COLLATION (often used via the COLLATE clause) tells the database which locale-specific rules to apply when comparing, sorting, or hashing character strings. A collation bundles language, culture, and case-sensitivity rules, affecting ORDER BY results, equality checks, DISTINCT, GROUP BY, indexes, and joins. Collations can be assigned at the server, database, column, expression, or literal level, with later scopes overriding earlier ones. Choosing the correct collation prevents unexpected sort orders and comparison mismatches between character sets. Mixing collations in a single query triggers implicit or explicit collation coercion; when coercion is impossible, the engine raises an error. Not all collations are available in every database or OS, and some operations (e.g., LIKE optimizations) may use slower code paths with non-default collations.
collation_name
(string) - Locale identifier or predefined collation to applyCHARACTER SET, NLS parameters, ORDER BY, LIKE, CASE SENSITIVITY, INDEXES
SQL-92 (standardized), extended in SQL:1999; vendor support predates the standard
A character set defines the valid symbols and their binary encoding, while a collation defines how those symbols are compared and ordered. Many collations can exist for one character set.
In SQL Server, query sys.columns. In MySQL, use SHOW FULL COLUMNS or INFORMATION_SCHEMA.COLUMNS. PostgreSQL exposes it via the collcollation column in pg_attribute.
Altering the default collation updates metadata only; stored data remains unchanged. However, indexes may need to rebuild and queries without explicit COLLATE will now use the new rules.
No. Collation names and behaviors are vendor-specific. Always check the target DBMS and specify portable Unicode collations where possible.