Unicode and Collation Support in Microsoft SQL Server - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Unicode and Collation Support in Microsoft SQL Server

Description:

Prague, Czech Republic (IUC23) Unicode Support ... Prague, Czech Republic (IUC23) At the server level. Acts as a default for all databases ... – PowerPoint PPT presentation

Number of Views:279
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Unicode and Collation Support in Microsoft SQL Server


1
Unicode and Collation Support in Microsoft SQL
Server
  • Michael S. Kaplan
  • Globalization Infrastructure and Font Technology
  • Windows International
  • Microsoft

2
Unicode Support
  • Uses the "N" or national data types from the
    SQL-92 specification
  • NCHAR, NVARCHAR, NTEXT
  • What the SQL-99 spec says about Unicode
  • Interoperability with other clients

3
Collation in SQL Server lt 6.5
  • No Unicode support at all
  • One code page per server
  • One collation per server
  • No good solution for multilingual support

4
Collation in SQL Server 7.0
  • Unicode datatypes supported
  • Two collations
  • Unicode
  • Non-Unicode
  • Number of collations distilled down to the
    minimum necessary

5
7.0 flattening of collations
  • Example the General Unicode sort order handles
    Afrikaans, Albanian, Arabic, Basque, Belarusian,
    Bulgarian, English, Faeroese, Farsi, Georgian
    (Traditional), Greek, Hebrew, Hindi, Indonesian,
    Malay, Russian, Serbian, Swahili, and Urdu

6
OS independence
  • Collation independent of operating system
  • Based on the Jet Unicorn DLLs

7
SQL Language Support(limited locale information)
  • Messages
  • Date/Time
  • First Day of Week
  • Currency and currency symbols
  • Month/day names and abbreviated month names

8
SQL Language Support(list of languages)
  • Arabic
  • British English
  • Brazilian
  • Bulgarian
  • Simplified Chinese
  • Traditional Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Greek
  • Hungarian
  • Italian
  • Japanese
  • Korean
  • Latvian
  • Lithuanian
  • Norwegian
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Slovak
  • Slovenian
  • Spanish
  • Swedish
  • Thai
  • Turkish

9
Getting at the list of languages
  • sp_helplanguage stored procedure
  • syslanguages/sysmessages tables
  • SET LANGUAGE
  • SET LANGUAGE cetina
  • SET LANGUAGE ???
  • Each language has a langid (0 32)

10
Collation in SQL Server 2000
  • Combined code pages and collations into a single
    entity

11
"Windows" collations
  • Added for unique code pages(Example Arabic)
  • Added for unique ordering (Example French)
  • Removed for identical ordering(Example
    Finnish_Swedish)

12
43 Windows Collations
  • Albanian
  • Arabic
  • Chinese_PRC
  • Chinese_PRC_Stroke
  • Chinese_Taiwan_Bopomofo
  • Chinese_Taiwan_Stroke
  • Cyrillic_General
  • Croatian
  • Czech
  • Danish_Norwegian
  • Estonian
  • Finnish_Swedish
  • French
  • Georgian_Modern_sort
  • German_PhoneBook
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Japanese_Unicode
  • Korean_Wansung
  • Korean_Wansung_Unicode
  • Latin1_General
  • Latvian
  • Lithuanian
  • Lithuanian_Classic
  • FYRO Macedonian
  • Spanish (Spain)
  • Polish
  • Romanian
  • Slovak
  • Slovenian
  • Thai
  • Traditional_Spanish
  • Turkish
  • Ukrainian
  • Vietnamese  

13
Windows collations, continued
  • Suffix meanings
  • _BIN (Binary)
  • _CI/_CS (Case sensitivity)
  • _AI/_AS (Accent sensitivity)
  • _KS - kanatype sensitivity (hiragana/katakana)
  • _WS - width sensitivity (full/half width)

14
SQL Collations
  • Provided for backwards compatibility with prior
    versions of SQL Server

15
SQL Collations
  • SQL_1xCompat_CP850
  • SQL_Estonian_CP1257
  • SQL_Latin1_General_Pref_CP437
  • SQL_AltDiction_CP1253
  • SQL_Hungarian_CP1250
  • SQL_Latin1_General_Pref_CP850
  • SQL_AltDiction_CP850
  • SQL_Icelandic_Pref_CP1
  • SQL_Latvian_CP1257
  • SQL_AltDiction_Pref_CP850
  • SQL_Latin1_General_CP1
  • SQL_Lithuanian_CP1257
  • SQL_Croatian_CP1250
  • SQL_Latin1_General_CP1250
  • SQL_MixDiction_CP1253
  • SQL_Czech_CP1250
  • SQL_Latin1_General_CP1251
  • SQL_Polish_CP1250
  • SQL_Danish_Pref_CP1
  • SQL_Latin1_General_CP1253
  • SQL_Romanian_CP1250
  • SQL_EBCDIC037_CP1
  • SQL_Latin1_General_CP1254
  • SQL_Scandinavian_CP850
  • SQL_EBCDIC273_CP1
  • SQL_Latin1_General_CP1255
  • SQL_Scandinavian_Pref_CP850
  • SQL_EBCDIC277_CP1
  • SQL_Latin1_General_CP1256
  • SQL_Slovak_CP1250
  • SQL_EBCDIC278_CP1
  • SQL_Latin1_General_CP1257
  • SQL_Slovenian_CP1250
  • SQL_EBCDIC280_CP1
  • SQL_Latin1_General_CP437
  • SQL_SwedishPhone_Pref_CP1
  • SQL_EBCDIC284_CP1
  • SQL_Latin1_General_CP850
  • SQL_SwedishStd_Pref_CP1
  • SQL_EBCDIC285_CP1
  • SQL_Latin1_General_Pref_CP1
  • SQL_Ukrainian_CP1251
  • SQL_AltDiction_CP1253
  • SQL_Hungarian_CP1250 
  • SQL_Latin1_General_Pref_CP850  

16
Collation at four levels
  • Server
  • Database
  • Column
  • Expression

17
At the server level
  • Acts as a default for all databases
  • Can be changed with RebuildM.exe in the
    tools\BINN dir
  • Querying the server collation
  • SELECT CONVERT(char, SERVERPROPERTY('collation'))

18
At the database level
  • Every database has a collation (default is the
    server collation)
  • Collation can be changed under some circumstances

19
At the column level
  • Overrides database level collation
  • Specifies code page for non-Unicode columns
  • Again, can be changed under some circumstances
  • No multilingual columns with separate collations

20
At the expression level
  • Can be used to override any other collation
  • uses the COLLATE keyword

21
Metadata in System Tables
  • All stored as Unicode no matter what the database
    collation is
  • Unicode 2.0 repertoire is used for identifiers
    (use brackets or quotes around anything else)

22
More on the COLLATE keyword
  • COLLATE ltWindows_Collation_namegtltSQL_Collation_N
    ame
  • Specific rules of precedence
  • Explicit (two explicits runtime error)
  • Implicit (two implicits no collation)
  • Default
  • ltno collationgt

23
Limitations
  • Features people will want for future versions
  • LCID --gt Collation
  • ISO string lt--gt Collation
  • Creating custom collations?

24
References
  • http//microsoft.com/globaldev/
  • International Features in Microsoft SQL Server
    2000(by Michael Kaplan) at http//msdn.microsoft
    .com/

25
Questions?
26
Dont Forget Your Evaluations!
  • Unicode and Collation Support
  • in Microsoft SQL Server
Write a Comment
User Comments (0)
About PowerShow.com