Supported Encodings

This document references http://java.sun.com/j2se/1.4.2/docs/guide/intl/encoding.doc.html for convenience purposes only. It is recommended you consult Sun's document (requires an Internet connection).



Basic Encoding Set

Canonical Name for java.nio API Canonical Name for java.io and java.lang API Description
US-ASCII ASCII American Standard code for Information Interchange
windows-1252 Cp1252 Windows Latin-1
ISO-8859-1 ISO8859_1 ISO 8859-1, Latin alphabet No. 1
ISO-8859-15 In extended encoding set Latin alphabet No. 9
UTF-8 UTF8 Eight-bit UCS Transformation Format
UTF-16 UTF-16 Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark
UTF-16BE UnicodeBigUnmarked Sixteen-bit Unicode Transformation Format, big-endian byte order
UTF-16LE UnicodeLittleUnmarked Sixteen-bit Unicode Transformation Format, little-endian byte order
Not available UnicodeBig Sixteen-bit Unicode Transformation Format, big-endian byte order, with byte-order mark
Not available UnicodeLittle Sixteen-bit Unicode Transformation Format, little-endian byte order, with byte-order mark


Extended Encoding Set

Canonical Name for java.nio API Canonical Name for java.io and java.lang API Description
windows-1250 Cp1250 Windows Eastern European
windows-1253 Cp1253 Windows Greek
windows-1254 Cp1254 Windows Turkish
windows-1255 Cp1255 Windows Hebrew
windows-1256 Cp1256 Windows Arabic
windows-1257 Cp1257 Windows Baltic
windows-1258 Cp1258 Windows Vietnamese
ISO-8859-2 ISO8859_2 Latin Alphabet No. 2
ISO-8859-3 ISO8859_3 Latin Alphabet No. 3
ISO-8859-4 ISO8859_4 Latin Alphabet No. 4
ISO-8859-5 ISO8859_5 Latin/Cyrillic Alphabet
ISO-8859-6 ISO8859_6 Latin/Arabic Alphabet
ISO-8859-7 ISO8859_7 Latin/Greek Alphabet
ISO-8859-8 ISO8859_8 Latin/Hebrew Alphabet
ISO-8859-9 ISO8859_9 Latin Alphabet No. 5
ISO-8859-13 ISO8859_13 Latin Alphabet No. 7
windows-31j MS932 Windows Japanese
EUC-JP EUC_JP JISX 0201, 0208 and 0212, EUC encoding Japanese
EUC-JP-LINUX EUC_JP_LINUX JISX 0201, 0208 , EUC encoding Japanese
Shift_JIS SJIS Shift-JIS, Japanese
ISO-2022-JP ISO2022JP JIS X 0201, 0208, in ISO 2022 form, Japanese
windows-936 MS936 Windows Simplified Chinese
GB18030 GB18030 Simplified Chinese, PRC standard
EUC-CN EUC_CN GB2312, EUC encoding, Simplified Chinese
GBK GBK GBK, Simplified Chinese
ISCII91 ISCII91 ISCII91 encoding of Indic scripts
ISO-2022-CN-GB ISO2022CN_GB GB 2312 in ISO 2022 CN form, Simplified Chinese (conversion from Unicode only)
windows-949 MS949 Windows Korean
EUC-KR EUC_KR KS C 5601, EUC encoding, Korean
ISO-2022-KR ISO2022KR ISO 2022 KR, Korean
windows-950 MS950 Windows Traditional Chinese
EUC-TW EUC_TW CNS11643 (Plane 1-3), EUC encoding, Traditional Chinese
ISO-2022-CN-CNS ISO2022CN_CNS CNS 11643 in ISO2022 CN form, Traditional Chinese (conversion from Unicode only)
Big5 Big5 Big5, Traditional Chinese
Big5-HKSCS Big5_HKSCS Big5 with Hong Kong extensions, Traditional Chinese
TIS-620 TIS620 TIS620, Thai
KOI8-R KOI8_R KOI8-R, Russian


Extended Encoding Set (supported by java.io and java.lang APIs)

Canonical Name Description
Big5_Solaris Big5 with seven additional Hanzi ideograph character mappings for the Solaris zh_TW.BIG5 locale
Cp037 USA, Canada (Bilingual, French), Netherlands, Portugal, Brazil, Australia
Cp273 IBM Austria, Germany
Cp277 IBM Denmark, Norway
Cp278 IBM Finland, Sweden
Cp280 IBM Italy
Cp284 IBM Catalan/Spain, Spanish Latin America
Cp285 IBM United Kingdom, Ireland
Cp297 IBM France
Cp420 IBM Arabic
Cp424 IBM Hebrew
Cp437 MS-DOS United States, Australia, New Zealand, South Africa
Cp500 EBCDIC 500V1
Cp737 PC Greek
Cp775 PC Baltic
Cp838 IBM Thailand extended SBCS
Cp850 MS-DOS Latin-1
Cp852 MS-DOS Latin-2
Cp855 IBM Cyrillic
Cp856 IBM Hebrew
Cp857 IBM Turkish
Cp858 Variant of Cp850 with Euro character
Cp860 MS-DOS Portuguese
Cp861 MS-DOS Icelandic
Cp862 PC Hebrew
Cp863 MS-DOS Canadian French
Cp864 PC Arabic
Cp865 MS-DOS Nordic
Cp866 MS-DOS Russian
Cp868 MS-DOS Pakistan
Cp869 IBM Modern Greek
Cp870 IBM Multilingual Latin-2
Cp871 IBM Iceland
Cp874 IBM Thai
Cp875 IBM Greek
Cp918 IBM Pakistan (Urdu)
Cp921 IBM Latvia, Lithuania (AIX, DOS)
Cp922 IBM Estonia (AIX, DOS)
Cp930 Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026
Cp933 Korean Mixed with 1880 UDC, superset of 5029
Cp935 Simplified Chinese Host mixed with 1880 UDC, superset of 5031
Cp937 Traditional Chinese Host miexed with 6204 UDC, superset of 5033
Cp939 Japanese Latin Kanji mixed with 4370 UDC, superset of 5035
Cp942 IBM OS/2 Japanese, superset of Cp932
Cp942C Variant of Cp942
Cp943 IBM OS/2 Japanese, superset of Cp932 and Shift-JIS
Cp943C Variant of Cp943
Cp948 OS/2 Chinese (Taiwan) superset of 938
Cp949 PC Korean
Cp949C Variant of Cp949
Cp950 PC Chinese (Hong Kong, Taiwan)
Cp964 AIX Chinese (Taiwan)
Cp970 AIX Korean
Cp1006 IBM AIX Pakistan (Urdu)
Cp1025 IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia (FYR)
Cp1026 IBM Latin-5, Turkey
Cp1046 IBM Arabic - Windows
Cp1097 IBM Iran (Farsi)/Persian
Cp1098 IBM Iran (Farsi)/Persian (PC)
Cp1112 IBM Latvia, Lithuania
Cp1122 IBM Estonia
Cp1123 IBM Ukraine
Cp1124 IBM AIX Ukraine
Cp1140 Variant of Cp037 with Euro character
Cp1141 Variant of Cp273 with Euro character
Cp1142 Variant of Cp277 with Euro character
Cp1143 Variant of Cp278 with Euro character
Cp1144 Variant of Cp280 with Euro character
Cp1145 Variant of Cp284 with Euro character
Cp1146 Variant of Cp285 with Euro character
Cp1147 Variant of Cp297 with Euro character
Cp1148 Variant of Cp500 with Euro character
Cp1149 Variant of Cp871 with Euro character
Cp1381 IBM OS/2, DOS People's Republic of China (PRC)
Cp1383 IBM AIX People's Republic of China (PRC)
Cp33722 IBM-eucJP - Japanese (superset of 5050)
ISO8859_15 ISO 8859-15, Latin alphabet No. 9
JISAutoDetect Detects and converts from Shift-JIS, EUC-JP, ISO 2022 JP (conversion to Unicode only)
MS874 Windows Thai
MacArabic Macintosh Arabic
MacCentralEurope Macintosh Latin-2
MacCroatian Macintosh Croatian
MacCyrillic Macintosh Cyrillic
MacDingbat Macintosh Dingbat
MacGreek Macintosh Greek
MacHebrew Macintosh Hebrew
MacIceland Macintosh Iceland
MacRoman Macintosh Roman
MacRomania Macintosh Romania
MacSymbol Macintosh Symbol
MacThai Macintosh Thai
MacTurkish Macintosh Turkish
MacUkraine Macintosh Ukraine


left arrow