{"id":2229,"date":"2014-10-03T12:16:15","date_gmt":"2014-10-03T19:16:15","guid":{"rendered":"http:\/\/www.curlybrace.com\/words\/?p=2229"},"modified":"2014-10-03T12:16:15","modified_gmt":"2014-10-03T19:16:15","slug":"windows-console-and-doublemulti-byte-character-set","status":"publish","type":"post","link":"https:\/\/www.curlybrace.com\/words\/2014\/10\/windows-console-and-doublemulti-byte-character-set\/","title":{"rendered":"Windows Console and Double\/Multi Byte Character Set"},"content":{"rendered":"<p>The Windows Console doesn&#8217;t support Unicode. It does, however, support Double Byte Character Sets using Code Pages. By changing the system locale, the Console can display Japanese, Korean, and Chinese text:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont.png\" alt=\"Code Page 932, Japanese file names and Unicode file content work correctly, UTF-8 file content is gibberish.\" width=\"757\" height=\"601\" class=\"size-full wp-image-2236\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont.png 757w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont-300x238.png 300w\" sizes=\"auto, (max-width: 757px) 100vw, 757px\" \/><\/p>\n<h2>Terminology<\/h2>\n<p>UTF-8 and UTF-16 are types of Unicode. However, it&#8217;s common on Windows to refer to UTF-16 as Unicode, and UTF-8 as UTF-8. I will follow this convention. DBCS (Double Byte Character Set) is the only type of MBCS (Multi Byte Character Set) supported by legacy (i.e. non-Unicode) Windows applications. Japanese, Chinese, and Korean are supported via DBCS encodings. None of these DBCS encodings are Unicode, and all of them are proprietary Microsoft implementations of other standards.<\/p>\n<h2>Code Pages Supported by Windows<\/h2>\n<p>Windows supports four Double Byte Character Set code pages:<\/p>\n<ul>\n<li>932 (Japanese Shift-JIS)<\/li>\n<li>936 (Simplified Chinese GBK)<\/li>\n<li>949 (Korean)<\/li>\n<li>950 (Traditional Chinese Big5)<\/li>\n<\/ul>\n<p><strong>The available code pages are determined by your System Locale<\/strong>. If your System Locale is set to &#8220;English (United States)&#8221;, then these code pages will be unavailable to you. In this post, I will only be covering Japanese, since it&#8217;s the only language with which I have any familiarity. The steps and results would be similar for the other languages.<\/p>\n<h2>How to Change System Locale<\/h2>\n<p>To change your system locale, go into &#8220;Change date, time, or number formats&#8221;:<\/p>\n<p><a href=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/StartMenu_ChangeDateTime.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/StartMenu_ChangeDateTime-300x37.png\" alt=\"StartMenu_ChangeDateTime\" width=\"300\" height=\"37\" class=\"aligncenter size-medium wp-image-2223\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/StartMenu_ChangeDateTime-300x37.png 300w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/StartMenu_ChangeDateTime.png 379w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> Select the Administrative tab, and click on &#8220;Change system locale&#8221;. Select the new system locale, click OK, and reboot. <strong>The system must be rebooted to change the system locale<\/strong>:<\/p>\n<p><a href=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/SystemLocaleSetting.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/SystemLocaleSetting-271x300.png\" alt=\"SystemLocaleSetting\" width=\"271\" height=\"300\" class=\"aligncenter size-medium wp-image-2230\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/SystemLocaleSetting-271x300.png 271w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/SystemLocaleSetting.png 498w\" sizes=\"auto, (max-width: 271px) 100vw, 271px\" \/><\/a><\/p>\n<h2>Windows Console Font and Code Page<\/h2>\n<p>The font typically recommended for Japanese output is MS Gothic. I have, however, found that Japanese text displays with the Terminal font selected, but it&#8217;s entirely possible that the UI is lying to me.<\/p>\n<p>To change the Windows Console code page, use the chcp command. chcp with no arguments will display the active code page.<\/p>\n<h2>Code Page 932 (Japanese Shift-JIS)<\/h2>\n<p>With the code page set to 932 (Japanese Shift-JIS), the path separator character will change into the Yen symbol (because only the backslash and tilde characters differ from ASCII in the lower 7-bits of Shift-JIS). Japanese file names will display in Japanese, as will text saved as Unicode. Japanese text saved as UTF-8 will display as gibberish:<\/p>\n<p><a href=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont-300x238.png\" alt=\"CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont\" width=\"300\" height=\"238\" class=\"aligncenter size-medium wp-image-2236\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont-300x238.png 300w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage932_SystemLocaleSetTo932_MSGothicFont.png 757w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<h2>Code Page 65001 (UTF-8)<\/h2>\n<p>I have found that it will <em>sometimes<\/em> work to set the code page to 65001 (UTF-8). Japanese filenames, Japanese Unicode file content, <em>and<\/em> Japanese UTF-8 content will all three display, as shown below. However, when I experimented with this it stopped working after changing fonts and code pages a few times. My final impression is that it <em>should<\/em> work, but that the Console has some bugs in this regard.<\/p>\n<p><a href=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont-300x215.png\" alt=\"CMD_CodePage65001_SystemLocaleSetTo932_RasterFont\" width=\"300\" height=\"215\" class=\"aligncenter size-medium wp-image-2237\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont-300x215.png 300w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont.png 837w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>Here&#8217;s a screen shot of the Console after code page 65001 stopped working as expected: <a href=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont_StoppedWorking.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont_StoppedWorking-300x215.png\" alt=\"Code Page 65001 (UTF-8), Japanese output stopped working\" width=\"300\" height=\"215\" class=\"aligncenter size-medium wp-image-2238\" srcset=\"https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont_StoppedWorking-300x215.png 300w, https:\/\/www.curlybrace.com\/words\/wp-content\/uploads\/2014\/10\/CMD_CodePage65001_SystemLocaleSetTo932_RasterFont_StoppedWorking.png 837w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<h2>References<\/h2>\n<ul>\n<li><a href=\"http:\/\/msdn.microsoft.com\/en-US\/goglobal\/bb964654.aspx\">Code Pages Supported by Windows<\/a> (MSDN)<\/li>\n<li><a href=\"http:\/\/www.sljfaq.org\/afaq\/encodings.html\">Encodings of Japanese<\/a> (sci.lang.japan FAQ)<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The Windows Console doesn&#8217;t support Unicode. It does, however, support Double Byte Character Sets using Code Pages. By changing the system locale, the Console can display Japanese, Korean, and Chinese text: Terminology UTF-8 and UTF-16 are types of Unicode. However, &hellip; <a href=\"https:\/\/www.curlybrace.com\/words\/2014\/10\/windows-console-and-doublemulti-byte-character-set\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15,283],"tags":[],"class_list":["post-2229","post","type-post","status-publish","format-standard","hentry","category-technology","category-windows-technology"],"_links":{"self":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts\/2229","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/comments?post=2229"}],"version-history":[{"count":32,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts\/2229\/revisions"}],"predecessor-version":[{"id":2266,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/posts\/2229\/revisions\/2266"}],"wp:attachment":[{"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/media?parent=2229"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/categories?post=2229"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.curlybrace.com\/words\/wp-json\/wp\/v2\/tags?post=2229"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}