Update: after several more hours of Googling and experimenting, I have found a way to display Japanese in the Console. For more information, check out my new post, “Windows Console and Double/Multi Byte Character Set“. The rest of this post is still accurate with regards to Unicode support and Western system locales.
Have you been hoping to see Japanese (or Thai, Hindi, Arabic, etc.) characters appear when you type
dir into your command prompt? Well, prepare to be disappointed, as the Windows CMD.exe Console cannot display Unicode characters. You’ll have to use the Powershell ISE if you want to see full Unicode text output.
The best that the Command Shell can do is to write out boxes or question marks and, when characters are marked and copied, the clipboard will be populated with the correct Unicode characters. Those characters can then be pasted into smarter executables, like Notepad.
Michael S. Kaplan, an expert on all things Unicode and Microsoft, wrote about this at great length on MSDN Blogs. Unfortunately, Microsoft decided to wipe his blog from the Internet, even though it breaks links from the likes of Raymond Chen’s The Old New Thing.
Michael’s relevant blog posts can be found on The Internet Archive’s Wayback Machine:
- Anyone who says the console can’t do Unicode isn’t as smart as they think they are from 4/7/2010 (explains that the Unicode characters won’t display, but they will copy to the clipboard).
- The real problem(s) with all of these console “fallback” discussions from 2/15/2010
- Cunningly conquering communicated console caveats. Comprende, mon Capitán? from 5/7/2010 (provides functions to determine whether output is to the Console or the Powershell ISE).
- A confluence of circumstances leaves a stone unturned… from 9/23/2010 (discusses problems with stdin).
- Conventional wisdom is retarded, aka What the @#%&* is _O_U16TEXT? from 3/18/2008 (explains wide-character output).