While working on my page on the Chinese calendar, I needed to put Chinese characters and pinyin on the web. The most common way to write Chinese characters on the web is to use Guobiao encoding for the Chinese characters. To put pinyin on the web, you can use one of the many special pinyin fonts or use numbers to indicate the tones as in Guo2biao3. I have instead decided to use Unicode rather than Guobiao encoding on my web pages. This has many advantages, and I believe that it will eventually become the standard. Unfortunately, there are some problems at the moment.
For XHTML 1.0, I set <?xml version=”1.0″ encoding=”UTF-8″?>, and for HTML 4.0 I set <meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″> rather than charset=gb2312. Fortunately, the fonts in the language packs from Microsoft (MS Song – Simplified/Serif, MS Hei – Simplified/Gothic and MingLiU – Traditional) and the Office 2000 fonts (Simsun – Simplified and PMingLiU – Traditional) have both GB and Unicode encoding tables associated with it.
If the “Install On Demand” option is checked at Tools | Internet Options | Advanced, then you can simply select Chinese at View | Encoding and the fonts and code pages will be downloaded and installed automatically. Or you can go to Windows Update. Just select “Chinese (Simplified) Language Support” or “Chinese (Traditional) Language Support”.
If you use Netscape, you can search for the files ie3lpktw.exe for Traditional Chinese or ie3lpkcn.exe for Simplified Chinese. (It is 3L, not thirty-one).
Versions 2.76 or higher of Times New Roman, Arial and Courier New contain all the Pinyin vowels. They are available from the TrueType core fonts for the Web section of the Microsoft Typography site. The fonts in Microsoft’s Simplified Chinese Language Pack also have them, but they display all accented letters as if they were followed by spaces. The reason for this is that the width of the accented vowels are aligned with the width of hanzi.
If you use Internet Explorer and have installed support for Chinese, it should be automatic. It will use a Chinese font (like MS Song) for the Chinese characters and a Latin font (like Times New Roman) for the rest. If your Latin font supports pinyin you’re fine!
Part of the reason why IE can do this, is that it “cheats”. It doesn’t consider Unicode as a codepage, but uses the fonts specified in the language settings. In Netscape, I go to Edit | Preferences | Fonts. There’s an item for Unicode and I can select a suitable font. But in Internet Explorer, when I go to Tools | Options | Fonts, there’s no item for Unicode. They have “Latin based” and “Chinese simplified” and so on. So instead of specifying one font for Unicode, I have to set each language separately. And if I want a language that IE hasn’t heard about, or want to use symbols from Unicode, I may be in trouble.
Netscape can only use characters from a single encoding to display a Web page, and does not implement any alternative encoding that you select from the View menu if the page has a charset specified in a meta tag. It does not build Unicode from its constituent codepages but treats it just like other codepages. That’s why Netscape can’t use Times New Roman for the Latin text and pinyin and MS Song for the Chinese characters the way IE does.
You will have to go to Edit | Preferences | Fonts and select an appropriate fonts. If you have a Chinese Unicode font like Arial Unicode MS or Bitstream Cyberbit, you’re OK. Just select that for Unicode. If not, choose a Chinese GB font like MS Song for Unicode. Unfortunately, this is not a good solution. The Latin characters in that font are not very pretty and the font leaves an extra space after the pinyin characters with tone marks, as explained above.
For more help on configuration, you can take a look at the page on Setting up Windows Internet Explorer 5, 5.5 and 6 for Multilingual and Unicode Support or Setting up Windows Netscape Browsers for Multilingual and Unicode Support, part of Alan Wood’s Unicode Resources.
(Source: www.math.nus.edu)