When writing in pinyin, the 2nd and 4th tones are easy. They are just the acute and grave accents and are part of any standard font. But the 1st and 3rd tones and tone marks over the u with umlaut are harder. The 1st tone is called the macron and the 3rd tone the caron.
Warning: Do not try to use the breve (ă), a with round upside down hat, for the third tone; it doesn’t look right (the upside down hat should be pointed) and the character is not part of MS Song. So if you are using Netscape with MS Song as your Unicode font, you will not see it.
I will demonstrate two methods for inputting pinyin.
- My first method is to use Unicode character codes, either numeric character reference or named character entities. This is not as hard as it seems. If you write a lot of pinyin and use a smart editor, you can define keyboard macros, or write things like “zha1ng” and then do a replace at the end. If your editor is smart, you can even define a script that does this for all the combinations.Or you can simply use the wonderful Pinyin to Unicode Converter at Konrad Mitchell Lawson’s The Fool’s Workshop. You just input “zhong1guo2 shi4 shi4jie4 zui4 hao3 de guo2jia1” and out comes “zhōngguó shì shìjiè zuì hǎo de guójiā” in both Unicode and character codes.Warning: Some older browser have trouble with hexadecimal numeric character references, so it may be safest to use decimal.
Latin-1 Supplement – Unicode U+0080 – U+00FF – (128-255)
á = á = á = á
à = à = à = à
é = é = é = é
è = è = è = è
í = í = í = í
ì = ì = ì = ì
ó = ó = ó = ó
ò = ò = ò = ò
ú = ú = ú = ó
ù = ù = ù = ù
ü = ü = ü = ü
subtract 32 for upper caseLatin Extended-A – Unicode U+0100 – U+017F – (256-383)
ā = ā = ā
ē = ē = ē
ě = ě = ě
ī = ī = ī
ō = ō = ō
ū = ū = ū
subtract 1 for upper caseLatin Extended-B U+0180 – U+024F (384-591)
ǎ = ǎ = ǎ
ǐ = ǐ = ǐ
ǒ = ǒ = ǒ
ǔ = ǔ = ǔǖ = ǖ = ǖ
ǘ = ǘ = ǘ
ǚ = ǚ = ǚ
ǜ = ǜ = ǜ
subtract 1 for upper caseNotice that e with 3rd tone (caron) is part of Latin Extended-A, while the other 3rd tones are part of Latin Extended-B.
- The second method is to use Microsoft Word, or another Unicode enabled editor/word processor. For the 2nd and 4th tones I use the acute and grave accents that are part of any standard font. They are easy to insert using keyboard shortcuts. (You may want to set your keyboard to US International). The 1st tone is called the macron and the 3rd tone the caron. (Do not try to use the breve for the third tone.) They are available in most Unicode fonts, and I get them by using MS Word and Insert – symbol. To simplify, I assign keystrokes to the 1st and 3rd tones and the tones on u with umlaut. I use F3 and a for ǎ for and F2 and y for ǘ. I can’t use F1 for the first tones, since that calls up help, so I use F5 instead. I then save the doc file as “Extended Text”, select UTF-8 (not just Unicode) and paste into my text editor.ā á ǎ à a
ē é ě è e
ī í ǐ ì i
ō ó ǒ ò o
ū ú ǔ ù u
ǖ ǘ ǚ ǜ üWarning: At one stage while testing this out, I created a file called pinyin.txt and opened it in IE. I couldn’t see the Extended-B characters, but when I renamed the file pinyin.html, everything came out all right (after I manually switched to Unicode encoding in the browser). The reason for this is that when displaying a *.txt file, IE will use a fixed width font like Courier New, and at the time I was using version 2.50 of Courier New, which did not have the Latin Extended-B characters.
You can also use a character map, like Character Agent, ListFont or International Character Code Map.
You may also find it convenient to use some of the keyboard utilities listed on Alan Wood’s Unicode Resources
(Source: www.math.nus.edu)




