Thursday, May 15, 2008

Online Texts in Hanyu Pinyin

See this blog post on The webmaster there wrote that post in response to an inquiry from me asking if he had a list of texts that were written solely in Pinyin, not in Chinese characters. That, in turn, was prompted by an argument that I had with one of my teachers and some of my classmates about the necessity of Chinese characters.

It's an argument I've had before. As much as I'm fascinated by them, and as much as I'm proud of how many I've learned and know how to write (last count: just over 2800), I hate them. I think it's a ridiculous, inefficient, and burdensome writing system. So, occasionally I'll get into a discussion where I'll suggest that they should be completely scrapped in favor of Pinyin, which is the phonetic writing system based on the Roman alphabet. Whenever I suggest this to a Chinese person, or even a foreigner whose studied Chinese for any amount of time, the response is amused skepticism, to put it mildly. 不可能! (Bù kěnéng! No way!)

No one here has even considered the possibility, and when they first consider it, it seems patently absurd. The problem as they see it is that the written characters contain much more information than just the pronunciation. In a nutshell, for every syllable in Mandarin Chinese, there are many possible characters. What they fail to take into account though, is that in standardized Pinyin, capitalization and word breaks are also brought into play, and have a powerful affect on reducing ambiguity.

What it boils down to is this: if a listener could understand something read in Mandarin (without seeing it written) then he or she would be able to understand the same thing written in the phonetic alphabet of Hanyu Pinyin. Note that this doesn't apply to obscure texts deliberately obfuscated, or to ancient Chinese, but it does apply to modern-day texts written in the vernacular.

I wish I had time to titivate (Merriam-Webster's word of the day today!) this post with lots of links to bolster some of my claims above, but I don't.

Update: I added a tag in-pinyin to track these babies.


  1. I believe this works for you, the phonetic learner. I'm most definitely the visual one and it reflects in a language too -- I hardly ever "hear aloud" things when I'm reading and well, reading 拼音 is a pain for me (no matter how formatted it is).

    I wonder though, whether the majority of Chinese is like you or me ;-) My bet would be, being raised in the environment, pretty much visual.

  2. Hi, Jan, thanks for the comment. I'm not sure why you would assume I'm a "phonetic learner" when it comes to reading. I guess I'm not even sure what the difference is when it comes to learning to read Chinese.

    My intuition is that you don't "hear aloud" things when you read simply because you read much more fluently than I do. Also, that reading Pinyin is a pain for you because you're not used to it.

    That's a point that I didn't adequately drive home in my post. Yes, of course it's hard to read these Pinyin articles when you're first exposed to them, but that skill would improve over time, with practice. Whether it would improve to the point where one could read Pinyin as fast as characters, I don't know. But, the fact that it's difficult at first is not really a good argument against Pinyin as a script.

  3. @klortho, I'm glad you asked about pinyin texts on The world needs a lot more pinyin texts precisely because of what you noted in the comment above -- that you only become a "fluent reader" through practice. I'm going to start referring people that direction.

    @jan, a commenter at Beijing Sounds responded with a similar note about "visual learning" a while back when I was ranting against characters. In my calmer moments I'd agree that some people do pretty well by the characters. But I think they'd do even better by pinyin, if that was the script we generally used to write Mandarin. Even you, a visual learner, would write things in pinyin and memorize them that way -- at least that's my guess.

    BTW, there's some interesting research that's been done on native speakers reading 汉字. It shows pretty conclusively that readers access meaning through sound when reading 汉字 just as they would with an other script. So 汉字 are just another sound-based script like every other one (read DeFrancis's book: Chinese Language Fact and Fantasy for more on this). The only difference with pinyin is that it's a thousand orders of magnitude more complicated!

  4. Oh, that's one interesting article on the Beijing Sounds indeed!

    Sounds like writing system is a big obstacle for many... Well, maybe it's universal. As I think about it, few languages I know have good writing system, English included (if you're all natives, I'd love to tell you English writing and reading is a pain! Reform it ASAP, please ;-))

    As for Chinese (or any other living language), I bet the spoken form is more important and thus, pinyin... Some statistics say that 95% how people use the language is speak, the remaining 5% is read and write. Pretty clear then.

    Anyway, good luck with your readings!

  5. @syz: Your comment is very interesting, and I'm not actually surprised to see that Chinese people "access meaning through sound" when the read. It's inline with the link that you just sent me to The Ideographic Myth. DeFrancis says that, indeed, the Chinese character writing system is a phonetic script.

    @Jan: I know the English writing system is screwed up!! My spelling, I think, is better than average, and yet I make plenty of mistakes. I often find myself in front of a classfull of students scratching my head at a word I've just written. I know it doesn't look right, but I'm damned if I know how to fix it.

  6. For foreigners I believe using pinyin to learn Chinese at first is the right thing to do, otherwise progress is painfully slow. But then we should learn the characters too. The methodology used for teaching characters to adult foreigners is totally inappropriate and makes learning them more difficult. I believe a more systematic approach to learning characters is needed and I have am developing one.

  7. @Chris
    That sounds great -- I'd like to hear more about what you're developing. Have you seen Anki? There are a few others like it -- based loosely on Supermemo, and with an emphasis on using mnemonics to aid in memorization. I think there's a book, something about memorizing 2000 kanji in a short time, but I can't seem to find a link right now.

  8. I´ve just started some blogs:

    My methods are based on character maps which I have developed. You can follow the link to see a sample chart.
    There are arrows on the originals and more characters but irt should give you an idea of what I´m doing. It´s a bit like what has done but in 2-dimensions instead of one to make the characters easier to find. Any information about the other methods you know about would be very welcome.

  9. You guys are just silly. Let me introduce myself. I'm croatian. Croatian is one of rare languages that has PURELY PHONETICAL writing system, you can learn yourself read and write croatian in day or two. Of course you won't understand what are you reading, but i can assure you you'll read it and write correctly. As for chinese characters and pinyin... Well, i have been learning english since i was 10 years old, and chinese since i was 20 (i'm 30 now)... i understand some 2000 chinese characters, and i think it's silly idea to abandon chinese characters in favor of pinyin. WHY? Because there's no much difference in learnig to write english, french or chinese. Chinese characters are even a bit simpler to learn because it's not designed to be phonetical script, unlike roman letters which are practically abused by french and anglophone nations. You'd like of chinese to abandon their writing. Well, why french and english don't abandon their present writing system and adopt phonetical writing system like we have here in Croatia and other parts of Balkan peninsula? Or... why should we use latin letters at all?! Croatians developed phonetical writing using arabian script some 300 years ago when turkish empire dominated in this part of europe. See? Or why don't anglophone countries use phonetical script developed by G.B. Shaw?
    You's stupid idea. There's nothing wrong with chinese characters. At least, you can take some text written 1000 years ago and understand it without much trouble. And if i take some text written in croatia 1000 years ago.... i barely understand every second word.
    If you hate HanZi...why you learned chinese at all?

  10. Croat, the whole point of switching to pinyin (or to any alphabetic writing system) is to allow real literacy for all the masses that, unlike you or me, don't have the leisure to dedicate years to learning characters.

    I'd ask you what's better: the 8 year old who can write a diary or notes with all the SPOKEN words she knows, using pinyin, or the 8 year old who will have to make do with the 200 or so characters (which is the vocabulary a TWO year old would handle) she has learnt?

    As for the reading of classical texts, no one can read those without additional training on vocabulary and grammar, and so it's an activity for an already very educated elite. Where's the advantage in that for people who cannot even write or read all they can say in their own modern tongue, much less understand classical versions of it?

    Leave the characters for the elites that have the time to master them, and give pinyin to all the rest who could benefit from being able to read and write with just a year or so of training in an alphabetical writing.


Comments welcome!

If you are new here, and don't have a Google account (or would rather not use it), then please use the "Name/URL" profile (next to "Comment as" below). You con't have to give your real name -- any nickname will do. And you can leave the URL field blank if you want.

If you want to be notified of comment updates, then you can either: use your Google account, and, after you have signed in, click "Subscribe by email"; or subscribe to the comment feed by clicking on "Subscribe to: Post Comments (Atom)" below.