Learning the Teochew (Chaozhou) Dialect

Lately I’ve been learning my girlfriend’s dialect of Chinese, called the Teochew dialect.  Teochew is spoken in the eastern part of the Guangdong province by about 15 million people, including the cities of Chaozhou, Shantou, and Jieyang. It is part of the Min Nan (闽南) branch of Chinese languages.


Above: Map of major dialect groups of Chinese, with Teochew circled. Teochew is part of the Min branch of Chinese. Source: Wikipedia.

Although the different varieties of Chinese are usually refer to as “dialects”, linguists consider them different languages as they are not mutually intelligible. Teochew is not intelligible to either Mandarin or Cantonese speakers. Teochew and Mandarin diverged about 2000 years ago, so today they are about as similar as French is to Portuguese. Interestingly, linguists claim that Teochew is one of the most conservative Chinese dialects, preserving many archaic words and features from Old Chinese.

Above: Sample of Teochew speech from entrepreneur Li Ka-shing.

Since I like learning languages, naturally I started learning my girlfriend’s native tongue soon after we started dating. It helped that I spoke Mandarin, but Teochew is not close enough to simply pick up by osmosis, it still requires deliberate study. Compared to other languages I’ve learned, Teochew is challenging because very few people try to learn it as a foreign language, thus there are few language-learning resources for it.

Writing System

The first hurdle is that Teochew is primarily spoken, not written, and does not have a standard writing system. This is the case with most Chinese dialects. Almost all Teochews are bilingual in Standard Chinese, which they are taught in school to read and write.

Sometimes people try to write Teochew using Chinese characters by finding the equivalent Standard Chinese cognates, but there are many dialectal words which don’t have any Mandarin equivalent. In these cases, you can invent new characters or substitute similar sounding characters, but there’s no standard way of doing this.

Still, I needed a way to write Teochew, to take notes on new vocabulary and grammar. At first, I used IPA, but as I became more familiar with the language, I devised my own romanization system that captured the sound differences.

Cognates with Mandarin

Note (Jul 2020): People in the comments have pointed out that some of these examples are incorrect. I’ll keep this section the way it is because I think the high-level point still stands, but these are not great examples.

Knowing Mandarin was very helpful for learning Teochew, since there are lots of cognates. Some cognates are obviously recognizable:

  • Teochew: kai shim, happy. Cognate to Mandarin: kai xin, 开心.
  • Teochew: ing ui, because. Cognate to Mandarin: ying wei, 因为

Some words have cognates in Mandarin, but mean something slightly different, or aren’t commonly used:

  • Teochew: ou, black. Cognate to Mandarin: wu, 乌 (dark). The usual Mandarin word is hei, 黑 (black).
  • Teochew: dze: book. Cognate to Mandarin: ce, 册 (booklet). The usual Mandarin word is shu, 书 (book).

Sometimes, a word has a cognate in Mandarin, but sound quite different due to centuries of sound change:

  • Teochew: hak hau, school. Cognate to Mandarin: xue xiao, 学校.
  • Teochew: de, pig. Cognate to Mandarin: zhu, 猪.
  • Teochew: dung: center. Cognate to Mandarin: zhong, 中.

In the last two examples, we see a fairly common sound change, where a dental stop initial (d- and t-) in Teochew corresponds to an affricate (zh- or ch-) in Mandarin. It’s not usually enough to guess the word, but serves as a useful memory aid.

Finally, a lot of dialectal Teochew words (I’d estimate about 30%) don’t have any recognizable cognate in Mandarin. Examples:

  • da bo: man
  • no gya: child
  • ge lai: home

Grammatical Differences

Generally, I found Teochew grammar to be fairly similar to Mandarin, with only minor differences. Most grammatical constructions can transfer cognate by cognate and still make sense in the other language.

One significant difference in Teochew is the many fused negation markers. Here, a syllable starts with the initial b- or m- joined with a final to negate something. Some examples:

  • bo: not have
  • boi: will not
  • bue: not yet
  • mm: not
  • mai: not want
  • ming: not have to

Phonology and Tone Sandhi

The sound structure of Teochew is not too different from Mandarin, and I didn’t find it difficult to pronounce. The biggest difference is that syllables may end with a stop: -t, -k, -p, and -m, whereas Mandarin syllables can only end with a vowel or nasal. The characteristic of a Teochew accent in Mandarin is replacing /f/ with /h/, and indeed there is no /f/ sound in Teochew.

The hardest part of learning Teochew for me were the tones. Teochew has either six or eight tones depending on how you count them, which isn’t difficult to produce in isolation. However, Teochew has a complex system of tone sandhi rules, where the tone of each syllable changes depending on the tone of the following syllable. Mandarin has tone sandhi to some extent (for example, the third tone sandhi rule where nǐ + hǎo is pronounced níhǎo rather than nǐhǎo). But Teochew takes this to a whole new level, where nearly every syllable undergoes contextual tone change.

Some examples (the numbers are Chao tone numerals, with 1 meaning lowest and 5 meaning highest tone):

  • gu5: cow
  • gu1 nek5: beef

Another example, where a falling tone changes to a rising tone:

  • seng52: to play
  • seng35 iu3 hi1: to play a game

There are tables of tone sandhi rules describing in detail how each tone gets converted to what other tone, but this process is not entirely regular and there are exceptions. As a result, I frequently get the tone wrong by mistake.

Update: In this blog post, I explore Teochew tone sandhi in more detail.

Resources for Learning Teochew

Teochew is seldom studied as a foreign language, so there aren’t many language learning resources for it. Even dictionaries are hard to find. One helpful dictionary is Wiktionary, which has the Teochew pronunciation for most Chinese characters.

Also helpful were formal linguistic grammars:

  1. Xu, Huiling. “Aspects of Chaoshan grammar: A synchronic description of the Jieyang dialect.” Monograph Series Journal of Chinese Linguistics 22 (2007).
  2. Yeo, Pamela Yu Hui. “A sketch grammar of Singapore Teochew.” (2011).

The first is a massively detailed, 300-page description of Teochew grammar, while the second is a shorter grammar sketch on a similar variety spoken in Singapore. They require some linguistics background to read. Of course, the best resource is my girlfriend, a native speaker of Teochew.

Visiting the Chaoshan Region

After practicing my Teochew for a few months with my girlfriend, we paid a visit to her hometown and relatives in the Chaoshan region. More specifically, Raoping County located on the border between Guangdong and Fujian provinces.


Left: Chaoshan railway station, China. Right: Me learning the Gongfu tea ceremony, an essential aspect of Teochew culture.

Teochew people are traditional and family oriented, very much unlike the individualistic Western values that I’m used to. In Raoping and Guangzhou, we attended large family gatherings in the afternoon, chatting and gossiping while drinking tea. Although they are still Han Chinese, the Teochew consider themselves a distinct subgroup within Chinese, with their unique culture and language. The Teochew are especially proud of their language, which they consider to be extremely hard for outsiders to learn. Essentially, speaking Teochew is what separates “ga gi nang” (roughly translated as “our people”) from the countless other Chinese.

My Teochew is not great. Sometimes I struggle to get the tones right and make myself understood. But at a large family gathering, a relative asked me why I was learning Teochew, and I was able to reply, albeit with a Mandarin accent: “I want to learn Teochew so that I can be part of your family”.


Above: Me, Elaine, and her grandfather, on a quiet early morning excursion to visit the sea. Raoping County, Guangdong Province, China.

Thanks to my girlfriend Elaine Ye for helping me write this post. Elaine is fluent in Teochew, Mandarin, Cantonese, and English.

13 thoughts on “Learning the Teochew (Chaozhou) Dialect

  1. very interesting 🙂

    I stumbled across this post looking for Teochew grammar resources, but as you noticed, there aren’t many true resources out there on the internet. I only have some exposure from my parents. It would be nice to document grammar like how many sites have for Mandarin, but I lack a strong enough grasp to do so. It’s also difficult because there are many different stylistic differences between Indonesian, Malaysian, Singaporean, Vietnamese, Cambodian and other variations among Teochew speakers.


    1. Very true. Even within Indonesia, there are differences. In West Kalimantan there’s noticeable overlap between Teochew and Hakka. Where as in Riau Islands or Jambi there tends to be more overlap with Hokkien.

      Similarly, I only have exposure to Teochew through my family. We’ve visited West Kalimantan before and speak Teochew with the locals. However being native speakers doesn’t translate to being able to explain the language in-depth. We would need both native speakers and researchers like this post’s author to work together to make Teochew learning resources.

      Liked by 1 person

  2. It is great that you are learning Teochew.

    I point out, however, that the Teochew word for 書, ze1 (using the Peng’im system published by Guangdong Provincial Education Department), is cognate with the Mandarin shu1. 冊 is a different word. It would be cêh4 (using the Peng’im system again). 冊 is commonly used in Hokkien in place of 書. However, mainstream Teochew usage is for 書 instead of 冊.

    If you want to learn Teochew, I suggest that you first familiarize yourself with two romanization systems: (a) the Peng’im system mentioned above, and (b) the Teochew version of the Pe̍h-ōe-jī system. You can also look up books on the Teochew language as written by Ms. Adele Fielde. They should be available on Archive.org. There are many other old Teochew language resources on Archive.org.

    Compared with most Chinese languages, there is in fact a wealth of resources for learning and studying Teochew, some of which can be accessed for free online, and some of which can be purchased in bookstores (particularly in Hong Kong or Guangdong). You can also access The Teochew Store online (not affiliated).


    1. You’re right that ze1 is cognate with 書 and not 冊 — my bad!

      I don’t think it matters much if you learn Peng-im or Peh-oe-ji or make up your own because neither is commonly used by Teochew people. Teochew is seldom written at all; in the few times it’s written, they use Chinese characters, with non-standardized characters for dialectal words (obviously not ideal for a language learner). So use whatever feels most intuitive to you.

      I’ve seen the books by Adele Fielde but have not looked at them in detail. One thing to keep in mind is that they were written over 100 years ago, which is enough time for a language to change significantly. Teochew is known to be a rather conservative language, so maybe it hasn’t changed that much; I don’t know whether this is the case.


      1. It is true that few Teochew people know either Peng’im or Peh-oe-ji. However, those few Teochew people, who care about their language (e.g. the people on http://www.ispeakmin.com), are usually familiar with both.

        Much of the Bible has also been translated into Teochew by way of Peh-oe-ji. I am not Christian. I do not believe in the Bible. However, I enjoyed the opportunity to read Teochew-language material in Peh-oe-ji.

        It is true that Adele Fielde wrote her books more than 100 years ago. However, I found her books to be quite useful in my personal Teochew-learning journey. First, Teochew has not changed that much. The main change is the merger of endings, e.g. -n and -ng, and -t and -k. Some Teochew varieties have not merged those endings. Arguably, even if we merge those endings in speech, (bowing to popular pressure), we should maintain their differences when writing in Peh-oe-ji.

        Additionally, in learning Teochew, the main difficulty is vocabulary. Certainly, we must also master phonology. However, phonology is a given. It is the first barrier, which we can and should pass through as soon as possible. Indeed, after a few days or practice, the standard pattern of Teochew sandhi should be second nature. For someone with minimal understanding of IPA, Teochew phonology is not difficult. After having passed through this first barrier, the main difficulty will be vocabulary – e.g. lexical differences between Teochew and Mandarin, Cantonese or Hokkien (depending on our respective personal linguistic backgrounds).

        Lexical differences are why Hokkien speakers cannot readily understand Teochew, even though their respective phonologies are so similar. It is also why Cantonese and Mandarin speakers cannot readily understand Teochew. The advantage of Adele Fielde’s materials is that she teaches “real Teochew”, e.g. all those words that do not exist in Cantonese or Mandarin. It is my belief that a person, who has diligently worked his or her way through Adele Fielde’s materials, will speak better Teochew than most native Teochew speakers. He or she will have a vocabulary that is broader and deeper than most native Teochew speakers today. He or she will also be able to explain with great clarity the hidden logic that underlies the language, in a way that a native Teochew speaker, who has not reflected upon his or her language, cannot.

        Another great resource for learning Teochew is stories told by 林江 and 陳四文 in Teochew. These are readily found on Youtube and other video-hosting websites.


    2. Thanks, some of these resources I wasn’t aware of before, I’ll look into them. I’m also working on natural language processing in Teochew, but for this I need computer-readable data. Anything structured (dictionaries, parallel text, etc) would be useful — do you know of anything that’s available? I’m also curious: what’s your background and what made you interested in studying Teochew?


      1. Mogher.com hosts a Teochew dictionary. However, I find that it has certain issues. First, if I recall correctly, Mogher initially used a Peh-oe-ji-derived system and subsequently changed to a Peng-im-derived system. I am not sure what prompted the change. Second, like most Teochew dictionaries, Mogher is designed for finding out the Teochew pronunciation of Chinese characters. This is useful, if you are reading a newspaper article in Teochew, and you need to find out how to pronounce certain characters in Teochew. However is more important for studying Teochew, however, is a dictionary that sets out and explains the Teochew lexicon (e.g. colloquial Teochew words). There are two main sources for this kind of dictionary: (1) Adele Fielde’s dictionary and others like it (e.g. written by westerners in the 19th century), and (2) certain Chinese-language publications in the 20th and 21st centuries. Those Chinese-language publications are harder to get. You have to come by them in bookstores. As soon as you see one, you have to buy it immediately. Otherwise, you might not see it again.

        There are also cell-phone apps for Teochew.

        I am of Teochew descent. When I was a kid (in Hong Kong), I wasn’t taught Teochew. As I grew up, I felt that I had to learn Teochew. The good thing about Teochew is that there are many Teochew people, who are quite enthusiastic about our language. I am sure you can find plenty of help online, if you look around (starting with the people on http://www.ispeakmin.com and The Teochew Store).

        Good luck on your natural language processing project.


  3. Thanks so much for this blog post! It’s refreshing to see someone write about a Chinese topolect that isn’t Mandarin or Cantonese (and specifically is Teochew)! 🙂
    Just a few notes though, 1) my own family is also originally from Raoping, but by way of Cambodia, so I would say that your comment about “almost all Teochews are bilingual in Standard Chinese,” is inaccurate for Chinese Southeast Asians. I would actually say that due to my personal networks being what they are, the Teochew folks I know are 90% Teochew from Cambodia, Vietnam, Laos, Thailand, Indonesia. Of that 90%, over half do not speak Standard Chinese/Mandarin at all. Side note that if, from a linguistic perspective, you’re interested in exploring the “flavours” of Teochew spoken in Southeast Asia, many differences are not just stylistic, but actually lexical in nature.
    2) While I’m willing to accept the probability of dialectal Teochew words that don’t have Mandarin cognates, I’m not sure that you’ve actually provided any evidence of that. And I’m not sure that there are truly as many as you seem to think. For example, the word for “man,” is actually simply the Teochew pronunciation of大夫, which is actually still used in Modern Mandarin, although in a much narrower sense. And the word for “home” is just the Teochew pronunciation of 家內, which may be a strange noun construction for Mandarin speakers, but that points to more grammatical differences than you suggested rather than not having a Mandarin cognate at all. And the word for “child” is 孥囝, and while I will admit that I’m not sure where the character 囝 comes from, 孥 is just archaic Chinese. Here it is used in a line from Du Fu’s poem Qiang Village: 妻孥怪我在,惊定还拭泪
    Apologies if anything I’ve written is vague or inartfully rendered. I’m a biochemist by training, so my understanding of linguistics is, while enthusiastic, fractured at best. Again, thanks for this post; any contribution to the linguistics of Southern Min languages is very welcome!

    Liked by 1 person

    1. Cool, I didn’t know these words had Mandarin cognates. They’re not easily recognizable as cognates (at least to me). Having a good etymological dictionary would be useful to match up these difficult cognates, but I don’t know of any.


  4. 黑人。heck nang cognate to madarin Hei Ren. For Teochew 黑 may be read Oul. and Heck depening on where it is placed . Blackboard. 黑板 for example is called Heck Pang and not Oul Pang. So 黑 Heck in teochew cognates to Hei in mandarin in this case


  5. 我来自潮州 wa lai si tio jiu is a TV Show in Hong Kong that depicts the hometown of Chaozhou, Guangdong, China on which a man who overstayed in Hong Kong was born. It became the autobiography of Lee Ka Shing Who can’t speak Cantonese but his hometown language Teochew.


  6. Lee Ka Shing was depicted in I’m From Chaozhou in which the theme song 胜利双手创 Winning in My Hands is the Cantonese Version of the Taiwanese Hokkien song Ai Pia Cia He Yia popularized by Ye Qi Tian from Taiwan and it is sung by Johnny Yip Chun Tong including other Teochews from South, Southeast Asia, Americas, Oceania, Europe, etc.


  7. Who told you that Teochew is not a written language? We have many characters dated as far as the Tang and Song Dynasties. We have our own dictionaries [pls note several] and I have just ordered a Teochew dictionary from Amazon which is arriving on Monday and that would be my 4th Teochew dictionary.
    I write Teochew songs in Teochew using our own characters. There a tonnes of Teochew learning videos to teach people how to speak Teochew and if we do not have a written language, how do we write our song lyrics in Teochew. You should not give fake information in the Internet because some stupid people might believe in the rubbish you have written.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s