The ABC Cantonese-English Comprehensive Dictionary

By Robert S. BAUER 包睿舜.

Copyright © 2017–2023 Wenlin Institute, Inc.

For print editions of dictionaries in the ABC Dictionary Series, please see UH Press, and

List entries alphabetically: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Please note: in the Wenlin Dictionaries Wiki, entries from this dictionary are all presented in the namespace Jyut, which stands for jyut6 jyu5​ (粵語) ‘Cantonese’. Headwords are in Cantonese and glosses are in English.



My dedication of the ABC Cantonese-English Comprehensive Dictionary recognizes my two greatest inspirations for why I have written it:

First of all, I dedicate this dictionary to the cherished memory of my late, dear mother, Joan E. Bauer (1925–2018). With both profound sadness and hopeful gladness, I acknowledge my mother as the loving, compassionate, gentle, but strong-willed soul who was also my best friend. She always provided me with her positive support and inspiring encouragement, all of which I much appreciated. I miss her deeply and think about her every single day that passes by. She was well aware that I had been working on this dictionary for a number of years and had wholeheartedly urged me to keep working on it till the very end when I thought I had completed writing it and then to have it published.

Secondly, it is with great pleasure that I also dedicate this dictionary to the Cantonese-speaking people of Hong Kong whose fascinating, marvelous, mesmerizing, and distinctive 香港粵語 hoeng1 gong2 jyut6 jyu5 or 港語 gong2 jyu5 hit directly upon my ears for the first time on my first trip to Hong Kong during a spring break back in 1974. As my ears had been accustomed to hearing mainly Mandarin and Taiwanese spoken around me back in Taiwan where I had begun studying Cantonese, so Hong Kong Cantonese felt amazingly, refreshingly, and happily different. It was during that early visit to Hong Kong that I realized that my study of Cantonese was opening another window through which to view, understand, and appreciate the Chinese language from a broader perspective. Since that time I have been profoundly inspired by witnessing first-hand here in Hong Kong the fervent love that Cantonese-speaking Hongkongers deeply feel for their very own Hong Kong Cantonese language. Furthermore, over the past few years I have observed how some of them sense that it may be endangered by Mandarin, and so they have come to increasingly cherish and support speaking and writing it.

My dual hope for this dictionary is that it not only serves as a concrete record that documents Cantonese, but it also becomes regarded as a practical, quotidian reference Cantonese-speakers can hold in their hands to help them celebrate speaking and writing their unique language.

May speaking and writing the Cantonese language thrive in Hong Kong and other Cantonese-speaking communities around the world till the end of time!


ji4 gaa1 gong2 gwong2 dung1 waa6/2 hai6 mai6 faan6 faat3 sin1

as said by the taxi-driver to the policeman in the dystopian Hong Kong movie “Ten Years” (2015)

Series Editor’s Preface

It is my great pleasure to contribute a preface to Robert S. Bauer’s ABC Cantonese-English Comprehensive Dictionary (ABC CECD). I have watched this stupendous lexicographical treasure grow from its inception to the monumental achievement that it is today. The genesis of this dictionary took place nearly two decades ago when I expressed to Bob my desire for an up-to-date, reliable Cantonese-English dictionary to which I could refer while learning this new (to me!) Sinitic language. Although Bob mentioned several existing dictionaries, we both agreed that none of them were quite suitable for my purposes so I gently suggested that Bob write one himself. Much to my gratification, he happily agreed to do so. Knowing Bob to be a doyen of Cantonese language studies, I was fully confident that he would do a thoroughly responsible job of compiling my dream dictionary. He has done that, and he has exceeded my expectations.

Bob began studying Cantonese in 1974, visited Hong Kong a number of times thereafter, and has lived in Hong Kong continuously since 1997. He started seriously collecting written Cantonese way back in 1980 while researching the language in Hong Kong for his Berkeley PhD in 1982. After discovering that some Hong Kong magazines and newspapers were being written in Cantonese, Bob became interested in understanding and identifying the conventions on which written Cantonese is based. This identification process led to his landmark 1988 publication “Written Cantonese of Hong Kong” (Cahiers de Linguistique Asie Orientale 38.2: 245–293).

Over time Bob has published a stream of authoritative scholarly articles and books on diverse aspects of Cantonese, which is quite different from Mandarin (in terms of grammar, syntax, morphology, lexicon, etc.) and requires more than a thousand special characters. As Bob’s expertise in the language increased, so did his reputation in Hong Kong. He has become a sort of folk hero, a gweilo who is an authority on Cantonese language. He has appeared often on television and radio and is regularly interviewed in newspapers and magazines for his views on the nature and preservation of Hong Kong language. These public appearances are especially delightful because he gives them in Cantonese.

After Bob had been working on the dictionary for about ten years and it was clear that it was well on its way to becoming a reality, I arranged for its publication in the ABC Chinese Dictionary Series at the University of Hawai‘i Press. During the last five years, it was very hard to pry the dictionary out of Bob’s hands. He always wanted to collect one more specimen of current Cantonese, then satisfy himself that it was actually still in use and that he had its precise meaning. In addition, there were the enormous challenges of compiling and typesetting the dictionary in an acceptable format. Bob was stymied for a couple of years about how to deal with the software and programming, but the geniuses at Wenlin worked their magic: Despite the gargantuan scope and complexity of these tasks, the last stages of putting the dictionary together went quite smoothly.

Some of the most poignant memories I have of Bob, which I will forever bear in mind, are our meetings whenever I passed through Hong Kong, at a tea shop or, more often, our favorite Indian restaurant, to talk over dictionary matters and the state of Hong Kong Cantonese. Most of all, I will never forget Bob’s shirt pocket bulging with slips of papers, little notebooks, and pens, to record yet another live specimen of Cantonese in action—and Bob always at the ready with a smile on his face.

Victor H. Mair


This dictionary sprouted from an inspirational seed that was planted some years back—in 2002 to be exact—by none other than Victor Mair when he was a visiting professor teaching in the Chinese Department at the University of Hong Kong. One day Victor called me up and said he was studying Cantonese and requested me to recommend to him a Cantonese-English dictionary that could help him learn the language. Well, his request was not at all a simple matter. At the time (and even now), the only such dictionary that came to mind and that was still in print was Sidney Lau’s A Practical Cantonese-English Dictionary (Hong Kong Government Printer, 1977). Since Victor seemed to urgently need a dictionary, I suggested this one to him, but with the caveat that it might not prove all that useful, as it omitted a good portion of the contemporary colloquial Cantonese vocabulary and was skewed toward formal, written Chinese. Upon hearing this, Victor replied: Given our lack of a suitably adequate Cantonese-English dictionary, then the obvious—if not the only—thing to do is for you (that is, me) to write one! Now in truth, I had already recognized there did indeed exist the crucial need for a comprehensive Cantonese-English dictionary, and so for some time I had already been mulling over this very idea of authoring one. So, right at that moment Victor’s words so resonated with me I decided the time had come to act, and so I took concrete steps to start a project writing and compiling the ABC Cantonese-English Comprehensive Dictionary. Here I offer my heartfelt thanks to Victor for inspiring me to start and then complete writing this dictionary which, as it turned out, spanned a decade and a half.

In the following year I began the initial phase of writing this dictionary with the help of Mr. Cheng Siu Bong who assisted me through his computer input and English translation of lexical data into a Word document file in which the lexical entries were organized according to the “bands” format of the University of Hawai‘i Press. Here I recognize Siu Bong’s contribution in the early phase of this dictionary.

One special and invaluable feature of this dictionary for which I wanted some expert opinion is its inclusion of example sentences which are intended to indicate how lexical items are used in everyday, colloquial Cantonese. As I am not a native speaker of Cantonese, I thought I could benefit from feedback provided by some people who are native speakers; and so I asked a few willing friends to take a look at the Cantonese sentences which I had composed and then tell me how to improve them to make them as natural-sounding as possible. In this regard, several individuals shared their native-speakers’ knowledge of the Cantonese language with me: some years ago Mr. Alan Mok Hing Lun, a former MA linguistics student of mine, very kindly allowed me to send him my example sentences which he then carefully checked and corrected to make them sound authentic. Dr. Zeng Zifan also checked some of my sentences and made them sound naturally spontaneous. He and I occasionally met at the City University of Hong Kong where he was teaching Putonghua, and he would generously share his impressive knowledge of and insights into the Cantonese language with me. Lastly, Ms. Jin Chow, who was a high school student in Hong Kong at the time she was helping me (but later became the valedictorian of one of America’s top-ranked universities), also checked some of my Cantonese sentences and sent me her candid opinions and suggested corrections for them. On further reflection, I think the people whose help I have acknowledged here were not just native speakers of Cantonese, but they were also especially interested in their own language, and some had also pursued specialized linguistic knowledge through advanced study for higher degrees. Their precious insiders’ insights into Cantonese have made this dictionary better than it would have been otherwise. I have very much appreciated all the help Siu Bong, Alan, Zifan, and Jin have so kindly given to me, and I now take this opportunity to warmly thank them.

In early 2017 I was nominated for a Spirit of Hong Kong Award for writing this Cantonese-English dictionary; then on September 29, 2017 at the formal ceremony and banquet I felt greatly honored (and delightfully surprised) for the Spirit of Hong Kong Award for Cultural Preservation to be bestowed upon me in recognition of my dictionary’s contribution to the preservation of Hong Kong’s Cantonese language. I hereby express my gratitude for this Award which was presented to me by the South China Morning Post, Hong Kong’s oldest English-language newspaper (established in 1903) and Sino Group, one of Hong Kong’s leading property developers.

Lastly, I take this opportunity to thank sincerely and applaud wholeheartedly the following people who performed those pivotal tasks that advanced this dictionary along its path to publication: Mr. Tom Bishop of Wenlin who has expertly accomplished the seemingly daunting, Herculean feat of computer-typesetting my enormous Word document (more than once), and thus creating this most elegantly looking dictionary; Mr. Richard Cook also of Wenlin who devised some Cantonese characters and assiduously checked through and made corrections in the later stages of my writing this dictionary; professor James E. Dew and Mr. Marc H. Miyake, whose painstaking proofreading of the dictionary caught and corrected typos, punctuation errors, and inconsistencies; and Ms. Stephanie Chun, acquisitions editor of the University of Hawai‘i Press, who diligently and deftly guided and coordinated this book through the myriad phases of proofreading and publication that have finally and successfully produced this print version of my dictionary.

Robert S. Bauer


Robert S. BAUER 包睿舜

What are the Special Features of the ABC Cantonese-English Comprehensive Dictionary?

The scope of this dictionary’s lexical entries reflects the broad range of the Hong Kong Cantonese lexicon by including many different kinds of words, nouns, verbs, stative verbs, coverbs, modal particles (or discourse markers), fixed expressions, idiomatic expressions, English loanwords, triad jargon, vulgar (taboo) words, etc. Among the special features associated with this dictionary’s lexical entries are the following nine items:

(1) Alphabetical ordering of head words. The head words of lexical entries are listed in strictly alphabetical order of their Cantonese pronunciations as romanized in the Jyutjyu Pingjam 粵語拼音 jyut6 jyu5 ping3 jam1 (or Jyutping 粵拼) system that was developed by the Linguistic Society of Hong Kong (2002). In general, this alphabetization principle has rarely been adopted for organizing Cantonese words in dictionaries (notable exceptions are Lee (2003) and Morrison (1828)); the more usual practice has been to list the lexical items under the first Chinese characters that occur in their head words. Since the tones of words can change depending on how they are used in various contexts, the romanization has indicated the original tone of the morphosyllable (this is a general cover term that includes monosyllabic free and bound morphemes and semantically-unanalyzable syllables) followed by its so-called “changed tone” (變音 bin3 jam1); e.g. in the case of 相 soeng3/2 ‘photo, photograph’ its original tone is 3 (mid level) but changes to tone 2 (high rising) when the syllable carries this particular meaning. For the reader who is not familiar with the Jyutping romanization system a table at the end of this introduction presents the initial consonants, rimes, and tones with their corresponding IPA symbols.

(2) Written forms. The corresponding written forms of the head words are represented with standard Chinese characters, colloquial (or so-called dialectal, nonstandard) characters, and in some cases English letters. As will be explained below in the section on Written Cantonese, there are some colloquial Cantonese words and English loanwords that lack Chinese characters to write them with; the solution has been to write them either with their original English spelling but pronounced with Cantonese syllables or with English letters in a kind of ad hoc romanization of their pronunciations.

(3) Parts of speech. The parts of speech or syntactic categories to which the head words belong, such as noun, verb, stative verb, fixed expression, etc., are so indicated; in the case of nouns, their classifiers (or measure words) have also been included.

(4) Cross-referencing. The head words have been extensively cross-referenced to semantically-related lexical items, such as synonyms, to alert readers to these additional entries that may interest them but which they might not encounter except through the cross-referencing.

(5) Variant pronunciations and written forms. Some Cantonese words can have two or more variant pronunciations, and because written Cantonese has not been standardized, one word can have two or more written forms, and so all of these variant forms have been included in the lexical entry and indicated as such.

(6) Notes on usage and social status of words. To help explain the usage, register, social status, and meanings of head words, some entries have included concise notes, such as literal, figurative, colloquial, slang, specialist jargon (e.g., student, financial, triad, etc.), humorous, derogatory, impolite, offensive, obscene, vulgar, obsolete, old-fashioned, etc.

(7) Explanatory material on cultural, historical, and political associations of head words. Explanatory information on the cultural, historical, and political aspects of some head words has been included in the entry where this information is deemed particularly revealing, appropriate, and helpful for the reader’s better understanding of the word’s meaning and usage.

(8) English definition. The precise equivalent English definition of the Cantonese head word is stated.

(9) Example sentence. For many lexical entries at least one example sentence has been included in order to demonstrate how the head words are used in Cantonese.

In addition, there are two Chinese-character indices: the first one has listed all the individual Chinese characters that occur in the dictionary in the alphabetical order of their romanized Cantonese pronunciations; and the second one has sorted all the Chinese characters according to their 214 traditional Kangxi radicals 康熙部首 and stroke counts. An index of characters arranged by their Mandarin pinyin forms with corresponding Cantonese romanizations may be included in a later edition of the dictionary.

Why Do We Need a Cantonese-English Comprehensive Dictionary?

In recent years my own personal experience of visiting Hong Kong bookstores and inquiring if a Cantonese-English dictionary were available for sale has invariably elicited the same answer from the store clerks, viz., they had no such dictionary. Some years ago I had gone over to the Hong Kong Government Publications Office and bought a copy of Sidney Lau’s A Practical Cantonese-English Dictionary (Hong Kong Government Printer) which was published in 1977 (it has never been revised, and the same first edition still seems to be available on the Sidney Lau website). I consult this stalwart among such dictionaries almost every day; and, as useful as it has been to me, it has some limitations despite its length of 1000 pages: most of its lexical entries are given over to the standard Chinese lexicon with a good portion of the colloquial vocabulary omitted; and, of course, it is now 40+ years old, so is also quite out of date. For some time I have felt the obvious need for a full-scale, warts-and-all (i.e. with both the good words and the bad words, whatever Cantonese-speakers say) Cantonese-English dictionary, so a big part of my motivation in writing this ABC Cantonese-English Comprehensive Dictionary has been to try to fill in the dictionary-gap that currently exists for students of the Cantonese language. To the best of my knowledge, this is the first, relatively comprehensive dictionary of its kind to be published in the past 40 years; it has attempted to document the Hong Kong Cantonese lexicon by including 16,000+ lexical entries which are uniquely and specifically Hong Kong Cantonese; this is to say that those words and expressions that also occur in standard Chinese with identical meanings and collocations have been excluded – unless they are originally Cantonese and later had made their way into standard Chinese.

In terms of published reference works, such as dictionaries, grammars, glossaries, descriptive analyses, etc., my subjective impression based on 40+ years of study is that Cantonese has been the most intensively and voluminously documented regional Chinese variety after standard Chinese (Mandarin, Putonghua). However, as mentioned above, up till now if anyone were looking for a Cantonese-English dictionary they would do well to be able to find Sidney Lau’s A Practical Cantonese-English Dictionary. In my own experience of using this dictionary the main problem I have found with it is that much of the colloquial Cantonese lexicon is lacking. As it is now over 40 years old, it also does not reflect the many changes that have occurred in the lexicon over these past several decades.

This ABC Cantonese-English Comprehensive Dictionary is intended to satisfy the practical, concrete needs that at least two groups of people have for a Cantonese-English dictionary: first, English-speaking students who are learning Cantonese as an additional language (Bauer 2019; Bauer and Wakefield 2019; Wakefield 2019) and second, Cantonese-speakers who want to know what are the English equivalents of Cantonese words. Unquestionably, Putonghua/Mandarin is China’s unifying national language and its lingua franca, yet we still must recognize the inescapable fact that Cantonese is the predominant speech variety that is spoken either as the first language or an additional one by the 7.5 million residents of Hong Kong, one of the world’s leading financial and commercial centers, as well as the millions of other people in mainland China and overseas Chinese communities. Cantonese simply cannot and should not be ignored; as the world's second major Chinese language, Cantonese merits our deep interest, commands our scholarly respect, and deserves our serious attention.

What is the Motivation for Writing ABC Cantonese-English Comprehensive Dictionary?

In addition, however, I should make clear that my writing of this dictionary has been motivated by another, relatively abstract issue, and so I should explain it here. Over the last few years, questions have been raised about what the future holds for the Hong Kong Cantonese language due to Hong Kong’s increasing use of Putonghua, not only as the principal medium of instruction in the majority of schools, but also as the major language in the broadcast media, entertainment and tourism industries, business, etc. Will Putonghua eventually replace Cantonese as the Hong Kong community’s primary Chinese language? I can quite understand that, given how well Cantonese currently fares, this possibility may seem remote, even far-fetched, and so unlikely. According to data released by the 2016 By-Census in early 2017, Cantonese predominates as the usual, daily language spoken by 89% of Hong Kong’s ethnic Chinese population which is currently about 7.5 million; if persons speaking Cantonese as another linguistic variety are also included, then the figure rises to 95%. Nonetheless, some people, myself included, do feel a genuine concern about the future of Cantonese in Hong Kong as it becomes increasingly mainlandized (i.e., more like mainland China): In 2015 when I saw printed on the ominously-black cover of a local Hong Kong magazine the provocative, even chilling question, The Death of Cantonese? (Tam and Cummins 2015), that did arch my eyebrows, and, of course, I was enticed to read the article inside. At any rate, in the meantime, this dictionary should go some way to help document the Hong Kong Cantonese lexicon, what many of its words and expressions mean in English, how their standard and variant pronunciations are romanized in the Jyutping system, and how they are transcribed in standard and non-standard Chinese characters. As for what the future may hold for the Hong Kong Cantonese language, I would like to hope this dictionary can contribute to supporting and promoting it in the years ahead.

At the turn of this century in observing how widespread has been the use of Hong Kong Cantonese in both its spoken and written forms at that time, I (Bauer 2000:37) had felt quite optimistic about its status; here I quote what I wrote in describing its thriving state back then:

“. . . Cantonese has achieved in Hong Kong a unique and very special status in comparison to any other Chinese dialects wherever they are spoken. I would go so far as to say that Cantonese is now enjoying its Golden Age in Hong Kong. Where else in China, or the world for that matter, can one witness Shakespeare’s A Midsummer Night’s Dream performed in Cantonese; read a newspaper article, novel, or adult comic book written in Cantonese; watch a movie in which the dialogue has been originally recorded in Cantonese; attend a university lecture delivered in Cantonese; listen to a radio play or international news program broadcast in Cantonese; or hear legislative councilors [somewhat comparable to elected representatives in a parliament] and the Chief Executive [who is the head of Hong Kong’s government] vigorously debate proposed laws in Cantonese? The answer is obvious, and most Cantonese speakers take all these things for granted because they perceive no threat to the language and feel there is nothing to get excited about. Their attitude reflects the healthy state of the language, but it also makes me wonder that if we now live in the Golden Age of Cantonese, how much longer can it continue?”

Although the Cantonese language still seems to be doing relatively well in Hong Kong 20+ years after its return to China’s sovereignty in 1997, some recent survey findings on the use of languages in Hong Kong clearly indicate that in the eyes of some people the Cantonese language is becoming endangered. According to findings from a telephone survey conducted by Bacon-Shone, Bolton, and Luke (2015:7), Cantonese continues to function as “the key language for oral communication in many settings in Hong Kong”. In addition, this survey reported that the Hong Kong government has been making good progress in promoting trilingualism, i.e., fluency in Cantonese, Putonghua, and English, especially among younger people who claim they have some degree of proficiency in all three of Hong Kong’s principal languages. At the same time, however, there were some Hongkongers who stated they felt some unease regarding the state of Cantonese. People who participated in this telephone survey were presented with the question, How seriously endangered is Cantonese at present? While 23.1% answered with Not at all, we should note that a combined total of 77% of the respondents said they thought Cantonese was seriously endangered to some degree: viz., either A Little at 31.8%, Moderately 30.1%, A Lot 11.7%, and Critically 3.4% (Bacon-Shone, Bolton, and Luke 2015:27).

The use of Putonghua as the medium of instruction in schools has been steadily increasing, and so it can be regarded as the direct cause for the fall in the numbers of schoolchildren who have been learning to read and write the Chinese characters with Cantonese pronunciation. At the time of this writing, about 70% of Hong Kong’s primary schools and 40% of its secondary schools are using Putonghua as their medium of instruction. Some people are not at all happy about this development: a few years ago one member of the Putonghua as Medium of Instruction Student Concern Group bluntly stated, “It’s ridiculous that we cannot use our mother tongue to learn in our own place” (Yau and Yung 2014:C5).

It comes as no surprise to me to read that some people are calling for laws to be enacted that would protect and preserve Cantonese:

“Although there is no determined campaign to eliminate Cantonese, Hong Kong gives little encouragement for children to study Cantonese when Putonghua is seen as one of the main languages of business today The city’s laws provide scant protection for Cantonese . . . Cantonese is an important part of the intangible cultural heritage of Hong Kong and vital for the preservation of its cultural identity. Hopefully, the [Hong Kong government’s cultural heritage] survey will identify Cantonese as worthy of protection, not just as a vehicle for communication of other elements such as Cantonese opera, local festivals and rituals, but as an element in its own right.” From Both city and nation must preserve Cantonese language, Letter of the Law by Steven Gallagher, associate dean of law at Chinese University of Hong Kong, South China Morning Post, April 24, 2014, Page C2.

In a few years’ time when people look back at the current era in which Putonghua has replaced Cantonese as the medium of instruction in most schools, I can imagine they will sadly say it was this shift that marked the beginning of the inevitable decline in the status of Cantonese in Hong Kong. We can also see this as another step towards the community’s further mainlandization, against which resentment has been intensifying among some Hongkongers; the localist movement’s support of Hong Kong Cantonese as one of its issues can only make the language more politicized. At some point down the road, I predict that speaking Cantonese will be regarded as a politically-sensitive act or even illegal. Indeed, just such a development has already been anticipated: In the Hong Kong movie “十年 Ten Years” which was released in 2015 and vividly imagined a future dystopian Hong Kong in the year 2025, there is the unforgettable scene in which the taxi-driver character who speaks only Cantonese says to the policeman who is giving him a ticket for speaking Cantonese in a Putonghua-only zone: “而家講廣東話係咪犯法先? ji4 gaa1 gong2 gwong2 dung1 waa6/2 hai6 mai6 faan6 faat3 sin1 ‘Is speaking in Cantonese against the law now?’.

After having spent more than 40 years learning and researching the Cantonese language, I feel the responsibility to do what I can to support and promote it; my mission has been to record the contemporary pronunciation and written form of the Hong Kong Cantonese language and to translate its vocabulary into English before it declines or even disappears.

What is the Special Connection Between Cantonese and Hong Kong?

Since the 1950s, following the establishment of the People’s Republic of China in 1949, Cantonese in Guangzhou, Guangdong’s provincial capital and the regional home of the language, began to fall into a steep decline due to the adoption and then heavy-handed promotion of Putonghua/Mandarin as the national (and official) language and the medium of instruction in the schools and broadcast media. Indeed, today schoolchildren in Guangzhou speak Putonghua rather than Cantonese, as speaking it is banned in the schools there (Yu 2018). As a direct result of this language switch, the center of Cantonese language and culture shifted away from Guangzhou to Hong Kong which has been called – and quite rightly so – the Cantonese-speaking capital of the world (Bolton 2011:64). According to the Hong Kong 2016 By-Census (Hong Kong Census and Statistics Department 2017), Cantonese is the usual, daily language that is spoken by 88.9% of Hong Kong’s population aged five years and older; if we include those people who speak it as another variety, then the percentage rises to 94.6% of the territory’s inhabitants who number just over 7 million. Without a doubt, today Hong Kong’s predominant speech variety is the Cantonese language. Although it is not Hong Kong’s de jure official language, nonetheless, based on its widespread use by government officials, the broadcast media, and ordinary Hongkongers, Cantonese can be considered Hong Kong’s de facto official spoken language (Bauer 2015:39). In 2015 the world-wide total population of Cantonese speakers, including those speaking so-called dialects of 粵語 Jyut6 jyu5 (or Yueyu, the name of the major Chinese dialect family to which Cantonese belongs), is 62.2 million, as calculated by SIL’s Ethnologue (Lewis et al. 2015).

What Makes the Hong Kong Cantonese Language So Special?

Bradley (1992) was the first scholar to recognize that the Chinese language is not one vast monolithic language, but through its geographical dispersal has developed into a number of distinctive varieties that are now written and spoken across East and Southeast Asia; the term he quite appropriately applied to Chinese was pluricentric, i.e., it is not just one but a whole series of languages that have evolved with multiple standards. To some extent this state of affairs resembles global English, or more appropriately World Englishes that are called American, Australian, British, Canadian, Hong Kong, Indian, New Zealand, Singaporean, South African, etc. Each of these has its own particular standard within the community where it is spoken and is distinguished by possessing its own unique features. By the same token, it needs to be asserted that today there is not just one kind of “proper” or standard Chinese language, but recognizably-different varieties are being spoken and written in Hong Kong, Macao, Malaysia, Singapore, Taiwan, Thailand, as well as mainland China. Indeed, one may presume that the recognition of this fact was the primary motivation for the publication in 2010 of the 《全球华语词典》Quánqiú Huáyǔ Cídiǎn [dictionary of global Chinese language] which classified a wide range of lexical items as belonging to the Chinese varieties spoken and written in these very nations and territories (Li 2010). Occupying a prominent position among these various Chinese varieties is Hong Kong Cantonese.

In its Hong Kong environment the Cantonese language has evolved into a dynamic and independent Chinese variety with its own uniquely and distinctively defining features. At the outset we should make one thing crystal clear: the Cantonese language is not simply the standard Chinese characters cloaked in Cantonese pronunciation. Innumerable differences between their sound systems, vocabularies, and grammars, have made spoken Cantonese and Putonghua, or Mandarin, the national language of mainland China, to be two mutually unintelligible Chinese speech varieties.

At least five features of the Hong Kong Cantonese language have combined together to bestow upon it a special – even unique – status as follows:

(1) Distinctive lexical items and localized Chinese characters.

(2) Phonetic features that are associated with the colloquial Cantonese pronunciation, and non-standard phonetic variations in initial consonants, rimes, and tones which have been widely observed, identified as so-called 懶音 laan5 jam1 ‘lazy pronunciations’, and formally investigated in sociolinguistic studies (in 2007 the Hong Kong government’s advisory Standing Committee on Language Education and Research or SCOLAR conducted the community-wide campaign entitled “Say ‘No’ to ‘laan5 jam1’” in order to help Cantonese speakers, especially students, “correct” their non-standard pronunciations).

(3) English loanwords (over 700 have been documented; Bauer 2003, 2006a, 2010; Bauer and Wong 2010; Wong, Bauer, and Lam 2009) that have been borrowed into the Cantonese lexicon primarily through phonetic transliteration as the direct result of the intimate contact that began between the English and Cantonese languages back in the late 17th century and still continues today.

(4) The two traditions of Cantonese lexicography and Cantonese romanization which were first combined together by 19th-century Western missionaries. Today we may have more dictionaries that document the lexicon of Putonghua and modern standard written Chinese than for Cantonese; nonetheless, over the past two decades the publication of several Hong Kong Cantonese dictionaries have codified the written form of the Hong Kong Cantonese lexicon and transcribed its pronunciation accurately in various systems of romanization which historically were alien to the Chinese language.

(5) The extraordinary and particularly noteworthy development, conventionalization, and widespread use throughout the speech community of the written form of Cantonese speech across many domains of written language, such as Hong Kong’s mainstream newspapers, gossip magazines, personal letters, government posters, comic books, novels, textbooks for teaching Cantonese, supermarket receipts, etc. Plainly stated, the written form of Hong Kong’s Cantonese language is unprecedented among all regional Chinese varieties being spoken today within China and across Southeast Asia; the Hong Kong Cantonese language uniquely stands out by having developed its own independent, distinctively-separate written form that is in competition with modern standard written Chinese. The conventions that underlie Hong Kong’s written Cantonese language are presented and analyzed in a section that follows below.

What is the Cantonese Language?

To answer this question let us first consider the names by which the Cantonese language has been known among its speakers, viz., 廣東話 gwong2 dung1 waa6/2 ‘speech of Guangdong (province)’, 廣州話 gwong2 zau1 waa6/2 ‘Guangzhou speech’, 香港話 hoeng1 gong2 waa6/2 ‘Hong Kong speech’, 唐話 tong4 waa6/2 ‘Tang (dynasty) language, i.e. language of the Tang dynasty (618 – 907 C.E.), which some Cantonese speakers have traditionally looked upon as the high point in the history of the Chinese civilization, 白話 baak6 waa6/2 ‘plain language (literally, white speech)’, 粵語 jyut6 jyu5 ‘Yue language’ (Bauer and Benedict 1997:xxxi). While Cantonese has been in a steady decline in Guangzhou, the capital city of Guangdong province, as a direct result of the heavy-handed promotion of Putonghua, nonetheless, it still thrives in Hong Kong; it is the view of this writer that today the most appropriate name for the contemporary Cantonese language is 香港粵語 hoeng1 gong2 jyut6 jyu5 ‘Hong Kong Cantonese language’.

There are a number of special linguistic features, including phonological, morphological, lexical, syntactic, and social, which give the Cantonese language its unique identity and so distinguish it from other Chinese varieties of southern China.

Among its special phonological traits are the syllable endings -m, -p, -t, -k that have been retained from the Ancient Chinese language (but have become lost in Putonghua). While the doubling of the original four-tone categories of Ancient Chinese to eight is found not only in Yue but also in some other Chinese dialects, one distinctive, definitive characteristic of many (but not all) Yue dialects (Norman 1988:217–18) has been the split of the 陰入 jam1 jap6 Yin Ru tone category (carried by syllables with stop endings -p, -t, -k) into two subcategories of 上陰入 soeng6 jam1 jap6 ‘High-stopped Tone, or Upper Yin Ru’ and 下陰入 haa6 jam1 jap6 ‘Mid-stopped Tone, or Lower Yin Ru’; this development has been conditioned by vowel length for the standard reading pronunciations of the standard Chinese characters (with only a very few exceptions); that is, syllables with short vowels co-occur with the High-stopped tone 1, while syllables with long vowels co-occur with the Mid-stopped tone 3 (下陰入 haa6 jam1 jap6). Other phonological features include the co-occurrence of colloquial morphosyllables with sonorant initials (m-, n-, ng-, l-) with high register tones, and the use of the 變音 bin3 jam1 ‘changed tone’ on syllables to derive additional words with different meanings.

Morphological features include the occurrence of modal particles at the ends of utterances; e.g., 吖嘛 aa1 maa3, 啊 aa3, 噃 bo3, 𠺢嘛 gaa1 maa3, 㗎 gaa3, 啩 gwaa3, 喇 laa3, 囉 lo1, 嚕 lu3, 嘛 maa3, 咩 me1, 喎 wo3, 喎 wo5, 啫 ze1, 唧 zek1, 之嘛 zi1 maa3, etc. Also known as discourse markers and sentence-final particles, these morphosyllables carry no semantic content in isolation but convey the speakers’ feelings and attitudes towards their utterances, such as certainty, disbelief, disdain, dismay, doubt, exasperation, impatience, indisputableness, intimacy, irritation, surprise, etc. The series of aspect markers which also lack semantic content on their own and that are suffixed to verbs include 𡁵 gan2, 𠹺 maai4, 晒 saai3, 咗 zo2, 住 zyu6, etc.

As will be further explained below, colloquial Cantonese speech includes a number of vocabulary items that are regarded as giving the language its distinctively Cantonese identity; interestingly and curiously enough, these words cannot be etymologically related to their semantic equivalents in standard Chinese; and, as we can see in the following examples, some of them are written with nonstandard, dialectal characters: e.g., 骲 beu6 ‘to jostle with the hips’, 𨳍 cat6 ‘penis (vulgar)’, 揼 dam3 ‘to droop, hang down’, 揼 dap6 ‘to beat, pound’, 掟 deng3 ‘to throw (at a target)’, 曱甴 gaat6 zaat6/2 ‘cockroach’, 𨅝 jaang3 ‘to kick off’, 𡲢 ke1 ‘shit’, 佢 keoi5 ‘he, she, it’, 𡃈 kwaak3 ‘loop, circle’, 嚟 lai4 ‘to come’, 冇 mou5 ‘not to have; no’, 腍 nam4 ‘soft, tender’, 啱 ngaam1 ‘to be all right, good’, 䠋 pe5 ‘to stagger’, 氹 tam5 ‘puddle’ (Cheung and Bauer 2002).

What is the Origin of the Cantonese Language?

During the Qin 秦 ceon4 dynasty (221–206 BCE) Han Chinese soldiers were dispatched to South China to conquer the region called 粵 jyut6 by subjugating its indigenous inhabitants who were referred to by the Chinese as 百越 or 百粵 baak3 jyut6 ‘hundred Yue’ [tribes]; one important consequence was that the Old Chinese language brought to the South by these soldiers and other immigrants came into contact with the aboriginal peoples’ non-Sinitic languages which are believed to have belonged to such language families as Austro-Asiatic, Tai-Kadai, and Miao-Yao (Yue-Hashimoto 1991b). Today these language families are still represented among South China’s non-Han ethnolinguistic groups who are called 少數民族 siu2 sou3 man4 zuk6 ‘minority nationalities’, as they continue to inhabit this region; the largest group is the 壯 zong1 ‘Zhuang’ who speak varieties of northern Tai over in neighboring Guangxi (Holm 2013). We can assume that the early contacts among peoples speaking Qin-Chinese and the local non-Sinitic aboriginal languages led to their intermarriage, and thus created the conditions for the formation of pidgins and creoles which developed into Norman’s hypothesized Old Southern Chinese, the ancestor of South China’s three main Chinese topolects of Yue 粵 jyut6, Kejia 客家 haak3 gaa1, and Min 閩 man5 (Norman 1988:210–214). According to Norman (1988:210), “[i]n areas which for geographic or topographic reasons were more exposed to Northern influence, these archaic Southern Chinese dialects freely incorporated [linguistic] elements from each new wave of Northern immigration, while in other more remote and mountainous regions (like Fújiàn [福建 fuk1 gin3]), they guarded their archaic aspect more faithfully.”

That these early contacts with non-Sinitic indigenous languages influenced the development of Chinese varieties in South China, such as Cantonese, is based on the identification of lexical substrata linked to Tai-Kadai, Miao-Yao, and other language families (Bauer 1987, 1996; Li 1994a, 1994b; Yuan 1983; Yue-Hashimoto 1991). In this regard, one particularly interesting and salient morphosyllable that we can cite from the Cantonese basic lexicon is 呢 ni1, as in the two words 呢個 ni1 go3 ‘this’ and 呢度 ni1 dou6 ‘here’ which are obviously not etymologically related to their standard Chinese semantic equivalent 這個 zhège ‘this’ and 這裡 zhèli ‘here’, respectively; indeed, we cannot find any semantically-equivalent morphosyllable in the standard Chinese written language to which Cantonese ni1 could be etymologically related. From my own comparative study of numerous phonosemantically-similar items that are widespread throughout the languages of the Tai-Kadai family, as well as other families distributed across Southeast Asia, including Austronesian, Tibeto-Burman, and Mon-Khmer (Bauer 1999:35–38), I have concluded that Cantonese ni1 must have descended from a very old lexical root that occurred in some ancient indigenous non-Sinitic language that is now no longer spoken.

What is the Cantonese Lexicon?

With regard to the Cantonese lexicon, we distinguish between two forms of its words, viz., the phonetic shapes with which they occur in the spoken language, on the one hand, and the written forms with which they are transcribed in Cantonese writing, on the other. Furthermore, spoken words and Chinese characters belong to two different, separate, albeit related systems. Although one might think that it should go without saying, I believe I had better say it here anyway: the Chinese characters do not equal the Cantonese language (or even the standard spoken Chinese language, or any other Chinese speech variety). This is because the scope of the spoken language is not only far broader than the set of Chinese characters for writing it, but also because speech is dynamic by constantly changing and evolving. Speech is primary, written language is secondary; human beings are born into the world with the innate capacity to learn to speak the language(s) they hear being spoken around them by their caregivers; learning how to read and write a written language, on the other hand, requires the formal process of education by which children attend school where they receive instruction in the written language from their teachers. However, even more importantly as we will see in the analysis and discussion of written Cantonese that follows, is the inescapable fact that the Chinese characters on their own are simply inadequate for unambiguously transcribing Cantonese speech in its fully expressed form. The Cantonese language includes a number of indigenous morphosyllables which cannot be linked to their etymological (or original) Chinese characters. How can these so-called “unwritable” Cantonese morphosyllables be written? As will be explained in more detail in sections that follow below, there have been essentially four solutions for writing such lexical items:

(1) Borrowing standard characters for their similar or homophonous pronunciations;

(2) Creating new Cantonese characters;

(3) Borrowing individual letters of the English alphabet or combinations of them for their homophonous or similar pronunciations as a kind of ad hoc romanization;

(4) Adopting the “empty-box” 囗 as a kind of place-holder of last resort.

As pointed out above, it is a given in linguistics that speech is primary, but writing a language is secondary. While young children learn to speak the language they hear being spoken around them without really needing direct instruction on pronunciation, lexicon, and syntax (although caregivers may still give them some help with these), they do need to go to school to learn from their teachers how to read and write the language(s) they speak.

In etymology, i.e., the study of the origins of words, there is the saying, Every word has a history of its own. We could paraphrase this by saying simply, Every word has its own story. In the course of writing this Cantonese-English dictionary I have learned first-hand just how true this is; in the case of some words their stories have turned out to be long and convoluted but almost always quite interesting.

As one way to learn more about the Cantonese lexicon I have tried to make it a daily habit of reading Hong Kong’s most popular newspaper, namely, 《蘋果日報》 ping4 gwo2 jat6 bou3 Apple Daily, as many of its articles are written in colloquial Cantonese and Hong Kong Chinese. As examples of such items of local Hong Kong Chinese vocabulary that appear in this newspaper, we may cite 衝紅公仔 cung1 hung4 gung1 zai2 ‘(for a pedestrian) to walk against the red light at a pedestrian crossing’, 行街紙 haang4 gaai1 zi2 ‘colloquial term for a document issued by the Hong Kong Immigration Department to a person who has applied to the Hong Kong government not to be returned to their country of origin due to fear of persecution; this document allows the person to move about Hong Kong freely but not to engage in employment or travel outside Hong Kong’, 石壆 sek6 bok3 ‘concrete curb, as alongside a road; ledge, as on the side of a building’, 劏房 tong1 fong4/2 ‘subdivided flat’.

In addition, a matter of much interest to me is this newspaper’s practice of quoting verbatim what Cantonese-speakers have said, that is, the speaker’s words are transcribed in written Cantonese (by the same token, if a Putonghua-speaker is quoted, then their speech is so reproduced). As a result of reading Apple Daily, it has been my experience that almost every week I encounter a word or expression that is either new to me or is being used in a different context, and so I feel compelled to learn more about it; if it turns out to be a Cantonese or Hong Kong Chinese word or expression, then I have felt the need to create a lexical entry for it in this dictionary.

The Cantonese lexicon can be found transcribed in some fairly mundane places, e.g., my bilingual supermarket receipt which the cashier hands over to me after I have paid for my groceries; a recent one listed the following Cantonese items (their standard Chinese equivalents are enclosed in parentheses): 椰菜仔 je4 coi3 zai2 ‘Brussels sprouts’ (球芽甘藍), 美國西芹 mei5 gwok3 sai1 kan4 ‘American celery’ (美國芹菜), 無核青提子 mou4 wat6 cing1 tai4 zi2 ‘seedless green grapes’ (無核青葡萄), 無核提子乾 mou4 wat6 tai4 zi2 gon1 ‘seedless raisins’ (無核葡萄乾), 毛茄 mou4 ke4/2 ‘okra’ (秋葵), 薯仔 syu4 zai2 ‘potatoes’ (土豆), 翠玉瓜 ceoi3 juk6 gwaa1 ‘zucchini’ (綠皮密生西葫蘆); a couple of standard Chinese items also appeared on the receipt: 菠蘿 bo1 lo4 ‘pineapple’, and 茄子 ke4 zi2 ‘eggplant’. And, not surprisingly, several English loanwords were also found listed: 士多啤梨 si6 do1 be1 lei4/2 ‘strawberries’ (草莓), 比爾芝士 bei2 ji5 zi1 si6/2 ‘brie cheese’ (布里干酪), 青奇異果 cing1 kei4 ji6 gwo2 ‘green kiwi’ (青猕猴桃), 長啤梨 coeng4 be1 lei4/2 ‘conference pear’, 馬來西亞車厘茄 maa5 loi4 sai1 aa3 ce1 lei4 ke4/2 ‘Malaysian cherry tomatoes’ (馬來西亞小西红柿).

How Can the Cantonese Lexicon be Analyzed?

Broadly speaking, in terms of the origins and distributions (or collocations) of Cantonese words in the spoken language and their corresponding written forms, we can analyze the Hong Kong Cantonese lexicon as comprising a series of three distinctive main layers or strata as follows:

(1) The Hong Kong Standard Chinese and Literary Layer includes many lexical items that overlap with those in modern standard written Chinese language of mainland China; however, while some items have identical meanings in modern standard Chinese, their collocations differ when they are used in Hong Kong Cantonese; in addition, there are some distinctively Hong Kong lexical items that do not occur in standard Chinese, or if they do so, they have quite different meanings in Cantonese.

(2) The Colloquial Cantonese Layer includes uniquely Cantonese words that are etymologically unrelated to their semantic equivalents in modern standard written Chinese, so overcoming the challenges to write them has taken various means. Some colloquial words are written with so-called “dialectal”, i.e. non-standard, characters, while others are transcribed with standard Chinese characters that have been borrowed solely for their homophonous (or nearly homophonous) pronunciations; some colloquial words are written with standard Chinese characters that have been borrowed solely for their meanings, but are read with the semantically-equivalent colloquial syllables; and there are even a few colloquial words that are written with English letters in a kind of ad hoc Cantonese romanization of last resort, as there do not seem to be any characters with which to write them.

(3) The English Loanwords Layer comprises lexical items that have been borrowed from the English language with which Cantonese has been in intimate, unbroken historical contact for over the past 300 years; as a result of this contact, no other Chinese dialect has been more influenced by a European language than Cantonese. These loanwords which number over 700 are being written in several different ways (Bauer 2010) as follows:

(1) Phonetic transliteration uses Chinese characters with suitable Cantonese pronunciations.

(2) Semantic translation uses standard Chinese characters with appropriate meanings.

(3) Combination of phonetic transliteration with semantic translation.

(4) Ad hoc romanization uses individual English letters pronounced with Cantonese syllables;

(5) Transcription retains the original English spellings which are pronounced with similar-sounding Cantonese syllables.

Examples of these various written forms of English loanwords are presented and discussed below.

As already stated above, those Cantonese words that also occur in standard Chinese with essentially identical meanings and collocations have not been included in this dictionary – unless they have been identified as being originally Cantonese and at some later stage were adopted into the standard Chinese vocabulary.

As for just how Cantonese words (and other morphosyllables) are written with Chinese characters and letters of the English alphabet, our closer inspection reveals that the lexicon is relatively more complex than the general outline just sketched above. The following section on written Cantonese provides a more in-depth systematic analysis of the processes and principles on which the written forms of Cantonese words are based.

What is Written Cantonese?

As already pointed out above, among China’s regional linguistic varieties Cantonese is extraordinary by having developed its own highly conventionalized written form which is widely used across many domains in Hong Kong; viz., not only is Cantonese a spoken language, but it is also a written one as well (Bauer 1988; Bauer 2018). The Chinese phrase 《我手寫我口》 ngo5 sau2 se2 ngo5 hau2 ‘my hand writes my mouth’, that is, I write the way I speak, was first advocated by the Qing dynasty poet, diplomat, statesman, and educator, 黃遵憲 Huang Zunxian (1848–1905), and then it later was adopted as a slogan of the May 4th Movement which called for 白話文 baak6 waa6/2 man4 ‘colloquial spoken language’ to replace the old style 文言文 man4 jin4 man4 ‘classical Chinese’ as the language of modern Chinese literature. Indeed, this sentence I write the way I speak expresses precisely what Cantonese speakers are doing – writing down pretty much verbatim with Chinese characters in combination with letters of the English alphabet the vocabulary and grammar of their Cantonese speech. This is quite unusual in the context of Cantonese lexicography, as it is necessary to draw attention to the fact that written form of Cantonese has never undergone the formal process of standardization; nonetheless, as Cheung and Bauer (2002:2) have observed: “[t]he compilation and publication of [numerous] Cantonese grammars, Cantonese-Putonghua dictionaries, Cantonese-English dictionaries, Cantonese textbooks for foreign students, etc. have contributed to the codification of written Cantonese and even implicitly conferred on it a kind of quasi-official status." Such publications, along with glossaries, stories, and other forms of documentation, have promoted the development of written Cantonese and allowed it to accumulate over time relatively consistent conventions which writers have generally adopted and adhered to in producing their texts so that they are intelligible to their Cantonese-speaking readers. Another important fact that makes written Cantonese especially remarkable and noteworthy is that its conventions are not explicitly taught in Hong Kong schools, nonetheless, Cantonese-speaking schoolchildren have still been able to pick them up informally and so learn to read and write Cantonese through their contact with and exposure to its texts that pervade the domains in which written language is used (of course, being able to learn to read Cantonese texts requires that the learner speak Cantonese, so that speaking and reading Cantonese go hand in hand to reinforce each other). As mentioned above, Cantonese and Putonghua are mutually unintelligible, so that a text of written Cantonese can be almost unintelligible to literate Mandarin-speakers from Beijing or Taipei. By pointedly highlighting the differences in the ways Cantonese and Mandarin are written, the author deliberately intends to dispel the curious myth which he has occasionally encountered that they are written the same way; we can now toss this baseless myth into the dustbin where it belongs.

In regard to the study of the historical development of written Cantonese in Guangdong and Hong Kong, Snow (2004) still remains the most comprehensive and authoritative reference. According to Snow (2004:77–99), the origins of written Cantonese can be traced back to a wide range of written materials: (1) transcribed vernacular Chinese texts that were associated with Buddhism and intended for singing and chanting; (2)木魚書 muk6 jyu4 syu1 ‘wooden fish books’ which contained different types of folk songs and popular songs; (3) manuscripts of 木魚歌 muk6 jyu4 go1 ‘wooden fish songs’ with popular stories from history, Buddhist scriptures, popular plays, folk tales, etc.; (4) Southern songs 南音 naam4 jam1 from the early 1800’s; (5) 粵謳 jyut6 au1 ‘Cantonese love songs’; (6) Cantonese textbooks of various types for Cantonese speakers; (7) 粵劇 jyut6 kek6 ‘Cantonese operas’; (8) popular works of fiction; (9) political articles published in newspapers in Guangzhou and Hong Kong in the early 1900’s. In the late 1940s leftists and communists who were involved in the Hong Kong Dialect Literature Movement and 廣東方言文藝研究組 gwong2 dung1 fong1 jin4 man4 ngai6 jan4 gau3 zou2 ‘Guangdong Dialect Arts Research Group’ wrote literary works in colloquial Cantonese. Genres associated with this movement included novels, poetry, plays, anthologies, short stories, and theoretical essays (Snow 2004:106–107). These writers claimed that writing in Cantonese had certain advantages, e.g. it helped to spread education and writing skills, produced literature in the areas where dialects were spoken, and appealed to the masses who felt it was more “intimate” than standard written Chinese. However, the Hong Kong Dialect Literature Movement was not popular and abruptly shut down when the Chinese communist government was established in 1949. After 1949 in Hong Kong under the British colonial government which regarded the Cantonese language as a potentially useful barrier to communication between Hong Kong and the mainland written Cantonese continued its development and to flourish; but in contrast in Guangzhou writing in Cantonese faded out (Snow 2004:121). Linguistic developments in Hong Kong in the 1950’s included the emergence of so-called 三及第 saam1 kap6 dai6/2 (or saam1 gap6 dai6/2) which was a satirical style of writing that brought together features of classical Chinese, Cantonese, and standard Chinese. In addition, the 小報 siu2 bou3 ‘mosquito press’ were cheap, four-page newspapers written for entertainment with gossip about movie stars and opera performers.

Why do Cantonese speakers write in Cantonese? Cheung and Bauer (2002:4) have answered this question as follows:

“. . . writing in Cantonese is perceived by writers and readers as conveying the writer’s message with a greater degree of informality, directness, intimacy, friendliness, casualness, freedom, modernity, and authenticity than writing it in standard Chinese, which is the formal language the Hong Kong Cantonese speaker learns to read and write in school, but its spoken counterpart s/he does not ordinarily use when speaking with coworkers, friends, and family members.”

As for the process of transforming Cantonese speech into its corresponding written form, we must recognize there is a mismatch between the inventory of Cantonese morphosyllables that occur in speech and the standard Chinese characters. By this I mean we cannot link up each and every Cantonese morphosyllable in the spoken language with its etymologically-related standard Chinese character because we simply do not know what some of these characters are (despite the best efforts of scholars who concentrate on 本字考 bun2 zi6 haau2 ‘investigation into the original Chinese character’). This mismatch or disjunct has everything to do with the very early formation and historically-complex evolution through language contact of the Cantonese language with various Sinitic and non-Sinitic languages. Furthermore, because there are more Cantonese morphosyllables in the Cantonese syllabary than there are standard Chinese characters with suitable pronunciations (whether or not etymologically related) for writing the morphosyllables, writing Cantonese words has been made that much more difficult.

In regard to the Hong Kong community’s attitudes toward written Cantonese, they can at best be described as ambivalent, accepting to some extent but not necessarily approving. Some decades ago when I was conducting research for my Ph.D. dissertation on Hong Kong Cantonese phonetic variation and change, I had an unforgettable experience, as I still remember it to this day: as part of my research work to collect texts of tape-recorded Cantonese speech, I requested the participants in my study to read aloud various kinds of research instruments, one of which was a story that was written out in Cantonese and included some Chinese characters whose pronunciations were related to the phonological variables I was investigating. When I gave this story to one of my subjects and asked him to read it aloud, he looked at the story with some surprise and then said to me as if I should know, “Cantonese is not a written language.” In hindsight I think his point was that written Cantonese is not the proper standard Chinese language, it is not taught in school, and that in comparison to modern standard written Chinese in which important documents are transcribed, any text of written Cantonese is not to be taken seriously. And he was and would still be quite correct in saying this. Nonetheless, we cannot ignore the fact that the spoken Cantonese language does indeed have its written counterpart, even though it has never undergone the formal process of standardization, and no one actually learns how to read it in school.

As for Cantonese words in the spoken language, they exist independently of how they are written. Yes, we are quite interested in knowing how the words are written with Chinese characters and English letters, but that is not necessarily the most important aspect of a word. In the case of Cantonese, there are many words that have two or more written forms, and we may not be able to determine which one is “better” or “correct” or “original” or “proper”. Which Chinese character should be used to write a Cantonese word? This is a question one often sees raised on the Internet and is usually framed as: What is the “correct” and “original” Chinese character that should be used to write a particular colloquial Cantonese word? The questioner may have paraphrased its meaning and even romanized it in order to represent its pronunciation.

From the point of view of the lexicographer, that the word has a written form certainly makes it more convenient to work with that word and keep track of it. Fortunately, in the case of Cantonese, we are not limited to transcribing its words with Chinese characters; we can also write them down with Cantonese romanization, and very often a particular word’s romanized form is more accurate in representing its pronunciation and so more useful to us than the Chinese characters with which it is written, since characters can have multiple pronunciations and different meanings associated with them.

In analyzing the written form of Cantonese, we should keep in mind that the conventions of written Cantonese cannot produce a fully accurate transcription of the spoken language but only an approximate representation of it. As matters currently stand, we will discover in the following discussion that the Chinese characters are not fully adequate for writing the Cantonese language. Writers have been doing the best they can with the ad hoc conventions they have at hand, yet they still face many gaps in the tools that are available to them. To enhance and improve the writing system, Cantonese writers have even taken to supplementing the Chinese characters with letters of the English alphabet to romanize the pronunciations of some “unwritable” Cantonese morphosyllables, as mentioned above.

Close examination of Cantonese texts reveals that five processes operate in written Cantonese: viz., traditional usage of the Chinese characters, as well as their phoneticization, indigenization, semanticization, and alphabetization (as the result of intimate contact with English over the past 300 years). In addition, 12 basic principles have been identified as underlying written Cantonese, and these help us better understand how the five processes operate. Finally, there are two main problems of variation in how lexical items are written in Cantonese which need to be resolved in order to advance its standardization. The conventions of written Cantonese and issues associated with the transcription of Cantonese speech in this dictionary are presented and explained in some detail in the following section.

Five Processes Operate in Written Cantonese

The following five processes have been identified as operating in written Cantonese:

1. Traditional usage of the Chinese characters

The Chinese characters are read with their usual, regular standard Cantonese pronunciations and meanings; that is, they are used in written Cantonese according to their traditional phonetic-semantic etymological development just as in modern standard Chinese. The meanings and usages of many lexical items that are written with standard Chinese characters are essentially identical in both Chinese varieties.

2. Phoneticization of the Chinese characters

This process of phoneticization means that the Chinese characters are read only for their pronunciations but not their meanings which are ignored (or suppressed). There are numerous colloquial Cantonese words do not have etymologically-related Chinese characters as their written forms; so in order to write these “characterless” words, written Cantonese resorts to borrowing some standard Chinese characters solely for their homophonous (or nearly homophonous) Cantonese pronunciations, but the meanings of these standard Chinese characters when used in this way are completely ignored. This is to say that colloquial Cantonese words that would otherwise be unwritable are written with standard Chinese characters which have the same or similar pronunciations as the colloquial words; when Cantonese-speaking readers encounter such written items, they know they should ignore the standard characters’ usual meanings and just read them for their pronunciations and associate the colloquial meanings of the words with the standard characters that are being used phonetically.

3. Indigenization of the Chinese characters

Cantonese characters have been created in order to write colloquial words. As noted above with the 2nd process, there are a number of words in the colloquial Cantonese language which do not have etymologically-related Chinese characters as their written forms. So Cantonese-speaking writers have adopted another solution for writing such “characterless” words, viz., the creation of Cantonese characters according to traditional principles of character formation, i.e. typically through the combination of semantic (radical or semantophore) and phonetic (sound or phonophore) character components. In addition, some ancient Chinese characters that are rarely if ever used in modern standard Chinese have been retained and revived to write Cantonese words.

4. Semanticization of the Chinese characters

In contrast to the 2nd process described above, we also find the opposite process of semantic substitution by which some standard Chinese characters are read only for their meanings with the corresponding colloquial Cantonese morphosyllables that are etymologically-unrelated but are semantically-equivalent to the standard Chinese characters. When the process of semanticization operates on the standard Chinese characters, the etymologically-derived pronunciations of the standard Chinese characters are ignored.

5. Alphabetization

The fifth process is actually a more specific kind of phoneticization as described above for the 2nd process. Some individual English letters whose syllabic pronunciations are perceived as being similar to Cantonese morphosyllables are borrowed into written Cantonese to transcribe these morphosyllables. Over the past 300 years Cantonese has borrowed hundreds of English words; older loanwords are typically written with Chinese characters, but some relatively more recent lexical borrowings still retain their original English spellings in written Cantonese, but they are pronounced with Cantonese morphosyllables that approximate the pronunciations of the original English words.

Twelve Basic Principles Underlie Written Cantonese

On closer examination and more specific analysis, we find that at least 12 basic principles underlie written form of Hong Kong Cantonese and at the same time demonstrate how these five processes function. Taken together these two approaches to the analysis of written Cantonese can help us better understand its present state and ongoing developments.

1st Principle: Traditional usage of the Chinese characters

Standard Chinese and Cantonese share many vocabulary items in common; written Cantonese uses the same standard Chinese characters and their meanings to transcribe these identical lexical items which generally have the same usages and collocations in both varieties. Both written Cantonese and modern standard Chinese share the same lexical items with the same meanings and usages as indicated in the following examples:

八月 baat3 jyut6 ‘August’
隨時隨地 ceoi4 si4 ceoi4 dei6 ‘at all times and places’
飛機 fei1 gei1 ‘airplane’
女人 neoi5 jan4/2 ‘woman’
銀行 ngan4 hong4 ‘bank (financial institution)’
星期一 sing1 kei4 jat1 ‘Monday’

2nd Principle: Phoneticization (1): Borrowing standard Chinese Characters for Their Pronunciations to Transcribe Cantonese Words

Standard Chinese characters are borrowed solely for their pronunciations and their meanings are completely ignored to transcribe homophonous (or nearly homophonous) but etymologically and semantically-unrelated Cantonese morphosyllables. Two types of phoneticization are distinguished in this analysis, with the first type applying to indigenous Cantonese lexical items, and the second type to English loanwords as described in the following section. This principle of phoneticization is actually one of the six traditional principles of character formation, or 六書 luk6 syu1, namely, 假借 gaa2 ze3 ‘character loan’ (literally, ‘false borrowing’). Some lexical examples of this first type of phoneticization are as follows:

呢度 ni1 dou6 ‘here’ = standard Chinese 這裏 ze3 leoi5; cf. standard Chinese 呢 ne1 ‘heavy woolen cloth’ + 度 dou6 ‘degree; pass’.
m4 ‘no; not’ = standard Chinese 不 bat1; cf. standard Chinese 唔 ng4 ‘exclamation particle’.
bin1 ‘where; which; who’ = standard Chinese 哪 naa3; 誰 seoi4; cf. standard Chinese 邊 bin1 ‘side’.
使 sai2 ‘to spend; to need (negative context)’ = standard Chinese 花 faa1; 需要 seoi1 jiu3; cf. standard Chinese 使 si2, sai2 ‘to send, use, cause’.

3rd Principle: Phoneticization (2): Borrowing Chinese Characters for their Pronunciations to Transcribe English Loanwords

Chinese characters are borrowed for their pronunciations (their meanings are ignored) to phonetically transliterate words borrowed from English. English loanwords are typically represented in written Cantonese via phonetic transliteration: standard (and non-standard) Chinese characters are borrowed solely for their pronunciations to approximate the pronunciations of loanwords, but the original meanings of the Chinese characters are ignored in this context (just as in the case of the second principle for indigenous Cantonese words).

Some examples of English loanwords that are phonetically transliterated with standard and nonstandard Chinese characters are as follows:

巴士 ba1 si6/2 ‘bus’ ⇐ bus
波士 bo1 si6/2 ‘boss’ ⇐ boss
的士 dik1 si6/2 ‘taxi’ ⇐ taxi
科文 fo1 man4/2 ‘foreman’ ⇐ foreman
士多 si6 do1 ‘store’ ⇐ store
天拿水 tin1 naa4/2 seoi2 ‘(paint) thinner’ ⇐ thinner
貼士 tip3/1 si6/2 ‘tips’ ⇐ tips
威吔 wai1 jaa2 ‘wire’ ⇐ wire

4th Principle: Indigenization (1): Creation of Cantonese Characters

As has been noted above, Cantonese and Putonghua are two mutually unintelligible Chinese languages; one of the major features contributing to their separation is their significantly different lexicons. Cantonese includes numerous lexical items that are etymologically unrelated to their semantic and functional equivalents in standard written Chinese and Putonghua, so the corresponding standard Chinese characters are not suitable for writing the Cantonese words (except in some special cases, as will be described below). As a result, Cantonese characters have been specially created by Cantonese writers to transcribe indigenous Cantonese morphosyllables which are etymologically unrelated to their semantic and functional equivalents in modern standard Chinese. This kind of character creation typically follows the traditional principles of character formation involving the combination of semantic (radical, also known as semantophore) and phonetic (pronunciation indicator, also known as phonophore) components of Chinese characters. When standard Chinese characters are put to use in this way solely for their pronunciations, their original meanings are ignored. Many Cantonese characters are created by adding the so-called “mouth” radical #30 口 hau2 ‘mouth’ to existing standard Chinese characters as demonstrated by the following examples:

dei2 [⇐ 口 + 地 dei6 ‘earth; land; soil’] (1) ‘suffix for reduplicated stative verbs’, 紅紅哋 hung4 hung4/2 dei2 ‘somewhat red’; (2) dei6 ‘plural marker for pronouns and 人 jan4’, 我哋 ngo5 dei6 ‘we, us’, 你哋 nei5 dei6 ‘you (plural)’, 佢哋 keoi5 dei6 ‘they, them’, 人哋 jan4 dei6 ‘people (in general)’; cf. standard Chinese 們 mun4.
di1 [⇐ 口 + 的 dik1 ‘target’] (1) ‘plural marker for nouns’, 啲學生 di1 hok6 saang1 ‘the students’, cf. standard Chinese 些 se1; (2) ‘marker of comparative degree’, 好啲 hou2 di1 ‘better’, cf. standard Chinese 點 dim2.
je5 [⇐ 口 + 野 je5 ‘wild’] ‘thing’, cf. standard Chinese 東西 dung1 sai1.
𠵱 ji1 in 𠵱家 ji1 gaa1 ‘now’, cf. standard Chinese 現在 jin6 zoi6.
𡃁 leng1 [⇐ 口 + 靚 Cantonese leng3 ‘pretty’, standard Chinese zing6 ‘to dress up’] ‘young man, teenager; young triad member’.
ngaam1 [⇐ 口 + 岩 ngaam4 ‘rock; cliff’] ‘right; correct; suitable’.
𡁻 zeu6 [⇐ 口 + 趙 ziu6 ‘a surname’] ‘to screw, have sex with (someone); to beat up, strike, hit (someone)’.
zo2 [⇐ 口 + 左 zo2 ‘left (side)’] ‘marker of completed action’, cf. standard Chinese 了 liu5.

Some additional examples of Cantonese characters that have been created to transcribe Cantonese lexical items which are etymologically unrelated to their semantic and functional equivalents in standard Chinese, including some English loanwords, are as follows:

𥄫 gap6 ‘to keep an eye on, fix one's gaze on, closely watch (someone or something)’.
gip1 ‘bag, grip’ ⇐ English grip.
𥇣 gwat6 ‘to glare at (someone or something); to glance at (someone or something)’.
kaat1 ‘card’ ⇐ English card.
keoi5 ‘he, she, it’, cf. standard Chinese 他 taa1.
𠹭 ko1 ‘call’ ⇐ English call.
𨋢 lip1 ‘elevator, lift’ ⇐ British English lift.
mou5 ‘not have’, cf. standard Chinese 沒有 mut6 jau5.
zong1 ‘to peep at, peek at, spy on, surreptitiously look at (someone or something); to steal a glance at (someone or something)’.

5th Principle: Indigenization (2): Use of Nonstandard Chinese Characters

A second type of indigenization is associated with the use of nonstandard Chinese characters: while modern standard Chinese and Cantonese share the same etymologically related morphosyllables which have the same (or ultimately related) meanings in both varieties, yet Hong Kong Cantonese writers may write them with variant, nonstandard characters. The number of such variant graphs is not large. This use of variant, nonstandard characters is shown in the following examples:

fu3 ‘trousers, pants’ = standard Chinese 褲子,裤子 kùzi
gaan2 ‘soap; alkali’ = standard Chinese 碱 jiǎn
韮 in 韮菜 gau2 coi3 ‘Chinese chives’ = standard Chinese 韭 jiǔ
杧菓 mong1 gwo2 ‘mango’ = standard Chinese 芒果 mángguǒ
餂, 𧵳 sit6 ‘to lose (money), suffer a loss’ = standard Chinese 蝕, 蚀 shí
zaan6 ‘to earn, make (money)’ = standard Chinese 賺, 赚 zhuàn

6th Principle: Indigenization (3): Standard Chinese Characters with Nonstandard Usage

The third kind of indigenization is the use in written Cantonese of some standard characters whose meanings are similar or identical in both varieties, but in written Cantonese these characters have developed quite different meanings, usages, and collocations than are ordinarily associated with them in standard Chinese as indicated by the following examples:

hai6 ‘to be’ = standard Chinese 是 shì ‘to be’ (= Cantonese si6; 係 xi4 ‘to be’).
jam2 ‘to drink’ = standard Chinese 喝 ‘to drink’ (= Cantonese hot3; 飲 yin3 ‘to drink’).
saam1 ‘clothing; dress; shirt’ = standard Chinese 衣服 yīfu (= Cantonese ji1 fuk6; 衫 shān ‘upper garment’).
sik6 ‘to eat’ = standard Chinese 吃 chī (= Cantonese hek3; 食 shí ‘to eat’).
tai2 ‘to see, watch, look at, gaze at, observe; to read (silently)’ = standard Chinese 看 kàn (= Cantonese hon3; 睇 ‘to look askance, cast a sidelong glance’).

7th Principle: Indigenization (4): Revival of Old Chinese Characters

Some Chinese characters which have come to be thought of as “Cantonese characters” have not necessarily been created by Cantonese-speakers. This fourth kind of indigenization refers to the revival (or recycling) in written Cantonese of some old, abandoned Chinese characters that occurred in earlier stages of the Chinese language and that can be found recorded in old dictionaries of various kinds, for example, 《說文解字》Shuōwén Jiězì (100, 121), 《廣韻》 Guǎngyùn (1008), 《集韻》 Jíyùn (1037), 《康熙字典》 Kāngxī Zìdiǎn (1716; Zhang et al 1987), etc. Such old characters, for whatever reasons, have fallen by the wayside over time and are rarely or never used in modern standard written Chinese. However, some of them may still be listed as entries in dictionaries but specially marked as 方 fong1 ‘dialectal’ for the benefit of users who speak regional Chinese varieties, such as Cantonese (e.g. 《新华字典》Xīnhuá Zìdiǎn 1972; Yao 2000). In addition, the meanings of some old Chinese characters have changed with the passage of time as far as their usage is concerned in Hong Kong Chinese and Hong Kong written Cantonese; some examples of old characters follow below:

bei3/2 ‘to give’ = standard Chinese 給 kap1, gěi (畀 ‘to give’ from Zhou Dynasty 900–700 BCE; Karlgren 1957:141).
入禀 jap6 ban2 ‘to bring lawsuit to a court’ = standard Chinese 提起訴訟 tíqǐ sùsòng (in standard Chinese 禀 ban2, bǐng ‘to report to a superior’)

8th Principle: Indigenization (5): Reading Chinese Characters for their Meanings and Not Their Pronunciations

The fifth kind of indigenization is observed when Cantonese writers employ some modern standard Chinese characters that are read (i.e. pronounced) with semantically-equivalent but etymologically-unrelated colloquial Cantonese morphosyllables which replace the standard Cantonese readings of these standard characters. This phenomenon is the so-called 訓讀 fan3 duk6 (xùn dú) ‘reading the Chinese character for its meaning and not its pronunciation’; this is a common phenomenon in the Japanese language and is referred to as kundoku or kun-yo(mi) ‘reading Chinese characters with Japanese sounds’ (Li 2000:209). Some examples of standard characters being read with semantically-equivalent but etymologically-unrelated colloquial syllables are as follows:

maa1 ‘twin, pair, double’, as in 孖仔 maa1 zai2 ‘twin boys’, 孖女 maa1 neoi5/2 ‘twin girls’ = standard Chinese 孖 ‘twins, two children born from the same pregnancy of the same mother’.
me2 ‘slanting, askew, aslant, awry, crooked, not straight’ = standard Chinese waai1.
ngong5, 打仰瞓 daa2 ngong5 fan3 ‘to sleep lying on one’s back’ = standard Chinese joeng5.

In addition, some standard characters can also be read with English loanword syllables which replace their usual standard Cantonese reading pronunciations as indicated below:

paak3 ‘to park’; loan from English park as in 泊車 paak3 ce1 ‘to park a car’ (= standard Chinese 泊 bok6 ‘to anchor, moor (a boat); to anchor alongside shore; (for a boat) to be at anchor’).
阿蛇 aa3 soe4 ‘address term for policemen and teachers; policeman, teacher’; soe4 is loan from English sir; (= standard Chinese 蛇 se4 ‘snake’).

Romanization and Alphabeticization in Written Cantonese

Some colloquial Cantonese morphosyllables cannot be etymologically traced back to their original (historical or etymological) Chinese characters, so they lack Chinese characters as their written forms. Because there is a serious mismatch between the Cantonese syllabary with its large inventory of colloquial morphosyllables and the literary syllables associated with the standard Chinese characters as their reading pronunciations, there are simply not enough standard characters with suitable pronunciations that can be borrowed (reused or recycled) to write the etymologically-unrelated Cantonese colloquial morphosyllables. So how can such “unwritable” morphosyllables be transcribed? The solution has been to devise a kind of informal or ad hoc romanization for them.

9th Principle: Alphabetization (1): Use of Ad hoc Romanization

Some indigenous, colloquial Cantonese morphosyllables that are etymologically unrelated to their semantic and functional equivalents in standard Chinese cannot be traced back to and associated with their original Chinese characters, so they lack standard Chinese characters as their written forms. In order to write such morphosyllables Hong Kong Cantonese-speakers who are typically familiar with the English alphabet by having learned to speak English to some degree have been inventing their own ad hoc romanizations to transcribe the Cantonese pronunciations of these seemingly unwritable lexical items but without marking or indicating their tones. Some examples of such commonly occurring romanized Cantonese words are the following:

CHOK (= cok3) ‘to pull with force; to jerk on (sth.); (for a vehicle) to jolt, lurch, suddenly move forward but then suddenly stop; to shake or jerk something up and down; to probe or tease someone so as to get the person to reveal something; (for a person) to look cool or sexy’
CHUR (= coe2) ‘to breathe in deeply, as smoke from a cigarette; to feel breathless, smothered; to make material demands on (someone); (for something) to be difficult’.
HEA (= he3)‘to be idle, indolent, lazy, laid-back, doing nothing; to hang around, lounge around; to idle away one's time’
JER (= zoe1) ‘young boy’s or man’s penis’

10th Principle: Alphabeticization (2): Use of Individual English Letters to Represent Cantonese Morphosyllables

Every letter of the English alphabet has its own Cantonese pronunciation with one or two Cantonese syllables. Cantonese writers borrow letters of the English alphabet according to their English pronunciations to transcribe indigenous, (nearly) homophonous Cantonese morphosyllables (or just their initial consonants) because they lack suitable Chinese characters with which to write them.

D di1 ‘plural marker’, as in 呢 D ni1 di1 ‘these’; marker of comparative degree’, as in 好D hou2 di1 ‘better’ (also written as 啲 di1)
E家 ji1 gaa1 ‘now’ (also written as 依, 𠵱 ji1)
J zei1, as in 打J daa2 zei1 ‘to jerk off (i.e. masturbate)’, 戒J gaai3 zei1 ‘to cease the habit of jerking off’, J咗 zei1 zo2 ‘to have jerked off’, J zei1 = romanization of initial consonant of 朘 zoe1 ‘penis’
K ke1 ‘shit’ as in 吔K jaak3 ke1 ‘eat shit’ (also written as 𡲢 ke1)

Individual English letters are also borrowed to represent tabooed morphosyllables which the reader may recognize, and so either pronounce them as such, or instead pronounce the English letters as a kind of euphemism. English letters can function as euphemisms for tabooed Cantonese morphosyllables:

Q kiu1 replaces 𨳊 gau1 or 𨶙 lan2 ‘cock (vulgar term for male sex organ); damn’ as in 麻Q煩 maa4 kiu1 faan4 for 麻𨶙煩 maa4 lan2 faan4 ‘damn troublesome’;
X written in place of 𨳒 diu2 as in X你老母 instead of 𨳒你老母 diu2 nei5 lou5 mou5/2 ‘fuck your mother!’.

11th Principle: Alphabeticization (3): Use of Individual English Letters to Write English Loanwords

Letters of the English alphabet are combined together with standard and nonstandard Chinese characters to transcribe the Cantonese syllables that are used in the phonetic transliteration or representation of some English loanwords.

A ei1, as in AA制 ei1 ei1 zai3 ‘to split the bill, as for a meal in a restaurant; to go Dutch treat’ (original meaning of AA ei1 ei1 is uncertain, but likely refers to the two persons eating together but each one paying for his or her own meal).
B bi1, bi4, as in BB女 bi4 bi1 neoi5/2 ‘baby girl’, BB仔 bi4 bi1 zai2 ‘baby boy’, ⇐ BB bi4 bi1Baby.
K kei1, as in K士 kei1 si6/2 ‘case, as in this is a case for the police’, ⇐ Case.
M em1 in M到 em1 dou3 ‘to menstruate’, M ⇐ Menstruate, Menstruation; M記 em1 gei3 ‘McDonald’s’, M ⇐ McDonald’s; 維他命 M wai4 taa1 ming6 em1, M ⇐ Money.
O ou1 in O記 ou1 gei3 ‘O.C.T.B.’ O ⇐ Organized, i.e. Organized Crime and Triad Bureau 有組織罪案及三合會調查科 jau5 zou2 zik1 zeoi6 on3 kap6 saam1 hap6 wui6/2 diu6 caa4 fo1 (within the Hong Kong Police Force).
T ti1, as in T恤 (or T裇) ti1 seot1 ‘T-shirt’, ⇐ Tee-shirt.

12th Principle: Alphabeticization (4): Retention of Original English Spelling of English Loanwords that are Pronounced with Cantonese Morphoyllables

Written Cantonese transcribes some English loanwords with their regular (i.e. usual) English orthography, but Cantonese-speaking readers know to transform (or adapt) such loanwords to the Cantonese sound system by reading them with suitable Cantonese morphosyllables that approximate the original pronunciations of the English loanwords.

CALL ko1 ‘call’ ⇐ call
車CAM ce1 kem1 ‘dash cam, dashboard camera in a motor vehicle’, CAM kem1 ⇐ cam, camera.
DOWNLOAD daang1 lou1 ‘download’ ⇐ download
FACE fei1 si2 ‘face (respect)’ ⇐ face
MAN men1 ‘manly’ ⇐ man, as in 你好似MAN咗好多! nei5 hou2 ci5 men1 zo2 hou2 do1 ‘You seem to have become much more manly!’ (as said by a young woman in admiration of her handsome boyfriend)
OK ou1 kei1 ‘OK, all right’ ⇐ OK, okay
PROJECTOR pou3 zek1 taa2 ‘projector’ ⇐ projector
SIR in 阿SIR aa3 soe4 ‘address term for police officer or teacher; police officer, teacher’; SIR ⇐ sir. However, as indicated above, this same morphosyllable is also written as 蛇 with change in pronunciation of standard Cantonese se4.
VAN wen1, as in 貨VAN fo3 wen1 ‘van for hauling goods’, 綠VAN luk6 wen1 ‘green minibus’; VAN wen1van (also written as 𨋍 wen1, but VAN is much more common as the Cantonese character 𨋍 does not display in Internet search engines)
WARM wom1 ‘warm’ ⇐ warm

What are the Problems of Variation in Written Cantonese that Still Need to be Resolved?

Hong Kong’s written Cantonese language has never been formally standardized, and this is why there is much variation in the way it is written. In Hong Kong there is no formal body of Cantonese-language experts who have been officially appointed by the government and explicitly entrusted with the task of looking after the Cantonese language by standardizing its pronunciation and written form. While grammars, glossaries, dictionaries, and various other kinds of studies of Hong Kong Cantonese have been published over the years and have certainly contributed to the development of its ad hoc de facto standard form, nonetheless, these works have been produced by self-motivated individuals who have been working on their own with or without funding support.

Due to the lack of formal standardization of written Cantonese, there are at least three major outstanding problems that still need to be resolved in order to make the writing of Cantonese more accurate and less ambiguous. These three problems that need resolution can be stated as follows:

1. Two or more graphs are used to transcribe the same morphosyllable:

a. The morphosyllable bei2 ‘to give’ can be found written in at least the following four ways:


b. The morphosyllable di1 ‘plural marker for nouns; marker of comparative degree’ can be found written in at least the following four ways:


c. The word ngaa6 zaa6 ‘to bar the way, obstruct’ can be found written in at least the following three ways:


2. One Chinese character can carry two or more pronunciations, each of which corresponds to and so represents a different morphosyllable and meaning:

a. The character「𠹌」carries at least the following four pronunciations and associated meanings:

(1) lan2 ‘vulgar term for penis’;
(2) lang1 in 溜𠹌 liu1 lang1 ‘uncommon, rare, highly specialized and unusual’;
(3) lang3 in 半𠹌𠼰 bun3 lang3 kang3 ‘half-way’;
(4) nang3 in 𠹌埋一齊 nang3 maai4 jat1 cai4 ‘join together’

b. The character「揼」carries at least the following five pronunciations and associated meanings:

1. dam1 ‘to delay’;
2. dam2 ‘to beat, pound’;
3. dam3 ‘to hang down, let fall’;
4. dam4 as in 圓揼𡇙 jyun4 dam4 doe4 ‘ to be round and full (as the moon)’;
5. dap6 ‘to beat, thump; to fall; to soak’

3. The Empty Box as Last Resort: Some Morphosyllables Lack Characters as Their Written Forms and so are Written with the Empty Box as 囗:

The Cantonese lexicon includes a number of colloquial morphosyllables with which no Chinese characters can be etymologically associated. As a kind of last resort, the so-called “empty box” 囗 has typically functioned as a place-holder to indicate that no etymological (or original) character has been identified to write the morphosyllables. We may note here that the Cantonese dictionaries by Rao, Ouyang, and Zhou (1997:363) and Zhang, Ni, and Pan (2018) have adopted this convention for representing a number of such Cantonese morphosyllables that seem to to lack Chinese characters as their written forms. However, for some morphosyllables colloquial Cantonese characters may have been created or borrowed to write them but they may not be widely known and used (in the 2016 edition of the dictionary by Rao, Ouyang, and Zhou dictionary the authors either created or found a number of colloquial characters with which to write previously unwritable morphosyllables and have included them in the relevant lexical entries). To exemplify the phenomenon of the empty box to represent morphosyllables that lack Chinese characters with which to transcribe them the following lexical items can be cited:

faak3 ‘to whisk’
gong6 ‘crab’s claw’ (= 弶, 蝄)
he3 ‘to hang out, idle away one’s time’
kwaang2 ‘stalk of a plant’ (= 䆲, 框 kwaang1/2)
囗囗 laau2 gaau6 ‘in a mess, topsy-turvy’ (= 嘮)
囗囗 lak1 kak1 ‘(for person) stammering; (for road) to be bumpy’ (= 𬧊揢)
lem2, lim2 ‘to lick’ (= 𬜐)
ngong5 ‘facing upwards; 打~瞓 ‘to sleep lying on one’s back’ (= 仰 joeng5 ⇒ ngong5)
soe4 ‘to slide down’ (= 𠿬, 𡄽)
囗囗聲 wiu1 wiu1 seng1 ‘sound of siren’

As we can see in this above list of lexical items that are represented by the empty box, for some of these colloquial Cantonese morphosyllables nonstandard characters have been created to write them, but they may not be widely known and used, or they cannot be input with a computer because they still do not occur in a computer’s fonts. Nonetheless, at the end of the day the problem of unwritable Cantonese morphosyllables can be resolved through the creation or discovery of the characters that are needed.

What are the Conclusions on Written Cantonese?

The development of written Cantonese is quite extraordinary for at least the following five reasons (although there may be more):

1. Written Cantonese has developed spontaneously, naturally, and concretely in response to the wide-ranging but particular needs and interests of Cantonese speakers who have been determined to write down their Cantonese speech. In Hong Kong written Cantonese coexists and even thrives as a parallel system in brazen competition with modern standard written Chinese. Although texts of written Cantonese may be generally regarded as less serious than those written in standard Chinese, yet the tradition of writing in Cantonese continues to advance uninhibitedly in Hong Kong and so persists as a pervasive phenomenon.

2. In Hong Kong no officially-appointed body of language experts has ever been formally assigned the task of developing and promoting any standard set of conventions for transcribing Cantonese speech; the conventions of written Cantonese have been evolving informally, sporadically, inconsistently, but also with some degree of agreement among Cantonese writers as they seek and improvise ad hoc solutions to the problems inherent with transcribing topolects.

3. Written Cantonese has still not been formally standardized, it has no officially recognized status in the community, and it is not taught in the schools. Nonetheless, despite all these negative factors that would seem to undermine its existence, schoolchildren and adults have managed to learn it, although informally and haphazardly, and so they know how to read it.

4. Many colloquial Cantonese words, including some English loanwords, are etymologically unrelated to their semantic and functional equivalents in modern standard Chinese, and so these particular words, along with some English loanwords, have required Cantonese writers to use their ingenuity to devise special graphs (or characters) in order to write them down.

5. Although Hong Kong’s written Cantonese has often been severely condemned and criticized by education authorities, academic experts, and community leaders for undermining (and even corrupting the purity of) the modern standard written Chinese language – yet paradoxically – written Cantonese continues to thrive in this community.

What are the Cantonese Phonetic Variations Called 懶音 Laan5 Jam1 ‘Lazy Pronunciations’?

This dictionary has transcribed the conservative, standard pronunciation of Cantonese lexical items; however, as users of the dictionary who have learned to speak Cantonese are likely aware, the pronunciations we hear uttered by many Cantonese speakers with whom we come into contact in the course of our daily lives typically differ from the standard forms. For example, where the standard language indicates that words and Chinese characters are pronounced with initial consonant /n-/ (alveolar nasal), as in 你 nei5 ‘you’, many Cantonese speakers are pronouncing this word and other related lexical items with /l-/ (lateral approximant), i.e. lei5. This is just one common example of the so-called Cantonese 懶音 laan5 jam1 “lazy sounds”. What this means is that if readers were to try to look up in the dictionary a word that they assume begins with l-, it may actually be the case that the word begins with n- in the standard language, and so they may also want to check that possibility. In order to help the reader recognize how variation has affected certain initial consonants, rimes, and tones in relation to the standard, the sounds that can vary have all been collected together and listed in Table 1 that follows below. As we observe from Table 1, all components of a syllable can vary, viz., the initial consonant, nuclear vowel, ending consonant, and tone. This table has listed the standard Cantonese pronunciations of words on the left and their variant pronunciations on the right. If readers keep in mind these variations that occur in relation to the standard Cantonese sound system, then their access to the dictionary’s lexical entries can be enhanced.

Table 1. Phonetic variations in the Hong Kong Cantonese sound system.

1. Labialized velar varies with delabialized velar:

gw- ⇒ g-/oC, kw- ⇒ k-/oC (C = either final consonant -ng or -k):
gwo3 ‘to go across’ ⇒ go3
gwong1 ‘bright’ ⇒ gong1
gwok3 ‘nation, country’ ⇒ gok3
kwong4 ‘crazy; violent’ ⇒ kong4

2. Alveolar nasal varies with lateral approximant:

n- ⇒ l-
naa2 ‘female suffix for animals’ ⇒ laa2
𨂾 naam3 ‘to step across’ ⇒ laam3
ni1 ‘this’ ⇒ li1
𢆡 nin1 ‘female breast; milk’ ⇒ lin1

3. Voiceless velar stop varies with glottal fricative:

k- ⇒ h-
keoi5 ‘he, she, it’ ⇒ heoi5
(only this one word)

4. Velar nasal varies with zero initial:

ng- ⇒ 0 (zero)
ngaak1 ‘to cheat, trick’ ⇒ aak1
ngaam1 ‘all right, good’ ⇒ aam1
我哋 ngo5 dei6 ‘we’ ⇒ o5 dei6
外便 ngoi6 bin6 ‘outside’ ⇒ oi6 bin6

5. Zero initial varies with velar nasal initial:

0- ⇒ ng-
aak3/2 ‘bracelet’ ⇒ ngaak3/2
oi3 ‘love’ ⇒ ngoi3

6. Velar nasal syllabic varies with bilabial nasal syllabic:

ng ⇒ m
ng5 ‘five’ ⇒ m5
ng4 ‘surname’ ⇒ m4

7. Velar nasal ending varies with alveolar nasal ending:

-ng ⇒ -n
(1) -aang ⇒ -aan
caang2 ‘orange (fruit)’ ⇒ caan2
(2) -ang ⇒ -an
dang1 ‘lamp’ ⇒ dan1
(3) -ong ⇒ -on
gwong2 ‘broad’ ⇒ gwon2
(4) -oeng ⇒ -oen
hoeng1 ‘fragrant’ ⇒ hoen1
(5) -eng ⇒ -en
teng1 ‘listen’ ⇒ ten1

8. Velar stop ending varies with alveolar stop ending -t or weakens further to glottal stop -ʔ:

(1) -aak ⇒ -aat / -aaʔ
baak3 ‘hundred’ ⇒ baat3 / baaʔ3
(2) -ak ⇒ -at / -aʔ
bak1 ‘north’ ⇒ bat1 / baʔ5
(3) -ek ⇒ -et / -eʔ
sek6 ‘rock, stone’ ⇒ set6 / seʔ6
(4) -oek ⇒ -oet / -oeʔ
goek3 ‘foot, leg’ ⇒ goet3 / goeʔ3
(5) -ok ⇒ -ot / -oʔ
gok3 ‘horn; corner’ ⇒ got3 / goʔ3

9. Rimes of certain high frequency words vary with other rimes:

(1) -i ⇒ -ei
呢個 ni1 go3 ‘this’ ⇒ lei1 go3
(2) ai ⇒ -ei
lai4 ‘to come’ ⇒ lei4

10. Certain tones vary with certain other tones:

(1) High Level 1 ˥55 ⇒ High Falling ˥˨52
saan1 ‘hill’ ⇒ saan52
(2) High Rising 2 ˨˥ 25 ⇒ Mid-Low Rising 5 ˨˧23
ji2 ‘chair’ ⇒ ji5
(3) Mid-Low Rising 5 ˨˧23 ⇒ High Rising 2 ˨˥25
ji5 ‘ear’ ⇒ ji2
(4) Mid Level 3 ˧33 ⇒ Mid-Low Rising 5 ˨˧23
si3 ‘try’ ⇒ 考試 haau2 si5

From this list of phonetic variations, we observe that in some cases speakers reduce the amount of articulatory effort being used to produce certain sounds, either by losing a sound, e.g. the labialized velar initial stop loses its labialization to become the plain velar (gw- ⇒ g-), or completely eliminating a sound, e.g. the velar nasal initial is dropped (ng- ⇒ 0), or changing a sound to some other one that requires less effort, e.g., the velar nasal syllabic varies with the bilabial nasal syllabic (ng ⇒ m), and variation between velar endings and alveolar endings (-ng ⇒ -n, -k ~ / -t). As for variation of some tones, the merger (and loss) of the contrast between the high rising Tone 2 (˨˥25) and mid-low rising Tone 5 (˨˧23) simplifies the tone system by eliminating one contrastive tone contour. Despite the extent of these phonetic variations and the loss of some phonological contrasts, nonetheless, it seems to be the case that speakers’ communication has not been negatively affected; this is most likely the result of compensating contextual cues, as well as the speakers’ subconscious awareness of these variations.

Cantonese Romanization Jyut6 jyu5 Ping3 jam1 粵語拼音 with Corresponding IPA Symbols [enclosed in brackets]

1. Initial Consonants:

b = [p], p = [pʰ], d = [t], t = [tʰ], g = [k], k = [kʰ], gw = [kʷ], kw = [kʰʷ], m = [m], n = [n], ng = [ŋ], f = [f], s = [s], h = [h], dz = [ts, tʃ], c = [tsʰ, tʃʰ], w = [w], l = [l], j = [j], Ø = [ʔ].

2. Final Consonants:

m = [m], n = [n], ng = [ŋ], p = [p̚], t = [t̚], k = [k̚].

3. Vowels in Rimes:

i = [iː], ing = [eʲŋ], ik = [eʲk]
yu = [yː], yun = [yːn], yut = [yːt]
e = [ɛː], ei = [eʲj], eu = [ɛːw], em = [ɛːm], en = [ɛːn], eng = [ɛːŋ], ek = [ɛːk]
oe = [œː], oem = [œːm], oeng = [œːŋ], oek = [œːk]
eoi = [ɵy], eon = [ɵn], eot [ɵt]
ai = [ɐj], au = [ɐw], am = [ɐm], an = [ɐn], ang = [ɐŋ], ak = [ɐk]
aa = [aː], aai = [aːj], aau = [aːw], aam = [aːm], aan = [aːn], aang = [aːŋ], aap = [aːp], aat = [aːt], aak = [aːk]
u = [u], ui = [uːj], un = [uːn], ung = [oŋ], ut = [uːt], uk = [ok]
o = [ɔː], oi = [ɔːj], ou = [ɔw], om = [ɔːm], on = [ɔːn], ong = [ɔːŋ], op = [ɔːp], ot = [ɔːt], ok = [ɔːk].

4. Tones as Jyutping Numbers with Corresponding Tone Categories [followed by corresponding Chao tone letters and tone values]:

1 陰平 Jam1 Ping4 High Level = [˥˥55],
上陰入 Soeng5 Jam1 Jap6 High Stopped [˥5];
2 陰上 Jam1 Soeng5 High Rising = [˨˥25];
3 陰去 Jam1 Heoi3 Mid Level = [˧˧33],
下陰入 Haa6 Jam1 Jap6 Mid Stopped [˧˧33];
4 陽平 Joeng4 Ping4 Mid-low Falling = [˨˩21];
5 陽上 Joeng4 Soeng5 Mid-low Rising = [˨˧23];
6 陽去 Joeng4 Heoi3 Mid-low Level = [˨˨22],
陽入 Joeng4 Jap6 Mid-low Stopped [˨2], [˨˨22].


