The Phantom Category of Chinese Characters


Published on

Describes how one traditional category of Chinese characters, semantic compounds, is non-existent, that all compound characters are phonosemantic compounds.

  1. 1. The Phantom Category of Chinese CharactersAuthor: Lawrence J. Howell (admin at kanjinetworks dot com)17 April 2013Keywords: Chinese characters; categories; semantic compounds; 會意字; networksAttempts to arrange Chinese characters logically have a long history. I joined the crowd about tenyears ago with an online dictionary that highlights word families of characters based on ancientpronunciations.Typically, however, lexicographers and other compilers present the Chinese characters in groupsbased on visual cues. An early and famous example is the 說文解字 (Shuōwén Jiězì; early secondcentury CE) of 許慎 (Xǔ Shèn), which distributed characters by 部首 (bùshǒu: headers, orclassifiers; habitually mistranslated as "radicals").說文解字 and earlier references describe the characters as being of four types: Pictographs,Ideographs, Phonosemantic Compounds, and Semantic Compounds. (Another two types concernusage, not typology.)Pictographs and ideographs are self-explanatory. Phonosemantic compounds combine 1) anelement suggesting both the characters pronunciation and a concept and 2) a purely semanticelement. An example of a series of phonosemantic compounds is 匪菲悲棐扉斐琲腓緋翡蜚誹霏騑鯡, in which the phonosemantic element is 非. In all of these characters, 非 suggests thepronunciation and also a concept: Spread in alignment (to right and left)."Semantic compounds" is one of many ways the Chinese term 會意字 (huìyìzì; Japanese: 会意文字 kaiimoji) has been rendered in English. [1] The 會意字 category was devised because Xǔ Shènand the commentators who preceded him lacked data (forms of the characters inscribed in oraclebones and turtle pastrons, excavated only at the end of the nineteenth century CE) and analyticaltools (reconstructions of the Old Chinese pronunciations of the characters) available to modernscholars. As a result, they were hampered in detecting the phonosemantic element in certaincharacters and in tracing how the phonosemantic element became altered or was lost when variantforms took over as standard ones.The traditional four-category classification scheme remains the norm even in scholarly treatmentsof Chinese characters. This holds true despite the fact that, as we will see below, questions about thevalidity of the semantic compound category have been in circulation for nearly two decades.This paper argues that 會意字 is a phantom category, devoid of substance as a descriptor for howcompound characters were actually formed in ancient China. It concludes that interpretations ofChinese characters and their arrangement into meaningful networks will benefit by eliminating 會意字 from the traditional classification scheme.Terminological note: This treatment of 會意字 refers to Chinese characters, 漢字 (hànzì:"characters of the Han"). Characters created in Japan (国字: kokuji) are not 漢字, merely modeledafter them, and so are excluded from consideration.Lagging ScholarshipHere follow five statements taken from recently published scholarly papers."肥 ... is a Logical-aggregate character 會意字, where the 月 signifies meat and the 巴 signifies aperson down on his knees." [2]"森 (forest) … follows the principle of combined ideogram: it is a stack of three 木 (tree)." [3]
  2. 2. "日 + 月 = 明"; "木 + 木 = 林" [4]"天 can be decomposed into 一 (one) and 大 (a person standing up, or big)." [5]"Initially, the character 天 (sky) refers to the head, the primary part of a person, by placing a barover the character 大 (person, big)." [6]How Things Actually WorkLets examine the statements (and mathematical formula) in reverse order.The top horizontal line of 天 (Proto-Chinese: *tan [7]) is not 一 *kat "one." Nor is it a bar, but asingle-stroke character pronounced *tan that represented the horizon. This now-disappearedcharacter also appears at the bottom of 旦 *tan, which originally indicated the sun rising upon thelong, flat line of the horizon. In both 天 and 旦, the horizontal line functioned as the phonosemanticelement of the compound character.How do we arrive at the conclusion that the horizontal line in 天 and 旦 is distinct from 一 *kat,and that its pronunciation was *tan? By inferring, from conceptually related characters, thepronunciations of elements that did not survive as independent characters. Like all *tan characters,天 and 旦 are informed by the concepts Straight (the concept conveyed by initial T-) and Adhere/BeProximate (conveyed by final -N). Compare 晨 *tan ("dawn"), which too is connected with the sunrising over the horizon in bringing the light of morning.The variant oracle bones styles of 旦 show a squarish object below 日 the sun, as opposed to thehorizontal line in the present style. This object is also seen in some oracle bones variants of 天,although most show the horizontal line. The present state of evidence does not allow us todetermine what the object was intended to represent, but what we do know is that as time went on itdropped in favor of the horizontal line, which was taken as referring to the horizon. As thecompletely different pronunciations tell us, this line was etymologically distinct from theindependent character 一.The line and the character 一 were written identically, or nearly so, but normally when an elementdisappears it is absorbed into another, more common one that in early stages of the characters wasclearly graphically distinct.In the present form of 設 *tat, the original elements (a chisel + a hand holding a bar) have morphedinto etymologically unrelated ones (言 *kan and 殳 *tug). The apparent 言+殳 constructionprompted the early commentators to reason that because 設 lacks an element that can convey thesound *tat, the character must have been created purely for the semantic value of 言 and of 殳.Nowadays, however, we can discover the phonosemantic element of 設 *tat by comparing it with acognate term, 舌 *tat (tongue).Characters belonging to the *tat word family are informed by the concepts Straight (the conceptconveyed by initial T-) and Cut/Divide/Reduce (conveyed by final -T). Representative characters inthis word family include 制 (originally: cut/trim to control the growth of a tree) and 折 (originally:chop/cut lumber).In 舌, the top element was a slender chisel. The addition of 口 *kug mouth indicated the slender,chisel-like tongue protruding from the mouth in speaking. Given that 舌 and 設 are cognates, andthat the chisel element in 舌 is the phonosemantic element, we can infer that it performs the samefunction in 設 as well. As a side note, observe how the chisel in 設 came to be written 言(speaking), another overlap with 舌.Chinese Characters are not Mathematical FormulasNow lets turn to the formulas 日 + 月 = 明 and 木 + 木 = 林.To assert "日 + 月 = 明" makes sense only on the premise that what is being discussed is the way
  3. 3. 明 appears, as opposed to the way 明 was created, which is an entirely different matter.明 *mang ("bright"; "clear" etc.) originally combined 月*kuat moon + not 日 *nat sun/day but 向*kang (penetrate a window or hole), indicating bright moonlight penetrating an open window. 向was the phonosemantic element, via consonant shift in the initial from K- to M-.As we have seen above with respect to character formation, it would be nonsense to claim that 言 +殳 = 設. The same applies to the equation 日 + 月 = 明.For its part, 林 *lam is most certainly not 木 *muk + 木 *muk. Rather, 林 is a representation of twotrees, devised in accordance with the phonosemantic influence of initial L- (Continuum) and final-M (Encompass). 林 originally indicated (multiple) trees growing alongside each other andencompassing villages, sacred ground and other spaces. If the formation process of 林 seemscounter-intuitive, recall the role of ancient readings in revealing how, despite appearances to thecontrary, the horizontal line element in 天 and 旦 is not the independent character 一.With 林 having been brought into the discussion, we can continue working in reverse order to treatthe statement: "森 (forest) … follows the principle of combined ideogram: it is a stack of three 木(tree)."Attuned as we are to the sounds of the character, the *lam pronunciation of 林 is a tip-off that 林 isnot 木 *muk + 木 *muk. The same attention to sound permits us to determine that 森 *sam is not astack of three 木 *muk but rather a combination of the semantic element 木 and the phonosemanticelement 林 *lam. The transition from L- to S- between 林 *lam and 森 *sam represents anotherexample of consonant shift in the initial, as we observed in the K- to M- shift between 向 and 明.Abbreviated Elements in Compound CharactersNow we arrive at the first statement: "肥 ... is a Logical-aggregate character 會意字, where the 月signifies meat and the 巴 signifies a person down on his knees."Before plunging into an analysis of 肥, lets review the key points.・ Elements contained in ancient and present forms of compound characters are not necessarilyidentical.・ When the elements are not identical, the ancient readings of the characters can provide importantinterpretive clues.・ These clues are obtained by analogy with cognate terms.・ Cognate terms refers both to terms pronounced exactly alike and to those in which consonantshift is at work. (Some terms feature vowel shift.)Here is an additional key point:・ Elements in compound characters are often abbreviated forms of more complex, cognatecharacters.After considering the information presented in the three charts that follow, we will be prepared todiscuss the etymology of 肥.In Chart 1, we can confirm the use of abbreviated forms in compound characters by means of oureyes. Note how part of the phonosemantic element is eliminated when the semantic element isinserted.
  4. 4. Chart 1: Abbreviated Elements Verifiable by Sight1 2 3畿 *kar 幾 *kar 田範 *pam 笵 *pam 車度 *tag 庶 *tag 又島 *tog 鳥 *tog 山1 = Character/Proto-Chinese Pronunciation; 2 = Abbreviated Phonosemantic Element/Proto-Chinese Pronunciation; 3 = Semantic ElementNote: For scholarly reconstructions of the Old Chinese pronunciations of the characters in Chart 1and in those following, see the Appendix of this paper.In 畿, the element at lower left has disappeared in being replaced by 田. In 範, the disappearedelement is 氵, a reduced form of 水 water. Meanwhile, it may appear that the same element, 灬,vanished when 度 added 又 hand/action indicator to an abbreviated form of 庶 and when 島 added山 mountain to an abbreviated form of 鳥, but this is not the case. The "lost" 灬 of 庶 is a variant of火 fire that appears in characters such as 烈, 煮, 然, 煎 and 熏. In 鳥 bird, the four strokes wereoriginally part of the pictograph on which 鳥 is based.In Chart 2, we can confirm the use of abbreviated forms of elements in compound characters bymeans of our ears. Note the cognate pronunciations of the compound characters in Column 1 andthe abbreviated phonosemantic elements in Column 3. Also note the unrelated pronunciations of theapparent constituent elements in Column 2, the kind of discrepancies that led early analysts to themistaken conclusion that some characters were created as semantic compounds.Chart 2: Abbreviated Elements Verifiable by Sound1 2 3 4看 *kan 目 *mok 見 *kan 手旬 *kuan 勹 *pog 匀 *kuan 日貞 *tang 貝 *puat 鼎 *tang 卜質 *tat 貝 *puat 實 *tat 斦討 *tog 寸 *suan 守 *tog 言豚 *tuan 豕 *tar 彖 *tuan 肉1 = Character/Proto-Chinese Pronunciation; 2 = Phonosemantic Element in Disguised Form/Proto-Chinese Pronunciation; 3 = Abbreviated Phonosemantic Element/Proto-Chinese Reading; 4 =Semantic ElementNotes: A) Having been devised later in history, 勹 has no proto-Chinese reading; its reading is via包 *pog, the parent character of 勹. B) 實 is the traditional form of 実 *tat.Another example is 尿 *nog (phonosemantic element 尻 *kog), with consonant shift in the initial.In Chart 3, we can confirm the use of abbreviated forms of elements in compound characters by
  5. 5. how they are attested in ancient forms of the characters. A handy online reference for reproductionsof these and other ancient forms is For 黄 and 原, refer to the bronzeinscription style; for the others, the seal inscription style.Chart 3: Abbreviated Elements Verifiable by Ancient Forms of the Characters1 2 3黄 *kuang 光 *kuang Arrow原 *kuan AF 泉 *suan 厂票 *pog AF 要 *kog 火災 *sag Rock damminga stream *sag火脆 *sat 色 (= AF 絶 *sat) 肉承 *tang 丞 *tang 手壽 *tog 老 *log Narrow ridges1 = Character/Proto-Chinese Pronunciation; 2 = Ancient Phonosemantic Element/Proto-ChinesePronunciation; 3 = Semantic ElementNote: AF = Abbreviated Form ofNote the pronunciation shift in the initial consonant for 原, 票 and 壽.Having familiarized ourselves with the use of abbreviated forms of elements in compoundcharacters, we are now prepared to examine 肥.To say that, with respect to 肥, the 巴 element "signifies a person down on his knees" is notentirely mistaken, although 巴 *pag is a pictograph of a person spread flat on the ground (not"down on his knees"). 巴 signifies "spread" in many characters, including 把, 芭, 杷, 爬, 笆, 耙,靶, 皅 and 疤, the Proto-Chinese pronunciations of which were all *pag.The pronunciation *puar for 肥 suggests that the 巴 element in 肥 is not the original form. Byinspecting various of the seal-inscription forms of another character cognate with 肥 *puar (配*puar), we find that in 配 too the element now written 己 was originally identical with the kneelingfigure that features in 肥. That is to say, the kneeling figure element now transcribed as 巴 in 肥and as 己 in 配 functioned as the phonosemantic element in both compound characters.Further, just as we saw abbreviated forms of elements at work in the characters listed in Chart 2(and for some of the characters in Chart 3), so too in 肥 is the 巴 element an abbreviated form, inthis case bearing the concept "aligned" that is attached to 配.The conclusion is that 肥, like 天, 旦, 設, 明, 林 and 森, is by no means "a Logical-aggregatecharacter" but rather, like all compounds, a phonosemantic character.Additional ExamplesLets inspect two more characters that are generally denoted as semantic compounds: 休 and 好.As it happens, 休 and 好 are phonosemantic compounds via a mechanism now familiar to us fromthe discussion above: The use of abbreviated elements.休 *kog ("rest; stop") is 人 person with 木 being an abbreviated form of the phonosemanticelement 槁 *kog (a dehydrated or withered tree): A person resting against a withered tree.As for 好 *kog ("like"; "prefer"), 女 woman is the semantic element and 子 is the phonosemanticelement.
  6. 6. In ancient times there were (at least) two characters used to represent a child. One (*sag) was apictograph of a small child. The other (*sog) was a pictograph of a small child with (rapidly)growing hair. 子 is the present-day form of the *sag character: The *sog character went out of use,and so eventually did the reading *sog, but not before it influenced the pronunciation of 好. In 好*kog, the *sog character is the phonosemantic element, via consonant shift in the initial (S- to K-).Reasons for the Absence of 會意字 Among Chinese CharactersIt has been posited [8] that the apparent absence of a phonosemantic element in the charactersregarded as semantic compounds can be explained by secondary readings borne by these charactersin ancient times. This proposition is almost as much a phantom as the 會意字 category itself.To be sure, it was not unknown for characters to possess secondary readings in ancient times. Asthe example of 好 suggests, it was also not unknown for the secondary reading of a phonosemanticelement to influence a compound character, but such cases are extremely rare.The mechanisms at work in the transformation or loss of the phonosemantic element in thecharacters traditionally regarded as semantic compounds are quite transparent. There is no need toconjure up unattested readings to account for the apparent absence of phonosemantic elements inthese characters.The mechanisms detailed above include:・ Absorption of Unrelated Character/Element 設肥思・ Abbreviated Phonosemantic Element 畿範度島看旬貞質討豚尿 etc.・ Consonant Shift 明森苗尿原票壽 etc.Here are others. (PE = Phonosemantic Element; PC = Phonosemantic Compound Character)・ PE Lost in Character Simplification 栄/営/蛍 (榮/營/螢)・ Unrelated Element Substituted for the PE 閉 ( →必 才), 仙・ Entire Character a Variant of a PC 岩 (巖/巌), 国/國, 法・ Character Devised from Part of a PC 早 (← 朝)・ Consonant Shift in the Final 半 *puan (PE is 八 *puat)・ PE Died Out and was not Absorbed into another Character 事 (PE was a bamboo tube)・ PE Unrecognized on Account of Rarity 炭 (← 屵)(Note: In 半, the slanting strokes were originally written as 八)Identifying the lost/transformed phonosemantic element in all alleged semantic compounds is laborintensive but a productive undertaking that is well worth the effort.ConclusionAs a descriptor for how compound characters were actually formed in ancient China, the 會意字category is meaningless. All characters assigned to this category originated as phonosemanticcompounds; the apparent absence of a phonosemantic element is accounted for by a variety oftransformation mechanisms.Supplement: Application to Creation of Elemental Character NetworksWhat practical difference arises from how we interpret Chinese characters? Lets look at anotherpiece of data that appears in the recent scholarly publication entitled, "Chinese character structureanalysis based on complex networks." [9]The abstract of this paper states, "We ... simulate the formation of Chinese phono-semantic
  7. 7. characters using bipartite graph theory. The bipartite graph model generates non-Poissondistributions and disassortative mixing as the empirical networks, which effectively explain theorigin and formation of phono-semantic characters." [10]But how effectively can the origin and formation of phonosemantic characters be carried out whena network presents etymologically unrelated characters in a single network?Figure 2 of the paper offers 田 *tan in a network with 苗 *mog and 思 *sag. Such a network iscompletely flawed: The only way it can be significant is if the element 田 is the common factor.However, in 苗 *mog, 田 *tan (field) is the semantic element, with 艸 *sog the phonosemanticelement (in the reduced form 艹) via shift in the initial consonant.Much worse, the 田 element in 思 is not "field" at all but rather derives from a pictograph showinga profusion of fine bones in fontanels (open spaces in an infants skull [over which the skull boneseventually fuse]). This fontanels element was once an independent character with the pronunciation*kag, as can be deduced from the phonosemantic compound 兒 *kag (the traditional form of 児)where the semantic element is 儿 *nan (human figure).In other words, the *kag fontanels element been transformed in 思, coming to be written 田. As weknow from the discussions of 明, 設, 肥 and 配, this sort of transformation is commonplace.To find the flawed network of 田, 苗 and 思 in a scholarly work that purports to "... effectivelyexplain the origin and formation of phono-semantic characters" is astonishing.Suggestions for Elemental Networks of Chinese CharactersHow should constructors of elemental networks of Chinese characters proceed? To ensure that thenetworks are based not on (apparent) similarities in the present-day forms but on actualetymological relations, it is necessary to expunge all data that assumes the viability of the 會意字category, elucidating the phonosemantic nature of all compound characters. Taking a networkcentering on 田 as an example, the only characters qualifying as proper members are those in which田 functions as the phonosemantic element (佃, 甸, 沺, 畋, 鈿 etc.).As for characters such as 苗 and 思, in which 田 is either the semantic element or a replacementelement for a character no longer in existence, students may find it helpful if network creatorsdistinguish them by color or other method and cross-reference them to their appropriate networks.Notes[1] Other English glosses of the term 會意字 include logical aggregate characters, ideogrammic compounds,associative compounds, compound indicatives, compound ideographs, combined ideograms, andsemasiographs.[2] Ho Cheong Lam: "A Critical Analysis of the Various Ways of Teaching Chinese Characters"; ElectronicJournal of Foreign Language Teaching 2011, Vol. 8, No. 1, pp. 57–70; Page 12 of electronic version at[3] Xiaoyong Yan, Ying Fan et al: Efficient learning strategy of Chinese characters based on networkapproach; March 8, 2013; Page 7 of electronic version at[4] David Al-dabass and Manling Ren: "Interweaving of Syntax and Semantics in Algorithms ForRecognising Chinese Characters"; Undated; Page 1 of pdf version downloadable at[5] Xiaoyong Yan et al op. cit. Page 1[6] Xiaoyong Yan et al op. cit. Page 7[7] Proto-Chinese readings in this paper are reconstructions by the author and his research collaborator,Hikaru Morimoto.[8] Boltz, William G. "The Origin and Early Development of the Chinese Writing System." 1994: AmericanOriental Society, pages 104–110, ISBN 0-940490-18-8.[9] Jianyu Li and Jie Zhou: "Chinese character structure analysis based on complex networks." Physica A:Statistical Mechanics and its Applications, Volume 380, 1 July 2007, Pages 629–638. Figure 2 as noted atAbstract Page at[10] Ibid.
  8. 8. Appendix: Scholarly Transcriptions of Old Chinese Terms Listed in Charts 1-31 = Character2 = Schuessler, Axel: "ABC Etymological Dictionary of Old Chinese." = Starostin StarLing Database Server Chinese Characters = Baxter-Sagart Old Chinese reconstruction, version of 20 February 2011; William H. Baxter and LaurentSagart; order: by radical and stroke StarLing Database Server gives two readings: Preclassic Old Chinese and Classic Old Chinese. Whenthe two are the same, the reading is listed only once.)1 2 3 4畿 *gəi g(h)ǝj (None)幾 *kəi kǝj *kəj範 (None) (None) *bomʔ {*[b](r)omʔ}笵 (None) b(h)(r)amʔ; b(h)(r)am (None)度 *dâkh dāk *dˤak {*[d]ˤak}庶 (None) (None) *s-tak-s島 *tûʔ tūʔ; tūm (None)鳥 *tiûʔ (None) (None)看 *khâns khāns (~ -rs); khānh *kʰˤar-s {*kʰˤa[r]-s}目 *muk mhuk *C.muk {*C.m(r)[u]k}見 *kêns kēns; kēnh *kˤen-s {*[k]ˤen-s}旬 *s-win whin *s-ɢʷin {*s-[ɢ]ʷi[n]}匀 *win (None) *ɢʷiŋ {*[ɢ]ʷi[ŋ]}貞 *treŋ (None) *treŋ貝 *pâts < *pops ? pāts; pāc (None)鼎 *têŋʔ tēŋʔ; tēŋ *tˤeŋʔ質 (None) tit *t<r>ip-s {*[t]<r>ip-s}實 *m-dit ? lit *mə.lit {*mə.li[t]}討 (None) thūʔ; thūm *tʰˤuʔ寸 *tshûns shūns; shwǝ• nh *tsʰˤun-s {*[tsʰ]ˤu[n]-s}守 *-uʔ tuʔ; tu *s.tuʔ-s豚 *dûn (None) *lˤunʔ豕 *lheʔ or *lhaiʔ (None) *l†ajʔ彖 (None) ƛōns (~ -rs) ƛwānh *l†ˤon-s {*l†ˤo[n]-s}黃 *wâŋ ghʷāŋ; ghwāŋ *N-kʷˤaŋ光 *kwâŋ kʷāŋ; kwāŋ *kʷˤaŋ原 *ŋwan ŋor *ŋʷar {*[ŋ]ʷar} (< uvular)泉 *dzwan (!) ʒuar; ʒwan *s-N-ɢʷar I!票 (None) phew *pʰew要 (None) ʔew *ʔew-s災 *tsə” cǝ• (~ c-); cǝ• *tsˤə {*[ts]ˤə}脆 (None) chots; chwac (None)承 *dəŋ d(h)ǝŋ *dəŋ {*[d]əŋ}丞 *dəŋ d(h)ǝŋ *m-təŋ壽 (None) d(h)uʔ; d(h)u duʔ-s {*[d]uʔ-s}老 *rûʔ rhūʔ; rhūm *C-rˤuʔ