World Library  
Flag as Inappropriate
Email this Article

CJK characters

In internationalization, CJK is a collective term for the Chinese, Japanese, and Korean languages, all of which use Chinese characters and derivatives (collectively, CJK characters) in their writing systems. Occasionally, Vietnamese is included, making the abbreviation CJKV, since Vietnamese historically used Chinese characters as well.

The characters are known as hànzì in Chinese, kanji, kana in Japanese, hanja in Korean, and Hán tự, Chữ Nôm in Vietnamese.


  • Character repertoire 1
  • Encoding 2
  • Legal status 3
  • See also 4
  • References 5
  • External links 6

Character repertoire

Chinese is written almost exclusively in Chinese characters. It requires approximately 4,000 characters for general literacy, but up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters — general literacy in Japan can be expected with about 2,000 characters. The use of Chinese characters in Korea is becoming increasingly rare, although idiosyncratic use of Chinese characters in proper names requires knowledge (and therefore availability) of many more characters.

Other scripts used for these languages, such as bopomofo and the Latin-based pinyin for Chinese, hiragana and katakana for Japanese, and hangul for Korean, are not strictly "CJK characters", although CJK character sets almost invariably include them as necessary for full coverage of the target languages.

Until the early 20th century, Literary Chinese was the written language of government and scholarship in Vietnam. Popular literature in Vietnamese was written in the Chữ Nôm script, consisting of borrowed Chinese characters together with many characters created locally. By the end of the 1920s, both scripts had been replaced by writing in Vietnamese using the Latin-based Vietnamese alphabet.

The sinologist Carl Leban (1971) produced an early survey of CJK encoding systems.


The number of characters required for complete coverage of all these languages' needs cannot fit in the 256-character code space of 8-bit character encodings, requiring at least a 16-bit fixed width encoding or multi-byte variable-length encodings. The 16-bit fixed width encodings, such as those from Unicode up to and including version 2.0, are now deprecated due to the requirement to encode more characters than a 16-bit encoding can accommodate—Unicode 5.0 has some 70,000 Han characters—and the requirement by the Chinese government that software in China support the GB18030 character set.

Although CJK encodings have common character sets, the encodings often used to represent them have been developed separately by different East Asian governments and software companies, and are mutually incompatible. Unicode has attempted, with some controversy, to unify the character sets in a process known as Han unification.

CJK character encodings should consist minimally of Han characters plus language-specific phonetic scripts such as pinyin, bopomofo, hiragana, katakana and hangul.

CJK character encodings include:

The CJK character sets take up the bulk of the assigned Unicode code space. There is much controversy among Japanese experts of Chinese characters about the desirability and technical merit of the Han unification process used to map multiple Chinese and Japanese character sets into a single set of unified characters.

All three languages can be written both left-to-right and top-to-bottom, but are usually considered left-to-right scripts when discussing encoding issues.

Legal status

According to Ken Lunde, in 1996 the abbreviation "CJK" was a registered trademark of Research Libraries Group[1] (which merged with OCLC in 2006). Justia lists the trademark as being owned by OCLC between 1987 and 2009 but says it has now expired.[2]

See also


  1. ^
  2. ^

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

  • DeFrancis, John. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press, 1990. ISBN 0-8248-1068-6.
  • Hannas, William C. Asia's Orthographic Dilemma. Honolulu: University of Hawaii Press, 1997. ISBN 0-8248-1892-X (paperback); ISBN 0-8248-1842-3 (hardcover).
  • Lemberg, Werner: The CJK package for LATEX2ε—Multilingual support beyond babel. TUGboat, Volume 18 (1997), No. 3—Proceedings of the 1997 Annual Meeting.
  • Leban, Carl. Automated Orthographic Systems for East Asian Languages (Chinese, Japanese, Korean), State-of-the-art Report, Prepared for the Board of Directors, Association for Asian Studies. 1971.
  • Lunde, Ken. CJKV Information Processing. Sebastopol, Calif.: O'Reilly & Associates, 1998. ISBN 1-56592-224-7.

External links

  • CJKV: A Brief Introduction
  • Lemberg CJK article from above, TUGboat18-3
  • On “CJK Unified Ideograph”, from
  • FGA: Unicode CJKV character set rationalization
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World eBook Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.