Romanization of Tajik

The Tajik language has been written in three alphabets over the course of its history: an adaptation of the Perso - Arabic script (specifically the Persian alphabet), an adaptation of the Latin script, and an adaptation of the Cyrillic script. Any script used specifically for Tajik may be referred to as the Tajik alphabet, which is written in Tajik as follows: Persian alphabet: Persian: ‫اﻟﻔﺒﺎی تاجیکی‬‎, Cyrillic: алифбои тоҷикӣ, Latin: alifboi toçikī.

The use of a specific alphabet generally corresponds with stages in history, with Arabic being used first, followed by Latin for a short period and then Cyrillic, which remains the most widely used alphabet in Tajikistan. A related language, Judæo-Tajiki, spoken by the Bukharan Jews, traditionally used the Hebrew alphabet but more often today is written using the Cyrillic variant.

Political context

As with many post-Soviet independent states, the change in writing system, and the debate surrounding it is closely intertwined with political themes. In simple terms, although not having been used since the adoption of Cyrillic, the Latin script is supported by those who wish to bring the country closer to Uzbekistan.[1] The Persian alphabet is supported by the devoutly religious, Islamists, along with those who wish to bring the country closer to Iran and their Persian heritage. As the current de facto standard, the Cyrillic alphabet is generally supported by those who wish to maintain the status quo, and not distance the country from Russia.


As a result of the influence of Islam in the region, Tajik was written in the Persian alphabet up to the 1920s. Until this time, the language was not thought of as separate and simply considered a dialect of the Persian language. The Soviets began by simplifying the Persian alphabet in 1923, before moving to a Latin based system in 1927.[2] The Latin script was introduced by the Soviet Union as part of an effort to increase literacy and distance the, at that time, largely illiterate population, from the Islamic Central Asia. There were also practical considerations. The regular Persian alphabet, being an abjad, does not provide sufficient letters for representing the vowel system of Tajik. In addition, the abjad is more difficult to learn, each letter having different forms depending on the position in the word.[3]

The Decree on Romanisation made this law in April, 1928.[4] The Latin variant for Tajik was based on the work by Turcophone scholars who aimed to produce a unified Turkic alphabet,[5] despite Tajik not being a Turkic language. The literacy campaign was successful, with near universal literacy being achieved by the 1950s.

As part of the "russification" of Central Asia, the Cyrillic script was introduced in the late 1930s. The alphabet remained Cyrillic until the end of the 1980s with the disintegration of the Soviet Union. In 1989, with the growth in Tajik nationalism, a law was enacted declaring Tajik the state language. In addition, the law officially equated Tajik with Persian, placing the word "Fârsi" (the local name for Persian) after Tajik. The law also called for a gradual reintroduction of the Persian (Arabic) alphabet.

The Persian alphabet was introduced into education and public life, although the banning of the Islamic Renaissance Party in 1993 slowed down the adoption. In 1999, the word "Fârsi" was removed from the state language law.[6] As of 2004 the de facto standard in use is a Cyrillic alphabet,[7] and as of 1996 a very small part of the population can read the Persian alphabet.[8]


The letters of the major variants of the Tajik alphabet are presented below, along with their phonetic values. There is also a comparative table below.

Persian alphabet

A variant of the Persian alphabet (technically an abjad) is used to write Tajik. In the Tajik version, as with all other versions of the Arabic script, with the exception of 'ا' (alef), vowels are not given unique letters, but rather optionally indicated with diacritic marks.

The Tajik alphabet in Persian
ر ذ د خ ح چ ج ث ت پ ب ا
/ɾ/ /z/ /d/ /χ/ /h/ /tʃ/ /dʒ/ /s/ /t/ /p/ /b/ /o/
ق ف غ ع ظ ط ض ص ش س ژ ز
/q/ /f/ /ʁ/ /ʔ/ /z/ /t/ /z/ /s/ /ʃ/ /s/ /ʒ/ /z/
ی ه و ن م ل گ ک
/j/ /h/ /v/ /n/ /m/ /l/ /ɡ/ /k/

Convert Cyrillic Tajik to Persian script:


The Latin script was introduced after the Russian Revolution in order to facilitate an increase in literacy and distance the language from Islamic influence. Only lowercase letters were found in the first versions of the Latin variant, between 1926 and 1929, as demonstrated by the image at the top right of this page. A slightly different version was used by the Jews of Central Asia including three extra characters for phonemes not found in the other dialects: ů, ə̧, and .[9]

The Tajik alphabet in Latin
A a B b C c Ç ç D d E e F f G g Ƣ ƣ H h I i Ī ī
/a/ /b/ /tʃ/ /dʒ/ /d/ /e/ /f/ /ɡ/ /ʁ/ /h/ /i/ /ˈi/
J j/Y y K k L l M m N n O o P p Q q R r S s Ş ş T t
/j/ /k/ /l/ /m/ /n/ /o/ /p/ /q/ /ɾ/ /s/ /ʃ/ /t/
U u Ū ū V v X x Z z Ƶ ƶ '
/u/ /ɵ/ /v/ /χ/ /z/ /ʒ/ /ʔ/

The unusual character Ƣ is called Gha and represents the phoneme /ɣ/. The character is found in the Uniform Turkic alphabet in which most non-Slavic languages of the Soviet Union were written until the late 1930s. The Latin alphabet is not used today, although the adoption of it is advocated by certain groups.[10]

Transliteration standards

The transliteration standards for the Tajik alphabet in Cyrillic into the Latin alphabet are as follows:

Cyrillic ISO 9 (1995) 1 KNAB (1981) 2 WWS (1996) 3 ALA-LC 4 Allworth 5 BGN/PCGN 6
А а a a a a /a/ a a
Б б b b b b /b/ b b
В в v v v v /v/ v v
Г г g g g g /ɡ/ g g
Ғ ғ ƣ gh gh /ʁ/ gh gh
Д д d d d d /d/ d d
Е е e e, ye e e /je, e/ ye‐, ‐e‐ e
Ё ё jo yo ë ë /jɒ/ yo yo
Ж ж ž zh zh ž /ʒ/ zh zh
З з z z z z /z/ z z
И и i i i i /i/ i i
Ӣ ӣ ī ī ī ī /i/ ī í
Й й j y ĭ j /j/ y y
К к k k k k /k/ k k
Қ қ ķ q q ķ /q/ q q
Л л l l l l /l/ l l
М м m m m m /m/ m m
Н н n n n n /n/ n n
О о o o o o /ɒ/ o o
П п p p p p /p/ p p
Р р r r r r /r/ r r
С с s s s s /s/ s s
Т т t t t t /t/ t t
У у u u u u /u/ u u
Ӯ ӯ ū ū ū ū /ø/ ū ŭ
Ф ф f f f f /f/ f f
Х х h kh kh x /χ/ kh kh
Ҳ ҳ h x /h/ h h
Ч ч c ch ch č /tʃ/ ch ch
Ҷ ҷ ç j j č̦ /dʒ/ j j
Ш ш š sh sh š /ʃ/ sh sh
Ъ ъ ' ' ' ' /ʔ/ " '
Э э è è, e ė è /e/ e ė
Ю ю ju yu i͡u ju /ju/ yu yu
Я я ja ya i͡a ja /ja/ ya ya

Notes to the table above:

  1. ISO 9 — The International Organization for Standardization ISO 9 specification.
  2. KNAB — From the placenames database of the Institute of the Estonian Language.
  3. WWS — From World’s Writing Systems, Bernard Comrie (ed.)
  4. ALA-LC — The standard of the Library of Congress and the American Library Association.
  5. Edward Allworth, ed. Nationalities of the Soviet East. Publications and Writing Systems (NY: Columbia University Press, 1971)
  6. BGN/PCGN — The standard of the United States Board on Geographic Names and the Permanent Committee on Geographical Names for British Official Use.


Tajik written in Cyrillic was introduced in Tajik Soviet Socialist Republic in the late 1930s, replacing the Latin script that had been used since the Bolshevik revolution. After 1939, materials published in Persian in the Persian alphabet were banned from the country.[11] The alphabet below was supplemented by the letters Щ and Ы in 1952.

The Tajik alphabet in Cyrillic
А а Б б В в Г г Д д Е е Ё ё Ж ж З з И и Й й К к
/a/ /b/ /v/ /ɡ/ /d/ /e/ /jɒ/ /ʒ/ /z/ /i/ /j/ /k/
Л л М м Н н О о П п Р р С с Т т У у Ф ф Х х Ч ч
/l/ /m/ /n/ /o/ /p/ /ɾ/ /s/ /t/ /u/ /f/ /χ/ /tʃ/
Ш ш Ъ ъ Э э Ю ю Я я Ғ ғ Ӣ ӣ Қ қ Ӯ ӯ Ҳ ҳ Ҷ ҷ
/ʃ/ /ʔ/ /e/ /ju/ /ja/ /ʁ/ /ˈi/ /q/ /ɵ/ /h/ /dʒ/

In addition to these thirty-five letters, the letters ц, щ, and ы can be found in loan words, although they were officially dropped in the 1998 reform, along with the letter ь. Along with the deprecation of these letters, the 1998 reform also changed the order of the alphabet, which now has the characters with diacritics following their unaltered partners, e.g. г, ғ and к, қ etc.[12] leading to the present order: а б в г ғ д е ё ж з и ӣ й к қ л м н о п р с т у ӯ ф х ҳ ч ҷ ш ъ э ю я. In 2010 it was suggested that the letters е ё ю я might be dropped as well. [13]

The alphabet includes a number of letters not found in the Russian alphabet:

Description Г with bar И with macron К with descender У with macron Х with descender Ч with descender
Letter Ғ Ӣ Қ Ӯ Ҳ Ҷ
Phoneme /ʁ/ /ij/ /q/ /ɵ/ /h/ /dʒ/

During the period when the Cyrillicization took place, Ӷ ӷ also appeared a few times in the table of the Tajik Cyrillic alphabet.[14]


The Hebrew alphabet is similarly as the Persian alphabet an abjad alphabet. It is used for Bukhori – a dialect of Tajik as spoken by the Bukharan Jews in Samarqand and Bukhara. Additionally, since 1940, when the Bukharian Jewish schools were closed in Central Asia, the use of the Hebrew Alphabet outside Hebrew liturgy fell into disuse and Bukharian Jewish publications such as books and newspapers began to appear using the Tajik Cyrillic Alphabet. Today, many older Bukharian Jews who speak Bukharian and went to Tajik or Russian schools in Central Asia only know the Tajik Cyrillic Alphabet when reading and writing Bukharian and Tajik.

The Tajik alphabet in Hebrew
ג״ ג׳ ג גּ בּ ב איֵ איִ אוּ אוׄ אָ אַ
/dʒ/ /tʃ/ /ʁ/ /ɡ/ /b/ /v/ /e/ /i/ /u/ /ɵ/ /o/ /a/
מ ם ל כּ ךּ כ ך י ט ח ז ז ו ה ד
/m/ /l/ /k/ /χ/ /j/ /t/ /ħ/ /ʒ/ /z/ /v/ /h/ /d/
ת שׁ ר ק צ ץ פּ ףּ פ ף ע ס נ ן
/t/ /ʃ/ /r/ /q/ /s/ /p/ /f/ /ʔ/ /s/ /n/

Sample text: דר מוקאבילי זולם איתיפאק נמאייד. מראם נאםה פרוגרמי פירקהי יאש בוכארייאן. – Дар муқобили зулм иттифоқ намоед. Муромнома – пруграми фирқаи ёш бухориён.[15]


Tajik Latin, Tajik Cyrillic and Persian alphabet

Latin Cyrillic Persian English
Tamomi odamon ozod ba dunjo meojand va az lihozi manzilatu huquq bo ham barobarand. Hama sohibi aqlu viçdonand, bojad nisbat ba jakdigar barodarvor munosabat namojand. Тамоми одамон озод ба дунё меоянд ва аз лиҳози манзилату ҳуқуқ бо ҳам баробаранд. Ҳама соҳиби ақлу виҷдонанд, бояд нисбат ба якдигар бародарвор муносабат намоянд. تمام آدمان آزاد به دنیا می‌آیند و از لحاظ منزلت و حقوق با هم برابرند. همه صاحب عقل و وجدانند، باید نسبت به یکدیگر برادروار مناسبت نمایند. All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

For reference, the Arabic variant transliterated letter-for-letter into the Latin script appears as follows:

tmạm ậdmạn ậzạd bh dnyạ my̱ ậynd w ạz lḥạẓ mnzlt w ḥqwq bạ hm brạbrnd. hmh ṣḥb ʿql w wjdạnnd, bạyd nsbt bh ykdygr brạdrwạr mnạsbt nmạynd.

And Cyrillic transliterated[16] into Latin script:

Tamomi odamon ozod ba dunyo meoyand va az lihozi manzilatu huquq bo ham barobarand. Hama sohibi aqlu vijdonand, boyad nisbat ba yakdigar barodarvor munosabat namoyand.

Tajik Cyrillic and Persian alphabet

Vowel pointed Persian includes the vowels which are not usually written.

Cyrillic vowel-pointed Persian Persian
Баниодам аъзои як пайкаранд, ки дар офариниш зи як гавҳаранд. Чу узве ба дард оварад рӯзгор, дигар узвҳоро намонад қарор. Саъдӣ بَنی‌آدَم اَعضایِ یَک پَیکَرَند، که دَر آفَرینِش زِ یَک گَوهَرَند. چو عُضوی به دَرد آوَرَد روزگار، دِگَر عُضوها را نَمانَد قَرار. سعدی بنی‌آدم اعضای یک پیکرند، که در آفرینش ز یک گوهرند. چو عضوی به درد آورد روزگار، دگر عضوها را نماند قرار. سعدی
Мурда будам, зинда шудам; гиря будам, xанда шудам. Давлати ишқ омаду ман давлати поянда шудам. Мавлавӣ مُرده بُدَم، زِنده شُدَم؛ گِریه بُدَم، خَنده شُدَم. دَولَتِ عِشق آمَد و مَن دَولَتِ پایَنده شُدَم. مَولَوی مرده بدم، زنده شدم؛ گریه بدم، خنده شدم. دولت عشق آمد و من دولت پاینده شدم. مولوی

Comparative table

A table comparing the different writing systems used for the Tajik alphabet. In this table, the Latin is based on the 1929 standard, the Cyrillic on the revised 1998 standard, and Arabic letters are given in their stand-alone forms.

Cyrillic Latin Persian Phonetic Value (IPA) Examples
А а A a َ, اَ /a/ санг= سنگ = سَنگ
Б б B ʙ /b/ барг = برگ = بَرگ
В в V v و /v/ номвар = نامور = ناموَر
Г г G g گ /ɡ/ санг= سنگ = سَنگ
Ғ ғ Ƣ ƣ /ʁ/ ғор = غار, Бағдод = بغداد = بَغداد
Д д D d /d/ модар = مادر = مادَر, Бағдод = بغداد = بَغداد
Е е E e ی /e/ шер = شیر, меравам = می‌روم = می‌رَوَم
Ё ё Jo jo یا /jɔ/ дарё = دریا, осиёб = آسیاب
Ж ж Ƶ ƶ ژ /ʒ/ жола = ژاله, каждум = کژدم = کَژدُم
З з Z z ﺽ, ﻅ, ﺫ, ﺯ /z/ баъз = بعض, назар = نَظَر, заҳоб = ذَهاب, замин = زَمیِن
И и I i اِ, ِ /i/ ихтиёр = اِختیار
Ӣ ӣ Ī ī ی /ˈi/ зебоӣ = زیبائی
Й й J j یْ, ی /j/ май = مَی
К к K k ک /k/ кадом = کَدام
Қ қ Q q /q/ қадам = قَدَم
Л л L l /l/ лола = لاله
М м M m /m/ мурдагӣ = مُردَگیِ
Н н N n /n/ нон = نان
О о O o ا, آ /ɔ/ орзу = آرزو
П п P p پ /p/ панҷ = پَنج
Р р R r /ɾ/ ранг = رَنگ
С с S s ﺙ, ﺹ, ﺱ /s/ сар = سَر, субҳ = صُبح, сурайё = ثُرَیاَ
Т т T t ﺕ, ﻁ /t/ тоҷик = تاجیک, талаб = طَلَب
У у U u اُ, ُ /u/ дуд = دُود
Ӯ ӯ Ū ū او, و /ɵ/ хӯрдан = خوردَن, ӯ = اُو
Ф ф F f /f/ фурӯғ = فُروُغ
Х х X x /χ/ хондан = خواندَن
Ҳ ҳ H h /h/ ҳофиз = حافِظ
Ч ч C c چ /tʃ/ чӣ = چی
Ҷ ҷ Ç ç /dʒ/ ҷанг = جَنگ
Ш ш Ş ş /ʃ/ шаб = شَب
ъ ' /ʔ/ таъриф = تعریف
Э э E e ای /e/ эй = ای
Ю ю Ju ju یُ, یو /ju/ июн = اِیون
Я я Ja ja یه, یَ /ja/ ягонагӣ = یَگانَگی

See also


  1. Sociolinguistic Changes in Transformed Central Asian Societies
  2. ^ Keller, S. (2001) To Moscow, Not Mecca: The Soviet Campaign Against Islam in Central Asia, 1917-1941
  3. Soviet Language Policy in Central Asia
  4. ^ Khudonazar, A. (2004) "The Other" in Berkeley Program in Soviet and Post-Soviet Studies, November 1, 2004.
  5. ^ Perry, J. R. (2005) A Tajik Persian Reference Grammar (Boston : Brill) p. 34
  6. ^ Siddikzoda, S. "Tajik Language: Farsi or not Farsi?" in Media Insight Central Asia #27, August 2002
  7. UNHCHR – Committee for the Elimination of Racial Discrimination – Summary Record of the 1659th Meeting : Tajikistan. 17 August 2004. CERD/C/SR.1659
  8. Library of Congress Country Study – Tajikistan
  9. ^ Perry, J. R. (2005) A Tajik Persian Reference Grammar (Boston : Brill) p. 35
  10. Sociolinguistic Changes in Transformed Central Asian Societies
  11. ^ Perry, J. R. (1996) "Tajik literature: Seventy years is longer than the millennium" in World Literature Today, Vol. 70 Issue 3, p. 571
  12. ^ Perry, J. R. (2005) A Tajik Persian Reference Grammar (Boston : Brill) p. 36
  13. Судьба «русских букв» в таджикском алфавите будет решаться
  14. ^ Ido, S. (2005) Tajik (München : Lincom GmbH) p. 8
  15. ^ Rzehak, L. (2001) Vom Persischen zum Tadschikischen. Sprachliches Handeln und Sprachplanung in Transoxanien zwischen Tradition, Moderne und Sowjetunion (1900-1956) (Wiesbaden : Reichert)
  16. IBM – International Components for Unicode – ICU Transform Demonstration


  • Goodman, E. R. (1956) "The Soviet Design for a World Language." in Russian Review 15 (2): 85-99.

External links

  • / Toçikī / تاجیكی)
  • View Cyrillic-script Tajik websites transliterated into the 1920s Latin orthography

Template:Arabic alphabets

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.