InfoCenter

Transcription and Transliteration

The rewriting or conversion of the characters of a text from one writing system to another writing system is called transcription or transliteration. For instance, it is possible to transform a text consisting of Cyrillic, Japanese or Greek letters to the Latin writing system. Often the transformation of other characters to the Latin writing system is called romanization or latinization. Of course, also the reverse way is possible, namely the conversion of a Latin text to another writing system. Transcriptions are used, for instance, to make a writing system, illegible for one reader, readable by using other characters or to translate proper names in another writing system.

The following summary shows step by step, how to arrange such a conversion rapidly and easily, it defines the difference between transliterations and transcriptions and explains all important things apart from that. As software, the Text Converter is used, with which you can apply conversions at many files at the same time. The Text Converter supports many transcriptions and transliterations on its own, which can be assigned with only one click. Furthermore, it is also possible to create any customized conversion with own rules or edit the pre-defined ones, to convert texts.

The summary is devided into the following sections. You can simply click on the name of one of the sections to jump to this part of the text.

Difference between Transcription and Transliteration

A transcription is the conversion of the characters of one language to the characters of another language in accordance with the pronunciation of the target language. Hence, other rules, depending on the target language (e.g. German or English), for one and the same source language can be used. An advantage of the transcription is, that everybody speaking the target language can also read the converted words of the source language correctly. A disadvantage is, that a transcription often can not be inverted clearly. The Text Converter supports, for example, Cyrillic transcription rules for English and German.

In contrast, in the transliteration, each character of the source language is assigned to a different unique character of the target language, so that an exact inversion is possible. If the source language consists of more characters than the target language, combinations of characters (for example KA for Hiragana か) and diacritica (for example Š for Cyrillic Щ) can be used. Although in the transliteration, the pronunciation is not as important as in the transcription, it is often possible to recognize the pronunciation by the used character. The Text Converter supports some transliterations according to ISO standards, for example Cyrillic-Latin (ISO 9), Arabic-Latin(ISO 233), Hebrew-Latin(ISO 259), Greek-Latin(ISO 843), Japanese-Latin(ISO 3602), Georgic-Latin(ISO 9984) and Thai-Latin (ISO 11940).

It follows that the difference between transcription and transliteration is that the transcription reflects the phonetic articulation of words, thus like the words are pronouced. By contrast, the transliteration is used when a word is transfered from one writing system to another writing system. The Arabic word كتب for books would therefore be "kataba" in the transcription and "ktb" in the transliteration.

Default Transliterations

In the Text Converter, you can easily apply some well-known transliterations. After opening some files, just click on "Transcription and Transliteration" in the column "Actions" on the right side of the main window. Here you can choose the transliteration in the box "Profiles". After clicking on a text file in the file list (input area of the software), you should immediately see the results of your transliteration, which you can save now.

The following transliterations can be done with the Text Converter. With the option "Invert Transcription", you can carry out a re-transliteration. For instance, with this function it is possible to transliterate from Cyrillic to Latin instead of Latin to Cyrillic. Please note, that often not every Latin character has a correspondence in the target writing system, so that this character will be left. If you do not want to have this problem, you should use a transcription with full Latin alphabet support (see next section). But a disadvantage of this will be, that it is often not possible to invert such a transcription clearly.

  • Cyrillic - Latin (ISO 9): Detailed transliteration of the Cyrillic letters in Latin letters in accordance with ISO 9 norm.
  • Arabic - Latin (ISO 233): Transliteration of Arabic characters to Latin according to the ISO 233 standard.
  • Hebrew - Latin (ISO 259): Transliteration of Hebrew characters to Latin according to the ISO 259 standard.
  • Greek - Latin (ISO 843): Transliteration of Greek letters to Latin letters according to the standard ISO 843.
  • Japanese - Latin (ISO 3602): Transliteration of Hiragana and Katakana in Latin letters in accordance with ISO 3602.
  • Georgic - Latin (ISO 9984): Transliteration of Georgic letters to Latin letters according to the standard ISO 9984.
  • Thai - Latin (ISO 11940): Transliteration of Thai to Latin according to the standard ISO 11940.
  • Cyrillic - Latin (Scientific): Scientific transliteration of the Cyrillic letters to Latin (see this table).

Default Transcriptions

The transcriptions can be applied in the same way like the transliterations in the Text Converter. After adding some files, simply go to the action "Transcription and Transliteration" and select your desired transcription in the box "Profiles". The result of the inscription should already be visible in the preview now.

The following transcriptions can be selected in the Text Converter. Because in some writing systems there are no unique equivalents for each single Latin character (for example often letters like Q or X do not happen), other writings are used (for example KS for X), to make it possible to transcript all 26 letters of the Latin alphabet. Often this can distort the meaning a little bit and of course, it can lead to equivocality. If you do not want to have this problems, you should use one of the transliterations (previous section). Then, there are not this problems, but often not all characters can be transformed.

  • Hindi - Latin: Transcription of the Hindi letters in Latin letters, see this table.
  • Cyrillic - Latin (German): Transcription of the Cyrillic letters in Latin letters in accordance with the German pronunciation. Summary in this table.
  • Cyrillic - Latin (English): Transcription of the Cyrillic letters in Latin letters in accordance with the English pronunciation. Summary in this table.
  • Latin - Arabic (Complete ABC): Transcription of the Latin characters in Arabic characters. Because the Arabic alphabet does not know any vowels, this transcription often can not be inverted clearly.
  • Latin - Georgic (Complete ABC): Exact invertment of the ISO 9984 standard. This is possible here, because the ISO standard has a unique Georgic character for each Latin one.
  • Latin - Greek (Complete ABC): Transcription of the Latin characters to Greek characters. See this table.
  • Latin - Hebrew (Complete ABC): Transcription of Latin letters to Hebrew letters. Problematic, since there are no vowels in Hebrew and therefore, a re-transcription is complicated. An overview you can get in this table.
  • Latin - Hindi (Complete ABC): Reversal of the transcription "Hindi - Latin" under addition of the missing characters.
  • Latin - Japanese (Hiragana, Complete ABC): Reversal of the ISO norm 3602 with completion of the missing letters. Problematic, since in Japanese there are not all letters and often there are only combinations of letters like MA, MI, ME, MO and MU. Hence, often combinations must be used for single letters (for example SU for S because there is no single S in Hiragana), to be able to transcript the whole alphabet. Of course, a text transcribed in this way loses its pronunciation, but ther are no other possibilities.
  • Latin - Japanese (Katakana, Complete ABC): The same difficulty as in the case of Hiragana.
  • Latin - Cyrillic (German, Complete ABC): Inversion of the transcription of the Cyrillic letters to Latin letters according to the German pronunciation in addition with all of the missing letters.
  • Latin - Cyrillic (English, Complete ABC): Inversion of the transcription of the Cyrillic letters to Latin letters according to the English pronunciation in addition with all of the missing letters.
  • Latin - Cyrillic (Scientific, Complete ABC): Inversion of the scientific transliteration with addition of the missing letters.
  • Latin - Thai (Complete ABC): Inversion of the standard ISO 11940 in addition with the missing letters.

Customized Transcriptions

Behind the transcriptions and transliterations introduced in the last two sections, which can be applied with only one click in the Text Converter, it is also possible to create and use any customised transcriptions. These transcriptions can be managed under "Transcription and Transliteration > Customize", where you can also change and edit the existing profiles. When working with customised profiles, you have to take into account the following things:

  • All transcriptions are saved as normal text files (format is arbitrary) in a folder named "Transliteration" in the same directory like the Text Converter. All files, saved in this folder, will be loaded at the start of the software and will be displayed in the list "Profiles", where you can also select the predefined profiles. So, you can choose your own transcriptions in the same way, you also select one of the other profiles. Also in the administration of the profiles of all of the saved transcriptions are shown, so that you can work on them at any time.
  • The text files are all built up in the same way. In every single line, there is one rule for the transcription. In this line, the old and the new string is separated by the character |. So, for instance, a line with "A|B" means, that in this transcription an A is replaced by a B. Of course, it is also possible to replace strings by nothing ("A|") or to replace combined characters ("CH|X"). You should notice, that in the transcription the lines will be worked off from the first to the last line. Hence, if you would like to introduce a rule like ("CH|X") but you also have rules for the single letters C and H, the rule for CH must be appear before the rules for C and H, because otherwise CH will be replaced by the single letters.
  • The text files can be edited with an arbitrary editor or you can use the profiles manager under "Transcription and Transliteration > Customize", where you can work on all currently loaded files as well as all predefined transcriptions and transliterations. This window makes it also possible to have a look at the predefined rules and to edit them if necessary.

Considering these points, you can fastly and simply create your own transcriptions, for example for individual purposes or for completely different writing systems, because all Unicode characters are supported by the Text Converter. If you create a list in the Text Converter, this list will be visible in the profiles after saving the list with "Save". The predefined transcriptions can not be changed as such, so you have to save these lists as new ones, if you want to change them. For that you can simply write a new name in the box, in which you can select the profiles. Under this name, the new or the changed profile is saved in the directory "Transliteration".

Another possibility offers the function "Inverse List" in the same window. With this function it is possible to immediately inverse the complete list, to be able to create a changed re-transciption. The rule "A|B" becomes the rule "B|A". If you should have created a very important transcription, you are welcome to send us this list for the publication in the next version of the Text Converter.