WordCreator

Syllable Lists

In this article we introduce how the syllable lists in the WordCreator are structured and how you can create or adapt your own syllable lists.

The article is divided into the following sections:

If you have never worked with the WordCreator, we recommend to read the introduction first.

Structure of Syllable Lists

The WordCreator always uses the syllable list that is currently set in the "Used Syllables" box in the main window. The syllables in this box are freely editable and can be changed and edited at any time.

Each line of the syllable list corresponds to a new syllable. The structure of each line is as following:

<Frequency><Space><Syllable>

The line begins with a number indicating the frequency of the syllable (more on this later). This is followed by a space dividing frequency and syllable. Finally comes the syllable itself. This can consist of any characters, including spaces, and can be of any length. It therefore doesn't matter whether you use single letters, any combination of letters, or longer sequences of any characters as your syllable.

Let us have a look at an example:

1 A
1 B

With this syllable list, we would like to work with the elements "A" and "B", both elements should occur with the same frequency. Therefore, we have used the same number for both elements.

The numbers do not have to meet any criteria. It is neither necessary that they are all together resulting in a specific sum nor do they have to have a specific length or size. The only thing that is important is the relation between the numbers: If both numbers are the same, the letters and syllables defined behind them will occur with the same probability, if both numbers are differing, the elements will accordingly occur with a different probability weighting.

In other words, the example above could also be written using higher numbers:

129 A
129 B

Here we are using the number 129 instead of 1. However, the created words would be similar because the relation between the numbers is the same.

Syllables with different Frequencies

In the next example, we would like to weight two letters differently:

2 A
6 B

Here, the number for B is three times higher than the number for A. Therefore, under ideal circumstances, B will occur three times more than A.

Provided, of course, the rules for readability do not restrict this. If you would like to create readable words and the only elements are A and B, it is after all only possible to create words with A and B alternating. In other words, generating readable words with this syllable list will indeed result in more words beginning with B than with A, but the letters will still occur with roughly equal frequency, especially when creating long words, due to their alternating appearance.

Hence, you will better see the probability distribution with using longer lists and less rules. For example in the next example.

100 A
100 B
001 E

In this list, we have defined the elements A, B and E where A and B should occur with the same probability and E with much less frequency. With this list, you will get words like ABAB or BABA very often compared to words containing an E.

By the way, we have used leading zeros at "001" in the list above (the same applies to the lists available in the WordCreator). The leading zeros are only used because of clarity so that you can better see the letters at the same position under each other. Otherwise, the leading zeros have no special meaning. So, you could also write just "1" instead of "001" with having the same effect.

Combination of different Types of Syllables

As already mentioned, the length or the structure of the defined syllables does not matter. So, also syllables of different types can be mixed with each other as desired.

2 A
2 BE BU
2 COM
2 2

In this syllable list, we have, for example, defined the four elements "A", "BE BU", "COM" and "2". All elements should have the same probability and it doesn't matter that one of the elements contains a space, another element only consists of one letter, another consists of three letters, while the last element contains no letters at all but consists of a digit.

Despite this unequal structure and content of these four syllables, alone the frequencies as well as possible restrictions due to the readability rules are decisive for where and how often the syllables in question are used and built into the generated words.

Positions of the Syllables in the generated Words

Up to now, the position of a syllable within a word does not play a role. With all rules we have used so far, the defined syllables are allowed at each position in a word.

However, it is also possible to define elements that are only allowed to appear at the beginning, at the end, in the middle or at another defined position within a word. How to do that can be seen in the next examples.

01 A
01 C
01 E
1B K
1M I
1E D

In this syllable list, all characters should be used with the same frequency. The letters A, C and E are allowed to appear at every position within a created word, the letters K, I and D are only allowed to occur at specific positions. As you can see, we have written "B" (begin) behind the number for K. This means, that K should only be set at the beginning of a word. Accordingly, M stands for middle and E for the end of a word while for the letters A, C and E we have only defined a pure number and thus (as in all previous example lists) do not force any positioning of these letters.

Using this list, we are able to produce words like KID, CID, KECA, ECID or ECA, but no words such as DIK or ICE.

With the following rules, a direct positioning is possible:

1P1 K
1P2 I
1P3 D
1P4 O

The letter "P" followed by a number specifies the exact position within a word. In the example, the letters K, I, D and O should have the same probability. K should only at position 1 in a word (P1), I only at position 2, D only at the third place and O at the forth.

Using this list, it will only be possible to create the word "KIDO". With adding "1P1 L" (letter "L" at position 1) to the list, the list will produce the words "KIDO" or "LIDO" but nothing else.

However, you can also define positions relatively from the beginning or the end:

01L1 A
01L2 C
01R2 I
01R1 D
0001 E
0001 F

L1 means that the element is allowed to appear at the first position (from the left), L2 means, that the element can appear at one of the first two positions. Accordingly, R1, R2, R3 and so on are standing for the positions counted from behind (from right). Using this list, words like ACID, CEFI or EFID will be produced. The letters E and F we have added (for each position) to be able to create readable words in each case.

Comments in Syllable Lists

If you want to add comments to your syllable lists, you can simply precede the line or the lines in question with a number sign. As soon as a line begins with the character #, this line is no longer considered in the sense of a syllable for the generation of words.

An example for a syllable list with comments is the following list:

# Vowels
1 A
1 E
1 U
# Consonants
1 B
1 C
1 D
#1 F

On the one hand, this example list uses comments in the form of the two headings "Vowels" and "Consonants" below which the corresponding letter types are grouped. On the other hand, with the help of a comment, the syllable "F" in this list has been deactivated, so that this syllable respectively letter is not used when this syllable list is applied.

Incidentally, the two headings "Vowels" and "Consonants" would not have been used as syllables for the generation of words, even if they had not been commented out here. The reason for this is that the strings "Vowels" and "Consonants" do not contain any indication of frequency and therefore do not correspond to the required structure of a syllable definition. Only if you write "1 Vowels" and "2 Consonants", as one possible example, those words would be used as syllables. However, it makes nevertheless sense to also comment out pure text with a # character, since the WordCreator checks each syllable list before it is used and indicates if a syllable list contains syllables without frequencies. So, to avoid this hint before word generation, you should always use "real" comments. Furthermore, the use of genuine comments prevents the lines in question from being accidentally assigned an automatic frequency.

Predefined and automatically generated Syllable Lists

Of course, you do not have to painstakingly create each syllable list letter by letter, syllable by syllable and frequency by frequency by hand. Instead, the WordCreator offers several options, allowing you to access predefined syllable lists or to generate entire syllable lists automatically. We will examine how this works in the following sections.

Regardless of which of these options you choose, you can of course afterwards nevertheless still modify any loaded or generated syllable list to your own needs, for example by removing, editing or adding individual syllables.

Syllable Lists from Languages

In the menu "Syllable Lists > Syllable Lists from Languages" of the WordCreator, you will find access to syllable lists with frequency profiles from approximately 60 different languages ​​using Latin (for example, German, English, Spanish, Portuguese, Italian or French), Cyrillic (for example, Russian, Ukrainian or Kyrgyz), Greek, Hebrew and Hindi alphabets. These syllable lists contain the letters and letter combinations that occur in the respective language.

When you click on one of these languages, automatically a syllable list will be loaded, representing the frequency distribution of the letters of the selected language.

You can specify in the settings whether only single letters and/or also digrams and trigrams (combinations of two and three letters) should be loaded via the menu "Settings > Syllable Lists". For this purpose, the options "Add single letters", "Add two-piece syllables (digrams)" as well as "Add three-piece syllables (trigrams)" are available.

Random Syllable Lists

In addition to the fixed syllable lists, which are based on the letter and syllable frequencies of real-world languages, the WordCreator can also create arbitrary random syllable lists which are automatically generated based on a selection of letters, numbers or other freely definable characters.

You can access the corresponding functions via the menu "Syllable Lists", where you will find the following sub-items:

The way in which the selected character set is incorporated into your generated list depends - as with the syllable lists from languages - on the settings you can configure under "Settings > Syllable Lists". Here you can specify whether only single letters or also letter combinations such as digrams and/or trigrams should be generated automatically.

As mentioned earlier, all the functions presented in this section assign random frequencies to the generated syllables. If you want to quickly and easily change these frequencies - for example, to a uniform distribution - it's not necessary to change the frequencies one by one manually. Instead, you can simply use one of the methods presented in the section about the automatic assignment of syllable frequencies.

Syllable Lists based on Texts

A third way to create usable syllable lists within the WordCreator is to generate syllable lists based on the frequency distribution of letters and characters of any text source. To do this, just follow these steps:

With this, the WordCreator automatically creates a usable syllable list from your count and switches back to the tab "Creator" so that you can immediately start generating new words from this base. You can define which letters and characters of the text source are to be included and in what form using the button "Settings" below the text fields.

Automatic Assignment of Syllable Frequencies

Finally, another way of handling syllable lists should not remain unmentioned: it's about the bulk frequency assignment. Instead of manually assigning a new frequency to each syllable in your syllable list, you can simply right-click on the syllable list text box to open the corresponding context menu. There you will find the following functions:

Although, so far, we always talked about "all syllables", you can also use these functions to assign frequencies to only parts of your syllable list. To do this, just select the syllables you want to assign a new frequency to with your mouse before opening the context menu.

By the way, this function does not only work for syllables that already have an assigned frequency. Also letters and syllables that just appear in a line without any number assigned will also receive a frequency after this function has been called (provided they don't already have one). This allows you to focus solely on the syllables when creating your syllable list, or even copy syllables from another context into the WordCreator and leave the frequency definitions entirely to the program. Of course, this excludes lines explicitly marked as comments with the hash symbol (#).

Save and load Syllable Lists

Completed syllable lists can be easily saved and reloaded via plain text files. To do this, either right-click on the syllable list field and then click on "Save" or "Load" or use the same named functions from the menu "Syllable Lists". Alternatively, you can also use the keyboard shortcuts CTRL+S (Save) and CTRL+0 (Open).