TextConverter

Split Text Files into several new Files

If we would like to divide the content of a text file into several new files, automation of this task can save us a lot of work and, above all, a lot of time. Especially if we want to split a very large number of files and the separation is always to be carried out according to the same pattern, the task is easy to automate and the savings are particularly large. In this tutorial, we would like to show you an easy way how you can get a quick result without much effort. We use the program TextConverter for this.

General Procedure

Before we take a detailed look at the individual options for separation and the associated options, we would first like to take a look at the general procedure on how to use the TextConverter to split individual files in several new files:

In this general description of the procedure, we have not yet talked about which criteria we can select for the separation. We would like to go into this in the next section.

Possibilities of Separation

The TextConverter offers you 3 different options or criteria according to which you can split your files. These options can also be combined:

Split Files on a Text or a Regular Expression

With this option you can divide your original file at a specific text. This means that after each appearance of this search text, a new file begins. Accordingly, if your text occurs twice in the original file, three new files are stored (one with the text that appears in the original file before the first occurrence of the search text, one with the text between the first occurrence and the second occurrence of the search text and a third file with the text that stands in the original file behind the second occurrence of the search text).

It does not matter whether your search text consists of only one character, several words or even multiple lines. Furthermore, the search text does not have to be a static text: If you activate the option "Interpret as Regular Expression" under the text box, you can also work with regular expressions at this point. A simple example would be the regular expression [0-9] which executes a separation on any digit.

If you would like to keep the search text on which was separated in the new files, you can activate one or both of the options "Keep Search Text at the Beginning of each new File" or "Keep Search Text at the End of each new File". If you do not activate any of these two options, the search text will not appear in the new files.

Another option makes it possible not to separate directly at the search text but on the next line break. If the option "Split at next Line Break" is activated, related words of a paragraph remain in the same file and are not separated from each other. This allows you to separate, for example, according to sections that contain certain words without tearing the respective sections apart.

Split Files on Line Breaks

With this option you can separate the original file on its line breaks. This means that for each line of the original file a new file is created that contains the text of the respective line.

For this option, the settings under "Actions > Files > Line Break Type" apply. By default, that means if you do not make any changes here, the type of line break of the original file is automatically recognized and you get the result that you would expect in general. The decisive factor is then the typical line break that you know from an average text editor. However, you can also define other criteria for a line break in the TextConverter. For example, it is possible to define any characters, character chains or several different characters as a line break. This gives you further ways to separate your files flexibly. You can find out how this works in the explanations of custom line breaks on one or several characters.

Split Files after Number of Characters

With this option you can cut your original file into pieces with a freely selectable number respectively length of characters. You can enter any numerical value into the field. For example, if your original file has 2500 characters and you specify a value of 1000 characters, your file is split into 3 parts: The first new file contains the first 1000 characters of the original file, the second new file contains the second 1000 characters of the original file and the third new file contains the remaining 500 characters. If your original file contains fewer characters than the specified value, there is no separation and the original file remains with its content as it is.

You can also use this option to limit the text of all files created to a maximum number of characters. For example, if you combine this option with the other options.

Combination of multiple Criteria

At least one of these introduced options must be activated in order to be able to perform the function. The activation of more than one of these options is also possible. In this case, it is first separated according to the criterion of the first activated option. Then the resulting parts are separated again according to the criterion of the second activated option and so on.

For example, if you activate both the option for a separation on line breaks as well as the option for a separation after a certain number of characters, first it is separated at the line breaks. Then all parts (here the parts are equal to the lines) are gone through and if a line consists of more than the permitted number of characters, it is splitted again within the relevant line in accordance with the second criterion.

General Options for all Separations

Under the 3 options with which you can determine the criteria for the separation of the files, you will find further general options that are always used regardless of the selected criteria:

Placeholders for the Numbering of the Parts

In addition to the simple placeholders and the placeholders for references, the TextConverter provides two other placeholders that can only be used in connection with splitting files: %part_num% and %part_abs%.

The placeholder %part_num% stands for the number of the part while the placeholder %part_abs% stands for the total number of parts. Both placeholders can be used in the file name (that means in the fields "Folder", "Name" and "File Extension" of the storage options) as well as in the actions and the files themselves.

If a file is split into 5 parts, as an example, the placeholder %part_abs% always stands for "5" while the placeholder %part_num% depends on the respective part. For the first part, %part_num% is "1", for the second part, it is "2", and so on. With this placeholder it is therefore possible, for example, to write the number of each part in the respective partial file, to number the file names of the parts consecutively or to save the individual parts in different folders whose names contain the number of the part.

Since the current version of the TextConverter does not provide a preview for file separations, the placeholders %part_num% and %part_abs% are not considered in the preview.

Storage and Configuration of the File Names of the Parts

In which folder and under what name the new files should be saved, you can define at the bottom right of the main window of the TextConverter. Here you can select an arbitrary folder and determine a base name for all files. With the option "keep", this can also be the folder or the name of the original file.

If you use the default settings, the individual parts respectively the files containing the individual parts are numbered consecutively by appending a consecutive number to the specified name. For example, the file names of the saved parts could be "file-01.txt", "file-02.txt" to "file-20.txt".

If you want to number the files in a different way, you can use the placeholder %part_num% within the storage options, which stands for the number of the part in question. For example, if you use "%part_num% %name%" as the file name, the partial files from the example would be named "01 file.txt", "02 file.txt" to "20 file.txt" or if you use "%name% (%part_num%)", the resulting file names would be "file (01).txt", "file (02).txt" through "file (20).txt".

If the file name contains the placeholder "%part_num%", there is no automatic numbering by appending the number of the part. On the other hand, if the file name does not contain the placeholder "%part_num%", an automatic numbering always occurs, except the option "Number File Names of Parts only if necessary" is activated and no file with the resulting name already exists.

For the file naming of the individual parts, you can also use references. An example would be using the placeholder "%ref:line=1%" which represents the first line of the file. If you use this placeholder as a file name, the first line of each part is used as the file name for this part. If you specify, for example, the placeholder "%ref:word=1%" as the folder, the individual parts will be sorted according to their first word into different folders, each folder having the first word of the respective file as its name. Of course, you can also use any other of the available references or combine the references with other characters or placeholders. If you use references and thus already get a unique file name, you can activate the option "Number File Names of Parts only if necessary" if you do not want any additional automatic numbering of the files.

Even if we sometimes only speak of one file as the original file in this tutorial, the function can of course also be used with multiple files at the same time. This means that if you have more than one file in your file list, each file is separated individually independently of the other files in the file list.

Join several Text Files

In addition to the possibility of dividing individual files into several new files, the TextConverter also offers the reverse way: How you can put any number of files together, you can learn in the tutorial about combining several text files.