Ice Bear SoftTypesetting an illustrated publication with a floating text

An abstract of a lecture delivered (in Czech) at conference TeXperience 2012. You can also download slides (in Czech) in PDF as well as the tools used for typesetting the book.

The algorithms of TeX are intended for text processing, they are not designed to work with external images. This work is left to the output drivers. The drivers follow evolution of polygraphy. Thus especially by using the graphicx package images in common formats can easily be included in a way independent of the output driver used. The LaTeX format is primarily aimed at typesetting texts containing a few tables or figures. The mechanism of floating objects ensures that the tables and figures automatically find a convenient place within the text. Another situation arises if the main part of the publication are pictures the position of which is bound to a strict place of a page and the text should float around them. Moreover, it is required that the text floats to the next page in the middle of a paragraph and then be typeset to a column of different width. In such a case existing approach cannot be used and a set of devoted macros must be developed. The principal part of the source code will thus be macros defining the position of images and text areas. The macros developed for this purpose not only typeset the text and place the images but also allow for visual verification of alignment of the text areas and positions of images even before the text is written.

For typesetting a book in TeX not only macros for creation of the requested layout are needed but also a proper tool for comfortable writing the source text is required. TeX expects a plain text file with macros as its input, therefore arbitrary text editor that does not insert its formatting instructions can be used. The macros defining the positions of pictures and text areas may be syntactically complicated which may lead to errors that may be difficult to locate. It would thus be useful to have a tool able to verify the syntax immediately when the source text is being written. Such an external validation plug-in for a text editor is not easy to develop because it is well known that only TeX can read TeX.

Solution is not difficult if several technologies are joined together. The pages are described in XML. For this purpose a Relax NG schema was developed and the text is written in a validating editor. The text is then transformed by XSLT. This presents one of the advantages. Unlike the direct input in TeX when a misprint in the name of a macro can be noticed only later during TeX compilation, the validating editor indicates an error immediately and moreover offers the names of elements as well as the names and values of attributes according to the context. TeX macros are not inserted by the text author but generated without errors during transformation.

The transformed text is compiled by XeLaTeX. For this reason the Czech text may contain small parts in Hindi and Urdu. The <oXygen/> XML editor honours the value of the lang and xml:lang attributes and switches the language for spell checking accordingly. Typing multi language documents is thus considerably easier and the author can be fully concentrated to the content creation.

The system developed makes also use of XSLT to find out which photos from the large collection are actually used in the book and only these image files will be converted to the CMYK colour space pro press. The files required for the typesetting of such a book contain les than 600 lines of Relax NG schema, less than 700 lines of XSLT and less than 100 lines of TeX macros, the rest is taken from the standard packages that are included both in the MiKTeX and TeX Live distributions.

Note: The macros described here were developed for typesetting the book entitled भारत के रंगبھارت کے رنگ — The Colours of India.

I am forced to sponsor musicians and other copyright organisations by my programming work because I store my software, input data as well as the results of calculation on CD and DVD media.