By
Clemens Lode
,
January 21, 2022
Open laptop with smartphone (source: pexels)

Adding TeX4ht to Your LaTeX Book Project

Here, we will examine the technical details of converting LaTeX to HTML, and how to add that capability to an existing project that does not use the template. If you are already using the configuration set up by the template, you can safely ignore this article.

This is an excerpt from Better Books with LaTeX, the Agile Way.

TeX4ht Configuration

Here, we will examine the technical details of converting LaTeX to HTML, and how to add that capability to an existing project that does not use the template. If you are already using the configuration set up by the template, you can safely ignore this chapter.

To convert LaTeX to HTML, we need an additional compiler, TeX4ht, which works only with pdfLaTeX. If you are using XeLaTeX (or other LaTeX compilers) for your project, you might run into some problems when switching to pdfLaTeX. You can stay compatible with both compilers by using the conditional \ifxetex statement.

On the Overleaf platform, no separate installation for TeX4ht is needed. All you need to do is include it by adding a file named latexmkrc in the main directory (and thus overriding the latexmkrc provided by Overleaf by default) of your project and adding a configuration file.

Figure 1: Build chain using different tools to produce different output formats.
Latexmk ⋅  Latexmk is the build tool Overleaf uses to automatically build your LaTeX project. The configuration file latexmkrc can be used to override build settings and configure compilers for a different output format (like using TeX4ht to generate HTML output in addition to generating a PDF).

First, let us create the latexmkrc file in the main directory of your project and insert this code (depending on your project, if you are not using the template, you might need additional settings from https://www.overleaf.com/help/216-how-does-overleaf-compile-my-project):

$pdflatex = "rm tikz-cache/*.pdf; pdflatex -shell-escape -synctex=1 %O %S; htlatex %S \"htlatex/htlatex.cfg,MyFonts,NoFonts\" \"\" \"\" -shell-escape > output.txt";
END {system(’zip -r allfiles.zip . -x *.cache*’);}

This creates a hook in the compilation chain of LaTeX (LaTeX calls $pdflatex at the end of the compilation). All this does is clear the cache, call pdflatex before calling htlatex, giving you a PDF output (and compilation of TikZ images) in addition to the HTML output. It also writes the output of the compilation of htlatex to a new file called output.txt to be used for debugging. At the end, it zips all the files of your project into the allfiles.zip for later download. You can download the output files (clicking on the Logs and output files icon at the top of the right window, scrolling all the way down to Other logs & files, and selecting the previously created allfiles.zip).

Converting that HTML file into a real e-book format like MOBI or EPUB takes some extra effort as we need to adjust the settings, take care of the table of contents, add a cover, and optimize our images.

HTML Output Formatting

Unfortunately, TeX4ht cannot do a 1:1 conversion simply because printed books are based on pages while HTML documents and e-books are continuous texts. Also, formatting, spacing, and images are handled differently, so we need to configure this separately. In the listing above, you can see a reference to htlatex.cfg—that is where the TeX4ht configuration resides:

\Preamble{xhtml}

\Configure{VERSION}{}
\Configure{DOCTYPE}{\HCode{<!DOCTYPE html>\Hnewline}}
\Configure{HTML}{\HCode{<html>\Hnewline}}{\HCode{\Hnewline</html>}}
\Configure{@HEAD}{\HCode{<!-- for beautifying --><link rel="stylesheet" type="text/css" href="site.css" />\Hnewline}}

% Translate \textbf, \textit and \texttt directives into <strong>, <em> and <code>
\Configure{emph}{\ifvmode\ShowPar\fi\HCode{<em>}}{\HCode{</em>}}
\Configure{textbf}{\ifvmode\ShowPar\fi\HCode{<strong>}}{\HCode{</strong>}}
\Configure{textit}{\ifvmode\ShowPar\fi\HCode{<em>}}{\HCode{</em>}}
\Configure{texttt}{\ifvmode\ShowPar\fi\HCode{<code>}}{\HCode{</code>}}
\Configure{textsc}{\ifvmode\ShowPar\fi\HCode{<span class="sc">}}{\HCode{</span>}}

% Translate verbatim and lstlisting blocks into <pre> elements
\ConfigureEnv{verbatim}{\HCode{<pre>}}{\HCode{</pre>}}{}{}
\ConfigureEnv{lstlisting}{\HCode{<pre>}}{\HCode{</pre>}}{}{}
\ConfigureEnv{minipage}{\ifvmode\IgnorePar\fi\HCode{<div class="minipage">}}{\ifvmode\IgnorePar\fi\HCode{</div>\Hnewline}}{}{}%

% Do not set ‘indent‘/‘noindent‘ classes on paragraphs
\Configure{HtmlPar}
{\EndP\Tg<p>}
{\EndP\Tg<p>}
{\HCode{</p>\Hnewline}}
{\HCode{</p>\Hnewline}}
\begin{document}
\EndPreamble

What the file does is configure the mapping between LaTeX and HTML. If you are familiar with HTML, you see that you can configure the contents of the output HTML file with the htlatex.cfg file. It starts with setting up the HTML header and then configures how individual LaTeX commands (\emph, \textbf, \textit, …) should be translated into HTML. For example, text formatted in italics (\textit) is translated into HTML by using the emphasis HTML tag (¡em¿). The \HCode command directly inserts HTML commands in the output file and can also be used in the regular LaTeX files. For example, you can use

\HCode{<hr style="clear: both" />}

to directly add a vertical line into the HTML output file and thus the e-book.

CSS ⋅  CSS files determine the final design and appearance of a website (or e-book).

Also, in the htlatex.cfg file, the site.css file is referenced. This can be adjusted according to your needs, although in my experience, some of the following settings work nicely for Kindle e-books:

1. You might want to adapt the sizes of the chapter title and section title fonts:

.chapterHead { font-size: 1.5em; margin-top: 0.83em; margin-bottom: 0.83em; font-weight: bold; text-align: left; }
.sectionHead { font-size: 1.17em; margin-top: 1em; margin-bottom: 1em; font-weight: bold; }
.subsectionHead { margin-top: 1.33em; margin-bottom: 1.33em; font-weight: bold; }
.subsubsectionHead { font-size: 0.83em; margin-top: 1.67em; margin-bottom: 1.67em; font-weight: bold; }

2. In Kindle e-books, new paragraphs have indents on the first line. If you do not like that, this is the workaround:

p { margin-top: 1em; margin-bottom: 1em; text-indent: 0.01em; }

3. One way to highlight a quotation:

.quotation { margin: 0.25em 0; padding: 0.35em 40px; line-height: 1.45; position: relative; color: #383838; }
.quotation:before { display: block; padding-left: 10px; content: "\201C"; font-size: 80px; position: absolute; left: -15px; top: -20px; color: #7a7a7a; }
.quotation cite { color: #999999; font-size: 14px; display: block; margin-top: 5px; }
.quotation cite:before { content: "\2014 \2009"; }
div.quotation { width: auto; }

4. Add support for small capitals:

.sc { font-variant: small-caps; }

5. Print description list elements in bold:

dt.description { font-weight: bold; }


Related Books and Services

Recommended Further Reading

Planted seeds growing into a green plant
January 21, 2022

Managing Your Expectations for Your First Book

Are your expectations for your first book too high? Click to find out how you can prevent a failed book launch and write your next book the agile way.

About the Author

Clemens Lode

Hello! My name is Clemens and I am based in Düsseldorf, Germany. I’m an author of books on philosophy, science, and project management, and coach people to publish their books and improve their approach to leadership.

I like visiting the gym, learning to sing, observing animals, and creating videos on science and philosophy. I enjoy learning from nature and love the idea of optimizing systems.

In my youth, I was an active chess player reaching the national championship in Germany, and an active pen&paper player leading groups of adventurers on mental journeys. These activities align with my calm approach to moderating meetings, leading meetups, and focusing on details. My personality type in socionics is IEE/ENFp.

Read more...
Clemens Lode

Related Blog Posts

Related Topics

LaTeX

LaTeX

LaTeX, a document processing system, creates a typeset finished product. The system works more like a compiler than a word processor. While initially complicated to learn, LaTeX allows better management of larger projects like theses or books by splitting the document into text, style, and references. Leslie Lamport created laTeX in the 1980s; his goal was to separate content from styling.

Read more...

Do you have a question about our services?

Reach out, we'd love to hear from you! Schedule a video chat or message us by e-mail, WhatsApp, or Discord!

Send us an e-mail (mail@lode.de).

Reach out to us via WhatsApp.

Or send us your question or comment here and we'll get back to you ASAP:
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Rate us at Trustpilot