How To Convert Markdown to Beautiful PDF

post_image_how_to_convert_markdown_to_pdf
Business / Tech

How To Convert Markdown to Beautiful PDF

Since I’m writing documentation, tutorials and guidelines on regular basis and often spent a lot of time on formatting until I get to know Markdown. Markdown is a lightweight markup language that helps you focus on writing. I have been using Markdown for a while now and have found that it has dramatically increased my productivity. Precisely because Markdown is lightweight and easy to learn, it has become increasingly popular over the years. Another great advantage is that its simplicity makes it quite easy and efficient to convert to other formats.

A Brief Introduction to Markdown

Markdown was developed and specified by John Gruber und Aaron Swartz back in 2004 with version 1.0.1. The goal was that the source language was easy for humans to read, even before conversion. There are now several extensions and flavours of Markdown that bring more features.

Key advantages of using Markdown are:

  • Markdown is easy to learn and easy to use. You can focus on content and (once you have defined a layout) don’t have to struggle with formatting and layout.
  • You can use your favorite editor to edit Markdown (.md) files, such as Visual Studio Code, Atom, Sublime Text, which help you to focus on content, are slim and efficient.
  • You can easily search files and content, since Markdown is based on textual files and content, which is readable by humans.
  • Because Markdown is slim and efficient, it is capable to be converted in many formats, such as PDF, HTML, epub or others.
  • Since the Markdown overhead is very small and textual based, it can be synced fast and easily between various devices and services.
  • Working together with others is very easy and highly efficient, if you use a version control system like git (e. g. GitHub, GitLab) or SVN (Subversion).

You can find an introduction to vanilla Markdown and tutorial here.

Why Did I Deal With This?

If you a familiar with Markdown and ever wondered how to convert Markdown files to beautiful PDFs, you might find the answer by continue reading.

Yes, Markdown is simple, easy to learn and has a small overhead, but what if you have to provide documentation to the management, customers or other external parties? Text files usually don’t cause a wow effect and it makes a better impression if your documents have a reasonable format and follow the corporate design.

Back in the days I struggled with this until I stumbled across a repository called “Eisvogel” on GitHub, which contains a generic LaTeX-template which uses pandoc (a universal document conversion tool) to create beautiful PDFs, which can be customised. I tried a lot and initially used a local setup, following the documentation of the repository.

However, my demands and desire for automation grew, so I started converting our documents, which we host on GitHub, automatically by using GitHub Actions. Of course, it was inefficient to build an environment for each operation and wait 5-6 minutes to set it up, and then convert documents in 10-30 seconds per operation.

That is why I have built a container.

Setup Container

This setup converts Markdown files to beautiful PDFs using pandoc, TeX Live and the “Eisvogel” Markdown template within a docker container. Not all Markdown flavours are supported to pandoc limitations, but vanilla markdown works like a charm.

Used Components

  • TeX Live 2021 docker image (Debian Bookworm / Testing)
  • TeX Live add-ons:
    adjustbox babel-german background bidi collectbox csquotes everypage filehook footmisc footnotebackref framed fvextra letltxmacro ly1 mdframed mweights needspace pagecolor sourcecodepro sourcesanspro titling ucharcat ulem unicode-math upquote xecjk xurl zref
  • pandoc 2.11.2
  • Eisvogel template 2.0.0

Requirements

  • Your favorite text editor
  • Docker
  • An x86_x64 compatible system (ARM not supported at the moment)

Document Folder Structure

For using this container you need following structure:

  • /docker Contains the Dockerfile for creating docker images.
  • /config Contains the yaml config files for pandoc and the Eisvogel LaTeX template (metafile).
  • /docs Contains all markdown files and assets (e. g. images).
  • /docs/assets Contains additional assets (e. g. images or pdfs) and is optional.
  • /output Contains generated PDFs. Folder and file names can be changed in the config files.
  • /templates Contains the Eisvogel LaTeX template, page layout and a logo.

You can download the sample structure here https://github.com/maholick/md-pdf-conversion.

Usage

Please clone the repo locally or if you are familar with GitHub Actions, you can use this setup to run automated PDF conversion directly on GitHub.

Build Docker Image

From the cloned directory of the repo, build the docker image. This step only needs to be performed a single time. If you don’t want to build the image by yourself, feel free to skip building and proceed with here.

docker build -t md-pdf-conversion -f docker/Dockerfile .

Setup Container

Now, simply create a container using following command. Name of the container of cause can be changed.

docker run -v /path/to/your/workdir/:/var/opt/pandoc --name md-pdf-conversion maholick/md-pdf-conversion:latest

Use Container

If you decided to build the container by your own, you can skip the next part and proceed with Creating the Container.

Using the Prebuilt Docker Image

You can find the prebuilt docker image (maholick/md-pdf-conversion), which is based on the above mentioned repository, on Docker Hub:

https://hub.docker.com/r/maholick/md-pdf-conversion

Download the image by using the docker pull command:

docker pull maholick/md-pdf-conversion

You should see following output:

[~] # docker pull maholick/md-pdf-conversion  
Using default tag: latest                                                                    
latest: Pulling from maholick/md-pdf-conversion
277c3ceb6ade: Pull complete 
580a80152554: Pull complete 
7bfd52c0000d: Pull complete 
7f8901bd7e4b: Pull complete 
e52047f39c0c: Pull complete 
1c092bbb5320: Pull complete 
cfee95d50208: Pull complete 
320119eb2697: Pull complete 
6bd300cf6af9: Pull complete 
c93bd34180d9: Pull complete 
3fb5bf20a2d1: Pull complete 
83cb0f8303fd: Pull complete 
aa9436434dc4: Pull complete 
be6b4cb165dd: Pull complete 
b6f8dc5a05d5: Pull complete 
e6046f25124e: Pull complete 
738b8b88784f: Pull complete 
34186473e003: Pull complete 
414e757c63b7: Pull complete 
6aa9266be2b4: Pull complete 
6f6544062c50: Pull complete 
5eeef9f9eed0: Pull complete 
5c1ac9571595: Pull complete 
d4aab6faf008: Pull complete 
eb71361fb3a8: Pull complete 
efea4e718ab5: Pull complete 
05229a15e1c5: Pull complete 
91d3fbc55d8a: Pull complete 
5661cf43c0e0: Pull complete 
4b676f374ca2: Pull complete 
cd4f21db67ec: Pull complete 
1f51504aeb7c: Pull complete 
Digest: sha256:43d1dfd8c23bd6cd532e47ac8a0eddda8ede4186dc90c4b17ef2a4a1828d7f4b
Status: Downloaded newer image for maholick/md-pdf-conversion:latest
docker.io/maholick/md-pdf-conversion:latest

Create the Container

After you downloaded the image, we need to create the container with docker run. Please make sure that you specify the path to the directory, with the above mentioned folder structure. You can choose any name for the container you want. The container is configured to start

docker create -v /path/to/your/workdir/:/var/opt/pandoc --name md-pdf-conversion maholick/md-pdf-conversion:latest

The container will be created and you should see a similar output:

[~] # docker create -v /path/to/your/workdir/:/var/opt/pandoc --name md-pdf-conversion maholick/md-pdf-conversion:latest
18a67ddaaee649d2eaa7a716608359afa13aef037f336653aa07a9d522f52707

Start the Container

Now we are ready and can start the container

[~] # docker start md-pdf-conversion
md-pdf-conversion

and check if the container is running

[~] # docker ps
CONTAINER ID   IMAGE                               COMMAND      CREATED         STATUS          PORTS     NAMES
18a67ddaaee6   maholick/md-pdf-conversion:latest   "/wait.sh"   2 minutes ago   Up 18 seconds             md-pdf-conversion

Convert Markdown to PDF

Now we are ready for our first conversion. For this we will use the example files, which are included in the repository. Please navigate to docs folder in your working directory, which you configured with the container and create the start file “.start“. The conversion starts immediately by converting all “*.md” files recursively and ordered, using the configuration in “/config“.

[/] # cd /path/to/your/workdir/docs
[/path/to/your/workdir/docs] # ls
00_Lorem_ipsum.md  assets/
[/path/to/your/workdir/docs] # touch .start
[/path/to/your/workdir/docs] # ls
[/path/to/your/workdir/docs] # ls -all ../output/
total 160
drwxrwxrwx 2 demo users   4096 2021-06-26 07:28 ./
drwxrwxrwx 6 demo users   4096 2022-01-23 21:34 ../
-rw-rw-rw- 1 demo users 146270 2022-01-24 21:12 example.pdf

You can find the converted file in the output folder of your working directory and should now see a beautiful PDF file, which was converted from Markdown, based on your adjustments within the config files.

Successful Conversion

After a successful conversion the conversion folder will be emptied. Make sure that you only copy files here.

Unsuccessful Conversion

The start-file will be removed and you can check your config for errors.

Example Output

Contribution

If you like this small project and want to contribute, please feel free to add code, templates or other improvements by creating pull requests with git. The repository is public and can be accessed by anyone.

Repository: https://github.com/maholick/md-pdf-conversion/
Container: https://hub.docker.com/r/maholick/md-pdf-conversion