mdtopdf

package module
v2.2.18 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 10, 2025 License: MIT Imports: 21 Imported by: 0

README

CI GoDoc GoReportCard License

Markdown to PDF

A CLI utility which, as the name implies, generates a PDF from Markdown.

This package depends on two other packages:

  • gomarkdown parser to read the markdown source
  • fpdf to generate the PDF

Features

Supported Markdown elements

  • Emphasised and strong text
  • Headings 1-6
  • Ordered and unordered lists
  • Nested lists
  • Images
  • Tables
  • Links
  • Code blocks and backticked text

Installation

You can obtain the pre-built md2pdf binary for your OS and arch here; you can also install the md2pdf binary directly onto your $GOBIN dir with:

$ go install github.com/mandolyte/mdtopdf/v2/cmd/md2pdf@latest

md2pdf is also available via Homebrew:

$ brew install md2pdf

Syntax highlighting

md2pdf supports colourised output via the gohighlight module.

For examples, see testdata/syntax_highlighting.md and testdata/syntax_highlighting.pdf

Custom themes

md2pdf supports both light and dark themes out of the box (use --theme light or --theme dark - no config required).

However, if you wish to customise the font faces, sizes and colours, you can use the JSONs in custom_themes as a starting point. Edit to your liking and pass --theme /path/to/json to md2pdf

Auto Generation of Table of Contents

md2pdf can automatically generate a TOC where each item corresponds to a header in the doc and include it in the first page. TOC items can then be clicked to navigate to the relevant section (similar to HTML <a> anchors).

To make use of this feature, simply pass --generate-toc as an argument.

Quick start

$ cd cmd/md2pdf
$ go run md2pdf.go -i test.md -o test.pdf

To benefit from Syntax highlighting, invoke thusly:

$ go run md2pdf.go -i syn_test.md -s /path/to/syntax_files -o test.pdf

To convert multiple MD files into a single PDF, use:

$ go run md2pdf.go -i /path/to/md/directory -o test.pdf

This repo has the gohighlight module configured as a submodule, so if you clone with --recursive, you will have the highlight dir in its root. Alternatively, you may issue the following command to update an existing clone:

git submodule update --remote  --init

Note 1: the cmd folder has an example for the syntax highlighting. See the script run_syntax_highlighting.sh. This example assumes that the folder with the syntax files is located at a relative location: ../../../jessp01/gohighlight/syntax_files.

Note 2: when annotating the code block to specify the language, the annotation name must match the syntax base filename.

Additional options

  -author string
    	Author name; used if -footer is passed
  -font-file string
    	path to font file to use
  -font-name string
    	Font name ID; e.g 'Helvetica-1251'
  -generate-toc
    	Auto Generate Table of Contents (TOC)
  -help
    	Show usage message
  -i string
    	Input filename, dir consisting of .md|.markdown files or HTTP(s) URL; default is os.Stdin
  -log-file string
    	Path to log file
  -new-page-on-hr
    	Interpret HR as a new page; useful for presentations
  -o string
    	Output PDF filename; required
  -orientation string
    	[portrait | landscape] (default "portrait")
  -page-size string
    	[A3 | A4 | A5] (default "A4")
  -s string
    	Path to github.com/jessp01/gohighlight/syntax_files
  -theme string
    	[light | dark | /path/to/custom/theme.json] (default "light")
  -title string
    	Presentation title
  -unicode-encoding string
    	e.g 'cp1251'
  -version
    	Print version and build info
  -with-footer
    	Print doc footer (<author>  <title>  <page number>)

For example, the below will:

  • Set the title to My Grand Title
  • Set Random Bloke as the author (used in the footer)
  • Set the dark theme
  • Start a new page when encountering an HR (---); useful for creating presentations
  • Print a footer (author name, title, page number)
$ go run md2pdf.go  -i /path/to/md \
    -o /path/to/pdf --title "My Grand Title" --author "Random Bloke" \
    --theme dark --new-page-on-hr --with-footer

Using non-ASCII Glyphs/Fonts

To use a non-ASCII language, the PDF generator must be configured with WithUnicodeTranslator:

// https://en.wikipedia.org/wiki/Windows-1251
pf := mdtopdf.NewPdfRenderer("", "", *output, "trace.log", mdtopdf.WithUnicodeTranslator("cp1251")) 

In addition, this package's Styler must be used to set the font to match what is configured with the PDF generator.

A complete working example can be found for Russian in the cmd folder named russian.go.

For a full example, run:

$ go run md2pdf.go -i russian.md -o russian.pdf \
    --unicode-encoding cp1251 --font-file helvetica_1251.json --font-name Helvetica_1251

Tests

The tests included in this repo (see the testdata folder) were taken from the BlackFriday package. While the tests may complete without errors, visual inspection of the created PDF is the only way to determine if the tests really pass!

The tests create log files that trace the gomarkdown parser callbacks. This is a valuable debugging tool, showing each callback and the data provided while the AST is presented.

Limitations and Known Issues

  • It is common for Markdown to include HTML. HTML is treated as a "code block". There is no attempt to convert raw HTML to PDF.
  • Github-flavoured Markdown permits strikethrough using tildes. This is not supported by fpdf as a font style at present.
  • The markdown link title (which would show when converted to HTML as hover-over text) is not supported. The generated PDF will show the URL, but this is a function of the PDF viewer.
  • Definition lists are not supported
  • The following text features may be tweaked: font, size, spacing, style, fill colour, and text colour. These are exported and available via the Styler struct. Note that fill colour only works when using CellFormat(). This is the case for tables, code blocks, and backticked text.

Contributions

  • Set up and run pre-commit hooks:
# Install the needed GO packages:
go install github.com/go-critic/go-critic/cmd/gocritic@latest
go install golang.org/x/tools/cmd/goimports@latest
go install golang.org/x/lint/golint@latest
go install github.com/gordonklaus/ineffassign@latest

# Install the `pre-commit` util:
pip install pre-commit

# Generate `.git/hooks/pre-commit`:
pre-commit install

Following that, these tests will run every time you invoke git commit:

go fmt...................................................................Passed
go imports...............................................................Passed
go vet...................................................................Passed
go lint..................................................................Passed
go-critic................................................................Passed
  • Submit a pull request and include a succinct description of the feature or issue it addresses

Documentation

Overview

Package mdtopdf implements a PDF document generator for markdown documents.

Introduction

This package depends on two other packages:

* The [gomarkdown](https://github.com/gomarkdown/markdown) parser to read the markdown source

* The fpdf package to generate the PDF

The tests included here are from the BlackFriday package. See the "testdata" folder. The tests create PDF files and thus while the tests may complete without errors, visual inspection of the created PDF is the only way to determine if the tests *really* pass!

The tests create log files that trace the BlackFriday parser callbacks. This is a valuable debug tool showing each callback and data provided in each while the AST is presented.

Installation

To install the package:

go get github.com/mandolyte/mdtopdf

Quick start

In the cmd folder is an example using the package. It demonstrates a number of features. The test PDF was created with this command:

go run convert.go -i test.md -o test.pdf

See README for limitations and known issues

Package mdtopdf converts markdown to PDF.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ExtractTextFromNode

func ExtractTextFromNode(node ast.Node) string

ExtractTextFromNode recursively extracts text content from AST nodes

Types

type Color

type Color struct {
	Red, Green, Blue int
}

Color is a RGB set of ints; for a nice picker see https://www.w3schools.com/colors/colors_picker.asp

func Colorlookup

func Colorlookup(s string) Color

Colorlookup returns a RGB triple corresponding to the named color, "rgb(r,g,b)" or "#rrggbb" string. On error, return black.

type PdfRenderer

type PdfRenderer struct {
	// Pdf can be used to access the underlying created fpdf object
	// prior to processing the markdown source
	Pdf *fpdf.Fpdf

	// normal text
	Normal Styler

	// link text
	Link Styler

	// backticked text
	Backtick Styler

	// blockquote text
	Blockquote  Styler
	IndentValue float64

	// Headings
	H1 Styler
	H2 Styler
	H3 Styler
	H4 Styler
	H5 Styler
	H6 Styler

	// Table styling
	THeader Styler
	TBody   Styler

	// code styling
	Code Styler

	// update styling
	NeedCodeStyleUpdate       bool
	NeedBlockquoteStyleUpdate bool
	HorizontalRuleNewPage     bool
	SyntaxHighlightBaseDir    string
	InputBaseURL              string
	Theme                     Theme
	BackgroundColor           Color

	Extensions   parser.Extensions
	ColumnWidths map[ast.Node][]float64
	// contains filtered or unexported fields
}

PdfRenderer is the struct to manage conversion of a markdown object to PDF format.

func NewPdfRenderer

func NewPdfRenderer(params PdfRendererParams) *PdfRenderer

NewPdfRenderer creates and configures an PdfRenderer object, which satisfies the Renderer interface.

func NewPdfRendererWithDefaultStyler

func NewPdfRendererWithDefaultStyler(orient, papersz, pdfFile, tracerFile string, defaultStyler Styler, opts []RenderOption, theme Theme) *PdfRenderer

NewPdfRendererWithDefaultStyler creates and configures an PdfRenderer object, which satisfies the Renderer interface. update default styler for normal

func (*PdfRenderer) Process

func (r *PdfRenderer) Process(content []byte) error

Process takes the markdown content, parses it to generate the PDF

func (*PdfRenderer) RenderFooter

func (r *PdfRenderer) RenderFooter(w io.Writer, _ ast.Node)

RenderFooter is not supported.

func (*PdfRenderer) RenderHeader

func (r *PdfRenderer) RenderHeader(w io.Writer, ast ast.Node)

RenderHeader is not supported.

func (*PdfRenderer) RenderNode

func (r *PdfRenderer) RenderNode(w io.Writer, node ast.Node, entering bool) ast.WalkStatus

RenderNode is a default renderer of a single node of a syntax tree. For block nodes it will be called twice: first time with entering=true, second time with entering=false, so that it could know when it's working on an open tag and when on close. It writes the result to w.

The return value is a way to tell the calling walker to adjust its walk pattern: e.g. it can terminate the traversal by returning Terminate. Or it can ask the walker to skip a subtree of this node by returning SkipChildren. The typical behavior is to return GoToNext, which asks for the usual traversal to the next node. (above taken verbatim from the blackfriday v2 package)

func (*PdfRenderer) Run

func (r *PdfRenderer) Run(content []byte) error

Run takes the markdown content, parses it but don't generate the PDF. you can access the PDF with youRenderer.Pdf

func (*PdfRenderer) SetCustomTheme

func (r *PdfRenderer) SetCustomTheme(themeJSONFile string)

SetCustomTheme sets a custom theme based on JSON config

func (*PdfRenderer) SetDarkTheme

func (r *PdfRenderer) SetDarkTheme()

SetDarkTheme sets theme to 'dark'

func (*PdfRenderer) SetLightTheme

func (r *PdfRenderer) SetLightTheme()

SetLightTheme sets theme to 'light'

func (*PdfRenderer) SetPageBackground

func (r *PdfRenderer) SetPageBackground(colorStr string, color Color)

SetPageBackground - sets background colour of page. String IDs ("blue", "grey", etc) and `Color` structs are both supported

func (r *PdfRenderer) SetTOCLinks(tocHeaders map[string]*int)

SetTOCLinks these will be used in `nodeProcessing.go:processText()` if the header is encoutered as we need to call `r.Pdf.SetLink()` if that's the case

func (*PdfRenderer) UpdateBlockquoteStyler

func (r *PdfRenderer) UpdateBlockquoteStyler()

UpdateBlockquoteStyler - update Blockquote fill styler

func (*PdfRenderer) UpdateCodeStyler

func (r *PdfRenderer) UpdateCodeStyler()

UpdateCodeStyler - update code fill styler

func (*PdfRenderer) UpdateParagraphStyler

func (r *PdfRenderer) UpdateParagraphStyler(defaultStyler Styler)

UpdateParagraphStyler - update with default styler

type PdfRendererParams

type PdfRendererParams struct {
	Orientation, Papersz, PdfFile, TracerFile, FontFile, FontName string
	Opts                                                          []RenderOption
	Theme                                                         Theme
	CustomThemeFile                                               string
}

PdfRendererParams struct to hold params passed to NewPdfRenderer

type RenderOption

type RenderOption func(r *PdfRenderer)

RenderOption allows to define functions to configure the renderer

func IsHorizontalRuleNewPage

func IsHorizontalRuleNewPage(value bool) RenderOption

IsHorizontalRuleNewPage if true, will start a new page when encountering a HR (---). Useful for presentations.

func SetSyntaxHighlightBaseDir

func SetSyntaxHighlightBaseDir(path string) RenderOption

SetSyntaxHighlightBaseDir path to https://github.com/jessp01/gohighlight/tree/master/syntax_files

func WithUnicodeTranslator

func WithUnicodeTranslator(cp string) RenderOption

WithUnicodeTranslator configures a unico translator to support characters for latin, russian, etc..

type Styler

type Styler struct {
	Font      string
	Style     string
	Size      float64
	Spacing   float64
	TextColor Color
	FillColor Color
}

Styler is the struct to capture the styling features for text Size and Spacing are specified in points. The sum of Size and Spacing is used as line height value in the fpdf API

type TOCEntry

type TOCEntry struct {
	Level int
	Title string
	ID    string
}

TOCEntry represents a table of contents entry

func GetTOCEntries

func GetTOCEntries(content []byte) ([]TOCEntry, error)

GetTOCEntries returns TOC entries

type TOCVisitor

type TOCVisitor struct {
	Entries []TOCEntry
}

TOCVisitor implements ast.NodeVisitor to collect headers

func (*TOCVisitor) Visit

func (v *TOCVisitor) Visit(node ast.Node, entering bool) ast.WalkStatus

Visit implements the ast.NodeVisitor interface

type Theme

type Theme int

Theme [light|dark]

const (
	// DARK theme const
	DARK Theme = 1
	// LIGHT theme const
	LIGHT Theme = 2
	// CUSTOM theme const
	CUSTOM Theme = 3
)

Directories

Path Synopsis
cmd
md2pdf command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL