Vous êtes sur la page 1sur 21

Futurile

• Work Play Resources About Archive Search


Complex text formatting in Matplotlib
using LaTeX

To create complex text formatting in Matplotlib we have to


use LaTeX. The standard text methods in Matplotlib use a
text object that formats the entire string: this means we
can make all of a string bold but not part of it. For complex
formatting Matplotlib lets us delegate all text handling to
LaTeX. By using LaTeX mark-up within the string we can do
complex formatting like partially bolding some words.
Figure 1 below demonstrates some of the formatting
possibilities using LaTeX.

LaTeX formatting of matplotlib example plot image

Figure 1: LaTeX formatting example

This post covers how to set-up LaTeX on Linux (Ubuntu),


how to format labels and annotations, and some of the
gotchas I've discovered. We'll reserve colouring text for
another time.

It builds on previous posts that covered text handling and


describing plots in Matplotlib, see styling plots and
standard text handling.
Matplotlib was primarily developed as a scientific
visualisation library, so it's not that surprising that it uses
LaTeX [1] for complex formatting. It's fairly complicated to
set-up and use if you don't know LaTeX. I'm not an expert
on LaTeX so treat this post with caution! Everything was
tested on Ubuntu 14.04 LTS and Python 3.

To use LaTeX to format Matplotlib text we have to use


LaTeX for all text formatting [2], the downside is that it's
significantly slower to complete processing a plot. The
output formats (backends) that support LaTeX are limited
to AGG, PS, PDF and PGF. Consequently, to create SVG it's
necessary to follow a conversion process which means two
steps to create a Web ready image. Finally, the grey
backgrounds in some styles didn't show up (e.g ggplot,
fivethirtyeight and BMH), which makes ggplot in particular
unusable.

With those constraints in mind, our first challenge is to set-


up LaTeX.

Install LaTeX

Texlive is a distribution of LaTeX [3], it's available through the


Ubuntu repositories. Installation is simple:

$ sudo apt-get install texlive


$ sudo apt-get install texlive-latex-extra
$ sudo apt-get install texlive-latex-recommended
$ sudo apt-get install dvipng
This is a very large download (~500MB), you can reduce it a little bit
by removing the documentation.

Test LaTeX

There are a couple of resources on checking that your LaTeX


environment is working correctly [4]. The core of the instructions
are to create a file called 'test.tex' and put the following in it [5].

\documentclass[a4paper,12pt]{article}
\begin{document}

The foundations of the rigorous study of \emph{analysis}


were laid in the nineteenth century, notably by the
mathematicians Cauchy and Weierstrass. Central to the
study of this subject are the formal definitions of
\emph{limits} and \emph{continuity}.

Let $D$ be a subset of $\bf R$ and let


$f \colon D \to \mathbf{R}$ be a real-valued function on
$D$. The function $f$ is said to be \emph{continuous} on
$D$ if, for all $\epsilon > 0$ and for all $x \in D$,
there exists some $\delta > 0$ (which may depend on $x$)
such that if $y \in D$ satisfies
\[ |y - x| < \delta \]
then
\[ |f(y) - f(x)| < \epsilon. \]

One may readily verify that if $f$ and $g$ are continuous
functions on $D$ then the functions $f+g$, $f-g$ and
$f.g$ are continuous. If in addition $g$ is everywhere
non-zero then $f/g$ is continuous.

\end{document}
Run the following:

# converts the tex file to a dvi file


$ latex test.tex
# open the dvi file viewer
$ xdvi test.dvi
# convert the text file to a pdf
$ pdflatex test.tex
# open the pdf with whatever pdf viewer is installed
$ xdg-open test.pdf
The first command converts test.text to test.dvi and the
second command lets you view it with xdvi so you can check the
formatting. Rather than converting to DVI, the third command
converts from LaTeX directly to PDF.

Test LaTeX in Matplotlib

With LaTeX installed and functioning on the system, the next step is
to confirm it's working correctly in Matplotlib. The Matplotlib
documentation covers using LaTeX extensively, see Text rendering
with LaTeX. The easiest way to check is:
Figure 2: Standard Matplotlib Tex demo

1. Download the example from the documentation.


2. Run the demo as python3 tex_demo.py
3. Check the created file with xdg-open tex_demo.png
You should see something similar to the figure on the right.

Setting Matplotlib for PDF and XeteX

To create plots we have to decide which output format to use, as


conversion from LaTeX to an output format isn't handled correctly
by all of them. For example, if we format some text using LaTeX in
matplotlib and save directly as SVG (with plt.savefig()) the
backend will not process this correctly and we'll lose the formatting.
Outputting from Matplotlib as PDF is the best option as it supports
vector graphics output, as a second step we can use pdf2svg or
inkscape to do conversion to SVG.

For processing we'll hand-off fonts and marking our plot to LaTeX.
This means we have to set-up LaTeX fully for processing all text
handling elements, and tell Matplotlib's native text handling to get
out of the way. It's surprisingly complicated!

LaTeX is a mature system that was created before modern standard


fonts (e.g. TTF and OpenType) so by default it's unaware of them.
To use standard fonts we set Matplotlib to use PGF [6] and the
XeteX processor as these are font aware. The first step is to install
XeTeX:

$ sudo apt-get install texlive-xetex


In our plotting source code we tell Matplotlib to use the PGF
backend for processing PDF output. This fragment of code goes at
the top of a plot:

import matplotlib
from matplotlib.backends.backend_pgf import FigureCanvasPgf
matplotlib.backend_bases.register_backend('pdf', FigureCanvasPgf)
The default is that LaTeX will use a different font to the general
font that Matplotlib uses, it's best to set the fonts explicitly and to
tell it how we want the Figure set-up. It's possible to define these
settings with the more Pythonic
plt.rcParams['blah']='blah' but there are a lot of them, so
it's easier to do this [7]:

pgf_with_latex = {
"pgf.texsystem": "xelatex", # use Xelatex which is TTF
font aware
"text.usetex": True, # use LaTeX to write all
text
"font.family": "serif", # use serif rather than
sans-serif
"font.serif": "Ubuntu", # use 'Ubuntu' as the
standard font
"font.sans-serif": [],
"font.monospace": "Ubuntu Mono", # use Ubuntu mono if we have
mono
"axes.labelsize": 10, # LaTeX default is 10pt
font.
"font.size": 10,
"legend.fontsize": 8, # Make the legend/label
fonts a little smaller
"xtick.labelsize": 8,
"ytick.labelsize": 8,
"pgf.rcfonts": False, # Use pgf.preamble, ignore
standard Matplotlib RC
"text.latex.unicode": True,
"pgf.preamble": [
r'\usepackage{fontspec}',
r'\setmainfont{Ubuntu}',
r'\setmonofont{Ubuntu Mono}',
r'\usepackage{unicode-math}'
r'\setmathfont{Ubuntu}'
]
}

matplotlib.rcParams.update(pgf_with_latex)
The first option, pgf.texsystem tells Matplotlib to use the xelatex
program to process the PGF backend. The second option,
text.usetex tells Matplotlib that all text should be processed
using LaTeX.

The various font and axes lines set how Matplotlib processes parts
of the plot, we covered many of these in a previous post. Defining
pgf.rcfonts to False means that the backend will obey the fonts
defined in the pgf.preamble rather than over-riding and using
whatever is in your Matplotlib configuration parameters (e.g.
~/.matplotlibrc). The benefit is we're explicitly defining how XeTeX
will function, and there's no risk of confusion with Matplotlib using
settings from elsewhere. We also tell XeTeX to use unicode
(text.latex.unicode) which allows us to send extended
characters [8].

At this point we've told Matplotlib how to handle the plot and that
it should hand over to the LaTeX system. Next, we have to tell the
LaTeX processor how it should handle the incoming stream.

The pgf.preamble section has directives that control how


xelatex command processes the document for PGF output. These
are LaTeX commands to load packages and alter settings. If we
were using the standard LaTeX backend then we'd provide an
e q u i v a l e n t latex-preamble section. We set
\usepackage{fontspec} so that we can define the fonts in
L a T e X o u t p u t , s p e c i fi c a l l y \setmainfont{Ubuntu} a n d
\setmonofont{Ubuntu Mono}. The fontspec package is part of
the Ubuntu texlive-latex-recommended package which you may
need to install.

This set-up is sufficient to show the Ubuntu font for text output like
annotate() and xlabel().

Axis Font

At the moment the Axis (ie the numbers along the X and Y axis) will
be in the default sans-serif font: many people consider this to be
correct for complex maths which is why it's the default. However, I
want the same font on all elements of my plot. Matplotlib defines
the font LaTeX should use for the Axis in the Mathsfont setting
which also controls how maths equations display. There are a few
options for changing it, depending on your requirements.

Many free fonts cannot display maths script completely [9]. If you
use maths script then the easiest option is to use the cmbright
sans serif font package. To install it get the package texlive-fonts-
extra and put the following in pgf.preamble:

r'\usepackage{cmbright}'
An alternative way to solve this is to update the Axis after it's been
plotted [10], but while a clever hack it's messy as it intermingles
standard matplotlib labelling and LaTeX labelling.

The last option is to change the font Matplotlib uses for maths. The
Ubuntu font is not fully maths capable, but as I'm not using maths
script (my graphs are simple numbers) it's fine for my purposes.
Consequently, I define the maths font using the unicode-math
package. Install the texlive-math-extra package, then in Matplotlib
we can do:

r'\usepackage{unicode-math}',
r'\setmathfont{Ubuntu}'
It's possible to use any TTF font, the easiest way to see which ones
are available on the command line is:

$ fc-list : family file | grep -i libertine


Running gnome-font-viewer gives a visual view of what the font
looks like.

Formatting strings

Having completed the set-up, formatting text is pretty


straightforward!

LaTeX uses a lot of back slashes to express formatting. To represent


them in a normal Python string literal requires doubling up the
slashes so that Python knows we're not trying to create an escape
sequence (e.g \n). It's nicer to define formatting strings as raw
string literals for Python, which is just a string with r at the start. An
example of this:

plt.ylabel(r'\textbf{A Python raw string} with LaTeX formatting')


This code sets the part of the string "A Python raw string" to be
bold. We can use many of the common LaTeX mark-up's for
formatting text strings, the common ones are:

Format string Output


textbf{words to bold} Bold text
underline{some underlined text} Underlining text
textit{words to italic} Italics
newline or \ Embed a newline in the text -
not with the PGF backend
plt.text(r'textit{Some text}', Standard string formatting
fontsize=16, color='blue'} options in Matplotlib work.

The PGF backend doesn't support using LaTeX codes for newlines:
according to this GitHub issue [11] the underlying problem is that
LaTeX doesn't support newlines in a variety of situations. This is
problematic if you're doing a multi-line annotation() or text() but
there are a couple of options. The first is to mix raw text strings and
normal strings together, putting newlines in the normal string:

txtcomment = r'\textbf{First line} of text' + '\n'


The downside with this approach is you have to manually work out
where you want newlines.

T h e s e c o n d o p t i o n i s t o u s e textwrap.dedent() a n d
textwrap.fill() with multi-line strings. The advantage of the
multi-line string is we can tab the text in nicely in the source code,
and in the output we can automatically wrap it at whatever length
we want. We have to use double backslashes to escape the LaTeX
codes properly, and add normal strings if we want to specifically
force a newline at a set point in the string:

note1_txt = 'This is the first line, with a line break \n'


note1_multiline = '''\
\\textit{These lines are} tabbed in to match
but will be displayed using textwraps width
argument. Both strings can have LaTeX in them
'''
# Remove the indents from the multi-line text, then reformat it to
80 chars
# Add the two strings together so we make a final one to put on the
plot
note1_txt += tw.fill(tw.dedent(note1_multiline.rstrip()), width=80)
plt.text(0.6, 130, note1_txt)
Post processing

The output with plt.savefig() should either be a PGF image to use


within a LaTeX document, or a PDF document. We can convert the
PDF image into an SVG image suitable for the Web [12] using
inkscape [13], pdf2svg or pdftocairo:

$ /usr/bin/pdftocairo -svg some-example.pdf file-to-publish.svg


Generally, the best results are from using transparent=True and
bbox_inches='tight' in the call to plt.savefig()

LaTeX example
In this example we use the PGF backend with LaTeX to do complex
formatting on strings. It was output as a PDF and then converted to
SVG for display with pdftocairo. The results are shown in Figure 1
at the top of this post.

#!/usr/bin/env python3
# Set-up PGF as the backend for saving a PDF
import matplotlib
from matplotlib.backends.backend_pgf import FigureCanvasPgf
matplotlib.backend_bases.register_backend('pdf', FigureCanvasPgf)

import matplotlib.pyplot as plt


import textwrap as tw

# Style works - except no Grey background


plt.style.use('fivethirtyeight')

pgf_with_latex = {
"pgf.texsystem": "xelatex", # Use xetex for processing
"text.usetex": True, # use LaTeX to write all text
"font.family": "serif", # use serif rather than sans-
serif
"font.serif": "Ubuntu", # use Ubuntu as the font
"font.sans-serif": [], # unset sans-serif
"font.monospace": "Ubuntu Mono",# use Ubuntu for monospace
"axes.labelsize": 10,
"font.size": 10,
"legend.fontsize": 8,
"axes.titlesize": 14, # Title size when one figure
"xtick.labelsize": 8,
"ytick.labelsize": 8,
"figure.titlesize": 12, # Overall figure title
"pgf.rcfonts": False, # Ignore Matplotlibrc
"text.latex.unicode": True, # Unicode in LaTeX
"pgf.preamble": [ # Set-up LaTeX
r'\usepackage{fontspec}',
r'\setmainfont{Ubuntu}',
r'\setmonofont{Ubuntu Mono}',
r'\usepackage{unicode-math}',
r'\setmathfont{Ubuntu}'
]
}

matplotlib.rcParams.update(pgf_with_latex)

fig = plt.figure(figsize=(8, 6), dpi=400)


plt.bar([1, 2, 3, 4], [125, 100, 90, 110], label="Product A",
width=0.5, align='center')
ax1 = plt.axis()

# LaTeX \newline doesn't work, but we can add multiple lines


together
annot1_txt = r'Our \textit{"Green Shoots"} Marketing campaign,
started '
annot1_txt += '\n'
annot1_txt += r'in Q3, shows some impact in Q4. Further
\textbf{positive} '
annot1_txt += '\n'
annot1_txt += r'impact is expected in \textit{later quarters.}'

# Annotate using an altered arrowstyle for the head_width, the rest


# of the arguments are standard
plt.annotate(annot1_txt, xy=(4, 80), xytext=(1.50, 105),
arrowprops=dict(arrowstyle='-|>, head_width=0.5',
linewidth=2, color='black'),
bbox=dict(boxstyle="round", color='yellow', ec="0.5",
alpha=1))

# Adjust the plot upwards at the bottom so we can fit the figure
# comment as well as the ylabel()
plt.subplots_adjust(bottom=0.15)

# We want a figure text with a separate new line


fig_txt = '\\textbf{Notes:}\n'
comment2_txt = '''\
Sales for \\textit{Product A} have been flat
through the year. We expect improvement after the new release
(codename: \\underline{Starstruck}) in Q2 next year.
'''
fig_txt += tw.fill(tw.dedent(comment2_txt.rstrip()), width=80)
# The YAxis value is -0.06 to push the text down slightly
plt.figtext(0.5, -0.06, fig_txt, horizontalalignment='center',
fontsize=12, multialignment='left',
bbox=dict(boxstyle="round", facecolor='#D8D8D8',
ec="0.5", pad=0.5, alpha=1))

# Standard description of the plot


# Set xticks, font for them is set globally
plt.xticks([1, 2, 3, 4], ['Q1', 'Q2', 'Q3', 'Q4'])
plt.xlabel(r'\textbf{Time} - FY quarters')
plt.ylabel(r'\textbf{Sales} - unadjusted')
plt.title('Total sales by quarter')
plt.legend(loc='best')

plt.savefig('matplot-latex.pdf', bbox_inches='tight',
transparent=True)
LaTeX resources

These are the most useful resources I found for Matplotlib and
LaTeX:

• Matplotlib tutorial section - text rendering with LaTeX


Matplotlib tutorial documentation on using LaTeX for
formatting.

• Bold, italics and underlining in LaTeX


Tutorial on LaTeX basic formatting, with good simple
examples.

• A short introduction to LaTeX 2E


Comprehensive introduction to LaTeX.
• Seamlessly Embedding Matplotlib Output into LaTeX
Sebastian Billaudelle's post placing Matplotlib plots within
LaTeX documents.

• Sebastian Billaudelle's Matplotlibrc


Good ideas here for configuring Matplotlib.

• Bartosz Teleńczuki post on publication quality plots


Generally covers altering plots with Inkscape, but very
interesting.

• Ubuntu font in LaTeX


Setting up Ubuntu font (or any TTF font) in LaTeX.

• Vim LaTeX plugin or VimTeX or LaTeX-BoX


LaTeX plugins for Vim.

Final words

With LaTeX handling the formatting of all text we can mark-up our
plots with any form of complex formatting we want. The constraints
are that there are two steps to creating Web ready images, grey
backgrounds on styles don't display properly and the processing of
a plot is slow. Despite those issues I think the results are worth the
extra effort.

[1]Overview of LaTeX on Wikipedia and


TeX.
[2 This is not strictly true. If you're only interested in maths text
] then LaTeX input is supported by default, see the documentation.
[3 LaTeX on Ubuntu provides a good introduction to the LaTeX
] distribution options on Linux.
[4 James Trimbles' answer to Getting started on LaTeX and Manuel
] Quintero's short tutorial How to install LaTex on Ubuntu 14.04
LTS.
[5]Example from the LaTex Primer via James Trimbles
answer.
[6 PGF provides direct embedding in LaTeX documents and
] matplotlib, for my use case it's really that we're using XeTeX for
handling the LaTeX input which is key.
[7 Most of these settings are from Bennett Kanuka's great post on
] Native LaTeX plots.
[8 Matplotlib and Xelatex explains the main settings, note that your
] source file also needs to be set for Unicode e.g. coding:utf-8 in
vim.
[9]Linux maths fonts
[10]Latex font issues using amsmath and sfmath for plot
labelling
[11]PGF backend: Lines in multi-line text drawn at same position
[12]Convert PDF to clean
SVG
[13]Using Inkscape is covered in Wikipedia PDF Conversion to
SVG.

Posted in Tech
Sunday 13 March 2016

Tagged with Python Matplotlib

‹‹Handling text in Matplotlib


Partial colouring of text in Matplotlib with LaTeX››

1 Comment

Futurile

Login

Recommend

Share

Sort by Best
Join the discussion…

Attach

LOG IN WITH

OR SIGN UP WITH DISQUS


Disqus is a discussion network

• Disqus never moderates or censors. The rules on this community


are its own.
• Don't be a jerk or do anything illegal. Everything is easier that way.
Read full terms and conditions
David Cortés-Ortuño • a year ago
This is a very useful article :) , I was having a hard time
trying to figure out how to make PGF to work correctly in
Matplotlib. If you use a Jupyter notebook, you must avoid
the `%matplotlib inline` magic, since it resets the matplotlib
settings


Reply

Share ›
Twitter
Facebook

ALSO ON FUTURILE
Yay this is the first post
1 comment 5 years ago
Steve George — This is the first post test comment

A menu launcher for i3 - Snapfly


3 comments 2 years ago

Man from Mars — That's the only reason to have a graphical


menu with icons and categories! Have a look here https://
wiki.lxde.org/en/PC... or here …

Site search with Tipue for Pelican


1 comment 4 years ago

Gary Hall — Thanks for the simple to understand instructions.


Very useful.

Long Easter weekends check


1 comment 5 years ago
Tshepang Lekhonkhobe — "There are a log" --> "There are a
lot""four command" --> "four commands"

Powered by Disqus

Subscribe

Add Disqus to your site

Privacy

Content by Steve George


(Some rights reserved)
Subscribe with RSS
Powered by Pelican

Vous aimerez peut-être aussi