These notes are from a workshop on pandoc I ran for Innovative Learning Week 2015.
You can get the PDF file produced with pandoc from the same source as this page here.
pandoc is the program written by the philosopher John MacFarlane to convert texts between different formatspandoc is able to take files in a variety of formats and produce output in an even bigger variety of formatsmarkdown is a lightweight mark-up language
markdowndocx, i.e. Wordtex, which pandoc uses internally to produce PDFpandocThis code will produce a presentation slide with a nested bullet list with the LaTeX package beamer
\begin{frame}{Slide title}
\begin{itemize}
\item Item 1
\item Item 2
\begin{itemize}
\item Subitem 1
\item Subitem 2
\end{itemize}
\item Item 3
\end{itemize}
\end{frame}
The same code will be produced if you run pandoc to produce beamer output on the following markdown fragment:
## Slide title ##
* Item 1
* Item 2
* Subitem 1
* Subitem 2
* Item 3
pandoc, you can get LaTeX typography without ever touching the LaTeX code: pandoc creates perfectly serviceable LaTeX code and can run LaTeX for you to produce a PDF filepandoc?pandoc workflowmarkdown support built-inmarkdown supportEmacs and Vim. They are very powerful and have excellent support for both markdown and pandoc. I would not recommend learning markdown and one of these editors at the same time (or at least with any sort of deadline looming!), but that is my recommendation. Gentler introductions to Emacs are available in the shape of Aquamacs (this is essentially Emacs with a slightly more traditional interface; OS X only) and Kieran Healy’s Emacs Starter Kit, specifically geared towards social scientists (close enough to linguistics!)pandoc. There is no graphical user interface (unless you use Emacs or Vim…), so the program must be run in the command line
cmd or powershell to open the Terminalcd, e.g. cd ~/Documents/Essays (~ is an abbreviation for /Users/<your username>, i.e. your home folder) or cd C:\Users\<your name>\Documents<TAB> to auto-complete the pathpandocpandoc looks like the following:
pandoc notes.md -o notes.docx
This will run pandoc on the file notes.md. The file’s extension (.md is conventionally used for markdown files) tells pandoc that it is written in markdown and that you want Word output.1
Now try this (this requires a LaTeX system to be installed):
pandoc notes.md -o notes.pdf
.tex file and runs LaTeX on it to produce a PDF file with the LaTeX defaults — probably already better than Word!This gives the main syntax constructs for Markdown as extended by pandoc
# Top-level title #
## Second-level title ##
### Third-level title ###
(you get the picture)
_italics like this_ or *this is also italics*
__bold like this__ or **this is also bold**
A link that leads to [pandoc's homepage](http://johnmacfarlane.net/pandoc)
* A top-level bullet list
* Another item
* A sub-item
* Another sub-item
* And even deeper nesting
1. A numbered list
2. Another item on the list
1. The actual numbers do not matter
5. So you don't have to renumber things if you rearrange them
~~This is strikethrough~~ (not really useful perhaps except for some
syntax?)
> If you have long quotations, you can typeset them in blocks like
> this
This will be ~subscript~ and this will be ^superscript^
You can also have footnotes.[^1]
[^1]: Again, the precise number does not matter, as long as it's the same in the references and the note itself. You can intersperse the footnotes with the text or put them all the end, they will come out as footnotes anyway.
(@ex) This will be a numbered example
You can refer to it in the text by writing (@ex) again --- as long as
the label is unique within the document, the numbering and referencing
will be automatic
pandoc that influence what it doespandoc myfile.md -o myfile.pdfpandoc myfile.md --output=myfile.pdf-o FILE / --output=FILE: the name of the file you want to produce. pandoc tries to guess the output format using the extension. If you do not pass this option, pandoc will just spit out the result of the conversion back into the terminal-t FORMAT / --to=FORMAT: the output format, such as docx, latex, html or even plain-S / --smart (capitalization matters! This is also a logical option, meaning there is no argument): typographically correct output
- is a hyphen (used in contexts such as ‘a difficult-to-parse document’)-- is an en dash (used to denote ranges of numbers, such as 2–4)--- is an em dash (the parenthetical dash — like this)" and ' are corrected to curly quotes depending on context... is corrected to …-s / --standalone: if you are converting to a format such as LaTeX or HTML, use this option: it will produce a complete file with all the necessary headers and footers-V KEY[=VAL] / --variable=KEY:VAL: this is used for setting variables such as author or font; see examples below
--V mainfont="Times New Roman" or --variable=mainfont:"Times New Roman"-M KEY[=VAL] / --metadata=KEY:VAL: this is used for setting metadata--toc / --table-of-contents: include a table of contents-N / --number-sections: what it says. By default, sections in LaTeX (and therefore PDF) output are unnumbered, so it makes sense to turn this on. This has no effect in .docx files, however; you need to fix the .docx template to achieve that.--reference-docx=FILE: you can create a .docx file with the correct styling (e.g. fonts, sizes, colours) and reuse it by passing in this option. The content of the reference file will be ignored. It is recommended to create a .docx file using pandoc, edit its styles to achieve the desired result, and reuse it.--latex-engine=pdflatex|lualatex|xelatex: see belowpandoc creates PDF output using pdflatex — a very stable version of LaTeX that is, however, quite archaic in its handling of fonts--latex-engine=xelatex--variable=mainfont:<name of font>--variable=geometry:a4paper--metadata option is very similar to --variable, and it’s used in a similar way
--metadata=author:"Pavel Iosad" --metadata=title:"Pandoc notes"
markdown source file with three lines that all start with %% This is the title
% This is the author
% This is the date
pandoc manual for that.pandoc can also do automated tracking of references and citationspandoc will take care of putting into your reference list and typesetting the entry in line with the style you require. This process is automatic, so if you end up deleting the in-text citation the entry will also not appear in the reference list..bib files are also plain text files, it is possible to do it using your text editor too..bib when you update it.pandoc, but BibTeX is the most portable@Book{kiparsky82:_explan,
author = {Kiparsky, Paul},
title = {Explanation in phonology},
publisher = {Foris},
year = 1982,
location = {Dordrecht}}

{ in the source and ‘cite key’ in the windowSPE for Chomsky & Halle 1967[@kiparsky82:_explan] $\Rightarrow$ (Kiparsky 1982)[@kiparsky82:_explan, p. 1] $\Rightarrow$ (Kiparsky 1982, p. 1)@kiparsky82:_explan shows $\Rightarrow$ Kiparsky (1982) shows@kiparsky82:_explan [p. 1] shows $\Rightarrow$ Kiparsky (1982, p. 1) showsPhonology has some explaining to do [as shown by @kiparsky82:_explan] $\Rightarrow$ Phonology has some explaining to do (as shown by Kiparsky 1982)One of Kiparsky's important works [-@kiparsky82:_explan] $\Rightarrow$ One of Kiparsky’s important works (1982)[@spe; @kiparsky82:_explan] $\Rightarrow$ (Chomsky & Halle 1967; Kiparsky 1982)pandoc uses the .csl format to describe citation styles.docx file with a bibliography styled using the Unified Style Sheet:
pandoc myfile.md -o myfile.docx --bibliography=path/to/your/bib/file --csl=path/to/your/csl/file
# References #knitr, the R library for reproducible research. By default (as set up in RStudio), knitr outputs HTML files, but it can also be set up to generate other formats via pandoc (see here)pandoc utilities (both are also excellent LaTeX editors if you go down that particular rabbit hole)pandoc notes.md -f markdown -t docx -o notes.docx
[return]
I’m Pavel Iosad, and I’m a Professor in the department of Linguistics and English Language at the University of Edinburgh. ¶ You can always go to the start page to learn more.