Convert Markdown Files to PDF with Rake and Pandoc

I've recently migrated this blog, and the older posts might not yet be satisfactorily cleaned up. Apologies for the temporary mess.

A post I wrote last fall about using Make to convert all the Markdown files in a directory to PDFs continues to draw lots of traffic, so it must be a solution that people find useful. I’ve since started using Rake instead of Make. You can find any number of comparisons between Make and and Rake. I use Rake because I like having the power of Ruby, but Make handles dependencies much better than Rake.

Here’s is a generic Rakefile to make PDFs and DOCXs of every Markdown file in a directory.

<span class=“nb”>require</span> <span class=“s2”>“rake/clean”</span>

<span class=“c1”># Define inputs and outputs</span> <span class=“no”>MDFILES</span> <span class=“o”>=</span> <span class=“no”>FileList</span><span class=“p”>[</span><span class=“s2”>”*.md”</span><span class=“p”>]</span> <span class=“no”>PDFS</span> <span class=“o”>=</span> <span class=“no”>MDFILES</span><span class=“p”>.</span><span class=“nf”>ext</span><span class=“p”>(</span><span class=“s2”>“.pdf”</span><span class=“p”>)</span> <span class=“no”>DOCX</span> <span class=“o”>=</span> <span class=“no”>MDFILES</span><span class=“p”>.</span><span class=“nf”>ext</span><span class=“p”>(</span><span class=“s2”>“.docx”</span><span class=“p”>)</span>

<span class=“c1”># Clobber only PDFs and DOCXs we've generated</span> <span class=“no”>CLOBBER</span><span class=“p”>.</span><span class=“nf”>include</span><span class=“p”>(</span><span class=“no”>PDFS</span><span class=“p”>,</span> <span class=“no”>DOCX</span><span class=“p”>)</span>

<span class=“c1”># Define bibliography and CSL files</span> <span class=“no”>BIB</span> <span class=“o”>=</span> <span class=“s2”>“/Users/lmullen/acad/research/bib/master.bib”</span> <span class=“no”>CSL</span> <span class=“o”>=</span> <span class=“s2”>“chicago-mullen.csl”</span>

<span class=“n”>desc</span> <span class=“s2”>“Build all documents in all formats.”</span> <span class=“n”>task</span> <span class=“ss”>:default</span> <span class=“o”>=></span> <span class=“p”>[</span><span class=“ss”>:pdfs</span><span class=“p”>,</span> <span class=“ss”>:docx</span><span class=“p”>]</span>

<span class=“n”>desc</span> <span class=“s2”>“Build PDFs of all documents.”</span> <span class=“n”>task</span> <span class=“ss”>:pdfs</span> <span class=“o”>=></span> <span class=“no”>PDFS</span>

<span class=“n”>desc</span> <span class=“s2”>“Build DOCXs of all documents.”</span> <span class=“n”>task</span> <span class=“ss”>:docx</span> <span class=“o”>=></span> <span class=“no”>DOCX</span>

<span class=“c1”># Build PDFs from Markdown source</span> <span class=“n”>rule</span> <span class=“s2”>“.pdf”</span> <span class=“o”>=></span> <span class=“s2”>“.md”</span> <span class=“k”>do</span> <span class=“o”>|</span><span class=“n”>t</span><span class=“o”>|</span> <span class=“n”>sh</span> <span class=“s2”>“pandoc </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>source</span><span class=“si”>}</span><span class=“s2”> -o </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>name</span><span class=“si”>}</span><span class=“s2”> –csl=</span><span class=“si”>#{</span><span class=“no”>CSL</span><span class=“si”>}</span><span class=“s2”> –bibliography=</span><span class=“si”>#{</span><span class=“no”>BIB</span><span class=“si”>}</span><span class=“s2”>”</span> <span class=“k”>end</span>

<span class=“c1”># Build DOCXs from Markdown source</span> <span class=“n”>rule</span> <span class=“s2”>“.docx”</span> <span class=“o”>=></span> <span class=“s2”>“.md”</span> <span class=“k”>do</span> <span class=“o”>|</span><span class=“n”>t</span><span class=“o”>|</span> <span class=“n”>sh</span> <span class=“s2”>“pandoc </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>source</span><span class=“si”>}</span><span class=“s2”> -o </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>name</span><span class=“si”>}</span><span class=“s2”> –csl=</span><span class=“si”>#{</span><span class=“no”>CSL</span><span class=“si”>}</span><span class=“s2”> –bibliography=</span><span class=“si”>#{</span><span class=“no”>BIB</span><span class=“si”>}</span><span class=“s2”>”</span> <span class=“k”>end</span>

If you prefer that your filenames be document.md.pdf instead of document.pdf, then you have to do something ugly and replace the rules above with these rules:
<span class=“c1”># Build PDFs from Markdown source</span>
<span class=“n”>rule</span><span class=“p”>(</span> <span class=“sr”>/.md.pdf$/</span> <span class=“o”>=></span> <span class=“p”>[</span>
  <span class=“nb”>proc</span> <span class=“p”>{</span><span class=“o”>|</span><span class=“n”>task_name</span><span class=“o”>|</span> <span class=“n”>task_name</span><span class=“p”>.</span><span class=“nf”>sub</span><span class=“p”>(</span><span class=“sr”>/.md.pdf$/</span><span class=“p”>,</span> <span class=“s1”>'.md'</span><span class=“p”>)</span> <span class=“p”>}</span>
  <span class=“p”>])</span> <span class=“k”>do</span> <span class=“o”>|</span><span class=“n”>t</span><span class=“o”>|</span>
  <span class=“n”>sh</span> <span class=“s2”>“pandoc </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>source</span><span class=“si”>}</span><span class=“s2”> -o </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>name</span><span class=“si”>}</span><span class=“s2”> –csl=</span><span class=“si”>#{</span><span class=“no”>CSL</span><span class=“si”>}</span><span class=“s2”> –bibliography=</span><span class=“si”>#{</span><span class=“no”>BIB</span><span class=“si”>}</span><span class=“s2”>”</span>
<span class=“k”>end</span>

<span class=“c1”># Build DOCXs from Markdown source</span> <span class=“n”>rule</span><span class=“p”>(</span> <span class=“sr”>/.md.docx$/</span> <span class=“o”>=></span> <span class=“p”>[</span> <span class=“nb”>proc</span> <span class=“p”>{</span><span class=“o”>|</span><span class=“n”>task_name</span><span class=“o”>|</span> <span class=“n”>task_name</span><span class=“p”>.</span><span class=“nf”>sub</span><span class=“p”>(</span><span class=“sr”>/.md.docx$/</span><span class=“p”>,</span> <span class=“s1”>'.md'</span><span class=“p”>)</span> <span class=“p”>}</span> <span class=“p”>])</span> <span class=“k”>do</span> <span class=“o”>|</span><span class=“n”>t</span><span class=“o”>|</span> <span class=“n”>sh</span> <span class=“s2”>“pandoc </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>source</span><span class=“si”>}</span><span class=“s2”> -o </span><span class=“si”>#{</span><span class=“n”>t</span><span class=“p”>.</span><span class=“nf”>name</span><span class=“si”>}</span><span class=“s2”> –csl=</span><span class=“si”>#{</span><span class=“no”>CSL</span><span class=“si”>}</span><span class=“s2”> –bibliography=</span><span class=“si”>#{</span><span class=“no”>BIB</span><span class=“si”>}</span><span class=“s2”>”</span> <span class=“k”>end</span>


comments powered by Disqus