Nanoc, Pandoc, and Pygments ― Oh my!
February 5, 2015
While trying to add syntax highlighting to my blog, I felt I needed to add syntax highlighting support, but I was at a loss of where to begin. As of writing, I’m using nanoc to compile my site, but my posts are written in pandoc’s markdown.
For my initial iteration, I simply enabled syntax highlighting in pandoc. Pandoc uses the highlighting-kate package. Both packages are developed by the extraordinary John MacFarlane.
Unfortunately, while highlighting-kate has support for a large number of languages (112 as of writing), I find pygments still outperforms other highlighters, both in number of supported languages (over 300), but also in the richness and accuracy of the syntax definitions.
It turns out nanoc has support for pygments through it’s :colorize_syntax
module.
Unfortunately, when pandoc generates code blocks, it nests a
<code>
element inside a <pre>
tag,
and places a CSS class with the language’s name on the
<pre>
element. :colorize-syntax
however
requires the css class be on the <code>
element, and
that it be prefixed with language-
1.
To work around this, I’ve devised a simple nanoc filter that uses nokogiri to transform pandoc’s output into one supported by nanoc.
# You may use this snippet under the WTFPL <http://www.wtfpl.net/>
#
# Converts pandoc-generated HTML to the format the :colorize_syntax understands,
# with the goal of using pygments with pandoc output.
#
# There are two issues:
# - Pandoc puts the classes on the <pre> tag, but nanoc wants them on <code>
# - Pandoc doesn't use a language- prefix for classes that nanoc wants
class PandocToColorize < Nanoc::Filter
require "nokogiri"
require "pygments"
identifier :pandoc_to_colorize
type :text
def run(content, params={})
doc = Nokogiri::HTML.parse content
doc.css("pre > code").each do |element|
next unless element.parent["class"]
element["class"] ||= ""
element.parent["class"].split(/\s+/).each do |cl|
if Pygments::Lexer.find(cl) != nil
element["class"] <<= " language-#{cl}"
else
element["class"] <<= " #{cl}"
end
end
element.parent.delete "class"
element["class"] = element["class"].strip
end
doc.to_html
end
end
I apologize if my Ruby isn’t idiomatic. I’m unfamiliar with the language.
You can then simply drop the filter into
lib/pandoc_to_colorize.rb
, and add
filter :pandoc_to_colorize
to your Rules file.
Update (November 2016)
It looks like pygments.rb isn’t actively maintained anymore. However,
rouge is, and it’s pretty
easy to convert this filter to one that works with Rouge. Simply replace
require "pygments"
with require "rouge"
and
Pygments::Lexer
with Rouge::Lexer
.
This is probably for the best anyways, as it prevents naming conflicts.↩︎
The views expressed on this site are my own and do not reflect those of my employer.