Dec 15 2007

Syntax highlighting with Markdown in WordPress

Published by Dougal at 5:18 pm under Blogging, Computing, Programming

I’ve finally got it together and installed Markdown here on my blog. It took a bit of effort to get exactly what I wanted so I thought I’d document.

I wanted:

  • Markdown syntax (or similar) for writing blog posts. I’m sick to high heaven of WYSIWYG editors pretending to know what I’m asking for.
  • Syntax highlighting for code snippets.
  • Minimal effort to actually use both of the above.

The third point is pretty important because Markdown on its own doesn’t support any particular varieties of code, though it has good means of separating code from other information in a page. You can’t specifically say, “this snippet here, it’s written in Python, so I want you to highlight it accordingly”.

My first thought was a client-side solution. I could forget all about doing my own syntax highlighting and just roll out the PHP Markdown plugin that exists for WordPress. The browsers could then run a bit of Javascript to highlight any code that appears in my posts. But the number of languages those solutions support is very small. I couldn’t find one that recognised Haskell. Pah! Get with the programme guys!

So I was back to square one, doing it server-side. I did some more creative googling and discovered this guy’s blog where he’d used a program called highlight on the server to process code fragments. He wrote a small syntax extension to Markdown to enable easy tagging of languages. I set up his solution only to find that highlight segfaults dramatically when asked to do… well, anything actually.

In the end, I used his adaptation of Markdown combined with the very comprehensive GeSHi to do syntax highlighting.

For writing code, I indent all my snippets by four spaces (in the usual Markdown style). If I want it highlighted in a particular language, I create a new line at the beginning of the code fragment which looks like {{lang:something}}. This is pretty unobtrusive to read (an important part of Markdown, I feel) and yet as powerful as I could want.

It took a fair amount of effort so I should really lighten the load on everybody else. You need to download and install the WP-Syntax plugin to give you GeSHi. (This will also give you the ability to specify the language in regular <pre> tags if you want.) Then fetch PHP-Markdown Extra (which lets you make footnotes, definition lists and simple tables).

Put the file highlight_helper.php in your plugins directory:

<?php
 
function highlight_src($source, $lang)
{
        $geshi = new GeSHi($source, $lang);
 
        return $geshi->parse_code();
}
 
?>

Edit the markdown.php like so:

--- php-markdown-extra_1.1.7/markdown.php   2007-09-26 14:41:22.000000000 +0000
+++ markdown.php    2007-12-15 17:55:09.000000000 +0000
@@ -11,6 +11,8 @@
 # <http://daringfireball.net/projects/markdown/>
 #
 
+include_once 'highlight_helper.php';
+include_once 'wp-syntax/geshi/geshi.php';
 
 define( 'MARKDOWN_VERSION',  "1.0.1k" ); # Wed 26 Sep 2007
 define( 'MARKDOWNEXTRA_VERSION',  "1.1.7" ); # Wed 26 Sep 2007
@@ -1047,7 +1049,7 @@
                                )
                                ((?=^[ ]{0,'.$this->tab_width.'}\S)|\Z) # Lookahead for non-space at line-start, or end of doc
                        }xm',
-           array(&$this, '_doCodeBlocks_callback'), $text);
+           array(&$this, '_doCodeBlocks_highlight_callback'), $text);
 
                return $text;
        }
@@ -1063,6 +1065,21 @@
                $codeblock = "<pre><code>$codeblock\n</code></pre>";
                return "\n\n".$this->hashBlock($codeblock)."\n\n";
        }
+   function _DoCodeBlocks_highlight_callback($matches) {
+       $codeblock = $matches[1];
+
+       $codeblock = $this->outdent($codeblock);
+
+       # trim leading newlines and trailing whitespace
+       $codeblock = preg_replace(array('/\A\n+/', '/\s+\z/'), '', $codeblock);
+
+       $codeblock = preg_replace_callback('/^(\{\{lang:([\w]+)\}\}\n|)(.*?)$/s', 
+       create_function('$matches', '
+       return highlight_src($matches[3], empty($matches[2]) ? "txt" : $matches[2]);
+       '), $codeblock);
+
+       return "\n\n<div>\n".$this->hashBlock($codeblock)."\n</div>\n\n";
+   }
 
 
        function makeCodeSpan($code) {

Then turn on the Markdown Extra plugin from the WordPress admin page and you should be ready!

If you know how to combine these two files into one (it shouldn’t be that hard, but it gives me error message I don’t comprehend) please let me know. I don’t know the first thing about PHP: I’m just guessing and throwing stuff together at random here…

13 Responses to “Syntax highlighting with Markdown in WordPress”

  1. Robert Hulmeon 15 Dec 2007 at 7:42 pm

    I am very jealous!

  2. Dougalon 16 Dec 2007 at 5:48 pm

    I know, it’s very exciting. Now I’ll have to start posting stuff with lots of code segments just to use it. Of course, maybe I could just use it to create Christmassy colour schemes.

  3. ConalBlog : Switching blog engineson 15 Jan 2008 at 6:58 am

    […] steps are described in a post on Syntax highlighting with Markdown in WordPress. It uses a combination of PHP Markdown Extra and GeSHi (Generic Syntax Highlighter), plus a small […]

  4. […] steps are described in a post on Syntax highlighting with Markdown in WordPress. It uses a combination of PHP Markdown Extra and GeSHi (Generic Syntax Highlighter), plus a small […]

  5. Conal Elliotton 10 Apr 2008 at 5:16 pm

    I’m having a problem with markdown in post comments made by other users. The inline markup (e.g., foo and bar) comes through fine, but not “block markup” like code blocks (indented by four), sections or horizontal rules.

    Any ideas? Thanks.

  6. Dougalon 10 Apr 2008 at 7:39 pm

    Hi Conal. I just tested the code block facility on this blog and it seems to work. But I’m not really sure what the problem is. Could you paste something in that doesn’t work for you? Cheers!

    Language-specific highlights:

    main = do string <- getLine
              if null string
                 then "You didn't write anything!"
                 else ("You wrote" ++ string)

    Non-specific code block:

    $ cat .ssh/config
    Host *example.com
    User dougal
    ForwardX11 yes

    And the amazing horizontal line game:


    Edit: Well, maybe not the prettiest but it all seems to be working. :-\

  7. Conal Elliotton 10 Apr 2008 at 10:05 pm

    Thanks for the test, Dougal. The following code and horizontal rule don’t work in comments (other than from me) on my blog:

    foo :: Bar

    I do have Filosofo Comments Preview plugin. Perhaps it could be interfering, though I have this same issue with the plugin is deactivated.

  8. Conal Elliotton 10 Apr 2008 at 10:06 pm

    hm. got the code block but not the lines. trying again w/o the code block.

    between lines

    that’s it.

  9. FractalizeRon 12 Jan 2009 at 11:12 am

    Can I suggest my syntax highlighter for WordPress? It works ok in GUI mode unlike others

    http://wordpress.org/extend/plugins/wp-synhighlight

    It utilizes shortcodes to mark code snippets.

  10. Dave Abrahamson 22 Aug 2009 at 5:20 pm

    I’ve made some modifications that allow this to work with PhP Markdown Extra’s “fenced code blocks” and allow the addition of line numbering as follows:

    {{lang=cpp,line=1}}

    If anyone’s interested, please let me know: dave-AT-boostpro-DOT-com

  11. Dave Abrahamson 30 Aug 2009 at 1:41 am

    It looks like wp-syntax using the

    <pre lang="xxx">
    ...
    </pre>

    syntax is still a bit more careful than markdown with this particular patch about not interfering with other WP formatting. If I put an unmatched “[” in a markdown code block (e.g. in a comment), any straight quotes in later non-code text are displayed straight rather than the WP default of making them curly.

  12. Dave Abrahamson 18 Sep 2009 at 1:28 am

    Another problem: try

    ~~~
    #include 
    ~~~
    

    Markdown picks up “#include” and treats it as a section header

  13. Cameron Brackenon 01 Dec 2009 at 4:38 am

    You can use the WP-Syntax Colorizer plugin with this hack if the second to last line is changed to

    return "\n\n<div class=\"wp_syntax\">\n".$this->hashBlock($codeblock)."\n</div>\n\n";

Trackback URI |