Welcome to Arkanis Development

Markdown and HTML5 block elements

Published

Markdown is a pure text format you can easily write and convert to HTML. It has several nice features and one of those is inline HTML. I one of my current projects (you will see it soon enought…) I'm combining PHP Markdown with some of the new HTML5 elements:

Normal text…

<figure>
    <figcaption>The appearance of FooBar</figcaption>
    <img src="foobar.jpg" />
</figure>

Some more text…

Markdown wraps p elements around everything that looks like paragraphs. This would also apply to inline HTML code but Markdown already has a build in list of HTML block elements. If it encounters one of those it's not wrapped inside a p element. Because HTML5 is pretty new stuff Markdown don't know the HTML5 block elements yet. Therefore it wraps the figure element inside a p element thus making the document invalid (and screwing up the styles).

Thankfully the PHP Markdown source code is very well documented and this made it a pice of cake to add the figure element to the list of block level elements. In Line 1770 of PHP Markdown 1.0.1n the variable $block_tags_re contains the list of block level elements:

# Tags that are always treated as block tags:
var $block_tags_re = 'p|div|h[1-6]|…|hr|legend';

I abbreviated the list a bit for clarity. Just add |figure to the end and you're done. After that modification PHP Markdown stopped to wrap p elements around the figure elements.

Since HTML5 is no way near finished it is probably to early to add the HTML5 block level elements to the source of the PHP Markdown project itself. However for the advantages of HTML5 I'm willing to live with such a small modification. :)

4 comments for this post

leave a new one

#1 by
David Turner
,

Hi,

I've just started making use of Markdown myself and was running into issues with regards to some HTML5 elements being wrapped in p tags. Thought I'd comment on your post as it's been of great use in identifying what was going on and how to go about fixing it.

Thanks!

#2 by
Stephan Soller
,

Glad this post was helpful to someone. :)

ps.: I really like the usage of fonts on your website.

#3 by
Ryan
,

"Since HTML5 is no way near finished"

What an stupid thing to say. As far as the WHATWG is concerned, HTML is now versionless. The W3C is now just their little bitch so it makes no difference what they call it.

There is no concept of "finished" in an evolving specification and anyone who says you can use it beccause it's "not finished yet" is a monumental retard.

#4 by
Stephan
,

"Finished" might have been the wrong word. Maybe I should have used "unstable" instead.

Before I wrote the post the future of the "details" element was in question. Not long after removal of the "s" and "small" elements was in discussion. I also vaguely remember a "dialog" element that was removed from the spec. And some elements are still in discussion (e.g. "hgroup").

The W3Cs HTML working group tracker is a nice way to stay up to date: http://www.w3.org/html/wg/tracker/products/1/all

While HTML5 can surely be used in every day tasks an element list would need some updates from time to time. Like you said, it's an evolving standard.

Leave a new comment

Having thoughts on your mind about this stuff here? Want to tell me and the rest of the world your opinion? Write and post it right here. Be sure to check out the format help (focus the large text field) and give the preview button a try.

Format help

Please us the following stuff to spice up your comment.

An empty line starts a new paragraph. ---- print "---- lines start/end code" ---- * List items start with a * or -

or