↩ go back to index

markdown's strengths and weaknesses

January 30, 2022

I'm spending my morning writing about markdown, aren't I a cool and interesting person? Well whatever I'm riled up now and if I put it off I'll never write it. I've been wanting to write something about this for a while, ever since ~bouncepaw wrote a post about markdown on the now-defunct tanelorn.city in uhh… mid-2020. Did not realize it was that long ago.

Inspired by a post on the fediverse as well as existing lingering thoughts I've had for a while.

Markdown is pretty great for quickly writing formatted documents, it's not my favorite but I am quite fond of it.

However, it tends to be used for many purposes that it really shouldn't be used for. It strikes a chord with me in particular when someone recommends raw markdown be used as an exchange format.

markdown's target

For the unaffiliated, markdown's entire reason for existence is for nothing other than… rendering to HTML.

It has literally no other purpose in existence.

The original markdown design had zero goals other than rendering to HTML.
Every markdown library for various programming languages I've seen do nothing other than render to HTML.
Almost every markdown utility does nothing but render to HTML.
Markdown itself mandates that embedding HTML elements in the source text be allowed (even CommonMark mandates it for backwards compatibility).

This certainly strikes me as a great way to replace HTML yes, certainly. Even when you're rendering to a non-HTML format you still require a full tag soup HTML parser! Potentially for numerous HTML versions since no single one is specified!

The point I'm trying to make is that using markdown is a poor way to try to supplant HTML. Sure, one could write a markdown parser that renders directly rather than to HTML, and one could just ignore HTML tags in the source, but markdown is literally designed around, and for, HTML.

Even stuff that translates markdown to non-HTML such as Sandra Snan's 7off (which is nice although I don't use it really) simply translate the markdown to a different markup language (Gemtext) which is then rendered by whatever Gemini browser you're using. I do not know of a single text renderer that directly renders markdown without converting it to another format first.

rendering ambiguities

Anyone who has even looked at the code for a markdown parser (or even just read the CommonMark spec) should know how ambiguous and painful to parse many markdown constructs can be.

For instance, look at this snippet of simple but rather ambiguous markdown:

**md **is great*

How should it be rendered?

With md in bold and a trailing asterisk? Or with md and is great empahsized, with a leading astrisk? If there's an astrisk left over on the front should the snippet then be treated as an unordered list item?

Let's look at how a few different implementations render that snippet (note that all of the output is in HTML).

markdown.pl, the original markdown reference implementation:

md is great*

reddit:

*md *is great*

mastodon:

*md **is good

github:

**md *is great

Wow, literally every single one of them is different. Markdown clearly is something that I know I can write and it will be rendered exactly how it was intended on everybody's system.

Considering that the W3C lists over fifty markdown implementations, for many documents there could potentially be dozens of different interpretations of how it should be rendered (many I've tested render it the way shithub does, but many also don't).

Even though you can't control the presentation of Gemtext, you know that a link is going to be a link, and a preformatted block is going to be preformatted, and a header is going to be a header.

what markdown is good at

Markdown is excellent for writing documents that are readable in plaintext form, and that will only be rendered by a known parser implementation with known characteristics.

For example: writing a README.md for a repository that'll only be rendered by the git forge's markdown renderer, and otherwise will only be read as plaintext after people clone it.

Alternately, writing a wiki article or a post on a forum that'll be displayed pre-rendered by the site's renderer.

In this way you could call markdown a source format (making up terms here). It is great at being a source text that is input into a known renderer to generate the final output that will be distributed to others. If anyone's ever had to suffer to write BBCode you should be able to attest that it's a lot faster and easier to write markdown for a post somewhere (that'll be rendered by the forum software).

Markdown is also wonderful—better than all competition I've seen—at being naturally readable in plaintext form, and was in fact designed to roughly emulate how people tended to write complex documents in plaintext, e.g. on Usenet or in emails.

However, markdown is not good at being an exchange format (real term there). Raw markdown should absolutely not be transmitted like HTML to be displayed in a wide range of clients with a wide range of renderer implementations. It'll be incredibly unpredictable and a hectic mess, most likely with tons of this page best viewed in X, just like the good ol' Internet Explorer browser wars days! That's not the part of the old web people yearn for reviving, I'd wager.

And don't tell me that CommonMark is the savior of everything because so far it very clearly hasn't helped considering that all the implementations seem to take CommonMark as a “guideline” and extend or modify it as they see fit.

conclusion

While this post may seem like it's hating on markdown in spots, I'm not saying markdown sucks.

Markdown is amazing for what John Gruber originally intended it for: quickly and naturally writing HTML documents (particularly documents embedded within an existing HTML page skeleton). It also excels at being a format that can be rendered prettily while also still looking good in a terminal.

However it is terrible at being something that is portable and can be transferred to many users with varied software installations. Please do not ever do so.

There are a lot better formats for archival and exchange that will be reliably parsed, maybe XHTML? Roff? (me hoping too much there lol). It'd probably be best to design a bespoke language if you really are designing something to replace HTML. Oh wait, that's what Gemini did!