Why is Markdown so popular?

Notable's Markdown Editor... The left is somehow better than the right.

I truly loathe Markdown. Truly. But given the widespread use of Markdown, it might seem strange that I have such aversion to it. If you somehow really like it, or are so used to it by now, you might be tempted to think I'm the oddball. But I'm definitely not alone in my dislike of the format:

You get the idea. My point is, I'm not the only one.

First the good...

Markdown definitely has some benefits, otherwise people wouldn't use it. Off the top of my head:

  1. It separates the design of a document from its content, allowing writers to focus on what they’re writing.
  2. The plain-text nature means that .md docs are lightweight and portable across platforms.
  3. The simple number of formatting rules is easily pulled into any publishing system and styled as needed.
  4. It appeals to techies -Markdown’s principle users - who want to use vim or emacs to write up documentation and notes (or say they do, anyways).
  5. It’s developer-friendly. Like code, it’s in plain-text so is usable within an IDE and easy to copy/paste chunks of code into documentation.
  6. It contains 95% of the functionality that most people need to create a document.
  7. It’s extensible: If you need additional content in your doc - like a video, or extra formatting options - these can easily be added with HTML tags or custom text markers that can be post-processed.
  8. Because it’s plain-text, it can easily be diffed and managed via git.
  9. Parsing can be as easy as literally 10 lines of RegEx
  10. Widely supported at this point, with lots of libraries, projects and editors.
  11. Meta-data can be thrown into a YAML frontmatter section at the top.
  12. It’s the only option available, really, as how else are you going to write documents?

So just suck it up and use Markdown

Because, as pointed out by others before me, Markdown sucks. Let me list some reasons why: 

  1. It’s barely a spec - just a cobbled together bunch of general rules which every implementation breaks in one way or another.
  2. There are at least a dozen variations, GFM being the most common, but also MultiMarkdown, Pandoc, CommonMark, etc. And site-specific variations such as Wikimedia, Reddit, WordPress and more.
  3. Each variation produces different default HTML output.
  4. The HTML output is antiquated at best. Though the basic structure of headers and paragraphs is generally semantic, there's no modern semantic elements such as main, article, section, nav, header, footer, figure, picture, etc. Embedding videos, social media widgets, etc. isn't possible at all. 
  5. Adding in any sort of extra meta data usually requires using YAML, the rules of which are a mystery to me. 
  6. In order to be truly useful, Markdown needs to be post-processed (so the text can be used for blog posts, research papers or online docs, for example) and needs to be extended via embedded HTML or custom tags. For example, Markdoc adds in tags for post processing, Hugo adds in templates for blog posts, etc. And the nightmare which is MDX is… just… wow. Not even once.
  7. Parts of MarkDown are truly atrocious, even if you love the general simplicity of it:
    1. Tables: The reason you don’t see many tables in Markdown is because they’re ludicrous. You need to use a WYSIWYG tool or you don’t use it.
    2. Image: links are hard to tell
    3. Blockquotes. Are people really typing > before each paragraph? No, they’re using an editor.
    4. Numbered lists. Again, you need to use an editor to stay sane.
    5. What the hell are task lists anyways? Why do they exist?
  8. The above means that anyone who writes Markdown regularly using a WYSIWYG editor or an IDE already, so the whole ‘plain text’ thing doesn’t matter. Why not use a format that isn't completely hamstrung?
  9. The markup is brittle as hell ends up causing weird edge cases, challenging even the best parsers. (###**this is __test__** … Is that bold with italics? Italic bold? In a heading? Wait…)
  10. The libraries for manipulating Markdown seem to be either RegEx based, or otherwise use an Abstract Syntax Tree. If you haven’t tried manipulating a document using an AST, let me assure you it's a non-trivial effort. The APIs are either so low-level as to make any change a 50 line script, or they end up using a badly made version of the standard DOM API. I spent several hours trying to convince one library to simply wrap an element with another before finally giving up, loading up JSDOM and doing it in 3 lines of code. 
  11. Markdown will never get beyond developers. It’s a way for people who are used to writing code to write docs without being bothered about how it looks. But the fact is that the output has to be readable and decent looking.
  12. Any developer who does care about how the output looks - say for someone trying to set up a new blog or documentation - either spends hours trying to figure out various libraries or gives up and uses an existing project with mixed results. 
  13. Goddamn it’s fugly. This is last, but honestly, this is my number one gripe. It’s 2022, we shouldn’t be using ascii-text to write documents.

OK, so use something else...

Like what? Plain text doesn't provide rich text. And the standard apps for creating documents are neither simple, open nor standard.

  1. Word processors are generally bloated, non-standard, self-contained applications that have file formats that aren’t suited for post-processing, or creating web pages. They’re barely usable for creating usable documents at this point.
  2. Google Docs doesn't even have a document format. It's all online or exported in different formats - all except .docx are read-only. If you don't feel like storing your docs on Google's servers, you're out of luck. 
  3. Word processors and editors that support HTML exporting - from TextEdit or WordPad to Google Docs and Microsoft Word - create HTML that is non-standard, bloated, and ugly. Each one has their own ways of formatting. This isn't that surprising, given there's no standard HTML Document format, and given the web's flexibility, that means there are dozens of ways to format a page. Bold text could be <b> or <strong> or a <span style="font-weight:bold"> or <span class="bold">, and all are used regularly.
  4. The HTML “web design” apps that do exist - and there aren’t many left - are focused on site design. They aren't a tool to be used for writing, but for layout of already created content.
  5. HTML document editors - aka word processors - focused on writing and sharing docs don’t exist. (Until now.)

This means that if you want to write a document in HTML as easily as you create a Google or Word Doc, you’re out of luck. It's not that word processors always create horrible output - Google Docs does a decent job of exporting a web page as an .html or zip file - but it's always a one-way process. Good luck importing that back into the editor or Word or anything else. And again, each has its own custom way of implementing each style. Oh, and the exported web page inevitably doesn’t actually look exactly like the document you were just editing - margins and spacing are off, etc.

Why don’t we have an HTML-based rich text standard yet?

I have zero idea. In fact, we seem to be getting farther away from one as Markdown popularity surges. 

The other day I read a blog post by the folks at Mozilla - who are supposed to be the standard bearers of, um, web standards, and was truly shocked that they decided to convert all their documentation to Markdown last year. What?? In fact, they stopped using HTML in order to do it. 

The blog posts starts out by saying that the reason was because Markdown is "much more approachable and friendlier" then proceeds to list the various Mozilla-specific Github projects that one needs to download and install (not including Node, Yarn and Git), and the multiple steps needed to actually generate the docs. Again... what??

And of course, Mozilla too has its own variant of Markdown, which builds on GFM. And of course, since Markdown doesn't do much besides headers, paragraphs, bold, italics and bullets, they need custom macro tags to make it do what they need. Which is of course both non-standard, as well as being completely invisible to the writer, so who knows what the end result will be: 

Kill Me Now

I mean, kill me now. If Mozilla, of all organizations, have dumped HTML in favor of Markdown and consider the above better than, you know, <span class="foo">, then I must be tilting at windmills. I get it.

That doesn't mean I'm wrong.

As I wrote in my previous post, we need an HTML Document standard. It's not a matter of technology at this point, it's a matter of simply deciding on a manageable subset of HTML and CSS and then calling it a standard. Then it needs to built in to every browser on the planet. I'm using my own Hypertext HTML Document editor to write this, but I shouldn't have needed to go through the effort. Mozilla should really be ashamed of itself for not doing it first, quite honestly. There is a W3C Working Group dedicated to Web Editing, but they're just focused on a few APIs it seems, basically continuing the focus on "web apps". The ePub Working Group is only focused on read-only e-books, and don't seem to be concerned at all about doing anything to enable actually writing them using web standards. 

My next post will be about the three decade long quagmire of "encapsulated HTML" formats that are out there, as it's an interesting topic of dead-ends and disagreements between browser makers. Then after that I'll be posting on a proposal for what an HTML Document spec would look like. 

-Russ