Markdown is one of the most popular markup languages on the Web. Unfortunately, with no standard specification, every implementation works differently, producing varying results across different platforms. The CommonMark specification fixes this by providing an unambiguous syntax specification and a comprehensive suite of tests. Attendees will learn about this standard and how to integrate the league/commonmark parser into their applications. We will also cover how to add new syntax and other features to the parser to fit your custom needs.
2. COLIN O’DELL
Creator & Maintainer of league/commonmark
Lead Web Developer at Unleashed Technologies
Author of PHP 7 Migration Guide e-book
@colinodell
4. ORIGINS OF MARKDOWN
Created in March 2004 by John Gruber
Informal plain-text formatting language
Converts readable text to valid (X)HTML
Primary goal - readability
5. HISTORY OF MARKDOWN
Hello Nomad PHP!
----------------
Markdown is **awesome**!
1. Foo
2. Bar
3. Baz
Wikipedia entry:
<https://en.wikipedia.org/wiki/Markdown>
6. WHY IS IT SUCCESSFUL?
1. Syntax is visually-similar to the resulting
markup
2. Non-strict, forgiving parsing
3. Easily adaptable for different uses
9. COMMONMARK IS…
A strongly defined, highly compatible specification of Markdown.
Written by John MacFarlane
(in collaboration with people from Github, StackOverflow, Reddit, and others)
Spec includes:
Strict rules (precedence, parsing order, handling edge cases)
Specific definitions (ex: “whitespace”, “punctuation”)
616 examples
10.
11.
12.
13.
14. WHY IS IT NEEDED?
*I love Markdown*
<p><em>I love Markdown</em></p>
24. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
25. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
28. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
29. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
36. class TwitterUsernameAutolinkParser extends AbstractInlineParser {
public function getCharacters() {
return ['<'];
}
public function parse(InlineParserContext $inlineContext) {
$cursor = $inlineContext->getCursor();
}
}
50. EXAMPLE 3: CUSTOM RENDERER
class ImageHorizontalRuleRenderer implements BlockRendererInterface {
public function render(...) {
return new HtmlElement('img', ['src' => 'hr.png']);
}
}
52. BUNDLING INTO AN EXTENSION
class MyCustomExtension extends Extension {
public function getInlineParsers() {
return [new TwitterUsernameAutolinkParser()];
}
public function getDocumentProcessors() {
return [new ShortenLinkProcessor()];
}
public function getBlockRenderers() {
return [new ImageHorizontalRuleRenderer()];
}
}
53. BUNDLING INTO AN EXTENSION
$environment = Environment::createCommonMarkEnvironment();
$environment->addExtension(new MyCustomExtension());
$converter = new CommonMarkConverter($environment);
$html = $converter->convertToHtml("...");
54. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
55. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
56. WELL-TESTED
94% code coverage
Functional tests
All 616 spec examples
Library of regression tests
Unit tests
Cursor
Environment
Utility classes
57. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
58. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
59. PERFORMANCE
0 20 40 60 80 100
Parsedown
PHP Markdown Extra
Time (ms)
league/commonmark is ~35-40ms
slower
PHP 5.6 PHP 7.0
Tips:
• Use PHP 7 (50-80% boost)
• Choose library based on your
needs
• Cache rendered HTML (100%
boost)
• Optimize custom functionality
60. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
61. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
62. STABILITY
Current version: 0.15.0
Conforms to CommonMark spec 0.26
1.0.0 will be released once CommonMark spec is 1.0
No major stability issues
Backwards Compatibility Promise:
No BC breaks to CommonMarkConverter class in 0.x
Other BC breaks will be documented
63. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
2. Informal; not like XML and XHTML which reject data simply because it fails to adhere to strict, unforgiving standards
3:
LaTeX – academia for scientific and mathematic papers
DocBook – technical documentation
We’re all familiar with the usual markdown syntax
This specification dictates exactly how compliant parsers should handle Markdown input
It includes
That’s cool, but do we really NEED a spec? Is Markdown really that complicated?
Well, let’s look at an example
Straight-forward example of emphasizing text
What happens if we add two asterisks?
Babelmark2
John MacFarlane, CommonMark spec maintainer
Like 3V4L, but for Markdown
Whole string emphasized with nested inner emphasis, as you’d expect
Another approach some parsers take is two separate emphasis elements
Kinda makes sense
What’s really strange and unexpected
3 other ways of parsing this (22%)
1/5
Actually 15 different ways that parsers interpret this
2/5
Actually 15 different ways that parsers interpret this
3/5
4/5
5/5
Actually 15 different ways that parsers interpret this
OUTRO:
The CommonMark standard is designed to eliminate this ambiguity
so that your Markdown is handled in a logical, predictable fashion
The league/commonmark library
It basically takes Markdown in and spits HTML out.
And it does so in a way that’s compliant with the CommonMark spec.
Several integrations for this library built by the community
We have a URL we want to link to using the standard autolinking syntax
Wrapped with a less-than and greater-than sign
“A” tag with href and text label of the URL
Show how our engine does this behind-the-scenes
Start off with Markdown input
Run it through the various sub-parsers which results in an Abstract Syntax Tree
Also known as an AST
Tree structure of PHP objects, each representing a certain type of element
For easier visualization I’m showing what these PHP objects MIGHT look like if we showed their data as XML
Once we’ve got the final AST, we pass that along to the renderers…
…which convert the AST into HTML
Now what’s really cool is that…
You can hook into any of these three aspects, adding “your own custom…”
THREE EXAMPLES
Now let’s go back to our autolink example
What if we wanted to add similar autolinking functionality, but for Twitter handles?
For example, say we want to enclose the Twitter handle in a similar fashion… which results in a link to that profile page
Let me show you how simple it is to add this feature
Simply create a sub-parser
Tell the main parser to stop whenever a less-than sign is encountered
When encountered, control is transferred to parse()
Cursor
Cursor is a simple yet powerful wrapper around the current line
Stores current line’s text and current position being parsed
Because we’ve told the engine we’re interested in less-than characters…
When control is passed into parse() method, we can use these specially-design methods to
Parse the string at that location
Cursor provides several high-optimized UTF-8 aware methods for Markdown parsing
That’s it!Now the APIs and methods might be unfamiliar, but hopefully you can see how features can be added seemlesly with only a few lines of code
Match this regular expression
Extract just the username from the matched text
That’s it!Now the APIs and methods might be unfamiliar, but hopefully you can see how features can be added seemlesly with only a few lines of code
What if we wanted to automatically shorten URLs with a service like bitly?
Guaranteed to be compatible with all other CommonMark parsers in other languages
CONTEXT: Blinking your eye is 100ms
You can find the library on Packagist under league/commonmark
Installation instructions & documentation can be found
Hopefully you find this library useful and can try it out in your next project.
Thank you so much!