Markdown is one of the most popular markup languages on the web. Unfortunately, with no standard specification, every implementation works differently, producing varying results across different platforms. The CommonMark specification fixes this by providing an unambiguous syntax specification and a comprehensive suite of tests. In this session you'll learn about this standard and how to integrate the league/commonmark parser into their PHP applications. We'll also cover how to customize the library to implement new features like custom Markdown syntax or advanced renderers.
2. COLIN O’DELL
Creator & Maintainer of league/commonmark
Lead Web Developer at Unleashed Technologies
Author of PHP 7 Migration Guide e-book
@colinodel
l
4. ORIGINS OF MARKDOWN
Created in March 2004 by John Gruber
Informal plain-text formatting language
Converts readable text to valid (X)HTML
Primary goal - readability
@colinodel
l
5. HISTORY OF MARKDOWN
Hello ZendCon!
--------------
Markdown is **awesome**!
1. Foo
2. Bar
3. Baz
Wikipedia entry:
<https://en.wikipedia.org/wiki/Markdown> @colinodel
l
6. WHY IS IT SUCCESSFUL?
1. Syntax is visually-similar to the resulting
markup
2. Non-strict, forgiving parsing
3. Easily adaptable for different uses
@colinodel
l
9. WHY IS IT NEEDED?
*I love Markdown*
<p><em>I love Markdown</em></p>
@colinodel
l
10. WHY IS IT NEEDED?
*I *love* Markdown*
@colinodel
l
11. WHY IS IT NEEDED? Source: http://johnmacfarlane.net/babelmark2/
12. 30%
WHY IS IT NEEDED?
*I *love* Markdown*
<p><em>I <em>love</em> Markdown</em></p>
*I *love* Markdown*
<p><em>I </em>love<em> Markdown</em></p>
*I *love* Markdown*
<p><em>I *love</em> Markdown*</p>
15%
33%
Source: http://johnmacfarlane.net/babelmark2/
@colinodel
l
13. WHY IS IT NEEDED?
1. > Hello
World!
------
Source: http://johnmacfarlane.net/babelmark2/
14. WHY IS IT NEEDED?
1. > Hello
World!
------
Source: http://johnmacfarlane.net/babelmark2/
15. WHY IS IT NEEDED?
1. > Hello
World!
------
Source: http://johnmacfarlane.net/babelmark2/
16. WHY IS IT NEEDED?
1. > Hello
World!
------
Source: http://johnmacfarlane.net/babelmark2/
17. WHY IS IT NEEDED?
1. > Hello
World!
------
Source: http://johnmacfarlane.net/babelmark2/
18. COMMONMARK IS…
A strongly defined, highly compatible specification of Markdown.
Written by people from Github, StackOverflow, Reddit, and others.
Spec includes:
Strict rules (precedence, parsing order, handling edge cases)
Specific definitions (ex: “whitespace”, “punctuation”)
624 examples
@colinodel
l
19.
20.
21.
22.
23. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
24. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
25. ADDING LEAGUE/COMMONMARK
$ composer require league/commonmark:^0.15
<?php
$converter = new CommonMarkConverter();
echo $converter->convertToHtml('Hello **ZendCon!**');
@colinodel
l
27. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
28. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
34. EXAMPLE 1: CUSTOM PARSER
<http://www.zendcon.com>
<a href="http://www.zendcon.com">
http://www.zendcon.com
</a>
<@colinodell>
<a href="https://twitter.com/colinodell">
@colinodell
</a>
@colinodel
l
35. class TwitterUsernameAutolinkParser extends AbstractInlineParser {
public function getCharacters() {
return ['<'];
}
public function parse(InlineParserContext $inlineContext) {
// TODO
}
}
@colinodel
l
37. class TwitterUsernameAutolinkParser extends AbstractInlineParser {
public function getCharacters() {
return ['<'];
}
public function parse(InlineParserContext $inlineContext) {
$cursor = $inlineContext->getCursor();
}
}
@colinodel
l
47. EXAMPLE 3: CUSTOM RENDERER
class ImageHorizontalRuleRenderer implements BlockRendererInterface {
public function render(...) {
return new HtmlElement('img', ['src' => 'hr.png']);
}
}
@colinodel
l
49. BUNDLING INTO AN EXTENSION
class MyCustomExtension extends Extension {
public function getInlineParsers() {
return [new TwitterUsernameAutolinkParser()];
}
public function getDocumentProcessors() {
return [new ShortenLinkProcessor()];
}
public function getBlockRenderers() {
return [new ImageHorizontalRuleRenderer()];
}
}
@colinodel
l
50. BUNDLING INTO AN EXTENSION
$environment = Environment::createCommonMarkEnvironment();
$environment->addExtension(new MyCustomExtension());
$converter = new CommonMarkConverter($environment);
$html = $converter->convertToHtml("...");
@colinodel
l
51. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
52. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
53. WELL-TESTED
94% code coverage
Functional tests
All 624 spec examples
Library of regression tests
Unit tests
Cursor
Environment
Utility classes
@colinodel
l
54. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
55. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
56. PERFORMANCE
0 20 40 60 80
Parsedown
cebe/markdown gfm
PHP Markdown Extra
league/commonmark
Time (ms)
league/commonmark is ~22-24ms slower
PHP 5.6 PHP 7.1
Tips:
• Choose library based on your
needs
• Cache rendered HTML (100%
boost)
• Use PHP 7 (50% boost)
• Optimize custom functionality
@colinodel
l
57. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
58. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
59. STABILITY
Current version: 0.15.6
Conforms to CommonMark spec 0.28
1.0.0 will be released once CommonMark spec is 1.0
No major stability issues
Backwards Compatibility Promise:
No BC breaks to CommonMarkConverter class in 0.x
Other BC breaks are documented (see UPGRADING.md)
@colinodel
l
60. FEATURES
100% compliance with the CommonMark spec
Easy to implement
Easy to customize
Well-tested
Decent performance
(Relatively) stable
@colinodel
l
The league/commonmark library
It basically takes Markdown in and spits HTML out.
And it does so in a way that’s compliant with the CommonMark spec.
OUT: Now I’ve mentioned the word CommonMark a few times, but what is that?
In collaboration with Aaron Swartz
2. Informal; not like XML and XHTML which reject data simply because it fails to adhere to strict, unforgiving standards
Straight-forward example of emphasizing text
What happens if we add two asterisks?
Babelmark2
John MacFarlane, CommonMark spec maintainer
Like 3V4L, but for Markdown
Whole string emphasized with nested inner emphasis, as you’d expect
Another approach some parsers take is two separate emphasis elements
Kinda makes sense
What’s really strange and unexpected
3 other ways of parsing this (22%)
Actually 15 different ways that parsers interpret this
OUTRO:
The CommonMark standard is designed to eliminate this ambiguity
so that your Markdown is handled in a logical, predictable fashion
Actually 15 different ways that parsers interpret this
1/4
OUTRO:
The CommonMark standard is designed to eliminate this ambiguity
so that your Markdown is handled in a logical, predictable fashion
2/4
3/4
Actually 15 different ways that parsers interpret this
OUTRO:
The CommonMark standard is designed to eliminate this ambiguity
so that your Markdown is handled in a logical, predictable fashion
We’re all familiar with the usual markdown syntax
This specification dictates exactly how compliant parsers should handle Markdown input
It includes
Look at the spec
What’s really cool:
- Examples can easily be parsed out and tested against by our Markdown library!
Several integrations for this library built by the community
We have a URL we want to link to using the standard autolinking syntax
Wrapped with a less-than and greater-than sign
“A” tag with href and text label of the URL
Show how our engine does this behind-the-scenes
Start off with Markdown input
Run it through the various sub-parsers which results in an Abstract Syntax Tree
Also known as an AST
Tree structure of PHP objects, each representing a certain type of element
For easier visualization I’m showing what these PHP objects MIGHT look like if we showed their data as XML
Once we’ve got the final AST, we pass that along to the renderers…
…which convert the AST into HTML
Now what’s really cool is that…
You can hook into any of these three aspects, adding “your own custom…”
THREE EXAMPLES
Now let’s go back to our autolink example
What if we wanted to add similar autolinking functionality, but for Twitter handles?
For example, say we want to enclose the Twitter handle in a similar fashion… which results in a link to that profile page
Let me show you how simple it is to add this feature
Simply create a sub-parser
Tell the main parser to stop whenever a less-than sign is encountered
When encountered, control is transferred to parse()
Cursor is a simple yet powerful wrapper around the current line
Stores current line’s text and current position being parsed
Because we’ve told the engine we’re interested in less-than characters…
Simply create a sub-parser
Tell the main parser to stop whenever a less-than sign is encountered
When encountered, control is transferred to parse()
When control is passed into parse() method, we can use these specially-design methods to
Parse the string at that location
Cursor provides several high-optimized UTF-8 aware methods for Markdown parsing
DESCRIBE METHODS
HOW TO IMPLEMENT OUR TWITTER AUTOLINK PARSER?
WE COULD TRY THIS
BETTER
That’s it!Now the APIs and methods might be unfamiliar, but hopefully you can see how features can be added seemlesly with only a few lines of code
Match this regular expression
Extract just the username from the matched text
That’s it!Now the APIs and methods might be unfamiliar, but hopefully you can see how features can be added seemlesly with only a few lines of code
What if we wanted to automatically shorten URLs with a service like bitly?
Guaranteed to be compatible with all other CommonMark parsers in other languages
Blinking your eye is 100ms
You can find the library on Packagist under league/commonmark
Installation instructions & documentation can be found
Hopefully you find this library useful and can try it out in your next project.
Thank you so much!