Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Parsing                           for Fun and Profit                            (but mainly fun)                          ...
What?Saturday, 23 February 13
Parsing                           Adding structure and meaning to textSaturday, 23 February 13
Parsing Human                                  Languages                           Jake stretched his legs                ...
Parsing Computer                                Languages                           “foo = bar + 123”                     ...
Why?Saturday, 23 February 13
Not just           compiling!            Compilers breathe fire.Saturday, 23 February 13
Pretty PrintingSaturday, 23 February 13
Pretty                Printing                           gofmt                 http://gofmt.com/Saturday, 23 February 13
Code Smell Detectors                           https://rubygems.org/gems/reekSaturday, 23 February 13
Code Smell DetectorsSaturday, 23 February 13
Other ideas                           Code metrics                           Bug detectors                           Domai...
How?Saturday, 23 February 13
Step 1          3 year computer science                   degreeSaturday, 23 February 13
Lexing/Tokenising                   if x > 100 then return “big” else return “small”                   if x > 100 then ret...
Tree Building                   if x > 100 then return “big” else return a + b                                           i...
Parsing Expression                                Grammars                           Like regular expressions, but can han...
Regexes and HTMLSaturday, 23 February 13
Treetop PEG grammarSaturday, 23 February 13
Doing SumsSaturday, 23 February 13
Switch to                           Sublime Text, idiot                               Code is now available:         https...
A Ruby Syntax                             HighlighterSaturday, 23 February 13
What                           A tool to read in simple Ruby source and                           output syntax highlighte...
Why                           Because I thought it would be fun                           It was                          ...
WhySaturday, 23 February 13
How                           Build a parse tree of the Ruby source                           Walk the tree and spit out a...
Switch to Chrome,                                 idiotSaturday, 23 February 13
Switch to        Sublime Text again, idiot                            Code is now available:           https://github.com/...
We’re doing this the                                hard way                           Ruby’s grammar is too complex and  ...
Ripper (Ruby 1.9.3)Saturday, 23 February 13
Learn                           more!        Skip theoretical physics,       start by playing with LegoSaturday, 23 Februa...
Do more                           Ideas you might like to try:                            CSV parser                      ...
Thank you                                              Ash Moran                               ash.moran@patchspace.co.uk ...
Upcoming SlideShare
Loading in …5
×

Parsing for Fun and Profit

3,998 views

Published on

Slides from my talk Parsing for Fun and Profit, code is available here: https://github.com/patchspace/parsing_for_fun_and_profit

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Parsing for Fun and Profit

  1. 1. Parsing for Fun and Profit (but mainly fun) Ash Moran ash.moran@patchspace.co.uk PatchSpace LtdSaturday, 23 February 13
  2. 2. What?Saturday, 23 February 13
  3. 3. Parsing Adding structure and meaning to textSaturday, 23 February 13
  4. 4. Parsing Human Languages Jake stretched his legs “Jake”, “stretched”, “his”, “legs” “Jake”<noun>, “stretched”<verb, past>, “his”<possessive pronoun>, “legs”<noun> “Jake” <noun, subject>, “stretched”, (“his”, “legs”)<noun phrase, object>Saturday, 23 February 13
  5. 5. Parsing Computer Languages “foo = bar + 123” “foo”, “=”, “bar”, “+”, “123” “foo”<var>, “=”<assignment_op>, “bar”<var>, “+”<op_plus>, “123”<int_literal>Saturday, 23 February 13
  6. 6. Why?Saturday, 23 February 13
  7. 7. Not just compiling! Compilers breathe fire.Saturday, 23 February 13
  8. 8. Pretty PrintingSaturday, 23 February 13
  9. 9. Pretty Printing gofmt http://gofmt.com/Saturday, 23 February 13
  10. 10. Code Smell Detectors https://rubygems.org/gems/reekSaturday, 23 February 13
  11. 11. Code Smell DetectorsSaturday, 23 February 13
  12. 12. Other ideas Code metrics Bug detectors Domain-specific languages Language translators (e.g. Ruby -> PHP) Code obfuscators Alternative syntaxes (e.g. CoffeeScript) Refactoring toolsSaturday, 23 February 13
  13. 13. How?Saturday, 23 February 13
  14. 14. Step 1 3 year computer science degreeSaturday, 23 February 13
  15. 15. Lexing/Tokenising if x > 100 then return “big” else return “small” if x > 100 then return “big” else return “small”Saturday, 23 February 13
  16. 16. Tree Building if x > 100 then return “big” else return a + b if > then else x 100 return return “big” + a bSaturday, 23 February 13
  17. 17. Parsing Expression Grammars Like regular expressions, but can handle recursion, e.g. HTML Not actually that much harder to useSaturday, 23 February 13
  18. 18. Regexes and HTMLSaturday, 23 February 13
  19. 19. Treetop PEG grammarSaturday, 23 February 13
  20. 20. Doing SumsSaturday, 23 February 13
  21. 21. Switch to Sublime Text, idiot Code is now available: https://github.com/patchspace/parsing_for_fun_and_profit/Saturday, 23 February 13
  22. 22. A Ruby Syntax HighlighterSaturday, 23 February 13
  23. 23. What A tool to read in simple Ruby source and output syntax highlighted HTMLSaturday, 23 February 13
  24. 24. Why Because I thought it would be fun It was Because I thought it would be easy …Saturday, 23 February 13
  25. 25. WhySaturday, 23 February 13
  26. 26. How Build a parse tree of the Ruby source Walk the tree and spit out a <span> element for each bit of text Oh yes, make sure each line goes in <div> and <pre> tags Wrap it in <html> And for bonus points, do some fancy method highlightingSaturday, 23 February 13
  27. 27. Switch to Chrome, idiotSaturday, 23 February 13
  28. 28. Switch to Sublime Text again, idiot Code is now available: https://github.com/patchspace/parsing_for_fun_and_profit/Saturday, 23 February 13
  29. 29. We’re doing this the hard way Ruby’s grammar is too complex and undefined to easily implement as a PEG Tools for parsing Ruby already existSaturday, 23 February 13
  30. 30. Ripper (Ruby 1.9.3)Saturday, 23 February 13
  31. 31. Learn more! Skip theoretical physics, start by playing with LegoSaturday, 23 February 13
  32. 32. Do more Ideas you might like to try: CSV parser JSON parser (return arrays & hashes) XML parser JSON highlighter A simple JavaScript minifier (just kill whitespace)Saturday, 23 February 13
  33. 33. Thank you Ash Moran ash.moran@patchspace.co.uk PatchSpace LtdSaturday, 23 February 13

×