jsdom @ nodeconf 2011


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Notes\n
  • name, rank, serial number\npurpose of talking:\n - explain my insanity and transfer knowledge in case of an encounter with a bus\n\nin order to understand why node has a dom, we must first know what a dom is.\n
  • The Document Object Model (or DOM) is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents.\n\nwell, thats nice. but I’m still confused.\n
  • When I think of the DOM, I immediately think of the browser.\n
  • this is the result of rendering this page’s html + styling\n\nThe dom lives above the parser and below the renderer/reflow engine. \n\nmarkup->parser->dom->styling->reflow->renderer->image\n\nFor instance...\n
  • this is a snippet taken from nodejs.org (it is the navigation on the left side of the page).\n\nWhen the browser retrieves the html it asks the parser to create a dom.\n
  • the parser is a sort of assembly line, taking raw material (markup) and turning it into usable components, in this case its a tree of dom nodes.\n\nthese are f22’s\n
  • this is a htmlgraph representation of the nodejs homepage\n\ncircles represent dom nodes, color indicates type\n\nthe markup from the previous slide is shown with an orange arrow to where it lives in the DOM tree\n\nBelow the body\nabove the li’s which contain anchors\n
  • - you can treat documents as a tree\n - manipulation is easier because we now have the tools to locate and manipulate nodes/branches/content\n - mostly consistent set of tools available everywhere you find a dom implementation\n
  • now, I’d like to dive a bit into what other server-sided platforms are doing with their doms\n
  • every server-side language has atleast one dom implementation\nbut there are fundamental problems with these:\n\nthe problem with each of these is there is no way to execute javascript in the context of the dom\n\nso you might be asking what are these libraries used for?\n
  • These implementations are used primarily for XML manipulation including:\nXML-RPC, SOAP, XPATH, XSLT\n
  • but there is a rift here. you can process html, but you don’t get any of the benefits.\n\n - no window\n - no events (in most cases)\n - default behavior of elements (links)\n - meant to be short lived (mostly php+python)\n - difficult to bootstrap\n - no javascript\n
  • the thing is, markup is good.\n\nhtml is by far the most known language\n
  • many people, including non-technical types, know what this does\n
  • how do we fix this problem?\n\n- we MUST be compliant with the w3c spec (no surprises)\n- many platforms make bootstrapping a DOM a gigantic pain. We must solve this.\n- they also don’t execute javascript which is why many crawlers do not evaluate ajax’d content\n- in the ideal implementation, the dom should be scriptable like any browser. But since node is written in js we should be able to script the dom from both sides (node and inside the window context)\n- because node is the server, we can run a dom for as long as we need. this makes it useful for a whole slew of things that most of the other platforms can’t even consider\n\nthese are some of the things that were missing and this void prompted me to start jsdom\n
  • now that we know what a dom is, and the problems with other implementations\n\nwhat is this jsdom thing?\n
  • it is the components between the parser and the renderer\n\nParsers\nsax - (xml only) strict + fast\nhtmlparser - less strict, slower\nhtml5 - lenient (useful for the wild internet)\n
  • \n
  • jsdom isn’t done until all of the tests pass, and it’s fast.\n\neach level is an extension of the previous, defining new functionality or updating existing\n\nexample: level1 core -> level2 xml & html\n
  • jsdom.env creates an browser-like environment\n
  • NOTE: pages run with jsdom.env will not process script tags\n\nit works!\n
  • \n
  • lots of scrapers, biggest category\n
  • testing framework / smoke testing\n
  • optimization / \n
  • \n
  • if you haven’t you must see Dav Glass’ yui3 on node project\n\ngraphing, charting, reading\n
  • whether its a 1 line patch or rewriting the entire test infrastructure, without these 46 folks, jsdom wouldn’t even be worth using.\n
  • - currently clicking on a link has no default action\n - jsdom resorts to a fairly slow mechanism to handle live node lists\n - we have memory leaks in the script tag executor\n - and we need to turn the window stubs into functional code and properly test it (act more like a browser)\n
  • \n
  • jsdom @ nodeconf 2011

    1. 1. Why node has a DOM Elijah Insua (tmpvar) @ nodeconf 2011
    2. 2. What is a “DOM”?"The Document Object Model is a platform-and language-neutral interface that will allowprograms and scripts to dynamically accessand update the content, structure and styleof documents."- http://www.w3.org/DOM/
    3. 3. wah wah wah
    4. 4. The browser
    5. 5. Markup
    6. 6. Parser
    7. 7. nodejs.org
    8. 8. Why is this valuable?• No more text manipulation!• An API for manipulating the DOM tree• An extremely common paradigm
    9. 9. Server-sideImplementations
    10. 10. Non-Javascript• PHP (DOMDocument, libxml)• Ruby (Nokogiri)• Java (org.w3c.dom)• Python (xml.dom, xml.minidom)• c/c++ (libxml, gdome)
    11. 11. XML Processing
    12. 12. The xml/html rift• no window• no events• difficult to bootstrap• short lived• no javascript!
    13. 13. markup is good.
    14. 14. Many people know
    15. 15. HTML Represent!• w3c compliant (xml and html)• easy to bootstrap• execute javascript• act like a headless browser (but better) • events / default actions• long living if necessary
    16. 16. What is jsdom?
    17. 17. The w3c DOM, implemented in javascript
    18. 18. What makes it great?
    19. 19. w3c compliance• 100% DOM Level 1 (xml/svg/html)• 100% DOM Level 2 (xml/html/events)• 15% DOM Level 3 (xml)• Passed tests: 2451/3069
    20. 20. Easy to bootstrap jsdom.env
    21. 21. jsdom.env
    22. 22. Projects using jsdom http://search.npmjs.org/#/jsdom
    23. 23. Screen Scraping• Aprocot• node-crawler• wsscraper• scraper• node-moviesearch• spider• jsgrep• query• jjw
    24. 24. Testing• Zombie• Tobie• Ace• Viewjs• Jellyfish (admc/jellyfish)• node-xmpp-bosh
    25. 25. Development• Assetgraph• jspp• inliner• csskeeper• packer• html2jade
    26. 26. Unobtrusive Templating• pure (pure/pure)• weld.js (hij1nx/weld)• minimal.js (ruidlopes/minimal.js)• graft (Shadowfiend/graft)
    27. 27. Code Reuse• YUI3• node-flot• node-highcharts• node-readability• node-rapheal
    28. 28. Shout outs• Adrian Makowski • Evan Jones • Matthew Pflueger • Vincent Desjardins• Alexander Flatter • Felix Gnass • Michael Fleet • Vytautas Jakutis• Andreas Lind Petersen • Gord Tanner • Nick Stenning • Wei Dai• Aria Stewart • Jerry Sievert • Nicolas LaCasse • Yonathan• Arrix • Jos Shepherd • Olivier El Mekki • hij1nx• Avery Fay • Joshua Peek • Phil Dokas • indexzero• Damian Janowski • José Valim • Rodrigo Flores • isaacs• Daniel Cassidy • Julien Guimont • Ryan Wolf • steve• Daniël van de Burgt • Karuna Sagar • Sam Ruby • ulteriorlife• Dav Glass • Karuna Sagar K • Shimon Doodkin • waslogic• Edward OConnor • Marak Squires • Swizec Teller• Evan Haas • Matthew King • Tom Taylor
    29. 29. TODO• default actions for elements• speed improvements• fix memory leak• better window implementation + testing
    30. 30. Thank you!https://github.com/tmpvar/jsdom