GRDDL in a nutshell fabien, gandon, inria
I want my data back from my web pages.
many data in my web page
I want  your  data back from  your  web pages.
many data in many web pages
open your data to anyone who  might use it W3C ©
deep web in particular…
many dialects of XML are in use <ccxml version=&quot;1.0&quot; xmlns=&quot;http://www.w3.org/2002/09/ccxml&quot;> <eventprocessor> <transition event=&quot;connection.alerting&quot; name=&quot;evt&quot;> <log expr=&quot;'The number called is' + evt.connection.remote + '.'&quot;/> <if cond=&quot;evt.connection.remote == 'tel:+18315551234'&quot;> <log expr=&quot;'Go away! we do not want to answer the phone.'&quot;/> <reject/> <else/> <log expr=&quot;'We like you! We are going to answer the call.'&quot;/> <accept/> </if> </transition> <transition event=&quot;connection.connected&quot;> <log expr=&quot;'Call was answered,Time to disconnect it.'&quot;/> <disconnect/> </transition> <transition event=&quot;connection.disconnected&quot;> <log expr=&quot;'Call has been disconnected. Ending CCXML Session.'&quot;/> <exit/> </transition> </eventprocessor> </ccxml> <mroot> <mrow> <mn>1</mn> <mo>-</mo> <mfrac> <mi>x</mi> <mn>2</mn> </mfrac> </mrow> <mn>3</mn> </mroot> <users> <person login=&quot;fgandon&quot; uid=&quot;19536&quot;> <home>/net/user/fg</home> <pref>/sys/19536.inf</pref> <access_level>8</access_level> </person> <person login=&quot;fgandon&quot; uid=&quot;19536&quot;> <home>/net/user/fg</home> <pref>/sys/19536.inf</pref> <access_level>8</access_level> </person> </users> <p:pipeline name=&quot;fig2&quot; xmlns:p=&quot;http://example.org/PipelineNamespace&quot;> <p:input port=&quot;doc&quot; sequence=&quot;no&quot;/> <p:output port=&quot;out&quot; step=&quot;xform&quot; source=&quot;result&quot;/> <p:choose name=&quot;vcheck&quot; step=&quot;fig2&quot; source=&quot;doc&quot;> <p:when test=&quot;/*[@version &lt; 2.0]&quot;> <p:output name=&quot;valid&quot; step=&quot;val1&quot; source=&quot;result&quot;/> <p:step type=&quot;p:validate&quot; name=&quot;val1&quot;> <p:input port=&quot;document&quot; step=&quot;fig2&quot; source=&quot;doc&quot;/> <p:input port=&quot;schema&quot; href=&quot;v1schema.xsd&quot;/> </p:step> </p:when> <p:otherwise> <p:output name=&quot;valid&quot; step=&quot;val2&quot; source=&quot;result&quot;/> <p:step type=&quot;p:validate&quot; name=&quot;val2&quot;> <p:input port=&quot;document&quot; step=&quot;fig2&quot; source=&quot;doc&quot;/> <p:input port=&quot;schema&quot; href=&quot;v2schema.xsd&quot;/> </p:step> </p:otherwise> </p:choose> <p:step type=&quot;p:xslt&quot; name=&quot;xform&quot;> <p:input port=&quot;document&quot; step=&quot;vcheck&quot; source=&quot;valid&quot;/> <p:input port=&quot;stylesheet&quot; href=&quot;stylesheet.xsl&quot;/> </p:step> </p:pipeline>  <HTML>  <HEAD> <TITLE>title</TITLE> <LINK REL=STYLESHEET TYPE=&quot;text/css&quot; HREF=&quot;http://style.com/cool&quot; TITLE=&quot;Cool&quot;> </HEAD> <BODY> <H1>Headline is blue</H1> <P STYLE=&quot;color: green&quot;>While the paragraph is green. </BODY> </HTML>
resources many ways to weave data with the web
embedding explicit several initiatives to make data
your data GRRDL  = an easy way to open
XML data GRRDL  = an easy way to extract RDF from http://www.flickr.com/photos/cho45/1402634073/
imagine…
guitar, Stephan wishes to buy a
reviews, he visits a site offering
reviews & profiles he uses GRDDL to aggregate
GRDDL at work
GRDDL step 1 declare a document is a source
GRDDL step 2 link it to one or more extractors
GRDDL step 3 let GRDDL agents extract RDF from the document
generic profile declare an XHTML document is a source <head  profile=&quot;http://www.w3.org/2003/g/data-view&quot; > <title>The man who mistook his wife for a hat</title> <link rel=&quot;transformation&quot;  href=&quot;http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl&quot; /> <meta name=&quot;DC.Subject&quot; content=&quot;clinical tales&quot; />  … </head>
transformation link an XHTML document to a <head profile=&quot;http://www.w3.org/2003/g/data-view&quot;> <title>The man who mistook his wife for a hat</title> < link rel=&quot;transformation&quot;  href=&quot;http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl&quot;  /> <meta name=&quot;DC.Subject&quot; content=&quot;clinical tales&quot; />  … </head>
GRDDL agent what is extracted by a standard <head profile=&quot;http://www.w3.org/2003/g/data-view&quot;> <title> The man who mistook his wife for a hat </title> <link rel=&quot;transformation&quot;  href=&quot;http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl&quot; /> < meta name=&quot;DC.Subject&quot; content=&quot;clinical tales&quot;  />  … </head>   # dc:title &quot;The man who mistook his wife for a hat&quot;  # dc:subject &quot;clinical tales&quot;
custom profile declare a source and a transformation at once
custom profile no transformation, just reference the <head  profile=&quot;http://purl.org/NET/erdf/profile&quot; > <title>Fabien’s agenda</title> (…) </head>
transformation profile document = GRDDL source giving the <html xmlns=&quot;http://www.w3.org/1999/xhtml&quot;> <head  profile=&quot;http://www.w3.org/2003/g/data-view&quot; > <link  rel=&quot;transformation&quot; href=&quot; http://www.w3.org/2003/g/glean-profile &quot;  /> </head> <body> <p> <a rel=&quot;profileTransformation&quot; href=&quot;http://purl.org/NET/erdf/extract-rdf&quot;>GRDDL transform</a> </p> </body> </html>
XML document declare a source and a transformation for an
generic profile declare an XML document is a source <book xmlns=&quot;http://example.org/book/&quot;  xmlns:grddl=&quot;http://www.w3.org/2003/g/data-view#&quot;   grddl:transformation=&quot; http://example.org/book/getAuthor.xsl &quot; > <title>The man who mistook his wife for a hat</title> … </book>
transformation link an XML document to a <book xmlns=&quot;http://example.org/book/&quot;  xmlns:grddl=&quot;http://www.w3.org/2003/g/data-view#&quot;  grddl:transformation=&quot; http://example.org/book/getAuthor.xsl &quot;  > <title>The man who mistook his wife for a hat</title> … </book>
message take away
Don't bury your data in some HTML page or XML document
data… when you publish a document that contains
do reference GRDDL profiles and/or transformations
fabien, gandon

Grddl In A Nutshell V1

  • 1.
    GRDDL in anutshell fabien, gandon, inria
  • 2.
    I want mydata back from my web pages.
  • 3.
    many data inmy web page
  • 4.
    I want your data back from your web pages.
  • 5.
    many data inmany web pages
  • 6.
    open your datato anyone who might use it W3C ©
  • 7.
    deep web inparticular…
  • 8.
    many dialects ofXML are in use <ccxml version=&quot;1.0&quot; xmlns=&quot;http://www.w3.org/2002/09/ccxml&quot;> <eventprocessor> <transition event=&quot;connection.alerting&quot; name=&quot;evt&quot;> <log expr=&quot;'The number called is' + evt.connection.remote + '.'&quot;/> <if cond=&quot;evt.connection.remote == 'tel:+18315551234'&quot;> <log expr=&quot;'Go away! we do not want to answer the phone.'&quot;/> <reject/> <else/> <log expr=&quot;'We like you! We are going to answer the call.'&quot;/> <accept/> </if> </transition> <transition event=&quot;connection.connected&quot;> <log expr=&quot;'Call was answered,Time to disconnect it.'&quot;/> <disconnect/> </transition> <transition event=&quot;connection.disconnected&quot;> <log expr=&quot;'Call has been disconnected. Ending CCXML Session.'&quot;/> <exit/> </transition> </eventprocessor> </ccxml> <mroot> <mrow> <mn>1</mn> <mo>-</mo> <mfrac> <mi>x</mi> <mn>2</mn> </mfrac> </mrow> <mn>3</mn> </mroot> <users> <person login=&quot;fgandon&quot; uid=&quot;19536&quot;> <home>/net/user/fg</home> <pref>/sys/19536.inf</pref> <access_level>8</access_level> </person> <person login=&quot;fgandon&quot; uid=&quot;19536&quot;> <home>/net/user/fg</home> <pref>/sys/19536.inf</pref> <access_level>8</access_level> </person> </users> <p:pipeline name=&quot;fig2&quot; xmlns:p=&quot;http://example.org/PipelineNamespace&quot;> <p:input port=&quot;doc&quot; sequence=&quot;no&quot;/> <p:output port=&quot;out&quot; step=&quot;xform&quot; source=&quot;result&quot;/> <p:choose name=&quot;vcheck&quot; step=&quot;fig2&quot; source=&quot;doc&quot;> <p:when test=&quot;/*[@version &lt; 2.0]&quot;> <p:output name=&quot;valid&quot; step=&quot;val1&quot; source=&quot;result&quot;/> <p:step type=&quot;p:validate&quot; name=&quot;val1&quot;> <p:input port=&quot;document&quot; step=&quot;fig2&quot; source=&quot;doc&quot;/> <p:input port=&quot;schema&quot; href=&quot;v1schema.xsd&quot;/> </p:step> </p:when> <p:otherwise> <p:output name=&quot;valid&quot; step=&quot;val2&quot; source=&quot;result&quot;/> <p:step type=&quot;p:validate&quot; name=&quot;val2&quot;> <p:input port=&quot;document&quot; step=&quot;fig2&quot; source=&quot;doc&quot;/> <p:input port=&quot;schema&quot; href=&quot;v2schema.xsd&quot;/> </p:step> </p:otherwise> </p:choose> <p:step type=&quot;p:xslt&quot; name=&quot;xform&quot;> <p:input port=&quot;document&quot; step=&quot;vcheck&quot; source=&quot;valid&quot;/> <p:input port=&quot;stylesheet&quot; href=&quot;stylesheet.xsl&quot;/> </p:step> </p:pipeline> <HTML> <HEAD> <TITLE>title</TITLE> <LINK REL=STYLESHEET TYPE=&quot;text/css&quot; HREF=&quot;http://style.com/cool&quot; TITLE=&quot;Cool&quot;> </HEAD> <BODY> <H1>Headline is blue</H1> <P STYLE=&quot;color: green&quot;>While the paragraph is green. </BODY> </HTML>
  • 9.
    resources many waysto weave data with the web
  • 10.
    embedding explicit severalinitiatives to make data
  • 11.
    your data GRRDL = an easy way to open
  • 12.
    XML data GRRDL = an easy way to extract RDF from http://www.flickr.com/photos/cho45/1402634073/
  • 13.
  • 14.
  • 15.
    reviews, he visitsa site offering
  • 16.
    reviews & profileshe uses GRDDL to aggregate
  • 17.
  • 18.
    GRDDL step 1declare a document is a source
  • 19.
    GRDDL step 2link it to one or more extractors
  • 20.
    GRDDL step 3let GRDDL agents extract RDF from the document
  • 21.
    generic profile declarean XHTML document is a source <head profile=&quot;http://www.w3.org/2003/g/data-view&quot; > <title>The man who mistook his wife for a hat</title> <link rel=&quot;transformation&quot;  href=&quot;http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl&quot; /> <meta name=&quot;DC.Subject&quot; content=&quot;clinical tales&quot; /> … </head>
  • 22.
    transformation link anXHTML document to a <head profile=&quot;http://www.w3.org/2003/g/data-view&quot;> <title>The man who mistook his wife for a hat</title> < link rel=&quot;transformation&quot;  href=&quot;http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl&quot; /> <meta name=&quot;DC.Subject&quot; content=&quot;clinical tales&quot; /> … </head>
  • 23.
    GRDDL agent whatis extracted by a standard <head profile=&quot;http://www.w3.org/2003/g/data-view&quot;> <title> The man who mistook his wife for a hat </title> <link rel=&quot;transformation&quot;  href=&quot;http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl&quot; /> < meta name=&quot;DC.Subject&quot; content=&quot;clinical tales&quot; /> … </head> # dc:title &quot;The man who mistook his wife for a hat&quot;  # dc:subject &quot;clinical tales&quot;
  • 24.
    custom profile declarea source and a transformation at once
  • 25.
    custom profile notransformation, just reference the <head profile=&quot;http://purl.org/NET/erdf/profile&quot; > <title>Fabien’s agenda</title> (…) </head>
  • 26.
    transformation profile document= GRDDL source giving the <html xmlns=&quot;http://www.w3.org/1999/xhtml&quot;> <head profile=&quot;http://www.w3.org/2003/g/data-view&quot; > <link rel=&quot;transformation&quot; href=&quot; http://www.w3.org/2003/g/glean-profile &quot; /> </head> <body> <p> <a rel=&quot;profileTransformation&quot; href=&quot;http://purl.org/NET/erdf/extract-rdf&quot;>GRDDL transform</a> </p> </body> </html>
  • 27.
    XML document declarea source and a transformation for an
  • 28.
    generic profile declarean XML document is a source <book xmlns=&quot;http://example.org/book/&quot; xmlns:grddl=&quot;http://www.w3.org/2003/g/data-view#&quot; grddl:transformation=&quot; http://example.org/book/getAuthor.xsl &quot; > <title>The man who mistook his wife for a hat</title> … </book>
  • 29.
    transformation link anXML document to a <book xmlns=&quot;http://example.org/book/&quot; xmlns:grddl=&quot;http://www.w3.org/2003/g/data-view#&quot; grddl:transformation=&quot; http://example.org/book/getAuthor.xsl &quot; > <title>The man who mistook his wife for a hat</title> … </book>
  • 30.
  • 31.
    Don't bury yourdata in some HTML page or XML document
  • 32.
    data… when youpublish a document that contains
  • 33.
    do reference GRDDLprofiles and/or transformations
  • 34.