0
How to get your data into Sindice and Google with sitemap4rdf<br />Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)<...
Publishing Linked Data <br />from a triple store<br />
Linked Data frontends for triple stores<br />Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/<br />
Search engines<br />
Sindice: the best RDF search engine<br />
Sindice: the best RDF search engine<br />120M+ documents<br />Continuously updating since 2006<br />Low-latency search API...
The Sitemap protocol<br />
Sitemap Protocol<br />Used by web crawlers<br />Efficiently find all your content & discover what has been updated<br />ht...
Sitemap Protocol: Simple example<br /><?xml version="1.0" encoding="UTF-8"?><br /><urlsetxmlns="http://www.sitemaps.org/sc...
Sitemap Protocol: Optional parts<br /><?xml version="1.0" encoding="UTF-8"?><br /><urlsetxmlns="http://www.sitemaps.org/sc...
Sitemap Protocol: Huge sitemaps<br />Gzip-compress your sitemap<br />Limit: 50k URLs or 10MB<br />split into multiple site...
Sitemap Protocol: Discovery<br />Publish the sitemap file<br />Add a line to http://yoursite/robots.txt<br />  Sitemap: ht...
sitemap4rdf<br />Generate Sitemap files from a SPARQL endpoint<br />
sitemap4rdf<br />Simple command line tool<br />Sends a SPARQL query to list all URIs<br />Generates sitemap<br />sitemap4r...
Submit the sitemap location - Sindice<br />http://sindice.com/main/submit<br />
Submit the sitemap location - Google<br />https://www.google.com/webmasters/tools/<br />
Summary<br />Sitemap protocol informs search engines about available pages<br />Supported by Sindice!<br />sitemap4rdf gen...
Upcoming SlideShare
Loading in...5
×

How to get your data into Sindice and Google with sitemap4rdf

1,684

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,684
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "How to get your data into Sindice and Google with sitemap4rdf"

  1. 1. How to get your data into Sindice and Google with sitemap4rdf<br />Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)<br />
  2. 2. Publishing Linked Data <br />from a triple store<br />
  3. 3. Linked Data frontends for triple stores<br />Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/<br />
  4. 4. Search engines<br />
  5. 5. Sindice: the best RDF search engine<br />
  6. 6. Sindice: the best RDF search engine<br />120M+ documents<br />Continuously updating since 2006<br />Low-latency search API<br />RDF/XML, Turtle, RDFa, microformats<br />
  7. 7. The Sitemap protocol<br />
  8. 8. Sitemap Protocol<br />Used by web crawlers<br />Efficiently find all your content & discover what has been updated<br />http://sitemaps.org/<br />
  9. 9. Sitemap Protocol: Simple example<br /><?xml version="1.0" encoding="UTF-8"?><br /><urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><br /> <url><br /> <loc>http://yoursite/</loc><br /> </url><br /> <url><br /> <loc>http://yoursite/products/53546</loc><br /> </url><br /> <url><br /> <loc>http://yoursite/products/98421</loc><br /> </url><br /> <url><br /> <loc>http://yoursite/products/41003</loc><br /> </url><br /></urlset> <br />
  10. 10. Sitemap Protocol: Optional parts<br /><?xml version="1.0" encoding="UTF-8"?><br /><urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><br /> <url><br /> <loc>http://yoursite/</loc><br /> <lastmod>2010-06-24</lastmod><br /> <changefreq>daily</changefreq><br /> </url><br /></urlset><br />
  11. 11. Sitemap Protocol: Huge sitemaps<br />Gzip-compress your sitemap<br />Limit: 50k URLs or 10MB<br />split into multiple sitemap files<br />add a sitemap index file<br />
  12. 12. Sitemap Protocol: Discovery<br />Publish the sitemap file<br />Add a line to http://yoursite/robots.txt<br /> Sitemap: http://yoursite/sitemap.xml<br />
  13. 13. sitemap4rdf<br />Generate Sitemap files from a SPARQL endpoint<br />
  14. 14. sitemap4rdf<br />Simple command line tool<br />Sends a SPARQL query to list all URIs<br />Generates sitemap<br />sitemap4rdf http://yoursite/sparql http://yoursite/resource/<br />
  15. 15. Submit the sitemap location - Sindice<br />http://sindice.com/main/submit<br />
  16. 16. Submit the sitemap location - Google<br />https://www.google.com/webmasters/tools/<br />
  17. 17. Summary<br />Sitemap protocol informs search engines about available pages<br />Supported by Sindice!<br />sitemap4rdf generates Sitemap files by listing URIs in a SPARQL endpoint<br />Open source, Java<br />http://lab.linkeddata.deri.ie/2010/sitemap4rdf/<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×