Your SlideShare is downloading. ×
How to get your data into Sindice and Google with sitemap4rdf
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How to get your data into Sindice and Google with sitemap4rdf

1,644

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,644
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How to get your data into Sindice and Google with sitemap4rdf
    Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)
  • 2. Publishing Linked Data
    from a triple store
  • 3. Linked Data frontends for triple stores
    Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/
  • 4. Search engines
  • 5. Sindice: the best RDF search engine
  • 6. Sindice: the best RDF search engine
    120M+ documents
    Continuously updating since 2006
    Low-latency search API
    RDF/XML, Turtle, RDFa, microformats
  • 7. The Sitemap protocol
  • 8. Sitemap Protocol
    Used by web crawlers
    Efficiently find all your content & discover what has been updated
    http://sitemaps.org/
  • 9. Sitemap Protocol: Simple example
    <?xml version="1.0" encoding="UTF-8"?>
    <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
    <loc>http://yoursite/</loc>
    </url>
    <url>
    <loc>http://yoursite/products/53546</loc>
    </url>
    <url>
    <loc>http://yoursite/products/98421</loc>
    </url>
    <url>
    <loc>http://yoursite/products/41003</loc>
    </url>
    </urlset>
  • 10. Sitemap Protocol: Optional parts
    <?xml version="1.0" encoding="UTF-8"?>
    <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
    <loc>http://yoursite/</loc>
    <lastmod>2010-06-24</lastmod>
    <changefreq>daily</changefreq>
    </url>
    </urlset>
  • 11. Sitemap Protocol: Huge sitemaps
    Gzip-compress your sitemap
    Limit: 50k URLs or 10MB
    split into multiple sitemap files
    add a sitemap index file
  • 12. Sitemap Protocol: Discovery
    Publish the sitemap file
    Add a line to http://yoursite/robots.txt
    Sitemap: http://yoursite/sitemap.xml
  • 13. sitemap4rdf
    Generate Sitemap files from a SPARQL endpoint
  • 14. sitemap4rdf
    Simple command line tool
    Sends a SPARQL query to list all URIs
    Generates sitemap
    sitemap4rdf http://yoursite/sparql http://yoursite/resource/
  • 15. Submit the sitemap location - Sindice
    http://sindice.com/main/submit
  • 16. Submit the sitemap location - Google
    https://www.google.com/webmasters/tools/
  • 17. Summary
    Sitemap protocol informs search engines about available pages
    Supported by Sindice!
    sitemap4rdf generates Sitemap files by listing URIs in a SPARQL endpoint
    Open source, Java
    http://lab.linkeddata.deri.ie/2010/sitemap4rdf/

×