Your SlideShare is downloading. ×
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
How to get your data into Sindice and Google with sitemap4rdf
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How to get your data into Sindice and Google with sitemap4rdf

1,651

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,651
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How to get your data into Sindice and Google with sitemap4rdf
    Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)
  • 2. Publishing Linked Data
    from a triple store
  • 3. Linked Data frontends for triple stores
    Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/
  • 4. Search engines
  • 5. Sindice: the best RDF search engine
  • 6. Sindice: the best RDF search engine
    120M+ documents
    Continuously updating since 2006
    Low-latency search API
    RDF/XML, Turtle, RDFa, microformats
  • 7. The Sitemap protocol
  • 8. Sitemap Protocol
    Used by web crawlers
    Efficiently find all your content & discover what has been updated
    http://sitemaps.org/
  • 9. Sitemap Protocol: Simple example
    <?xml version="1.0" encoding="UTF-8"?>
    <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
    <loc>http://yoursite/</loc>
    </url>
    <url>
    <loc>http://yoursite/products/53546</loc>
    </url>
    <url>
    <loc>http://yoursite/products/98421</loc>
    </url>
    <url>
    <loc>http://yoursite/products/41003</loc>
    </url>
    </urlset>
  • 10. Sitemap Protocol: Optional parts
    <?xml version="1.0" encoding="UTF-8"?>
    <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
    <loc>http://yoursite/</loc>
    <lastmod>2010-06-24</lastmod>
    <changefreq>daily</changefreq>
    </url>
    </urlset>
  • 11. Sitemap Protocol: Huge sitemaps
    Gzip-compress your sitemap
    Limit: 50k URLs or 10MB
    split into multiple sitemap files
    add a sitemap index file
  • 12. Sitemap Protocol: Discovery
    Publish the sitemap file
    Add a line to http://yoursite/robots.txt
    Sitemap: http://yoursite/sitemap.xml
  • 13. sitemap4rdf
    Generate Sitemap files from a SPARQL endpoint
  • 14. sitemap4rdf
    Simple command line tool
    Sends a SPARQL query to list all URIs
    Generates sitemap
    sitemap4rdf http://yoursite/sparql http://yoursite/resource/
  • 15. Submit the sitemap location - Sindice
    http://sindice.com/main/submit
  • 16. Submit the sitemap location - Google
    https://www.google.com/webmasters/tools/
  • 17. Summary
    Sitemap protocol informs search engines about available pages
    Supported by Sindice!
    sitemap4rdf generates Sitemap files by listing URIs in a SPARQL endpoint
    Open source, Java
    http://lab.linkeddata.deri.ie/2010/sitemap4rdf/

×