• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
How to get your data into Sindice and Google with sitemap4rdf
 

How to get your data into Sindice and Google with sitemap4rdf

on

  • 1,890 views

 

Statistics

Views

Total Views
1,890
Views on SlideShare
1,890
Embed Views
0

Actions

Likes
3
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    How to get your data into Sindice and Google with sitemap4rdf How to get your data into Sindice and Google with sitemap4rdf Presentation Transcript

    • How to get your data into Sindice and Google with sitemap4rdf
      Boris Villazón-Terrazas (OEG), Richard Cyganiak (DERI)
    • Publishing Linked Data
      from a triple store
    • Linked Data frontends for triple stores
      Source: Pubby website, http://www4.wiwiss.fu-berlin.de/pubby/
    • Search engines
    • Sindice: the best RDF search engine
    • Sindice: the best RDF search engine
      120M+ documents
      Continuously updating since 2006
      Low-latency search API
      RDF/XML, Turtle, RDFa, microformats
    • The Sitemap protocol
    • Sitemap Protocol
      Used by web crawlers
      Efficiently find all your content & discover what has been updated
      http://sitemaps.org/
    • Sitemap Protocol: Simple example
      <?xml version="1.0" encoding="UTF-8"?>
      <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
      <loc>http://yoursite/</loc>
      </url>
      <url>
      <loc>http://yoursite/products/53546</loc>
      </url>
      <url>
      <loc>http://yoursite/products/98421</loc>
      </url>
      <url>
      <loc>http://yoursite/products/41003</loc>
      </url>
      </urlset>
    • Sitemap Protocol: Optional parts
      <?xml version="1.0" encoding="UTF-8"?>
      <urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
      <loc>http://yoursite/</loc>
      <lastmod>2010-06-24</lastmod>
      <changefreq>daily</changefreq>
      </url>
      </urlset>
    • Sitemap Protocol: Huge sitemaps
      Gzip-compress your sitemap
      Limit: 50k URLs or 10MB
      split into multiple sitemap files
      add a sitemap index file
    • Sitemap Protocol: Discovery
      Publish the sitemap file
      Add a line to http://yoursite/robots.txt
      Sitemap: http://yoursite/sitemap.xml
    • sitemap4rdf
      Generate Sitemap files from a SPARQL endpoint
    • sitemap4rdf
      Simple command line tool
      Sends a SPARQL query to list all URIs
      Generates sitemap
      sitemap4rdf http://yoursite/sparql http://yoursite/resource/
    • Submit the sitemap location - Sindice
      http://sindice.com/main/submit
    • Submit the sitemap location - Google
      https://www.google.com/webmasters/tools/
    • Summary
      Sitemap protocol informs search engines about available pages
      Supported by Sindice!
      sitemap4rdf generates Sitemap files by listing URIs in a SPARQL endpoint
      Open source, Java
      http://lab.linkeddata.deri.ie/2010/sitemap4rdf/