Semantic Search on the Public Web with Creative Commons
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Semantic Search on the Public Web with Creative Commons

on

  • 2,438 views

 

Statistics

Views

Total Views
2,438
Views on SlideShare
2,438
Embed Views
0

Actions

Likes
0
Downloads
27
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Semantic Search on the Public Web with Creative Commons Presentation Transcript

  • 1.
      • Semantic Search on the Public Web with Creative Commons
      • 2006.03.07
      • Mike Linksvayer
  • 2. Billion$ (0)
    • Let's get the hype out of the way....
  • 3. Billion$ (1)
    • Let's get the hype out of the way....
  • 4. Billion$ (2)
    • Let's get the hype out of the way....
  • 5. Billion$ (3)
    • This calls for a mashup...
  • 6. Billion$ (4)
  • 7. Billion$ (5)
    • Fortunately CC's founders thought of that from the beginning...
  • 8. Billion$ (6)
  • 9. Billion$ (7)
  • 10.
    • About Creative Commons
  • 11.  
  • 12. Core Licensing Suite: Creator/Licensor chooses license options NonCommercial No Derivatives ShareAlike Every Creative Commons licenses allows the world to copy and distribute a work provided that the licensee credits the creator/licensor In addition creator/licensor may apply the following conditions:
  • 13.  
  • 14. Simple License Generator
  • 15. Internet Archive Free Hosting for CC works http://www.archive.org/
  • 16.
    • Creative Commons Metadata
  • 17. Creative Commons Metadata Example
    • <rdf:RDF xmlns=&quot;http://web.resource.org/cc/&quot;
    • xmlns:dc=&quot;http://purl.org/dc/elements/1.1/&quot;
    • xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;>
    • <Work rdf:about=&quot;http://example.com/article.html&quot;>
    • <dc:title>An Example Article</dc:title>
    • <dc:date>2003-10-01</dc:date>
    • <dc:type rdf:resource=&quot;http://purl.org/dc/dcmitype/Text&quot; />
    • <license rdf:resource=&quot;http://creativecommons.org/licenses/by-nc-sa/2.5/&quot; />
    • </Work>
    • <License rdf:about=&quot;http://creativecommons.org/licenses/by-nc-sa/2.5/&quot;>
    • <permits rdf:resource=&quot;http://web.resource.org/cc/Reproduction&quot; />
    • <permits rdf:resource=&quot;http://web.resource.org/cc/Distribution&quot; />
    • <requires rdf:resource=&quot;http://web.resource.org/cc/Notice&quot; />
    • <requires rdf:resource=&quot;http://web.resource.org/cc/Attribution&quot; />
    • <prohibits rdf:resource=&quot;http://web.resource.org/cc/CommercialUse&quot; />
    • <permits rdf:resource=&quot;http://web.resource.org/cc/DerivativeWorks&quot; />
    • <requires rdf:resource=&quot;http://web.resource.org/cc/ShareAlike&quot; />
    • </License>
    • </rdf:RDF>
  • 18. Rights Description Use Cases Discovery Expression Commerce Management(1)
  • 19. Rights Description vs. Rights Management(2) Copy/Use promotion vs. Copy/Use protection Encourage fans vs. Discourage casual pirates Resource management vs. Customer management Web content model vs. 20 th century content model Not mutually exclusive in theory.
  • 20. Why Semantic Web? Small organization, no central registration for every license Decentralization: Let a thousand search engines bloom; web as API Existing RDF tools could take advantage of CC RDF
  • 21. Why RDF-in-HTML comments? (yuck)
    • Considered:
    • Robots.txt-like
    • HTML meta tags
    • LINK to external RDF file
    • RDF-in-HTML comments wins because
    • Metadata colocated with human visible HTML, only single copy & paste for licensors
    • Full power of RDF
  • 22. CC Search History I
    • Postgresql/tsearch2/python prototype (early 2004)
      • Sloooowwwww, but did what a prototype should do
  • 23. CC Search History II
    • CC-Nutch (late 2004)
      • Nutch aims to be open source search engine comparable to commercial web scale search engines
      • Built on top of Lucene full text index
      • CC plugin only ~500 lines of code (not counting UI, CC-required additions to Nutch core)
      • http://search.creativecommons.org uses Nutch, >1m CC-licensed pages indexed
  • 24.  
  • 25. CC Search History III
    • Yahoo! Search for Creative Commons (early 2005)
      • Search CC-licensed subset of Yahoo!’s index (~15m* pages)
      • *very rough guesstimate
  • 26.  
  • 27.  
  • 28.  
  • 29. CC Search History IV
    • Google CC search (November 2005)
      • Search CC-licensed subset of Google’s index (~45m* pages)
      • *very rough guesstimate
  • 30.  
  • 31.  
  • 32.  
  • 33. CC Search History V (the future) Better metadata formats Image and Video search Derivatives search Content commerce search “ Live” web search “ Management” (desktop, workgroup) Semantic mashups
  • 34. Future CC metadata formats
    • “ Semantic XHTML” AKA “lowercase semantic web” AKA “microformats” (now)
    • <a rel=“license” href=“ http://creativecommons.org/licenses/by/2.5/ ”>
    • RDF/A AKA XHTML2 metadata (in working group)
    • GRDDL (gleaning resource descriptions from dialects of languages)
  • 35.  
  • 36.  
  • 37.  
  • 38.  
  • 39. Image and Video search Better metadata formats Image and Video search Derivatives search Content commerce search “ Live” web search “ Management” (desktop, workgroup) Semantic mashups
  • 40. Searching for Derivative Works
  • 41. Creative Commons (0)
  • 42. Creative Commons (0)
  • 43. Creative Commons (0)
  • 44. Creative Commons (0)
  • 45. Derivatives search RDF/XML snippet: <dc:source rdf:resource=”http://ccmixter.org/media/files/victor/3385”/> Query like Yahoo! link: search or Technorati Cosmos search source:http://ccmixter.org/media/files/victor/3385 “ Who sampled this” as the new “who linked to this”
  • 46. Content commerce search Transaction costs should be low even if rights are reserved Commercial terms and other commerce described by metadata associated with a work Find me work I can use at a price I can pay for usage rights warranty/paper trail (even if rights not reserved) Reintermediate consumer and creator
  • 47. “ Live” web search (feeds) Feeds are explicitly metadata-rich (unlike typical web page) Existing blog search ignores metadata Web search will become more like blog search, vice versa?
  • 48. “ Management” (desktop, workgroup) Desktop search (OS-level) Content creation and media player integration XMP Semantic Wikis
  • 49. Semantic mashups
  • 50. Issues for Semantic Search on the Public Web Metadata quality Trust Scalability Usability Compatibility Critical mass State of the art IR works very well – high expectations!
  • 51.
      • Semantic Search on the Public Web with Creative Commons
      • 2006.03.07
      • Mike Linksvayer
      • Questions, feedback, flames:
      • [email_address]
      • http://developer.creativecommons.org