Semantic Search on the Public Web with Creative Commons

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Semantic Search on the Public Web with Creative Commons - Presentation Transcript

        • Semantic Search on the Public Web with Creative Commons
        • 2006.03.07
        • Mike Linksvayer
    1. Billion$ (0)
      • Let's get the hype out of the way....
    2. Billion$ (1)
      • Let's get the hype out of the way....
    3. Billion$ (2)
      • Let's get the hype out of the way....
    4. Billion$ (3)
      • This calls for a mashup...
    5. Billion$ (4)
    6. Billion$ (5)
      • Fortunately CC's founders thought of that from the beginning...
    7. Billion$ (6)
    8. Billion$ (7)
      • About Creative Commons
    9.  
    10. Core Licensing Suite: Creator/Licensor chooses license options NonCommercial No Derivatives ShareAlike Every Creative Commons licenses allows the world to copy and distribute a work provided that the licensee credits the creator/licensor In addition creator/licensor may apply the following conditions:
    11.  
    12. Simple License Generator
    13. Internet Archive Free Hosting for CC works http://www.archive.org/
      • Creative Commons Metadata
    14. Creative Commons Metadata Example
      • <rdf:RDF xmlns=&quot;http://web.resource.org/cc/&quot;
      • xmlns:dc=&quot;http://purl.org/dc/elements/1.1/&quot;
      • xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;>
      • <Work rdf:about=&quot;http://example.com/article.html&quot;>
      • <dc:title>An Example Article</dc:title>
      • <dc:date>2003-10-01</dc:date>
      • <dc:type rdf:resource=&quot;http://purl.org/dc/dcmitype/Text&quot; />
      • <license rdf:resource=&quot;http://creativecommons.org/licenses/by-nc-sa/2.5/&quot; />
      • </Work>
      • <License rdf:about=&quot;http://creativecommons.org/licenses/by-nc-sa/2.5/&quot;>
      • <permits rdf:resource=&quot;http://web.resource.org/cc/Reproduction&quot; />
      • <permits rdf:resource=&quot;http://web.resource.org/cc/Distribution&quot; />
      • <requires rdf:resource=&quot;http://web.resource.org/cc/Notice&quot; />
      • <requires rdf:resource=&quot;http://web.resource.org/cc/Attribution&quot; />
      • <prohibits rdf:resource=&quot;http://web.resource.org/cc/CommercialUse&quot; />
      • <permits rdf:resource=&quot;http://web.resource.org/cc/DerivativeWorks&quot; />
      • <requires rdf:resource=&quot;http://web.resource.org/cc/ShareAlike&quot; />
      • </License>
      • </rdf:RDF>
    15. Rights Description Use Cases Discovery Expression Commerce Management(1)
    16. Rights Description vs. Rights Management(2) Copy/Use promotion vs. Copy/Use protection Encourage fans vs. Discourage casual pirates Resource management vs. Customer management Web content model vs. 20 th century content model Not mutually exclusive in theory.
    17. Why Semantic Web? Small organization, no central registration for every license Decentralization: Let a thousand search engines bloom; web as API Existing RDF tools could take advantage of CC RDF
    18. Why RDF-in-HTML comments? (yuck)
      • Considered:
      • Robots.txt-like
      • HTML meta tags
      • LINK to external RDF file
      • RDF-in-HTML comments wins because
      • Metadata colocated with human visible HTML, only single copy & paste for licensors
      • Full power of RDF
    19. CC Search History I
      • Postgresql/tsearch2/python prototype (early 2004)
        • Sloooowwwww, but did what a prototype should do
    20. CC Search History II
      • CC-Nutch (late 2004)
        • Nutch aims to be open source search engine comparable to commercial web scale search engines
        • Built on top of Lucene full text index
        • CC plugin only ~500 lines of code (not counting UI, CC-required additions to Nutch core)
        • http://search.creativecommons.org uses Nutch, >1m CC-licensed pages indexed
    21.  
    22. CC Search History III
      • Yahoo! Search for Creative Commons (early 2005)
        • Search CC-licensed subset of Yahoo!’s index (~15m* pages)
        • *very rough guesstimate
    23.  
    24.  
    25.  
    26. CC Search History IV
      • Google CC search (November 2005)
        • Search CC-licensed subset of Google’s index (~45m* pages)
        • *very rough guesstimate
    27.  
    28.  
    29.  
    30. CC Search History V (the future) Better metadata formats Image and Video search Derivatives search Content commerce search “ Live” web search “ Management” (desktop, workgroup) Semantic mashups
    31. Future CC metadata formats
      • “ Semantic XHTML” AKA “lowercase semantic web” AKA “microformats” (now)
      • <a rel=“license” href=“ http://creativecommons.org/licenses/by/2.5/ ”>
      • RDF/A AKA XHTML2 metadata (in working group)
      • GRDDL (gleaning resource descriptions from dialects of languages)
    32.  
    33.  
    34.  
    35.  
    36. Image and Video search Better metadata formats Image and Video search Derivatives search Content commerce search “ Live” web search “ Management” (desktop, workgroup) Semantic mashups
    37. Searching for Derivative Works
    38. Creative Commons (0)
    39. Creative Commons (0)
    40. Creative Commons (0)
    41. Creative Commons (0)
    42. Derivatives search RDF/XML snippet: <dc:source rdf:resource=”http://ccmixter.org/media/files/victor/3385”/> Query like Yahoo! link: search or Technorati Cosmos search source:http://ccmixter.org/media/files/victor/3385 “ Who sampled this” as the new “who linked to this”
    43. Content commerce search Transaction costs should be low even if rights are reserved Commercial terms and other commerce described by metadata associated with a work Find me work I can use at a price I can pay for usage rights warranty/paper trail (even if rights not reserved) Reintermediate consumer and creator
    44. “ Live” web search (feeds) Feeds are explicitly metadata-rich (unlike typical web page) Existing blog search ignores metadata Web search will become more like blog search, vice versa?
    45. “ Management” (desktop, workgroup) Desktop search (OS-level) Content creation and media player integration XMP Semantic Wikis
    46. Semantic mashups
    47. Issues for Semantic Search on the Public Web Metadata quality Trust Scalability Usability Compatibility Critical mass State of the art IR works very well – high expectations!
        • Semantic Search on the Public Web with Creative Commons
        • 2006.03.07
        • Mike Linksvayer
        • Questions, feedback, flames:
        • [email_address]
        • http://developer.creativecommons.org

    + Mike LinksvayerMike Linksvayer, 2 years ago

    custom

    1064 views, 0 favs, 0 embeds more stats

    More info about this document

    CC Attribution License

    Go to text version

    • Total Views 1064
      • 1064 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 19
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories