Your SlideShare is downloading. ×
0
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Querying the Web
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Querying the Web

679

Published on

A discussion of the various ways that data on the web can be published and queried. Why SQL is not the right tool for this.

A discussion of the various ways that data on the web can be published and queried. Why SQL is not the right tool for this.

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
679
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Querying the Web SlipstreamUSA :: April 2, 2008
  • 2. Querying the Web
      • “ Information wants to be free”
        • Stewart Brand, Whole Earth Catalogue
        • May 1985
      • “ Data is the Next Intel Inside”
        • Tim O’Reilly
        • September 2005
      • “ The internet is my hard drive”
        • Bruce Schneier
        • February 2008
  • 3. Freebase
  • 4. Freebase
  • 5. Freebase
  • 6. Freebase
  • 7. Freebase
    • Metaweb Query Language
    • Request:
      • { "type" : "/medicine/physician",
      • "name" : “Michael Maher“ }
    • Response:
      • { "code": "/api/status/ok", "result": { "type": "/medicine/physician", "name": “Michael Maher", “gender”: “Male”,
      • “ education”: “Leeds University”}
      • }
    • JSON
  • 8. REST
      • REpresentational State Transfer
      • Less rigourous equivalent of SOAP
      • Data are considered to be resources
      • Every resource has a unique address
      • Layered over http:
        • Client/Server separation
        • Stateless
        • Cacheable
      • Request:
        • GET http://rest.georgejames.com/product/Serenji/
      • Response:
        • Name=Serenji
        • Price=195.00
        • OrderCode=H1001
  • 9. Amazon S3
      • S3 :: Simple Storage Service
      • Online storage space
      • $0.15 per Gbyte per month for storage
      • ~ $0.20 per Gbyte data transfer
      • Storage request:
        • PUT http://s3.amazonaws.com/[bucket-name]/[key-name]
      • Retrieval request:
        • GET http://s3.amazonaws.com/[bucket-name]/[key-name]
  • 10. Amazon SimpleDB
      • Storage request:
      • https://sdb.amazonaws.com/?Action=PutAttributes &Attribute.0.Name=Color&Attribute.0.Value=Blue &Attribute.1.Name=Size&Attribute.1.Value=Med &Attribute.2.Name=Price&Attribute.2.Value=14.99 &AWSAccessKeyId=[valid access key id] &DomainName=MyDomain &ItemName=Item123
      • Retrieval request:
      • https://sdb.amazonaws.com/ ?Action=GetAttributes &AWSAccessKeyId=[valid access key id] &DomainName=MyDomain &ItemName=Item123
      • Retrieval response:
      • <GetAttributesResult> <Attribute><Name>Color</Name><Value>Blue</Value></Attribute> <Attribute><Name>Size</Name><Value>Med</Value></Attribute> <Attribute><Name>Price</Name><Value>14.99</Value></Attribute> </GetAttributesResult>
  • 11. Astoria
  • 12. Astoria in action
    • Request:
      • http://astoria.sandbox.live.com/northwind/northwind.rse/Categories
    • Response:
  • 13. Astoria in action
    • Request:
      • http://astoria.sandbox.live.com/northwind/northwind.rse/Customers
    • Response:
  • 14. Astoria in action
    • Request:
      • /Customers[FRANK]
    • Response:
  • 15. Astoria in action
    • Request:
      • /Customers[FRANK]/Orders
    • Response:
  • 16. Astoria in action
    • A variety of response formats:
      • POX
      • Web3S (Web, Structured, Schema’d and Searchable)
      • ATOM
      • JSON
    • JSON request:
      • /Customers[FRANK]?$format=json
    • Response:
  • 17.
      • Where is all this information going to come from?
  • 18. Crowdsourcing
      • Jeff Howe, Wired Magazine, June 2006
      • Delegating an activity to a large number of unidentified individuals
      • Small finite tasks
      • Quantity more important than quality
      • The sum is greater than the parts
      • Examples:
        • Wikipedia
  • 19. Crowdsourcing
  • 20. Crowdsourcing
  • 21. Google Maps
  • 22. Google Maps
  • 23. Crowdsourcing
      • Jeff Howe, June 2006, Wired Magazine
      • Delegating an activity to a large number of unidentified individuals
      • Small finite tasks
      • Quantity more important than quality
      • The sum is greater than the parts
      • Examples:
        • Wikipedia
        • Galaxy Zoo
        • Amazon Mechanical Turk
        • Google route planner
      • Consequences:
        • Drives down the cost of data
        • Ownership may not be the traditional incubents
        • Client / user needs to discriminate
  • 24. What does this mean for you?
      • Data Provider
        • Publish data via simple APIs
        • You data may have unexpected value
        • Innovative usage
        • Usage can enhance the quality of your data
      • Data Consumer
        • Many potential data sources
        • Explosive growth in available data
        • Quality of the data is potentially lower
        • … but is outweighed by quantity and richness
      • Technical
        • Cache database is an ideal container
        • Dynamic / extensible data structure
        • Weak data typing
        • High performance and scalability
  • 25.
      • The Internet is the Database
  • 26.
      • Thank you
      • Questions?

×