Linked Data Marketplaces
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Linked Data Marketplaces

on

  • 13,541 views

 

Statistics

Views

Total Views
13,541
Views on SlideShare
12,781
Embed Views
760

Actions

Likes
40
Downloads
444
Comments
2

9 Embeds 760

http://letthedataflow.ca 415
http://www.letthedataflow.ca 315
url_unknown 10
http://itlyderis.wordpress.com 7
http://www.pearltrees.com 5
http://iricelino.org 4
http://paper.li 2
http://www.twylah.com 1
http://webcache.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Linked Data Marketplaces Presentation Transcript

  • 1. (Linked) Data Marketplaces Marin Dimitrov (Ontotext) v0.3 / Jan 2011
  • 2. Contents• Introduction• Data Marketplaces – Factual, InfoChimps, Azure DataMarket, Freebase, Socrata – Data Market, Timetric, xIgnite• Linked Data for Data Marketplaces (Linked) Data Marketplaces Jan 2011 #2
  • 3. INTRODUCTION (Linked) Data Marketplaces Jan 2011 #3
  • 4. Definitions• Data-as-a-Service (DaaS) – “Like all members of the "as a Service" (XaaS) family, DaaS is based on the concept that the product, data in this case, can be provided on demand to the user regardless of geographic or organizational separation of provider and consumer. Additionally, the emergence of service-oriented architecture (SOA) has rendered the actual platform on which the data resides also irrelevant” (Wikipedia)• Data Marketplaces – “Services that make it easy to find data from a range of secondary data sources, then consume the data in a usable and unified format. Several of these services are trying to create marketplaces for data, envisioning that data providers can offer their data sets for sale to data seekers” (DataMarket.com) (Linked) Data Marketplaces Jan 2011 #4
  • 5. Data Marketplaces properties• Proposed classification by Bauereiss & Fensel 1. Data domain 2. Population of content 3. Community management 4. Operating party 5. Pricing models 6. Data exchange• Some additional differentiating characteristics – Data model, Data size, Data export – Branded marketplaces, SLA – Query languages, Data tools (Linked) Data Marketplaces Jan 2011 #5
  • 6. DATA MARKETPLACES (Linked) Data Marketplaces Jan 2011 #6
  • 7. Factual• www.factual.com / @factual (Linked) Data Marketplaces Jan 2011 #7
  • 8. Factual (2)• Data domain – Travel, finance, sports, autos, movies, music, TV, books, health, food, politics, education, science, arts, … – High quality local data • USA, Germany, France, Italy, UK, Japan, Switzerland, Australia, … • Used by Facebook Places• Data population – Crawling the web – Public data sources – Community contributions • Upload XLS/ODS, CSV (Linked) Data Marketplaces Jan 2011 #8
  • 9. Factual (3)• Data model – tabular – Taxonomy of 400 categories • 13 Level 1 categories: Arts, Automotive, Business, Government, …• Data size – 500,000 datasets• Company info – Factual Inc. (USA) – $27M VC funding so far (Linked) Data Marketplaces Jan 2011 #9
  • 10. Factual (4)• Monetization model – Pricing model not finalised yet (currently free) – Pay-per-use pricing (per API call) with subscriptions • Companies that contribute data will have a fee reduction• Data access options – REST API • Read from table, Add/Write to table, Get schema info – Web applications • Read/write raw data from a web page (JavaScript) • Web widgets for visualising, filtering and sorting data (Linked) Data Marketplaces Jan 2011 #10
  • 11. Factual (5)• Data tools – AutoClipper – find tables on the web – PageClipper – extract tabular data from a web page – FactClipper – find individual facts (query templates) (Linked) Data Marketplaces Jan 2011 #11
  • 12. InfoChimps• www.infochimps.com / @infochimps (Linked) Data Marketplaces Jan 2011 #12
  • 13. InfoChimps (2)• Data domain – All purpose • Including data from Freebase, Wikipedia infoboxes, CKAN, Twitter, Data.gov, Data.gov.uk, GeoNames, …• Data population – Public datasets – User submitted datasets• Data model is dataset specific• 10,000+ datasets organised in 13 collections (Linked) Data Marketplaces Jan 2011 #13
  • 14. InfoChimps (3)• Company info – InfoChimps (USA) – $1.6M VC funding so far – Acquired DataMarketplace in 12/2010• Monetization model – Charge data sellers • Data sellers choose the price & licensing of their data • Charge for data storage • 30% commission for InfoChimps on each sale (Linked) Data Marketplaces Jan 2011 #14
  • 15. InfoChimps (4)• Monetization model (2) – Charge data buyers • Baboon – free, 100K API calls / mo • Brass Monkey – $20/mo, 500K API calls / mo • Silverback – $250/mo, 2M API calls / mo • Golden Ape – $4,000/mo, 15M API calls / mo• Data access options – REST API • api.infochimps.com/DATASET/METHOD.json?PARAM=VALUE – YQL tables (Linked) Data Marketplaces Jan 2011 #15
  • 16. Azure DataMarket• https://datamarket.azure.com (Linked) Data Marketplaces Jan 2011 #16
  • 17. Azure DataMarket (2)• Data domain – All purpose, incl. Data.gov, UN data, Wolfram|Alpha, ESRI• Data population – Data publishers (need prior approval) • Data can be stored on SQL Azure, Azure Storage or 3rd party clouds (via Data Access Layers)• Data model – Depends on the dataset and the storage, but always presented as OData to consumers• Data size – 90 datasets (Linked) Data Marketplaces Jan 2011 #17
  • 18. Azure DataMarket (3) (c) Microsoft (Linked) Data Marketplaces Jan 2011 #18
  • 19. Azure DataMarket (4)• Company info – Microsoft• Monetization model – Subscription for data buyers (limited/unlimited API calls)• Access options – OData (feeds, queries, updates)• Data tools – Service Explorer – Excel add-in (find, purchase, consume data) – Integration with SQL Server Reporting Services / Integration Services (Linked) Data Marketplaces Jan 2011 #19
  • 20. DataMarket• www.datamarket.com / @datamarket (Linked) Data Marketplaces Jan 2011 #20
  • 21. DataMarket (2)• Data domain – Statistical data from 2,000 providers, incl. UN, Eurostat, World Bank, US agencies, BP, FIFA, …• Data population – Data aggregation (2,000 data providers)• Data size – 13K datasets, 100M time series, 600M facts• Company info – DataMarket (Iceland) (Linked) Data Marketplaces Jan 2011 #21
  • 22. DataMarket (3)• Monetization model – Charge data sellers • Free datasets – $249/mo; Paid datasets – 25% commission; Branded datasets – $699/mo + commission – Charge data buyers • Free – 50 API calls/mo; $99 – 500 API calls/mo; $299 – 10K API calls/mo; $799 – 100K API calls/mo• Data access – REST API (Linked) Data Marketplaces Jan 2011 #22
  • 23. Socrata• www.socrata.com / @socrata (Linked) Data Marketplaces Jan 2011 #23
  • 24. Socrata (2)• Data domain – Business, education, government data• Data population – Uploads from data publishers• Data size – 13K datasets• Data model – tabular (Linked) Data Marketplaces Jan 2011 #24
  • 25. Socrata (3)• Company info – Socrata (USA)• Monetization model – Charge data buyers (“Plans starting at $499 per month”) • Basic – 100K API calls/mo + 50GB traffic; Plus – 250K API calls/mo + 250GB traffic; Premium – 1M API calls/mo + 1.2TB traffic; Ultimate – 10M API calls/mo + 5TB traffic• Data access – REST API (Socrata Open Data API) – Data export (XLS, CSV, RDF, XML) – RSS updates (Linked) Data Marketplaces Jan 2011 #25
  • 26. Freebase• www.freebase.com / @fbase (Linked) Data Marketplaces Jan 2011 #26
  • 27. Freebase (2)• Data domain – General purpose• Data model – Graph (RDF dumps available)• Data population – Community curated data (licensed as CC-BY) – Import of public data sources (Wikipedia, MusicBrainz, WordNet, LoC, …)• Data size – 20M entities (Linked) Data Marketplaces Jan 2011 #27
  • 28. Freebase (3)• Company info – Metaweb (USA), now Google• Monetization model – Free for 100K read API calls per day (10K write) – Paid for higher volumes• Data access – REST API – Linked Data endpoint (http://rdf.freebase.com) – Triple uploader / RDF dumps – Acre (application hosting platform) (Linked) Data Marketplaces Jan 2011 #28
  • 29. Freebase (4)• Data tools – Web based – schema editor, review queue, viewers, … – GridWorks (Google Refine) • Exploring, data cleaning, transformation of tabular data • Map data to Freebase schema & RDF export (3rd party extension) – Acre • Application hosting platform – User contributed JavaScript code (converted to Java with Rhino) • Access & store data directly into Freebase (Linked) Data Marketplaces Jan 2011 #29
  • 30. timetric• www.timetric.com / @timetric (Linked) Data Marketplaces Jan 2011 #30
  • 31. timetric (2)• Data domain – Economic data• Data population – aggregate data from the worlds leading sources of economic data (World Bank, Eurostat, …) – User uploaded data• Data size – 2.5M public statistics (Linked) Data Marketplaces Jan 2011 #31
  • 32. timetric (3)• Company info – Timetric Ltd. (UK)• Monetization model – Free public datasets – Paid exclusive datasets• Data access – REST API (Linked) Data Marketplaces Jan 2011 #32
  • 33. xIgnite• www.xignite.com (Linked) Data Marketplaces Jan 2011 #33
  • 34. xIgnite (2)• Data domain – Financial data• Data population – aggregate data from leading sources (Dow Jones, Thomson Reuters, stock exchanges, …) – Public datasets (national banks, SEC, Federal Reserve, …) – User uploaded data• Company info – Xignite (USA) (Linked) Data Marketplaces Jan 2011 #34
  • 35. xIgnite (3)• Monetization model – Paid subscriptions• Data access – Web services (REST/SOAP) (Linked) Data Marketplaces Jan 2011 #35
  • 36. Coming soon…• Kasabi – www.kasabi.com / @kasabidata @teamkasabi – Company: Talis• BuzzData – www.buzzdata.com / @buzzdata – Company: BuzzData (Linked) Data Marketplaces Jan 2011 #36
  • 37. Data marketplaces – features summary• Data – datamodel, domain, export options• Monetization – Charge buyers/ sellers – free API calls – branded marketplaces & Service Level Agreement• For developers – REST API; query language – Tools for data management / integration – Application hosting (Linked) Data Marketplaces Jan 2011 #37
  • 38. Feature matrix DataMarket DataMarket InfoChimps Freebase timetric Socrata Factual xIgnite AzureData from all domains + + + - + + - -Data model tabular various various ? tabular graph ? ?Data export - - + - + + - -RDF export - - - - + + - -Charge buyers + +/- + +/- + +/- +/- +Charge sellers ? + - + - - ? ?Free API calls (month) ? 100K ? 50 - 3M ? -Branded marketplaces - - + + + - - -Service Level guarantee ? - - - - - - -REST API + + + + + + + +Query language + - + - - + - -Tools + - + - - + - -App hosting - - + - - + - - (Linked) Data Marketplaces Jan 2011 #38
  • 39. LINKED DATA + MARKETPLACES (Linked) Data Marketplaces Jan 2011 #39
  • 40. Linked Data cloud (Sep 2010) (c) R. Cyganiak and A. Jentzsch (Linked) Data Marketplaces Jan 2011 #40
  • 41. Benefits of Linked Data for Data Marketplaces• Unified data representation model (RDF) – Easy consumption of the data• Global identifiers for all objects (URI) – Makes incremental data integration & federation easier• Interlinked datasets – New data added to the marketplace can be integrated/mapped to existing data• Data marketplace interoperability – Data from different marketplaces can be easily integrated & consumed (Linked) Data Marketplaces Jan 2011 #41
  • 42. Benefits of Linked Data for Data Marketplaces (2)• Derived knowledge / facts – RDF inference of additional implicit facts – (see FactForge and LinkedLifeData)• Rich queries – SPARQL offers unmatched query expressivity• Easy import of existing LOD datasets – Linked Open Data cloud already includes 200+ datasets with 20+ billion RDF triples (Linked) Data Marketplaces Jan 2011 #42
  • 43. Linked Data for marketplaces – challenges• Quality of data – Different (public) datasets may come with inconsistent or controversial data• Large scale data integration – Ontology (schema) mapping of different datasets & vocabularies• Licensing – Some datasets come with “CC-BY-NC” or unclear licensing• Billing – API calls / SPARQL queries with varying granularity• … (Linked) Data Marketplaces Jan 2011 #43
  • 44. LinkedLifeData & FactForgeFactForgeLinkedLifeData (c) R. Cyganiak and A. Jentzsch (Linked) Data Marketplaces Jan 2011 #44
  • 45. LinkedLifeData & FactForge• FactForge – Integrates some of the most central LOD datasets – General-purpose information (not specific to a domain) – 1.2 billion explicit and 1 billion inferred statements – The largest upper-level knowledge base – http://www.FactForge.net• Linked Life Data – 25 of the most popular life-science datasets – 2.7 billion explicit and 1.4 billion inferred statements – http://www.LinkedLifeData.com (Linked) Data Marketplaces Jan 2011 #45
  • 46. (Linked) Data Marketplaces• Strategic questions – Which (linked) datasets can be monetized – Monetization strategy • Charge buyers / charge sellers / free quota • Branded marketplaces / virtual RDF data warehouses – Community building • Crowdsource the data curation to the community • How to provide incentives to data curators? – Tools • Scalable RDF databases • Linked Data curation, discovery, exploration & integration (Linked) Data Marketplaces Jan 2011 #46
  • 47. Data monetization with WebServius (c) WebServius• Benefits – user management, quotas & restrictions – Metering, pricing, billing – Security, scalability, SLAs (Linked) Data Marketplaces Jan 2011 #47
  • 48. Q&AQuestions? @ontotext (Linked) Data Markets Jan 2011 #48