SlideShare a Scribd company logo
1 of 30
Download to read offline
Freebase
A socially managed semantic database



Jamie Taylor
SemTech 2010 Data Camp
Freebase has Many Types of Things
12 Million Topics
A Multiplicity Strong Identifiers

            http://rdf.freebase.com/ns/en.berlin_wall




            http://www.ellerdale.com/topics/view/0080-6ba0




            http://www.bbc.co.uk/music/artists/7f347782-eb14-40c3-98e2-17b6e1bfe56c

                   http://musicbrainz.org/artist/7f347782-eb14-40c3-98e2-17b6e1bfe56c

http://rdf.freebase.com/ns/authority.musicbrainz.7f347782-eb14-40c3-98e2-17b6e1bfe56c
Relations
contains
                          400 Million
           contained-by

                                  event               label
                                          albums

                            member-of
                                          member-of

           nationality

                           education
                                          education

                          contained-by
What’s in Freebase?
http://www.bestbuy.com/site/She+Wolf…

              http://www.daylife.com/topic/Shakira

                         http://twitter.com/shakira

                  http://www.facebook.com/shakira

                  http://www.myspace.com/shakira

                  http://www.last.fm/music/Shakira

http://www.netflix.com/RoleDisplay/Shakira/20046629

          http://www.guardian.co.uk/music/shakira
99% pure

All data undergoes rigorous QA before load
Major focus is reconciliation
Use sampling to assure 99% accuracy
Data that does not meet 99% accuracy is not loaded
What's been built on Freebase?
Up to 100,000 Queries a Day




 Quarterly dumps of graph
    http://download.freebase.com
Users contribute data




Users extend the data model
The Freebase Commons
                      Top-level domains
                      ·American football       ·Internet
                      ·Anime/Manga             ·Language
                      ·Architecture            ·Law
                      ·Astronomy               ·Library
                      ·Automotive              ·Location
                      ·Aviation                ·Martial Arts
                      ·Awards                  ·Measurement Unit
                      ·Baseball                ·Media Common
                      ·Basketball              ·Medicine
                      ·Bicycles                ·Metaweb Types
                      ·Biology                 ·Meteorology
                      ·Boats                   ·Military
                      ·Broadcast               ·Music
                      ·Business                ·Olympics
                      ·Celebrities             ·Opera
                      ·Chemistry               ·Organization
                      ·Comics                  ·People
                      ·Common                  ·Geography
                      ·Computers               ·Projects
                      ·Conferences             ·Protected Places
                      ·Cricket                 ·Publishing
                      ·Data World              ·Radio
                      ·Digicams                ·Rail
                      ·Education               ·Religion
                      ·Engineering             ·Royalty
                      ·Event                   ·Soccer
                      ·Clothing and Textiles   ·Spaceflight
                      ·Fictional Universes     ·Sports
                      ·Film                    ·Symbols
                      ·Food & Drink            ·Tennis
                      ·Freebase                ·Theater
                      ·Games                   ·Time
                      ·Geology                 ·Transportation




schema = vocabulary
                      ·Government              ·Travel
                      ·Hobbies and Interests   ·TV
                      ·Ice Hockey              ·Video Games
                      ·Influence               ·Visual Art
The Scope of Schema
   10,448 Properties
      describing
     4,936 Types*
     organized into
     641 Domains
     (77 Commons)
            *types with 10 or more instances
Strength through Exemplars
                                                   Type Instances


            100,000,000


             10,000,000



                                                              >10 instances,
              1,000,000


               100,000
                                                              4936 types
Instances




                10,000


                  1,000
                                                              1424 Commons
                   100


                    10


                     1
                          0   1000   2000   3000   4000   5000    6000   7000   8000   9000   10000 11000
                                                                 Rank
Metaweb Query Language
      [{
           "name" : null,
           "type" : "/film/film"
      }]




               MQL
[{
     "name" : null,
     "type" : "/film/film",
     "directed_by":{"id":"/en/george_lucas"},
     "starring":[{
            "actor":{"id":"/en/harrison_ford"}
         }]
}]




                      MQL
[{
      "name" : null,
      "type" : "/film/film",
      "directed_by":{"id":"/en/george_lucas"},
      "starring": [{
          "actor": {
             "name": null,
             "film": [{
                 "film": {"id": "/en/the_great_escape"}
             }]
          }
     }]
}]


                     Donald Pleasence
                        THX 1138
Freebase Suggest
Reconciliation
        {
             "/type/object/name":"Blade Runner",
             "/type/object/type":"/film/film",
             "/film/film/starring/actor":["Harrison Ford", "Rutger Hauer"],
             "/film/film/director":"Ridley Scott",
             "/film/film/release_date_s":"1981"
         }
[{
     "id":"/guid/9202a8c04000641f8000000000009e89",
     "name":["Blade Runner", "Bladerunner"],
     "score":1.4320519,
     "match":true,
     "type":["/common/topic", "/film/film","/media_common/adapted_work", "/award/award_winning_work",
     ]},
 {
     "id":"/guid/9202a8c04000641f80000000002643d0",
     "name":["Blade"],
     "score":0.48852453,
     "match":false,
     "type":["/common/topic", "/film/film", "/award/award_winning_work", "/award/award_nominated_work",
     ]}

               http://data.labs.freebase.com/recon/
Topic Blocks
Topic API
         Shortcut to building Topic displays
         Two forms:
             basic (names, types, description)
             standard (basic + keys, properties)




http://www.freebase.com/experimental/topic/standard?id=/en/ncis
Geo Search API



Semantic              Spatial              Semantic




      http://www.freebase.com/docs/geosearch
Gridworks
Acre Development Environment
Getting Started++
•   Freebase Documentation Hub
    •   http://www.freebase.com/docs
•   Developer Mailing List
    •   http://lists.freebase.com/mailman/listinfo/freebase-discuss
    •   http://freebase.markmail.org
•   Real Time help on IRC
    •   Freenode #freebase
•   Freebase Happenings
    •   http://blog.freebase.com
•   About the Graph Store
    •   Google: "ACM SIGMOD schema last tuple store"

More Related Content

Similar to Freebase - Semantic Technologies 2010 Code Camp

Freebase API @ HackTO 2
Freebase API @ HackTO 2Freebase API @ HackTO 2
Freebase API @ HackTO 2narphorium
 
Text Analytic Summit 2010
Text Analytic Summit 2010Text Analytic Summit 2010
Text Analytic Summit 2010Jamie Taylor
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsJoshua Shinavier
 
ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9Will Moffat
 
YQL:: Select * from Internet
YQL:: Select * from InternetYQL:: Select * from Internet
YQL:: Select * from Internetdrgath
 
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...MODUL Technology GmbH
 
YQL: Select * from Internet
YQL: Select * from InternetYQL: Select * from Internet
YQL: Select * from Internetdrgath
 
Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)Adhearsion Foundation
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsDavid Graus
 
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
Iccv2009 recognition and learning object categories   p3 c00 - summary and da...Iccv2009 recognition and learning object categories   p3 c00 - summary and da...
Iccv2009 recognition and learning object categories p3 c00 - summary and da...zukun
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
How Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital EvolutionHow Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital EvolutionAndrea Vascellari
 
Sounddogsppt
SounddogspptSounddogsppt
Sounddogspptpoopshkin
 
A Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationA Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationAndy Fawkes
 
Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionSonya Liberman
 
Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web InfrastructureEvaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web InfrastructureMartin Klein
 
COMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented RealityCOMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented RealityMark Billinghurst
 

Similar to Freebase - Semantic Technologies 2010 Code Camp (19)

Freebase API @ HackTO 2
Freebase API @ HackTO 2Freebase API @ HackTO 2
Freebase API @ HackTO 2
 
Text Analytic Summit 2010
Text Analytic Summit 2010Text Analytic Summit 2010
Text Analytic Summit 2010
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9
 
YQL:: Select * from Internet
YQL:: Select * from InternetYQL:: Select * from Internet
YQL:: Select * from Internet
 
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
 
ChContext
ChContextChContext
ChContext
 
YQL: Select * from Internet
YQL: Select * from InternetYQL: Select * from Internet
YQL: Select * from Internet
 
Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
Iccv2009 recognition and learning object categories   p3 c00 - summary and da...Iccv2009 recognition and learning object categories   p3 c00 - summary and da...
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
 
SC in SL
SC in SLSC in SL
SC in SL
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
How Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital EvolutionHow Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital Evolution
 
Sounddogsppt
SounddogspptSounddogsppt
Sounddogsppt
 
A Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationA Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & Automation
 
Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended Version
 
Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web InfrastructureEvaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
 
COMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented RealityCOMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented Reality
 

More from Jamie Taylor

The next phase of Web2.0: Data
The next phase of Web2.0: DataThe next phase of Web2.0: Data
The next phase of Web2.0: DataJamie Taylor
 
Public private-cloud
Public private-cloudPublic private-cloud
Public private-cloudJamie Taylor
 
Using Semantics to Enhance Content
Using Semantics to Enhance ContentUsing Semantics to Enhance Content
Using Semantics to Enhance ContentJamie Taylor
 
Freebase Workshop, December 2009
Freebase Workshop, December 2009Freebase Workshop, December 2009
Freebase Workshop, December 2009Jamie Taylor
 
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content PublishingUsing Semantics to Enhance Content Publishing
Using Semantics to Enhance Content PublishingJamie Taylor
 
ISWC 2009 Consuming LOD
ISWC 2009 Consuming LODISWC 2009 Consuming LOD
ISWC 2009 Consuming LODJamie Taylor
 
Drupal and the Semantic Web
Drupal and the Semantic WebDrupal and the Semantic Web
Drupal and the Semantic WebJamie Taylor
 

More from Jamie Taylor (7)

The next phase of Web2.0: Data
The next phase of Web2.0: DataThe next phase of Web2.0: Data
The next phase of Web2.0: Data
 
Public private-cloud
Public private-cloudPublic private-cloud
Public private-cloud
 
Using Semantics to Enhance Content
Using Semantics to Enhance ContentUsing Semantics to Enhance Content
Using Semantics to Enhance Content
 
Freebase Workshop, December 2009
Freebase Workshop, December 2009Freebase Workshop, December 2009
Freebase Workshop, December 2009
 
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content PublishingUsing Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
 
ISWC 2009 Consuming LOD
ISWC 2009 Consuming LODISWC 2009 Consuming LOD
ISWC 2009 Consuming LOD
 
Drupal and the Semantic Web
Drupal and the Semantic WebDrupal and the Semantic Web
Drupal and the Semantic Web
 

Recently uploaded

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Freebase - Semantic Technologies 2010 Code Camp

  • 1. Freebase A socially managed semantic database Jamie Taylor SemTech 2010 Data Camp
  • 2.
  • 3. Freebase has Many Types of Things
  • 5.
  • 6. A Multiplicity Strong Identifiers http://rdf.freebase.com/ns/en.berlin_wall http://www.ellerdale.com/topics/view/0080-6ba0 http://www.bbc.co.uk/music/artists/7f347782-eb14-40c3-98e2-17b6e1bfe56c http://musicbrainz.org/artist/7f347782-eb14-40c3-98e2-17b6e1bfe56c http://rdf.freebase.com/ns/authority.musicbrainz.7f347782-eb14-40c3-98e2-17b6e1bfe56c
  • 7. Relations contains 400 Million contained-by event label albums member-of member-of nationality education education contained-by
  • 9.
  • 10. http://www.bestbuy.com/site/She+Wolf… http://www.daylife.com/topic/Shakira http://twitter.com/shakira http://www.facebook.com/shakira http://www.myspace.com/shakira http://www.last.fm/music/Shakira http://www.netflix.com/RoleDisplay/Shakira/20046629 http://www.guardian.co.uk/music/shakira
  • 11. 99% pure All data undergoes rigorous QA before load Major focus is reconciliation Use sampling to assure 99% accuracy Data that does not meet 99% accuracy is not loaded
  • 12. What's been built on Freebase?
  • 13. Up to 100,000 Queries a Day Quarterly dumps of graph http://download.freebase.com
  • 14.
  • 15.
  • 16. Users contribute data Users extend the data model
  • 17. The Freebase Commons Top-level domains ·American football ·Internet ·Anime/Manga ·Language ·Architecture ·Law ·Astronomy ·Library ·Automotive ·Location ·Aviation ·Martial Arts ·Awards ·Measurement Unit ·Baseball ·Media Common ·Basketball ·Medicine ·Bicycles ·Metaweb Types ·Biology ·Meteorology ·Boats ·Military ·Broadcast ·Music ·Business ·Olympics ·Celebrities ·Opera ·Chemistry ·Organization ·Comics ·People ·Common ·Geography ·Computers ·Projects ·Conferences ·Protected Places ·Cricket ·Publishing ·Data World ·Radio ·Digicams ·Rail ·Education ·Religion ·Engineering ·Royalty ·Event ·Soccer ·Clothing and Textiles ·Spaceflight ·Fictional Universes ·Sports ·Film ·Symbols ·Food & Drink ·Tennis ·Freebase ·Theater ·Games ·Time ·Geology ·Transportation schema = vocabulary ·Government ·Travel ·Hobbies and Interests ·TV ·Ice Hockey ·Video Games ·Influence ·Visual Art
  • 18. The Scope of Schema 10,448 Properties describing 4,936 Types* organized into 641 Domains (77 Commons) *types with 10 or more instances
  • 19. Strength through Exemplars Type Instances 100,000,000 10,000,000 >10 instances, 1,000,000 100,000 4936 types Instances 10,000 1,000 1424 Commons 100 10 1 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 Rank
  • 20. Metaweb Query Language [{ "name" : null, "type" : "/film/film" }] MQL
  • 21. [{ "name" : null, "type" : "/film/film", "directed_by":{"id":"/en/george_lucas"}, "starring":[{ "actor":{"id":"/en/harrison_ford"} }] }] MQL
  • 22. [{ "name" : null, "type" : "/film/film", "directed_by":{"id":"/en/george_lucas"}, "starring": [{ "actor": { "name": null, "film": [{ "film": {"id": "/en/the_great_escape"} }] } }] }] Donald Pleasence THX 1138
  • 24. Reconciliation { "/type/object/name":"Blade Runner", "/type/object/type":"/film/film", "/film/film/starring/actor":["Harrison Ford", "Rutger Hauer"], "/film/film/director":"Ridley Scott", "/film/film/release_date_s":"1981" } [{ "id":"/guid/9202a8c04000641f8000000000009e89", "name":["Blade Runner", "Bladerunner"], "score":1.4320519, "match":true, "type":["/common/topic", "/film/film","/media_common/adapted_work", "/award/award_winning_work", ]}, { "id":"/guid/9202a8c04000641f80000000002643d0", "name":["Blade"], "score":0.48852453, "match":false, "type":["/common/topic", "/film/film", "/award/award_winning_work", "/award/award_nominated_work", ]} http://data.labs.freebase.com/recon/
  • 26. Topic API Shortcut to building Topic displays Two forms: basic (names, types, description) standard (basic + keys, properties) http://www.freebase.com/experimental/topic/standard?id=/en/ncis
  • 27. Geo Search API Semantic Spatial Semantic http://www.freebase.com/docs/geosearch
  • 30. Getting Started++ • Freebase Documentation Hub • http://www.freebase.com/docs • Developer Mailing List • http://lists.freebase.com/mailman/listinfo/freebase-discuss • http://freebase.markmail.org • Real Time help on IRC • Freenode #freebase • Freebase Happenings • http://blog.freebase.com • About the Graph Store • Google: "ACM SIGMOD schema last tuple store"