Ny Freebase Workshop 10 Dec 2009
Upcoming SlideShare
Loading in...5

Ny Freebase Workshop 10 Dec 2009



Intro slides for NY Freebase Workshop on Dec 10, 2009

Intro slides for NY Freebase Workshop on Dec 10, 2009



Total Views
Views on SlideShare
Embed Views



1 Embed 1

http://www.slideshare.net 1


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Ny Freebase Workshop 10 Dec 2009 Ny Freebase Workshop 10 Dec 2009 Presentation Transcript

  • Freebase
    New York Workshop
    10 Dec 2009
  • Presenters
    Robert Cook
    Jamie Taylor
    Will Moffat
  • Today’s Workshop
    9:30 – Intro
    10:30 – Prepackaged Freebase solutions
    12:30 – Lunch
    1:15 – Connecting your data to Freebase
    2:30 – Freebase in the data service ecosystem
    3:30 – Wrap up, “office hours”
  • Agenda
    Intro to Freebase
    Freebase as an identity directory
    The Freebase platform
  • Metaweb
    Technology company based in San Francisco
    ~60 person team of engineers and business people
    Venture funded, with long-term outlook
    Focused on Freebase.com platform
  • Freebase is a database of entities
    One entity per thing in the world
    Stable, long-lived identifiers
    Inclusive policy
    Practical data
    Focus on available data
    People, places, products, etc.
    Data to build apps
    Names, images, descriptions
    Dates, measurements and relationships
  • Actresses (37,079)
  • Football Players (16,568)
  • Cheeses (488)
  • Musical Instruments (1,034)
  • Airports (11,556)
  • TV Programs (33,630)
  • Related entities are connected, forming a graph
    Current stats:
    • ~10M entities
    • ~1,800 “types”
    • Celebrity
    • Movie
    • TV show
    • Book
    • Company
    • Location
    • Sports team
    • Product
    • Etc.
    • ~275M facts
    • Continuous data input, cleanup, and syncing
  • Each entity contains rich, structured metadata
  • Entities are language independent
  • As a writeable graph, Freebase gets better over time
    • Add (or remove) entities
    • Add (or remove) metadata (facts, keys, translations, etc.)
    • Extend and improve the schemas
  • Bulk data into Freebase
    15 person group dedicated to algorithmic data import, processing, and tools development
    Reconciliation, reconciliation, reconciliation
    Critical part of everything we do
    Automate wherever possible
    Crowdsource for tasks requiring human judgment (semi-automated)
    Pipelined, ongoing syncing with large external sources(Wikipedia, partners, etc.)
  • Reconciliation
    Guaranteeing one entity
    per thing in the world
  • Reconciliation
  • Reconciliation
  • Reconciliation
  • “US Politicians who have taken more than $30K from foreign companies”
  • Freebase is open
  • Open platform means more data
    Creative Commons Attribution(CC-BY) licensing
    Robust set of APIs
    SLAs for higher volume users (typically >100K API calls per day)
    Hosted developer platform for building tools and apps on top of the data
  • External site data and/or keys
    Beer (3,100)
    The Oxford Bottled Beers Database
    TV episode (715,032)
    The TVDB, TV Rage, etc.
  • A global community is actively improving it
    Creating new data sets
    Curating existing data
    Jet Engines
    Maritime museums
  • The community is defining new schemas
    Top-level domains
  • Agenda
    Intro to Freebase
    Freebase as an identity directory
    The Freebase platform
  • Everybody is creating entities
    Topic pages
    User profiles
    Relevant apps
    Artist pages
    Other fans
  • Millions of users are helping them
    #sxsw09 (Event)
    (Movies, Celebrities, Companies, Products, etc.)
  • Freebase is connecting these entities together
    Will Smith(Actor)
  • An entity directory can power new applications
  • Example:
    Each film review is tagged with the corresponding movies in Freebase
    When the pages loads, it grabs data from Freebase (images, film info and links) to enhance the article
    Freebase also returns links to related WSJ film reviews the user might enjoy (based on genre, director, actors, release year, etc.)
    A Freebase search box allows the user to quickly find any film review in the WSJ archives
  • Agenda
    Intro to Freebase
    Freebase as an identity directory
    The Freebase platform
  • Freebase architecture
  • Query editor
  • Querying Freebase
    “Russian cosmonauts”
    "type": "/spaceflight/astronaut",
    "name": null,
    "/people/person/nationality": ”russia"
  • Querying Freebase
    “Tropical storms in the 90s”
    "type": "/meteorology/tropical_cyclone",
    "name": null,
    "formed>=": "1990",
    "a:formed<": "2000”
  • Querying Freebase
    “French actresses born pre-WWII”
    "type": "/film/actor",
    "name": null,
    "/people/person/gender": "female",
    "/people/person/date_of_birth<=": "1939",
    "/people/person/nationality": "France",
    "sort": "/people/person/date_of_birth"
  • ACRE
    Server side Javascript + webpage templating
    WSJ (and other) applications developed
    Advanced APIs
    Code sharing – programmer ecosystem
  • Other platform services
    Freebase suggest
    Lucene-based topic search interface
    Blob store (text, image thumbnailing)
    Reconciliation service
    Extended MQL
  • www.freebase.comblog.freebase.com