API's, Freebase, and the Collaborative Semantic web
Upcoming SlideShare
Loading in...5
×
 

API's, Freebase, and the Collaborative Semantic web

on

  • 8,915 views

A presentation about the state of the collaborative semantic web, including:

A presentation about the state of the collaborative semantic web, including:

- What?
- Why?
- Where do we stand?
- A case study on Metaweb's Freebase project

Statistics

Views

Total Views
8,915
Views on SlideShare
7,692
Embed Views
1,223

Actions

Likes
27
Downloads
313
Comments
0

16 Embeds 1,223

http://cognitiveharmony.net 1131
http://semanticwebbuzz.blogspot.com 31
http://www.slideshare.net 29
http://semanticwebbuzz.blogspot.co.uk 11
http://thinkingout2loud.blogspot.com 7
http://www.thinkingout2loud.blogspot.com 2
http://semanticwebbuzz.blogspot.in 2
http://semanticwebbuzz.blogspot.tw 2
http://blaze.apphb.com 1
http://s90193894.onlinehome.us 1
http://semanticwebbuzz.blogspot.nl 1
http://semanticwebbuzz.blogspot.it 1
http://thinkingout2loud.blogspot.de 1
http://web.archive.org 1
http://coknown.biz 1
http://semanticwebbuzz.blogspot.fr 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    API's, Freebase, and the Collaborative Semantic web API's, Freebase, and the Collaborative Semantic web Presentation Transcript

    • Diana Tamabayeva & Dan Delany API’s, Freebase, and the  Collaborative Semantic Web
    • Dan Delany dan.delany@gmail.com cognitiveharmony.net 970‐309‐8598
    • Object‐Oriented Representation “John
Smith
is
a
five‐ foot‐tall
male
who
 weighs
one‐hundred
 eighty
pounds.
He’s
 forty‐two
years
old
and
 he
loves
dogs.” sortByHeight(users); isADude(“John
Smith”); Structured data in  String of characters  object ‐ useful to  ‐ useful to people computers  
    • Object‐Oriented Representation class
Dog<Mammal class
Place @home,
@age,
@human @latlong,
@elevation, def
goHome @address,
@name go(@home); def
isBelowSeaLevel Home end @elevation
<
0; end Pet Home Human Spouse O‐O instance variables and  class
Human<Mammal Friends methods represent   @home,
@age,
@pets structured semantic  @spouse,
@friends
 def
divorce relationships between  @spouse
=
nil; pieces of information.  end
    • Structured Semantic  Relationships Among Objects We visualize this with a semantic graph.
    • The (hypertext) Web Today Web
Servers WWW LAMP User
Computer
    • The (semantic) Web Tomorrow? Servers Servers User
Computer Structured
Data (OWL/RDF) Data
Aggregator/ Document
Data (HTML/CSS) Visualizer Hyperlinks Servers (Old
www
links)
    • A Cloud that Talks To Itself. vs.
    • Why? Searchability. With semantic graphs, you can perform semantic searches  by traversing the graph: “Where does the woman who lives at 2408 Walnut work?”
    • Why? So You Can Mash it Up. twittermap.com mapmash.googlepages.com/gaza.html Mashups create meaning from data. housingmaps.com
    • Why? Context‐Aware AI Today’s AI is limited by the domain‐ specificity of its input data Example: Image Analysis Context‐Unaware: Blob Tracking Useful in it’s domain (in this  case, touch detection), but  ultimately has no  Blob2 ‘intelligence.’ Blob1
    • Why? Context‐Aware AI Today’s AI is limited by the domain‐ specificity of its input data Example: Image Analysis Context‐Aware: Mine the semantic  graph for heuristics and clues Trees Sky You’re in the mountains  near Aspen, CO. It is the  fall. Your friends Taifur  and Rachel are at the same location. Etc. Rachel Taifur Dan
    • Why? Context‐Aware AI I have a dream for the Web [in which computers] become capable of analyzing  all the data on the Web ‐ the content, links, and transactions between people  and computers. A ‘Semantic Web’, which should make this possible, has yet to  emerge, but when it does, the day‐to‐day mechanisms of trade, bureaucracy  and our daily lives will be handled by machines talking to machines.  The ‘intelligent agents’ people have  touted for ages will finally materialize. Tim Berners‐Lee, 1999
    • Challenges ‐ Motivation/Critical Mass Value comes from ubiquity,  and ubiquity comes from value. How do we encourage adoption of technology that does not yet provide value? More importantly, how will it provide value to content producers?  How is giving away your data’s meaning (your secret sauce)  instead of presenting it alongside ads valuable? Should semantic feeds be monetized? “Information As A Service” 14
    • Challenges ‐ Privacy/Security • Reduced
anonymity
on
the
Web • Increased
invasion
of
privacy • SoluSon:
access
privileges
must
 be
controlled
by
infrastructure‐ level
security
    • The State of the Semantic Web
    • Top‐Down: Content Structuring •SemanSc
search
engines
generate
and
leverage
an
internal
semanSc
database •Data
sSll
returned
as
HTML,
no
API










not
helping
create
the
semanSc
web!
    • Top‐Down: Dapper •Dapper
‐
UI
for
seXng
up
rules
to
 scrape
semi‐structured
data
(CSV,
RSS,
XML)
 RDF from
any
set
of
HTML
documents! CSV to
semanSc
web RSS OWL WWW
    • Top‐Down: Yahoo! Pipes •Yahoo!
Pipes
‐
UI
for
transforming,
 reformaXng,
and
combining
data
feeds
into
 RDF more
useful
data
feeds. JSON to
semanSc
web CSV RSS OWL JSON RSS
    • Bottom‐Up: Publishing Standards RDF and OWL standards •W3C‐sponsored
standards
for
defining
semanSc
relaSonships
and
resources •Powerful,
but
complex
and
hard
for
humans
to
read/create •No
mo@va@on
for
developers
to
create •No
W3C‐sponsored
universal
ontologies <rdf:Description owl:Class rdf:ID=quot;Carquot; rdf:about=quot;http://.../DansCarquot;> rdfs:subClassOf rdf:resource=quot;#Vehiclequot; <car:color>Red</car:color> rdfs:subClassOf [a owl:Restriction; <car:make>Honda</make> owl:cardinality quot;4quot;^^xsd:nonNegativeInteger; <car:year>1999</make> owl:onProperty <#Wheel> ] RDF Data Node OWL Ontology Dan’s hasA isA Car Color Vehicle color make Car HasMul*ple(4) year 1999
    • API’s: I dream of a RESTful tomorrow Thousands of content creators are already sharing structured data  with API’s!
    • Bottom‐Up: Knowledge Bases Domain
Experts •CollaboraSve
a_empt
to
build
 factual
knowledge
base •“Object‐Oriented
Wikipedia”
    • Bottom‐Up: Knowledge Bases Domain
Experts •Content
contributed
by
 experts
manually,
or
by
dataset
 owners
automaScally. Seman@c Analysis
    • Freebase: The Everything Graph
    • Metaweb Freebase Statistics •5.3
million
topics
today •Growing
by
~15,000/month •Pulled
from
public
data •By
comparison,
Wikipedia
has
 2.64
million
English
arScles •25,379
users
today. •Growing
by
600‐800/month •Freebase
Launch,
March
2007 •LLC
Founded,
July
2005
    • The Freebase Approach Crea@ve
Commons Content ALribu@on
License Publishers WWW Open
API
 PI RDF
A (MQL) App Developers Node
Editor
GUI Data Modelers Expert Users & Dataset Owners Casual Collaborators Exis@ng Datasets WWW
    • Freebase Data Policy Crea@ve
Commons •Requires
A_ribuSon
of
Source ALribu@on
License MQL
Query: Open
API
 PI [{ quot;albumquot; : { RDF
A quot;artistquot; : [], quot;namequot; : null, (MQL) quot;release_datequot; : null }, quot;limitquot; : 25, quot;namequot; : null, quot;name~=quot; : quot;Love*quot;, quot;typequot; : quot;/music/trackquot; }] Returns: Data [{ quot;albumquot; : { quot;artistquot; : [quot;Massive Attackquot;], quot;namequot; : quot;Blue Linesquot;, quot;release_datequot; : quot;1991-04-08quot; Freedom! }, quot;namequot; : quot;One Lovequot;, quot;typequot; : quot;/music/trackquot; },{quot;albumquot; : { quot;artistquot; : [“Squirrel Nut Zippersquot;], quot;namequot; : quot;The Inevitablequot;, quot;release_datequot; : quot;1995-03-17quot; }, quot;namequot; : quot;Anything But Lovequot;,
    • The Freebase Approach Crea@ve
Commons Content ALribu@on
License Publishers WWW Open
API
 PI RDF
A (MQL) App Developers Node
Editor
GUI Data Modelers Expert Users & Dataset Owners Casual Collaborators Exis@ng Datasets WWW
    • Freebase Community App
Developers
List Data
Modelers
List App Developers IRC
Chat
(open
to
all) Data Modelers Discussion
Threads on
Individual
Topics Expert Users & Dataset Owners •No
Central
Forum •No
Backlog
of
Mailing
List Casual •No
Friends Collaborators •No
Private
Messages
    • Freebase Community Tools “Acre,
the
Freebase
applicaSon
 development
plaqorm,
lets
anyone
 mashup
Freebase
data
using
 Javascript
and
have
it
hosted
for
 Employees free.”
‐ Shawn Simister, developer App Developers “We
have
been
working
 “[Data
Modeling
is
hard
because
new
 hard
recently
to
provide
 schemas
are
a
slow
process,
and
they
 bulk
import
tools
 can
break
users’
code.
Let’s
have
a
 “Sloppy
Freebase”
that
allows
 Data Modelers for
Freebase.
While
such
 tools
exist
internally,
the
 users
to
enter
unstructured
data
unSl
 reconciliaSon
process
has
 new
schemas
are
defined.]”
‐
Jack Alves,  Expert Users & thus
far
been
too
 former Metaweb Director of Engineering Dataset Owners complicated
for
public
 release.”
‐ Brian  Culbertson, Metaweb  “As
a
programmer,
I
feel
that
I'm
most
 Engineer effecSve
when
I'm
contribuSng
 large
data
sets
…
faciliSes
 Casual built
into
Freebase
that
let
users
 Collaborators upload
lists
of
topics
are
limited
to
 specific
situaSons.”
‐ Shawn Simister
    • “Sloppy” Data Modeling “[Data
Modeling
is
hard
because
new
 Currently, Freebase users cannot submit data if there is not  schemas
are
a
slow
process,
and
they
 already a data structure + ontology built for that data type. can
break
users’
code.
Let’s
have
a
 “Sloppy
Freebase”
that
allows
 “Sloppy” data creation allows users to create their own data  users
to
enter
unstructured
data
unSl
 new
schemas
are
defined.]”
‐
Jack Alves,  types, which will later be cleaned and standardized. former Metaweb Director of Engineering User‐Generated, Semi‐ structured “sloppy” data Clean, structured data Bob Jones John Green RDF major: English area of study: Sociology to
semanSc
web John Green Eric Bradley Bob Jones focus: studying: area of study: Sociology CS OWL English Casual Automated Data Eric Bradley Collaborators Cleaner‐Upper area of study: CS
    • Conclusion We
are
here. • Progress • RDF/OWL • Freebase
‐
5.3
mil.
topics! • Dapper,
Yahoo!
Pipes,
other
 data
abstractors • API’s
out
the
wazoo • Issues • AdopSon • Privacy/Security • Intellectual
Property • We’re
good
at
making
 content,
but
we
suck
at
 mining
it
and
describing
it.