Date Of Birth “ 8/22/63” Musician Crucify Under The Pink North Carolina USA Located in City Music Album instanceof instanceof Located in 62 F temperature Author EMI Atlantic publisher birthplace publisher instanceof instanceof Author Geo Almanac Weather channel CDNow People Magazine Newton, NC Newton, NC Newton, NC Tori Amos Tori Amos
We get a mess like this Date Of Birth “ 8/22/63” Musician Crucify Under The Pink North Carolina USA Located in City Music Album instanceof instanceof Located in 62 F temperature Author EMI Atlantic publisher birthplace publisher instanceof instanceof Author Geo Almanac Weather channel CDNow People Magazine NTNC Newton,_NorthCar USNC0491 0,9855,109071,00 328723677
The Name Problem
Names are crucial in information exchange
2 parties cannot exchange information about an object without agreeing on how they are going to refer to it
The Problem : too many names to keep track off!
No URN for <Newton, NC> or <Tori Amos>
Different sites have different names for the same thing!
URN efforts to date largely failures
Traditional Approach : Name-Mapping tables
Date Of Birth “ 8/22/63” Musician Crucify Under The Pink North Carolina USA Located in City Music Album instanceof instanceof Located in 62 F temperature Author EMI Atlantic publisher birthplace publisher instanceof instanceof Author Geo Almanac Weather channel CDNow People Magazine NTNC Newton,_NorthCar USNC0491 0,9855,109071,00 328723677 Calling program 328723677 <-> 0,9855,1… USNC0491 <-> NTNC <-> . . . NTNC Newton,_Nor… 0,9855, … 328723677 USNC0491
TAP Naming
Reference by descriptions
E.g., “A Musician whose firstName is ’Tori’ and whose lastName is ‘Amos’ and whose …”
Names are degenerate descriptions
Amzn:B000002UB2, CDNOW: 328723677
Description based name negotiation
Core Insight
Don’t require globally unique names for everything if we can describe things using a starting vocabulary
Need a description language, starting vocabulary and negotiation mechanism
Bootstrapping some shared meaning into more shared meaning
The vision: descriptions choreograph the integration Date Of Birth “ 8/22/63” Musician Crucify Under The Pink North Carolina USA Located in City Music Album instanceof instanceof Located in 62 F temperature Author EMI Atlantic publisher birthplace publisher instanceof instanceof Author Geo Almanac Weather channel CDNow People Magazine NTNC Newton,_NorthCar USNC0491 0,9855,109071,00 328723677 Calling program D1 D1 D1, D2 D2 D1 = description of Newton, NC D2 = description of Tori Amos
Description based References
The core protocol : GetData
GetData(Resource Description, arc-label)
GetData(<Tori Amos>, birthplace)
GetData(RDF Description of Tori Amos, birthplace)
A form of loose coupling:
Handling Ambiguity, Failure to denote, …
The core contract:
Expose your data as a Graph
Map incoming descriptions to nodes in your graph
In return, your data is now integrated into the global semantic web
Infrastructure: Kernel Vocabulary
Provides vocabulary for descriptions
Purpose is to provide the infrastructure for constructing descriptions with which programs can refer to things
“ A Musician whose firstName is ’Tori’ and whose lastName is ‘Amos’ and whose
It doesn’t reside anywhere : it’s a specification
Applications
Good infrastructures have waves of applications
WWW : home pages, portals, ecommerce, …
DNS : email, telnet, ftp, gopher, … WWW
Semantic Search
Adding Semantics to Search
Crawl, grab, index model of search doesn’t work for dynamic web sites or web applications
Semantic based Search Augmentation enables search to cover time sensitive data
Internet Wet Lab
Semantic Web Application: Semantic Search
Search Augmentation Example
How the Semantic Infrastructure gets used in Semantic Search Search Front End “ Yo Yo Ma” Musician whose genre is ClassicalMusic, First name is … Who has - concert dates? - discography? - auctions? - bio? For musician whose EBay CDNow AllMusic TicketMaster KB UDDI++ Concert Dates for Musician whose … Bio for … Discography for … Auctions for … Caching & Buffering
TAP KBs for Semantic Search
Large Knowledge Base of specific musicians, cities, athletes, …
Currently covers about 20% of search terms
Built in a largely automated fashion
Scrapers for free data sources
Simple noun phrase analysis of news articles
AP, Reuters, …
Scrapers for important sites to bootstrap
KB also helps bootstrap the semantic web
KB Coverage Today
Music
Musicians, instr., styles
Movies
Movies, actors, tv-shows
Authors
Top authors, classic books,
Sports
Athletes, sports, sports teams, equipment
Autos
Auto models, motorcycles, .
Companies
Fortune 500
Home Appliances
Types, brands
Toys
Types, brands
Baby products
Types, brands
Places
Countries, cities, tourist attractions, …
Consumer electronics
Audio/Video, Communication
Game : consoles, titles, …
Health
Diseases, Drugs, …
Semantic Site Search
Semantic Search useful not just for internet wide search, but also for site search
Same principles as internet-wide search
KBs created for searching related individual sites can be shared between sites
These KBs feed into global semantic web
Example: Semantic Search for www.w3.org
TAP Appl: Internet Wet Lab
In many sciences, more data will be produced in the next 2 years than exists today
Increasingly, research consists of writing programs that mine this data
Data is isolated as islands in different labs
Data from one lab not easily available to programs in another lab
We want to use TAP to create a single virtual net-wide “database” containing all this experimental data
KBs, source-code, etc. freely available (via BSD license)
A number of new projects starting up … places, entertainment, …
We invite you to join
URL: http://tap.stanford.edu/
TAP: Summary
Small set of guidelines that create a coherent semantic web out of disparate web services
Potential solution to naming problem
Relevant to all web services
Semantics Search & Internet Wet Lab as driving applications
TAP is a research project
Lot of fundamental work remains to be done
Everything freely available. We want you to join!
Questions
Date Of Birth “ 8/22/63” Musician Crucify Under The Pink Newton, NC North Carolina USA US State Located in Tori Amos City Music Album Country instanceof instanceof instanceof Located in 62 F temperature Author EMI Atlantic publisher Weather channel Bg KB People Magazine CDNow Geo Almanac birthplace publisher instanceof instanceof instanceof Author
Date Of Birth “ 8/22/63” Musician Crucify Under The Pink North Carolina USA Located in City Music Album instanceof instanceof Located in 62 F temperature Author EMI Atlantic publisher birthplace publisher instanceof instanceof Author Geo Almanac Weather channel CDNow People Magazine Newton, NC Newton, NC Newton, NC Tori Amos Tori Amos
TAP : Summary
Focus is shifting from just storing and retrieving data to exchanging data. XML provides syntax. We need semantics
We need infrastructure layer for semantics
Applications drive infrastructures. The driving application for this layer is Semantics based Search & News Augmentation.
What is an Internet Infrastructure Layer?
There is a data structure, pieces of which are in different places on the net
DNS: Hash table of host names to ip addresses accessed via GetHostByName
WWW : Directed graph of documents accessed via HTTP GET/POST
Infrastructure layer provides a set of standards & APIs to unify the different pieces so that a client can pretend it is all local
Application 2 : RTA for news articles
RTA for News Articles Search/ Syndication Front End News article SportsTeam_TexasRangers, AthleteRodriguez_Alex … Whose - team schedule? - posters? - auctions? - bio? EBay AOL Shopping AllPosters MLB.com Knowledge Base Directory Team Schedule for team whose title … Poster for … Videos for … Auctions for … Text analysis
0 comments
Post a comment