Open Calais For SF And LA Meetups
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Open Calais For SF And LA Meetups

  • 1,706 views
Uploaded on

Here is the deck we shared with the SF and LA Semantic Web Meetups this past week (March, '09). It covers Calais 4.0 and its connection to the Linked Data cloud. Please join us at OpenCalais.com

Here is the deck we shared with the SF and LA Semantic Web Meetups this past week (March, '09). It covers Calais 4.0 and its connection to the Linked Data cloud. Please join us at OpenCalais.com

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,706
On Slideshare
1,706
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
17
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Calais Thomson Reuters Calais Initiative
  • 2. Overview • Going to discuss five basic topics – What is Calais? – Why we’re doing it & what our goals are – How it works / What’s under the hood? – A few examples – Where it’s headed
  • 3. Calais… • Calais extracts smart metadata from unstructured text and links that metadata to the Linked Data cloud.
  • 4. Calais progress to date • Launched in late January, 2008 • 9,500 developers have joined OpenCalais.com • 1-3 million content ‘transactions’ per day • Delivered four major update releases • Free (as in free) for commercial or non- commercial use
  • 5. 5 3 Which provides Metadata information and 1 returned to other Linked the user Unstructur Data pointers with keys ed Text 4 Keys provide access to the Calais 2 Linked Calais Data cloud 6 extracts entities, To a range of open and partner Linked facts and data assets, events including Thomson Reuters
  • 6. Quick Demo You can find the Calais Viewer demonstration tool here: http://viewer.opencalais.com (Note that the Calais Viewer is not the Calais service. It is merely a demonstration of how the service works.) – Copy and paste the text of a business news article from AP, Dow Jones or Reuters.com into the viewer, and press submit. The article is sent to the Calais engine which tags the content and returns it, marked-up. – The tags appear on the left hand rail, and you can click on the plus (+) sign to see the tags expand. – Since we are now on Calais 4.0, you can also use the viewer to see the Linked Data assets related to the tags Calais returns. • Click on a company name on the left hand rail to find a Calais summary page featuring a basic description for that company, as well as a number of links. • Follow those links to see the other data entries on that company that are available for public use in the Linked Data Cloud. – For example, here is the Calais summary page for IBM: http://d.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221- a07aa7933633.html – And here is the summary page for IBM in DBPedia (the Wikipedia translated into computer language): http://dbpedia.org/page/IBM
  • 7. Why & What 1. Derive semantic metadata from textual assets 2. Use that semantic metadata to create entry points into the linked data ecosystem 3. Provide a simple mechanism for the sharing of semantic metadata about textual content assets 4. And just why are you doing this…
  • 8. 1: Semantics from Text: The Text Problem • People consume text • Most of it isn’t semantically enabled • Most of it won’t be semantically enabled • This isn’t about standards – microfromats vs RDFa vs. whatever. • Why: Latency, cost and short shelf- life
  • 9. 1: Semantics from Text: The Text Problem • Target areas where: Years – The economics Great Novels don’t support Scient. Shelf Life metadata Pubs creation Legacy – The value of News metadata is New Gen potentially high News – The value of Seconds Tweets aggregated metadata is Latency ds potentially on rs extremely high a c Ye Se
  • 10. 2: Getting from Text to the Linked Data Ecosystem
  • 11. The Linked Data Cloud
  • 12. 3: Semantic Metadata Transport Layer • I’m a content producer. We’ve loaded the car with rich semantic metadata – I’m sharing it within my four walls – How do I transport it to my consumers? – RSS / Atom, XML, Proprietary data feeds, Content API’s
  • 13. 4: Why We’re Doing It • Two simple answers: – Hyper-evolution of capabilities – better, faster, stronger – The walled garden content world
  • 14. How it Works – Under the Hood of Calais
  • 15. How it Works – Under the Hood of Calais Document Level Metadata Metadata Reference Management Data Assets Entity Level Linked Data and … Stat Tools Disambig. ClearForest Calais Web RD Engine NLP Engine Service F Rule Lexi Base cons Output Formatting
  • 16. Where From Here? • We’ve seen examples of first generation uses. • Where does this go in the future? • Beyond the document – Social Resume analysis – Museum Content Coalitions – Knowledge Management Applications – Investigative Journalism*
  • 17. Investigative Journalism FOIA Calais Web Company:Contract Contract Service Company:Affiliation Document s Big Fuzzy Graph News Calais Web Company:Person Service FamilyRelation
  • 18. What’s in the Pipeline? • 2009 (this is a fuzzy list) – Person disambiguation @ domain level? – Other disambiguation – Continued expansion of URI’s (entities & events) – Calais as hub – Exposure of the IDE? – User managed lexicons – Languages – Opt-in SPARQL Endpoint?
  • 19. • www.opencalais.com – Gallery – code and applications examples – Forums – Documentation • Twitter @opencalais, Facebook Group