ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
Upcoming SlideShare
Loading in...5
×
 

ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson

on

  • 241 views

 

Statistics

Views

Total Views
241
Views on SlideShare
241
Embed Views
0

Actions

Likes
0
Downloads
3
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson Presentation Transcript

  • Inside the mind of Watson Chris Welty IBM Research ibmwatson.com Do Not Record. Do Not Distribute. © 2011 IBM Corporation
  • The Core Technical Team* Researchers and Engineers in NLP, ML, IR, KR&R and CL at IBM Labs and a growing number of universities © 2011 IBM Corporation
  • Automatic Open-Domain Question Answering A Long-Standing Challenge in Artificial Intelligence to emulate human expertise  Given – Rich Natural Language Questions – Over a Broad Domain of Knowledge  Deliver – – – – 3 Precise Answers: Determine what is being asked & give precise response Accurate Confidences: Determine likelihood answer is correct Consumable Justifications: Explain why the answer is right Fast Response Time: Precision & Confidence in <3 seconds © 2011 IBM Corporation
  • What is Jeopardy?  Jeopardy! is an American quiz show – 1964 – Today  answer-and-question format – contestants are presented with clues in the form of answers – must phrase their responses in question form.  Example – Category: General Science – Clue: When hit by electrons, a phosphor gives off electromagnetic energy in this form – Answer: What is light? © 2011 IBM Corporation
  • The Jeopardy! Challenge Hard for humans, hard for machines Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed 5 $1000 The first person If you are looking at mentioned the wainscoating,for different reasons.by name in But hard ‘The Man in the Iron you are looking in Mask’ is this hero of a this direction. previous book by the Who is same author. $200 What is down? D’Artagnan? For people, the challenge is knowing the answer For machines, the challenge is understanding the question $600 In cell division, mitosis splits the nucleus & cytokinesis splits this What is liquid cushioning the nucleus cytoplasm? $800 The conspirators against this man were wounded by each other while they Who is Julius stabbed at him Caesar? © 2011 IBM Corporation
  • What It Takes to compete against Top Human Jeopardy! Players Our Analysis Reveals the Winner’s Cloud Each dot – actual historical human Jeopardy! games Top human players are remarkably good. Winning Human Performance Grand Champion Human Performance 2007 QA Computer System More Confident Less Confident © 2011 IBM Corporation
  • What It Takes to compete against Top Human Jeopardy! Players Our Analysis Reveals the Winner’s Cloud Each dot – actual historical human Jeopardy! games Winning Human Performance In 2007, we committed to making a Huge Leap! Grand Champion Human Performance Computers? Not So Good. 2007 QA Computer System More Confident Less Confident © 2011 IBM Corporation
  • Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • DeepQA: The Technology Behind Watson An example of a new software paradigm DeepQA generates and scores many hypotheses using an extensible collection of Natural Language Processing, Machine Learning and Reasoning Algorithms. These gather and weigh evidence over both unstructured and structured content to determine the answer with the best confidence. Learned Models help combine and weigh the Evidence Evidence Sources Question Answer Sources Primary Search Question & Topic Analysis Candidate Answer Generation Question Decomposition Answer Scoring Hypothesis Generation Hypothesis Generation Evidence Retrieval Hypothesis and Evidence Scoring Hypothesis and Evidence Scoring ... Models Deep Evidence Scoring Synthesis Models Models Models Models Models Final Confidence Merging & Ranking Answer & Confidence © 2011 IBM Corporation
  • Example Question In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city Question Analysis Keywords: 1894, C.W. Post, created … Lexical AnswerType: (Michingan city) Date(1984) Relations: Create(Post, cereal drink) … Related Content (Structured & Unstructured) Primary Search Candidate Answer Generation General Foods [0.58 0 -1.3 … 0.97] 1985 [0.71 1 13.4 … 0.72] Post Foods [0.12 0 aramour Battle Creek [0.84 1 10.6 … 0.21] [0.33 0 Grand Rapids 2.0 … 0.40] 6.3 … 0.83] … [0.91 0 -8.2 … 0.61] Battle Creek (0.85) Post Foods ( 0.20) 1985 (0.05) [0.21 1 11.1 … 0.92] … 1) 2) 3) … Evidence Retrieval Merging & Ranking [0.91 0 -1.7 … 0.60] Evidence Scoring © 2011 IBM Corporation
  • Hypothesis Scoring Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this Tycor Michigan city Temporal Answer Scorers can be applied depending on different relations or constraints detected in the question. For example, this question focus with modifiers is “Michigan city.” Watson can Spatial detect this as a geospatial relation that indicates the correct answer must be a city spatially Popularity located within the sate of Michigan. … Candidate Answers Evidence Feature Scores (Answer Scoring + Passage Scoring) Doc Rank Pass Rank Ty Cor Geo General Foods 0 1 0.1 0 Post Foods 2 1 0.1 0 Battle Creek 1 2 0.8 1 Will Keith Kellogg 3 0.1 0 0.9 1 0.0 0 Grand Rapids 1895 0 © 2011 IBM Corporation
  • Passage Scoring Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city In Deep Evidence Scoring, Watson retrieves evidence for each candidate answer, then evaluates the evidence using a large number of deep evidence scoring analytics. The evidence for a candidate answer may come from the original document or passage where the candidate answer was generated, or it may come from an evidence retrieval search performed by taking the keyword search query from Step 2, replacing the focus terms with the candidate answer, and retrieving the relevant passages that are found. The passages, or “context” in which the candidate answer occurs are evaluated as evidence to support or refute the candidate answer as the correct answer for the question. General Foods Battle Creek 1895: In Battle Creek, Michigan, C.W. Post made thecamePOSTUM , a cereal C.W. Post first to the Battle Creek beverage. Post created GRAPE-NUTS sanitarium to cure his upset stomach. cereal in 1897, and POST TOASTIES He later created Postum, a cerealcorn flakes in 1908 based coffee substitute Post Foods 1854 C. W. Post (Charles William) was born. He founded the Postum Cereal Co. General Foods' products go from in 1895 (renamed General Foods Corp.breakfast (Post's cereals) to Postum cereal in 1922) to manufacture warm nightcaps (Postum, Sanka), also wash the pots and pans that its beverage foods are cooked in (S.O.S. Scouring Pads The company was incorporated in 1922, Post Foods, LLC, also known as Post Cereals having developed from the earlier Postum (formerly Postum Cereals) was founded by C.W. Cereal Co. Ltd., founded by C.W. Post Post. It began in 1895 with the first Postum, a (1854-1914) in 1895 in Battle Creek, Mich. "cereal beverage", developed by Post in Battle After a number of experiments, Post Creek, Michigan. The first cereal, Grape-Nuts, marketed his first product-the cereal It was named after C. W. Post, the founder of was developed in 1897 beverage called Postum-in 1895 the Postum Cereal Company that later became General Foods. The cereal company unit was later sold off and is now Post Foods © 2011 IBM Corporation
  • Merging Candidate Answers and Scoring the Confidence Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this … In the final processing step, Watson detects variants of the same answer and merges their feature scores together. Watson then computes the final confidence scores for the candidate answers by applying a series of Machine Learning models that weight all of the feature scores to produce the final confidence scores. Candidate Answers Evidence Feature Scores Doc Rank Pass Rank Ty Cor General Foods 0 1 0.1 Post Foods 2 1 Battle Creek 1 2 Will Keith Kellogg 3 Geo LFAC S Term Match Temporal 0 0.2 22 1 0.1 0 0.4 41 1 0.8 1 0.5 30 0.9 0.1 0 0 23 0.5 0.9 1 0 10 0.5 0.0 0 0 21 Correct Answer 0 0.6 Post Foods 0.152 1895 0.040 0.033 General Foods 1895 0.946 Grand Rapids Machine Learning Model Application Confidence Battle Creek Grand Rapids Final Answers 0.014 © 2011 IBM Corporation
  • “Minimal” Deep QA Pipeline Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city Question Battle Creek Primary Search Question Analysis LAT Document Search Results R Mitchigan City 0 1 Title General Foods Battle Creek 2 Post Foods 3 Will Keith Kellogg Hypothesis Generation Candidate Answers General Foods Post Foods Battle Creek Hypothesis and Evidence Scoring Final Confidence Merging & Ranking Evidence Features Ty Cor Geo Final Answers Confidence 0.1 0 Battle Creek 0.946 0.1 0 Post Foods 0.152 0.8 1 1895 0.040 © 2011 IBM Corporation
  • A new software paradigm emerging (not that we invented it)  The basic Watson computation is Hypothesis Scoring  How well does an answer fit into a question?  More than 100 different Hypothesis scoring software components  No single scoring component does the whole job  Many of them do very similar jobs  12 typing components, 8 passage alignment components, 10 ngram components, …  These components are not integrated with each other beyond that they each produce a score for each hypothesis  A machine learning algorithm learns how to combine them to produce a final score  The development methodology involved an incremental approach of producing stable baseline systems and testing changes with “follow-ons”  Changes that improve performance according to our metrics are accepted into the next stable baseline © 2011 IBM Corporation
  • Follow-on development + ~10% © 2011 IBM Corporation
  • Incremental Baselines 100% 90% 11/2010 80% 4/2010 Precision 70% 10/2009 5/2009 60% 12/2008 50% 8/2008 5/2008 40% 12/2007 30% 20% Baseline 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % Answered © 2011 IBM Corporation
  • Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • ClassicQA: NOT The Technology Behind Watson From the dawn of AI, it was envisioned that question answering would work by having a process that completely translated natural language (content & questions) into an unambiguous (logical) representation, and a reasoning process would run on that representation to produce answers. This vision has never been realized. Question Answer Sources Primary Search Formal Query GOFNLP Logical Reasoner Formal Knowledge Answer & Confidence © 2011 IBM Corporation
  • into the Gap Language Recall NLP Knowledge FAIL Precision Mentions Scale Semantic Technology Brittleness Acquisition © 2011 IBM Corporation
  • into the Gap Language Knowledge Scale Recall Semantic Technology NLP Precision Brittleness No! Mentions Acquisition © 2011 IBM Corporation
  • into the Gap Language Knowledge Knowledge is not the destination Scale Recall Semantic Technology NLP Precision Mentions Brittleness Acquisition © 2011 IBM Corporation
  • into the Gap IR LF Language NER ML Crowds SemTech Task (e.g. QA) Parsing © 2011 IBM Corporation
  • Using Structured Evidence • Exploit wealth of freely available structured information • e.g. Linked Open Data (LOD) • Types, Relations, Links • Complement results from unstructured text analysis • Classic Precision Vs. Recall Tradeoff  Useful for explanation data –Precise and reliable evidence (e.g. spatial / temporal constraint match) © 2011 IBM Corporation
  • Structured Data and Inference in Watson Spatial Reasoning Relation Detection and Scoring Using Structured KBs Q: “This 1997 Titanic hero..” matches <Dicaprio, lead-actor, Titanic> Answer Typing (Type Coercion) LAT: Scottish Inventor Answer: James Watt Anti-Type Coercion LAT: Country Candidate: Einstein Answer In Clue Q: “In 2003, ‘Big Blue’ acquired this company..”  Downweigh IBM Evidence Sources Containment (“This African country..”) Relative direction (“This sea east of Florida..”) Border (“This state bordering the Great Lakes..”) Relative location (“bldg. near Times Square..”) Numeric Properties: area/population/height (“This sea, largest in area,..”) Temporal Reasoning Lifespan, Duration Question Models Primary Search Question & Topic Analysis Candidate Answer Generation Question Decomposition LAT Inference Q: “Annexation of this in 1803..” (Using PRISMATIC) “this”  Region Hypothesis Generation Evidence Retrieval Models Evidence Scoring Hypothesis and Evidence Scoring Synthesis Evidence Diffusion Q: “Sunan Intl. Airport is in this country” Diffuse evidence from (Pyongyang ->> N Korea) Models Models Final Confidence Merging & Ranking Answer & Confidence © 2011 IBM Corporation
  • LOD Impact on DeepQA for Typing Answers + ~10% 66.5% 66.0% 65.5% 65.0% 64.5% 64.0% 63.5% 63.0% 62.5% 62.0% 61.5% An ensemble of TyCor components © 2011 IBM Corporation
  • Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • And the winner is….not human is…. 100% 90% 80% Precision 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % Answered © 2011 IBM Corporation
  • IBM Research HOW TASTY WAS MY LITTLE FRENCHMAN FATHERLY NICKNAMES THIS FRENCHMAN WAS "THE FATHER OF BACTERIOLOGY” © 2009 IBM Corporation
  • IBM Research President Bush THERE'S A FIRST TIME FOR EVERYTHING IN 1824 THIS FIRST FOREIGNER TO ADDRESS A JOINT SESSION OF CONGRESS CONGRATULATED THE U.S. ON ITS GROWTH © 2009 IBM Corporation
  • IBM Research Michael MUSIC WHAT IS THE TEXT OF AN OPERA CALLED? © 2009 IBM Corporation
  • Kosher HAPPY MEALS GRASSHOPPERS EAT PRIMARILY THIS © 2011 IBM Corporation
  • OLYMPIC ODDITIES Had only one hand It was the anatomical oddity of U.S. gymnast George Eyser, who won a gold medal on the parallel bars in 1904 © 2011 IBM Corporation
  • Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • CONFIRMED KEYNOTE: TOM MALONE, MIT PAPER DEADLINES: MID JUNE iswc2012.semanticweb.org