Your SlideShare is downloading. ×
Sis sat 1000 josh dreller
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Sis sat 1000 josh dreller

679
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
679
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • In order to know we are making progress on scientific problems like open-domain QA well-defined challenges help demonstrate we can solve concrete & difficult tasks. As you might know Jeopardy! Is a long-standing, well-regarded and highly challenging Television quiz show in the US that demands human contestants to quickly understand and answer richly expressed natural language questions over a staggering array of topics.The Jeopardy! Challenge uniquely provides a palpable, compelling and notable way to drive the technology of Question Answering along key dimensionsIf you are familiar with the quiz show it asks an I incredibly broad range of questions over a huge variety of topics.In a single round there is a grid of 6 Categories and for each category 5 rows with increasing $ values. Once a cell is chosen by 1 of three players, A question, or what is often called a Clue is revealed.Here you see some example questions.<read some of the questions> Jeopardy uses complex and often subtle language to describe what is being asked. To win you have to be extraordinarily precise. You must deliver the exact answer – no more and no less – it is not good enough for it be somewhere in the top 2, 10 or 20 documents – you must know it exactly and get it in first place – otherwise no credit – in fact you loose points. You must demonstrate Accurate Confidences -- That is -- you must know what you know – if you “buzz –in” and then get it wrong you lose the $$ value of the question. And you have to do this all very quickly – deeply analyze huge volumes of content, consider many possible answers, compute your confidence and buzz in – all in just seconds.As we shall see compete with human champions at this game represents a Grand Challenge in Automatic Open-Domain Question Answering.<STOP><NEXT SLIDE>
  • Computer programs are natively explicit and exacting in their calculations over numbers and symbols.But Natural Language - -the words and phrases we humans use to communicate with one another -- is implicit -- the exact meaning is not completely and exactly indicated -- but instead is highly dependent on the context -- what has been said before, the topic, how it is being discussed -- factually, figuratively, fictionally etc.Moreover, natural language is often imprecise – it does not have to treat a subject with numerical precision…humans naturally interact and operate all the time with different degrees of uncertainty and fuzzy associations between words and concepts. We use huge amounts of background knowledge to reconcile and interpret what we read.Consider these examples….it is one thing to build a database table to exactly answer the question “Where is someone born?”. The computer looks up the name in one column and is programmed to know that the other column contains the birth place. STRUCUTRED information, like this database table, is designed for computers to make simple comparisons and to be exactly as accurate as the data entered into the database. Natural language is created and used by humans for humans. A reason we call natural language “Unstructured” is because it lacks the exact structure and meaning that computer programs typically use to answer questions. Understanding what is being represented is a whole other challenge for computer programs .Consider this sentence <read> It implies that Albert Einstein was born in Ulm – but there is a whole lot the computer has to do to figure that out any degree of certainty - it has to understand sentence structure, parts of speech, the possible meaning of words and phrases and how they related to the words and phrases in the question. What does a remembrance, a water color and an Otto have to do with where someone was born.Consider another question in the Jeopardy Style … X ran this? And this potentially answer-bearing sentence. Read the Sentence…Does this sentence answer the question for Jack Welch - -what does “ran” have to do with leadership or painting. How would a computer confidently infer from this sentence that Jack Welch ran GE – might be easer to deduce that he was at least a painter there.
  • Human performance is one of the things that makes the Jeopardy! Challenge so compelling. The best humans are very, very good at this task. In this chart, each dot corresponds to actual historical Jeopardy! games and represents the performance of the winner of those games. We refer to this cluster of dots as the “Winners Cloud”. For each dot, the X-axis, along the bottom of the graph, represents the percentage of questions in a game that the winning player got a chance to ANSWER. These were the questions he or she was confident enough and fast enough to ring-in or buzz in for FIRST. The Y-axis, going up along the left of the graph, represents the winning player’s PRECISION – that is, the percentage of those questions answered the player got RIGHT. CORemember, if a player gets a question wrong then they lose the $ value of the clue and their competitors still get a chance to answer or rebound. But what we humans, tend to do really, really well is – confidently know what we know – computing an accurate confidence turns out to be key ability for winning at Jeopardy! Looking at the center of the green cloud, what you see is that, on average, WINNERS are confident enough and fast enough to answer nearly 50% of the questions in a game and do somewhere between 85% and 95% precision on those questions. That is, they get 85-95% of the ones they answer RIGHT. The red dots represents Ken Jennings's performance. Ken won 74 consecutive games against qualified players. He was confident and fast enough to acquire 60% and even up to 80% of a game’s questions from his competitors and still do 85% and 95% precision on average.Good Jeopardy! players are remarkable in their breadth, precision, confidence and speed. 
  • Transcript

    • 1. IBM’s Watson: Is the World’s Trivia Champion the Future of Search?
      Josh Dreller
      VP Media Technology & Analytics
      Fuor Digital @fuordigital
    • 2.
    • 3.
    • 4. Two Types of Innovation
      Incremental – improving current projects and advancing them in a linear fashion
      Grand Challenges – pushing the limits of science
      Must be Difficult (has to be a challenge)
      Must be Inspiring
      Irresistible Vision – “just has to be done”
    • 5. Previous IBM Grand Challenges
    • 6. Open Question Answering
      The way normal humans communicate
      Natural Language – very ambiguous, but at the heart of human intelligence
      Last night I shot an
      elephant in my pajamas.
      How it got into my
      pajamas I’ll never know.
    • 7. The Ultimate Advancement:Computers that can communicate with humans in Natural Language
    • 8. What Can Search Learn From Watson?
      The only way to push forward is to take huge leaps and look for self-imposed challenges even if we can’t prove out the business case right now
      What kind of Grand Challenges could Search create?
      A non-spammable Search Engine?
      No need for Search Engine Optimization?
      ??
      8
    • 9.
    • 10. The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Question Answering along 5 Key Dimensions
      Broad/Open Domain
      $200
      If you're standing, it's the direction you should look to check out the wainscoting.
      Complex Language
      $1000
      Of the 4 countries in the world that the U.S. does not have diplomatic relations with, the one that’s farthest north
      High Precision
      $800
      In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus
      Accurate
      Confidence
      High
      Speed
    • 11. Jeopardy Reaction
      Not too inspired at first
      Weren’t interested in a circus sideshow
      2009, IBM setup a moch-studio at their New York research facility
      Sparring matches with ex-Jeopardy winners
      Eventually saw the potential and thought it was something special
    • 12. A Very Simple Question for a Computer
      In (( 12,546,798 * P ) ^ 2) / 34,567.46 = ?
      Greater than or less than 1?
      50/50 Shot
      = .00885
    • 13. Real Language is Real Hard
      Chess
      A finite, mathematically well-defined search space
      Limited number of moves and states
      Grounded in explicit, unambiguous mathematical rules
      Human Language
      Ambiguous, contextual and implicit
      Grounded only in human cognition
      Seemingly infinitenumber of ways to
      express the same meaning
    • 14. The Opposite of Current Computer Language
      Questions not designed for a computer to answer
      Slang
      Crafty questions
      Shorthand
      Rhyme
      Regionalism
      Anagrams
      Complex Language!
    • 15. Structured vs. Unstructured Data
      Where was Einstein born?
      Structured
      Unstructured
      One day, from among his city views of Ulm, Otto chose a watercolor to send to Albert Einstein as a remembrance of Einstein´s birthplace.
    • 16. Common Sense Knowledge Base
      An ontology of classes and individuals
      Parts and materials of objects
      Properties of objects (such as color and size)
      Functions and uses of objects
      Locations of objects and layouts of locations
      Locations of actions and events
      Durations of actions and events
      Preconditions of actions and events
      Effects (post conditions) of actions and events
      Subjects and objects of actions
      Behaviors of devices
      Stereotypical situations or scripts
      Human goals and needs
      Emotions
      Plans and strategies
      Story themes
      Contexts
      16
      Can a can CanCan?
    • 17. What Can Search Learn From Watson?
      We need to focus on what computers aren’t good at, not what they are good at
      Keywords and Links are not savvy enough. Natural language is key to a next generation search engine
      Most of human knowledge is kept in unstructured data sources or based on common sense context
      17
    • 18.
    • 19. The DeepQA Project
      Dr. David Ferrucci
      25-30 full time
      researchers from
      many disciplines.
      2007-2011
      Millions of dollars
      Post Jeopardy implications
    • 20. Speed Results
      Deployed Watson on 2,880 IBM POWER 750 computer cores
      Went from 2 hours per question on a single CPU to an average ofjust 3 seconds – fast enough to compete with the best.
    • 21. Example Question
      Related Content
      (Structured & Unstructured)
      Keywords: 1698, comet,
      paramour, pink, …
      AnswerType(comet discoverer)
      Date(1698)
      Took(discoverer, ship)
      Called(ship, Paramour Pink)

      Primary
      Search
      Question
      Analysis
      Candidate Answer Generation

      Temporal
      Taxonomic
      Lexical
      Spatial
      [0.58 0 -1.3 … 0.97]
      Isaac Newton
      [0.71 1 13.4 … 0.72]
      Wilhelm Tempel
      [0.12 0 2.0 … 0.40]
      HMS Paramour
      Edmond Halley (0.85)
      Christiaan Huygens (0.20)
      Peter Sellers (0.05)
      [0.84 1 10.6 … 0.21]
      Christiaan Huygens
      [0.33 0 6.3 … 0.83]
      Halley’s Comet
      [0.21 1 11.1 … 0.92]
      Edmond Halley
      Merging &
      Ranking
      [0.91 0 -8.2 … 0.61]
      Pink Panther
      [0.91 0 -1.7 … 0.60]
      Peter Sellers
      Evidence
      Retrieval

      Evidence
      Scoring
      IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE
    • 22. Confidence is Key
      Watson only rings in if it can reach a statistically significant confidence in time
      Some questions take longer than others
      Some questions will be able to answer less confident than others
      Watson manages risk in betting based on confidence
    • 23. Embarrassingly Parallel Computing
      Def: “Little or no effort is required to separate the problem into a number of parallel tasks”
      Works on many algorithms at once and comes back together with confidence scores.
      different from distributed computing problems (such as Google’s MapReduce) that require communication between tasks, especially communication of intermediate results.
    • 24. What Can Search Learn From Watson?
      We can’t be bound by the constraints of current technology
      Two fold process of first coming up with answers then vetting them with more evidence
      Ideas like Parallel Processing will allow us to jump ahead
      24
    • 25.
    • 26. Ken Jennings & Brad Rutter
    • 27. Top human players are remarkably good.
      Computers?
      The Best Human Performance: Our Analysis Reveals the Winner’s Cloud
      Each dot represents an actual historical human Jeopardy! game
      Winning Human Performance
      Grand Champion Human Performance
      2007 QA Computer System
      More Confident
      Less Confident
      27
    • 28. DeepQA: Incremental Progress in Precision and Confidence 6/2007-11/2010
      Now Playing in the Winners Cloud
      11/2010
      4/2010
      10/2009
      5/2009
      12/2008
      Precision
      8/2008
      5/2008
      12/2007
      Baseline
    • 29. Confidence Bar
    • 30. What Can Search Learn From Watson?
      Confidence bar would be a great addition to SERPs
      We must benchmark what is “good” and then aim higher
      These things take time (and money)
      30
    • 31.
    • 32. TJ Watson Research CenterYorktown, NYTwo Games: Aired February 14-16, 2011
    • 33. The End – Humans Win
    • 34.
    • 35. Financial Industry
      Generates large amounts of data and growing 70% per year
      Not just numbers, but all info that would influence the biz landscape (news, articles, blogs, etc)
      Recent financial crisis shows failures of lack of understanding in interdependencies
    • 36. DeepQA in Continuous Evidence-Based Diagnostic Analysis
      Considers and synthesizes a broad range of evidence improving quality, reducing cost
      Symptoms
      Diagnosis Models
      Meds
      Symp
      Fam
      Hist
      Find
      Confidence
      Renal failure
      Family History
      Patient History
      UTI
      Medications
      Tests/Findings
      Diabetes
      Influenza
      Notes/Hypotheses
      hypokalemia
      esophogitis
      Most Confident Diagnosis: Diabetes
      Most Confident Diagnosis: UTI
      Most Confident Diagnosis: Influenza
      Most Confident Diagnosis: Diabetes and Esophogitis
      Huge Volumes of Texts, Journals, References, DBs etc.
    • 37. What Can Search Learn From Watson?
      Even the most daunting task can be overcome
      It’s not company versus company, it’s stretching human knowledge
      How can search engines help other industries
      37
    • 38. Sources
      “What is Watson?” presentation by Adam Lally, IBM Research
      • Jeopardy website with videos: http://www.jeopardy.com/minisites/watson/
      • 39. NYTimes article: “What Is I.B.M.’s Watson?http://www.nytimes.com/2010/06/20/magazine/20Computer-t.html?_r=2&ref=opinion
      Wired magazine:“IBM’s Watson Supercomputer Wins Practice Jeopardy Round”http://www.wired.com/epicenter/2011/01/ibm-watson-jeopardy/#
      More technical: AI magazine “Building Watson: An Overview of the DeepQA Project”http://www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf
    • 40. Thank You
      39