Playing Trivia with a Bot
Upcoming SlideShare
Loading in...5

Playing Trivia with a Bot



A short description of my "Watson" python bot that plays trivia on IRC - and wins!

A short description of my "Watson" python bot that plays trivia on IRC - and wins!



Total Views
Views on SlideShare
Embed Views



3 Embeds 28 26 1 1


Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Playing Trivia with a Bot Playing Trivia with a Bot Presentation Transcript

  • Playing Trivia with a Bot Jose Nazario <>
  • History ~2005? Created "#trivia" for my wife Uses Blitzed Trivia bot, "brainiac", and a 110k question/answer DB Winter 2012 had an interest in NLP for potential project Decided to tackle a "toy problem" "Let's play trivia!"
  • Goals Learn NLP via NLTK Build a bot that can play trivia "competitively"
  • Natural Language Processing (NLP) Algorithms that can parse and process human language Major field of study related to AI, useful in ● Machine translation ● Grammar induction ● Information extraction ● Sentence understanding
  • Challenges & Advantages Unlike Jeopardy ● Can answer question wrong and not get penalized, try multiple times ● No puns or wordplay, straightforward questions Still ... ● Have to have a knowledge base - Google ● Have to be able to figure out the right answer
  • Watson Components Simple IRC library (not irclib) NLTK - Natural Language Toolkit Logic Hand crafted
  • Base Assumptions "Google knows all" - no need to make a local knowledge database The right answer will be commonly seen, exploit that repetition
  • Watson 1.0 ~100 LoC, "an evening of futzing around" "Strategy" 1. Read the question 2. Throw it at Google, get a result page 3. Find all the proper names (via NLTK) from page titles, rank by frequency 4. Guess those sequentially
  • Watson 1.0 Results Very poor performance Not surprising
  • Watson 2.0 Written a few days later ~300 LoC, "actually had to think this time" Strategy ● Check a DB of cached questions and answers (from observations), use similar ones if possible ● Read question, throw at Google (or Bing) ● Figure out what kind of answer is expected, extract matching text via NLTK and scoring ● If we get a hint, use it (as a regex)
  • Extracting Answers from Web Pages Challenge Web pages contain a lot of junk around the answer How do we find what the answer in the sea of words? Simple strategy - extract proper names! (The trivia DB often has proper names for answers)
  • Where is Watson these days? 00:08 < brainiac> Congratulations to rogueclown who has won this round! What a brain! 00:08 < brainiac> Final scores: 00:08 < brainiac> rogueclown: 10 00:08 < brainiac> watson: 9 00:08 < brainiac> purge: 2 irc://
  • Additional Ideas for Watson 2.0 New search engines Bing, Ask, Wolfram Alpha Prune knowledge base Weed out useless “answers” New/different named entity recognition engine Experiment with scoring algorithms for guesses
  • Disappointments Only a minor increase in my knowledge of NLP I did not become an NLP maestro No one else built a bot Was hoping for a competition
  • Watson 3 .. sorta in the works Ideas Natural language interface to semantic web (e.g. QuestIO, Quepy), SPARQL endpoints Wolfram Alpha-like UI, research prototypes available Teach the bot what kind of answer to look for Quantity, dates, names, etc Probabalistic programming? Marry answers with confidence
  • IBM Watson Links com/researcher/view_project.php?id=2099 (Special issue of IBM JR&D on Watson)