Watson is an artificially intelligent computer system capable of answering questions posed in natural language. It was built by IBM to apply advanced natural language processing, information retrieval, knowledge representation, automated reasoning, and machine learning technologies to the field of open domain question answering. So it is used to predict the future happenings related to the past experiences
2. INTRODUCTION
WATSON is an artificially intelligent computer system capable of answering
questions posed in natural language, developed in IBM's DeepQA project.
QA technology takes a question expressed in natural language, seeks to
understand it in much greater detail, and returns a precise answer to the
question.
Watson applies advanced natural language processing, information retrieval,
knowledge representation, automated reasoning, and machine learning
technologies for this purpose.
It incorporates a local corpus (database) of information.
1/23
3. JEOPARDY !
Initially developed to answer questions on Jeopardy !, a quiz show known for its tricky
questions.
Watson participated in 2011 against former champions Brad Rutter and Ken Jennings
and won over them.
2/23
4. REQUIREMENTS
• 90 x IBM Power 750 servers
• 2880 POWER7 cores
• POWER7 3.55 GHz chip
• 500 GB per sec on-chip bandwidth
• 10 Gb Ethernet network
• 16 Terabytes of memory
• 20 Terabytes of disk, clustered
• Can operate at 80 Teraflops
• Runs IBM DeepQA software
• Scales out with and searches vast amounts of unstructured
information with UIMA & Hadoop open source components
• Linux provides a scalable, open platform, optimized
to exploit POWER7 performance
• 10 racks include servers, networking, shared disk system,
cluster controllers
3/23
5. ALGORITHMS USED
1. SVM (Support Vector Machines) Classifier
• SVM is supervised learning model that analyzes data and recognizes patterns
• Given a set of training examples, each marked as belonging to one of two categories, it builds a
model that assigns new examples into one category or the other
• It is a non-probabilistic binary linear classifier.
2. Naïve Bayes’ Classifier
• It is a family of classifiers based on applying Bayes' theorem with strong
(naive) independence assumptions between the features.
• So it is a conditional probability model.
• Particularly suited when the dimensionality of the inputs is high.
4/23
6. ALGORITHMS USED
3. Word Sense Disambiguation
• It is an open problem of natural language
processing and ontology. WSD is identifying
which sense of a word (i.e. meaning) is used in
a sentence, when the word has multiple
meanings.
• It requires two strict things: a dictionary to
specify the senses and a corpus of
language data to be disambiguated. WordNet
is used as a dictionary in this context. For
example –
5/23
7. 3. Word Sense Disambiguation (contd.)
• The sentence as well as the query forms an ordered set of words. We then compute the sense
network between every pair of words from query and sentence.
ALGORITHMS USED
6/23
8. PROCESS
The basic working of Watson computer is
based on four steps –
1. Question
Analysis
4. Answer
Extraction
(Result)
3. Hypotheses
Generation
2. Document
Retrieval
7/23
9. PROCESS
Step 1 – Determining answer type
•Uses machine learning techniques like SVM
(Support Vector Machine), Naïve Bayes
classifiers
•Above techniques apply on a tagged corpus
of information
Step 2 – Query formation
• Assume question is a valid IR query
• Remove stop words from question
Example:
In 1897 Swiss climber Matthias Zurbriggen
became the first to scale this Argentinean
peak.
1. Question Analysis
8/23
10. PROCESS
•The task of the document retrieval module is
to select a small set from the collection which
can be practically handled in the later stages.
•Using important terms from the question,
Watson performs a search over millions of
documents to find relevant passages.
• Data can be stored either in a local corpus or
can be accessed from the Internet.
2. Document Retrieval
9/23
11. PROCESS
• Extracts important entities – so called “candidate answers” – from the documents.
• WordNet is used as a sense/semantic dictionary.
• Obtain statistics of a particular word from a large corpus by assigning probabilities based on
occurrence of target concept.
• Hypotheses generation of example given above -
3. Hypotheses Generation
10/23
12. PROCESS
Step 1 – Answer Scoring
•Candidate answers are scored using a large number of answer scoring analytics running
parallel.
• Algorithms like Type Coercion scorer, temporal match etc. are used.
• Answer scoring of example given above -
4. Answer Extraction
11/23
13. PROCESS
Step 2 – Analysing Scores
•The scores are grouped into meaningful groups, or evidence dimensions.
•A plot of these yields the evidence profile for the candidate.
•Watson statistically combines the scores to produce a final confidence score.
4. Answer Extraction
Aconcagua
12/23
21. EXISTING CHALLENGES
Healthcare
Medical information doubles
every three years, physician’s
inability to be up-to-date,
complex decision making
Retail
Fulfilling customers’ high
expectation of satisfaction and
effectively analysing growing
mountain of data
Finance
Each day huge financial
information is generated, difficult
to harness
Public Sector
Efficient analysis of enormous
volumes of unstructured,
unverified data
20/23
23. FUTURE SCOPE
Recipe generating platform
Pharmaceutical industry
Publishing
Biotechnology
Research or inventions
22/23
24. Requires a huge database of prior knowledge and information
Has trouble responding to short clues
Incapable of coming up with fresh ideas
More than base knowledge, clues may require thought, an area where humans
still have an edge over Watson Computer
LIMITATIONS
23/23
25. BIBLIOGRAPHY
https://researcher.ibm.com/researcher/viewpage.php?id=2121
Science Behind an Answer http://www03.ibm.com/innovation/us/watson/what-is-watson/science-
behind-an-answer.html
Jeopardy! IBM Watson Day 1 (Feb 14, 2011)
http://www.youtube.com/watch?v=seNkjYyG3gI&feature=related
Tom M. Mitchell. 1997. Machine Learning. Computer Science Series. McGraw-Hill.
Corpora for Question Answering Task, Cognitive Computation Group at the Department of
Computer Science, University of Illinois at Urbana-Champaign.
Dell Zhang and Wee Sun Lee. 2003. Question Classification using Support Vector Machines. In
Proceedings of the 26th ACM International Conference on Research and Developement in
Information Retrieval (SIGIR’03), pages 26–32, Toronto, Canada.
www.google.com
www.wikipedia.com
www.ibm.com