SlideShare a Scribd company logo
1 of 64
TECHNICAL PROJECT REVIEW
14
TH
AUGUST, 2013
WATSON
@ RPI
WATSON RESEARCH LAB
PROFESSOR JIM HENDLER
SIMON ELLIS
KATE MCGUIRE  NICOLE NEGEDLY
DILLON BURNS  MATT KLAWONN
AVI WEINSTOCK
WATSON RPI
Simon Ellis
INTRODUCTION
???IBM Watson
???Watson is…
 … a piece of software that will run on your laptop
 Though very slowly
 Specialised hardware and control platform
 … an implementation of the DeepQA concept
 … the first iteration of the „cognitive computing‟ platform
 … a very clever artificial intelligence
 A very clever application of human intelligence
???Background
 IBM agrees to give RPI a version of Watson
 Watson team is set up to undertake summer research on
the Watson system
 Watson hardware/software configuration not ready at
beginning of summer session
 So what do we do with:
10 weeks, 5 undergraduates and 1 graduate…
???Challenge accepted!
 Build a new version of Watson
 Based on research published in IBM J Res & Dev
 With support and input from IBM Research
 Use open source libraries wherever possible
 Faster development
 No IP issues
 Turns out to be a very useful project
 Trains team in the details of the operation of Watson system
 Can be used in education, training, testing, evaluation
???Sample output
 Demo run of RPI
version of Watson
 Shows output
representing most of
the “pipeline”
???Inside Watson
Watson pipeline as published by IBM; see IBM J Res & Dev 56 (3/4), May/July 2012, p. 15:2
WATSON RPI
Nicole Negedly
QUESTION ANALYSIS
???Question Analysis
???Question Analysis
 What is the question asking for?
 What structured information can be determined from the
unstructured text of the question?
 Topics
 Parsers
 Syntactic and Semantic Analysis Tools
 Focus and Lexical Answer Type Detection
 Future Work
???Parsing
 Open-source parsers
 Stanford Parser
 Berkeley Parser
 Functions
 Determine grammatical structure of text
 Parse trees, part-of-speech Tags, dependency relations
???Coreference Resolution
 What terms in the question refer to the same entity?
???Named Entity Extraction
 Identifies people, places, organizations, and time spans.
???Focus and Lexical Answer Type
POETS & POETRY
He was a bank clerk in the Yukon before
he published Songs of a Sourdough in 1907
 Focus: “he”
 LAT: “he”, “clerk”, “poet”
???Future Work
 Adding additional parsers to the system
 Comparison of parser output
 Relation extraction
 Prolog code and database
 Improved focus and LAT detection
 Princeton WordNet
WATSON RPI
Dillon Burns
PRIMARY SEARCH
???Primary Search
???Primary Search & Corpus
Generation
 Primary search is used to generate our corpus of
information from which to take candidate answers,
passages, supporting evidence, and essentially all textual
input to the system.
 Search Wikipedia for the focus identified during the
Question Analysis phase.
 Grab first 5 documents returned back as corpus.
 Uses Jsoup library to collect and parse HTML.
???JSoup
String[] results = {“/wiki/Snapple”,”/wiki/Dr_Pepper_Snapple_Group”,”/wiki/Snapple_Theater….”
???JSoup
String[] results = {“/wiki/Snapple”,”/wiki/Dr_Pepper_Snapple_Group”,”/wiki/Snapple_Theater….”
To Cache
???DBpedia
???DBpedia
 As of 2011 it had 3.64 million things categorized in its
database
 URLs are a direct map to Wikipedia‟s
 Wikipedia redirect lists
help with alternate
names for entities and
closely related concepts
to certain entities or
people
???Future Directions
 Use DBpedia to fact-check answers about entities in the
database
 Making use of the DBpedia subject matching
WATSON RPI
Kate McGuire
CANDIDATE GENERATION
???Search Result Processing and
Candidate Generation
???Search Result Processing and
Candidate Generation
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
<p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy"
title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts
Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American"
title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0"
class="reference"><a href="#cite_note-profile-2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-AAAS_3-0"
class="reference"><a href="#cite_note-AAAS-3"><span>[</span>3<span>]</span></a></sup></p>
<div id="toc" class="toc">
<div id="toctitle">
<h2>Contents</h2>
</div>
<ul>
<li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early
life and schooling</span></a></li>
<li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a>
<ul>
<li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span
class="toctext">Rensselaer Polytechnic Institute</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span
class="toctext">Honors and distinctions</span></a>
<ul>
<li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards
of directors</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span
class="toctext">Personal</span></a></li>
<li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span
class="toctext">References</span></a></li>
<li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External
links</span></a></li>
</ul>
</div>
<h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-editsection"><span
class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit
section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> | </span><a
href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling"
class="mw-editsection-visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in
school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup>
Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson
attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1"
class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
<p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a
href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a
href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of
Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a>
woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile-
2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-AAAS_3-0" class="reference"><a href="#cite_note-AAAS-
3"><span>[</span>3<span>]</span></a></sup></p>
<div id="toc" class="toc">
<div id="toctitle">
<h2>Contents</h2>
</div>
<ul>
<li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span
class="toctext">Early life and schooling</span></a></li>
<li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span
class="toctext">Career</span></a>
<ul>
<li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span
class="toctext">Rensselaer Polytechnic Institute</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span
class="toctext">Honors and distinctions</span></a>
<ul>
<li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span
class="toctext">Boards of directors</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span
class="toctext">Personal</span></a></li>
<li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span
class="toctext">References</span></a></li>
<li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External
links</span></a></li>
</ul>
</div>
<h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-
editsection"><span class="mw-editsection-bracket">[</span><a
href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit
source</a><span class="mw-editsection-divider"> | </span><a
href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling"
class="mw-editsection-visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged
her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her
science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in
1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup></p>
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th
president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic
Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy"
title="Doctor of Philosophy">Ph.D.</a> in physics from the <a
href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts
Institute of Technology">Massachusetts Institute of Technology</a> in
1973, becoming the first <a href="/wiki/African_American" title="African
American">African American</a> woman to earn a doctorate from MIT in
nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a
href="#cite_note-profile-
2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-
3"><span>[</span>3<span>]</span></a></sup>
<div id="toc" class="toc">">edit</a><span class="mw-
edit]</span></span></h2>
Jackson was born in Washington D.C. Her parents, Beatrice and George
Jackson, strongly valued education and encouraged her in school.<sup
id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-
diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father
spurred on her interest in science by helping her with projects for her
science classes. At Roosevelt High School, Jackson attended accelerated
programs in both math and science, and graduated in 1964 as
valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a
href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup>
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th
president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic
Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy"
title="Doctor of Philosophy">Ph.D.</a> in physics from the <a
href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts
Institute of Technology">Massachusetts Institute of Technology</a> in
1973, becoming the first <a href="/wiki/African_American" title="African
American">African American</a> woman to earn a doctorate from MIT in
nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a
href="#cite_note-profile-
2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-
3"><span>[</span>3<span>]</span></a></sup>Jackson was born in
Washington D.C. Her parents, Beatrice and George Jackson, strongly
valued education and encouraged her in school.<sup id="cite_ref-
diaspora_4-0" class="reference"><a href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup> Her father spurred on
her interest in science by helping her with projects for her science classes.
At Roosevelt High School, Jackson attended accelerated programs in
both math and science, and graduated in 1964 as valedictorian.<sup
id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-
diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th
president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic
Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy"
title="Doctor of Philosophy">Ph.D.</a> in physics from the <a
href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts
Institute of Technology">Massachusetts Institute of Technology</a> in
1973, becoming the first <a href="/wiki/African_American" title="African
American">African American</a> woman to earn a doctorate from MIT in
nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a
href="#cite_note-profile-
2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-
3"><span>[</span>3<span>]</span></a></sup>Jackson was born in
Washington D.C. Her parents, Beatrice and George Jackson, strongly
valued education and encouraged her in school.<sup id="cite_ref-
diaspora_4-0" class="reference"><a href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup> Her father spurred on
her interest in science by helping her with projects for her science classes.
At Roosevelt High School, Jackson attended accelerated programs in
both math and science, and graduated in 1964 as valedictorian.<sup
id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-
diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
Cleaned Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
Cleaned Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
???Search Result Processing
 Passage Retrieval
 Watson: Indri and
Lucene
 Identifies each HTML
sentence and adds both
the HTML and the clean
text to the passage type
 Adds information about
each passage
 Passage Parsing
 Forms parse trees for
each individual
sentence
 Add an array of
passages to each
document
Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
Cleaned Text:
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
Parse Tree:
(ROOT
(S
(NP
(NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP ))
(PRN (-LRB- -LRB-)
(VP (VBN born)
(NP (NNP August) (CD 5) (, ,) (CD 1946)))
(-RRB- -RRB-)))
(VP (VBZ is)
(NP
(NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist) (NNS ))
(, ,)
(CC and)
(NP
(NP (DT the) (JJ 18th) (NN president))
(PP (IN of)
(NP (NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP ))))))
(. .)))
???Candidate Generation
 Using each document, and the passages created by
Search Result Processing, we generate candidates using
three techniques:
1. Title of Document (T.O.D.): Adds the title of the document
as a candidate.
2. Wikipedia Title Candidate Generation: Adds any noun
phrases within the document‟s passage texts that are also
the titles of Wikipedia articles.
3. Anchor Text Candidate Generation: Adds candidates
based on the hyperlinks and metadata within the document.
???Wikipedia Title Candidate
Generation
 Runs on the passage array from each search result.
 Using the parse tree, retrieves all the noun phrases
in each passage.
 Checks if each Noun Phrase is the title of a
Wikipedia Article
 Adds the verified candidates along with an array of
the passages that contained them
Array of
Passages
Retrieving
Noun Phrases
Check against
Previous Data
Wikipedia URL
Check
Candidate and
Containing
Passages
(ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (-
LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB-
))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist)
(NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP
(NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .)))
b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States"
title="United States">American</a> <a href="/wiki/Physicist"
title="Physicist">physicist</a>, and the 18th president of <a
href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic
Institute">Rensselaer Polytechnic Institute</a>.
???Wikipedia Title Candidate
Generation
 Runs on the passage array from each search result.
 Using the parse tree, retrieves all the noun phrases
in each passage.
 Checks if each Noun Phrase is the title of a
Wikipedia Article
 Adds the verified candidates along with an array of
the passages that contained them
Array of
Passages
Retrieving
Noun Phrases
Check against
Previous Data
Wikipedia URL
Check
Candidate and
Containing
Passages
(ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (-
LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB-
))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist)
(NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP
(NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .)))
Shirley Ann Jackson
Shirley Ann Jackson (born August 5, 1946)
August 5, 1946
An American Physicist
An American Physicist, and the 18th president of Rensselaer Polytechnic Institute
The 18th president
The 18th president of Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute
???Wikipedia Title Candidate
Generation
 Runs on the passage array from each search result.
 Using the parse tree, retrieves all the noun phrases
in each passage.
 Checks if each Noun Phrase is the title of a
Wikipedia Article
 Adds the verified candidates along with an array of
the passages that contained them
Array of
Passages
Retrieving
Noun Phrases
Check against
Previous Data
Wikipedia URL
Check
Candidate and
Containing
Passages
(ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (-
LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB-
))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist)
(NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP
(NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .)))
Shirley Ann Jackson
Shirley Ann Jackson (born August 5, 1946)
August 5, 1946
An American Physicist
An American Physicist, and the 18th president of Rensselaer Polytechnic Institute
The 18th president
The 18th president of Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute
???Wikipedia Title Candidate
Generation
 Runs on the passage array from each search result.
 Using the parse tree, retrieves all the noun phrases
in each passage.
 Checks if each Noun Phrase is the title of a
Wikipedia Article
 Adds the verified candidates along with an array of
the passages that contained them
Array of
Passages
Retrieving
Noun Phrases
Check against
Previous Data
Wikipedia URL
Check
Candidate and
Containing
Passages
http://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search=
Shirley+Ann+Jackson
Shirley Ann Jackson
Shirley Ann Jackson (born August 5, 1946)
August 5, 1946
An American Physicist
An American Physicist, and the 18th president of Rensselaer Polytechnic Institute
The 18th president
The 18th president of Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute
???Wikipedia Title Candidate
Generation
 Runs on the passage array from each search result.
 Using the parse tree, retrieves all the noun phrases
in each passage.
 Checks if each Noun Phrase is the title of a
Wikipedia Article
 Adds the verified candidates along with an array of
the passages that contained them
Array of
Passages
Retrieving
Noun Phrases
Check against
Previous Data
Wikipedia URL
Check
Candidate and
Containing
Passages
http://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search=
Shirley+Ann+Jackson
Shirley Ann Jackson
Shirley Ann Jackson (born August 5, 1946)
August 5, 1946
An American Physicist
An American Physicist, and the 18th president of Rensselaer Polytechnic Institute
The 18th president
The 18th president of Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute
???Anchor Text Candidate Generation
 Runs on the passage array from each
search result.
 Checks for hyperlinks within the HTML
text of each passage.
 Adds the title of the hyperlinked article as
a candidate
 Adds each passage containing the
candidate to an array
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th
president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic
Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy"
title="Doctor of Philosophy">Ph.D.</a> in physics from the <a
href="/wiki/Massachusetts_Institute_of_Technology"
title="Massachusetts Institute of Technology">Massachusetts
Institute of Technology</a> in 1973, becoming the first <a
href="/wiki/African_American" title="African American">African
American</a> woman to earn a doctorate from MIT in nuclear
physics.<sup id="cite_ref-profile_2-0" class="reference"><a
href="#cite_note-profile-
2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-
3"><span>[</span>3<span>]</span></a></sup>Jackson was born
in Washington D.C. Her parents, Beatrice and George Jackson,
strongly valued education and encouraged her in school.<sup
id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-
diaspora-4"><span>[</span>4<span>]</span></a></sup> Her
father spurred on her interest in science by helping her with projects
for her science classes. At Roosevelt High School, Jackson
attended accelerated programs in both math and science, and
graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1"
class="reference"><a href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup>
???Anchor Text Candidate Generation
 Runs on the passage array from each
search result.
 Checks for hyperlinks within the HTML
text of each passage.
 Adds the title of the hyperlinked article as
a candidate
 Adds each passage containing the
candidate to an array
<b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a
href="/wiki/United_States" title="United States">American</a> <a
href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th
president of <a href="/wiki/Rensselaer_Polytechnic_Institute"
title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic
Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy"
title="Doctor of Philosophy">Ph.D.</a> in physics from the <a
href="/wiki/Massachusetts_Institute_of_Technology"
title="Massachusetts Institute of Technology">Massachusetts
Institute of Technology</a> in 1973, becoming the first <a
href="/wiki/African_American" title="African American">African
American</a> woman to earn a doctorate from MIT in nuclear
physics.<sup id="cite_ref-profile_2-0" class="reference"><a
href="#cite_note-profile-
2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-
3"><span>[</span>3<span>]</span></a></sup>Jackson was born
in Washington D.C. Her parents, Beatrice and George Jackson,
strongly valued education and encouraged her in school.<sup
id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-
diaspora-4"><span>[</span>4<span>]</span></a></sup> Her
father spurred on her interest in science by helping her with projects
for her science classes. At Roosevelt High School, Jackson
attended accelerated programs in both math and science, and
graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1"
class="reference"><a href="#cite_note-diaspora-
4"><span>[</span>4<span>]</span></a></sup>
???Search Result Processing
<p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and
the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a
href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of
Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a
doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile-2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-3"><span>[</span>3<span>]</span></a></sup></p>
<div id="toc" class="toc">
<div id="toctitle">
<h2>Contents</h2>
</div>
<ul>
<li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early life and schooling</span></a></li>
<li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a>
<ul>
<li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span class="toctext">Rensselaer Polytechnic
Institute</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span class="toctext">Honors and distinctions</span></a>
<ul>
<li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards of directors</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span class="toctext">Personal</span></a></li>
<li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span class="toctext">References</span></a></li>
<li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External links</span></a></li>
</ul>
</div>
<h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a
href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> |
</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling" class="mw-editsection-
visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0"
class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science
classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1"
class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
???Search Result Processing
<p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and
the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a
href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of
Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a
doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile-2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-
AAAS_3-0" class="reference"><a href="#cite_note-AAAS-3"><span>[</span>3<span>]</span></a></sup></p>
<div id="toc" class="toc">
<div id="toctitle">
<h2>Contents</h2>
</div>
<ul>
<li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early life and schooling</span></a></li>
<li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a>
<ul>
<li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span class="toctext">Rensselaer Polytechnic
Institute</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span class="toctext">Honors and distinctions</span></a>
<ul>
<li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards of directors</span></a></li>
</ul>
</li>
<li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span class="toctext">Personal</span></a></li>
<li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span class="toctext">References</span></a></li>
<li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External links</span></a></li>
</ul>
</div>
<h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a
href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> |
</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling" class="mw-editsection-
visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0"
class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science
classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1"
class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
???Future Work
 Search Result Processing
 Improve methods of imitating Indri or Lucene passage retrieval
without a corpus.
 Create a passage score and rank.
 Candidate Generation
 Continue to improve the speed and quality of Candidate
Generation
 Research and implement Candidate Generation from
Structured Sources (Prismatic, Answer Lookup)
 Record and measure recall in comparison with Watson and
other Question answering software.
WATSON RPI
Matt Klawonn
SCORING & RANKING
???Scoring & Ranking
???Differentiating between answers
 Making sense of candidates
 Filtering
 Supporting Evidence Retrieval (SER)
 Scoring (passage-based)
???Scorers
 Passage Term Match
 Textual Alignment
 Skip-Bigram
 Each of these scores supportive evidence
 These scores are then merged to produce a single candidate
score
???Passage Term Search
 Question Terms
Extracted
 Passage is searched
for those terms
 Score calculated for
that passage
 Done per passage
 “Where is Toronto?”
“Where” “is” “Toronto”
 “Toronto is in Southern Ontario”
“Toronto is ”
 Score = IDF(Toronto) + IDF(is)
???Textual Alignment
 Finds an optimal alignment of a question and a passage
 Assigns “partial credit” for close matches
 “Who is the President of RPI?”
 Shirley Ann Jackson is the President of RPI.
 Who is the President of RPI.
???Skip-Bigram
 Constructs a graph
 Nodes represent terms (syntactic objects)
 Edges represent relations
 Extracts skip-bigrams
 A skip-bigram is a pair of nodes either directly connected or
which have only one intermediate node
 Skip-bigrams represent close relationships between terms
 Scores based on number of common skip-bigrams
???Example
 Who authored
“The Good Earth”?
 “Pearl Buck, author of
the good earth…”
???Future Directions
 More algorithms
 Logical form answer candidate scoring
 Improved Type Coercion scoring
 Begin implementing machine learning
 Temporal/Spatial reasoning
WATSON RPI
Avi Weinstock
UIMA PIPELINE
???UIMA
 The DeepQA architecture is built on top of another
architecture, UIMA (Unstructured Information
Management Architecture).
 A UIMA CAS (Common Analysis Structure) contains a
contiguous block of data (normally text), and annotations,
which contain start & end indexes into the data, and
additional data (strings, integers, doubles, arrays,
annotation references).
???UIMA
 CAS Multipliers output multiple CASes based on the data
in the input CAS; this facilitates parallelization, which is
the key to Watson‟s response time.
???DeepQA Architecture
 The DeepQA architecture, which both IBM Watson and RPI
MiniDeepQA implement, is a QA (Question Answering) system
that answers questions by generating as many potential
answers as is practical, then filtering them with multiple
evidence scorers in parallel.
???Data cache
 IBM Watson has a pre-processed corpus of information,
generated automatically by a subset of the DeepQA
pipeline from an enormous volume of raw text, which the
remainder of the pipeline uses at question time.
 As our system retrieves information from the internet on a
per-question basis, it cannot (practically) process the
whole corpus in advance.
???Data cache
 Since parsing the documents takes a large amount of
time, in order to test/demonstrate the system, it is
beneficial to store webpages and associated parses
locally. This allows a question that has been asked
before, and candidates that come up for multiple
questions, to be processed faster.
 As a side-benefit of the caching, if a website is
temporarily down, its data can still be used (if it was not
down at some point in the past).
???Graphical User Interface
 Towards the start of the project, we ran our system using
the Document Analyzer (a UIMA-provided tool).
 While it was useful, once we had the entire pipeline set
up, testing the full system required more input than
necessary.
 Additionally, there wasn't a convenient way to display just
the intended output, nor intermediate output at a level
suitable for monitoring progress/giving demonstrations.
???Graphical User Interface
 The GUI addresses these concerns, and has additionally
been extensively tweaked to be demonstration-friendly.

More Related Content

What's hot

AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsAndre Freitas
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsAndre Freitas
 
From metasearch to metaservices
From metasearch to metaservicesFrom metasearch to metaservices
From metasearch to metaservicesdswalker
 
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoverySolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoveryJack Park
 
Summarization for dragon star program
Summarization for dragon  star programSummarization for dragon  star program
Summarization for dragon star programYueshen Xu
 
Reputation Management for Early Career Researchers
Reputation Management for Early Career ResearchersReputation Management for Early Career Researchers
Reputation Management for Early Career ResearchersMicah Altman
 
"UX for the win!" at #CityMash: how we did grounded theory coding of qualitat...
"UX for the win!" at #CityMash: how we did grounded theory coding of qualitat..."UX for the win!" at #CityMash: how we did grounded theory coding of qualitat...
"UX for the win!" at #CityMash: how we did grounded theory coding of qualitat...Andrew Preater
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538Krishna Sankar
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk KnowledgeKrishna Sankar
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on NetworksMason Porter
 
Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019Miguel Pardal
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)Nicolas Van Labeke
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkKrishna Sankar
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...IT Arena
 
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012OSCON Byrum
 
Whither subject access?
Whither subject access?Whither subject access?
Whither subject access?kramsey
 
2021 Summary of Research on the STAGES Developmental Model
2021 Summary of Research on the STAGES Developmental Model2021 Summary of Research on the STAGES Developmental Model
2021 Summary of Research on the STAGES Developmental Modelperspegrity5
 
Dcla13 discourse, computation and context – sociocultural dcla
Dcla13 discourse, computation and context – sociocultural dclaDcla13 discourse, computation and context – sociocultural dcla
Dcla13 discourse, computation and context – sociocultural dclaSimon Knight
 

What's hot (20)

AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
From metasearch to metaservices
From metasearch to metaservicesFrom metasearch to metaservices
From metasearch to metaservices
 
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoverySolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
 
Summarization for dragon star program
Summarization for dragon  star programSummarization for dragon  star program
Summarization for dragon star program
 
Reputation Management for Early Career Researchers
Reputation Management for Early Career ResearchersReputation Management for Early Career Researchers
Reputation Management for Early Career Researchers
 
"UX for the win!" at #CityMash: how we did grounded theory coding of qualitat...
"UX for the win!" at #CityMash: how we did grounded theory coding of qualitat..."UX for the win!" at #CityMash: how we did grounded theory coding of qualitat...
"UX for the win!" at #CityMash: how we did grounded theory coding of qualitat...
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on Networks
 
Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
 
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
 
Whither subject access?
Whither subject access?Whither subject access?
Whither subject access?
 
2021 Summary of Research on the STAGES Developmental Model
2021 Summary of Research on the STAGES Developmental Model2021 Summary of Research on the STAGES Developmental Model
2021 Summary of Research on the STAGES Developmental Model
 
Dcla13 discourse, computation and context – sociocultural dcla
Dcla13 discourse, computation and context – sociocultural dclaDcla13 discourse, computation and context – sociocultural dcla
Dcla13 discourse, computation and context – sociocultural dcla
 

Viewers also liked

"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM
"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM
"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBMYandex
 
IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!
IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!
IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!Tony Pearson
 
Estre/Apresentação Paulinia
Estre/Apresentação PauliniaEstre/Apresentação Paulinia
Estre/Apresentação PauliniaAnita Rocha
 
SAGE | IOB CORPORATE - MAPEAMENTO DE RISCOS ESOCIAL
SAGE | IOB  CORPORATE - MAPEAMENTO DE RISCOS ESOCIALSAGE | IOB  CORPORATE - MAPEAMENTO DE RISCOS ESOCIAL
SAGE | IOB CORPORATE - MAPEAMENTO DE RISCOS ESOCIALMartcom Digital
 
Gate 2017 me 1 solutions with explanations
Gate 2017 me 1 solutions with explanationsGate 2017 me 1 solutions with explanations
Gate 2017 me 1 solutions with explanationskulkarni Academy
 
Microsoft visual basic 6 animasi
Microsoft visual basic 6 animasiMicrosoft visual basic 6 animasi
Microsoft visual basic 6 animasiI H
 
Introduction to epid
Introduction to epidIntroduction to epid
Introduction to epidBeMyApp
 
Jungle Scout's Million Dollar Case Study: How To Find Product Ideas
Jungle Scout's Million Dollar Case Study: How To Find Product IdeasJungle Scout's Million Dollar Case Study: How To Find Product Ideas
Jungle Scout's Million Dollar Case Study: How To Find Product IdeasGen Furukawa
 
Introduction ciot workshop premeetup
Introduction ciot workshop premeetupIntroduction ciot workshop premeetup
Introduction ciot workshop premeetupBeMyApp
 
Social Mobile Analytics Cloud
Social Mobile Analytics CloudSocial Mobile Analytics Cloud
Social Mobile Analytics CloudMphasis
 
Putting IBM Watson to Work.. Saxena
Putting IBM Watson to Work.. SaxenaPutting IBM Watson to Work.. Saxena
Putting IBM Watson to Work.. SaxenaManoj Saxena
 
Building with Watson - Serverless Chatbots with PubNub and Conversation
Building with Watson - Serverless Chatbots with PubNub and ConversationBuilding with Watson - Serverless Chatbots with PubNub and Conversation
Building with Watson - Serverless Chatbots with PubNub and ConversationIBM Watson
 
Gtext homes training manual. jan 2017
Gtext homes training manual. jan 2017Gtext homes training manual. jan 2017
Gtext homes training manual. jan 2017Arit Bassey
 

Viewers also liked (20)

"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM
"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM
"IBM Watson — компьютерная лингвистика". Артём Семенихин, IBM
 
Watson and Open Source Tools
Watson and Open Source ToolsWatson and Open Source Tools
Watson and Open Source Tools
 
IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!
IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!
IBM Watson: How it Works, and What it means for Society beyond winning Jeopardy!
 
Estre/Apresentação Paulinia
Estre/Apresentação PauliniaEstre/Apresentação Paulinia
Estre/Apresentação Paulinia
 
SAGE | IOB CORPORATE - MAPEAMENTO DE RISCOS ESOCIAL
SAGE | IOB  CORPORATE - MAPEAMENTO DE RISCOS ESOCIALSAGE | IOB  CORPORATE - MAPEAMENTO DE RISCOS ESOCIAL
SAGE | IOB CORPORATE - MAPEAMENTO DE RISCOS ESOCIAL
 
Watson Analytic
Watson AnalyticWatson Analytic
Watson Analytic
 
Gate 2017 me 1 solutions with explanations
Gate 2017 me 1 solutions with explanationsGate 2017 me 1 solutions with explanations
Gate 2017 me 1 solutions with explanations
 
Microsoft visual basic 6 animasi
Microsoft visual basic 6 animasiMicrosoft visual basic 6 animasi
Microsoft visual basic 6 animasi
 
Introduction to epid
Introduction to epidIntroduction to epid
Introduction to epid
 
Telephone language
Telephone languageTelephone language
Telephone language
 
Watson System
Watson SystemWatson System
Watson System
 
Jungle Scout's Million Dollar Case Study: How To Find Product Ideas
Jungle Scout's Million Dollar Case Study: How To Find Product IdeasJungle Scout's Million Dollar Case Study: How To Find Product Ideas
Jungle Scout's Million Dollar Case Study: How To Find Product Ideas
 
IBM's watson
IBM's watsonIBM's watson
IBM's watson
 
Safety Management & Training in Higher Ed
Safety Management & Training in Higher EdSafety Management & Training in Higher Ed
Safety Management & Training in Higher Ed
 
Introduction ciot workshop premeetup
Introduction ciot workshop premeetupIntroduction ciot workshop premeetup
Introduction ciot workshop premeetup
 
Social Mobile Analytics Cloud
Social Mobile Analytics CloudSocial Mobile Analytics Cloud
Social Mobile Analytics Cloud
 
Putting IBM Watson to Work.. Saxena
Putting IBM Watson to Work.. SaxenaPutting IBM Watson to Work.. Saxena
Putting IBM Watson to Work.. Saxena
 
Building with Watson - Serverless Chatbots with PubNub and Conversation
Building with Watson - Serverless Chatbots with PubNub and ConversationBuilding with Watson - Serverless Chatbots with PubNub and Conversation
Building with Watson - Serverless Chatbots with PubNub and Conversation
 
Gtext homes training manual. jan 2017
Gtext homes training manual. jan 2017Gtext homes training manual. jan 2017
Gtext homes training manual. jan 2017
 
Yoga Mantra
Yoga MantraYoga Mantra
Yoga Mantra
 

Similar to Watson at RPI - Summer 2013

ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSaeedeh Shekarpour
 
Data Designed for Discovery
Data Designed for DiscoveryData Designed for Discovery
Data Designed for DiscoveryOCLC
 
Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Paul Royster
 
Knowledge Technologies: Opportunities and Challenges
Knowledge Technologies: Opportunities and ChallengesKnowledge Technologies: Opportunities and Challenges
Knowledge Technologies: Opportunities and ChallengesFariz Darari
 
Csci 6530 2016 spring presentation
Csci 6530 2016 spring presentationCsci 6530 2016 spring presentation
Csci 6530 2016 spring presentationciakov
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsPaul Hofmann
 
Information literacy
Information literacyInformation literacy
Information literacySean Socha
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4jSimon Jupp
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisMathieu d'Aquin
 
Csci 6530 2015 fall_cli2
Csci 6530 2015 fall_cli2Csci 6530 2015 fall_cli2
Csci 6530 2015 fall_cli2ciakov
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchersDirk Roorda
 
ASEE2012 Presentation: iKNEER User Study
ASEE2012 Presentation: iKNEER User StudyASEE2012 Presentation: iKNEER User Study
ASEE2012 Presentation: iKNEER User StudyXin Chen
 
Archives Hub - Data in :: Data out
Archives Hub - Data in :: Data outArchives Hub - Data in :: Data out
Archives Hub - Data in :: Data outJane Stevenson
 
VIVO: enabling the discovery of research and scholarship
VIVO: enabling the discovery of research and scholarshipVIVO: enabling the discovery of research and scholarship
VIVO: enabling the discovery of research and scholarshipPaul Albert
 
One Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersOne Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersPhilip Bourne
 
Project Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of CreditProject Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of CreditCASRAI
 

Similar to Watson at RPI - Summer 2013 (20)

ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration Tools
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
 
Data Designed for Discovery
Data Designed for DiscoveryData Designed for Discovery
Data Designed for Discovery
 
Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)
 
Knowledge Technologies: Opportunities and Challenges
Knowledge Technologies: Opportunities and ChallengesKnowledge Technologies: Opportunities and Challenges
Knowledge Technologies: Opportunities and Challenges
 
Csci 6530 2016 spring presentation
Csci 6530 2016 spring presentationCsci 6530 2016 spring presentation
Csci 6530 2016 spring presentation
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
GE80C Bortnick 4 19 10
GE80C Bortnick 4 19 10GE80C Bortnick 4 19 10
GE80C Bortnick 4 19 10
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Information literacy
Information literacyInformation literacy
Information literacy
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
Vivo Search
Vivo SearchVivo Search
Vivo Search
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 
Csci 6530 2015 fall_cli2
Csci 6530 2015 fall_cli2Csci 6530 2015 fall_cli2
Csci 6530 2015 fall_cli2
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchers
 
ASEE2012 Presentation: iKNEER User Study
ASEE2012 Presentation: iKNEER User StudyASEE2012 Presentation: iKNEER User Study
ASEE2012 Presentation: iKNEER User Study
 
Archives Hub - Data in :: Data out
Archives Hub - Data in :: Data outArchives Hub - Data in :: Data out
Archives Hub - Data in :: Data out
 
VIVO: enabling the discovery of research and scholarship
VIVO: enabling the discovery of research and scholarshipVIVO: enabling the discovery of research and scholarship
VIVO: enabling the discovery of research and scholarship
 
One Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersOne Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific Publishers
 
Project Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of CreditProject Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of Credit
 

More from James Hendler

Knowing what AI Systems Don't know and Why it matters
Knowing what AI  Systems Don't know and Why it mattersKnowing what AI  Systems Don't know and Why it matters
Knowing what AI Systems Don't know and Why it mattersJames Hendler
 
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")James Hendler
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)James Hendler
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
Knowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityKnowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityJames Hendler
 
The Future(s) of the World Wide Web
The Future(s) of the World Wide WebThe Future(s) of the World Wide Web
The Future(s) of the World Wide WebJames Hendler
 
Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs James Hendler
 
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...James Hendler
 
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...James Hendler
 
KR in the age of Deep Learning
KR in the age of Deep LearningKR in the age of Deep Learning
KR in the age of Deep LearningJames Hendler
 
Digital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AIDigital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AIJames Hendler
 
The Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataThe Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataJames Hendler
 
Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)James Hendler
 
Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...James Hendler
 
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...James Hendler
 
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?James Hendler
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebJames Hendler
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)James Hendler
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science James Hendler
 

More from James Hendler (20)

Knowing what AI Systems Don't know and Why it matters
Knowing what AI  Systems Don't know and Why it mattersKnowing what AI  Systems Don't know and Why it matters
Knowing what AI Systems Don't know and Why it matters
 
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Knowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityKnowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/Interoperability
 
The Future(s) of the World Wide Web
The Future(s) of the World Wide WebThe Future(s) of the World Wide Web
The Future(s) of the World Wide Web
 
Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs
 
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...
 
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...
 
KR in the age of Deep Learning
KR in the age of Deep LearningKR in the age of Deep Learning
KR in the age of Deep Learning
 
Digital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AIDigital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AI
 
The Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of MetadataThe Unreasonable Effectiveness of Metadata
The Unreasonable Effectiveness of Metadata
 
Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)Social Machines - 2017 Update (University of Iowa)
Social Machines - 2017 Update (University of Iowa)
 
Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...Social Machines: The coming collision of Artificial Intelligence, Social Netw...
Social Machines: The coming collision of Artificial Intelligence, Social Netw...
 
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
 
Wither OWL
Wither OWLWither OWL
Wither OWL
 
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the Web
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Watson at RPI - Summer 2013

  • 1. TECHNICAL PROJECT REVIEW 14 TH AUGUST, 2013 WATSON @ RPI WATSON RESEARCH LAB PROFESSOR JIM HENDLER SIMON ELLIS KATE MCGUIRE  NICOLE NEGEDLY DILLON BURNS  MATT KLAWONN AVI WEINSTOCK
  • 4. ???Watson is…  … a piece of software that will run on your laptop  Though very slowly  Specialised hardware and control platform  … an implementation of the DeepQA concept  … the first iteration of the „cognitive computing‟ platform  … a very clever artificial intelligence  A very clever application of human intelligence
  • 5. ???Background  IBM agrees to give RPI a version of Watson  Watson team is set up to undertake summer research on the Watson system  Watson hardware/software configuration not ready at beginning of summer session  So what do we do with: 10 weeks, 5 undergraduates and 1 graduate…
  • 6. ???Challenge accepted!  Build a new version of Watson  Based on research published in IBM J Res & Dev  With support and input from IBM Research  Use open source libraries wherever possible  Faster development  No IP issues  Turns out to be a very useful project  Trains team in the details of the operation of Watson system  Can be used in education, training, testing, evaluation
  • 7. ???Sample output  Demo run of RPI version of Watson  Shows output representing most of the “pipeline”
  • 8. ???Inside Watson Watson pipeline as published by IBM; see IBM J Res & Dev 56 (3/4), May/July 2012, p. 15:2
  • 11. ???Question Analysis  What is the question asking for?  What structured information can be determined from the unstructured text of the question?  Topics  Parsers  Syntactic and Semantic Analysis Tools  Focus and Lexical Answer Type Detection  Future Work
  • 12. ???Parsing  Open-source parsers  Stanford Parser  Berkeley Parser  Functions  Determine grammatical structure of text  Parse trees, part-of-speech Tags, dependency relations
  • 13. ???Coreference Resolution  What terms in the question refer to the same entity?
  • 14. ???Named Entity Extraction  Identifies people, places, organizations, and time spans.
  • 15. ???Focus and Lexical Answer Type POETS & POETRY He was a bank clerk in the Yukon before he published Songs of a Sourdough in 1907  Focus: “he”  LAT: “he”, “clerk”, “poet”
  • 16. ???Future Work  Adding additional parsers to the system  Comparison of parser output  Relation extraction  Prolog code and database  Improved focus and LAT detection  Princeton WordNet
  • 19. ???Primary Search & Corpus Generation  Primary search is used to generate our corpus of information from which to take candidate answers, passages, supporting evidence, and essentially all textual input to the system.  Search Wikipedia for the focus identified during the Question Analysis phase.  Grab first 5 documents returned back as corpus.  Uses Jsoup library to collect and parse HTML.
  • 20. ???JSoup String[] results = {“/wiki/Snapple”,”/wiki/Dr_Pepper_Snapple_Group”,”/wiki/Snapple_Theater….”
  • 21. ???JSoup String[] results = {“/wiki/Snapple”,”/wiki/Dr_Pepper_Snapple_Group”,”/wiki/Snapple_Theater….” To Cache
  • 23. ???DBpedia  As of 2011 it had 3.64 million things categorized in its database  URLs are a direct map to Wikipedia‟s  Wikipedia redirect lists help with alternate names for entities and closely related concepts to certain entities or people
  • 24. ???Future Directions  Use DBpedia to fact-check answers about entities in the database  Making use of the DBpedia subject matching
  • 26. ???Search Result Processing and Candidate Generation
  • 27. ???Search Result Processing and Candidate Generation
  • 28. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document <p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile-2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-AAAS_3-0" class="reference"><a href="#cite_note-AAAS-3"><span>[</span>3<span>]</span></a></sup></p> <div id="toc" class="toc"> <div id="toctitle"> <h2>Contents</h2> </div> <ul> <li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early life and schooling</span></a></li> <li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a> <ul> <li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span class="toctext">Rensselaer Polytechnic Institute</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span class="toctext">Honors and distinctions</span></a> <ul> <li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards of directors</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span class="toctext">Personal</span></a></li> <li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span class="toctext">References</span></a></li> <li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External links</span></a></li> </ul> </div> <h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> | </span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling" class="mw-editsection-visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2> <p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
  • 29. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document <p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile- 2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref-AAAS_3-0" class="reference"><a href="#cite_note-AAAS- 3"><span>[</span>3<span>]</span></a></sup></p> <div id="toc" class="toc"> <div id="toctitle"> <h2>Contents</h2> </div> <ul> <li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early life and schooling</span></a></li> <li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a> <ul> <li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span class="toctext">Rensselaer Polytechnic Institute</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span class="toctext">Honors and distinctions</span></a> <ul> <li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards of directors</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span class="toctext">Personal</span></a></li> <li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span class="toctext">References</span></a></li> <li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External links</span></a></li> </ul> </div> <h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw- editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> | </span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling" class="mw-editsection-visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2> <p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup></p>
  • 30. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile- 2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS- 3"><span>[</span>3<span>]</span></a></sup> <div id="toc" class="toc">">edit</a><span class="mw- edit]</span></span></h2> Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note- diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup>
  • 31. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile- 2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS- 3"><span>[</span>3<span>]</span></a></sup>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref- diaspora_4-0" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note- diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
  • 32. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile- 2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS- 3"><span>[</span>3<span>]</span></a></sup>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref- diaspora_4-0" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note- diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
  • 33. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>.
  • 34. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. Cleaned Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>.
  • 35. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. Cleaned Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>.
  • 36. ???Search Result Processing  Passage Retrieval  Watson: Indri and Lucene  Identifies each HTML sentence and adds both the HTML and the clean text to the passage type  Adds information about each passage  Passage Parsing  Forms parse trees for each individual sentence  Add an array of passages to each document Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. Cleaned Text: <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. Parse Tree: (ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (-LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB-))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist) (NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP (NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .)))
  • 37. ???Candidate Generation  Using each document, and the passages created by Search Result Processing, we generate candidates using three techniques: 1. Title of Document (T.O.D.): Adds the title of the document as a candidate. 2. Wikipedia Title Candidate Generation: Adds any noun phrases within the document‟s passage texts that are also the titles of Wikipedia articles. 3. Anchor Text Candidate Generation: Adds candidates based on the hyperlinks and metadata within the document.
  • 38. ???Wikipedia Title Candidate Generation  Runs on the passage array from each search result.  Using the parse tree, retrieves all the noun phrases in each passage.  Checks if each Noun Phrase is the title of a Wikipedia Article  Adds the verified candidates along with an array of the passages that contained them Array of Passages Retrieving Noun Phrases Check against Previous Data Wikipedia URL Check Candidate and Containing Passages (ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (- LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB- ))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist) (NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP (NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .))) b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>.
  • 39. ???Wikipedia Title Candidate Generation  Runs on the passage array from each search result.  Using the parse tree, retrieves all the noun phrases in each passage.  Checks if each Noun Phrase is the title of a Wikipedia Article  Adds the verified candidates along with an array of the passages that contained them Array of Passages Retrieving Noun Phrases Check against Previous Data Wikipedia URL Check Candidate and Containing Passages (ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (- LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB- ))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist) (NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP (NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .))) Shirley Ann Jackson Shirley Ann Jackson (born August 5, 1946) August 5, 1946 An American Physicist An American Physicist, and the 18th president of Rensselaer Polytechnic Institute The 18th president The 18th president of Rensselaer Polytechnic Institute Rensselaer Polytechnic Institute
  • 40. ???Wikipedia Title Candidate Generation  Runs on the passage array from each search result.  Using the parse tree, retrieves all the noun phrases in each passage.  Checks if each Noun Phrase is the title of a Wikipedia Article  Adds the verified candidates along with an array of the passages that contained them Array of Passages Retrieving Noun Phrases Check against Previous Data Wikipedia URL Check Candidate and Containing Passages (ROOT (S (NP (NP (NNP ) (NNP Shirley) (NNP Ann) (NNP Jackson) (NNP )) (PRN (- LRB- -LRB-) (VP (VBN born) (NP (NNP August) (CD 5) (, ,) (CD 1946))) (-RRB- -RRB- ))) (VP (VBZ is) (NP (NP (DT an) (JJ ) (NNP American) (NNP ) (NNP ) (NN physicist) (NNS )) (, ,) (CC and) (NP (NP (DT the) (JJ 18th) (NN president)) (PP (IN of) (NP (NNP ) (NNP Rensselaer) (NNP Polytechnic) (NNP Institute) (NNP )))))) (. .))) Shirley Ann Jackson Shirley Ann Jackson (born August 5, 1946) August 5, 1946 An American Physicist An American Physicist, and the 18th president of Rensselaer Polytechnic Institute The 18th president The 18th president of Rensselaer Polytechnic Institute Rensselaer Polytechnic Institute
  • 41. ???Wikipedia Title Candidate Generation  Runs on the passage array from each search result.  Using the parse tree, retrieves all the noun phrases in each passage.  Checks if each Noun Phrase is the title of a Wikipedia Article  Adds the verified candidates along with an array of the passages that contained them Array of Passages Retrieving Noun Phrases Check against Previous Data Wikipedia URL Check Candidate and Containing Passages http://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search= Shirley+Ann+Jackson Shirley Ann Jackson Shirley Ann Jackson (born August 5, 1946) August 5, 1946 An American Physicist An American Physicist, and the 18th president of Rensselaer Polytechnic Institute The 18th president The 18th president of Rensselaer Polytechnic Institute Rensselaer Polytechnic Institute
  • 42. ???Wikipedia Title Candidate Generation  Runs on the passage array from each search result.  Using the parse tree, retrieves all the noun phrases in each passage.  Checks if each Noun Phrase is the title of a Wikipedia Article  Adds the verified candidates along with an array of the passages that contained them Array of Passages Retrieving Noun Phrases Check against Previous Data Wikipedia URL Check Candidate and Containing Passages http://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search= Shirley+Ann+Jackson Shirley Ann Jackson Shirley Ann Jackson (born August 5, 1946) August 5, 1946 An American Physicist An American Physicist, and the 18th president of Rensselaer Polytechnic Institute The 18th president The 18th president of Rensselaer Polytechnic Institute Rensselaer Polytechnic Institute
  • 43. ???Anchor Text Candidate Generation  Runs on the passage array from each search result.  Checks for hyperlinks within the HTML text of each passage.  Adds the title of the hyperlinked article as a candidate  Adds each passage containing the candidate to an array <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile- 2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS- 3"><span>[</span>3<span>]</span></a></sup>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note- diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup>
  • 44. ???Anchor Text Candidate Generation  Runs on the passage array from each search result.  Checks for hyperlinks within the HTML text of each passage.  Adds the title of the hyperlinked article as a candidate  Adds each passage containing the candidate to an array <b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile- 2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS- 3"><span>[</span>3<span>]</span></a></sup>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note- diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora- 4"><span>[</span>4<span>]</span></a></sup>
  • 45. ???Search Result Processing <p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile-2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS-3"><span>[</span>3<span>]</span></a></sup></p> <div id="toc" class="toc"> <div id="toctitle"> <h2>Contents</h2> </div> <ul> <li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early life and schooling</span></a></li> <li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a> <ul> <li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span class="toctext">Rensselaer Polytechnic Institute</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span class="toctext">Honors and distinctions</span></a> <ul> <li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards of directors</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span class="toctext">Personal</span></a></li> <li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span class="toctext">References</span></a></li> <li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External links</span></a></li> </ul> </div> <h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> | </span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling" class="mw-editsection- visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2> <p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
  • 46. ???Search Result Processing <p><b>Shirley Ann Jackson</b> (born August 5, 1946) is an <a href="/wiki/United_States" title="United States">American</a> <a href="/wiki/Physicist" title="Physicist">physicist</a>, and the 18th president of <a href="/wiki/Rensselaer_Polytechnic_Institute" title="Rensselaer Polytechnic Institute">Rensselaer Polytechnic Institute</a>. She received her <a href="/wiki/Doctor_of_Philosophy" title="Doctor of Philosophy">Ph.D.</a> in physics from the <a href="/wiki/Massachusetts_Institute_of_Technology" title="Massachusetts Institute of Technology">Massachusetts Institute of Technology</a> in 1973, becoming the first <a href="/wiki/African_American" title="African American">African American</a> woman to earn a doctorate from MIT in nuclear physics.<sup id="cite_ref-profile_2-0" class="reference"><a href="#cite_note-profile-2"><span>[</span>2<span>]</span></a></sup><sup id="cite_ref- AAAS_3-0" class="reference"><a href="#cite_note-AAAS-3"><span>[</span>3<span>]</span></a></sup></p> <div id="toc" class="toc"> <div id="toctitle"> <h2>Contents</h2> </div> <ul> <li class="toclevel-1 tocsection-1"><a href="#Early_life_and_schooling"><span class="tocnumber">1</span> <span class="toctext">Early life and schooling</span></a></li> <li class="toclevel-1 tocsection-2"><a href="#Career"><span class="tocnumber">2</span> <span class="toctext">Career</span></a> <ul> <li class="toclevel-2 tocsection-3"><a href="#Rensselaer_Polytechnic_Institute"><span class="tocnumber">2.1</span> <span class="toctext">Rensselaer Polytechnic Institute</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-4"><a href="#Honors_and_distinctions"><span class="tocnumber">3</span> <span class="toctext">Honors and distinctions</span></a> <ul> <li class="toclevel-2 tocsection-5"><a href="#Boards_of_directors"><span class="tocnumber">3.1</span> <span class="toctext">Boards of directors</span></a></li> </ul> </li> <li class="toclevel-1 tocsection-6"><a href="#Personal"><span class="tocnumber">4</span> <span class="toctext">Personal</span></a></li> <li class="toclevel-1 tocsection-7"><a href="#References"><span class="tocnumber">5</span> <span class="toctext">References</span></a></li> <li class="toclevel-1 tocsection-8"><a href="#External_links"><span class="tocnumber">6</span> <span class="toctext">External links</span></a></li> </ul> </div> <h2><span class="mw-headline" id="Early_life_and_schooling">Early life and schooling</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;action=edit&amp;section=1" title="Edit section: Early life and schooling">edit source</a><span class="mw-editsection-divider"> | </span><a href="/w/index.php?title=Shirley_Ann_Jackson&amp;veaction=edit&amp;section=1" title="Edit section: Early life and schooling" class="mw-editsection- visualeditor">edit</a><span class="mw-editsection-bracket">]</span></span></h2> <p>Jackson was born in Washington D.C. Her parents, Beatrice and George Jackson, strongly valued education and encouraged her in school.<sup id="cite_ref-diaspora_4-0" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup> Her father spurred on her interest in science by helping her with projects for her science classes. At Roosevelt High School, Jackson attended accelerated programs in both math and science, and graduated in 1964 as valedictorian.<sup id="cite_ref-diaspora_4-1" class="reference"><a href="#cite_note-diaspora-4"><span>[</span>4<span>]</span></a></sup></p>
  • 47. ???Future Work  Search Result Processing  Improve methods of imitating Indri or Lucene passage retrieval without a corpus.  Create a passage score and rank.  Candidate Generation  Continue to improve the speed and quality of Candidate Generation  Research and implement Candidate Generation from Structured Sources (Prismatic, Answer Lookup)  Record and measure recall in comparison with Watson and other Question answering software.
  • 50. ???Differentiating between answers  Making sense of candidates  Filtering  Supporting Evidence Retrieval (SER)  Scoring (passage-based)
  • 51. ???Scorers  Passage Term Match  Textual Alignment  Skip-Bigram  Each of these scores supportive evidence  These scores are then merged to produce a single candidate score
  • 52. ???Passage Term Search  Question Terms Extracted  Passage is searched for those terms  Score calculated for that passage  Done per passage  “Where is Toronto?” “Where” “is” “Toronto”  “Toronto is in Southern Ontario” “Toronto is ”  Score = IDF(Toronto) + IDF(is)
  • 53. ???Textual Alignment  Finds an optimal alignment of a question and a passage  Assigns “partial credit” for close matches  “Who is the President of RPI?”  Shirley Ann Jackson is the President of RPI.  Who is the President of RPI.
  • 54. ???Skip-Bigram  Constructs a graph  Nodes represent terms (syntactic objects)  Edges represent relations  Extracts skip-bigrams  A skip-bigram is a pair of nodes either directly connected or which have only one intermediate node  Skip-bigrams represent close relationships between terms  Scores based on number of common skip-bigrams
  • 55. ???Example  Who authored “The Good Earth”?  “Pearl Buck, author of the good earth…”
  • 56. ???Future Directions  More algorithms  Logical form answer candidate scoring  Improved Type Coercion scoring  Begin implementing machine learning  Temporal/Spatial reasoning
  • 58. ???UIMA  The DeepQA architecture is built on top of another architecture, UIMA (Unstructured Information Management Architecture).  A UIMA CAS (Common Analysis Structure) contains a contiguous block of data (normally text), and annotations, which contain start & end indexes into the data, and additional data (strings, integers, doubles, arrays, annotation references).
  • 59. ???UIMA  CAS Multipliers output multiple CASes based on the data in the input CAS; this facilitates parallelization, which is the key to Watson‟s response time.
  • 60. ???DeepQA Architecture  The DeepQA architecture, which both IBM Watson and RPI MiniDeepQA implement, is a QA (Question Answering) system that answers questions by generating as many potential answers as is practical, then filtering them with multiple evidence scorers in parallel.
  • 61. ???Data cache  IBM Watson has a pre-processed corpus of information, generated automatically by a subset of the DeepQA pipeline from an enormous volume of raw text, which the remainder of the pipeline uses at question time.  As our system retrieves information from the internet on a per-question basis, it cannot (practically) process the whole corpus in advance.
  • 62. ???Data cache  Since parsing the documents takes a large amount of time, in order to test/demonstrate the system, it is beneficial to store webpages and associated parses locally. This allows a question that has been asked before, and candidates that come up for multiple questions, to be processed faster.  As a side-benefit of the caching, if a website is temporarily down, its data can still be used (if it was not down at some point in the past).
  • 63. ???Graphical User Interface  Towards the start of the project, we ran our system using the Document Analyzer (a UIMA-provided tool).  While it was useful, once we had the entire pipeline set up, testing the full system required more input than necessary.  Additionally, there wasn't a convenient way to display just the intended output, nor intermediate output at a level suitable for monitoring progress/giving demonstrations.
  • 64. ???Graphical User Interface  The GUI addresses these concerns, and has additionally been extensively tweaked to be demonstration-friendly.

Editor's Notes

  1. Our project differs from IBM’s Watson in that it draws from the internet each run to generate a question specific corpus.The reason we only take 5 documents is because of time it takes to parse these documents with the Stanford Parser. This is our main limiting factor, and as we parse
  2. You can navigate HTML pages with jsoup very similar to jquery, selecting items by ID/class/property
  3. Cache stores the URL and entire HTML body of the pagepassages are then used in different ways for Candidate Generation, Answer Scoring, and Supporting Evidence Retrieval
  4. Online knowledgebase, stores information as triples.We use a language called SPARQL (SPARQL Protocol and RDF Query Language) to query DBPedia and a library called Jena to do this from our Java Application
  5. Enough information to be confident that the entities we’re asking about will appear in dbpedia.Don’t have to check whether URL is valid if we already have the wikipedia URL.We can learn other names for people like, Louis 14 is the Sun King
  6. Verifying we have the right length for the River Nile, correct birth date of Harrison Ford
  7. The DeepQA architecture, which both IBM Watson and RPI MiniDeepQA implement, is a QA (Question Answering) system that answers questions by generating as many potential answers as is practical, then filtering them with multiple evidence scorers in parallel.