2. 2
Natural Language Understanding – the key to
intelligent behavior
§ Most information and knowledge is encoded in unstructured form in
natural language
§ When humans learn about a new topic, they read about it – machines
should do the same
§ Natural language content on the internet is growing constantly
§ Natural language is evolving, and natural language processing should
account for that
Cognitive computing
Cognitive computing systems learn and interact naturally with people to
extend what either humans or machine could do on their own. They help
human experts make better decisions by penetrating the complexity of
Big Data.
http://www.research.ibm.com/cognitive-computing
3. 3
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
4. 4
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
Lexical Layer
Concept Layer
5. 5
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
Lexical Layer
Concept Layer
polysemous
6. 6
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
Lexical Layer
Concept Layer
synonymouspolysemous
7. 7
Why Not To Use Dictionaries or Ontologies
Advantages:
§ Sense inventory given
§ Linking to concepts
§ Full control
Photo by zeh fernando under Creative Commons licence
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
8. 8
Why Not To Use Dictionaries or Ontologies
Advantages:
§ Sense inventory given
§ Linking to concepts
§ Full control
Photo by zeh fernando under Creative Commons licence Disadvantages:
• Dictionaries have to be created
• Dictionaries are incomplete
• Language changes constantly: new
words, new meanings …
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
9. 9
Why Not To Use Dictionaries or Ontologies
Advantages:
§ Sense inventory given
§ Linking to concepts
§ Full control
Photo by zeh fernando under Creative Commons licence
“give a man a fish and
you feed him for a day…
Disadvantages:
• Dictionaries have to be created
• Dictionaries are incomplete
• Language changes constantly: new
words, new meanings …
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
10. 10
Structure Discovery Paradigm
… teach a man to fish and
you feed him for a lifetime”
Consequences:
§ Only raw text input required
§ No fine-grained control on categories
§ Cognitive system: learns from and adopts to data
Task
Use annotations as features
Text Data
SD algorithm
Find regularities by analysis
Annotate data with regularities
SD algorithm
SD algorithm
SD algorithms
11. 11
The JoBimText project –
www.jobimtext.org
Partners:
§ Lead at IBM: Alfio Gliozzo
IBM Watson DeepQA, Yorktown, NY, USA
§ Lead at TU DA: Chris Biemann
Language Technology, TU Darmstadt, Germany
Software Capabilities:
§ Compute a Distributional Thesaurus
§ Compute Sense Representations
§ 2-Dimensional Text: Contextualized Expansion
§ RESTful API and Web Demo
Features:
§ Scalable architecture
§ Open Source, ASL 2.0
12. 12
2D Text: Matching Meaning beyond Keywords
almost
no word
overlap
Where was the first professor for electric science established?
In 1883 the first faculty for electrical engineering was founded there.
13. 13
2D Text: Matching Meaning beyond Keywords
Where was the first professor for electric science established?
In 1883 the first faculty for electrical engineering was founded there.
teacher
professor
student
graduate
alumnus
staff
campus
electric
mechanical
thermal
electronic
industrial
optical
automotive
science
sciences
biology
physics
economics
mathematics
psychology
co-found
form
establish
own
join
rename
bear
director
emeritus
dean
lecturer
president
psychologist
historian
electrical
heavy-duty
antique
battery-powered
electronic
stainless
diesel
biology
economics
sciences
mathematics
physics
math
psychology
create
form
set
maintain
found
abolish
strengthen
14. 14
2D Text: Matching Meaning beyond Keywords
Where was the first professor for electric science established?
In 1883 the first faculty for electrical engineering was founded there.
teacher
professor
student
graduate
alumnus
staff
campus
electric
mechanical
thermal
electronic
industrial
optical
automotive
science
sciences
biology
physics
economics
mathematics
psychology
co-found
form
establish
own
join
rename
bear
director
emeritus
dean
lecturer
president
psychologist
historian
electrical
heavy-duty
antique
battery-powered
electronic
stainless
diesel
biology
economics
sciences
mathematics
physics
math
psychology
create
form
set
maintain
found
abolish
strengthen
18. 18
Clustering of DT entries:
Sense Induction
bright#JJ
paper#NN
C. Biemann (2006): Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing
Problems. Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, New York, USA.
19. 19
Features for Disambiguation
paper 0 (newspaper)
read#VB#-dobj 45
reading#VBG#-dobj 45
write#VB#-dobj 38
read#VBD#-dobj 37
writing#VBG#-dobj 36
wrote#VBD#-dobj 34
original#JJ#amod 27
wrote#VBD#-prep_in 26
recent#JJ#amod 26
published#VBN#partmod 25
written#VBN#-dobj 23
published#VBN#-nsubjpass 20
published#VBD#-dobj 19
copy#NN#-prep_of 18
said#VBD#-prep_in 18
author#NN#-prep_of 17
pages#NNS#-prep_of 16
told#VBD#-dobj 15
buy#VB#-dobj 14
published#VBN#-prep_in 14
page#NN#-prep_of 14
paper 1 (material)
piece#NN#-prep_of 21
pieces#NNS#-prep_of 17
made#VBN#-prep_from 13
bags#NNS#-nn 11
white#JJ#amod 9
paper#NN#-conj_and 9
glass#NN#-conj_and 9
products#NNS#-nn 9
industry#NN#-nn 8
plastic#NN#conj_and 8
plastic#NN#-conj_and 8
bits#NNS#-prep_of 8
bag#NN#-nn 8
plastic#NN#conj_or 8
sheet#NN#-prep_of 7
recycled#JJ#amod 7
tons#NNS#-prep_of 7
glass#NN#conj_and 7
buy#VB#-dobj 6
plates#NNS#-nn 6
pile#NN#-prep_of 6
These are shared by paper and the cluster members.
Disambiguation: find features in context.
I am reading an original paper on the paper .
22. 22
JoBimText Model example “beetle”
S. Mitra, R. Mitra, M. Riedl, C. Biemann, A. Mukherjee, P. Goyal (2014):
That’s sick dude!: Automatic identification of word sense change across
different timescales. Proceedings of ACL-2014, Baltimore, MD, USA
http://www.thezooom.com/2013/01/10749/
23. 23
JoBimText Model example “beetle”
S. Mitra, R. Mitra, M. Riedl, C. Biemann, A. Mukherjee, P. Goyal (2014):
That’s sick dude!: Automatic identification of word sense change across
different timescales. Proceedings of ACL-2014, Baltimore, MD, USA
http://www.thezooom.com/2013/01/10749/
24. 24
Outlook: From Similarities and Relations…
Cathy liked the blue dress very much.
She bought it for 15 Euros from the shop.
gown
skirt
blouse
Pat
Brian
Kevin
red
purple
green
currency
greenback
yen
store
restaurant
boutique
COLOR CLOTHINGFIRSTNAME
MONEY SALESPOINT
HAS-PROPERTY 1: ENTITIES
2. RELATIONS
26. 26
… to Frames and Causality
She bought it for 15 Euros from the shop.
MONEY SALESPOINT
FIRSTNAME adored CLOTHING
FIRSTNAME found CLOTHING great
POSITIVE-OPINION-ABOUT
subj=FIRSTNAME obj=CLOTHING
VERKAUFSVORGANG
subj=AGENT obj=THING für=MONEY loc=SALESPOINT
FIRSTNAME
CLOTHING
Cathy
dress
Cathy
dress
3: FRAMES
4: CAUSALITY
Cathy liked the blue dress very much.
COLOR CLOTHINGFIRSTNAME
HAS-PROPERTY
28. 28
§ JoBimText informs relation extraction
significant improvements in EMRA application, e.g. for
finding drug prescriptions for diseases
§ JoBimText sense clusters are being used to inform
term matching
e.g. when finding justifications for answers
§ JoBimText is one of the solutions for knowledge
induction from text in new domains
Applications of JoBimText
in IBM Watson
29. 29
Conclusion
§ The role of Natural Language Processing in Cognitive
Computing is two-fold:
§ the technology for natural interaction with the system
§ a technology subject to be framed in the cognitive paradigm
30. 30
Conclusion
§ The role of Natural Language Processing in Cognitive
Computing is two-fold:
§ the technology for natural interaction with the system
§ a technology subject to be framed in the cognitive paradigm
§ Adaptive Natural Language Processing
§ makes use of static AND dynamically generated resources
§ is driven by (text) data that defines its application domain
§ accounts for language evolution and new meanings by adaptation
to the data
§ beyond NLP pipelines
31. 31
Thanks..
.. and now some (deep) QA!
www.jobimtext.org
Special Track: Semantic
and Cognitive Computing
33. 33
The @-ing (‘holing’) operation:
producing pairs of Jos and Bims
SENTENCE:
I suffered from a cold and took aspirin.
STANFORD COLLAPSED DEPENDENCIES:
nsubj(suffered, I); nsubj(took, I); root(ROOT, suffered); det(cold, a);
prep_from(suffered, cold); conj_and(suffered, took); dobj(took, aspirin)
WORD-CONTEXT PAIRS:
suffered nsubj(@@, I) 1
took nsubj(@@, I) 1
cold det(@@, a) 1
suffered prep_from(@@, cold) 1
suffered conj_and(@@, took) 1
took dobj(@@, aspirin) 1
I nsubj(suffered, @@) 1
I nsubj(took, @@) 1
a det(cold, @@) 1
cold prep_from(suffered, @@) 1
took conj_and(suffered, @@) 1
aspirin dobj(took, @@) 1
http://nlp.stanford.edu:8080/parser/
Jo Bim
34. 34
Distributional Thesaurus (DT)
§ Computed from distributional similarity statistics
§ Entry for a target word consists of a ranked list of neighbors
meeting
meeting 288
meetings 102
hearing 89
session 68
conference 62
summit 51
forum 46
workshop 46
hearings 46
ceremony 45
sessions 41
briefing 40
event 40
convention 38
gathering 36
...
articulate
articulate 89
explain 19
understand 17
communicate 17
defend 16
establish 15
deliver 14
evaluate 14
adjust 14
manage 13
speak 13
change 13
answer 13
maintain 13
...
immaculate amod(condition,@@)
perfect amod(timing,@@)
nsubj(@@,hair)
cop(@@,remains)
First order
immaculate perfect
Second order
3
amod(Church,@@)
35. 35
Scaling Computation with MapReduce
Roomano is a hard
Gouda-like cheese
from Friesland in the
northern part of The
Netherlands. It pairs
well with aged
sherries ...
FreqSig
t: min freq
s: min sign
Holing
using gramm.
relations
word feature t
hard#a cheese#ADJ_MODn 17
cheese#n Gouda-like#ADJ_MODa 5
cheese#n hard#ADJ_MODa 17
pair#v well#ADV_MODa 3
... .... ...
word feature s
hard#a cheese#ADJ_MODn 15.8
cheese#n Gouda-like#ADJ_MODa 7.6
cheese#n hard#ADJ_MODa 0.4
... .... ...
AggrPerFtfeature words
cheese#ADJ_MODn hard#a, yellow#a, French#a
hard#ADJ_MODa cheese#n, stone#n
... .... ...
SimCounts w: weighting
for # words/ feature
word word w.sum
hard#a yellow#a 0.234
yellow#a hard#a 0.234
cheese#n stone#n 3.14
... .... ...
PruneGraph
p: max number of
features per word ; s
(like
data
below)
Convert
sum threshold
ibm
i.b.m. 164
intel 154
hewlett-packard 151
dell 141
cisco 134
microsoft 125
hp 124
green: Steps blue: Parameters