Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Cognitive Era and Introduction to IBM Watson
1. Cognitive Era &
Introduction to IBM
Watson
AN INDUSTRY PERSPECTIVE
Subhendu Dey | Senior Solution Architect, Cognitive Business Solutions
2. What we
want to
cover today
What is Cognitive Era
How we are affected by the Cognitive Era
Introduction to IBM Watson
Foundational technologies behind IBM Watson
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 2
3. Welcome to a
new era of
computing
Nearly 80% of data is essentially invisible to Computers
Data in all sectors will phenomenally increase in next two
years and so is the part of unstructured content
Cloud adoption will influence re-write of software around the
world through composible business and usage of APIs
The C-suite executives are ready to invest in Cognitive
Computing anticipating clear differentiation.
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 3
• A world awash in Data
• Cloud impact on software
• Advent of cognitive
computing
Sector Total Increase in data
(unstructured %)
Healthcare + 98% (88%)
Government + 94% (84%)
Utilities + 93% (84%)
Media + 92% (82%)
4. How are we
affected by the
new era?
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 4
Customer centric business is not a newbie, however
“understanding” the customer is changing scope/context with
deeper human engagement.
By 2020, about 1.7mb of new information would be created in
every minute for every human being on the planet
Marketers are after data driven personalization and expect
15-20% benefit from it
• Geo-location footprints
• Web interactions
• Transaction history
• Loyalty programme
pattern
• Electronic Medical
records
• Data from wearable
devices
• Others….
• Tone
• Sentiment
• Emotional State
• Environmental Conditions
• Strength of Person’s
relationship
• Nature of person’s
relationship (e.g. Social
Credit)
=
Value added
Customer
Engagement
• Fine-grained picture of
individuals
• Interact with individual in
natural language
• Consider peer (social)
influence in every action
in this globally connected
world
• Increasing number of
automated agents
5. How are we
affected by the
new era?
Medical knowledge is
increasing in an unprecedented
manner nowadays.
We generate health related data
of the size 300m books in our
life-time.
Missed opportunities of getting
insight from medical records
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 5
• Rate of increase of
information/knowledge is
more than what a human
can keep up with
• There is a significant gap
between top performer
and average ones in any
industry.
1950 1960 1970 1975 1978 1980 2015 2020
Preventable
medical errors is #3
cause of death in
US.
Average Expert
Surgeons Cook Apple Dev
N x gap
Companies all over the
world spend hugely on
learning and development,
yet they are in shortage of
right skills on a continuous
basis.
6. How are we
affected by the
new era?
We expect more and more products and services becoming
cognitive in next few years
As the world being re-written in code, we see this coming in the
cars, medical devices, appliances and more.
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 6
• In this instrumented and
interconnected world,
the machines will become
more intelligent to
sense, reason and learn
about their users and the
world around them.
Map internal and
external structured
and unstructured
data
Example: “.. City becomes
Venice during monsoons”
may mean waterlogging
Create / Leverage
Ontology
Example: Kolkata =
Calcutta = City of Joy =
Kolkatta
Sense
Relate
Reason
Learn
Machine Learn and
add to Ontology
7. How are we
affected by the
new era?
Business process functions of all industries can be influenced by
the power of cognition.
Banking & financial services
Automated agents in call centers
Social credit aspect in credit risk measurement
Fraud detection
Personalised rewards
Cognitive validation of trade, policies etc.
Enhanced KYC
……
Supply Chain
Telecom
…..
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 7
• With the help of cognitive
capabilities companies
can transform their
operations & functions
with better decision
making at the speed of
data.
• Cognitive Bus / BPM
would introduce evidence
based decisions
compared to static rules.
8. How are we
affected by the
new era?
This is where cognitive computing and advanced analytics work
hand-in-hand to produce meaningful insight from the unstructured
content.
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 8
• Better headlights to an
increasing volatile and
complex future
• 42% of CXOs believe rigid
and insufficient analytic
tools are a major barrier for
new drug development.
94% of them believe
cognitive computing will be
a disruptive force in life
sciences.
Unstructured
Content
Structured Data
Cognitive
Analysis
Improved
Business
insight
Other Structured
Data from
various sources
Analytical
Models
Dataexploration&
discovery
9. What we
want to
cover today
What is Cognitive Era
How we are affected by the Cognitive Era
Introduction to IBM Watson
Foundational technologies behind IBM Watson
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 9
10. Introduction to
IBM Watson
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 10
• Watson is something to go
past the programmatic era
and help users interacts
with computer more like a
human.
Watson is not
Just a search engine, although it comprises of several search technology
Just a new-fangled database system for supporting smart searches
Skynet or HAL9000 (luckily)
Watson is
Cognitive System
Combines information retrieval and Natural Language Processing (NLP)
Builds upon domain knowledge for a targeted industry
Runs on Apache UIMA (Unstructured Information Management
Architecture) technology*
Derives answers from evidences and not 100% correct all the time
* For custom cognitive solution by orchestrating cloud APIs, it depends on developer to use the framework
11. Introduction to
IBM Watson
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 11
• Watson is a cloud-based,
open platform of
expanding cognitive
capabilities. With Watson,
one can build cognition
into digital applications,
products and operations.
Watson for
Oncology
Watson for Oncology, trained by Memorial Sloan-Kettering, is designed to help oncologists
anywhere make evidence-informed decisions.
Watson for
Clinical Trial
Matching
Watson for Clinical Trial Matching applies Watson’s ability to combine natural language
processing, evidence-based reasoning and other analytic techniques in order to help match
patients to clinical trials and clinical trials to patients.
Watson
Developer
Cloud
Watson Developer Cloud enables developers and businesses of all sizes to build new
cognitive applications, and add cognitive capabilities to existing applications. It provides a
growing set of API’s and SDK’s, and is accessible to anyone through the Bluemix cloud
environment.
Watson
Discovery
Advisor
For researchers in information-intensive industries who are responsible for developing new
knowledge in areas where consensus answers do not currently exist
Watson
Engagement
Advisor
A question-answer based offering to improve self-service functions in most of the business
domains through deep analytics.
Watson
Ecosystem
A breakthrough partner program that supports organizations in building and selling
innovations with Watson technology.
Watson
Explorer
A cognitive exploration solution that combines information access, content analytics and
cognitive computing to help users find and understand the information they need to work
more efficiently and make better, more confident decisions.
Watson
Analytics
A cloud-based service for data analysis and visualization that allows business people to
discover patterns and meaning in their data all on their own.
12. Foundational
Technologies
behind Watson
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 12
• Fifty (50) foundational
technologies draw upon
five (5) distinct field of
study:
• Big Data & Analytics
• Artificial Intelligence
• Cognitive Experience
• Cognitive Knowledge
• Computing
Infrastructure
Anaphoric Co-referencing Feature Engineering Learn to Rank Question Analysis
Colloquialism Processing Feature Normalization Linguistic Analysis
Question-answering
Reasoning Strategies
Content Management --
Versioning
Focus and Spurious
Phrase Resolution
Logical Reasoning
Analysis
Recursive Neural Networks
Convolutional Neural
Networks
HTML Page Analysis Logistical Regression Rules Processing
Curation Image Management Machine Learning Scalable Search
Deep Learning Information Retrieval
Multi-dimensional
Clustering
Similarity Analysis
Dialog Framing
Knowledge (Property)
Graphs
Multilingual Training
Statistical Language
Parsing
Ellipses Knowledge Answering
N-gram analysis (word
combinations & distance)
Support Vector Machines
Embedded Table
Processing
Knowledge Extraction
Annotators
Ontology Analysis Syllable Analysis
Ensembles and Fusion
Knowledge Validation and
Extrapolation
Pareto Analysis Table Answering
Entity Resolution Language Modeling Passage Answering Visual Analysis
Factoid Answering Latent Semantic Analysis
PDF Conversion Visual Rendering
Phoneme Aggregation Voice Synthesis
13. Watson builds upon
programmatic
computing but
differs in significant
ways
Language is important – it can be incredibly imprecise yet
amazingly accurate!
Precision – mechanical or scientific exactness that can be found in a passage of text.
For example, we can determine if a particular word exists within a passage with a high
degree of precision.
Accuracy - the degree to which one passage infers that another passage might be
considered to be true by reasonable people.
Shallow NLP – fairly precise in a narrow focus, but not accurate
Deep NLP – use context in the evaluation of a question or
classification of text.
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 13
• Natural Language
Processing
• Move from decision tree-
driven, deterministic
applications to
probabilistic systems
that co-evolve with their
users
Question
Decomposition
Hypothesis
Generation
Hypothesis and
Evidence
Scoring
Synthesis
Final Confidence
Merging and
Ranking
100’s of possible
answers
1000’s of pieces of
evidences
100,000’s of scores
from many deep
analysis algorithms
Balance and
Combine
Q A
14. Question
analysis &
decomposition
Question or Text analysis and decomposition starts with some of
the core NLP techniques
Tokenization / Lemmatization / Stemming
Named entity recognition / detectors (NER / NED) – for example type
identification, part of speech tagging etc.
Relationship detection
Conference / Anaphora (pronoun) ID
Keyword identification
Term / Lexical Answer Type (LAT) identification
Machine learning to consider most likely LAT
Example: “…want to go for a vacation destination beside the sea
with lots of kids activity, enough parking space and an array of
good sea-food destinations – preferably within 5 hours drive from
my place.”
What is the LAT here?
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 14
• Typically there are off-the-
shelf tools available for
language detection, NER,
POS tagging etc. However,
the keyword identification
and LAT detection is often
a solution specific item to
be custom coded.
15. Search sources
& Hypothesis
generation
The constructed queries are searched over the corpora to
generate candidate answers
Parse the search result to find possible answers based on
Titles
Anchor text
Passages and their parts
Checking candidates against known constraints
Next scoring is used for filtering the candidates
Taxonomic, Geospatial, Temporal, Source reliability, Gender etc.
Context dependent (deep evidence)
Context independent
Output of the scoring is the feature list of the candidates.
Finally merge the duplicate candidates by normalizing scores per
feature, and rank them through machine learning built over
corpora.
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 15
• Cognitive systems are
built on the basis of
available corpora, so if
the latter is updated, the
machine learning
algorithm has to be
rebuilt to work effectively
on the revised knowledge
base.
16. Example of
Shallow and
Deep NLP
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 16
• Question: In May 1998
Portugal celebrated the
400th anniversary of this
explorers’ arrival in India.
Who is he?
• Relying upon just the
keyword matching could
result in inaccurate result
Celebrated
In May 1898
400th anniversary
Portugal
Arrival in
India
Explorer
In May, Gary arrived in India
after he celebrated his
anniversary in Portugal
Arrived in
Celebrated
In May
anniversary
In Portugal
India
Gary
Keyword matching
Keyword matching
Keyword matching
Keyword matching
Keyword matching
17. • Search far and wide
• Explore many
hypothesis
• Find and rank evidence
Example of
Shallow and
Deep NLP
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 17
• Question: In May 1998
Portugal celebrated the
400th anniversary of this
explorers’ arrival in India.
Who is he?
• Relying upon just the
keyword matching could
result in inaccurate result
Celebrated
In May 1898
400th anniversary
Portugal
Arrival in
India
Explorer
On the 27th of May, 1498
Vasco da Gama landed in
Kappad Beach
Landed in
27th May 1498
Kappad beach
Vasco da Gama
Geo-Spatial Reasoning
Statistical Paraphrasing
Temporal Reasoning
18. Understanding
Language is
just the
beginning
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 18
• Cognitive systems can
be decomposed into
several key elements.
• Some of the capabilities
exists today and some
are expected to appear
in near future.
Cognitive systems
can be used later to
find new concepts
which can be used
to new discovery
and insight, helping
us getting answers
to questions we
never thought to
ask.
As the system grow
richer, they are
expected to gain
ability to sense.
19. Accuracy is
improved
through
generalization
There are many other things that make up Watson core elements
besides the pipeline that have been talked about.
Watson Developer Cloud APIs are one set of such things, which
are gaining most traction today.
These industry agnostic generalized APIs allows creation of
pipeline of your own, and leverage NLP and Machine learning
techniques in custom way to build cognitive solution as
appropriate to an industry / client.
Where is the domain specificity in those NLP?
We are in a classic juncture, whether to specialize or generalize.
We are perhaps at the beginning of a new era of computing, one
that is less precise but more accurate.
23/01/2016 KSHITIJ, The Annual Techno-Management Fest | January 21-24, 2016 | Organized by IIT KHARAGPUR 19
• If the ability to adapt with
human-like dexterity makes
cognitive systems special, we
must generalize. We need to
recognize and draw
inferences from a broader set
of linguistic variation, under a
broader set of circumstances,
as our knowledge changes, as
the context changes, and as
contemporary linguistics
change.