SlideShare a Scribd company logo
1 of 18
Download to read offline
Natural Language Classifier
Links, Best Practices, Source Code, and Tools
CONFIDENTIAL2© 2016 International Business Machines Corporation
Welcome to the Natural Language Classifier! This document will help you get started.
Topic
Introduction
Documentation, Community, and Demos with Source Code
Tools
Ground Truth Management Tool
Demos
Build a Harry Potter sorting hat
Best Practices
Obtain end user questions ASAP
Train users to ask natural language questions
What’s the best naming convention for intent names?
Bootstrapping your next training set
NLC Confidences Scores Sum to 100%
Design Patterns
Classify Complex Text with a Multi-Intent Classifier
Chaining multiple classifiers
Page
3
4
5
6
7
8
9
10
11
12
Intro to Watson Development
https://ibm.box.com/s/nav52vt6q2xwib5zqwupwjf78mxtgems
Retrieve and Rank (R&R) Handbook
https://ibm.box.com/s/n0lqowt0v97nxb5mtei6qrbkkbtdt2dm
Personality (PI) Handbook
https://ibm.box.com/s/6h8dxsc3pq5idtgehjb6fwh7vbejsvtc
More Handbooks
Topic
Design Patterns Continued
Answer FAQs with NLC and let R&R answer the “long tail
Combine Dialog + NLC + R&R
Answer Detailed Questions Using NLC + Entity Extraction
Using NLC on large pieces of text through decomposition
Use Cases
Perform actions for users using NLC+Dialog
Topic Modeling Using the NLC
Page
13
14
15
16
17
18
CONFIDENTIAL3© 2016 International Business Machines Corporation
NLC on the Stack overflow: http://stackoverflow.com/questions/tagged/ibm-watson
Read questions and and get answers from other Watson NLC developers.
Technical Webinars: http://www.pages03.net/ibmwatson/building-with-watson-web-series
Sign up for the upcoming webinars on how to build conversational apps using Watson
Docs, Tools, and Community
NLC on the WDC: https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/nl-classifier.html
Read the NLC section of the Watson Developer Cloud to learn all about the service.
Webinar: Interpreting Language Using the NLC: https://goo.gl/5SFLmn
A recent webinar on by Rahul A. Garg, Watson Offering Manager for the NLC
https://goo.gl/YEBfG0
https://goo.gl/OgfsN2
https://goo.gl/lUlQ9G
Architecture: http://goo.gl/BLNtbM
Demos w/source code
https://goo.gl/ARnUwJ
NodeJS Watson Library: https://goo.gl/2nhR1n
Kick start your Watson NodeJS development.
Java Watson Library: https://github.com/watson-developer-cloud/java-sdk
A library of Java utilities to jump start your Watson development.
iOS/Swift Watson Library: https://github.com/watson-developer-cloud/ios-sdk
Checkout this NodeJS library to kick start your Watson development.
Code Libraries & SDKs
CONFIDENTIAL4© 2016 International Business Machines Corporation
NLC Ground Truth Management Tool
You can now let non-technical people train classifiers and manage their ground truth. This NLC tooling allows you to quickly upload and preview
your ground truth while quickly viewing existing classifiers. Go here to get started:
http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/nl-classifier/tool_overview.shtml
CONFIDENTIAL5© 2016 International Business Machines Corporation
Build a Harry Potter Soring Hat
Impress your kids and your colleagues with the power of the NLC
This fun and creative demo shows how a very simple training set can roughly discern your “personality type” enough to
determine which Hogwart’s house you belong too. The training dataset plus code in R are available for download
Top Demos
Blog: https://dreamtolearn.com/ryan/data_analytics_viz/98
Video: https://www.youtube.com/watch?v=7SoUSpxONcg
CONFIDENTIAL6© 2016 International Business Machines Corporation
Obtain End User Text/Questions ASAP
What’s the first mistake most R&R implementations make?
That’s right. They spend too much time working with pseudo-questions generated internally to bootstrap R&R. Don’t let yourself spend too much time
before getting your application in-front of real end-users to validate your assumptions about how your users will ask questions.
Anecdotally, 99% of implementations waste 75% of their time generating ground truth that doesn’t properly match end user needs. Validate soon and
validate often!
Best Practice
Don’t wait for perfection. Present your users with “Good Enough”.
Pay close attention to George Box’s quote and get in-front of your end users ASAP. Often a basic system is still useful enough to start asking meaningful
questions. I.e. you need to know quickly understand your user’s workflow, what questions/text they’ll submit, and how it’s worded. Normally a partial
system will still be enough to validate and extend your initial assumptions.
“Remember that all models are wrong.
The practical question is how wrong do
they have to be to be useful?”
George Box, 1987
CONFIDENTIAL7© 2016 International Business Machines Corporation
Train users to ask natural language questions
Modify your user’s default behaviors. No more keywords. Use full sentence.
If your UI has a text box, your users will likely default to entering keywords as they do with Google. You’ll need to modify
their behavior so they use full sentences. Here are a few recommended ways to achieve this.
Best Practice
1
Your placeholder
should be a full
text sentence
2
Provide examples
of other full text
sentences.
5
Track every sentence submitted by
users. This is valuable IP and critical
for training. Real end user questions
are GOLDEN so always keep them
separate from pseudo-questions
generated by non-SMEs.
4
Use type-ahead to display a drop-
down list of possible sentences
from your database of prior user
sentences
3
If the user types < 3 words, then
display a dialog suggesting that
they enter full sentences.
CONFIDENTIAL8© 2016 International Business Machines Corporation
What’s the best naming convention for intent names?Best Practice
Take time to establish a consistent naming convention for your intents. First, determine a hierarchy that groups similar intents. We
recommend a naming convention with the broader category at the left moving to subcategories on the right:
Text
Where is the fitness center?
Where is the business center
When does the fitness center open?
When does the fitness center close?
What are the business center hours?
Intent
fitness_center-location
business_center-location
fitness_center-open_time
fitness_center-close_time
business_center-hours
Next look for subtypes that are common between intents. E.g. “location” and “hours” in the examples above. These subtypes could span
multiple levels (as below), but we recommend not going beyond 4 levels in your intent hierarchy. Having good intent names will allow you
to rapidly sort/search for all intents of a given type.
Separate categories by dashes
and replaces spaces between
words with underscores
Can you send 2 pillows to my room?
Just spilled coffee on my bed sheets. Help!
housekeeping-linens-pillows
housekeeping-linens-sheets
CONFIDENTIAL9© 2016 International Business Machines Corporation
Bootstrap your NLC TrainingBest Practice
Are you tired of spending so much time training your NLC classifiers?
Wouldn’t if be great if prior training could speed-up the next round of training?
NLC
classifier
New
training
set
Initial
Pass at
intent
classes
SMEs
final
intent
classes
Validation ClassificationBootstrap Classification
Try bootstrapping your next round of NLC training
Before you perform the next round of NLC training. Run your next set of training data against your current classifier and let it
take a first pass at classification. Now go through this pre-classified training set and you only need to edit the classes that
were mislabeled. Be extra careful not to simply accept the classes created by this bootstrapped classification. We don’t want
the classifier to skew your training! Also pay attention to where your classifications are failing/succeeding as that can guide
your future training efforts!
CONFIDENTIAL10© 2016 International Business Machines Corporation
NLC Confidences Scores Sum to 100%
What about training with 2 intent classes then submitting input text that’s absolutely not related to either?
So this time, let’s train with A and B.
What confidence is returned by a classifier trained with only 1 intent class?
A single class NLC classifier would always return that intent class with confidence = 100% regardless of the text submitted. Let’s see this
with a simplified example where we train the NLC to recognize A and then submit X for classification.
Best Practice
If A then given unrelated X, NLC will always return 100% confident the answer is A.
NLC
NLC
Similarly for more intent classes.
An NLC trained with 100 different intents classes would return confidence = 1% for each class when given text unrelated to the trained classes.
This may sound like an odd usage of the idea of confidence, but it’s incredibly useful. See the Design Pattern on how to “Classify Complex
Text with a Multi-Intent Classifier” to learn more.
Training
If A and B Training then given unrelated X, NLC will return 50% confident the answer is A and 50% it is B.
Trick Question: What if you trained the NLC on 10 classes and NLC was 50% the answer was either of two of them
It means the NLC is highly confident your text belongs equally to these classes. So you should consider two actions: (1) review the semantic overlap
of the classes to determine if they are really distinct and (2) if they are distinct then use Watson Dialog service to ask the user which of the two options
is actually the correct one as the user’s text may ambiguously refer to both intents.
CONFIDENTIAL11© 2016 International Business Machines Corporation
Classify Complex Text with a Multi-Intent ClassifierDesign Pattern
How can we analyze text like: “Show me pink Audi convertibles”
We want to extract three intents from this text: color, model, and vehicle type. And we can do this by
assigning three classes to our NLC training set. For example something like this:
Text
Show me pink Audi convertibles
Do you have blue BMW motorcycles?
Let me see pink BMW convertibles
…
Class 1
vehicle_color_pink
vehicle_color_blue
vehicle_color_pink
…
Class 2
vehicle_model_audi
vehicle_model_bmw
vehicle_model_bmw
…
Class 3
vehicle_type_convertible
vehicle_type_motorcycle
vehicle_type_convertible
…
Now use the fact that all NLC confidences always sum to 100%!
If the heading above doesn’t make sense, look at the Best Practices slide on “NLC Confidences Scores Sum to 100%.” That will
help you understand this next conceptual example where going to train the classifier
And the trick? Let A = vehicle colors, B = vehicle models, and C = vehicle types!
Yes. You’ve just trained your classifier to distinguish three intent classes with one call! You can now create a virtual vehicle sales agent that classifies
user text by color, model, and vehicle type in a single NLC call. For our query above of “Show me pink Audi convertibles”, first separate the classes in
your NLC response by each type (color, model, type) and normalize each intent category to 100%. This would allow you to determine
vehicle_color=“”pink” at 100%, vehicle_model=“Audi” at 100%, and vehicle_type=“convertible” at 100%.
NLC
If A1, A2, B1, B2, C1 & C2 Training then given A1+B2+C1, NLC will return 33% confident the answer is A1, 33% it’s B2 and 33% it’s C1!
CONFIDENTIAL12© 2016 International Business Machines Corporation
Chaining Multiple NLC Classifiers
Do you have more classes than allowed by a single classifier?
Do your text classes have higher level categories that are easily separated?
Is training a single classifier becoming increasingly difficult to manage?
Then perhaps a multi-classifier solution is a solution for you:
The initial classifier would perform a high-level separation of your text so that classifiers at the next level can
separate your classes with higher confidence.
Text related to
Fitness or
Diet or
Wellness or
Exercise or
Food or
Allergies…
NLC
“Topic”
classifier
Fitness
Diet
Wellness
Exercise
Food
Allergies
NLC classifier for Fitness
NLC classifier for Diets
NLC classifier Exercise
NLC classifier Allergies
NLC classifier Food
NLC
Wellness
classifier
wellness intent
Design Pattern
CONFIDENTIAL13© 2016 International Business Machines Corporation
Answer FAQs with NLC and let R&R answer the “long tail
The Natural Language Classifier is well suited for Frequently Asked Questions (FAQs) where the effort to associate a single static
answer to a question is rapidly rewarded. Retrieve to Rank is then used for infrequently asked questions or those for which multiple
passages or frequently changing content must be searched.
Questions
Variants(Clusters)
1
10
20
FAQs Infrequent questions
Design Pattern
Natural Language Classifier
Retrieve and Rank
CONFIDENTIAL14© 2016 International Business Machines Corporation
Combine Retrieve and Rank w/Dialog and NLC. Dialog provides the ability for a multi-turn experience where you asking clarifying
questions to customers and track state across queries. NLC can be used to (1) detect specific domains of user interest so R&R can
search only a subset of documents, (2) detect overlap between possible user intents so Dialog can request clarification by the user,
or perhaps (3) NLC can be used to inject valuable run-time features into R&R for more targeted ranking of answers
GT Mgm’t
Tool
Doc
Conversion
NLC
Rank
Dialog
Retrieve
Index
Combine Dialog + NLC + R&RDesign Pattern
CONFIDENTIAL15© 2016 International Business Machines Corporation
App’s Response
Pipeline
Answer Detailed Questions Using NLC + Entity ExtractionDesign Pattern
So you have the user’s intent from NLC, what’s next? Use entity extraction to obtain other useful details
from the user’s text. The NLC Factoid Assistant demo does this for entities contained in Dbpedia.
Demo: http://goo.gl/u4jXiT. Source code: https://goo.gl/ARnUwJ.
First identify the intent and entities being asked about:
NLC
Alchemy
Entity Extraction
E.g. in the Factoid Assistant,
a ‘place’ pipeline handles
place-related questions
Entity
Sylvester Stallone
Eiffel Tower
Intent
place_population
place_height
The Factoid Assistant uses
SPARQL queries to extract
entity details for its answers.
California's population is 38,802,500.
The Eiffel Tower's height is 300 meters.
What’s the population of California?
How tall is the Eiffel Tower?
CONFIDENTIAL16© 2016 International Business Machines Corporation
Design Pattern Using NLC on large pieces of text through decomposition
NLC works with sentences < 1,024 characters so it won’t directly work with long pieces of text such as emails. Long text can be broken into sentences -
classifying each sentence and then aggregating the result into a summary of the message. There are a number of open source sentence segmentation
tools. NLTK works very well for this purpose
The training phase requires manual annotation of a set of test documents (start with a couple of hundred - it should take a couple of hours to plow through
them and assign a class for each sentence).
The production flow will use the trained classifier to automatically do the sentence level classification and then spit out a summary - you can use
business rules to trigger actions at this point.
For an example use case where we detect the severity of complaint emails and rank them from minor to severe:
If the email contains 10% or higher "severe" sentences then prioritise triage.
If the email contains 90% "minor" sentences then leave at bottom of queue
If the email contains 90% "severe" sentences then email the CEO and assemble the press team
Or train a meta-classifier (Decision tree or SVM) to make a classification based on the results from the sentence level annotations.
New
training
set
NLTK
Sentence
Splitter
Sentences
NLC
Manual Classification
intents
CONFIDENTIAL17© 2016 International Business Machines Corporation
Perform actions for users using NLC+Dialog
I need to
change my
password.
Natural
Language
Classifier
Checking Savings Customer ServiceTech Support
Intent Categories
Identifies intent
-  Intent = password_reset
-  Confidence = 0.874582
Dialog
What is your mother’s
maiden name?
What should your new
password be?
…Make update in
authentication
control systems
Use Case
Your task? Build a voice-enabled online banking solution. For example, how would you allow a customer to request a password change? First train the
NLC to identify the various intent categories such as Tech Support, Checking, Savings, and Customer Service. When a password change request
arrives, the NLC would identify this and your system would recognize the need to first authenticate the user. The Watson Dialog service could be
configured to perform the back-forth to obtain the user’s authentication details and request the password reset from your bank’s authentication system.
CONFIDENTIAL18© 2016 International Business Machines Corporation
Topic Modeling Using the NLCUse Case
Lucky you!
You have lots of natural language being generated by your customer/technical support teams. Or maybe it’s your customers writing reviews, sending
you emails, or tweeting about your products. Your challenge is to quickly categorize, analyze, and route the highest priority ones to the best person. Not
a problem! You can do that and more using the NLC and Alchemy Language.
If your content is < 1024 chars, then send it directly to the NLC. If it’s longer, reduce the size by parsing at the sentence or paragraph level. You could
also test running your text through Alchemy Keywords to remove the lower relevance words.
NLC
tokenize text to NLC’s
1,024 char limit
Alchemy
sentiment
<1,024 chars
>1,024 chars
Alchemy
Keywords
Optional step to focus on
only “high” value words
Topics/Intents
Product Installation
Product Return
Product Malfunction
Competitor Mentioned
What are we doing right?
What’s wrong?
Did our rep respond appropriately?
What was the resolution?

More Related Content

What's hot

OpenControl Overview - Joshua McKenty
OpenControl Overview - Joshua McKentyOpenControl Overview - Joshua McKenty
OpenControl Overview - Joshua McKentyJulie Coonce
 
Challenges in test automation for web apps
Challenges in test automation for web appsChallenges in test automation for web apps
Challenges in test automation for web appsSudara Fernando
 
Building cross platform applications using Windows Azure Mobile Services
Building cross platform applications using Windows Azure Mobile ServicesBuilding cross platform applications using Windows Azure Mobile Services
Building cross platform applications using Windows Azure Mobile ServicesKevin DeRudder
 
VISUG: Visual studio for web developers
VISUG: Visual studio for web developersVISUG: Visual studio for web developers
VISUG: Visual studio for web developersKevin DeRudder
 
iOS development best practices
iOS development best practicesiOS development best practices
iOS development best practicesMichal Juhas
 
What I learned teaching programming to 150 beginners
What I learned teaching programming to 150 beginnersWhat I learned teaching programming to 150 beginners
What I learned teaching programming to 150 beginnersEtiene Dalcol
 
Common design principles and design patterns in automation testing
Common design principles and design patterns in automation testingCommon design principles and design patterns in automation testing
Common design principles and design patterns in automation testingKMS Technology
 
Building an Accessible Component Library
Building an Accessible Component LibraryBuilding an Accessible Component Library
Building an Accessible Component LibraryAri Rizzitano
 
Getting All Your Web Apps To Wear The Company Brand
Getting All Your Web Apps To Wear The Company BrandGetting All Your Web Apps To Wear The Company Brand
Getting All Your Web Apps To Wear The Company Brandknappt
 
Cf objective2014 software-craftsmanship
Cf objective2014 software-craftsmanshipCf objective2014 software-craftsmanship
Cf objective2014 software-craftsmanshipColdFusionConference
 
The state of testing @ Microsoft
The state of testing @ MicrosoftThe state of testing @ Microsoft
The state of testing @ MicrosoftRobert MacLean
 
Introduction to the Managed Extensibility Framework in Silverlight
Introduction to the Managed Extensibility Framework in SilverlightIntroduction to the Managed Extensibility Framework in Silverlight
Introduction to the Managed Extensibility Framework in SilverlightJeremy Likness
 
АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...
АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...
АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...GoQA
 
Testing apps with MTM and Tea Foundation Service
Testing apps with MTM and Tea Foundation ServiceTesting apps with MTM and Tea Foundation Service
Testing apps with MTM and Tea Foundation ServiceKevin DeRudder
 
CCC 2015 tfs admin for good not evil
CCC 2015 tfs admin for good not evilCCC 2015 tfs admin for good not evil
CCC 2015 tfs admin for good not evilAngela Dugan
 
DSL, Page Object and WebDriver – the path to reliable functional tests.pptx
DSL, Page Object and WebDriver – the path to reliable functional tests.pptxDSL, Page Object and WebDriver – the path to reliable functional tests.pptx
DSL, Page Object and WebDriver – the path to reliable functional tests.pptxMikalai Alimenkou
 

What's hot (20)

OpenControl Overview - Joshua McKenty
OpenControl Overview - Joshua McKentyOpenControl Overview - Joshua McKenty
OpenControl Overview - Joshua McKenty
 
Challenges in test automation for web apps
Challenges in test automation for web appsChallenges in test automation for web apps
Challenges in test automation for web apps
 
Building cross platform applications using Windows Azure Mobile Services
Building cross platform applications using Windows Azure Mobile ServicesBuilding cross platform applications using Windows Azure Mobile Services
Building cross platform applications using Windows Azure Mobile Services
 
VISUG: Visual studio for web developers
VISUG: Visual studio for web developersVISUG: Visual studio for web developers
VISUG: Visual studio for web developers
 
iOS development best practices
iOS development best practicesiOS development best practices
iOS development best practices
 
What I learned teaching programming to 150 beginners
What I learned teaching programming to 150 beginnersWhat I learned teaching programming to 150 beginners
What I learned teaching programming to 150 beginners
 
Common design principles and design patterns in automation testing
Common design principles and design patterns in automation testingCommon design principles and design patterns in automation testing
Common design principles and design patterns in automation testing
 
Building an Accessible Component Library
Building an Accessible Component LibraryBuilding an Accessible Component Library
Building an Accessible Component Library
 
Getting All Your Web Apps To Wear The Company Brand
Getting All Your Web Apps To Wear The Company BrandGetting All Your Web Apps To Wear The Company Brand
Getting All Your Web Apps To Wear The Company Brand
 
Cf objective2014 software-craftsmanship
Cf objective2014 software-craftsmanshipCf objective2014 software-craftsmanship
Cf objective2014 software-craftsmanship
 
The state of testing @ Microsoft
The state of testing @ MicrosoftThe state of testing @ Microsoft
The state of testing @ Microsoft
 
Introduction to the Managed Extensibility Framework in Silverlight
Introduction to the Managed Extensibility Framework in SilverlightIntroduction to the Managed Extensibility Framework in Silverlight
Introduction to the Managed Extensibility Framework in Silverlight
 
jp06_bossola
jp06_bossolajp06_bossola
jp06_bossola
 
Polyglot engineering
Polyglot engineeringPolyglot engineering
Polyglot engineering
 
Mercurial
MercurialMercurial
Mercurial
 
АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...
АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...
АННА ТИМОФІЄВА & СЕРГІЙ МАЛИНОВСЬКИЙ «Tools and Tips of video connection test...
 
The Open Web
The Open WebThe Open Web
The Open Web
 
Testing apps with MTM and Tea Foundation Service
Testing apps with MTM and Tea Foundation ServiceTesting apps with MTM and Tea Foundation Service
Testing apps with MTM and Tea Foundation Service
 
CCC 2015 tfs admin for good not evil
CCC 2015 tfs admin for good not evilCCC 2015 tfs admin for good not evil
CCC 2015 tfs admin for good not evil
 
DSL, Page Object and WebDriver – the path to reliable functional tests.pptx
DSL, Page Object and WebDriver – the path to reliable functional tests.pptxDSL, Page Object and WebDriver – the path to reliable functional tests.pptx
DSL, Page Object and WebDriver – the path to reliable functional tests.pptx
 

Similar to Here is a design pattern for classifying complex text with multiple intents using the Natural Language Classifier:1. Define intent classes for each "aspect" you want to extract, such as color, model, type. 2. During training, assign multiple intent classes to examples that contain multiple aspects.3. At runtime, submit the text and NLC will return confidences for each intent class. 4. Analyze the confidences to extract the different aspects. For example, the top confidence for color, model, type.5. You may need to handle cases where NLC is equally confident in multiple classes for the same aspect. Ask the user to clarify or pick the most likely.6

Cinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsCinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsSteven Smith
 
Turning Passion Into Words
Turning Passion Into WordsTurning Passion Into Words
Turning Passion Into WordsBrian Hogan
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsOVHcloud
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxSurendra Gusain
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxSurendra Gusain
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Planning For Reuse 2009 03 09
Planning For Reuse 2009 03 09Planning For Reuse 2009 03 09
Planning For Reuse 2009 03 09Edward VanArsdall
 
Microsoft Teams community call-September 2019
Microsoft Teams community call-September 2019Microsoft Teams community call-September 2019
Microsoft Teams community call-September 2019Microsoft 365 Developer
 
Enterprise Devops Presentation @ Magentys Seminar London May 15 2014
Enterprise Devops Presentation @ Magentys Seminar London May 15 2014Enterprise Devops Presentation @ Magentys Seminar London May 15 2014
Enterprise Devops Presentation @ Magentys Seminar London May 15 2014Jwooldridge
 
Pair Programming Presentation
Pair Programming PresentationPair Programming Presentation
Pair Programming PresentationThoughtWorks
 
Diversity Project KickoffYour Name Capella Universit.docx
Diversity Project KickoffYour Name Capella Universit.docxDiversity Project KickoffYour Name Capella Universit.docx
Diversity Project KickoffYour Name Capella Universit.docxpauline234567
 
You already have an LxP, you just don't know it
You already have an LxP, you just don't know itYou already have an LxP, you just don't know it
You already have an LxP, you just don't know itJames Wann
 
How to Internationalize Products by fmr Condé Nast Int. PM
How to Internationalize Products by fmr Condé Nast Int. PMHow to Internationalize Products by fmr Condé Nast Int. PM
How to Internationalize Products by fmr Condé Nast Int. PMProduct School
 
Conversational interfaces and time series prediction
Conversational interfaces and time series predictionConversational interfaces and time series prediction
Conversational interfaces and time series predictionBirger Moell
 
Converting Flash Training into HTML5
Converting Flash Training into HTML5Converting Flash Training into HTML5
Converting Flash Training into HTML5David Goodman
 
Recipe of a rockstar developer
Recipe of a rockstar developerRecipe of a rockstar developer
Recipe of a rockstar developerTopu Newaj
 

Similar to Here is a design pattern for classifying complex text with multiple intents using the Natural Language Classifier:1. Define intent classes for each "aspect" you want to extract, such as color, model, type. 2. During training, assign multiple intent classes to examples that contain multiple aspects.3. At runtime, submit the text and NLC will return confidences for each intent class. 4. Analyze the confidences to extract the different aspects. For example, the top confidence for color, model, type.5. You may need to handle cases where NLC is equally confident in multiple classes for the same aspect. Ask the user to clarify or pick the most likely.6 (20)

Cinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsCinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patterns
 
Turning Passion Into Words
Turning Passion Into WordsTurning Passion Into Words
Turning Passion Into Words
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Planning For Reuse 2009 03 09
Planning For Reuse 2009 03 09Planning For Reuse 2009 03 09
Planning For Reuse 2009 03 09
 
Microsoft Teams community call-September 2019
Microsoft Teams community call-September 2019Microsoft Teams community call-September 2019
Microsoft Teams community call-September 2019
 
Enterprise Devops Presentation @ Magentys Seminar London May 15 2014
Enterprise Devops Presentation @ Magentys Seminar London May 15 2014Enterprise Devops Presentation @ Magentys Seminar London May 15 2014
Enterprise Devops Presentation @ Magentys Seminar London May 15 2014
 
Scailing CX Playbook - Chattermill
Scailing CX Playbook - ChattermillScailing CX Playbook - Chattermill
Scailing CX Playbook - Chattermill
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Pair Programming Presentation
Pair Programming PresentationPair Programming Presentation
Pair Programming Presentation
 
2014 Technical Communication Conference Program
2014 Technical Communication Conference Program2014 Technical Communication Conference Program
2014 Technical Communication Conference Program
 
Devops interview questions
Devops interview questionsDevops interview questions
Devops interview questions
 
Diversity Project KickoffYour Name Capella Universit.docx
Diversity Project KickoffYour Name Capella Universit.docxDiversity Project KickoffYour Name Capella Universit.docx
Diversity Project KickoffYour Name Capella Universit.docx
 
You already have an LxP, you just don't know it
You already have an LxP, you just don't know itYou already have an LxP, you just don't know it
You already have an LxP, you just don't know it
 
How to Internationalize Products by fmr Condé Nast Int. PM
How to Internationalize Products by fmr Condé Nast Int. PMHow to Internationalize Products by fmr Condé Nast Int. PM
How to Internationalize Products by fmr Condé Nast Int. PM
 
Conversational interfaces and time series prediction
Conversational interfaces and time series predictionConversational interfaces and time series prediction
Conversational interfaces and time series prediction
 
Converting Flash Training into HTML5
Converting Flash Training into HTML5Converting Flash Training into HTML5
Converting Flash Training into HTML5
 
Recipe of a rockstar developer
Recipe of a rockstar developerRecipe of a rockstar developer
Recipe of a rockstar developer
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Here is a design pattern for classifying complex text with multiple intents using the Natural Language Classifier:1. Define intent classes for each "aspect" you want to extract, such as color, model, type. 2. During training, assign multiple intent classes to examples that contain multiple aspects.3. At runtime, submit the text and NLC will return confidences for each intent class. 4. Analyze the confidences to extract the different aspects. For example, the top confidence for color, model, type.5. You may need to handle cases where NLC is equally confident in multiple classes for the same aspect. Ask the user to clarify or pick the most likely.6

  • 1. Natural Language Classifier Links, Best Practices, Source Code, and Tools
  • 2. CONFIDENTIAL2© 2016 International Business Machines Corporation Welcome to the Natural Language Classifier! This document will help you get started. Topic Introduction Documentation, Community, and Demos with Source Code Tools Ground Truth Management Tool Demos Build a Harry Potter sorting hat Best Practices Obtain end user questions ASAP Train users to ask natural language questions What’s the best naming convention for intent names? Bootstrapping your next training set NLC Confidences Scores Sum to 100% Design Patterns Classify Complex Text with a Multi-Intent Classifier Chaining multiple classifiers Page 3 4 5 6 7 8 9 10 11 12 Intro to Watson Development https://ibm.box.com/s/nav52vt6q2xwib5zqwupwjf78mxtgems Retrieve and Rank (R&R) Handbook https://ibm.box.com/s/n0lqowt0v97nxb5mtei6qrbkkbtdt2dm Personality (PI) Handbook https://ibm.box.com/s/6h8dxsc3pq5idtgehjb6fwh7vbejsvtc More Handbooks Topic Design Patterns Continued Answer FAQs with NLC and let R&R answer the “long tail Combine Dialog + NLC + R&R Answer Detailed Questions Using NLC + Entity Extraction Using NLC on large pieces of text through decomposition Use Cases Perform actions for users using NLC+Dialog Topic Modeling Using the NLC Page 13 14 15 16 17 18
  • 3. CONFIDENTIAL3© 2016 International Business Machines Corporation NLC on the Stack overflow: http://stackoverflow.com/questions/tagged/ibm-watson Read questions and and get answers from other Watson NLC developers. Technical Webinars: http://www.pages03.net/ibmwatson/building-with-watson-web-series Sign up for the upcoming webinars on how to build conversational apps using Watson Docs, Tools, and Community NLC on the WDC: https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/nl-classifier.html Read the NLC section of the Watson Developer Cloud to learn all about the service. Webinar: Interpreting Language Using the NLC: https://goo.gl/5SFLmn A recent webinar on by Rahul A. Garg, Watson Offering Manager for the NLC https://goo.gl/YEBfG0 https://goo.gl/OgfsN2 https://goo.gl/lUlQ9G Architecture: http://goo.gl/BLNtbM Demos w/source code https://goo.gl/ARnUwJ NodeJS Watson Library: https://goo.gl/2nhR1n Kick start your Watson NodeJS development. Java Watson Library: https://github.com/watson-developer-cloud/java-sdk A library of Java utilities to jump start your Watson development. iOS/Swift Watson Library: https://github.com/watson-developer-cloud/ios-sdk Checkout this NodeJS library to kick start your Watson development. Code Libraries & SDKs
  • 4. CONFIDENTIAL4© 2016 International Business Machines Corporation NLC Ground Truth Management Tool You can now let non-technical people train classifiers and manage their ground truth. This NLC tooling allows you to quickly upload and preview your ground truth while quickly viewing existing classifiers. Go here to get started: http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/nl-classifier/tool_overview.shtml
  • 5. CONFIDENTIAL5© 2016 International Business Machines Corporation Build a Harry Potter Soring Hat Impress your kids and your colleagues with the power of the NLC This fun and creative demo shows how a very simple training set can roughly discern your “personality type” enough to determine which Hogwart’s house you belong too. The training dataset plus code in R are available for download Top Demos Blog: https://dreamtolearn.com/ryan/data_analytics_viz/98 Video: https://www.youtube.com/watch?v=7SoUSpxONcg
  • 6. CONFIDENTIAL6© 2016 International Business Machines Corporation Obtain End User Text/Questions ASAP What’s the first mistake most R&R implementations make? That’s right. They spend too much time working with pseudo-questions generated internally to bootstrap R&R. Don’t let yourself spend too much time before getting your application in-front of real end-users to validate your assumptions about how your users will ask questions. Anecdotally, 99% of implementations waste 75% of their time generating ground truth that doesn’t properly match end user needs. Validate soon and validate often! Best Practice Don’t wait for perfection. Present your users with “Good Enough”. Pay close attention to George Box’s quote and get in-front of your end users ASAP. Often a basic system is still useful enough to start asking meaningful questions. I.e. you need to know quickly understand your user’s workflow, what questions/text they’ll submit, and how it’s worded. Normally a partial system will still be enough to validate and extend your initial assumptions. “Remember that all models are wrong. The practical question is how wrong do they have to be to be useful?” George Box, 1987
  • 7. CONFIDENTIAL7© 2016 International Business Machines Corporation Train users to ask natural language questions Modify your user’s default behaviors. No more keywords. Use full sentence. If your UI has a text box, your users will likely default to entering keywords as they do with Google. You’ll need to modify their behavior so they use full sentences. Here are a few recommended ways to achieve this. Best Practice 1 Your placeholder should be a full text sentence 2 Provide examples of other full text sentences. 5 Track every sentence submitted by users. This is valuable IP and critical for training. Real end user questions are GOLDEN so always keep them separate from pseudo-questions generated by non-SMEs. 4 Use type-ahead to display a drop- down list of possible sentences from your database of prior user sentences 3 If the user types < 3 words, then display a dialog suggesting that they enter full sentences.
  • 8. CONFIDENTIAL8© 2016 International Business Machines Corporation What’s the best naming convention for intent names?Best Practice Take time to establish a consistent naming convention for your intents. First, determine a hierarchy that groups similar intents. We recommend a naming convention with the broader category at the left moving to subcategories on the right: Text Where is the fitness center? Where is the business center When does the fitness center open? When does the fitness center close? What are the business center hours? Intent fitness_center-location business_center-location fitness_center-open_time fitness_center-close_time business_center-hours Next look for subtypes that are common between intents. E.g. “location” and “hours” in the examples above. These subtypes could span multiple levels (as below), but we recommend not going beyond 4 levels in your intent hierarchy. Having good intent names will allow you to rapidly sort/search for all intents of a given type. Separate categories by dashes and replaces spaces between words with underscores Can you send 2 pillows to my room? Just spilled coffee on my bed sheets. Help! housekeeping-linens-pillows housekeeping-linens-sheets
  • 9. CONFIDENTIAL9© 2016 International Business Machines Corporation Bootstrap your NLC TrainingBest Practice Are you tired of spending so much time training your NLC classifiers? Wouldn’t if be great if prior training could speed-up the next round of training? NLC classifier New training set Initial Pass at intent classes SMEs final intent classes Validation ClassificationBootstrap Classification Try bootstrapping your next round of NLC training Before you perform the next round of NLC training. Run your next set of training data against your current classifier and let it take a first pass at classification. Now go through this pre-classified training set and you only need to edit the classes that were mislabeled. Be extra careful not to simply accept the classes created by this bootstrapped classification. We don’t want the classifier to skew your training! Also pay attention to where your classifications are failing/succeeding as that can guide your future training efforts!
  • 10. CONFIDENTIAL10© 2016 International Business Machines Corporation NLC Confidences Scores Sum to 100% What about training with 2 intent classes then submitting input text that’s absolutely not related to either? So this time, let’s train with A and B. What confidence is returned by a classifier trained with only 1 intent class? A single class NLC classifier would always return that intent class with confidence = 100% regardless of the text submitted. Let’s see this with a simplified example where we train the NLC to recognize A and then submit X for classification. Best Practice If A then given unrelated X, NLC will always return 100% confident the answer is A. NLC NLC Similarly for more intent classes. An NLC trained with 100 different intents classes would return confidence = 1% for each class when given text unrelated to the trained classes. This may sound like an odd usage of the idea of confidence, but it’s incredibly useful. See the Design Pattern on how to “Classify Complex Text with a Multi-Intent Classifier” to learn more. Training If A and B Training then given unrelated X, NLC will return 50% confident the answer is A and 50% it is B. Trick Question: What if you trained the NLC on 10 classes and NLC was 50% the answer was either of two of them It means the NLC is highly confident your text belongs equally to these classes. So you should consider two actions: (1) review the semantic overlap of the classes to determine if they are really distinct and (2) if they are distinct then use Watson Dialog service to ask the user which of the two options is actually the correct one as the user’s text may ambiguously refer to both intents.
  • 11. CONFIDENTIAL11© 2016 International Business Machines Corporation Classify Complex Text with a Multi-Intent ClassifierDesign Pattern How can we analyze text like: “Show me pink Audi convertibles” We want to extract three intents from this text: color, model, and vehicle type. And we can do this by assigning three classes to our NLC training set. For example something like this: Text Show me pink Audi convertibles Do you have blue BMW motorcycles? Let me see pink BMW convertibles … Class 1 vehicle_color_pink vehicle_color_blue vehicle_color_pink … Class 2 vehicle_model_audi vehicle_model_bmw vehicle_model_bmw … Class 3 vehicle_type_convertible vehicle_type_motorcycle vehicle_type_convertible … Now use the fact that all NLC confidences always sum to 100%! If the heading above doesn’t make sense, look at the Best Practices slide on “NLC Confidences Scores Sum to 100%.” That will help you understand this next conceptual example where going to train the classifier And the trick? Let A = vehicle colors, B = vehicle models, and C = vehicle types! Yes. You’ve just trained your classifier to distinguish three intent classes with one call! You can now create a virtual vehicle sales agent that classifies user text by color, model, and vehicle type in a single NLC call. For our query above of “Show me pink Audi convertibles”, first separate the classes in your NLC response by each type (color, model, type) and normalize each intent category to 100%. This would allow you to determine vehicle_color=“”pink” at 100%, vehicle_model=“Audi” at 100%, and vehicle_type=“convertible” at 100%. NLC If A1, A2, B1, B2, C1 & C2 Training then given A1+B2+C1, NLC will return 33% confident the answer is A1, 33% it’s B2 and 33% it’s C1!
  • 12. CONFIDENTIAL12© 2016 International Business Machines Corporation Chaining Multiple NLC Classifiers Do you have more classes than allowed by a single classifier? Do your text classes have higher level categories that are easily separated? Is training a single classifier becoming increasingly difficult to manage? Then perhaps a multi-classifier solution is a solution for you: The initial classifier would perform a high-level separation of your text so that classifiers at the next level can separate your classes with higher confidence. Text related to Fitness or Diet or Wellness or Exercise or Food or Allergies… NLC “Topic” classifier Fitness Diet Wellness Exercise Food Allergies NLC classifier for Fitness NLC classifier for Diets NLC classifier Exercise NLC classifier Allergies NLC classifier Food NLC Wellness classifier wellness intent Design Pattern
  • 13. CONFIDENTIAL13© 2016 International Business Machines Corporation Answer FAQs with NLC and let R&R answer the “long tail The Natural Language Classifier is well suited for Frequently Asked Questions (FAQs) where the effort to associate a single static answer to a question is rapidly rewarded. Retrieve to Rank is then used for infrequently asked questions or those for which multiple passages or frequently changing content must be searched. Questions Variants(Clusters) 1 10 20 FAQs Infrequent questions Design Pattern Natural Language Classifier Retrieve and Rank
  • 14. CONFIDENTIAL14© 2016 International Business Machines Corporation Combine Retrieve and Rank w/Dialog and NLC. Dialog provides the ability for a multi-turn experience where you asking clarifying questions to customers and track state across queries. NLC can be used to (1) detect specific domains of user interest so R&R can search only a subset of documents, (2) detect overlap between possible user intents so Dialog can request clarification by the user, or perhaps (3) NLC can be used to inject valuable run-time features into R&R for more targeted ranking of answers GT Mgm’t Tool Doc Conversion NLC Rank Dialog Retrieve Index Combine Dialog + NLC + R&RDesign Pattern
  • 15. CONFIDENTIAL15© 2016 International Business Machines Corporation App’s Response Pipeline Answer Detailed Questions Using NLC + Entity ExtractionDesign Pattern So you have the user’s intent from NLC, what’s next? Use entity extraction to obtain other useful details from the user’s text. The NLC Factoid Assistant demo does this for entities contained in Dbpedia. Demo: http://goo.gl/u4jXiT. Source code: https://goo.gl/ARnUwJ. First identify the intent and entities being asked about: NLC Alchemy Entity Extraction E.g. in the Factoid Assistant, a ‘place’ pipeline handles place-related questions Entity Sylvester Stallone Eiffel Tower Intent place_population place_height The Factoid Assistant uses SPARQL queries to extract entity details for its answers. California's population is 38,802,500. The Eiffel Tower's height is 300 meters. What’s the population of California? How tall is the Eiffel Tower?
  • 16. CONFIDENTIAL16© 2016 International Business Machines Corporation Design Pattern Using NLC on large pieces of text through decomposition NLC works with sentences < 1,024 characters so it won’t directly work with long pieces of text such as emails. Long text can be broken into sentences - classifying each sentence and then aggregating the result into a summary of the message. There are a number of open source sentence segmentation tools. NLTK works very well for this purpose The training phase requires manual annotation of a set of test documents (start with a couple of hundred - it should take a couple of hours to plow through them and assign a class for each sentence). The production flow will use the trained classifier to automatically do the sentence level classification and then spit out a summary - you can use business rules to trigger actions at this point. For an example use case where we detect the severity of complaint emails and rank them from minor to severe: If the email contains 10% or higher "severe" sentences then prioritise triage. If the email contains 90% "minor" sentences then leave at bottom of queue If the email contains 90% "severe" sentences then email the CEO and assemble the press team Or train a meta-classifier (Decision tree or SVM) to make a classification based on the results from the sentence level annotations. New training set NLTK Sentence Splitter Sentences NLC Manual Classification intents
  • 17. CONFIDENTIAL17© 2016 International Business Machines Corporation Perform actions for users using NLC+Dialog I need to change my password. Natural Language Classifier Checking Savings Customer ServiceTech Support Intent Categories Identifies intent -  Intent = password_reset -  Confidence = 0.874582 Dialog What is your mother’s maiden name? What should your new password be? …Make update in authentication control systems Use Case Your task? Build a voice-enabled online banking solution. For example, how would you allow a customer to request a password change? First train the NLC to identify the various intent categories such as Tech Support, Checking, Savings, and Customer Service. When a password change request arrives, the NLC would identify this and your system would recognize the need to first authenticate the user. The Watson Dialog service could be configured to perform the back-forth to obtain the user’s authentication details and request the password reset from your bank’s authentication system.
  • 18. CONFIDENTIAL18© 2016 International Business Machines Corporation Topic Modeling Using the NLCUse Case Lucky you! You have lots of natural language being generated by your customer/technical support teams. Or maybe it’s your customers writing reviews, sending you emails, or tweeting about your products. Your challenge is to quickly categorize, analyze, and route the highest priority ones to the best person. Not a problem! You can do that and more using the NLC and Alchemy Language. If your content is < 1024 chars, then send it directly to the NLC. If it’s longer, reduce the size by parsing at the sentence or paragraph level. You could also test running your text through Alchemy Keywords to remove the lower relevance words. NLC tokenize text to NLC’s 1,024 char limit Alchemy sentiment <1,024 chars >1,024 chars Alchemy Keywords Optional step to focus on only “high” value words Topics/Intents Product Installation Product Return Product Malfunction Competitor Mentioned What are we doing right? What’s wrong? Did our rep respond appropriately? What was the resolution?