SlideShare a Scribd company logo
1 of 45
Download to read offline
NLU / Intent Detection
Benchmark
by Intento
August 2017
About
• At Intento, we want to make Machine Intelligence
services easy to discover, choose and use.
• So far, the evaluation is the most problematic part: to
compare providers, one need to sign a lot of contracts
and integrate a lot of APIs.
• We deliver that for FREE on public datasets. To
evaluate on you own dataset, contact our sales.
• Also, check out our Machine Translation Benchmark.
Machine Translation is an easy way to build a multi-
lingual bot.
Overview
• Natural Language Understanding Services with Public
APIs*
• IBM Watson Conversation
• API.ai API
• Microsoft LUIS
• Amazon Lex
• Recast.ai API
• wit.ai API
• SNIPS API
• Benchmark Dimensions
• Intent Prediction Performance
• False Positives
• Learning Speed (performance on small datasets)
• Language Coverage, Price, Response time
August 2017© Intento, Inc.
* as of today, some of them don’t have a Public API and we’ve got an access for the purpose of this benchmark
NLU Engines Compared
August 2017© Intento, Inc.
API.ai / Google
• Website: https://api.ai
• Launched: 2010 (acquired by Google in 2016)
• Pricing model: FREE
• Interface: HTTP REST API
“Build delightful and natural
conversational experiences”
August 2017© Intento, Inc.
wit.ai / Facebook
• Website: https://wit.ai
• Launched: 2013 (acquired by Facebook in 2015)
• Pricing model: FREE
• Interface: HTTP REST API
“Natural Language for Developers”
August 2017© Intento, Inc.
IBM Watson Conversation
• Website: https://www.ibm.com/watson/services/
conversation/
• Launched: 2016
• Pricing model: Pay As You Go
• Interface: HTTP REST API
“Quickly build and deploy chatbots and
virtual agents across a variety of
channels, including mobile devices,
messaging platforms, and even robots.”
August 2017© Intento, Inc.
Microsoft LUIS
• Website: https://www.luis.ai/
• Launched: 2015
• Pricing model: Pay As You Go
• Interface: HTTP REST API
“Language Understanding
Intelligent Service. Add
conversational intelligence to your
apps.”
August 2017© Intento, Inc.
Amazon Lex
• Website: https://aws.amazon.com/lex/
• Launched: 2016
• Pricing model: Pay As You Go
• Interface: HTTP REST API
• Important Restrictions apply (see Dataset slide)
“Conversational interfaces for your
applications. Powered by the same deep
learning technologies as Alexa”
August 2017© Intento, Inc.
Recast.ai
• Website: https://recast.ai
• Launched: 2016
• Pricing mode: Free tier + Contact sales
• Interface: HTTP REST API
“The collaborative platform to
build, train, deploy and monitor
intelligent bots for developers”
August 2017© Intento, Inc.
SNIPS
• Website: https://snips.ai
• Launched: 2017
• Pricing model: Free to test + Pay per Device
• Interface: On Device*
“Snips is an AI-powered voice
assistant you can add to your
products. It runs on-device and is
Private by Design”
* we’ve tested a hosted version with a private API provided by SNIPS
August 2017© Intento, Inc.
Historical
timeline
2011 2015 20162012 2013 2014 20172010
Conversation
Facebook
Google
LUIS
August 2017© Intento, Inc.
The Approach
August 2017© Intento, Inc.
Dataset
• The original dataset is the SNIPS.ai 2017 NLU Benchmark
• English language only
• We have removed duplicates that differ by number of
whitespaces, quotes, lettercase etc
• Resulting dataset parameters*: 7 intents (next slide),
15.6K samples (~2K per intent), 340K symbols
• We also used ~450 samples from SNIPS.ai 2016 NLU
Benchmark to test for False Positives
* For Amazon Lex, the training set is capped by 200K symbols per API limitation; also symbol limitations
apply, resulting in 4500 utterances (~640 per intent).
August 2017© Intento, Inc.
Intents [*]
• SearchCreativeWork (e.g. Find me the I, Robot television show)
• GetWeather (e.g. Is it windy in Boston, MA right now?)
• BookRestaurant (e.g. I want to book a highly rated restaurant for
me and my boyfriend tomorrow night)
• PlayMusic (e.g. Play the last track from Beyoncé off Spotify)
• AddToPlaylist (e.g. Add Diamonds to my roadtrip playlist)
• RateBook (e.g. Give 6 stars to Of Mice and Men)
• SearchScreeningEvent (e.g. Check the showtimes for Wonder
Woman in Paris)
[*] quoted from https://github.com/snipsco/nlu-benchmark/tree/master/2017-06-custom-intent-engines
August 2017© Intento, Inc.
I. Prediction Performance
August 2017© Intento, Inc.
Experimental setting
• Inspired by the SNIPS Benchmarks (2016, 2017)
• We benchmark intent detection only, no parameter
extraction yet
• For each provider, one model is trained to detect all
intents
• We run all models with the default confidence score
thresholds.
• 3-fold 80/20* Stratified Monte Carlo cross-validation
* 47/20 for Amazon Lex
August 2017© Intento, Inc.
Scores Normalization (I)
Intents: the good, the bad and the ugly
(to compare providers we need to remove intent-related bias)
F1 Scores* Confusion matrix
* F1 Score is a weighted average of the precision and recall; reaches its best value at 1 and worst score at 0.
August 2017© Intento, Inc.
Scores Normalization (II)
• Standardizing F1,
P and R scores
for each intent
(SD-
normalization)
• Then adjusting
scales by
multiplying on the
global std and
adding the global
mean (next slide)
August 2017© Intento, Inc.
Detection Performance*
• Black bars
indicate
confidence
intervals
• Amazon Lex
is trained on a
smaller
dataset due
to its API
limits
* Mean standardized F1 scores, adjusted to the initial scale using global mean and std
August 2017© Intento, Inc.
Average Precision*
• Black bars
indicate
confidence
intervals
• Amazon Lex
is trained on a
smaller
dataset due
to its API
limits
August 2017© Intento, Inc.
* Mean standardized Precision, adjusted to the initial scale using global mean and std
Average Recall*
* Mean standardized Recall, adjusted to the initial scale using global mean and std
• Black bars
indicate
confidence
intervals
• is Amazon Lex
trained on a
smaller dataset
due to its API
limits
August 2017© Intento, Inc.
Precision vs. Recall
August 2017© Intento, Inc.
Discussion
• Training models is cumbersome for most of the services:
• manual work (adding models in the web interface) and
• solving issues with the tech support.
• Based on the confidence intervals, all providers fall into several
groups:
• Top-runners: IBM Watson, API.ai, Microsoft LUIS
• Amazon Lex: API limits don’t allow for training multi-intent
model on enough samples
• Choosing the provider:
• For “good” intents (like GetWeather), all providers are good
enough. For “bad” intents, having a “good” providers is crucial
• Within each tier, the leader depends on the intent data
August 2017© Intento, Inc.
II. False Positives
August 2017© Intento, Inc.
Approach
• How does NLU behaves when a user express an
intent it wasn’t trained for?
• Expected behavior: produce the Fallback Intent
(no trained intents pass the detection threshold)
• 1411 utterances from the domains other than
Music, Movie, Weather, RestaurantReservation,
Entertainment and TV.
• Compare % of (false) positive detections for each
NLU provider
August 2017© Intento, Inc.
Out-of-the-Domain Samples
August 2017© Intento, Inc.
Discussion
• Only Snips.ai and API.ai are somewhat good at
detecting that users asks for something the agent
is not trained for
• IBM Watson and Microsoft LUIS are trying to map
any user request to one of the intents from the
training set
Perhaps the Fallback intent should be manually
added and trained on junk utterances?
August 2017© Intento, Inc.
III. Learning Curve
August 2017© Intento, Inc.
Experimental Setting
• Similar to the Prediction Performance (slide 18)
• 20% of the dataset reserved for testing (stratified)
• From the remaining 80%, for intent we’ve randomly
built a set of training sets of the following cardinality:
10, 25, 50, 100, 200, 500, 1000*
• No cross-validation
• Analyzed F1 Scores, normalised as described on
slides 19-20.
* for Amazon Lex, 10, 25, 50, 100, 200
August 2017© Intento, Inc.
Leaning curve by provider
Vertical bars
denote
confidence
intervals
August 2017© Intento, Inc.
Discussion
• On <100 samples IBM Watson is superior*. API.ai
catches up at >100 samples.
• Detecting the user’s intent is crucial for the
subsequent slot extraction and response generation.
• The learning curve is quite steep, good performance
requires hundreds of utterances to train on.
• Most of the pre-built intents (Microsoft LUIS, API.ai,
etc) are built on 10-50 utterances.
* SNIPS advertises a special enterprise feature to generate additional samples for smaller datasets, but it is not
available by default
August 2017© Intento, Inc.
IV. Language coverage
August 2017© Intento, Inc.
Supported languages
• Merged all
dialects (e.g. en-
uk and en-us).
• Note we’ve tested
the performance
only for English
August 2017© Intento, Inc.
Language popularity
Numberofsupportingproviders
0
2
4
5
7
English
Korean
German
Spanish
French
Italian
Portuguese
Japanese
Chinese
Dutch
Arabic
Russian
Norwegian,
Polish, Hindi,
Finnish, Danish,
Czech, Swedish,
Catalan,
Ukranian +29
August 2017© Intento, Inc.
Discussion
Potentially, Machine Translation may be used to
increase the language coverage and/or performance:
• either by translating both the training and testing
utterances to English
• or by translating only testing utterances and
using the English training model
That’s something to check
in future benchmarks
August 2017© Intento, Inc.
V. Other observations
August 2017© Intento, Inc.
Average response time
*
* Snips assumes on-device deployment, we put here a response time for a hosted test bench August 2017© Intento, Inc.
Price per 1K requests*
* prediction requests; SNIPS has per device pricing and is not shown on this chart
CONTACTSALES
Free Free
August 2017© Intento, Inc.
Conclusions
August 2017© Intento, Inc.
Performance (F1) vs. Price*
Performance
Affordability = 1/Price
wit.ai
amazon.lex
microsoft.luis
ibm.watson api.ai
* Recast and SNIPS are not shown as they don't provide a public pricing
FREE
August 2017© Intento, Inc.
Performance (F1) vs. Latency*
Performance
Speed = 1/Latency
snips
recast
wit.ai
amazon.lex
microsoft.luis
ibm.watson api.ai
* Snips assumes on-device deployment, we put here a response time for a hosted test bench August 2017© Intento, Inc.
Conclusions
1. API.ai, Microsoft LUIS and IBM Watson have overall
best intent detection performance, speed and language
coverage.
• Within this group, API.ai is superior at price (free), Microsoft LUIS at speed
(almost 50% faster response), IBM Watson at performance (esp. at smaller
datasets). Here, only API.ai detects (~40% of) out-of-domain requests and produce
Fallback intent.
2. For extreme language coverage, go with wit.ai.
• It’s interesting if Machine Translation can be applied either on training or on
testing stage to increase language coverage for the top-3 providers.
3. The performance varies a lot for different intents and
dataset sizes. We recommend to evaluate several
providers on your data before making a choice.
August 2017© Intento, Inc.
Discover the best service providers
for your AI task
Evaluate performance on your own
data
Access any provider with no effort
using to our Single API
Intento Service Platform
August 2017© Intento, Inc.
Intento
https://inten.to
Dmitry Labazkin,
Grigory Sapunov,
Konstantin Savenkov
Intento, Inc.

<hello@inten.to>

More Related Content

What's hot

Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AICMassociates
 
Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)
Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)
Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)Amazon Web Services Korea
 
From knowledge graph to commonsense knowledge graph brighton seo
From knowledge graph to commonsense knowledge graph   brighton seoFrom knowledge graph to commonsense knowledge graph   brighton seo
From knowledge graph to commonsense knowledge graph brighton seoDateme Tubotamuno
 
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...Amazon Web Services Korea
 
[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...
[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...
[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...AWS Korea 금융산업팀
 
컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018
컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018
컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018Amazon Web Services Korea
 
Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...
Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...
Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...Amazon Web Services
 
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...NETUserGroupBern
 
Recommendation system (1).pptx
Recommendation system (1).pptxRecommendation system (1).pptx
Recommendation system (1).pptxprathammishra28
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfLiming Zhu
 
클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021
클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021
클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021Amazon Web Services Korea
 
Become a Quality Enabler
Become a Quality EnablerBecome a Quality Enabler
Become a Quality Enabler99X Technology
 
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기Amazon Web Services Korea
 
Mobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalMobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalAleyda Solís
 
The AI revolution in sales and marketing
The AI revolution in sales and marketingThe AI revolution in sales and marketing
The AI revolution in sales and marketingZoodikers
 
Google Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptxGoogle Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptxVishPothapu
 
Recommender system
Recommender systemRecommender system
Recommender systemSaiguru P.v
 

What's hot (20)

Matt Lewis - The Hardest Thing-Final to Host.pdf
Matt Lewis - The Hardest Thing-Final to Host.pdfMatt Lewis - The Hardest Thing-Final to Host.pdf
Matt Lewis - The Hardest Thing-Final to Host.pdf
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AI
 
Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)
Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)
Amazon Rekognition을 이용하여 인공지능 안면 인식 키오스크 만들기 - 강정희 (AWS 솔루션즈 아키텍트)
 
From knowledge graph to commonsense knowledge graph brighton seo
From knowledge graph to commonsense knowledge graph   brighton seoFrom knowledge graph to commonsense knowledge graph   brighton seo
From knowledge graph to commonsense knowledge graph brighton seo
 
Generative AI
Generative AIGenerative AI
Generative AI
 
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
 
[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...
[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...
[금융사를 위한 AWS Generative AI Day 2023] 7_다양한 AI 워크로드를 위한 최적의 ...
 
컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018
컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018
컨테이너와 서버리스 기술을 통한 디지털 트랜스포메이션::정도현::AWS Summit Seoul 2018
 
Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...
Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...
Deep Dive into AWS X-Ray: Monitor Modern Applications (DEV324) - AWS re:Inven...
 
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
 
Recommendation system (1).pptx
Recommendation system (1).pptxRecommendation system (1).pptx
Recommendation system (1).pptx
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021
클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021
클라우드 MSP에 강력한 '보안'을 더하다 - 최광호 클라우드사업본부장, 안랩 :: AWS Summit Seoul 2021
 
Become a Quality Enabler
Become a Quality EnablerBecome a Quality Enabler
Become a Quality Enabler
 
Ashen Bhatti - How I Build Companies with LLM.pdf
Ashen Bhatti - How I Build Companies with LLM.pdfAshen Bhatti - How I Build Companies with LLM.pdf
Ashen Bhatti - How I Build Companies with LLM.pdf
 
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
AWS Summit Seoul 2023 | 생성 AI 모델의 임베딩 벡터를 이용한 서버리스 추천 검색 구현하기
 
Mobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalMobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigital
 
The AI revolution in sales and marketing
The AI revolution in sales and marketingThe AI revolution in sales and marketing
The AI revolution in sales and marketing
 
Google Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptxGoogle Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptx
 
Recommender system
Recommender systemRecommender system
Recommender system
 

Viewers also liked

State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
 
Artificial Intelligence API Services Compared
Artificial Intelligence API Services ComparedArtificial Intelligence API Services Compared
Artificial Intelligence API Services ComparedCraig Milroy
 
BootstrapLabs - Tracxn Report - artificial intelligence for the Applied Arti...
BootstrapLabs - Tracxn  Report - artificial intelligence for the Applied Arti...BootstrapLabs - Tracxn  Report - artificial intelligence for the Applied Arti...
BootstrapLabs - Tracxn Report - artificial intelligence for the Applied Arti...BootstrapLabs
 
Neural Machine Translation: a report from the front line
Neural Machine Translation: a report from the front lineNeural Machine Translation: a report from the front line
Neural Machine Translation: a report from the front lineIconic Translation Machines
 
Create Your Own Voice Assistant Using Watson and IBM Bluemix
Create Your Own Voice Assistant Using Watson and IBM BluemixCreate Your Own Voice Assistant Using Watson and IBM Bluemix
Create Your Own Voice Assistant Using Watson and IBM BluemixVidyasagar Machupalli
 
How to ready your organization for Artificial Intelligence
How to ready your organization for Artificial IntelligenceHow to ready your organization for Artificial Intelligence
How to ready your organization for Artificial IntelligenceCraig Milroy
 

Viewers also liked (7)

State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Artificial Intelligence API Services Compared
Artificial Intelligence API Services ComparedArtificial Intelligence API Services Compared
Artificial Intelligence API Services Compared
 
BootstrapLabs - Tracxn Report - artificial intelligence for the Applied Arti...
BootstrapLabs - Tracxn  Report - artificial intelligence for the Applied Arti...BootstrapLabs - Tracxn  Report - artificial intelligence for the Applied Arti...
BootstrapLabs - Tracxn Report - artificial intelligence for the Applied Arti...
 
Neural Machine Translation: a report from the front line
Neural Machine Translation: a report from the front lineNeural Machine Translation: a report from the front line
Neural Machine Translation: a report from the front line
 
Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
 
Create Your Own Voice Assistant Using Watson and IBM Bluemix
Create Your Own Voice Assistant Using Watson and IBM BluemixCreate Your Own Voice Assistant Using Watson and IBM Bluemix
Create Your Own Voice Assistant Using Watson and IBM Bluemix
 
How to ready your organization for Artificial Intelligence
How to ready your organization for Artificial IntelligenceHow to ready your organization for Artificial Intelligence
How to ready your organization for Artificial Intelligence
 

Similar to NLU / Intent Detection Benchmark by Intento, August 2017

DEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceDEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceAmazon Web Services
 
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSIOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSAmazon Web Services
 
EXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to TestEXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to TestIosif Itkin
 
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017Andrew Clark
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersAmazon Web Services
 
MCL308_Using a Digital Assistant in the Enterprise for Business Productivity
MCL308_Using a Digital Assistant in the Enterprise for Business ProductivityMCL308_Using a Digital Assistant in the Enterprise for Business Productivity
MCL308_Using a Digital Assistant in the Enterprise for Business ProductivityAmazon Web Services
 
2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...
2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...
2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...Bruno Capuano
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSInjae Kwak
 
Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)Konstantin Savenkov
 
APIs as a Product Strategy
APIs as a Product StrategyAPIs as a Product Strategy
APIs as a Product StrategyRavi Kumar
 
DEV322_Continuous Integration Best Practices for Software Development Teams
DEV322_Continuous Integration Best Practices for Software Development TeamsDEV322_Continuous Integration Best Practices for Software Development Teams
DEV322_Continuous Integration Best Practices for Software Development TeamsAmazon Web Services
 
Introduction to Mobile Development with AWS
Introduction to Mobile Development with AWSIntroduction to Mobile Development with AWS
Introduction to Mobile Development with AWSAmazon Web Services
 
Introduction to Mobile Development with AWS
Introduction to Mobile Development with AWSIntroduction to Mobile Development with AWS
Introduction to Mobile Development with AWSAmazon Web Services
 
MBL306_Mobile State of the Union
MBL306_Mobile State of the UnionMBL306_Mobile State of the Union
MBL306_Mobile State of the UnionAmazon Web Services
 
CON320_Monitoring, Logging and Debugging Containerized Services
CON320_Monitoring, Logging and Debugging Containerized ServicesCON320_Monitoring, Logging and Debugging Containerized Services
CON320_Monitoring, Logging and Debugging Containerized ServicesAmazon Web Services
 
Building your own chat bot with Amazon Lex - Hebrew Webinar
Building your own chat bot with Amazon Lex - Hebrew WebinarBuilding your own chat bot with Amazon Lex - Hebrew Webinar
Building your own chat bot with Amazon Lex - Hebrew WebinarBoaz Ziniman
 
DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...
DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...
DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...Amazon Web Services
 

Similar to NLU / Intent Detection Benchmark by Intento, August 2017 (20)

DEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceDEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 Service
 
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSIOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
EXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to TestEXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to Test
 
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
 
MCL308_Using a Digital Assistant in the Enterprise for Business Productivity
MCL308_Using a Digital Assistant in the Enterprise for Business ProductivityMCL308_Using a Digital Assistant in the Enterprise for Business Productivity
MCL308_Using a Digital Assistant in the Enterprise for Business Productivity
 
Samepoint API
Samepoint APISamepoint API
Samepoint API
 
2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...
2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...
2018 04 20 Azure Global Bootcamp - Artificial Intelligence and Cognitive Serv...
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWS
 
Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)
 
APIs as a Product Strategy
APIs as a Product StrategyAPIs as a Product Strategy
APIs as a Product Strategy
 
DEV322_Continuous Integration Best Practices for Software Development Teams
DEV322_Continuous Integration Best Practices for Software Development TeamsDEV322_Continuous Integration Best Practices for Software Development Teams
DEV322_Continuous Integration Best Practices for Software Development Teams
 
Introduction to Mobile Development with AWS
Introduction to Mobile Development with AWSIntroduction to Mobile Development with AWS
Introduction to Mobile Development with AWS
 
Introduction to Mobile Development with AWS
Introduction to Mobile Development with AWSIntroduction to Mobile Development with AWS
Introduction to Mobile Development with AWS
 
MBL306_Mobile State of the Union
MBL306_Mobile State of the UnionMBL306_Mobile State of the Union
MBL306_Mobile State of the Union
 
CON320_Monitoring, Logging and Debugging Containerized Services
CON320_Monitoring, Logging and Debugging Containerized ServicesCON320_Monitoring, Logging and Debugging Containerized Services
CON320_Monitoring, Logging and Debugging Containerized Services
 
Building your own chat bot with Amazon Lex - Hebrew Webinar
Building your own chat bot with Amazon Lex - Hebrew WebinarBuilding your own chat bot with Amazon Lex - Hebrew Webinar
Building your own chat bot with Amazon Lex - Hebrew Webinar
 
DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...
DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...
DevOps for a Mobile World: Building an iOS or Android Mobile App in the Cloud...
 

More from Konstantin Savenkov

GPT and other Text Transformers: Black Swans and Stochastic Parrots
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic ParrotsKonstantin Savenkov
 
Dodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutionsDodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutionsKonstantin Savenkov
 
Building Multi-Purpose MT Portfolio
Building Multi-Purpose MT PortfolioBuilding Multi-Purpose MT Portfolio
Building Multi-Purpose MT PortfolioKonstantin Savenkov
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Konstantin Savenkov
 
Progress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation SystemsProgress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation SystemsKonstantin Savenkov
 
Cloud Artificial Intelligence Landscape
Cloud Artificial Intelligence LandscapeCloud Artificial Intelligence Landscape
Cloud Artificial Intelligence LandscapeKonstantin Savenkov
 
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)Konstantin Savenkov
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)Konstantin Savenkov
 
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)Konstantin Savenkov
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...Konstantin Savenkov
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Konstantin Savenkov
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаKonstantin Savenkov
 
State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)Konstantin Savenkov
 
State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)Konstantin Savenkov
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Konstantin Savenkov
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данныхKonstantin Savenkov
 
Messengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsMessengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsKonstantin Savenkov
 

More from Konstantin Savenkov (20)

GPT and other Text Transformers: Black Swans and Stochastic Parrots
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic Parrots
 
Dodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutionsDodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutions
 
Building Multi-Purpose MT Portfolio
Building Multi-Purpose MT PortfolioBuilding Multi-Purpose MT Portfolio
Building Multi-Purpose MT Portfolio
 
Machine Translation Insights
Machine Translation InsightsMachine Translation Insights
Machine Translation Insights
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
 
Progress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation SystemsProgress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation Systems
 
Cloud Artificial Intelligence Landscape
Cloud Artificial Intelligence LandscapeCloud Artificial Intelligence Landscape
Cloud Artificial Intelligence Landscape
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
 
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного перевода
 
State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)
 
State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017
 
Building a Data Driven Business
Building a Data Driven BusinessBuilding a Data Driven Business
Building a Data Driven Business
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данных
 
Messengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsMessengers, Bots and Personal Assistants
Messengers, Bots and Personal Assistants
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptxFIDO Alliance
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 

NLU / Intent Detection Benchmark by Intento, August 2017

  • 1. NLU / Intent Detection Benchmark by Intento August 2017
  • 2. About • At Intento, we want to make Machine Intelligence services easy to discover, choose and use. • So far, the evaluation is the most problematic part: to compare providers, one need to sign a lot of contracts and integrate a lot of APIs. • We deliver that for FREE on public datasets. To evaluate on you own dataset, contact our sales. • Also, check out our Machine Translation Benchmark. Machine Translation is an easy way to build a multi- lingual bot.
  • 3. Overview • Natural Language Understanding Services with Public APIs* • IBM Watson Conversation • API.ai API • Microsoft LUIS • Amazon Lex • Recast.ai API • wit.ai API • SNIPS API • Benchmark Dimensions • Intent Prediction Performance • False Positives • Learning Speed (performance on small datasets) • Language Coverage, Price, Response time August 2017© Intento, Inc. * as of today, some of them don’t have a Public API and we’ve got an access for the purpose of this benchmark
  • 4. NLU Engines Compared August 2017© Intento, Inc.
  • 5. API.ai / Google • Website: https://api.ai • Launched: 2010 (acquired by Google in 2016) • Pricing model: FREE • Interface: HTTP REST API “Build delightful and natural conversational experiences” August 2017© Intento, Inc.
  • 6. wit.ai / Facebook • Website: https://wit.ai • Launched: 2013 (acquired by Facebook in 2015) • Pricing model: FREE • Interface: HTTP REST API “Natural Language for Developers” August 2017© Intento, Inc.
  • 7. IBM Watson Conversation • Website: https://www.ibm.com/watson/services/ conversation/ • Launched: 2016 • Pricing model: Pay As You Go • Interface: HTTP REST API “Quickly build and deploy chatbots and virtual agents across a variety of channels, including mobile devices, messaging platforms, and even robots.” August 2017© Intento, Inc.
  • 8. Microsoft LUIS • Website: https://www.luis.ai/ • Launched: 2015 • Pricing model: Pay As You Go • Interface: HTTP REST API “Language Understanding Intelligent Service. Add conversational intelligence to your apps.” August 2017© Intento, Inc.
  • 9. Amazon Lex • Website: https://aws.amazon.com/lex/ • Launched: 2016 • Pricing model: Pay As You Go • Interface: HTTP REST API • Important Restrictions apply (see Dataset slide) “Conversational interfaces for your applications. Powered by the same deep learning technologies as Alexa” August 2017© Intento, Inc.
  • 10. Recast.ai • Website: https://recast.ai • Launched: 2016 • Pricing mode: Free tier + Contact sales • Interface: HTTP REST API “The collaborative platform to build, train, deploy and monitor intelligent bots for developers” August 2017© Intento, Inc.
  • 11. SNIPS • Website: https://snips.ai • Launched: 2017 • Pricing model: Free to test + Pay per Device • Interface: On Device* “Snips is an AI-powered voice assistant you can add to your products. It runs on-device and is Private by Design” * we’ve tested a hosted version with a private API provided by SNIPS August 2017© Intento, Inc.
  • 12. Historical timeline 2011 2015 20162012 2013 2014 20172010 Conversation Facebook Google LUIS August 2017© Intento, Inc.
  • 13. The Approach August 2017© Intento, Inc.
  • 14. Dataset • The original dataset is the SNIPS.ai 2017 NLU Benchmark • English language only • We have removed duplicates that differ by number of whitespaces, quotes, lettercase etc • Resulting dataset parameters*: 7 intents (next slide), 15.6K samples (~2K per intent), 340K symbols • We also used ~450 samples from SNIPS.ai 2016 NLU Benchmark to test for False Positives * For Amazon Lex, the training set is capped by 200K symbols per API limitation; also symbol limitations apply, resulting in 4500 utterances (~640 per intent). August 2017© Intento, Inc.
  • 15. Intents [*] • SearchCreativeWork (e.g. Find me the I, Robot television show) • GetWeather (e.g. Is it windy in Boston, MA right now?) • BookRestaurant (e.g. I want to book a highly rated restaurant for me and my boyfriend tomorrow night) • PlayMusic (e.g. Play the last track from Beyoncé off Spotify) • AddToPlaylist (e.g. Add Diamonds to my roadtrip playlist) • RateBook (e.g. Give 6 stars to Of Mice and Men) • SearchScreeningEvent (e.g. Check the showtimes for Wonder Woman in Paris) [*] quoted from https://github.com/snipsco/nlu-benchmark/tree/master/2017-06-custom-intent-engines August 2017© Intento, Inc.
  • 16. I. Prediction Performance August 2017© Intento, Inc.
  • 17. Experimental setting • Inspired by the SNIPS Benchmarks (2016, 2017) • We benchmark intent detection only, no parameter extraction yet • For each provider, one model is trained to detect all intents • We run all models with the default confidence score thresholds. • 3-fold 80/20* Stratified Monte Carlo cross-validation * 47/20 for Amazon Lex August 2017© Intento, Inc.
  • 18. Scores Normalization (I) Intents: the good, the bad and the ugly (to compare providers we need to remove intent-related bias) F1 Scores* Confusion matrix * F1 Score is a weighted average of the precision and recall; reaches its best value at 1 and worst score at 0. August 2017© Intento, Inc.
  • 19. Scores Normalization (II) • Standardizing F1, P and R scores for each intent (SD- normalization) • Then adjusting scales by multiplying on the global std and adding the global mean (next slide) August 2017© Intento, Inc.
  • 20. Detection Performance* • Black bars indicate confidence intervals • Amazon Lex is trained on a smaller dataset due to its API limits * Mean standardized F1 scores, adjusted to the initial scale using global mean and std August 2017© Intento, Inc.
  • 21. Average Precision* • Black bars indicate confidence intervals • Amazon Lex is trained on a smaller dataset due to its API limits August 2017© Intento, Inc. * Mean standardized Precision, adjusted to the initial scale using global mean and std
  • 22. Average Recall* * Mean standardized Recall, adjusted to the initial scale using global mean and std • Black bars indicate confidence intervals • is Amazon Lex trained on a smaller dataset due to its API limits August 2017© Intento, Inc.
  • 23. Precision vs. Recall August 2017© Intento, Inc.
  • 24. Discussion • Training models is cumbersome for most of the services: • manual work (adding models in the web interface) and • solving issues with the tech support. • Based on the confidence intervals, all providers fall into several groups: • Top-runners: IBM Watson, API.ai, Microsoft LUIS • Amazon Lex: API limits don’t allow for training multi-intent model on enough samples • Choosing the provider: • For “good” intents (like GetWeather), all providers are good enough. For “bad” intents, having a “good” providers is crucial • Within each tier, the leader depends on the intent data August 2017© Intento, Inc.
  • 25. II. False Positives August 2017© Intento, Inc.
  • 26. Approach • How does NLU behaves when a user express an intent it wasn’t trained for? • Expected behavior: produce the Fallback Intent (no trained intents pass the detection threshold) • 1411 utterances from the domains other than Music, Movie, Weather, RestaurantReservation, Entertainment and TV. • Compare % of (false) positive detections for each NLU provider August 2017© Intento, Inc.
  • 28. Discussion • Only Snips.ai and API.ai are somewhat good at detecting that users asks for something the agent is not trained for • IBM Watson and Microsoft LUIS are trying to map any user request to one of the intents from the training set Perhaps the Fallback intent should be manually added and trained on junk utterances? August 2017© Intento, Inc.
  • 29. III. Learning Curve August 2017© Intento, Inc.
  • 30. Experimental Setting • Similar to the Prediction Performance (slide 18) • 20% of the dataset reserved for testing (stratified) • From the remaining 80%, for intent we’ve randomly built a set of training sets of the following cardinality: 10, 25, 50, 100, 200, 500, 1000* • No cross-validation • Analyzed F1 Scores, normalised as described on slides 19-20. * for Amazon Lex, 10, 25, 50, 100, 200 August 2017© Intento, Inc.
  • 31. Leaning curve by provider Vertical bars denote confidence intervals August 2017© Intento, Inc.
  • 32. Discussion • On <100 samples IBM Watson is superior*. API.ai catches up at >100 samples. • Detecting the user’s intent is crucial for the subsequent slot extraction and response generation. • The learning curve is quite steep, good performance requires hundreds of utterances to train on. • Most of the pre-built intents (Microsoft LUIS, API.ai, etc) are built on 10-50 utterances. * SNIPS advertises a special enterprise feature to generate additional samples for smaller datasets, but it is not available by default August 2017© Intento, Inc.
  • 33. IV. Language coverage August 2017© Intento, Inc.
  • 34. Supported languages • Merged all dialects (e.g. en- uk and en-us). • Note we’ve tested the performance only for English August 2017© Intento, Inc.
  • 36. Discussion Potentially, Machine Translation may be used to increase the language coverage and/or performance: • either by translating both the training and testing utterances to English • or by translating only testing utterances and using the English training model That’s something to check in future benchmarks August 2017© Intento, Inc.
  • 37. V. Other observations August 2017© Intento, Inc.
  • 38. Average response time * * Snips assumes on-device deployment, we put here a response time for a hosted test bench August 2017© Intento, Inc.
  • 39. Price per 1K requests* * prediction requests; SNIPS has per device pricing and is not shown on this chart CONTACTSALES Free Free August 2017© Intento, Inc.
  • 41. Performance (F1) vs. Price* Performance Affordability = 1/Price wit.ai amazon.lex microsoft.luis ibm.watson api.ai * Recast and SNIPS are not shown as they don't provide a public pricing FREE August 2017© Intento, Inc.
  • 42. Performance (F1) vs. Latency* Performance Speed = 1/Latency snips recast wit.ai amazon.lex microsoft.luis ibm.watson api.ai * Snips assumes on-device deployment, we put here a response time for a hosted test bench August 2017© Intento, Inc.
  • 43. Conclusions 1. API.ai, Microsoft LUIS and IBM Watson have overall best intent detection performance, speed and language coverage. • Within this group, API.ai is superior at price (free), Microsoft LUIS at speed (almost 50% faster response), IBM Watson at performance (esp. at smaller datasets). Here, only API.ai detects (~40% of) out-of-domain requests and produce Fallback intent. 2. For extreme language coverage, go with wit.ai. • It’s interesting if Machine Translation can be applied either on training or on testing stage to increase language coverage for the top-3 providers. 3. The performance varies a lot for different intents and dataset sizes. We recommend to evaluate several providers on your data before making a choice. August 2017© Intento, Inc.
  • 44. Discover the best service providers for your AI task Evaluate performance on your own data Access any provider with no effort using to our Single API Intento Service Platform August 2017© Intento, Inc.