SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Tensors for Large-scale
Topic Modeling and Deep Learning
A n i m a A n a n d k u m a r , P r i n c i p a l S c i e n t i s t , A m a z o n A I
M C L 3 3 7
N o v e m b e r 2 9 , 2 0 1 7
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Machine learning in many domains…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Machine learning in many domains…
Image
Understanding
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Machine learning in many domains…
Object
Classification
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Machine learning in many domains…
Text
Understanding
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic Detection
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic Detection
Government
Information
Technology
Politics
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trinity in Machine Learning
Algorithms
ComputeData
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS ML Stack
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU
(P3 Instances)
Mobile
CPU
(C5 Instances)
IoT
(Greengrass)
Vision:
Rekognition Image
Rekognition Video
Speech:
Polly
Transcribe
Language:
Lex Translate
Comprehend
Apache
MXNet
PyTorch
Cognitive
Toolkit
Keras
Caffe2
& Caffe
TensorFlow Gluon
Application
Services
Platform
Services
Amazon Machine
Learning
Mechanical
Turk
Spark &
EMR
Amazon
SageMaker
AWS
DeepLens
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Comprehend for Text
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ML Algorithms in SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
End-to-end
Machine Learning
Platform
Zero setup Flexible model
training
Pay by the
second
Introducing Amazon SageMaker
The quickest and easiest way to get ML models from idea to production
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
XGBoost, FM,
and Linear for
classification and
regression
Kmeans and PCA
for clustering and
dimensionality
reduction
Image
classification with
convolutional
neural networks
LDA and NTM for
topic modeling,
seq2seq for
translation
More than just general purpose algorithms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA topic model on AWS SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA Topic Models
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic Models for Document Categorization
Government
Information
Technology
Politics
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic Models for Document Categorization
• Labeled sample
documents
hard to obtain
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic Models for Document Categorization
• Labeled sample
documents
hard to obtain
• How do we
discover topics
automatically?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Unsupervised Learning Supervised Learning
ML Algorithms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Warm-up: Clustering
• Each data point is part of a cluster
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Warm-up: Clustering
• Each data point is part of a cluster
• Data point = document
• Cluster = topic
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Warm-up: Clustering
• Each data point is part of a cluster
• Data point = document
• Cluster = topic
But documents have multiple topics!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA Topic Model: Beyond Clustering
Justice
Education
Sports
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA Topic Model: Beyond Clustering
brai
n
comput
data
evolve
gene
neuron
Justice
Education
Sports
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training and Inference in SageMaker LDA
brai
n
comput
data
evolve
gene
neuron
• Training using spectralLDA algorithm
• Inference using stochastic gradient descent (SGD)
LDA ModelDocument
corpus
Learning
topic-word
matrix
Inference
brai
n
comput
data
evolve
gene
neuron
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Notebook Demo
h t t p s : / / g i t h u b . c o m / a w s l a b s / a m a z o n - s a g e m a k e r - e x a m p l e s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA synthetic data generation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA synthetic data generation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA synthetic data generation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance Analysis
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Qualitative Analysis
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
NewYork Times topics
Lifestyle
Politics
Sports
Business
1 2
3 4
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PubMed Topics
BloodClinicalTrials
treatmentPublichealth
Cancer/genetics
1 2
3 4
5
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example Document in NYTimes
Government
Information
Technology
Politics
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example Document in NYTimes
Business
Information
Technology
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance Benchmarks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker LDA training is faster
0.00
20.00
40.00
60.00
80.00
100.00
5 10 15 20 25 30 50 75 100
Timeinminutes
Number of Topics
Training time for NYTimes
Spectral Time(minutes) Mallet Time (minutes)
0.00
50.00
100.00
150.00
200.00
250.00
5 10 15 20 25 50 100
Timeinminutes
Number of Topics
Training time for PubMed
Spectral Time (minutes) Mallet Time (minutes)
8 million documents
22x faster on average 12x faster on average
• Mallet is an open-source framework for topic modeling
• Mallet does training and inference together
• Benchmarks on AWS SageMaker Platform
300000 documents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker LDA is cheaper on AWS
0.00
0.50
1.00
1.50
2.00
2.50
1 2 3 4 5 6 7 8 9
Cost($)
Number of Topics
Training cost for NYTimes
Spectral Cost ($) Mallet Cost ($)
300000 documents
0.000
1.000
2.000
3.000
4.000
5.000
6.000
1 2 3 4 5 6 7
Cost($)
Number of Topics
Training cost for PubMed
Spectral Cost ($) Mallet Cost ($)
22x cheaper on average
12x cheaper on average
• Faster training translates to lower costs on AWS
• Benchmarks on C4.8x
1 million documents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker LDA inference is faster
0
20
40
60
80
100
120
5 10 15 20 25 50 100
Inferencetimeinminutes
Number of Topics
Inference time for NYTimes
SpectralLDA Mallet
0
10
20
30
40
50
60
5 10 15 20 25 50 100
Inferencetimeinminutes
Number of topics
Inference time for Pubmed
SpectralLDA Mallet
300000 documents 1 million documents
• Mallet is an open-source framework for topic modeling
• Mallet does training and inference together
• Benchmarks on AWS SageMaker Platform
13x faster on average
3.5x faster on average
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker LDA training + inference
faster
0
20
40
60
80
100
120
5 10 15 20 25 50 100
Totaltimeinminutes
Number of Topics
Total Time (Training + Inference) for NYTimes
SpectralLDA Mallet
0
10
20
30
40
50
60
5 10 15 20 25 50 100
Totaltimeinminutes
Number of Topics
Total Time (Training + Inference) for Pubmed
SpectralLDA Mallet
• Mallet is an open-source framework for topic modeling
• Mallet does training and inference together
• Benchmarks on AWS SageMaker Platform
7x faster on average
2.5x faster on average
300000 documents 1 million documents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker LDA has better topic
coherence
1.4
1.5
1.6
1.7
1.8
5 10 15 20 25 30 40 50 75 100
PMI
Number of Topics
Topic coherence for NYTimes
Mallet PMI Spectral PMI
• Topic coherence = Pairwise Mutual Information (PMI)
• PMI: co-occurrence of top words in a topic
• Higher PMI represents better topic quality and is a
better representative of human judgement
• Human judgement not highly correlated to log
likelihood of topic model
300000 documents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker LDA has better topic
coherence
1.4
1.5
1.6
1.7
1.8
5 10 15 20 25 30 40 50 75 100
PMI
Number of Topics
Topic coherence for NYTimes
Mallet PMI Spectral PMI
• Topic coherence = Pairwise Mutual Information (PMI)
• PMI: co-occurrence of top words in a topic
• Higher PMI represents better topic quality and is a
better representative of human judgement
• Human judgement not highly correlated to log
likelihood of topic model
300000 documents
Faster algorithm with competitive topic quality
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Neural Topic Modeling on SageMaker
Perplexity vs. Number of Topics
Encoder: feedforward net
Input term counts vector
Document
Posterior
Sampled Document
Representation
Decoder:
Softmax
Output term counts vector
0
2000
4000
6000
8000
10000
12000
0 50 100 150 200
Perplexity
Number of Topics
NTM Other
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Methods for LDA Topic
Models
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensors in ML Algorithms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
LDA Topic Model
brai
n
comput
data
evolve
gene
neuron
Justice
Education
Sports
Topics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic-word matrix [word = i|topic = j ]
Topic proportions P[topic = j|document]
Moment Tensor: Co-occurrence of Word Triplets
= + +
crim
e
Sports
Educa
on
Learning LDA Model
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Decomposit ions
Spectral Decomposition
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Tensors?
Statistical reasons:
• Incorporate higher order relationships in data
• Discover hidden topics (not possible with matrix methods)
A. Anandkumar et al.,Tensor Decompositions for Learning Latent Variable Models, JMLR 2014.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Tensors?
Statistical reasons:
• Incorporate higher order relationships in data
• Discover hidden topics (not possible with matrix methods)
Computational reasons:
• Tensor algebra is parallelizable like linear algebra.
• Faster than other algorithms for LDA
• Flexible: Training and inference decoupled
• Guaranteed in theory to converge to global optimum
A. Anandkumar et al., Tensor Decompositions for Learning Latent Variable Models, JMLR 2014.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TENSORS IN DEEP LEARNING
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Existing Deep Networks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Tensorized Networks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Space Saving in Deep Tensorized
Networks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RNN and LSTM for Sequence Modeling
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor RNN and Tensor LSTM
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor RNN and Tensor LSTM
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
C l i m a t e d a t a s e tTr a ff i c d a t a s e t
TLSTM for Long-term Forecasting
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visual Question & Answering
Tensors for multiple modalities
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visual Question & Answering
Tensors for multiple modalities
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visual Question & Answering
Tensor Sketching Algorithms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensorly: Framework for Tensor Algebra
• Python programming
• User-friendly API
• Multiple backends:
flexible + scalable
• Example notebooks in
repository
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CONCLUSION
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Conclusion
• AWS SageMaker: Serverless ML framework
• Algorithms on SageMaker: faster and cheaper
• LDA model for unsupervised document categorization
• SageMaker LDA is faster and yields good topic quality
• Tensors are extensions of matrices
• Multiple dimensions and modalities
• Can be combined with deep learning
= + ..
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!

More Related Content

What's hot

GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
Amazon Web Services
 
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
AWS Summits
 
Add Real-Time Personalization and Recommendations to Your Applications (AIM39...
Add Real-Time Personalization and Recommendations to Your Applications (AIM39...Add Real-Time Personalization and Recommendations to Your Applications (AIM39...
Add Real-Time Personalization and Recommendations to Your Applications (AIM39...
Amazon Web Services
 
Building a Recommender System on AWS
Building a Recommender System on AWSBuilding a Recommender System on AWS
Building a Recommender System on AWS
Amazon Web Services
 
Building an end to end image recognition service - Tel Aviv Summit 2018
Building an end to end image recognition service - Tel Aviv Summit 2018Building an end to end image recognition service - Tel Aviv Summit 2018
Building an end to end image recognition service - Tel Aviv Summit 2018
Amazon Web Services
 
Sviluppare applicazioni voice-first con AWS e Amazon Alexa
Sviluppare applicazioni voice-first con AWS e Amazon AlexaSviluppare applicazioni voice-first con AWS e Amazon Alexa
Sviluppare applicazioni voice-first con AWS e Amazon Alexa
Amazon Web Services
 
BI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWSBI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWS
Amazon Web Services
 
Intro to SageMaker
Intro to SageMakerIntro to SageMaker
Intro to SageMaker
Soji Adeshina
 
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
Amazon Web Services
 
Deep Dive on Big Data
Deep Dive on Big Data Deep Dive on Big Data
Deep Dive on Big Data
Amazon Web Services
 
Artifical Intelligence and Machine Learning 201, AWS Federal Pop-Up Loft
Artifical Intelligence and Machine Learning 201, AWS Federal Pop-Up LoftArtifical Intelligence and Machine Learning 201, AWS Federal Pop-Up Loft
Artifical Intelligence and Machine Learning 201, AWS Federal Pop-Up Loft
Amazon Web Services
 
AI & ML on AWS: State of the Union
AI & ML on AWS: State of the UnionAI & ML on AWS: State of the Union
AI & ML on AWS: State of the Union
Julien SIMON
 
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglioArtificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Amazon Web Services
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
Amazon Web Services
 
BI & Analytics
BI & AnalyticsBI & Analytics
BI & Analytics
Amazon Web Services
 
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Amazon Web Services
 
Moving forward with AI
Moving forward with AIMoving forward with AI
Moving forward with AI
Amazon Web Services
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018
AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018
AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018
Amazon Web Services Korea
 
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Amazon Web Services
 
How Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionHow Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud Adoption
Amazon Web Services
 

What's hot (20)

GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
 
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
 
Add Real-Time Personalization and Recommendations to Your Applications (AIM39...
Add Real-Time Personalization and Recommendations to Your Applications (AIM39...Add Real-Time Personalization and Recommendations to Your Applications (AIM39...
Add Real-Time Personalization and Recommendations to Your Applications (AIM39...
 
Building a Recommender System on AWS
Building a Recommender System on AWSBuilding a Recommender System on AWS
Building a Recommender System on AWS
 
Building an end to end image recognition service - Tel Aviv Summit 2018
Building an end to end image recognition service - Tel Aviv Summit 2018Building an end to end image recognition service - Tel Aviv Summit 2018
Building an end to end image recognition service - Tel Aviv Summit 2018
 
Sviluppare applicazioni voice-first con AWS e Amazon Alexa
Sviluppare applicazioni voice-first con AWS e Amazon AlexaSviluppare applicazioni voice-first con AWS e Amazon Alexa
Sviluppare applicazioni voice-first con AWS e Amazon Alexa
 
BI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWSBI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWS
 
Intro to SageMaker
Intro to SageMakerIntro to SageMaker
Intro to SageMaker
 
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
 
Deep Dive on Big Data
Deep Dive on Big Data Deep Dive on Big Data
Deep Dive on Big Data
 
Artifical Intelligence and Machine Learning 201, AWS Federal Pop-Up Loft
Artifical Intelligence and Machine Learning 201, AWS Federal Pop-Up LoftArtifical Intelligence and Machine Learning 201, AWS Federal Pop-Up Loft
Artifical Intelligence and Machine Learning 201, AWS Federal Pop-Up Loft
 
AI & ML on AWS: State of the Union
AI & ML on AWS: State of the UnionAI & ML on AWS: State of the Union
AI & ML on AWS: State of the Union
 
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglioArtificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglio
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
 
BI & Analytics
BI & AnalyticsBI & Analytics
BI & Analytics
 
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
 
Moving forward with AI
Moving forward with AIMoving forward with AI
Moving forward with AI
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018
AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018
AWS의 새로운 언어, 음성, 텍스트 처리 인공지능 서비스::Vikram Anbazhagan::AWS Summit Seoul 2018
 
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
 
How Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionHow Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud Adoption
 

Similar to Tensors for topic modeling and deep learning on AWS Sagemaker

Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
Vladimir Simek
 
Integrating Deep Learning into your Enterprise
Integrating Deep Learning into your EnterpriseIntegrating Deep Learning into your Enterprise
Integrating Deep Learning into your Enterprise
Amazon Web Services
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Amazon Web Services
 
Supercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerSupercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMaker
Amazon Web Services
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with Databricks
Amazon Web Services
 
AWS & kreuzwerker Startup Day Warsaw - 09.11.2023
AWS & kreuzwerker Startup Day Warsaw - 09.11.2023AWS & kreuzwerker Startup Day Warsaw - 09.11.2023
AWS & kreuzwerker Startup Day Warsaw - 09.11.2023
kreuzwerker GmbH
 
Integrating Deep Learning In the Enterprise
Integrating Deep Learning In the EnterpriseIntegrating Deep Learning In the Enterprise
Integrating Deep Learning In the Enterprise
Amazon Web Services
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
Amazon Web Services Korea
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
AWS Germany
 
Integrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your EnterpriseIntegrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your Enterprise
Amazon Web Services
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
Amazon Web Services
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
Vladimir Simek
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
Amazon Web Services
 
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Amazon Web Services
 
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
Amazon Web Services
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Amazon Web Services
 
Using Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML ModelsUsing Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML Models
Amazon Web Services
 
Supercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMakerSupercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMaker
Amazon Web Services
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Amazon Web Services
 
Using Amazon SageMaker to Build, Train, and Deploy Your ML Models
Using Amazon SageMaker to Build, Train, and Deploy Your ML ModelsUsing Amazon SageMaker to Build, Train, and Deploy Your ML Models
Using Amazon SageMaker to Build, Train, and Deploy Your ML Models
Amazon Web Services
 

Similar to Tensors for topic modeling and deep learning on AWS Sagemaker (20)

Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
Integrating Deep Learning into your Enterprise
Integrating Deep Learning into your EnterpriseIntegrating Deep Learning into your Enterprise
Integrating Deep Learning into your Enterprise
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
 
Supercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerSupercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMaker
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with Databricks
 
AWS & kreuzwerker Startup Day Warsaw - 09.11.2023
AWS & kreuzwerker Startup Day Warsaw - 09.11.2023AWS & kreuzwerker Startup Day Warsaw - 09.11.2023
AWS & kreuzwerker Startup Day Warsaw - 09.11.2023
 
Integrating Deep Learning In the Enterprise
Integrating Deep Learning In the EnterpriseIntegrating Deep Learning In the Enterprise
Integrating Deep Learning In the Enterprise
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
 
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenMaschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten
 
Integrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your EnterpriseIntegrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your Enterprise
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
 
Artificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to StartArtificial Intelligence (Machine Learning) on AWS: How to Start
Artificial Intelligence (Machine Learning) on AWS: How to Start
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
 
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...
 
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
 
Using Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML ModelsUsing Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML Models
 
Supercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMakerSupercharge Your Machine Learning Solutions with Amazon SageMaker
Supercharge Your Machine Learning Solutions with Amazon SageMaker
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 
Using Amazon SageMaker to Build, Train, and Deploy Your ML Models
Using Amazon SageMaker to Build, Train, and Deploy Your ML ModelsUsing Amazon SageMaker to Build, Train, and Deploy Your ML Models
Using Amazon SageMaker to Build, Train, and Deploy Your ML Models
 

Recently uploaded

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 

Tensors for topic modeling and deep learning on AWS Sagemaker

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Tensors for Large-scale Topic Modeling and Deep Learning A n i m a A n a n d k u m a r , P r i n c i p a l S c i e n t i s t , A m a z o n A I M C L 3 3 7 N o v e m b e r 2 9 , 2 0 1 7
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Machine learning in many domains…
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Machine learning in many domains… Image Understanding
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Machine learning in many domains… Object Classification
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Machine learning in many domains… Text Understanding
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Topic Detection
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Topic Detection Government Information Technology Politics Topics
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Trinity in Machine Learning Algorithms ComputeData
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS ML Stack Frameworks & Infrastructure AWS Deep Learning AMI GPU (P3 Instances) Mobile CPU (C5 Instances) IoT (Greengrass) Vision: Rekognition Image Rekognition Video Speech: Polly Transcribe Language: Lex Translate Comprehend Apache MXNet PyTorch Cognitive Toolkit Keras Caffe2 & Caffe TensorFlow Gluon Application Services Platform Services Amazon Machine Learning Mechanical Turk Spark & EMR Amazon SageMaker AWS DeepLens
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Comprehend for Text
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ML Algorithms in SageMaker
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. End-to-end Machine Learning Platform Zero setup Flexible model training Pay by the second Introducing Amazon SageMaker The quickest and easiest way to get ML models from idea to production
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. XGBoost, FM, and Linear for classification and regression Kmeans and PCA for clustering and dimensionality reduction Image classification with convolutional neural networks LDA and NTM for topic modeling, seq2seq for translation More than just general purpose algorithms
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA topic model on AWS SageMaker
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA Topic Models
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Topic Models for Document Categorization Government Information Technology Politics Topics
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Topic Models for Document Categorization • Labeled sample documents hard to obtain
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Topic Models for Document Categorization • Labeled sample documents hard to obtain • How do we discover topics automatically?
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Unsupervised Learning Supervised Learning ML Algorithms
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Warm-up: Clustering • Each data point is part of a cluster
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Warm-up: Clustering • Each data point is part of a cluster • Data point = document • Cluster = topic
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Warm-up: Clustering • Each data point is part of a cluster • Data point = document • Cluster = topic But documents have multiple topics!
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA Topic Model: Beyond Clustering Justice Education Sports Topics
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA Topic Model: Beyond Clustering brai n comput data evolve gene neuron Justice Education Sports Topics
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training and Inference in SageMaker LDA brai n comput data evolve gene neuron • Training using spectralLDA algorithm • Inference using stochastic gradient descent (SGD) LDA ModelDocument corpus Learning topic-word matrix Inference brai n comput data evolve gene neuron
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Notebook Demo h t t p s : / / g i t h u b . c o m / a w s l a b s / a m a z o n - s a g e m a k e r - e x a m p l e s
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA synthetic data generation
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA synthetic data generation
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA synthetic data generation
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Analysis
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Qualitative Analysis
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. NewYork Times topics Lifestyle Politics Sports Business 1 2 3 4
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PubMed Topics BloodClinicalTrials treatmentPublichealth Cancer/genetics 1 2 3 4 5
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Example Document in NYTimes Government Information Technology Politics Topics
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Example Document in NYTimes Business Information Technology Topics
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Benchmarks
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker LDA training is faster 0.00 20.00 40.00 60.00 80.00 100.00 5 10 15 20 25 30 50 75 100 Timeinminutes Number of Topics Training time for NYTimes Spectral Time(minutes) Mallet Time (minutes) 0.00 50.00 100.00 150.00 200.00 250.00 5 10 15 20 25 50 100 Timeinminutes Number of Topics Training time for PubMed Spectral Time (minutes) Mallet Time (minutes) 8 million documents 22x faster on average 12x faster on average • Mallet is an open-source framework for topic modeling • Mallet does training and inference together • Benchmarks on AWS SageMaker Platform 300000 documents
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker LDA is cheaper on AWS 0.00 0.50 1.00 1.50 2.00 2.50 1 2 3 4 5 6 7 8 9 Cost($) Number of Topics Training cost for NYTimes Spectral Cost ($) Mallet Cost ($) 300000 documents 0.000 1.000 2.000 3.000 4.000 5.000 6.000 1 2 3 4 5 6 7 Cost($) Number of Topics Training cost for PubMed Spectral Cost ($) Mallet Cost ($) 22x cheaper on average 12x cheaper on average • Faster training translates to lower costs on AWS • Benchmarks on C4.8x 1 million documents
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker LDA inference is faster 0 20 40 60 80 100 120 5 10 15 20 25 50 100 Inferencetimeinminutes Number of Topics Inference time for NYTimes SpectralLDA Mallet 0 10 20 30 40 50 60 5 10 15 20 25 50 100 Inferencetimeinminutes Number of topics Inference time for Pubmed SpectralLDA Mallet 300000 documents 1 million documents • Mallet is an open-source framework for topic modeling • Mallet does training and inference together • Benchmarks on AWS SageMaker Platform 13x faster on average 3.5x faster on average
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker LDA training + inference faster 0 20 40 60 80 100 120 5 10 15 20 25 50 100 Totaltimeinminutes Number of Topics Total Time (Training + Inference) for NYTimes SpectralLDA Mallet 0 10 20 30 40 50 60 5 10 15 20 25 50 100 Totaltimeinminutes Number of Topics Total Time (Training + Inference) for Pubmed SpectralLDA Mallet • Mallet is an open-source framework for topic modeling • Mallet does training and inference together • Benchmarks on AWS SageMaker Platform 7x faster on average 2.5x faster on average 300000 documents 1 million documents
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker LDA has better topic coherence 1.4 1.5 1.6 1.7 1.8 5 10 15 20 25 30 40 50 75 100 PMI Number of Topics Topic coherence for NYTimes Mallet PMI Spectral PMI • Topic coherence = Pairwise Mutual Information (PMI) • PMI: co-occurrence of top words in a topic • Higher PMI represents better topic quality and is a better representative of human judgement • Human judgement not highly correlated to log likelihood of topic model 300000 documents
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker LDA has better topic coherence 1.4 1.5 1.6 1.7 1.8 5 10 15 20 25 30 40 50 75 100 PMI Number of Topics Topic coherence for NYTimes Mallet PMI Spectral PMI • Topic coherence = Pairwise Mutual Information (PMI) • PMI: co-occurrence of top words in a topic • Higher PMI represents better topic quality and is a better representative of human judgement • Human judgement not highly correlated to log likelihood of topic model 300000 documents Faster algorithm with competitive topic quality
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Neural Topic Modeling on SageMaker Perplexity vs. Number of Topics Encoder: feedforward net Input term counts vector Document Posterior Sampled Document Representation Decoder: Softmax Output term counts vector 0 2000 4000 6000 8000 10000 12000 0 50 100 150 200 Perplexity Number of Topics NTM Other
  • 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor Methods for LDA Topic Models
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensors in ML Algorithms
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. LDA Topic Model brai n comput data evolve gene neuron Justice Education Sports Topics
  • 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Topic-word matrix [word = i|topic = j ] Topic proportions P[topic = j|document] Moment Tensor: Co-occurrence of Word Triplets = + + crim e Sports Educa on Learning LDA Model
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor Decomposit ions Spectral Decomposition
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why Tensors? Statistical reasons: • Incorporate higher order relationships in data • Discover hidden topics (not possible with matrix methods) A. Anandkumar et al.,Tensor Decompositions for Learning Latent Variable Models, JMLR 2014.
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why Tensors? Statistical reasons: • Incorporate higher order relationships in data • Discover hidden topics (not possible with matrix methods) Computational reasons: • Tensor algebra is parallelizable like linear algebra. • Faster than other algorithms for LDA • Flexible: Training and inference decoupled • Guaranteed in theory to converge to global optimum A. Anandkumar et al., Tensor Decompositions for Learning Latent Variable Models, JMLR 2014.
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TENSORS IN DEEP LEARNING
  • 52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Existing Deep Networks
  • 53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Tensorized Networks
  • 54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Space Saving in Deep Tensorized Networks
  • 55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RNN and LSTM for Sequence Modeling
  • 56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor RNN and Tensor LSTM
  • 57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor RNN and Tensor LSTM
  • 58. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. C l i m a t e d a t a s e tTr a ff i c d a t a s e t TLSTM for Long-term Forecasting
  • 59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Visual Question & Answering Tensors for multiple modalities
  • 60. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Visual Question & Answering Tensors for multiple modalities
  • 61. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Visual Question & Answering Tensor Sketching Algorithms
  • 62. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensorly: Framework for Tensor Algebra • Python programming • User-friendly API • Multiple backends: flexible + scalable • Example notebooks in repository
  • 63. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CONCLUSION
  • 64. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Conclusion • AWS SageMaker: Serverless ML framework • Algorithms on SageMaker: faster and cheaper • LDA model for unsupervised document categorization • SageMaker LDA is faster and yields good topic quality • Tensors are extensions of matrices • Multiple dimensions and modalities • Can be combined with deep learning = + ..
  • 65. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU!

Editor's Notes

  1. Document, image, forecasting
  2. Document, image, forecasting
  3. Document, image, forecasting
  4. Document, image, forecasting
  5. End-to-End Machine Learning Platform Amazon SageMaker offers a familiar integrated development environment so that you can start processing your training dataset and developing your algorithms immediately. With one-click training, Amazon SageMaker provides a distributed training environment complete with high-performance machine learning algorithms, and built-in hyperparameter optimization for auto-tuning your models. When you’re ready to deploy, launching a secure and elastically scalable production environment is as simple as clicking a button in the Amazon SageMaker management console.   Zero Setup Amazon SageMaker provides hosted Jupyter notebooks that require no setup, so you can begin processing your training datasets and developing your algorithms immediately. With a few clicks in the Amazon SageMaker console, you can create a fully managed notebook instance, pre-loaded with useful libraries for machine learning and deep learning frameworks like TensorFlow, and Apache MXNet. You need only add your data. Flexible Model Training With native support for bring-your-own-algorithms and frameworks, model training in Amazon SageMaker is flexible. Amazon SageMaker provides native Apache MXNet and TensorFlow support, and offers a range of built-in, high performance machine learning algorithms, in addition to supporting popular open source algorithms. If you want to train against another algorithm or with an alternative deep learning framework, you simply bring your own algorithms or deep learning frameworks via a Docker container. Pay by the second With Amazon SageMaker , you pay only for what you use. Authoring, training, and hosting is billed by the second, with no minimum fees and no upfront commitments. Pricing within Amazon SageMaker is broken down by on-demand ML instances, ML storage, and fees for data processing in notebooks and hosting instances.
  6. The result of this is 1)      Linear Learner - Regression 2)      Linear Learner - Classification 3)      K-means 4)      Principal Component Analysis 5)      Factorization Machines 6)      Neural Topic Modeling 7)      Latent Dirichlet Allocation 8)      XGBoost 9)      Seq2Seq 10)  Image classification (ResNet)
  7. Highly-optimized Machine Learning Algorithms Amazon Iron Man installs high-performance, scalable machine learning algorithms optimized for speed, scale, and accuracy, to run on extremely large training datasets. Based on the type of learning that you are undertaking, you can choose from supervised algorithms, such as linear/logistic regression or classification; as well as unsupervised learning, such as with k-means clustering. Linear Classification and Regression Factorization Machines K-Means Clustering Principal Components Analysis (PCA) Latent Dirichlet Analysis (Spectral LDA) Neural Topic Modeling Time-series forecasting (DeepAR)
  8. 44