SlideShare a Scribd company logo
1 of 26
Download to read offline
LEARNING FROM SETS
ANDREW CLEGG
IN A NUTSHELL
ABOUT ME
• Yelp (starting next week!)
• Etsy, Pearson, Last.fm,
AstraZeneca, consulting
• Bioinformatics, information
retrieval, natural language
processing (UCL/Birkbeck)
• Main interests: search,
recommendations,
personalization
• @andrew_clegg
• http://andrewclegg.org/
LEARNING DEEP
REPRESENTATIONS FOR
UNORDERED ITEM SETS
LEARNING FROM ITEM COLLECTIONS
PROBLEM STATEMENT
• A lot of real-world data consists of collections of objects
• User’s session on a website (list of events)
• Products in a shopping cart (bag of items)
• Product titles (list of words)
• Songs played in a user’s history (list of items)
• Movies liked in a user’s signup flow (set of items)
LEARNING FROM ITEM COLLECTIONS
PROBLEM STATEMENT
• A lot of real-world data consists of collections of objects
• User’s session on a website (list of events) — ORDERED
• Products in a shopping cart (bag of items) — ORDERED OR NOT
• Product titles (list of words) — ORDERED… OR NOT?
• Songs played in a user’s history (list of items) — ORDERED
• Movies liked in a user’s signup flow (set of items) — UNORDERED
LEARNING FROM ITEM COLLECTIONS
PROBLEM STATEMENT
• Learning representations for variable-length sequences is “easy”
• RNNs, LSTMs, GRUs
• Input = sequence of embeddings
• Output = embedding for whole sequence
• Very effective but not always the cheapest or easiest to train
• But what if the data is unordered?
• What if it’s ordered, but that ordering is uninformative?
HOW CAN WE LEARN A SINGLE
EMBEDDING FROM A BAG OR
SET OF ITEM EMBEDDINGS?
(WHICH MIGHT NOT WORK VERY WELL)
REALLY SIMPLE APPROACH
• Learn item embeddings in an unsupervised manner
• e.g. “Item2Vec”, Barkan & Koenigstein 2016
• word2vec (skip-gram with negative sampling) on item IDs
• Average them together to get an embedding for the set/bag
• Often used in text mining / IR as a baseline or lower bound
• e.g. “word centroid distance” from Kusner et al 2015
Embeddings
Item 05
Item 17
Item 23 Element-wise mean
Issues:
• Not task oriented
• Embeddings can’t adapt to problem domain
• No guarantee that taking the mean is the best strategy
LEARN EMBEDDINGS WHILE TRAINING ON A TASK
NEURAL BAG-OF-ITEMS
• Common baseline in NLP tasks: neural bag-of-words
• Initialize embeddings randomly
• Or from unsupervised pre-training, or third-party data
• Take mean (or sometimes sum)
• Feed into network, update embeddings via backprop
Embeddings
Item 05
Item 17
Item 23 Element-wise mean
Output layer or rest of network
Errors propagate back into embeddings
COMPOSE EMBEDDINGS VIA NON-LINEAR TRANSFORMATIONS
DEEP AVERAGING NETWORKS
• “Deep Unordered Composition Rivals Syntactic Methods for Text
Classification” (Iyyer et al 2015)
• Developed for sentiment classification & question answering
• Proposed as a cheap alternative to recursive neural networks
• In a nutshell:
• Don’t use mean of embeddings directly
• Take mean and pass it through some fully-connected layers
• Probably prior art somewhere?
Embeddings
Item 05
Item 17
Item 23 Element-wise mean
Output layer or rest of network
Errors propagate back into FC layers and embeddings
FC2
FC1
Activation of last FC layer is representation of whole set
— Iyyer et al
THE DEEP LAYERS OF THE DAN
AMPLIFY TINY DIFFERENCES IN THE
VECTOR AVERAGE THAT ARE
PREDICTIVE OF THE OUTPUT LABELS.
”
“
“I really loved Rosamund Pike’s performance in the movie Gone Girl”
“I really loved Rosamund Pike’s performance in the movie Gone Girl”
liked
“I really loved Rosamund Pike’s performance in the movie Gone Girl”
liked
despised
“I really loved Rosamund Pike’s performance in the movie Gone Girl”
despised
All three sentences have very similar vector mean
liked
REMOVING ENTIRE EMBEDDINGS FROM THE MEAN
WORD DROPOUT
• Additional contribution: alternative dropout scheme
• Don’t add dropout after fully-connected layers
• Instead, randomly drop words from the input sentences
• Maybe somewhat specific to sentiment and question answering?
• Most words in a sentence don’t affect the sentiment
• Most words in a sentence don’t describe the actual answer
DEEP AVERAGING
NETWORKS FOR
ECOMMERCE DATA
PREDICTING GROCERY RE-ORDERS
INSTACART KAGGLE CONTEST
Simplified version of task, for trying out DANs:
• Given previous order (n of ~50K products)…
• Predict what % of items in it will be re-ordered in next order
• Use only the items in the previous order (not user, metadata etc.)
TRAIN ON 2893386 SAMPLES, VALIDATE ON 321488 SAMPLES
DAN VS GRU HEAD-TO-HEAD
DAN input: unordered item IDs
Dim-50 item embedding
(2484450 trainable params)
Mean + 2x dim-50 dense ReLU layers
(5100 trainable params)
Single linear output
(51 trainable params)
GRU input: ordered item IDs
Dim-50 item embedding
(2484450 trainable params)
GRU with 25 units + ReLU activation
(5700 trainable params)
Single linear output
(26 trainable params)
TRAINED WITH ADAM (ALL DEFAULTS) ON GOOGLE GPU BOX
DAN VS GRU HEAD-TO-HEAD
DAN
Batch size: 100
MSE loss
One epoch: 4 minutes
Mean training loss: 0.0631
Validation loss: 0.0626
Competitive result in minutes
GRU
Batch size: 100
MSE loss
One epoch: 5 hours
Mean training loss: 0.0626
Validation loss: 0.0614
Slightly better result… in hours!
DAN MATCHED GRU PERFORMANCE IN 12 MINUTES
DAN VS GRU HEAD-TO-HEAD
DANLOSS
0.0568
0.0586
0.0604
0.0622
0.0640
EPOCH
1 2 3 4 5
VALIDATION TRAINING
0.0615 ≈ GRU performance after 5 hours
SOME REMARKS
DAN VS GRU HEAD-TO-HEAD
• Tried ‘neural bag-of-items’ (no hidden layers) for comparison
• Training time per epoch similar to DAN (few secs faster)
• Validation loss flattened out at 0.063 (worse than DAN at epoch 0)
• Not a thorough investigation — no hyperparameter search
• No dropout, weight decay, batch norm, etc.
• Item dropout (i.e. word dropout) didn’t seem to help
• Unlike text mining tasks, all items in bag are (potentially) important
ANY QUESTIONS?
THANKS!
• Code available on GitHub:
• andrewclegg/insta-keras
• Feel free to grab me
afterwards to chat about
anything
• Or ping me on Twitter:
• @andrew_clegg

More Related Content

Similar to Applied AI - 2017-07-11 - Learning From Sets

Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet
 
Coaching teams in creative problem solving
Coaching teams in creative problem solvingCoaching teams in creative problem solving
Coaching teams in creative problem solvingFlowa Oy
 
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018Mike Harris
 
Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​Somnath Banerjee
 
Improving Pharo Snapshots
Improving Pharo SnapshotsImproving Pharo Snapshots
Improving Pharo SnapshotsESUG
 
Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Balázs Hidasi
 
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...Dataconomy Media
 
Decoupling shared code with state that needs to cleared in between uses
Decoupling shared code with state that needs to cleared in between usesDecoupling shared code with state that needs to cleared in between uses
Decoupling shared code with state that needs to cleared in between usesMichael Fons
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Scala in the Wild
Scala in the WildScala in the Wild
Scala in the WildTomer Gabel
 
Tdd is Dead, Long Live TDD
Tdd is Dead, Long Live TDDTdd is Dead, Long Live TDD
Tdd is Dead, Long Live TDDJonathan Acker
 
Unbreaking Your Django Application
Unbreaking Your Django ApplicationUnbreaking Your Django Application
Unbreaking Your Django ApplicationOSCON Byrum
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
MongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combinationMongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combinationSteven Francia
 
Completely Test-Driven
Completely Test-DrivenCompletely Test-Driven
Completely Test-DrivenIan Truslove
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Avkash Chauhan
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Alex Pinto
 

Similar to Applied AI - 2017-07-11 - Learning From Sets (20)

Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Coaching teams in creative problem solving
Coaching teams in creative problem solvingCoaching teams in creative problem solving
Coaching teams in creative problem solving
 
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
 
Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​
 
Improving Pharo Snapshots
Improving Pharo SnapshotsImproving Pharo Snapshots
Improving Pharo Snapshots
 
Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...
 
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
 
Decoupling shared code with state that needs to cleared in between uses
Decoupling shared code with state that needs to cleared in between usesDecoupling shared code with state that needs to cleared in between uses
Decoupling shared code with state that needs to cleared in between uses
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Scala in the Wild
Scala in the WildScala in the Wild
Scala in the Wild
 
Tdd is Dead, Long Live TDD
Tdd is Dead, Long Live TDDTdd is Dead, Long Live TDD
Tdd is Dead, Long Live TDD
 
Unbreaking Your Django Application
Unbreaking Your Django ApplicationUnbreaking Your Django Application
Unbreaking Your Django Application
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
MongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combinationMongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combination
 
Completely Test-Driven
Completely Test-DrivenCompletely Test-Driven
Completely Test-Driven
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 

Recently uploaded

Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...ranjana rawat
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Recently uploaded (20)

Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 

Applied AI - 2017-07-11 - Learning From Sets

  • 2. IN A NUTSHELL ABOUT ME • Yelp (starting next week!) • Etsy, Pearson, Last.fm, AstraZeneca, consulting • Bioinformatics, information retrieval, natural language processing (UCL/Birkbeck) • Main interests: search, recommendations, personalization • @andrew_clegg • http://andrewclegg.org/
  • 4. LEARNING FROM ITEM COLLECTIONS PROBLEM STATEMENT • A lot of real-world data consists of collections of objects • User’s session on a website (list of events) • Products in a shopping cart (bag of items) • Product titles (list of words) • Songs played in a user’s history (list of items) • Movies liked in a user’s signup flow (set of items)
  • 5. LEARNING FROM ITEM COLLECTIONS PROBLEM STATEMENT • A lot of real-world data consists of collections of objects • User’s session on a website (list of events) — ORDERED • Products in a shopping cart (bag of items) — ORDERED OR NOT • Product titles (list of words) — ORDERED… OR NOT? • Songs played in a user’s history (list of items) — ORDERED • Movies liked in a user’s signup flow (set of items) — UNORDERED
  • 6. LEARNING FROM ITEM COLLECTIONS PROBLEM STATEMENT • Learning representations for variable-length sequences is “easy” • RNNs, LSTMs, GRUs • Input = sequence of embeddings • Output = embedding for whole sequence • Very effective but not always the cheapest or easiest to train • But what if the data is unordered? • What if it’s ordered, but that ordering is uninformative?
  • 7. HOW CAN WE LEARN A SINGLE EMBEDDING FROM A BAG OR SET OF ITEM EMBEDDINGS?
  • 8. (WHICH MIGHT NOT WORK VERY WELL) REALLY SIMPLE APPROACH • Learn item embeddings in an unsupervised manner • e.g. “Item2Vec”, Barkan & Koenigstein 2016 • word2vec (skip-gram with negative sampling) on item IDs • Average them together to get an embedding for the set/bag • Often used in text mining / IR as a baseline or lower bound • e.g. “word centroid distance” from Kusner et al 2015
  • 9. Embeddings Item 05 Item 17 Item 23 Element-wise mean Issues: • Not task oriented • Embeddings can’t adapt to problem domain • No guarantee that taking the mean is the best strategy
  • 10. LEARN EMBEDDINGS WHILE TRAINING ON A TASK NEURAL BAG-OF-ITEMS • Common baseline in NLP tasks: neural bag-of-words • Initialize embeddings randomly • Or from unsupervised pre-training, or third-party data • Take mean (or sometimes sum) • Feed into network, update embeddings via backprop
  • 11. Embeddings Item 05 Item 17 Item 23 Element-wise mean Output layer or rest of network Errors propagate back into embeddings
  • 12. COMPOSE EMBEDDINGS VIA NON-LINEAR TRANSFORMATIONS DEEP AVERAGING NETWORKS • “Deep Unordered Composition Rivals Syntactic Methods for Text Classification” (Iyyer et al 2015) • Developed for sentiment classification & question answering • Proposed as a cheap alternative to recursive neural networks • In a nutshell: • Don’t use mean of embeddings directly • Take mean and pass it through some fully-connected layers • Probably prior art somewhere?
  • 13. Embeddings Item 05 Item 17 Item 23 Element-wise mean Output layer or rest of network Errors propagate back into FC layers and embeddings FC2 FC1 Activation of last FC layer is representation of whole set
  • 14. — Iyyer et al THE DEEP LAYERS OF THE DAN AMPLIFY TINY DIFFERENCES IN THE VECTOR AVERAGE THAT ARE PREDICTIVE OF THE OUTPUT LABELS. ” “
  • 15. “I really loved Rosamund Pike’s performance in the movie Gone Girl”
  • 16. “I really loved Rosamund Pike’s performance in the movie Gone Girl” liked
  • 17. “I really loved Rosamund Pike’s performance in the movie Gone Girl” liked despised
  • 18. “I really loved Rosamund Pike’s performance in the movie Gone Girl” despised All three sentences have very similar vector mean liked
  • 19. REMOVING ENTIRE EMBEDDINGS FROM THE MEAN WORD DROPOUT • Additional contribution: alternative dropout scheme • Don’t add dropout after fully-connected layers • Instead, randomly drop words from the input sentences • Maybe somewhat specific to sentiment and question answering? • Most words in a sentence don’t affect the sentiment • Most words in a sentence don’t describe the actual answer
  • 21. PREDICTING GROCERY RE-ORDERS INSTACART KAGGLE CONTEST Simplified version of task, for trying out DANs: • Given previous order (n of ~50K products)… • Predict what % of items in it will be re-ordered in next order • Use only the items in the previous order (not user, metadata etc.)
  • 22. TRAIN ON 2893386 SAMPLES, VALIDATE ON 321488 SAMPLES DAN VS GRU HEAD-TO-HEAD DAN input: unordered item IDs Dim-50 item embedding (2484450 trainable params) Mean + 2x dim-50 dense ReLU layers (5100 trainable params) Single linear output (51 trainable params) GRU input: ordered item IDs Dim-50 item embedding (2484450 trainable params) GRU with 25 units + ReLU activation (5700 trainable params) Single linear output (26 trainable params)
  • 23. TRAINED WITH ADAM (ALL DEFAULTS) ON GOOGLE GPU BOX DAN VS GRU HEAD-TO-HEAD DAN Batch size: 100 MSE loss One epoch: 4 minutes Mean training loss: 0.0631 Validation loss: 0.0626 Competitive result in minutes GRU Batch size: 100 MSE loss One epoch: 5 hours Mean training loss: 0.0626 Validation loss: 0.0614 Slightly better result… in hours!
  • 24. DAN MATCHED GRU PERFORMANCE IN 12 MINUTES DAN VS GRU HEAD-TO-HEAD DANLOSS 0.0568 0.0586 0.0604 0.0622 0.0640 EPOCH 1 2 3 4 5 VALIDATION TRAINING 0.0615 ≈ GRU performance after 5 hours
  • 25. SOME REMARKS DAN VS GRU HEAD-TO-HEAD • Tried ‘neural bag-of-items’ (no hidden layers) for comparison • Training time per epoch similar to DAN (few secs faster) • Validation loss flattened out at 0.063 (worse than DAN at epoch 0) • Not a thorough investigation — no hyperparameter search • No dropout, weight decay, batch norm, etc. • Item dropout (i.e. word dropout) didn’t seem to help • Unlike text mining tasks, all items in bag are (potentially) important
  • 26. ANY QUESTIONS? THANKS! • Code available on GitHub: • andrewclegg/insta-keras • Feel free to grab me afterwards to chat about anything • Or ping me on Twitter: • @andrew_clegg