Emirates Advanced Analytics Hackathon
Vera Ekimenko
Hackathon Use Case 1: Problem statement
Use case #1 Social media image analytics to uncover valuable customer insights
Description
More than 3 billion photos are shared daily on social media and an estimated 85% of those photos lack text
references. It becomes increasingly important to incorporate social images analytics to draw better
customer insights.
Use of image analytics to identify and categorize the scenes, faces, logos, objects, and actions that
passengers post to social media every day.
Classify images from passenger posts as places, brands, food, events and action. The model should be able
to identify the places the passengers visited / like, the brands the passenger like, the events the passenger
attended.
Business Benefits
Provide exciting opportunities to uncover more valuable consumer insights from social media
 What places passengers visited
 What kind of places does passengers like
 Special events passengers attending
 What kind of food does the passenger like
 What brands does the passenger like
 Whom does the passenger travel with
 And much more.
This will help in target marketing to increase revenue
Data set to use Social Media posting of passengers
Success criteria Able to identify at least 3 of the above benefits
Social profiles provide information about customer’s interests, hobbies, music
preferences, places they like, role models they follow and etc. This will allow you
to segment your customers going beyond demographics and communicate with
them in a new very personalized manner.
Enrich Customer Genome with social IDs
Target customers with the most
relevant offers
Detect and predict emerging trends among the
target audience
Turn new customers into loyal ones through
personalized up-sell and cross-sell offers.
Random public Instagram account - 1
Brands
Random public Instagram account - 2
Food Food Food
BrandsFood
FoodFood
Food
Random public Instagram account - 3
Events
Brands
Food
Food
Food
Image categories I picked to train my models
Concert
Show
Theatre
Opera
Ballet
Football Rugby
TennisHorse racing
Racing
Gucci
Fendi
Chanel
Hermes
Dior
Events Brands Food
Dessert
Wine
Coffee
Seafood
Pizza
Salad
Pasta
Steak
Cake
Chicken
Burger
Fish
Sushi
Chocolate
Beer
Meat
Cheese
Ice cream
Beef
Vegetables
Fries
salmon
Rice
Bread
Pork
Soup
Pancakes
Tacos
Sandwich
Sports
Sample Sports images
Sample Event images
Sample Brands images
Sample Food images
All food images are multi-labelled
Bread
Sea food
Fries
The prediction system architecture (17 models)
Food/Non-food
Multi-label food
Fendi
Dior
Chanel
Gucci
Hermes
Ballet
Opera
Concert
Show
Theatre
Football
Rugby
Tennis
Horses
Racing
Multi-class brands Multi-class events
Transfer Learning using Keras
Places365 model
Version 1
Version 2
The projects stages and resources
Collection
• Download
Instagram
images
Labelling
• Manually
group
images
Training
• Light
version of
EGG16
Lenovo X220
64-bit Windows 10
Memory: 12GB
No GPU
Collection
15,000 images
for each hashtag
(total number 18)
150
150
Concert Show
Theatre
Opera Ballet
Football
Rugby
TennisHorse racing
Racing
Gucci
Fendi
Chanel
Hermes
Dior
EatingOut Food
Restaurant
270,000 images (10GB)
Labelling
Manual labelling
• Manual labelling 200
images for each category:
• 100 - ‘positive’
• 100 - ‘negative’
Helper model
• Trained a helper model
• 5-10 epochs
• 85-90% accuracy
Final labelling
• With the helper model
predict/ label all 15,000
images
• Manually correct the final
labelling
chanel not chanel
100 + 100
Labelling – Instagram noise ratio
hashtag labelled ratio
concert 627 1/24
show 1046 1/14
theatre 768 1/20
opera 225 1/66
ballet 969 1/15
football 1247 1/12
rugby 245 1/60
tennis 368 1/40
horseracing 1550 1/10
racing 339 1/44
gucci 492 1/30
fendi 348 1/43
chanel 307 1/50
hermes 1862 1/8
dior 350 1/43

Labelling helper model
a
Food labelling
Hashtags processing
• Using NLTK define a word category for
each hashtag
• Select images with hashtag category
‘food’
Manual labelling
• Group and select top 50 food hashtags
• For each food category manually label
images (true/false)
Fries 
Stake 
Bread 
Red wine 
Training – VGG16 for Brands and Events (v1)
Training – a unified model for Events
It didn’t train well no matter of parameters or base model used
Training – Places365 for Events (v2)
Places: A 10 million Image Database for Scene Recognition
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Food model (multi-label)
Next step - Segmentation
Customer profile
Thank you

Deep Learning Hackathon

  • 1.
    Emirates Advanced AnalyticsHackathon Vera Ekimenko
  • 2.
    Hackathon Use Case1: Problem statement Use case #1 Social media image analytics to uncover valuable customer insights Description More than 3 billion photos are shared daily on social media and an estimated 85% of those photos lack text references. It becomes increasingly important to incorporate social images analytics to draw better customer insights. Use of image analytics to identify and categorize the scenes, faces, logos, objects, and actions that passengers post to social media every day. Classify images from passenger posts as places, brands, food, events and action. The model should be able to identify the places the passengers visited / like, the brands the passenger like, the events the passenger attended. Business Benefits Provide exciting opportunities to uncover more valuable consumer insights from social media  What places passengers visited  What kind of places does passengers like  Special events passengers attending  What kind of food does the passenger like  What brands does the passenger like  Whom does the passenger travel with  And much more. This will help in target marketing to increase revenue Data set to use Social Media posting of passengers Success criteria Able to identify at least 3 of the above benefits
  • 3.
    Social profiles provideinformation about customer’s interests, hobbies, music preferences, places they like, role models they follow and etc. This will allow you to segment your customers going beyond demographics and communicate with them in a new very personalized manner. Enrich Customer Genome with social IDs Target customers with the most relevant offers Detect and predict emerging trends among the target audience Turn new customers into loyal ones through personalized up-sell and cross-sell offers.
  • 4.
    Random public Instagramaccount - 1 Brands
  • 5.
    Random public Instagramaccount - 2 Food Food Food BrandsFood FoodFood Food
  • 6.
    Random public Instagramaccount - 3 Events Brands Food Food Food
  • 7.
    Image categories Ipicked to train my models Concert Show Theatre Opera Ballet Football Rugby TennisHorse racing Racing Gucci Fendi Chanel Hermes Dior Events Brands Food Dessert Wine Coffee Seafood Pizza Salad Pasta Steak Cake Chicken Burger Fish Sushi Chocolate Beer Meat Cheese Ice cream Beef Vegetables Fries salmon Rice Bread Pork Soup Pancakes Tacos Sandwich Sports
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
    All food imagesare multi-labelled Bread Sea food Fries
  • 13.
    The prediction systemarchitecture (17 models) Food/Non-food Multi-label food Fendi Dior Chanel Gucci Hermes Ballet Opera Concert Show Theatre Football Rugby Tennis Horses Racing Multi-class brands Multi-class events Transfer Learning using Keras Places365 model Version 1 Version 2
  • 14.
    The projects stagesand resources Collection • Download Instagram images Labelling • Manually group images Training • Light version of EGG16 Lenovo X220 64-bit Windows 10 Memory: 12GB No GPU
  • 15.
    Collection 15,000 images for eachhashtag (total number 18) 150 150 Concert Show Theatre Opera Ballet Football Rugby TennisHorse racing Racing Gucci Fendi Chanel Hermes Dior EatingOut Food Restaurant 270,000 images (10GB)
  • 16.
    Labelling Manual labelling • Manuallabelling 200 images for each category: • 100 - ‘positive’ • 100 - ‘negative’ Helper model • Trained a helper model • 5-10 epochs • 85-90% accuracy Final labelling • With the helper model predict/ label all 15,000 images • Manually correct the final labelling chanel not chanel 100 + 100
  • 17.
    Labelling – Instagramnoise ratio hashtag labelled ratio concert 627 1/24 show 1046 1/14 theatre 768 1/20 opera 225 1/66 ballet 969 1/15 football 1247 1/12 rugby 245 1/60 tennis 368 1/40 horseracing 1550 1/10 racing 339 1/44 gucci 492 1/30 fendi 348 1/43 chanel 307 1/50 hermes 1862 1/8 dior 350 1/43 
  • 18.
  • 19.
    Food labelling Hashtags processing •Using NLTK define a word category for each hashtag • Select images with hashtag category ‘food’ Manual labelling • Group and select top 50 food hashtags • For each food category manually label images (true/false) Fries  Stake  Bread  Red wine 
  • 20.
    Training – VGG16for Brands and Events (v1)
  • 21.
    Training – aunified model for Events It didn’t train well no matter of parameters or base model used
  • 22.
    Training – Places365for Events (v2) Places: A 10 million Image Database for Scene Recognition Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 23.
  • 24.
    Next step -Segmentation Customer profile
  • 25.