Tracing Multisensory Food
Experience on Twitter
Matīss Rikters
July 13, 2021
Outline
• Project overview
• Dataset collection, processing, annotation
• Sentiment analysis, named entity recognition, question answering
• Dataset analysis, aspects from cognitive science
Project overview
• Started as my bachelor’s thesis in 2011
• Has been running ever since with little disruptions
• https://tvitediens.tk/ - go check it out
• Has its own Twitter account https://twitter.com/Twitediens
• Every day it tweets
• 5 most mentioned foods of the last 24 hours
• 5 most active users of the last 24 hours
• A random recommendation for lunch
• Twitter users occasionally interact with it
Main keywords
taste lunch beet potato mandarin sweet
eat feast bun cabbage sauce mushroom
breakfast drink carrot candy pancake onion
dine treat chips sour cream dumpling chocolate
dinner nom vegetable cream soup gingerbread tea
bite appetite meat cake rice tomato
meal orange Hesburger drink salad grape
food apple coffee McDonald's ice cream strawberry
Dataset overview
https://github.com/Usprogis/Latvian-Twitter-Eater-Corpus
Domain-specific about food and eating written in Latvian
2.4M+ tweets
• ~5,500 + 744 tweets with manually annotated sentiment (positife, neutral,
negative) for training and testing
• 744 tweets with manually annotated named entity classes of person names,
locations, organizations, food and drinks, and miscellaneous named entities, like
• ~43,000 automatically aggregated question-answer tweet pairs
• ~155,000 tweets have images
• ~165,000 have location info
Data collection
• A script uses the Twitter streaming API to listen for any of the 363
pre-defined keywords
• Records the latest tweets in a large MySQL database for storage and a
separate database for displaying data from the last 3 months online
• At the beginning of each month a scheduled script converts data from
the database into JSON format for easier processing
Data processing
Experiments
• Sentiment analysis – about 5,500 tweets annotated for training and
744 as a test dataset
• Named entity recognition – the same 744 tweets annotated with
place, person, food, time, and misc entities
• Question answering – about 19,000 tweets that express questions
along with any replies to the tweets make up about 43,000 question-
answer tweet pairs
• Multimodal experiments – about 155,000 tweets have images, but
experiments are still in progress with no outstanding results just yet
Sentiment analysis
Training Data TE MP MP+PE TE+MP TE+MP+RV+PE+NI TE+MP+RV+PE
Naive Bayes 53.21 43.32 45.72 56.55 59.63 58.02
Perceptron 53.07 52.67 53.47 57.87 57.33 58.27
Stemmed
Naive Bayes 53.74 46.39 50.67 58.16 60.56 61.23
Perceptron 56.67 53.73 54.13 60.00 56.93 57.73
Lemmas
Naive Bayes 53.88 45.45 49.60 56.42 58.42 59.63
Perceptron 54.41 51.07 53.07 57.35 56.95 56.95
Stemmed Lemmas
Naive Bayes 54.41 45.99 49.33 57.62 59.63 59.63
Perceptron 53.34 51.47 52.67 58.29 56.68 57.09
Multilingual
BERT
68.32
Latvian BERT 74.06
How to determine sentiment?
It was difficult to agree upon sentiment of some tweets
Consider those:
• “Batars tak arī viņus ēda paļube tgd mums no 9 izlabos uz 3 :D”
• “Batars was also eating them and now our grades will be marked from 9 to 3 :D”
• “Ja vēlies pazaudēt pāris kilogramus, izrauj savus zobus! Tad arī turpmāk būs
grūti apēst parāk daudz”
• “If you want to lose weight, just pull out your teeth! Then it is going to be
difficult to eat too much”
Sentiment over time
Relations to smell, taste and temperature
Relations to smell, taste and temperature
Fun fact – cold soup is very popular in Latvia
Relations to smell, taste and temperature
Food/Drink Pleasant smell Bad smell Food/Drink
Tea 598 156 Meat
Chocolate 408 96 Tea
Coffee 386 93 Fish
Gingerbread 293 67 Cheese
Tangerines 244 65 Garlic
Strawberries 227 57 Coffee
Apple 220 48 Potatoes
Meat 183 39 Egg
Potatoes 142 38 Onion
Pancakes 142 37 Chocolate
Question answering
• We performed a small-scale human evaluation results by asking
5 annotatorsto evaluate a random 10% of the evaluation set by
marking generated answers as either OK or not good (NG).
• The evaluators marked 46.40% of answers as OK.
• The evaluators had an overall agreement of 66.27% (Free-
marginal kappa 0.33), which indicatesmoderate agreement.
Dataset overview over time
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
80,000
0
100,000
200,000
300,000
400,000
500,000
600,000
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Twitter
Users
Tweets
Tweets Trend Twitter Instagram Twitter Users
Most popular foods and drinks
Food Count Drink Count
Chocolate 117,235 Tea 163,338
Ice cream 86,109 Coffee 120,040
Meat 85,574 Juice 18,179
Potatoes 70,135 Water 15,692
Salads 61,616 Beer 14,845
Cake 52,267 Cocktails 8,207
Soup 46,545 Coca-cola 5,016
Pancakes 40,203 Alcohol 4,766
Sauce 40,201 Champagne 3,673
Apple 36,571 Vodka 2,802
Trends and analysis
• Seasonal trends
• Large noticeable trends
• In-deapth look into specific foods
Large trends
32
Drastic price increase
65
Russia import ban
86
CoViD19 panic buying
2666
Horse meat burgers
0
500
1000
1500
2000
2500
0
10
20
30
40
50
60
70
80
90
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
Meat
Tweet
Count
Butter,
Sprat,
Buckwheat
Tweet
Count
Week
Butter 2017 Sprats 2015 Salmon 2016 Buckwheat 2020 Meat 2013
Horse meat 2013
Seasonal trends
Jan Jan Feb Feb Mar Mar Apr Apr Apr May May Jun Jun Jul Jul Aug Aug Sep Sep Sep Oct Oct Nov Nov Dec Dec
0
50
100
150
200
250
300
350
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
Ice
cream,
Strawberries
Chocolate,
Tangerines,
Gingerbread
Gingerbread Chocolate Tangerines Strawberries Ice cream
Timeline of tea tweets
0
20
40
60
80
10.2011
2.2012
6.2012
10.2012
2.2013
6.2013
10.2013
2.2014
6.2014
10.2014
2.2015
6.2015
10.2015
2.2016
6.2016
10.2016
2.2017
6.2017
10.2017
2.2018
6.2018
10.2018
2.2019
6.2019
10.2019
2.2020
6.2020
Tea
Green Tasty Sweet Bitter Black White
Timeline of coffee tweets
0
10
20
30
40
50
60
10.2011
12.2011
2.2012
4.2012
6.2012
8.2012
10.2012
12.2012
2.2013
4.2013
6.2013
8.2013
10.2013
12.2013
2.2014
4.2014
6.2014
8.2014
10.2014
12.2014
2.2015
4.2015
6.2015
8.2015
10.2015
12.2015
2.2016
4.2016
6.2016
8.2016
10.2016
12.2016
2.2017
4.2017
6.2017
8.2017
10.2017
12.2017
2.2018
4.2018
6.2018
8.2018
10.2018
12.2018
2.2019
4.2019
6.2019
8.2019
10.2019
12.2019
2.2020
4.2020
6.2020
8.2020
Coffee 2011-2020 Normalized
Black Tasty Sweet White Bitter Green
Pancakes during the week / time of day
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Morning 1,107 1,128 1,122 1,049 1,221 1,617 1,887
Afternoon 2,122 2,071 2,015 2,030 2,236 2,704 3,410
Evening 2,133 2,171 2,096 2,044 1,810 1,856 2,515
Night 615 603 609 601 588 583 668
Salad during the week / time of day
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Morning 3,613 3,679 3,521 3,399 3,265 1,148 1,146
Afternoon 3,628 3,219 3,071 3,057 2,658 2,630 2,970
Evening 2,838 2,852 2,682 2,619 2,187 2,241 2,696
Night 923 904 908 883 776 678 899
Some conclusions so far, more to come
Large scale social network data can be helpful for better understanding human and food
relationships and forming strategies and tactics for nudging for healthier (but not
necessarily less tasty) food behavior
By researching food related behavior on social media we can move from fragmented and
valuable data to a better understanding and knowledge of food choice and sentiment
associated with it
Our research results reveal that negative sentiment expressed about meat in Twitter is
rising steadily, however, large part of neutral tweets remain
Neutrality has, however, sharply decreased with the beginning of Covid-19 pandemic
Publications
• Sproģis, U., Rikters, M. (2020). What Can We Learn From Almost a Decade of
Food Tweets. In The 9th Conference on Human Language Technologies - the Baltic
Perspective.
• Kāle, M., Rikters, M. (2021). Fragmented and Valuable: Following Sentiment
Changes in Food Tweets. In Smell, Taste, and Temperature Interfaces. ACM CHI
2021 workshop.
• Kāle, M., Rikters, M., Šķilters, J. (2021). Tracing Multisensory Food Experience on
Twitter. In review for International Journal of Food Design.
All on GitHub
• Website - https://github.com/M4t1ss/TwitEdiens
• Main corpus - https://github.com/Usprogis/Latvian-Twitter-Eater-Corpus
• NER corpus - https://github.com/RinaldsViksna/Latvian-food-NER-corpus
• Sentiment analysis - https://github.com/M4t1ss/sentiment-analysis-toolkit
• Processing scripts - https://github.com/M4t1ss/Latvian-Twitter-Eater-Corpus-
Processing
End

Tracing multisensory food experience on twitter

  • 1.
    Tracing Multisensory Food Experienceon Twitter Matīss Rikters July 13, 2021
  • 2.
    Outline • Project overview •Dataset collection, processing, annotation • Sentiment analysis, named entity recognition, question answering • Dataset analysis, aspects from cognitive science
  • 3.
    Project overview • Startedas my bachelor’s thesis in 2011 • Has been running ever since with little disruptions • https://tvitediens.tk/ - go check it out • Has its own Twitter account https://twitter.com/Twitediens • Every day it tweets • 5 most mentioned foods of the last 24 hours • 5 most active users of the last 24 hours • A random recommendation for lunch • Twitter users occasionally interact with it
  • 4.
    Main keywords taste lunchbeet potato mandarin sweet eat feast bun cabbage sauce mushroom breakfast drink carrot candy pancake onion dine treat chips sour cream dumpling chocolate dinner nom vegetable cream soup gingerbread tea bite appetite meat cake rice tomato meal orange Hesburger drink salad grape food apple coffee McDonald's ice cream strawberry
  • 5.
    Dataset overview https://github.com/Usprogis/Latvian-Twitter-Eater-Corpus Domain-specific aboutfood and eating written in Latvian 2.4M+ tweets • ~5,500 + 744 tweets with manually annotated sentiment (positife, neutral, negative) for training and testing • 744 tweets with manually annotated named entity classes of person names, locations, organizations, food and drinks, and miscellaneous named entities, like • ~43,000 automatically aggregated question-answer tweet pairs • ~155,000 tweets have images • ~165,000 have location info
  • 6.
    Data collection • Ascript uses the Twitter streaming API to listen for any of the 363 pre-defined keywords • Records the latest tweets in a large MySQL database for storage and a separate database for displaying data from the last 3 months online • At the beginning of each month a scheduled script converts data from the database into JSON format for easier processing
  • 7.
  • 8.
    Experiments • Sentiment analysis– about 5,500 tweets annotated for training and 744 as a test dataset • Named entity recognition – the same 744 tweets annotated with place, person, food, time, and misc entities • Question answering – about 19,000 tweets that express questions along with any replies to the tweets make up about 43,000 question- answer tweet pairs • Multimodal experiments – about 155,000 tweets have images, but experiments are still in progress with no outstanding results just yet
  • 9.
    Sentiment analysis Training DataTE MP MP+PE TE+MP TE+MP+RV+PE+NI TE+MP+RV+PE Naive Bayes 53.21 43.32 45.72 56.55 59.63 58.02 Perceptron 53.07 52.67 53.47 57.87 57.33 58.27 Stemmed Naive Bayes 53.74 46.39 50.67 58.16 60.56 61.23 Perceptron 56.67 53.73 54.13 60.00 56.93 57.73 Lemmas Naive Bayes 53.88 45.45 49.60 56.42 58.42 59.63 Perceptron 54.41 51.07 53.07 57.35 56.95 56.95 Stemmed Lemmas Naive Bayes 54.41 45.99 49.33 57.62 59.63 59.63 Perceptron 53.34 51.47 52.67 58.29 56.68 57.09 Multilingual BERT 68.32 Latvian BERT 74.06
  • 10.
    How to determinesentiment? It was difficult to agree upon sentiment of some tweets Consider those: • “Batars tak arī viņus ēda paļube tgd mums no 9 izlabos uz 3 :D” • “Batars was also eating them and now our grades will be marked from 9 to 3 :D” • “Ja vēlies pazaudēt pāris kilogramus, izrauj savus zobus! Tad arī turpmāk būs grūti apēst parāk daudz” • “If you want to lose weight, just pull out your teeth! Then it is going to be difficult to eat too much”
  • 11.
  • 12.
    Relations to smell,taste and temperature
  • 13.
    Relations to smell,taste and temperature
  • 14.
    Fun fact –cold soup is very popular in Latvia
  • 15.
    Relations to smell,taste and temperature Food/Drink Pleasant smell Bad smell Food/Drink Tea 598 156 Meat Chocolate 408 96 Tea Coffee 386 93 Fish Gingerbread 293 67 Cheese Tangerines 244 65 Garlic Strawberries 227 57 Coffee Apple 220 48 Potatoes Meat 183 39 Egg Potatoes 142 38 Onion Pancakes 142 37 Chocolate
  • 16.
    Question answering • Weperformed a small-scale human evaluation results by asking 5 annotatorsto evaluate a random 10% of the evaluation set by marking generated answers as either OK or not good (NG). • The evaluators marked 46.40% of answers as OK. • The evaluators had an overall agreement of 66.27% (Free- marginal kappa 0.33), which indicatesmoderate agreement.
  • 17.
    Dataset overview overtime 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 0 100,000 200,000 300,000 400,000 500,000 600,000 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Twitter Users Tweets Tweets Trend Twitter Instagram Twitter Users
  • 18.
    Most popular foodsand drinks Food Count Drink Count Chocolate 117,235 Tea 163,338 Ice cream 86,109 Coffee 120,040 Meat 85,574 Juice 18,179 Potatoes 70,135 Water 15,692 Salads 61,616 Beer 14,845 Cake 52,267 Cocktails 8,207 Soup 46,545 Coca-cola 5,016 Pancakes 40,203 Alcohol 4,766 Sauce 40,201 Champagne 3,673 Apple 36,571 Vodka 2,802
  • 19.
    Trends and analysis •Seasonal trends • Large noticeable trends • In-deapth look into specific foods
  • 20.
    Large trends 32 Drastic priceincrease 65 Russia import ban 86 CoViD19 panic buying 2666 Horse meat burgers 0 500 1000 1500 2000 2500 0 10 20 30 40 50 60 70 80 90 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 Meat Tweet Count Butter, Sprat, Buckwheat Tweet Count Week Butter 2017 Sprats 2015 Salmon 2016 Buckwheat 2020 Meat 2013
  • 21.
  • 22.
    Seasonal trends Jan JanFeb Feb Mar Mar Apr Apr Apr May May Jun Jun Jul Jul Aug Aug Sep Sep Sep Oct Oct Nov Nov Dec Dec 0 50 100 150 200 250 300 350 0 100 200 300 400 500 600 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 Ice cream, Strawberries Chocolate, Tangerines, Gingerbread Gingerbread Chocolate Tangerines Strawberries Ice cream
  • 23.
    Timeline of teatweets 0 20 40 60 80 10.2011 2.2012 6.2012 10.2012 2.2013 6.2013 10.2013 2.2014 6.2014 10.2014 2.2015 6.2015 10.2015 2.2016 6.2016 10.2016 2.2017 6.2017 10.2017 2.2018 6.2018 10.2018 2.2019 6.2019 10.2019 2.2020 6.2020 Tea Green Tasty Sweet Bitter Black White
  • 24.
    Timeline of coffeetweets 0 10 20 30 40 50 60 10.2011 12.2011 2.2012 4.2012 6.2012 8.2012 10.2012 12.2012 2.2013 4.2013 6.2013 8.2013 10.2013 12.2013 2.2014 4.2014 6.2014 8.2014 10.2014 12.2014 2.2015 4.2015 6.2015 8.2015 10.2015 12.2015 2.2016 4.2016 6.2016 8.2016 10.2016 12.2016 2.2017 4.2017 6.2017 8.2017 10.2017 12.2017 2.2018 4.2018 6.2018 8.2018 10.2018 12.2018 2.2019 4.2019 6.2019 8.2019 10.2019 12.2019 2.2020 4.2020 6.2020 8.2020 Coffee 2011-2020 Normalized Black Tasty Sweet White Bitter Green
  • 25.
    Pancakes during theweek / time of day Monday Tuesday Wednesday Thursday Friday Saturday Sunday Morning 1,107 1,128 1,122 1,049 1,221 1,617 1,887 Afternoon 2,122 2,071 2,015 2,030 2,236 2,704 3,410 Evening 2,133 2,171 2,096 2,044 1,810 1,856 2,515 Night 615 603 609 601 588 583 668
  • 26.
    Salad during theweek / time of day Monday Tuesday Wednesday Thursday Friday Saturday Sunday Morning 3,613 3,679 3,521 3,399 3,265 1,148 1,146 Afternoon 3,628 3,219 3,071 3,057 2,658 2,630 2,970 Evening 2,838 2,852 2,682 2,619 2,187 2,241 2,696 Night 923 904 908 883 776 678 899
  • 27.
    Some conclusions sofar, more to come Large scale social network data can be helpful for better understanding human and food relationships and forming strategies and tactics for nudging for healthier (but not necessarily less tasty) food behavior By researching food related behavior on social media we can move from fragmented and valuable data to a better understanding and knowledge of food choice and sentiment associated with it Our research results reveal that negative sentiment expressed about meat in Twitter is rising steadily, however, large part of neutral tweets remain Neutrality has, however, sharply decreased with the beginning of Covid-19 pandemic
  • 28.
    Publications • Sproģis, U.,Rikters, M. (2020). What Can We Learn From Almost a Decade of Food Tweets. In The 9th Conference on Human Language Technologies - the Baltic Perspective. • Kāle, M., Rikters, M. (2021). Fragmented and Valuable: Following Sentiment Changes in Food Tweets. In Smell, Taste, and Temperature Interfaces. ACM CHI 2021 workshop. • Kāle, M., Rikters, M., Šķilters, J. (2021). Tracing Multisensory Food Experience on Twitter. In review for International Journal of Food Design.
  • 29.
    All on GitHub •Website - https://github.com/M4t1ss/TwitEdiens • Main corpus - https://github.com/Usprogis/Latvian-Twitter-Eater-Corpus • NER corpus - https://github.com/RinaldsViksna/Latvian-food-NER-corpus • Sentiment analysis - https://github.com/M4t1ss/sentiment-analysis-toolkit • Processing scripts - https://github.com/M4t1ss/Latvian-Twitter-Eater-Corpus- Processing
  • 30.