© 2018 Converseon Inc. Proprietary and Confidential 1
Classification with Memes
Converseon.AI
Natural Language Processing Meetup | 09 May 2019
Michelle A. McSweeney, PhD
© 2018 Converseon Inc. Proprietary and Confidential 2
● Since 2008, Converseon has been recognized as a leader providing Consumer Intelligence through our award-winning machine
learning social intelligence technologies.
● We provide a comprehensive suite of machine learning technology, models and insights, ranging from full turnkey solutions to
“DIY” enablement.
● Our “no code required” machine learning as a service platform, Conversus.AI, is revolutionizing social and VoC text analysis and
model development by putting the power of the technology directly into the hands of business analysts and subject matter
experts—helping to lead the democratization of AI for practical and valuable business use.
● We work with a growing range of leading ecosystem partners to offer our technologies seamlessly to our clients and theirs,
including Brandwatch, Sprinklr, Crimson Hexagon, Tableau, and many others.
● We have been honored to work with a wide range of leading brands around the world including IBM, Uber, Dell, J&J, Walmart,
and more.
Converseon: An Overview
© 2018 Converseon Inc. Proprietary and Confidential 3
https://blog.playment.io/training-data-for-computer-vision/
© 2018 Converseon Inc. Proprietary and Confidential 4
© 2018 Converseon Inc. Proprietary and Confidential 5
● Dawkins, 1979 The Selfish Gene
● Memes are deeply cultural artifacts
● Spread virally
● Quickly change and transform
● Often humorous
● Cultural (self) criticism
An Internet meme is a piece of culture, typically a joke, which gains influence
through online transmission (Davison, 2012)
What are Memes?
© 2018 Converseon Inc. Proprietary and Confidential 6
● Familiar
● Relatable
● Quickly understood by the target
audience
● Short, succinct, culturally relevant
● Structurally distinct from longer form
media
What makes a meme successful?
© 2018 Converseon Inc. Proprietary and Confidential 7
● Widely adopted
● Immediately understood
● Often culturally (self) critical
● Structurally distinct from earlier techniques
https://www.cnbc.com/2018/05/22/meet-the-2018-cnbc-disruptor-50-companies.html
Memes and our Global Moment
© 2018 Converseon Inc. Proprietary and Confidential 8
© 2018 Converseon Inc. Proprietary and Confidential 9
● >15 million rides per DAY
● Operates in 65 countries, 600 cities
● 75 million passengers
● 3 million drivers
● about 1% of the WORLD population has
taken an Uber
● about 13% of the US population has
taken an Uber
Some notes on Uber
Basic Classification Task
● Three-way sentiment
○ Positive
○ Negative
○ Neutral
● Nine-way emotions
○ Joy
○ Trust
○ Anger
○ Disgust
○ Fear
○ Sadness
○ Surprise
○ Anticipation
○ Other
© 2018 Converseon Inc. Proprietary and Confidential 10
● Open up your favorite [social] media app (Twitter / Instagram / Reddit / the Internet broadly conceived)
● Search for “uber”
● Look at the headlines/previews of the first 3 posts
● Are they positive/negative/neutral?
● Turn to your neighbor(s), introduce yourself, share your results
Uber Data: A Group Activity
© 2018 Converseon Inc. Proprietary and Confidential 11
Positive:
Negative:
Neutral:
Impossible to classify:
Not really about Uber at all:
Other:
Uber Data: A Group Activity
© 2018 Converseon Inc. Proprietary and Confidential 12
Uber on Social Media: The easy cases
© 2018 Converseon Inc. Proprietary and Confidential 13
Uber on Social Media: The hard cases
© 2018 Converseon Inc. Proprietary and Confidential 14
Uber on Social Media: the Memes
© 2018 Converseon Inc. Proprietary and Confidential 15
Some Solutions
Converseon.AI
© 2018 Converseon Inc. Proprietary and Confidential
Most Uber data is on Twitter and other short-form
platforms, and with only 160 to 320 characters means
every feature is essential
● Respect the stopwords
● Don’t strip any punctuation, etc.
16
*Google Books, COCA, McSweeney, 2012
Problem Categories: Short Form Media
Writing Speech Twitter
the the the
be be i
to and to
of of a
and a and
Top 5 words in 3 English Modalities
© 2018 Converseon Inc. Proprietary and Confidential 17
There is a particular type of humor with Uber & other transportation companies:
● drinking
● getting lost
● random vehicles showing up
● music
Train the models to easily identify this language by building our classifiers in only one industry at a time
Problem Categories: Industry-Specific
© 2018 Converseon Inc. Proprietary and Confidential 18
People complain more than they praise on
twitter and similar platforms - how to teach
our models positivity?
● Sample ruthlessly
● Create training data as needed
● Oversample likely-positive data for
coding
Problem Categories: Negativity
© 2018 Converseon Inc. Proprietary and Confidential 19
Memes change more rapidly than any other
language form
● Constantly train classifiers to adapt to the most
recent linguistic features
● Remove old training data once it is no longer
relevant
● Monitor performance and enhance as needed
Problem Categories: Constantly Evolving
© 2018 Converseon Inc. Proprietary and Confidential 20
● Constantly evolving our technology to meet
new demands - experimenting and iterating to
ensure that we stay in front of problems
before they arise.
● Rethinking common practices to get the most
out of our data based on the problem at hand.
Conclusion: Looking Back to Look Forward
© 2018 Converseon Inc. Proprietary and Confidential 21
Thank You

Classification with Memes–Uber case study

  • 1.
    © 2018 ConverseonInc. Proprietary and Confidential 1 Classification with Memes Converseon.AI Natural Language Processing Meetup | 09 May 2019 Michelle A. McSweeney, PhD
  • 2.
    © 2018 ConverseonInc. Proprietary and Confidential 2 ● Since 2008, Converseon has been recognized as a leader providing Consumer Intelligence through our award-winning machine learning social intelligence technologies. ● We provide a comprehensive suite of machine learning technology, models and insights, ranging from full turnkey solutions to “DIY” enablement. ● Our “no code required” machine learning as a service platform, Conversus.AI, is revolutionizing social and VoC text analysis and model development by putting the power of the technology directly into the hands of business analysts and subject matter experts—helping to lead the democratization of AI for practical and valuable business use. ● We work with a growing range of leading ecosystem partners to offer our technologies seamlessly to our clients and theirs, including Brandwatch, Sprinklr, Crimson Hexagon, Tableau, and many others. ● We have been honored to work with a wide range of leading brands around the world including IBM, Uber, Dell, J&J, Walmart, and more. Converseon: An Overview
  • 3.
    © 2018 ConverseonInc. Proprietary and Confidential 3 https://blog.playment.io/training-data-for-computer-vision/
  • 4.
    © 2018 ConverseonInc. Proprietary and Confidential 4
  • 5.
    © 2018 ConverseonInc. Proprietary and Confidential 5 ● Dawkins, 1979 The Selfish Gene ● Memes are deeply cultural artifacts ● Spread virally ● Quickly change and transform ● Often humorous ● Cultural (self) criticism An Internet meme is a piece of culture, typically a joke, which gains influence through online transmission (Davison, 2012) What are Memes?
  • 6.
    © 2018 ConverseonInc. Proprietary and Confidential 6 ● Familiar ● Relatable ● Quickly understood by the target audience ● Short, succinct, culturally relevant ● Structurally distinct from longer form media What makes a meme successful?
  • 7.
    © 2018 ConverseonInc. Proprietary and Confidential 7 ● Widely adopted ● Immediately understood ● Often culturally (self) critical ● Structurally distinct from earlier techniques https://www.cnbc.com/2018/05/22/meet-the-2018-cnbc-disruptor-50-companies.html Memes and our Global Moment
  • 8.
    © 2018 ConverseonInc. Proprietary and Confidential 8
  • 9.
    © 2018 ConverseonInc. Proprietary and Confidential 9 ● >15 million rides per DAY ● Operates in 65 countries, 600 cities ● 75 million passengers ● 3 million drivers ● about 1% of the WORLD population has taken an Uber ● about 13% of the US population has taken an Uber Some notes on Uber Basic Classification Task ● Three-way sentiment ○ Positive ○ Negative ○ Neutral ● Nine-way emotions ○ Joy ○ Trust ○ Anger ○ Disgust ○ Fear ○ Sadness ○ Surprise ○ Anticipation ○ Other
  • 10.
    © 2018 ConverseonInc. Proprietary and Confidential 10 ● Open up your favorite [social] media app (Twitter / Instagram / Reddit / the Internet broadly conceived) ● Search for “uber” ● Look at the headlines/previews of the first 3 posts ● Are they positive/negative/neutral? ● Turn to your neighbor(s), introduce yourself, share your results Uber Data: A Group Activity
  • 11.
    © 2018 ConverseonInc. Proprietary and Confidential 11 Positive: Negative: Neutral: Impossible to classify: Not really about Uber at all: Other: Uber Data: A Group Activity
  • 12.
    © 2018 ConverseonInc. Proprietary and Confidential 12 Uber on Social Media: The easy cases
  • 13.
    © 2018 ConverseonInc. Proprietary and Confidential 13 Uber on Social Media: The hard cases
  • 14.
    © 2018 ConverseonInc. Proprietary and Confidential 14 Uber on Social Media: the Memes
  • 15.
    © 2018 ConverseonInc. Proprietary and Confidential 15 Some Solutions Converseon.AI
  • 16.
    © 2018 ConverseonInc. Proprietary and Confidential Most Uber data is on Twitter and other short-form platforms, and with only 160 to 320 characters means every feature is essential ● Respect the stopwords ● Don’t strip any punctuation, etc. 16 *Google Books, COCA, McSweeney, 2012 Problem Categories: Short Form Media Writing Speech Twitter the the the be be i to and to of of a and a and Top 5 words in 3 English Modalities
  • 17.
    © 2018 ConverseonInc. Proprietary and Confidential 17 There is a particular type of humor with Uber & other transportation companies: ● drinking ● getting lost ● random vehicles showing up ● music Train the models to easily identify this language by building our classifiers in only one industry at a time Problem Categories: Industry-Specific
  • 18.
    © 2018 ConverseonInc. Proprietary and Confidential 18 People complain more than they praise on twitter and similar platforms - how to teach our models positivity? ● Sample ruthlessly ● Create training data as needed ● Oversample likely-positive data for coding Problem Categories: Negativity
  • 19.
    © 2018 ConverseonInc. Proprietary and Confidential 19 Memes change more rapidly than any other language form ● Constantly train classifiers to adapt to the most recent linguistic features ● Remove old training data once it is no longer relevant ● Monitor performance and enhance as needed Problem Categories: Constantly Evolving
  • 20.
    © 2018 ConverseonInc. Proprietary and Confidential 20 ● Constantly evolving our technology to meet new demands - experimenting and iterating to ensure that we stay in front of problems before they arise. ● Rethinking common practices to get the most out of our data based on the problem at hand. Conclusion: Looking Back to Look Forward
  • 21.
    © 2018 ConverseonInc. Proprietary and Confidential 21 Thank You