Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Clustering, Analysis And Visualization Of Social Media Big Data by Braulio Medina of Vortio - Presented at Insight Innovation eXchange LATAM 2013


Published on

In the last couple of years, a complete new business area has arisen to help brands deal with the new communication paradigm in which each individual is a potential mass communicator and influencer …

In the last couple of years, a complete new business area has arisen to help brands deal with the new communication paradigm in which each individual is a potential mass communicator and influencer through social media. This area is called social media monitoring and analytics and while it is changing the way market research has been done so far, it poses new problems concerning the high volume of information, challenges in the analysis of subjective and disperse data and the requirement of better visualization techniques.

Even though hundreds of tools and processes have been created to monitor and analyse social media data, it may be suggested that none of them has truly been able to allow the mass analysis of high volumes of data in a reasonable amount of time.

That´s why we decided to focus our efforts in developing sophisticated algorithms to automatically detect the most important conversations using clustering techniques and a methodology to score those conversations, providing more precise metrics and better reasoning on social media data. In addition, we noticed that it is equally important to display the insights gathered in an intuitive manner, using visualizations techniques for clearer translation of data into context.

It will be shown how to diminish the time required for analysis considerably using clustering techniques, while delivering better metrics and contextualization. Metrics like sentiment, reach and influence will be exposed in an innovative way, in which clusters are analyzed and have different weights. It will also be shown how these clustering techniques, along with other mathematical tools for processing social media data, should lay ground to the next generation of social media analysis, brand valuation and market research.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Good morning ladies and gentlemen Mu name is Braulio Medina. I am a mathematician and the co-founder of Vortio, a company focused on social media monitoring, analysis snd research I am going to show you some recent challenges
  • The urge to bring conversations into Numbers Here we are talking about human behaviour and here we are talking about mathematics
  • If you can listen to what the social customers are saying, you will have an enourmous competitive advantage. There has never been so much information available about ourselves and from other players in the marketplace. Social media gave us the power
  • 13 days 168k mentions How many people would be necessary to compare ... How many things can be inferred but can be wrong And what is worse, we don't know what those conversations are about?
  • 13 days 168k mentions
  • Transcript

    • 1. Advances in SocialMedia ListeningBraulio MedinaMathematician (PUC-RIO - TU Darmstadt)Co-founder of @bmedinadias
    • 2. ‘ How can we compare brands? Cold Metrics Warm Metrics‣‣Market value Market value Sentiment ‣‣Sentiment‣‣Market share Market share ‣‣Emotional bonding Emotional bonding Revenue‣‣Revenue ‣‣Target relevance Target relevance Profit‣‣Profit Sustainability ‣‣Sustainability Price‣‣Price Awareness ‣‣Awareness Measure Measure Numbers Opinions
    • 3. ‘ What changed with social media? It allows us to understand the opinion of the crowd almost instantaneouslyWhat is wrong with social media measurement?We are approaching it as a cold metric problem
    • 4. Heineken X Budweiser (15-28 february) Distribution of Twitter mentions about these brands‣ 91K mentions 26.5K 55K ‣ 77K mentions 12.4K 1.9K 8.5K 1.7K 1.8K 1.6K‣ USA 15% ‣ USA 58%‣ Brazil 13% 0.9K 0.3K ‣ Canada 11%‣ Turkey 7% ‣ UK 5%‣ Spain 6% ‣ Brazil 4% Data obtained from‣ Netherlands 5% ‣ France 1%
    • 5. ‘How do we get true insights withouthaving to read hundreds of thousands ofconversations?To answer this question, we thinkthat basic statistics and naïvesentiment analysis are not enough
    • 6. We decided to study how information spreads Douglas Hofstaedter Aaron Lynch Susan Blackmore Mathematician & Mathematician & Psychologist physician philosopher The meme Metamagical Themas Thought Contagion machine1976 1985 1996 1996 1997 1999 Richard Brodie Microsoft Executive Journal ofRichard Dawkins Virus of the mind: The MemeticsThe Selfish Gene new science of the meme And we found out that memes reveal us a lot
    • 7. A meme is a unit of information that spreadsIt can be a video, an image or a textual idea
    • 8. This is a text meme that spread on Twitter Love Is Like ADIDAS And NIKE . " Nothing Is ImpossibLe , JUST DO IT Life is like Adidas and Nike. "Nothing Is Impossible, so Just Do It. √ :) True Life = Adidas + Nike → "Nothing is impossible" so "Just Do It"Adidas said " Nothing is impossible" so Nike came up and said " Just do it Life=Adidas+Nike> "nothing is impossible" so "just do it" How can we spot clusters automatically?
    • 9. Raw dataClustering Engine Insights
    • 10. We chose the stream graph visualizationtechnique to represent each cluster/meme Cluster 1 Cluster 2 Cluster 3 Cluster 4 Sum of the conversations in a stream graph
    • 11. Benefits of analysing social media clusters• More precise and much faster• Classify in bulk - classify whole clusters instead of single mention• Straightforward disambiguation (eg Visa is a brand but it also means work permit)• Powerful sentiment metric (weighted average of clusters sentiment score)• Visualization of dominant topics in a timeline
    • 12. Most important benefit of analysing socialmedia clusters Quickly obtain insights for decision making
    • 13. Same approach can be used Feb 2013 The Worst Job Interview Ever, Proof that @heineken_uk is totally out of touch. Brought to You by Heineken (92 mentions) (1032mentions) I liked a @YouTube video Heineken - The Candidate (175 mentions)The best job interview processever by #Heineken(73 mentions) Why Heineken Understands the Importance of Employee Culture (53 mentions) Feb 2013 I just wanted to apologize to all the people Budweiser being who had to replace their Budweiser Red sued for watering Light bulb after yesterdays debacle (4.985 down its beer mentions) (> 4.000 mentions)The BudweiserClydesdale commercial Beer lovers in class-action lawsuits accusegets me teary every Anheuser-Busch of watering down itstime(1.500 mentions) Budweiser, Michelob and other brands (1.066 mentions)
    • 14. Meme dimensions analysis Analysis of 22.321tweets about Heineken and 51.004 tweets about Budweiser in english -100% 0 +100%Dimension:Emotions Hatred LovePrestige Low HighSincerity Dishonest Honest
    • 15. Similarly, we can analyse all conversations that mention beer The average human walks 900 miles per year and drinks Beer is like God is great, beer is good, 22 gallons of beer.That means sex, when its people are crazy that the average human gets good its good, 41 miles per gallon.Not bad. when its bad its still good.Got my toes in the Beer lovers accusewater, ass in the sand, Anheuser-Busch ofnot a worry in the watering down brewsworld, cold beer in my! Life is goodtoday! You know I like my chicken fried, a cold beer on a Friday Tracked keyword: BEER Associations found: night Language: ENGLISH • Fun Number of mentions: 98.343 • Pleasure Period: February 15th to 28th • Lifestyle
    • 16. Different cultures, different behavioursBased on our 2-week analysis, it seems that mostAmericans drink beer to relax and feel goodItalians drink beer because of relationship problemsBrazilians drink beer to partySpanish speakers consider beer drinking healthyGermans drink beer to fuel their bodiesFrench drink beer because it tastes good
    • 17. These insights can help usCalculate and compare brand metricsUnderstand the consumerUnderstand the performance on midia channelsMinimize reputational riskBuild meaningful conversations
    • 18. Multi-language / country analysisAn extended version of this presentation can be found contains the analysis in multiple languages
    • 19. Q&A
    • 20. Thank y ou@bmed inadias .com/we bmining Facebook t/braulio medina ina@vor bra uliomed