• Save
Advanced Analytics with Social Media Data
Upcoming SlideShare
Loading in...5

Like this? Share it with your network

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • [twitter]Tweet Tweet![/twitter]


  • 1. Please tweet!#RNWebinars @LoveStatsAdvanced Analytics for Social Media Research:Examples from the automotive industryJanuary 2013Social media listening data by researchers, for researchers
  • 2. Please tweet!#RNWebinars @LoveStatsTrack brand mentionsIdentify positive and negative brand attributesIdentify sources of negativityMonitor an ad campaignMeasure category normsStandard Social Media Research Uses112345
  • 3. Please tweet!#RNWebinars @LoveStatsCorrelations – How does gender correlate with brand choice?Which brands and features are preferred by men and by women?Regression – Which features best predict purchase ofspecific brands? How do combinations of variables worktogether to predict an overarching variable?Factor analysis – How do brands or featurescluster together as being similar in consumer’sminds? What clusters “appear”? What is the best“package?”Advanced Social Media Research Uses2123
  • 4. Please tweet!#RNWebinars @LoveStatsData + Category Experts = Insights3Expert methodologistscollecting, cleaning, coding, andcalibrating data specific to yourresearch objectivesIndustry analysts using categoryand normative expertise toanalyze and interpret dataRelevant, valid, and reliableconclusions, insights, andrecommendationsYourLogoHere
  • 5. Please tweet!#RNWebinars @LoveStatsResearch Method4Datasets1. Branded: Random sample of verbatimsmentioning a brand name(e.g., GMC, Honda, Lexus). To measurecorrelations.• N>250 0002. Branded purchasing: Random sample ofverbatims mentioning a brand and purchase. Topredict purchase.N>100 0003. Branded pairs: Random sample of verbatimsmentioning at least TWO brand names. To runbrand factor analysis.• N>100 000Data Collection Criteria• Consumer focus• Dealership messaging removed• Viral games and jokes removedCollectCleanCategorizeCalibrate• Clean out spam and non-relevant chatter (e.g., funengagement conversationson Facebook)• Scour the internet forthousands of messagesrelated to the brand• Categorize verbatims intorelevant contentareas, e.g., pricing, recommendations, commercials, celebrities• Calibrate the sentiment into5-point Likert scale bucketsspecific to the brand andcategory
  • 6. Please tweet!#RNWebinars @LoveStats5What is a correlation?A statistical process for identifying how two variables relate witheach other.• E.g., there exists a positive correlation betweeneducation and price paid for vehicles– Expensive cars tend to be owned by people with higher education– Budget cars tend to be owned by people with lower education– A correlation does not mean one variable causes the other. Sending anuneducated person to school will not cause them to buy an expensivecar nor vice versa. The more likely scenario is that higher educationleads to higher income which enables one to purchase a moreexpensive vehicle, if desired.R=0.3 R=0.15R=0.01
  • 7. Please tweet!#RNWebinars @LoveStatsCorrelations: Women’s Brand Preferences6Women are more likely than men to speak positively aboutmidsize vehicles and base level SUVs.Lexus (r=0.34)Nissan Pathfinder (r=0.34)Nissan Maxima (r=0.31)Peugeot (r=0.28)BMW X5 (r=0.27)Chevrolet Impala (r=0.25)Mitsubishi Eclipse (r=0.25)e.g., 6% of the variance in positive opinions aboutLexus can be attributed to gender (r=0.34)Analysis: Gender must be specified (n=56 000), Brand non-mentiontreated as pair-wise missing, Minimum sample size per brand n>=30
  • 8. Please tweet!#RNWebinars @LoveStatsCorrelations: Men’s Brand Preferences7Men are more likely to speak positively about sporty carsand adventure trucks.Jeep Safari (r=0.32)GMC Yukon (r=0.22)Ford Fiesta (r=0.17)Mazda Miata (r=0.11)Toyota Tacoma (r=0.10)Ford Mustang (r=0.10)e.g., 5.6% of the variance in positive opinions aboutJeep Safari can be attributed to gender (r=0.32)Analysis: Gender must be specified (n=56 000), Brand non-mentiontreated as pair-wise missing, Minimum sample size per brand n>=30
  • 9. Please tweet!#RNWebinars @LoveStatsCorrelations: Women’s Feature Preferences8Stereotypes abound as women chat more positively about easydriving (e.g., suspension) and appearance (e.g., dashboard)features.Grill (r = 0.38)Suspension (r = 0.36)Dashboard (r = 0.35)Interior (r = 0.33)Steering (r = 0.32)(High correlation withautomatic transmission butsample size was only 17)Analysis: Gender specified (n=56 000), Feature non-mention treated aspair-wise missing, Minimum sample size per feature n>=30
  • 10. Please tweet!#RNWebinars @LoveStatsCorrelations: Men’s Feature Preferences9Stereotypes continue as men chat positively about blastingtheir tunes (i.e. radio) and speeding (i.e. accelerator).Car Radio (r=0.38)Accelerator (r=0.11)Headlight (r=0.10)(High correlation withmanual transmission butsample size was only 25)Analysis: Gender specified (n=56 000), Feature non-mention treated aspair-wise missing, Minimum sample size per feature n>=30
  • 11. Please tweet!#RNWebinars @LoveStats10What is Regression?A statistical method for estimating relationships among variables. Todetermine whether and by how much the change in the value of onevariable affects the value of another variable.Purchase2 XVariableA1 XVariableB0.5 XVariableC+= +Can we determine which variables influence purchase opinions?• Is it a simple or complex relationship with few or many variables?• Do these relationships differ based on the brand? We can then focus our marketing attention in these areas with the appropriatelevel of importance2
  • 12. Please tweet!#RNWebinars @LoveStats11Explaining Past PurchasePeople who have purchased a vehicle focus on quality (e.g., servicing,errors), personality characteristics (e.g., honesty, pride), and features(e.g., color, size, fuel economy)• Variables to account for 30% of variance: 17• Variables to account for total variance (40%): 118• Variables excluded from total : 200• Key Variables:Color, Servicing, Errors, Functionality, Size, Recommend,Engine, Intelligence, Honesty, Pride, Fast, FuelEconomy, Ease, Doors, WheelsPositivePurchaseOpinionServicingX 0.12Recommend X0.11HonestyX 0.08+= +FuelEconomyX 0.08+Analysis: n>36 000, Exploratory stepwise, Feature non-mention recoded as neutralopinion, Subsample required mention of past purchase
  • 13. Please tweet!#RNWebinars @LoveStats12Explaining Purchases of JeepPeople who have purchased a Jeep talk more positively their vehicle beinghighly functional, requiring few repairs, and being sexy in appearance.• Number of variables: 23• % of Variance accounted for: 30%• Positive Variables: Truck types,Functionality, Intelligence, Doors, Error, Size,Engine, Servicing, Tires, Repairs, Exciting,Wheels, Sexy, Transmission, DifferentPositivePurchaseOpinionTypes X0.13Doors X0.11Engine X0.10+= + Sexy X0.07+Analysis: n>4600, Exploratory stepwise, Feature non-mention treated as neutralopinion, Subsample required mention of both purchase and Jeep brand
  • 14. Please tweet!#RNWebinars @LoveStats13Explaining Women’s Purchases of JeepWomen who have purchased a Jeep talk more positively about theirvehicle in terms of pride, reliability (e.g., errors, servicing), andappearance (e.g., hubcaps, fashionable)• Number of variables: 15• % of Variance accounted for: 27%• Key Variables: Pride, Error, TruckTypes, Size, Honesty, Cleanliness, Servicing,Doors, Brakes, Warranty, Hubcaps, Fashionable, IntelligencePositivePurchaseOpinionPride X0.19Error X0.13HonestyX 0.10+= + FashionX 0.09+Analysis: n>460, Exploratory stepwise, Feature non-mention treated as neutralopinion, Subsample required mention of purchase, Jeep brand, and female author
  • 15. Please tweet!#RNWebinars @LoveStats14What is Factor Analysis?A statistic for determining which variables or brand names or product featuresare commonly associated with each other. The reader’s task is to determine whystatistics put those items together and “name” the over-arching concept.MediumX-smallSmallX-largeLargePolyester VelvetLeatherCottonNylonSilkWhat is Factor #1? Sizes What is Factor #2? Fabric3
  • 16. Please tweet!#RNWebinars @LoveStats15Factor Analysis DataTo run a factor analysis, each piece of data must incorporate atleast two brand (or feature) mentions• “In a few years, I want a red or black Range Rover and a sports car. Maybe aBMW or Mercedes.”• “I need to know if I should get the 2 door bmw or 4 door mazda 3. Help meguys!”• “Toyota Land Cruiser is way better than jeep in every way. With that price, ithad better be.”• “Would you buy a Mercury Mountaineer with lower miles or a Lexus withhigher miles? Thanks for your help.”
  • 17. Please tweet!#RNWebinars @LoveStats16How to Use Factor Analysis• Identify the real competitive set, not whatresearchers or brand managers assume or assign• Better understand consumer perceptions of yourbrand• Discover new ways that consumers think aboutyour brand• Market against the most relevant competitors
  • 18. Please tweet!#RNWebinars @LoveStats17Results: Automotive BrandsConsumers categorize vehicles by size, adventurousness, andluxuriousness.Ferrari, Porsche, AudiR8, BMWM3, FordMustangLuxuryChrysler, Jeep, Dodge, Cherokee, Explorer, MustangTrucksPeugeot, Kia,VWGolf, Peugeot206, VWPassatSubcompactPontiac, OldsmobileCutlass, Buick, TaurusMidsizeToyotaYaris, Prius, Kia, Miata, Nissan MaximaFashionablyFriendlyYour realcompetitorsHow consumerscategorize youAnalysis: n=75 000, Equimax rotation, Nonresponse recoded asneutral, Minimum sample size per brand n>=30, 11 factors based on screeplot
  • 19. Please tweet!#RNWebinars @LoveStats18Results: Automotive FeaturesConsumers categorize features into many buckets, some focused on theinterior or exterior appearance, while others are focused on specificsystems, such as fuel or drive system.ABS, Tractioncontrol, Airbags, TirePressureSafetyRWD, FWD,AWD, 4WD, Turbo, HorsepowerDrive SystemsFuelsupply, Fueltank, Airintake. SparkplugFuel SystemBlack, White,Red, Blue,Green, Pink,YellowColorsHubcaps, Chrome, Bumper, Grill, HeadlightExteriorAppearanceDashboard, Beige, Pink, Mirrors, CupholderInteriorAppearanceEngine, Horsepower, Turbo, Torque, ManualPowerHybrid, Electriccars, Coupe,FueleconomyFuel EconomyAnalysis: n=100 000, Equimax rotation, Nonresponse coded asneutral, Minimum sample size per feature n>=30, 17 factors based on screeplot
  • 20. Please tweet!#RNWebinars @LoveStats19What about conjoint?Unfortunately, social media research is not ideal for runningconjoint analyses. Surveys are much better suited to this need.• Frequency of direct comparisons of one product feature in one social mediasentiment: Extremely rare• Ability to isolate two distinct opinions and apply the appropriate sentiment toeach: Extremely difficult“It pains me to see a price of $22k but if they offer $18k, I’ll take it.”“I can’t afford $25k so I’m pumped for when the price comes down to $23k.”
  • 21. Please tweet!#RNWebinars @LoveStats20WatchoutsIrrelevant data, spam, and viral jokes create false correlations betweenbrands. If this data is not removed prior to the analysis, statistics willerroneously identify them as real associations.• Irrelevant data– Come test drive this 2010 Chevrolet Malibu LT. We also have theImpala, Toyota Camry, Honda Accord, Nissan Altima, and Ford Fusion.• Spam– free perscription volvo bieber gaga nike honda adidas free fedexsaturday delivery toyota britney• Viral Jokes– Boyfriend: see that new, red mercedes benz parked beside ourneighbour’s ferrari? Girlfriend: whoooa! its gorgeous! Boyfriend: yeah... I bought you a toothbrush of that colour!!
  • 22. Please tweet!#RNWebinars @LoveStatsThank you21hello@conversition.comwww.conversition.com