Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Pydata influencer validation

114 views

Published on

Novel set of indices to validate social media influencers for marketing campaigns

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Pydata influencer validation

  1. 1. Influencer Validation 5th September 2017 Dr Ed Cannon
  2. 2. 2Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Whose the one?
  3. 3. 3Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Objective • Create a plugin to validate social media influencers for marketing • Key Advantages: • Can be used by analysts • Can be extended by developers & data scientists • Can be part of a workflow – simple plug & play • Works ontop of hadoop using pyspark can scale to millions • Can get raw metrics & analyse further • Option to output to ppt
  4. 4. 4Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Influencer Validation A way to measure the effectiveness of an influential social media entity prior to using them as a service • To quickly identify potential influencers for a marketing campaign • Target correct audience • Understand potential value of an influencer/ ROI Who needs influencers to be validated? 1 43 • Brand managers • PR Agencies • Breakdown of metrics across several social media channels • What age group they target • What the audience is interested in • Gender of the audience • Type of account • Written in python code & plugin was created • On demand 2 Why do we need it? How is it ran and how often? What does the service provide?
  5. 5. 5Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 What data sources are used? Primary Data Sources
  6. 6. 6Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Metrics that Matter  Engagement e.g. youtube comments/likes; twitter retweets  Reach Youtube views, twitter followers etc  Target demographic & Interest e.g. males aged 25-34  Channel(s) e.g. Youtube for a video campaign  Indices: H-index, M-index, G-index
  7. 7. Indices: H, M, G
  8. 8. 8Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 H-index Definition: “A scholar with an index of h has published h papers each of which has been cited in other papers at least h times. Thus, the h-index reflects both the number of publications and the number of citations per publication.” H-index = 116 H-index = 116
  9. 9. 9Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 The H-Index is a means of measuring influence Twitter Author-level metric – measures: productivity (tweets) & engagement (retweets) Tweet Retweeted Retweeted 1 10 25 2 8 8 3 5 5 4 4 3 5 3 3 200 Last 200 tweets H-index = #of tweets which have been retweeted H or more times H-index = 4 H-index = 3
  10. 10. 10Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Twitter H-index Twitter H-index = 10 Twitter H-index = 5
  11. 11. 11Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 M-Index  The m-index is defined as h/n, where n is the time delta in years between the last tweet and the 200th tweet & h is the h-index  M-indices tend to be higher than H or G-indices as the time taken to tweet 200 times could be days in some circumstances
  12. 12. 12Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 G-Index The g-index can be seen as the h-index for an averaged retweet count, largest number n of highly engaged tweets for which the average number of retweets is at least n The index is calculated based on the distribution of retweets received by a given authors tweets, such that given a set of tweets ranked in decreasing order of the number of of retweets that they received, the g-index is the unique largest number such that the top g tweets received together at least g2 retweets.
  13. 13. Case Study - Foodies
  14. 14. 14Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Objective Validate a set of influencers that can be used to market a food brand on Youtube, or Twitter, targeting mixed audience, primarily females in age group 25-34, intrested in food & parenting?
  15. 15. 15Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Followers 6527902 Location London & Essex Created @ Tue Jan 06 14:21:45 2009 Average Reach 728.26 + Sentiment(%) 44.60 - Sentiment(%) 4.50 Average Impact 38.24 H-Index 61 55% 45% Women Men Audience Information 77% 23% Individual Organisation 0 50 100 150 0-9 10-17 18-24 25-34 35-44 45-54 55-64 65+ Audience Age 0% 5% 10% 15% 20% food & drinks family & parenting sports beauty/health & fitness music Audience Interest Followers 5765451 Posts 5090 Videos(last 6m) 250 Average Likes 6414 Average Dislikes 186 Average Views 476999 Average Comments 421 jamieoliver jamieoliver JamieOliver jamieoliver London & Essex
  16. 16. 16Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Followers 79178 Location United Kingdom Created @ Tue Feb 22 22:19:13 2011 Average Reach 943.12 + Sentiment(%) 54.17 - Sentiment(%) 0.00 Average Impact 46.96 H-Index 15 38% 63% Women Men Audience Information 54% 46% Individual Organisation 0 1 2 3 4 0-9 10-17 18-24 25-34 35-44 45-54 55-64 65+ Audience Age 0% 10% 20% 30% 40% animals & pets tv automotive food & drinks photo & video Audience Interest Followers 139316 Posts 332 Videos(last 6m) 0 Average Likes Not available Average Dislikes Not available Average Views Not available Average Comments Not available CandiceBrown candicebrown candicebrown United Kingdom
  17. 17. 17Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 H-index Foodie Distribution (sample 1K)
  18. 18. 18Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Foodie H-index By Gender 0.0% 2.5% 5.0% 7.5% 10.0% 0 50 100 150 200 Twitter h−index UserPercentage female male Set threshold acceptance criteria Female foodie influencers have more engagement
  19. 19. 19Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 0% 10% 20% 0 20000 40000 Twitter m−index UserPercentage female male Foodie M-index By Gender Highly influential female influencers who have tweeted a lot over a short period of time @DeniseCop1 74K tweets Joined: July 2013 H-index: 159 M-index: 58K
  20. 20. 20Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Foodie G-Index By Gender 0% 5% 10% 15% 0 20 40 60 Twitter g−index UserPercentage female male Skewed to female foodies having more collections of tweets with higher retweets
  21. 21. Comparing Foodies To Beauticians
  22. 22. 22Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Beauty H-index By Gender 0% 20% 40% 0 50 100 150 200 Twitter h−index UserPercentage Female Male 0.0% 2.5% 5.0% 7.5% 10.0% 0 50 100 150 200 Twitter h−index UserPercentage female male FoodieBeauty H-indices lower for beauticians & follows exponential distribution
  23. 23. 23Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Beauty M-Index 0% 20% 40% 60% 0 20000 40000 Twitter m−index UserPercentage Female Male 0% 10% 20% 0 20000 40000 Twitter m−index UserPercentage female male Both follow exponential distributions FoodieBeauty
  24. 24. 24Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Beauty G-Index 0% 10% 20% 30% 40% 0 25 50 75 Twitter g−index UserPercentage Female Male 0% 5% 10% 15% 0 20 40 60 Twitter g−index UserPercentage female male FoodieBeauty Foodies & females across these 2 categories have higher G-indices
  25. 25. 25Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Conclusions  Developed software that can validate influencers for social media marketing campaigns across different channels  Introduced 3-novel indices to measure an influencers engagement (H, M & G)  Indices are quick to calculate, can be incorporated into workflows, are easily scalable in a distributed fashion and used by a variety of audiences & categories  Analysis is automated to provide both metrics & ppt
  26. 26. 26Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 APIs Leveraged  Twitter API  Instagram API  Youtube API  Faceplusplus  Pandas  Requests  Rpy2  Python-pptx
  27. 27. 27Copyright © Capgemini 2015. All Rights Reserved Insights & Data: Data Science | Version 1.0 Questions

×