Social media a prediktivní analýza

1,644 views
1,579 views

Published on

konference Social media ve finančních službách

Published in: Entertainment & Humor

Social media a prediktivní analýza

  1. 1. Social media aprediktivní analýza15. 6. 2011 Josef Šlerka, Prahakonference Social media ve finančních službách
  2. 2. Predictive analytics Predictive analytics encompasses a variety ofstatistical techniques from modeling, data miningand game theory that analyze current and historicalfacts to make predictions about future events.(WIKIPEDIA)
  3. 3. Predictive analytics In business, predictive models exploit patternsfound in historical and transactional data to identifyrisks and opportunities. Models capture relationshipsamong many factors to allow assessment of risk orpotential associated with a particular set ofconditions, guiding decision making for candidatetransactions. (WIKIPEDIA)
  4. 4. Search jako signálHyunyoung Choi, Hal Varia:Predicting the Present with Google Trends
  5. 5. Jak je to možné?Život je hledání ... (taky)a dříve než se rozhodneme, tak hledáme ... (taky)
  6. 6. Google Insightsslužba, kterou Google postkytuje zadarmolze ji využít i pro predikční analýzyNikolaos Askitas, Klaus F. Zimmermann:Google Econometrics and UnemploymentForecasting
  7. 7. Time to Release (Days) Time to Release (Days) Week Fig. 1. Search volume for the movie Transformers 2 (A) and the video game Tom Clancy’s H.A.W.X. (B) prior to and after their release, and search and Billboard rank for the song “Right Round” by Flo Rida (C). A of the song “Right Round” in terms of B Transformers 2 C search volume closely H.A.W.X in order to account for the highly skewed distributions of Tom Clancys where, Right Round tracks its rank on the Billboard Hot 100 chart. popularity, both revenue and search volume are log-transformed. Thus motivated, we now investigate whether search activity is For songs, search data were collected from Yahoo!’s dedicated a systematic leading indicator of consumer activity by forecasting music site, music.yahoo.com. We predict the weekly Billboard 10 Search Volume (i) opening weekend box-office revenue for 119 feature films re- rank using search rank from the current and previous weeks: Search Volume leased in the United States between October 2008 and September 2009; (ii) first-month sales of video games across all gaming 20 Rank billboardtþ1 ¼ β0 þ β1 searcht þ β2 searcht−1 þ : platforms (e.g., Xbox, PlayStation, etc.) for 106 games released between September 2008 and September 2009; and (iii) the 30 weekly rank of 307 songs that appeared on the Billboard Hot Fig. 2 A–C shows that search-based predictions are strongly 100 list between March and September 2009. Search data for mo- correlated with realized outcomes for movies (0.85) and video Billboard vies and video games come from Yahoo!’s Web search query logs games (0.76) and moderately correlated for music (0.56), where 40 Search for the US market. Predictions in these domains are based on in each case revenue or rank is predicted on the day immediately linear models with Gaussian 20 −30 −20 −10 0 10 error of the form 30 −30 −20 −10 0 preceding the event of interest. Moreover, Fig. 2 D–F shows that 10 20 30 Mar−09 Apr−09 May−09 Jun−09 Jul−09 Aug−09 Time to Release (Days) the predictive power of search persists as far Week several weeks Time to Release (Days) out as logðrevenueÞ ¼ β0 þ β1 logðsearchÞ þ ; in advance—for example, four weeks prior to a movie’s releaseFig. 1. Search volume for the movie Transformers 2 (A) and the video game Tom Clancy’s H.A.W.X. (B) prior to and after their release, and search and Billboardrank for the song “Right Round” by Flo Rida (C). Movies Video Games Music A 10 B 10 7 C 100of the song “Right Round” in terms of search volume closely where, in order to account for the highly skewed distributions of COMPUTER SCIENCES 10 Actual Revenue (Dollars) Actual Revenue (Dollars)tracks its rank on the Billboard Hot 100 chart. 106 popularity, both revenue80 search volume are log-transformed. and Actual Billboard Rank Thus motivated, we now investigate whether search activity is 10 For songs, search data were collected from Yahoo!’s dedicateda systematic leading indicator of consumer activity by 10 10 5 forecasting 60 music site, music.yahoo.com. We predict the weekly Billboard(i) opening weekend box-office revenue for 119 feature films re- rank using search rank from the current and previous weeks: 40leased in the United States between October 2008 and September 10 4 102009; (ii) 10first-month sales of video games across all gaming billboardtþ1 ¼ β20 þ β1 searcht þ β2 searcht−1 þ : Non−Sequelplatforms (e.g., Xbox, PlayStation, etc.) for 106 games released 0 SOCIAL SCIENCES Sequelbetween September 2008 and September 2009; 10 and (iii) the 3 10 0weekly rank 10 307 songs that appeared on the Billboard Hot of 10 10 10 10 10 10 3 4 5 6 7 8 9 10 3 104 10Fig. 210 5 A–C shows that search-based predictions are strongly 6 10 7 0 20 40 60 80 100 Predicted Revenue (Dollars) Predicted Revenue (Dollars) with realized outcomes for movies Rank correlated Predicted Billboard (0.85) and video100 list between March and September 2009. Search data for mo-vies and video games come from Yahoo!’s Web search query logs Games (0.76) and moderately correlated for music (0.56), where D 0.9 Movies E 0.9 Video games F rank Musicfor the US market. Predictions in these domains are based on in each case revenue or 0.9 is predicted on the day immediatelylinear models with Gaussian error of the form preceding the event of interest. Moreover, Fig. 2 D–F shows that 0.8 0.8 the predictive power of search persists as far out as several weeks 0.8 logðrevenueÞ ¼ β0 þ β1 logðsearchÞ þ ; in advance—for example, four weeks prior to a movie’s release 0.7 0.7 Model Fit Model Fit Model Fit 0.7 Movies Video Games Music A 10 0.6 B 10 7 0.6 C0.6 100 COMPUTER SCIENCES 0.5 0.5 0.5 10al Revenue (Dollars) al Revenue (Dollars) 106 tual Billboard Rank 80 10 0.4 0.4 0.4 −6 −5 −4 −3 −2 −1 0 −6 −5 −4 −3 −2 −1 0 −6 −5 −4 −3 −2 −1 0 60 10 Time to Release (Weeks) 105 Time to Release (Weeks) Time to Release (Weeks) 10 Fig. 2. Search-based predictions for box-office movie revenue (A), first-month video game sales (B), and the Billboard rank of songs (C), where predictions are 40 made immediately prior to the event of interest; correlation between predicted and actual outcomes when predictions are based on query data t weeks prior 4
  8. 8. Funguje i u nás?nejsou žadné přesné studienení důvod, aby nefungoval
  9. 9. Social media jako signálŽivot NENÍ jen hledání ... Fans, followers, pages“Co se vám honí hlavou?” (Facebook)“What’s happening?” (Twitter)
  10. 10. Predikce burzy
  11. 11. Predikce burzy To put it in simple words, when the emotions on twitterfly high, that is when people express a lot of hope, fear,and worry, the Dow goes down the next day. Whenpeople have less hope, fear, and worry, the Dow goes up.It therefore seems that just checking on twitter foremotional outbursts of any kind gives a predictor of howthe stock market will be doing the next day.Zhang, Fuehres, Peter A. Gloor: Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear”
  12. 12. Predikce akciísledované akcie Starbucks, Coca Cola a Nikepoužité signály Facebook Fans, Twitter flowers,YouTube Views
  13. 13. Predikce volebvolby do amerického senátusignálem byl počet followerů na Twitterukorelace mezi vítězstvím a počtem byla 71%u porovnání FB fanoušků dokonce 80%
  14. 14. Funguje to i u nás?Zdá, se že ano:-)Výzkum na datech ze www.ataxosocialinsider.cz
  15. 15. Ataxo Social Insider nástroj pro analýzu dat ze sociálních sítí, diskusníchfór, blogů a zpravodajských serverů
  16. 16. Ataxo Social Insider
  17. 17. A co predikce?Case study:počty zmínek na Facebooku a návštěvnost filmu
  18. 18. zmínky o Inception na českém Facebooku 2010 a divácký ohlas
  19. 19. Harry Potter na českém Facebooku 2010 a divácký ohlas
  20. 20. FB zmínky jako signál Korelace ukazuje schopnost předvídat dynamikutržeb filmů, protože lidé většinou dělají, co říkají....
  21. 21. Budoucnost?Propojme data a dívejme se...
  22. 22. Profilování klientů propojení statusů uživatelů s jejich finačnímchovánímpredikce solventnostimíra spolehlivosti jejich sítěověření reality
  23. 23. Hledání produktůšití produktů na míruobjevování patterns v chování
  24. 24. Půjde to?Jde to! V USA firma RapLeaf.U nás zatím není poptávka.Data ano.
  25. 25. Děkuji za pozornostjosef.slerka@ataxo.comwww.ataxointeractive.comtwitter.com/josefslerka

×