Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Detecting Trends!Stanislav Nikolov §,†Devavrat Shah §                        §   †
Source: http://twoinformcanada.ca/wp-content/uploads/2012/07/barclays.jpg
Source: http://twoinformcanada.ca/wp-content/uploads/2012/07/barclays.jpg
The Barclays Libor scandal            #                 12:49: “#Barclays” is listed as                 a trending topic o...
•  Is there enough information before the   “jump”?
•  Is there enough information before the   “jump”?•  Can we predict which topics will trend in   advance?
Yes.
•  79% early detection•  1.43 hours mean early detection•  95% TPR, 4% FPR.              (best parameter setting)
What are Trending Topics?•  Twitter: a global communication network.
What are Trending Topics?•  Twitter: a global communication network.•  Tweet: a short, public message.
What are Trending Topics?•  Twitter: a global communication network.•  Tweet: a short, public message.•  Topic: a phrase i...
What are Trending Topics?•  Twitter: a global communication network.•  Tweet: a short, public message.•  Topic: a phrase i...
A Parametric Model•  Expect certain type of pattern (e.g.   constant + jumps).  activity             time
A Parametric Model•  Expect certain type of pattern (e.g.   constant + jumps).•  Fit parameters to data (e.g. how much of ...
A Parametric Model•  Expect certain type of pattern (e.g.   constant + jumps).•  Fit parameters to data (e.g. how much of ...
A Parametric Model!•  Expect certain type of pattern (e.g.   constant + jumps).•  Fit parameters to data (e.g. how much of...
A Parametric Model!•  Expect certain type of pattern (e.g.   constant + jumps).•  Fit parameters to data (e.g. how much of...
A Parametric Model!•  Expect certain type of pattern (e.g.   constant + jumps).•  Fit parameters to data (e.g. how much of...
Parametric Models areInadequate!                            trend                            detected!                    ...
Parametric Models areInadequate!                            trend                            detected!                    ...
Parametric Models areInadequate!                            trend                            detected!                    ...
Parametric Models areInadequate!                            trend                            detected!                    ...
A Data-Driven Approach•  All of the information is in the data.
A Data-Driven Approach•  All of the information is in the data.•  Hypothesis
A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis  –  Tweets are written by people.
A Data-Driven Approach•  All of the information is in the data.•  Hypothesis  –  Tweets are written by people.  –  People ...
A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis  –  Tweets are written by people.  –  People...
A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis  –  Tweets are written by people.  –  People...
A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis  –  Tweets are written by people.  –  People...
Classification by Experts
Classification by Experts!                     observations
Classification by Experts!                     observationsr
Classification by Experts!                     observationsr          vote
Classification by Experts!                     observationsr          vote
Classification by Experts!                     observationsr          vote
Classification by Experts!                     observationsr          vote
Classification by Experts!                     observationsr          vote
Classification by Experts!                     observationsr          vote
Classification by Experts!                     observationsr
Properties•  Simple (just compute distances)•  Scalable (can compute distances in   parallel)•  Non-parametric – model “pa...
ExperimentalResults
Experiment•    500 trends.•    500 non-trends.•    Do trend detection on a 50% hold out set.•    Online signal classificati...
Results – Early Detection          (best parameter setting)
Results – FPR / TPR Tradeoff
Results – Early / Late Tradeoff
Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series   analysis
Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series   analysis    classification
Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series   analysis    classification   a...
Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series   analysis    classification   a...
Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series   analysis    classification   a...
Detecting Trends
Detecting Trends
Detecting Trends
Detecting Trends
Detecting Trends
Detecting Trends
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Scalability, Availability & Stability Patterns
Next
Download to read offline and view in fullscreen.

25

Share

Download to read offline

Detecting Trends

Download to read offline

Stanislav Nikolov (MIT, Twitter)
Devavrat Shah (MIT)

Interdisciplinary Workshop on Information and Decision in Social Networks 2012

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Detecting Trends

  1. 1. Detecting Trends!Stanislav Nikolov §,†Devavrat Shah § § †
  2. 2. Source: http://twoinformcanada.ca/wp-content/uploads/2012/07/barclays.jpg
  3. 3. Source: http://twoinformcanada.ca/wp-content/uploads/2012/07/barclays.jpg
  4. 4. The Barclays Libor scandal # 12:49: “#Barclays” is listed as a trending topic on Twitter
  5. 5. •  Is there enough information before the “jump”?
  6. 6. •  Is there enough information before the “jump”?•  Can we predict which topics will trend in advance?
  7. 7. Yes.
  8. 8. •  79% early detection•  1.43 hours mean early detection•  95% TPR, 4% FPR. (best parameter setting)
  9. 9. What are Trending Topics?•  Twitter: a global communication network.
  10. 10. What are Trending Topics?•  Twitter: a global communication network.•  Tweet: a short, public message.
  11. 11. What are Trending Topics?•  Twitter: a global communication network.•  Tweet: a short, public message.•  Topic: a phrase in a tweet.
  12. 12. What are Trending Topics?•  Twitter: a global communication network.•  Tweet: a short, public message.•  Topic: a phrase in a tweet.•  Trending topic (a “trend”): a topic that becomes popular.
  13. 13. A Parametric Model•  Expect certain type of pattern (e.g. constant + jumps). activity time
  14. 14. A Parametric Model•  Expect certain type of pattern (e.g. constant + jumps).•  Fit parameters to data (e.g. how much of a jump). activity time
  15. 15. A Parametric Model•  Expect certain type of pattern (e.g. constant + jumps).•  Fit parameters to data (e.g. how much of a jump). activity p = 0.1 time
  16. 16. A Parametric Model!•  Expect certain type of pattern (e.g. constant + jumps).•  Fit parameters to data (e.g. how much of a jump). activity p = 0.6 time
  17. 17. A Parametric Model!•  Expect certain type of pattern (e.g. constant + jumps).•  Fit parameters to data (e.g. how much of a jump). activity p = 4.1 time
  18. 18. A Parametric Model!•  Expect certain type of pattern (e.g. constant + jumps).•  Fit parameters to data (e.g. how much of a jump).•  Decide if jump is big enough. trend detected! activity p = 4.1 time
  19. 19. Parametric Models areInadequate! trend detected! activity time
  20. 20. Parametric Models areInadequate! trend detected! activity time
  21. 21. Parametric Models areInadequate! trend detected! activity time
  22. 22. Parametric Models areInadequate! trend detected! activity time
  23. 23. A Data-Driven Approach•  All of the information is in the data.
  24. 24. A Data-Driven Approach•  All of the information is in the data.•  Hypothesis
  25. 25. A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis –  Tweets are written by people.
  26. 26. A Data-Driven Approach•  All of the information is in the data.•  Hypothesis –  Tweets are written by people. –  People are simple.
  27. 27. A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis –  Tweets are written by people. –  People are simple. •  In how they spread information.
  28. 28. A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis –  Tweets are written by people. –  People are simple. •  In how they spread information. •  In how they connect to one another.
  29. 29. A Data-Driven Approach!•  All of the information is in the data.•  Hypothesis –  Tweets are written by people. –  People are simple. •  In how they spread information. •  In how they connect to one another. –  Small number of distinct “ways” in which a topic can become trending.
  30. 30. Classification by Experts
  31. 31. Classification by Experts! observations
  32. 32. Classification by Experts! observationsr
  33. 33. Classification by Experts! observationsr vote
  34. 34. Classification by Experts! observationsr vote
  35. 35. Classification by Experts! observationsr vote
  36. 36. Classification by Experts! observationsr vote
  37. 37. Classification by Experts! observationsr vote
  38. 38. Classification by Experts! observationsr vote
  39. 39. Classification by Experts! observationsr
  40. 40. Properties•  Simple (just compute distances)•  Scalable (can compute distances in parallel)•  Non-parametric – model “parameters” scale with the data
  41. 41. ExperimentalResults
  42. 42. Experiment•  500 trends.•  500 non-trends.•  Do trend detection on a 50% hold out set.•  Online signal classification.
  43. 43. Results – Early Detection (best parameter setting)
  44. 44. Results – FPR / TPR Tradeoff
  45. 45. Results – Early / Late Tradeoff
  46. 46. Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series analysis
  47. 47. Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series analysis classification
  48. 48. Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series analysis classification anomaly detection
  49. 49. Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series analysis classification anomaly detection prediction
  50. 50. Concluding Remarks•  Algorithm to detect trends early•  Scalable nonparametric time series analysis classification anomaly detection prediction
  • YingHu26

    Oct. 27, 2018
  • aryeshalev

    Aug. 12, 2018
  • ninachen30

    Feb. 28, 2018
  • shreyashejwalkar

    Feb. 2, 2018
  • BertranddeBodinat

    Nov. 7, 2016
  • zgdr7th

    Dec. 25, 2015
  • AlexanderSemeonov

    Dec. 19, 2015
  • putriawulandariii

    Jul. 15, 2015
  • caidong

    Apr. 21, 2015
  • DVStream

    Mar. 15, 2015
  • zhaochenting

    Feb. 3, 2015
  • LionelLimery

    Jan. 22, 2015
  • sumitbajaj

    Nov. 23, 2014
  • takahirosawada718

    Nov. 7, 2014
  • mox601

    Feb. 4, 2014
  • Synodiance

    Dec. 5, 2013
  • mauriciomaia

    Nov. 28, 2013
  • UnaSinnott

    Aug. 8, 2013
  • rachelmercer

    Jun. 19, 2013
  • AAinslie

    Jun. 16, 2013

Stanislav Nikolov (MIT, Twitter) Devavrat Shah (MIT) Interdisciplinary Workshop on Information and Decision in Social Networks 2012

Views

Total views

60,913

On Slideshare

0

From embeds

0

Number of embeds

49,715

Actions

Downloads

244

Shares

0

Comments

0

Likes

25

×