Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gimme More! Supporting User Growth in a Performant and Efficient Fashion

1,896 views

Published on

Published in: Technology, Economy & Finance

Gimme More! Supporting User Growth in a Performant and Efficient Fashion

  1. 1. Gimme More! ! Supporting User Growth in a! Performant and Efficient Fashion Arun Kejariwal, Winston Lee (@arun_kejariwal) (@winstl) Capacity Engineering @ Twitter November 2013 @Twitter 1
  2. 2. User Experience •  Anytime, Anywhere, Any device q  5.2 billion mobile users by 2017 [1] q  More than 10 billion mobile devices/connections by 2017 [1] q  Worldwide mobile data traffic will reach 11.2 exabytes/month by 2017 (13x increase) [1] •  Real-time performance [1] http://newsroom.cisco.com/release/1135354 (Feb. 5, 2013) @Twitter 2
  3. 3. Capacity Planning: Why bother? •  Organic growth q  Over 230M monthly active users [1] •  User engagement •  Evolving product landscape q  Cards, Photos, Vines §  Mobile video will increase 16-fold between 2012 and 2017 [2] •  Events planned or unplanned [1] http://www.sec.gov/Archives/edgar/data/1418091/000119312513400028/d564001ds1a.htm [2] http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html @Twitter 3
  4. 4. Approaches to Capacity Planning •  Throw hardware at the problem o  How much? o  What kind? (Inventory management etc.) o  Operationally inefficient! •  Reactive approach Bottomline Poor UX @Twitter 4
  5. 5. Systematic Capacity Planning •  Objectives q  Check under-allocation §  Performance §  Availability o  Adversely impact user experience q  Check over-allocation §  Operational efficiency o  Adversely impacts bottomline •  Determine capacity needed proactively via forecasting q  Business metrics q  System resource usage @Twitter 5
  6. 6. Systematic Capacity Planning: Forecasting •  Key questions q  Which data? §  Raw §  Periodic Max §  Moving average q  Data granularity §  Minutely §  Daily o  Depends q  Which model? §  Linear §  Spline §  Holt-Winters Non-Trivial! §  ARIMA @Twitter 6
  7. 7. Good old Linear Regression Linear Regression based Forecast Adjusted R-squared: 0.6062 Raw Data Forecast @Twitter 7
  8. 8. Linear Regression using periodic max Linear Regression Using Maxes based Forecast Adjusted R-squared: 0.5673 Standard Error 2.45x Raw Data Forecast @Twitter 8
  9. 9. Splines •  Smooth Spline q λ: penalty for “wiggliness” Spline based Fitting Raw Data Fitted @Twitter 9
  10. 10. Splines Spline based Forecast Raw Data Forecast @Twitter 10
  11. 11. Splines Boundary 2 Boundary 1 •  Sensitive to nature of time series at the boundary @Twitter 11
  12. 12. Splines – Take 2 Spline based Forecast (Boundary 1) Raw Data Forecast 8.31x higher than end of time series @Twitter 12
  13. 13. Splines – Take 3 Spline based Forecast (Boundary 2) Raw Data Forecast 3.77x higher than end of time series @Twitter 13
  14. 14. Holt-Winters •  Triple exponential smoothing Estimate of linear trend Seasonal correction factors Holt-Winters based Fitting Raw Data Fitted @Twitter 14
  15. 15. Holt-Winters Holt-Winters based Forecast Raw Data Upper 95% CI Forecast @Twitter 15
  16. 16. ARIMA •  Auto-Regressive Integrated Moving Average q  (p, d , q) Moving Average order Integrated order Autoregressive order Autoregressive component Moving Average component @Twitter 16
  17. 17. ARIMA •  Fitting Auto ARIMA based Fitting Raw Data Fitted @Twitter 17
  18. 18. ARIMA – Take 1 ARIMA based Forecast (p, d, q): (0,1,1)(0,1,1)[7] Raw Data Upper 95% CI Forecast @Twitter 18
  19. 19. ARIMA – Take 2 Auto ARIMA based Forecast (p, d, q): (1,1,1)(2,0,0)[7] Raw Data Upper 95% CI Forecast @Twitter 19
  20. 20. Impact of Outliers @Twitter 20
  21. 21. Forecast without outlier @Twitter 21
  22. 22. Good “enough”? @Twitter 22
  23. 23. Impact of “Corrections” @Twitter 23
  24. 24. Implications of data characteristics ARIMA based forecast Raw Data Upper 95% CI Forecast @Twitter 24
  25. 25. Forecast without the boundary case ARIMA based Forecast - Without initial spike Raw Data Upper 95% CI Forecast @Twitter 25
  26. 26. Forecast with truncation ARIMA based Forecast - Truncated and Without initial spike Raw Data Upper 95% CI Forecast @Twitter 26
  27. 27. Lessons learned •  Data fidelity q  Anomalies q  Absence of seasonality •  Modeling q  Never perfect §  Assess forecasting error q  Continuous refinement §  Incoming data stream is dynamic o  Organic growth o  New products o  Behavioral aspect @Twitter 27
  28. 28. Acknowledgements •  Capacity Engineering Team •  Management team @Twitter 28
  29. 29. Join the Flock Like problem solving? Like challenges? Be at cutting Edge Make an impact •  We are hiring!! q  https://twitter.com/JoinTheFlock q  https://twitter.com/jobs q  Contact us: @arun_kejariwal, @winstl @Twitter 29

×