Tableau @ Spil Games

997 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
997
On SlideShare
0
From Embeds
0
Number of Embeds
145
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Tableau @ Spil Games

  1. 1. 1Reporting, Analytics, and Tableau at Spil Games0-100 KPH in One YearPresented by:Rob Winters30 October 2012
  2. 2. 2• 200 Million UVs per Month (Google Analytics)• 2-3 Billion pageviews per month, >1 Billion game plays• Local portals in 19 languages with traffic from 219 countries a month• Developer, Publisher, and Platform• Target audience: Girls, Boys, Adult femalesWho is Spil Games?TitlesPortals
  3. 3. 3• Slowing growth in core markets plus increased competition equaled lessrapid EBITDA growth• Focus on personalization and user-centricity (changing marketexpectations)• Change in revenue streams and products (Advertising to End UserMonetization)Why Reporting and Analytics Became ImportantThese guys showed that data mattered
  4. 4. 4August 2011: Starting from Ground ZeroData “Piggy Bank”• 5 GB of data• Unindexed/Unusable• Direct copies of some production dataReporting• Two Dashboards• Manually generated weekly“Analytics”
  5. 5. 5AnalyticsReportingData PlatformToday we have a different landscapeData Warehouse• >700GB Compressed/>2,5TB Uncompressed• Largest tables load >50M records/dayMapReduce• >150M events/day• >500M events/day by Q1 2013Tableau Server• Daily, Weekly, and Monthly Push Reports• >192 views, 5,5 GB of extracts and data sources• 100+ sessions per dayTableau Desktop plus R• Five desktop users of Tableau• Driving company forecast process and planning• Analysis forms backbone of new strategies
  6. 6. 6Two AnalyticsSpecialistsOne ReportingSpecialistTwo DBAsOne HadoopDeveloperOne FreelancePython DeveloperThe Team that Built It
  7. 7. 7Why we chose TableauSpeed of Development Flexibility in dealing with messy dataCombined Analytics and Reporting From the best city in the world
  8. 8. 8Reporting
  9. 9. 9• Reports are scheduled and pushed daily, weekly, and monthly from the Tableau Server• Tools used:• Tableau Server + tabcmd: pull reports• sqlRun: ODBC command line tool used to build message bodies• BLAT: command line email clientPush ReportingSQLProcesses•Grab key values from DWH•Log into Tableau Server Postgres DB, confirms that the extract update ranBuildmessagebody•Modify DB outputs to match expected format (ex. “.”->”,”)•Echo text into body text filePull reportsfrom TableauServer•start /wait "Logoff" Tabcmd" logout•start /wait "Login" "Tabcmd" login -s https://reporting.spilgames.com -u USERNAME -p PASSWORD•start /wait "Pull Report" "Tabcmd" get views/DailyEUMRevenueReport/DailyEUMRevenueReportPush.pdf -f"X:Reporting dataTodays ReportsDaily EUM Report for %AWSDT%.pdf"Send Emailsand Archive•Email list managed via ActiveDirectory, emails pushed via SMTP•Every push report archived to share drive to have permanent record
  10. 10. 10Sample Push Report
  11. 11. 11• 25 power users (10% of local office)• 5 sites:• Reporting• Development• Three sites for business partners• 87 workbooks, 192 views• >100 sessions per day• One dashboard accounts for 50% of all viewsWeb reporting platformReporting Platform
  12. 12. 12Change to the site IssueAdded dashboards to various pages toshow report update status, primaryKPIs on the landing page, etcSlow load speed of reports causedissues; different permissions betweensites led to partners having issuesCustom HTML on the page to link todocumentation, report request forms,and email the reporting teamDue to Tableau’s “update” process,HTML would have to be manuallyreplaced with each version changeCustom CSS to match branding Same issue as custom HTMLWhat we have done that DIDN’T work and other issues
  13. 13. 13• Tableau butchers custom SQL. when possible, use views, tables, or projections• Huge amounts of usage data are available on the server back-end, use it to youradvantage.• Tabcmd can handle custom variables easily, opening the potential for users torequest highly personalized reports (or batch produce reports with a loop)• Balance flexibility with data size, and use extracts for reports which require significantdimensionality• If using the server for multiple functions (ex. reporting AND analysis-sharing), makeseparate sites to avoid confusion on data quality• Use parameters and actions to make your report dynamic• You can lead a horse to water, but you can’t make them drink• Make it easy to search with tags• Provide easy access to documentation and contact forms• Resist the urge to make duplications of data for users wanting slightly modifiedreportsRecommendations for Reporting
  14. 14. 14Analytics
  15. 15. 15Analytics at Spil: many tools make effective workR plus Tableau combine to form awell-trained athleteExploredata inSQL andTableauBuildModels inR andevaluateTest viaA/BTestingImplementreportingwithTableauBoth form a critical part ofSpil’s analyticsBut for simple problems,Tableau is sufficient
  16. 16. 16When we use Tableau When we use RMultidimensional trending analysis(including comparing trends)Modeling/forecasts (ARIMA,regression, etc)Distribution analysis Seasonal decompositionVisualization of small multiples Tree-based analysisExploratory analysis Statistical analysis (correlations, t-tests, ANOVA)Data miningWe have found each tool is optimal for different purposes
  17. 17. 17• Structuring your data BEFORE Tableau forces you to consider dimensions/attributes.SQL and Hive are your friends.• We are TOO good at seeing patterns, so “trust but verify” what you learn fromTableau with more robust tools like SAS or R.• Remember Occam’s Razor: Use the simplest possible visualization that canaccurately convey the information but no simplerAnalysis Advice
  18. 18. 18Case: ContentRecommendationA/B Test
  19. 19. 19Most content on the home page was geared to the under-12audience, yet analysis showed older users were more valuableCan we ensure that thecontent interests for themost valuable users aremet?
  20. 20. 20Work flow:1. A variety of base and calculated variables were created in SQL andloaded into a reference table2. Data was loaded into Tableau and explored to find “natural”/visualrelationships and break points3. New variables were created or added based on visual exploration4. Revised data set was loaded into R for modeling• Step 1: Stepwise logistic models predicting probability of game play based onvariables from step 3• Step 2: Build behavioral clustering models and compare to demographicsegmentation• Step 3: Model other industry standard approaches (ex. slope one, cosinesimilarity) in R and measure reduction in AIC5. Users were assigned to appropriate clusters and distributions of variousvariables explored in R6. Models were tuned and made ready for productionTableau and R were used simultaneously to accelerate analysisand modeling process
  21. 21. 21Segmentation: Kmeans clustering on 30+ factors,dividing the user base in 30 different behavioralsegments plus demographic boostingUltimately, an ensemble of models were built to recommendcontentContent selection: Ensemble model based ondrivers predictive of play• Cosine similarity of player bases and probabilityof play• Weighted slope one modeling of relative playrates• General user feedback from user ratings andrelative time on page
  22. 22. 22Case: MonthlyForecastingProcess
  23. 23. 23Bottoms up ARIMA forecast is generated for each core market/businesschannel/traffic source split (approximately 500 forecasts)1. Traffic (visits) are forecast using R’s auto-ARIMA functionality• Multiple ARIMA models plus time series linear regressions are built andcompared based on AIC/AICc, with the best-fit model selected• Forecasts are then rolled up to market/channel level (approx. 120 forecasts)2. Primary interactions (casual and social gameplays) are forecast on a per-visit basisbased on historical patterns and known seasonality matrices3. Secondary interactions (navigational pageviews) are forecast based on primaryinteraction forecasts, historical data, and other regressors4. Advertising impressions are loaded into the model on a market/channel/page typebasis to generate total impressions by type and location5. eCPMs are forecast on a market/channel/page type basis6. Forecast data is aggregated and loaded into the data warehouse for trackingStep One: Initial forecast is built using RCompletely parallel: Using a quad-core machine, total forecasting time is under one hour per month
  24. 24. 24Step two: Exploratory variance analysis using TableauWhy Tableau:• Faster than R for rapid exploration• Flexible adjustment of plot structurewhile exploring data with leadership• Clear visualization without planning
  25. 25. 25Step Three: Modify forecast with business and load adjustedforecast to data warehouseChannel Family Family Family Family Family Family Family Family Family Family Family Family FamilyJan 12 Feb 12 Mar 12 Apr 12 May 12 Jun 12 Jul 12 Aug 12 Sep 12 Oct 12 Nov 12 Dec 12GrandTotalAustria 1.0 1.1 1.0 1.0 1.0 1.0 1.0 .9 .8 1.0 1.0 1.0 11.8Belgium 2.5 2.7 2.4 2.5 2.3 2.5 2.3 2.4 2.2 2.3 2.5 2.7 29.2France 17.1 18.0 16.8 17.2 16.6 16.9 16.1 15.5 14.1 15.9 16.4 18.5 199.2Germany 12.9 12.8 12.8 12.4 12.1 12.8 12.7 11.3 10.5 11.2 11.6 12.6 145.6Italy 18.7 20.2 19.0 19.3 19.1 21.0 17.8 16.6 17.1 16.7 16.9 17.6 220.0Netherlands 5.6 5.4 5.2 5.0 4.9 5.2 4.7 4.2 4.0 4.6 4.6 4.9 58.3Poland 34.1 34.8 33.2 30.9 28.7 31.4 28.8 29.5 25.2 26.7 29.2 33.8 366.2Portugal 2.0 2.0 2.3 2.3 2.2 2.6 2.6 2.5 2.1 1.9 1.9 2.3 26.9Russia 5.8 5.7 6.2 5.4 5.1 4.2 3.5 3.7 3.4 4.1 4.5 4.7 56.2Spain 5.2 5.4 5.4 5.6 5.3 5.8 5.1 5.1 4.9 4.6 4.4 5.3 62.1Sweden 3.1 3.0 2.9 2.8 2.6 2.6 2.1 2.3 2.2 2.6 2.7 3.0 31.9Switzerland .9 .9 .9 .9 .9 .9 .8 .8 .7 .9 .9 .9 10.5Ukraine 1.5 1.6 1.7 1.5 1.3 1.1 .9 .9 .8 .9 .9 1.0 14.1United Kingdom 2.5 2.7 2.6 2.7 2.6 2.6 2.7 2.4 2.0 2.4 2.4 2.8 30.6United States 5.1 5.3 5.4 5.0 5.1 5.4 5.4 4.8 4.3 4.4 4.8 5.2 60.3Canada 1.1 1.1 1.2 1.1 1.0 1.0 1.0 1.0 .9 1.0 1.0 1.1 12.5Turkey 29.9 25.0 25.8 23.5 24.0 24.7 23.6 23.8 22.0 20.7 20.9 21.1 285.1India .5 .5 .7 .8 1.0 .9 .6 .6 .6 .6 .6 .6 8.0Indonesia .7 .5 .6 .7 .8 .9 1.0 1.0 .7 .6 .7 .8 9.0Argentina 9.7 10.1 9.9 9.4 10.0 10.6 11.5 10.8 9.5 9.7 8.2 9.5 119.0Brazil 27.2 23.7 22.8 21.7 22.8 24.0 24.7 22.7 20.5 20.6 19.2 22.5 272.6Mexico 12.0 12.0 13.2 13.4 13.8 14.4 14.4 13.3 10.2 10.0 9.6 11.3 147.6LATAM 17.6 16.6 16.8 16.4 16.4 18.2 18.6 18.0 15.3 15.3 14.2 16.5 199.8ROW 15.7 13.8 14.5 13.5 13.7 14.8 14.4 13.2 11.2 10.9 11.2 12.6 159.6Grand Total 232.6 225.1 223.3 214.9 213.5 225.8 216.4 207.3 185.2 189.4 190.3 212.4 2536.1Austria .0Belgium .0France .0Germany .0Italy .0Netherlands .0Poland .0Portugal .0Russia .0Spain .0Sweden .0Switzerland .0Ukraine .0United Kingdom .0United States .0Canada .0Turkey .0India .0Indonesia .0Argentina .0Brazil .0Mexico .0LATAM .0ROW .0Grand Total .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0Austria 1.0 1.1 1.0 1.0 1.0 1.0 1.0 .9 .8 1.0 1.0 1.0 11.8Belgium 2.5 2.7 2.4 2.5 2.3 2.5 2.3 2.4 2.2 2.3 2.5 2.7 29.2France 17.1 18.0 16.8 17.2 16.6 16.9 16.1 15.5 14.1 15.9 16.4 18.5 199.2Germany 12.9 12.8 12.8 12.4 12.1 12.8 12.7 11.3 10.5 11.2 11.6 12.6 145.6Italy 18.7 20.2 19.0 19.3 19.1 21.0 17.8 16.6 17.1 16.7 16.9 17.6 220.0Netherlands 5.6 5.4 5.2 5.0 4.9 5.2 4.7 4.2 4.0 4.6 4.6 4.9 58.3Poland 34.1 34.8 33.2 30.9 28.7 31.4 28.8 29.5 25.2 26.7 29.2 33.8 366.2Portugal 2.0 2.0 2.3 2.3 2.2 2.6 2.6 2.5 2.1 1.9 1.9 2.3 26.9Russia 5.8 5.7 6.2 5.4 5.1 4.2 3.5 3.7 3.4 4.1 4.5 4.7 56.2Spain 5.2 5.4 5.4 5.6 5.3 5.8 5.1 5.1 4.9 4.6 4.4 5.3 62.1Sweden 3.1 3.0 2.9 2.8 2.6 2.6 2.1 2.3 2.2 2.6 2.7 3.0 31.9Switzerland .9 .9 .9 .9 .9 .9 .8 .8 .7 .9 .9 .9 10.5Ukraine 1.5 1.6 1.7 1.5 1.3 1.1 .9 .9 .8 .9 .9 1.0 14.1United Kingdom 2.5 2.7 2.6 2.7 2.6 2.6 2.7 2.4 2.0 2.4 2.4 2.8 30.6United States 5.1 5.3 5.4 5.0 5.1 5.4 5.4 4.8 4.3 4.4 4.8 5.2 60.3Canada 1.1 1.1 1.2 1.1 1.0 1.0 1.0 1.0 .9 1.0 1.0 1.1 12.5Turkey 29.9 25.0 25.8 23.5 24.0 24.7 23.6 23.8 22.0 20.7 20.9 21.1 285.1India .5 .5 .7 .8 1.0 .9 .6 .6 .6 .6 .6 .6 8.0Indonesia .7 .5 .6 .7 .8 .9 1.0 1.0 .7 .6 .7 .8 9.0Argentina 9.7 10.1 9.9 9.4 10.0 10.6 11.5 10.8 9.5 9.7 8.2 9.5 119.0Brazil 27.2 23.7 22.8 21.7 22.8 24.0 24.7 22.7 20.5 20.6 19.2 22.5 272.6Mexico 12.0 12.0 13.2 13.4 13.8 14.4 14.4 13.3 10.2 10.0 9.6 11.3 147.6LATAM 17.6 16.6 16.8 16.4 16.4 18.2 18.6 18.0 15.3 15.3 14.2 16.5 199.8ROW 15.7 13.8 14.5 13.5 13.7 14.8 14.4 13.2 11.2 10.9 11.2 12.6 159.6Grand Total 232.6 225.1 223.3 214.9 213.5 225.8 216.4 207.3 185.2 189.4 190.3 212.4 2536.11. Forecast data is loaded into Excel template (right)to load in adjustments2. Channel/market leaders provide feedback oninitiatives and expected impact, along with non-initiative adjustments (if needed)3. Revised forecast data is committed anduploaded; primary outputs (visits, pageviews,gameplays, advertising revenues) arerecalculated4. Final forecast is shared with Management Team
  26. 26. 26Step Four: Activity is monitored within Tableau

×