Google Statistics 2012


Published on

Published in: Business, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Google Statistics 2012

  1. 1. Google Confidential and Proprietary Statistics at Google Royal Statistical Society 5 Sept 2012 Hal Varian
  2. 2. Google Confidential and Proprietary Statistics and me SB MIT, Economics MA Berkeley, Math PhD Berkeley, Economics (mathematical economics, econometrics) Stat courses from Bickel, Lehman, and others Worked as computer programmer for Dan McFadden on logit Assistant professor MIT Taught undergrad stat class Most of my published work is in economic theory Started working on economics of internet at UMich 1992 Became dean of School of Information at Berkeley 1995 1998: co-authored Information Rules 2000-2007: wrote monthly column Economic Scene for New York Times 2002: stepped down at Berkeley, spent a year at Google 2009: New York Times quote
  3. 3. Google Confidential and Proprietary Google and me How I started working at Google Ad auction Forecasting Advertiser churn and lifetime value Etc Ad Stats team Evolution of statistics at Google Evolution of organizational model Current model 650 people on statistics mailing list, 350 on bi-annual meeting list Analyst Newsletters produced monthly Twice-a-year meeting, about 100 attend
  4. 4. Google Confidential and Proprietary What does the Chief Economist do? Answer the questions management is going to ask next month... Team roles revenue analysis (trends for vertical, country, product) program evaluation (adoption rates, attrition rates, impact) predictive modeling (advertisers, user behavior) experiments policy (IP, privacy, antitrust, telecom) auction design computational infrastructure ad effectiveness, marketing internal consulting, etc. Now Microsoft, Amazon, eBay, Yahoo, Intel all have Chief Economists
  5. 5. Google Confidential and Proprietary What do statisticians do at Google? Teams: Economics, Hardware Operations, Quantitative Marketing, Video and TV Insights, Content Ads, Ads Quality, Conversions and Attribution, Quantitative User Experience, Travel Analytics, Sales Finance, Affiliate Ops, People Analytics, Engineering Statistics, Machine Learning, Search Quality Projects: Forecasting and planning, project evaluation, testing new features, modeling behavior of advertisers, publishers and users, auction design, tool building, survey analysis, etc. Examples of projects...
  6. 6. Google Confidential and Proprietary How Google makes money...
  7. 7. Google Confidential and Proprietary Simplified rules of ad auction 1. Order ads by bid per click 2. Highest bid get best position, etc. 3. Each ad pays bid of the advertiser below it Advertiser wants to maximize profit: English: value of click * number of clicks - cost of clicks Math: max v x - c(x) So at optimum: v = c'(x) "value = marginal cost" How can we estimate marginal cost at current operating position?
  8. 8. Google Confidential and Proprietary Bid simulator The only real unknown is the clicks adv would get in other positions.
  9. 9. Google Confidential and Proprietary Uses of bid value estimate 1. Can be used to evaluate impact of system changes on advertisers 2. Can compare changes in advertiser value over time and geo 3. Can compare advertiser value in different auction configurations Example: suppose all ad slots are identical, all bidders have identical value v, and there is a minimum reserve price r. 3 slots and 3 bidders: everyone pays minimum price r 3 slots and 4 bidders: everyone pays value v Demand and supply...
  10. 10. Google Confidential and Proprietary More complex model of auction Google has impressions to sell, advertisers want to buy clicks. Need an exchange rate. Bid per impression = Bid per click x clicks per impr Need to estimate clicks per impression Logistic regression: Prob(click | impr) ~ X b Estimate the world's largest logistic regression update it in real time
  11. 11. Google Confidential and Proprietary Multi-armed bandits Website Optimizer: allowed for A-B testing of web page design for users of Google Analytics Optimize some objective, e.g., conversions Experiments are expensive! Could not easily model features (font, colors,images,layout) Google Analytics Content Experiments Multiarmed bandit Far more cost-effective testing More natural interpretation Can model features easily
  12. 12. Google Confidential and Proprietary Publisher quality Quality score for each publisher based on observed performance. But what do you do for new publishers? Empirical Bayes model Publisher: coin with prob p of heads (p = quality) Known distribution of coins in country A and B is prior Draw coin from distribution and flip it Refine estimate of p according to standard Bayesian update Can add other predictors (vertical) Problems Survivorship bias in the original distribution Potentially asymmetric loss function
  13. 13. Google Confidential and Proprietary Impact of recession on Google How would the US recession affect our revenue? Problem: only had 1 observation! But really had 50 observations. Could look at revenue by state and economic indicators such as personal income or unemployment rate. Estimate a longitudinal model to identify the response of revenue by state to economic metric by state (unemployment, personal income). Can estimate how revenue responds to various scenarios about severity of recession.
  14. 14. Google Confidential and Proprietary Incrementality of ad clicks Web search has organic results and sometimes ads Question: How many incremental clicks for an advertiser does an ad generate? Solution: Investigate natural ad spending variation ● Ad spend stops or declines dramatically ● Estimate how many clicks would have occurred ● Compare this counterfactual to actual clicks Engineering challenge: built system to search for such cases and automatically apply the model
  15. 15. Google Confidential and Proprietary Incrementality of mobile queries How many mobile queries would have been issued on desktop/laptop device? Obviously can't look at difference between those with mobile devices and those without. Difference-in-differences analysis: look at change of behavior in those who acquire mobile device ui = individual fixed effects st = seasonal fixed effects xit = treatment (0-1) qit ~ ui + st + b xit This is impact of treatment on those who choose to be treated which is often what we want.
  16. 16. Google Confidential and Proprietary Nowcasting using Google Insights for Search
  17. 17. Google Confidential and Proprietary
  18. 18. Google Confidential and Proprietary Initial Claims for Unemployment Benefits in US
  19. 19. Google Confidential and Proprietary Bayesian Structural Time Series Combine 3 techniques ○ Kalman filter for trend and seasonality ○ Spike-and-slab regression for variable choice ○ Model averaging for final prediction Example of BSTS model
  20. 20. Google Confidential and Proprietary BSTS modeling Estimation Use MCMC for efficient simulation of posterior. Can decompose the time series into structural components. Trend Seasonal Regression Other capabilities Variable selection using spike and slab technique Multiple seasonalities Dynamic regression Holidays AR(p) trends
  21. 21. Google Confidential and Proprietary Ex: Forecast UM Consumer Sentiment using query category data White: positive predictor Black: negative predictor
  22. 22. Google Confidential and Proprietary Short term prediction of UM Consumer Sentiment
  23. 23. Google Confidential and Proprietary Google Consumer Surveys Example of In-Line Prompt
  24. 24. Google Confidential and Proprietary How it Works
  25. 25. Google Confidential and Proprietary Consumer Sentiment
  26. 26. Google Confidential and Proprietary Experimentation Data mining is all the rage. Experimentation is what matters. Experimentation is far easier online. In 2010 Google ran about 10,000 experiments: 5000 in search and 5000 in ads. Implemented 400 improvements in search and a similar number in ads. At any one time on Google you are in a dozen or more experiments. Scope of experiments Queries (one-shot) Cookies (persistent) Geographic Temporal
  27. 27. Google Confidential and Proprietary ...and many, many more projects What are we looking for in new hires? Broad knowledge of statistics (compare to academia) Computer coding skills (Python) Database and data manipulation skills (SQL) Machine learning knowledge Visualization skills Communication skills (teaching is important!) More generally Understand the domain. Ask the right questions. Use the right tools. Get the answer quickly.
  28. 28. Google Confidential and Proprietary Computer mediated transactions There is now a computer in the middle of almost every transaction Captures data about that transaction Can be very valuable for understanding behavior Google, Amazon, eBay, Microsoft, Yahoo, etc are in the vanguard. But soon everyone will need statisticians Businesses have spent billions installing "data warehouses". The problem now: what to do with them? Data mining for fun and profit: "improving almost everything". But many companies don't know where to start. That's where statistics comes in...
  29. 29. Google Confidential and Proprietary Other economic data on web Potential Wal-Mart, Target, K-Mart retails sales Price indices from retail data Package delivery data from UPS, FedEx Existing Billion prices project (MIT) Mastercard Spending Pulse Monster Employment Index Intuit Small Business Employment Index Zillow Real Estate Market Reports
  30. 30. Google Confidential and Proprietary Statistician/Engineering Analyst Location:London Team:IT & Data Management Apply now This position is based in London, UK. The area: Engineering & Operations Google is and always will be an engineering company. We hire people with a broad set of technical skills who are ready to tackle some of technology's greatest challenges and make an impact on millions, if not billions, of users. At Google, engineers not only revolutionize search, they routinely work on massive scalability and storage solutions, large-scale applications and entirely new platforms for developers around the world. From AdWords to Chrome, Android to YouTube, Social to Local, Google engineers are changing the world one technological achievement after another. The role: Statistician/Engineering Analyst At Google, data drives all of our decision-making. Quantitative Analysts work all across the organization to help shape Google's business and technical strategies by processing, analyzing and interpreting huge data sets. Using analytical rigor and statistical methods, you mine through data to identify opportunities for Google and our clients to operate more efficiently, from enhancing advertising efficacy to network infrastructure optimization to studying user behavior. As an analyst, you do more than just crunch the numbers. You work with Engineers, Product Managers, Sales Associates and Marketing teams to adjust Google's practices according to your findings. Identifying the problem is only half the job; you also figure out the solution. Google is hiring statisticians in UK
  31. 31. Google Confidential and Proprietary The role: Quantitative Marketing Manager - London As a Quantitative analyst, you will be responsible for analyzing large data sets and building expert systems that improve our understand of the Web and improve the performance of our products. This effort includes performing complex statistical analysis on non-routine problems and working with engineers to embed models into production systems. Managing fast changing business priorities and interfacing with product managers and engineers are required for success. Responsibilities: - Apply advanced statistical methods - Work with large, complex data sets - Solve difficult, non-routine problems - Clearly communicate highly technical results and methods - Interact cross-functionally with a wide variety of people and teams Minimum qualifications: - PhD in Statistics or Econometrics. - Experience with R/SPlus; coursework in Bayesian methods, longitudinal analysis and experimental design. Preferred qualifications: - Experience with Python, Perl and SQL.