Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Artificial Intelligence and Antitrust (Hal Varian)

416 views

Published on

Keynote lecture on 'Artificial Intelligence and Antitrust' delivered during the FSR C&M, CMPF and FCP Annual Scientific Seminar on 'Competition, Regulation and Pluralism in the Online World'

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Artificial Intelligence and Antitrust (Hal Varian)

  1. 1. Artificial Intelligence and Antitrust Hal Varian Florence Italy March 22-23, 2018 Competition, Regulation and Pluralism in the Online World The views in this presentation are those of the author and do not represent the views of his employer or any other party.
  2. 2. Outline ● Machine learning and AI ● What ML can do ● Data and other factors of production ● Online competition ● Leaders, laggards, superstars ● Algorithmic collusion
  3. 3. Machine learning and AI
  4. 4. Data pyramid
  5. 5. Machine learning and Artificial Intelligence ● AI has been around a long time (in the lab) ● ML has been around for a long time---in classroom and in practice ● Last 5 years saw tremendous advances in AI methods for ML ○ Deep learning: layered neural networks ○ Applications in image recognition, voice recognition, translation ○ Progress is due to better algorithms, hardware, data, expertise ● What does it take to use ML and AI? ○ Hardware and software are easy via cloud computing ○ Data is often easy to acquire ○ Expertise is scarce but growing rapidly due to online courses
  6. 6. What can ML do? Examples from Kaggle ● Home Prices; Improve accuracy of Zillow’s home price prediction: $1,200,000 ● Traffic to Wikipedia Pages; Forecast traffic to Wikipedia pages: $25,000 ● Personalized Medicine; Predict effect of genetic variants $15,000 ● Taxi Trip Duration: Predict total ride duration of taxi trips in New York: $30,000 ● Product Search Relevance: Relevance of search results: $40,000 ● Clustering Questions: Question pairs that have the same intent: $25,000 ● Cervical cancer screening: Effectiveness of treatments $100,000 ● Click Prediction: Can you predict where each user will click: $25,000 ● Inventory Demand: Minimize returns of bakery goods: $25,000 ● Video understanding: classify what is happening in YT videos $100,000
  7. 7. Kaggle users by (overlapping) region
  8. 8. Data
  9. 9. Economic characteristics of data Is data the new oil? ● Like oil, data must be refined before it becomes useful ○ It has to be turned into information, knowledge, action ● Unlike oil, data is non-rival ○ Partly excludable (intellectual property, privacy regulation) ○ Excludable, non-rival goods are “club goods” (tennis club, swimming club) ○ Ownership for private goods, access for club goods ● Rights, permissions, licensing, regulation, contracts ○ Example: who has access to autonomous vehicle data? ○ Pluralism is generally good, see airlines ● Data portability ○ Google Takeout ○ Facebook account ○ But where are the Take-Ins?
  10. 10. Advice to students (Feb 2008) “If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots of courses about how to manipulate and analyze data: databases, machine learning, econometrics, statistics, visualization, and so on.” A noted pundit
  11. 11. Advice to students (Feb 2008) “If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots of courses about how to manipulate and analyze data: databases, machine learning, econometrics, statistics, visualization, and so on.” Hal Varian
  12. 12. Where does the data come from? ● A by-product of operations. Data warehouses. ● Web scraping. Bots, spiders, Common Crawl ● Offering a service. GOOG411, ReCaptcha, QuickDraw ● Hiring humans to label data. Mechanical Turk, India, Philippines, etc. ● Buying data from provider. Mail lists, credit reports, data brokers ● Sharing data. 9.5 million labeled images, 4.5 million labeled YouTube videos ● Data from governments and NGOs. Economic, transport, agricultural, scientific ● Data from cloud providers. Google Patents Public Dataset, AWS Public Datasets. ● Computer generated data. Synthetic images, AlphaGo 0, back translation
  13. 13. How important is data? ● Imagenet challenge ● 20M labeled images ● 1.2M images for training ● 150,000 images for testing ● 1000 object categories ● Roughly constant training size since 2010 ● Dramatic improvement due to hardware, algorithms, expertise ● Now beats humans Imagenet 2017
  14. 14. Diminishing returns to data: Stanford dog dataset ● Of course, more data is better ● But there are diminishing returns, like any other factor of production ● Stanford dog breed images ○ 120 breeds of dogs ○ 20,580 images ● Goal: recognize breeds ● Accuracy goes up but at a decreasing rate Stanford Dog Dataset
  15. 15. Cognitive assistance It used to be that being a ... ● ...cashier required knowing how to make change ● ...writer required knowing how to spell ● ...taxi driver meant knowing city streets ● ...a hospitality worker in an international you know a bit of foreign languages ● ...gardener, you needed to recognize plants ● ...veterinarian how to recognize dog breeds Where there is a skills gap, you can bring the worker’s skills up to the requirement, or bring the job down to workers’ competencies. Cognitive assistances helps people get jobs, by reducing the tasks they need to master. ● In 1880 machines offered physical assistance to workers ● In 2018 machines offer cognitive assistance to workers
  16. 16. Examples
  17. 17. cloud.google.com/vision
  18. 18. cloud.google.com/vision
  19. 19. Ragdoll cat from Wikipedia Our ragdoll cat
  20. 20. Online competition
  21. 21. Competition in the cloud Providers: Amazon Web Services, Microsoft Azure, Google Cloud Platform, Adobe, VMware, IBM Cloud, Rackspace. Red Hat, Oracle, SAP, etc. (60 more)
  22. 22. Dimensions of competition ● Pricing: posted and negotiated ○ Bertrand pricing due to low marginal cost and rapid technological progress ○ Reserved or interruptible use ● Standard services: images, transcripts, voice, translation (see demos) ● Software: databases, prediction API, statistical tools (patents database) ● Consulting: direct and 3rd party ● Portability: containers, dockers avoid lock-in ○ Multihoming is common ○ For example, Snap uses both Google and Amazon ● Make or buy decision? ○ Expertise is in demand ○ New grads want to work with great teams (as with any new technology) ○ Skills will diffuse, specializations develop, domain knowledge is valuable ○ Cost of maintaining infrastructure v specialization in specific domain
  23. 23. Leaders, laggards, and superstars OECD
  24. 24. Online competition
  25. 25. Online competition
  26. 26. Example: virtual assistants Apple Siri, Google Assistant, Amazon Alexa, Microsoft Cortana, Facebook M, Samsung Bixby, etc
  27. 27. VC funding is robust ● But fewer firms are going public. ● The US stock markets have half the number of firms listed as they had 20 years ago. ● In VC there are four times as many acquisitions as IPOs ● The average listed firm spends more on R&D than on capital Kathleen Kahle and Rene Stulz, “Is the US Public Corporation in Trouble”? Sand Hill Econometrics
  28. 28. Algorithmic collusion
  29. 29. Algorithmic collusion ● Law ○ OECD on Algorithms and Collusion ○ Stucke and Ezrachi (2016), Virtual Competition ○ Gal and Elkin-Koren (2017), “Algorithmic Consumers”, HJLT ● Economics ○ Repeated games (1970s) and the folk theorem ○ Axelrod (1981): Evolution of cooperation in prisoner's dilemma ■ Tournament of strategies and tit-for-tat ■ Evolution of cooperation ■ Noisy prisoners’ dilemma ○ Rapid response equilibrium ■ Gas stations with posted prices ■ Airline price signaling
  30. 30. Computation? Or communication? Strategies for collusion are simple, communication is key issue ● NASDAQ odd-eighth pricing (1994) ● Spectrum auction signalling by bids (1996) [483] bidding Considerations ● Is communication open or closed? ● Is it implicit or explicit? ● Facebook AI “own language” ● Google AI neural cryptography
  31. 31. Adversarial AI Machine learning algorithms can be fooled ● Spam filtering ● Image manipulation ● 3-D printed turtle Quartz
  32. 32. Summary ● AI is here to stay ● Hardware, software, data, expertise are inputs to production and all have diminishing returns. ● Variety of applications ● Online competition is robust ● Algorithmic collusion has more to do with communication than computation ● Adversarial AI

×