Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Day LA 2016 Keynote - Andy Feng/ Yahoo

282 views

Published on

Big Data Day LA 2016 Keynote - Andy Feng, VP-Architecture at Yahoo talks about Hadoop and Big Data Innovation

Published in: Technology
  • Be the first to comment

Big Data Day LA 2016 Keynote - Andy Feng/ Yahoo

  1. 1. Big-Data Innovation Hadoop , Real - time, and Machine Learning A n d y F e n g Ya h o o
  2. 2. Hadoop Clusters at Yahoo
  3. 3. How does Yahoo use its Big Data Tech? 3Yahoo Confidential & Proprietary
  4. 4. 4 Personalized Content Content augmentation User profiling Recommendation
  5. 5. Mail Smart Views 5 Yahoo Confidential & Proprietary
  6. 6. Flickr: flickr.com/cameraroll
  7. 7. Weather 7  Beauty › Computational assessed  Relevant › Location › Time › Cloudy › Shower › … Weather App Yahoo Weather App
  8. 8. Search Query intention Page ranking Ads matching
  9. 9. Search2Vec: Cosine similarity b/w ads and queries 9 Yahoo Confidential & Proprietary
  10. 10. Big-Data Impact: Search2Vec http://bit.ly/28SLdjU 10 Bucket Tests Query Coverage Auction Depth Revenue per Search Simple model vs. Baseline +1.14% +2.13% +7.07% Advanced model vs. Small model +2.44% +2.39% +9.39% (= +17.12% vs. Baseline)
  11. 11. Open Source 11Yahoo Confidential & Proprietary
  12. 12. CaffeOnSpark: Distributed Deep Learning github.com/yahoo/caffeonspark Powerful DL Platform Fully Distributed High-level API Incremental Learning Existing Clusters 12
  13. 13. Interactive Analytics Data Sketches Algorithms Library datasketches.github.io Sub-second User Facing Analytics druid.io 13
  14. 14. Apache Storm: Real-time Processing https://storm.apache.org MT & RA Scheduler Dist. Cache API 8 x Throughput Improved Debuggability 1 github.com/yahoo/streaming-benchmarks Pacemaker Server Streaming Benchmark 1 14
  15. 15. Apache Omid: Transactions for NoSQL DB http://omid.incubator.apache.org/ • Multi-row/multi-table transactions • Snapshot isolation • Lock-free 15 ACID Transactions
  16. 16. Yahoo Hadoop Stack 16Yahoo Confidential & Proprietary
  17. 17. 17
  18. 18. 18 Thanks! yahoohadoop.tumblr.com bigdata@yahoo-inc.com

×