Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data

78 views

Published on

The “People You May Know” (PYMK) recommendation service helps LinkedIn’s members identify other members that they might want to connect to and is the major driver for growing LinkedIn's social network. The principal challenge in developing a service like PYMK is dealing with the sheer scale of computation needed to make precise recommendations with a high recall. PYMK service at LinkedIn has been operational for over a decade, during which it has evolved from an Oracle-backed system that took weeks to compute recommendations to a Hadoop backed system that took a few days to compute recommendations to its most modern embodiment where it can compute recommendations in near real time.

This talk will present the evolution of PYMK to its current architecture. We will focus on various systems we built along the way, with an emphasis on systems we built for our most recent architecture, namely Gaia, our real-time graph computing capability, and Venice our online feature store with scoring capability, and how we integrate these individual systems to generate recommendations in a timely and agile manner, while still being cost-efficient. We will briefly talk about the lessons learned about scalability limits of our past and current design choices and how we plan to tackle the scalability challenges for the next phase of growth.
https://qcon.ai/qconai2019/presentation/people-you-may-know-fast-recommendations-over-massive-data

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data

  1. 1. People You May Know Fast Recommendations Over Massive Data Jeff Weiner Chief Executive Officer Sumit Rangwala Artificial Intelligence Felix GV Data Infrastructure
  2. 2. My Professional Network Professional network in real world Sumit Felix Amol GaojiePeter
  3. 3. My Professional Network Professional network in real world Professional network on LinkedIn Sumit Felix Peter Amol Gaojie Sumit Felix Peter Amol Gaojie
  4. 4. My Professional Network Professional network in real world Professional network on LinkedIn Sumit Felix Peter Amol Gaojie Sumit Felix Peter Amol Gaojie Predicting real world connections
  5. 5. Helps grow member’s professional network Recommends people that one might know People You May Know Enables many other LinkedIn services
  6. 6. Talk Outline People You May Know PYMK: Generating Recommendations PYMK Architecture Evolution PYMK Rebirth Insights and Road Ahead
  7. 7. PYMK: Generating Recommendations
  8. 8. PYMK: Prediction Strategy Data Mining • LinkedIn’s Economic Graph • Member’s activities and profile LinkedIn Economic Graph Sumit Felix Peter Amol Gaojie
  9. 9. PYMK: Prediction Strategy Data Mining • LinkedIn’s Economic Graph • Member’s activities and profile LinkedIn Economic Graph Felix Peter Amol Gaojie Microsoft USC Sumit
  10. 10. Recommendation System Candidate Generation Feature Generation Scoring
  11. 11. PYMK: Candidate Generation Using commonalities in economic graph • Friends of my friends (triangle closing) LinkedIn Economic Graph Amol Peter Gaojie Sumit Felix
  12. 12. PYMK: Candidate Generation Using commonalities in economic graph • Friends of my friends (triangle closing) • Coworkers • Personalized Page Rank LinkedIn Economic Graph Amol Peter Gaojie Felix Microsoft Sumit
  13. 13. PYMK: Feature Generation Using economic graph characteristics • Number of common friends Using member activities/profile • Common work location LinkedIn Economic Graph Amol Peter Gaojie Felix Microsoft Sumit
  14. 14. PYMK: Recommendation System Candidate Generation Feature Generation
  15. 15. PYMK: Recommendation System Candidate Generation Feature Generation Sumit might know Amol’s friend Felix Sumit and Felix has one common friend Sumit and Felix both work in Bay Area
  16. 16. PYMK: Recommendation System Candidate Generation Feature Generation Sumit might know Amol’s friend Felix Sumit and Felix has one common friend Sumit and Felix both work in Bay Area Graph processing Data processing
  17. 17. PYMK Architecture Evolution
  18. 18. Pre-compute recommendations A P P R O A C H
  19. 19. PYMK: The Beginning Problem Space • 10s of millions of members Architecture • Pre-compute using SQL Shortcomings • Staleness of 6 weeks to 6 months • Extraneous computation Oracle
  20. 20. PYMK: The Beginning Problem Space • 10s of millions of members Architecture • Pre-compute using SQL Shortcomings • Staleness of 6 weeks to 6 months • Extraneous computation Oracle PYMK Service Online service request
  21. 21. PYMK: Keeping up with Growth Problem space • Low 100s of millions of members Architecture • Pre-compute using Hadoop MR • Push to a key-value store Shortcomings • Staleness of 2-3 days • Extraneous computation Voldemort PYMK Service
  22. 22. PYMK: Pushing the Technology Limits Problem Space • Mid 100s of millions of members Architecture • Pre-compute using Spark1 • Push to a key-value store Shortcomings • Staleness of 1-2 days • Excessive computation cost Venice [1] Managing Exploding Big Data PYMK Service
  23. 23. PYMK: Exploring Data Freshness Problem Space • Use up to date member data Architecture • Hybrid offline-online approach Shortcomings • Split-brain design • Didn’t scale Venice Realtime signals PYMK Service
  24. 24. Key Realization Freshness matters Pre-computation is costly
  25. 25. PYMK Rebirth
  26. 26. Compute recommendations on demand A P P R O A C H
  27. 27. PYMK: Recommendation System Candidate Generation Feature Generation Sumit might know Amol’s friend Felix Sumit and Felix has one common friend Sumit and Felix both work in Bay Area Online Graph Traversal Fast Data Access
  28. 28. An online graph processing system
  29. 29. G A I A A generic service for executing complex graph algorithms with low latency on massive graphs
  30. 30. Gaia: Overview Gaia
  31. 31. Gaia: Overview Gaia Any kind of graph A snapshot on HDFS
  32. 32. Gaia: Overview Gaia Any kind of graph Updates to graph A snapshot on HDFS Via Kafka, etc.
  33. 33. Gaia: Overview Gaia Any kind of graph Updates to graph Graph algorithm code A snapshot on HDFS Via Kafka, etc. Using compute framework e.g., triangle closing, random graph walks
  34. 34. Design Choice Gaia • Single server architecture with replicas • Full in-memory graph for fast execution
  35. 35. Gaia: Architecture Server Server Server Gaia
  36. 36. Gaia: Architecture Server Server Server Algo Algo Algo Gaia
  37. 37. Gaia: Architecture Graph snapshot on disk Server Server Server Algo Algo Algo Gaia
  38. 38. Gaia: Architecture Graph snapshot on disk Graph updates via Kafka, etc. Server Server Server Algo Algo Algo Gaia
  39. 39. PYMK Gaia • Candidate generation using triangle closing and common connection count • 10s of milliseconds (p90)
  40. 40. A key-value store with scoring capability
  41. 41. At a glance Venice • Tailored for serving ML jobs’ output • High throughput ingestion • Fast lookups • Self-service onboarding
  42. 42. Supported Ingestion Modes in Venice Batch Hadoop Push Job
  43. 43. Supported Ingestion Modes in Venice Batch Incremental Hadoop Push Job Samza Streaming Job
  44. 44. Supported Ingestion Modes in Venice Batch Incremental Hadoop Push Job Push Job Samza Reprocessing Job (Kappa Architecture) Streaming Job
  45. 45. Supported Ingestion Modes in Venice Batch Incremental Hadoop Push Job Push Job Samza Reprocessing Job (Kappa Architecture) Streaming Job Hybrid Any Batch Job + Streaming Job (Lambda Architecture)
  46. 46. Online Feature Retrieval F i r s t P Y M K U s e C a s e
  47. 47. Requirements Online Feature Retrieval • Millions of lookups / sec at peak • ~1000 keys / query • Thousands of queries / sec • ~80B / value
  48. 48. Before / After Online Feature Retrieval • Base latency • 4 seconds (p99) • Changed storage engine to RocksDB • 60 ms (p99)
  49. 49. Embeddings S e c o n d P Y M K U s e C a s e
  50. 50. Requirements Embeddings • Millions of lookups / sec at peak • ~1000 keys / query • Thousands of queries / sec • ~800B / value • 10x the previous size
  51. 51. Before / After Embeddings • Base latency • 275 ms (p99) • Server-side computation • 60 ms (p99)
  52. 52. At a glance Server-side Computation • Simple vector operations • Smaller response size • Big input (vector) • Small output (scalar) • Declarative API • No arbitrary code
  53. 53. More tuning Fast Avro • Online feature retrieval • 60 to 40 ms (p99) • Embeddings w/ computation • 60 to 35 ms (p99) • Now open-source! • github.com/linkedin/avro-util
  54. 54. PYMK Today P u t t i n g i t a l l t o g e t h e r
  55. 55. PYMK: Recommendation System Candidate Generation Sumit might know Amol’s friend, Felix Sumit and Felix have one common friend Sumit and Felix both work in Bay Area PYMK Service Feature Generation Scoring Sumit and Felix likely know each other Venice Gaia
  56. 56. PYMK: Today Venice PYMK Service Gaia 1. Ingest in Gaia & Venice 2. Candidate gen & graph features from Gaia 4. Final scoring by PYMK Service 3. Member features & partial scoring from Venice Staleness • Seconds to minutes
  57. 57. Key Learnings • Pre-computation is viable for many products • Scaling RT computation requires moving compute close to data • Infra aware Machine Learning
  58. 58. Looking Ahead
  59. 59. • Further scale Gaia & Venice • More candidates • More features • Larger features • More complex computations ML-Aware Infra
  60. 60. • Continue democratizing access • Easier onboarding to Venice & Gaia • Multi-tenancy for Venice Compute • Integration with other frameworksProductive ML
  61. 61. Contributors Amol Ghoting Gaojie Liu Kevinjeet Gill Peter Chng Min Huang Yao Chen Hema Raghavan Many othersAshish Singhai
  62. 62. Thank You

×