Be the first to like this
The “People You May Know” (PYMK) recommendation service helps LinkedIn’s members identify other members that they might want to connect to and is the major driver for growing LinkedIn's social network. The principal challenge in developing a service like PYMK is dealing with the sheer scale of computation needed to make precise recommendations with a high recall. PYMK service at LinkedIn has been operational for over a decade, during which it has evolved from an Oracle-backed system that took weeks to compute recommendations to a Hadoop backed system that took a few days to compute recommendations to its most modern embodiment where it can compute recommendations in near real time.
This talk will present the evolution of PYMK to its current architecture. We will focus on various systems we built along the way, with an emphasis on systems we built for our most recent architecture, namely Gaia, our real-time graph computing capability, and Venice our online feature store with scoring capability, and how we integrate these individual systems to generate recommendations in a timely and agile manner, while still being cost-efficient. We will briefly talk about the lessons learned about scalability limits of our past and current design choices and how we plan to tackle the scalability challenges for the next phase of growth.