SlideShare a Scribd company logo
Quasar and ReMix
An introduction to LinkedIn's Ranking and Federation libraries
Andris Birkmanis & Lance Wall
1
Relevance: Verticals & Infrastructure
2
Relevance
Isolated ML
models
Integrated ML
models
Relevance
Infra
Relevance
Verticals
Deployed ML
services
ML algos Scoring
and Ranking
Tools
Relevance
service platform
Quasar ReMix
Quasar
Quick Scoring and Ranking
Our mission is making efficient feature transformation, scoring, and ranking simple.
3
Scoring
• Scoring
• Scorables
• Features
• Feature Transformation
4
Ranking
• Sorting
• TopK
• Filtering
• Group by
• Distinct
• Union
• Custom
5
Relevance Models: DAGs of Computations
Filter BY
interest-match
> 0.5
Filter BY
skill-match
> 0.7
TOP 50 BY
content-match
All
Documents
member
interest category
interest-
match-
score
news
feed
skill content
skill-
match-
score
content-
match-
score
10,000
500 500
50
ML and Training
• Tracking training dependencies between ML models
• Integrating with training engines via Training API
• Automatic type conversion for features and model parameters
• Reuse of feature transformations between training and prediction
7
Quasar Components
• Domain Specific Language (DSL)
▪ Oriented towards scoring and ranking concepts
▪ Supports various machine learning models
▪ Supports various ranking operators
▪ Supports pluggable feature transformers
▪ Supports arithmetical and logical expressions
• Library
▪ Includes out-of-box feature transformers tuned for performance (dense/sparse vectors, bags of
words, etc.)
▪ Extensible with custom transformers and ranking operators
• Execution engine
▪ Supports multiple evaluation strategies for different objectives (lazy/eager/batching/etc.)
▪ Debuggability, logging, and other cross-cutting concerns
▪ API for scoring, ranking, read/write access to features, training
• LinkedIn Relevance Products
▪ Feed, Recommendation, Search
• Adoption
▪ 1000+ Quasar models
Project Status
Future directions
• Better training support for external models (XGBoost, Tensorflow)
• Making feature transformers and operators more reusable
• Better type information
• Standardized storage formats for features and model parameters
• See the upcoming LinkedIn engineering blog for technical details
10
Lance Wall
ReMix
Example relevance workflows at LinkedIn
Member ID
Fetch
Member
Profile
Fetch
Member
Profile
Compute
Job
Recommendations
Compute
People
Recommendations
Format
Results
Member ID
Format
Results
Motivation
• Multitenant relevance workflow services with tens of engineers on
multiple teams contributing
• Each relevance workflow service has different APIs and conventions
• Lack of abstraction of system-level concerns from application logic
• Diminished productivity, operability, and leverage
ReMix’s Mission
Provide an easy to use platform for building relevance services
with a focus on optimizing leverage and automating common
operability concerns.
Design Goals
• Consolidation of various relevance service stacks
• Ease of support
• Ease of development
• Ease of operation
Features of ReMix
• Leverages ParSeq for easy asynchronous I/O
• Exposes declarative API for composing workflows
• Provides automated monitoring instrumentation and tooling
• Provides robust, extensible solutions for common workflow
functionality
• Provides isolation and robustness to downstream instability
How does ReMix work?
Operator
is assembled into
Workflow
is submitted to
WorkflowEngine
Operator
• Modular functional component of a Workflow
• ReMix provides Operators for common functionality
• ReMix provides decorative interfaces for common optimizations
• ReMix provides generic support for asynchronous execution
Example relevance workflows at LinkedIn
Member ID
Fetch
Member
Profile
Fetch
Member
Profile
Compute
Job
Recommendations
Compute
People
Recommendations
Format
Results
Member ID
Format
Results
Workflow
• Declaration of deferred execution
• Easy to understand declarative language
• Leverages ParSeq and exposes a simpler API
• Abstraction of execution behavior & optimizations
• Independent of environment or service (i.e. portable)
Example relevance workflows at LinkedIn
Member ID
Fetch
Member
Profile
Fetch
Member
Profile
Compute
Job
Recommendations
Compute
People
Recommendations
Format
Results
Member ID
Format
Results
WorkflowEngine
• Executor of Workflows
• Translates Workflows to ParSeq Tasks
• Provides special considerations for async/RPC operations
• Provides common operability functionality
Project Status & Planned Work
• ReMix adopters include job recommendations and blended search
• Working on integration with Quasar
▪ Complete solution for model serving from offline to online
• ReMix Cloud
▪ Simple toolkit/UI for creating a Workflow and deploying it to production
▪ Hosts Workflows in a managed service, with little to no operational cost to
Workflow developers
▪ Increased leverage due to reuse of common components in multitenant platform
Thanks!
Backup Slides
25
Quasar Model
26
MODELID "feed_quasar";
DOCPARAM com.linkedin.feed.FeedItem feedItem;
REQUEST PARAM Profile member;
REQUEST FEATURE VECTOR interests = GetInterests(member);
DOCUMENT FEATURE VECTOR categories = GetCategories(feedItem);
DOCUMENT FEATURE LONG publishedTime = GetPublishedTime(feedItem);
MODEL PARAM timeBuckets = { "1hr" : 60, "3hr" : 180 };
DOCUMENT FEATURE VECTOR normalizedTime = Bucketize(diffTime, timeBuckets);
DOCUMENT FEATURE VECTOR interestMatch = Similarity(interests, categories);
MODEL PARAM MAP<STRING, OBJECT> modelWeights = {
“normalizedTime”: { “1hr”: 0.234, “3hr”: 0.456, “Other”:0.21 }, “interestMatch”: 0.823 };
DOCUMENT FEATURE FLOAT score = LinearScore(modelWeights, “sigmoid”);
DOCUMENT FEATURE BOOLEAN aboveThreshold = score > 0.5
filteredFeed = FILTER DOCUMENTS BY aboveThreshold;
rerankedFeed = ORDER filteredFeed BY score WITH DESC;
RETURN rerankedFeed;
Candidate list of documents
Filter Documents
getInterest
s
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
1 3 4Request
1 3 4
3 1 4
Order
Documents
Pass 1
2
Pass 2
Decision
Tree
LinearSc
ore
Decision
Tree
LinearSc
ore
getVie
wTimes
Bucke
tize
Decision
Tree
LinearSc
ore
The multipass ensemble
model
at runtime
Vector Math and Expression Support
• Vector as first class citizen in DSL
• State-of-art Java Vector implementation
▪ Compact and efficient data structure
▪ Efficient Vector math computation
C++
Java
Networ
k
1.0
1.0
3.0
1.0
Linux
Member/Job
Similarity
Score
=
member.skill
Hadoop
Scala
Gradle
2.0
1.0
2.0
job.required_skill
dot
product

More Related Content

What's hot

Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...
Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...
Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...
Amazon Web Services
 
Continuous Deployment for Deep Learning
Continuous Deployment for Deep LearningContinuous Deployment for Deep Learning
Continuous Deployment for Deep Learning
Databricks
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operations
Stepan Pushkarev
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Animesh Singh
 
Expanding beyond SPL -- More language support in IBM Streams V4.1
Expanding beyond SPL -- More language support in IBM Streams V4.1Expanding beyond SPL -- More language support in IBM Streams V4.1
Expanding beyond SPL -- More language support in IBM Streams V4.1
lisanl
 
Hydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on KubeflowHydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on Kubeflow
Rustem Zakiev
 
Space-Based Architecture
Space-Based ArchitectureSpace-Based Architecture
Space-Based Architecture
Suresh Patidar
 
Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)
Josh Baer
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
MeetupDataScienceRoma
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
Databricks
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Lviv Startup Club
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
Animesh Singh
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Databricks
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
Databricks
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking VN
 
Tensorflow model using docker and AWS SageMaker
Tensorflow model using docker and AWS SageMakerTensorflow model using docker and AWS SageMaker
Tensorflow model using docker and AWS SageMaker
Arif Khan
 
Machine learning at scale by Amy Unruh from Google
Machine learning at scale by  Amy Unruh from GoogleMachine learning at scale by  Amy Unruh from Google
Machine learning at scale by Amy Unruh from Google
Bill Liu
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
MLconf
 
Intro to SageMaker
Intro to SageMaker  Intro to SageMaker
Intro to SageMaker
Stacey Graham
 
AWS Serverless patterns & best-practices in AWS
AWS Serverless  patterns & best-practices in AWSAWS Serverless  patterns & best-practices in AWS
AWS Serverless patterns & best-practices in AWS
Dima Pasko
 

What's hot (20)

Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...
Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...
Build, Train, & Deploy ML Models Using SageMaker: Machine Learning Week San F...
 
Continuous Deployment for Deep Learning
Continuous Deployment for Deep LearningContinuous Deployment for Deep Learning
Continuous Deployment for Deep Learning
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operations
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Expanding beyond SPL -- More language support in IBM Streams V4.1
Expanding beyond SPL -- More language support in IBM Streams V4.1Expanding beyond SPL -- More language support in IBM Streams V4.1
Expanding beyond SPL -- More language support in IBM Streams V4.1
 
Hydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on KubeflowHydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on Kubeflow
 
Space-Based Architecture
Space-Based ArchitectureSpace-Based Architecture
Space-Based Architecture
 
Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)Kubeflow at Spotify (For the Kubeflow Summit)
Kubeflow at Spotify (For the Kubeflow Summit)
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Tensorflow model using docker and AWS SageMaker
Tensorflow model using docker and AWS SageMakerTensorflow model using docker and AWS SageMaker
Tensorflow model using docker and AWS SageMaker
 
Machine learning at scale by Amy Unruh from Google
Machine learning at scale by  Amy Unruh from GoogleMachine learning at scale by  Amy Unruh from Google
Machine learning at scale by Amy Unruh from Google
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Intro to SageMaker
Intro to SageMaker  Intro to SageMaker
Intro to SageMaker
 
AWS Serverless patterns & best-practices in AWS
AWS Serverless  patterns & best-practices in AWSAWS Serverless  patterns & best-practices in AWS
AWS Serverless patterns & best-practices in AWS
 

Similar to ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation Libraries

Orchestrating Cloud Workloads with RightScale Self-Service
Orchestrating Cloud Workloads with RightScale Self-Service Orchestrating Cloud Workloads with RightScale Self-Service
Orchestrating Cloud Workloads with RightScale Self-Service
RightScale
 
Paa sing a java ee 6 application kshitiz saxena
Paa sing a java ee 6 application   kshitiz saxenaPaa sing a java ee 6 application   kshitiz saxena
Paa sing a java ee 6 application kshitiz saxena
IndicThreads
 
Soa 1 7.ppsx
Soa 1 7.ppsxSoa 1 7.ppsx
Soa 1 7.ppsx
ssuser3a47cb
 
Designing Microservices
Designing MicroservicesDesigning Microservices
Designing Microservices
David Chou
 
Establishing SOA Focused Enterprise Architecture
Establishing SOA Focused Enterprise ArchitectureEstablishing SOA Focused Enterprise Architecture
Establishing SOA Focused Enterprise Architecture
Chris Haddad
 
What are IBM Rational's CLM products
What are IBM Rational's CLM productsWhat are IBM Rational's CLM products
What are IBM Rational's CLM products
Shawn Doyle
 
DesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 MigrationDesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 Migration
Mark Ginnebaugh
 
API’s and Micro Services 0.5
API’s and Micro Services 0.5API’s and Micro Services 0.5
API’s and Micro Services 0.5Richard Hudson
 
Cloud Native Application Development
Cloud Native Application DevelopmentCloud Native Application Development
Cloud Native Application Development
Siva Rama Krishna Chunduru
 
Systemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenterSystemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenter
jmustac
 
How to Build TOGAF Architectures With System Architect (2).ppt
How to Build TOGAF Architectures With System Architect (2).pptHow to Build TOGAF Architectures With System Architect (2).ppt
How to Build TOGAF Architectures With System Architect (2).ppt
StevenShing
 
Rational CLM at a glance
Rational CLM at a glanceRational CLM at a glance
Rational CLM at a glance
Prussian Eka Pradana
 
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
VMware Tanzu
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa intro
Sonic leigh
 
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup
 
Developing scalable enterprise serverless applications on azure with .net
Developing scalable enterprise serverless applications on azure with .netDeveloping scalable enterprise serverless applications on azure with .net
Developing scalable enterprise serverless applications on azure with .net
Callon Campbell
 
Presentation application change management and data masking strategies for ...
Presentation   application change management and data masking strategies for ...Presentation   application change management and data masking strategies for ...
Presentation application change management and data masking strategies for ...
xKinAnx
 
Modern Enterprise integration Strategies
Modern Enterprise integration StrategiesModern Enterprise integration Strategies
Modern Enterprise integration Strategies
Jesus Rodriguez
 
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Cloudera, Inc.
 

Similar to ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation Libraries (20)

Orchestrating Cloud Workloads with RightScale Self-Service
Orchestrating Cloud Workloads with RightScale Self-Service Orchestrating Cloud Workloads with RightScale Self-Service
Orchestrating Cloud Workloads with RightScale Self-Service
 
Paa sing a java ee 6 application kshitiz saxena
Paa sing a java ee 6 application   kshitiz saxenaPaa sing a java ee 6 application   kshitiz saxena
Paa sing a java ee 6 application kshitiz saxena
 
Soa 1 7.ppsx
Soa 1 7.ppsxSoa 1 7.ppsx
Soa 1 7.ppsx
 
Designing Microservices
Designing MicroservicesDesigning Microservices
Designing Microservices
 
Establishing SOA Focused Enterprise Architecture
Establishing SOA Focused Enterprise ArchitectureEstablishing SOA Focused Enterprise Architecture
Establishing SOA Focused Enterprise Architecture
 
What are IBM Rational's CLM products
What are IBM Rational's CLM productsWhat are IBM Rational's CLM products
What are IBM Rational's CLM products
 
DesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 MigrationDesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 Migration
 
API’s and Micro Services 0.5
API’s and Micro Services 0.5API’s and Micro Services 0.5
API’s and Micro Services 0.5
 
SaaS External Presentation
SaaS External PresentationSaaS External Presentation
SaaS External Presentation
 
Cloud Native Application Development
Cloud Native Application DevelopmentCloud Native Application Development
Cloud Native Application Development
 
Systemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenterSystemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenter
 
How to Build TOGAF Architectures With System Architect (2).ppt
How to Build TOGAF Architectures With System Architect (2).pptHow to Build TOGAF Architectures With System Architect (2).ppt
How to Build TOGAF Architectures With System Architect (2).ppt
 
Rational CLM at a glance
Rational CLM at a glanceRational CLM at a glance
Rational CLM at a glance
 
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa intro
 
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
 
Developing scalable enterprise serverless applications on azure with .net
Developing scalable enterprise serverless applications on azure with .netDeveloping scalable enterprise serverless applications on azure with .net
Developing scalable enterprise serverless applications on azure with .net
 
Presentation application change management and data masking strategies for ...
Presentation   application change management and data masking strategies for ...Presentation   application change management and data masking strategies for ...
Presentation application change management and data masking strategies for ...
 
Modern Enterprise integration Strategies
Modern Enterprise integration StrategiesModern Enterprise integration Strategies
Modern Enterprise integration Strategies
 
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
 

Recently uploaded

Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 

Recently uploaded (20)

Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 

ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation Libraries

  • 1. Quasar and ReMix An introduction to LinkedIn's Ranking and Federation libraries Andris Birkmanis & Lance Wall 1
  • 2. Relevance: Verticals & Infrastructure 2 Relevance Isolated ML models Integrated ML models Relevance Infra Relevance Verticals Deployed ML services ML algos Scoring and Ranking Tools Relevance service platform Quasar ReMix
  • 3. Quasar Quick Scoring and Ranking Our mission is making efficient feature transformation, scoring, and ranking simple. 3
  • 4. Scoring • Scoring • Scorables • Features • Feature Transformation 4
  • 5. Ranking • Sorting • TopK • Filtering • Group by • Distinct • Union • Custom 5
  • 6. Relevance Models: DAGs of Computations Filter BY interest-match > 0.5 Filter BY skill-match > 0.7 TOP 50 BY content-match All Documents member interest category interest- match- score news feed skill content skill- match- score content- match- score 10,000 500 500 50
  • 7. ML and Training • Tracking training dependencies between ML models • Integrating with training engines via Training API • Automatic type conversion for features and model parameters • Reuse of feature transformations between training and prediction 7
  • 8. Quasar Components • Domain Specific Language (DSL) ▪ Oriented towards scoring and ranking concepts ▪ Supports various machine learning models ▪ Supports various ranking operators ▪ Supports pluggable feature transformers ▪ Supports arithmetical and logical expressions • Library ▪ Includes out-of-box feature transformers tuned for performance (dense/sparse vectors, bags of words, etc.) ▪ Extensible with custom transformers and ranking operators • Execution engine ▪ Supports multiple evaluation strategies for different objectives (lazy/eager/batching/etc.) ▪ Debuggability, logging, and other cross-cutting concerns ▪ API for scoring, ranking, read/write access to features, training
  • 9. • LinkedIn Relevance Products ▪ Feed, Recommendation, Search • Adoption ▪ 1000+ Quasar models Project Status
  • 10. Future directions • Better training support for external models (XGBoost, Tensorflow) • Making feature transformers and operators more reusable • Better type information • Standardized storage formats for features and model parameters • See the upcoming LinkedIn engineering blog for technical details 10
  • 12. Example relevance workflows at LinkedIn Member ID Fetch Member Profile Fetch Member Profile Compute Job Recommendations Compute People Recommendations Format Results Member ID Format Results
  • 13. Motivation • Multitenant relevance workflow services with tens of engineers on multiple teams contributing • Each relevance workflow service has different APIs and conventions • Lack of abstraction of system-level concerns from application logic • Diminished productivity, operability, and leverage
  • 14. ReMix’s Mission Provide an easy to use platform for building relevance services with a focus on optimizing leverage and automating common operability concerns.
  • 15. Design Goals • Consolidation of various relevance service stacks • Ease of support • Ease of development • Ease of operation
  • 16. Features of ReMix • Leverages ParSeq for easy asynchronous I/O • Exposes declarative API for composing workflows • Provides automated monitoring instrumentation and tooling • Provides robust, extensible solutions for common workflow functionality • Provides isolation and robustness to downstream instability
  • 17. How does ReMix work? Operator is assembled into Workflow is submitted to WorkflowEngine
  • 18. Operator • Modular functional component of a Workflow • ReMix provides Operators for common functionality • ReMix provides decorative interfaces for common optimizations • ReMix provides generic support for asynchronous execution
  • 19. Example relevance workflows at LinkedIn Member ID Fetch Member Profile Fetch Member Profile Compute Job Recommendations Compute People Recommendations Format Results Member ID Format Results
  • 20. Workflow • Declaration of deferred execution • Easy to understand declarative language • Leverages ParSeq and exposes a simpler API • Abstraction of execution behavior & optimizations • Independent of environment or service (i.e. portable)
  • 21. Example relevance workflows at LinkedIn Member ID Fetch Member Profile Fetch Member Profile Compute Job Recommendations Compute People Recommendations Format Results Member ID Format Results
  • 22. WorkflowEngine • Executor of Workflows • Translates Workflows to ParSeq Tasks • Provides special considerations for async/RPC operations • Provides common operability functionality
  • 23. Project Status & Planned Work • ReMix adopters include job recommendations and blended search • Working on integration with Quasar ▪ Complete solution for model serving from offline to online • ReMix Cloud ▪ Simple toolkit/UI for creating a Workflow and deploying it to production ▪ Hosts Workflows in a managed service, with little to no operational cost to Workflow developers ▪ Increased leverage due to reuse of common components in multitenant platform
  • 26. Quasar Model 26 MODELID "feed_quasar"; DOCPARAM com.linkedin.feed.FeedItem feedItem; REQUEST PARAM Profile member; REQUEST FEATURE VECTOR interests = GetInterests(member); DOCUMENT FEATURE VECTOR categories = GetCategories(feedItem); DOCUMENT FEATURE LONG publishedTime = GetPublishedTime(feedItem); MODEL PARAM timeBuckets = { "1hr" : 60, "3hr" : 180 }; DOCUMENT FEATURE VECTOR normalizedTime = Bucketize(diffTime, timeBuckets); DOCUMENT FEATURE VECTOR interestMatch = Similarity(interests, categories); MODEL PARAM MAP<STRING, OBJECT> modelWeights = { “normalizedTime”: { “1hr”: 0.234, “3hr”: 0.456, “Other”:0.21 }, “interestMatch”: 0.823 }; DOCUMENT FEATURE FLOAT score = LinearScore(modelWeights, “sigmoid”); DOCUMENT FEATURE BOOLEAN aboveThreshold = score > 0.5 filteredFeed = FILTER DOCUMENTS BY aboveThreshold; rerankedFeed = ORDER filteredFeed BY score WITH DESC; RETURN rerankedFeed;
  • 27. Candidate list of documents Filter Documents getInterest s getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re 1 3 4Request 1 3 4 3 1 4 Order Documents Pass 1 2 Pass 2 Decision Tree LinearSc ore Decision Tree LinearSc ore getVie wTimes Bucke tize Decision Tree LinearSc ore The multipass ensemble model at runtime
  • 28. Vector Math and Expression Support • Vector as first class citizen in DSL • State-of-art Java Vector implementation ▪ Compact and efficient data structure ▪ Efficient Vector math computation C++ Java Networ k 1.0 1.0 3.0 1.0 Linux Member/Job Similarity Score = member.skill Hadoop Scala Gradle 2.0 1.0 2.0 job.required_skill dot product