SlideShare a Scribd company logo
1 of 61
Download to read offline
Computing Professional Identity for the Economic Graph
Agenda 
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
4 
Vitaly Gordon
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
5
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
6
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
7
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
8
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
9
About me 
@bigdatasc /in/vitalygordon 
©2013 LinkedIn Corporation. All Rights Reserved. 
10
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
11
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
12 
1. You will learn about how LinkedIn 
takes a massive vision and breaks it 
down to small data problems
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
13 
1. You will learn about how LinkedIn 
takes a massive vision and breaks it 
down to small data problems 
2. You will learn about how hard 
cleaning data can be
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
14 
1. You will learn about how LinkedIn 
takes a massive vision and breaks it 
down to small data problems 
2. You will learn about how hard 
cleaning data can be 
3. You will learn why LinkedIn needs 
endorsements
What’s in it for me? 
©2013 LinkedIn Corporation. All Rights Reserved. 
15
What’s in it for me? 
©2013 LinkedIn Corporation. All Rights Reserved. 
16
What’s in it for me? 
©2013 LinkedIn Corporation. All Rights Reserved. 
17 
@bigdatasc
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
©2013 LinkedIn Corporation. All Rights Reserved. 
19 
Create economic opportunity for 
every professional in the world
©2013 LinkedIn Corporation. All Rights Reserved. 
20
©2013 LinkedIn Corporation. All Rights Reserved. 
21 
• CEO 
• Chief Executive Officer 
• CEO and Founder 
• CEO & Co-founder 
• President and CEO 
• Owner
©2013 LinkedIn Corporation. All Rights Reserved. 
22 
• IBM 
• International Business Machines 
• International Bus. Machines 
• IBM Research 
• IBM T.J. Watson Research Center 
• IBM Canada 
• IBM India
©2013 LinkedIn Corporation. All Rights Reserved. 
23 
• UCLA 
• University of California, Los Angeles 
• UC Los Angeles 
• The Anderson School of Management
©2013 LinkedIn Corporation. All Rights Reserved. 
24
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
Why Do We Need Identity Standardization? 
©2013 LinkedIn Corporation. All Rights Reserved. 
26
Why Do We Need Identity Standardization? 
©2013 LinkedIn Corporation. All Rights Reserved. 
27
Why Do We Need Identity Standardization? 
©2013 LinkedIn Corporation. All Rights Reserved. 
28
©2013 LinkedIn Corporation. All Rights Reserved. 
29
Text Based Solution 
 Applies acronym expansion (e.g. vp -> vice president) 
 Applies abbreviation expansion (e.g. sr. -> senior) 
 Select the most common standard titles 
 Selects standard sub strings (e.g. software engineer and tech lead 
in search -> [software engineer, tech lead]) 
©2013 LinkedIn Corporation. All Rights Reserved. 
30
Problems with a Text Based Approach 
©2013 LinkedIn Corporation. All Rights Reserved. 
31 
Senior Software 
Engineer 
Software Engineer
Problems with a Text Based Approach 
©2013 LinkedIn Corporation. All Rights Reserved. 
32 
Software Engineer Software Developer Programmer
Problems with a Text Based Approach 
©2013 LinkedIn Corporation. All Rights Reserved. 
33 
Architect
©2013 LinkedIn Corporation. All Rights Reserved. 
34
©2013 LinkedIn Corporation. All Rights Reserved. 
35
©2013 LinkedIn Corporation. All Rights Reserved. 
36
©2013 LinkedIn Corporation. All Rights Reserved. 
37
38 
Profile Inferred Skills Endorsements Skill Vectors
39 
Profile Inferred Skills Endorsements Skill Vectors 
30% 
25% 
20% 
15% 
10% 
5% 
0% 
Data Mining Hadoop Machine 
Learning 
Java Algorithms Python MapReduce Data Science
40 
Profile Inferred Skills Endorsements Skill Vectors 
30% 
25% 
20% 
15% 
10% 
5% 
0% 
Data Mining Hadoop Machine 
Learning 
Java Algorithms Python MapReduce Data Science 
http://www.slideshare.net/s_shah/strata-endorsements
Ontology Creation 
41
Ontology Creation 
42
Classification 
43
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
Normalization 
©2013 LinkedIn Corporation. All Rights Reserved. 
45
Normalization 
46
Clustering 
©2013 LinkedIn Corporation. All Rights Reserved. 
47
Clustering 
48 
• Each topic is a distribution over words 
• Each document is a mixture of corpus-wide topics 
• Each word is drawn from one of those topics
Anomaly Detection 
49
Anomaly Detection 
50
Anomaly Detection 
51 
http://www.slideshare.net/tdunning/strata-2014-anomaly-detection
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
Summary 
©2013 LinkedIn Corporation. All Rights Reserved. 
60 
1. User generated content from 300M 
members, creates 300M problems
Summary 
©2013 LinkedIn Corporation. All Rights Reserved. 
61 
1. User generated content from 300M 
members, creates 300M problems 
2. Data cleaning is so much more than 
filtering out empty values
Summary 
©2013 LinkedIn Corporation. All Rights Reserved. 
62 
1. User generated content from 300M 
members, creates 300M problems 
2. Data cleaning is so much more than 
filtering out empty values 
3. Try to be creative and work around 
difficult language problems
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
We’re building the next big thing
We’re building the next big thing 
Join Us
We’re building the next big thing 
Join Us!
We’re building the next big thing 
Join Us! 
DJ Patil Gary Flake Beau Cronin
@bigdatasc /in/vitalygordon

More Related Content

Similar to Computing Professional Identity for the Economic Graph

7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)Social Fresh Conference
 
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination 7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination Jason Miller
 
7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content DominationLinkedIn
 
How Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data ProcessesHow Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data ProcessesCA | Automic Software
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsPeter Skomoroch
 
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013LinkedIn Higher Education
 
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or LessLinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or LessLinkedIn Higher Education
 
Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1Safe Rise
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInMinh-Hoang Nguyen
 
Linkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 updateLinkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 updateWSI Business Performance
 
Präsentation share point
Präsentation share pointPräsentation share point
Präsentation share pointcoda-efurt
 
Sharepoint Architecture
Sharepoint Architecture Sharepoint Architecture
Sharepoint Architecture arun kumar
 
Interior Designs
Interior DesignsInterior Designs
Interior Designsarun kumar
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelLima Consulting Group
 
Forging an Analytics Center of Excellence
Forging an Analytics Center of ExcellenceForging an Analytics Center of Excellence
Forging an Analytics Center of ExcellenceLewandog, Inc,
 
Microsoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introductionMicrosoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introductionDipti Bohra
 
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?CloudIDSummit
 
Big data arch_analytics
Big data arch_analyticsBig data arch_analytics
Big data arch_analyticsSrinu Adira
 
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup Jason Miller
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bhaskar Ghosh
 

Similar to Computing Professional Identity for the Economic Graph (20)

7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
 
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination 7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination
 
7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination
 
How Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data ProcessesHow Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data Processes
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data Products
 
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013
 
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or LessLinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or Less
 
Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
 
Linkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 updateLinkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 update
 
Präsentation share point
Präsentation share pointPräsentation share point
Präsentation share point
 
Sharepoint Architecture
Sharepoint Architecture Sharepoint Architecture
Sharepoint Architecture
 
Interior Designs
Interior DesignsInterior Designs
Interior Designs
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
 
Forging an Analytics Center of Excellence
Forging an Analytics Center of ExcellenceForging an Analytics Center of Excellence
Forging an Analytics Center of Excellence
 
Microsoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introductionMicrosoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introduction
 
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
 
Big data arch_analytics
Big data arch_analyticsBig data arch_analytics
Big data arch_analytics
 
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
 

Recently uploaded

Create Data Model & Conduct Visualisation in Power BI Desktop
Create Data Model & Conduct Visualisation in Power BI DesktopCreate Data Model & Conduct Visualisation in Power BI Desktop
Create Data Model & Conduct Visualisation in Power BI DesktopThinkInnovation
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
Cyclistic Memberships Data Analysis Project
Cyclistic Memberships Data Analysis ProjectCyclistic Memberships Data Analysis Project
Cyclistic Memberships Data Analysis Projectdanielbell861
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 

Recently uploaded (13)

Create Data Model & Conduct Visualisation in Power BI Desktop
Create Data Model & Conduct Visualisation in Power BI DesktopCreate Data Model & Conduct Visualisation in Power BI Desktop
Create Data Model & Conduct Visualisation in Power BI Desktop
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
Cyclistic Memberships Data Analysis Project
Cyclistic Memberships Data Analysis ProjectCyclistic Memberships Data Analysis Project
Cyclistic Memberships Data Analysis Project
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 

Computing Professional Identity for the Economic Graph

  • 1. Computing Professional Identity for the Economic Graph
  • 2. Agenda 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 3. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 4. About me ©2013 LinkedIn Corporation. All Rights Reserved. 4 Vitaly Gordon
  • 5. About me ©2013 LinkedIn Corporation. All Rights Reserved. 5
  • 6. About me ©2013 LinkedIn Corporation. All Rights Reserved. 6
  • 7. About me ©2013 LinkedIn Corporation. All Rights Reserved. 7
  • 8. About me ©2013 LinkedIn Corporation. All Rights Reserved. 8
  • 9. About me ©2013 LinkedIn Corporation. All Rights Reserved. 9
  • 10. About me @bigdatasc /in/vitalygordon ©2013 LinkedIn Corporation. All Rights Reserved. 10
  • 11. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 11
  • 12. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 12 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems
  • 13. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 13 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems 2. You will learn about how hard cleaning data can be
  • 14. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 14 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems 2. You will learn about how hard cleaning data can be 3. You will learn why LinkedIn needs endorsements
  • 15. What’s in it for me? ©2013 LinkedIn Corporation. All Rights Reserved. 15
  • 16. What’s in it for me? ©2013 LinkedIn Corporation. All Rights Reserved. 16
  • 17. What’s in it for me? ©2013 LinkedIn Corporation. All Rights Reserved. 17 @bigdatasc
  • 18. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 19. ©2013 LinkedIn Corporation. All Rights Reserved. 19 Create economic opportunity for every professional in the world
  • 20. ©2013 LinkedIn Corporation. All Rights Reserved. 20
  • 21. ©2013 LinkedIn Corporation. All Rights Reserved. 21 • CEO • Chief Executive Officer • CEO and Founder • CEO & Co-founder • President and CEO • Owner
  • 22. ©2013 LinkedIn Corporation. All Rights Reserved. 22 • IBM • International Business Machines • International Bus. Machines • IBM Research • IBM T.J. Watson Research Center • IBM Canada • IBM India
  • 23. ©2013 LinkedIn Corporation. All Rights Reserved. 23 • UCLA • University of California, Los Angeles • UC Los Angeles • The Anderson School of Management
  • 24. ©2013 LinkedIn Corporation. All Rights Reserved. 24
  • 25. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 26. Why Do We Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 26
  • 27. Why Do We Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 27
  • 28. Why Do We Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 28
  • 29. ©2013 LinkedIn Corporation. All Rights Reserved. 29
  • 30. Text Based Solution  Applies acronym expansion (e.g. vp -> vice president)  Applies abbreviation expansion (e.g. sr. -> senior)  Select the most common standard titles  Selects standard sub strings (e.g. software engineer and tech lead in search -> [software engineer, tech lead]) ©2013 LinkedIn Corporation. All Rights Reserved. 30
  • 31. Problems with a Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 31 Senior Software Engineer Software Engineer
  • 32. Problems with a Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 32 Software Engineer Software Developer Programmer
  • 33. Problems with a Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 33 Architect
  • 34. ©2013 LinkedIn Corporation. All Rights Reserved. 34
  • 35. ©2013 LinkedIn Corporation. All Rights Reserved. 35
  • 36. ©2013 LinkedIn Corporation. All Rights Reserved. 36
  • 37. ©2013 LinkedIn Corporation. All Rights Reserved. 37
  • 38. 38 Profile Inferred Skills Endorsements Skill Vectors
  • 39. 39 Profile Inferred Skills Endorsements Skill Vectors 30% 25% 20% 15% 10% 5% 0% Data Mining Hadoop Machine Learning Java Algorithms Python MapReduce Data Science
  • 40. 40 Profile Inferred Skills Endorsements Skill Vectors 30% 25% 20% 15% 10% 5% 0% Data Mining Hadoop Machine Learning Java Algorithms Python MapReduce Data Science http://www.slideshare.net/s_shah/strata-endorsements
  • 44. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 45. Normalization ©2013 LinkedIn Corporation. All Rights Reserved. 45
  • 47. Clustering ©2013 LinkedIn Corporation. All Rights Reserved. 47
  • 48. Clustering 48 • Each topic is a distribution over words • Each document is a mixture of corpus-wide topics • Each word is drawn from one of those topics
  • 51. Anomaly Detection 51 http://www.slideshare.net/tdunning/strata-2014-anomaly-detection
  • 52. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 53. Summary ©2013 LinkedIn Corporation. All Rights Reserved. 60 1. User generated content from 300M members, creates 300M problems
  • 54. Summary ©2013 LinkedIn Corporation. All Rights Reserved. 61 1. User generated content from 300M members, creates 300M problems 2. Data cleaning is so much more than filtering out empty values
  • 55. Summary ©2013 LinkedIn Corporation. All Rights Reserved. 62 1. User generated content from 300M members, creates 300M problems 2. Data cleaning is so much more than filtering out empty values 3. Try to be creative and work around difficult language problems
  • 56. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 57. We’re building the next big thing
  • 58. We’re building the next big thing Join Us
  • 59. We’re building the next big thing Join Us!
  • 60. We’re building the next big thing Join Us! DJ Patil Gary Flake Beau Cronin

Editor's Notes

  1. Computing Professional Identity for the Economic Graph
  2. How do you evaluate that people do the same thing?
  3. The crowdsourcing turkers left very confused
  4. Clustering titles is like clustering geography, it depends on the context
  5. Elephants can run, but that’s not what you should use them for