Submit Search
Upload
Computing Professional Identity for the Economic Graph
•
4 likes
•
1,829 views
Vitaly Gordon
Follow
Strata New York 2014
Read less
Read more
Data & Analytics
Report
Share
Report
Share
1 of 61
Download Now
Download to read offline
Recommended
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
Big Data World 2013 - How LinkedIn leveraged its data to become the world's l...
Vitaly Gordon
LinkedIn Data Products
LinkedIn Data Products
Vitaly Gordon
Developing Data Products
Developing Data Products
Peter Skomoroch
How to earn 15% interest (and understand stablecoins)
How to earn 15% interest (and understand stablecoins)
Sean O'Connor
BlockChain_Brochure
BlockChain_Brochure
Thi Dang
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Vitaly Gordon
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
DataWorks Summit
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.
Hakka Labs
More Related Content
Similar to Computing Professional Identity for the Economic Graph
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
Social Fresh Conference
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination
Jason Miller
7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination
LinkedIn
How Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data Processes
CA | Automic Software
SF Data Science: Developing Data Products
SF Data Science: Developing Data Products
Peter Skomoroch
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Higher Education
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn Higher Education
Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1
Safe Rise
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
Minh-Hoang Nguyen
Linkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 update
WSI Business Performance
Präsentation share point
Präsentation share point
coda-efurt
Sharepoint Architecture
Sharepoint Architecture
arun kumar
Interior Designs
Interior Designs
arun kumar
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
Lima Consulting Group
Forging an Analytics Center of Excellence
Forging an Analytics Center of Excellence
Lewandog, Inc,
Microsoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introduction
Dipti Bohra
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CloudIDSummit
Big data arch_analytics
Big data arch_analytics
Srinu Adira
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Jason Miller
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bhaskar Ghosh
Similar to Computing Professional Identity for the Economic Graph
(20)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination
How Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data Processes
SF Data Science: Developing Data Products
SF Data Science: Developing Data Products
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or Less
Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
Linkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 update
Präsentation share point
Präsentation share point
Sharepoint Architecture
Sharepoint Architecture
Interior Designs
Interior Designs
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
Forging an Analytics Center of Excellence
Forging an Analytics Center of Excellence
Microsoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introduction
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
Big data arch_analytics
Big data arch_analytics
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Recently uploaded
Create Data Model & Conduct Visualisation in Power BI Desktop
Create Data Model & Conduct Visualisation in Power BI Desktop
ThinkInnovation
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
Becky Burwell
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
Giorgio Carbone
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Guido X Jansen
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
Venkatasubramani13
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
JasonViviers2
Cyclistic Memberships Data Analysis Project
Cyclistic Memberships Data Analysis Project
danielbell861
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
PrithaVashisht1
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
DataArchiva
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
aigil2
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
Pavel Šabatka
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
sonikadigital1
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
Vladislav Solodkiy
Recently uploaded
(13)
Create Data Model & Conduct Visualisation in Power BI Desktop
Create Data Model & Conduct Visualisation in Power BI Desktop
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
Cyclistic Memberships Data Analysis Project
Cyclistic Memberships Data Analysis Project
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
Computing Professional Identity for the Economic Graph
1.
Computing Professional Identity
for the Economic Graph
2.
Agenda 1 Introduction
2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
3.
1 Introduction 2
LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
4.
About me ©2013
LinkedIn Corporation. All Rights Reserved. 4 Vitaly Gordon
5.
About me ©2013
LinkedIn Corporation. All Rights Reserved. 5
6.
About me ©2013
LinkedIn Corporation. All Rights Reserved. 6
7.
About me ©2013
LinkedIn Corporation. All Rights Reserved. 7
8.
About me ©2013
LinkedIn Corporation. All Rights Reserved. 8
9.
About me ©2013
LinkedIn Corporation. All Rights Reserved. 9
10.
About me @bigdatasc
/in/vitalygordon ©2013 LinkedIn Corporation. All Rights Reserved. 10
11.
What’s in it
for you? ©2013 LinkedIn Corporation. All Rights Reserved. 11
12.
What’s in it
for you? ©2013 LinkedIn Corporation. All Rights Reserved. 12 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems
13.
What’s in it
for you? ©2013 LinkedIn Corporation. All Rights Reserved. 13 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems 2. You will learn about how hard cleaning data can be
14.
What’s in it
for you? ©2013 LinkedIn Corporation. All Rights Reserved. 14 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems 2. You will learn about how hard cleaning data can be 3. You will learn why LinkedIn needs endorsements
15.
What’s in it
for me? ©2013 LinkedIn Corporation. All Rights Reserved. 15
16.
What’s in it
for me? ©2013 LinkedIn Corporation. All Rights Reserved. 16
17.
What’s in it
for me? ©2013 LinkedIn Corporation. All Rights Reserved. 17 @bigdatasc
18.
1 Introduction 2
LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
19.
©2013 LinkedIn Corporation.
All Rights Reserved. 19 Create economic opportunity for every professional in the world
20.
©2013 LinkedIn Corporation.
All Rights Reserved. 20
21.
©2013 LinkedIn Corporation.
All Rights Reserved. 21 • CEO • Chief Executive Officer • CEO and Founder • CEO & Co-founder • President and CEO • Owner
22.
©2013 LinkedIn Corporation.
All Rights Reserved. 22 • IBM • International Business Machines • International Bus. Machines • IBM Research • IBM T.J. Watson Research Center • IBM Canada • IBM India
23.
©2013 LinkedIn Corporation.
All Rights Reserved. 23 • UCLA • University of California, Los Angeles • UC Los Angeles • The Anderson School of Management
24.
©2013 LinkedIn Corporation.
All Rights Reserved. 24
25.
1 Introduction 2
LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
26.
Why Do We
Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 26
27.
Why Do We
Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 27
28.
Why Do We
Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 28
29.
©2013 LinkedIn Corporation.
All Rights Reserved. 29
30.
Text Based Solution
Applies acronym expansion (e.g. vp -> vice president) Applies abbreviation expansion (e.g. sr. -> senior) Select the most common standard titles Selects standard sub strings (e.g. software engineer and tech lead in search -> [software engineer, tech lead]) ©2013 LinkedIn Corporation. All Rights Reserved. 30
31.
Problems with a
Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 31 Senior Software Engineer Software Engineer
32.
Problems with a
Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 32 Software Engineer Software Developer Programmer
33.
Problems with a
Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 33 Architect
34.
©2013 LinkedIn Corporation.
All Rights Reserved. 34
35.
©2013 LinkedIn Corporation.
All Rights Reserved. 35
36.
©2013 LinkedIn Corporation.
All Rights Reserved. 36
37.
©2013 LinkedIn Corporation.
All Rights Reserved. 37
38.
38 Profile Inferred
Skills Endorsements Skill Vectors
39.
39 Profile Inferred
Skills Endorsements Skill Vectors 30% 25% 20% 15% 10% 5% 0% Data Mining Hadoop Machine Learning Java Algorithms Python MapReduce Data Science
40.
40 Profile Inferred
Skills Endorsements Skill Vectors 30% 25% 20% 15% 10% 5% 0% Data Mining Hadoop Machine Learning Java Algorithms Python MapReduce Data Science http://www.slideshare.net/s_shah/strata-endorsements
41.
Ontology Creation 41
42.
Ontology Creation 42
43.
Classification 43
44.
1 Introduction 2
LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
45.
Normalization ©2013 LinkedIn
Corporation. All Rights Reserved. 45
46.
Normalization 46
47.
Clustering ©2013 LinkedIn
Corporation. All Rights Reserved. 47
48.
Clustering 48 •
Each topic is a distribution over words • Each document is a mixture of corpus-wide topics • Each word is drawn from one of those topics
49.
Anomaly Detection 49
50.
Anomaly Detection 50
51.
Anomaly Detection 51
http://www.slideshare.net/tdunning/strata-2014-anomaly-detection
52.
1 Introduction 2
LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
53.
Summary ©2013 LinkedIn
Corporation. All Rights Reserved. 60 1. User generated content from 300M members, creates 300M problems
54.
Summary ©2013 LinkedIn
Corporation. All Rights Reserved. 61 1. User generated content from 300M members, creates 300M problems 2. Data cleaning is so much more than filtering out empty values
55.
Summary ©2013 LinkedIn
Corporation. All Rights Reserved. 62 1. User generated content from 300M members, creates 300M problems 2. Data cleaning is so much more than filtering out empty values 3. Try to be creative and work around difficult language problems
56.
1 Introduction 2
LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
57.
We’re building the
next big thing
58.
We’re building the
next big thing Join Us
59.
We’re building the
next big thing Join Us!
60.
We’re building the
next big thing Join Us! DJ Patil Gary Flake Beau Cronin
61.
@bigdatasc /in/vitalygordon
Editor's Notes
Computing Professional Identity for the Economic Graph
How do you evaluate that people do the same thing?
The crowdsourcing turkers left very confused
Clustering titles is like clustering geography, it depends on the context
Elephants can run, but that’s not what you should use them for
Download Now