Data Products @LinkedIn – Culture, People and Tools
Before we begin …




                 A dog teaches a boy fidelity,
              perseverance, and to turn around
               three times before lying down.
                                                   Robert Benchley – American Humorist




©2013 LinkedIn Corporation. All Rights Reserved.                                         ORGANIZATION NAME   2
What qualifies as a Data Product?



                                                   Big
                                                   DataData
                                                         Product



                                   Machine          AI
                                                          Consumer
                                   Learning                Facade


©2013 LinkedIn Corporation. All Rights Reserved.                     ORGANIZATION NAME   3
Shift in Metrics




                                                                 2012
                                                          2008
                                                   2003
                       1998

©2013 LinkedIn Corporation. All Rights Reserved.                   ORGANIZATION NAME   4
Data Products Timeline


                                                               Netflix Challenge
                                                        PYMK

                      Recommendation
                                                                           2009
                                                                2007
 Personalization
                                                       2006
                                                2004
Crawl Search
                         1998


  ©2013 LinkedIn Corporation. All Rights Reserved.                         ORGANIZATION NAME   5
What I’ll talk about




            Culture                                People   Tools


©2013 LinkedIn Corporation. All Rights Reserved.            ORGANIZATION NAME   6
Culture                                People   Tools


©2013 LinkedIn Corporation. All Rights Reserved.            ORGANIZATION NAME   7
Culture




1. Everything is a Data Product
2. If you can’t measure it, you
  can’t fix it
3. Fewer things are done better


©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME   8
Data Products on your LinkedIn homepage




                                          ORGANIZATION NAME   9
Measurement




1. Have core metrics
2. Define measure of success
3. Rinse and repeat



©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 10
Measurement




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 11
Culture




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 12
Culture                                People   Tools


©2013 LinkedIn Corporation. All Rights Reserved.            ORGANIZATION NAME 13
People




1. World class talent is the
  number one priority
2. Data scientists are unicorns
3. Let people be


©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 14
People




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 15
Some of LinkedIn’s Data Scientists




  Joseph Adler                                         Gloria Lau     Monica Rogati




Daniel Tunkelang                                     Daria Sorokina   Matheiu Bastian
  ©2013 LinkedIn Corporation. All Rights Reserved.                      ORGANIZATION NAME 16
Letting people be




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 17
Culture                                People   Tools


©2013 LinkedIn Corporation. All Rights Reserved.            ORGANIZATION NAME 19
Tools




1. For real data products, you
  need real data
2. Invest in infrastructure
3. Open Source = Happiness


©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 20
Real Data




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 21
LinkedIn Infrastructure



              Project Voldemort                                  Espresso


                                                   Apache Kafka
          DataBus
                                                   Azkaban       Zoie / Bobo



                        Avatara                         DataFu



©2013 LinkedIn Corporation. All Rights Reserved.                               ORGANIZATION NAME 22
Apache Kafka




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 23
Project Voldemort




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 24
Azkaban




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 25
DataBus




©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 26
The Technologist’s Hierarchy of Needs


                                                      BDFL          External
                                                                   Validation
                                                      Fame
                                                   Recognition
                                                      Salary
                                                   Functionality
©2013 LinkedIn Corporation. All Rights Reserved.                      ORGANIZATION NAME 27
If you need to remember just 3 things




1. People are everything
2. Data Products drive business
3. Easier life => More productivity


©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 28
©2013 LinkedIn Corporation. All Rights Reserved.   ORGANIZATION NAME 29

LinkedIn Data Products

  • 1.
    Data Products @LinkedIn– Culture, People and Tools
  • 2.
    Before we begin… A dog teaches a boy fidelity, perseverance, and to turn around three times before lying down. Robert Benchley – American Humorist ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 2
  • 3.
    What qualifies asa Data Product? Big DataData Product Machine AI Consumer Learning Facade ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 3
  • 4.
    Shift in Metrics 2012 2008 2003 1998 ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 4
  • 5.
    Data Products Timeline Netflix Challenge PYMK Recommendation 2009 2007 Personalization 2006 2004 Crawl Search 1998 ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 5
  • 6.
    What I’ll talkabout Culture People Tools ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 6
  • 7.
    Culture People Tools ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 7
  • 8.
    Culture 1. Everything isa Data Product 2. If you can’t measure it, you can’t fix it 3. Fewer things are done better ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 8
  • 9.
    Data Products onyour LinkedIn homepage ORGANIZATION NAME 9
  • 10.
    Measurement 1. Have coremetrics 2. Define measure of success 3. Rinse and repeat ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 10
  • 11.
    Measurement ©2013 LinkedIn Corporation.All Rights Reserved. ORGANIZATION NAME 11
  • 12.
    Culture ©2013 LinkedIn Corporation.All Rights Reserved. ORGANIZATION NAME 12
  • 13.
    Culture People Tools ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 13
  • 14.
    People 1. World classtalent is the number one priority 2. Data scientists are unicorns 3. Let people be ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 14
  • 15.
    People ©2013 LinkedIn Corporation.All Rights Reserved. ORGANIZATION NAME 15
  • 16.
    Some of LinkedIn’sData Scientists Joseph Adler Gloria Lau Monica Rogati Daniel Tunkelang Daria Sorokina Matheiu Bastian ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 16
  • 17.
    Letting people be ©2013LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 17
  • 18.
    Culture People Tools ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 19
  • 19.
    Tools 1. For realdata products, you need real data 2. Invest in infrastructure 3. Open Source = Happiness ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 20
  • 20.
    Real Data ©2013 LinkedInCorporation. All Rights Reserved. ORGANIZATION NAME 21
  • 21.
    LinkedIn Infrastructure Project Voldemort Espresso Apache Kafka DataBus Azkaban Zoie / Bobo Avatara DataFu ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 22
  • 22.
    Apache Kafka ©2013 LinkedInCorporation. All Rights Reserved. ORGANIZATION NAME 23
  • 23.
    Project Voldemort ©2013 LinkedInCorporation. All Rights Reserved. ORGANIZATION NAME 24
  • 24.
    Azkaban ©2013 LinkedIn Corporation.All Rights Reserved. ORGANIZATION NAME 25
  • 25.
    DataBus ©2013 LinkedIn Corporation.All Rights Reserved. ORGANIZATION NAME 26
  • 26.
    The Technologist’s Hierarchyof Needs BDFL External Validation Fame Recognition Salary Functionality ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 27
  • 27.
    If you needto remember just 3 things 1. People are everything 2. Data Products drive business 3. Easier life => More productivity ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME 28
  • 28.
    ©2013 LinkedIn Corporation.All Rights Reserved. ORGANIZATION NAME 29

Editor's Notes

  • #3 Please take my words with a grain of salt. No everything you will learn today, you should run to implement and not everything you do that doesn’t appear here is wrong. I’ve just came to tell you about some of the stuff I found interesting that made LinkedIn successful in the development of data products
  • #4 What are data products (in the context of this talk) – something that involves algorithms and some consumer web facade. For example dashboards and visualization are not because they don’t have algorithms and HFT algorithms and missile guidance systems don’t have this consumer web façade
  • #5 Metrics - Explain the shift in metrics as well, from eyeballs (bottom page counters) to segmented eye balls (e.g. Google analytics) to web funnels (i.e. looking at more metrics than just views) to multi dimensional engagement (don’t know if it is that different than the previous).Timeline – Explain the involvement of data products from Amazon’s “People who viewed this …” through Google Ads (don’t know if falls under the previous definition through PYMK to Endorsements. Also explain that data products are not just cool, but they are very valuable to your business
  • #6 Metrics - Explain the shift in metrics as well, from eyeballs (bottom page counters) to segmented eye balls (e.g. Google analytics) to web funnels (i.e. looking at more metrics than just views) to multi dimensional engagement (don’t know if it is that different than the previous).Timeline – Explain the involvement of data products from Amazon’s “People who viewed this …” through Google Ads (don’t know if falls under the previous definition through PYMK to Endorsements. Also explain that data products are not just cool, but they are very valuable to your business
  • #7 The power of 3 
  • #9 Almost every element on the LinkedIn pages is a data productEverything should be data driven – from the idea conception, through iteration until the successFocus is really important – The attention span of your customers is limited, for every feature you add, think what you need to remove
  • #11 Everything should be data driven – from the idea conception, through iteration until the success
  • #12 Everything should be data driven – from the idea conception, through iteration until the success
  • #13 If you could only do one thing, what would it be? -- Steve JobsShortly after Jerry Yang became the CEO of Yahoo, he invited Steve Jobs to address the company's leadership. Among many insightful things that Steve shared that day, the one that continues to have the most profound influence on me was his discussion regarding prioritization. Jobs said that after he returned to Apple in 1994, he recognized there were far too many products and SKUs in development so he asked his team one simple question: If you could only do one thing, what would it be? He said that many of the answers rationalized the need to do more than one thing, or sought to substantiate bundling one priority with another. However, all he wanted to know was what "the one thing" was. As he explained it, if they got that one thing right, they could then move on the next thing, and the next thing after that, and so on. Turned out the answer to his question was the reinvention of the iMac. After that, it was the iPod, the iPhone, and the iPad, and the rest, as they say, is history.Interestingly enough, years later I heard Jobs speak at All Things D and he explained that the company had actually been working on the iPad before the iPhone, as he had long written off pursuit of the phone as being prohibitively challenging given the carrier landscape. However, once a window of opportunity opened up to successfully bring a phone to market, he hit the pause button on the tablet, and only returned to it once Apple got the iPhone right. Pretty mind blowing to think that a company as large and successful as Apple, and someone as prodigiously talented as Steve Jobs, would temporarily shelve something as important as the iPad for the sake of focus, but that's exactly what he did.
  • #14 The power of 3 
  • #15 Really, it is. Give examples that show that this stuff is taken seriously (like Jeff’s all hands and Jim’s getting sent back to do his homework for not putting it first on his roadmap)Famous Conway’s diagram. Explain the day to day a data scientist at LinkedIn and why is it important to have those skills. Give some examples of people backgroundsHow to hire good data scientists – by using real examples you both test for truly needed skills and do some selling while interviewing, maybe give examples of interview questions
  • #16 Really, it is. Give examples that show that this stuff is taken seriously (like Jeff’s all hands)
  • #17 Joe Adler – Author of R in a NutshellGloria Lau – Associate professor at StanfordMonica Rogati - Wall Street Journal & The Economist to NPR & CNN to Real Simple & (yes!) Howard Stern.Daniel Tunkelang – Chief Scientist of Endeca that was sold to Oracle for > $1BDaria Sorokina – Creator of additive groves and competitor for the national heritage health prizeMatheiu Bastian – Co-founder and technical lead at GephiTalk about the day to day work of those people
  • #20 The power of 3 
  • #21 Most companies don’t have an exact replica of their production cluster in their development environment, explain why is it crucialInfrastructure increases the productivity and let’s data scientists to focus more on actual data scienceMaybe refers more to culture, but people are really curious why LinkedIn open sources so much. Explain the benefits in hiring and retention and mention few of those projects
  • #22 Developing data products without real data is like learning swimming from a book
  • #23 Infrastructure increases the productivity and let’s data scientists to focus more on actual data science
  • #28 BDFL – Benevolent Dictator for LifeBenevolent - נדיב
  • #29 I don’t know if those are the main 3 takeaways from the talk