Successfully reported this slideshow.

LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)

11,218 views

Published on

This presentation was presented at Hadoop Summit 2013 on June 26, 2013 by Sid Anand and Hien Luu of LinkedIn.

Published in: Technology, Business

LinkedIn's Segmentation & Targeting Platform (Hadoop Summit 2013)

  1. 1. LinkedIn Segmentation & TargetingPlatform: A Big Data ApplicationHadoop Summit, June 2013Hien Luu, Sid Anand©2013 LinkedIn Corporation. All Rights Reserved.
  2. 2. About Us*Hien Luu Sid Anand
  3. 3. ©2013 LinkedIn Corporation. All Rights Reserved.Our missionConnect the world’s professionals to makethem more productive and successful
  4. 4. Over 200M members and counting2 4 8173255901452004 2005 2006 2007 2008 2009 2010 2011 2012LinkedIn Members (Millions)200+The world’s largest professional networkGrowing at more than 2 members/secSource :http://press.linkedin.com/about©2013 LinkedIn Corporation. All Rights Reserved.
  5. 5. *>88%Fortune 100 Companiesuse LinkedIn Talent Soln to hireCompany Pages>2.9MProfessional searches in 2012>5.7BLanguages19>30MFastest growing demographic:Students and NCGsThe world’s largest professional networkOver 64% of members are now internationalSource :http://press.linkedin.com/about©2013 LinkedIn Corporation. All Rights Reserved.
  6. 6. Other Company Facts*• Headquartered in Mountain View, Calif., with offices around the world!• As of June 1, 2013, LinkedIn has ~3,700 full-time employees located aroundthe worldSource :http://press.linkedin.com/about
  7. 7. Agenda Company Overview• Big Data @ LinkedIn• The Segmentation & Targeting Problem• Solution : LinkedIn Segmentation & Targeting Platform• Q & A
  8. 8. Big Data @ LinkedIn©2013 LinkedIn Corporation. All Rights Reserved.
  9. 9. LinkedIn : Big Data Story©2013 LinkedIn Corporation. All Rights Reserved.Our Big Data Story depends on Infrastructure!• On-line Data Infrastructure• Near-line Data Infrastructure• Offline Data InfrastructureOracle orEspressoUpdatesWebServingTeradataData StreamsNear-lineOn-line Off-line
  10. 10. Big Data Story : On-line Data©2013 LinkedIn Corporation. All Rights Reserved.On-line Data Infrastructure• Supports typical OLTP requirements• Highly concurrent R/W access• Transactional guarantees• Back-up & Recovery• Supports a central LinkedIn Data Principle!• “All data everywhere”• All OLTP databases need to provide atime-line consistent change stream• For this, we developed and open-sourced Databus!Oracle orEspressoUpdatesWebServingOn-line
  11. 11. Big Data Story : On-line DataOracle orEspresso Data Change EventsSearchIndexGraphIndexReadReplicasUpdatesStandardizationA user updates the company, title, & school on his profile. He also accepts aconnectionThe write is made to an Oracle or Espresso Master and DataBus replicates it:• the profile change is applied to the Standardization service E.g. the many forms of IBM were canonicalized for search-friendliness• …. and to the Search Index Recruiters can find you immediately by new keywords• the connection change is applied to the Graph Index service The user can now start receiving feed updates from his new connections
  12. 12. Big Data Story : On-line DataDatabus streams also update Hadoop!Oracle orEspressoSearchIndexGraphIndexReadReplicaUpdatesStandardizationData Change Events
  13. 13. Big Data Story : Near-line & Off-line Data©2013 LinkedIn Corporation. All Rights Reserved.2 Main Sources of Data @ LinkedIn• User-provided data• e.g. Member Profile data (e.g. employment, education history, endorsements)• Tracking data via web site instrumentation• e.g. pages viewed, email opened/sent, social gestures : posts/likes/sharesOracle orEspressoUpdatesDatabusWebServersTeradata
  14. 14. TheSegmentation & TargetingProblem©2013 LinkedIn Corporation. All Rights Reserved.
  15. 15. Segmentation & Targeting
  16. 16. Segmentation & Targeting Attribute typesBhaskar Ghosh
  17. 17. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.Step 1 : Take some information about usersMember ID Join Date Country Responded toPromotion X11 01/01/2013 FR F2 01/02/2013 BE F3 01/03/2013 FR F4 02/01/2013 FR TStep 2 : Provide some targeting criteria for a new promotionPick members where• Join Date between(01/01/2013", 01/31/2013") and• Country="FR" and• Responded to Promotion X1="F" Members 1 & 3Step 3 : Target them for a different email campaign (promotion_X2)
  18. 18. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.Step 1 : Take some information about usersMember ID Join Date Country Responded toPromotion X11 01/01/2013 FR F2 01/02/2013 BE F3 01/03/2013 FR F4 02/01/2013 FR TStep 2 : Provide some targeting criteria for a new promotionPick members where• Join Date between(01/01/2013", 01/31/2013") and• Country="FR" and• Responded to Promotion X1="F" Members 1 & 3Step 3 : Target them for a different email campaign (promotion_X2)AttributesSegmentDefinitionSegment
  19. 19. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.Problem Definition• The business wants to launch new campaigns often• The business wants to specify targeting criteria (segmentdefinitions) using an arbitrary set of attributes• The attributes often need to be computed to fulfill the targetingcriteria• This data resides on Hadoop or TD• The business is most comfortable with SQL-like languages
  20. 20. Segmentation & Targeting Solution©2013 LinkedIn Corporation. All Rights Reserved.
  21. 21. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.AttributeComputationEngineAttributeServingEngine
  22. 22. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.AttributeComputationEngineSelf-serviceSupport variousdata sourcesAttributeconsolidationAttributeavailability
  23. 23. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.Attribute computation~225MPBTBTB~240
  24. 24. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.Attribute Portal Web ApplicationAttribute & DefinitionMetadata
  25. 25. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.Attribute &DefinitionMetadataTD ExecutorHive ExecutorPig ExecutorRESTRESTREST
  26. 26. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.M/RStitcher/path/dataset1/path/dataset2/path/dataset3/path/dataset4/path/lnkd_big_tableDataLoaderAttribute consolidation & availability
  27. 27. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.LinkedIn big table, the most sought after dataSegmentationPropensityModelAd hoc analysisLinkedIn big table
  28. 28. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.AttributeServingEngineSelf-serviceAttribute predicateexpressionBuildsegmentsBuild lists
  29. 29. Segmentation & Targeting©2013 LinkedIn Corporation. All Rights Reserved.Serving Engine$count filter sumcomplexexpressionsΣ1234LinkedIn big table~225M~240
  30. 30. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.InvertedIndexInvertedIndexInvertedIndexM/RIndexerLinkedIn big tableAttribute &DefinitionMetadata
  31. 31. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.Who are north American recruiters thatdon’t work for a competitor?Who are the LinkedIn Talent Solution prospectsin Europe?Who are the job seekers?
  32. 32. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.JSON PredicateExpressionJSON LuceneQuery ParserInvertedIndexInvertedIndexInvertedIndexSegment &List
  33. 33. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.Complex tree-like attribute predicate expressions
  34. 34. LinkedIn Segmentation & Targeting Platform©2013 LinkedIn Corporation. All Rights Reserved.A marketing campaign is represented by a list
  35. 35. Conclusion©2013 LinkedIn Corporation. All Rights Reserved.Move at business speed and scale at LinkedIn scale Segmentation & Targeting Platform– Self-service– Multiple data sources & massive data volume– Support complex expression evaluation in seconds– Attribute availability at business speed
  36. 36. Engineering Team Jessica Ho Swetha Karthik Raj Rangaswamy Tony Tong Ajinkya Harkare Hien Luu Sid Anand©2013 LinkedIn Corporation. All Rights Reserved.
  37. 37. Questions?More info: data.linkedin.com©2013 LinkedIn Corporation. All Rights Reserved.

×