• Save
Bringing OLTP woth OLAP: Lumos on Hadoop
Upcoming SlideShare
Loading in...5
×
 

Bringing OLTP woth OLAP: Lumos on Hadoop

on

  • 372 views

 

Statistics

Views

Total Views
372
Views on SlideShare
369
Embed Views
3

Actions

Likes
1
Downloads
0
Comments
0

1 Embed 3

https://twitter.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Today, Talk about Scaling ETL in order to consolidate and democratize data and analytics on Hadoop at LinkedIn.
  • Let’s start with the overall Data Ecosystem <br /> Then focus on the specific problem of integrating online data-stores with Hadoop <br /> and go over the solution
  • Members interact with the site apps <br /> And they generate actions and data mutations <br /> Which gets persisted in LOGS store and ONLINE data stores <br /> Espresso, MySQL and Oracle are primary online data stores. <br /> Espresso is a document oriented partitioned data store with transactional support. It is home grown. <br /> Kafka is used as the LOG store. <br /> Online Data sources are periodically replicated to hadoop for creating cubes & enrichments. <br /> Cubes are used externally on the site as well as internally on the reports/insights for analysts. <br /> (Eg: “Who viewed your profile”, “Campaign performance reports”, Member sign-up reports) <br /> Cubes are delivered via Cube serving Engines. There are primarily 3 cube serving stack. <br /> Voldemort is a key-value store : used to deliver static reports with pre-computed metrics. <br /> Pinot : search technology : used for delivering some what dynamic reports with pre-compute metrics (drill) <br /> Finally, the traditional BI stack comprised of TD + Tableau + MSTR: deliver insights to business users. <br />
  • Explain interactively what action generated what data  real use case. <br /> <br /> Tracking: User activity at the site turns into tracking data <br /> Example -> Tracking -> PageView, AdClick <br /> Append -> each user activity generates new data <br /> Immutable -> Once generated, does not change but grows over time <br /> Usually organized by time and accessed over time range <br /> <br /> Database: is user provided data stored in online stores. <br /> This data is mutable over time <br /> Example -> Member Profile, Education <br /> Organized as full table as of some time and accessed in full
  • The problem is simply replicating the data from ONLINE to HADOOP <br /> But, LNKD has 300m members and generates lots of data => humongous amount of data <br /> Fresh data directly impacts the member engagement and business decision making <br />
  • PROD data center that is accessible from outside <br /> HADOOP is CORP data center
  • Deletes for compliance <br /> Move the data entirely, but it puts load on the source system, network and hadoop resources
  • Commit time or <br /> <br /> Since tracking data generates is append only, it is easier to handler and arrange them in time window. <br /> DB data can have updates or deletes, <br /> and reflecting that on HDFS in low latency and with optimal resouce usage is a challenge
  • TALK about schema evaluation
  • TALK about schema evaluation
  • <br /> This is not HDFS snaphsot not HBASE snapshot
  • Schema changes + rewrite the complete data <br /> <br /> Sqoop: Cross-colo database connections are not allowed <br /> Sqoop: May put load on the production databases <br /> Hbase <br /> Write the change logs and periodically do a snapshot and replicate <br /> not all companies run Hbase as part of the standard deployment <br /> not clear if this will meet the low-latency requirement <br /> Hive Streaming <br /> looks similar to what we do <br /> caveat: it only supports ORCA
  • Change to Data Extract
  • Bottom right
  • TODO: cluster of databases and Relay <br /> Reading off of databus <br /> With a picture <br /> Checkpoint  <br /> Scn to time mapping <br /> Backup slides towards the end
  • Db Dump format to Avro <br /> Oracle data types <br /> Map-Only Job <br /> Field Level transformation <br /> Eliminate recursive schema <br /> <br /> Avro Schema Attribute JSON <br /> Meta info <br /> Key and delta column <br /> begin_date, end_date, <br /> drop_date, full_drop date <br /> Row counts <br /> <br /> <br />
  • Db Dump format to Avro <br /> Oracle data types <br /> Map-Only Job <br /> Field Level transformation <br /> Eliminate recursive schema <br /> <br /> Avro Schema Attribute JSON <br /> Meta info <br /> Key and delta column <br /> begin_date, end_date, <br /> drop_date, full_drop date <br /> Row counts <br /> <br /> <br />
  • Db Dump format to Avro <br /> Oracle data types <br /> Map-Only Job <br /> Field Level transformation <br /> Eliminate recursive schema <br /> <br /> Avro Schema Attribute JSON <br /> Meta info <br /> Key and delta column <br /> begin_date, end_date, <br /> drop_date, full_drop date <br /> Row counts <br /> <br /> <br />
  • Db Dump format to Avro <br /> Oracle data types <br /> Map-Only Job <br /> Field Level transformation <br /> Eliminate recursive schema <br /> <br /> Avro Schema Attribute JSON <br /> Meta info <br /> Key and delta column <br /> begin_date, end_date, <br /> drop_date, full_drop date <br /> Row counts <br /> <br /> <br />
  • Change to Data Extract

Bringing OLTP woth OLAP: Lumos on Hadoop Bringing OLTP woth OLAP: Lumos on Hadoop Presentation Transcript

  • Scaling ETL on Hadoop: Bridging OLTP with OLAP
  • Agenda  Data Ecosystem @ LinkedIn  Problem : Bridging OLTP with OLAP  Solution  Details  Conclusion and Future Work 2
  • Data Ecosystem @ LinkedIn 3
  • Data Ecosystem - Overview 4 Serving App Online Stores Espresso Oracle MySQL Logs Analytics Infra Business Engines Serving OLAP
  • Data Ecosystem – Data 5  Tracking Data  Tracks user activity at web site  Append only  Example: Page View  Database Data  Member provided data in online-stores  Inserts, Updates and Deletes  Example: Member Profiles, Likes, Comments
  • Problem Scaling ETL on Hadoop 6
  • Bridging OLTP to OLAP 7 OLTP OLAP  Integrating site-serving data stores with Hadoop at scale with low latency.  Critical to LinkedIn’s  Member engagement  Business decision making Kafka Engines Serving OLAP Databases Tracking Data Espresso Oracle MySQL
  • Challenge - Scalable ETL 8  600+ Tracking topics  500+ Database tables  XXX TB of Data at rest  X TB of new data generated per day  5000 Nodes, Several Hadoop clusters Kafka Engines Serving OLAP Databases Tracking Data Espresso Oracle MySQL OLTP OLAP
  • Challenge – Consistent Snapshot with SLA 9  Apply updates, deletes  Copy full tables  But, resource overheads  Small fraction of data changes Kafka Engines Serving OLAP Databases Tracking Data Espresso Oracle MySQL OLTP OLAP
  • Engines Requirements 10 OLTP Oracle Espresso OLAP  Refresh data on HDFS frequently  Seamless handling of schema evolution  Optimal resource usage  Handle multi data centers  Efficient change capture on source  Ensure Last-Update semantics  Handle deletes Serving OLAP Database Data Tracking Data
  • Solution 11
  • Lumos 12 Data Capture  Can use commit logs  Delta processing  Latencies in minutes  Schema agnostic framework Databus Others Hadoop : Data Center DB Extract Files Data Center Colo-1 Databases Colo-2 Databases Lumos databases (HDFS) dbchanges (HDFS)
  • Lumos – Multi-Datacenter 13 Data Capture  Handle multi-datacenter stores  Resolve updates via commit order Databus Others Hadoop : Data Center DB Extract Files Data Center Colo-1 Databases Colo-2 Databases Lumos databases (HDFS) dbchanges (HDFS)
  • Lumos – Data Organization 14 - Virtual Snapshot HDFS Layout InputFormat Pig&Hive Loaders  Database Snapshot - Entire database on HDFS - With added latency  Database Virtual Snapshot - Previous Snapshot + Delta - Enables faster refresh /db/table/snapshot-0 _delta dir-1 dir-2 dir-3
  • Lumos - High Level Architecture 15 Virtual Snapshot Builder ETL Hadoop Cluster Staging (internal) Lazy Snapshot Builder User Jobs HDFS Published Virtual Snapshot MR/Pig/Hiv e Loaders Compactor Change Captur e Increments Pre- Process Full Drops
  • Alternative Approaches  Sqoop  Hbase  Hive Streaming 16
  • Details 17
  • Change Capture – File Based 18  File Format  Compressed CSV  Metadata  Full Drop  Via Fast Reader (Oracle, MySQL)  Via MySQL backups (Espresso)  Runs for hours with Dirty reads  Increments  Via SQL  Transactional Full Drop 1am 4am Inc h-1 Inc h-2 Inc h-3 2am 3am Prev. HW New High-water mark DB Files Web Service HDFS HTTPS Pulls Inc H-4
  • Change Capture – Databus Based 19 Databus Relay Mapper Databus Consumer dbchanges (HDFS) Reducer Database Mapper Databus Consumer Reducer  Reads Database commit logs  Multi datacenter via Databus Relay  Runs as MR Job  Output : date-time partitioned with multiple versions  True change capture (including hard deletes) Databus RelayDatabase Hadoop
  • Pre-Processing 20  Data format conversion  Field level transformations  Privacy  Cleansing – Eg. Remove recursive schema  Metadata annotation  Add row counts for data validation Virtual Snapshot Builder (HDFS) Internal Staging Lazy Snapshot Builder User Jobs (HDFS) Published Virtual Snapshot MR/Pig/Hive Loaders Compactor Change Capture Increments Pre- Process Full Drops
  • Snapshotting – Lazy Materializer 21  One MR job per table, consumes full drops  Supports dirty reads.  Hash Partition on primary key  Number of partitions based on data size  Sorts on primary key  Results published into staging directory Virtual Snapshot Builder (HDFS) Internal Staging Lazy Snapshot Builder User Jobs (HDFS) Published Virtual Snapshot MR/Pig/Hive Loaders Compactor Change Capture Increments Pre- Process Full Drops
  • Snapshotting – Virtual Snapshot Builder 22  One MR Job for all tables  Identifies all existing snapshots, both published and staged  Creates appropriate delta partitions for every snapshot  Delta partition count equals Snapshot partition count  Club multiple partition in one file  Outputs latest row using delta column  Publishes staged snapshots with new deltas  Previously published snapshots updated with new deltas Virtual Snapshot Builder (HDFS) Internal Staging Lazy Snapshot Builder User Jobs (HDFS) Published Virtual Snapshot MR/Pig/Hive Loaders Compactor Change Capture Increments Pre- Process Full Drops
  • Snapshotting – Virtual Snapshot Builder 23 /db/table/snapshot-0 (10 partitions, 10 Avro files) _delta inc-1 (10 partitions, 2 Avro file) Part-0 . . .Part-9 Index files Inc-2 (10 partitions, 2 Avro file) Part-0 Part-5 Part-0  Incremental data is small  Rolls increments  Avoid creating small files  Equi-partitions INC as Snapshot  Seek and Read a partition Partition-0 Part-0.avro File Partition-4 Partition-5 Partition-9 Index file Index files Part-5 Index file Part-5.avro File
  • Snapshotting – Loaders 24  Custom InputFormat (MR)  Uses the Index file to create Splits  RecordReader merges partition-0 of Snapshot and Delta  Returns latest row from Delta if present  Masks row if deleted  Otherwise returns row from snapshot  Pig Loader enables reading virtual snapshot via Pig  Storage handler enables reading virtual snapshot via Hive
  • Snapshotting – Loaders (2) 25 /db/table/snapshot-0 (10 partitions, 10 Avro files) _delta Part-0 Part-9 Delta-1 (10 partitions, 2 Avro file) Part-5 Part-0 Custom InputFormat Index files Part-1 Part-2 . . . Mapper-0 Custom InputFormat Mapper-9  Delta-1.Part-0 contains partitions 0 to 4  Delta-2.Part-5 contains partitions 5 to 9  Snapshot-0.Part-0 contains partition 0  Both sorted on primary key
  • Snapshotting – Compactor 26  Required when partition size exceeds threshold  Materializes Virtual Snapshot to Snapshot  With more partitions  MR job with Reducer Virtual Snapshot Builder (HDFS) Internal Staging Lazy Snapshot Builder User Jobs (HDFS) Published Virtual Snapshot MR/Pig/Hive Loaders Compactor Change Capture Increments Pre- Process Full Drops
  • Operating billions of rows per day  Dude, where’s my row? – Automatic Data validation  When data misses the bus – Handling late data – Look back window  Cluster downtime – Restart-ability – Active-active – Idempotent processing 27
  • Conclusion and Future Work  Conclusion  Lumos : Scalable ETL framework  Battle tested in production  Future Work  Unify Internal and External data  Open source 28
  • Q & A 29 Questions?
  • Appendix 30