Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Patterns and Operational Insights from the First Users of Delta Lake

129 views

Published on

Cyber threat detection and response requires demanding work loads over large volumes of log and telemetry data. A few years ago I came to Apple after building such a system at another FAANG company, and my boss asked me to do it again.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Patterns and Operational Insights from the First Users of Delta Lake

  1. 1. Delta Lake Patterns and Insights Dominique Brezinski Distinguished Engineer, Apple Information Security
  2. 2. “Sometimes the greatest innovations come from solving operational problems. Removing the Metastore as a dependency was my operational problem.”
  3. 3. Michael Armbrust As I described our problem domain’s scale That moment Michael got inspired, “oh…we’ll never be able to handle that scale with our current architecture.”
  4. 4. Agenda Dominique Brezinski What we learned running Delta Lake since it was a baby named Tahoe. We started big—tens of TB a day—and went much bigger—hundreds of TB a day.
  5. 5. Delta Lake ▪ Extract Load Transform ▪ Merge Logic Debug ▪ Stateful Merge ▪ Aggregation Funnel ▪ Merge, Dedup => SCD Updates ▪ Storage isolation for high-scale tables ▪ Delta tables and Spark streams are composable ▪ Schema ordering ▪ Don’t over partition ▪ Handling conflicting transactions ▪ Large table metadata optimization InsightsPatterns
  6. 6. Patterns
  7. 7. Extract Load Transform S3 -> Streaming -> Staging Table -> Streaming -> Data Tables
  8. 8. Parsing abstract class Parser extends ParserTrait { val validConditionExpr: Column val additionalDetailsExpr: Column def prepare(source: DataFrame): DataFrame = source def parse(source: DataFrame): DataFrame def complete(source: DataFrame): DataFrame = source final def apply(source: DataFrame): DataFrame = { source .transform(prepare) .select(struct("*") as 'origin) .transform(parse) .transform(setParseDetails) .transform(complete) } private final def setParseDetails(parsed: DataFrame): DataFrame = { parsed .withColumn("parse_details", struct( when(validConditionExpr, lit("OK")).otherwise(lit("NOT_VALID")) as 'status, additionalDetailsExpr as 'info, current_timestamp() as 'at)) } }
  9. 9. Extract ▪ Extract system wraps input with metadata including event/data type
  10. 10. Load ▪ s3-sqs Spark Stream Source ▪ Input files are JSON object per line ▪ fileFormat ‘text’
  11. 11. Load class StagingParser extends Parser { final val validConditionExpr: Column = col("ts").isNotNull final val additionalDetailsExpr: Column = concat_ws(":", lit("Input File Name"), input_file_name) def parse(source: DataFrame): DataFrame = { source .withColumn("extractedFields", apple_from_json(col("origin.value"), extractSchema)) .select("extractedFields.*", "origin") } override def complete(source: DataFrame): DataFrame = { source .select( when(col(“ts”).isNull, col(“parse_details.at")).otherwise(col("ts")).as("ts"), col(“raw”), …, col(“origin"), col(“parse_details”)) } }
  12. 12. Transform class DatasetParser extends Parser { final val validConditionExpr: Column = col("ts").isNotNull final val additionalDetailsExpr: Column = … override def prepare(source: DataFrame): DataFrame = { source.drop(“origin”) } def parse(source: DataFrame): DataFrame = { source .withColumn("extractedFields", apple_from_json(col("raw"), extractSchema)) .select("extractedFields.*", "origin") } }
  13. 13. Merge Logic Debug def upsertIntoSessions(microBatchDF: DataFrame, batchId: Long) { microBatchDF.sparkSession.sql(“set spark.sql.shuffle.partitions = 96”) microBatchDF.createOrReplaceTempView(“session_updates”) microBatchDF .write .format(“delta”) .mode(“overwrite”) .save(“/mnt/somebucket/session_update.delta”) microBatchDF.sparkSession.sql( s""" MERGE INTO delta.`/mnt/somebucket/sessions.delta` sessions USING session_updates updates ON … """) }
  14. 14. Stateful Merge val sessions = dhcpEventStream .groupByKey(_.mac) .flatMapGroupsWithState(OutputMode.Append, GroupStateTimeout.ProcessingTimeTimeout) (DhcpSessionGenerator.mapDhcpSessionsByMac _) .withColumn("dt", to_date($"start_ts")) .writeStream .option("checkpointLocation", checkpointPath) .foreachBatch(sessionWriter.upsertIntoDhcpSessions _) .start()
  15. 15. Aggregation Funnel Source Dataset 1 Step 1 Step 2 Step 3 Output Table 1 groupBy(src_ip,dst_ip,dt) Source Dataset 2 Output Table 2 Intermediate Aggregate Table groupBy(src_ip,dt) + groupBy(dst_ip,dt) groupBy(src_ip,dt) + groupBy(dst_ip,dt) Window aggregations Upsert Result Fields original_value extracted_value dt source(array datasets and columns) total_row_count IPv4 IPv6 Source Dataset n Output Table n Final Index groupBy(src_ip,dst_ip,dt)
  16. 16. Merge, Dedup => SCD Updates object DedupWriter extends Serializable { def upsertIntoDeduped(microBatchOutput: DataFrame, batchId: Long): Unit = { DeltaTable.forPath("/mnt/somebucket/ip_index_deduped_updates.delta").as("out") .merge( microBatchOutput.as("in"), // all columns match ) .whenNotMatched.insertAll.execute } } spark .readStream .format("delta") .option("ignoreChanges", true) .load(“/mnt/aggregates/prod/some_table_that_receives_upserts.delta”) .writeStream .outputMode("append") .option("checkpointLocation", CHECKPOINT_PATH) .foreachBatch(DedupWriter.upsertIntoDeduped _) .start()
  17. 17. Insights
  18. 18. Storage isolation for high-scale tables ▪ Put each large table and corresponding checkpoint in its own bucket ▪ Enable random prefixes [delta.randomizeFilePrefixes=true]
  19. 19. composable
  20. 20. Schema ordering ▪ By default only first 32 fields (not columns!) have stats collection ▪ Dynamic File Pruning uses min-max column stats ▪ Z-Ordering maximizes utility of min-max stats for ordered columns ▪ Make sure your Z-Ordered/sorted columns have stats collection! ▪ Move long strings to after the dataSkippingNumIndexedCols
  21. 21. Don’t over partition
  22. 22. Handling conflicting transactions def upsertIntoDeduped(microBatchOutput: DataFrame, batchId: Long): Unit = { DeltaTable.forPath("/mnt/somebucket/ip_index_deduped_updates.delta").as("out") .merge( microBatchOutput.as("in"), col("in.extracted_value") === col("out.extracted_value") &&... ) .whenNotMatched.insertAll .execute if (batchId % 31 == 0) { microBatchOutput .sparkSession .sql("OPTIMIZE delta.`/mnt/somebucket/ip_index_deduped_updates.delta` ZORDER BY (extracted_value)") } }
  23. 23. Large table metadata optimization ▪ We have tables with 2M+ objects and larger than 1.5PB ▪ spark.databricks.delta.snapshotPartitions 1024 ▪ "delta.logRetentionDuration":"interval 10 days"
  24. 24. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

×