Delta Lakehouse to Scale
Dan Ferrante
Director of Platform and Data Engineering
Agenda
Overview
Digital Turbine Background:
▪ Who we are.
▪ What we do.
▪ Where we’ve been.
Spark and Delta Lake
▪ Where we are.
In the Works
▪ Where we’re going.
At Digital Turbine, we:
Simplify – A User’s Mobile Experience
Discover – Recommended Applications
Deliver – Direct to Device
Where We’ve Been
This Photo by Unknown Author is licensed under CC BY-SA
REDSHIFT
STORM
Kafka
8
Spark
Microservices
S3
Sources and Sinks
Digital Turbine
Data Marts
Delta
Redshift
Datadog
Unified
Mobile
Events
Legacy
Mobile
Events
Campaign
Events
Slowly
Changing
Dimensions
Event Flow Rates
134M (74%) 41M (23%)
7M(3%)
Unified
Mobile
Events
Legacy
Mobile
Events
Campaign
Events +
SCDs
Delta Lake and Spark Streaming Objectives
Real-Time – Using Data direct from Kafka
Use Best Practices – Offload Semi-Structured Data to S3
Standardizing ETL Technologies – Accuracy, Precision, Reliability
Design And Architecture
Design And Architecture
Partitions
Production System Measurements
System Availability – 99.99% availability.
Data Time To Market – Seconds from Hours.
Communication – Alerting the Business.
Delta In Use
In The Works
• Redshift Optimization – Creating Datamarts in place of Raw Data
Repos.
• Business Metrics and Data Wellness Reports – Internal
Communication
• Cost Optimization – Reducing Major Cost Centers
• Machine Learning Algorithms – Improving our Targeting
Audiences
• Data Generation Tools – For Development and Testing.
Come Join Us
Feel free to reach out to me at: daniel.ferrante@digitalturbine.com.
Seek out available positions at https://www.digitalturbine.com/careers.
Thank You
Feedback
Your feedback is important to us.
Don’t forget to rate
and review the sessions.

Digital Turbine Adopts A Lakehouse to Scale to Their Analytics Needs

  • 1.
    Delta Lakehouse toScale Dan Ferrante Director of Platform and Data Engineering
  • 2.
    Agenda Overview Digital Turbine Background: ▪Who we are. ▪ What we do. ▪ Where we’ve been. Spark and Delta Lake ▪ Where we are. In the Works ▪ Where we’re going.
  • 3.
    At Digital Turbine,we: Simplify – A User’s Mobile Experience Discover – Recommended Applications Deliver – Direct to Device
  • 4.
    Where We’ve Been ThisPhoto by Unknown Author is licensed under CC BY-SA REDSHIFT STORM Kafka 8 Spark Microservices S3
  • 5.
    Sources and Sinks DigitalTurbine Data Marts Delta Redshift Datadog Unified Mobile Events Legacy Mobile Events Campaign Events Slowly Changing Dimensions
  • 6.
    Event Flow Rates 134M(74%) 41M (23%) 7M(3%) Unified Mobile Events Legacy Mobile Events Campaign Events + SCDs
  • 7.
    Delta Lake andSpark Streaming Objectives Real-Time – Using Data direct from Kafka Use Best Practices – Offload Semi-Structured Data to S3 Standardizing ETL Technologies – Accuracy, Precision, Reliability
  • 8.
  • 9.
  • 10.
    Production System Measurements SystemAvailability – 99.99% availability. Data Time To Market – Seconds from Hours. Communication – Alerting the Business.
  • 11.
  • 12.
    In The Works •Redshift Optimization – Creating Datamarts in place of Raw Data Repos. • Business Metrics and Data Wellness Reports – Internal Communication • Cost Optimization – Reducing Major Cost Centers • Machine Learning Algorithms – Improving our Targeting Audiences • Data Generation Tools – For Development and Testing.
  • 13.
    Come Join Us Feelfree to reach out to me at: daniel.ferrante@digitalturbine.com. Seek out available positions at https://www.digitalturbine.com/careers.
  • 14.
  • 15.
    Feedback Your feedback isimportant to us. Don’t forget to rate and review the sessions.