SlideShare a Scribd company logo
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Building a Streaming
Microservices
Architecture
With Apache Spark Structured Streaming & Friends
Scott Haines
Senior Principal Software Engineer, Twilio
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
A little about me.
• I work at Twilio building massive data systems
• I run a bi-weekly internal Spark Office Hours where I offer
training and guidance to teams at the company
• >12 years working on large distributed analytics systems
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
A little about me.
• I work at Twilio building massive data systems
• I run a bi-weekly internal Spark Office Hours where I offer
training and guidance to teams at the company
• >12 years working on large distributed analytics systems
• Published work on Distributed Analytics Systems
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Voice Insights
Accountable observability data and
interactive analytics and insights for
the voice business and customers.
VIRGINIA, USA
DUBLIN, IRELAND
SINGAPORE
SYDNEY, AUSTRALIA
TOKYO, JAPAN
SAO PAULO, BRAZIL
DATA CENT ERS
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Events per Second
>1MIL
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Let’s Build a Reliable Data Architecture
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Goal: Reliable E2E Streaming Data Pipeline
GRPC Client
GRPC Server GRPC Server GRPC Server
1
2
3
Kafka Broker
4
Kafka Broker
5
6
Spark Application
7 8
HDFS
S39
HTTP /2
@newfront
Strong and Reliable Data starts at Ingest
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Of people love
JSON*.
>95%
{
"type": "CallEvent",
“call_sid”: "CA123",
"attributes": [
{
“account_sid”: “AC123”
“start_ms": 123,
“end_ms": "435"
}
]
}
• JSON has Structure.
• But JSON isn't strictly Structured Data.
Structured Data
@newfront @twilio
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Of people saddened by
bad data*
100%
{
"type": "CallEvent",
“call_sid": “CA234”,
“attributes":"oops"
}
• JSON has poor runtime guarantees due to
its flexible nature. Optimize for compile time
guarantees.
• Debugging corrupt data in a large
distributed system ruins hopes and dreams.
Structured Data
@newfront @twilio
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Protocol Buffers
message CallEvent {
uint64 created_ms = 1;
string call_sid = 2;
uint64 account_sid = 3;
EventType event_type = 4;
Region region = 5;
}
• Well Defined Events tell their own Story
• Type-Safety Rules
• Versioning your API / Pipeline / Data
now just means sticking to a version of
your schema
• Rely on Releases for versioning
• Interoperable with most major languages
(java/scala/c++/go/obj-c/node-js/
python/...)
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Protocol Buffers
message CallEvent {
uint64 created_ms = 1;
string call_sid = 2;
uint64 account_sid = 3;
EventType event_type = 4;
Region region = 5;
}
• Data Accountability
• Lightning Fast Serialization /
Deserialization
• Plays with nicely gRPC
• Interoperable with Spark SQL
• Like “JSON with Guard Rails”
Of people like when
things work between
releases!
100%
Take Aways
Data Engineers love gRPC
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
*technically not universally accepted
@newfront
GRPC Client
GRPC Server GRPC Server GRPC Server
HTTP /2
gRPC | saves time
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC Client
GRPC Server GRPC Server GRPC Server
HTTP /2
gRPC | saves time
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
val time = “$$$”
GRPC Client
GRPC Server GRPC Server GRPC Server
HTTP /2
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC.
GRPC // nutshell
• RPC = remote procedure call. “G” stands
for generic or Google
• Great for Internal Services
• High Performance
• Compact Binary Exchange Format
• Compile Idiomatic API Definitions
• Capable of Bi-Directional Streaming
• Pluggable HTTP/2 transport
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC.
• Building a CallEvent Service.
• 1: Define your messages (call.proto)
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC.
• Building a CallEvent Service.
• 1: Define your messages (call.proto)
• 2: Define your services (service.proto)
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC.
• Building a CallEvent Service.
• 1: Define your messages (call.proto)
• 2: Define your services (service.proto)
• 3: Compile your messages and service
stubs.
sbt clean compile package publishLocal
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC.
• Building a CallEvent Service.
• 1: Define your messages (call.proto)
• 2: Define your services (service.proto)
• 3: Compile your messages and service
stubs.
• 4: Implement Traits and Run!
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
GRPC.
• Building a CallEvent Service.
• Client SDKs essentially write themselves.
• JSON <-> Protobuf is still possible for
maintaining customer facing APIs
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Protocol Gatewaymessage T<:Event { … }
protobuf @ version
Kafka Broker
gRPC
Common Pattern Emerges
Protocol Streams
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Still Building…
GRPC Client
GRPC Server GRPC Server GRPC Server
1
2
3
Kafka Broker
4
Kafka Broker
5
6
Spark Application
7 8
HDFS
S39
HTTP /2
@newfront
GRPC Client
GRPC Server GRPC Server GRPC Server
1
2
3
Kafka Broker
4
Kafka Broker
5
6
Spark Application
7 8
9
HTTP /2
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Kafka + Protobuf + Spark
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Kafka.
• Solve for common Data Pipeline Problems
• Partition Keys are important. Use them to
your advantage
• Spark AQE - adaptive query execution can
handle hot spots. Non-spark services not-
so-much.
Topic: CallEvents
key: call_sid
partitions*: n
* number of partitions: factor of producer records/s * avg(record.bytesize)
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark.
• Use ScalaPB’s ExpressionEncoders to
natively convert protobuf to Catalyst
Optimizable DataFrames
• Marry this with Sparks Kafka
DataSource Reader/Writer
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark.
• Behind the Scenes…
• Spark is doing magic
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark.
• End to End Tests ensure you can press
the release button anywhere in the
pipeline.
• Can be automated to ensure your Spark
Apps can continue working with any
changes to the upstream gRPC
• Can use for Canary testing updates
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark.
• Use ScalaPB to read local streams
• Ensure your Spark Apps can continue
working with any new protobuf
dependencies
• Test simple reads and complex
aggregations locally, deploy globally
@newfront
GRPC Client
GRPC Server GRPC Server GRPC Server
1
Kafka Broker
4
Kafka Broker
5
6
Spark Application
7 8
HDFS
S39
TP /2
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark & Beyond
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark + Friends.
• Proto to Catalyst DataFrame
• Conversion to Parquet in Delta
Table
• Partitioned by Date
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Spark + Friends.
• Now it is easy for Downstream
applications to pick up using
Parquet Protocol Streams
@newfront
© 2019 TWILIO INC. ALL RIGHTS RESERVED.
Rinse and Repeat
@newfront
Just keep adding new Flows.
1. Define your Structured Data
2. Define your service definitions
3. Emit Data / Enqueue to Kafka
4. Read and Drop in HDFS (delta) or
pass along to a new Topic
Kafka Topic Kafka Topic
Spark Application Spark Application Spark Application
Kafka Topic
Data Table Data Table
Spark Application
GRPC Server
THANK YOU
@newfront

More Related Content

What's hot

Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
HostedbyConfluent
 
Democratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druidDemocratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druid
DataWorks Summit
 
Delta Lake: Open Source Reliability w/ Apache Spark
Delta Lake: Open Source Reliability w/ Apache SparkDelta Lake: Open Source Reliability w/ Apache Spark
Delta Lake: Open Source Reliability w/ Apache Spark
George Chow
 
Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
Building Sessionization Pipeline at Scale with Databricks Delta
Building Sessionization Pipeline at Scale with Databricks DeltaBuilding Sessionization Pipeline at Scale with Databricks Delta
Building Sessionization Pipeline at Scale with Databricks Delta
Databricks
 
Data Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive ApproachesData Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive Approaches
Databricks
 
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Databricks
 
Successful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data EngineeringSuccessful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data Engineering
Databricks
 
Reliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoTReliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoT
Guido Schmutz
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
Jeraldine Phneah
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
Databricks
 
Redash: Open Source SQL Analytics on Data Lakes
Redash: Open Source SQL Analytics on Data LakesRedash: Open Source SQL Analytics on Data Lakes
Redash: Open Source SQL Analytics on Data Lakes
Databricks
 
Reltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with CassandraReltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with Cassandra
DataStax Academy
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
Joseph Arriola
 
Instrumenting your Instruments
Instrumenting your Instruments Instrumenting your Instruments
Instrumenting your Instruments
DataWorks Summit/Hadoop Summit
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
Databricks
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
Pat Patterson
 

What's hot (20)

Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
 
Democratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druidDemocratizing data science Using spark, hive and druid
Democratizing data science Using spark, hive and druid
 
Delta Lake: Open Source Reliability w/ Apache Spark
Delta Lake: Open Source Reliability w/ Apache SparkDelta Lake: Open Source Reliability w/ Apache Spark
Delta Lake: Open Source Reliability w/ Apache Spark
 
Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
Building Sessionization Pipeline at Scale with Databricks Delta
Building Sessionization Pipeline at Scale with Databricks DeltaBuilding Sessionization Pipeline at Scale with Databricks Delta
Building Sessionization Pipeline at Scale with Databricks Delta
 
Data Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive ApproachesData Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive Approaches
 
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
Unifying Streaming and Historical Telemetry Data For Real-time Performance Re...
 
Successful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data EngineeringSuccessful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data Engineering
 
Reliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoTReliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoT
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
 
LinkedIn2
LinkedIn2LinkedIn2
LinkedIn2
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
 
Redash: Open Source SQL Analytics on Data Lakes
Redash: Open Source SQL Analytics on Data LakesRedash: Open Source SQL Analytics on Data Lakes
Redash: Open Source SQL Analytics on Data Lakes
 
Reltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with CassandraReltio: Powering Enterprise Data-driven Applications with Cassandra
Reltio: Powering Enterprise Data-driven Applications with Cassandra
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
 
Instrumenting your Instruments
Instrumenting your Instruments Instrumenting your Instruments
Instrumenting your Instruments
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
 

Similar to Building a Streaming Microservices Architecture - Data + AI Summit EU 2020

Apache Pulsar @Splunk
Apache Pulsar @SplunkApache Pulsar @Splunk
Apache Pulsar @Splunk
Karthik Ramasamy
 
Integrating Postgres with ActiveMQ and Camel
Integrating Postgres with ActiveMQ and CamelIntegrating Postgres with ActiveMQ and Camel
Integrating Postgres with ActiveMQ and Camel
Justin Reock
 
Oracle Modern AppDev Approach to Cloud & Container Native App
Oracle Modern AppDev Approach to Cloud & Container Native AppOracle Modern AppDev Approach to Cloud & Container Native App
Oracle Modern AppDev Approach to Cloud & Container Native App
Paulo Alberto Simoes ∴
 
Why Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik RamasamyWhy Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik Ramasamy
StreamNative
 
Pulsar summit-keynote-final
Pulsar summit-keynote-finalPulsar summit-keynote-final
Pulsar summit-keynote-final
Karthik Ramasamy
 
CICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdfCICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdf
Amazon Web Services
 
Netflix MSA and Pivotal
Netflix MSA and PivotalNetflix MSA and Pivotal
Netflix MSA and Pivotal
VMware Tanzu Korea
 
CI/CD for Containers: A Way Forward for Your DevOps Pipeline
CI/CD for Containers: A Way Forward for Your DevOps PipelineCI/CD for Containers: A Way Forward for Your DevOps Pipeline
CI/CD for Containers: A Way Forward for Your DevOps Pipeline
Amazon Web Services
 
CI/CD for Modern Applications
CI/CD for Modern ApplicationsCI/CD for Modern Applications
CI/CD for Modern Applications
Amazon Web Services
 
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
OpenStack Korea Community
 
Kamailio practice Quobis-University of Vigo Laboratory of Commutation 2012-2...
Kamailio practice Quobis-University of Vigo Laboratory of Commutation  2012-2...Kamailio practice Quobis-University of Vigo Laboratory of Commutation  2012-2...
Kamailio practice Quobis-University of Vigo Laboratory of Commutation 2012-2...
Quobis
 
The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...
The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...
The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...
Splunk
 
SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...
SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...
SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...
Harry McLaren
 
Cisco Connect Ottawa 2018 dev net
Cisco Connect Ottawa 2018 dev netCisco Connect Ottawa 2018 dev net
Cisco Connect Ottawa 2018 dev net
Cisco Canada
 
Continuous Integration and Continuous Delivery Best Practices for Building Mo...
Continuous Integration and Continuous Delivery Best Practices for Building Mo...Continuous Integration and Continuous Delivery Best Practices for Building Mo...
Continuous Integration and Continuous Delivery Best Practices for Building Mo...
Amazon Web Services
 
Analytics im DevOps Lebenszyklus
Analytics im DevOps LebenszyklusAnalytics im DevOps Lebenszyklus
Analytics im DevOps Lebenszyklus
Splunk
 
Emulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API ProvidersEmulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API Providers
Cisco DevNet
 
AWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applicationsAWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applications
Cobus Bernard
 
Leveraging Splunk Enterprise Security with the MITRE’s ATT&CK Framework
Leveraging Splunk Enterprise Security with the MITRE’s ATT&CK FrameworkLeveraging Splunk Enterprise Security with the MITRE’s ATT&CK Framework
Leveraging Splunk Enterprise Security with the MITRE’s ATT&CK Framework
Splunk
 
TechWiseTV Workshop: Cisco Hybrid Cloud Platform for Google Cloud
TechWiseTV Workshop:  Cisco Hybrid Cloud Platform for Google CloudTechWiseTV Workshop:  Cisco Hybrid Cloud Platform for Google Cloud
TechWiseTV Workshop: Cisco Hybrid Cloud Platform for Google Cloud
Robb Boyd
 

Similar to Building a Streaming Microservices Architecture - Data + AI Summit EU 2020 (20)

Apache Pulsar @Splunk
Apache Pulsar @SplunkApache Pulsar @Splunk
Apache Pulsar @Splunk
 
Integrating Postgres with ActiveMQ and Camel
Integrating Postgres with ActiveMQ and CamelIntegrating Postgres with ActiveMQ and Camel
Integrating Postgres with ActiveMQ and Camel
 
Oracle Modern AppDev Approach to Cloud & Container Native App
Oracle Modern AppDev Approach to Cloud & Container Native AppOracle Modern AppDev Approach to Cloud & Container Native App
Oracle Modern AppDev Approach to Cloud & Container Native App
 
Why Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik RamasamyWhy Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik Ramasamy
 
Pulsar summit-keynote-final
Pulsar summit-keynote-finalPulsar summit-keynote-final
Pulsar summit-keynote-final
 
CICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdfCICDforModernApplications-Oslo.pdf
CICDforModernApplications-Oslo.pdf
 
Netflix MSA and Pivotal
Netflix MSA and PivotalNetflix MSA and Pivotal
Netflix MSA and Pivotal
 
CI/CD for Containers: A Way Forward for Your DevOps Pipeline
CI/CD for Containers: A Way Forward for Your DevOps PipelineCI/CD for Containers: A Way Forward for Your DevOps Pipeline
CI/CD for Containers: A Way Forward for Your DevOps Pipeline
 
CI/CD for Modern Applications
CI/CD for Modern ApplicationsCI/CD for Modern Applications
CI/CD for Modern Applications
 
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
 
Kamailio practice Quobis-University of Vigo Laboratory of Commutation 2012-2...
Kamailio practice Quobis-University of Vigo Laboratory of Commutation  2012-2...Kamailio practice Quobis-University of Vigo Laboratory of Commutation  2012-2...
Kamailio practice Quobis-University of Vigo Laboratory of Commutation 2012-2...
 
The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...
The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...
The DevOps Promise: Helping Management Realise the Quality, Velocity & Effici...
 
SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...
SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...
SplDevOps: Making Splunk Development a Breeze With a Deep Dive on DevOps' Con...
 
Cisco Connect Ottawa 2018 dev net
Cisco Connect Ottawa 2018 dev netCisco Connect Ottawa 2018 dev net
Cisco Connect Ottawa 2018 dev net
 
Continuous Integration and Continuous Delivery Best Practices for Building Mo...
Continuous Integration and Continuous Delivery Best Practices for Building Mo...Continuous Integration and Continuous Delivery Best Practices for Building Mo...
Continuous Integration and Continuous Delivery Best Practices for Building Mo...
 
Analytics im DevOps Lebenszyklus
Analytics im DevOps LebenszyklusAnalytics im DevOps Lebenszyklus
Analytics im DevOps Lebenszyklus
 
Emulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API ProvidersEmulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API Providers
 
AWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applicationsAWS DevDay Cologne - CI/CD for modern applications
AWS DevDay Cologne - CI/CD for modern applications
 
Leveraging Splunk Enterprise Security with the MITRE’s ATT&CK Framework
Leveraging Splunk Enterprise Security with the MITRE’s ATT&CK FrameworkLeveraging Splunk Enterprise Security with the MITRE’s ATT&CK Framework
Leveraging Splunk Enterprise Security with the MITRE’s ATT&CK Framework
 
TechWiseTV Workshop: Cisco Hybrid Cloud Platform for Google Cloud
TechWiseTV Workshop:  Cisco Hybrid Cloud Platform for Google CloudTechWiseTV Workshop:  Cisco Hybrid Cloud Platform for Google Cloud
TechWiseTV Workshop: Cisco Hybrid Cloud Platform for Google Cloud
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 

Recently uploaded (20)

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 

Building a Streaming Microservices Architecture - Data + AI Summit EU 2020

  • 1. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Building a Streaming Microservices Architecture With Apache Spark Structured Streaming & Friends Scott Haines Senior Principal Software Engineer, Twilio @newfront
  • 2. © 2019 TWILIO INC. ALL RIGHTS RESERVED. A little about me. • I work at Twilio building massive data systems • I run a bi-weekly internal Spark Office Hours where I offer training and guidance to teams at the company • >12 years working on large distributed analytics systems @newfront
  • 3. © 2019 TWILIO INC. ALL RIGHTS RESERVED. A little about me. • I work at Twilio building massive data systems • I run a bi-weekly internal Spark Office Hours where I offer training and guidance to teams at the company • >12 years working on large distributed analytics systems • Published work on Distributed Analytics Systems @newfront
  • 4. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Voice Insights Accountable observability data and interactive analytics and insights for the voice business and customers.
  • 5. VIRGINIA, USA DUBLIN, IRELAND SINGAPORE SYDNEY, AUSTRALIA TOKYO, JAPAN SAO PAULO, BRAZIL DATA CENT ERS © 2019 TWILIO INC. ALL RIGHTS RESERVED. Events per Second >1MIL
  • 6. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Let’s Build a Reliable Data Architecture
  • 7. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Goal: Reliable E2E Streaming Data Pipeline GRPC Client GRPC Server GRPC Server GRPC Server 1 2 3 Kafka Broker 4 Kafka Broker 5 6 Spark Application 7 8 HDFS S39 HTTP /2 @newfront
  • 8. Strong and Reliable Data starts at Ingest © 2019 TWILIO INC. ALL RIGHTS RESERVED. @newfront
  • 9. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Of people love JSON*. >95% { "type": "CallEvent", “call_sid”: "CA123", "attributes": [ { “account_sid”: “AC123” “start_ms": 123, “end_ms": "435" } ] } • JSON has Structure. • But JSON isn't strictly Structured Data. Structured Data @newfront @twilio
  • 10. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Of people saddened by bad data* 100% { "type": "CallEvent", “call_sid": “CA234”, “attributes":"oops" } • JSON has poor runtime guarantees due to its flexible nature. Optimize for compile time guarantees. • Debugging corrupt data in a large distributed system ruins hopes and dreams. Structured Data @newfront @twilio
  • 11. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Protocol Buffers message CallEvent { uint64 created_ms = 1; string call_sid = 2; uint64 account_sid = 3; EventType event_type = 4; Region region = 5; } • Well Defined Events tell their own Story • Type-Safety Rules • Versioning your API / Pipeline / Data now just means sticking to a version of your schema • Rely on Releases for versioning • Interoperable with most major languages (java/scala/c++/go/obj-c/node-js/ python/...)
  • 12. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Protocol Buffers message CallEvent { uint64 created_ms = 1; string call_sid = 2; uint64 account_sid = 3; EventType event_type = 4; Region region = 5; } • Data Accountability • Lightning Fast Serialization / Deserialization • Plays with nicely gRPC • Interoperable with Spark SQL • Like “JSON with Guard Rails” Of people like when things work between releases! 100% Take Aways
  • 13. Data Engineers love gRPC © 2019 TWILIO INC. ALL RIGHTS RESERVED. *technically not universally accepted @newfront
  • 14. GRPC Client GRPC Server GRPC Server GRPC Server HTTP /2 gRPC | saves time © 2019 TWILIO INC. ALL RIGHTS RESERVED.
  • 15. GRPC Client GRPC Server GRPC Server GRPC Server HTTP /2 gRPC | saves time © 2019 TWILIO INC. ALL RIGHTS RESERVED. val time = “$$$”
  • 16. GRPC Client GRPC Server GRPC Server GRPC Server HTTP /2 © 2019 TWILIO INC. ALL RIGHTS RESERVED. GRPC. GRPC // nutshell • RPC = remote procedure call. “G” stands for generic or Google • Great for Internal Services • High Performance • Compact Binary Exchange Format • Compile Idiomatic API Definitions • Capable of Bi-Directional Streaming • Pluggable HTTP/2 transport
  • 17. © 2019 TWILIO INC. ALL RIGHTS RESERVED. GRPC. • Building a CallEvent Service. • 1: Define your messages (call.proto)
  • 18. © 2019 TWILIO INC. ALL RIGHTS RESERVED. GRPC. • Building a CallEvent Service. • 1: Define your messages (call.proto) • 2: Define your services (service.proto) @newfront
  • 19. © 2019 TWILIO INC. ALL RIGHTS RESERVED. GRPC. • Building a CallEvent Service. • 1: Define your messages (call.proto) • 2: Define your services (service.proto) • 3: Compile your messages and service stubs. sbt clean compile package publishLocal
  • 20. © 2019 TWILIO INC. ALL RIGHTS RESERVED. GRPC. • Building a CallEvent Service. • 1: Define your messages (call.proto) • 2: Define your services (service.proto) • 3: Compile your messages and service stubs. • 4: Implement Traits and Run!
  • 21. © 2019 TWILIO INC. ALL RIGHTS RESERVED. GRPC. • Building a CallEvent Service. • Client SDKs essentially write themselves. • JSON <-> Protobuf is still possible for maintaining customer facing APIs
  • 22. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Protocol Gatewaymessage T<:Event { … } protobuf @ version Kafka Broker gRPC Common Pattern Emerges Protocol Streams
  • 23. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Still Building… GRPC Client GRPC Server GRPC Server GRPC Server 1 2 3 Kafka Broker 4 Kafka Broker 5 6 Spark Application 7 8 HDFS S39 HTTP /2 @newfront
  • 24. GRPC Client GRPC Server GRPC Server GRPC Server 1 2 3 Kafka Broker 4 Kafka Broker 5 6 Spark Application 7 8 9 HTTP /2 © 2019 TWILIO INC. ALL RIGHTS RESERVED. Kafka + Protobuf + Spark
  • 25. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Kafka. • Solve for common Data Pipeline Problems • Partition Keys are important. Use them to your advantage • Spark AQE - adaptive query execution can handle hot spots. Non-spark services not- so-much. Topic: CallEvents key: call_sid partitions*: n * number of partitions: factor of producer records/s * avg(record.bytesize)
  • 26. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark. • Use ScalaPB’s ExpressionEncoders to natively convert protobuf to Catalyst Optimizable DataFrames • Marry this with Sparks Kafka DataSource Reader/Writer
  • 27. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark. • Behind the Scenes… • Spark is doing magic
  • 28. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark. • End to End Tests ensure you can press the release button anywhere in the pipeline. • Can be automated to ensure your Spark Apps can continue working with any changes to the upstream gRPC • Can use for Canary testing updates
  • 29. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark. • Use ScalaPB to read local streams • Ensure your Spark Apps can continue working with any new protobuf dependencies • Test simple reads and complex aggregations locally, deploy globally @newfront
  • 30. GRPC Client GRPC Server GRPC Server GRPC Server 1 Kafka Broker 4 Kafka Broker 5 6 Spark Application 7 8 HDFS S39 TP /2 © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark & Beyond @newfront
  • 31. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark + Friends. • Proto to Catalyst DataFrame • Conversion to Parquet in Delta Table • Partitioned by Date @newfront
  • 32. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Spark + Friends. • Now it is easy for Downstream applications to pick up using Parquet Protocol Streams @newfront
  • 33. © 2019 TWILIO INC. ALL RIGHTS RESERVED. Rinse and Repeat @newfront Just keep adding new Flows. 1. Define your Structured Data 2. Define your service definitions 3. Emit Data / Enqueue to Kafka 4. Read and Drop in HDFS (delta) or pass along to a new Topic Kafka Topic Kafka Topic Spark Application Spark Application Spark Application Kafka Topic Data Table Data Table Spark Application GRPC Server