SlideShare a Scribd company logo
1
Stream All Things
Patterns of Modern Data Integration
2
Evolution of Data Integration
Data
Warehouse
Data Lake Data Stream
• Difficult modeling
• Difficult ingest
• Missing data
• Limited throughput
• Batch only
• Still batch only
• Data quality is suspect
• Data model left as
exercise for each
developer
• Where is my data?
• Kept the scale
• Reduced latency
• More involved ecosystem
that includes not just E+T+L
but also applications and
microservices.
• The tooling isn’t all there yet
3
Evolution of Data Integration
DWH Data Lake
Stream
Data
ETL
Engineers
Software
Engineers
4
Evolution of IT
2000
Software engineers
Data modelers
DBAs
Sysadmins
Storage admins
Network admins
ETL engineers
QA
….
2017
Software engineers
5
Software engineers do things a bit differently…
1. Write code, rather than be ”Informatica Guy”
2. Build apps, rather than help other people explore data
But it still looks a lot like ETL to me.
6
Few Patterns
1. Streams of events
2. Keep Events Compatible
3. Ridiculously Parallel Transformations
4. Streaming Data Enrichment
7
About Me
• Working @confluent
• Moving data around for 2 decades
• Was DBA, Consultant, engineer, Product
Manager…
• Wrote a book or two
• @gwenshap
• Github.com/gwenshap
8
Example: Large Hotel Chain
9
Streams of Events
10
Main
pain-points
of data
integration?
11
12
Since everything is an Event
1. Data integration is to collect, process and store events
2. Stream processing is kinda the same
3. Event Driven Architectures is pretty close too
14
PageViewEvent
{
sessionId: 676fc8983gu563,
timestamp: 1413215458,
viewType: "propertyView",
propertyId: 7879,
loyaltyId: 6764532
origin: "promotion",
...... lots of metadata....
}
16
17
Put all streams in One Kafka
18
Recap:
1. Everything is a stream of events
2. We need to move, process and integrate those streams
3. Kafka was made to store streams of events
4. Kafka brings those streams together –
which makes integrating them easier
19
20
21
22
Keep Events Compatible
23
Schemas are about how teams work together
Booking
service
{user_id: 53, timestamp: 1497842472}
create table (
use_id number,
timestamp number)
new Date(timestamp)
Attribution
service
Booking
DB
24
Booking
service
{user_id: 53, timestamp: “June 28, 2017 4:00pm”}
Attribution
service
Booking
DB
25
Moving fast and breaking things
Booking
service
{user_id: 53, timestamp: “June 28, 2017 4:00pm”}
create table (
use_id number,
timestamp number)
new Date(timestamp)
Attribution
service
Booking
DB
26
Schema Manifesto…
APIs between services are Contracts
In Stream Processing World – Event Schemas ARE the API
…except they stick around for a lot longer
27
Schema Registry
Elastic
Cassandra
Streams
App
Example Consumers
Serialize
r
App
!
Kafka Topic
Schema Registry
Define the expected fields for each Kafka topic
Automatically handle schema changes (e.g. new fields)
Verify Schema Compatibility at every stage of dev cycle
Supports multi-datacenter environments
28
Make those contracts binding
SerializationException
29
So the flow is…
Dev
Nightly build /
merge
Prod
Test Registry
Or
Mvn Plugin
Prod Registry
Test
MVN Plugin
30
Ridiculously Parallel Transformations
31
Sometimes a transformation is really trivial – but you do LOTS of them
{
sessionId: 676fc8983gu563,
timestamp: 1413215458,
viewType: "propertyView",
propertyId: 7879,
loyaltyId: 6764532,
origin: "promotion",
cc: “4444—3333-2222-1111”
}
{
sessionId: 676fc8983gu563,
timestamp: 1413215458,
viewType: "propertyView",
propertyId: 7879,
loyaltyId: 6764532,
origin: "promotion",
cc: “xxxx—xxxx-xxxx-1111”
}
32
33
Hipster Stream Processing
34
This applies surprisingly often
1. “L”oading events from Kafka to something (with Connect)
2. Event routing, time indexing
3. Field masking
4. Filtering
5. Unit conversions
6. Even “E”xtracting from something to Kafka (with Connect)
35
One soundbit about connect
36
Ecosystem of Connectors
Databases Datastore/File Store
Analytics Applications / Other
37
Careful:
Don’t take this pattern too far
Too far is when you manage state
38
Streaming Data Enrichment
39
Promotion!
We want to send Platinum members
Who are looking at beach properties
An email about
Discount package in new Florida hotel
40
Enriching events
From Event
sessionID
timestamp
loyaltyID
propertyID
To Enriched Event
sessionID
timestamp
loyaltyID
Level
isPlatium
propertyID
Location
isBeach
41
44
46
47
We use Kafka, Connect and CDC Connectors
To turn state (in DB)
Into stream of events (in Kafka)
And back into state (in application cache)
48
49
In reality…
• Maintaining the distributed cache isn’t trivial
• How do we persist state? Failover? Recover?
• How do we co-partition for joins?
KafkaStreams makes it easy:
Ktable loyaltyTbl = builder.table(…, “loyalty-cdc-topic”)
PageViewStream.leftJoin(loyaltyTbl, <operation>)
Example:
https://www.confluent.io/blog/distributed-real-time-joins-and-aggregations-on-user-activity-events-
using-kafka-streams/
50
Bonus Content – Joining Streams
51
Search Relevance
User searched for… “Hotel in Philly”
Goal:
• 80% of users clicked 1st result
• 15% of users clicked 2nd result
• 5% of users clicked results 3-
10
Reality:
?
52
PageViewEvent
{
sessionId: 676fc8983gu563,
timestamp: 1413215458,
viewType: "propertyView",
propertyId: 7879,
loyaltyId: 6764532,
origin: "promotion"}
SearchEvent
{
sessionId: 676fc8983gu563,
timestamp: 1413215458,
loyaltyId: 6764532,
SearchTerm: ”Philladelphia cozy"}
53
54
55
56
When Did Event?
Browser
time
Web App
created
event
Producer
sent event
Kafka got
event
Consumer
App got
event
58
In Reality…
• Windowed join can be challenging two implement – two buffers, state, failover, recovery…
KafkaStreams makes it easy:
PageViewStream.leftJoin(searchStream, <operation>, JoinWindows.of(1000)…)
Full example: https://github.com/gwenshap/kafka-clickstream-enrich
59
Few Patterns
1. Streams of events
2. Keep Events Compatible
3. Ridiculously Parallel Transformations
4. Streaming Data Enrichment
60
Thank You!

More Related Content

What's hot

Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
confluent
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
confluent
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...
confluent
 
Observability for developer ( Inny So & Andrew Jones, ThoughtWorks) Kafka Su...
Observability for developer ( Inny So & Andrew Jones, ThoughtWorks)  Kafka Su...Observability for developer ( Inny So & Andrew Jones, ThoughtWorks)  Kafka Su...
Observability for developer ( Inny So & Andrew Jones, ThoughtWorks) Kafka Su...
confluent
 
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
confluent
 
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
confluent
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
KafkaZone
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful Streams
Ben Stopford
 
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
confluent
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processing
confluent
 
Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...
Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...
Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...
confluent
 
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern ProgrammingKafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
confluent
 
Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...
HostedbyConfluent
 
How to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, StripeHow to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, Stripe
HostedbyConfluent
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Ben Stopford
 
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
HostedbyConfluent
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
confluent
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
confluent
 
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
confluent
 
James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...
James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...
James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...
confluent
 

What's hot (20)

Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...
 
Observability for developer ( Inny So & Andrew Jones, ThoughtWorks) Kafka Su...
Observability for developer ( Inny So & Andrew Jones, ThoughtWorks)  Kafka Su...Observability for developer ( Inny So & Andrew Jones, ThoughtWorks)  Kafka Su...
Observability for developer ( Inny So & Andrew Jones, ThoughtWorks) Kafka Su...
 
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
 
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful Streams
 
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processing
 
Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...
Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...
Kafka Summit SF 2017 - From Scaling Nightmare to Stream Dream : Real-time Str...
 
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern ProgrammingKafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
 
Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...
 
How to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, StripeHow to mutate your immutable log | Andrey Falko, Stripe
How to mutate your immutable log | Andrey Falko, Stripe
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
 
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
 
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
 
James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...
James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...
James Watters, Pivotal | Kafka Summit NYC 2019 Keynote (Spring Boot+Kafka: Th...
 

Similar to SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018

batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...
batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...
batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...
BATbern
 
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraStream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Databricks
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
C4Media
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Gary Arora
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
DataWorks Summit
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Tammy Bednar
 
James Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 Keynote
James Watters
 
Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...
Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...
Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...
Matt Stubbs
 
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache KafkaScylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
ScyllaDB
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
confluent
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live Streams
C4Media
 
Stream Analytics with SQL on Apache Flink
 Stream Analytics with SQL on Apache Flink Stream Analytics with SQL on Apache Flink
Stream Analytics with SQL on Apache Flink
Fabian Hueske
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
confluent
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the way
Grega Kespret
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
Amazon Web Services
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
Kristi Lewandowski
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
SingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
Kristi Lewandowski
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
confluent
 

Similar to SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018 (20)

batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...
batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...
batbern43 Stream all Things: Patterns of Data Integration in Event Driven Sys...
 
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraStream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
 
James Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 Keynote
 
Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...
Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...
Big Data LDN 2017: Look Ma, No Code! Building Streaming Data Pipelines With A...
 
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache KafkaScylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
Scylla Summit 2022: An Odyssey to ScyllaDB and Apache Kafka
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live Streams
 
Stream Analytics with SQL on Apache Flink
 Stream Analytics with SQL on Apache Flink Stream Analytics with SQL on Apache Flink
Stream Analytics with SQL on Apache Flink
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the way
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
 

More from Chester Chen

SFBigAnalytics_SparkRapid_20220622.pdf
SFBigAnalytics_SparkRapid_20220622.pdfSFBigAnalytics_SparkRapid_20220622.pdf
SFBigAnalytics_SparkRapid_20220622.pdf
Chester Chen
 
zookeeer+raft-2.pdf
zookeeer+raft-2.pdfzookeeer+raft-2.pdf
zookeeer+raft-2.pdf
Chester Chen
 
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
Chester Chen
 
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
Chester Chen
 
A missing link in the ML infrastructure stack?
A missing link in the ML infrastructure stack?A missing link in the ML infrastructure stack?
A missing link in the ML infrastructure stack?
Chester Chen
 
Shopify datadiscoverysf bigdata
Shopify datadiscoverysf bigdataShopify datadiscoverysf bigdata
Shopify datadiscoverysf bigdata
Chester Chen
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
Chester Chen
 
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
 SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK... SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
Chester Chen
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a ProSFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 
SF Big Analytics 2019-06-12: Managing uber's data workflows at scale
SF Big Analytics 2019-06-12: Managing uber's data workflows at scaleSF Big Analytics 2019-06-12: Managing uber's data workflows at scale
SF Big Analytics 2019-06-12: Managing uber's data workflows at scale
Chester Chen
 
SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...
SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...
SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...
Chester Chen
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
Chester Chen
 
SFBigAnalytics- hybrid data management using cdap
SFBigAnalytics- hybrid data management using cdapSFBigAnalytics- hybrid data management using cdap
SFBigAnalytics- hybrid data management using cdap
Chester Chen
 
Sf big analytics: bighead
Sf big analytics: bigheadSf big analytics: bighead
Sf big analytics: bighead
Chester Chen
 
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platformSf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Chester Chen
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Chester Chen
 
2018 data warehouse features in spark
2018   data warehouse features in spark2018   data warehouse features in spark
2018 data warehouse features in spark
Chester Chen
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
Chester Chen
 
2018 02 20-jeg_index
2018 02 20-jeg_index2018 02 20-jeg_index
2018 02 20-jeg_index
Chester Chen
 
Index conf sparkml-feb20-n-pentreath
Index conf sparkml-feb20-n-pentreathIndex conf sparkml-feb20-n-pentreath
Index conf sparkml-feb20-n-pentreath
Chester Chen
 

More from Chester Chen (20)

SFBigAnalytics_SparkRapid_20220622.pdf
SFBigAnalytics_SparkRapid_20220622.pdfSFBigAnalytics_SparkRapid_20220622.pdf
SFBigAnalytics_SparkRapid_20220622.pdf
 
zookeeer+raft-2.pdf
zookeeer+raft-2.pdfzookeeer+raft-2.pdf
zookeeer+raft-2.pdf
 
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
SF Big Analytics 2022-03-15: Persia: Scaling DL Based Recommenders up to 100 ...
 
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
 
A missing link in the ML infrastructure stack?
A missing link in the ML infrastructure stack?A missing link in the ML infrastructure stack?
A missing link in the ML infrastructure stack?
 
Shopify datadiscoverysf bigdata
Shopify datadiscoverysf bigdataShopify datadiscoverysf bigdata
Shopify datadiscoverysf bigdata
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
 
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
 SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK... SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a ProSFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a Pro
 
SF Big Analytics 2019-06-12: Managing uber's data workflows at scale
SF Big Analytics 2019-06-12: Managing uber's data workflows at scaleSF Big Analytics 2019-06-12: Managing uber's data workflows at scale
SF Big Analytics 2019-06-12: Managing uber's data workflows at scale
 
SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...
SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...
SF Big Analytics 20190612: Building highly efficient data lakes using Apache ...
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
SFBigAnalytics- hybrid data management using cdap
SFBigAnalytics- hybrid data management using cdapSFBigAnalytics- hybrid data management using cdap
SFBigAnalytics- hybrid data management using cdap
 
Sf big analytics: bighead
Sf big analytics: bigheadSf big analytics: bighead
Sf big analytics: bighead
 
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platformSf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
 
2018 data warehouse features in spark
2018   data warehouse features in spark2018   data warehouse features in spark
2018 data warehouse features in spark
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
 
2018 02 20-jeg_index
2018 02 20-jeg_index2018 02 20-jeg_index
2018 02 20-jeg_index
 
Index conf sparkml-feb20-n-pentreath
Index conf sparkml-feb20-n-pentreathIndex conf sparkml-feb20-n-pentreath
Index conf sparkml-feb20-n-pentreath
 

Recently uploaded

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 

Recently uploaded (20)

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 

SF big Analytics : Stream all things by Gwen Shapira @ Lyft 2018

Editor's Notes

  1. Data Integration is one of those domains that keep changing. I’ll start by exploring a bit how data integration changed over the course of my career, give a quick rundown of what I think is the current state of data integration and then I’ll dive into 4 architecture patterns that I see as central to modern data integration.
  2. Page views – Properties, rooms, deals, etc Searches Booking Everything that happens to a database Loyal customer profiles Changes to properties Price changes Sensors: Room entry TV viewing Snack bar
  3. Generalizes pub/sub and queuing Basis for kafka connect, kafka streams
  4. As a result of adopting events-based ideas into the data integration world, we inherited one of the biggest problems of the event world: Compatibility
  5. This wouldn’t happen in a database – because there is 4 month change process to apply any DDL This wouldn’t happen to a public API, because you’d have unit tests and code reviews that catch this But this keeps happening to anything that touches semi-structured data. Enough is enough.
  6. So they deserve as much love and as much testing as you give other APIs.
  7. This diagram shows what schema registry is for. From left to right: Schema registry is specific to a Kafka cluster, so you can have one schema registry per cluster. Applications can be third party apps like Twitter or SFDC or a custom application. Schema registry is relevant to every producer who can feed messages to your cluster. The schema registry exists outside your applications as a separate process. Within an application, there is a function called serializer that serializes messages for delivery to the kafka data pipeline. Confluent’s schema registry and integrated into the serializer. Some other apex developer companies have created their own schema registries outside the serializer; argument is that’s more complexity for the same functionality. The process goes like this: The serializer places a call to the schema registry, to see if it has a format for the data the application wants to publish. If it does, then schema registry passes that format to the application’s serializer, which uses it to filter out incorrectly formatted messages. This keeps your data pipeline clean. The format confluent uses for its schema registry is called Avro. Avro is the only known generally published schema format. i.e. everything else is custom. When you use the confluent schema registry, it’s automatically integrated into the serializer and there’s no effort you need to put into it. Your data pipeline will just be clean. You simply need to have all applications call the schema registry when publishing. The ultimate benefit of schema registry is that the consumer applications of your kafka topic can successfully read the messages they receive.
  8. There are two hard problems in stream processing
  9. Draw the rest of the fucking owl