SlideShare a Scribd company logo
1 of 33
Download to read offline
The evolution of data
movement.
2022 SERIES OF EVENTS
New York
JULY
(HYBRID)
Australia
SEPTEMBER
(HYBRID)
Singapore
APRIL
(VIRTUAL)
Helsinki & North
MARCH
(VIRTUAL)
Paris
DECEMBER
(HYBRID)
London
OCTOBER
(HYBRID)
Hong Kong
AUGUST
(VIRTUAL)
JUNE (VIRTUAL)
India
MAY
(VIRTUAL)
APRIL (VIRTUAL)
Dubai & Middle East
JUNE
(VIRTUAL)
Check out our API Conferences
www.a pida ys .globa l
Want to talk at one of our conferences?
apidays.typeform.com/to/ILJeAaV8
Airbyte
Open-Source data integration
30,000 Deployments
7,900 Slack members
7,000 GitHub stars
Hello!
I am Michel Tricot
Co-Founder & CEO of Airbyte
@MichelTricot
michel-tricot
/in/micheltricot
How Data Movement has changed…
Before Today
The rise of the Cloud
compute era
1. Exponential growth in the amount of data sources and
data
2. Plummeting cost of cloud-based computation and
storage
➡ Data consumption model has
changed
APIs are ubiquitous
➡ Data access model has changed
1. APIs are both a product and a datastore
2. Data is siloed and access has become a key challenge
Extract - Load - Transform
A new paradigm for modern teams
ELT is replacing ETL
Extract
Source-specific routines
to pull selected data from
an external system.
Transform
Business logic specific to
your organization to serve
an analytics or
operational use case.
Load
Destination specific
routines to push data
where it is going to be
consumed.
ETL doesn’t work in today’s world
Inflexible
● Friction when
changing an existing
pipeline.
● Hard to add new
data.
● Most issues force
data to be
re-extracted.
Lack of Autonomy
● Warehouses made data
consumers more autonomous.
● Changes require engineering
involvement.
Complex
● Custom DSL.
● Force adoption of a
data stack.
● Address 70% of the
needs, 30% still
built and
maintained
in-house.
Extract
General-purpose routines
to pull selected data from
a source.
Load
General-purpose routines
to push raw data where it
is going to be consumed.
Transform
Business logic specific to
your organization to serve
an analytics or operational
use case with SQL / dbt / ...
ELT fixes the ETL-related issues
Flexibility
● All the data available
on the destination.
● Data consumers are
free to use what they
need for the insights
they want.
Autonomy
● Data consumers can
leverage SQL queries to
transform the data the way
they want.
● No need to involve the
engineering team.
Future proof
● Issues during
transformation don’t
prevent access to the
data.
● Easy to update
transformation
schemas.
What about
the long-tail of APIs?
1,000's of new apps/APIs emerging every year
➡ Data is more and more fragmented
➡ Rising need to break down data silos
Open-source communities
solve the long-tail of APIs
1. Don’t reinvent the wheel, leverage existing connectors
2. Share the work of maintenance across a community
OSS is the only way to solve data integration
Developer tooling is crucial
We empower people to build good connectors
with the Airbyte CDK
1. Offer developers tools
2. Build developer leverage
Predictions for APIs
An API is not just about exposing data, it is the
programmatic version of a product with all the
business logic that ties to it.
Because of it, there will always be
fragmentation in the API world and the need
to cover the long tail to break down these silos.
Any questions?
@MichelTricot
slack.airbyte.com (@Michel)
airbytehq/airbyte
Thanks!
Predictive analytics uses historical data to predict future events.
Only way this work is good data in good analytics out
Empower your data teams.
Limitations of current ELT explain the
growing need for data engineers.
Only the most popular connectors
They plateau at ~170 connectors, and can’t cover the long tail
because of maintenance costs and ROI consideration.
Can’t handle custom use cases
Customers can't customize pre-built connectors, nor create new ones.
Counter-productive row-based
pricing
Charging on active rows prevents mid- and high-scale replications
(APIs, databases...) and is unpredictable.
Lorem ipsum dolor sit
amet, consectetur
adipiscing elit, sed do
eiusmod tempor
incididunt ut labore et
dolore magna aliqua. Ut
enim ad minim veniam,
quis nostrud
exercitation ullamco
laboris nisi ut aliquip
ex ea commodo
consequat. Duis aute
irure dolor in
reprehenderit in
voluptate velit esse
cillum dolore eu fugiat
nulla pariatur.
Excepteur sint occaecat
cupidatat non proident,
sunt in culpa qui
officia deserunt mollit
anim id est laborum.
X X
Data Engineers need a scalable
way to cover all data pipelines
Covers the long tail of connectors
Extensible and non-opinionated to
address your exact needs
A fair compute-based pricing
www.airbyte.io
Data infrastructure is huge and growing,
but movement is still immature.
www.airbyte.io
CDK to increase developer productivity
Enabling the long tail
○ Connectors as configuration
○ Speed ups & usability improvements to Connector Acceptance Tests
○ Reducing effort required to specify connector output schemas
○ CDK-level speed ups in connectors via multi-threading
Developer happiness
& reducing friction
○ Seamless M1 support
○ Connector config migrations
The hardest part of
By enabling the community with
the best tooling (CDK)
Nailing maintenance
at scale
Appendix
Reverse ETL
Data
Warehouse
Extract Load Transform Activate
...
BI/Visualization
...
18
We grew the biggest community
around data integration. [updated]
GitHub stars Slack members Code contributors
0
2,000
4,000
6,000
0 0
Oct. Jan. Apr. Jul. Sep.
Grouparoo Rudderstack Meltano
Nov.
Oct. Jan. Apr. Jul. Sep. Nov.
Oct. Jan. Apr. Jul. Sep. Nov.
Airbyte
2,000
4,000
6,000
100
200
300
“We are past the golden age of
Hadoop and Spark”
Topics (notes from our call with event organizers)
*they do want Michel to talk about whatever he thinks is important*
20 min talk + 5 min Q&A
Talking at 10:40am PST on 6/8
Need a slide deck
Michel will be speaking directly after the Keynote speaker (author of Platform Revolution)
Some ideas for the talk:
1. APIs
2. OSS connectors
3. The whole vision
a. Why it makes sense to have OSS connectors
b. Why is makes sense to maintain certain APIs
c. “Airbyte has the community and platform to rule them all”
4. Integration is fragmented
a. History of integrations and types of integrations overview
5. He can do a plug for maintainer program and ask people contribute to airbyte
a. This is the best community/audience to give a call to action to contribute to Airbyte
They really want to hear about the Airbyte’s VISION
● Moving data from A to B
● Community led growth
● Long-tail of APIs
● How we see APIs changing and evolving
● Fragmentation in integrations today is a “trillion dollar issue” and airbyte aims to be the platform to solve it all
Title for the talk: The Evolution of Data Movement
Potential agenda (in order)
*This is the airbyte vision + our thoughts on evolution of data movement
1. API Evolution 1990 → 2000→ Today (Cheaper Storage move all data)
2. And now ETL —> ELT
3. To solve the long-tail of APIs, you need a Community based approach
4. OSS - why it’s critical for the future of API integrations (and the scalability of it)
5. CDK: Why developer tooling is important (API Specific)
6. Future predictions for APIs?
1890’s Data Movement and Analytics
In 1880, prior to computers, it took over seven years for the U.S.
Census Bureau to process the collected information and complete
a final report. In response, inventor Herman Hollerith produced the
“tabulating machine,” which was used in the 1890 census. The
tabulating machine could systematically process data recorded on
punch cards. With this device, the 1890 census was finished in 18
months.
Interesting Read -
https://www.dataversity.net/brief-history-analytics/#
What Data Movement looked like in….
1990 2000
“We are years past the golden age
of Hadoop and Spark”
Cloud, Warehouses and Lakehouses are
taking over the data world.
How Data Movement has changed…
Before Today
Databas
e
Files
API
Spreads
heet
Extract &
Load
Warehou
se
Transform
Databas
e
Files
API
BI
…
…

More Related Content

Similar to INTERFACE, by apidays - The Evolution of Data Movement.pdf

Why Docker, Why Now?
Why Docker, Why Now?Why Docker, Why Now?
Why Docker, Why Now?Bret Fisher
 
Comparison of Open Source Frameworks for Integrating the Internet of Things
Comparison of Open Source Frameworks for Integrating the Internet of ThingsComparison of Open Source Frameworks for Integrating the Internet of Things
Comparison of Open Source Frameworks for Integrating the Internet of ThingsKai Wähner
 
The Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & APIThe Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & APIJudy Breedlove
 
Running containers in production, the ING story
Running containers in production, the ING storyRunning containers in production, the ING story
Running containers in production, the ING storyThijs Ebbers
 
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamIoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamgogo6
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQLEDB
 
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...DataWorks Summit
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Getting Multi-chain Web3 data with One Unified API
Getting Multi-chain Web3 data with One Unified APIGetting Multi-chain Web3 data with One Unified API
Getting Multi-chain Web3 data with One Unified APITinaBregovi
 
Alfresco Day Roma 2015: Digital Renaissance
Alfresco Day Roma 2015: Digital RenaissanceAlfresco Day Roma 2015: Digital Renaissance
Alfresco Day Roma 2015: Digital RenaissanceAlfresco Software
 
Platform Requirements for CI/CD Success—and the Enterprises Leading the Way
Platform Requirements for CI/CD Success—and the Enterprises Leading the WayPlatform Requirements for CI/CD Success—and the Enterprises Leading the Way
Platform Requirements for CI/CD Success—and the Enterprises Leading the WayVMware Tanzu
 
2 pc enterprise summit cronin newfinal aug 18
2 pc enterprise summit cronin newfinal aug 182 pc enterprise summit cronin newfinal aug 18
2 pc enterprise summit cronin newfinal aug 18IntelAPAC
 
Axway's Journey to the Cloud
Axway's Journey to the CloudAxway's Journey to the Cloud
Axway's Journey to the CloudAxway
 
TiConf Australia 2013
TiConf Australia 2013TiConf Australia 2013
TiConf Australia 2013Jeff Haynie
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deckAirbyte
 
Cisco Connect Toronto 2018 DevNet Overview
Cisco Connect Toronto 2018  DevNet OverviewCisco Connect Toronto 2018  DevNet Overview
Cisco Connect Toronto 2018 DevNet OverviewCisco Canada
 
The Environment for Innovation: Tristan Goode, Aptira
The Environment for Innovation: Tristan Goode, AptiraThe Environment for Innovation: Tristan Goode, Aptira
The Environment for Innovation: Tristan Goode, AptiraOpenStack
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Why Automate the Network?
Why Automate the Network?Why Automate the Network?
Why Automate the Network?Hank Preston
 

Similar to INTERFACE, by apidays - The Evolution of Data Movement.pdf (20)

Why Docker, Why Now?
Why Docker, Why Now?Why Docker, Why Now?
Why Docker, Why Now?
 
Comparison of Open Source Frameworks for Integrating the Internet of Things
Comparison of Open Source Frameworks for Integrating the Internet of ThingsComparison of Open Source Frameworks for Integrating the Internet of Things
Comparison of Open Source Frameworks for Integrating the Internet of Things
 
The Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & APIThe Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & API
 
Running containers in production, the ING story
Running containers in production, the ING storyRunning containers in production, the ING story
Running containers in production, the ING story
 
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStreamIoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Getting Multi-chain Web3 data with One Unified API
Getting Multi-chain Web3 data with One Unified APIGetting Multi-chain Web3 data with One Unified API
Getting Multi-chain Web3 data with One Unified API
 
Netflix MSA and Pivotal
Netflix MSA and PivotalNetflix MSA and Pivotal
Netflix MSA and Pivotal
 
Alfresco Day Roma 2015: Digital Renaissance
Alfresco Day Roma 2015: Digital RenaissanceAlfresco Day Roma 2015: Digital Renaissance
Alfresco Day Roma 2015: Digital Renaissance
 
Platform Requirements for CI/CD Success—and the Enterprises Leading the Way
Platform Requirements for CI/CD Success—and the Enterprises Leading the WayPlatform Requirements for CI/CD Success—and the Enterprises Leading the Way
Platform Requirements for CI/CD Success—and the Enterprises Leading the Way
 
2 pc enterprise summit cronin newfinal aug 18
2 pc enterprise summit cronin newfinal aug 182 pc enterprise summit cronin newfinal aug 18
2 pc enterprise summit cronin newfinal aug 18
 
Axway's Journey to the Cloud
Axway's Journey to the CloudAxway's Journey to the Cloud
Axway's Journey to the Cloud
 
TiConf Australia 2013
TiConf Australia 2013TiConf Australia 2013
TiConf Australia 2013
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deck
 
Cisco Connect Toronto 2018 DevNet Overview
Cisco Connect Toronto 2018  DevNet OverviewCisco Connect Toronto 2018  DevNet Overview
Cisco Connect Toronto 2018 DevNet Overview
 
The Environment for Innovation: Tristan Goode, Aptira
The Environment for Innovation: Tristan Goode, AptiraThe Environment for Innovation: Tristan Goode, Aptira
The Environment for Innovation: Tristan Goode, Aptira
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Why Automate the Network?
Why Automate the Network?Why Automate the Network?
Why Automate the Network?
 

More from apidays

apidays Australia 2023 - A programmatic approach to API success including Ope...
apidays Australia 2023 - A programmatic approach to API success including Ope...apidays Australia 2023 - A programmatic approach to API success including Ope...
apidays Australia 2023 - A programmatic approach to API success including Ope...apidays
 
apidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile API
apidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile APIapidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile API
apidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile APIapidays
 
apidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wise
apidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wiseapidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wise
apidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wiseapidays
 
apidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Ventures
apidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Venturesapidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Ventures
apidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Venturesapidays
 
apidays Singapore 2023 - Digitalising agreements with data, design & technolo...
apidays Singapore 2023 - Digitalising agreements with data, design & technolo...apidays Singapore 2023 - Digitalising agreements with data, design & technolo...
apidays Singapore 2023 - Digitalising agreements with data, design & technolo...apidays
 
apidays Singapore 2023 - Building a digital-first investment management model...
apidays Singapore 2023 - Building a digital-first investment management model...apidays Singapore 2023 - Building a digital-first investment management model...
apidays Singapore 2023 - Building a digital-first investment management model...apidays
 
apidays Singapore 2023 - Changing the culture of building software, Aman Dham...
apidays Singapore 2023 - Changing the culture of building software, Aman Dham...apidays Singapore 2023 - Changing the culture of building software, Aman Dham...
apidays Singapore 2023 - Changing the culture of building software, Aman Dham...apidays
 
apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...
apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...
apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...apidays
 
apidays Singapore 2023 - Beyond REST, Claudio Tag, IBM
apidays Singapore 2023 - Beyond REST, Claudio Tag, IBMapidays Singapore 2023 - Beyond REST, Claudio Tag, IBM
apidays Singapore 2023 - Beyond REST, Claudio Tag, IBMapidays
 
apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...
apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...
apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...apidays
 
apidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartner
apidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartnerapidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartner
apidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartnerapidays
 
apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...
apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...
apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...apidays
 
Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...
Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...
Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...apidays
 
Apidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IO
Apidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IOApidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IO
Apidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IOapidays
 
Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...
Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...
Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...apidays
 
Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...
Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...
Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...apidays
 
Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...
Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...
Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...apidays
 
Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...
Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...
Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...apidays
 
Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...
Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...
Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...apidays
 
Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...
Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...
Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...apidays
 

More from apidays (20)

apidays Australia 2023 - A programmatic approach to API success including Ope...
apidays Australia 2023 - A programmatic approach to API success including Ope...apidays Australia 2023 - A programmatic approach to API success including Ope...
apidays Australia 2023 - A programmatic approach to API success including Ope...
 
apidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile API
apidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile APIapidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile API
apidays Singapore 2023 - Addressing the Data Gap, Jerome Eger, Smile API
 
apidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wise
apidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wiseapidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wise
apidays Singapore 2023 - Iterate Faster with Dynamic Flows, Yee Hui Poh, Wise
 
apidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Ventures
apidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Venturesapidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Ventures
apidays Singapore 2023 - Banking the Ecosystem, Apurv Suri, SC Ventures
 
apidays Singapore 2023 - Digitalising agreements with data, design & technolo...
apidays Singapore 2023 - Digitalising agreements with data, design & technolo...apidays Singapore 2023 - Digitalising agreements with data, design & technolo...
apidays Singapore 2023 - Digitalising agreements with data, design & technolo...
 
apidays Singapore 2023 - Building a digital-first investment management model...
apidays Singapore 2023 - Building a digital-first investment management model...apidays Singapore 2023 - Building a digital-first investment management model...
apidays Singapore 2023 - Building a digital-first investment management model...
 
apidays Singapore 2023 - Changing the culture of building software, Aman Dham...
apidays Singapore 2023 - Changing the culture of building software, Aman Dham...apidays Singapore 2023 - Changing the culture of building software, Aman Dham...
apidays Singapore 2023 - Changing the culture of building software, Aman Dham...
 
apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...
apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...
apidays Singapore 2023 - Connecting the trade ecosystem, CHOO Wai Yee, Singap...
 
apidays Singapore 2023 - Beyond REST, Claudio Tag, IBM
apidays Singapore 2023 - Beyond REST, Claudio Tag, IBMapidays Singapore 2023 - Beyond REST, Claudio Tag, IBM
apidays Singapore 2023 - Beyond REST, Claudio Tag, IBM
 
apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...
apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...
apidays Singapore 2023 - Securing and protecting our digital way of life, Ver...
 
apidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartner
apidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartnerapidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartner
apidays Singapore 2023 - State of the API Industry, Manjunath Bhat, Gartner
 
apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...
apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...
apidays Australia 2023 - Curb your Enthusiasm:Sustainable Scaling of APIs, Sa...
 
Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...
Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...
Apidays Paris 2023 - API Security Challenges for Cloud-native Software Archit...
 
Apidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IO
Apidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IOApidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IO
Apidays Paris 2023 - State of Tech Sustainability 2023, Gaël Duez, Green IO
 
Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...
Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...
Apidays Paris 2023 - 7 Mistakes When Putting In Place An API Program, Francoi...
 
Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...
Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...
Apidays Paris 2023 - Building APIs That Developers Love: Feedback Collection ...
 
Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...
Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...
Apidays Paris 2023 - Product Managers and API Documentation, Gareth Faull, Lo...
 
Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...
Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...
Apidays Paris 2023 - How to use NoCode as a Microservice, Benjamin Buléon and...
 
Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...
Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...
Apidays Paris 2023 - Boosting Event-Driven Development with AsyncAPI and Micr...
 
Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...
Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...
Apidays Paris 2023 - API Observability: Improving Governance, Security and Op...
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 

INTERFACE, by apidays - The Evolution of Data Movement.pdf

  • 1. The evolution of data movement.
  • 2. 2022 SERIES OF EVENTS New York JULY (HYBRID) Australia SEPTEMBER (HYBRID) Singapore APRIL (VIRTUAL) Helsinki & North MARCH (VIRTUAL) Paris DECEMBER (HYBRID) London OCTOBER (HYBRID) Hong Kong AUGUST (VIRTUAL) JUNE (VIRTUAL) India MAY (VIRTUAL) APRIL (VIRTUAL) Dubai & Middle East JUNE (VIRTUAL) Check out our API Conferences www.a pida ys .globa l Want to talk at one of our conferences? apidays.typeform.com/to/ILJeAaV8
  • 3. Airbyte Open-Source data integration 30,000 Deployments 7,900 Slack members 7,000 GitHub stars Hello! I am Michel Tricot Co-Founder & CEO of Airbyte @MichelTricot michel-tricot /in/micheltricot
  • 4. How Data Movement has changed… Before Today
  • 5. The rise of the Cloud compute era 1. Exponential growth in the amount of data sources and data 2. Plummeting cost of cloud-based computation and storage ➡ Data consumption model has changed
  • 6. APIs are ubiquitous ➡ Data access model has changed 1. APIs are both a product and a datastore 2. Data is siloed and access has become a key challenge
  • 7. Extract - Load - Transform A new paradigm for modern teams ELT is replacing ETL
  • 8. Extract Source-specific routines to pull selected data from an external system. Transform Business logic specific to your organization to serve an analytics or operational use case. Load Destination specific routines to push data where it is going to be consumed.
  • 9. ETL doesn’t work in today’s world Inflexible ● Friction when changing an existing pipeline. ● Hard to add new data. ● Most issues force data to be re-extracted. Lack of Autonomy ● Warehouses made data consumers more autonomous. ● Changes require engineering involvement. Complex ● Custom DSL. ● Force adoption of a data stack. ● Address 70% of the needs, 30% still built and maintained in-house.
  • 10. Extract General-purpose routines to pull selected data from a source. Load General-purpose routines to push raw data where it is going to be consumed. Transform Business logic specific to your organization to serve an analytics or operational use case with SQL / dbt / ...
  • 11. ELT fixes the ETL-related issues Flexibility ● All the data available on the destination. ● Data consumers are free to use what they need for the insights they want. Autonomy ● Data consumers can leverage SQL queries to transform the data the way they want. ● No need to involve the engineering team. Future proof ● Issues during transformation don’t prevent access to the data. ● Easy to update transformation schemas.
  • 12. What about the long-tail of APIs? 1,000's of new apps/APIs emerging every year ➡ Data is more and more fragmented ➡ Rising need to break down data silos
  • 13. Open-source communities solve the long-tail of APIs 1. Don’t reinvent the wheel, leverage existing connectors 2. Share the work of maintenance across a community OSS is the only way to solve data integration
  • 14. Developer tooling is crucial We empower people to build good connectors with the Airbyte CDK 1. Offer developers tools 2. Build developer leverage
  • 15. Predictions for APIs An API is not just about exposing data, it is the programmatic version of a product with all the business logic that ties to it. Because of it, there will always be fragmentation in the API world and the need to cover the long tail to break down these silos.
  • 17. Predictive analytics uses historical data to predict future events. Only way this work is good data in good analytics out
  • 19. Limitations of current ELT explain the growing need for data engineers. Only the most popular connectors They plateau at ~170 connectors, and can’t cover the long tail because of maintenance costs and ROI consideration. Can’t handle custom use cases Customers can't customize pre-built connectors, nor create new ones. Counter-productive row-based pricing Charging on active rows prevents mid- and high-scale replications (APIs, databases...) and is unpredictable. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. X X
  • 20. Data Engineers need a scalable way to cover all data pipelines Covers the long tail of connectors Extensible and non-opinionated to address your exact needs A fair compute-based pricing
  • 21. www.airbyte.io Data infrastructure is huge and growing, but movement is still immature.
  • 22. www.airbyte.io CDK to increase developer productivity Enabling the long tail ○ Connectors as configuration ○ Speed ups & usability improvements to Connector Acceptance Tests ○ Reducing effort required to specify connector output schemas ○ CDK-level speed ups in connectors via multi-threading Developer happiness & reducing friction ○ Seamless M1 support ○ Connector config migrations
  • 23. The hardest part of By enabling the community with the best tooling (CDK) Nailing maintenance at scale
  • 25. Reverse ETL Data Warehouse Extract Load Transform Activate ... BI/Visualization ... 18
  • 26. We grew the biggest community around data integration. [updated] GitHub stars Slack members Code contributors 0 2,000 4,000 6,000 0 0 Oct. Jan. Apr. Jul. Sep. Grouparoo Rudderstack Meltano Nov. Oct. Jan. Apr. Jul. Sep. Nov. Oct. Jan. Apr. Jul. Sep. Nov. Airbyte 2,000 4,000 6,000 100 200 300
  • 27. “We are past the golden age of Hadoop and Spark”
  • 28. Topics (notes from our call with event organizers) *they do want Michel to talk about whatever he thinks is important* 20 min talk + 5 min Q&A Talking at 10:40am PST on 6/8 Need a slide deck Michel will be speaking directly after the Keynote speaker (author of Platform Revolution) Some ideas for the talk: 1. APIs 2. OSS connectors 3. The whole vision a. Why it makes sense to have OSS connectors b. Why is makes sense to maintain certain APIs c. “Airbyte has the community and platform to rule them all” 4. Integration is fragmented a. History of integrations and types of integrations overview 5. He can do a plug for maintainer program and ask people contribute to airbyte a. This is the best community/audience to give a call to action to contribute to Airbyte They really want to hear about the Airbyte’s VISION ● Moving data from A to B ● Community led growth ● Long-tail of APIs ● How we see APIs changing and evolving ● Fragmentation in integrations today is a “trillion dollar issue” and airbyte aims to be the platform to solve it all Title for the talk: The Evolution of Data Movement
  • 29. Potential agenda (in order) *This is the airbyte vision + our thoughts on evolution of data movement 1. API Evolution 1990 → 2000→ Today (Cheaper Storage move all data) 2. And now ETL —> ELT 3. To solve the long-tail of APIs, you need a Community based approach 4. OSS - why it’s critical for the future of API integrations (and the scalability of it) 5. CDK: Why developer tooling is important (API Specific) 6. Future predictions for APIs?
  • 30. 1890’s Data Movement and Analytics In 1880, prior to computers, it took over seven years for the U.S. Census Bureau to process the collected information and complete a final report. In response, inventor Herman Hollerith produced the “tabulating machine,” which was used in the 1890 census. The tabulating machine could systematically process data recorded on punch cards. With this device, the 1890 census was finished in 18 months. Interesting Read - https://www.dataversity.net/brief-history-analytics/#
  • 31. What Data Movement looked like in…. 1990 2000
  • 32. “We are years past the golden age of Hadoop and Spark” Cloud, Warehouses and Lakehouses are taking over the data world.
  • 33. How Data Movement has changed… Before Today Databas e Files API Spreads heet Extract & Load Warehou se Transform Databas e Files API BI … …