ING was one of the early adopters of the DevOps movement. Currently, there is a lot of expertise in the organization: way of working, tools, and HR are all catered for DevOps. In the Analytics area, these best practices were the basis of a modern and stable architecture where data engineers, operations, and data scientists work together with business people on daily basis. The technology stack includes Hadoop, Spark, Flink, Kafka, Cassandra, and several IBM tools. In the talk I’m going to share tools evolution, skills and processes in place. Touching in the second part two use-cases.
[DSC Europe 22] The Making of a Data Organization - Denys HolovatyiDataScienceConferenc1
Similar to DevOps at ING Analytics: combining data engineering with data operations - Giuseppe D'alessio and Taco Bakker - Codemotion Amsterdam 2018 (20)
4. Bio
• Italian living in the Netherlands
• SW Engineering, Machine Learning & Pattern Recognition
Work
• Chapter Lead Fast Data 2
• Engineer @ Squad PIE – 1:1 Analytics
Giuseppe d’Alessio
https://nl.linkedin.com/in/giuseppedalessio
@peppeweb
5. Taco Bakker
Find me on: www.tsbakker.nl Twitter: @tsbakker65
LinkedIn: https://www.linkedin.com/in/taco-bakker-9846b12/
Taco Bakker
a.i. Area Lead
Continuous Delivery
ING
Me
Master of Computer Science
University of Amsterdam
Lean Six Sigma Black Belt
>5 years of experience in
Agile Scrum, DevOps and
Continuous Delivery
6. ING is a top financial enterprise, operating since 1881
Customers
33 Million
Private, Corporate and
Institutional Customers
Countries
41
In Europe, Asia,
Australia, North and
South America
Employees
52,000
Market leaders Benelux
Growth markets
Commercial Banking
Challengers
7
9. Use your data to improve your customer experience
Or someone else will do it …and take your customers
Data is eating the World
10
10. The best way to use your data
is to apply Streaming Analytics
11
11. Streaming Analytics enables us to detect patterns in real time, and
respond to events for customers’ benefit
12
Secure and reliable
Relevant
Personal
Omnichannel
Predictive Actionable
17. But we quickly found the way to do it
Dev and Ops working as one team. Everybody has the same goals
18. DevOps is a new way to look at IT
IT is a cost centre IT delivers customer value
19. What does it mean to adopt DevOps?
Tools
CultureCollaboration
Organization
20. Squad
In ING’s one way of working, ‘business’ and ‘IT’ go hand-in-hand
SquadSquad
Chapter
Guild
DS
DA
OpsOps
CJE
DS
DA
DS
CJE
DS
Ops
DS
CJE
Dev DevDevDevDev
Tribe
Agile
Coach
30. Streaming Data Platform
34
CEP Engine Machine Learning Engine Post-Processor
Raw
Event
Business
Event
Notification
Event
“detect pattern”
“determine
relevant
notifications”
“produce
notification”
Application Flow :
Kafka Events:
Data storage:
Business Flow:
Customer
Profiles
Notification
Definitions
Models
Get
Customer
Profile
Apply
Selection
Criteria
Score
Notificatio
ns
Detect
Pattern
Get
Intermed
Event
Get Raw
Events
Send
Output
Business
Users
System configuration: Machine
Learning
Environment
Configuration
GUI
(future scope) Data
Scientists
Create
Intermed
Event
Get
Business
Event
Create
Business
Event
Intermediate
Event
Format
Event
Data Lake
Models
Feedback loop
31. Flink:
• is an open source framework for distributed, in-memory (big) data analytics
• Likes Java, Scala, and (a bit) Python
• Several APIs for streams (DataStream), batch (DataSet), and relational data (Table API)
Benefits:
• true streaming: per-event processing, no micro-batching
• high volumes, low latency
Features:
• state management and fault tolerancy: savepointing, checkpointing exactly-once
semanctics
• time windowing: event time, flexible (e.g. sum all transactions in the past minute)
• Complex Event Processing, Machine Learning, SQL (at various levels of maturity)
What is Flink, and why use it?
35
32. Complex Event Processing with FlinkCEP
36
• Allows detection of event patterns in data streams
• In contrast to traditional DBMSs where a query is executed on stored data, CEP executes
data on a stored query.
34. Working DevOps really makes the difference in Data Analytics
• Smarter and Faster solutions for customers
• Better Knowledge sharing & learning
• More IT Quality
Conclusions
38