SlideShare a Scribd company logo
1 of 60
1 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Empowering Self-
Organising Teams with
Apache NiFi
Sebastian Carroll
Sr Consultant @ HWX
2 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
What is NiFi?
The Data-Movement Production Line
3 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Data Movement
4 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Production Line
5 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
5 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Pop Quiz!
6 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
The Self-Organising Team
Intra-team Improvements
7 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
The Problem
⬢ Traditional data movement frameworks are usually large, expensive and strictly
controlled by change management
⬢ Leading to teams that cannot control their own environment
⬢ Inability to change quickly
8 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
9 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
1
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
1
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
1
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
1
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
1
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
⬢ Immediate
1
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
⬢ Immediate
Not just S2S!
1
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
⬢ Immediate
Not just S2S!
Many choices!
1
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Quickly work with live data
⬢ Test integration points, before developing
1
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Quickly work with live data
⬢ Test integration points, before developing
⬢ Quickly profile data
– Encrypted?
– Compressed?
– CSV/JSON/XML?
– Seasonal?
– Volume?
1
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
2
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
2
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
2
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
⬢ Landed in HDFS
2
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to HDFS
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
⬢ Landed in HDFS
⬢ How long? 1 Day from 0 to HDFS
2
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to HDFS
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
⬢ Landed in HDFS
⬢ How long? 1 Day from 0 to HDFS
Next piece of value - make it queryable in Hive
2
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Incremental Improvement
of Pipelines
Inter-team Improvements
2
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
How do I get here?
All very nice ! but ...
⬢ Multi-team changes are difficult
⬢ Core changes are difficult
⬢ Too much risk to change everyone at once
So, what do we do?
2
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
How do I get here?
All very nice ! but ...
⬢ Multi-team changes are difficult
⬢ Core changes are difficult
⬢ Too much risk to change everyone at once
So, what do we do?
Using NiFi we can deliver small changes regularly!
2
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell
To File
2
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
To File To Excel
3
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
To File To Excel
Merge Multiple
Excels
sFTP
3
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
3
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
Publishing
3
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
Publishing
3
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
Publishing
3
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell S2S
Email
To File To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
ExecuteSQL S2S
Email
To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
ExecuteSQL S2S
Email
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
4
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
ExecuteSQL S2S
PutEmail
Merge Multiple
Excels
S2S
S2S
Analysis
Publishing
4
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Why NiFi?
What features make it a good fit?
4
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Easy UI
⬢ Drag and drop integration
⬢ Only Web-Browser
⬢ Flow based programming
⬢ Changes take effect immediately
⬢ Nice Visual cues on queues
4
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Easy UI
4
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
⬢ See content in the queue
⬢ See attributes
⬢ Can download for further inspection or transformation
4
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
4
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
4
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
4
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Provenance - lineage
⬢ Can track end to end
⬢ Gives timestamped events
⬢ Trace and record the history
4
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Provenance - Changes and Replay
⬢ You can replay this data back!
⬢ Can change quickly and then replay to correct mistakes
5
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Provenance - Changes and Replay
5
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
S2S + Vast number of integration Points
⬢ S2S is a very easy to use NiFi to NiFi protocol
⬢ Also, integrates with almost everything OOTB
⬢ SFTP, Kafka, MQTT, RELP, Email, Disk, HDFS, JMS, etc
5
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
S2S + Vast number of integration Points
5
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
5
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
⬢ Very granular authorisation model
5
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
⬢ Very granular authorisation model
⬢ Can secure down to the processor group level
5
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
⬢ Very granular authorisation model
⬢ Can secure down to the processor group level
⬢ Integrates with Ranger
5
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Productionable
Can start as PoC then be
⬢ Secured
⬢ integrated with AD
⬢ Scaled up
⬢ Scaled out
5
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Re-Cap
5
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Re-cap
⬢ Data-Movement Production Line
⬢ Reduce Time-To-Production
⬢ Decrease feedback loop
⬢ Move responsibility to decision makers
⬢ Empower teams!
6
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Questions?

More Related Content

What's hot

Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
InSync2011
 

What's hot (20)

HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for EveryoneHDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
 
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWSIntroduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
 
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 UpdatesHortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Spark Security
Spark SecuritySpark Security
Spark Security
 
Transactional SQL in Apache Hive
Transactional SQL in Apache HiveTransactional SQL in Apache Hive
Transactional SQL in Apache Hive
 
HPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran talHPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran tal
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
 
Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4 Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4
 
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
 
O365con14 - the 4 major steps to migrate content from any on-premise source i...
O365con14 - the 4 major steps to migrate content from any on-premise source i...O365con14 - the 4 major steps to migrate content from any on-premise source i...
O365con14 - the 4 major steps to migrate content from any on-premise source i...
 
How to Upgrade Hundreds or Thousands of Databases
How to Upgrade Hundreds or Thousands of DatabasesHow to Upgrade Hundreds or Thousands of Databases
How to Upgrade Hundreds or Thousands of Databases
 
Speakers at Nov 2019 PL/SQL Office Hours session
Speakers at Nov 2019 PL/SQL Office Hours sessionSpeakers at Nov 2019 PL/SQL Office Hours session
Speakers at Nov 2019 PL/SQL Office Hours session
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
 
Colin Carter - LSPs and APIs
Colin Carter  - LSPs and APIsColin Carter  - LSPs and APIs
Colin Carter - LSPs and APIs
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
EMR and DynamoDB
EMR and DynamoDBEMR and DynamoDB
EMR and DynamoDB
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
 

Similar to Using Apache® NiFi to Empower Self-Organising Teams

Apache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop SummitApache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop Summit
Daniel Madrigal
 

Similar to Using Apache® NiFi to Empower Self-Organising Teams (20)

Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiTaking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
 
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective InsightsHow Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFi
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJDataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
 
Apache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop SummitApache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop Summit
 
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics ManagerIntroduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without Data
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability Meetup
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 

Recently uploaded

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Recently uploaded (20)

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdfAzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
 
WSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid Environments
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
 

Using Apache® NiFi to Empower Self-Organising Teams

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Empowering Self- Organising Teams with Apache NiFi Sebastian Carroll Sr Consultant @ HWX
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is NiFi? The Data-Movement Production Line
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Data Movement
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Production Line
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Pop Quiz!
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Self-Organising Team Intra-team Improvements
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Problem ⬢ Traditional data movement frameworks are usually large, expensive and strictly controlled by change management ⬢ Leading to teams that cannot control their own environment ⬢ Inability to change quickly
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI
  • 10. 1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core
  • 11. 1 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual
  • 12. 1 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies
  • 13. 1 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable
  • 14. 1 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable ⬢ Immediate
  • 15. 1 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable ⬢ Immediate Not just S2S!
  • 16. 1 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable ⬢ Immediate Not just S2S! Many choices!
  • 17. 1 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Quickly work with live data ⬢ Test integration points, before developing
  • 18. 1 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Quickly work with live data ⬢ Test integration points, before developing ⬢ Quickly profile data – Encrypted? – Compressed? – CSV/JSON/XML? – Seasonal? – Volume?
  • 19. 1 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials
  • 20. 2 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi
  • 21. 2 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there
  • 22. 2 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there ⬢ Landed in HDFS
  • 23. 2 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to HDFS I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there ⬢ Landed in HDFS ⬢ How long? 1 Day from 0 to HDFS
  • 24. 2 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to HDFS I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there ⬢ Landed in HDFS ⬢ How long? 1 Day from 0 to HDFS Next piece of value - make it queryable in Hive
  • 25. 2 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Incremental Improvement of Pipelines Inter-team Improvements
  • 26. 2 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How do I get here? All very nice ! but ... ⬢ Multi-team changes are difficult ⬢ Core changes are difficult ⬢ Too much risk to change everyone at once So, what do we do?
  • 27. 2 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How do I get here? All very nice ! but ... ⬢ Multi-team changes are difficult ⬢ Core changes are difficult ⬢ Too much risk to change everyone at once So, what do we do? Using NiFi we can deliver small changes regularly!
  • 28. 2 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell To File
  • 29. 2 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 To File To Excel
  • 30. 3 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 To File To Excel Merge Multiple Excels sFTP
  • 31. 3 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 To File To Excel Merge Multiple Excels sFTP SAN2 Analysis
  • 32. 3 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels sFTP SAN2 Analysis Publishing
  • 33. 3 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels sFTP SAN2 Analysis Publishing
  • 34. 3 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels sFTP SAN2 Analysis Publishing
  • 35. 3 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 36. 3 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 37. 3 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell S2S Email To File To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 38. 3 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business ExecuteSQL S2S Email To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 39. 3 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business ExecuteSQL S2S Email Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 40. 4 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business ExecuteSQL S2S PutEmail Merge Multiple Excels S2S S2S Analysis Publishing
  • 41. 4 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why NiFi? What features make it a good fit?
  • 42. 4 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Easy UI ⬢ Drag and drop integration ⬢ Only Web-Browser ⬢ Flow based programming ⬢ Changes take effect immediately ⬢ Nice Visual cues on queues
  • 43. 4 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Easy UI
  • 44. 4 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection ⬢ See content in the queue ⬢ See attributes ⬢ Can download for further inspection or transformation
  • 45. 4 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection
  • 46. 4 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection
  • 47. 4 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection
  • 48. 4 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance - lineage ⬢ Can track end to end ⬢ Gives timestamped events ⬢ Trace and record the history
  • 49. 4 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance - Changes and Replay ⬢ You can replay this data back! ⬢ Can change quickly and then replay to correct mistakes
  • 50. 5 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance - Changes and Replay
  • 51. 5 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved S2S + Vast number of integration Points ⬢ S2S is a very easy to use NiFi to NiFi protocol ⬢ Also, integrates with almost everything OOTB ⬢ SFTP, Kafka, MQTT, RELP, Email, Disk, HDFS, JMS, etc
  • 52. 5 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved S2S + Vast number of integration Points
  • 53. 5 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI
  • 54. 5 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI ⬢ Very granular authorisation model
  • 55. 5 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI ⬢ Very granular authorisation model ⬢ Can secure down to the processor group level
  • 56. 5 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI ⬢ Very granular authorisation model ⬢ Can secure down to the processor group level ⬢ Integrates with Ranger
  • 57. 5 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Productionable Can start as PoC then be ⬢ Secured ⬢ integrated with AD ⬢ Scaled up ⬢ Scaled out
  • 58. 5 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Re-Cap
  • 59. 5 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Re-cap ⬢ Data-Movement Production Line ⬢ Reduce Time-To-Production ⬢ Decrease feedback loop ⬢ Move responsibility to decision makers ⬢ Empower teams!
  • 60. 6 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions?

Editor's Notes

  1. [ 1 min ] [ 1.12 min ] Hi There! I’m Sebastian I’m a senior consultant here in EMEA and I focus on HDF technologies and Apache NiFi Started using NiFi about three years ago in Australia. Pre open-source days -- very different to now but that early learning has definitely helped Using NiFi ever since those early days and really enjoyed the NiFi journey. Fantastic product that really solves a lot of problems unlike anything else out there One of the most transformational NiFi applications I have seen
  2. [ 1 min ] [ 1 min ] So first a quick poll: Who has heard of NiFi? Keep them up if you know what NiFi does? Keep them up if you have ever used NiFi? I summarise NiFi as the Data-Movement Production-line
  3. [ 1 min ] Data movement Moving Data around See on image Business units Machines Bandwidth links Availability times Sounds easy but it isn’t There are some tools that specialise in some areas but *in general* If youre moving data, NiFI is probably what you want
  4. [ 1:50 min ] Production line Flow based programming model FlowFiles - the actual data - product Processors - the work units Queues - connectors
  5. [ 30 sec ] Why would I use NiFi? Moving Data What does it resemble? The production line Don’t worry youll see more later
  6. [ 1 min ] [ 2:40 ] So armed with that ridiculously brief overview, I want to talk about how NiFi can be used to facilitate the agile philosphy of the self organising team What is a self organising team: one where they define the how, the management defines the what. Why is this good? Well lets take an example: Management says - at 3pm, take the red truck out the back, load all the apples and deliver across the bridge to the supermarket. For this to go well, everything has to work, all the steps have to be known in advance. What happens if … truck is blue? Take it? No fuel? Break down? Putting the decisions into the hands of the people who are most qualified to make them - speeds up delivery and increases quality (often solving the problems in novel ways to boot) Generally it increases quality and decreases time to completion How does this work with Data Ingestion Pipelines?
  7. [ 1 min ] Using NiFi can help to speed up the process: Less risk of losing perishable insights Reduce costs
  8. [ 2 min ] You might have a very simplistic representation of the organisation like so We’re agile, so we have cross-functional teams (maybe consisting of developers, sysadmins, data professionals and subject matter experts) This depicts the information flow through the teams This isn’t that special at the moment, looks like a normal ‘change managed, data backbone’ -everything goes through core And anytime one of the teams needs a change, it goes through the core team. For example, if in store team wants information from the supply chain team they have to coordinate with core, who talks to supply chain and all the complexities there. It’s difficult to get core to prioritise your work due to their requests coming in thick and fast Often leads to prioritisation by volume - or those who yell the loudest win Often have to do testing in test infrastructure to verify that changes work Then can submit change request Only to find out that test isn’t actually like production, you exceed the change window and have to revert Starting the change management cycle over again. Even companies that don’t have these specific procedures are often bogged down in slow change cycles BUT .. let’s assume these are all NiFi instances. Website, core, in store etc all have their own NiFi instance - we haven’t changed the organisational structure at all Just changed the tool that passes data around - so right now it looks the same. But lets replay the scenario from above - In Store wants supply chain data
  9. [ 1 min ] Direct Connect S2S Easy to use once set up Intuitive UI Flow based For simple data movement, anyone can use
  10. [ 1 min ] Team to Team - No Core Direct connect - straight to the team in question No core No change requests minimal process Team affected by change are the ones who implement it Decisions made by people most well positioned to do so
  11. [ 1 min ] Individual to Individual Just 2 people!
  12. [ 1 min ] Not just techies Could be anyone theoretically on the team - not just NiFi Guru Whoever is using the data, can get it
  13. [ 1 min ] Productionable NiFi itself and these features are not gimmicks They are the same, robust and secure features using in all our deployments
  14. [ 1 min ] Immediate Changes take effect immediately Can quickly see and debug issues
  15. [ 1 min ] Not Just S2S!
  16. [ 4 min ] Flexible - almost any endpoint Integrators dream! Options include: files, to Hadoop, to Kafka, to plain TCP, HTTP, JMS, CDC, WebSockets, Emails, Hbase, Mongo, SNMP, Solr, splunk and even twitter!
  17. [ 1 min ] Now I want to look at improvement over the oganisation as a whole You’ve seen the improvements that NiFi can bring to a team if all teams have a their own NiFi instance that is under their control But this requires NiFI everywhere, which is the case for only a handful of organisations So how do we get there? Well we could just change over night right …. ? Mandate that All teams use NiFi? Pump huge amounts of money to looking at the potential risks and mitigating those and rolling out changes and all the tradition stuff? But ...
  18. [ 2 min ]
  19. [ .5 min ]
  20. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report
  21. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email. All the markings in
  22. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email. All the markings in
  23. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email.
  24. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email. While obviously fictional, these sorts of ingestion pipelines are the norm not the exception, they are especially hard to root out as each team is generally siloed and don’t have visibility of the process as a whole Lets pick a team that has decided to try out NiFi for automating some of this movement - the supply chain team.
  25. We go ahead and install NiFi inside the Supply chain team We change just one task at first: Using the FetchSFTP processor to watch the server and pick up the warehouse report And place it in a location where it can be worked on So we have just automated one step The employee looking after this step now knows that the file will appear, ready to be worked on To simplify another simple step, once recieving this file we can send an email to the SC team notifying them of its arrival. Now the warehouse employee is freed from doing that step. Great! That’s two down. Theres still the manual analysis, but thats what humans are good at so let them do that. Now the Warehouse guy makes a comment that that is a cool thing You say, well If you have your own nifi, you could do the same thing! So you convince them to try setting up thier own NiFi
  26. At first, it just automates one step moving the file to the ftp server That great for the warehouse guys, one less thing But now we can eliminate the clunky file -ftp -pickup - putdown chain with NiFi S2S
  27. At first, it just automates one step moving the file to the ftp server That great for the warehouse guys, one less thing But now we can eliminate the clunky file -ftp -pickup - putdown chain with NiFi S2S
  28. [ 10 sec ] We’ve seen what benefits can come from the NiFI web But why NiFI? What features make it a good fit?
  29. [ 2 min ] Cross funtional teams - wide skill set Want anyone to come in with minimal training and be in control of thier own data Most people are aware of flow charts and Graphical web interfaces Don’t need special software - web browser Changes take effect immediately: Allows faster feed-back cycles Avoids long or specialised deployment cycles (e.g. must know how to use git, jenkins or pull requests) Lots of good visual queues to help show where issues are and how to resolve them
  30. [ 30 sec ] Call out queue limit Really helpful dialog showing issue
  31. [ 1:30 min ] Many systems its difficult to get a snapshot of the stages Would have to get a sample, transport to somewhere you could access and then probably download This is often just more work for the developer
  32. Fast feedback
  33. Again fast feedback! Can see exactly what has changed (12:20 to here from slide 41)
  34. So as I’ve mentioned previously, S2S is a very nice way to communicate with NiFi clusters Only have to open one port Easy to configure Ties in nicely with the UI Maintains Attributes! Balances accross clusters But as previously mentioned Integrates with bloody everything!
  35. So as I’ve mentioned previously, S2S is a very nice way to communicate with NiFi clusters Only have to open one port Easy to configure Ties in nicely with the UI Maintains Attributes! Balances accross clusters But as previously mentioned Integrates with bloody everything!
  36. Can see anyone who has made changes Strong Authorisation: Are sure that the people have authorised are the ones making the change Can then reverse changes if required (no undo, but can just re-apply to old setting)
  37. Can very tightly control who can do what on the system E.g. have someone who can see the data but can’t move it, someone who can move but not see
  38. Can now use processor groups to group flows and secure those Could have one group with access to one and one group with access to the other but no cross talk Similar model to the NiFi Web but on one cluster Has pros/cons but is possible
  39. Make managing these things easier
  40. Can get started in 10 minutes but will also do GB and thousands of events/second Can be secured
  41. [ 10 sec ]