SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Empowering Self-
Organising Teams with
Apache NiFi
Sebastian Carroll
Sr Consultant @ HWX
2 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
What is NiFi?
The Data-Movement Production Line
3 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Data Movement
4 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Production Line
5 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
5 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Pop Quiz!
6 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
The Self-Organising Team
Intra-team Improvements
7 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
The Problem
⬢ Traditional data movement frameworks are usually large, expensive and strictly
controlled by change management
⬢ Leading to teams that cannot control their own environment
⬢ Inability to change quickly
8 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
9 © Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
1
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
1
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
1
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
1
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
1
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
⬢ Immediate
1
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
⬢ Immediate
Not just S2S!
1
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Core
Data
Science
Supply
Chain
Website
Online In Store
S2S
Direct Connection and Intuitive UI
⬢ Team to Team - No Core
⬢ Individual to Individual
⬢ Not just techies
⬢ Productionable
⬢ Immediate
Not just S2S!
Many choices!
1
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Quickly work with live data
⬢ Test integration points, before developing
1
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Quickly work with live data
⬢ Test integration points, before developing
⬢ Quickly profile data
– Encrypted?
– Compressed?
– CSV/JSON/XML?
– Seasonal?
– Volume?
1
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
2
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
2
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
2
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to Hive
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
⬢ Landed in HDFS
2
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to HDFS
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
⬢ Landed in HDFS
⬢ How long? 1 Day from 0 to HDFS
2
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Adobe Clickstream - 0 to HDFS
I’ve done this for a customer:
⬢ Got only the necessary details - transport, location and credentials
⬢ Connected up to NiFi
⬢ Saw what data was there
⬢ Landed in HDFS
⬢ How long? 1 Day from 0 to HDFS
Next piece of value - make it queryable in Hive
2
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Incremental Improvement
of Pipelines
Inter-team Improvements
2
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
How do I get here?
All very nice ! but ...
⬢ Multi-team changes are difficult
⬢ Core changes are difficult
⬢ Too much risk to change everyone at once
So, what do we do?
2
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
How do I get here?
All very nice ! but ...
⬢ Multi-team changes are difficult
⬢ Core changes are difficult
⬢ Too much risk to change everyone at once
So, what do we do?
Using NiFi we can deliver small changes regularly!
2
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell
To File
2
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
To File To Excel
3
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
To File To Excel
Merge Multiple
Excels
sFTP
3
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
3
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
Publishing
3
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
Publishing
3
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
sFTP
SAN2
Analysis
Publishing
3
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell SAN1
Email
To File To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
Shell S2S
Email
To File To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
ExecuteSQL S2S
Email
To Excel
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
3
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
ExecuteSQL S2S
Email
Merge Multiple
Excels
S2S
SAN2
Analysis
Publishing
4
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
DB OPs Warehouse
Supply
Chain
BuyersWider
Business
ExecuteSQL S2S
PutEmail
Merge Multiple
Excels
S2S
S2S
Analysis
Publishing
4
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Why NiFi?
What features make it a good fit?
4
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Easy UI
⬢ Drag and drop integration
⬢ Only Web-Browser
⬢ Flow based programming
⬢ Changes take effect immediately
⬢ Nice Visual cues on queues
4
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Easy UI
4
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
⬢ See content in the queue
⬢ See attributes
⬢ Can download for further inspection or transformation
4
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
4
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
4
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Live Data Inspection
4
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Provenance - lineage
⬢ Can track end to end
⬢ Gives timestamped events
⬢ Trace and record the history
4
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Provenance - Changes and Replay
⬢ You can replay this data back!
⬢ Can change quickly and then replay to correct mistakes
5
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Provenance - Changes and Replay
5
1
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
S2S + Vast number of integration Points
⬢ S2S is a very easy to use NiFi to NiFi protocol
⬢ Also, integrates with almost everything OOTB
⬢ SFTP, Kafka, MQTT, RELP, Email, Disk, HDFS, JMS, etc
5
2
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
S2S + Vast number of integration Points
5
3
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
5
4
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
⬢ Very granular authorisation model
5
5
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
⬢ Very granular authorisation model
⬢ Can secure down to the processor group level
5
6
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Security
⬢ Change Tracking On the UI
⬢ Very granular authorisation model
⬢ Can secure down to the processor group level
⬢ Integrates with Ranger
5
7
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Productionable
Can start as PoC then be
⬢ Secured
⬢ integrated with AD
⬢ Scaled up
⬢ Scaled out
5
8
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Re-Cap
5
9
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Re-cap
⬢ Data-Movement Production Line
⬢ Reduce Time-To-Production
⬢ Decrease feedback loop
⬢ Move responsibility to decision makers
⬢ Empower teams!
6
0
© Hortonworks Inc. 2011 – 2016. All Rights
Reserved
Questions?

More Related Content

What's hot

HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for EveryoneHDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
Yifeng Jiang
 
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWSIntroduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Yifeng Jiang
 
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 UpdatesHortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
Yifeng Jiang
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
DataWorks Summit
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Spark Security
Spark SecuritySpark Security
Spark Security
Yifeng Jiang
 
Transactional SQL in Apache Hive
Transactional SQL in Apache HiveTransactional SQL in Apache Hive
Transactional SQL in Apache Hive
DataWorks Summit
 
HPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran talHPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran tal
Liran Tal
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
DataWorks Summit
 
Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4 Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4
Hortonworks
 
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
InSync2011
 
O365con14 - the 4 major steps to migrate content from any on-premise source i...
O365con14 - the 4 major steps to migrate content from any on-premise source i...O365con14 - the 4 major steps to migrate content from any on-premise source i...
O365con14 - the 4 major steps to migrate content from any on-premise source i...
NCCOMMS
 
How to Upgrade Hundreds or Thousands of Databases
How to Upgrade Hundreds or Thousands of DatabasesHow to Upgrade Hundreds or Thousands of Databases
How to Upgrade Hundreds or Thousands of Databases
DLT Solutions
 
Speakers at Nov 2019 PL/SQL Office Hours session
Speakers at Nov 2019 PL/SQL Office Hours sessionSpeakers at Nov 2019 PL/SQL Office Hours session
Speakers at Nov 2019 PL/SQL Office Hours session
Steven Feuerstein
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
Hortonworks
 
Colin Carter - LSPs and APIs
Colin Carter  - LSPs and APIsColin Carter  - LSPs and APIs
Colin Carter - LSPs and APIs
sconul
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
DataWorks Summit
 
EMR and DynamoDB
EMR and DynamoDBEMR and DynamoDB
EMR and DynamoDB
Sohail M. Khan
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
aioughydchapter
 
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
vasuballa
 

What's hot (20)

HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for EveryoneHDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
 
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWSIntroduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
 
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 UpdatesHortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Spark Security
Spark SecuritySpark Security
Spark Security
 
Transactional SQL in Apache Hive
Transactional SQL in Apache HiveTransactional SQL in Apache Hive
Transactional SQL in Apache Hive
 
HPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran talHPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran tal
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
 
Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4 Apache Ambari - What's New in 2.4
Apache Ambari - What's New in 2.4
 
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
Developer and Fusion Middleware 2 _Greg Kirkendall _ How Australia Post teach...
 
O365con14 - the 4 major steps to migrate content from any on-premise source i...
O365con14 - the 4 major steps to migrate content from any on-premise source i...O365con14 - the 4 major steps to migrate content from any on-premise source i...
O365con14 - the 4 major steps to migrate content from any on-premise source i...
 
How to Upgrade Hundreds or Thousands of Databases
How to Upgrade Hundreds or Thousands of DatabasesHow to Upgrade Hundreds or Thousands of Databases
How to Upgrade Hundreds or Thousands of Databases
 
Speakers at Nov 2019 PL/SQL Office Hours session
Speakers at Nov 2019 PL/SQL Office Hours sessionSpeakers at Nov 2019 PL/SQL Office Hours session
Speakers at Nov 2019 PL/SQL Office Hours session
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
 
Colin Carter - LSPs and APIs
Colin Carter  - LSPs and APIsColin Carter  - LSPs and APIs
Colin Carter - LSPs and APIs
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
EMR and DynamoDB
EMR and DynamoDBEMR and DynamoDB
EMR and DynamoDB
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
OOW16 - Running your E-Business Suite on Oracle Cloud (IaaS + PaaS) - Why, Wh...
 

Similar to Using Apache® NiFi to Empower Self-Organising Teams

Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiTaking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Bryan Bende
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
Koji Kawamura
 
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective InsightsHow Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
Hortonworks
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Bryan Bende
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
DataWorks Summit/Hadoop Summit
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
Raúl Marín
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Haimo Liu
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Big Data Spain
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
Isheeta Sanghi
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
Hortonworks
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
alanfgates
 
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJDataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
 
Apache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop SummitApache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop Summit
Daniel Madrigal
 
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics ManagerIntroduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
Yifeng Jiang
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without Data
Bryan Bende
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
John Park
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability Meetup
Saptak Sen
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
DataWorks Summit/Hadoop Summit
 

Similar to Using Apache® NiFi to Empower Self-Organising Teams (20)

Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiTaking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
 
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective InsightsHow Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFi
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJDataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
 
Apache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop SummitApache NiFi Crash Course San Jose Hadoop Summit
Apache NiFi Crash Course San Jose Hadoop Summit
 
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics ManagerIntroduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without Data
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability Meetup
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 

Recently uploaded

How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
kalichargn70th171
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
campbellclarkson
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
ImtiazBinMohiuddin
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio, Inc.
 
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfThe Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
kalichargn70th171
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionBuilding the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Applitools
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
sandeepmenon62
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
widenerjobeyrl638
 
Microsoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptxMicrosoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptx
jrodriguezq3110
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
manji sharman06
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 

Recently uploaded (20)

How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
 
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
 
bgiolcb
bgiolcbbgiolcb
bgiolcb
 
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfThe Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionBuilding the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
 
Microsoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptxMicrosoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptx
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 

Using Apache® NiFi to Empower Self-Organising Teams

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Empowering Self- Organising Teams with Apache NiFi Sebastian Carroll Sr Consultant @ HWX
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is NiFi? The Data-Movement Production Line
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Data Movement
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Production Line
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Pop Quiz!
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Self-Organising Team Intra-team Improvements
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Problem ⬢ Traditional data movement frameworks are usually large, expensive and strictly controlled by change management ⬢ Leading to teams that cannot control their own environment ⬢ Inability to change quickly
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI
  • 10. 1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core
  • 11. 1 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual
  • 12. 1 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies
  • 13. 1 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable
  • 14. 1 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable ⬢ Immediate
  • 15. 1 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable ⬢ Immediate Not just S2S!
  • 16. 1 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Data Science Supply Chain Website Online In Store S2S Direct Connection and Intuitive UI ⬢ Team to Team - No Core ⬢ Individual to Individual ⬢ Not just techies ⬢ Productionable ⬢ Immediate Not just S2S! Many choices!
  • 17. 1 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Quickly work with live data ⬢ Test integration points, before developing
  • 18. 1 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Quickly work with live data ⬢ Test integration points, before developing ⬢ Quickly profile data – Encrypted? – Compressed? – CSV/JSON/XML? – Seasonal? – Volume?
  • 19. 1 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials
  • 20. 2 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi
  • 21. 2 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there
  • 22. 2 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to Hive I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there ⬢ Landed in HDFS
  • 23. 2 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to HDFS I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there ⬢ Landed in HDFS ⬢ How long? 1 Day from 0 to HDFS
  • 24. 2 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adobe Clickstream - 0 to HDFS I’ve done this for a customer: ⬢ Got only the necessary details - transport, location and credentials ⬢ Connected up to NiFi ⬢ Saw what data was there ⬢ Landed in HDFS ⬢ How long? 1 Day from 0 to HDFS Next piece of value - make it queryable in Hive
  • 25. 2 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Incremental Improvement of Pipelines Inter-team Improvements
  • 26. 2 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How do I get here? All very nice ! but ... ⬢ Multi-team changes are difficult ⬢ Core changes are difficult ⬢ Too much risk to change everyone at once So, what do we do?
  • 27. 2 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How do I get here? All very nice ! but ... ⬢ Multi-team changes are difficult ⬢ Core changes are difficult ⬢ Too much risk to change everyone at once So, what do we do? Using NiFi we can deliver small changes regularly!
  • 28. 2 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell To File
  • 29. 2 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 To File To Excel
  • 30. 3 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 To File To Excel Merge Multiple Excels sFTP
  • 31. 3 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 To File To Excel Merge Multiple Excels sFTP SAN2 Analysis
  • 32. 3 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels sFTP SAN2 Analysis Publishing
  • 33. 3 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels sFTP SAN2 Analysis Publishing
  • 34. 3 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels sFTP SAN2 Analysis Publishing
  • 35. 3 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 36. 3 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell SAN1 Email To File To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 37. 3 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business Shell S2S Email To File To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 38. 3 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business ExecuteSQL S2S Email To Excel Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 39. 3 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business ExecuteSQL S2S Email Merge Multiple Excels S2S SAN2 Analysis Publishing
  • 40. 4 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved DB OPs Warehouse Supply Chain BuyersWider Business ExecuteSQL S2S PutEmail Merge Multiple Excels S2S S2S Analysis Publishing
  • 41. 4 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why NiFi? What features make it a good fit?
  • 42. 4 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Easy UI ⬢ Drag and drop integration ⬢ Only Web-Browser ⬢ Flow based programming ⬢ Changes take effect immediately ⬢ Nice Visual cues on queues
  • 43. 4 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Easy UI
  • 44. 4 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection ⬢ See content in the queue ⬢ See attributes ⬢ Can download for further inspection or transformation
  • 45. 4 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection
  • 46. 4 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection
  • 47. 4 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Live Data Inspection
  • 48. 4 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance - lineage ⬢ Can track end to end ⬢ Gives timestamped events ⬢ Trace and record the history
  • 49. 4 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance - Changes and Replay ⬢ You can replay this data back! ⬢ Can change quickly and then replay to correct mistakes
  • 50. 5 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance - Changes and Replay
  • 51. 5 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved S2S + Vast number of integration Points ⬢ S2S is a very easy to use NiFi to NiFi protocol ⬢ Also, integrates with almost everything OOTB ⬢ SFTP, Kafka, MQTT, RELP, Email, Disk, HDFS, JMS, etc
  • 52. 5 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved S2S + Vast number of integration Points
  • 53. 5 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI
  • 54. 5 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI ⬢ Very granular authorisation model
  • 55. 5 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI ⬢ Very granular authorisation model ⬢ Can secure down to the processor group level
  • 56. 5 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Security ⬢ Change Tracking On the UI ⬢ Very granular authorisation model ⬢ Can secure down to the processor group level ⬢ Integrates with Ranger
  • 57. 5 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Productionable Can start as PoC then be ⬢ Secured ⬢ integrated with AD ⬢ Scaled up ⬢ Scaled out
  • 58. 5 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Re-Cap
  • 59. 5 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Re-cap ⬢ Data-Movement Production Line ⬢ Reduce Time-To-Production ⬢ Decrease feedback loop ⬢ Move responsibility to decision makers ⬢ Empower teams!
  • 60. 6 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions?

Editor's Notes

  1. [ 1 min ] [ 1.12 min ] Hi There! I’m Sebastian I’m a senior consultant here in EMEA and I focus on HDF technologies and Apache NiFi Started using NiFi about three years ago in Australia. Pre open-source days -- very different to now but that early learning has definitely helped Using NiFi ever since those early days and really enjoyed the NiFi journey. Fantastic product that really solves a lot of problems unlike anything else out there One of the most transformational NiFi applications I have seen
  2. [ 1 min ] [ 1 min ] So first a quick poll: Who has heard of NiFi? Keep them up if you know what NiFi does? Keep them up if you have ever used NiFi? I summarise NiFi as the Data-Movement Production-line
  3. [ 1 min ] Data movement Moving Data around See on image Business units Machines Bandwidth links Availability times Sounds easy but it isn’t There are some tools that specialise in some areas but *in general* If youre moving data, NiFI is probably what you want
  4. [ 1:50 min ] Production line Flow based programming model FlowFiles - the actual data - product Processors - the work units Queues - connectors
  5. [ 30 sec ] Why would I use NiFi? Moving Data What does it resemble? The production line Don’t worry youll see more later
  6. [ 1 min ] [ 2:40 ] So armed with that ridiculously brief overview, I want to talk about how NiFi can be used to facilitate the agile philosphy of the self organising team What is a self organising team: one where they define the how, the management defines the what. Why is this good? Well lets take an example: Management says - at 3pm, take the red truck out the back, load all the apples and deliver across the bridge to the supermarket. For this to go well, everything has to work, all the steps have to be known in advance. What happens if … truck is blue? Take it? No fuel? Break down? Putting the decisions into the hands of the people who are most qualified to make them - speeds up delivery and increases quality (often solving the problems in novel ways to boot) Generally it increases quality and decreases time to completion How does this work with Data Ingestion Pipelines?
  7. [ 1 min ] Using NiFi can help to speed up the process: Less risk of losing perishable insights Reduce costs
  8. [ 2 min ] You might have a very simplistic representation of the organisation like so We’re agile, so we have cross-functional teams (maybe consisting of developers, sysadmins, data professionals and subject matter experts) This depicts the information flow through the teams This isn’t that special at the moment, looks like a normal ‘change managed, data backbone’ -everything goes through core And anytime one of the teams needs a change, it goes through the core team. For example, if in store team wants information from the supply chain team they have to coordinate with core, who talks to supply chain and all the complexities there. It’s difficult to get core to prioritise your work due to their requests coming in thick and fast Often leads to prioritisation by volume - or those who yell the loudest win Often have to do testing in test infrastructure to verify that changes work Then can submit change request Only to find out that test isn’t actually like production, you exceed the change window and have to revert Starting the change management cycle over again. Even companies that don’t have these specific procedures are often bogged down in slow change cycles BUT .. let’s assume these are all NiFi instances. Website, core, in store etc all have their own NiFi instance - we haven’t changed the organisational structure at all Just changed the tool that passes data around - so right now it looks the same. But lets replay the scenario from above - In Store wants supply chain data
  9. [ 1 min ] Direct Connect S2S Easy to use once set up Intuitive UI Flow based For simple data movement, anyone can use
  10. [ 1 min ] Team to Team - No Core Direct connect - straight to the team in question No core No change requests minimal process Team affected by change are the ones who implement it Decisions made by people most well positioned to do so
  11. [ 1 min ] Individual to Individual Just 2 people!
  12. [ 1 min ] Not just techies Could be anyone theoretically on the team - not just NiFi Guru Whoever is using the data, can get it
  13. [ 1 min ] Productionable NiFi itself and these features are not gimmicks They are the same, robust and secure features using in all our deployments
  14. [ 1 min ] Immediate Changes take effect immediately Can quickly see and debug issues
  15. [ 1 min ] Not Just S2S!
  16. [ 4 min ] Flexible - almost any endpoint Integrators dream! Options include: files, to Hadoop, to Kafka, to plain TCP, HTTP, JMS, CDC, WebSockets, Emails, Hbase, Mongo, SNMP, Solr, splunk and even twitter!
  17. [ 1 min ] Now I want to look at improvement over the oganisation as a whole You’ve seen the improvements that NiFi can bring to a team if all teams have a their own NiFi instance that is under their control But this requires NiFI everywhere, which is the case for only a handful of organisations So how do we get there? Well we could just change over night right …. ? Mandate that All teams use NiFi? Pump huge amounts of money to looking at the potential risks and mitigating those and rolling out changes and all the tradition stuff? But ...
  18. [ 2 min ]
  19. [ .5 min ]
  20. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report
  21. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email. All the markings in
  22. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email. All the markings in
  23. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email.
  24. Here we have a traditional data movement pipeline The Buyers are looking at historical sales trends and trying to see if their predictions were correct. For this to happen we need to go all the way back to the warehouse database: Start at the database, the warehouse team want to get a report on all the items in the database Ops probably does some sort of manual process - logging in to a firewalled machine, bringing up a shell and executing the required report Then placed on the Shared Drive The warehouse team pick this up and load into excel Check that things look OK and pass to the Supply Chain team However supply chain don’t sit in the warehouse and the Shared Drive is different. So it’s placed on an SFTP server Its joined with other reports from other warehouses, reconciled and place on the second HQ internal SAN Picked up by the buyers They don’t need to modify the report, simply ingest the data for analysis Do so and pass the results to the business by email. While obviously fictional, these sorts of ingestion pipelines are the norm not the exception, they are especially hard to root out as each team is generally siloed and don’t have visibility of the process as a whole Lets pick a team that has decided to try out NiFi for automating some of this movement - the supply chain team.
  25. We go ahead and install NiFi inside the Supply chain team We change just one task at first: Using the FetchSFTP processor to watch the server and pick up the warehouse report And place it in a location where it can be worked on So we have just automated one step The employee looking after this step now knows that the file will appear, ready to be worked on To simplify another simple step, once recieving this file we can send an email to the SC team notifying them of its arrival. Now the warehouse employee is freed from doing that step. Great! That’s two down. Theres still the manual analysis, but thats what humans are good at so let them do that. Now the Warehouse guy makes a comment that that is a cool thing You say, well If you have your own nifi, you could do the same thing! So you convince them to try setting up thier own NiFi
  26. At first, it just automates one step moving the file to the ftp server That great for the warehouse guys, one less thing But now we can eliminate the clunky file -ftp -pickup - putdown chain with NiFi S2S
  27. At first, it just automates one step moving the file to the ftp server That great for the warehouse guys, one less thing But now we can eliminate the clunky file -ftp -pickup - putdown chain with NiFi S2S
  28. [ 10 sec ] We’ve seen what benefits can come from the NiFI web But why NiFI? What features make it a good fit?
  29. [ 2 min ] Cross funtional teams - wide skill set Want anyone to come in with minimal training and be in control of thier own data Most people are aware of flow charts and Graphical web interfaces Don’t need special software - web browser Changes take effect immediately: Allows faster feed-back cycles Avoids long or specialised deployment cycles (e.g. must know how to use git, jenkins or pull requests) Lots of good visual queues to help show where issues are and how to resolve them
  30. [ 30 sec ] Call out queue limit Really helpful dialog showing issue
  31. [ 1:30 min ] Many systems its difficult to get a snapshot of the stages Would have to get a sample, transport to somewhere you could access and then probably download This is often just more work for the developer
  32. Fast feedback
  33. Again fast feedback! Can see exactly what has changed (12:20 to here from slide 41)
  34. So as I’ve mentioned previously, S2S is a very nice way to communicate with NiFi clusters Only have to open one port Easy to configure Ties in nicely with the UI Maintains Attributes! Balances accross clusters But as previously mentioned Integrates with bloody everything!
  35. So as I’ve mentioned previously, S2S is a very nice way to communicate with NiFi clusters Only have to open one port Easy to configure Ties in nicely with the UI Maintains Attributes! Balances accross clusters But as previously mentioned Integrates with bloody everything!
  36. Can see anyone who has made changes Strong Authorisation: Are sure that the people have authorised are the ones making the change Can then reverse changes if required (no undo, but can just re-apply to old setting)
  37. Can very tightly control who can do what on the system E.g. have someone who can see the data but can’t move it, someone who can move but not see
  38. Can now use processor groups to group flows and secure those Could have one group with access to one and one group with access to the other but no cross talk Similar model to the NiFi Web but on one cluster Has pros/cons but is possible
  39. Make managing these things easier
  40. Can get started in 10 minutes but will also do GB and thousands of events/second Can be secured
  41. [ 10 sec ]