Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Design a Dataflow in 7 minutes with Apache NiFi/HDF

10,329 views

Published on

How to create a real-time dataflow in 7 Minutes with Hortonworks DataFlow, powered by Apache NiFi”.

Published in: Technology
  • Be the first to comment

Design a Dataflow in 7 minutes with Apache NiFi/HDF

  1. 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create a live dataflow in minutes How would that change your business?
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Add processor for data intake. Time: 1 minute 1 Drag and drop processor from top menu
  3. 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Choose the specific processor 2 Choose one of the processors – currently 170+ available
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Example: Pick Twitter Processor
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Configure the processor. Time: 2 minutes 3 4 Select processor and choose option to Configure Adjust parameters as required
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Another processor for data output. Time: 1 minute 5 6 Filter for and select a “Put” processor Drag and drop processor from top menu
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Configure second processor. Time: 1 minute 7 Configure 2nd processor
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Connect processors, configure connection. 2 minutes Configure Connection8 Note: Sample Flow is different from previous example of PutHDFS. This dataflow is PutFile. Same concepts apply.
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Click Start to Begin Processing. Time total: 7 minutes 9 Click start “play” to begin processing (will run continuously until you select stop)
  10. 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved See Processors Update with Real Time Changes 10 As data flows, GUI interface updates in real time. 11 If destination is stopped or unable to receive, queue builds
  11. 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Dynamically adjust and tune data flow as needed 12 Dynamically configure/ start/ stop/ tune/ reroute change/ pause dataflows as needed.
  12. 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Powerful Tools to Quickly Replicate, Group, Repurpose, Tune and Test in Real-Time 13 14 Create a new template Group multiple processes together to create a process group
  13. 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Provenance Means Real-Time Traceability of: Data Flow Data Content Data Context
  14. 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Watch Real Time Flow of Data: Data Provenance Select Data Provenance15
  15. 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Trace Lineage of a Particular Piece of Data Icon for Data Lineage16
  16. 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Every Change to Data is Tracked in Real-Time: processing, views Every event is traceable 17
  17. 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Real-Time Updates of Dataflow: Traceable Context & Content Know immediately both context and content18
  18. 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Easily access and trace changes to dataflow
  19. 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Audit trail of Hortonworks DataFlow User Actions
  20. 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions? Hortonworks Community Connection: Data Ingestion and Streaming https://community.hortonworks.com/

×