Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache NiFi 1.0 in Nutshell

2,511 views

Published on

This is a slide we presented at Hadoop Summit Tokyo 2016.

Published in: Technology

Apache NiFi 1.0 in Nutshell

  1. 1. Apache NiFi 1.0 in Nutshell Koji Kawamura – Software Engineer Arti Wadhwani – Technical Support Engineer 2016 October 27
  2. 2. 2 © Hortonworks Inc. 2011 –2016. All Rights Reserved Agenda What’s NiFi NiFi 1.0 Enhancements NiFi on the edge Common issues What’s Next?
  3. 3. 3 © Hortonworks Inc. 2011 –2016. All Rights Reserved Agenda What’s NiFi NiFi 1.0 Enhancements NiFi on the edge Common issues What’s Next?
  4. 4. 4 © Hortonworks Inc. 2011 –2016. All Rights Reserved November 2014 NiFi is donated to the Apache Software Foundation (ASF) through NSA’s Technology Transfer Program and enters ASF’s incubator. 2006 NiagaraFiles (NiFi) was first incepted at the National Security Agency (NSA) A Brief History July 2015 NiFi reaches ASF top-level project status
  5. 5. 5 © Hortonworks Inc. 2011 –2016. All Rights Reserved ” NiFi is like digging irrigation ditches as the water flows, rather than building out a sprinkler system in advance." “NiFiは事前にスプリンクラーを配備するというより、 水が流れるのに合わせて用水路を整備するようなもんさ” https://mail-archives.apache.org/mod_mbox/nifi-users/201604.mbox/%3C2FCCBD60-0A79-42F1-9F9B-A121591C826E@apache.org%3E What’s Apache NiFi?
  6. 6. 6 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi is a tool for Data Flow Management
  7. 7. 7 © Hortonworks Inc. 2011 –2016. All Rights Reserved Store Data Process and Analyze Data Acquire Data Simplistic View of DataFlows: Easy, Definitive Dataflow
  8. 8. 8 © Hortonworks Inc. 2011 –2016. All Rights Reserved Realistic View of Dataflows: Complex, Convoluted Store Data Process and Analyze Data Acquire Data Store DataStore Data Store Data Store Data Acquire Data Acquire Data Acquire Data Dataflow
  9. 9. 9 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 has 170+ Processors, 30% Increase from NiFi 0.7 Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute HL7 FTP UDP XML SFTP HTTP Syslog Email HTML Image AMQP MQTT All Apache project logos are trademarks of the ASF and the respective projects. Fetch
  10. 10. 10 © Hortonworks Inc. 2011 –2016. All Rights Reserved Deeper Ecosystem Integration – New Processors Processor Description Publish/ConsumeKafka Two NARs, with kafka 0.9/0.10 client libraries,respectively JoltTransformJson ManipulateJSON data on the fly, with a preview functionality GenerateTableFetch Incrementalfetch + parallel fetch against source table partitions PutHiveQL Ingest to Hive tables SelectHiveQL Select from Hive tables PutHiveStreaming ingest streaming data to Hive, leverage Hive streaming API CovertAvroToORC Format conversation,Avro to ORC Publish/ConsumeMQTT MQTT is a popular protocolin IoTworld
  11. 11. 11 © Hortonworks Inc. 2011 –2016. All Rights Reserved SOURCES REGIONAL INFRASTRUCTURE CORE INFRASTRUCTURE Data Movement Management Constrained High-Latency Localized Context Hybrid – Cloud/On-Premise Low-Latency Global Context
  12. 12. 12 © Hortonworks Inc. 2011 –2016. All Rights Reserved Hortonworks DataFlow (HDF) § Constrained § High-latency § Localized context § Hybrid – cloud/on-premises § Low-latency § Global context SOURCES REGIONAL INFRASTRUCTURE CORE INFRASTRUCTURE
  13. 13. 13 © Hortonworks Inc. 2011 –2016. All Rights Reserved Flow Management Detailed Break Down of Requirements à Req 1: Acquire data from various Wearable Device’s Cloud Instances à Req 2: Move Data from Customer Cloud Instances to on-premise instance à Req 3: Perform intelligent Routing & Filtering of data. The routing and filtering rules will be often changed at run-time. à Req 4: Deliver the data data to various downstream systems. New downstream apps should will always appear and the data should be fed to it when it comes online. à Req 5: Parse the device data to standardized format that downstream sysem can understand à Req 6: Enrich the data with contextual information including patient/customer info (age, gender, etc..) à Req 7: Recognize the pattern when the resting heart rate exceeds a certain threshold (the insight), and then create an alert/notification. à Req 8: Run a Outlier detection model on streaming heart rate that comes in. If the score is above certain threshold, alert on the heart rate. Stream Processing & Analytics
  14. 14. 14 © Hortonworks Inc. 2011 –2016. All Rights Reserved Agenda What’s NiFi NiFi 1.0 Enhancements NiFi on the edge Common issues What’s Next?
  15. 15. 15 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0: Modernized UI
  16. 16. 16 © Hortonworks Inc. 2011 –2016. All Rights Reserved Modernized UI – Complete Interface Redesign
  17. 17. 17 © Hortonworks Inc. 2011 –2016. All Rights Reserved Connect Components to design your data flow Component What for? Processor Purpose built processing unit e.g. GetXXX, PutXXX Input Port Receiving data endpointbtw Process Groups (local/remote) Output Port Exposingdata endpoint btw Process Groups (local/remote) Process Group Musthave, to design well structured data flow Remote Process Group Enable data transfer btwNiFi deployments via Site-to-Site Funnel Bundle multiple relationshipsinto one Template Sharepart of data flow Label Useful to visually group processors, and description From left to right
  18. 18. 18 © Hortonworks Inc. 2011 –2016. All Rights Reserved Data Provenance
  19. 19. 19 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0: Multitenant Authorization
  20. 20. 20 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 0.x - Authorization Model à Previously had role based authorization – Dataflow Manager (DFM) – Monitor – Provenance – Admin – Proxy – NiFi à Limitation - All or nothing model – DFM can change everything, Monitor can change nothing – Can’t give a user ability to modify/view only certain components – Would require standing up multiple NiFi instances
  21. 21. 21 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 - Authorization Model à NiFi 1.0 introduces a new delegated authorization model à Authorize each request based on user identity, action, and resource – Example for user1 modifying properties on processor1: • User Identity: user1 • Action: WRITE • Resource: processor1 (uuid) à If authorizer says resource not found, parent is checked… if parent isn’t found, parent’s parent is checked, and so on…
  22. 22. 22 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 – NiFi Managed Authorizer vs. External Authorizer à Managed Authorizer – File based persistence • Could be be extended to other persistence mechanisms – NiFi UI to manage policies – NiFi controls authorization logic à External Authorizer – Ranger integration – Ranger UI to manage policies – Ranger controls authorization logic
  23. 23. 23 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 – Managing Users à Clicking the new user icon allows the admin to create Users and Groups – Individual Users can be grouped – Groups can be assigned members à Clicking the edit user icon allows the admin to update a specific User/Group
  24. 24. 24 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 – UI Overview Users Icon in Global Menu used to access Users/Groups Lock Icon in Global Menu used to access Global policies
  25. 25. 25 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 – UI Overview Lock Icon in palette used to access policies for currently selected component Selection Context
  26. 26. 26 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 – Overriding Component Policies à Component inherit policies from the closest ancestor Process Group with policies defined à View/Modify policies handled independently à Click Override to define a new policy, then add Users and Groups à New Users and Groups override the inherited policies (whitelisting)
  27. 27. 27 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 - Multi-Tenancy Example à Create a Group for Team 1 and a Group for Team 2 à Give Team 1 view & modify for Process Group 1 à Give Team 2 view & modify for Process Group 2 à A user from Team 1 would see: Can’t see the name of the group and can’t right-click to configure the group, but can enter the group
  28. 28. 28 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0 – Revisions à Revision per component à Supports concurrent editing of different components without need for refreshing
  29. 29. 29 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0: Zero Master Clustering
  30. 30. 30 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 0.x: NCM (NiFi Cluster Manager) NCM Node1 Node2 External Data Source Chunk Chunk Chunk Distribution mechanism depends on data source Web UI Other NiFi Interact with NCM Site-to-Site: Get topology from NCM Then transfer data p2p Primary
  31. 31. 31 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0: ZMC (Zero Master Clustering) Node1 Node2 Node3 External Data Source Chunk Chunk Chunk Distribution mechanism depends on data source Web UI Other NiFi Interact with any node Site-to-Site: Get topology from one of nodes Then transfer data p2p Zookeeper Primary Coordinator Zookeeper elects Cluster Coordinator and Primary node Any node can fail
  32. 32. 32 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi 1.0: And More!
  33. 33. 33 © Hortonworks Inc. 2011 –2016. All Rights Reserved Foundational Work for SDLC à Deterministic template export – Deterministic ordering, template xml file – Version control of the template – Collaborative SDLC effort à Variable registry – Phase one implementation – In-memory variable registry – The same key referenced in a template, mapped to different environmental specific values
  34. 34. 34 © Hortonworks Inc. 2011 –2016. All Rights Reserved© Hortonworks Inc. 2011 – 2016. All Rights ReservedX Enter the TLS Toolkit ⬢ Command-line tool to automate certificate generation and configuration ⬢ Self-contained certificate authority (CA) for certificate signing ⬢ Keystore & truststore generation ⬢ Client certificate generation ⬢ Automatically updates nifi.properties ⬢ Underpins Ambari TLS integration
  35. 35. 35 © Hortonworks Inc. 2011 –2016. All Rights Reserved JVM REST API NiFi Framework Proc CS Report Task Extension API S2S API JVM S2S Client Libraries Site-to-Site Refactoring – S2S HTTP(S) Protocol through Proxy Server Socket protocol: TCP HDF 2.0: HTTP(s) protocol HTTP proxy
  36. 36. 36 © Hortonworks Inc. 2011 –2016. All Rights Reserved Agenda What’s NiFi NiFi 1.0 Enhancements NiFi on the edge Common issues What’s Next?
  37. 37. 37 © Hortonworks Inc. 2011 –2016. All Rights Reserved Edge Intelligence with Apache MiNiFi à Guaranteed delivery à Data buffering ‒ Backpressure ‒ Pressure release à Prioritized queuing à Flow specific QoS ‒ Latency vs. throughput ‒ Loss tolerance à Data provenance à Recovery / recording a rolling log of fine-grained history à Designed for extension Different from Apache NiFi à Design and Deploy à Warm re-deploys Key Features
  38. 38. 38 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi vs. MiNiFi Java Processor, Smaller Footprint ~40 MB NiFi Framework Components MiNiFi NiFi Framework User Interface Components NiFi
  39. 39. 39 © Hortonworks Inc. 2011 –2016. All Rights Reserved Agenda What’s NiFi NiFi 1.0 Enhancements NiFi on the edge Common issues What’s Next?
  40. 40. 40 © Hortonworks Inc. 2011 –2016. All Rights Reserved Common issues à NiFi Repo configuration issues à NiFi SSL configuration or certificate issues à ExecuteStreamCommand Processor getting stuck à OutOfMemory Issues with NCM or processors.
  41. 41. 41 © Hortonworks Inc. 2011 –2016. All Rights Reserved Best Practices à Debug Logging in case of Processor issues à Core Properties and JVM tuning : https://community.hortonworks.com/articles/7882/hdfnifi-best- practices-for-setting-up-a-high-perfo.html
  42. 42. 42 © Hortonworks Inc. 2011 –2016. All Rights Reserved Understanding health via NiFi UI Status Bar Processor Details
  43. 43. 43 © Hortonworks Inc. 2011 –2016. All Rights Reserved NiFi Summary Page
  44. 44. 44 © Hortonworks Inc. 2011 –2016. All Rights Reserved System Information
  45. 45. 45 © Hortonworks Inc. 2011 –2016. All Rights Reserved Agenda What’s NiFi NiFi 1.0 Enhancements NiFi on the edge Common issues What’s Next?
  46. 46. 46 © Hortonworks Inc. 2011 –2016. All Rights Reserved What’s Next à Framework extension – Distributed data durability (HA data) – Configuration management flows (SDLC) à Enhanced User Experience – Template/Extension Registry – Variable Registry à Deeper ecosystem integration à Central Command and Control à Native Agent (GA) NiFi MiNiFi https://cwiki.apache.org/confluence/display/NIFI/Product+requirements Nifi product requirements Search!
  47. 47. 47 © Hortonworks Inc. 2011 –2016. All Rights Reserved Thank You

×