Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Database ingest with Apache NiFi and MiNiFi


Published on

Talk will cover a real use case for database ingest from over 100 servers (running MiNiFi) that push data to a NiFi cluster witch routes them to HDFS and/or Kafka.

This is a keynote from the series "by Developer for Developers" powered by eSolutions Grup.

Published in: Technology
  • Be the first to comment

Database ingest with Apache NiFi and MiNiFi

  1. 1. Data Ingestion with Apache NiFi and MiNiFi Lucian Neghina Big Data Architect by Developer for Developers
  2. 2. ➔ Periodically batch ingestion ➔ Incremental ingestion ➔ Over 100 servers ➔ No access to the servers ➔ Push access only ➔ Networking failure ➔ Secure communication Challenges
  3. 3. Data Flow
  4. 4. Simplistic view of Data Flow
  5. 5. Realistic view of Data Flow
  6. 6. Apache NiFi NiFi is based on Flow Based Programming ➔ Data agnostic ➔ Guaranteed data delivery ➔ Data Buffering / Back-Pressure ➔ Prioritized queuing ➔ Secure (SSL, HTTPS, encrypted content...) ➔ Data provenance ➔ Support push and pull models ➔ Designed for extension ➔ Clustering
  7. 7. Apache NiFi What is NiFi used for? ➔ Reliable and secure transfer of data between systems ➔ Delivery of data from sources to analytic platforms ➔ Enrichment and preparation of data: ◆ Conversion between formats ◆ Extraction /Parsing ◆ Routing decisions What is NiFi NOT used for? ➔ Distributed computation ➔ Complex Event Processing ➔ Long term data storage
  8. 8. Apache MiNiFi Subproject of Apache NiFi ➔ Agent model to collect data at the edge ➔ Smaller footprint ➔ Design & Deploy ➔ Warm re-deploys NiFi lives in the data center MiNiFi lives as close to where data is born
  9. 9. System Architecture Physical Store ... Data Center Server Cluster
  10. 10. Data Flow - Server-side (NiFi)
  11. 11. Data Flow - Agent-side (MiNiFi)
  12. 12. Thank you ● NiFi Template ● twitter: @nflucian ● github: