Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Unlock Value from Big Data with Apache NiFi and Streaming CDC

2,169 views

Published on

Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It provides an end-to-end platform that can collect, curate, analyze, and act on data in real-time, on-premises, or in the cloud with a drag-and-drop visual interface. It’s being used across industries on large amounts of data that had stored in isolation which made collaboration and analysis difficult.
Join industry experts from Hortonworks and Attunity as they explain how Apache NiFi and streaming CDC technology provides a distributed, resilient platform for unlocking the value of data in new ways.

Published in: Technology
  • Be the first to comment

Unlock Value from Big Data with Apache NiFi and Streaming CDC

  1. 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Unlock Value from Big Data with Apache NiFi and Streaming CDC Mark Payne, Sr. Member of Technical Staff, Hortonworks Jordan Martz, Director, Technology Solutions, Attunity
  2. 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Jordan Martz Director, Technology Solutions ATTUNITY Mark Payne Sr. Member Technical Staff, NiFi PMC HORTONWORKS
  3. 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Agenda • Apache NiFi – What and Why • Features of Apache NiFi • Demo • Use Cases of Apache NiFi • Streaming CDC with NiFi and Attunity
  4. 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Apache NiFi - What and Why
  5. 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved • Tool for getting the right data to the right place(s), in the right format(s), at the right time. • Listen, Fetch Data • Split, Aggregate Data • Route, Transform Data • Push Data • Drag & Drop Dataflow What Is It? Apache NiFi – What and Why
  6. 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved • Even with a large budget, nothing on the market • Typical Approaches to DataFlow • Messaging Frameworks (e.g., Kafka, JMS) • Scripts • ESB’s • Shortcomings of These Approaches • Visualization • Maintainability • Monitoring and Operations • Data Traceability Why – A Brief History Apache NiFi – What and Why
  7. 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Features of Apache NiFi
  8. 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved • Quickly build out DataFlow by dragging components • Real-Time Command & Control • Real-Time Monitoring Drag & Drop UI Features of NiFi
  9. 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved • Visualize and Monitor performance, behavior in flow • Bulletins provide immediate insight • Inline documentation • Start and Stop components individually or at group level • Visualize DataFlow at the enterprise level, not only the “pipeline” level Operations First Features of NiFi
  10. 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved • Fine-grained data lineage • Immutable data stores • See data before and after each event • Enables compliance use cases • Enables debugging, understanding Data Provenance Features of NiFi
  11. 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved • Data of any format, any schema • Or with no schema • Structured, unstructured, or semi-structured • Data of any size Data Agnostic Features of NiFi
  12. 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Demo Time
  13. 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved NiFi Use Cases
  14. 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Insurance Industry - Actionable Intelligence with Attunity Replicate Catastrophic Event Data Customer Onboarding Data Seismic Data Biometrics Data Usage-Based Driver Data Cyber Threat Metadata RISK & UNDERWRITING ANALYSIS USAGE-BASED INSURANCE CLAIMS ANALYTICS NEW PRODUCT DEVELOPMENT CYBER RISK ANALYTICS Drones & Aerial Imagery Claims Docs, Notes & Diaries Weather & Environment Underwriting Analysis Policy Histories Photos
  15. 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved Insurance Industry - Business use cases Emerging Tech, Real-time data and the Connected World Smart Cities & Buildings Smart Factories / Commercial Connected Life / Health / Medicine IoT / Robotics Telematics Shared Economy Smart Homes Cyber / AI / Analytics
  16. 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Enables Transformation in Insurance Insurance Risk & underwriting analysis; Usage-based insurance; Claims analytics; New product development, and cyber-risk analytics • NiFi: Usage-Based Driver Data, Weather & Environment, Drones & Aerial Imagery, Seismic Data, Biometrics Data, Cyber Threat Metadata, Catastrophic Event Data, Photos • Replicate: Underwriting Analysis, Customer Onboarding Data, Underwriting Analysis, Policy histories, Claims Docs, Notes & Diaries
  17. 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved Actionable Intelligence Makes Healthcare Precise and Personal Patient Records Lab Data Pharmacy Data Patient Locations Wearables Intra-Network Data Sensor Data Claims Data Social Media Physician Notes Patient Satisfaction Data Clinical (EMR) Data SINGLE VIEW OF PATIENT REAL-TIME VITAL SIGN MONITORING BILLING & REIMBURSEMENTS EMR OPTIMIZATION SUPPLY CHAIN OPTIMIZATION
  18. 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Enables Transformational Hadoop Analytics in Healthcare Healthcare SINGLE VIEW OF PATIENT, REAL-TIME VITAL SIGN MONITORING, BILLING & REIMBURSEMENTS, EMR OPTIMIZATION, SUPPLY CHAIN OPTIMIZATION • NiFi- Wearables, Social Media, Sensor Data, Pharmacy Data, Physician Notes, Patient Locations, Intra-Network Data • Replicate - Claims Data, Clinical (EMR) Data, Patient Records, Patient Satisfaction Data, Lab Data
  19. 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Telecommunications m
  20. 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Replicate Completes the 360-degree Views of Telecom Customers Telecommunications SINGLE VIEW OF THE CUSTOMER; CHURN REDUCTION; CDR ANALYSIS; NETWORK OPTIMIZATION, and DYNAMIC BANDWIDTH ALLOCATION • NiFi- Server Logs, Social Media, Clickstream, Cyber Threat Metadata, Sensor Data, Voice-to-Text • Replicate- ERP System Data, CRM Records, Call Detail Records, Billing Data, Subscriber Profiles, Product Catalogs
  21. 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Actionable Intelligence Powers Modern Manufacturing Defect Testing Data Product Designs MES Systems RFID Streams SCADA Systems Shop Floor Sensors PREVENTATIVE MAINTENANCE SUPPLY CHAIN OPTIMIZATION YIELD MAXIMIZATION QUALITY CONTROL RECALL AVOIDANCE ERP Systems Supplier Receipts Machine Data Assembly Line Sensors Data Historians Work Orders
  22. 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Data Drives the Connected Car – Must include SAP SCM, etc. Insurance Premiums Warranties Recalls Pricing Models Design Innovation Autonomous Driving Connected City Infotainment Sensors Scheduled Maintenance Predictive Maintenance Route Optimization INSURANCE COMPANIES GOVERNMENT AGENCIES INFOTAINMENT PROVIDERS SOFTWARE COMPANIES AUTO MAKERS
  23. 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Replicate Enables Global Manufacturing and Automotive Transformation Manufacturing Preventative Maintenance, Supply Chain Optimization, Yield Maximization, Quality Control, and Recall Avoidance Development • Apache NiFi: Machine Data, Testing Data, Assembly Line Sensors, RFID Streams, Shop Floors • Attunity Replicate: Scada systems, Work Orders, Supplier Receipts, ERP Systems, MES, Data Historians, Product Designs
  24. 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Oil & Gas - Industry Data Trends ERP Data Engineering Notes IoT Gateway Data Video WITSML Data Weather & Environment REAL-TIME MONITORING SINGLE VIEW OF OPERATIONS PREDICTIVE MAINTENANCE ARCHIVE & ANALYTICS UNSTRUCTURED DATA CLASSIFICATION Vehicle GPS Data GIS Data SCADA Systems Field Comments Production Histories G&G Data
  25. 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Replicate Pumps More Relevant Data for Oil & Gas Oil & Gas REAL-TIME MONITORING, SINGLE VIEW OF OPERATIONS, PREDICTIVE MAINTENANC, ARCHIVE & ANALYTICS, UNSTRUCTURED DATA CLASSIFICATION • NiFi: WITSML Data, SCADA Systems, Vehicle GPS Data, Video, Production Histories, Weather & Environment • Replicate: G&G Data, GIS Data, ERP Data, Field Comments, Scada (Engineering Notes)
  26. 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved Actionable Intelligence Transforms Energy and Utilities Asset Data Customer Surveys Weather & Environmental Service Fleet GPS Data Smart Meter Streams Commodity Prices REVENUE PROTECTION SINGLE VIEW OF CUSTOMER PREDICTIVE EQUIPMENT MAINTENANCE CONSERVATION VOLTAGE REDUCTION COMMODITY TRADING Social Media GIS Data SCADA Outage Histories CIS Records EDW
  27. 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Replicate Provides Pumps More Relevant Data for Energy and Utilities Utilities REVENUE PROTECTION; SINGLE VIEW OF CUSTOMER; PREDICTIVE EQUIPMENT MAINTENANCE; CONSERVATION VOLTAGE REDUCTION; COMMODITY TRADING • Apache NiFi: smart meter streams, GIS data, social media, weather & environmental, CIS record, customer surveys, commodity prices • Attunity Replicate: SCADA, EDW, asset data, outage histories
  28. 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Actionable Intelligence Powers Today’s Financial Services OFAC Lists Credit Records ATM Streams Transactions & Wires Stock Tickers Trade Settlements DIGITAL CUSTOMER 360 RISK DATA AGGREGATION ANTI-MONEY LAUNDERING FRAUD DETECTION TRADE SURVEILLANCE Mobile App Data Trade Data Web Logs Banker Notes Demographic Data Customer Transaction Data
  29. 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved Attunity Replicate Connects You to Your Money and Integrates with Key Market Resources Financial Services DIGITAL CUSTOMER 360; RISK DATA AGGREGATION; ANTI-MONEY LAUNDERING; FRAUD DETECTION; TRADE SURVELLIANCE • Apache NiFi: web logs, trade data, mobile app data, ATM streams, OFAC lists, transactions & wires, demographic data, trade settlements. • Attunity Replicate: customer transaction data, OFAC lists, banker notes, credit records, stock tickers
  30. 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved Streaming CDC with NiFi and Attunity Replicate
  31. 31. ATTUNITY REPLICATE Automated, real-time data delivery software
  32. 32. 32© 2018 Attunity 32© 2017 Attunity Attunity Replicate Architecture TRANSFER IN-MEMORY FILTER HADOOP RDBMS DATA WAREHOUSE FILES MAINFRAME TRANSFORM FILE CHANNEL PERSISTENT STORE CDC BATCH INCREMENTAL BATCH HADOOP RDBMS DATA WAREHOUSE STREAMING FILES
  33. 33. 33© 2018 Attunity DATABASE EDW HADOOP CLOUD MAINFRAME SAP FLAT FILESOTHER LEGACY Oracle SQL Server DB2 iSeries DB2 z/OS DB2 LUW MySQL PostgeSQL Sybase ASE Informix Exadata Teradata Netezza Vertica Pivotal Hortonworks Cloudera MapR DB2 for z/OS IMS/DB VSAM ECC on Oracle ECC on SQL ECC on DB2 ECC on HANA S4 HANA AWS RDS Amazon Aurora Salesforce SQL/MP Enscribe RMS Delimited (e.g., CSV, TSV) Universal Platform Coverage – Sources
  34. 34. 34© 2018 Attunity DATABASE EDW STREAMING CLOUD HADOOP Oracle SQL Server DB2 LUW MySQL PostgreSQL Sybase ASE Informix Microsoft PDW Exadata Teradata Netezza Vertica Sybase IQ Amazon Redshift Actian Vector SAP HANA Amazon RDS Amazon Redshift Amazon EMR Amazon S3 Amazon Aurora Google Cloud SQL Azure SQL DW Azure SQL DB Snowflake Hortonworks Cloudera MapR Amazon EMR HDInsight * Azure Event Hubs * MapR-ES * Kafka FLAT FILESSAP HANA Delimited (e.g., CSV, TSV) Universal Platform Coverage – Targets
  35. 35. 35© 2018 Attunity 35© 2017 Attunity MODERN DATA INGEST METADATA HIVE OPTIMIZED STREAM OPTIMIZED CHANGE DATA CAPTURE CLOUD ON PREM WAREHOUSE MAINFRAME RDBMS SAP § CDC (log-based) for high performance, low latency and low impact § Single platform for all key enterprise systems § Hive-optimized for HDP and Stream- optimized for HDF § Point-and-Click with NO coding and NO agents
  36. 36. 36© 2018 Attunity 36© 2017 Attunity SAP DATA INGEST METADATA HIVE OPTIMIZED STREAM OPTIMIZED CHANGE DATA CAPTURE SAP NATIVE AGENT § Unlock and decode SAP application data § Real-time and continuous ingest with CDC § Native agent, SAP certified § All core and industry- specific SAP ECC modules All the standard SAP ECC modules (FI, CO, MM, PM, SD, PM, HR, …) All industry specific solutions (i.e. IS-Utilities, IS-OIL, …) SAP SRM SAP ERP SAP BW SAP HR SAP GTS SAP CRM SAP EWM SAP TM SAP SCM ANY INDUSTRY SOLUTION SAP EM
  37. 37. 37© 2018 Attunity 37© 2017 Attunity METADATA HIVE OPTIMIZED STREAM OPTIMIZED RAPID ODS WITH HIVE LLAP § Automates creation of analytics-ready Hive dataset § Reconciles source data and metadata updates § Transformation processing pushed down to Hive CHANGE DATA CAPTURE CLOUD ON PREM WAREHOUSE MAINFRAME RDBMS SAP HIVE HQL TRANSFORM & UPDATE
  38. 38. 38© 2018 Attunity 38© 2017 Attunity CHANGE DATA CAPTURE CLOUD ON PREM EDW OFFLOAD WITH USAGE PROFILING TERADATA EXADATA NETEZZA DB2 OFFLOAD TASKS EDW USAGE & ANALYTICS § Identify cold data to be moved from EDW § Perform impact analysis based on user activity § Automatically generate & execute replication tasks METADATA HIVE OPTIMIZED STREAM OPTIMIZED
  39. 39. 39© 2018 Attunity 39© 2017 Attunity § Simple batch ingest (easy, +metadata) § Streaming CDC ingest (for HDF, cloud) § High volume offload from EDW (e.g. Teradata) § Metadata replication (with DDL capture) DATA PLANE SERVICES WITH ATTUNITY Packaging: • ISV service • Co-branded service • HWX service EDW ETL & UPDATE Ingest & Stream w CDC Data Heat Map
  40. 40. 40© 2018 Attunity 40© 2017 Attunity BIG DATA INTEGRATION MATURITY MODEL Level 1 Sandbox Level 2 Opportunistic Level 3 Workgroup Level 5 Transformative Level 4 Enterprise Bulk data transfer Manual change data capture Non-invasive CDC via change logs Automatically generate target schemas,process DML, and respond to source DDL changes Hybrid deployments; publish to multiple streams; Microservices API; Programmatic,resource intensive System resource intensive; inflexible and brittle; people intensive change management Non-invasive,agentless, automated movement, flexible Real-time analytic availability;Lambda architecture; fully automated Resilient; high-availability; single console management for global deployments Style Capabilities Product Examples Sqoop Sqoop with database time stamps, triggers and Change Tables; or Query-basedCDC Attunity Replicate Attunity Enterprise Manager Attunity Visibility Attunity Compose for Hive Manual Automated
  41. 41. 41 © Hortonworks Inc. 2011–2018. All rights reserved Download Apache NiFi for Dummies Today! www.Attunity.com/nifibook
  42. 42. 42 © Hortonworks Inc. 2011–2018. All rights reserved Questions?
  43. 43. 43 © Hortonworks Inc. 2011–2018. All rights reserved Thank you For more information, go to: www.Attunity.com/nifibook

×