Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Social Media Monitoring with NiFi, Druid and Superset

294 views

Published on

This the presentation for DEMO "The Social Stalker" - Application done for Social Media Monitoring with NiFi, Druid and Superset

Published in: Technology
  • Be the first to comment

Social Media Monitoring with NiFi, Druid and Superset

  1. 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved.1 © Hortonworks Inc. 2011–2018. All rights reserved. Thiago Santiago Solution Engineer – Latam
  2. 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved. Why Customers Choose Hortonworks Global Data Management • Hybrid • Multi-cloud • End-to-end security and governance 100% Open Source – “We are the Linux of Big Data” • Innovation • Interoperability • No vendor lock-in • Rapid community innovation Proven Business Model: • 1,300 enterprise customers • First to IPO • Fastest to $100M • First to profitability Most Comprehensive Platform • Data at Rest and Data in Motion • Any style of workload • Centralized management, security, governance
  3. 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved. Powering the Modern Data Architecture DATA AT RESTDATA IN MOTION ACTIONABLE INTELLIGENCE COMPLETE DATA LIFECYCLE MANAGEMENT RUN CONTAINERIZED APPLICATIONS CONCURRENTLY EDGECLOUD H O L I S T I C M A N A G E M E N T, G OV E R N A N C E A N D S E C U R I T Y ON-PREMISES MULTI-WORKLOADS MULTI-TYPE MULTI-TIER Data Science SQL Query Engine
  4. 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved. The Datalake Data Science IT Systems & Ops HDP HDF
  5. 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved. Capture streaming data Deliver perishable insights Combine new & old data Store data forever Access a multi-tenant data lake Model with artificial intelligence DATA AT RESTDATA IN MOTION ACTIONABLE INTELLIGENCE Perishable Insights Historical Insights
  6. 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved. HORTONWORKS DATA FLOW NIFI 1.2.0HDF 3.0 Jul 2017 1.0.0 HDF 2.0 Mar 2016 * HDF 3.1 – Shows current Apache branches being used. Final component version subject to change based on Apache release process. 1.1.0 NiFiRegistry Ranger 0.7.0 0.5.0 0.6.0 Ambari 2.5.1 2.4.0 2.4.2 Kafka 0.10.1.0 0.9.0 0.10.0 Zookeeper 3.4.6 3.4.6 3.4.6 Storm 1.1.0 1.0.1 1.0.2SAM 0.5.0 SchemaRegistry 0.3.0 HDF 2.1 Aug 2016 Ongoing Innovation in Apache HDF 1.0 Dec 2014 0.3.0 0.6.1 HDF 1.2 Oct 2015 MiNiFi0.2.0 Ongoing Innovation in OpenSource 1.0.0 0.0.1 0.10.0 HDF 3.1 Jan 2018 1.5.0 0.1.0 0.7.02.6.11.0 3.4.61.1.10.6.0 0.5.00.4.0 SECURITYSTREAMING & INTEGRATION OPERATIONS Hortonworks Data Flow 3.1
  7. 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved. HORTONWORKS DATA PLATFORM Hadoop &YARN HDP 2.2 Dec 2014 HDP 2.2 Dec 2014 2.2.0 2.4.0 2.6.0 2.7.1 HDP 2.3 Oct 2015 2.7.3 HDP 2.6* 2017 2.7.1 HDP 2.4 Mar 2016 * HDP 2.6 – Shows current Apache branches being used. Final component version subject to change based on Apache release process. ** Spark 1.6.3+ Spark 2.1 – HDP 2.6 supports both Spark 1.6.3 and Spark 2.1 as GA. *** Hive 2.1 is GA within HDP 2.6. **** Apache Solr is available as an add-on product HDP Search. 2.7.3 Sqoop 1.4.4 1.4.5 1.4.4 1.4.6 1.4.6 1.4.6 1.4.6 Druid 0.9.2 Knox 0.4.0 0.5.0 0.6.0 0.11.0 0.6.0 0.9.0 Ranger 0.4.0 0.5.0 0.7.0 0.5.0 0.6.0 Ambari 1.4.4 2.0.0 1.5.1 2.1.0 2.5.0 2.2.1 2.4.0 Kafka 0.8.2 0.8.1 0.10.1.0 0.9.0 0.10.0 Zookeeper 3.4.5 3.4.6 3.4.5 3.4.6 3.4.6 3.4.6 3.4.6 Flume 1.5.2 1.4.0 1.3.1 1.5.2 1.5.2 1.5.2 1.5.2 Solr 4.10.2 4.7.2 5.2.1 5.5.1 **** 5.2.1 5.5.1 Slider 0.60.0 0.80.0 0.91.0 0.80.0 0.91.0 Atlas 0.5.0 0.8.0 0.5.0 0.7.0 Accumulo 1.6.1 1.5.1 1.7.0 1.7.0 1.7.0 1.7.0 Phoenix 4.0.0 4.2.0 4.4.0 4.7.0 4.4.0 4.7.0 Storm 0.9.3 0.10.0 0.9.1 1.1.0 0.10.0 1.0.1 Falcon 0.5.0 0.6.0 0.6.1 0.10.0 0.6.1 0.10.0 Tez 0.4.0 0.5.2 0.7.0 0.7.0 0.7.0 0.7.0 Hive 0.12.0 0.13.0 0.14.0 1.2.1 1.2.1+ 2.1*** 1.2.1 1.2.1+ 2.1***Pig 0.12.0 0.12.1 0.14.0 0.15.0 0.16.0 0.15.0 0.16.0HDP 2.5 Aug 2016 Oozie 3.3.2 4.1.0 4.0.0 4.2.0 4.2.0 4.2.0 4.2.0 Spark 1.2.1 1.4.1 1.6.3+ 2.1** 1.6.0 1.6.2+ 2.0** HBase 0.98.4 0.96.1 0.98.0 1.1.2 1.1.2 1.1.2 1.1.2 Zeppelin 0.7.0 0.6.0 HDP 2.1 April 2014 HDP 2.0 Oct 2013 Ongoing Innovation in OpenSource Hortonworks Data Platform 2.6 DATA MGMT DATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS SECURITY
  8. 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved.8 © Hortonworks Inc. 2011–2018. All rights reserved. Clients ApplicationsLegacy On Premises Lambda Architecture in Hortonworks complementing IBM investments Tooling Data Science, Machine Learning Model Pré- processing Analytics, BI, Ad-hoc Exploration Data Exploration Complex Event Processing Kafka SAM Analytics, BI, Ad-hoc Exploration Visualization & Reporting All Data HDFS Tooling Hive Bach Views Tooling SuperSet Real Time Views Custom Applications Dashboards BatchLayerSpeedLayerServingLayer Ingest Atlas/Ranger Model Building IBM DB2 Big SQL Druid Marketing Zeppelin Relational Bases Social Networks WebSites Mobile Apps CDR - Network OOT Adwords/adserver Beacon TWW/Smart Focus CRM … IBM Spectrum Scale … … IBM Stream Computing
  9. 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved.
  10. 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved.10 © Hortonworks Inc. 2011–2018. All rights reserved. Source Json Parsing Nifi Druid Processor Social Media analysis is a great use case for show how we can build a dashboard showing streaming analytics with NiFi, Druid, and Superset This processing flow has these steps: 1) Tweets ingestion using Apache NiFi 2) OLAP database storage using Druid 3) Visualization using Apache Superset
  11. 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved.11 © Hortonworks Inc. 2011–2018. All rights reserved. Clients ApplicationsLegacy On Premises Lambda Architecture in Hortonworks complementing IBM investments Tooling Data Science, Machine Learning Model Pré- processing Analytics, BI, Ad-hoc Exploration Data Exploration Complex Event Processing Kafka SAM Analytics, BI, Ad-hoc Exploration Visualization & Reporting All Data HDFS Tooling Hive Bach Views Tooling SuperSet Real Time Views Custom Applications Dashboards BatchLayerSpeedLayerServingLayer Ingest Atlas/Ranger Model Building IBM DB2 Big SQL Druid Marketing Zeppelin Relational Bases Social Networks WebSites Mobile Apps CDR - Network OOT Adwords/adserver Beacon TWW/Smart Focus CRM … IBM Spectrum Scale … … IBM Stream Computing
  12. 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved.12 © Hortonworks Inc. 2011–2018. All rights reserved. The Social Stalker Tooling SuperSet SpeedLayer Ingest Atlas/Ranger Druid Social Networks HDFS Data Science, Machine Learning Model Pré- processing
  13. 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved.13 © Hortonworks Inc. 2011–2018. All rights reserved. NiFi makes data ingestion fast, easy and secure Druid is a data store designed for business intelligence (OLAP) queries on event data. Superset's main goal is to make it easy to slice, dice and visualize data https://community.hortonworks.com/
  14. 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved. Article Demo https://community.hortonworks.com/articles/177561/streaming-tweets-with-nifi-kafka-tranquility-druid.html
  15. 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved. Thank you

×