Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi and TensorFlow

3,078 views

Published on

In this talk I will show data engineers and architects how to run real-time TensorFlow Inception Image Recognition on images captured by remote sensors and images in tweets and facebook posts.
In the same flow I will also demonstrate how to apply real-time sentiment analysis and intelligent routing of data to Phoenix, Email and Slack.

I will elaborate on a number of different sentiment analysis frameworks available for use within Apache NiFi including Python NLTK, Stanford CoreNLP, Python SpaCy and Python TextBlob.

This talk will be a deep dive into how to manage complex dataflow pipelines ingesting from multiple streaming sources including social, public open data feeds, logs, drones, RDBMS and IoT with transformations, deep learning, machine learning and business rules.

Data engineers will be shown the power of Apache NiFi for loading diverse sources of data, applying transformations in-stream, routing based on attributes, adding sentiment data to workflows, running deep learning algorithms in stream and
storing data into Apache Phoenix on HBase and Apache Hive as ORC tables.

In this talk, I will walk through each step in the process from ingest of each source, applying filters, performing transformations, converting types, picking and converting fields and finally storing data to Apache Phoenix on HBase.

A quick data analysis to show streaming updates to data will be done in Apache Zeppelin running on HDP 2.x.

This is based on a few talks I have given at the Future of Data - Princeton meetup on various ingestion and processing patterns with Apache NiFi.

Speaker:
TImothy Spann, Solutions Engineer, Hortonworks

Published in: Technology
  • Be the first to comment

Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi and TensorFlow

  1. 1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi and TensorFlow Timothy Spann Hortonworks @PaaSDev
  2. 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda • What do we want to do? • Why? • How? • Apache NiFi • TensorFlow • Natural Language Processing • Demo • Questions
  3. 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved What do we want to do? • MiniFi ingests camera images and sensor data • Run TensorFlow Inception v3 to recognize objects in image • NiFi stores images, metadata and enriched data in Hadoop • NiFi ingests social data and feeds • NiFi analyzes sentiment of textual data
  4. 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Why Gather and Analyze Social Media Stream? - Automate processes to maximize Social Media team’s time - Improved response time to requests, complaints and emergencies in social media - Predictive analytics to know when and where problems will happen - Learn where unhappy customers are and address instantly
  5. 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Aggregate all data from sensors, geo-location devices, machines and social feeds Collect: Bring Together Mediate point-to-point and bi-directional data flows, delivering data reliably to HBase, Hive, Slack and Email. Conduct: Mediate the Data Flow Parse, filter, join, transform, fork, query, sort, dissect; enrich with weather, location, NLP and TensorFlow. Curate: Gain Insights
  6. 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Why Apache NiFi? • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Supports push and pull models • Hundreds of processors • Visual command and control • Over a fifty sources • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering
  7. 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved DATA ENRICHMENT DATA DISCOVERY Inception v3 PREDICTIVE ANALYTICS Sentiment Analysis
  8. 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Why TensorFlow? • Google • Multiple platform support • Hadoop integration • Spark integration • Keras • Large Community • Python and Java APIs • GPU Support • Mobile Support • Inception v3 • Clustering • Fully functional demos • Open Source • Apache Licensed • Large Model Library • Buzz • Extensive Documentation • Raspberry Pi Support
  9. 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved • TensorFlow (C++, Python, Java) via ExecuteStreamCommand • TensorFlow NiFi Java Custom Processor • TensorFlow Running on Edge Nodes (MiniFi) Apache NiFi Integration with TensorFlow Options
  10. 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved • TensorFlow Mobile (iOS, Android, RPi) • TensorFlow on Spark (Yahoo) via Livy, S2S, Kafka • TensorFlow Running in Containers in YARN 3.0 on Hadoop • gRPC Call to TensorFlow Serving Apache NiFi Integration with TensorFlow Options
  11. 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved ExecuteStreamCommand To TensorFlow https://community.hortonworks.com/articles/58265/analyzing-images-in-hdf-20-using-tensorflow.html
  12. 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved python classify_image.py --image_file /dir/solarroofpanel.jpg solar dish, solar collector, solar furnace (score = 0.98316) window screen (score = 0.00196) manhole cover (score = 0.00070) radiator (score = 0.00041) doormat, welcome mat (score = 0.00041) TensorFlow via Python
  13. 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved TensorFlow Java Processor in NiFi https://community.hortonworks.com/content/kbentry/116803/building-a-custom-processor-in- apache-nifi-12-for.html https://github.com/tspannhw/nifi-tensorflow-processor
  14. 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved TensorFlow Running on Edge Nodes (MiniFi)
  15. 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved pip install -U textblob python -m textblob.download_corpora Installing TextBlob for Python Installing spaCy for Python https://community.hortonworks.com/articles/76935/using-sentiment-analysis-and-nlp-tools-with-hdp-25.html pip install -U spacy python -m spacy.en.download all Installing NLTK for Python 2.7 http://www.nltk.org/install.html pip install -U nltk pip install -U numpy
  16. 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved run.sh python sentiment.py "$@” sentiment.py from nltk.sentiment.vader import SentimentIntensityAnalyzer import sys sid = SentimentIntensityAnalyzer() ss = sid.polarity_scores(sys.argv[1]) print('Compound {0} Negative {1} Neutral {2} Positive {3} '.format( ss['compound'],ss['neg'],ss['neu'],ss['pos'])) Local Sentiment Analysis via Python
  17. 17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache OpenNLP for Entity Resolution Processor https://github.com/tspannhw/nifi-nlp- processor Requires installation of NAR and Apache OpenNLP BINs This is a non-supported processor that I wrote and put into the community. Installing Apache OpenNLP NiFi Processor https://community.hortonworks.com/articles/80418/open-nlp-example-apache-nifi-processor.html
  18. 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Stanford CoreNLP Processor https://github.com/tspannhw/nifi-corenlp-processor Requires install of NAR and Stanford English Models http://nlp.stanford.edu/software/stanford-english- corenlp-2017-06-09-models.jar This is a non-supported processor that I wrote and put into the community. Installing Stanford CoreNLP Processor https://community.hortonworks.com/articles/81270/adding-stanford-corenlp-to-big-data-pipelines-apac-1.html
  19. 19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Code and Demo
  20. 20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Contact: Timothy Spann @PaaSDeV http://www.meetup.com/futureofdata-princeton https://dzone.com/users/297029/bunkertor.html http://community.hortonworks.com/users/9304/tspann.html
  21. 21. 21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Community Connection Read access for everyone, join to participate and be recognized • Full Q&A Platform (like StackOverflow) • Knowledge Base Articles • Code Samples and Repositories
  22. 22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Community Engagement Participate now at: community.hortonworks.com© Hortonworks Inc. 2011 – 2015. All Rights Reserved 4,000+ Registered Users 10,000+ Answers 15,000+ Technical Assets One Website!

×