Your SlideShare is downloading. ×
0
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Apache Storm - Real Time Analytics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Apache Storm - Real Time Analytics

7,792

Published on

Published in: Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,792
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
2
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. Introduction to Real Time Analytics using Apache Storm www.edureka.in/apache-storm Buy Complete Course at : www.edureka.in/apache-storm Post your Questions on Twitter on @edurekaIN: #askEdureka
  2. Objectives of this Session • Un • The need for Real Time Analytics - Usecases • How does Storm come to rescue? • Where does Storm fit in Hadoop Framework? • Storm Architecture – Components of Storm • Quiz to reinforce your learning For Queries during the session and class recording: Post on Twitter @edurekaIN: #askEdureka Post on Facebook /edurekaIN www.edureka.in/apache-storm
  3. Need of Real Time Analytics Ret • Banking - Fraud Transaction Detection • Telecommunication – Silent Roamers Detection • Retail- Inventory Dynamic Pricing • Social Networking- Trending Topics www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  4. Growing Interest in Apache Storm www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  5. Storm Usecases – Need for Real Time Analytics Twitter Trends Responsive Logs Source: https://github.com/nathanmarz/storm/wiki/Powered-By Custom Magazine Feeds Real Time Video Analytics Enable Clinicians to Make Medical Decisions Compare and Display Real Time Prices www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  6. What is Storm ?  Apache Storm is a free and open source distributed real-time computation system.  Storm makes it easy to reliably process unbounded streams of data.  Storm does for real-time processing what Hadoop did for batch processing.  Simple, can be used with any programming language. www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  7. Understanding the Storm Architecture Nimbus Zookeeper Supervisor Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor www.edureka.in/apache-storm *Covered in module 2 in the course Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  8. ZooKeeper Nimbus ZooKeeper ZooKeeper Supervisor Supervisor Supervisor Supervisor Supervisor Nimbus node (master node, similar to the Hadoop JobTracker): » Uploads computations for execution » Distributes code across the cluster » Launches workers across the cluster » Monitors computation and reallocates workers as needed ZooKeeper nodes: » Coordinates the Storm cluster Supervisor nodes : » Communicates with Nimbus through Zookeeper, starts and stops workers according to signals from Nimbus Storm Components A Storm cluster has 3 sets of nodes 1. Nimbus node 2. Zookeeper nodes 3. Supervisor nodes www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  9. The work is delegated to different types of components that are each responsible for a simple specific processing task. The input stream of a Storm cluster is handled by a component called a spout. The spout passes the data to a component called a bolt, which transforms it in some way. A bolt either persists the data in some sort of storage, or passes it to some other bolt. Storm Topology www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions spout spout bolt bolt bolt bolt passes data passes data transforms data data storage Input Data Source
  10. Why Storm is ideal for Real Time Processing Fast – benchmarked as processing one million, 100 byte messages, per second per node. Scalable – with parallel calculations that run across a cluster of machines. Fault-tolerant – when workers die, Storm will automatically restart them. If a node dies, the worker will be restarted on another node. Reliable – Storm guarantees that each unit of data (tuple) will be processed at least once or exactly once. Messages are only replayed when there are failures. Easy to operate – standard configurations are suitable for production on day one. Once deployed, Storm is easy to operate. http://hortonworks.com/hadoop/storm/ www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  11. MapReduce (Batch) INTERACTIVE (Text) ONLINE (HBase) STORM (Streaming) GRAPH (Giraph) IN-MEMORY (Spark) HPC MPI (OpenMPI) OTHER (Search) (Weave..) http://hadoop.apache.org/docs/stable2/hadoop-yarn/hadoop-yarn-site/YARN.html Storm in the Hadoop Framework www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  12. Upcoming Batch for Storm Start Date: 16th Aug (08:30 PM – 11:30 PM, India Time) / 16th Aug (08:00 AM – 11:00 AM, Pacific Time) 13th Sep (7:00 AM – 10:00 AM, India Time) / 12th Sep (06:30 PM – 09:30 PM, Pacific Time) Curriculum: Module 1: Introduction of Big Data and Storm Module 2: Getting Started with Storm Module 3: Spouts and Bolts Module 4: Trident Topologies Module 5: Real Life Storm Project – 1 Module 6: Real Life Storm Project – 2 www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
  13. www.edureka.in/apache-storm Annie’s Question Storm can be used in: - Real-time Processing - Batch Processing - Both
  14. www.edureka.in/apache-storm Annie’s Answer Real-time Processing
  15. www.edureka.in/apache-storm Annie’s Question Which of them can be a source of Stream? - Spout - Bolt - Both
  16. www.edureka.in/apache-storm Annie’s Answer Both
  17. www.edureka.in/apache-storm Annie’s Question It is not possible to run Storm process along with MapReduce jobs inside a Hadoop Cluster. - True - False
  18. www.edureka.in/apache-storm Annie’s Answer False. With Hadoop 2.0, it is possible.
  19. www.edureka.in/apache-storm Annie’s Question A Nimbus Node is similar to TaskTracker Node in Hadoop Cluster. - True - False
  20. www.edureka.in/apache-storm Annie’s Answer No. A Nimbus Node is more like a JobTracker Node in Hadoop
  21. www.edureka.in/apache-storm Annie’s Question A Storm topology is defined in terms of - Nimbus, Zookeeper, Supervisor nodes - Spout, Bolt - Spout, Bolt, Nimbus, Zookeeper, Supervisor nodes - Spout, Bolt, Zookeeper node
  22. www.edureka.in/apache-storm Annie’s Answer Spout and Bolt
  23. Questions? www.edureka.in/apache-stormTwitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions Buy Complete Course at : www.edureka.in/apache-storm Batch Starts On: 16th Aug 08:30 PM , IST / 16th Aug 08:00 AM, PDT 13th Sep 7:00 AM, IST/ 12th Sep 06:30 PM, PDT Course Fee: USD 339 / INR (17795 + 12.36% Service tax)** For Existing edureka Customers (20% OFF) Price : USD 271/ INR 14326

×