SlideShare a Scribd company logo
MODULE 6: Data Analytics for
IoE
CO5-Design and develop smart city in IOT.
CO6-Analysis and evaluate the data
received through sensors in IOT.
CONTENTS
• Introduction
• Apache Hadoop Using Hadoop Map Reduce for Batch Data
Analysis
• Apache Oozie
• Apache Spark
• Apache Storm, Using Apache Storm for Real-time Data
Analysis
• Structural Health Monitoring Case Study
• Tools for IoT:-
– Chef, Chef Case Studies,
– Puppet, Puppet Case Study
– Multi-tier Deployment
• NETCONF-YANG Case Studies
• IoT Code Generator.
INTRODUCTION
HADOOP
MAJOR GOALS OF HADOOP
MAJOR GOALS OF HADOOP
MAJOR GOALS OF HADOOP
MAJOR GOALS OF HADOOP
MAJOR GOALS OF HADOOP
HADOOP ECOSYSTEM
HADOOP ECOSYSTEM
LAYERED DIAGRAM
LAYERED DIAGRAM FOR HADOOP
HDFS
YARN
MAP - REDUCE
PIG AND HIVE
GIRAPH
STORM SPARK AND FLINK
HBASE
ZOOKEEPER
OPEN SOURCE
Map Reduce
• https://words.sdsc.edu/words-data-
science/mapreduce
Map Reduce
• Map-Reduce is a scalable programming model
that simplifies distributed processing of data.
Map-Reduce consists of three main steps:
Mapping, Shuffling and Reducing.
• An easy way to think about a Map-Reduce job
is to compare it with act of ‘delegating’ a large
task to a group of people, and then combining
the result of each person’s effort, to produce
the final outcome.
Map Reduce
• Let’s take an example to bring the point across. You just heard about this
great news at your office, and are throwing a party for all your colleagues!
• You decide to cook Pasta for the dinner. Four of your friends, who like
cooking, also volunteer to join you in preparation.
• The task of preparing Pasta broadly involves chopping the vegetables,
cooking, and garnishing.
• Let’s take the job of chopping the vegetables and see how it is analogous
to a map-reduce task.
• Here the raw vegetables are symbolic of the input data, your friends are
equivalent to compute nodes, and final chopped vegetables are analogous
to desired outcome.
• Each friend is allotted onions, tomatoes and peppers to chop and weigh.
• You would also like to know how much of each vegetable types you have
in the kitchen. You would also like to chop these vegetables while this
calculation is occurring. In the end, the onions should be in one large bowl
with a label that displays its weight in pounds, tomatoes in a separate one,
and so on.
Map Reduce
MAP: To start with, you assign each of your four friends a random mix
of different types of vegetables. They are required to use their
‘compute’ powers to chop them and measure the weight of each type
of veggie. They need to ensure not to mix different types of veggies. So
each friend will generate a mapping of <key, value> pairs that looks
like:
Friend X:
• <tomatoes, 5 lbs>
<onions, 10 lbs>
<garlic, 2 lbs>
Friend Y:
• <onions, 22 lbs>
<green peppers, 5 lbs>
…
• Now that your friends have chopped the vegetables, and labeled
each bowl with the weight and type of vegetable, we move to the
next stage: Shuffling.
Map Reduce
SHUFFLE: This stage is also called Grouping. Here you want to group the veggies by their types. You
assign different parts of your kitchen to each type of veggie, and your friends are supposed to group the
bowls, so that like items are placed together:
North End of Kitchen:
• <tomatoes, 5 lbs>
<tomatoes, 11 lbs>
West End of Kitchen:
• <onions, 10 lbs>
<onions, 22 lbs>
<onions, 1.4 lbs>
East End of Kitchen:
• <green peppers, 3 lbs>
<green peppers, 10 lbs>
• The party starts in a couple of hours, but you are impressed by what your friends have
accomplished by Mapping and Grouping so far! The kitchen looks much more organized now and
the raw material is chopped. The final stage of this task is to measure how much of each veggie you
actually have. This brings us to the Reduce stage.
Map Reduce
• REDUCE: In this stage, you ask each of your
friend to collect items of same type, put them
in a large bowl, and label this large bowl with
sum of individual bowl weights. Your friends
cannot wait for the party to start, and
immediately start ‘reducing’ small bowls. In
the end, you have nice large bowls, with total
weight of each vegetable labeled on it.
Map Reduce
Map Reduce
• The number represents total weight of that vegetable after
reducing from smaller bowls
• Your friends (‘compute nodes’) just performed a Map-
Reduce task to help you get started with cooking the Pasta.
Since you were coordinating the entire exercise, you are
“The Master” node of this Map-Reduce task. Each of your
friends took roles of Mappers, Groupers and Reducers at
different times. This example demonstrates the power of
this technique.
• This simple and powerful technique can be scaled very
easily if more of your friends decide to join you. In future,
we will continue to add more articles on different open
source tools that will help you easily implement Map-
Reduce to solve your computational problems.
Introduction to batch processing –
MapReduce
• Today, the volume of data is often too big for a
single server – node – to process.
• Therefore, there was a need to develop code that
runs on multiple nodes.
• Writing distributed systems is an endless array of
problems, so people developed multiple
frameworks to make our lives easier.
• MapReduce is a framework that allows the user
to write code that is executed on multiple nodes
without having to worry about fault tolerance,
reliability, synchronization or availability.
Batch processing
• There are a lot of use cases for a system described in the
introduction, but the focus of this post will be on data
processing – more specifically, batch processing.
• Batch processing is an automated job that does some
computation, usually done as a periodical job.
• It runs the processing code on a set of inputs, called a
batch. Usually, the job will read the batch data from a
database and store the result in the same or different
database.
• An example of a batch processing job could be reading all
the sale logs from an online shop for a single day and
aggregating it into statistics for that day (number of users
per country, the average spent amount, etc.). Doing this as
a daily job could give insights into customer trends.
Batch processing
MapReduce
MapReduce
• MapReduce is a programming model that was introduced
in a white paper by Google in 2004.
• Today, it is implemented in various data processing and
storing systems (Hadoop, Spark, MongoDB, …) and it is a
foundational building block of most big data batch
processing systems.
• For MapReduce to be able to do computation on large
amounts of data, it has to be a distributed model that
executes its code on multiple nodes. This allows the
computation to handle larger amounts of data by adding
more machines – horizontal scaling.
• This is different from vertical scaling, which implies
increasing the performance of a single machine.
MapReduce
Execution
• In order to decrease the duration of our distributed
computation, MapReduce tries to reduce
shuffling (moving) the data from one node to another by
distributing the computation so that it is done on the same
node where the data is stored.
• This way, the data stays on the same node, but the code is
moved via the network. This is ideal because the code is
much smaller than the data.
• To run a MapReduce job, the user has to implement two
functions, map and reduce, and those implemented
functions are distributed to nodes that contain the data by
the MapReduce framework.
• Each node runs (executes) the given functions on the data
it has in order the minimize network traffic (shuffling data).
MapReduce
MapReduce
• The computation performance of MapReduce
comes at the cost of its expressivity.
• When writing a MapReduce job we have to follow
the strict interface (return and input data
structure) of the map and the reduce functions.
• The map phase generates key-value data pairs
from the input data (partitions), which are then
grouped by key and used in the reduce phase by
the reduce task.
• Everything except the interface of the functions is
programmable by the user.
MapReduce
Map
• Hadoop, along with its many other features, had the
first open-source implementation of MapReduce. It
also has its own distributed file storage called HDFS.
• In Hadoop, the typical input into a MapReduce job is a
directory in HDFS.
• In order to increase parallelization, each directory is
made up of smaller units called partitions and each
partition can be processed separately by a map task
(the process that executes the map function).
• This is hidden from the user, but it is important to be
aware of it because the number of partitions can affect
the speed of execution.
MapReduce
MapReduce
• The map task (mapper) is called once for every
input partition and its job is to extract key-value
pairs from the input partition. The mapper can
generate any number of key-value pairs from a
single input (including zero, see the figure above).
• The user only needs to define the code inside the
mapper. Below, we see an example of a simple
mapper that takes the input partition and
outputs each word as a key with value 1.
MapReduce
MapReduce
Reduce
• The MapReduce framework collects all the key-
value pairs produced by the mappers, arranges
them into groups with the same key and applies
the reduce function.
• All the grouped values entering the reducers are
sorted by the framework.
• The reducer can produce output files which can
serve as input into another MapReduce job, thus
enabling multiple MapReduce jobs to chain into a
more complex data processing pipeline.
MapReduce
• The mapper yielded key-value pairs with the word as
the key and the number 1 as the value.
• The reducer can be called on all the values with the
same key (word), to create a distributed word counting
pipeline.
• In the image below, we see that not every sorted group
has a reduce task.
• This happens because the user needs to define the
number of reducers, which is 3 in our case.
• After a reducer is done with its task, it takes another
group if there is one that was not processed.
MapReduce
MapReduce
• MapReduce is a programming model that allows the user to
write batch processing jobs with a small amount of code.
• It is flexible in the sense that you, the user, can write code
to modify the behavior, but making complex data
processing pipelines becomes cumbersome because every
MapReduce job has to be managed and scheduled on its
own.
• The intermediate output of map tasks is written to a file
which allows the framework to recover easily if a node has
a failure.
• This stability comes at a cost of performance, as the data
could have been forwarded to reduce tasks with a small
buffer instead, creating a stream.
Apache Oozie
• Oozie is a workflow scheduler system to manage
Apache Hadoop jobs.
• Oozie Workflow jobs are Directed Acyclical Graphs
(DAGs) of actions.
• Oozie Coordinator jobs are recurrent Oozie Workflow
jobs triggered by time (frequency) and data availability.
• Oozie is integrated with the rest of the Hadoop stack
supporting several types of Hadoop jobs out of the box
(such as Java map-reduce, Streaming map-reduce, Pig,
Hive, Sqoop and Distcp) as well as system specific jobs
(such as Java programs and shell scripts).
• Oozie is a scalable, reliable and extensible system.
WORKFLOW EXECUTION IN OOZIE
Apache Spark
• Apache Spark is a lightning-fast unified analytics engine
for big data and machine learning. It was originally
developed at UC Berkeley in 2009.
• Since its release, Apache Spark, the unified analytics
engine, has seen rapid adoption by enterprises across a
wide range of industries.
• Internet powerhouses such as Netflix, Yahoo, and eBay
have deployed Spark at massive scale, collectively
processing multiple petabytes of data on clusters of
over 8,000 nodes.
• It has quickly become the largest open source
community in big data, with over 1000 contributors
from 250+ organizations.
Apache Spark
• Spark can be 100x faster than Hadoop for large scale data
processing by exploiting in memory computing and other
optimizations.
• Spark is also fast when data is stored on disk, and currently
holds the world record for large-scale on-disk sorting.
• Spark has easy-to-use APIs for operating on large datasets.
• Spark comes packaged with higher-level libraries, including
support for SQL queries, streaming data, machine learning
and graph processing.
• These standard libraries increase developer productivity
and can be seamlessly combined to create complex
workflows.
Apache Spark
Apache Storm
• Apache Storm is a distributed real-time big data-processing
system.
• Storm is designed to process vast amount of data in a fault-
tolerant and horizontal scalable method.
• It is a streaming data framework that has the capability of
highest ingestion rates.
• Though Storm is stateless, it manages distributed
environment and cluster state via Apache ZooKeeper.
• It is simple and you can execute all kinds of manipulations
on real-time data in parallel.
• Apache Storm is continuing to be a leader in real-time data
analytics. Storm is easy to setup, operate and it guarantees
that every message will be processed through the topology
at least once.
• Apache Storm has the cluster with some
specific components, each component works
with some functions together assists “The
Apache Storm: Architecture”. There are two
types of nodes that are present in
architecture:
• Master node(Nimbus)
• Worker node(Supervisor)
• The Master node comprises nimbus, nimbus acts as a daemon for
the master node. The Master node runs the nimbus, nimbus
examines and administers the task to cluster or worker node, allots
tasks to machines, and supervise on failure. Nimbus permits to
accept code (data) in any programming language, in this way
anyone can utilize Apache storm without knowing any other
language.
• The Worker node comprises of a supervisor, the supervisor acts as
a daemon for the worker node. The Worker node runs the
supervisor, supervisor concentrates on the task given to the
machine and monitors worker processes as required based on what
Nimbus has assigned to it. Each worker node process operates a
part of topology in the form of spouts and bolts. Nimbus daemon
communicates with the supervisor daemon via ZooKeeper.
Components of Apache Storm
• Topology is the real-time computational and
graphical representation data structure. The
topology consists of bolt and spouts where
spout determines how the output is fixed to
the inputs of bolts and output from a single
bolt linked to the inputs of other bolts. A
storm cluster gets input as topology, the
nimbus daemon in the master node seeks
information with supervisor daemon in the
Worker node and accepts the topology.
• https://medium.com/analytics-steps/apache-
storm-architecture-real-time-big-data-
analysis-engine-for-streaming-data-
4fc34ce0adae
• Spout acts as an initial point-step in topology, data from
unlike sources is acquired by the spout. It ingests the data
as a stream of tuples and sends it to bolt for processing of
stream as data. A single spout can generate multiple
outputs of streams as tuples, these tuples of streams are
further consumed by one or many bolts. Spout gets data
from various databases, file system distribution or
messages like Kafka consistently, converts them in streams
of tuples and sends them to bolts for processing.
• Bolts are responsible for the processing of data, their work
includes filtering, functioning, aggregations, and handing
databases, etc. Bolts consume multiple streams as input,
process them, and generate new streams for processing of
data.
• Consider the case of Twitter, it is an online social
platform to communicate with tweets. Here, user
tweets can be sent and received. Subscribed
users read and post tweets while unsubscribed
users read tweets only.
• A hashtag is used to classify tweets as a keyword
by putting # earlier to an appropriate keyword.
So, Apache Storm acts here as a real-time outline
of detecting the most used Hashtag per tweet.
Apache Storm vs Hadoop
• Basically Hadoop and Storm frameworks are
used for analyzing big data. Both of them
complement each other and differ in some
aspects. Apache Storm does all the operations
except persistency, while Hadoop is good at
everything but lags in real-time computation.
The following table compares the attributes of
Storm and Hadoop.
Apache Storm vs Hadoop
PUPPET
NETCONF/YANG
• Network Configuration Protocol, known as
NETCONF, gives access to the native capabilities
of a device within a network, defining methods to
manipulate its configuration database,
retrieve operational data, and invoke specific
operations.
• YANG provides the means to define
the content carried via NETCONF, for both data
and operations. Together, they help users build
network management applications that meet the
needs of network operators.
NETCONF/YANG
NETCONF/YANG
The motivation behind NETCONF and YANG was, to have a network
management system that manages the network at the service level
that includes:
– Standardized data model (YANG)
– Network-wide configuration transactions
– Validation and roll-back of configuration
– Centralized backup and restore configuration
Businesses have used Simple Network Management Protocol for a long
time, but it was being used more for reading device states than for
configuring devices. NETCONF and YANG address the shortcomings of
SNMP and add more functionalities in network management, such as:
1. Configuration transactions
2. Network-wide orchestrated activation
3. Network-level validation and roll-back
4. Save and restore configurations
NETCONF/YANG
Configuration transactions:
• NETCONF configurations work based on atomic
transactions consisting of multiple configuration commands
required to move a network from state A to state B.
• The order of the configuration snippets within a transaction
does not matter and the success of a transaction is
based on the success of all the command snippets.
• If any single command fails, the entire transaction becomes
a failure.
• So, there is no intermediate erroneous state, either it’s at
state A (if any one command of the transaction fails) or at
state B (if the transaction is successful as a whole).
NETCONF/YANG
Network-wide orchestrated activation:
• There is a distinction between the distribution of a
configuration to all the networking devices
and the activation of it.
• For example, if the operator wants to configure a VPN
over a network of devices all at one time, NETCONF
provides the flexibility to distribute the configuration,
validate it, lock all device configurations, commit the
configuration, and unlock.
• This set of actions will result in enabling a VPN over the
entire network at the same time, in an orchestrated,
synchronized way.
NETCONF/YANG
Network-level validation and roll-back:
• Each NETCONF server keeps a “Candidate database” (in
parallel to “Running config
database”).
• Using this candidate data store, a NETCONF manager
can implement a network-wide transaction by sending
a configuration to the candidate of each device,
validating the candidate, and if all participants are fine,
telling them to commit the changes.
• If the results are not satisfactory, the manager can ask
to roll-back all devices.
NETCONF/YANG
Save and restore configurations:
• NETCONF Manager can take a backup of the
networking device configuration whenever
needed and restore it by sending the saved
configuration to any networking device.
Protocol Stack
• The NETCONF protocol can be broken down
into 4 layers. These are as shown in figure
1. Content: NETCONF data models and protocol operations use the
YANG modelling language. A data model outlines the structure,
semantics and syntax of the data.
2. Operations: A set of base protocol operations initiated via by RPC
methods using XML- encoding, in order to perform operations
upon the device. Such as <get-config>, <edit-config> and <get>.
3. Messages: A set of RPC messages and notifications are defined for
use including <rpc>,<rpc-reply> and <rpc-error>.
4. Transport: The transport layer used to provide a communication
path between the client/server (manager/agent). The protocol
used is agnostic to NETCONF, but SSH is typically used
Communication
NETCONF is based upon a client/server model. Within the
communication flow of a NETCONF session there are 3
main parts. These are:
1. Session Establishment - Each side sends a <hello> ,
along with its <capabilities> . Announcing what
operations (capabilities) it supports.
2. Operation Request - The client then sends its request
(operation) to the server via the <rpc> message. The
response is then sent back to the client within <rpc-
reply> .
3. Session Close - The session is then closed by the client
via <close-session>
OPERATIONS
Netconf Manager Test Scenario
The important aspects of a NETCONF Server validation can be
classified in the following
categories:
• Validate the YANG model encoding of NETCONF operations
(e.g., <get>, <get-
config>, <edit-config>) received in Request XML messages
(e.g., ietf, openconf, or
proprietary)
• Stress management plane with many concurrent NETCONF
sessions and assess the impact to regular control plane and
data plane operation of a network device
• Measure device response time of NETCONF transaction
NETCONF Server test scenario
• Test Objective: Measure the efficiency of
NETCONF Server in terms of the time it takes
to respond to NETCONF Request XMLs when
multiple concurrent NETCONF Client
sessions are active.
• Test Topology: There are multiple NETCONF
Clients, all of them are connected to a single
NETCONF Server (DUT), as shown in figure:
NETCONF Server test scenario
NETCONF Server test scenario
Steps for testing are as follows:
1. The NETCONF Clients are preconfigured with a set of NETCONF Request XMLs as
per the YANG model supported by the DUT. The XMLs have different types of
command snippets like edit-config, get, get-config.
2. Once sessions are established, NETCONF clients will start sending NETCONF
Request messages in the form of XML files and the Server is supposed to
respond with NETCONF Reply messages with the same XML format.
3. Let’s assume the Clients have sent some Request messages and stopped sending
thereafter.
4. For each session, measure how much time (min/max/average) the Server takes
to send a Reply message after receiving a Request message.
5. Now have the Clients resume sending Request messages again at a higher
transmission rate for a certain duration of time and measure how that affects the
DUT’s response time under stress condition.
YANG (Yet Another Next Generation)
• YANG is a language used to model data for the NETCONF
protocol. A YANG module defines
a hierarchy of nodes which can be used for NETCONF-
based operations, including configuration, state data,
remote procedure calls (RPCs), and notifications.
• This allows a complete description of all data sent
between a NETCONF client and server.
• YANG models the hierarchical organization of data as a
tree in which YANG provides clear and concise
descriptions of the nodes, as well as the interaction
between those nodes.
YANG (Yet Another Next Generation)
• YANG structures data models into modules and sub modules.
• A module can import data from other external modules, and
include data from sub modules.
• The hierarchy can be extended, allowing one module to add
data nodes to the hierarchy defined in another module.
• This augmentation can be conditional, with new nodes to
appearing only if certain conditions are met.
• YANG models can describe constraints to be enforced on the
data, restricting the appearance or value of nodes based the
presence or value of other nodes in the hierarchy.
• These constraints are enforceable by either the client or the
server, and valid content must abide by them.
YANG (Yet Another Next Generation)
• YANG defines a set of built-in types, and has a type mechanism
through which additional types may be defined.
• Derived types can restrict their base type's set of valid values using
mechanisms like range or pattern restrictions that can be enforced
by clients or servers.
• They can also define usage conventions for use of the derived type,
such as a string-based type that contains a host name.
• YANG permits the definition of complex types using reusable
grouping of nodes.
• The instantiation of these groupings can refine or augment the
nodes, allowing it to tailor the nodes to its particular needs.
• Derived types and groupings can be defined in one module or
submodule and used in either that location or in another module or
submodule that imports or includes it
YANG (Yet Another Next Generation)
• YANG organizational constructs include defining lists of
nodes with the same names and identifying the keys which
distinguish list members from each other.
• Such lists may be defined as either sorted by user or
automatically sorted by the system.
• For user-sorted lists, operations are defined for
manipulating the order of the nodes.
• YANG modules can be translated into an XML format called
YIN, allowing applications using XML parsers and XSLT
scripts to operate on the models.
• XML Schema [XSD] files can be generated from YANG
modules, giving a precise description of the XML
representation of the data modeled in YANG modules
YANG (Yet Another Next Generation)
• YANG strikes a balance between high-level object-oriented modelling and
low-level bits-on- the-wire encoding.
• The reader of a YANG module can easily see the high-level view of the
data model while seeing how the object will be encoded in NETCONF
operations.
• YANG is an extensible language, allowing extension statements to be
defined by standards bodies, vendors, and individuals.
• The statement syntax allows these extensions to coexist with standard
YANG statements in a natural way, while making extensions stand out
sufficiently for the reader to notice them.
• YANG resists the tendency to solve all possible problems, limiting the
problem space to allow expression of NETCONF data models, not arbitrary
XML documents or arbitrary data models.
• The data models described by YANG are designed to be easily operated
upon by NETCONF operations.

More Related Content

What's hot

WSN Routing Protocols
WSN Routing ProtocolsWSN Routing Protocols
WSN Routing Protocols
Murtadha Alsabbagh
 
RFID Technology - Electronics and Communication Seminar Topic
RFID Technology - Electronics and Communication Seminar TopicRFID Technology - Electronics and Communication Seminar Topic
RFID Technology - Electronics and Communication Seminar Topic
HimanshiSingh71
 
RFID with INTERNET OF THINGS
RFID with INTERNET OF THINGSRFID with INTERNET OF THINGS
RFID with INTERNET OF THINGS
Bino Mathew Varghese
 
Infrastructure Establishment
Infrastructure EstablishmentInfrastructure Establishment
Infrastructure Establishment
juno susi
 
Wi-Fi based indoor positioning
Wi-Fi based indoor positioningWi-Fi based indoor positioning
Wi-Fi based indoor positioning
Sherwin Rodrigues
 
IoT material revised edition
IoT material revised editionIoT material revised edition
IoT material revised edition
pavan penugonda
 
WLAN and Bluetooth Indoor Positioning System
WLAN and Bluetooth Indoor Positioning SystemWLAN and Bluetooth Indoor Positioning System
WLAN and Bluetooth Indoor Positioning System
ProjectENhANCE
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
Neelam Rawat
 
DASH7 Alliance Protocol Technical Presentation
DASH7 Alliance Protocol Technical PresentationDASH7 Alliance Protocol Technical Presentation
DASH7 Alliance Protocol Technical Presentation
Maarten Weyn
 
Mitigating GNSS jamming and spoofing using ML and AI
Mitigating GNSS jamming and spoofing using ML and AIMitigating GNSS jamming and spoofing using ML and AI
Mitigating GNSS jamming and spoofing using ML and AI
ADVA
 
Brief introduction to satellite communications
Brief introduction to satellite communicationsBrief introduction to satellite communications
Brief introduction to satellite communications
Sally Sheridan
 
Lidar
LidarLidar
Lidar
aksh rana
 
Wireless Sensor Networks UNIT-1
Wireless Sensor Networks UNIT-1Wireless Sensor Networks UNIT-1
Wireless Sensor Networks UNIT-1
Easy n Inspire L
 
MANET.pptx
MANET.pptxMANET.pptx
MANET.pptx
ssuser476e50
 
Sensor Protocols for Information via Negotiation (SPIN)
Sensor Protocols for Information via Negotiation (SPIN)Sensor Protocols for Information via Negotiation (SPIN)
Sensor Protocols for Information via Negotiation (SPIN)rajivagarwal23dei
 
Lecture 1 module 1 - radar
Lecture 1   module 1 - radarLecture 1   module 1 - radar
Lecture 1 module 1 - radar
Arnab Sarkar
 

What's hot (20)

WSN Routing Protocols
WSN Routing ProtocolsWSN Routing Protocols
WSN Routing Protocols
 
RFID Technology - Electronics and Communication Seminar Topic
RFID Technology - Electronics and Communication Seminar TopicRFID Technology - Electronics and Communication Seminar Topic
RFID Technology - Electronics and Communication Seminar Topic
 
RFID with INTERNET OF THINGS
RFID with INTERNET OF THINGSRFID with INTERNET OF THINGS
RFID with INTERNET OF THINGS
 
Infrastructure Establishment
Infrastructure EstablishmentInfrastructure Establishment
Infrastructure Establishment
 
Indoor positioning
Indoor positioningIndoor positioning
Indoor positioning
 
Wi-Fi based indoor positioning
Wi-Fi based indoor positioningWi-Fi based indoor positioning
Wi-Fi based indoor positioning
 
IoT material revised edition
IoT material revised editionIoT material revised edition
IoT material revised edition
 
WLAN and Bluetooth Indoor Positioning System
WLAN and Bluetooth Indoor Positioning SystemWLAN and Bluetooth Indoor Positioning System
WLAN and Bluetooth Indoor Positioning System
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
 
DASH7 Alliance Protocol Technical Presentation
DASH7 Alliance Protocol Technical PresentationDASH7 Alliance Protocol Technical Presentation
DASH7 Alliance Protocol Technical Presentation
 
Mitigating GNSS jamming and spoofing using ML and AI
Mitigating GNSS jamming and spoofing using ML and AIMitigating GNSS jamming and spoofing using ML and AI
Mitigating GNSS jamming and spoofing using ML and AI
 
Brief introduction to satellite communications
Brief introduction to satellite communicationsBrief introduction to satellite communications
Brief introduction to satellite communications
 
Sdr seminar
Sdr seminarSdr seminar
Sdr seminar
 
Lidar
LidarLidar
Lidar
 
Wireless Sensor Networks UNIT-1
Wireless Sensor Networks UNIT-1Wireless Sensor Networks UNIT-1
Wireless Sensor Networks UNIT-1
 
MANET.pptx
MANET.pptxMANET.pptx
MANET.pptx
 
Distributed Antenna System
Distributed Antenna SystemDistributed Antenna System
Distributed Antenna System
 
802 15-4 tutorial
802 15-4 tutorial802 15-4 tutorial
802 15-4 tutorial
 
Sensor Protocols for Information via Negotiation (SPIN)
Sensor Protocols for Information via Negotiation (SPIN)Sensor Protocols for Information via Negotiation (SPIN)
Sensor Protocols for Information via Negotiation (SPIN)
 
Lecture 1 module 1 - radar
Lecture 1   module 1 - radarLecture 1   module 1 - radar
Lecture 1 module 1 - radar
 

Similar to IOE MODULE 6.pptx

Hadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User GroupHadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User GroupCsaba Toth
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
AtulYadav218546
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
TSANKARARAO
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
NelakurthyVasanthRed1
 
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Codemotion
 
Map reduce
Map reduceMap reduce
Map reduce
Somesh Maliye
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
WasyihunSema2
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
Ahmad El Tawil
 
Hadoop combiner and partitioner
Hadoop combiner and partitionerHadoop combiner and partitioner
Hadoop combiner and partitioner
Subhas Kumar Ghosh
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduce
Shrihari Rathod
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce script
Haripritha
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
sreehari orienit
 
Introduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfIntroduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdf
BikalAdhikari4
 
Hadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigHadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pig
KhanKhaja1
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
Dr. C.V. Suresh Babu
 
try
trytry

Similar to IOE MODULE 6.pptx (20)

Hadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User GroupHadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User Group
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
 
ENAR short course
ENAR short courseENAR short course
ENAR short course
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
 
Map reduce
Map reduceMap reduce
Map reduce
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop combiner and partitioner
Hadoop combiner and partitionerHadoop combiner and partitioner
Hadoop combiner and partitioner
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduce
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce script
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
2 mapreduce-model-principles
2 mapreduce-model-principles2 mapreduce-model-principles
2 mapreduce-model-principles
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 
Introduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfIntroduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdf
 
Hadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigHadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pig
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
try
trytry
try
 

More from nikshaikh786

Module 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxModule 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptx
nikshaikh786
 
Module 1_ Introduction.pptx
Module 1_ Introduction.pptxModule 1_ Introduction.pptx
Module 1_ Introduction.pptx
nikshaikh786
 
Module 1_ Introduction to Mobile Computing.pptx
Module 1_  Introduction to Mobile Computing.pptxModule 1_  Introduction to Mobile Computing.pptx
Module 1_ Introduction to Mobile Computing.pptx
nikshaikh786
 
Module 2_ GSM Mobile services.pptx
Module 2_  GSM Mobile services.pptxModule 2_  GSM Mobile services.pptx
Module 2_ GSM Mobile services.pptx
nikshaikh786
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
nikshaikh786
 
MODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxMODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptx
nikshaikh786
 
DWM-MODULE 6.pdf
DWM-MODULE 6.pdfDWM-MODULE 6.pdf
DWM-MODULE 6.pdf
nikshaikh786
 
TCS MODULE 6.pdf
TCS MODULE 6.pdfTCS MODULE 6.pdf
TCS MODULE 6.pdf
nikshaikh786
 
Module 3_ Classification.pptx
Module 3_ Classification.pptxModule 3_ Classification.pptx
Module 3_ Classification.pptx
nikshaikh786
 
Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...
Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...
Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...
nikshaikh786
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
nikshaikh786
 
Module 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptxModule 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptx
nikshaikh786
 
Module 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptxModule 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptx
nikshaikh786
 
MODULE 5- EDA.pptx
MODULE 5- EDA.pptxMODULE 5- EDA.pptx
MODULE 5- EDA.pptx
nikshaikh786
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptx
nikshaikh786
 
Module 3 - Time Series.pptx
Module 3 - Time Series.pptxModule 3 - Time Series.pptx
Module 3 - Time Series.pptx
nikshaikh786
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
nikshaikh786
 
MODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptxMODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptx
nikshaikh786
 
MAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdfMAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdf
nikshaikh786
 
VIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdfVIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdf
nikshaikh786
 

More from nikshaikh786 (20)

Module 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxModule 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptx
 
Module 1_ Introduction.pptx
Module 1_ Introduction.pptxModule 1_ Introduction.pptx
Module 1_ Introduction.pptx
 
Module 1_ Introduction to Mobile Computing.pptx
Module 1_  Introduction to Mobile Computing.pptxModule 1_  Introduction to Mobile Computing.pptx
Module 1_ Introduction to Mobile Computing.pptx
 
Module 2_ GSM Mobile services.pptx
Module 2_  GSM Mobile services.pptxModule 2_  GSM Mobile services.pptx
Module 2_ GSM Mobile services.pptx
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
 
MODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptxMODULE 5 _ Mining frequent patterns and associations.pptx
MODULE 5 _ Mining frequent patterns and associations.pptx
 
DWM-MODULE 6.pdf
DWM-MODULE 6.pdfDWM-MODULE 6.pdf
DWM-MODULE 6.pdf
 
TCS MODULE 6.pdf
TCS MODULE 6.pdfTCS MODULE 6.pdf
TCS MODULE 6.pdf
 
Module 3_ Classification.pptx
Module 3_ Classification.pptxModule 3_ Classification.pptx
Module 3_ Classification.pptx
 
Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...
Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...
Module 2_ Introduction to Data Mining, Data Exploration and Data Pre-processi...
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
 
Module 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptxModule 2_ Cyber offenses & Cybercrime.pptx
Module 2_ Cyber offenses & Cybercrime.pptx
 
Module 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptxModule 1- Introduction to Cybercrime.pptx
Module 1- Introduction to Cybercrime.pptx
 
MODULE 5- EDA.pptx
MODULE 5- EDA.pptxMODULE 5- EDA.pptx
MODULE 5- EDA.pptx
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptx
 
Module 3 - Time Series.pptx
Module 3 - Time Series.pptxModule 3 - Time Series.pptx
Module 3 - Time Series.pptx
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
 
MODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptxMODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptx
 
MAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdfMAD&PWA VIVA QUESTIONS.pdf
MAD&PWA VIVA QUESTIONS.pdf
 
VIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdfVIVA QUESTIONS FOR DEVOPS.pdf
VIVA QUESTIONS FOR DEVOPS.pdf
 

Recently uploaded

road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 

Recently uploaded (20)

road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 

IOE MODULE 6.pptx

  • 1. MODULE 6: Data Analytics for IoE CO5-Design and develop smart city in IOT. CO6-Analysis and evaluate the data received through sensors in IOT.
  • 2. CONTENTS • Introduction • Apache Hadoop Using Hadoop Map Reduce for Batch Data Analysis • Apache Oozie • Apache Spark • Apache Storm, Using Apache Storm for Real-time Data Analysis • Structural Health Monitoring Case Study • Tools for IoT:- – Chef, Chef Case Studies, – Puppet, Puppet Case Study – Multi-tier Deployment • NETCONF-YANG Case Studies • IoT Code Generator.
  • 5. MAJOR GOALS OF HADOOP
  • 6. MAJOR GOALS OF HADOOP
  • 7. MAJOR GOALS OF HADOOP
  • 8. MAJOR GOALS OF HADOOP
  • 9. MAJOR GOALS OF HADOOP
  • 12.
  • 15. HDFS
  • 16. YARN
  • 21. HBASE
  • 25. Map Reduce • Map-Reduce is a scalable programming model that simplifies distributed processing of data. Map-Reduce consists of three main steps: Mapping, Shuffling and Reducing. • An easy way to think about a Map-Reduce job is to compare it with act of ‘delegating’ a large task to a group of people, and then combining the result of each person’s effort, to produce the final outcome.
  • 26. Map Reduce • Let’s take an example to bring the point across. You just heard about this great news at your office, and are throwing a party for all your colleagues! • You decide to cook Pasta for the dinner. Four of your friends, who like cooking, also volunteer to join you in preparation. • The task of preparing Pasta broadly involves chopping the vegetables, cooking, and garnishing. • Let’s take the job of chopping the vegetables and see how it is analogous to a map-reduce task. • Here the raw vegetables are symbolic of the input data, your friends are equivalent to compute nodes, and final chopped vegetables are analogous to desired outcome. • Each friend is allotted onions, tomatoes and peppers to chop and weigh. • You would also like to know how much of each vegetable types you have in the kitchen. You would also like to chop these vegetables while this calculation is occurring. In the end, the onions should be in one large bowl with a label that displays its weight in pounds, tomatoes in a separate one, and so on.
  • 27. Map Reduce MAP: To start with, you assign each of your four friends a random mix of different types of vegetables. They are required to use their ‘compute’ powers to chop them and measure the weight of each type of veggie. They need to ensure not to mix different types of veggies. So each friend will generate a mapping of <key, value> pairs that looks like: Friend X: • <tomatoes, 5 lbs> <onions, 10 lbs> <garlic, 2 lbs> Friend Y: • <onions, 22 lbs> <green peppers, 5 lbs> … • Now that your friends have chopped the vegetables, and labeled each bowl with the weight and type of vegetable, we move to the next stage: Shuffling.
  • 28. Map Reduce SHUFFLE: This stage is also called Grouping. Here you want to group the veggies by their types. You assign different parts of your kitchen to each type of veggie, and your friends are supposed to group the bowls, so that like items are placed together: North End of Kitchen: • <tomatoes, 5 lbs> <tomatoes, 11 lbs> West End of Kitchen: • <onions, 10 lbs> <onions, 22 lbs> <onions, 1.4 lbs> East End of Kitchen: • <green peppers, 3 lbs> <green peppers, 10 lbs> • The party starts in a couple of hours, but you are impressed by what your friends have accomplished by Mapping and Grouping so far! The kitchen looks much more organized now and the raw material is chopped. The final stage of this task is to measure how much of each veggie you actually have. This brings us to the Reduce stage.
  • 29. Map Reduce • REDUCE: In this stage, you ask each of your friend to collect items of same type, put them in a large bowl, and label this large bowl with sum of individual bowl weights. Your friends cannot wait for the party to start, and immediately start ‘reducing’ small bowls. In the end, you have nice large bowls, with total weight of each vegetable labeled on it.
  • 31. Map Reduce • The number represents total weight of that vegetable after reducing from smaller bowls • Your friends (‘compute nodes’) just performed a Map- Reduce task to help you get started with cooking the Pasta. Since you were coordinating the entire exercise, you are “The Master” node of this Map-Reduce task. Each of your friends took roles of Mappers, Groupers and Reducers at different times. This example demonstrates the power of this technique. • This simple and powerful technique can be scaled very easily if more of your friends decide to join you. In future, we will continue to add more articles on different open source tools that will help you easily implement Map- Reduce to solve your computational problems.
  • 32. Introduction to batch processing – MapReduce • Today, the volume of data is often too big for a single server – node – to process. • Therefore, there was a need to develop code that runs on multiple nodes. • Writing distributed systems is an endless array of problems, so people developed multiple frameworks to make our lives easier. • MapReduce is a framework that allows the user to write code that is executed on multiple nodes without having to worry about fault tolerance, reliability, synchronization or availability.
  • 33. Batch processing • There are a lot of use cases for a system described in the introduction, but the focus of this post will be on data processing – more specifically, batch processing. • Batch processing is an automated job that does some computation, usually done as a periodical job. • It runs the processing code on a set of inputs, called a batch. Usually, the job will read the batch data from a database and store the result in the same or different database. • An example of a batch processing job could be reading all the sale logs from an online shop for a single day and aggregating it into statistics for that day (number of users per country, the average spent amount, etc.). Doing this as a daily job could give insights into customer trends.
  • 35. MapReduce MapReduce • MapReduce is a programming model that was introduced in a white paper by Google in 2004. • Today, it is implemented in various data processing and storing systems (Hadoop, Spark, MongoDB, …) and it is a foundational building block of most big data batch processing systems. • For MapReduce to be able to do computation on large amounts of data, it has to be a distributed model that executes its code on multiple nodes. This allows the computation to handle larger amounts of data by adding more machines – horizontal scaling. • This is different from vertical scaling, which implies increasing the performance of a single machine.
  • 36. MapReduce Execution • In order to decrease the duration of our distributed computation, MapReduce tries to reduce shuffling (moving) the data from one node to another by distributing the computation so that it is done on the same node where the data is stored. • This way, the data stays on the same node, but the code is moved via the network. This is ideal because the code is much smaller than the data. • To run a MapReduce job, the user has to implement two functions, map and reduce, and those implemented functions are distributed to nodes that contain the data by the MapReduce framework. • Each node runs (executes) the given functions on the data it has in order the minimize network traffic (shuffling data).
  • 38. MapReduce • The computation performance of MapReduce comes at the cost of its expressivity. • When writing a MapReduce job we have to follow the strict interface (return and input data structure) of the map and the reduce functions. • The map phase generates key-value data pairs from the input data (partitions), which are then grouped by key and used in the reduce phase by the reduce task. • Everything except the interface of the functions is programmable by the user.
  • 39. MapReduce Map • Hadoop, along with its many other features, had the first open-source implementation of MapReduce. It also has its own distributed file storage called HDFS. • In Hadoop, the typical input into a MapReduce job is a directory in HDFS. • In order to increase parallelization, each directory is made up of smaller units called partitions and each partition can be processed separately by a map task (the process that executes the map function). • This is hidden from the user, but it is important to be aware of it because the number of partitions can affect the speed of execution.
  • 41. MapReduce • The map task (mapper) is called once for every input partition and its job is to extract key-value pairs from the input partition. The mapper can generate any number of key-value pairs from a single input (including zero, see the figure above). • The user only needs to define the code inside the mapper. Below, we see an example of a simple mapper that takes the input partition and outputs each word as a key with value 1.
  • 43. MapReduce Reduce • The MapReduce framework collects all the key- value pairs produced by the mappers, arranges them into groups with the same key and applies the reduce function. • All the grouped values entering the reducers are sorted by the framework. • The reducer can produce output files which can serve as input into another MapReduce job, thus enabling multiple MapReduce jobs to chain into a more complex data processing pipeline.
  • 44. MapReduce • The mapper yielded key-value pairs with the word as the key and the number 1 as the value. • The reducer can be called on all the values with the same key (word), to create a distributed word counting pipeline. • In the image below, we see that not every sorted group has a reduce task. • This happens because the user needs to define the number of reducers, which is 3 in our case. • After a reducer is done with its task, it takes another group if there is one that was not processed.
  • 46. MapReduce • MapReduce is a programming model that allows the user to write batch processing jobs with a small amount of code. • It is flexible in the sense that you, the user, can write code to modify the behavior, but making complex data processing pipelines becomes cumbersome because every MapReduce job has to be managed and scheduled on its own. • The intermediate output of map tasks is written to a file which allows the framework to recover easily if a node has a failure. • This stability comes at a cost of performance, as the data could have been forwarded to reduce tasks with a small buffer instead, creating a stream.
  • 47. Apache Oozie • Oozie is a workflow scheduler system to manage Apache Hadoop jobs. • Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions. • Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availability. • Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts). • Oozie is a scalable, reliable and extensible system.
  • 49.
  • 50.
  • 51.
  • 52. Apache Spark • Apache Spark is a lightning-fast unified analytics engine for big data and machine learning. It was originally developed at UC Berkeley in 2009. • Since its release, Apache Spark, the unified analytics engine, has seen rapid adoption by enterprises across a wide range of industries. • Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at massive scale, collectively processing multiple petabytes of data on clusters of over 8,000 nodes. • It has quickly become the largest open source community in big data, with over 1000 contributors from 250+ organizations.
  • 53. Apache Spark • Spark can be 100x faster than Hadoop for large scale data processing by exploiting in memory computing and other optimizations. • Spark is also fast when data is stored on disk, and currently holds the world record for large-scale on-disk sorting. • Spark has easy-to-use APIs for operating on large datasets. • Spark comes packaged with higher-level libraries, including support for SQL queries, streaming data, machine learning and graph processing. • These standard libraries increase developer productivity and can be seamlessly combined to create complex workflows.
  • 55. Apache Storm • Apache Storm is a distributed real-time big data-processing system. • Storm is designed to process vast amount of data in a fault- tolerant and horizontal scalable method. • It is a streaming data framework that has the capability of highest ingestion rates. • Though Storm is stateless, it manages distributed environment and cluster state via Apache ZooKeeper. • It is simple and you can execute all kinds of manipulations on real-time data in parallel. • Apache Storm is continuing to be a leader in real-time data analytics. Storm is easy to setup, operate and it guarantees that every message will be processed through the topology at least once.
  • 56. • Apache Storm has the cluster with some specific components, each component works with some functions together assists “The Apache Storm: Architecture”. There are two types of nodes that are present in architecture: • Master node(Nimbus) • Worker node(Supervisor)
  • 57. • The Master node comprises nimbus, nimbus acts as a daemon for the master node. The Master node runs the nimbus, nimbus examines and administers the task to cluster or worker node, allots tasks to machines, and supervise on failure. Nimbus permits to accept code (data) in any programming language, in this way anyone can utilize Apache storm without knowing any other language. • The Worker node comprises of a supervisor, the supervisor acts as a daemon for the worker node. The Worker node runs the supervisor, supervisor concentrates on the task given to the machine and monitors worker processes as required based on what Nimbus has assigned to it. Each worker node process operates a part of topology in the form of spouts and bolts. Nimbus daemon communicates with the supervisor daemon via ZooKeeper.
  • 58. Components of Apache Storm • Topology is the real-time computational and graphical representation data structure. The topology consists of bolt and spouts where spout determines how the output is fixed to the inputs of bolts and output from a single bolt linked to the inputs of other bolts. A storm cluster gets input as topology, the nimbus daemon in the master node seeks information with supervisor daemon in the Worker node and accepts the topology.
  • 60. • Spout acts as an initial point-step in topology, data from unlike sources is acquired by the spout. It ingests the data as a stream of tuples and sends it to bolt for processing of stream as data. A single spout can generate multiple outputs of streams as tuples, these tuples of streams are further consumed by one or many bolts. Spout gets data from various databases, file system distribution or messages like Kafka consistently, converts them in streams of tuples and sends them to bolts for processing. • Bolts are responsible for the processing of data, their work includes filtering, functioning, aggregations, and handing databases, etc. Bolts consume multiple streams as input, process them, and generate new streams for processing of data.
  • 61. • Consider the case of Twitter, it is an online social platform to communicate with tweets. Here, user tweets can be sent and received. Subscribed users read and post tweets while unsubscribed users read tweets only. • A hashtag is used to classify tweets as a keyword by putting # earlier to an appropriate keyword. So, Apache Storm acts here as a real-time outline of detecting the most used Hashtag per tweet.
  • 62. Apache Storm vs Hadoop • Basically Hadoop and Storm frameworks are used for analyzing big data. Both of them complement each other and differ in some aspects. Apache Storm does all the operations except persistency, while Hadoop is good at everything but lags in real-time computation. The following table compares the attributes of Storm and Hadoop.
  • 63. Apache Storm vs Hadoop
  • 65. NETCONF/YANG • Network Configuration Protocol, known as NETCONF, gives access to the native capabilities of a device within a network, defining methods to manipulate its configuration database, retrieve operational data, and invoke specific operations. • YANG provides the means to define the content carried via NETCONF, for both data and operations. Together, they help users build network management applications that meet the needs of network operators.
  • 67. NETCONF/YANG The motivation behind NETCONF and YANG was, to have a network management system that manages the network at the service level that includes: – Standardized data model (YANG) – Network-wide configuration transactions – Validation and roll-back of configuration – Centralized backup and restore configuration Businesses have used Simple Network Management Protocol for a long time, but it was being used more for reading device states than for configuring devices. NETCONF and YANG address the shortcomings of SNMP and add more functionalities in network management, such as: 1. Configuration transactions 2. Network-wide orchestrated activation 3. Network-level validation and roll-back 4. Save and restore configurations
  • 68. NETCONF/YANG Configuration transactions: • NETCONF configurations work based on atomic transactions consisting of multiple configuration commands required to move a network from state A to state B. • The order of the configuration snippets within a transaction does not matter and the success of a transaction is based on the success of all the command snippets. • If any single command fails, the entire transaction becomes a failure. • So, there is no intermediate erroneous state, either it’s at state A (if any one command of the transaction fails) or at state B (if the transaction is successful as a whole).
  • 69. NETCONF/YANG Network-wide orchestrated activation: • There is a distinction between the distribution of a configuration to all the networking devices and the activation of it. • For example, if the operator wants to configure a VPN over a network of devices all at one time, NETCONF provides the flexibility to distribute the configuration, validate it, lock all device configurations, commit the configuration, and unlock. • This set of actions will result in enabling a VPN over the entire network at the same time, in an orchestrated, synchronized way.
  • 70. NETCONF/YANG Network-level validation and roll-back: • Each NETCONF server keeps a “Candidate database” (in parallel to “Running config database”). • Using this candidate data store, a NETCONF manager can implement a network-wide transaction by sending a configuration to the candidate of each device, validating the candidate, and if all participants are fine, telling them to commit the changes. • If the results are not satisfactory, the manager can ask to roll-back all devices.
  • 71. NETCONF/YANG Save and restore configurations: • NETCONF Manager can take a backup of the networking device configuration whenever needed and restore it by sending the saved configuration to any networking device.
  • 72. Protocol Stack • The NETCONF protocol can be broken down into 4 layers. These are as shown in figure
  • 73. 1. Content: NETCONF data models and protocol operations use the YANG modelling language. A data model outlines the structure, semantics and syntax of the data. 2. Operations: A set of base protocol operations initiated via by RPC methods using XML- encoding, in order to perform operations upon the device. Such as <get-config>, <edit-config> and <get>. 3. Messages: A set of RPC messages and notifications are defined for use including <rpc>,<rpc-reply> and <rpc-error>. 4. Transport: The transport layer used to provide a communication path between the client/server (manager/agent). The protocol used is agnostic to NETCONF, but SSH is typically used
  • 74. Communication NETCONF is based upon a client/server model. Within the communication flow of a NETCONF session there are 3 main parts. These are: 1. Session Establishment - Each side sends a <hello> , along with its <capabilities> . Announcing what operations (capabilities) it supports. 2. Operation Request - The client then sends its request (operation) to the server via the <rpc> message. The response is then sent back to the client within <rpc- reply> . 3. Session Close - The session is then closed by the client via <close-session>
  • 76. Netconf Manager Test Scenario The important aspects of a NETCONF Server validation can be classified in the following categories: • Validate the YANG model encoding of NETCONF operations (e.g., <get>, <get- config>, <edit-config>) received in Request XML messages (e.g., ietf, openconf, or proprietary) • Stress management plane with many concurrent NETCONF sessions and assess the impact to regular control plane and data plane operation of a network device • Measure device response time of NETCONF transaction
  • 77. NETCONF Server test scenario • Test Objective: Measure the efficiency of NETCONF Server in terms of the time it takes to respond to NETCONF Request XMLs when multiple concurrent NETCONF Client sessions are active. • Test Topology: There are multiple NETCONF Clients, all of them are connected to a single NETCONF Server (DUT), as shown in figure:
  • 79. NETCONF Server test scenario Steps for testing are as follows: 1. The NETCONF Clients are preconfigured with a set of NETCONF Request XMLs as per the YANG model supported by the DUT. The XMLs have different types of command snippets like edit-config, get, get-config. 2. Once sessions are established, NETCONF clients will start sending NETCONF Request messages in the form of XML files and the Server is supposed to respond with NETCONF Reply messages with the same XML format. 3. Let’s assume the Clients have sent some Request messages and stopped sending thereafter. 4. For each session, measure how much time (min/max/average) the Server takes to send a Reply message after receiving a Request message. 5. Now have the Clients resume sending Request messages again at a higher transmission rate for a certain duration of time and measure how that affects the DUT’s response time under stress condition.
  • 80. YANG (Yet Another Next Generation) • YANG is a language used to model data for the NETCONF protocol. A YANG module defines a hierarchy of nodes which can be used for NETCONF- based operations, including configuration, state data, remote procedure calls (RPCs), and notifications. • This allows a complete description of all data sent between a NETCONF client and server. • YANG models the hierarchical organization of data as a tree in which YANG provides clear and concise descriptions of the nodes, as well as the interaction between those nodes.
  • 81. YANG (Yet Another Next Generation) • YANG structures data models into modules and sub modules. • A module can import data from other external modules, and include data from sub modules. • The hierarchy can be extended, allowing one module to add data nodes to the hierarchy defined in another module. • This augmentation can be conditional, with new nodes to appearing only if certain conditions are met. • YANG models can describe constraints to be enforced on the data, restricting the appearance or value of nodes based the presence or value of other nodes in the hierarchy. • These constraints are enforceable by either the client or the server, and valid content must abide by them.
  • 82. YANG (Yet Another Next Generation) • YANG defines a set of built-in types, and has a type mechanism through which additional types may be defined. • Derived types can restrict their base type's set of valid values using mechanisms like range or pattern restrictions that can be enforced by clients or servers. • They can also define usage conventions for use of the derived type, such as a string-based type that contains a host name. • YANG permits the definition of complex types using reusable grouping of nodes. • The instantiation of these groupings can refine or augment the nodes, allowing it to tailor the nodes to its particular needs. • Derived types and groupings can be defined in one module or submodule and used in either that location or in another module or submodule that imports or includes it
  • 83. YANG (Yet Another Next Generation) • YANG organizational constructs include defining lists of nodes with the same names and identifying the keys which distinguish list members from each other. • Such lists may be defined as either sorted by user or automatically sorted by the system. • For user-sorted lists, operations are defined for manipulating the order of the nodes. • YANG modules can be translated into an XML format called YIN, allowing applications using XML parsers and XSLT scripts to operate on the models. • XML Schema [XSD] files can be generated from YANG modules, giving a precise description of the XML representation of the data modeled in YANG modules
  • 84. YANG (Yet Another Next Generation) • YANG strikes a balance between high-level object-oriented modelling and low-level bits-on- the-wire encoding. • The reader of a YANG module can easily see the high-level view of the data model while seeing how the object will be encoded in NETCONF operations. • YANG is an extensible language, allowing extension statements to be defined by standards bodies, vendors, and individuals. • The statement syntax allows these extensions to coexist with standard YANG statements in a natural way, while making extensions stand out sufficiently for the reader to notice them. • YANG resists the tendency to solve all possible problems, limiting the problem space to allow expression of NETCONF data models, not arbitrary XML documents or arbitrary data models. • The data models described by YANG are designed to be easily operated upon by NETCONF operations.