Data Analytics
Shivam Singh
Emergence of Data Analytics
• Traditionally business managers were making decisions based on past
experiences or rules of thumb, or there were other qualitative
aspects to decision making
• Analytics began to command more awareness in the late 1960s when
computers had started playing a dominating role as organizations’
decision support systems.
• Development of data warehouses and enterprise resource planning
(ERP) systems.
• The business managers and leaders considered data and relied on ad
hoc analysis to affirm their experience/knowledge based assumptions
for daily and critical business decisions.
Data Analytics
• It is a process of inspecting, cleaning, transforming, and modeling
data with the goal of discovering useful information, suggesting
conclusions, and supporting decision making.
• Also known as Business Analytics
• Widely used in many industries to allow companies/organization to
use the science of examining raw data with the purpose of drawing
conclusions about that information and make better business
decisions.
Types of Data Analytics
1. Descriptive Analytics
2. Diagnostic Analytics
3. Predictive Analytics
4. Prescriptive Analytics
Descriptive Analytics
• Any activity or method that helps us to describe or summarize raw
data into something interpretable by humans can be termed
‘Descriptive Analytics’.
• These are useful because they allow us to learn from past behaviors,
and understand how they might influence future outcomes.
• Eg- company’s business intelligence reports
• The statistics such as arithmetic operation of count, min, max, sum,
average, percentage, and percent change, etc., fall into this category.
• provide historical hindsights regarding the company’s
• production, operations, sales, revenue, financials, inventory,
customers, and market share.
Diagnostic Analytics
• It focuses on determining the factors and events that contributed to
the outcome.
• Characterized by techniques such as drilldown, data discovery, data
mining, correlations, and causation.
• Eg- assume a retail company’s hardlines sales performance is not up
to the mark in certain stores and the product line manager would like
to understand the root cause.
• To accomplish this there is not a clearly defined set of ordered steps
defined, and it depends on the experience level and thinking style of
the person carrying out the analysis.
Predictive Analytics
• It is the ability to make predictions or estimations of likelihoods about
unknown future events based on the past or historic patterns.
• It uses many techniques from data mining, statistics, modeling,
machine learning, and artificial intelligence to analyze current data to
make predictions about the future.
• Machine learning is heavily focused on predictive analytics, where we
combine historical data from different sources such as organizational
ERP, CRM, POS, Employees data, Market research data to identify
patterns and apply statistical model/algorithms to capture the
relationship between various data sets and further predict the
likelihood of an event.
Prescriptive Analytics
• It is the area of data or business analytics dedicated to finding the
best course of action for a given situation.
• The endeavor of prescriptive analytics is to measure the future
decision’s effect to enable the decision makers to foresee the
possible outcomes before the actual decisions are made.
• Eg- Using simulation in design situations to help users identify system
behaviors under different configurations, and ensuring that all key
performance metrics are
• Eg- Use linear or nonlinear programming to identify the best outcome
for a business, given constraints, and objective function.
Data Analysis Packages
• NumPy
• SciPy
• Matplotlib
• Pandas
Natural Language Processing
• The majority of activities performed by humans are done through
language, whether communicated directly or reported using natural
language.
• By combining the power of artificial intelligence, computational
linguistics and computer science, Natural Language Processing (NLP)
helps machines “read” text by simulating the human ability to
understand language.
Steps
Level 1 – Speech sound (Phonetics & Phonology)
Level 2 – Words & their forms (Morphology, Lexicon)
Level 3 – Structure of sentences (Syntax, Parsing)
Level 4 – Meaning of sentences (Semantics)
Level 5 – Meaning in context & for a purpose (Pragmatics)
Level 6 – Connected sentence processing in a larger body of text
(Discourse)
Applications
• Machine Translation
• Fighting Spam
• Information Extraction
• Summarization
• Question Answering
Big Data
• Big data analytics examines large amounts of data to uncover
hidden patterns, correlations and other insights.
• With today’s technology, it’s possible to analyze your data and get
answers from it almost immediately – an effort that’s slower and
less efficient with more traditional business intelligence solutions.
Big Data Analytics vs Data Mining
• Data mining relates to the process of going through large sets of
data to identify relevant or pertinent information.
• The term Big Data can be defined simply as large data sets that
outgrow simple databases and data handling architectures.
• However, decision makers need access to smaller, more specific
pieces of data and use data mining to identify specific data that may
help their businesses make better leadership and management
decisions.
Benefits of data analytics in IoT
• Smart Metering
• A smart meter is a device that electronically records consumption of
electric energy data between the meter and the control system.
• Smart Transportation
• Improve existing traffic systems in which vehicles can effectively
communicate with one another in a systematic manner without human
intervention.
• Smart Supply Chains
• Embedded sensor technologies can communicate bidirectionally and
provide remote accessibility
• The captured data are used by on-and off-site technicians to run
diagnostics and repair options to make appropriate decisions
Benefits of data analytics in IoT
• Smart Grid
• The smart grid is a new generation of power grid in which managing and
distributing electricity between suppliers and consumers is upgraded using
two-way communication technologies and computing capabilities to
improve reliability, safety, efficiency with real-time control, and monitoring
• Smart Traffic Light System
• The smart traffic light system consists of nodes that locally interact with IoT
sensors and devices to detect the presence of vehicles, bikers, and
pedestrians.
• These nodes communicate with neighboring traffic lights to measure the
speed and distance of approaching transportation means and manage
green traffic signals
Big Data Mining Issues in IoT
• Increasingly Large Volumes of Data
• By 2020 the digital universe – the data we create and copy annually – will
reach 44 zettabytes
• Data Sets Aren’t Homogenous
• Data is curated from many different sources in multiple formats, such as
web documents, CSV sheets, and SQL tables.
• Integrity of Different Sources
• Each system may use its own methodology to develop data, which will
always introduce some level of uncertainty.
• Need for Real-Time Analysis
Smart Agriculture
• A variety of external parameters belonging to different domains
(e.g. weather conditions, regulations etc.) have a major influence
over the food supply chain
• Agri-IoT can integrate multiple cross-domain data streams,
providing a complete semantic processing pipeline, offering a
common framework for smart farming applications.
• Agri-IoT supports large-scale data analytics and event detection,
ensuring seamless interoperability among sensors, services,
processes, operations, farmers and other relevant actors, including
online information sources and linked open datasets and streams
available on the Web.
IoT Architecture for Big Data Analytics

Data Analytics and Big Data on IoT

  • 1.
  • 2.
    Emergence of DataAnalytics • Traditionally business managers were making decisions based on past experiences or rules of thumb, or there were other qualitative aspects to decision making • Analytics began to command more awareness in the late 1960s when computers had started playing a dominating role as organizations’ decision support systems. • Development of data warehouses and enterprise resource planning (ERP) systems. • The business managers and leaders considered data and relied on ad hoc analysis to affirm their experience/knowledge based assumptions for daily and critical business decisions.
  • 3.
    Data Analytics • Itis a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making. • Also known as Business Analytics • Widely used in many industries to allow companies/organization to use the science of examining raw data with the purpose of drawing conclusions about that information and make better business decisions.
  • 4.
    Types of DataAnalytics 1. Descriptive Analytics 2. Diagnostic Analytics 3. Predictive Analytics 4. Prescriptive Analytics
  • 5.
    Descriptive Analytics • Anyactivity or method that helps us to describe or summarize raw data into something interpretable by humans can be termed ‘Descriptive Analytics’. • These are useful because they allow us to learn from past behaviors, and understand how they might influence future outcomes. • Eg- company’s business intelligence reports • The statistics such as arithmetic operation of count, min, max, sum, average, percentage, and percent change, etc., fall into this category. • provide historical hindsights regarding the company’s • production, operations, sales, revenue, financials, inventory, customers, and market share.
  • 6.
    Diagnostic Analytics • Itfocuses on determining the factors and events that contributed to the outcome. • Characterized by techniques such as drilldown, data discovery, data mining, correlations, and causation. • Eg- assume a retail company’s hardlines sales performance is not up to the mark in certain stores and the product line manager would like to understand the root cause. • To accomplish this there is not a clearly defined set of ordered steps defined, and it depends on the experience level and thinking style of the person carrying out the analysis.
  • 7.
    Predictive Analytics • Itis the ability to make predictions or estimations of likelihoods about unknown future events based on the past or historic patterns. • It uses many techniques from data mining, statistics, modeling, machine learning, and artificial intelligence to analyze current data to make predictions about the future. • Machine learning is heavily focused on predictive analytics, where we combine historical data from different sources such as organizational ERP, CRM, POS, Employees data, Market research data to identify patterns and apply statistical model/algorithms to capture the relationship between various data sets and further predict the likelihood of an event.
  • 8.
    Prescriptive Analytics • Itis the area of data or business analytics dedicated to finding the best course of action for a given situation. • The endeavor of prescriptive analytics is to measure the future decision’s effect to enable the decision makers to foresee the possible outcomes before the actual decisions are made. • Eg- Using simulation in design situations to help users identify system behaviors under different configurations, and ensuring that all key performance metrics are • Eg- Use linear or nonlinear programming to identify the best outcome for a business, given constraints, and objective function.
  • 10.
    Data Analysis Packages •NumPy • SciPy • Matplotlib • Pandas
  • 11.
    Natural Language Processing •The majority of activities performed by humans are done through language, whether communicated directly or reported using natural language. • By combining the power of artificial intelligence, computational linguistics and computer science, Natural Language Processing (NLP) helps machines “read” text by simulating the human ability to understand language.
  • 12.
    Steps Level 1 –Speech sound (Phonetics & Phonology) Level 2 – Words & their forms (Morphology, Lexicon) Level 3 – Structure of sentences (Syntax, Parsing) Level 4 – Meaning of sentences (Semantics) Level 5 – Meaning in context & for a purpose (Pragmatics) Level 6 – Connected sentence processing in a larger body of text (Discourse)
  • 13.
    Applications • Machine Translation •Fighting Spam • Information Extraction • Summarization • Question Answering
  • 14.
    Big Data • Bigdata analytics examines large amounts of data to uncover hidden patterns, correlations and other insights. • With today’s technology, it’s possible to analyze your data and get answers from it almost immediately – an effort that’s slower and less efficient with more traditional business intelligence solutions.
  • 15.
    Big Data Analyticsvs Data Mining • Data mining relates to the process of going through large sets of data to identify relevant or pertinent information. • The term Big Data can be defined simply as large data sets that outgrow simple databases and data handling architectures. • However, decision makers need access to smaller, more specific pieces of data and use data mining to identify specific data that may help their businesses make better leadership and management decisions.
  • 16.
    Benefits of dataanalytics in IoT • Smart Metering • A smart meter is a device that electronically records consumption of electric energy data between the meter and the control system. • Smart Transportation • Improve existing traffic systems in which vehicles can effectively communicate with one another in a systematic manner without human intervention. • Smart Supply Chains • Embedded sensor technologies can communicate bidirectionally and provide remote accessibility • The captured data are used by on-and off-site technicians to run diagnostics and repair options to make appropriate decisions
  • 17.
    Benefits of dataanalytics in IoT • Smart Grid • The smart grid is a new generation of power grid in which managing and distributing electricity between suppliers and consumers is upgraded using two-way communication technologies and computing capabilities to improve reliability, safety, efficiency with real-time control, and monitoring • Smart Traffic Light System • The smart traffic light system consists of nodes that locally interact with IoT sensors and devices to detect the presence of vehicles, bikers, and pedestrians. • These nodes communicate with neighboring traffic lights to measure the speed and distance of approaching transportation means and manage green traffic signals
  • 18.
    Big Data MiningIssues in IoT • Increasingly Large Volumes of Data • By 2020 the digital universe – the data we create and copy annually – will reach 44 zettabytes • Data Sets Aren’t Homogenous • Data is curated from many different sources in multiple formats, such as web documents, CSV sheets, and SQL tables. • Integrity of Different Sources • Each system may use its own methodology to develop data, which will always introduce some level of uncertainty. • Need for Real-Time Analysis
  • 19.
    Smart Agriculture • Avariety of external parameters belonging to different domains (e.g. weather conditions, regulations etc.) have a major influence over the food supply chain • Agri-IoT can integrate multiple cross-domain data streams, providing a complete semantic processing pipeline, offering a common framework for smart farming applications. • Agri-IoT supports large-scale data analytics and event detection, ensuring seamless interoperability among sensors, services, processes, operations, farmers and other relevant actors, including online information sources and linked open datasets and streams available on the Web.
  • 20.
    IoT Architecture forBig Data Analytics