Fundamentals
of Data
Analytics
CSC1212 IOT DATA
ANALYTICS
Learning objectives
At the end of this chapter, student should be able to:
1. Define the IoT analytics
2. Determine the categorization of data
3. Explain the challenges of IoT Analytics
IMAGINE:
WHAT IF THINGS START TO THINK???
An introduction to IoT Data
Analytics
▪ in IoT world, the creation of massive amounts of data
from sensors is common and one of the biggest
challenges – not only from a transport perspective but
also from a data management standpoint.
▪ Example: Modern jet engines are fitted with thousands
of sensors that generate a whopping 10 GB of data per
second
▪ analysing this amount of data in the most efficient
manner possible falls under the umbrella of data
analytics
An introduction to IoT Data
Analytics
▪ Not all data is the same; it can be categorized and thus analyzed in
different ways.
▪ Depending on how data is categorized, various data analytics tools and
processing methods can be applied.
▪ TWO important categorization from an IoT perspective are whether the data
is structured or unstructured AND whether it is in motion or at rest.
Structured vs Unstructured Data
❑ structured and unstructured data are important classifications as they
typically require different toolset from a data analytics perspective.
❑ structured data means the data follows a model or schema that defines
how the data is represented or organized, meaning it fits well with a
traditional relational database management system (RDBMS)
❑ the structured data in a simple tabular form – for example, spreadsheet
where data occupies a specific cell and can be explicitly defined and
referenced
Structured vs Unstructured Data
❑ structured data is easily formatted, stored, queried and processed.
❑ because of the highly organizational format of structured data, a wide array
of data analytics tools are readily available for processing this type of data
❑ Microsoft Excel
❑ Tableau
Structured vs Unstructured Data
❑ Unstructured data lacks a logical schema for understanding and decoding
the data through programming.
❑ Example, text, speech, images and videos.
❑ As a general rule, any data does not fit neatly into a predefined data model
is classified as unstructured data.
Structured vs Unstructured Data
❑ According to some estimates, around 80% of a business’s data is
unstructured.
❑ Because of this fact, data analytics methods that can be applied to
unstructured data can be cognitive computing and machine learning.
[cognitive computing: hardware and/or software that mimics the functioning of the
human brain and helps to improve human decision-making]
Data in Motion vs Data at Rest
❑ Data in IoT networks is either in transit (“data in motion”) or being held or
stored (“data at rest”).
❑ Examples of data in motion include traditional client/server exchanges,
such as web browsing and file transfers and email.
❑ data saved to a hard drive, storage array or USB drive is data at rest.
Data in Motion vs Data at Rest
❑ From IoT perspective, the data from smart object is considered data in
motion as it passes through the network en route to its final destination.
❑ this often processed at the edge using fog computing.
❑ At the edge, data may be filtered and deleted or forwarded on further
processing and possible storage at a fog node or in data center.
❑ Data at rest in IoT networks can be typically found in IoT brokers or in some
sort of storage array at the data center
What is Data Analytics?
Data analytics is the science and art! Applying the statistical techniques to
large data sets to obtain actionable insights for making smart decisions.
It is the process uncover hidden patterns, unknown correlations, trends, any
other useful business information.
Type of
Data
Analysis
• What is happening?
Descriptive
• Why did it happen?
Diagnostic
• What is likely to happen?
Predictive
• What should I do about it?
Prescriptive
Both predictive and prescriptive analyses are more resource intensive and increase
complexity, but the value they provide is much greater than the value from descriptive
and diagnostic analysis
IoT Analytics Challenges
Too much
data
Security
Misbehaving
devices
IoT Analytics Challenges
Too
much
data
The total amount of data
being collected may be so
large that it may not be
possible to move it over the
network to a central location
IoT Analytics Challenges
Security If the security on a specific vendor’s
outdoor sensor is weak, and the sensor is
connected to other devices, the likelihood
of ‘indirect’ critical impact is high.
Attackers can compromise the sensor and
modify its data or exploit the connection to
other devices to cause damage.
IoT Analytics Challenges
Misbehaving
device These are devices or sensors that go
bad and begin sending false readings
to the system. For example, a low
battery, a software bug, or a hardware
failure, could cause such readings.
This could ruin the inventory of the
warehouse.
Conclusion
Class Activities
1. Take your Phone Camera, walk around in the campus and take any picture that you think can be
capture any data of it.
2. Present your finding in the class. The slide presentation should include what type of
categorization of data is your finding
and what is the challenge to do the analysis of it.

Chapter 2.pdf

  • 1.
  • 2.
    Learning objectives At theend of this chapter, student should be able to: 1. Define the IoT analytics 2. Determine the categorization of data 3. Explain the challenges of IoT Analytics
  • 3.
    IMAGINE: WHAT IF THINGSSTART TO THINK???
  • 7.
    An introduction toIoT Data Analytics ▪ in IoT world, the creation of massive amounts of data from sensors is common and one of the biggest challenges – not only from a transport perspective but also from a data management standpoint. ▪ Example: Modern jet engines are fitted with thousands of sensors that generate a whopping 10 GB of data per second ▪ analysing this amount of data in the most efficient manner possible falls under the umbrella of data analytics
  • 8.
    An introduction toIoT Data Analytics ▪ Not all data is the same; it can be categorized and thus analyzed in different ways. ▪ Depending on how data is categorized, various data analytics tools and processing methods can be applied. ▪ TWO important categorization from an IoT perspective are whether the data is structured or unstructured AND whether it is in motion or at rest.
  • 9.
    Structured vs UnstructuredData ❑ structured and unstructured data are important classifications as they typically require different toolset from a data analytics perspective. ❑ structured data means the data follows a model or schema that defines how the data is represented or organized, meaning it fits well with a traditional relational database management system (RDBMS) ❑ the structured data in a simple tabular form – for example, spreadsheet where data occupies a specific cell and can be explicitly defined and referenced
  • 10.
    Structured vs UnstructuredData ❑ structured data is easily formatted, stored, queried and processed. ❑ because of the highly organizational format of structured data, a wide array of data analytics tools are readily available for processing this type of data ❑ Microsoft Excel ❑ Tableau
  • 11.
    Structured vs UnstructuredData ❑ Unstructured data lacks a logical schema for understanding and decoding the data through programming. ❑ Example, text, speech, images and videos. ❑ As a general rule, any data does not fit neatly into a predefined data model is classified as unstructured data.
  • 12.
    Structured vs UnstructuredData ❑ According to some estimates, around 80% of a business’s data is unstructured. ❑ Because of this fact, data analytics methods that can be applied to unstructured data can be cognitive computing and machine learning. [cognitive computing: hardware and/or software that mimics the functioning of the human brain and helps to improve human decision-making]
  • 14.
    Data in Motionvs Data at Rest ❑ Data in IoT networks is either in transit (“data in motion”) or being held or stored (“data at rest”). ❑ Examples of data in motion include traditional client/server exchanges, such as web browsing and file transfers and email. ❑ data saved to a hard drive, storage array or USB drive is data at rest.
  • 15.
    Data in Motionvs Data at Rest ❑ From IoT perspective, the data from smart object is considered data in motion as it passes through the network en route to its final destination. ❑ this often processed at the edge using fog computing. ❑ At the edge, data may be filtered and deleted or forwarded on further processing and possible storage at a fog node or in data center. ❑ Data at rest in IoT networks can be typically found in IoT brokers or in some sort of storage array at the data center
  • 19.
    What is DataAnalytics? Data analytics is the science and art! Applying the statistical techniques to large data sets to obtain actionable insights for making smart decisions. It is the process uncover hidden patterns, unknown correlations, trends, any other useful business information.
  • 20.
    Type of Data Analysis • Whatis happening? Descriptive • Why did it happen? Diagnostic • What is likely to happen? Predictive • What should I do about it? Prescriptive
  • 25.
    Both predictive andprescriptive analyses are more resource intensive and increase complexity, but the value they provide is much greater than the value from descriptive and diagnostic analysis
  • 26.
    IoT Analytics Challenges Toomuch data Security Misbehaving devices
  • 27.
    IoT Analytics Challenges Too much data Thetotal amount of data being collected may be so large that it may not be possible to move it over the network to a central location
  • 28.
    IoT Analytics Challenges SecurityIf the security on a specific vendor’s outdoor sensor is weak, and the sensor is connected to other devices, the likelihood of ‘indirect’ critical impact is high. Attackers can compromise the sensor and modify its data or exploit the connection to other devices to cause damage.
  • 29.
    IoT Analytics Challenges Misbehaving deviceThese are devices or sensors that go bad and begin sending false readings to the system. For example, a low battery, a software bug, or a hardware failure, could cause such readings. This could ruin the inventory of the warehouse.
  • 30.
  • 31.
    Class Activities 1. Takeyour Phone Camera, walk around in the campus and take any picture that you think can be capture any data of it. 2. Present your finding in the class. The slide presentation should include what type of categorization of data is your finding and what is the challenge to do the analysis of it.