Multi-resolution Data Communication in Wireless Sensor Networks
1. Multi-resolution Data Communication in
Wireless Sensor Networks
Frieder Ganz, Payam Barnaghi, Francois Carrez
Centre for Communication Systems Research (CCSR)
University of Surrey
Guildford, United Kingdom
Seoul, Korea, March 2014
1
3. Wireless Sensor Networks (WSN)
End-user
Core network
e.g. Internet
Gateway
Sink
node
Gateway
Computer services
- The networks typically run Low Power Devices
- Consist of one or more sensors, could be different type of sensors (or actuators)
3
5. Data Processing
Data collections
and processing
within the
networks
WSN
WSN
Network-enabled
Devices
Network
services/storage
and processing
units
WSN
WSN
Gateway
WSN
Gateway
Network-enabled
Devices
5
6. Data aggregation and reduction
methods
− The Symbolic Aggregate Approximation (SAX) is a widely used
dimensionality reduction mechanism for time-series data.
− However, time-series != time-series as they can have a variety of
different application domains. SAX was firstly developed for static
databases; however in this work we extend it for the use in sensor
domain applications
− SAX consists of two steps:
− the aggregation phase, using Piecewise Aggregate Approximation
(PAA) and
− the discretisation of the aggregated data.
− This work limits the extension to the PAA phase.
7. Data aggregation and reduction
methods
1. SAX uses z-normalisation (left: original data blue,
normalised green)
2. Then it reduces the data to a vector of a smaller length
by taking the mean of each window. (left below: mean
values)
3. And finally discretising the data based on the Gaussian
distribution into SAX words represented as strings
according to the quartiles of the data. (right below)
11. Multi Resolution Data Communication
− A variable granularity selection is required that selects
the right window length based on the data activity.
− How to measure and quantify data activity?
− To measure the activity in the data we pre-selected four
statistical methods that can give insights about the
activity in the data, i.e. variability measured as variance,
maximum, minimum and the mean.
− Each of these has advantages and disadvantages that
can lead to different interpretation.
12. Multi Granularity
− Using SAX we can define different window/string size;
but what is the best choice?
Size =m1
W1
Size =m2
W2
Size =m3
W3
…
13. Window Selection
− Maximum:
− A higher boundary of historical data is identified. If the observed
data in the current frame is close to or higher than maximum m,
high granularity is sent.
− However, the application of this method is only useful for the
data that has interesting outliers that have a magnitude higher
than a certain threshold; for example, this could be applied to
presence data where presence could be identified using local
maxima.
− Minimum:
− Selecting m based on the minimum has the same applications
as choosing the maximum value discussed above;
− however it is applicable where a higher granularity should be
achieved for small values.
14. Window Selection
− Mean:
− Taking the average to select the granularity will result in a higher
granularity data values that are stationary around a certain
value. This reduces the granularity in cases where there are
many outliers.
− Variance:
− The variability measure defines how far values are spread out.
This can be used to create a higher granularity in values that are
more distant to the mean of the data.
− This includes the features of the min, max approaches.
However, it does not favour values that are around the mean.
− In this work, we assume that the values away from the mean are
more interesting and those values should be represented with a
higher granularity then data that is close to the mean.
15. Multi Resolution Data Communication
− Which method suits sensor data?
− To select a method we compare the similarity of the original and
reconstructed dataset by using Pearson correlation and also
compare the size of the original and reconstructed datasets.
− By choosing the variance as the selection method, the
dataset is reduced by 36% with a correlation factor of
0.94.
− For mean 27% and 0.95;
− For max 0.68% and 0.92;
− And for min 29% and 0.99 respectively.
− Reduction and reconstruction strongly depend on the
underlying dataset
17. Deciding on the window length
− How to represent the different window lengths?
− To reconstruct the data, the window lengths of each segment
has to be known as there is no constant window length
anymore. Therefore we introduce a multi resolution message
that reflects the different window length.
18. Implementation results
− We run our method on a data set consisting of 55000 samples.
− Based on the variance a different window size is chosen as shown
below:
20. Conclusions
− We use a SAX based technique to reduce the size of data
communication from WSN nodes to the gateways.
− The method uses a variance function and variable set of window
sizes.
− For data with higher activity, smaller window sizes are chosen
(assuming the SAX pattern size is fixed).
− For data with less activity larger window size is chosen.
− The initial thresholds are defined by processes a set of existing
samples.
− We have presented the evaluation results based on the size and
correlation evaluation on a sample streaming sensor data set.
− Limitations and future work:
− Changing is the size of SAX patterns (variable string size)
− Adjusting the thresholds over the time
− Deciding on the number and size of the windows based on the
characteristics of the data.