As a graduate course work, I have practiced Raspberry Pi programming and Amazon Web Service utilization. DynamoDB, IoT, EC2, and SES services were used in this project.
The project was to build a device for sound detection, using Kalman Filter and Moving Average methods for analysis
Automating Google Workspace (GWS) & more with Apps Script
Internet of Things Application: Soundsense
1. Soundsense: An IoT Application using Raspberry Pi and Amazon Web
Services
(Peter) Donghyeok Shin, Cameron Sherr
University of District of Columbia: CSIT Department
CS 538 Physical Computing: Fall 2016
Abstract— Soundsense is a detection device which measures
volume levels constantly by using Raspberry Pi and Amazon
services. And when it detects abnormal volume change, it
alerts the user about the event. Combined with a simple
microphone connected through the GPIO ports, the device sat
stationary in a specific location and recorded the surrounding
sound. Kalman Filter was used for smoothing extreme changes
and through the Amazon IoT, the measure data could be
sent to DynamoDB. Along with data recording, EC2 instance
was running an analysis application which constantly read
DynamoDB items to look for sudden change. All data from
DynamoDB was graphed automatically through a real-time
comprehensive interface, Plot.ly. Using moving average based
evaluation, abnormal volume change could be detected, and
alerted the responsible user through Amazon SES. For future
improvement, different statistical analysis can be used. And
neural-network based approach can be considered to minimize
the effort to find the best Kalman Filter configuration. (Shin)
Index Terms— Internet of Things, Raspberry Pi, Kalman Fil-
ter, Moving Average, Sound detection, DynamoDB, Streaming
Data
I. INTRODUCTION
This report is about our experience with Raspberry Pi
based IoT device manufacturing and using it for measuring
sound volume and reacting to the situation when the volume
change happened in extreme way.
Using the title Soundsense, our application utilized Rasp-
berry Pi, Sound Sensor, and several services from Amazon:
Internet of Things [1], DynamoDB [2], EC2 [3], SES [4].
Additionally, for visualizing real-time charting for monitor-
ing purpose, Plot.ly graphing service [5] was used. Even
though, we used only one device, the expected scope of usage
was imagined to be for massive deployment up to about
1,000 devices running simultaneously. Also, monitoring in-
stances were imagined to be many, requiring high-velocity
massive writing and reading onto database.
Like many cases of measuring values, noisy data should
be considered, and had to be smoothed by using Kalman
Filter [6] in conventional way. Of course, it required many
trials to find the optimal R and Q values in this project as
well.
The program for measuring sound, transmitting data to
Amazon IoT and DynamoDB was written in Python, and
was running in Soundsense Raspberry Pi computer. The
other program for reading collected data in DynamoDB for
analysis, and send an email if an anomaly was detected, was
also written in Python and was running in EC2 instance.
As the simplest method, Simple Moving Average [7] based
evaluation was used to determine if the newly collected
sound data was an extreme change that should alert the
responsible administrator-like user. In our own experiment,
when the detection happened, we could confirm that alerting
email was successfully sent.
Moving average approach was useful to distinguish be-
tween normal ambient sound and unexpected sudden rising
of noise. However, more sophisticated statistical evaluation
approaches could be used to improve the project, along with
advanced neural-network based way to find optimal R and
Q values without excessive trials. (Shin, Sherr)
II. RELATED WORK
1) Glass Break Detectors: These systems work based on
a system pre-programmed to listen to a certain frequency
of sound (one matching the brand of glass being shattered
that they are in proximity to). The system featured in this
process works on similar principal, constantly listening for
an anomaly in sound and immediately reporting on it the
moment it is detected. Our system might improve upon
this by providing an opportunity to expand the alarm, po-
tentially allowing it to be more than a one-trick pony. An
example being tuning the system to detect multiple sound-
based anomalies. Shattered glass would of course remain
a feature but initially we might tune it to detect volumes
frequencies such as the car being damaged or rammed into
(ex. crunched/distorted metal), or someone pounding against
glass (thus providing the option for the alarm to sound before
the glass is actually shattered) [8].
2) Access-Based Systems: Model followed by many home
security systems. Motion-based Sensors are placed at various
places at various entryways throughout the home, when
motion is detected at an entryway a signal is sent to the
security system to check whether or not it is valid (based on
whether or not it is currently armed). Traditionally, when a
sensor is set off when the system is set to armed’ it provides
a brief grace period for someone to set it to unarmed’ (either
via a password input, or some other means input via an app,
or key fob). Including the sound-system might add a bit more
functionality to these existing systems. The addition a sound-
detector might provide home owners with an additional
2. layer of relief while they are away from home: monitoring
whether or not rowdy children might be throwing a house-
party, if house-sitters are actually doing their duties, and
(as stated earlier) providing an additional layer of security
to detect potential intruders [9]. (Durant, Osamor: Initially,
it was understood that this project was joint project which
could share same contents. While this section was written by
other course’s members, their intention was to provide some
material that is more relevant to this course’s report.)
III. EXPERIMENT
The volume data was collected using a Raspberry Pi
and a volume sensor module, connected through the GPIO
interface. A Python program was running to measure sound
and apply Kalman Filter [6] and transmit the both volume
data using Amazon IoT MQTT protocol. For record keep-
ing, the database created on Amazon DynamoDB, received
and wrote formatted items from Amazon IoT. Residing on
Amazon EC2, an analyzer application was running to read
items from DynamoDB for real-time streaming chart, using
Plot.ly service [5]. While reading and visualizing, recent
range of data will be used to evaluate if the latest measure-
ment is unexpected and should be considered abnormal vol-
ume. When evaluation detected an unusual spike of volume
change, it uses Amazon SES to send email notification to
the administrator-kind personnel (Figure 1). (Shin)
A. Imagined Scale of Application
This project imagined the case of using massive number of
Soundsense devices. We considered about more than 1,000
devices, which send messages at the interval about 0.05
second at the slowest. Since the devices were meant to
operate continuously throughout hours and days, we expected
to receive 1,200 * 1,000 = 1,200,000 messages from all
Soundsense devices per second, at least.
Also, on the other hand, analysis application was imagined
to become multiple instances, to reflect the case of accommo-
dating multiple users having administrative-kind roles. This
analysis application was running at the interval about 1.0
second at the slowest to visualize real-time streaming chart,
which was to be monitored. While doing so, evaluation for
detecting unusual volume spike was done as well. (Shin)
B. Programming
1) Outlier(Anomaly) Detection: The type of our collected
volume measurements are time-series data. Using Kalman
Filtering, collected data could be graphed like EXCERPTED:
Figure 2. And once we collected our own data, actual volume
graph was like Figure 3.
What we were looking for as an outlier or an anomaly,
was suddenly appearing peaks like t2 in the example EX-
CERPTED: Figure 2. And similar spike could also be found
at the right side of Figure 3.
As the simplest method, we thought moving average
based evaluation could be sufficient [7]. Of course, this
moving average should be defined constantly while running
Soundsense device being stationed in the operation envi-
ronment. At about every analysis interval, about N hour
worth of volume data could be traced back to be collected.
From the collection, maximum, minimum, average, median,
deviation can be statistically inferred. Using the inference,
interquartile distribution [10] can be graphed like following
EXCERPTED: Figure 4.
Using the minimum and the maximum of the interquartile
range of distribution to be thresholds, an anomaly can be
detected when the volume measurement is outside of the
range. Such inference can be done regularly to build series
of boxplots [11] to be used for evaluating next volume
measurements. For example, at each interval, when new
volume is being measured, current boxplot constructed from
previous N amount of data can be used to evaluate if this
new volume measurement is an outlier or not. By defining
that volume measurement being inside the interquartile range
of boxplot of N amount of data to be Normal, and Anomaly
when its being outside the interquartile range, normal and
anomaly can be distinguished. Since calculating one dimen-
sional volume array for average, minimum, maximum, and
median can be done quickly, instead of using disconnected
dataset, we could use sliding dataset like EXCERPTED:
Figure 5 at each measuring interval. (Shin)
2) Using Kalman Filtering: In our project, Kalman Fil-
tering [6] was used to smoothing noisy measurements to
correctly support identifying at what time frame, outlier
volume value could be actually found.
Initially in our project, R(estimated measurement variance)
and Q(process variance) had values suggested by Greg Welch
and Gary Bishop [6]. In the paper, about how to assign initial
values to R and Q, the paper recommended R = (0.1)2 = 0.01
because it being the true measurement error variance. It could
provide the best performance for balancing responsiveness
and estimate variance. And also the paper recommended Q
= 1e 5, because it could provide more flexibility in tuning
while looking for best R and Q constants.
Throughout our research, R and Q was calibrated by com-
paring graphs for both volume actual and volume filtered.
This empirical comparison was suggested by Scott Lobdell
[12]. This tutorial was chosen due to its relevancy to sensor
measuring, which was very similar to what our project
was about. Also, the tutorial provided a quality snippet for
implementing Kalman Filter in Python. (Shin)
# R: estimate of measurement variance,
change to see effect
estimated_measurement_variance = 0.1 **
2
# Q: process variance
process_variance = 1e-5
# Initialization
volume_filtered = 0.0 # X:
3. Soundsense IoT Application
Raspberry Pi
GPIO Volume
Module
Amazon IoT
SDK
Amazon IoT
Service
MQTT
Amazon
Dynamo DB
write item
Soundsense Analyzer Application
Amazon EC2
Amazon DynamoDB
SDK
read item
Analyzing for
Volume Spike
Plot.ly SDK
(Actual & Filtered)
Volume Data Line Chart
streaming
Amazon SES
Amazon SES
SDK
enque notification
send
email
Fig. 1. Describing main components and actions for Soundsense IoT Application and Analysis
Fig. 2. EXCERPTED: Sample Contextual Outlier [17]
# P: estimation error covariance
current_error_estimate = 1.0
volume_actual = Measured.FromSoundSendor
{}
while (volume_actual >= 0.0)
# While measurement is being continued
at each iteration
# Reset with previous values
previous_volume = volume_filtered #
X‘
previous_error_estimate =
current_error_estimate +
process_variance # P‘
# Apply filtering
kalman_gain =
previous_error_estimate / (
previous_error_estimate +
estimated_measurement_variance)
# Both variables will have new
values
volume_filtered = previous_volume +
kalman_gain * (volume_actual -
previous_volume)
current_error_estimate = (1 -
kalman_gain) *
previous_error_estimate
# USE: volume_filtered
C. Equipment
Our goal was to create a small, easily transportable device
that can collect sound data from any of the surrounding
environment. Figure 6 below was our own manufactured
result, following the tutorial in Sunfounder Website [13].
GPIO connectable Sound Sensor(EXCERPTED: Figure 7),
actually gets measurement in analog data. Because of this
addtional PCF8591 Analog-to-Digital Converter had to be
used, according to the Sunfounder tutorial [13]. The device
was (virtually) deployed to a location with identifier and
was running constantly to measure volume data. While still
4. Fig. 3. Case of detecting a sudden spike, even while there were some level of sound already
being connected through an internet connection, the device
could transmit high velocity data tuples that was consisting
of identifier, timestamp, actual volume, and (Kalman)filtered
volume, unto Amazon IoT (Figure 1). (Shin, Sherr)
D. Result
Once data had been collected into DynamoDB, the real-
time charting for volume data could be generated like
Figure 3, using a separate analysis application reading the
database constantly.
When the level of sound was high, the value was given low
integer, which could be as low as 40 to 50. The value being in
between 140 to 150 was considered normal sound situation.
Adhering to the purpose of Soundsense device, the alerting
didn’t happen until extreme change of volume was detected.
From left to right in Figure 3, there were some spikes of
values going down to about 80 to 100. However, this level
could be considered normal, since the measurements were
taken for normal human talking, like giving presentation for
this project. Because of using moving average methods, if
volume was measured to be at constant level consistently,
situation was interpreted to be normal.
For example, if Soundsense was deployed to a restaurant
setting, it should not react to every conversation of guest,
nor at music being played. Instead of comparing it with
total silence, as long as sound level could remain at level,
Soundsense could remain unresponsive. But, if there were
screaming, fire, or explosion, noticeable change of sound vol-
ume is inevitable. And it will be measured to show extreme
change, like the way depicted in right-end of Figure 3. This
suddenly appearing extreme change was difficult to ignore,
as moving average based evaluation definitely recognize that
new measurement was surely outside of it’s current average,
minimum thresholds of volume values.
As detection happened, the responsible administrator-kind
user received an email sent automatically for alerting, with
the information containing the device identifier and the time
of detection occurred.
IV. FUTURE WORKS
A. Other Statistical Approaches
While Simple Moving Average was used in this project,
there were other statistical methods that could be tested in
the category of Moving Average [7].
Simple Moving Average may have some drawback, such
as sudden fluctuation of average value if the small samples
of measurements are too extreme, affecting the resulting av-
erage. Even though our project expected to find an anomaly
which is far from normal average, we tested with different
shifting windows to minimize occurrence of extreme changes
in calculated average.
Cumulative Moving Average could be useful for minimiz-
ing such sudden fluctuation, since it could use previously cal-
culated average with the new measurement. Also, this could
minimize data scanning once first average was calculated.
Since we expected there may be various threshold to be
used for finding an anomaly during a day, Weighted Moving
Average approach could be useful. For example, instead
of using same criteria, different weights during different
times of the day could be used to determine if the new
measurement was outside the normal range. (Shin)
5. Fig. 4. EXCERPTED: Sample Interquartile [10]
Fig. 5. EXCERPTED: Series of Boxplots [11]
B. Neural Network as replacement for Kalman Filter
Kalman Filter has been used for smoothing noise mea-
surements for many different projects. However, it’s always
difficult to find optimal R and Q values early. Likewise, this
project had to deal with many trials to obtain good R and
Q value for charting with correct smoothness. There were
several attempts to improve Kalman Filter or replace it, by
using neural network [14] [15] [16]. The very essential but
most difficult process for finding optimal R and Q values
could be done by using neural network approach, minimizing
repetitive work. (Shin)
V. CONCLUSION
In our experiment under controlled environment, our
Soundsense device did detect unusual change of volume.
Moreover, such detection was done while distinguishing the
normal level of ambient sound, which should be intentionally
ignored. For the purpose of smoothing out noisy measure-
Fig. 6. Soundsense Rapsberry Pi device using Sound Sensor
Fig. 7. EXCERPTED: Sound Sensor [13]
ment, Kalman Filter was used in the conventional way.
However, finding R and Q values were not a mere task, and
motivated us to look for better ways. Testing Amazon’s ser-
vices were not the topic of this project, however, standardized
infrastructure was highly useful for saving precious time for
this project.
REFERENCES
[1] AWS IoT Platform - Amazon Web Services, Amazon Web Services,
Inc. [Online]. Available: https://aws.amazon.com/iot-platform/. [Ac-
cessed: 15-Dec-2016].
[2] Amazon DynamoDB NoSQL Cloud Database Ser-
vice, Amazon Web Services, Inc. [Online]. Available:
https://aws.amazon.com/dynamodb/. [Accessed: 15-Dec-2016].
[3] Elastic Compute Cloud (EC2) Cloud Server & Hosting AWS, Amazon
Web Services, Inc. [Online]. Available: https://aws.amazon.com/ec2/.
[Accessed: 15-Dec-2016].
[4] AWS — Amazon Simple Email Service (SES) - Cloud Based
Email Services, Amazon Web Services, Inc. [Online]. Available:
https://aws.amazon.com/ses/. [Accessed: 15-Dec-2016].
[5] plotly, Python Graphing Library, Plotly. [Online]. Available:
https://plot.ly/python/. [Accessed: 15-Dec-2016].
6. [6] Welch, G., & Bishop, G. (2006). An Introduction to the Kalman Filter.
In Practice, 7(1), 116. https://doi.org/10.1.1.117.6808
[7] Moving average, Wikipedia. [Online]. Available:
https://en.wikipedia.org/wiki/Moving average. [Accessed: 15-Dec-
2016].
[8] How Motion Sensors Work with a Security System,
theHomeSecurityAdvisercom, May-2016. [Online]. Available:
http://thehomesecurityadviser.com/how-motion-sensors-work-with-a-
security-system/. [Accessed: 13-Dec-2016].
[9] C. Harrelson, Audio Verification Equals More Ap-
prehensions, Intrusion RSS, 2013. [Online]. Available:
http://www.securitysales.com/article/audio-verification-equals-more-
apprehensions. [Accessed: 13-Dec-2016].
[10] Interquartile range, Wikipedia. [Online]. Available:
https://en.wikipedia.org/wiki/Interquartile range. [Accessed: 15-
Dec-2016].
[11] Time-series boxplot in pandas, python - Time-series
boxplot in pandas - Stack Overflow. [Online]. Available:
http://stackoverflow.com/questions/26507404/time-series-boxplot-
in-pandas. [Accessed: 15-Dec-2016].
[12] Kalman Filtering in Python for Reading Sensor Input -
Scott Lobdell, Scott Lobdell, 2014. [Online]. Available:
http://scottlobdell.me/2014/08/kalman-filtering-python-reading-
sensor-input/. [Accessed: 15-Dec-2016].
[13] Lesson 19 Sound Sensor, Lesson 19 Sound Sensor. [Online].
Available: https://www.sunfounder.com/learn/sensor-kit-v2-0-for-
raspberry-pi-b-plus/lesson-19-sound-sensor-sensor-kit-v2-0-for-b-
plus.html. [Accessed: 13-Dec-2016].
[14] Belhajem, I., Maissa, Y. Ben, & Tamtaoui, A. (2016). A hybrid low
cost approach using Extended Kalman Filter and Neural Networks for
real time positioning. https://doi.org/10.1109/IT4OD.2016.7479298
[15] Deb, A. K. (2016). Estimation of States of a Nonlinear Plant using
Dynamic Neural Network and Kalman Filter, 497502.
[16] Xu, L., & Xu, H. Y. (2009). Performance evaluation of innova-
tive enterprises based on Neural network-Kalman filter model. 2009
International Conference on Management Science and Engineer-
ing - 16th Annual Conference Proceedings, ICMSE 2009, 450455.
https://doi.org/10.1109/ICMSE.2009.5317386
[17] Outlier, Wikipedia. [Online]. Available:
https://en.wikipedia.org/wiki/Outlier. [Accessed: 15-Dec-2016].