Indoor Localization using
Wi-Fi Fingerprinting
IOT Analytics Report based on Deep Analytics and Visualization with R
IoT Analytics (The Internet of Things)
Internet of Things
Device
Connection
and
Connectivity
Data Sensing
and Collecting
Data
transport
and access
Data
Analytics
Data value
defined by
action
Human value,
apps and
experiences
03
01
04
05
06
Data Analytics 04
Big Data Analysis
AI and Cognitive
Analysis at the Edge
Data Value 05
Analysis to Action
APIs and Processes
Actionable Intelligence
Human Value 06
Smart Applications
Stakeholder Benefits
Tangible Benefits
Data Transport03
Focus on Access
Networks, Cloud,
Edge Data Transport
Data Sensing02
Capture Data
Sensors and Tags
Storage
Data Sensing01
IoT Devices
IoT Connectivity
Embedded Intelligence
From Connection to
Benefit
Our client intends to develop a Indoor Positioning System (IPS) to be
deployed on large industrial campuses, in shopping malls, et cetera to help
people to navigate a complex, unfamiliar interior space without getting
lost. While GPS works fairly reliably outdoors thanks to the inclusion of
GPS sensors into the mobile devices, it generally doesn’t work due to the
loss of GPS signal in indoor environments. The increasing demand for
indoors location based services has made indoor positioning a significant
research topic. A spectacular growth of indoor localization studies has been
witnessed during the last decade, and the WLAN fingerprint based ones
(also known as WiFi Fingerprinting) is the basis for many indoor
localization approaches. This is mainly due to the proliferation of both
wireless local area networks (WLANs) and mobile devices. Nowadays
WLANs can be found anywhere, and mobile phones have increasingly
become an indispensable part of our daily lives and, therefore, we can
safely expect that the user is at the same location than the mobile device.
Wifi fingerprinting uses the detected Wireless Access Points (WAPs) within
the building and the corresponding Received Signal Strength Intensity
(RSSI) to determine physical location of a person, analogously to how GPS
uses satellite signals. This report evaluates multiple machine learning
models to see which produces the best result, enabling us to make a
recommendation to the client for incorporating into the smartphone app
they will develop to determine a person's location in indoor spaces.
Introduction
 To identify a person’s physical position in a multi-building
indoor facility using Wi-Fi Fingerprinting.
 Evaluate a sufficiently accurate model for client
Analysis Goal
For feasibility study requested by client a large database UJIIndoorLoc has been used for comparing different models.
• The database created in 2013 covers three buildings of Universitat Jaume with 4 or 5 floors and almost 110,000m2 by
means of more than 20 different users and 25 Android devices.
• The database consists of 19937 training/reference records (trainingData.csv file).
• Total 933 different places (reference points) appearing in the database.
• The number of different wireless access points (WAPs) appearing in the database is 520.
• WAP intensity values are represented as negative integer values ranging -104dBm (extremely poor signal) to 0dbM. The
positive value 100 is used to denote when a WAP was not detected.
• The coordinates (latitude, longitude, floor) and Building ID are provided as the attributes to be predicted.
• The database also contains records of particular space (offices, labs, etc.) and the relative position (inside/outside the
space) where the capture was taken. Outside means that the capture was taken in front of the door of the space.
• 20 different users and 25 Android devices.
• Timestamp provides information about when WiFi capture was taken in UNIX time format.
• No Missing Data.
• No unique ID to identify location.
• 232 WAPs with no signals.
Data Description
Process Framework
Create Unique Identifier for
each location
Sample Data (Building 2 Floor 2)
to reduce dimensionality
(for predicting target class ie
location)
Compare Models
Data Preparation
To evaluate machine learning techniques to locate a person indoors using fingerprinting, following measures were taken on
the database.
• Reduce database dimensionality because of processor speed and memory of available . In real life there will be better
computing facilities available.
• The database sample has been restricted to 2nd Floor of Building 2.
• Converted TimeStamp from Unix Time to calender dates and times using as.POSIXct function.
• Assigned an unique identifier MasterId by concatenating attributes Building Id, Floor, Space Id and Relative Position(Inside
or outside).
• Classified MasterId as factor to help with building models.
• Remove excessive attributes Longitude, Latitude, Building Id, Floor, Space Id and Relative Position to get rid of overfitting.
• The sample database of Building 2 Floor 2 consist of 1577 observations with 524 attributes.
• Sample database then partitioned, 80% Training & 20% Testing sets using createDataPartition Function of R. Our Training
set consist of 1277 observations while the Testing Set consists of 300 observations.
• Total 523 predictors and 73 classes appearing in the Training database of 2nd Floor of Building 2.
• No Missing Data.
• Create and evaluate models for determining location.
Model Algorithms to determine location
K-Nearest Neighbor (knn)
Random Forest
Model mtry 523
Model k = 5
CART
K-Nearest Neighbor (kknn)
Kmax = 5, distance = 2, kernel =
optimal
C 5.0 Decision Trees
Trials = 20, Model = rules, winnow = FALSE
Conditional Inf .Trees (ctree)
mincriterion = 0.01
Bagged CART
Accuracy Kappa
Testing Set 0.9400000 0.9390037
Bldg 2 Floor 2 0.9885859 0.9883962
Accuracy Kappa
Testing Set 0.7266667 0.7219931
Bldg 2 Floor 2 0.8319594 0.8291356
Accuracy Kappa
Testing Set 0.2833333 0.2689976
Bldg 2 Floor 2 0.3367153 0.3231254
Accuracy Kappa
Testing Set 0.6333333 0.6273208
Bldg 2 Floor 2 0.9302473 0.9290903
Accuracy Kappa
Testing Set 0.8733333 0.8712694
Bldg 2 Floor 2 0.9759036 0.9759044
Accuracy Kappa
Testing Set 0.1500000 0.1289992
Bldg 2 Floor 2 0.1838935 0.1634314
Accuracy Kappa
Testing Set 0.9363602 0.9330285
Bldg 2 Floor 2 0.9879518 0.9877516
k-Nearest Neighbor (knn)
Both Random Forest and Bagged CART model performs very
well , Random Forest model performs slightly better
Model Snapshots
Model performance after removal
of WAPs with no signals, not
much remarkable change
Model Snapshots
Model performance
after removal of WAPs
with no signals
Model Snapshots
Model Snapshots
Model Snapshots
Model Comparison
Model Name
Testing Set Building 2 Floor 2
Accuracy Kappa Accuracy Kappa
C 5.0 0.873333 0.871269 0.975904 0.975904
k-Nearest Neighborhood (knn) 0.726667 0.721993 0.831959 0.829136
Random Forest 0.940000 0.939004 0.988586 0.988396
CART 0.283333 0.268998 0.336715 0.323125
Bagged CART 0.936360 0.933029 0.987952 0.987752
CTree 0.150000 0.128999 0.183894 0.163431
kknn 0.633333 0.627321 0.930247 0.929090
Model Comparison
Model Comparison Plot
The location of a user or a device is a very meaningful and significant information for many applications problems.
• Using Wi-Fi Indoor Positioning can successfully locate different rooms/areas.
• Both Random Forest and Bagged CART model were able to provide location information with above 93% accuracy, pretty
good success of models.
• Provides only room level localization. Fingerprint-based system will not be able provide geographic coordinates, rather
provides symbolic identifiers as for example the number or name of a room.
• WiFi fingerprinting based approach is cost effective due to the use of existing infrastructure.
• Model assumes each location as different based on building, floor, room information. Location can be mapped using
pre-built information of location.
• However, the received signal strength (RSS) to fingerprint a location gets hugely effected due to absorption by humans,
or reflection on walls.
• Model is unable to provide the information of direction in which person is moving. Some providers use the approach of
calculating triangular distance to determine location, while some collaborate both approaches for locationing.
• Devices running on iOS 4.3 and higher do not support client-based positioning via Wi-Fi. Relying exclusively on Wi-Fi
would mean that a lot of devices are excluded. Exclusively using Wi-Fi as a positioning technology can only make sense
when the operator offers his own (Android-) devices, for example visitor guides in museums or smartphones for
employees. However server-based positioning using wifi, it is possible to detect all devices, but special hardware such as
infosoft locator nodes, CISCO MSE, Merali, Xirrus are required.
Conclusion
Indoor localization using wifi fingerprinting

Indoor localization using wifi fingerprinting

  • 1.
    Indoor Localization using Wi-FiFingerprinting IOT Analytics Report based on Deep Analytics and Visualization with R
  • 2.
    IoT Analytics (TheInternet of Things) Internet of Things Device Connection and Connectivity Data Sensing and Collecting Data transport and access Data Analytics Data value defined by action Human value, apps and experiences 03 01 04 05 06 Data Analytics 04 Big Data Analysis AI and Cognitive Analysis at the Edge Data Value 05 Analysis to Action APIs and Processes Actionable Intelligence Human Value 06 Smart Applications Stakeholder Benefits Tangible Benefits Data Transport03 Focus on Access Networks, Cloud, Edge Data Transport Data Sensing02 Capture Data Sensors and Tags Storage Data Sensing01 IoT Devices IoT Connectivity Embedded Intelligence From Connection to Benefit
  • 3.
    Our client intendsto develop a Indoor Positioning System (IPS) to be deployed on large industrial campuses, in shopping malls, et cetera to help people to navigate a complex, unfamiliar interior space without getting lost. While GPS works fairly reliably outdoors thanks to the inclusion of GPS sensors into the mobile devices, it generally doesn’t work due to the loss of GPS signal in indoor environments. The increasing demand for indoors location based services has made indoor positioning a significant research topic. A spectacular growth of indoor localization studies has been witnessed during the last decade, and the WLAN fingerprint based ones (also known as WiFi Fingerprinting) is the basis for many indoor localization approaches. This is mainly due to the proliferation of both wireless local area networks (WLANs) and mobile devices. Nowadays WLANs can be found anywhere, and mobile phones have increasingly become an indispensable part of our daily lives and, therefore, we can safely expect that the user is at the same location than the mobile device. Wifi fingerprinting uses the detected Wireless Access Points (WAPs) within the building and the corresponding Received Signal Strength Intensity (RSSI) to determine physical location of a person, analogously to how GPS uses satellite signals. This report evaluates multiple machine learning models to see which produces the best result, enabling us to make a recommendation to the client for incorporating into the smartphone app they will develop to determine a person's location in indoor spaces. Introduction
  • 4.
     To identifya person’s physical position in a multi-building indoor facility using Wi-Fi Fingerprinting.  Evaluate a sufficiently accurate model for client Analysis Goal
  • 5.
    For feasibility studyrequested by client a large database UJIIndoorLoc has been used for comparing different models. • The database created in 2013 covers three buildings of Universitat Jaume with 4 or 5 floors and almost 110,000m2 by means of more than 20 different users and 25 Android devices. • The database consists of 19937 training/reference records (trainingData.csv file). • Total 933 different places (reference points) appearing in the database. • The number of different wireless access points (WAPs) appearing in the database is 520. • WAP intensity values are represented as negative integer values ranging -104dBm (extremely poor signal) to 0dbM. The positive value 100 is used to denote when a WAP was not detected. • The coordinates (latitude, longitude, floor) and Building ID are provided as the attributes to be predicted. • The database also contains records of particular space (offices, labs, etc.) and the relative position (inside/outside the space) where the capture was taken. Outside means that the capture was taken in front of the door of the space. • 20 different users and 25 Android devices. • Timestamp provides information about when WiFi capture was taken in UNIX time format. • No Missing Data. • No unique ID to identify location. • 232 WAPs with no signals. Data Description
  • 6.
    Process Framework Create UniqueIdentifier for each location Sample Data (Building 2 Floor 2) to reduce dimensionality (for predicting target class ie location) Compare Models
  • 7.
    Data Preparation To evaluatemachine learning techniques to locate a person indoors using fingerprinting, following measures were taken on the database. • Reduce database dimensionality because of processor speed and memory of available . In real life there will be better computing facilities available. • The database sample has been restricted to 2nd Floor of Building 2. • Converted TimeStamp from Unix Time to calender dates and times using as.POSIXct function. • Assigned an unique identifier MasterId by concatenating attributes Building Id, Floor, Space Id and Relative Position(Inside or outside). • Classified MasterId as factor to help with building models. • Remove excessive attributes Longitude, Latitude, Building Id, Floor, Space Id and Relative Position to get rid of overfitting. • The sample database of Building 2 Floor 2 consist of 1577 observations with 524 attributes. • Sample database then partitioned, 80% Training & 20% Testing sets using createDataPartition Function of R. Our Training set consist of 1277 observations while the Testing Set consists of 300 observations. • Total 523 predictors and 73 classes appearing in the Training database of 2nd Floor of Building 2. • No Missing Data. • Create and evaluate models for determining location.
  • 8.
    Model Algorithms todetermine location K-Nearest Neighbor (knn) Random Forest Model mtry 523 Model k = 5 CART K-Nearest Neighbor (kknn) Kmax = 5, distance = 2, kernel = optimal C 5.0 Decision Trees Trials = 20, Model = rules, winnow = FALSE Conditional Inf .Trees (ctree) mincriterion = 0.01 Bagged CART Accuracy Kappa Testing Set 0.9400000 0.9390037 Bldg 2 Floor 2 0.9885859 0.9883962 Accuracy Kappa Testing Set 0.7266667 0.7219931 Bldg 2 Floor 2 0.8319594 0.8291356 Accuracy Kappa Testing Set 0.2833333 0.2689976 Bldg 2 Floor 2 0.3367153 0.3231254 Accuracy Kappa Testing Set 0.6333333 0.6273208 Bldg 2 Floor 2 0.9302473 0.9290903 Accuracy Kappa Testing Set 0.8733333 0.8712694 Bldg 2 Floor 2 0.9759036 0.9759044 Accuracy Kappa Testing Set 0.1500000 0.1289992 Bldg 2 Floor 2 0.1838935 0.1634314 Accuracy Kappa Testing Set 0.9363602 0.9330285 Bldg 2 Floor 2 0.9879518 0.9877516 k-Nearest Neighbor (knn) Both Random Forest and Bagged CART model performs very well , Random Forest model performs slightly better
  • 9.
    Model Snapshots Model performanceafter removal of WAPs with no signals, not much remarkable change
  • 10.
    Model Snapshots Model performance afterremoval of WAPs with no signals
  • 11.
  • 12.
  • 13.
  • 14.
    Model Comparison Model Name TestingSet Building 2 Floor 2 Accuracy Kappa Accuracy Kappa C 5.0 0.873333 0.871269 0.975904 0.975904 k-Nearest Neighborhood (knn) 0.726667 0.721993 0.831959 0.829136 Random Forest 0.940000 0.939004 0.988586 0.988396 CART 0.283333 0.268998 0.336715 0.323125 Bagged CART 0.936360 0.933029 0.987952 0.987752 CTree 0.150000 0.128999 0.183894 0.163431 kknn 0.633333 0.627321 0.930247 0.929090
  • 15.
  • 16.
  • 17.
    The location ofa user or a device is a very meaningful and significant information for many applications problems. • Using Wi-Fi Indoor Positioning can successfully locate different rooms/areas. • Both Random Forest and Bagged CART model were able to provide location information with above 93% accuracy, pretty good success of models. • Provides only room level localization. Fingerprint-based system will not be able provide geographic coordinates, rather provides symbolic identifiers as for example the number or name of a room. • WiFi fingerprinting based approach is cost effective due to the use of existing infrastructure. • Model assumes each location as different based on building, floor, room information. Location can be mapped using pre-built information of location. • However, the received signal strength (RSS) to fingerprint a location gets hugely effected due to absorption by humans, or reflection on walls. • Model is unable to provide the information of direction in which person is moving. Some providers use the approach of calculating triangular distance to determine location, while some collaborate both approaches for locationing. • Devices running on iOS 4.3 and higher do not support client-based positioning via Wi-Fi. Relying exclusively on Wi-Fi would mean that a lot of devices are excluded. Exclusively using Wi-Fi as a positioning technology can only make sense when the operator offers his own (Android-) devices, for example visitor guides in museums or smartphones for employees. However server-based positioning using wifi, it is possible to detect all devices, but special hardware such as infosoft locator nodes, CISCO MSE, Merali, Xirrus are required. Conclusion