Ana I. Pereira · Florbela P. Fernandes ·
João P. Coelho · João P. Teixeira ·
Maria F. Pacheco · Paulo Alves ·
Rui P. Lopes (Eds.)
First International Conference, OL2A 2021
Bragança, Portugal, July 19–21, 2021
Revised Selected Papers
Optimization, Learning
Algorithms and Applications
Communications in Computer and Information Science 1488
Communications
in Computer and Information Science 1488
Editorial Board Members
Joaquim Filipe
Polytechnic Institute of Setúbal, Setúbal, Portugal
Ashish Ghosh
Indian Statistical Institute, Kolkata, India
Raquel Oliveira Prates
Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
Lizhu Zhou
Tsinghua University, Beijing, China
More information about this series at https://link.springer.com/bookseries/7899
Ana I. Pereira · Florbela P. Fernandes ·
João P. Coelho · João P. Teixeira ·
Maria F. Pacheco · Paulo Alves ·
Rui P. Lopes (Eds.)
Optimization, Learning
Algorithms and Applications
First International Conference, OL2A 2021
Bragança, Portugal, July 19–21, 2021
Revised Selected Papers
Editors
Ana I. Pereira
Instituto Politécnico de Bragança
Bragança, Portugal
João P. Coelho
Instituto Politécnico de Bragança
Bragança, Portugal
Maria F. Pacheco
Instituto Politécnico de Bragança
Bragança, Portugal
Rui P. Lopes
Instituto Politécnico de Bragança
Bragança, Portugal
Florbela P. Fernandes
Instituto Politécnico de Bragança
Bragança, Portugal
João P. Teixeira
Instituto Politécnico de Bragança
Bragança, Portugal
Paulo Alves
Instituto Politécnico de Bragança
Bragança, Portugal
ISSN 1865-0929 ISSN 1865-0937 (electronic)
Communications in Computer and Information Science
ISBN 978-3-030-91884-2 ISBN 978-3-030-91885-9 (eBook)
https://doi.org/10.1007/978-3-030-91885-9
© Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, expressed or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The volume CCIS 1488 contains the refereed proceedings of the International
Conference on Optimization, Learning Algorithms and Applications (OL2A 2021), an
event that, due to the COVID-19 pandemic, was held online.
OL2A 2021 provided a space for the research community on optimization and
learning to get together and share the latest developments, trends, and techniques
as well as develop new paths and collaborations. OL2A 2021 had more than 400
participants in an online environment throughout the three days of the conference
(July 19–21, 2021), discussing topics associated to areas such as optimization
and learning and state-of-the-art applications related to multi-objective optimization,
optimization for machine learning, robotics, health informatics, data analysis,
optimization and learning under uncertainty, and the Fourth Industrial Revolution.
Four special sessions were organized under the following topics: Trends in
Engineering Education, Optimization in Control Systems Design, Data Visualization
and Virtual Reality, and Measurements with the Internet of Things. The event had 52
accepted papers, among which 39 were full papers. All papers were carefully reviewed
and selected from 134 submissions. All the reviews were carefully carried out by a
Scientific Committee of 61 PhD researchers from 18 countries.
July 2021 Ana I. Pereira
Organization
General Chair
Ana Isabel Pereira Polytechnic Institute of Bragança, Portugal
Organizing Committee Chairs
Florbela P. Fernandes Polytechnic Institute of Bragança, Portugal
João Paulo Coelho Polytechnic Institute of Bragança, Portugal
João Paulo Teixeira Polytechnic Institute of Bragança, Portugal
M. Fátima Pacheco Polytechnic Institute of Bragança, Portugal
Paulo Alves Polytechnic Institute of Bragança, Portugal
Rui Pedro Lopes Polytechnic Institute of Bragança, Portugal
Scientific Committee
Ana Maria A. C. Rocha University of Minho, Portugal
Ana Paula Teixeira University of Trás-os-Montes and Alto Douro, Portugal
André Pinz Borges Federal University of Technology – Paraná, Brazil
Andrej Košir University of Ljubljana, Slovenia
Arnaldo Cândido Júnior Federal University of Technology – Paraná, Brazil
Bruno Bispo Federal University of Santa Catarina, Brazil
Carmen Galé University of Zaragoza, Spain
B. Rajesh Kanna Vellore Institute of Technology, India
C. Sweetlin Hemalatha Vellore Institute of Technology, India
Damir Vrančić Jozef Stefan Institute, Slovenia
Daiva Petkeviciute Kaunas University of Technology, Lithuania
Diamantino Silva Freitas University of Porto, Portugal
Esteban Clua Federal Fluminense University, Brazil
Eric Rogers University of Southampton, UK
Felipe Nascimento Martins Hanze University of Applied Sciences,
The Netherlands
Gaukhar Muratova Dulaty University, Kazakhstan
Gediminas Daukšys Kauno Technikos Kolegija, Lithuania
Glaucia Maria Bressan Federal University of Technology – Paraná, Brazil
Humberto Rocha University of Coimbra, Portugal
José Boaventura-Cunha University of Trás-os-Montes and Alto Douro, Portugal
José Lima Polytechnic Institute of Bragança, Portugal
Joseane Pontes Federal University of Technology – Ponta Grossa,
Brazil
Juani Lopéz Redondo University of Almeria, Spain
viii Organization
Jorge Ribeiro Polytechnic Institute of Viana do Castelo, Portugal
José Ramos NOVA University Lisbon, Portugal
Kristina Sutiene Kaunas University of Technology, Lithuania
Lidia Sánchez University of León, Spain
Lino Costa University of Minho, Portugal
Luís Coelho Polytecnhic Institute of Porto, Portugal
Luca Spalazzi Marche Polytechnic University, Italy
Manuel Castejón Limas University of León, Spain
Marc Jungers Université de Lorraine, France
Maria do Rosário de Pinho University of Porto, Portugal
Marco Aurélio Wehrmeister Federal University of Technology – Paraná, Brazil
Mikulas Huba Slovak University of Technology in Bratislava,
Slovakia
Michał Podpora Opole University of Technology, Poland
Miguel Ángel Prada University of León, Spain
Nicolae Cleju Technical University of Iasi, Romania
Paulo Lopes dos Santos University of Porto, Portugal
Paulo Moura Oliveira University of Trás-os-Montes and Alto Douro, Portugal
Pavel Pakshin Nizhny Novgorod State Technical University, Russia
Pedro Luiz de Paula Filho Federal University of Technology – Paraná, Brazil
Pedro Miguel Rodrigues Catholic University of Portugal, Portugal
Pedro Morais Polytechnic Institute of Cávado e Ave, Portugal
Pedro Pinto Polytechnic Institute of Viana do Castelo, Portugal
Rudolf Rabenstein Friedrich-Alexander-University of Erlangen-Nürnberg,
Germany
Sani Rutz da Silva Federal University of Technology – Paraná, Brazil
Sara Paiva Polytechnic Institute of Viana do Castelo, Portugal
Sofia Rodrigues Polytechnic Institute of Viana do Castelo, Portugal
Sławomir St˛
epień Poznan University of Technology, Poland
Teresa Paula Perdicoulis University of Trás-os-Montes and Alto Douro, Portugal
Toma Roncevic University of Split, Croatia
Vitor Duarte dos Santos NOVA University Lisbon, Portugal
Wojciech Paszke University of Zielona Gora, Poland
Wojciech Giernacki Poznan University of Technology, Poland
Contents
Optimization Theory
Dynamic Response Surface Method Combined with Genetic Algorithm
to Optimize Extraction Process Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Laires A. Lima, Ana I. Pereira, Clara B. Vaz, Olga Ferreira,
Márcio Carocho, and Lillian Barros
Towards a High-Performance Implementation of the MCSFilter
Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Leonardo Araújo, Maria F. Pacheco, José Rufino,
and Florbela P. Fernandes
On the Performance of the OrthoMads Algorithm on Continuous
and Mixed-Integer Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Marie-Ange Dahito, Laurent Genest, Alessandro Maddaloni, and José Neto
A Look-Ahead Based Meta-heuristics for Optimizing Continuous
Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Thomas Nordli and Noureddine Bouhmala
Inverse Optimization for Warehouse Management . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Hannu Rummukainen
Model-Agnostic Multi-objective Approach for the Evolutionary Discovery
of Mathematical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Alexander Hvatov, Mikhail Maslyaev, Iana S. Polonskaya,
Mikhail Sarafanov, Mark Merezhnikov, and Nikolay O. Nikitin
A Simple Clustering Algorithm Based on Weighted Expected Distances . . . . . . . 86
Ana Maria A. C. Rocha, M. Fernanda P. Costa,
and Edite M. G. P. Fernandes
Optimization of Wind Turbines Placement in Offshore Wind Farms: Wake
Effects Concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
José Baptista, Filipe Lima, and Adelaide Cerveira
A Simulation Tool for Optimizing a 3D Spray Painting System . . . . . . . . . . . . . . . 110
João Casanova, José Lima, and Paulo Costa
x Contents
Optimization of Glottal Onset Peak Detection Algorithm for Accurate
Jitter Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Joana Fernandes, Pedro Henrique Borghi, Diamantino Silva Freitas,
and João Paulo Teixeira
Searching the Optimal Parameters of a 3D Scanner Through Particle
Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
João Braun, José Lima, Ana I. Pereira, Cláudia Rocha, and Paulo Costa
Optimal Sizing of a Hybrid Energy System Based on Renewable Energy
Using Evolutionary Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Yahia Amoura, Ângela P. Ferreira, José Lima, and Ana I. Pereira
Robotics
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot . . . . 171
Hugo Mendonça, José Lima, Paulo Costa, António Paulo Moreira,
and Filipe Santos
Multiple Mobile Robots Scheduling Based on Simulated Annealing
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Diogo Matos, Pedro Costa, José Lima, and António Valente
Multi AGV Industrial Supervisory System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Ana Cruz, Diogo Matos, José Lima, Paulo Costa, and Pedro Costa
Dual Coulomb Counting Extended Kalman Filter for Battery SOC
Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Arezki A. Chellal, José Lima, José Gonçalves, and Hicham Megnafi
Sensor Fusion for Mobile Robot Localization Using Extended Kalman
Filter, UWB ToF and ArUco Markers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Sílvia Faria, José Lima, and Paulo Costa
Deep Reinforcement Learning Applied to a Robotic Pick-and-Place
Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
Natanael Magno Gomes, Felipe N. Martins, José Lima,
and Heinrich Wörtche
Measurements with the Internet of Things
An IoT Approach for Animals Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Matheus Zorawski, Thadeu Brito, José Castro, João Paulo Castro,
Marina Castro, and José Lima
Contents xi
Optimizing Data Transmission in a Wireless Sensor Network Based
on LoRaWAN Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Thadeu Brito, Matheus Zorawski, João Mendes,
Beatriz Flamia Azevedo, Ana I. Pereira, José Lima, and Paulo Costa
Indoor Location Estimation Based on Diffused Beacon Network . . . . . . . . . . . . . 294
André Mendes and Miguel Diaz-Cacho
SMACovid-19 – Autonomous Monitoring System for Covid-19 . . . . . . . . . . . . . . 309
Rui Fernandes and José Barbosa
Optimization in Control Systems Design
Economic Burden of Personal Protective Strategies for Dengue Disease:
an Optimal Control Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Artur M. C. Brito da Cruz and Helena Sofia Rodrigues
ERP Business Speed – A Measuring Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Zornitsa Yordanova
BELBIC Based Step-Down Controller Design Using PSO . . . . . . . . . . . . . . . . . . . 345
João Paulo Coelho, Manuel Braz-César, and José Gonçalves
Robotic Welding Optimization Using A* Parallel Path Planning . . . . . . . . . . . . . . 357
Tiago Couto, Pedro Costa, Pedro Malaca, Daniel Marques,
and Pedro Tavares
Deep Learning
Leaf-Based Species Recognition Using Convolutional Neural Networks . . . . . . . 367
Willian Oliveira Pires, Ricardo Corso Fernandes Jr.,
Pedro Luiz de Paula Filho, Arnaldo Candido Junior,
and João Paulo Teixeira
Deep Learning Recognition of a Large Number of Pollen Grain Types . . . . . . . . 381
Fernando C. Monteiro, Cristina M. Pinto, and José Rufino
Predicting Canine Hip Dysplasia in X-Ray Images Using Deep Learning . . . . . . 393
Daniel Adorno Gomes, Maria Sofia Alves-Pimenta, Mário Ginja,
and Vitor Filipe
Convergence of the Reinforcement Learning Mechanism Applied
to the Channel Detection Sequence Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
André Mendes
xii Contents
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data . . . . . . . 417
Tiago Franco, P. R. Henriques, P. Alves, and M. J. Varanda Pereira
Artificial Intelligence Architecture Based on Planar LiDAR Scan Data
to Detect Energy Pylon Structures in a UAV Autonomous Detailed
Inspection Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
Matheus F. Ferraz, Luciano B. Júnior, Aroldo S. K. Komori,
Lucas C. Rech, Guilherme H. T. Schneider, Guido S. Berger,
Álvaro R. Cantieri, José Lima, and Marco A. Wehrmeister
Data Visualization and Virtual Reality
Machine Vision to Empower an Intelligent Personal Assistant for Assembly
Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
Matheus Talacio, Gustavo Funchal, Victória Melo, Luis Piardi,
Marcos Vallim, and Paulo Leitao
Smart River Platform - River Quality Monitoring and Environmental
Awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Kenedy P. Cabanga, Edmilson V. Soares, Lucas C. Viveiros,
Estefânia Gonçalves, Ivone Fachada, José Lima, and Ana I. Pereira
Health Informatics
Analysis of the Middle and Long Latency ERP Components
in Schizophrenia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Miguel Rocha e Costa, Felipe Teixeira, and João Paulo Teixeira
Feature Selection Optimization for Breast Cancer Diagnosis . . . . . . . . . . . . . . . . . 492
Ana Rita Antunes, Marina A. Matos, Lino A. Costa,
Ana Maria A. C. Rocha, and Ana Cristina Braga
Cluster Analysis for Breast Cancer Patterns Identification . . . . . . . . . . . . . . . . . . . 507
Beatriz Flamia Azevedo, Filipe Alves, Ana Maria A. C. Rocha,
and Ana I. Pereira
Overview of Robotic Based System for Rehabilitation and Healthcare . . . . . . . . . 515
Arezki A. Chellal, José Lima, Florbela P. Fernandes, José Gonçalves,
Maria F. Pacheco, and Fernando C. Monteiro
Understanding Health Care Access in Higher Education Students . . . . . . . . . . . . . 531
Filipe J. A. Vaz, Clara B. Vaz, and Luís C. D. Cadinha
Contents xiii
Using Natural Language Processing for Phishing Detection . . . . . . . . . . . . . . . . . . 540
Richard Adolph Aires Jonker, Roshan Poudel, Tiago Pedrosa,
and Rui Pedro Lopes
Data Analysis
A Panel Data Analysis of the Electric Mobility Deployment in the European
Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Sarah B. Gruetzmacher, Clara B. Vaz, and Ângela P. Ferreira
Data Analysis of Workplace Accidents - A Case Study . . . . . . . . . . . . . . . . . . . . . . 571
Inês P. Sena, João Braun, and Ana I. Pereira
Application of Benford’s Law to the Tourism Demand: The Case
of the Island of Sal, Cape Verde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
Gilberto A. Neves, Catarina S. Nunes, and Paula Odete Fernandes
Volunteering Motivations in Humanitarian Logistics: A Case Study
in the Food Bank of Viana do Castelo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Ana Rita Vasconcelos, Ângela Silva, and Helena Sofia Rodrigues
Occupational Behaviour Study in the Retail Sector . . . . . . . . . . . . . . . . . . . . . . . . . 617
Inês P. Sena, Florbela P. Fernandes, Maria F. Pacheco,
Abel A. C. Pires, Jaime P. Maia, and Ana I. Pereira
A Scalable, Real-Time Packet Capturing Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 630
Rafael Oliveira, João P. Almeida, Isabel Praça, Rui Pedro Lopes,
and Tiago Pedrosa
Trends in Engineering Education
Assessing Gamification Effectiveness in Education Through Analytics . . . . . . . . 641
Zornitsa Yordanova
Real Airplane Cockpit Development Applied to Engineering Education:
A Project Based Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
José Carvalho, André Mendes, Thadeu Brito, and José Lima
Azbot-1C: An Educational Robot Prototype for Learning Mathematical
Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
Francisco Pedro, José Cascalho, Paulo Medeiros, Paulo Novo,
Matthias Funk, Albeto Ramos, Armando Mendes, and José Lima
xiv Contents
Towards Distance Teaching: A Remote Laboratory Approach for Modbus
and IoT Experiencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
José Carvalho, André Mendes, Thadeu Brito, and José Lima
Evaluation of Soft Skills Through Educational Testbed 4.0 . . . . . . . . . . . . . . . . . . 678
Leonardo Breno Pessoa da Silva, Bernado Perrota Barreto,
Joseane Pontes, Fernanda Tavares Treinta,
Luis Mauricio Martins de Resende, and Rui Tadashi Yoshino
Collaborative Learning Platform Using Learning Optimized Algorithms . . . . . . . 691
Beatriz Flamia Azevedo, Yahia Amoura, Gauhar Kantayeva,
Maria F. Pacheco, Ana I. Pereira, and Florbela P. Fernandes
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703
Optimization Theory
Dynamic Response Surface Method
Combined with Genetic Algorithm
to Optimize Extraction Process Problem
Laires A. Lima1,2(B)
, Ana I. Pereira1
, Clara B. Vaz1
, Olga Ferreira2
,
Márcio Carocho2
, and Lillian Barros2
1
Research Center in Digitalization and Intelligent Robotics (CeDRI), Instituto
Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal
{laireslima,apereira,clvaz}@ipb.pt
2
Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança,
Campus de Santa Apolónia, 5300-253 Bragança, Portugal
{oferreira,mcarocho,lillian}@ipb.pt
Abstract. This study aims to find and develop an appropriate opti-
mization approach to reduce the time and labor employed throughout a
given chemical process and could be decisive for quality management. In
this context, this work presents a comparative study of two optimization
approaches using real experimental data from the chemical engineering
area, reported in a previous study [4]. The first approach is based on the
traditional response surface method and the second approach combines
the response surface method with genetic algorithm and data mining.
The main objective is to optimize the surface function based on three
variables using hybrid genetic algorithms combined with cluster analysis
to reduce the number of experiments and to find the closest value to
the optimum within the established restrictions. The proposed strategy
has proven to be promising since the optimal value was achieved with-
out going through derivability unlike conventional methods, and fewer
experiments were required to find the optimal solution in comparison to
the previous work using the traditional response surface method.
Keywords: Optimization · Genetic algorithm · Cluster analysis
1 Introduction
Search and optimization methods have several principles, being the most rele-
vant: the search space, where the possibilities for solving the problem in question
are considered; the objective function (or cost function); and the codification of
the problem, that is, the way to evaluate an objective in the search space [1].
Conventional optimization techniques start with an initial value or vector
that, iteratively, is manipulated using some heuristic or deterministic process
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 3–14, 2021.
https://doi.org/10.1007/978-3-030-91885-9_1
4 L. A. Lima et al.
directly associated with the problem to be solved. The great difficulty to deal
with when solving a problem using a stochastic method is the number of possible
solutions growing with a factorial speed, being impossible to list all possible
solutions of the problem [12]. Evolutionary computing techniques operate on a
population that changes in each iteration. Thus, they can search in different
regions on the feasible space, allocating an appropriate number of members to
search in different areas [12].
Considering the importance of predicting the behavior of analytical processes
and avoiding expensive procedures, this study aims to propose an alternative
for the optimization of multivariate problems, e.g. extraction processes of high-
value compounds from plant matrices. In the standard analytical approach, the
identification and quantification of phenolic compounds require expensive and
complex laboratory assays [6]. An alternative approach can be applied using
forecasting models from Response Surface Method (RSM). This approach can
maximize the extraction yield of the target compounds while decreasing the cost
of the extraction process.
In this study, a comparative analysis between two optimization methodolo-
gies (traditional RSM and dynamic RSM), developed in MATLAB R

software
(version R2019a 9.6), that aim to maximize the heat-assisted extraction yield
and phenolic compounds content in chestnut flower extracts is presented.
This paper is organized as follows. Section 2 describes the methods used to
evaluate multivariate problems involving optimization processes: Response Sur-
face Method (RSM), Hybrid Genetic Algorithm, Cluster Analysis and Bootsrap
Analysis. Sections 3 and 4 introduce the case study, consisting of the optimization
of the extraction yield and content of phenolic compounds in extracts of chest-
nut flower by two different approaches: Traditional RSM and Dynamic RSM.
Section 5 includes the numerical results obtained by both methods and their
comparative evaluation. Finally, Sect. 6 presents the conclusions and future work.
2 Methods Approaches and Techniques
Regarding optimization problems, some methods are used more frequently (tra-
ditional RSM, for example) due to their applicability and suitability to different
cases. For the design of the dynamic RSM, the approach of conventional meth-
ods based on Genetic Algorithm combined with clustering and bootstrap analysis
was made to evaluate the aspects that could be incorporated into the algorithm
developed in this work. The key concepts for dynamic RSM are presented below.
2.1 Response Surface Method
The Response Surface Method is a tool introduced in the early 1950s by Box
and Wilson, which covers a collection of mathematical and statistical techniques
useful for approximating and optimizing stochastic models [11]. It is a widely
used optimization method, which applies statistical techniques based on special
factorial designs [2,3]. Its scientific approach estimates the ideal conditions for
Dynamic RSM Combined with GA to Optimize Extraction Process Problem 5
achieving the highest or lowest required response value, through the design of the
response surface from the Taylor series [8]. RSM promotes the greatest amount
of information on experiments, such as the time of experiments and the influence
of each dependent variable, being one of the largest advantages in obtaining the
necessary general information on the planning of the process and the experience
design [8].
2.2 Hybrid Genetic Algorithm
The genetic algorithm is a stochastic optimization method based on the evolu-
tionary process of natural selection and genetic dynamics. The method seeks to
combine the survival of the fittest among the string structures with an exchange
of random, but structured, information to form an ideal solution [7]. Although
they are randomized, GA search strategies are able to explore several regions
of the feasible search space at a time. In this way, along with the iterations,
a unique search path is built, as new solutions are obtained through the com-
bination of previous solutions [1]. Optimization problems with restrictions can
influence the sampling capacity of a genetic algorithm due to the population
limits considered. Incorporating a local optimization method into GA can help
overcome most of the obstacles that arise as a result of finite population sizes,
for example, the accumulation of stochastic errors that generate genetic drift
problems [1,7].
2.3 Cluster Analysis
Cluster algorithms are often used to group large data sets and play an impor-
tant role in pattern recognition and mining large arrays. k-means and k-medoids
strategies work by grouping partition data into a k number of mutually exclusive
clusters, demonstrated in Fig. 1. These techniques assign each observation to a
cluster, minimizing the distance from the data point to the average (k-means)
or median (k-medoids) location of its assigned cluster [10].
Fig. 1. Mean and Medoid in 2D space representation. In both figures, the data are
represented by blue dots, being the rightmost point an outlier and the red point rep-
resents the centroid point found by k-mean or k-medoid methods. Adapted from Jin
and Han (2011) (Color figure online).
6 L. A. Lima et al.
2.4 Bootstrap Analysis
The idea of bootstrap analysis is to mimic the sampling distribution of the statis-
tic of interest through the use of many resamples replacing the original sample
elements [5]. In this work, the bootstrap analysis enables the handling of the vari-
ability of the optimal solutions derived from the cluster method analysis. Thus,
the bootstrap analysis is used to estimate the confidence interval of the statis-
tic of interest and subsequently, comparing the results obtained by traditional
methods.
3 Case Study
This work presents a comparative analysis between two methodologies for opti-
mizing the total phenolic content in extracts of chestnut flower, developed in
MATLAB R

software. The natural values of the dependent variables in the
extraction - t, time in minutes; T, temperature in ◦
C; and S, organic solvent con-
tent in %v/v of ethanol - were coded based on Central Composite Circumscribed
Design (CCCD) and the result was based on extraction yield (Y, expressed in
percentage of dry extract) and total phenolic content (Phe, expressed in mg/g
of dry extract) as shown in Table 1.
The experimental data presented were cordially provided by the Mountain
Research Center - CIMO (Bragança, Portugal) [4].
The CCCD design selected for the original experimental study [4] is based
on a cube circumscribed to a sphere in which the vertices are at α distance from
the center, with 5 levels for each factor (t, T, and S). In this case, the α values
vary between −1.68 and 1.68, and correspond to each factor level, as described
in Table 2.
4 Data Analysis
In this section, the two RSM optimization methods (traditional and dynamic)
will be discussed in detail, along with the results obtained from both methods.
4.1 Traditional RSM
In the original experiment, a five-level Central Composite Circumscribed Design
(CCCD) coupled with RSM was build to optimize the variables for the male
chestnut flowers. For the optimization, a simplex method developed ad hoc was
used to optimize nonlinear solutions obtained by a regression model to maximize
the response, described in the flowchart in Fig. 2.
Through the traditional RSM, the authors approximated the surface response
to a second-order polynomial function [4]:
Y = b0 +
n

i=1
biXi +
n−1

i=1
j1
n

j=2
bijXiXj +
n

i=1
biiX2
i (1)
Dynamic RSM Combined with GA to Optimize Extraction Process Problem 7
Table 1. Variables and natural values of the process parameters for the extraction of
chestnut flowers [4].
t (min) T (◦
C) S (%EtOH) Yield (%R) Phenolic cont.
(mg.g−1
dry weight)
40.30 37.20 20.30 38.12 36.18
40.30 37.20 79.70 26.73 11.05
40.30 72.80 20.30 42.83 36.66
40.30 72.80 79.70 35.94 22.09
99.70 37.20 20.30 32.77 35.55
99.70 37.20 79.70 32.99 8.85
99.70 72.80 20.30 42.55 29.61
99.70 72.80 79.70 35.52 11.10
120.00 55.00 50.00 42.41 14.56
20.00 55.00 50.00 35.45 24.08
70.00 25.00 50.00 38.82 12.64
70.00 85.00 50.00 42.06 17.41
70.00 55.00 0.00 35.24 34.58
70.00 55.00 100.00 15.61 12.01
20.00 25.00 0.00 22.30 59.56
20.00 25.00 100.00 8.02 15.57
20.00 85.00 0.00 34.81 42.49
20.00 85.00 100.00 18.71 50.93
120.00 25.00 0.00 31.44 40.82
120.00 25.00 100.00 15.33 8.79
120.00 85.00 0.00 34.96 45.61
120.00 85.00 100.00 32.70 21.89
70.00 55.00 50.00 41.03 14.62
Table 2. Natural and coded values of the extraction variables [4].
Natural variables Coded value
t (min) T (◦
C) S (%)
20.0 25.0 0.0 −1.68
40.3 37.2 20.0 −1.00
70 55 50.0 0.00
99.7 72.8 80.0 1.00
120 85 100.0 1.68
8 L. A. Lima et al.
Fig. 2. Flowchart of traditional RSM modeling approach for optimal design.
where, for i = 0, ..., n and j = 1, ..., n, bi stand for the linear coefficients; bij
correspond to the interaction coefficients while bii are the quadratic coefficients;
and, finally, Xi are the independent variables, associated to t, T and S, being n
the total number of variables.
In the previous study, the traditional RSM, Eq. 1 represents coherently the
behaviour of the extraction process of the target compounds from chestnut flow-
ers [4]. In order to compare the optimization methods and to avoid data conflict,
the estimation of the cost function was done based on a multivariate regression
model.
4.2 Dynamic RSM
For the proposed optimization method, briefly described in the flowchart shown
in Fig. 3, the structure of the design of the experience was maintained, as well
as the imposed restrictions on the responses and variables to elude awkward
solutions.
Fig. 3. Flowchart of dynamic RSM integrating genetic algorithm and cluster analysis
to the process.
The dynamic RSM method was build in MATLAB R

using a programming
code developed by the authors coupled with pre-existing functions from the
statistical and optimization toolboxes of the software. The algorithm starts by
generating a set of 15 random combinations between the levels of combinatorial
analysis. From this initial experimental data, a multivariate regression model is
calculated, being this model the objective function of the problem. Thereafter,
a built-in GA-based solver was used to solve the optimization problem. The
optimal combination is identified and it is used to define the objective function.
The process stops when no new optimal solution is identified.
Dynamic RSM Combined with GA to Optimize Extraction Process Problem 9
Considering the stochastic nature of this case study, clustering analysis is
used to identify the best candidate optimal solution. In order to handle the
variability of the achieved optimal solution, the bootstrap method is used to
estimate the confidence interval at 95%.
5 Numerical Results
The study using the traditional RSM returned the following optimal conditions
for maximum yield: 120.0 min, 85.0 ◦
C, and 44.5% of ethanol in the solvent, pro-
ducing 48.87% of dry extract. For total phenolic content, the optimal conditions
were: 20.0 min, 25.0 ◦
C and S = 0.0% of ethanol in the solvent, producing 55.37
mg/g of dry extract. These data are displayed in Table 3.
Table 3. Optimal responses and respective conditions using traditional and dynamic
RSM based on confidence intervals at 95%.
Method t (min) T (◦
C) S (%) Response
Extraction yield (%) Traditional RSM 120.0 ± 12.4 85.0 ± 6.7 44.5 ± 9.7 48.87
Dynamic RSM 118.5 ± 1.4 84.07 ± 0.9 46.1 ± 0.85 45.87
Total phenolics (Phe) Traditional RSM 20.0 ± 3.7 25.0 ± 5.7 0.0 ± 8.7 55.37
Dynamic RSM 20.4 ± 1.5 25.1 ± 1.97 0.05 ± 0.05 55.64
For the implementation of dynamic RSM in this case study, 100 runs were
carried out to evaluate the effectiveness of the method. For the yield, the esti-
mated optimal conditions were: 118.5 min, 84.1 ◦
C, and 46.1% of ethanol in the
solvent, producing 45.87% of dry extract. In this case, the obtained optimal con-
ditions for time and temperature were in accordance with approximately 80% of
the tests.
For the total phenolic content, the optimal conditions were: 20.4 min, 25.1
◦
C, and 0.05% of ethanol in the solvent, producing 55.64 mg/g of dry extract.
The results were very similar to the previous report with the same data [4].
The clustering analysis for each response variable was performed considering
the means (Figs. 4a and 5a) and the medoids (Figs. 4b and 5b) for the output
population (optimal responses). The bootstrap analysis makes the inference con-
cerning the results achieved and are represented graphically in terms of mean in
Figs. 4c and 5c, and in terms of medoids in Figs. 4d and 5d.
10 L. A. Lima et al.
The box plots of the group of optimal responses from dynamic RSM dis-
played in Fig. 6 shows that the variance within each group is small, given that
the difference between the set of responses is highly narrow. The histograms
concerning the set of dynamic RSM responses and the bootstrap distribution of
the mean (1000 resamples) are shown in Figs. 7 and 8.
(a) Extraction yield responses and
k-means
(b) Extraction yield responses and
k-medoids
(c) Extraction yield bootstrap output and
k-means
(d) Extraction yield bootstrap output and
k-medoids
Fig. 4. Clustering analysis of the outputs from the extraction yield optimization using
dynamic RSM. The responses are clustered in 3 distinct groups
Dynamic RSM Combined with GA to Optimize Extraction Process Problem 11
(a) Total phenolic content responses and
k-means
(b) Total phenolic content responses and
k-medoids
(c) Total phenolic content bootstrap
output and k-means
(d) Total phenolic content bootstrap
output and k-medoids
Fig. 5. Clustering analysis of the outputs from the total phenolic content optimization
using dynamic RSM. The responses are clustered in 3 distinct groups
Fig. 6. Box plot of the dynamic RSM outputs for the extraction yield and total phenolic
content before bootstrap analysis, respectively.
12 L. A. Lima et al.
Fig. 7. Histograms of the extraction data (extraction yield) and the bootstrap means,
respectively.
Fig. 8. Histograms of the extraction data (total phenolic content) and the bootstrap
means, respectively.
The results obtained in this work are satisfactory since they were analogous
for both methods, although dynamic RSM took 15 to 18 experimental points
to find the optimal coordinates. Some authors use the design of experiments
involving traditional RSM containing 20 different combinations, including the
repetition of centroid [4]. However, in studies involving recent data or the absence
of complementary data, evaluations about the influence of parameters and range
are essential to obtain consistent results, making it necessary to make about 30
experimental points for optimization. Considering these cases, the dynamic RSM
method proposes a different, competitive, and economical approach, in which
fewer points are evaluated to obtain the maximum response.
Genetic algorithms have been providing their efficiency in the search for
optimal solutions in a wide variety of problems, given that they do not have
some limitations found in traditional search methodologies, such as the require-
ment of the derivative function, for example [9]. GA is attractive to identify the
global solution of the problem. Considering the stochastic problem presented in
this work, the association of genetic algorithm with the k-methods as clustering
algorithm obtained satisfactory results. This solution can be used for problems
involving small-scale data since GA manages to gather the best data for opti-
Dynamic RSM Combined with GA to Optimize Extraction Process Problem 13
mization through its evolutionary method, while k-means or k-medoids make the
grouping of optimum points.
In addition to clustering analysis, bootstrapping was also applied, in which
the sample distribution of the statistic of interest is simulated through the use
of many resamples with replacement of the original sample, thus enabling to
make the statistical inference. Bootstrapping was used to calculate the confidence
intervals to obtain unbiased estimates from the proposed method. In this case,
the confidence interval was calculated at the 95% level (two-tailed), since the
same percentage was adopted by Caleja et al. (2019). It was observed that the
Dynamic RSM approach also enables the estimation of confidence intervals with
less margin of error than the Traditional RSM approach, conducting to define
more precisely the optimum conditions for the experiment.
6 Conclusion and Future Work
For the presented case study, applying dynamic RSM using Genetic Algorithm
coupled with clustering analysis returned positive results, in accordance with
previous published data [4]. Both methods seem attractive for the resolution
of this particular case concerning the optimization of the extraction of target
compounds from plant matrices. Therefore, the smaller number of experiments
required for dynamic RSM can be an interesting approach for future studies.
In brief, a smaller set of points was obtained that represent the best domain
of optimization, thus eliminating the need for a large number of costly labora-
tory experiments. The next steps involve the improvement of the dynamic RSM
algorithm and the application of the proposed method in other areas of study.
Acknowledgments. The authors are grateful to FCT for financial support through
national funds FCT/MCTES UIDB/00690/2020 to CIMO and UIDB/05757/2020.
M. Carocho also thanks FCT through the individual scientific employment program-
contract (CEECIND/00831/2018).
References
1. Beasley, D., Bull, D.R., Martin, R.R.: An overview of genetic algorithms: Part 1,
fundamentals. Univ. Comput. 2(15), 1–16 (1993)
2. Box, G.E.P., Behnken, D.W.: Simplex-sum designs: a class of second order rotatable
designs derivable from those of first order. Ann. Math. Stat. 31(4), 838–864 (1960)
3. Box, G.E.P., Wilson, K.B.: On the experimental attainment of optimum conditions.
J. Roy. Stat. Soc. Ser. B (Methodol.) 13(1), 1–38 (1951)
4. Caleja C., Barros L., Prieto M. A., Bento A., Oliveira M.B.P., Ferreira, I.C.F.R.:
Development of a natural preservative obtained from male chestnut flowers: opti-
mization of a heat-assisted extraction technique. In: Food and Function, vol. 10,
pp. 1352–1363 (2019)
5. Efron, B., Tibshirani, R.J.: An introduction to the Bootstrap, 1st edn. Wiley, New
York (1994)
14 L. A. Lima et al.
6. Eftekhari, M., Yadollahi, A., Ahmadi, H., Shojaeiyan, A., Ayyari, M.: Development
of an artificial neural network as a tool for predicting the targeted phenolic profile
of grapevine (Vitis vinifera) foliar wastes. Front. Plant Sci. 9, 837 (2018)
7. El-Mihoub, T.A., Hopgood, A.A., Nolle, L., Battersby, A.: Hybrid genetic algo-
rithms: a review. Eng. Lett. 11, 124–137 (2006)
8. Geiger, E.: Statistical methods for fermentation optimization. In: Vogel H.C.,
Todaro C.M., (eds.) Fermentation and Biochemical Engineering Handbook: Prin-
ciples, Process Design, and Equipment, 3rd edn, pp. 415–422. Elsevier Inc. (2014)
9. Härdle, W.K., Simar, L.: Applied Multivariate Statistical Analysis, 4th edn.
Springer, Heidelberg (2019)
10. Jin, X., Han, J.: K-medoids clustering. In: Sammut, C., Webb, G.I. (eds.) Ency-
clopedia of Machine Learning, pp. 564–565. Springer, Boston (2011)
11. Şenaras, A.E.: Parameter optimization using the surface response technique in
automated guided vehicles. In: Sustainable Engineering Products and Manufac-
turing Technologies, pp. 187–197. Academic Press (2019)
12. Schneider, J., Kirkpatrick, S.: Genetic algorithms and evolution strategies. In:
Stochastic Optimization, vol. 1, pp. 157–168, Springer-Verlag, Heidelberg (2006)
Towards a High-Performance
Implementation of the MCSFilter
Optimization Algorithm
Leonardo Araújo1,2
, Maria F. Pacheco2
, José Rufino2
,
and Florbela P. Fernandes2(B)
1
Universidade Tecnológica Federal do Paraná,
Campus de Ponta Grossa, Ponta Grossa 84017-220, Brazil
2
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politécnico de Bragança, 5300-252 Bragança, Portugal
a46677@alunos.ipb.pt, {pacheco,rufino,fflor}@ipb.pt
Abstract. Multistart Coordinate Search Filter (MCSFilter) is an opti-
mization method suitable to find all minimizers – both local and global –
of a non convex problem, with simple bounds or more generic constraints.
Like many other optimization algorithms, it may be used in industrial
contexts, where execution time may be critical in order to keep a pro-
duction process within safe and expected bounds. MCSFilter was first
implemented in MATLAB and later in Java (which introduced a signif-
icant performance gain). In this work, a comparison is made between
these two implementations and a novel one in C that aims at further
performance improvements. For the comparison, the problems addressed
are bound constraint, with small dimension (between 2 and 10) and mul-
tiple local and global solutions. It is possible to conclude that the average
time execution for each problem is considerable smaller when using the
Java and C implementations, and that the current C implementation,
though not yet fully optimized, already exhibits a significant speedup.
Keywords: Optimization · MCSFilter method · MatLab · C · Java ·
Performance
1 Introduction
The set of techniques and principals for solving quantitative problems known as
optimization has become increasingly important in a broad range of applications
in areas of research as diverse as engineering, biology, economics, statistics or
physics. The application of the techniques and laws of optimization in these
(and other) areas, not only provides resources to describe and solve the specific
problems that appear within the framework of each area but it also provides the
opportunity for new advances and achievements in optimization theory and its
techniques [1,2,6,7].
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 15–30, 2021.
https://doi.org/10.1007/978-3-030-91885-9_2
16 L. Araújo et al.
In order to apply optimization techniques, problems can be formulated in
terms of an objective function that is to be maximized or minimized, a set of
variables and a set of constraints (restrictions on the values that the variables can
assume). The structure of the these three items – objective function, variables
and constraints –, determines different subfields of optimization theory: linear,
integer, stochastic, etc.; within each of these subfields, several lines of research
can be pursued. The size and complexity of optimization problems that can be
dealt with has increased enormously with the improvement of the overall per-
formance of computers. As such, advances in optimization techniques have been
following progresses in computer science as well as in combinatorics, operations
research, control theory, approximation theory, routing in telecommunication
networks, image reconstruction and facility location, among other areas [14].
The need to keep up with the challenges of our rapidly changing society and
its digitalization means a continuing need to increase innovation and productiv-
ity and improve the performance of the industry sector, and places very high
expectations for the progress and adaptation of sophisticated optimization tech-
niques that are applied in the industrial context. Many of those problems can be
modelled as nonlinear programming problems [10–12] or mixed integer nonlinear
programming problems [5,8]. The urgency to quickly output solutions to difficult
multivariable problems, leads to an increasing need to develop robust and fast
optimization algorithms. Considering that, for many problems, reliable informa-
tion about the derivative of the objective function is unavailable, it is important
to use a method that allows to solve the problem without this information.
Algorithms that do not use derivatives are called derivative-free. The MCS-
Filter method is such a method, being able to deal with discontinuous or non-
differentiable functions that often appear in many applications. It is also a multi-
local method, meaning that it finds all the minimizers, both local and global, and
exhibits good results [9,10]. Moreover, a Java implementation was already used
to solve processes engineering problems [4]. Considering that, from an industrial
point of view, execution time is of utmost importance, a novel C reimplementa-
tion, aimed at increased performance, is currently under way, having reached a
stage at which it is already able to solve a broad set of problems with measur-
able performance gains over the previous Java version. This paper presents the
results of a preliminary evaluation of the new C implementation of the MCSFilter
method, against the previously developed versions (in MATLAB and Java).
The rest of this paper is organized as follows: in Sect. 2, the MCSFilter algo-
rithm is briefly described; in Sect. 3 the set of problems that are used to compare
the three implementations and the corresponding results are presented and ana-
lyzed. Finally, in Sect. 4, conclusions and future work are addressed.
2 The Multistart Coordinate Search Filter Method
The MCSFilter algorithm was initially developed in [10], with the aim of finding
multiple solutions of nonconvex and nonlinear constrained optimization problems
of the following type:
Towards a High-Performance Implementation of the MCSFilter 17
min f(x)
subject to gj(x) ≤ 0, j = 1, ..., m
li ≤ xi ≤ ui, i = 1, ..., n
(1)
where f is the objective function, gj(x), j = 1, ..., m are the constraint functions
and at least one of the functions f, gj : Rn
−→ R is nonlinear; also, l and u are
the bounds and Ω = {x ∈ Rn
: g(x) ≤ 0 , l ≤ x ≤ u} is the feasible region.
This method has two main different parts: i) the multistart part, related with
the exploration feature of the method, and ii) the coordinate search filter local
search, related with the exploitation of promising regions.
The MCSFilter method does not require any information about the deriva-
tives and is able to obtain all the solutions, both local and global, of a given non-
convex optimization problem. This is an important asset of the method since,
in industry problems, it is often not possible to know the derivative functions;
moreover, a large number of real-life problems are nonlinear and nonconvex.
As already stated, the MCSFilter algorithm relies on a multistart strategy
and a local search repeatedly called inside the multistart. Briefly, the multistart
strategy is a stochastic algorithm that applies more than once a local search to
sample points aiming to converge to all the minimizers, local and global, of a
multimodal problem. When the local search is repeatedly applied, some of the
minimizers can be reached more than once. This leads to a waste of time since
these minimizers have already been determined. To avoid these situations, a
clustering technique based on computing the regions of attraction of previously
identified minimizers is used. Thus, if the initial point belongs to the region of
attraction of a previously detected minimizer, the local search procedure may
not be performed, since it would converge to this known minimizer.
Figure 1 illustrates the influence of the regions of attraction. The red/magenta
lines between the initial approximation and the minimizer represent a local
search that has been performed; the red line represents the first local search that
converged to a given minimizer; the white dashed line between the two points
represents a discarded local search, using the regions of attraction. Therefore,
this representation intends to show the regions of attraction of each minimizer
and the corresponding points around each one. These regions are dynamic in the
sense that they may change every time a new initial point is used [3].
The local search uses a derivative-free strategy that consists of a coordinate
search combined with a filter methodology in order to generate a sequence of
approximate solutions that improve either the constraint violation or the objec-
tive function relatively to the previous approximation; this strategy is called
Coordinate Search Filter algorithm (CSFilter). In this way, the initial problem
is previously rewritten as a bi-objective problem (2):
min (θ(x), f(x))
x ∈ Ω
(2)
18 L. Araújo et al.
3 4 5 6 7 8 9 10 11 12 13
3
4
5
6
7
8
9
10
11
12
13
Fig. 1. Illustration of the multistart strategy with regions of attraction [3].
aiming to minimize, simultaneously, the objective function f(x) and the non-
negative continuous aggregate constraint violation function θ(x) defined in (3):
θ(x) = g(x)+2
+ (l − x)+2
+ (x − u)+2
(3)
where v+ = max{0, v}. For more details about this method see [9,10].
Algorithm 1 displays the steps of the MSCFilter method. The stopping condi-
tion of CSFilter is related with the step size α of the method (see condition (4)):
α  αmin (4)
with αmin  1 and close to zero.
The main steps of the MCSFilter algorithm for finding global (as well as
local) solutions to problem (1) are shown in Algorithm 2.
The stopping condition of the MCSFilter algorithm is related to the number
of minimizers found and to the number of local searches that were applied in
the multistart strategy. Considering nl as the number of local searches used
and nm as the number of minimizers obtained, then Pmin =
nm(nm + 1)
nl(nl − 1)
. The
MCSFilter algorithm stops when condition (5) is reached:
Pmin   (5)
where   1.
In this preliminary work, the main goal is to compare the performance
of MCSFilter when bound constraint problems are addressed, using different
Towards a High-Performance Implementation of the MCSFilter 19
Algorithm 1. CSFilter algorithm
Require: x and parameter values, αmin; set x̃ = x, xinf
F = x, z = x̃;
1: Initialize the filter; Set α = min{1, 0.05
n
i=1 ui−li
n
};
2: repeat
3: Compute the trial approximations zi
a = x̃ + αei, for all ei ∈ D⊕;
4: repeat
5: Check acceptability of trial points zi
a;
6: if there are some zi
a acceptable by the filter then
7: Update the filter;
8: Choose zbest
a ; set z = x̃, x̃ = zbest
a ; update xinf
F if appropriate;
9: else
10: Compute the trial approximations zi
a = xinf
F + αei, for all ei ∈ D⊕;
11: Check acceptability of trial points zi
a;
12: if there are some zi
a acceptable by the filter then
13: Update the filter;
14: Choose zbest
a ; Set z = x̃, x̃ = zbest
a ; update xinf
F if appropriate;
15: else
16: Set α = α/2;
17: end if
18: end if
19: until new trial zbest
a is acceptable
20: until α  αmin
implementations of the algorithm: the original implementation in MATLAB [10],
a follow up implementation in Java (already used to solve problems from the
Chemical Engineering area [3,4,13]), and a new implementation in C (evaluated
for the first time in this paper).
3 Computational Results
In order to compare the performance of the three implementations of the MCS-
Filter optimization algorithm, a set of problems was chosen. The definition of
each problem (a total of 15 bound constraint problems) is given below, along
with the experimental conditions under which they were evaluated, as well as
the obtained results (both numerical and performance-related).
3.1 Benchmark Problems
The collection of problems was taken from [9] (and the references therein) and
all the fifteen problems in study are listed below. The problems were chosen in
such a way that different characteristics were addressed: they are multimodal
problems with more than one minimizer (actually, the number of minimizers
varies from 2 to 1024); they can have just one global minimizer or more than
one global minimizer; the dimension of the problems varies between 2 and 10.
20 L. Araújo et al.
Algorithm 2. MCSFilter algorithm
Require: Parameter values; set M∗
= ∅, k = 1, t = 1;
1: Randomly generate x ∈ [l, u]; compute Bmin = mini=1,...,n{ui − li};
2: Compute m1 = CSFilter(x), R1 = x − m1; set r1 = 1, M∗
= M∗
∪ m1;
3: while the stopping rule is not satisfied do
4: Randomly generate x ∈ [l, u];
5: Set o = arg minj=1,...,k dj ≡ x − mj;
6: if do  Ro then
7: if the direction from x to yo is ascent then
8: Set prob = 1;
9: else
10: Compute prob =  φ( do
Ro
, ro);
11: end if
12: else
13: Set prob = 1;
14: end if
15: if ζ‡
 prob then
16: Compute m = CSFilter(x); set t = t + 1;
17: if m − mj  γ∗
Bmin, for all j = 1, . . . , k then
18: Set k = k + 1, mk = m, rk = 1, M∗
= M∗
∪ mk; compute Rk = x − mk;
19: else
20: Set Rl = max{Rl, x − ml}; rl = rl + 1;
21: end if
22: else
23: Set Ro = max{Ro, x − mo}; ro = ro + 1;
24: end if
25: end while
– Problem (P1)
min f(x) ≡

x2 −
5.1
4π2
x2
1 +
5
π
x1 − 6
2
+ 10

1 −
1
8π

cos(x1) + 10
s.t. −5 ≤ x1 ≤ 10, 0 ≤ x2 ≤ 15
• known global minimum f∗
= 0.39789.
– Problem (P2)
min f(x) ≡

4 − 2.1x2
1 +
x4
1
3

x2
1 + x1x2 − 4(1 − x2
2)x2
2
s.t. −2 ≤ xi ≤ 2, i = 1, 2
• known global minimum: f∗
= −1.03160.
– Problem (P3)
min f(x) ≡
n

i=1

sin(xi) + sin

2xi
3

s.t. 3 ≤ xi ≤ 13, i = 1, 2
Towards a High-Performance Implementation of the MCSFilter 21
• known global minimum: f∗
= −2.4319.
– Problem (P4)
min f(x) ≡
1
2
2

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ xi ≤ 5, i = 1, 2
• known global minimum: f∗
= −78.3323.
– Problem (P5)
min f(x) ≡
1
2
3

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 3
• known global minimum: f∗
= −117, 4983.
– Problem (P6)
min f(x) ≡
1
2
4

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 4
• known global minimum: f∗
= −156, 665.
– Problem (P7)
min f(x) ≡
1
2
5

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 5
• known global minimum: f∗
= −195, 839.
– Problem (P8)
min f(x) ≡
1
2
6

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 6
• known global minimum: f∗
= −234, 997.
– Problem (P9)
min f(x) ≡
1
2
8

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 8
22 L. Araújo et al.
• known global minimum: f∗
= −313, 3287.
– Problem (P10)
min f(x) ≡
1
2
10

i=1
x4
i − 16x2
i + 5xi
s.t. −5 ≤ x1 ≤ 5, i = 1, · · · , 10
• known global minimum: f∗
= −391, 658.
– Problem (P11)
min f(x) ≡ (1 + (x1 + x2 + 1)2
(19 − 14x1 + 3x2
1 − 14x2 + 6x1x2 + 3x2
2))
× (30 + (2x1 − 3x2)2
(18 − 32x1 + 12x2
1 + 48x2 − 36x1x2 + 27x2
2))
s.t. −2 ≤ xi ≤ 2, i = 1, 2
• known global minimum: f∗
= 3.
– Problem (P12)
min f(x) ≡
 5

i=1
i cos((i + 1)x1 + i)
  5

i=1
i cos((i + 1)x2 + i)

s.t. −10 ≤ xi ≤ 10, i = 1, 2
• known global minimum: f∗
= −186, 731.
– Problem (P13)
min f(x) ≡ cos(x1) sin(x2) −
x1
x2
2 + 1
s.t. −1 ≤ x1 ≤ 2, −1 ≤ x2 ≤ 1
• known global minimum: f∗
= −2, 0.
– Problem (P14)
min f(x) ≡ (1.5 − x1(1 − x2))2
+ (2.25 − x1(1 − x2
2))2
+ (2.625 − x1(1 − x3
2))2
s.t. −4.5 ≤ xi ≤ 4.5, i = 1, · · · , 2
• known global minimum: f∗
= 0.
– Problem (P15)
min f(x) ≡ 0.25x4
1 − 0.5x2
1 + 0.1x1 + 0.5x2
2
s.t. −2 ≤ xi ≤ 2, i = 1, · · · , 2
• known global minimum: f∗
= −0, 352386.
Towards a High-Performance Implementation of the MCSFilter 23
3.2 Experimental Conditions
The problems were evaluated in a computational system with the following rel-
evant characteristics: CPU - 2.3/4.3 GHz 18-core Intel Xeon W-2195, RAM -
32 GB DDR4 2666 MHz ECCR, OS - Linux Ubuntu 20.04.2 LTS, MATLAB
- version R2020a, JAVA - OpenJDK 8, C Compiler - gcc version 9.3.0 (-O2
option).
Since the MCSFilter algorithm has a stochastic component, 10 runs were
performed for each problem. The execution time of the first run was ignored and
so the presented execution times are averages of the remaining 9 executions.
All three implementations (MATLAB, Java and C) ran using the same
parameters, namely, the ones related to the stopping conditions. For the local
search CSFilter, αmin = 10−5
was considered in condition (4), as in previous
works. For the stopping condition of MCSFilter  = 10−2
was considered in
condition (5).
3.3 Analysis of the Results
Tables 1, 2 and 3 show the results obtained for each implementation. In all the
tables the first column (Prob) has the name of each problem; the second col-
umn (minavg) presents the average number of minimizers obtained in the 9
executions; in the third column (tavg) the average execution time (in seconds)
of 9 runs is shown; the last column (best f∗
) shows the best value achieved for
the global minimum. One important feature visible in the results of all three
implementations is that the global minimum is always achieved, in all problems.
Table 1. Results obtained using MATLAB.
Prob minavg nfavg tavg(s) best f∗
P1 3 5684,82 0,216 0,398
P2 6 8678,36 0,289 −1,032
P3 4 4265,55 0,178 −2,432
P4 4 4663,27 0,162 −78,332
P5 8 15877,09 0,480 −117,499
P6 16 51534,64 1,438 −156,665
P7 32 145749,64 3,898 −195,831
P8 64 391584,00 10,452 −234,997
P9 256 2646434,55 71,556 −313,329
P10 1023,63 15824590,64 551,614 −391,662
P11 4 39264,73 1,591 3,000
P12 64,36 239005,18 6,731 −186,731
P13 2,45 4653,18 0,160 −2,022
P14 2,27 52964,09 2,157 0,000
P15 2 2374,91 0,135 −0,352
24 L. Araújo et al.
For the MATLAB implementation, looking at the second column of Table 1 it
is possible to state that all the minimizers are found for all the problems except
problems P10, P12, P13, P14. Nevertheless, it is important to note that P10 has
1024 minimizers and an average of 1023,63 were found, and P12 has 760 mini-
mizers and an average of 64,36 minimizers were discovered. P13 is the problem
where MCSFilter exhibits the worst behaviour, shared by other algorithms. It is
also worth to remark that problem P10 has 10 variables and, taking into account
the structure of the CSFilter algorithm, such leads to a large number of function
evaluations. This, of course, impacts the execution time and, therefore, prob-
lem P10 has the highest execution time of all the problems, taking an average
551,614 s per each run.
Table 2. Results obtained using JAVA.
Prob minavg nfavg tavg(s) best f∗
P1 3 9412,64 0,003 0,3980
P2 6 13461,82 0,005 −1,0320
P3 4 10118,73 0,006 −2,4320
P4 4 10011,91 0,005 −78,3320
P5 8 32990,73 0,013 −117,4980
P6 16 98368,73 0,038 −156,6650
P7 32 274812,36 0,118 −195,8310
P8 64 730754,82 0,363 −234,9970
P9 256 4701470,36 2,868 −313,3290
P10 1024 27608805,73 20,304 −391,6620
P11 4 59438,18 0,009 3,0000
P12 46,45 99022,91 0,035 −186,7310
P13 2,36 6189,09 0,003 −2,0220
P14 2,54 62806,64 0,019 0,0000
P15 2 5439,18 0,002 −0,3520
Considering now the results produced by the Java implementation (Table 2),
it is possible to observe a behaviour similar to the MATLAB version, regarding
to the best value of the global minimum. This value is always achieved, in all
runs, as well as the known number of minimizers – except in problems P12, P13
and P14. It is noteworthy that using Java, all the 1024 minimizers were obtained.
If the fourth column of Tables 1 and 2 are compared, it is possible to point out
that in Java the algorithm clearly takes less time to obtain the same solutions.
Towards a High-Performance Implementation of the MCSFilter 25
Finally, Table 3 shows the results obtained by the new C-based implementa-
tion. It can be observed that the numerical behaviour of the algorithm is similar
to that observed in the Java version: both implementations find, approximately,
the same number of minimizers. However, comparing the execution times of the
C version with those of the Java version (Table 2), the C version is clearly faster.
Table 3. Results obtained using C.
Prob minavg nfavg tavg(s) best f∗
P1 3 3434,64 0,001 0,3979
P2 6 7809,73 0,002 −1,0316
P3 4 5733,36 0,002 −2,4320
P4 4 6557,54 0,002 −78,3323
P5 8 17093,64 0,007 −117,4985
P6 16 51359,82 0,023 −156,6647
P7 32 162219,63 0,073 −195,8308
P8 64 403745,91 0,198 −234,9970
P9 255,18 2625482,73 1,600 −313,3293
P10 1020,81 15565608,53 14,724 −391,6617
P11 4 8723,34 0,004 3,0000
P12 38,81 43335,73 0,027 −186,7309
P13 2 2345,62 0,001 −2,0218
P14 2,36 24101,18 0,014 0,0000
P15 2 3334,53 0,002 −0,3524
In order to make discernible the differences in the computational performance
of the three MCSFilter implementations, the execution times of all problems
considered in this study are represented, side-by-side, in two different charts,
accordingly with the order of magnitude of the execution times in the MATLAB
version (the slowest one). The average execution times for problems P1 −P5, P13
and P15, are represented in Fig. 2; in MATLAB, all these problems executed in
less than half a second. For the remaining problems, the execution times are rep-
resented in Fig. 3: these include problems P6 −P12, and P14, which took between
≈1,5 s to ≈10,5 s to execute in MATLAB, and whose execution times are repre-
sented against the primary (left) vertical axis; and also include problems P9 and
P10, whose executions times in MATLAB were ≈71 s and ≈551 s, respectively,
and are represented against the (right) secondary axis.
26 L. Araújo et al.
Fig. 2. Average execution time (s) for problems P1 − P5, P13, P15.
Fig. 3. Average execution time (s) for problems P6 − P12, P14.
A quick overview of Fig. 2 and Fig. 3 is enough to conclude that the new
C implementation is faster than the previous Java implementation, and much
faster than the original MATLAB version of the MCSFilter algorithm (note that
logarithmic scales were used in both figures due the different order of magnitude
of the various execution times; also, in Fig. 3 different textures were used for
Towards a High-Performance Implementation of the MCSFilter 27
problems P9 and P10 once their execution times are represented against the
right vertical axis). It is also possible to observe that, in general, there is a
direct proportionality between the execution times of the three code bases: when
changing the optimization problem, if the execution time increases or decreases
in one version, the same happens in the other versions.
To quantify the performance improvement of a version of the algorithm,
over a preceding implementation, one can calculate the Speedups (accelerations)
achieved. Thus, the speedup of the X version of the algorithm against the Y
version of the same algorithm is simply given by S(X, Y ) = T(Y )/T(X) where
T(Y ) and T(X) are the average execution times of the Y and X implementations.
The relevant speedups in the context of this study are represented in Table 4.
Another perspective on the capabilities of the three MCSFilter implementa-
tions herein considered builds on the comparison of their efficiency (or effective-
ness) in discovering all known optimizers of the optimization problems at stake.
A simple metric that yields such efficiency is E(X, Y ) = minavg(X)/minavg(Y ).
Table 5 shows the relative optima search efficiency for several pairs of MCS-
Filter implementations. For all but three problems, the MATLAB, JAVA and C
implementations are able to find exactly the same number of optimizers (and so
their relative efficiency is 1 or 100%). For problems P12, P13 and P14, however,
the search efficiency may vary a lot. Compared to the MATLAB version, both
the Java and C versions are unable to find as much optimizers for problems
P12 and P13; for problem P14, however, the Java version is able to find 12%
more optimizers than the MATLAB version, and the C version still lags behind
Table 4. Speedups of the execution time.
Problem S(Java, MATLAB) S(C, Java) S(C, MATLAB)
P1 71,8 2,7 197,4
P2 57,9 2,1 122,3
P3 29,6 3,2 94,5
P4 32,3 2,0 64,7
P5 36,9 1,7 64,5
P6 37,8 1,6 61,2
P7 33,0 1,6 53,2
P8 28,8 1,8 52,7
P9 24,9 1,8 44,7
P10 27,2 1,4 37,5
P11 176,8 2,4 417,9
P12 192,3 1,3 252,3
P13 53,3 2,4 126,3
P14 113,5 1,3 150,4
P15 67,6 1,2 84,4
28 L. Araújo et al.
Table 5. Optima search eficiency.
Problem E(Java, MATLAB) E(C, Java) E(C, MATLAB)
P1 1 1 1
P2 1 1 1
P3 1 1 1
P4 1 1 1
P5 1 1 1
P6 1 1 1
P7 1 1 1
P8 1 1 1
P9 1 1 1
P10 1 1 1
P11 1 1 1
P12 0,72 0,75 0,54
P13 0,96 0,85 0,81
P14 1,12 0,79 0,88
P15 1 1 1
(finding only 88% of the optimizers found by MATLAB). Also, compared to the
Java version, the C version currently shows an inferior search efficiency regarding
problems P12, P13 and P14, something to be tackled in future work.
A final analysis is provided based on the data of Table 6. This table presents,
for each problem Y , and for each MCSFilter implementation X, the precision
achieved by that implementation as P(X, Y ) = |f∗
(Y ) − bestf∗
(X)|, that is,
the modulus of the distance between the known global minimum of problem Y
and the best value achieved for the global minimum by implementation X. The
following conclusions may be derived: in all the problems the best f∗
is closer to
the global minimum known in the literature since the measure used is close to
zero; moreover, in the new implementation (using C) there are six problems for
which this implementation overcomes the previous two; in five other problems
all the implementations obtained the same precision for the best f∗
.
Towards a High-Performance Implementation of the MCSFilter 29
Table 6. Global optima precision.
Problem P(MATLAB) P(Java) P(C)
P1 1,1E − 04 1,1E − 04 1,0E − 05
P2 4,0E − 04 4,0E − 04 0,0E + 00
P3 1,0E − 04 1,0E − 04 1,0E − 04
P4 3,0E − 04 3,0E − 04 0,0E + 00
P5 7,0E − 04 3,0E − 04 2,0E − 04
P6 0,0E + 00 0,0E + 00 3,0E − 04
P7 8,0E − 03 8,0E − 03 8,2E − 03
P8 0,0E + 00 0,0E + 00 0,0E + 00
P9 3,0E − 04 3,0E − 04 6,0E − 04
P10 4,0E − 03 4,0E − 03 3,7E − 03
P11 0,0E + 00 0,0E + 00 0,0E + 00
P12 0,0E + 00 0,0E + 00 1,0E − 04
P13 2,2E − 02 2,2E − 02 2,2E − 02
P14 0,0E + 00 0,0E+00 0,0E + 00
P15 3,9E − 04 3,9E − 04 1,4E − 05
4 Conclusions and Future Work
The MCSFilter algorithm was used to solve bound constraint problems with dif-
ferent dimensions, from two to ten. The algorithm was originally implemented in
MATLAB and the results initially obtained were considered very promising. A
second implementation was later developed in Java, which increased the perfor-
mance considerably. In this work, the MCSFilter algorithm was re-coded in the
C language and a comparison was made between all three implementations, both
performance-wise and regarding search efficiency and precision. The evaluation
results show that, for the set of problems considered, the novel C version, even
though it is still a preliminary version, already surpasses the performance of the
Java implementation. The search efficiency of the C version, however, must be
improved. Regarding precision, the C version matched the previous in 6 problems
and brought improvements on 5 other problems, in a total of 15 problems.
Besides tackling the numerical efficiency and precision issues that still persist,
future work will include testing the C code with other problems (including higher
dimension and harder problems), and refining the code in order to improve its
performance. In particular, and most relevant for the problems that still take a
considerable amount of execution time, parallelization strategies will be exploited
as a way to further accelerate the execution of the MCSFilter algorithm.
Acknowledgements. This work has been supported by FCT - Fundação para a
Ciência e Tecnologia within the Project Scope: UIDB/05757/2020.
30 L. Araújo et al.
References
1. Abhishek, K., Leyffer, S., Linderoth, J.: FilMINT: an outer-approximation-based
solver for convex mixed-integer nonlinear programs. INFORMS J. Comput. 22(4),
555–567 (2010)
2. Abramson, M., Audet, C., Chrissis, J., Walston, J.: Mesh adaptive direct search
algorithms for mixed variable optimization. Optim. Lett. 3(1), 35–47 (2009).
https://doi.org/10.1007/s11590-008-0089-2
3. Amador, A., Fernandes, F.P., Santos, L.O., Romanenko, A., Rocha, A.M.A.C.:
Parameter estimation of the kinetic α-Pinene isomerization model using the MCS-
Filter algorithm. In: Gervasi, O., et al. (eds.) ICCSA 2018, Part II. LNCS, vol.
10961, pp. 624–636. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-
95165-2 44
4. Amador, A., Fernandes, F.P., Santos, L.O., Romanenko, A.: Application of MCS-
Filter to estimate stiction control valve parameters. In: International Conference of
Numerical Analysis and Applied Mathematics, AIP Conference Proceedings, vol.
1863, pp. 270005 (2017)
5. Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Mahajan, A.: Mixed-Integer
Nonlinear Optimization. Acta Numer. 22, 1–131 (2013)
6. Bonami, P., et al.: An algorithmic framework for convex mixed integer nonlinear
programs. Discrete Optim. 5(2), 186–204 (2008)
7. Bonami, P., Gonçalves, J.: Heuristics for convex mixed integer nonlinear programs.
Comput. Optim. Appl. 51(2), 729–747 (2012)
8. D’Ambrosio, C., Lodi, A.: Mixed integer nonlinear programming tools: an updated
practical overview. Ann. Oper. Res. 24, 301–320 (2013). https://doi.org/10.1007/
s10479-012-1272-5
9. Fernandes, F.P.: Programação não linear inteira mista e não convexa sem derivadas.
PhD thesis, University of Minho, Braga (2014)
10. Fernandes, F.P., Costa, M.F.P., Fernandes, E.M.G.P., et al.: Multilocal program-
ming: a derivative-free filter multistart algorithm. In: Murgante, B. (ed.) ICCSA
2013, Part I. LNCS, vol. 7971, pp. 333–346. Springer, Heidelberg (2013). https://
doi.org/10.1007/978-3-642-39637-3 27
11. Floudas, C., et al.: Handbook of Test Problems in Local and Global Optimization.
Kluwer Academic Publishers, Boston (1999)
12. Hendrix, E.M.T., Tóth, B.G.: Introduction to Nonlinear and Global Optimization.
Springer, New York (2010). https://doi.org/10.1007/978-0-387-88670-1
13. Romanenko, A., Fernandes, F.P., Fernandes, N.C. P.: PID controllers tuning with
MCSFilter. In: AIP Conference Proceedings, vol. 2116, pp. 220003 (2019)
14. Yang, X.-S.: Optimization Techniques and Applications with Examples. Wiley,
Hoboken (2018)
On the Performance of the ORTHOMADS
Algorithm on Continuous
and Mixed-Integer Optimization
Problems
Marie-Ange Dahito1,2(B)
, Laurent Genest1
, Alessandro Maddaloni2
,
and José Neto2
1
Stellantis, Route de Gisy, 78140 Vélizy-Villacoublay, France
{marieange.dahito,laurent.genest}@stellantis.com
2
Samovar, Telecom SudParis, Institut Polytechnique de Paris,
19 place Marguerite Perey, 91120 Palaiseau, France
{alessandro.maddaloni,jose.neto}@telecom-sudparis.eu
Abstract. ORTHOMADS is an instantiation of the Mesh Adaptive
Direct Search (MADS) algorithm used in derivative-free and black-
box optimization. We investigate the performance of the variants of
ORTHOMADS on the bbob and bbob-mixint, respectively continuous
and mixed-integer, testbeds of the COmparing Continuous Optimizers
(COCO) platform and compare the considered best variants with heuris-
tic and non-heuristic techniques. The results show a favourable perfor-
mance of ORTHOMADS on the low-dimensional continuous problems
used and advantages on the considered mixed-integer problems. Besides,
a generally faster convergence is observed on all types of problems when
the search phase of ORTHOMADS is enabled.
Keywords: Derivative-free optimization · Blackbox optimization ·
Benchmarking · Mesh Adaptive Direct Search · Mixed-integer blackbox
1 Introduction
Derivative-free optimization (DFO) and blackbox optimization (BBO) are
branches of numerical optimization that have known a fast growth in the past
years, especially with the growing need to solve real-world application problems
but also with the development of methods to deal with unavailable or numeri-
cally costly derivatives. DFO focuses on optimization techniques that make no
use of derivatives while BBO deals with problems where the objective function
is not analytically known, that is it is a blackbox. A regular blackbox objective
is the output of a computer simulation: for instance, at Stellantis, the crash or
acoustic outputs computed by the finite element simulation of a vehicle. The
problems addressed in this paper are of the form:
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 31–47, 2021.
https://doi.org/10.1007/978-3-030-91885-9_3
32 M.-A. Dahito et al.
minimize
x∈X
f(x), (1)
where X is a bounded domain of either Rn
or Rc
× Zi
with c and i respectively
the number of continuous and integer variables. n = c+i is the dimension of the
problem and f is a blackbox function. Heuristic and non-heuristic techniques
can tackle this kind of problems. Among the main approaches used in DFO
are direct local search methods. The latter are iterative methods that, at each
iteration, evaluate a set of points in a certain radius that can be increased if a
better solution is found or decreased if the incumbent remains the best point at
the current iteration.
The Mesh Adaptive Direct Search (MADS) [1,4,5] is a famous direct local
search method used in DFO and BBO that is an extension of the Generalized
Pattern Search (GPS) introduced in [28]. MADS evolves on a mesh by first doing
a global exploration called the search phase and then, if a better solution than
the current iterate is not found, a local poll is performed. The points evaluated
in the poll are defined by a finite set of poll directions that is updated at each
iteration. The algorithm is derived in several instantiations available in the Non-
linear Optimization with the MADS algorithm (NOMAD) software [7,19] and
its performance is evaluated in several papers. As examples, a broad compar-
ison of DFO optimizers is performed on 502 problems in [25] and NOMAD is
used in [24] with a DACE surrogate and compared with other local and global
surrogate-based approaches in the context of constrained blackbox optimization
on an automotive optimization problem and twenty two test problems.
Given the growing number of algorithms to deal with BBO problems, the
choice of the most adapted method for solving a specific problem still remains
complex. In order to help with this decision, some tools have been developed
to compare the performance of algorithms. In particular, data profiles [20] are
frequently used in DFO and BBO to benchmark algorithms: they show, given
some precision or target value, the fraction of problems solved by an algorithm
according to the number of function evaluations. There also exist suites of aca-
demic test problems: although the latter are treated as blackbox functions, they
are analytically known, which is an advantage to understand the behaviour of
an algorithm. There are also available industrial applications but they are rare.
Twenty two implementations of derivative-free algorithms for solving box-
constrained optimization problems are benchmarked in [25] and compared with
each other according to different criteria. They use a set of 502 problems that
are categorized according to their convexity (convex or nonconvex), smoothness
(smooth or non-smooth) and dimensions between 1 and 300. The algorithms
tested include local-search methods such as MADS through NOMAD version
3.3 and global-search methods such as the NEW Unconstrained Optimization
Algorithm (NEWUOA) [23] using trust regions and the Covariance Matrix Adap-
tation - Evolution Strategy (CMA-ES) [16] which is an evolutionary algorithm.
Simulation optimization deals with problems where at least some of the objec-
tive or constraints come from stochastic simulations. A review of algorithms to
solve simulation optimization is presented in [2], among which the NOMAD
software. However, this paper does not compare them due to a lack of standard
comparison tools and large-enough testbeds in this optimization branch.
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 33
In [3], the MADS algorithm is used to optimize the treatment process of
spent potliners in the production of aluminum. The problem is formalized as
a 7–dimensional non-linear blackbox problem with 4 inequality constraints. In
particular, three strategies are compared using absolute displacements, relative
displacements and the latter with a global Latin hypercube sampling search.
They show that the use of scaling is particularly beneficial on the considered
chemical application.
The instantiation ORTHOMADS is introduced in Ref. [1] and consists in using
orthogonal directions in the poll step of MADS. It is compared to the initial
LTMADS, where the poll directions are generated from a random lower trian-
gular matrix, and to GPS algorithm on 45 problems from the literature. They
show that MADS outperforms GPS and that the instantiation ORTHOMADS
competes with LTMADS and has the advantage that its poll directions cover
better the variable space.
The ORTHOMADS algorithm, which is the default MADS instantiation used
in NOMAD, presents variants in the poll directions of the method. To our knowl-
edge, the performance of these different variants has not been discussed in the
literature. The purpose of this paper is to explore this aspect by performing
experiments with the ORTHOMADS variants. This work is part of a project
conducted with the automotive group Stellantis to develop new approaches for
solving their blackbox optimization problems. Our contributions are first the
evaluations of the ORTHOMADS variants on continuous and mixed-integer opti-
mization problems. Besides, the contribution of the search phase is studied and
shows a general deterioration of the performance when the search is turned off.
The effect however decreases with increasing dimension. Two from the best vari-
ants of ORTHOMADS are identified on each of the used testbeds and their perfor-
mance is compared with other algorithms including heuristic and non-heuristic
techniques. Our experiments exhibit particular variants of ORTHOMADS per-
forming best depending on problems features. Plots for analyses are available at
the following link: https://github.com/DahitoMA/ResultsOrthoMADS.
The paper is organized as follows. Section 2 gives an overview of the
MADS algorithm and its ORTHOMADS variants. In Sect. 3, the variants of
ORTHOMADS are evaluated on the bbob and bbob-mixint suites that con-
sist respectively of continuous and mixed-integer functions. Then, two from the
best variants of ORTHOMADS are compared with other algorithms in Sect. 4.
Finally, Sect. 5 discusses the results of the paper.
2 MADS and the Variants of ORTHOMADS
This section gives an overview of the MADS algorithm and explains the differ-
ences among the ORTHOMADS variants.
2.1 The MADS Algorithm
MADS is an iterative direct local search method used for DFO and BBO prob-
lems. The method relies on a mesh Mk updated at each iteration and determined
34 M.-A. Dahito et al.
by the current iterate xk, a mesh parameter size δk  0 and a matrix D whose
columns consist of p positive spanning directions. The mesh is defined as follows:
Mk := {xk + δkDy : y ∈ Np
}, (2)
where the columns of D form a positive spanning set {D1, D2, . . . , Dp} and N
stands for natural numbers.
The algorithm proceeds in two phases at each iteration: the search and the
poll. The search phase is optional and similar to a design of experiment: a finite
set of points Sk, stemming generally from a surrogate model prediction and a
Nelder-Mead (N-M) search [21], are evaluated anywhere on the mesh. If the
search fails at finding a better point, then a poll is performed. During the poll
phase, a finite set of points are evaluated on the mesh in the neighbourhood of
the incumbent. This neighbourhood is called the frame Fk and has a radius of
Δk  0 that is called the poll size parameter. The frame is defined as follows:
Fk := {x ∈ Mk : x − xk∞ ≤ Δkb}, (3)
where b = max{d∞, d ∈ D} and D ⊂ {D1, D2, . . . , Dp} is a finite set of poll
directions. The latter are such that their union over iterations grows dense on
the unit sphere.
The two size parameters are such that δk ≤ Δk and evolve after each itera-
tion: if a better solution is found, they are increased and otherwise decreased. As
the mesh size decreases more drastically than the poll size in case of an unsuc-
cessful iteration, the choice of points to evaluate during the poll becomes greater
with unsuccessful iteration. Usually, δk = min{Δk, Δ2
k}. The description of the
MADS algorithm is given in Algorithm 1 and inspired from [6].
Algorithm 1: Mesh Adaptive Direct Search (MADS)
Initialize k = 0, x0 ∈ Rn
, D ∈ Rn×p
, Δ0  0, τ ∈ (0, 1) ∩ Q, stop  0
1. Update δk = min{Δk, Δ2
k}
2. Search
If f(x)  f(xk) for x ∈ Sk then xk+1 ← x, Δk+1 ← τ−1
Δk and go to 4
Else go to 3
3. Poll
Select Dk,Δk such that Pk := {xk + δkd : d ∈ Dk,Δk } ⊂ Fk
If f(x)  f(xk) for x ∈ Pk then xk+1 ← x, Δk+1 ← τ−1
Δk and go to 4
Else xk+1 ← xk and Δk+1 ← τΔk
4. Termination
If Δk+1 ≥ stop then k ← k + 1 and go to 1
Else stop
2.2 ORTHOMADS Variants
MADS has two main instantiations called ORTHOMADS and LTMADS, the
latter being the first developed. Both variants are implemented in the NOMAD
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 35
software but as ORTHOMADS is to be preferred for its coverage property in
the variable space, it was used for the experiments of this paper with NOMAD
version 3.9.1.
The NOMAD implementation of ORTHOMADS provides 6 variants of the
algorithm according to the number of directions used in the poll or according to
the way that the last poll direction is computed. They are listed below.
ORTHO N + 1 NEG computes n + 1 directions among which n are orthogonal
and the (n + 1)th
direction is the opposite sum of the n first ones.
ORTHO N + 1 UNI computes n + 1 directions among which n are orthogonal
and the (n + 1)th
direction is generated from a uniform distribution.
ORTHO N + 1 QUAD computes n + 1 directions among which n are orthogonal
and the (n+1)th
direction is generated from the minimization of a local quadratic
model of the objective.
ORTHO 2N computes 2n directions that are orthogonal. More precisely each
direction is orthogonal to 2n−2 directions and collinear with the remaining one.
ORTHO 1 uses only one direction in the poll.
ORTHO 2 uses two opposite directions in the poll.
In the plots, the variants will respectively be denoted using Neg, Uni, Quad,
2N, 1 and 2.
3 Test of the Variants of ORTHOMADS
In this section, we try to identify potentially better direction types of
ORTHOMADS and investigate the contribution of the search phase.
3.1 The COCO Platform and the Used Testbeds
The COmparing Continuous Optimizers (COCO) platform [17] is a bench-
marking framework for blackbox optimization. In this respect, several suites
of standard test problems are provided and are declined in variants, also called
instances. The latter are obtained from transformations in variable and objective
space in order to make the functions less regular.
In particular, the bbob testbed [13] provides 24 continuous problems for
blackbox optimization, each of them available in 15 instances and in dimensions
2, 3, 5, 10, 20 and 40. The problems are categorized in five subgroups: separable
functions, functions with low or moderate conditioning, ill-conditioned functions,
multi-modal functions with global structure and multi-modal weakly structured
functions. All problems are known to have their global optima in [−5, 5]n
, where
n is the size of a problem.
The mixed-integer suite of problems bbob-mixint [29] derives the bbob and
bbob-largescale [30] problems by imposing integer constraints on some vari-
ables. It consists of the 24 functions of bbob available in 15 instances and in
dimensions 5, 10, 20, 40, 80 and 160.
COCO also provides various tools for algorithm comparison, notably Empir-
ical Cumulative Distribution Function (ECDF) plots (or data profiles) that are
36 M.-A. Dahito et al.
used in this paper. They show the empirical runtimes, computed as the num-
ber of function evaluations to reach given function target values, divided by the
dimension. A function target value is defined as ft = f∗
+ Δft, where f∗
is the
minimum value of a function f and Δft is a target precision.
For the bbob and bbob-mixint testbeds, the target precisions are 51 values
between 10−8
and 102
. Thus, if a method reaches 1 in the ordinate axis of an
ECDF plot, it means 100% of function target values have been reached, including
the smallest one f∗
+ 10−8
. The presence of a cross on an ECDF curve indicates
when the maximal budget of function evaluations is reached. After the cross,
COCO estimates the runtimes: it is called simulated restarts.
For bbob, an artificial solver called best 2009 is present on the plots and is
used as reference solver. Its data comes from the BBOB-2009 workshop1
com-
paring 31 solvers.
The statistical significance of the results is evaluated in COCO using the
rank-sum test.
3.2 Parameter Setting
In order to test the performance of the different variants of ORTHOMADS, the
bbob and bbob-mixint suites of COCO were used, in particular the problems
that have a dimension lower than or equal to 20. This limit in the dimensions has
two main reasons: the first one is the computational cost required for the exper-
iments and, with the perspective of solving real-world problems, 20 is already
a high dimension in this expensive blackbox context. Only the first 5 instances
of each function were used, that is a total of respectively 600 and 360 problems
used from bbob and bbob-mixint. A maximal function evaluation budget of
2 × 103
× n was set, with n being the dimension of the considered problem.
To see the contribution of the search phase, the experiments on the vari-
ants were divided in two subgroups: the first one using the default search
of ORTHOMADS and the second one where the search phase is disabled.
The latter is obtained by setting the four parameters NM SEARCH, VNS SEARCH,
SPECULATIVE SEARCH and MODEL SEARCH of NOMAD to the value no. In the
plots, the label NoSrch is used when the search is turned off. The search notably
includes the use of a quadratic model and of the N-M method. The minimal mesh
size was set to 10−11
.
Experiments were run with restarts allowed for unsolved problems when the
evaluation budget is not reached. This may happen due to internal stopping
criteria of the solvers. The initial points used are suggested by COCO through
the method initial solution proposal().
3.3 Results
Continuous Problems. As said previously, the contribution of the search
phase was studied. The results aggregated on all functions in dimensions 5, 10
1
https://coco.gforge.inria.fr/doku.php?id=bbob-2009-results.
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 37
and 20 on the bbob suite are depicted on Fig. 1. They show that enabling the
search step in NOMAD generally leads to an equivalent or higher performance
of the variants and this improvement can be important. Besides, using one or
two directions with or without search is often far from being competitive with
the other variants. In particular, 1 NoSrch is often the worst or among the
worsts, except on Discus which is an ill-conditioned quadratic function, where it
competes with the variants that do not use the search. As mentioned in Sect. 1,
the plots depicting the results described in the paper are available online.
Looking at the results aggregated on all functions for ORTHO 2N, ORTHO N + 1
NEG, ORTHO N + 1 QUAD and ORTHO N + 1 UNI, the search increases the success
rate from nearly 70%, 55% and 40% up to 90%, 80% and 65% respectively in
dimensions 2, 3 and 5, as shown in Fig. 1a for dimension 5. From dimension
10, the advantage of the search decreases and the performance of ORTHO N + 1
UNI visibly stands out from the other three variants mentioned above since it
decreases with or without the search, as illustrated in Figs. 1b and 1c.
Focusing on some families of functions, Neg NoSrch seems slightly less
impacted than the other NoSrch variants by the increase of the dimension.
On ill-conditioned problems, the variants using search are more sensitive to
the increase of the dimension.
Considering multi-modal functions with adequate global structure, 2N NoSrch
solves 15% more problems than the other NoSrch variants in 2D. In this dimen-
sion, the variants using search have a better success rate than the best 2009
up to a budget of 200 function evaluations. From 10D, all curves are rather flat:
all ORTHOMADS variants tend to a local optimum.
With increasing dimension, Neg is competitive or better than the others on
multi-modal problems without global structure, followed by 2N. In particular,
in dimension 20 both variants are competitive and outperform the remaining
variants that use search on the Gallagher’s Gaussian 101–me peaks function,
and Neg outperforms them with a gap of more than 20% in their success rate on
the Gallagher’s Gaussian 21–hi peaks function which is also ill-conditioned.
Since Neg and 2N are often among the best variants on the considered prob-
lems and have an advantage on some multi-modal weakly structured functions,
they are chosen for comparison with other solvers.
Mixed-Integer Problems. The experiments performed on the mixed-integer
problems also show a similar or improved performance of the ORTHOMADS
variants when the search step is enabled in NOMAD, as illustrated in Fig. 2 in
dimensions 5, 10 and 20. Looking at Fig. 2a for instance, in the given budget of
2 × 103
× n, the variant denoted as 2 solves 75% of the problems in dimension
5 against 42% for 2 NoSrch.
However, it is not always the case: the only use of the poll directions is some-
times favourable. It is notably the case on the Schwefel function in dimension 20
where the curve Neg NoSrch solves 43% of the problems, which is the highest
success rate when the search and non-search settings are compared together.
38 M.-A. Dahito et al.
(a) 5D (b) 10D (c) 20D
Fig. 1. ECDF plots: the variants of ORTHOMADS with and without the search step
on the bbob problems. Results aggregated on all functions in dimensions 5, 10 and 20.
When the search is disabled, ORTHO 2N seems preferable in small dimension,
namely here in 5D as presented in Fig. 2a. In this dimension, it is sometimes
the only variant that solves all the instances of a function in the given budget:
it is the case for the step-ellipsoidal function, the two Rosenbrock functions
(original and rotated), the Schaffer functions, and the Schwefel function. It also
solves all the separable functions in 5D and can therefore solve the different
types of problems. Although the difference is less noticeable with the search step
enabled, this variant is still a good choice, especially on multi-modal problems
with adequate global structure.
On the whole, looking at Fig. 2, ORTHO 1 and ORTHO 2 solve less problems than
the other variants and the gap in performance with the other direction types
increases with the dimension, whether using the search phase or not. Although
the use of the search helps solving some functions in low dimension such as the
sphere or linear slope functions in 5D, both variants perform poorly in dimension
20 on second-order separable functions, even if the search enables the solution
of linear slope which is a linear function. Among these two variants, using 2
poll directions also seems better than only one, especially in dimension 10 where
ORTHO 2 solves more than 23% and 40% of problems respectively without and
with use of search, against 16% and 31% for ORTHO 1 as presented in Fig. 2b.
Among the four remaining variants, ORTHO N + 1 UNI reaches equivalent or
less targets than the others whether considering the setting where the search
is available or when only the poll directions are used, as depicted in Fig. 2. In
particular, in dimension 5, the four variants using more than n+1 poll directions
solve more than 85% of the separable problems with or without search. But when
the dimension increases, ORTHO N + 1 UNI has a disadvantage on the Rastrigin
functions where the use of the search does not noticeably help the convergence
of the algorithm.
Focusing on the different function types, no algorithm among the variants
ORTHO 2N, ORTHO N + 1 NEG and ORTHO N + 1 QUAD seem to particularly outperform
the others in dimensions 10 and 20. A higher success rate is however noticeable
on multimodal weakly structured problems with search available for ORTHO N + 1
NEG in comparison with ORTHO N + 1 QUAD and for the latter in comparison with
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 39
ORTHO 2N. Besides, Neg reaches more targets on problems with low or moderate
conditioning. For these reasons, ORTHO N + 1 NEG was chosen for comparison with
other solvers. Besides, the mentioned slight advantage of ORTHO N + 1 QUAD over
ORTHO 2N, its equivalent or better performance on separable and ill-conditioned
functions compared with the latter variant, makes it a good second choice to
represent ORTHOMADS.
(a) 5D (b) 10D (c) 20D
Fig. 2. ECDF plots: the variants of ORTHOMADS with and without the search step
on the bbob-mixint problems. Results aggregated on all functions in dimensions 5, 10
and 20.
4 Comparison of ORTHOMADS with other solvers
The previous experiments showed the advantage of using the search step in
ORTHOMADS to speed up convergence. They also revealed the effectiveness of
some variants that are used here for comparisons with other algorithms on the
continuous and mixed-integer suites.
4.1 Compared Algorithms
Apart from ORTHOMADS, the other algorithms used for comparison on bbob
are first, three deterministic algorithms: the quasi-Newton Broyden-Fletcher-
Goldfarb-Shanno (BFGS) method [22], the quadratic model-based NEWUOA
and the adaptive N-M [14] that is a simplicial search. Stochastic methods are also
used among which a Random Search (RS) algorithm [10] and three population-
based algorithms: a surrogate-assisted CMA-ES, Differential Evolution (DE) [27]
and Particle Swarm Optimization (PSO) [11,18].
In order to perform algorithm comparisons on bbob-mixint, data from four
stochastic methods were collected: RS, the mixed-integer variant of CMA-ES,
DE and the Tree-structured Parzen Estimator (TPE) [8] that is a stochastic
model-based technique.
BFGS is an iterative quasi-Newton linesearch method that uses approxima-
tions of the Hessian matrix of the objective. At iteration k, the search direc-
tion pk solves a linear system Bkpk = −∇f(xk), where xk is the iterate, f the
40 M.-A. Dahito et al.
objective function and Bk ≈ ∇2
f(xk). The matrix Bk is then updated according
to a formula. In the context of BBO, the derivatives are approximated with finite
differences.
NEWUOA is the Powell’s model-based algorithm for DFO. It is a trust-region
method that uses sequential quadratic interpolation models to solve uncon-
strained derivative-free problems.
The N-M method is a heuristic DFO method that uses simplices. It begins
with a non degenerated simplex. The algorithm identifies the worst point among
the vertices of the simplex and tries to replace it by reflection, expansion or
contraction. If none of these geometric transformations of the worst point enables
to find a better point, a contraction preserving the best point is done. The
adaptive N-M method uses the N-M technique with adaptation of parameters
to the dimension, which is notably useful in high dimensions.
RS is a stochastic iterative method that performs a random selection of
candidates: at each iteration, a random point is sampled and the best between
this trial point and the incumbent is kept.
CMA-ES is a state-of-the art evolutionary algorithm used in DFO. Let
N(m, C) denote a normal distribution of mean m and covariance matrix C.
It can be represented by the ellipsoid x
C−1
x = 1. The main axes of the ellip-
soid are the eigenvectors of C and the square roots of their lengths correspond
to the associated eigenvalues. CMA-ES iteratively samples its populations from
multivariate normal distributions. The method uses updates of the covariance
matrices to learn a quadratic model of the objective.
DE is a meta-heuristic that creates a trial vector by combining the incumbent
with randomly chosen individuals from a population. The trial vector is then
sequentially filled with parameters from itself or the incumbent. Finally the best
vector between the incumbent and the created vector is chosen.
PSO is an archive-based evolutionary algorithm where candidate solutions
are called particles and the population is a swarm. The particles evolve according
to the global best solution encountered but also according to their local best
points.
TPE is an iterative model-based method for hyperparameter optimization.
It sequentially builds a probabilistic model from already evaluated hyperparam-
eters sets in order to suggest a new set of hyperparameters to evaluate on a score
function that is to be minimized.
4.2 Parameter Setting
To compare the considered best variants of ORTHOMADS with other methods,
the 15 instances of each function were used and the maximal function evaluation
budget was increased to 105
× n, with n being the dimension.
For the bbob problems, the data used for BFGS, DE and the adaptive N-M
method comes from the experiments of [31]. CMA-ES was tested in [15], the
data of NEWUOA is from [26], the one of PSO is from [12] and RS results
come from [9]. The comparison data of CMA-ES, DE, RS and TPE used on
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 41
the bbob-mixint suite comes from the experiments of [29]. All are accessi-
ble from the data archives of COCO with the cocopp.archives.bbob and
cocopp.archives.bbob mixint methods.
4.3 Results
Continuous Problems. Figures 3 and 4 show the ECDF plots comparing the
methods on the different function types and on all functions, respectively in
dimensions 5 and 20 on the continuous suite. Compared with BFGS, CMA-ES,
DE, the adaptive N-M method, NEWUOA, PSO and RS, ORTHOMADS often
performs in the average for medium and high dimensions. For small dimensions
2 and 3, it is however among the most competitive.
Considering the results aggregated on all functions and splitting them over all
targets according to the function evaluations, they can be divided in three parts.
The first one consists of very limited budgets (about 20 × n) where NEWUOA
competes with or outperforms the others. After that, BFGS becomes the best
for an average budget and CMA-ES outperforms the latter for high evaluation
budgets (above the order of 102
× n), as shown in Figs. 3f and 4f. The obtained
performance restricted to a low budget is an important feature relevant to many
applications for which each function evaluation may last hours or even days.
On multi-modal problems with adequate structure, there is a noticeable gap
between the performance of CMA-ES, which is the best algorithm on this kind of
problems, and the other algorithms as shown by Figs. 3d and 4d. ORTHOMADS
performs the best in the remaining methods and competes with CMA-ES for
low budgets. It is even the best method up to a budget of 103
× n in 2D and 3D
while it competes with CMA-ES in higher dimensions for budgets lower than
the order of 102
× n.
RS is often the worse algorithm to use on the considered problems.
Mixed-Integer Problems. Figures 5 and 6 show the ECDF plots comparing
the methods on the different function types and on all functions, respectively
in dimensions 5 and 20 on the mixed-integer suite. The comparisons of NEG
and QUAD with CMA-ES, DE, RS and TPE show an overall advantage of these
ORTHOMADS variants over the other methods. A gap is especially visible on
separable and ill-conditioned problems, respectively depicted in Figs. 5a and 6a
and Figs. 5c and 6c in dimensions 5 and 20, but also on moderately conditioned
problems as shown in Figs. 5b and 6b in 5D and 20D. On multi-modal prob-
lems with global structure, ORTHOMADS is to prefer only in small dimensions:
from 10D its performance highly deteriorates and CMA-ES and DE seem to be
better choices. On multi-modal weakly structured functions, the advantages of
ORTHOMADS compared to the others emerge when the dimension increases.
Besides, although the performance of all algorithms decreases with increasing
dimensions, ORTHOMADS seems less sensitive to that. For instance, for a budget
of 102
× n, ORTHOMADS reaches 15% more targets than CMA-ES and TPE
that are the second best algorithms until this budget, and in dimension 20 this
gap increases to 18% for CMA-ES and 25% for TPE.
42 M.-A. Dahito et al.
(a) Separable (b) Moderately conditioned (c) Ill-conditioned
(d) Multi-modal (e) Weakly structured (f) All functions
Fig. 3. ECDF plots: comparison of the two variants ORTHO 2N and ORTHO N + 1 NEG of
ORTHOMADS with BFGS, NEWUOA, adaptive N-M, RS, CMA-ES, DE and PSO
on the bbob problems. Results aggregated on the function types and on all functions
in dimension 5.
(a) Separable (b) Moderately conditioned (c) Ill-conditioned
(d) Multi-modal (e) Weakly structured (f) All functions
Fig. 4. ECDF plots: comparison of the two variants ORTHO 2N and ORTHO N + 1 NEG of
ORTHOMADS with BFGS, NEWUOA, adaptive N-M, RS, CMA-ES, DE and PSO
on the bbob problems. Results aggregated on the function types and on all functions
in dimension 20.
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 43
On the overall picture, presented in Figs. 5f and 6f, RS performs poorly.
The budget allocated to TPE, which is only 102
× n, is way smaller than the
ones allocated to the other methods. In this limited budget, TPE competes with
CMA-ES in 5D and is better or competitive with DE in 10D and 20D. The latter
competes with ORTHOMADS after a budget in the order of 103
× n. Thus, after
5×103
function evaluations, only DE competes with ORTHOMADS in 5D where
both methods reach 70% of function-target pairs. Finally, CMA-ES competes
with ORTHOMADS when the budget approaches 104
× n function evaluations.
Hence, restricted budgets seem to favour the direct local search method while
expensive budgets favour the evolutionary algorithms CMA-ES and DE.
(a) Separable (b) Moderately conditioned (c) Ill-conditioned
(d) Multi-modal (e) Weakly structured (f) All functions
Fig. 5. ECDF plots: comparison of the two variants ORTHO N + 1 NEG and ORTHO N + 1
QUAD of ORTHOMADS with RS, CMA-ES, DE and TPE on the bbob-mixint problems.
Results aggregated on the function types and on all functions in dimension 5.
44 M.-A. Dahito et al.
(a) Separable (b) Moderately conditioned (c) Ill-conditioned
(d) Multi-modal (e) Weakly structured (f) All functions
Fig. 6. ECDF plots: comparison of the two variants ORTHO N + 1 NEG and ORTHO N + 1
QUAD of ORTHOMADS with RS, CMA-ES, DE and TPE on the bbob-mixint problems.
Results aggregated on the function types and on all functions in dimension 20.
5 Conclusion
This paper investigates the performance of the different poll direction types
available in ORTHOMADS on continuous and mixed-integer problems from the
literature in a blackbox context. On these two types of problems, ORTHO N + 1
NEG competes with or outperforms the other variants of the algorithm whereas
using only 1 or 2 directions is often far from being competitive.
On the continuous functions considered, the best poll direction types identi-
fied are ORTHO N + 1 NEG and ORTHO 2N, especially on multi-modal weakly struc-
tured problems. ORTHOMADS is advantageous in small dimensions and achieves
mean results for medium and high dimensions compared to the other algorithms.
It also performs well on multi-modal problems with global structure where it
competes with CMA-ES for limited budgets.
For very limited budgets, the trust-region method NEWUOA is favourable
on continuous problems, followed by the linesearch method BFGS for a medium
budget and finally the evolutionary algorithm CMA-ES for a high budget.
The results on the mixed-integer suite show that, among the poll direction
types, ORTHO 2N is preferable in small dimension. Otherwise, ORTHO N + 1 NEG and
ORTHO N + 1 QUAD are among the best direction types. Comparing them to other
methods show that ORTHOMADS often outperforms the compared algorithms
and seems more resilient to the increase of the dimension. For limited budgets,
ORTHOMADS seems a good choice among the other considered algorithms to
solve unconstrained mixed-integer blackbox problems. This is notably interesting
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 45
regarding real-world application problems and, in particular, the mixed-integer
optimization problems of Stellantis, where the number of allowed blackbox eval-
uations is often limited to a few hundreds. In the latter case, the variables are
typically the thicknesses of the sheet metals, considered as continuous, and the
materials that are categorical variables encoded as integers.
Finally, studying the contribution of the search step of ORTHOMADS shows
that disabling it generally leads to a deteriorated performance of the algorithm.
Indeed, the default search sequentially executes a N-M search and a quadratic
model search that enable a global exploration and accelerate the convergence.
However, this effect softens when the dimension increases.
References
1. Abramson, M.A., Audet, C., Dennis, J.E., Jr., Le Digabel, S.: OrthoMADS: a
deterministic MADS instance with orthogonal directions. SIAM J. Optim. 20(2),
948–966 (2009). https://doi.org/10.1137/080716980
2. Amaran, S., Sahinidis, N.V., Sharda, B., Bury, S.J.: Simulation optimization: a
review of algorithms and applications. 4OR 12(4), 301–333 (2014). https://doi.
org/10.1007/s10288-014-0275-2
3. Audet, C., Béchard, V., Chaouki, J.: Spent potliner treatment process optimization
using a MADS algorithm. Optim. Eng. 9(2), 143–160 (2008). https://doi.org/10.
1007/s11081-007-9030-2
4. Audet, C., Dennis, J.E., Jr.: Mesh adaptive direct search algorithms for constrained
optimization. SIAM J. Optim. 17(1), 188–217 (2006). https://doi.org/10.1137/
040603371
5. Audet, C., Dennis, J.E., Jr.: A progressive barrier for derivative-free nonlinear
programming. SIAM J. Optim. 20(1), 445–472 (2009). https://doi.org/10.1137/
070692662
6. Audet, C., Hare, W.: Derivative-Free and Blackbox Optimization. SSORFE,
Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68913-5
7. Audet, C., Le Digabel, S., Tribes, C.: NOMAD user guide. Technical Report
G-2009-37, Les cahiers du GERAD (2009). https://www.gerad.ca/nomad/
Downloads/user guide.pdf
8. Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-
parameter optimization. In: Advances in Neural Information Processing
Systems, vol. 24 (2011). https://proceedings.neurips.cc/paper/2011/file/
86e8f7ab32cfd12577bc2619bc635690-Paper.pdf
9. Brockhoff, D., Hansen, N.: The impact of sample volume in random search on the
bbob test suite. In: Proceedings of the Genetic and Evolutionary Computation
Conference Companion, GECCO 2019, pp. 1912–1919. Association for Computing
Machinery, New York (2019). https://doi.org/10.1145/3319619.3326894
10. Brooks, S.H.: A discussion of random methods for seeking maxima. Oper. Res.
6(2), 244–251 (1958). https://doi.org/10.1287/opre.6.2.244
11. Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In:
MHS 1995. Proceedings of the Sixth International Symposium on Micro Machine
and Human Science, pp. 39–43. IEEE (1995). https://doi.org/10.1109/MHS.1995.
494215
46 M.-A. Dahito et al.
12. El-Abd, M., Kamel, M.S.: Black-box optimization benchmarking for noiseless func-
tion testbed using particle swarm optimization. In: Proceedings of the 11th Annual
Conference Companion on Genetic and Evolutionary Computation Conference:
Late Breaking Papers, GECCO 2009, pp. 2269–2274. Association for Computing
Machinery, New York (2009). https://doi.org/10.1145/1570256.1570316
13. Finck, S., Hansen, N., Ros, R., Auger, A.: Real-parameter black-box optimiza-
tion benchmarking 2009: presentation of the noiseless functions. Technical Report
2009/20, Research Center PPE (2009)
14. Gao, F., Han, L.: Implementing the Nelder-Mead simplex algorithm with adaptive
parameters. Comp. Optim. Appl. 51, 259–277 (2012). https://doi.org/10.1007/
s10589-010-9329-3
15. Hansen, N.: A global surrogate assisted CMA-ES. In: Proceedings of the Genetic
and Evolutionary Computation Conference, GECCO 2019, pp. 664–672. Asso-
ciation for Computing Machinery, New York (2019). https://doi.org/10.1145/
3321707.3321842
16. Hansen, N., Auger, A.: Principled design of continuous stochastic search: from
theory to practice. In: Borenstein, Y., Moraglio, A. (eds.) Theory and Principled
Methods for the Design of Metaheuristics. NCS, pp. 145–180. Springer, Heidelberg
(2014). https://doi.org/10.1007/978-3-642-33206-7 8
17. Hansen, N., Auger, A., Ros, R., Mersmann, O., Tušar, T., Brockhoff, D.: COCO: a
platform for comparing continuous optimizers in a black-box setting. Optim. Meth.
Softw. 36(1), 114–144 (2021). https://doi.org/10.1080/10556788.2020.1808977
18. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of the
IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948. Citeseer
(1995)
19. Le Digabel, S.: Algorithm 909: NOMAD: nonlinear optimization with the MADS
algorithm. ACM Trans. Math. Softw. 37(4), 44:1–44:15 (2011). https://doi.org/
10.1145/1916461.1916468
20. Moré, J.J., Wild, S.M.: Benchmarking derivative-free optimization algorithms.
SIAM J. Optim. 20(1), 172–191 (2009). https://doi.org/10.1137/080724083
21. Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J.
7(4), 308–313 (1965). https://doi.org/10.1093/comjnl/7.4.308
22. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006).
https://doi.org/10.1007/978-0-387-40065-5
23. Powell, M.J.D.: The NEWUOA software for unconstrained optimization without
derivatives. In: Di Pillo, G., Roma, M. (eds.) Large-Scale Nonlinear Optimization,
pp. 255–297. Springer, Boston (2006). https://doi.org/10.1007/0-387-30065-1 16
24. Regis, R.G.: Constrained optimization by radial basis function interpolation for
high-dimensional expensive black-box problems with infeasible initial points. Eng.
Optim. 46(2), 218–243 (2014). https://doi.org/10.1080/0305215X.2013.765000
25. Rios, L.M., Sahinidis, N.V.: Derivative-free optimization: a review of algorithms
and comparison of software implementations. J. Glob. Optim. 56(3), 1247–1293
(2013). https://doi.org/10.1007/s10898-012-9951-y
26. Ros, R.: Benchmarking the NEWUOA on the BBOB-2009 function testbed. In:
Proceedings of the 11th Annual Conference Companion on Genetic and Evolution-
ary Computation Conference: Late Breaking Papers, GECCO 2009, pp. 2421–2428.
Association for Computing Machinery, New York (2009). https://doi.org/10.1145/
1570256.1570338
27. Storn, R., Price, K.: Differential evolution-A simple and efficient heuristic for
global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997).
https://doi.org/10.1023/A:1008202821328
ORTHOMADS on Continuous and Mixed-Integer Optimization Problems 47
28. Torczon, V.: On the convergence of pattern search algorithms. SIAM J. Optim.
7(1), 1–25 (1997). https://doi.org/10.1137/S1052623493250780
29. Tušar, T., Brockhoff, D., Hansen, N.: Mixed-integer benchmark problems for single-
and bi-objective optimization. In: Proceedings of the Genetic and Evolutionary
Computation Conference, GECCO 2019, pp. 718–726. Association for Computing
Machinery, New York (2019). https://doi.org/10.1145/3321707.3321868
30. Varelas, K., et al.: A comparative study of large-scale variants of CMA-ES. In:
Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D.
(eds.) PPSN 2018. LNCS, vol. 11101, pp. 3–15. Springer, Cham (2018). https://
doi.org/10.1007/978-3-319-99253-2 1
31. Varelas, K., Dahito, M.A.: Benchmarking multivariate solvers of SciPy on the
noiseless testbed. In: Proceedings of the Genetic and Evolutionary Computation
Conference, GECCO 2019, pp. 1946–1954. Association for Computing Machinery,
New York (2019). https://doi.org/10.1145/3319619.3326891
A Look-Ahead Based Meta-heuristics
for Optimizing Continuous Optimization
Problems
Thomas Nordli and Noureddine Bouhmala(B)
University of South-Eastern Norway, Kongsberg, Norway
{thomas.nordli,noureddine.bouhmala}@usn.no
http://www.usn.no
Abstract. In this paper, the famous kernighan-Lin algorithm is adjusted
and embedded into the simulated annealing algorithm and the genetic
algorithm for continuous optimization problems. The performance of the
different algorithms are evaluated using a set of well known optimization
test functions.
Keywords: Continuous optimization problems · Simulated annealing ·
Genetic algorithm · Local search
1 Introduction
Several types of meta-heuristics methods have been designed for solving contin-
uous optimization problems. Examples include genetic algorithms [8,9] artificial
immune systems [7], and taboo search [5]. Meta-heuristics can be divided into
two different classes. The first class refers to single-solution search algorithms.
A notable example that belongs to this class is the popular simulated annealing
algorithm (SA) [12], which is a random search that avoids getting stuck in local
minima. In addition to solutions corresponding to an improvement in objective
function value, SA also accepts those corresponding to a worse objective function
value using a probabilistic acceptance strategy.
The second class of algorithms refer to population based algorithms. Algo-
rithms of this class applies the principle of survival of the fittest to a population
of potential solutions, iteratively improving the population. During each genera-
tion, pairs of solutions called individuals are generated to breed a new generation
using operators borrowed from natural genetics. This process is repeated until a
stopping criteria has been reached. Genetic algorithm is one among many that
belongs to this class. The following papers [1,2,6] provide a review of the lit-
erature covering the use of evolutionary algorithms for solving continuous opti-
mization problems. In spite of the advantages that meta-heuristics offer, they
still suffer from the phenomenon of premature convergence.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 48–55, 2021.
https://doi.org/10.1007/978-3-030-91885-9_4
A Look-Ahead Based Meta-heuristics for Continuous Optimization 49
Recently, several studies combined meta-heuristics with local search meth-
ods, resulting in more efficient methods with relatively faster convergence, com-
pared to pure meta-heuristics. Such hybrid approaches offer a balance between
diversification—to cover more regions of a search space, and intensification – to
find better solutions within those regions. The reader might refer to [3,14] for
further reading on hybrid optimization methods.
This paper introduces a hybridization of genetic algorithm and simulated
annealing with the variable depth search (VDS) Kernighan-Lin algorithm (KL)
(which was firstly presented for graph partitioning problem in [11]). Compared
to simple local search methods, KL allows making steps that worsens the quality
of the solution on a short term, as long as the result gives an improvement in a
longer run.
In this work, the search for one favorable move in SA, and one two-point
cross-over in GA, are replaced by a search for a favorable sequence of moves in
SA and a series of two-point crossovers using the objective function to guide
the search.
The rest of this paper is organized as follows. Section 2 describes the com-
bined simulated annealing and KL heuristic, while Sect. 3 explains the hybridiza-
tion of KL with the genetic algorithm. Section 4 lists the functions used in the
benchmark while Section 5 shows the experimental results. Finally, Sect. 6 con-
cludes the paper with some future work.
2 Combining Simulated Annealing with Local Search
Previously, SA in combination with KL, was applied for the Max-SAT problem in
[4]. SA iteratively improves a solution by making random perturbations (moves)
to the current solution—exploring the neighborhood in the space of possible
solutions. It uses a parameter called temperature to control the decision whether
to accept bad moves or not. A bad move is a solution that decreases the value
of the objective function.
The algorithm starts of with a high temperature, when that almost all
moves are accepted. For each iteration, the temperature decreases, the algorithm
becomes selective, giving higher preference for better solutions.
Assuming an objective function f is to be maximized. The algorithm starts
computing the initial temperature T, using a procedure similar to the one
described in [12]. The temperature is computed such that the probability of
accepting a bad move is approximately equal to a given probability of acceptance
Pr. First, a low value of is chosen as the initial temperature. This temperature
is used during a number of moves.
If the ratio of accepted bad moves is less than Pr, the temperature is multi-
plied by two. This continues until the observed acceptance ratio exceeds Pr. A
random starting solution is generated and its value is calculated. An iteration
of the algorithm starts by performing a series of so-called KL perturbations or
moves to the solution Sold leading to a new solution Si
new where i denotes the
number of consecutive moves. The change in the objective function called gain is
50 T. Nordli and N. Bouhmala
computed for each move. The goal of KL perturbation is to generate a sequence
of objective function scores together with their corresponding moves. KL is sup-
posed to reach convergence if the function scores of five consecutive moves are
bad moves. The subset of moves having the best cumulative score BCSk
(SA+KL)
is identified. The identification of this subset is equivalent to choosing k so that
BCSk
(SA+KL) in Eq. 1 is maximum,
BCSk
(SA+KL) =
k

i=1
gain(Si
new) (1)
where i represents the ith
move performed, k the number of a moves, and
gain(Si
new) = f(Si
new) − f(Si−1
new) denotes the resultant change of the objective
function when ith
move has been performed. If BCSk
(SA+KL)  0, the solution
is updated by taking all the perturbations up to the index k and the best solu-
tion is always recorded. If (BCSk
SA+KL ≤ 0), the simulated acceptance test is
restricted to only the resultant change of the first perturbation. A number from
the interval (0,1) is drawn by a random number generator. The move is accepted
ff the drawn number is less than exp−δf/T
. The process of proposing a series of
perturbations and selecting the best subset of moves is repeated for a number of
iterations before the temperature is updated. A temperature reduction function
is used to lower the temperature. The updating of the temperature is done using
a geometric cooling, as shown in Eq. 2
Tnew = α × Told, where α = 0.9. (2)
3 Combining Genetic Algorithm with Local Search
Genetic Algorithm (GA) belong to the group of evolutionary algorithms. It works
on a set of solutions called a population. Each of these members, called chro-
mosomes or individuals, is given a score (fitness), allowing the assessing of its
quality. The individuals of the initial population are in most cases generated
randomly. A reproduction operator selects individuals as parents, and generates
off-springs by combining information from the parent chromosomes. The new
population might be subject to a mutation operator introducing diversity to the
population. A selection scheme is then used to update the population—resulting
in a new generation. This is repeated until the convergence is reached—giving
an optimal or near optimal solution.
The simple GA as described in [8] is used here. It starts by generating an ini-
tial population represented by floating point numbers. Solutions are temporary
converted to integers when bit manipulation is needed, and resulting integers are
converted back to floating point representation for storage. A roulette function
is used for selections. The implementation and based on the one described in
section IV of [10], where more details can be found.
The purpose of KL-Crossover is to perform the crossover operator a number
of times generating a sequence of fitness function scores together with their cor-
responding crossover. Thereafter, the subset of consecutive crossovers having the
A Look-Ahead Based Meta-heuristics for Continuous Optimization 51
best cumulative score BCSk
(GA−KL) is determined. The identification of this sub-
set is the same as described in SA-KL. GA-KL chooses k so that BCSk
(GA+KL)
in Eq. 4 is maximum, where CRi
represents the ith
crossover performed on two
individuals, Il and Im, k the number of allowed crossovers, and gain(Il, Im)CRi
denotes the resulting change in the fitness function when the ith
crossover CRi
has been performed calculated shown in Eq. 3.
gain(Il, Im)CRi = f(Il, Im)CRi − f(Il, Im)CRi−1 , (3)
where CR0
refers to the chosen pair of parents before applying the cross-
over operator. KL-crossover is supposed to reach convergence if the gain of five
consecutive cross-overs is negative. Finally, the individuals that are best fit are
allowed to move to the next generation while the other half is removed and a
new round is performed.
BCSk
(GA+KL) = Max
 k

i=1
gain(Individuall, Individualm)CRi

. (4)
4 Benchmark Functions and Parameter Setting
The stopping criterion for all four algorithms (SA, SA-Look-Ahead, GA, GA-
Look-Ahead) is supposed to be reached if the best solution has not been improved
during 100 consecutive iterations for (SA, SA-Look-Ahead) or 10 generations for
(GA, GA-Look-Ahead). The starting temperature for (SA, Look-Ahead-SA) is
set to 0.8 (i.e., a bad move has a probability of 80% for being accepted). In the
inner loop of (SA, Look-Ahead-SA), the equilibrium is reached if the number of
accepted moves is less than 10%.
The ten following benchmark functions were retrieved from [13] and tested.
1: Drop Wave f(x, y) = −
1+cos(12

x2+y2)
1
2
(x2+y2)+2
2: Griewangk f(x) = 1
4000
n
i=1 −
n
i=1 cos(
xi
√
i
) + 1
3: Levy Function sin2(3πx) + (x − 1)2[1 + sin2(3πy)] + (y − 1)2[1 + sin2(2πy)]
4: Rastrigin f(x) = 10n +
n
i=1(x2
i − 10 cos 2πxi)
5: Sphere Function f(x) =
n
i=1 x2
i
6: Weighted Sphere Function f(x, y) = x2 + 2y2
7: Sum of different power functions f(x) =
n
i=1 |xi|i+1
8: Rotated hyper-ellipsoid f(x) =
n
i=1
i
j=1 x2
j
9: Rosenbrock’s valley f(x) =
n−1
i=1
[100(xi+1 − x2
i )2 + (1 − xi)2]
10: Three Hump Camel Function f(x, y) = x2 − 1.05x4 + x6
6
+ xy + y2
52 T. Nordli and N. Bouhmala
Fig. 1. Functions 1–6: GA vs. GA-KL
5 Experimental Results
The results are visualized in Fig. 1, 2 and 3, where both the mean and the best
solution over 100 runs are plotted. The X-axis shows the number of generations
(for GA and GA-KL—Fig. 1 and 2) or the iterations (for SA and SA-KL—Fig. 3),
while the Y-axis gives the absolute error (i.e. the excess of deviation from the
optimal solution).
Figures 1 and 2 compares GA against GA-KL (GA-Look-Ahead). Looking
at the mean solution, GA delivers on average lower absolute error solutions on
almost 9 cases out of 10. The average percentage change error reduction in favor
of GA is 5% for function 1, 12% for function 2, 43% for function 4, within 1%
for functions 5, 6, 7 and 8, 14% for function 9 and finally 2% for 10. Function
A Look-Ahead Based Meta-heuristics for Continuous Optimization 53
Fig. 2. Functions 7–10: GA vs. GA-KL
3 was the only test case where GA-Look-Ahead wins with an average percent-
age change error reduction of 26%. On the other hand, comparing the curves
representing the evolution of the (best solution) fittest individual produced by
the two algorithms, GA-Look-Ahead is capable of reaching solutions of higher
precision when compared to GA (10−8
versus 10−6
for function 2, 10−11
versus
10−6
for function 3, 10−19
versus 10−10
for function 5, 10−15
versus 10−9
for
function 6, 10−21
versus 10−12
for function 7, 10−16
versus 10−6
for function 8,
10−14
versus 10−10
for function 10. The diversity of the population produced by
GA-Look-Ahead enables GA from premature convergence phenomenon leading
GA to continue for more generations before convergence is reached. GA-Look-
Ahead performed 69% more generations compared to GA for function 3, 82%
for function 5, 57% for function 6, and 25% for function 10.
Figure 3 shows the results for SA and SA-KL (SA-Look-Ahead). Looking at
the evolution of the mean, both methods produce similar quality of results while
SA is showing insignificantly lower absolute error on few cases. However, when
comparing the best versus the best, SA-Look-Ahead delivers solutions of better
precision than SA. The precision rate is 10−10
versus 10−8
for function 1, 10−11
versus 10−6
for function 4, 10−15
versus 10−9
for function 7, and 10−7
versus
10−4
for function 9.
54 T. Nordli and N. Bouhmala
Fig. 3. Functions 1, 4, 7, 9: SA vs. SA-KL
6 Conclusion
Meta-heuristics global optimization algorithms have become widely popular for
solving global optimization problems. In this paper, both SA and GA have been
combined for the first time with the popular Kernighan-Lin heuristic used for
the graph partitioning problem. The main idea is to replace the search for one
possible perturbation in SA or one typical crossover in GA by a search for favor-
able sequence of perturbations or crossovers using the objective function to be
optimized to guide the search. The results presented in this paper show that the
proposed scheme enables both SA and GA to reach solutions of higher accuracy
which is the significant result of the present study. In addition, the proposed
scheme enables GA to maintain the diversity of the population for a longer
period preventing GA from reaching premature convergence which still remains
a serious defect in GA when applied to both continuous and discrete optimiza-
tion. With regard to the future, we believe that the performance of the strategy
could be further improved by selecting the type of perturbations to be processed
by introducing a probability of acceptance at the level of the Kernighan-Lin
heuristic whenever a bad move is to be selected.
A Look-Ahead Based Meta-heuristics for Continuous Optimization 55
References
1. Ardia, D., Boudt, K., Carl, P., Mullen, K., Peterson, B.G.: Differential evolution
with DEoptim: an application to non-convex portfolio optimization. R J. 3(1),
27–34 (2011)
2. Ardia, D., David, J., Arango, O., Gómez, N.D.G.: Jump-diffusion calibration using
differential evolution. Wilmott 2011(55), 76–79 (2011)
3. Arun, N., Ravi, V.: ACONM: A Hybrid of Ant Colony Optimization and Nelder-
Mead Simplex Search. Institute for Development and Research in Banking Tech-
nology (IDRBT), India (2009)
4. Bouhmala, N.: Combining simulated annealing with local search heuristic for MAX-
SAT. J. Heuristics 25(1), 47–69 (2019). https://doi.org/10.1007/s10732-018-9386-
9
5. Chelouah, R., Siarry, P.: Tabu search applied to global optimization. Eur. J. Oper.
Res. 123(2), 256–270 (2000)
6. Chelouah, R., Siarry, P.: Genetic and Nelder-mead algorithms hybridized for a
more accurate global optimization of continuous multiminima functions. Eur. J.
Oper. Res. 148(2), 335–348 (2003)
7. De Castro, L.N., Von Zuben, F.J.: Learning and optimization using the clonal
selection principle. IEEE Trans. Evol. Comput. 6(3), 239–251 (2002)
8. Goldberg, D.E.: Genetic algorithms in search. Optimization, and Machine Learning
(1989)
9. Holland John, H.: Adaptation in Natural and Artificial Systems. University of
Michigan Press, Ann Arbor (1975)
10. Jensen, B., Bouhmala, N., Nordli, T.: A novel tangent based framework for opti-
mizing continuous functions. J. Emerg. Trends Comput. Inf. Sci. 4(2), 239–247
(2013)
11. Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs.
Bell Syst. Tech. J. 49(2), 291–307 (1970)
12. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing.
Science 220(4598), 671–680 (1983)
13. Surjanovic, S., Bingham, D.: Virtual library of simulation experiments: Test func-
tions and datasets. http://www.sfu.ca/∼ssurjano
14. Tank, M.: An ant colony optimization and Nelder-mead simplex search hybrid
algorithm for unconstrained optimization (2009)
Inverse Optimization for Warehouse
Management
Hannu Rummukainen(B)
VTT Technical Research Centre of Finland, P.O. Box 1000, 02044 Espoo, Finland
Hannu.Rummukainen@vtt.fi
Abstract. Day-to-day operations in industry are often planned in an
ad-hoc manner by managers, instead of being automated with the aid of
mathematical optimization. To develop operational optimization tools,
it would be useful to automatically learn management policies from data
about the actual decisions made in production. The goal of this study
was to investigate the suitability of inverse optimization for automating
warehouse management on the basis of demonstration data. The man-
agement decisions concerned the location assignment of incoming pack-
ages, considering transport mode, classification of goods, and congestion
in warehouse stocking and picking activities. A mixed-integer optimiza-
tion model and a column generation procedure were formulated, and an
inverse optimization method was applied to estimate an objective func-
tion from demonstration data. The estimated objective function was used
in a practical rolling horizon procedure. The method was implemented
and tested on real-world data from an export goods warehouse of a con-
tainer port. The computational experiments indicated that the inverse
optimization method, combined with the rolling horizon procedure, was
able to mimic the demonstrated policy at a coarse level on the train-
ing data set and on a separate test data set, but there were substantial
differences in the details of the location assignment decisions.
Keywords: Multi-period storage location assignment problem ·
Class-based storage · Inverse optimization · Mixed-integer linear
programming
1 Introduction
Many industrial management and planning activities involve effective use of
available resources, and mathematical optimization algorithms could in princi-
ple be applied to make better decisions. However, in practice such activities are
often planned manually, using computers only to track and communicate infor-
mation without substantially automating the decision making process itself. The
reasons for the limited use of optimization include the expense of customising
optimization models, algorithms and software systems for individual business
needs, and the complexity of real-world business processes and operating envi-
ronments. The larger the variety of issues one needs to consider in the planning,
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 56–71, 2021.
https://doi.org/10.1007/978-3-030-91885-9_5
Inverse Optimization for Warehouse Management 57
the more complex an optimization model is needed, which can make it very
challenging to find a satisfactory solution in reasonable time, whether by generic
optimization algorithms or case-specific customized heuristics. Moreover, the
dynamic nature of business often necessitates regular updates to any decision
support tools as business processes and objectives change.
To make it easier to develop industrial decision support tools, one potential
approach is to apply machine learning methods to data from existing planning
processes. The data on actual decisions can be assumed to indicate the pref-
erences of competent decision makers, who have considered the effects of all
relevant factors and operational constraints in each decision. If a general control
policy can be derived from example decisions, the policy can then be replicated
in an automated decision support tool to reduce the manual planning effort,
or even completely automate the planning process. There would be no need to
explicitly model the preferences of the operational management at a particular
company. Ideally the control policy learned from data is generalisable to new
planning situations that differ from the training data, and can also be explained
in an understandable manner to business managers. In addition to implement-
ing tools for operational management, machine learning of management policies
could also be a useful tool when developing business simulation models for strate-
gic decision support.
The goal of this case study was to investigate whether inverse optimization
would be a suitable machine learning method to automate warehouse manage-
ment on the basis of demonstration data. In the study, inverse optimization
was applied to the problem of storage location assignment, i.e. where to place
incoming packages in a large warehouse. A mixed-integer model and a column
generation procedure were formulated for dynamic class-based storage location
assignment, with explicit consideration for the arrival and departure points of
goods, and possible congestion in warehouse stocking and picking activities.
The main contributions of the study are 1) combining the inverse optimiza-
tion approach of Ahuja and Orlin [1] with a cutting plane algorithm in the
first known application of inverse optimization to a storage location assignment
problem, and 2) computational experiments on applying the estimated objective
function in a practical rolling horizon procedure with real-world data.
2 Related Work
A large variety of analytical approaches have been published for warehouse
design and management [6,8,12,21]. Hausman et al. [10] presented an early
analysis of basic storage location assignment methods including random stor-
age, turnover-based storage in which highest-turnover products are assigned to
locations with the shortest transport distance, and class-based storage in which
products are grouped into classes that are associated with areas of the warehouse.
The location allocation decisions of class-based storage policies may be based on
simple rules such as turnover-based ordering, or an explicit optimization model
as in [18].
58 H. Rummukainen
Goetschalckx and Ratliff [7] showed that information about storage durations
of arriving units can be used in a duration-of-stay-based (DOS-based) shared
storage policy that is more efficient than an optimal static dedicated (class-
based or product-based) storage policy. Kim and Park [13] used a subgradient
optimization algorithm for location allocation, assuming full knowledge of stor-
age times and amounts over the planning interval. Chen et al. [5] presented a
mixed-integer model and heuristic algorithms to minimize the peak transport
load of a DOS-based policy. In the present study, more complex DOS-based
policies are considered, with both dynamic class-based location assignment and
congestion costs.
Updating class-based assignments regularly was proposed by Pierre et al. [20],
and further studied by Kofler et al. [14], who called the problem multi-period
storage location assignment. Their problem was quite similar to the present
study, but both Pierre et al. and Kofler et al. applied heuristic rules without an
explicit mathematical programming model, and also included reshuffling moves.
The problem of yard planning in container terminals is closely related to
warehouse management, and a number of detailed optimization models have
been published. [17,24] Congestion constraints for lanes in the yard were already
considered by Lee et al. [9,15] Moccia et al. [16] considered storage location
assignment as a dynamic generalized assignment problem, in which goods could
be relocated over time, but each batch of goods would have to be treated as an
unsplittable unit. The present study does not address relocation, but is differ-
entiated by class-based assignment, as well as the use of inverse optimization.
Pang and Chan [19] applied a data mining algorithm as a component of
a dynamic storage location assignment algorithm, but they did not use any
external demonstration policy in their approach.
Learning to replicate a control policy demonstrated by an expert has been
studied by several computational methods, in particular as inverse optimiza-
tion [1,3,22] and inverse reinforcement learning [2]. Compared to more generic
supervised machine learning methods, inverse optimization and reinforcement
learning methods have the advantage that they can take into account the future
consequences of present-time actions. The author is not aware of any prior work
in applying inverse optimization or inverse reinforcement learning to warehouse
management problems.
In a rolling horizon procedure, the plan is reoptimized regularly so that con-
secutive planning periods overlap. Troutt et al. [23] considered the problem of
estimating the objective function of a rolling horizon optimization procedure
from demonstrated decisions. They assumed that the initial state for each plan-
ning period could be fixed according to the demonstrated decisions, and then
minimized the sum of decisional regret over the planning periods: the decisional
regret of a planning period was defined as a function of objective function coeffi-
cients, and equals the gap between the optimal objective value and the objective
value of the demonstrated decisions over the planning period. In the present
study, the inverse optimization was performed on a model covering the full
Inverse Optimization for Warehouse Management 59
demonstration period, and the rolling horizon procedure was not explicitly con-
sidered in the inverse optimization at all.
3 Storage Location Assignment Model
Fig. 1. Schematics of the warehouse model structure. Left: small-scale example with
all connections shown. Right: larger example, with cut down connections for clarity.
Goods flow from arrival nodes (solid circles) to storage locations (outlined boxes) and
later on to departure nodes (open circles). Locations are grouped into areas (shaded)
for congestion modelling.
The basic structure of the storage location assignment model is illustrated in
Fig. 1. The model covers a number of discrete storage locations with limited
capacity. Goods arrive to storage in discrete packages from one or more arrival
nodes, and leave through departure nodes: the nodes may represent e.g. load-
ing/unloading points of different transport modes, or manufacturing cells at a
factory. The storage locations are grouped into areas for the purpose of mod-
elling congestion: simultaneous stocking and picking operations on the same area
may incur extra cost. In a typical warehouse layout an area would represent a
single aisle, as illustrated on the right in Fig. 1.
Goods are divided into distinct classes, and one location can only store one
class of packages simultaneously. For each arriving package, the decision maker
is assumed to know a probability distribution on arrival nodes and departure
nodes, as well as the expected storage duration in discrete time units. The deci-
sion maker must choose a storage location for each package. The parametric
objective function is based on distance parameters between arrival and depar-
ture nodes and locations, per-area congestion levels per time step, and cost of
storage location allocation. Each location has a fixed capacity, representing avail-
able storage volume. Packages are assumed to stay in one location for their entire
stay, i.e. relocation actions are not considered.
In this study, a mixed-integer linear programming model was developed for
storage location assignment over a set of discrete time steps T. The packages to
be stored over the planning period were assumed to be known deterministically
60 H. Rummukainen
in advance. Although the model was intended to be applicable in a relatively
short-term rolling-horizon procedure, in the inverse optimization method a linear
relaxation of the model was used on historical data over a longer period.
3.1 Definitions
Let K be the index set of packages to be stored. There are qk capacity units of
packages of kind k ∈ K, stored over time interval Tk ⊆ T. The begin and end
time of interval Tk together form the set T
k ⊆ Tk. Denoting the set of arrival
and departure nodes by N, packages of kind k ∈ K are assumed to arrive via
node n ∈ N by probability πnk and depart via node n by probability πkn. The
amount qk is split proportionally between the nodes, so that the amounts of
arrivals and departures via n ∈ N are given by πnkqk and πknqk; effectively, the
optimization objective is the expected cost of the location assignment.
The set of storage locations L is grouped into distinct areas A, which are
identified with subsets of L. Location l ∈ L can hold Cl capacity units. In order
to define piecewise linear area congestion costs, flows to/from areas are specified
in R pieces of D capacity units, for a total maximum flow of RD capacity units.
For purposes of class-based location assignment, the package kinds K are
grouped into distinct classes B, identified with subsets of K. At each time step,
the subset of locations occupied by packages of one class is called a pattern. Let
P denote the set of patterns, i.e. feasible subsets of L; although the set P is in
principle exponential in the number of locations, it is constructed incrementally
in a column generation procedure.
In rolling horizon planning, the warehouse already contains initial stock at
the beginning of the planning period. Let Qtl denote the amount of initial stock
remaining at time t ∈ T in location l ∈ L, and Q
ta be the amount of initial stock
transported at time t ∈ T from area a ∈ A.
3.2 Location Assignment Model
The location assignment mixed-integer program includes the following decision
variables:
xkl ≥ 0 fraction of kind k ∈ K assigned to location l ∈ L,
zbpt ∈ {0, 1} assignment of class b ∈ B to location pattern p ∈ P at time t ∈ T,
ytar ≥ 0 flow at time t ∈ T from/to area a ∈ A at discretisation level r ∈
{1, . . . , R}.
The corresponding objective function coefficients are independent of time:
ckl cost of assigning packages of kind k ∈ K to location l ∈ L
cbp cost per time step of using pattern p ∈ P for class b ∈ B
car transfer cost from/to area a ∈ A at flow discretisation level r ∈ {1, . . . , R}
The piecewise linear flow costs must be convex so that the model can be formu-
lated without any additional integer variables. Specifically, the car coefficients
must be increasing in r, that is car ≤ car for all a ∈ A and r  r
.
Inverse Optimization for Warehouse Management 61
The location assignment mixed-integer program is:
min

b∈B, p∈P,
t∈T
cbpzbpt +

k∈K, l∈L
cklxkl +

t∈T, a∈A,
r=1,...,R
carytar (1)

p∈P
zbpt = 1 ∀b ∈ B, t ∈ T (2)

b∈B, pl
zbpt ≤ 1 ∀t ∈ T, l ∈ L (3)
xkl −

pl
zbpt ≤ 0 ∀b ∈ B, k ∈ b, t ∈ Tk, l ∈ L (4)

l∈L
xkl = 1 ∀k ∈ K (5)

k∈K: Tkt
qkxkl ≤ Cl − Qtl ∀t ∈ T, l ∈ L (6)

r=1,...,R
ytar −

k∈K: T 
k t,
l∈a
qkxkl = Q
ta ∀t ∈ T, a ∈ A (7)
ytar ≤ D ∀t ∈ T, a ∈ A, r = 1, . . . , R (8)
The objective function (1) is the sum of class-pattern assignment costs, kind-
location assignment costs, and piecewise linear area congestion costs. The con-
straints (2) require each class of goods to be assigned to a unique location pattern
on each time step, and the constraints (3) require that two different classes of
goods cannot be at the same location at the same time. The constraints (4) disal-
low kind-location choices xkl that are inconsistent with the class-pattern choices
zbpt. The constraints (5) require all goods to be assigned to some location. The
constraints (6) bound storage volumes by storage capacity. The constraints (7)
link the area flows ytar to the kind-location choices xkl and flows of initial stocks
Q
ta, and the constraints (8) bound feasible flow volumes.
3.3 Cost Model
The cost coefficients cbp, ckl, clm, car are defined in terms of the following cost
parameters, which are treated as variables in the inverse optimization.
dnl ≥ 0 transfer cost per capacity unit from node n ∈ N to location l ∈ L
dln ≥ 0 transfer cost per capacity unit from location l ∈ L to node n ∈ N
γl ≥ 0 cost per time step of allocating location l ∈ L
δ ≥ 0 cost per distance unit per time step, based on the diameter of an allocation
pattern (The diameter of a pattern is the maximum distance between two
locations in the pattern.)
αr ≥ 0 marginal cost per capacity unit for area flow level r
62 H. Rummukainen
The above parameters have nominal values, indicated by superscript 0, which
are set a priori before inverse optimization. For the nominal transfer costs d0
nl
and d0
ln, the physical distance between node n and location l was used. The
nominal cost γ0
l was set proportionally to the capacity of location l, and the
nominal pattern diameter cost δ0
was set 0. The nominal marginal flow costs
α0
r were set 0 below an estimated flow capacity bound, and a high value above
that. Additionally, d0
ll is used to denote the physical distance between locations
l, l
∈ L. The cost model can now be stated as:
cbp =

l∈p
γl + δ max
l,l∈p
d0
ll (9)
ckl = qk

n∈N
(πnkdnl + πkndln) (10)
car =
r

j=1
αj (11)
Note that (11) is defined as a sum to ensure that the cost coefficients car are
increasing in r.
3.4 Dual Model
Let us associate dual variables with the constraints of the location assignment
model as follows: ψbt ∈ R for (2), φtl ≥ 0 for (3), ξktl ≥ 0 for (4), μk ∈ R for
(5), λtl ≥ 0 for (6), ηta ∈ R for (7), and θtar ≥ 0 for (8). The dual of the linear
relaxation of the location assignment model (1)–(8) can now be formulated:
max

b∈B, t∈T
ψbt −

t∈T, l∈L
φtl +

k∈K
μk −

t∈T, l∈L
(Cl −Qtl)λtl +

t∈T, a∈A
Q
taηta −

t∈T, a∈A,
r=1,...,R
Dθtar (12)
ψbt −

l∈p
φtl +

k∈b: Tkt,
l∈p
ξktl ≤ cbp ∀b ∈ B, p ∈ P, t ∈ T (13)
−

t∈Tk
ξktl + μk −

t∈Tk
qkλtl −

t∈T 
k
a∈A: l∈a
qkηta ≤ ckl ∀k ∈ K, l ∈ L (14)
ηta − θtar ≤ car ∀t ∈ T, a ∈ A, r = 1, . . . , R (15)
3.5 Pattern Generation
To find a solution to the location assignment model (1)–(8), a column generation
procedure was applied to its linear relaxation, then the set of patterns P was
fixed, and the problem was finally resolved as a mixed-integer program. This is
a heuristic that does not guarantee optimality for the mixed-integer problem.
Inverse Optimization for Warehouse Management 63
Potentially profitable patterns can be found by detecting violations of dual
constraint (13). Specifically, the procedure seeks p ⊆ L, b ∈ B and t ∈ T that
maximize, and reach a positive value for the expression
ψbt −

l∈p
φtl +

k∈b: Tkt,
l∈p
ξktl − cbp = ψbt −

l∈p
(φtl + γl −

k∈b: Tkt
ξktl) − δ max
l,l∈p
d0
ll ,
where the definition (9) of cbp has been expanded. Let us denote by Vlbt the
parenthesized expression on the right-hand side, and by qbt =

k∈b: Tkt qk the
amount of class b goods at time t. The column generation subproblem for given b
and t is defined in terms of the following decision variables: ul ∈ {0, 1} indicates
whether location l ∈ L belongs in the pattern, vlk ≥ 0 indicates ordered pairs
of locations l, k ∈ L that both belong in the pattern, and s ≥ 0 represents
the diameter of the pattern. The subproblem is formulated as a mixed-integer
program:
min

l∈L
Vlbtul + δs (16)

l∈L
CMl ul ≥ qbt (17)
s − d0
lkvlk ≥ 0 ∀l, k ∈ L, l  k (18)
vlk − ul − uk ≥ −1 ∀l, k ∈ L, l  k (19)
The constraint (17) ensures that the selected locations together have sufficient
capacity to be feasible in the location assignment problem. The constraints (18)
bound s to match the pattern diameter, and (19) together with the minimization
objective ensure that vlk = uluk for all l, k ∈ L.
In the column generation procedure, the problem (16)–(19) is solved itera-
tively by a mixed-integer solver for all classes b ∈ B and time steps t ∈ T, and
whenever the objective is smaller than ψbt, the pattern defined by the corre-
sponding solution is added to the set of patterns P. After all b and t have been
checked, the linear program (1)–(8) is resolved with the updated P, and the
column generation procedure is repeated with newly obtained values of (ψ, φ, ξ).
If no more profitable patterns are found or the algorithm runs out of time, the
current patterns P are fixed and finally the location assignment problem (1)–(8)
is resolved as a mixed-integer program.
Note that cbp is a submodular function of p. Kamiyama [11] presented a
polynomial-time approximation algorithm for minimizing a submodular function
with covering-type constraints (which (17) is), but in this study it was considered
simpler to use a mixed-integer solver.
4 Inverse Optimization Method
The location assignment decisions demonstrated by the human decision makers
are assumed to form a feasible solution (z, x, y) to the location assignment model
64 H. Rummukainen
(1)–(8). The goal of the presented inverse optimization method is to estimate
the parameters of the cost model in the form (9)–(11), so that the demonstrated
solution (z, x, y) is optimal or near-optimal for the costs c.
The demonstrated solution is not assured to be an extreme point, and may
be in the relative interior of the feasible region. Moreover, with costs restricted
to the form (9)–(11), non-trivial (unique) optimality may be impossible for even
some boundary points of the primal feasible region. Because of these issues, in
contrast with the inverse linear optimization method of Ahuja and Orlin [1],
complementary slackness conditions are not fully enforced, and the duality gap
between the primal (1) and dual (12) objectives is minimized as an objective.
To make sure that the inverse problem is feasible with nontrivial c, none of
the dual constraints (13)–(15) is tightened to strict equality by complementary
slackness. Nevertheless, the following additional constraints on the dual solution
(ψ, φ, ξ, μ, λ, η, θ) are enforced on the basis of complementary slackness with the
fixed primal solution (z, x, y):
φtl = 0 ∀t ∈ T, l ∈ L :

b∈B, pl zbpt  1 (20)
ξktl = 0 ∀b ∈ B, k ∈ b, t ∈ T, l ∈ L : xkl −

pl zbpt  0 (21)
λtlm = 0 ∀t ∈ T, l ∈ L, m = 1, . . . , Ml : wtlm  C (22)
θtar = 0 ∀t ∈ T, a ∈ A, , r = 1, . . . , R : ytar  D (23)
The duality gap that is to be minimized is given by
Γ =

b∈B, p∈P,
t∈T
cbpzbpt +

k∈K, l∈L
cklxkl +

t∈T, l∈L,
m=1,...,Ml
clmwtlm +

t∈T, a∈A,
r=1,...,R
carytar
−

b∈B, t∈T
ψbt +

t∈T, l∈L
φtl −

k∈K
μk +

t∈T, l∈L
(Cl − Qtl)λtl −

t∈T, a∈A
Q
taηta +

t∈T, a∈A,
r=1,...,R
Dθtar. (24)
Note that Γ ≥ 0 because (z, x, y) is primal feasible and (ψ, φ, ξ, μ, λ, η, θ) is dual
feasible.
Similarly to Ahuja and Orlin [1], the cost estimation is regularized by penalis-
ing deviations from nominal cost values, but in a more specific formulation. The
deviation objective term is defined as follows, weighting cost model parameters
approximately in proportion to their contribution to the primal objective:
Δ =

t∈T

b∈B

k∈b: Tkt qk

l∈L CMl

l∈L

γl − γ0
l

 +
maxl,l∈L d0
ll
2

δ − δ0



+

k∈K
qk
|L|

l∈L
 
n∈N
πnk

dnl − d0
nl

 + πkn

dln − d0
ln



+


k∈K
2qk
 R

r=1
r

j=1

αj − α0
j

 (25)
As this is a minimization objective, the standard transformation of absolute
value terms to linear programming constraints and variables applies [4, p. 17].
Inverse Optimization for Warehouse Management 65
The inverse problem is obtained by substituting the cost model (9)–(11) in
the dual constraints (13)–(15), with the additional constraints (20)–(23), and
the objective of minimizing gΔ+Γ, where g is a weighting coefficient in units of
storage capacity. The decision variables are the dual variables (ψ, φ, ξ, μ, λ, η, θ)
and the cost model parameters (d, γ, δ, α).
The inverse problem is a linear program with an exponential number of con-
straints (13). The problem can be solved iteratively by first solving the linear
program with a limited set of patterns P, then applying the algorithm of Sect. 3.5
to detect violations of constraints (13) and to update the set P accordingly, and
repeating. In other words, the primal column generation procedure of Sect. 3.5
can be applied as a dual cutting plane algorithm. The algorithm can be stopped
on timeout with a suboptimal solution.
5 Computational Experiments
A warehouse management case was kindly provided by the company Steveco Oy,
a container port operator in Finland. One of the services provided by Steveco
is warehousing of export goods, allowing manufacturers to reduce lead time on
deliveries to their foreign customers. The goods in the port warehouses can be
shipped either as break-bulk or stuffed in containers at short notice. The case
involves the management of inventory in one such port warehouse.
Goods are stored in the warehouse for a period ranging from a day or two
up to several weeks. Goods may arrive by either rail or truck, which are usually
received on opposite sides of the warehouse. The warehouse is split into grid
squares of different sizes. Different goods are classified based on manufacturer-
provided order identifiers, such that two classes cannot be stored in the same grid
square simultaneously. The warehouse management typically receives advance
information of deliveries 12–24 h in advance of arrival in the warehouse. A ware-
house foreman allocates grid squares for each class of goods, taking into account
the transport mode (rail or truck), whether the goods are likely to be stuffed in
a container or transported on a rolltrailer directly to a quay, and importantly,
simultaneously transported goods must be split between aisles of the warehouse
so that multiple loaders can access the goods at different aisles without interfer-
ing with each other. Currently the management of the warehouse is a manual
procedure, which requires time and effort from experienced foremen.
Computational experiments were performed using a 92-day time series of
actual storage decisions provided by Steveco. The data described the arrival and
departure times, storage locations and classification of each individual package.
The data was split into a 46-day training data set, and a 46-day test data set.
The model time step was 1 h. There were arrival or departure events on 942 h of
the 2208 h period, and a total of 66000 packages in 99 orders (classes of goods).
The warehouse had 8 aisles (areas) with a total of 83 grid squares (locations).
Arrivals and departures were associated with 8 arrival nodes (6 locations along
a spur rail, and truck unloading on 2 sides of the warehouse) and 3 departure
nodes (1 quay and 2 truck).
66 H. Rummukainen
To simulate usage of the model in operational planning, the storage loca-
tion assignment model of Sect. 3.2 was applied in a rolling horizon procedure:
For every hour of simulation time, the mixed-integer model was solved for the
following 24-hour period, the decisions of the first hour were fixed, and the pro-
cedure was repeated with the planning period shifted by one hour forward.
The inverse optimization was computed for the entire 46-day training period
at once, without regard to rolling horizon planning. Thus there was no guarantee
that the cost parameters computed by the inverse optimization would allow the
rolling horizon procedure to accurately replicate the demonstrated decisions.
The following cases are reported in the results:
Example. Demonstrated decisions of warehouse foremen.
Learned/post. A posteriori optimization over the entire data period, using the
estimated objective function in a linear relaxation of the model of Sect. 3.2.
This result is what the inverse optimization method actually aims to match
with the demonstration solution, and is only included for comparison. This
method cannot be used in operational planning.
Nominal/rolling. Rolling horizon planning using the nominal objective func-
tion, which corresponds to pure transport distance minimization, along with
high aisle congestion cost above estimated loader capacity.
Learned/rolling. Rolling horizon planning using the estimated objective func-
tion, with the aim of replicating the example behaviour.
The experiments were run on a server with two 18-core 2.3 GHz Intel Xeon
E5-2699v3 processors and 256 GB RAM, using the IBM ILOG CPLEX 20.1
mixed-integer programming solver. The inverse optimization time was limited
to 12 h, and each rolling horizon step took 0–5 min.
To compare the optimization results with the demonstrated decisions, two
numerical measures were defined at different aggregation levels. Let xkl represent
the solution of one of the four cases above, and x◦
kl the demonstrated solution
of the Example case. The measure DIFFA indicates how closely the volumes
assigned to each aisle match the demonstrated solution over time:
DIFFA = 100% ·

a∈A
t∈T








k∈K: Tkt
l∈a
qkxkl −

k∈K: Tkt
l∈a
qkx◦
kl








k∈K
2qk |Tk| . (26)
The measure DIFFB indicates how closely the volumes of each class of goods
assigned to each aisle match the demonstrated solution over time:
DIFFB = 100% ·

b∈B, a∈A,
t∈T








k∈b: Tkt
l∈a
qkxkl −

k∈b: Tkt
l∈a
qkx◦
kl








k∈K
2qk |Tk| . (27)
These measures range from 0% indicating perfect replication, to 100% indicating
that no goods are assigned to the same aisles in the two solutions.
Inverse Optimization for Warehouse Management 67
Fig. 2. Training results: capacity usage over time for each aisle.
Fig. 3. Test results: capacity usage over time for each aisle.
68 H. Rummukainen
Table 1. Differences in capacity usage between the optimization results and demon-
strated decisions. DIFFA indicates differences in per-aisle capacity usage, and DIFFB
differences in per-aisle capacity usage of each class of goods; both are relative to the
total volume of goods in storage
Case DIFFA DIFFB
Training Test Training Test
Example 0.0% 0.0% 0.0% 0.0%
Learned/post 6.6% 15.8% 9.9% 55.6%
Nominal/rolling 32.3% 26.2% 43.3% 68.1%
Learned/rolling 18.5% 17.2% 32.7% 62.5%
Table 2. Total transport distance of the packages, relative to the demonstrated exam-
ple solution, on the training and test periods. The comparison includes distances from
arrival node to storage location to departure node.
Case Training Test
Example 100.0% 100.0%
Learned/post 99.7% 91.6%
Nominal/rolling 86.9% 84.2%
Learned/rolling 97.2% 93.0%
5.1 Results
Graphs of capacity usage over time in different aisles are shown in Figs. 2 and
3. The graphs give an overview of how well the demonstrated aisle choices could
be matched. Note that the differences between these graphs and the Example
case are precisely summarized by the DIFFA measure.
The numerical values of the measures DIFFA and DIFFB are shown in
Table 1. The total transport distance of the packages in the different cases, com-
pared to the demonstrated solution, is shown in Table 2.
In 12 h the inverse optimization method ran for 21 iterations, on which it
generated a total of 2007 patterns. In the last 3 h, there were only negligible
improvements to the objective value of the inverse problem.
5.2 Discussion
On the training data set the inverse optimization method worked, as can be
seen in Fig. 2: the continuous optimization results (Learned/post) match the
demonstrated decisions (Example) relatively well. However, the rolling horizon
procedure with learned costs (Learned/rolling) diverged substantially from the
results of the continuous optimization (Learned/post), possibly due to the short
24 h planning horizon. Nevertheless, as indicated by the numerical measures in
Inverse Optimization for Warehouse Management 69
Table 1, the results of the rolling horizon procedure were closer to the demon-
strated decisions after the inverse optimization (Learned/rolling) than before
(Nominal/rolling).
On the test data set, as seen in Fig. 3 and Table 1, the continuous optimiza-
tion results (Learned/post) diverged further from the demonstrated decisions
(Example) than in the training data set. The results of the rolling horizon pro-
cedure however appeared to follow the demonstrated decisions somewhat better
than on the training data set, both visually and by the DIFFA measure. Again,
the results of the rolling horizon procedure were closer to the demonstrated
decisisions after the inverse optimization (Learned/rolling) than before (Nomi-
nal/rolling).
Replicating the demonstrated decisions in more detail than at the level of per-
aisle capacity usage was unsuccessful so far. As shown by the DIFFB measure
in Table 1, classes of goods were largely assigned to different aisles than in the
demonstrated decisions. A particular difficulty in the case study was that there
was no machine-readable data on the properties of the classes beyond an opaque
order identifier.
Besides the difficulties at the level of classes of goods, comparing the assign-
ments at the level of individual locations was also fruitless. The preliminary
conclusion was that since most of the goods would flow through the warehouse
aisles from one side to another, choosing one grid square or another in the same
aisle would not make a substantial difference in the total work required.
Although matching the demonstrated decisions by inverse optimization had
limited success so far, the storage location assignment model of Sect. 3 was found
to be valuable. In the Nominal/rolling case, in which the objective was to mini-
mize transport distances with congestion penalties, the total transport distance
of the packages was reduced by 13–15% compared to manual planning, as can
be seen in Table 2. Based on the computational experiments, the model has the
potential to be useful for warehouse management on realistic scale.
While the algorithm appeared to converge in 12 h with the 46-day training
data set, this was at the limit of the current capabilities of the method. In
an abandoned attempt to use a 50% larger training data set, the algorithm
progressed over 10 times slower than in the presented runs. In particular the size
of the location assignment model is sensitive to the size of the set K, i.e. the
number of kinds of packages.
6 Conclusion
The inverse optimization approach of Ahuja and Orlin [1] was applied to the
linear relaxation of a mixed-integer storage location assignment problem, and a
solution method based on a cutting plane algorithm was presented. The method
was tested on real-world data, and the estimated objective function was used in
the mixed-integer model in a practical short-term rolling-horizon procedure.
The inverse optimization method and the rolling-horizon procedure were able
to coarsely mimic a demonstrated storage location assignment policy on the
70 H. Rummukainen
training data set, as well as on a separate test data set. However, the demon-
strated assignments of specific classes of goods could not be replicated well,
possibly due to the limitations of the inverse optimization method or due to the
nature of the case study. Further research is needed to more accurately follow
the demonstrated location assignment policy.
References
1. Ahuja, R.K., Orlin, J.B.: Inverse optimization. Oper. Res. 49(5), 771–783 (2001)
2. Arora, S., Doshi, P.: A survey of inverse reinforcement learning: challenges, meth-
ods and progress (2020). arXiv:1806.06877
3. Aswani, A., Shen, Z.J., Siddiq, A.: Inverse optimization with noisy data. Oper.
Res. 66(3), 870–892 (2018)
4. Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Sci-
entific, Belmont (1997)
5. Chen, L., Riopel, D., Langevin, A.: Minimising the peak load in a shared storage
system based on the duration-of-stay of unit loads. Int. J. Shipp. Transp. Logist.
1(1), 20–36 (2009)
6. De Koster, R., Le-Duc, T., Roodbergen, K.J.: Design and control of warehouse
order picking: a literature review. Eur. J. Oper. Res. 182(2), 481–501 (2007)
7. Goetschalckx, M., Ratliff, H.D.: Shared storage policies based on the duration stay
of unit loads. Manage. Sci. 36(9), 1120–1132 (1990)
8. Gu, J., Goetschalckx, M., McGinnis, L.F.: Research on warehouse operation: a
comprehensive review. Eur. J. Oper. Res. 177, 1–21 (2007). https://doi.org/10.
1016/j.ejor.2006.02.025
9. Han, Y., Lee, L.H., Chew, E.P., Tan, K.C.: A yard storage strategy for minimizing
traffic congestion in a marine container transshipment hub. OR Spectr. 30, 697–720
(2008). https://doi.org/10.1007/s00291-008-0127-6
10. Hausman, W.H., Schwarz, L.B., Graves, S.C.: Optimal storage assignment in auto-
matic warehousing systems. Manage. Sci. 22(6), 629–638 (1976)
11. Kamiyama, N.: A note on submodular function minimization with covering type
linear constraints. Algorithmica 80, 2957–2971 (2018)
12. Karásek, J.: An overview of warehouse optimization. Int. J. Adv. Telecommun.
Electrotech. Sig. Syst. 2(3), 111–117 (2013)
13. Kim, K.H., Park, K.T.: Dynamic space allocation for temporary storage. Int. J.
Syst. Sci. 34(1), 11–20 (2003)
14. Kofler, M., Beham, A., Wagner, S., Affenzeller, M.: Robust storage assignment
in warehouses with correlated demand. In: Borowik, G., Chaczko, Z., Jacak, W.,

Luba, T. (eds.) Computational Intelligence and Efficiency in Engineering Systems.
SCI, vol. 595, pp. 415–428. Springer, Cham (2015). https://doi.org/10.1007/978-
3-319-15720-7 29
15. Lee, L.H., Chew, E.P., Tan, K.C., Han, Y.: An optimization model for storage yard
management in transshipment hubs. OR Spectr. 28, 539–561 (2006)
16. Moccia, L., Cordeau, J.F., Monaco, M.F., Sammarra, M.: A column generation
heuristic for dynamic generalized assignment problem. Comput. Oper. Res. 36(9),
2670–2681 (2009). https://doi.org/10.1016/j.cor.2008.11.022
17. Monaco, M.F., Sammarra, M., Sorrentino, G.: The terminal-oriented ship stowage
planning problem. Eur. J. Oper. Res. 239, 256–265 (2014)
Inverse Optimization for Warehouse Management 71
18. Muppani, V.R., Adil, G.K.: Efficient formation of storage classes for warehouse
storage location assignment: a simulated annealing approach. Omega 36, 609–618
(2008)
19. Pang, K.W., Chan, H.L.: Data mining-based algorithm for storage location assign-
ment in a randomised warehouse. Int. J. Prod. Res. 55(14), 4035–4052 (2017).
https://doi.org/10.1080/00207543.2016.1244615
20. Pierre, B., Vannieuwenhuyse, B., Domnianta, D., Dessel, H.V.: Dynamic ABC
storage policy in erratic demand environments. Jurnal Teknik Industri 5(1), 1–12
(2003)
21. Rouwenhorst, B., Reuter, B., Stockrahm, V., van Houtum, G.J., Mantel, R.J.,
Zijm, W.H.M.: Warehouse design and control: framework and literature review.
Eur. J. Oper. Res. 122, 515–533 (2000)
22. Schaefer, A.J.: Inverse integer programming. Optim. Lett. 3, 483–489 (2009)
23. Troutt, M.D., Pang, W.K., Hou, S.H.: Behavioral estimation of mathematical pro-
gramming objective function coefficients. Manage. Sci. 52(3), 422–434 (2006)
24. Zhen, L., Xu, Z., Wang, K., Ding, Y.: Multi-period yard template planning in
container terminals. Transp. Res. Part B 93, 700–719 (2016)
Model-Agnostic Multi-objective
Approach for the Evolutionary Discovery
of Mathematical Models
Alexander Hvatov(B)
, Mikhail Maslyaev, Iana S. Polonskaya,
Mikhail Sarafanov, Mark Merezhnikov, and Nikolay O. Nikitin
NSS (Nature Systems Simulation) Lab, ITMO University,
Saint-Petersburg, Russia
{alex hvatov,mikemaslyaev,ispolonskaia,
mik sar,mark.merezhnikov,nnikitin}@itmo.ru
Abstract. In modern data science, it is often not enough to obtain only
a data-driven model with a good prediction quality. On the contrary, it is
more interesting to understand the properties of the model, which parts
could be replaced to obtain better results. Such questions are unified
under machine learning interpretability questions, which could be consid-
ered one of the area’s raising topics. In the paper, we use multi-objective
evolutionary optimization for composite data-driven model learning to
obtain the algorithm’s desired properties. It means that whereas one
of the apparent objectives is precision, the other could be chosen as
the complexity of the model, robustness, and many others. The method
application is shown on examples of multi-objective learning of compos-
ite models, differential equations, and closed-form algebraic expressions
are unified and form approach for model-agnostic learning of the inter-
pretable models.
Keywords: Model discovery · Multi-objective optimization ·
Composite models · Data-driven models
1 Introduction
The increasing precision of the machine learning models indicates that the best
precision model is either overfitted or very complex. Thus, it is used as a black
box without understanding the principle of the model’s work. This fact means
that we could not say if the model can be applied to another sample without
a direct experiment. Related questions such as applicability to the given class
of the problems, sensitivity, and the model parameters’ and hyperparameters’
meaning arise the interpretability problem [7].
In machine learning, two approaches to obtain the model that describes the
data are existing. The first is to fit the learning algorithm hyperparameters and
the parameters of a given model to obtain the minimum possible error. The
second one is to obtain the model structure (as an example, it is done in neural
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 72–85, 2021.
https://doi.org/10.1007/978-3-030-91885-9_6
Multi-objective Discovery of Mathematical Models 73
architecture search [1]) or the sequence of models that describe the data with the
minimal possible error [12]. We refer to obtaining a set of models or a composite
model as a composite model discovery.
After the model is obtained, both approaches may require additional model
interpretation since the obtained model is still a black box. For model interpre-
tation, many different approaches exist. One group of approaches is the model
sensitivity analysis [13]. Sensitivity analysis allows mapping the input variability
to the output variability. Algorithms from the sensitivity analysis group usually
require multiple model runs. As a result, the model behavior is explained rela-
tively, meaning how the output changes with respect to the input changing.
The second group is the explaining surrogate models that usually have less
precision than the parent model but are less complex. For example, linear regres-
sion models are used to explain deep neural network models [14]. Additionally,
for convolutional neural networks that are often applied to the image classi-
fication/regression task, visual interpretation may be used [5]. However, this
approach cannot be generalized, and thus we do not put it into the classification
of explanation methods.
All approaches described above require another model to explain the results.
Multi-objective optimization may be used for obtaining a model with desired
properties that are defined by the objectives. The Pareto frontier obtained during
the optimization can explain how the objectives affect the resulting model’s
form. Therefore, both the “fitted” model and the model’s initial interpretation
are achieved.
We propose a model-agnostic data-driven modeling method that can be used
for an arbitrary composite model with directed acyclic graph (DAG) represen-
tation. We assume that the composite model graph’s vertices (or nodes) are the
models and the vertices define data flow to the final output node. Genetic pro-
gramming is the most general approach for the DAG generation using evolution-
ary operators. However, it cannot be applied to the composite model discovery
directly since the models tend to grow unexpectedly during the optimization
process.
The classical solution is to restrict the model length [2]. Nevertheless, the
model length restriction may not give the best result. Overall, an excessive
amount of models in the graph drastically increases the fitting time of a given
composite model. Moreover, genetic programming requires that the resulting
model is computed recursively, which is not always possible in the composite
model case.
We refine the usual for the genetic programming cross-over and mutation
operators to overcome the extensive growth and the model restrictions. Addition-
ally, a regularization operator is used to retain the compactness of the resulting
model graph. The evolutionary operators, in general, are defined independently
on a model type. However, objectives must be computed separately for every
type of model. The advantage of the approach is that it can obtain a data-
driven model of a given class. Moreover, the model class change does not require
significant changes in the algorithm.
74 A. Hvatov et al.
We introduce several objectives to obtain additional control over the model
discovery. The multi-objective formulation allows giving an understanding of how
the objectives affect the resulting model. Also, the Pareto frontier provides a set
of models that an expert could assess and refine, which significantly reduces
time to obtain an expert solution when it is done “from scratch”. Moreover,
the multi-objective optimization leads to the overall quality increasing since the
population’s diversity naturally increases.
Whereas the multi-objective formulation [15] and genetic programming [8]
are used for the various types of model discovery separately, we combine both
approaches and use them to obtain a composite model for the comprehensive
class of atomic models. In contrast to the composite model, the atomic model
has single-vector input, single-vector output, and a single set of hyperparameters.
As an atomic model, we may consider a single model or a composite model that
undergoes the “atomization” procedure. Additionally, we consider the Pareto
frontier as the additional tool for the resulting model interpretability.
The paper is structured as follows: Sect. 2 describes the evolutionary oper-
ators and multi-objective optimization algorithms used throughout the paper,
Sect. 3 describes discovery done on the same real-world application dataset for
particular model classes and different objectives. In particular, Sect. 3.2 describes
the composite model discovery for a class of the machine learning models with
two different sets of objective functions; Sect. 3.3 describes the closed-form alge-
braic expression discovery; Sect. 3.4 describes the differential equation discovery.
Section 4 outlines the paper.
2 Problem Statement for Model-Agnostic Approach
The developed model agnostic approach could be applied to an arbitrary com-
posite model represented as a directed acyclic graph. We assume that the graph’s
vertices (or nodes) are the atomic models with parameters fitted to the given
data. It is also assumed that every model has input data and output data. The
edges (or connections) define which models participate in generating the given
model’s input data.
Before the direct approach description, we outline the scheme Fig. 1 that
illustrates the step-by-step questions that should be answered to obtain the
composite model discovery algorithm.
For our approach, we aim to make Step 1 as flexible as possible. Approach
application to different classes of functional blocks is shown in Sect. 3. The
realization of Step 2 is described below. Moreover, different classes of models
require different qualitative measures (Step 3), which is also shown in Sect. 3.
The cross-over and mutation schemes used in the described approach do not
differ in general from typical symbolic optimization schemes. However, in con-
trast to the usual genetic programming workflow, we have to add regularization
operators to restrict the model growth. Moreover, the regularization operator
allows to control the model discovery and obtain the models with specific prop-
erties. In this section, we describe the generalized concepts. However, we note
Multi-objective Discovery of Mathematical Models 75
Fig. 1. Illustration of the modules of the composite models discovery system and the
particular possible choices for the realization strategy.
that every model type has its realization details for the specific class of the
atomic models. Below we describe the general scheme of the three evolution-
ary operators used in the composite model discovery: cross-over, mutation, and
regularization shown in Fig. 2.
Fig. 2. (Left) The generalized composite model individual: a) individual with models
as nodes (green) and the dedicated output node (blue) and nodes with blue and yellow
frames that are subjected to the two types of the mutations b1) and b2); b1) mutation
with change one atomic model with another atomic model (yellow); b2) mutation with
one atomic model replaced with the composite model (orange) (right) The scheme of
the cross-over operator green and yellow nodes are the different individuals. The frames
are subtrees that are subjected to the cross-over (left), two models after the cross-over
operator is applied (right) (Color figure online)
76 A. Hvatov et al.
The mutation operator has two variations: node replacement with another
node and node replacement with a sub-tree. Scheme for the both type of the
mutations are shown in Fig. 2(left). The two types of mutation can be applied
simultaneously. The probabilities of the appearance of a given type of mutation
are the algorithm’s hyperparameters. We note that for convenience, some nodes
and model types may be “immutable”. This trick is used, for example, in Sect.3.4
to preserve the differential operator form and thus reduce the optimization space
(and consequently the optimization time) without losing generality.
In general, the cross-over operator could be represented as the subgraphs
exchange between two models as shown in Fig. 2(right). In the most general case,
in the genetic programming case, subgraphs are chosen arbitrarily. However,
since not all models have the same input and output, the subgraphs are chosen
such that the inputs and outputs of the offspring models are valid for all atomic
models.
In order to restrict the excessive growth of the model, we introduce an addi-
tional regularization operator shown in Fig. 3. The amount of described disper-
sion (as an example, using the R2 metric) is assessed for each graph’s depth
level. The models below the threshold are removed from the tree iteratively with
metric recalculation after each removal. Also, the applied implementation of the
regularization operator can be task-specific (e.g., custom regularization for com-
posite machine learning models [11] and the LASSO regression for the partial
differential equations [9]).
Fig. 3. The scheme of the regularization operator. The numbers are the dispersion
ratio that is described by the child nodes as the given depth level. In the example
model with dispersion ratio 0.1 is removed from the left model to obtain simpler model
to the right.
Unlike the genetic operators defined in general, the objectives are defined for
the given class of the atomic models and the given problem. The class of the
models defines the way of an objective function computation. For example, we
consider the objective referred to as “quality”, i.e., the ability of the given com-
posite model to predict the data in the test part of the dataset. Machine learning
models may be additionally “fitted” to the data with a given composite model
structure. Fitting may be used for all models simultaneously or consequently
for every model. Also, the parameters of the machine learning models may not
Multi-objective Discovery of Mathematical Models 77
be additionally optimized, which increases the optimization time. The differen-
tial equations and the algebraic expressions realization atomic models are more
straightforward than the machine learning models. The only way to change the
quality objective is the mutation, cross-over, and regularization operators. We
note that there also may be a “fitting” procedure for the differential equations.
However, we do not introduce variable parameters for the differential terms in
the current realization.
In the initial stage of the evolution, according to the standard workflow
of the MOEA/DD [6] algorithm, we have to evaluate the best possible value
for each of the objective functions. The selection of parents for the cross-over
is held for each objective function space region. With a specified probability
of maintaining the parents’ selection, we can select an individual outside the
processed subregion to partake in the recombination. In other cases, if there are
candidate solutions in the region associated with the weights vector, we make
a selection among them. The final element of MOEA/DD is population update
after creating new solutions, which is held without significant modifications. The
resulting algorithm is shown in Algorithm 1.
Data: Class of atomic models T = {T1, T2, ... Tn}; (optional) define subclass of
immutable models; objective functions
Result: Pareto frontier
Create a set of weight vectors w = (w1
, ..., wn weights
), wi
= (wi
1, ..., wi
n eq+1);
for weight vector in weights do
Select K nearest weight vectors to the weight vector;
Randomly generate a set of candidate models  divide them into
non-dominated levels;
Divide the initial population into groups by subregion, to which they belong;
for epoch = 1 to epoch number do
for weight vector in weights do
Parent selection;
Apply recombination to parents pool and mutation to individuals inside
the region of weights (Fig. 2);
for offspring in new solutions do
Apply regularization operator (Fig. 3);
Get values of objective functions for offspring;
Update population;
Algorithm 1: The pseudo-code of model-agnostic Pareto frontier construction
To sum up, the proposed approach combines classical genetic programming
operators with the regularization operator and the immutability property of
selected nodes. The refined MOEA/DD and refined genetic programming oper-
ators obtain composite models for different atomic model classes.
78 A. Hvatov et al.
3 Examples
In this section, several applications of the described approach are shown. We use
a common dataset for all experiments that are described in the Sect. 3.1.
While the main idea is to show the advantages of the multi-objective app-
roach, the particular experiments show different aspects of the approach realiza-
tion for different models’ classes. Namely, we want to show how different choices
of the objectives reflect the expert modeling.
For the machine learning models in Sect. 3.2, we try to mimic the expert’s
approach to the model choice that allows one to transfer models to a set of
related problems and use a robust regularization operator.
Almost the same idea is pursued in mathematical models. Algebraic expres-
sion models in Sect. 3.3 are discovered with the model complexity objective. More
complex models tend to reproduce particular dataset variability and thus may
not be generalized. To reduce the optimization space, we introduce immutable
nodes to restrict the model form without generality loss. The regularization oper-
ator is also changed to assess the dispersion relation between the model residuals
and data, which has a better agreement with the model class chosen.
While the main algebraic expressions flow is valid for partial differential equa-
tions discovery in Sect. 3.4, they have specialties, such as the impossibility to
solve intermediate models “on-fly”. Therefore LASSO regularization operator is
used.
3.1 Experimental Setup
The validation of the proposed approach was conducted for the same dataset
for all types of models: composite machine learning models, models based on
closed-form algebraic expression, and models in differential equations.
The multi-scale environmental process was selected as a benchmark. As the
dataset for the examples, we take the time series of sea surface height were
obtained from numerical simulation using the high-resolution setup of the NEMO
model for the Arctic ocean [3]. The simulation covers one year with the hourly
time resolution. The visualization of the experimental data is presented in Fig. 4.
It is seen from Fig. 4 that the dataset has several variability scales. The com-
posite models, due to their nature, can reproduce multiple scales of variability.
In the paper, the comparison between single and composite model performance
is taken out of the scope. We show only that with a single approach, one may
obtain composite models of different classes.
3.2 Composite Machine Learning Models
The machine learning pipelines’ discovery methods are usually referred to as
automated machine learning (AutoML). For the machine learning model design,
the promising task is to control the properties of obtained model. Quality and
robustness could be considered as an example of the model’s properties. The
Multi-objective Discovery of Mathematical Models 79
Fig. 4. The multi-scale time series of sea surface height used as a dataset for all exper-
iments.
proposed model-agnostic approach can be used to discover the robust compos-
ite machine learning models with the structure described as a directed acyclic
graph (as described in [11]). In this case, the building blocks are regression-based
machine learning models, algorithms for feature selection, and feature transfor-
mation. The specialized lagged transformation is applied to the input data to
adapt the regression models for the time series forecasting. This transformation
is also represented as a building block for the composite graph [4].
The quality of the composite machine learning models can be analyzed in
different ways. The simplest way is to estimate the quality of the prediction on
the test sample directly. However, the uncertainty in both models and datasets
makes it necessary to apply the robust quality evaluation approaches for the
effective analysis of modeling quality [16].
The stochastic ensemble-based approach can be applied to obtain the set of
predictions Yens for different modeling scenarios using stochastic perturbation of
the input data for the model. In this case, the robust objective function 
fi (Yens)
can be evaluated as follows:
μens = 1
k
k
j=1 (Y j
ens) + 1,

fi (Yens) = μens

1
k−1
k
i=1 (fi (Y i
ens) − μens)
2
+ 1
(1)
In Eq. 1 k is the number of models in the ensemble, f - function for modelling
error, Yens - ensemble of the modelling results for specific configuration of the
composite model.
The robust and non-robust error measures of the model cannot be minimized
together. In this case, the multi-objective method proposed in the paper can be
used to build the predictive model. We implement the described approach as a
part of the FEDOT framework1
that allow building various ML-based composite
models. The previous implementation of FEDOT allows us to use multi-objective
1
https://github.com/nccr-itmo/FEDOT.
80 A. Hvatov et al.
optimization only for regression and classification tasks with a limited set of
objective functions. After the conducted changes, it can be used for custom
tasks and objectives.
The generative evolutionary approach is used during the experimental studies
to discover the composite machine learning model for the sea surface height
dataset. The obtained Pareto frontier is presented is Fig. 5.
Fig. 5. The Pareto frontier for the evolutionary multi-objective design of the compos-
ite machine learning model. The root mean squared error (RMSE) and mean-variance
for RMSE are used as objective functions. The ridge and linear regressions, lagged
transformation (referred as lag), k-nearest regression (knnreg) and decision tree regres-
sion (dtreg) models are used as a parts of optimal composite model for time series
forecasting.
From Fig. 5 we obtain an interpretation that agrees with the practical guide-
lines. Namely, the structures with single ridge regression (M1) are the most
robust, meaning that dataset partition less affects the coefficients. The single
decision tree model, on the contrary, the most dependent on the dataset parti-
tion model.
3.3 Closed-Form Algebraic Expressions
The class of the models may include algebraic expressions to obtain better inter-
pretability of the model. As the first example, we present the multi-objective
algebraic expression discovery example.
As the algebraic expression, we understand the sum of the atomic functions’
products, which we call tokens. Basically token is an algebraic expression with
Multi-objective Discovery of Mathematical Models 81
a free parameters (as an example T = (t; α1, α2, α3) = α3 sin(α1t + α2) with
free parameters set α1, α2, α3), which are subject to optimization. In the present
paper, we use pulses, polynomials, and trigonometric functions as the tokens set.
For the mathematical expression models overall, it is helpful to introduce two
groups of objectives. The first group of the objectives we refer to as “quality”.
For a given equation M, the quality metric ||·|| is the data D reproduction norm
that is represented as
Q(M) = ||M − D|| (2)
The second group of objectives we refer to as “complexity”. For a given
equation M, the complexity metric is bound to the length of the equation that
is denoted as #(M)
C(M) = #(M) (3)
As an example of objectives, we use rooted mean squared error (RMSE) as
the quality metric and the number of tokens present in the resulting model as the
complexity metric. First, the model’s structure is obtained with a separate evo-
lutionary algorithm to compute the mean squared error. In details it is described
in [10].
To perform the single model evolutionary optimization in this case, we make
the arithmetic operations immutable. The resulting directed acyclic graph is
shown in Fig. 6. Thus, the third type of nodes appears - immutable ones. This
step is not necessary, and the general approach described above may be used
instead. However, it reduces the search space and thus reduces the optimization
time without losing generality.
Fig. 6. The scheme of the composite model, generalizing the discovered differential
equation, where red nodes are the nodes, unaffected by mutation or cross-over operators
of the evolutionary algorithm. The blue nodes represent the tokens that evolutionary
operators can alter. (Color figure online)
The resulting Pareto frontier for the class of the described class of closed-form
algebraic expressions is shown in Fig. 7.
82 A. Hvatov et al.
Fig. 7. The Pareto frontier for the evolutionary multi-objective design of the closed-
form algebraic expressions. The root mean squared error (RMSE) and model complex-
ity are used as objective functions.
Since the origin of time series is the sea surface height in the ocean, it is
natural to expect that the closed-form algebraic expression is the spectra-like
decomposition, which is seen in Fig. 7. It is also seen that as soon as the com-
plexity rises, the additional term only adds information to the model without
significant changes to the terms that are present in the less complex model.
3.4 Differential Equations
The development of a differential equation-based model of a dynamic system can
be viewed from the composite model construction point of view. A tree graph rep-
resents the equation with input functions, decoded as leaves, and branches repre-
senting various mathematical operations between these functions. The specifics
of a single equation’s development process were discussed in the article [9].
The evaluation of equation-based model quality is done in a pattern similar to
one of the previously introduced composite models. Each equation represents a
trade-off between its complexity, which we estimate by the number of terms in it
and the quality of a process representation. Here, we will measure this process’s
representation quality by comparing the left and right parts of the equation.
Thus, the algorithm aims to obtain the Pareto frontier with the quality and
complexity taken as the objective functions.
We cannot use standard error measures such as RMSE since the partial
differential equation with the arbitrary operator cannot be solved automatically.
Therefore, the results from previous sections could not be compared using the
quality metric.
Multi-objective Discovery of Mathematical Models 83
Fig. 8. The Pareto frontier for the evolutionary multi-objective discovery of differential
equations, where complexity objective function is the number of terms in the left part
of the equation, and quality is the approximation error (difference between the left and
right parts of the equation).
Despite the achieved quality of the equations describing the process, pre-
sented in Fig. 8, their predictive properties may be lacking. The most appropriate
physics-based equations to describe this class of problems (e.g., shallow-water
equations) include spatial partial derivatives that are not available in processing
a single time series.
4 Conclusion
The paper describes a multi-objective composite models discovery approach
intended for data-driven modeling and initial interpretation.
Genetic programming is a powerful tool for DAG model generation and opti-
mization. However, it requires refinement to be applied to the composite model
discovery. We show that the number of changes required is relatively high. There-
fore, we are not talking about the genetic programming algorithm. Moreover,
the multi-objective formulation may be used to understand how the human-
formulated objectives affect the optimization, though this basic interpretation is
achieved.
As the main advantages we note:
– The model and basic interpretation are obtained simultaneously during the
optimization
– The approach can be applied to the different classes of the models without
significant changes
– Obtained models could have better quality since the multi-objective problem
statement increases diversity which is vital for evolutionary algorithms
84 A. Hvatov et al.
As future work, we plan to work on the unification of the approaches, which
will allow obtaining the combination of algebraic-form models and machine learn-
ing models, taking best from each of the classes: better interpretability of math-
ematical and flexibility machine learning models.
Acknowledgements. This research is financially supported by The Russian Science
Foundation, Agreement #17-71-30029 with cofinancing of Bank Saint Petersburg.
References
1. Elsken, T., Metzen, J.H., Hutter, F., et al.: Neural architecture search: a survey.
J. Mach. Learn. Res. 20(55), 1–21 (2019)
2. Grosan, C.: Evolving mathematical expressions using genetic algorithms. In:
Genetic and Evolutionary Computation Conference (GECCO). Citeseer (2004)
3. Hvatov, A., Nikitin, N.O., Kalyuzhnaya, A.V., Kosukhin, S.S.: Adaptation of nemo-
lim3 model for multigrid high resolution arctic simulation. Ocean Model. 141,
101427 (2019)
4. Kalyuzhnaya, A.V., Nikitin, N.O., Vychuzhanin, P., Hvatov, A., Boukhanovsky, A.:
Automatic evolutionary learning of composite models with knowledge enrichment.
In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
Companion, pp. 43–44 (2020)
5. Konforti, Y., Shpigler, A., Lerner, B., Bar-Hillel, A.: Inference graphs for CNN
interpretation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV
2020, Part XXV. LNCS, vol. 12370, pp. 69–84. Springer, Cham (2020). https://
doi.org/10.1007/978-3-030-58595-2 5
6. Li, K., Deb, K., Zhang, Q., Kwong, S.: An evolutionary many-objective optimiza-
tion algorithm based on dominance and decomposition. IEEE Trans. Evol. Com-
put. 19(5), 694–716 (2014)
7. Lipton, Z.C.: The mythos of model interpretability: in machine learning, the con-
cept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018)
8. Lu, Q., Ren, J., Wang, Z.: Using genetic programming with prior formula knowl-
edge to solve symbolic regression problem. Comput. Intell. Neurosci. 2016, 1–17
(2016)
9. Maslyaev, M., Hvatov, A., Kalyuzhnaya, A.V.: Partial differential equations dis-
covery with EPDE framework: application for real and synthetic data. J. Comput.
Sci. 53, 101345 (2021). https://doi.org/10.1016/j.jocs.2021.101345, https://www.
sciencedirect.com/science/article/pii/S1877750321000429
10. Merezhnikov, M., Hvatov, A.: Closed-form algebraic expressions discovery using
combined evolutionary optimization and sparse regression approach. Procedia
Comput. Sci. 178, 424–433 (2020)
11. Nikitin, N.O., Polonskaia, I.S., Vychuzhanin, P., Barabanova, I.V., Kalyuzhnaya,
A.V.: Structural evolutionary learning for composite classification models. Procedia
Comput. Sci. 178, 414–423 (2020)
12. Olson, R.S., Moore, J.H.: TPOT: a tree-based pipeline optimization tool for
automating machine learning. In: Workshop on Automatic Machine Learning,
pp. 66–74. PMLR (2016)
13. Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., Tarantola, S.: Vari-
ance based sensitivity analysis of model output. Design and estimator for the total
sensitivity index. Comput. Phys. Commun. 181(2), 259–270 (2010)
Multi-objective Discovery of Mathematical Models 85
14. Tsakiri, K., Marsellos, A., Kapetanakis, S.: Artificial neural network and multiple
linear regression for flood prediction in mohawk river, New York. Water 10(9),
1158 (2018)
15. Vu, T.M., Probst, C., Epstein, J.M., Brennan, A., Strong, M., Purshouse, R.C.:
Toward inverse generative social science using multi-objective genetic program-
ming. In: Proceedings of the Genetic and Evolutionary Computation Conference,
pp. 1356–1363 (2019)
16. Vychuzhanin, P., Nikitin, N.O., Kalyuzhnaya, A.V., et al.: Robust ensemble-based
evolutionary calibration of the numerical wind wave model. In: Rodrigues, J.M.F.
(ed.) ICCS 2019, Part I. LNCS, vol. 11536, pp. 614–627. Springer, Cham (2019).
https://doi.org/10.1007/978-3-030-22734-0 45
A Simple Clustering Algorithm Based
on Weighted Expected Distances
Ana Maria A. C. Rocha1(B)
, M. Fernanda P. Costa2
,
and Edite M. G. P. Fernandes1
1
ALGORITMI Center, University of Minho, Campus de Gualtar,
4710-057 Braga, Portugal
{arocha,emgpf}@dps.uminho.pt
2
Centre of Mathematics, University of Minho, Campus de Gualtar,
4710-057 Braga, Portugal
mfc@math.uminho.pt
Abstract. This paper contains a proposal to assign points to clusters,
represented by their centers, based on weighted expected distances in a
cluster analysis context. The proposed clustering algorithm has mecha-
nisms to create new clusters, to merge two nearby clusters and remove
very small clusters, and to identify points ‘noise’ when they are beyond
a reasonable neighborhood of a center or belong to a cluster with very
few points. The presented clustering algorithm is evaluated using four
randomly generated and two well-known data sets. The obtained cluster-
ing is compared to other clustering algorithms through the visualization
of the clustering, the value of the DB validity measure and the value
of the sum of within-cluster distances. The preliminary comparison of
results shows that the proposed clustering algorithm is very efficient and
effective.
Keywords: Clustering analysis · Partitioning algorithms · Weighted
distance
1 Introduction
Clustering is an unsupervised machine learning task that is considered one of
the most important data analysis techniques in data mining. In a clustering
problem, unlabeled data objects are to be partitioned into a certain number of
groups (also called clusters) based on their attribute values. The objective is that
objects in a cluster are more similar to each other than to objects in another
cluster [1–4]. In geometrical terms, the objects can be viewed as points in a
a-dimensional space, where a is the number of attributes. Clustering partitions
these points into groups, where points in a group are located near one another
in space.
This work has been supported by FCT – Fundação para a Ciência e Tecnologia within
the RD Unit Project Scope UIDB/00319/2020.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 86–101, 2021.
https://doi.org/10.1007/978-3-030-91885-9_7
A Simple Clustering Algorithm Based on Weighted Expected Distances 87
There are a variety of categories of clustering algorithms. The most tradi-
tional include the clustering algorithms based on partition, the algorithms based
on hierarchy, algorithms based on fuzzy theory, algorithms based on distribution,
those based on density, the ones based on graph theory, based on grid, on fractal
theory and based on model. The basic and core ideas of a variety of commonly
used clustering algorithms, a comprehensive comparison and an analysis of their
advantages and disadvantages are summarized in an interesting review of cluster-
ing algorithms [5]. The most used are the partitioning clustering methods, being
the K-means clustering the most popular [6]. K-means clustering (and other K-
means combinations) subdivide the data set into K clusters, where K is specified
in advance. Each cluster is represented by the centroid (mean) of the data points
belonging to that cluster and the clustering is based on the minimization of the
overall sum of the squared errors between the points and their corresponding
cluster center. While partition clustering constructs various partitions of the
data set and uses some criteria to evaluate them, hierarchical clustering creates
a hierarchy of clusters by combining the data points into clusters, and after that
these clusters are combined into larger clusters and so on. It does not require the
number of clusters in advance. However, once a merge or a split decision have
been made, pure hierarchical clustering does not make adjustments. To overcome
this limitation, hierarchical clustering can be integrated with other techniques
for multiple phase clustering. Linkage algorithms are agglomerative hierarchical
methods that considers merging clusters based on the distance between clusters.
For example, the single-link is a type of linkage algorithm that merges the two
clusters with the smallest minimum pairwise distance [7]. K-means clustering
easily fails when the cluster forms depart from the hyper spherical shape. On
the other hand, the single-linkage clustering method is affected by the presence
of outliers and the differences in the density of clusters. However, it is not sensi-
tive to shape and size of clusters. Model-based clustering assumes that the data
set comes from a distribution that is a mixture of two or more clusters [8]. Unlike
K-means, the model-based clustering uses a soft assignment, where each point
has a probability of belonging to each cluster. Since clustering can be seen as
an optimization problem, well-known optimization algorithms may be applied in
the cluster analysis, and a variety of them have been combined with the K-means
clustering, e.g. [9–11].
Unlike K-means and pure hierarchical clustering, the herein proposed parti-
tioning clustering algorithm has mechanisms that dynamically adjusts the num-
ber of clusters. The mechanisms are able to add, remove and merge clusters
depending on some pre-defined conditions. Although for the initialization of the
algorithm, an initial number of clusters to start the points assigning process
is required, the proposed clustering algorithm has mechanisms to create new
clusters. Points that were not assigned to a cluster (using a weighted expected
distance of a particular center to the data points) are gathered in a new cluster.
The clustering is also able to merge two nearby clusters and remove the clus-
ters that have a very few points. Although these mechanisms are in a certain
way similar to others in the literature, this paper gives a relevant contribution
88 A. M. A. C. Rocha et al.
to the object similarity issue. Similarity is usually tackled indirectly by using
a distance measure to quantify the degree of dissimilarity among objects, in a
way that more similar objects have lower dissimilarity values. A probabilistic
approach is proposed to define the weighted expected distances from the cluster
centers to the data points. These weighted distances rely on the average variabil-
ity of the distances of the points to each center with respect to their estimated
expected distances. The larger the variability, the smaller the weighted distance.
Therefore, weighted expected distances are used to assign points to clusters (that
are represented by their centers).
The paper is organized as follows. Section 2 briefly describes the ideas of a
cluster analysis and Sect. 3 presents the details of the proposed weighted dis-
tances clustering algorithm. In Sect. 4, the results of clustering four sets of data
points with two attributes, one set with four attributes, and another with thir-
teen attributes are shown. Finally, Sect. 5 contains the conclusions of this work.
2 Cluster Analysis
Assume that n objects, each with a attributes, are given. These objects can also
be represented by a data matrix X with n vectors/points of dimension a. Each
element Xi,j corresponds to the jth attribute of the ith point. Thus, given X, a
partitioning clustering algorithm tries to find a partition C = {C1, C2, . . . , CK}
of K clusters (or groups), in a way that similarity of the points in the same
cluster is maximum and points from different clusters differ as much as possible.
It is required that the partition satisfies three conditions:
1. each cluster should have at least one point, i.e., |Ck| = 0, k = 1, . . . , K;
2. a point should not belong to two different clusters, i.e., Ck

Cl = ∅, for
k, l = 1, . . . , K, k = l;
3. each point should belong to a cluster, i.e.,
K
k=1 |Ck| = n;
where |Ck| is the number of points in cluster Ck. Since there are a number of ways
to partition the points and maintain these properties, a fitness function should
be provided so that the adequacy of the partitioning is evaluated. Therefore, the
clustering problem could be stated as finding an optimal solution, i.e., partition
C∗
, that gives the optimal (or near-optimal) adequacy, when compared to all the
other feasible solutions.
3 Weighted Distances Clustering Algorithm
Distance based clustering is a very popular technique to partition data points
into clusters, and can be used in a large variety of applications. In this section, a
probabilistic approach is described that relies on the weighted distances (WD) of
each data point Xi, i = 1, . . . , n relative to the cluster centers mk, k = 1, . . . , K.
These WD are used to assign the data points to clusters.
A Simple Clustering Algorithm Based on Weighted Expected Distances 89
3.1 Weighted Expected Distances
To compute the WD between the data point Xi and center mk, we resort to the
distance vectors DVk, k = 1, . . . , K, and to the variations V ARk, k = 1, . . . , K,
the variability in cluster center mk relative to all data points. Thus, let DVk =
(DVk,1, DVk,2, . . . , DVk,n) be a n-dimensional distance vector that contains the
distances of data points X1, X2, . . . , Xn (a-dimensional vectors) to the cluster
center mk ∈ Ra
, where DVk,i = Xi − mk2 (i = 1, . . . , n). The componentwise
total of the distance vectors, relative to a particular data point Xi, is given by:
Ti =
K

k=1
DVk,i.
The probability of mk being surrounded and having a good centrality relative
to all points in the data set is given by
pk =
n
i=1 DVk,i
n
i=1 Ti
. (1)
Therefore, the expected value for the distance vector component i (relative to
point Xi) of the vector DVk is E[DVk,i] = pkTi. Then, the WD is defined for
each component i of the vector DVk, WD(k, i), using
WD(k, i) = DVk,i/V ARk (2)
(corresponding to the weighted distance assigned to data point Xi by the cluster
center mk), where the variation V ARk is the average amount of variability of
the distances of the points to the center mk, relative to the expected distances:
V ARk =

1
n
n

i=1
(DVk,i − E[DVk,i])
2
1/2
.
The larger the V ARk, the lower the weighted distances WD(k, i), i = 1, . . . , n
of the points relative to that center mk, when compared to the other centers.
A larger V ARk means that in average the difference between the estimated
expected distances and the real distances, to that center mk, is larger than to
the other centers.
3.2 The WD-Based Algorithm
Algorithm 1 presents the main steps of the proposed weighted expected distances
clustering algorithm. For initialization, a problem dependent number of clusters
is provided, e.g. K, so that the algorithm randomly selects K points from the
data set to initialize the cluster centers. To determine which data points are
assigned to a cluster, the algorithm uses the WD as a fitness function. Each
point Xi has a WD relative to a cluster center mk, the above defined WD(k, i).
90 A. M. A. C. Rocha et al.
The lower the better. The minimization of WD(k, i) relative to k takes care of
the local property of the algorithm, assigning Xi to cluster Ck.
Algorithm 1. Clustering Algorithm
Require: data set X ≡ Xi,j, i = 1, . . . , n, j = 1, . . . , a, Itmax, Nmin
1: Set initial K; It = 1
2: Randomly select a set of K points from the data set X to initialize the K cluster
centers mk, k = 1, . . . , K,
3: repeat
4: Based on mk, k = 1, . . . , K, assign the data points to clusters and eventually
add a new cluster, using Algorithm 2
5: Compute ηD using (4)
6: Based on ηD, eventually remove clusters by merging two at a time, and remove
clusters with less than Nmin points, using Algorithm 3
7: Set m̄k ← mk, k = 1, . . . , K
8: Compute improved values for the cluster centers, mk, k = 1, . . . , K using Algo-
rithm 4
9: Set It = It + 1
10: until It  Itmax or maxk=1,...,K m̄k − mk2 ≤ 1e − 3
11: return K∗
, mk, k = 1, . . . , K∗
and optimal C∗
= {C∗
1 , C∗
2 , . . . , C∗
K }.
Thus, to assign a point Xi to a cluster, the algorithm identifies the center
mk that has the lowest weighted distance value to that point, and if WD(k, i)
is inferior to the quantity 1.5ηk, where ηk is a threshold value for the center mk,
the point is assigned to that cluster. On the other hand, if WD(k, i) is between
1.5ηk and 2.5ηk a new cluster CK+1 is created and the point is assigned to CK+1;
otherwise, the WD(k, i) value exceeds 2.5ηk and the point is considered ‘noise’.
Thus, all points that have a minimum weighted distance to a specific center
mk between 1.5ηk and 2.5ηk are assigned to the same cluster CK+1, and if the
minimum weighted distances exceed 2.5ηk, the points are ‘noise’. The goal of
the threshold ηk is to define a bound to border the proximity to a cluster center.
The limit 1.5ηk defines the region of a good neighborhood, and beyond 2.5ηk the
region of ‘noise’ is defined. Similarity in the weighted distances and proximity
to a cluster center mk is decided using the threshold ηk. For each cluster k, the
ηk is found by using the average of the WD(k, .) of the points which are in the
neighborhood of mk and have magnitudes similar to each other. The similarity
between the magnitudes of the WD is measured by sorting their values and
checking if the consecutive differences remain lower that a threshold ΔW D. This
value depends on the number of data points and the median of the matrix WD
values (see Algorithm 2 for the details):
ΔW D =
n − 1
n
median[WD(k, i)]k=1,...,K, i=1,...,n. (3)
During the iterative process, clusters that are close to each other, measured
by the distance between their centers, may be merged if the distance between
A Simple Clustering Algorithm Based on Weighted Expected Distances 91
their centers is below a threshold, herein denoted as ηD. This parameter value
is defined as a function of the search region of the data points and also depends
on the current number of clusters and number of attributes of the data set:
ηD =
minj=1,...,a{maxi Xi,j − mini Xi,j}
K(a − 1)
. (4)
Furthermore, all clusters that have a relatively small number of points, i.e.,
all clusters that verify |Ck|  Nmin, where Nmin = min{50, max{2, 0.05n}},
are removed, their centers are deleted, and the points are coined as ‘noise’ (for
the current iteration), see Algorithm 3.
Algorithm 2. Assigning Points Algorithm
Require: mk for k = 1, . . . , K, data set Xi,j, i = 1, . . . , n, j = 1, . . . , a
1: Compute each component i of the vector DVk, i = 1, . . . , n (k = 1, . . . , K)
2: Compute matrix of WD as shown in (2)
3: Compute ΔW D using (3)
4: for k = 1, . . . , K do
5: WDsort(k) ← sort(WD(k, :))
6: end for
7: for k = 1, . . . , K do
8: for i = 1, . . . , n − 1 do
9: if (WDsort(k, i + 1) − WDsort(k, i))  ΔW D then
10: break
11: end if
12: end for
13: Compute ηk =
i
l=1 W Dsort(k,l)
i
14: end for
15: for i = 1, . . . , n do
16: Set k = arg mink=1,...,K WD(k, i)
17: if WD(k, i)  1.5ηk then
18: Assign point Xi to the cluster Ck
19: else
20: if WD(k, i)  2.5ηk then
21: Assign point Xi to the new cluster CK+1
22: else
23: Define point Xi as ‘noise’ (class Xnoise)
24: end if
25: end if
26: end for
27: Compute center mK+1 as the centroid of all points assigned to CK+1
28: Update K
29: return K, mk, Ck, for k = 1, . . . , K, and Xnoise.
Summarizing, to achieve the optimal number of clusters, the Algorithm 1
proceeds iteratively by:
92 A. M. A. C. Rocha et al.
– adding a new cluster whose center is represented by the centroid of all points
that were not assigned to any cluster according to the rules previously defined;
– defining points ‘noise’ if they are not in an appropriate neighborhood of the
center that is closest to those points, or if they belong to a cluster with very
few points;
– merging clusters that are sufficiently close to each other, as well as removing
clusters with very few points.
On the other hand, the position of each cluster center can be iteratively
improved, by using the centroid of the data points that have been assigned to
that cluster and the previous cluster center [12,13], as shown in Algorithm 4.
Algorithm 3. Merging and Removing Clusters Algorithm
Require: K, mk, Ck, k = 1, . . . , K, ηD, Nmin, Xnoise
1: for k = 1 to K − 1 do
2: for l = k + 1 to K do
3: Compute Dk,l = mk − ml2
4: if Dk,l ≤ ηD then
5: Compute mk = 1
|Ck|+|Cl|
(|Ck|mk + |Cl|ml)
6: Assign all points X ∈ Cl to the cluster Ck
7: Delete ml (and remove Cl)
8: Update K
9: end if
10: end for
11: end for
12: for k = 1, . . . , K do
13: if |Ck|  Nmin then
14: Define all X ∈ Ck as points ‘noise’ (add to class Xnoise)
15: Delete mk (remove Ck)
16: Update K
17: end if
18: end for
19: return K, mk, Ck for k = 1, . . . , K, and Xnoise.
Algorithm 4. Update Cluster Centers Algorithm
Require: mk, Ck, k = 1, . . . , K,
1: Update all cluster centers:
mk =
1
|Ck| + 1
⎛
⎝

Xl∈Ck
Xl + mk
⎞
⎠ for k = 1, . . . , K
2: return mk, k = 1, . . . , K
A Simple Clustering Algorithm Based on Weighted Expected Distances 93
4 Computational Results
In this section, some preliminary results are shown. Four sets of data points
with two attributes, one set with four attributes (known as ‘Iris’) and one
set with thirteen attributes (known as ‘Wine’) are used to compute and visu-
alize the partitioning clustering. The algorithms are coded in MATLAB R

.
The performance of the Algorithm 1, based on Algorithm 2, Algorithm 3
and Algorithm 4, depends on some parameter values that are problem depen-
dent and dynamically defined: initial K = min{10, max{2, 0.01n}}, Nmin =
min{50, max{2, 0.05n}}, ΔW D  0, ηk  0, k = 1, . . . , K, ηD  0. The clus-
tering results were obtained for Itmax = 5, except for ‘Iris’ and ‘Wine’ where
Itmax = 20 was used.
4.1 Generated Data Sets
The generator code for the first four data sets is the following:
Problem 1. ‘mydata’1
with 300 data points and two attributes.
Problem 2. 2000 data points and two attributes.
mu1 = [1 2]; Sigma1 = [2 0; 0 0.5];
mu2 = [-3 -5]; Sigma2 = [1 0; 0 1];
X = [mvnrnd(mu1,Sigma1,1000); mvnrnd(mu2,Sigma2,1000)];
Problem 3. 200 data points and two attributes.
X = [randn(100,2)*0.75+ones(100,2); randn(100,2)*0.5-ones(100,2)];
Problem 4. 300 data points and two attributes.
mu1 = [1 2]; sigma1 = [3 .2; .2 2];
mu2 = [-1 -2]; sigma2 = [2 0; 0 1];
X = [mvnrnd(mu1,sigma1,200); mvnrnd(mu2,sigma2,100)];
In Figs. 1, 2, 3 and 4, the results obtained for Problems 1–4 are depicted:
– the plot (a) is the result of assigning the points (applying Algorithm 2) to
the clusters based on the initialized centers (randomly selected K points of
the data set);
– the plot (a) may contain a point, represented by the symbol ‘’, which iden-
tifies the position of the center of the cluster that is added in Algorithm 2
with the points that were not assigned (according to the previously referred
rule) to a cluster;
– plots (b) and (c) are obtained after the application of Algorithm 4 to update
the centers that result from the clustering obtained by Algorithm 2 and Algo-
rithm 3, at an iteration;
1
available at Mostapha Kalami Heris, Evolutionary Data Clustering in MATLAB
(URL: https://yarpiz.com/64/ypml101-evolutionary-clustering), Yarpiz, 2015.
94 A. M. A. C. Rocha et al.
– the final clustering is visualized in plot (d);
– plots (a), (b), (c) or (d) may contain points ‘noise’ identified by the sym-
bol ‘×’;
– plot (e) shows the values of the Davies-Bouldin (DB) index, at each iteration,
obtained after assigning the points to clusters (according to the previously
referred rule) and after the update of the cluster centers;
– plot (f) contains the K-means clustering when K = K∗
is provided to the
algorithm.
The code that implements the K-means clustering [14] is used for comparative
purposes, although with K-means the number of clusters K had to be specified
in advance. The performances of the tested clustering algorithms are measured
in terms of a cluster validity measure, the DB index [15]. The DB index aims to
evaluate intra-cluster similarity and inter-cluster differences by computing
DB =
1
K
K

k=1
max
l=1,...,K, l=k

Sk + Sl
dk,l

(5)
where Sk (resp. Sl) represents the average of all the distances between the center
mk (resp. ml) and the points in cluster Ck (resp. Cl) and dk,l is the distance
between mk and ml. The smallest DB index indicates a valid optimal partition.
The computation of the DB index assumes that a clustering has been done.
The values presented in plots (a) (on the x-axis), (b), (c) and (d) (on the y-axis),
and (e) are obtained out of the herein proposed clustering (recall Algorithm 2,
Algorithm 3 and Algorithm 4). The values of the ‘DB index(m,X)’ and ‘CS
index(m,X)’ (on the y-axis) of plot (a) come from assigning the points to clusters
based on the usual minimum distances of points to centers. The term CS index
refers to the CS validity measure and is a function of the ratio of the sum of
within-cluster scatter to between-cluster separation [13]. The quantity ‘Sum of
Within-Cluster Distance (WCD)’ (in plots (b), (c) and (d) (on the x-axis)) was
obtained out of our proposed clustering process.
In conclusion, the proposed clustering is effective, very efficient and robust.
As it can be seen, the DB index values that result from our clustering are slightly
lower than (or equal to) those registered by the K-means clustering.
4.2 ‘Iris’ and ‘Wine’ Data Sets
The results of our clustering algorithm, when solving Problems 5 and 6 are now
shown. To compare the performance, our algorithm was run 30 times for each
data set.
Problem 5. ‘Iris’ with 150 data points. It contains three categories (types of
iris plant) with 4 attributes (sepal length, sepal width, petal length and petal
width) [16].
Problem 6. ‘Wine’ with 178 data points. It contains chemical analysis of 178
wines derived from 3 different regions, with 13 attributes [16].
A Simple Clustering Algorithm Based on Weighted Expected Distances 95
-2 0 2 4 6 8
DB index = 1.4783
-2
-1
0
1
2
3
4
5
6
DB
index(m,X)
=
1.6035;
CS
index(m,X)
=
1.2197
Random selection of centers+Assign
Cluster 1
Cluster 2
Cluster 3
Cluster 4
(a) Assign after initialization
-2 0 2 4 6 8
Sum of Within-Cluster Distance (WCD) = 589.0572
-2
-1
0
1
2
3
4
5
6
BD
index
after
updating
centroids
=
0.96515
Assign+Delete+Update centroids, at iteration 1
Cluster 1
Cluster 2
Cluster 3
(b) 1st. iteration
-2 0 2 4 6 8
Sum of Within-Cluster Distance (WCD) = 379.0274
-2
-1
0
1
2
3
4
5
6
BD
index
after
updating
centroids
=
0.6342
Assign+Delete+Update centroids, at iteration 2
Cluster 1
Cluster 2
Cluster 3
(c) 2nd. iteration
-2 0 2 4 6 8
Sum of Within-Cluster Distance (WCD) = 374.6869
-2
-1
0
1
2
3
4
5
6
BD
index
after
updating
centroids
=
0.59976
Assign+Delete+Update centroids, at iteration 5
Cluster 1
Cluster 2
Cluster 3
(d) final clustering (5th iteration)
1 1.5 2 2.5 3 3.5 4 4.5 5
Iterations
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
Evaluation of new Updated Centers by DB index
DB index (After Update Centroids)
DB index (of Assign)
(e) DB index plot
-2 0 2 4 6 8
-2
-1
0
1
2
3
4
5
6
Kmeans, # clusters = 3 / DB index = 0.60831 / 5 iterations
Cluster 1
Cluster 2
Cluster 3
(f) K-means with K = 3
Fig. 1. Clustering process of Problem 1
The results are compared to those of [17], that uses a particle swarm opti-
mization (PSO) approach to the clustering, see Table 1. When solving the ‘Iris’
problem, our algorithm finds the 3 clusters in 77% of the runs (23 successful
96 A. M. A. C. Rocha et al.
-8 -6 -4 -2 0 2 4 6
DB index = 1.8608
-8
-6
-4
-2
0
2
4
DB
index(m,X)
=
1.8226;
CS
index(m,X)
=
1.7192
Random selection of centers+Assign
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 5
Cluster 6
Cluster 7
Cluster 8
Cluster 9
Cluster 10
(a) Assign after initialization
-8 -6 -4 -2 0 2 4 6
Sum of Within-Cluster Distance (WCD) = 2891.6719
-8
-6
-4
-2
0
2
4
BD
index
after
updating
centroids
=
2.2919
Assign+Delete+Update centroids, at iteration 1
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 5
Noise
(b) 1st. iteration
-8 -6 -4 -2 0 2 4 6
Sum of Within-Cluster Distance (WCD) = 2474.8877
-8
-6
-4
-2
0
2
4
BD
index
after
updating
centroids
=
1.2168
Assign+Delete+Update centroids, at iteration 2
Cluster 1
Cluster 2
Cluster 3
(c) 2nd. iteration
-8 -6 -4 -2 0 2 4 6
Sum of Within-Cluster Distance (WCD) = 2645.2442
-8
-6
-4
-2
0
2
4
BD
index
after
updating
centroids
=
0.37506
Assign+Delete+Update centroids, at iteration 5
Cluster 1
Cluster 2
(d) final clustering (5th iteration)
1 1.5 2 2.5 3 3.5 4 4.5 5
Iterations
0
0.5
1
1.5
2
2.5
Evaluation of new Updated Centers by DB index
DB index (After Update Centroids)
DB index (of Assign)
(e) DB index plot
-8 -6 -4 -2 0 2 4 6
-8
-6
-4
-2
0
2
4
Kmeans, # clusters = 2 / DB index = 0.37506 / 5 iterations
Cluster 1
Cluster 2
(f) K-means with K = 2
Fig. 2. Clustering process of Problem 2
runs out of 30). When solving the problem ‘Wine’, 27 out of 30 runs identified
the 3 clusters.
Table 1 shows the final value of a fitness function, known as Sum of Within-
Cluster Distance (WCD) (the best, the average (avg.), the worst and the stan-
A Simple Clustering Algorithm Based on Weighted Expected Distances 97
-2 -1 0 1 2 3
DB index = 7.2298
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
DB
index(m,X)
=
4.6524;
CS
index(m,X)
=
1.6856
Random selection of centers+Assign
Cluster 1
Cluster 2
Cluster 3
Noise
(a) Assign after initialization
-2 -1 0 1 2 3
Sum of Within-Cluster Distance (WCD) = 211.878
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
BD
index
after
updating
centroids
=
0.8258
Assign+Delete+Update centroids, at iteration 1
Cluster 1
Cluster 2
Noise
(b) 1st. iteration
-2 -1 0 1 2 3
Sum of Within-Cluster Distance (WCD) = 165.2711
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
BD
index
after
updating
centroids
=
0.68398
Assign+Delete+Update centroids, at iteration 2
Cluster 1
Cluster 2
(c) 2nd. iteration
-2 -1 0 1 2 3
Sum of Within-Cluster Distance (WCD) = 157.7905
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
BD
index
after
updating
centroids
=
0.62417
Assign+Delete+Update centroids, at iteration 5
Cluster 1
Cluster 2
(d) final clustering (5th iteration)
1 1.5 2 2.5 3 3.5 4 4.5 5
Iterations
0
1
2
3
4
5
6
7
8
Evaluation of new Updated Centers by DB index
DB index (After Update Centroids)
DB index (of Assign)
(e) DB index plot
-2 -1 0 1 2 3
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Kmeans, # clusters = 2 / DB index = 0.62418 / 5 iterations
Cluster 1
Cluster 2
(f) K-means with K = 2
Fig. 3. Clustering process of Problem 3
dard deviation (St. Dev.) values over the successful runs). The third column of
the table has the WCD that results from our clustering, i.e., after assigning the
points to clusters (according to the above described rules based on the WD –
Algorithm 2), merging and removing clusters if appropriate (Algorithm 3) and
98 A. M. A. C. Rocha et al.
-6 -4 -2 0 2 4 6 8
DB index = 1.2733
-4
-2
0
2
4
6
DB
index(m,X)
=
1.2146;
CS
index(m,X)
=
1.7295
Random selection of centers+Assign
Cluster 1
Cluster 2
Cluster 3
Cluster 4
(a) Assign after initialization
-6 -4 -2 0 2 4 6 8
Sum of Within-Cluster Distance (WCD) = 546.8942
-4
-2
0
2
4
6
BD
index
after
updating
centroids
=
0.98561
Assign+Delete+Update centroids, at iteration 1
Cluster 1
Cluster 2
Noise
(b) 1st. iteration
-6 -4 -2 0 2 4 6 8
Sum of Within-Cluster Distance (WCD) = 528.8877
-4
-2
0
2
4
6
BD
index
after
updating
centroids
=
0.91049
Assign+Delete+Update centroids, at iteration 2
Cluster 1
Cluster 2
Noise
(c) 2nd. iteration
-6 -4 -2 0 2 4 6 8
Sum of Within-Cluster Distance (WCD) = 507.6508
-4
-2
0
2
4
6
BD
index
after
updating
centroids
=
0.84801
Assign+Delete+Update centroids, at iteration 5
Cluster 1
Cluster 2
Noise
(d) final clustering (5th iteration)
1 1.5 2 2.5 3 3.5 4 4.5 5
Iterations
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
1.3
Evaluation of new Updated Centers by DB index
DB index (After Update Centroids)
DB index (of Assign)
(e) DB index plot
-6 -4 -2 0 2 4 6 8
-4
-2
0
2
4
6
Kmeans, # clusters = 2 / DB index = 0.86305 / 5 iterations
Cluster 1
Cluster 2
(f) K-means with K = 2
Fig. 4. Clustering process of Problem 4
updating the centers (Algorithm 4), ‘WCDW D’. The number of points ‘noise’
at the final clustering (of the best run, i.e., the run with the minimum value of
‘WCDW D’), ‘# noise’, the number of iterations, It, the time in seconds, ‘time’,
and the percentage of successful runs, ‘suc.’, are also shown in the table. The
A Simple Clustering Algorithm Based on Weighted Expected Distances 99
Table 1. Clustering results for Problems 5 and 6
Algorithm 1 Results in [17]
Problem ‘WCDW D’ # noise It Time suc. ‘WCD’ ‘WCD’ Time
‘Iris’ Best 107.89 0 5 0.051 77% 100.07 97.22 0.343
Avg. 108.21 8.7 0.101 100.17 97.22 0.359
Worst 109.13 12 0.156 100.47 97.22 0.375
St. Dev. 0.56 0.18
‘Wine’ Best 17621.45 0 6 0.167 90% 16330.19 16530.54 2.922
Avg. 19138.86 9.1 0.335 17621.12 16530.54 2.944
Worst 20637.04 13 1.132 19274.45 16530.54 3.000
St. Dev. 1249.68 1204.04
Table 2. Results comparison for Problems 5 and 6
Algorithm 1 K-means K-NM-PSO
Problem ‘WCDD’ # noise It Time suc. ‘WCD’ ‘WCD’
‘Iris’ Best 97.20 0 4 0.096 63% 97.33 96.66
Avg. 97.21 8.7 0.204 106.05 96.67
St. Dev. 0.01 14.11 0.008
‘Wine’ Best 16555.68 0 6 0.159 90% 16555.68 16292.00
Avg. 17031.79 10.3 0.319 18061.00 16293.00
St. Dev. 822.08 793.21 0.46
column identified with ‘WCD’ contains values of WCD (at the final clustering)
using X and the obtained final optimal cluster centers to assign the points to
the clusters/centers based on the minimum usual distances from each point to
the centers.
For the same optimal clustering, the ‘WCD’ has always a lower value than
‘WCDW D’. When comparing the ‘WCD’ of our clustering with ‘WCD’ registered
in [17], our obtained values are higher, in particular the average and the worst
values, except the best of the results ‘WCD’ for problem ‘Wine’. The variability
of the results registered in [17] and [18] (see also Table 2) are rather small but
this is due to the type of clustering based on the PSO algorithm. The clustering
algorithm K-NM-PSO [18] is a combination of K-means, the local optimization
method Nelder-Mead (NM) and the PSO. In contrast, our algorithm finds the
optimal clustering much faster than the PSO clustering approach in [17]. This
is a point in favor of our algorithm, an essential requirement when solving large
data sets.
The results of our algorithm shown in Table 2 (third column identified with
‘WCDD’) are based on the designed and previously described Algorithms 1,
2, 3 and 4, but using the usual distances between points and centers for the
assigning process (over the iterative process). These values are slightly better
100 A. M. A. C. Rocha et al.
than those obtained by the K-means (available in [18]), but not as interesting as
those obtained by the K-NM-PSO (reported in [18]).
5 Conclusions
A probabilistic approach to define the weighted expected distances from the
cluster centers to the data points is proposed. These weighted distances are used
to assign points to clusters, represented by their centers. The new proposed
clustering algorithm relies on mechanisms to create new clusters. Points that are
not assigned to a cluster, based on the weighted expected distances, are gathered
in a new cluster. The clustering is also able to merge two nearby clusters and
remove the clusters that have very few points. Points are also identified as ‘noise’
when they are beyond a limit of a reasonable neighbourhood (as far as weighted
distances are concerned) and when they belong to a cluster with very few points.
The preliminary clustering results, and the comparisons with the K-means and
other K-means combinations with stochastic algorithms, show that the proposed
algorithm is very efficient and effective.
In the future, data sets with non-convex clusters, clusters with different
shapes, in particular those with non-spherical blobs, and sizes will be addressed.
Our proposal is to integrate a kernel function into our clustering approach. We
aim to further investigate the dependence of the parameter values of the algo-
rithm on the number of attributes, in particular, when highly dimensional data
sets should be partitioned. The set of tested problems will also be enlarged to
include large data sets with clusters of different shapes and non-convex.
Acknowledgments. The authors wish to thank three anonymous referees for their
comments and suggestions to improve the paper.
References
1. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput.
Surv. 31(3), 264–323 (1999)
2. Greenlaw, R., Kantabutra, S.: Survey of clustering: algorithms and applications.
Int. J. Inf. Retr. Res. 3(2) (2013). 29 pages
3. Ezugwu, A.E.: Nature-inspired metaheuristics techniques for automatic clustering:
a survey and performance study. SN Appl. Sci. 2, 273–329 (2020)
4. Mohammed, J.Z., Meira, W., Jr.: Data Mining and Machine Learning: Fundamen-
tal Concepts and Algorithms, 2nd edn. Cambridge University Press, Cambridge
(2020)
5. Xu, D., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci.
2(2), 165–193 (2015)
6. MacQueen, J.B.: Some methods for classification and analysis of multivariate obser-
vations. In: Le Cam, L.M., Neyman, J. (eds.) Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Uni-
versity of California Press (1967)
7. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval.
Cambridge University Press, Cambridge (2008)
A Simple Clustering Algorithm Based on Weighted Expected Distances 101
8. Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis and density
estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)
9. Kwedlo, W.: A clustering method combining differential evolution with K-means
algorithm. Pattern Recogn. Lett. 32, 1613–1621 (2011)
10. Patel, K.G.K., Dabhi, V.K., Prajapati, H.B.: Clustering using a combination of
particle swarm optimization and K-means. J. Intell. Syst. 26(3), 457–469 (2017)
11. He, Z., Yu, C.: Clustering stability-based evolutionary K-means. Soft. Comput. 23,
305–321 (2019)
12. Sarkar, M., Yegnanarayana, B., Khemani, D.: A clustering algorithm using evolu-
tionary programming-based approach. Pattern Recogn. Lett. 18, 975–986 (1997)
13. Chou, C.-H., Su, M.-C., Lai, E.: A new cluster validity measure and its application
to image compression. Pattern Anal. Appl. 7, 205–220 (2004)
14. Asvadi, A.: K-means Clustering Code. Department of ECE, SPR Lab., Babol
(Noshirvani) University of Technology (2013). http://www.a-asvadi.ir/
15. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern
Anal. Mach. Intell. PAMI-1(2), 224–227 (1979)
16. Dua, D., Graff, C.: UCI Machine Learning Repository. University of California,
School of Information and Computer Science, Irvine, CA (2019). http://archive.
ics.uci.edu/ml
17. Cura, T.: A particle swarm optimization approach to clustering. Expert Syst. Appl.
39, 1582–1588 (2012)
18. Kao, Y.-T., Zahara, E., Kao, I.-W.: A hybridized approach to data clustering.
Expert Syst. Appl. 34, 1754–1762 (2008)
Optimization of Wind Turbines
Placement in Offshore Wind Farms:
Wake Effects Concerns
José Baptista1,2(B)
, Filipe Lima1
, and Adelaide Cerveira1,2
1
School of Science and Technology, University of Trás-os-Montes and Alto Douro,
5000-801 Vila Real, Portugal
{baptista,cerveira}@utad.pt, al64391@utad.eu
2
UTAD’s Pole, INESC-TEC, 5000-801 Vila Real, Portugal
Abstract. In the coming years, many countries are going to bet on
the exploitation of offshore wind energy. This is the case of southern
European countries, where there is great wind potential for offshore
exploitation. Although the conditions for energy production are more
advantageous, all the costs involved are substantially higher when com-
pared to onshore. It is, therefore, crucial to maximize system efficiency.
In this paper, an optimization model based on a Mixed-Integer Linear
Programming model is proposed to find the best wind turbines loca-
tion in offshore wind farms taking into account the wake effect. A case
study, concerning the design of an offshore wind farm, were carried out
and several performance indicators were calculated and compared. The
results show that the placement of the wind turbines diagonally presents
better results for all performance indicators and corresponds to a lower
percentage of energy production losses.
Keywords: Offshore wind farm · Wake effect · Optimization
1 Introduction
In the last two decades, wind energy has been a huge investment for almost
all developed countries, with a strong bet on onshore wind farms (WF). It is
expected that investments will be directed towards the energy exploration in
the sea, where offshore wind farms (OWFs) are a priority. Recently, the Euro-
pean Union has highlighted the enormous unexplored potential in Europe’s seas,
intending to multiply offshore wind energy by 20 to reach 450 GW, to meet the
objective of decarbonizing energy and achieving carbon neutrality by 2050 [6].
An OWF has enormous advantages over onshore, with reduced visual impact
and higher efficiency, where the absence of obstacles allow the wind to reach a
higher and more constant speed. However, the installation costs are substan-
tially higher [10]. So, it is very important to optimize the efficiency of these
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 102–109, 2021.
https://doi.org/10.1007/978-3-030-91885-9_8
Optimization of WT Placement in OWF: Wake Effects Concerns 103
WFs, wherein the location of wind turbines (WTs) is one of the most important
factors to consider, taking into account the wake effect impact on the energy
production of each one. To achieve this goal new models related to WF layout
optimization should be investigated. Several authors addressed the topic using
models based on genetic algorithms (GA) as is the case of Grady et al. [7]. In
[12], a mathematical model based in a particle swarm optimization (PSO) algo-
rithm which includes the variation of both wind direction and wake deficit is
proposed. A different approach is addressed in [13], where a modular framework
for the optimization of an OWF using a genetic algorithm is presented. In [14]
a sequential procedure is used for global optimization consisting of two steps:
i) a heuristic method to set an initial random layout configuration, and ii) the
use of nonlinear mathematical programming techniques for local optimization,
which use the random layout as an initial solution. Other authors address the
topic of WF optimization, focusing on optimizing the interconnection between
the turbines, known as cable routing optimization. Cerveira et al. in [5] pro-
poses a flow formulation to find the optimal cable network layout considering
only the cable infrastructure costs. In [3,4,11] Mixed-Integer Linear Program-
ming (MILP) models are proposed to optimize the wind farm layout in order
to obtain the greater profit of wind energy, minimizing the layout costs. Assum-
ing that the ideal position of the turbines has already been identified, the most
effective way of connecting the turbines to each other and to the substations,
minimizing the infrastructure cost and the costs of power losses during the wind
farm lifetime is considered.
The main aim of this paper is to optimize the location of the WTs to minimize
the single wake effect and so to maximize energy production. In order to achieve
this target, a MILP optimization model is proposed. A case study is considered
and performance indicators were computed and analyzed obtaining guidelines
for the optimal relative position of the WTs. The paper is organized as follows.
In Sect. 2 the wind power model is addressed and in Sect. 3 the MILP model
to optimize the WTs placement is presented. Section 4 presents the case study
results. Finally, Sect. 5 highlights some conclusions remarks.
2 Wind Farm Power Model
A correct evaluation of the wind potential for power generation must be based
on the potential of the wind resource and various technical and physical aspects
that can affect energy production. Among them, the wind speed probability
distribution, the effect of height, and the wake effect are highlighted.
2.1 Power in the Wind
The energy available for a WT is kinetic energy associated with an air column
traveling at a uniform and constant speed u, in m/s. The power available in the
wind, in W, is proportional to the cube of the wind speed, given by Eq. (1),
P =
1
2
(ρAv)v2
=
1
2
ρAv3
(1)
104 J. Baptista et al.
where A is the swept area by the rotor blades, in m2
, v is the wind speed, in
m/s, and ρ is the air density, usually considered as constant during the year, with
standard value being equal to 1.225 kg/m3
. The power availability significantly
depends on the wind speed, so, it is very important to place the turbines where
the wind speed is high. Unfortunately, the total wind power cannot be recovered
in a WT. According to Betz law [2], it is only allowed to convert, at most,
59.3% of the kinetic energy in mechanical energy to be used through the turbine.
Typically, turbines’ technical is provided by manufacturers. This is the case of
the power curve which gives the electrical power output, Pe(v), as a function of
the wind speed, v. Knowing the wind profile and the turbine power curve, it is
possible to determine the energy Ea produced by the conversion system,
Ea = 8760
 vmax
v0
f(v)Pe(v)dv. (2)
2.2 The Jensen Model for Wake Effect
The power extraction of turbines from the wind depends on the wind energy
content, which differs in its directions. The downstream wind (the one that
comes out of the turbine) has less energy content than the upstream wind. The
wake effect is the effect of upstream wind turbines on the speed of the wind that
reaches downstream turbines. In a WF where there are several WTs, it is very
likely that the swept area of a turbine is influenced by one or more turbines
that are before, this effect can be approached by the Jensen model proposed in
1983 by N. O. Jensen [8] and developed in 1986 by Katic et al. [9]. The initial
diameter of the wake is equal to the diameter of the turbine and expands linearly
as a function of wind. Similarly, the downstream wind deficit declines linearly
with the distance to the turbine. The simple wake model consists of a linear
expression of the wake diameter and the decay of the speed deficit inside the
wake. The downstream wind speed (vd) of the turbine can be obtained by (3),
where v0 represents the upstream wind speed.
vd = v0
⎛
⎜
⎝1 −




N
i=1
2a
(1 + αx
r1
)2
2
⎞
⎟
⎠ (3)
Where: the wake radius r1 is given by r1 = αx + rr, in which rr is the radius of
the upstream turbine; x is the distance that meets the obstacle; α is a scalar that
determines how quickly the wake expands, which is given by α = 1
2 ln z
z0
, where
z is the hub height and z0 is the surface roughness that in the water is around
0.0002 m, although it may increase with sea conditions; a is the axial induction
factor given by a = (1 −
√
1 − CT )/2, and the turbine thrust coefficient CT is
determined by the turbine manufacturer according to Fig. 1.
Optimization of WT Placement in OWF: Wake Effects Concerns 105
3 Optimization Model
This section presents a MILP model to determine the optimal location of WT
to maximize energy production, taking into account the wake effect. It is given
a set of possible locations for WTs and a maximum number of WTs to install.
Consider the set N = {1, . . . , n} of all possible locations to install WTs and,
for each i ∈ N, let Ei be the energy generated by the WT installed at the location
i without considering the wake effect, Eq. (2). Let Iij be the interference (loss
of energy produced) at a WT installed on location j by the WT installed at
location i, with i, j ∈ N. Those values are computed using the Jensen’s model,
Sect. 2.2. It is considered that Ijj = 0.
The following decision variables are considered: binary variables xi that
assumes the value 1 if a WT is installed in the location i and 0 otherwise;
non-negative variables wi that represents the interference at a WT installed on
location i by all WT installed in the WF, for each i ∈ N: .
If a WT is installed at location i, wi is the summation of the interference
caused by all WT. Therefore: wi =

j∈N Iijxj if xi = 1 and wi = 0 if xi = 0. So,
if a WT is installed on location i, the produced energy is given by Ei −Eiwi, and
the objective is to maximize

i∈N (Eixi − wiEi) . To set variable w as just been
described, the following constraint will be included, where M is a big number,
j∈N
Iijxj ≤ wi + M (1 − xi) (4)
In fact, if xi = 0, by inequality (4), it holds

j∈N Iijxj ≤ wi + M, and so
there are no lower bound on the value of wi. However, as the coefficient of wi is
negative in a maximization model, wi will assume the smallest possible value in
the optimal solution which, in this case, is zero. Furthermore, if xi = 1, it holds

j∈N Iijxj ≤ wi and wi will assumes the smallest possible value which, in this
case, is

j∈N Iijxj, as expected. The optimization model to determine the WTs
location is denoted by WTL model and can be written as follows:
max

i∈N (Eixi − wiEi) (5)
subject to

i∈N xi ≤ U (6)

j∈N Iijxj ≤ wi + M (1 − xi) , i ∈ N (7)
xi ∈ {0, 1}, wi ≥ 0 i ∈ N (8)
The objective function (5) corresponds to the maximization of the energy pro-
duced, taking into account the losses due to the wake effect. Constraint (6)
imposes a limit U on the number of turbines (this bound is often associated
with the available capital). Constraints (7) relate variables x and w, assuring
that wi = 0 if xi = 0 and wi =

j∈N Iij if xi = 1. Finally, constraints (8) are
signal constraints on the variables.
To strengthen constraints (7), the value considered for the big constant was
M = U − 1.
106 J. Baptista et al.
4 Case Study and Results
In this section, a case study will be described and the results are presented and
discussed. The WT used was Vestas v164-8.0 [15] with the power curve provided
by the manufacturer shown in Fig. 1. The site chosen for the WT installation
is approximately 20 km off the coast of Viana do Castelo, Portugal, location
currently used by the first Portuguese OWF. To achieve the average annual
wind speed there was used the Global wind Atlas [1] data. The average annual
wind speed considered at the site was 8 m/s collected at 150 m of height. Prandtl
Law was used to extrapolate the wind speed at the turbine hub height, around
140 m. Subsequently, the annual frequency of occurrence is calculated for various
wind speeds, in the range between 4 m/s and 25 m/s (according to the WT power
curve), using the Rayleigh distribution. The occurrence of these wind speeds is
shown in the bar chart of Fig. 2a. Figure 2b shows the annual energy production
of a WT at each wind speeds considered. Adding all energy values, the total
yearly energy production for this WT will be 29.1 GWh.
Fig. 1. Power and thrust coefficient curve for WT Vestas V164-8.0.
(a) Number of hours wind speeds occur. (b) Annual energy produced by a WT.
Fig. 2. Wind distribution and annual energy production.
There are 36 possible locations for WT, distributed over six rows, each one of
them having six possible locations, as shown in Fig. 3a. The horizontally and ver-
tically distance between two neighbored locations is 850 m and the predominant
wind direction is marked with an arrow.
Optimization of WT Placement in OWF: Wake Effects Concerns 107
(a) Possible location of WTs. (b) Energy production in each row with
wake effect.
Fig. 3. Wind farm layout and annual energy production
According to Sect. 2.2, the Jensen model for the wake effect was used to
calculate the speed behind each WT. Note that not all WTs receive the same
amount of wind and for this reason do not will produce the same amount of
energy. Figure 3b shows the total energy produced in each row. The wake effect
leads a reduction in the energy production around 3.5% in each row.
Two situations where considered in which it is intended to select a maximum
of U = 22 and U = 24 WTs. The WTL model was constructed using FICO
Xpress Mosel (Version 4.8.0) and then it was solved using the interfaces provided
by these packages to the Optimizer. The CPU-time for all tested instances were
less than 1 s and the model has 31 constraints and 66 variables.
When not considering the wake effect, for U = 22 the expected electricity
produced annually is 636.3 GWh and for U = 24 it is 692.8 GWh. Figures 4a
and b show the optimal solutions for the distribution of the WTs, by solving
WTL model for both situations. It turns out that when U = 22 the maximum
expected production of energy for the 22 turbines is 639.6 GWh, per year. When
U = 24 the maximum expected energy production by the 24 WTs is 698.3 GWh.
It is observed that when considering the wake effect, with 22 turbines, there
is a decrease of 3.6 GWh of energy produced, which corresponds to a reduction of
0.56%. With 24 WTs, there was a decrease of approximately 5.3 GWh of energy
produced, which corresponds to a reduction of 0.72%.
For a more complete analysis, some performance indicators were calculated,
Equivalent full load (h) =
Energy Production
Total installed power
(9)
Capacity Factor (%) =
Energy Production
8760 · Total installed power
(10)
Specific energy production (MWh/m2
) =
Energy Production
Total Blades swept area
(11)
The values of these indicators, for both situations, are presented in Table 1.
108 J. Baptista et al.
(a) Case U = 22. (b) Case U = 24.
Fig. 4. Optimal WTs location.
Table 1. Energy calculation and performance indicators, equations (9)–(11).
22 turbines 24 turbines
With Without With Without
Wake effect Wake effect Wake effect Wake effect
Energy production (MWh) 636000 639600 693000 698300
Total installed power (MW) 176 176 192 192
Area swept by the blades (m2) 464728 464728 506976 506976
Equivalent full load (h) 3613.64 3634.09 3609.38 3635.42
Capacity factor (%) 41.25 41.49 41.20 41.50
Specific energy production (MWh/m2) 1.37 1.38 1.37 1.38
Analyzing the indicators in Table 1, the solution with the WTs placed diag-
onally (22 turbines) presents better results, with a Equivalent Full Load around
3614 h, a Capacity Factor of 41.25% and a Specific energy production of 1.37
MWh/m2
, which confirms that placing the turbines on diagonal lines results in
greater gains for OWFs.
5 Conclusions
In this work, after having addressed mathematical models for wind characteri-
zation, a MILP model is proposed to obtain the optimal distribution of a given
number of turbines in an OWF location, with the objective of maximizing energy
production.
Two very important conclusions can be drawn from the obtained results.
The first conclusion is that the relative position of the WTs has an influence on
the amount of energy produced by each of them. In fact, the placement of the
WTs diagonally presents better results for all performance indicators and has
a lower percentage of energy production losses. The second conclusion refers to
the size of the wind farm, as the results show, for larger OWFs it results in a
lower percentage of energy production losses. In the case of 22 turbines, there is
an annual decrease of 3.6 GWh of energy produced, representing 0.56%.
Optimization of WT Placement in OWF: Wake Effects Concerns 109
The results obtained show that the optimization model based on MILP is
able to achieve, with low processing times, exact optimal solutions allowing to
significantly increase the efficiency of OWFs.
For future work, it is possible to complement this method by carrying out a
more detailed study with regard to energy losses due to the wake effect taking
into account also the non-predominant wind directions, and to incorporate in the
optimization model the cost of the connection network between turbines taking
into account the construction costs and energy losses over its lifetime.
References
1. Global Wind Atlas. https://globalwindatlas.info. Accessed May 2021
2. Bergey, K.H.: The lanchester-betz limit. J. Energy 3, 382–384 (1979)
3. Cerveira, A., Baptista, J., Pires, E.J.S.: Wind farm distribution network optimiza-
tion. Integr. Comput.-Aided Eng. 23(1), 69–79 (2015)
4. Cerveira, A., de Sousa, A., Solteiro Pires, E.J., Baptista, J.: Optimal cable design
of wind farms: the infrastructure and losses cost minimization case. IEEE Trans.
Power Syst. 31(6), 4319–4329 (2016)
5. Cerveira, A., Baptista, J., Pires, E.J.S.: Optimization design in wind farm dis-
tribution network. In: Herrero, Á., et al. (eds.) International Joint Conference
SOCO’13-CISIS’13-ICEUTE’13. AISC, vol. 239, pp. 109–119. Springer, Cham
(2014). https://doi.org/10.1007/978-3-319-01854-6 12
6. E.Commission: An EU Strategy to harness the potential of offshore renewable
energy for a climate neutral future. COM(2020) 741 final (2020)
7. Grady, S., Hussaini, M., Abdullah, M.M.: Placement of wind turbines using genetic
algorithms. Renew. Energy 30(2), 259–270 (2005)
8. Jensen, N.O.: A Note on Wind Generator Interaction, pp. 1–16. Riso National
Laboratory, Roskilde (1983)
9. Katic, I., Hojstrup, J., Jensen, N.O.: A simple model for cluster efficiency. In: Wind
Energy Association Conference EWEC 1986, vol. 1, pp.407–410 (1986)
10. Manzano-Agugliaro, F., Sánchez-Calero, M., Alcayde, A., San-Antonio-Gómez, C.,
Perea-Moreno, A., Salmeron-Manzano, E.: Wind turbines offshore foundations and
connections to grid. Inventions 5, 1–24 (2020)
11. Fischetti, M., Pisinger, D.: Optimizing wind farm cable routing considering power
losses. Eur. J. Oper. Res. 270, 917–930 (2017). https://doi.org/10.1016/j.ejor.2017.
07.061
12. Hou, P., Hu, W., Soltani, M., Chen, Z.: Optimized placement of wind turbines in
large-scale offshore wind farm using particle swarm optimization algorithm. IEEE
Trans. Sustain. Energy 6(4), 1272–1282 (2015)
13. Pillai, A.C., Chick, J., Johanning, L., Khorasanchi, M., Barbouchi, S.: Optimisation
of offshore wind farms using a genetic algorithm. Int. J. Offshore Polar Eng. 26(3),
1272–1282 (2016)
14. Pérez, B., Mı́nguez, R., Guanche, R.: Offshore wind farm layout optimization using
mathematical programming techniques. Renew. Energy 53, 389–399 (2013)
15. Vestas: Wind-turbine-models. https://en.wind-turbine-models.com/turbines/318-
vestas-v164-8.0. Accessed May 2021
A Simulation Tool for Optimizing
a 3D Spray Painting System
João Casanova1(B)
, José Lima2,3(B)
, and Paulo Costa1,3(B)
1
Faculty of Engineering, University of Porto, Porto, Portugal
{up201605317,paco}@fe.up.pt
2
Research Centre of Digitalization and Intelligent Robotics,
Instituto Politécnico de Bragança, Braganza, Portugal
jllima@ipb.pt
3
INESC-TEC - INESC Technology and Science, Porto, Portugal
Abstract. The lack of general robotics purposed, accurate open source
simulators is a major setback that limits the optimized trajectory gener-
ation research and general evolution of the robotics field. Spray painting
is a particular case that has multiple advantages in using a simulator for
exploring new algorithms, mainly the waste of materials and the dan-
gers associated with a robotic manipulator. This paper demonstrates
an implementation of spray painting on a previously existing simula-
tor, SimTwo. Several metrics for optimization that evaluate the painted
result are also proposed. In order to validate the implementation, we
conducted a real world experiment that serves both as proof that the
chosen spray distribution model translates to reality and as a way to
calibrate the model parameters.
Keywords: Spray painting · Simulator · Trajectory validation
1 Introduction
Spray painting systems have been widely used in industry, becoming increasingly
more competitive. Throughout the years several technologies have been devel-
oped and adopted in order to improve optimization and quality of the painting
process. In this work we address the problem of simulating spray paint optimiza-
tion. This is a vital tool since it allows the user to analyse the quality of the
paint produced by different trajectories without additional costs or dangers.
In this work a 3D spray painting simulation system is proposed. This system
has realistic spray simulation with sufficient accuracy to mimic real spray paint-
ing. The simulation has 3D CAD or 3D scanned input pieces and produces a
realistic visual effect that allows qualitative analyses of the painted product. It
is also presented an evaluation metric that scores the painting trajectory based
on thickness, uniformity, time and degree of coverage. This new simulation sys-
tem, provides an open source implementation that is capable of real time spray
simulation with validated experimental results.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 110–122, 2021.
https://doi.org/10.1007/978-3-030-91885-9_9
A Simulation Tool for Optimizing a 3D Spray Painting System 111
2 Literature Review
Spray painting simulation is an important part of automated spray painting sys-
tems since it provides the possibility to predict the end result without any cost.
Furthermore, virtual simulations allow quality metrics such as paint thickness
uniformity or paint coverage to be easily measured and optimized. This section
analyse the basis to simulate spray painting, including painting methods and sev-
eral approaches on the spray simulation, such as computational fluid dynamics
(CFD), spray simulation applied to robotics and commercial simulators.
2.1 Painting Methods
Paint application has been used for centuries, for most of this time the main
methods consisted in spreading paint along the desired surface with tools like
a brush or a roller. Nowadays however, their popularity and use are decreasing
since they are slow methods with low efficiency, and the resulting quality highly
depends on the quality of the materials and tools. Consequently new techniques
were developed such as dipping and flow coating, that are best applied to small
components and protective coatings, and spray painting that is more suitable
for industry.
The main industrial painting methods are: air atomized spray application,
high volume low pressure (HVLP) spray application, airless spray application,
air-assisted airless spray application, heated spray application and electrostatic
spray application [19].
The conventional spray application method is air atomized spray. In this
method the compressed air is mixed with the paint in the nozzle and the atomized
paint is propelled out of the gun. HVLP is an improvement on the conventional
system and is best suited for finishing operations. The paint is under pressure and
the atomization occurs when it contacts with the high volume of low pressure air
resulting in the same amount of paint to be propelled at the same rate but with
lower speeds. This way a high precision less wasteful spray is produced, since
the paint doesn’t miss the target and the droplets don’t bounce back. Airless
spray application is a different method, that uses a fluid pump to generate high
pressure paint. The small orifice in the tip atomizes the pressurized paint. The
air resistance slows down the droplets and allows for a good amount to adhere to
the surface, but still in a lower extent than HVLP [14]. Air-assisted airless spray
application is a mixture of the airless and conventional methods with the purpose
of facilitating the atomization and decreasing the waste of paint. Heated spray
application can be applied to the other methods and it consists in pre-heating
the paint, making it less viscous requiring lower pressures. This also reduces
drying time and increases efficiency. Electrostatic spray application can also be
applied with the other techniques and consists in charging the paint droplets as
they pass through an electrostatic field. The surface to be painted is grounded
in order to attract the charged particles [18].
At last, the conditions at which the paint is applied have a direct impact on
the paint job quality and success. This way, painting booths are usually used,
112 J. Casanova et al.
because it is easier to control aspects such as temperature, relative humidity
and ventilation. Lower temperatures increase drying time, while high moisture
in the air interferes with adhesion and can cause some types of coating not to
cure. Also the ventilation decreases initial drying time and allows the workspace
to be safer for the workers.
2.2 Computational Fluid Dynamics (CFD)
In order to simulate the spray painting process as accurately as possible, it is
necessary to understand the physical properties that dictate droplets trajecto-
ries and deposition. Computational fluid dynamics (CFD) allows for a numerical
approach that can predict flow fields using the Navier-Stokes equations [15]. An
important step that CFD still struggles to solve is the atomization process. It
is possible to skip the atomization zone by using initial characteristics that can
be approximated by measures of the spray [16]. Another alternative is to allow
some inaccuracies of the model in the atomization zone by simulating completed
formed droplets near the nozzle. This allows for simpler initialization while main-
taining accurate results near the painted surface [21]. The Fig. 1 shows a CFD
result that compared with the experimental process reveals similar thickness
[22].
Despite these advantages, CFD isn’t used in industrial applications [11]. This
is due to the fact that CFD requires a high computation cost and the precision
that it produces isn’t a requirement for many applications. On the other hand,
with the current technological advances this seems to be a promising technique
that enables high precision even on irregular surfaces [20].
Fig. 1. CFD results with the calculated velocity contours colored by magnitude (m/s)
in the plane z = 0 [22].
A Simulation Tool for Optimizing a 3D Spray Painting System 113
Fig. 2. On the left side is presented a beta distribution for varying β [11]. On the right
side is presented an asymmetrical model (ellipse) [11].
2.3 Spray Simulation Applied to robotics
In robotics, it is common to use explicit functions to describe the rate of paint
deposition at a point based on the position and orientation of the spray gun.
Depending on the functions used, it is conceivable to create finite and infinite
range models [11]. In infinite range models the output approaches zero as the
distance approaches infinity. The most usually used distributions are Cauchy and
Gaussian. Cauchy is simpler, allowing for faster computation while Gaussian is
rounder and closer to reality [10,17]. It is also possible to use multiple Gaussian
functions in order to obtain more complex and accurate gun models [24]. If the
gun model is considerably asymmetrical, an 1D Gaussian with an offset revolved
around an axis summed with a 2D Gaussian provides a 2D deposition model [13].
In finite range models the value of deposition outside a certain distance is actu-
ally zero. Several models have been developed such as trigonometric, parabolic,
ellipsoid and beta [11]. The most used model is the beta distribution model that,
as the beta increases, changes from parabolic to ellipsoid and even the Gaussian
distribution can be modelled, as shown in Fig. 2 [9]. It is also useful to use two
beta distributions that best describe the gun model, as was done with infinite
range models, Fig. 2 exemplifies an asymmetrical model [23].
2.4 Commercial Simulators
The growing necessity of an user friendly software that provides a fast and
risk free way of programming industrial manipulators led to the development of
dedicated software by the manipulators manufacturers and by external software
companies. Most of them include essential features such as support for several
robots, automatic detection of axis limits, collisions and singularities and cell
components positioning.
114 J. Casanova et al.
RoboDK is a software developed for industrial robots that works as an offline
programming and simulation tool [3]. With some simple configurations it’s pos-
sible to simulate and program a robotic painting system, however the simulation
is a simple cone model and the programming is either entered manually or a
program must be implemented [4]. This is a non trivial task for the end user
but a favorable feature for research in paint trajectory generation. The advan-
tages of this system are mainly the possibility to create a manual program and
experiment it without any danger of collisions or without any waste of paint.
RobCad paint, by Siemens, also works as an offline programming and simulation
tool and also presents features as paint coverage analysis and paint databases
[7]. It also includes some predefined paths that only require the user to set a
few parameters. This simulator is one of the oldest in the market being used by
many researchers to assess their algorithms. OLP Automatic by Inropa is a sys-
tem that uses 3D scanning to automatically produce robot programs [1]. A laser
scanner is used to scan parts that are placed on a conveyor and the system can
paint every piece without human intervention. Although there are no available
information about the created paths, the available demonstrations only paint
simple pieces like windows and doors. Delfoi Paint is an offline programming
software that besides the simulation and thickness analysis tools has distinctive
features such as pattern tools that create paths based on surface topology and
manual use of the 3D CAD features [2]. RoboGuide PaintPRO by FANUC, on
the other hand is specifically designed to generate paths for FANUC robots [8].
The user selects the area to be painted and chooses the painting method and
the path is generated automatically. Lastly, RobotStudio R
 Paint PowerPac by
ABB is purposely developed for ABB robots and doesn’t have automatically
generated paths. Its main benefit is the fast development of manual programs
for multiple simultaneous robots.
3 Simulator Painting Add-On
The simulation was developed as an additional feature on an already existing
robotics simulator, SimTwo. SimTwo is robot simulation environment that was
designed to enable researchers to accurately test their algorithms in a way that
reflected the real world. SimTwo uses The Open Dynamics Engine for simulat-
ing rigid body dynamics and the virtual world is represented using GLScene
components [5], these provide a simple implementation of OpenGL [6].
In order to simulate the spray painting, it is taken the assumption that the
target model has regular sized triangles. As this is rarely the case, a preprocess
phase is required, where the CAD model is remeshed using an Isotropic Explicit
Remeshing algorithm that repeatedly applies edge flip, collapse, relax and refine
A Simulation Tool for Optimizing a 3D Spray Painting System 115
Fig. 3. On top is presented the car hood model produced by the modeling software.
On the bottom is presented the same model after the Isotropic Explicit Remeshing.
to improve aspect ratio and topological regularity. This step can be held by the
free software MeshLab [12] and the transformation can be observed in Fig. 3 with
a CAD model of a car hood as an example.
The importance of the preprocess lies in the fact that from thereafter the
triangle center describes a triangle location, since the triangles are approximately
equilateral. Another important result emerges from the fact that every triangle
area is identical and small enough so that each triangle constitutes an identity
for the mesh supporting a pixel/vortex like algorithm.
116 J. Casanova et al.
Fig. 4. Simulation example with marked Spray gun origin (S), piece Center (C) and
example point (P).
In order to achieve a fast runtime without compromising on the realism of the
simulation, the paint distribution model adopted is a Gaussian function limited
to the area of a cone. The parameters that describe the paint distribution as
well as the cone’s angle are dependent on the spray gun and its settings. The
general equation that calculates the paint quantity per angle pqα based on the
painting rate pr, time step of the simulation Δt, standard deviation σ and angle
α is presented in the Eq. 1.
pqα = pr.Δt.
1
σ ∗
√
2 ∗ π
e− 1
2
−α
σ
2
(1)
The principle of the simulation consists in iterating every triangle in the
target model, and for each one calculating the respective paint thickness that’s
being added. For this, let us define the following variables that are available
from the scene: C piece Center; Sp, Spray gun position; Sd ≡
−
−
→
SpC, Spray gun
direction; Pp the center of an example triangle position and Pn the triangle
normal. With these the angle between the Spray gun direction and the vector
formed between the spray gun position and the triangle center position can be
calculated as expressed on Eq. 2.
α = arccos(
−
−
−
→
SpPp • Sd



−
−
−
→
SpPp


 Sd
) (2)
A Simulation Tool for Optimizing a 3D Spray Painting System 117
This angle, α, determines the limit of the cone that defines the area that is
painted while also allowing to calculate the amount of paint that the triangle
should receive as modeled by the Gaussian function. In order to obtain the paint
quantity per triangle, pq, the fraction between the spray cone solid angle and the
triangle solid angle Ω is used. The triangle solid angle calculation is presented on
Eq. 3 and the paint quantity per triangle is presented on Eq. 4. Considering that
the angle of the cone is δ, the vectors A, B, C are the 3 vertices of the triangle
and the vectors a =
−
−
→
SpA, b =
−
−
→
SpB and c =
−
−
→
SpC.
Ω = 2 arctan(
|abc|
a b c + (a • b) c + (a • c) b + (b • c) a
) (3)
pq = pqα
Ω
(2 ∗ π ∗ (1 − cos(δ
2 ))
(4)
At last the added paint thickness Pt can be calculated by dividing the paint
quantity by the area of the triangle At, Eq. 5.
Pt =
pq
At
(5)
As this analysis doesn’t avoid painting the back part of the piece, an extra
condition can be added based on the angle β (Eq. 6) between the Spray gun
direction and the triangle normal since its absolute value must always be larger
than π
2 to guarantee that the considered triangle is on the upper side of the
piece.
β = arccos(
Pn • Sd
Pn Sd
) (6)
Some calculations such as the area and center of the triangles and which
vertices correspond to each triangle occur every iteration (as long as the triangle
is accumulating paint). As these values aren’t dynamic, they can be calculated
only once, when the paint target CAD is loaded, and stored in arrays of records
reducing the complexity of the algorithm to O(N).
4 Results
In order to determine the paint quality several metrics were implemented. Estab-
lishing which part of the paint target model is indeed supposed to be painted
is a problem that is outside of the scope of this work as it would either involve
a development of a graphical tool to select the parts to be painted or resort to
simple plane intersections or even some kind of CAD model coloring that would
limit the format abstraction. For this reasons a compromise solution is to cal-
culate a dual version of each metric that only considers the painted triangles
118 J. Casanova et al.
as triangles to be painted. These measurements aren’t as accurate as the others
and should only be used when the painting algorithm is clearly doing what it
was meant to do and paying close attention to the coverage ratio.
In this regard the average spray thickness, the standard deviation of the
spray thickness, the positive average of the spray thickness, the positive stan-
dard deviation of the spray thickness, the max spray thickness, the minimum
spray thickness and the positive minimum spray thickness function have been
made available for any trajectory or sets of trajectories. In order to validate
the simulation results, a real life experiment was made, using a regular spray
painting gun with a servo controlling the trigger position in order to guaran-
tee coherence among the results. The inside of the assembly can be observed in
Fig. 5.
Fig. 5. Spray gun used during experiments, without the front cover (with a servo motor
to control the trigger).
The spray gun was mounted in a Yaskawa MH50 machining robot, alongside
the spindle and several loose paper sheets were used as sample painting targets.
The full system is presented in Fig. 6.
For the sake of reproducibility and meaningful analysis, an imaging process-
ing pipeline was developed to extract the paint distribution function from the
painted sheets scans. As the images were obtained with controlled conditions,
no extensive preprocessing is necessary to clean them. This way the first step is
A Simulation Tool for Optimizing a 3D Spray Painting System 119
Fig. 6. Robotic painting system used during experiments.
converting the image to grayscale. Then a simple threshold followed by a single
step erosion segments the majority of the paint. Since the spray pattern is cir-
cular, the center of the segmentation is the same as the center of the paint, the
region of interest is thus defined by a circle, with this center and a slightly larger
radius than half the segmentation bounding box. From this region the image
intensity along several lines is averaged and the standard deviation is calculated
to validate the assumption that the pattern is indeed circular symmetric. The
size of the paper sheet used is known as is the amount of paint spent during
the experiment, considering that the paint didn’t saturate, we can consider the
amount of paint proportional to the pixel intensity as a reasonable assumption.
The last step is to fit the curve to the Gaussian equation used in the simulation
1. The least mean squares is used for optimization, obtaining the optimal values
for the Gaussian function parameters (the sum of the squared residuals is mini-
mized). The pipeline and the calculated plots can be observed in Fig. 7. This tool
allows us not only to validate the purposed simulation but also to easily calibrate
the spray parameters, this is a great advantage since every spray painting gun
has different parameters and even the same gun with different configurations can
have a variety of spray patterns.
120 J. Casanova et al.
Fig. 7. Imaging processing pipeline: original image, converted to grayscale image,
thresholded image, eroded image, image with the circle delimiting the region of interest,
and an example of the selected lines. Imaging processing plots: the 3D plot of the paint
distribution across the image, average paint intensity of the lines and the respective
standard deviation and the fitting curve of the Gaussian function with the obtained
paint distribution.
5 Conclusion and Future Work
This work developed a simulation tool that enables optimized trajectory genera-
tion algorithms for industrial spray painting applications. The advantages of this
simulation tool include fast and easy testing of on development algorithms, opti-
mization without paint, energy and pieces being wasted, availability of qualitative
and quantitative metrics without real world hassles. The simulations accuracy and
A Simulation Tool for Optimizing a 3D Spray Painting System 121
validity was tested with several experiments and an image processing pipeline that
facilitates the tuning of the spray parameters was developed. Future work includes
occlusion algorithms in order to allow the simulation to work with more complex
parts that have cavities.
Acknowledgements. This work is financed by National Funds through the Por-
tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project
UIDB/50014/2020.
References
1. Automatic scanning and programming of robots. www.inropa.com/fileadmin/
Arkiv/Dokumenter/Produktblade/OLP automatic.pdf
2. Delfoi PAINT - Software for painting and coating. https://www.delfoi.com/delfoi-
robotics/delfoi-paint/
3. Examples - RoboDK. https://robodk.com/examples#examples-painting
4. Getting Started - RoboDK Documentation. https://robodk.com/doc/en/Getting-
Started.html#Station
5. GLScene. http://glscene.sourceforge.net/wikka/
6. OpenGL. https://www.opengl.org/
7. Robcad Robotics and automation workcell simulation, validation and off-line pro-
gramming. www.siemens.com/tecnomatix
8. Robust ROBOGUIDE Simulation Software. FANUC America. https://
www.fanucamerica.com/products/robots/robot-simulation-software-FANUC-
ROBOGUIDE
9. Andulkar, M.V., Chiddarwar, S.S.: Incremental approach for trajectory generation
of spray painting robot. Ind. Robot. (2015). https://doi.org/10.1108/IR-10-2014-
0405
10. Antonio, J.K.: Optimal trajectory planning for spray coating. In: Proceedings of
the IEEE International Conference on Robotics and Automation (1994). https://
doi.org/10.1109/robot.1994.351125
11. Chen, Y., Chen, W., Li, B., Zhang, G., Zhang, W.: Paint thickness simulation for
painting robot trajectory planning: a review (2017). https://doi.org/10.1108/IR-
07-2016-0205
12. Cignoni, P., Callieri, M., Corsini, M., Dellepiane, M., Ganovelli, F., Ranzuglia,
G.: MeshLab: an open-source mesh processing tool. In: Scarano, V., Chiara,
R.D., Erra, U. (eds.) Eurographics Italian Chapter Conference. The Eurograph-
ics Association (2008). https://doi.org/10.2312/LocalChapterEvents/ItalChap/
ItalianChapConf2008/129-136
13. Conner, D.C., Greenfield, A., Atkar, P.N., Rizzi, A.A., Choset, H.: Paint deposition
modeling for trajectory planning on automotive surfaces. IEEE Trans. Autom. Sci.
Eng. (2005). https://doi.org/10.1109/TASE.2005.851631
14. Fleming, D.: Airless spray-practical technique for maintenance painting. Plant Eng.
(Barrington, Illinois) 31(20), 83–86 (1977)
15. Fogliati, M., Fontana, D., Garbero, M., Vanni, M., Baldi, G., Dondè, R.: CFD
simulation of paint deposition in an air spray process. J. Coat. Technol. Res. 3(2),
117–125 (2006)
16. Hicks, P.G., Senser, D.W.: Simulation of paint transfer in an air spray process.
J. Fluids Eng. Trans. ASME 117(4), 713–719 (1995). https://doi.org/10.1115/1.
2817327
122 J. Casanova et al.
17. Persoons, W., Van Brussel, H.: CAD-based robotic coating of highly curved sur-
faces. In: 24th International Symposium on Industrial Robots, Tokyo, pp. 611–618
(November 1993). https://doi.org/10.1109/robot.1994.351125
18. Rupp, J., Guffey, E., Jacobsen, G.: Electrostatic spray processes. Met. Finish.
108(11–12), 150–163 (2010). https://doi.org/10.1016/S0026-0576(10)80225-9
19. Whitehouse, N.R.: Paint application. In: Shreir’s Corrosion, pp. 2637–2642. Else-
vier (January 2010). https://doi.org/10.1016/B978-044452787-5.00142-6
20. Ye, Q.: Using dynamic mesh models to simulate electrostatic spray-painting. In:
High Performance Computing in Science and Engineering 2005 - Transactions of
the High Performance Computing Center Stuttgart, HLRS 2005 (2006). https://
doi.org/10.1007/3-540-29064-8-13
21. Ye, Q., Domnick, J., Khalifa, E.: Simulation of the spray coating process using a
pneumatic atomizer. Institute for Liquid Atomization and Spray Systems (2002)
22. Ye, Q., Pulli, K.: Numerical and experimental investigation on the spray coating
process using a pneumatic atomizer: influences of operating conditions and target
geometries. Coatings (2017). https://doi.org/10.3390/coatings7010013
23. Zhang, Y., Huang, Y., Gao, F., Wang, W.: New model for air spray gun of robotic
spray-painting. Jixie Gongcheng Xuebao/Chin. J. Mech. Eng. (2006). https://doi.
org/10.3901/JME.2006.11.226
24. Zhou, B., Zhang, X., Meng, Z., Dai, X.: Off-line programming system of industrial
robot for spraying manufacturing optimization. In: Proceedings of the 33rd Chi-
nese Control Conference, CCC 2014 (2014). https://doi.org/10.1109/ChiCC.2014.
6896426
Optimization of Glottal Onset Peak
Detection Algorithm for Accurate Jitter
Measurement
Joana Fernandes1,2
, Pedro Henrique Borghi3,4
, Diamantino Silva Freitas2
,
and João Paulo Teixeira5(B)
1
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politecnico de Braganca (IPB), 5300 Braganca, Portugal
joana.fernandes@ipb.pt
2
Faculdade de Engenharia da Universidade do Porto (FEUP),
4200-465 Porto, Portugal
dfreitas@ipb.pt
3
Instituto Politecnico de Braganca (IPB), 5300 Braganca, Portugal
4
Federal University of Technology - Parana (UTFPR),
Cornelio Procopio 86300-000, Brazil
pedromelo@alunos.utfpr.edu.br
5
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Applied Management Research Unit (UNIAG), Instituto Politecnico
de Braganca (IPB), 5300 Braganca, Portugal
joaopt@ipb.pt
Abstract. Jitter is an acoustic parameter used as input for intelligent
systems for the diagnosis of speech related pathologies. This work has the
objective to improve an algorithm that allows to extract vocal param-
eters, and thus improve the accuracy measurement of absolute jitter
parameter. Some signals were analyzed, where signal to signal was com-
pared in order to try to understand why the values are different in some
signal between the original algorithm and the reference software. In this
way, some problems were found that allowed to adjust the algorithm,
and improve the measurement accuracy for those signals. Subsequently,
a comparative analysis was performed between the values of the original
algorithm, the adjusted algorithm and the Praat software (assumed as
reference). By comparing the results, it was concluded that the adjusted
algorithm allows the extraction of the absolute jitter with values closer to
the reference values for several speech signals. For the analysis, sustained
vowels of control and pathological subjects were used.
Keywords: Jitter · Algorithm · Optimization · Speech pathologies ·
Acoustic analysis
This work was supported by Fundação para a Ciência e Tecnologia within the Project
Scope: UIDB/05757/2020.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 123–137, 2021.
https://doi.org/10.1007/978-3-030-91885-9_10
124 J. Fernandes et al.
1 Introduction
Speech pathologies are relatively common and can be found in different stages of
evolution and severity, affecting approximately 10% of the population [1]. These
pathologies directly affect vocal quality, as they alter the phonation process and
have increased dramatically in recent times, mainly due to harmful habits, such
as smoking, excessive consumption of alcoholic beverages, persistent inhalation
of dust-contaminated air and abuse of voice [2].
There are a variety of tests that can be performed to detect pathologies
associated with the voice, however, they are invasive becoming uncomfortable
for patients and are time-intensive [3].
Auditory acoustic analyzes, performed by professionals, lack objectivity and
depend on the experience of the physician who makes the assessment. Acoustic
analysis allows non-invasively to determine the individual’s vocal quality. It is a
technique widely used in the detection and study of voice pathologies, since it
allows to measure properties of the acoustic signal of a recorded voice where a
speech or vowels are said in a sustained way [4,5]. This analysis is able to provide
the sound wave format allowing the evaluation of certain characteristics such as
frequency disturbance measurements, amplitude disturbance measurements and
noise parameters.
Jitter is one of the most used parameters as part of a voice exam and is used
by several authors Kadiri and Alku 2020 [6], Teixeira et al. 2018 [7], Sripriya
et al. 2017 [8], Teixeira and Fernandes 2015 [9] to determine voice pathologies.
Jitter is the measure of the cycle to cycle variations of the successive glottic
cycles, being possible to measure in absolute or relative values. Jitter is mainly
affected by the lack of control in the vibration of the vocal folds. Normally, the
voices of patients with pathologies tend to have higher values of jitter [10].
This work intends to improve the algorithm developed by Teixeira and
Gonçalves [10,11] to obtain the jitter parameter, to later be used in a com-
plementary diagnostic system and obtain realiable jitter values. As a means of
reference, to ensure reliable values of the algorithm, Praat is used as a refer-
ence for comparison, as this software is accepted by the scientific community as
an accurate measure and is open software. This software is developed by Paul
Boersma and David Weenink [12], from the Institute of Phonetic Sciences at the
University of Amsterdam.
This article is organized as follows: Sect. 2 describes the determination of the
absolute jitter, the pathologies and the database used, as well as the number
of subjects used for the study; Sect. 3 describes some of the problems found
in some signals in the Teixeira and Gonçalves algorithms [10,11], as well as
the description and flowchart of the adjusted algorithm; Sect. 4 describes the
studies carried out, the results obtained and the discussion. Finally, in Sect. 5
the conclusions are presented.
Optimization of Glottal Onset Peak Detection 125
2 Methodology
In Fig. 1, the jitter concept is illustrated, where it is possible to perceive that
jitter corresponds to the measure of the variation of the duration of the global
periods.
Fig. 1. Jitter representation in a sustained vowel /a/.
2.1 Jitter
Jitter is defined as a measure of glottal variation between cycles of vibration of
the vocal cords. Subjects who cannot control the vibration of the vocal cords
tend to have higher values of jitter [11].
Absolute jitter (jitta) is the variation of the glottal period between cycles,
that is, the average absolute difference between consecutive periods, expressed
by Eq. 1.
jitta =
1
N − 1
N

i=2
|Ti − Ti−1| (1)
In Eq. 1 Ti is the size of the glottal period i and N is the total number of glottal
periods.
2.2 Database
The German Saarbrucken Voice Database (SVD) was used. This database is
available online by the Institute of Phonetics at the University of Saarland [13].
The database consists of voice signals of more than 2000 subjects with several dis-
ease and controls/healthy subjects. Each person has the recording of phonemes
/a/, /i/ and /u/ in the low, normal and high tones, swipe along tones, and the
126 J. Fernandes et al.
German phrase “Guten Morgen, wie geht es Ihnen?” (“Good morning, how are
you?”). The size of the sound files is between 1 and 3 s and have a sampling
frequency of 50 kHz.
For the analysis it was used a sub-sample of 10 control subjects (5 male and
5 female) and 5 patient subjects (2 male and 3 female), of 3 diseases. Of these,
2 subjects had chronic laryngitis, one of each gender, 2 subjects with vocal cord
paralysis, one of each gender and 1 female with vocal cord polyps.
3 Development
3.1 Comparative Analysis Between the Signals Themselves
Teixeira and Gonçalves in [11] reported the results of the algorithm using groups
of 9 healthy and 9 pathological male and female subject. The authors compared
the average measures of the algorithm and Praat software, over the groups of 9
subjects. The comparison shows generally slightly higher values measured by the
algorithm, but with difference lower than 20 µs except for the female pathological
group. 20 µs corresponds to only one speech signal sample at 50 kHz sampling
frequency. Authors claim a difference lower than 1 speech sample in average.
Anyhow, in a particular speech file it was found very high differences between
the algorithm measures and the Praat measures.
In order to try to understand why the jitter measure in some particular
speech files of the Teixeira and Gonçalves 2016 [11] algorithm are so different
from the reference values, we proceeded to a visual comparison of the signals in
the reference software and in the Teixeira and Gonçalves 2016 algorithm.
In this way, we took the signals in which the values for absolute jitter were
very different and it was noticed that for a particular shapes of the signal the
algorithm do not find the peaks in a similar way as Praat did.
In a female control signal, the reference value for absolute jitter was 10.7 µs
and the algorithm value was 478.6 µs. When the signal was observed, it was
realized that one of the reasons that led to this difference was the fact that the
algorithm was selecting the maximum peaks to make the measurement, while the
reference software was using the minimum peaks. This led us to try to understand
why the algorithm is selecting the maximum peaks, instead of selecting the
minimums. The conclusion observed in Fig. 2 was reached.
As it can be seen in Fig. 2, the algorithm is finding two maximum values
for the same peak. This first search of peaks is taken in a initial phase of the
algorithm to decide to use the maximum or minimum peaks. This two close
peaks leads the algorithm to erroneously choose the peaks it will use to make the
measurement, since it does not satisfy the condition of the algorithm (number of
maximum peaks = 10, in a 10 glottal periods length part of speech). This problem
occurs in more signals. Correcting this error, for this signal, the absolute jitter
value went from 478.6 µs to 39.3 µs.
In another signal, also from a control subject, in which the reference value
for the absolute jitter is 9.6 µs and the value of the algorithm was 203,1 µs.
Optimization of Glottal Onset Peak Detection 127
Fig. 2. Problem 1 - finding peaks in a short window for decision for use the maximum
or minimum peaks.
When observing the signal, it was also noticed that the algorithm was using
the maximum peaks to make the measurement, while the reference software
used the minimum peaks. Looking at Fig. 3 (presenting the speech signal and
the identification of the right minimum peaks), it is possible to see why that
algorithm selected the maximum peaks instead of the minimum peaks.
Fig. 3. Problem 2 - selection of maximum/minimum peaks (in red the position of the
minimum peaks). (Color figure online)
As it is possible to observe in Fig. 3 the length of the window used, defined in
global periods, comprises 11 minimum peaks. Taking into account the conditions
of the algorithm to select the maximum or minimum peaks, for later measure-
ment, the fact of having more than 10 minimum peaks leads the algorithm to
select the maximum. When the window correction was made, for this signal in
question, it started to select only 10 minimum peaks and the algorithm starts
to select the minimum peaks for measurement. Thus, the jitter absolute value
went from 203.1 µs to 32.3 µs.
In two pathological signals, one female and one male, in which the reference
value for absolute jitter is 8.0 µs and the value of the algorithm is 68.6 µs for
female signal. For the male, the absolute jitter in reference software is 28.7 µs and
that of the algorithm is 108.0 µs. When the signal of each of them was observed,
128 J. Fernandes et al.
it was noticed that the choice between selecting the maximum/minimum was
well done, however, this type of signal have two peaks very close in same period
that alternatively have the minimum/maximum value. Once the algorithm takes
always the minimum/maximum value, the selection of the first and second peak
changes along the signal, making the measure of the jitter inconsistent.
In Figs. 4 and 5 it is possible to observe the problems found in these two
signals. These figures present the speech signal in blue line, the moving average
in red line and the marked peaks with red circle.
Fig. 4. Problem 3 - variation along the signal of the moving average minimum peak.
(Color figure online)
Fig. 5. Problem 4 - variation along the signal of the signal minimum peak. (Color
figure online)
As it is possible to observe in Figs. 4 and 5 the fact that there are 2 peaks
(minimum peaks in these examples) in the same glottal period, sometimes the
maximum peak is the first, other times it is the second, which leads to increase
the absolute jitter determined.
Optimization of Glottal Onset Peak Detection 129
3.2 Algorithm Adjustments
The main purpose of the present work is to present the optimizations applied to
the algorithm proposed in [11], since it was mainly observed the inability to deal
with some recordings, both in the control group and those who have some type of
pathology. Thus, a recap of the steps in [11] is proposed in order to contextualize
each improvement developed. It must be emphasised the importance of the exact
identification of the position of each onset time of the glottal periods, for an
accurate measure of the Jitter values. The onset time of the glottal periods are
considered as the position of the peaks in the signal.
According to [11], the process of choosing the references for calculating the
Jitter parameter can be better performed by analyzing the signal from its cen-
tral region. There it is expected to find high stationarity. In addition, it was
concluded that a window of 10 fundamental periods contains enough informa-
tion to determine the reference of the length of the fundamental period and the
selection between the minimums or maximums. On this window, the moments
that occur the positive and negative peaks of each glottal period are determined,
as well as the amplitude module of the first cycle. The step between adjacent
cycles is based on the peak position of the previous cycle plus a fundamental
period. Focusing on this point, the position of the posterior peak is determined
by searching for the maximum (or minimum) over a two-third window of the
fundamental period. Finally, in an interval of two fifths of fundamental period,
around each peak, the search for other peaks corresponding to at least 70% of
its amplitude is carried out. The reference determination is made considering
the positive and negative amplitudes of the first module cycle and the count of
positive and negative peaks in the interval of 10 cycles. Basically, the measure of
amplitude directs the decision to the parameter with the greatest prominence,
since it is expected to have isolated peaks for the one with the greatest ampli-
tude. On the other hand, the count of cycles indicates the number of points
and oscillations close to the peaks, which may eventually prevail over these. The
occurrence of these behaviors may or may not be something characteristic of the
signal, in a way that it was observed that for cases in which this rarely occurred,
a slight smoothing should be sufficient to readjust the signal. In [11], when the
peak count exceeded the limit of 10 for both searches, a strong smoothing was
applied over the signal that sometimes de-characterized it.
Based on the mentioned amplitude parameters and the peaks count, a
decision-making process is carried out, which concludes with the use of max-
imums or minimums of the original signal or its moving average, to determine
all glottal periods. In Fig. 6 the flowchart of the proposed algorithm is presented.
130 J. Fernandes et al.
Fig. 6. Algorithm flowchart.
The rules of the decision process are described below, where the limit param-
eter n = 13 is defined as the maximum peak count accepted for the use of the
original signal or for the application of light straightening.
• If the maximum amplitude is greater than the minimum amplitude in module
and the maximum peak count is in the range [10, n], or, the minimum peak
count is greater than n and the maximum peak count is in [10, n]:
◦ If the maximum count is equal to 10, the search for peaks is per-
formed on the original signal using the maximum as a cycle-by-cycle
search method;
◦ Else, a moving average with a Hanning window of equal length to a
ninth of a fundamental period is applied to the signal and the search for
peaks in this signal is carried out through the maximum of each cycle.
Figure 7 shows the comparison between the original signals, the moving
Optimization of Glottal Onset Peak Detection 131
average MA1 and the moving average MA2. The peak marking was per-
formed within this rule, that is, starting from MA1 and using maximum
method. It is noted that the slight smoothing applied keeps the peak
marking close to the real one and maintains, in a way, the most signifi-
cant undulations of the signal, whereas MA2, in addition to shifting the
position of the peaks more considerably, their undulations are composed
of the contribution of several adjacent waves. This can reduce the reli-
ability of marking signals that have peaks’ side components with high
amplitude in modulus.
Fig. 7. Comparison between the original signal, the signal after moving average with
Hanning window of length N0/9 (MA1) and the signal after moving average with
Hanning window of length N0/3 (MA2). The peaks of glotal periods are shown in
black and were detected through maximum method over MA1. Recording excerpt from
a healthy subject performing /i/ low. Normalized representation.
• Else, if the minimum amplitude in module is greater than the maximum
amplitude and the minimum peak count is in the range [10, n], or, the max-
imum peak count is greater than n and the minimum peak count is in [10,
n]:
◦ If the minimum count is equal to 10, the search for peaks is per-
formed on the original signal using the minimum as a cycle-by-cycle search
method;
◦ Else, a moving average with a Hanning window of equal length to a
ninth of a fundamental period is applied to the signal and the search for
peaks in this signal is carried out through the minimum of each cycle.
Figure 8 shows an example of the path taken through this rule. Here, the
generated MA1 signal proves to be much less sensitive than the MA2 (gen-
erated for comparison) since in this one, the wave that contains the real
peak is strongly attenuated by the contribution of the previous positive
amplitude to the averaging. In this way, the application of smoothing
132 J. Fernandes et al.
contributes to eliminates rapid oscillations in the samples close to the
minimum, but preserves the amplitude behavior of signals that do not
need a strong smoothing.
Fig. 8. Comparison between the original signal, the signal after moving average with
Hanning window of length N0/9 (MA1) and the signal after moving average with
Hanning window of length N0/3 (MA2). The peaks of glotal periods are shown in
black and were detected through minimum method over MA1. Recording excerpt from
a pathological subject performing /a/ high. Normalized representation.
• Else, a moving average with a Hanning window of equal length to a third of
the fundamental period is applied to the signal. On the result, linear trends
are removed and a new analysis is performed in the central region of the
signal. In this analysis, an interval of 10 fundamental periods is evaluated
cycle by cycle about the maximum, minimum peaks and their adjacent peaks
greater than 75% of their amplitude:
◦ If, in the first cycle, the maximum amplitude is greater than the min-
imum amplitude in module, and the maximum count is equal to 10 or,
the minimum count is greater than 10 and the maximum count is equal
to 10, the search for peaks is made on the moving average signal using
the maximum as a method of inspection cycle by cycle. Figure 9 demon-
strates the case in which this rule is followed. It is possible to observe, by
comparing the original signal, MA1 and MA2, that the determination of
the peaks of the global periods is only possible when a heavy smoothing is
applied, since the region with the greatest positive amplitude and energy,
can only be highlighted with a single peak approaching a context long
enough to avoid the two adjacent peaks in the original signal. It is also
noted that MA1 reduces the high frequencies but keeps the maximum
point undefined, varying between the left and right along the signal;
Optimization of Glottal Onset Peak Detection 133
Fig. 9. Comparison between the original signal, the signal after moving average with
Hanning window of length N0/9 (MA1) and the signal after moving average with
Hanning window of length N0/3 (MA2). The peaks of glotal periods are shown in
black and were detected through maximum method over MA2. Recording excerpt from
a pathological subject performing /a/ normal. Normalized representation.
◦ Else, the search for peaks is done on the moving average signal using
the minimum as a method of inspection cycle by cycle.
In all cases, the inspection of the peaks cycle by cycle of every signal is carried
out with steps of a fundamental period from the reference point (maximum
or minimum) and then an interval of one third of the fundamental period is
analyzed. The reduction in the length of this interval from two thirds (used in
[11]) to one third makes the region of analysis more restricted to samples close
to one step from the previous peak. With this it is expected to guarantee that
the analysis is always carried out in the same region of the cycles, avoiding the
exchange between adjacent major peaks. However, in the cases where there is a
shift in the fundamental frequency over time, the use of a fixed parameter as a
step between cycles, as done in [11] and in this work, can cause inevitable errors
in the definition of the search interval. The maximum or minimum points of
each cycle are used as a source for determining the parameters Jitter according
to Eqs. 1.
4 Results and Discussion
In this section, the results of some analyzes made will be reported, as well as
their discussion.
A comparative analysis was made for the values of absolute jitter between
the values obtained by the algorithm developed by Teixeira and Gonçalves 2016
[11], the reference software [12] and the adjusted algorithm, in order to try
to understand if the adjusted algorithm values are closer to the values of the
134 J. Fernandes et al.
reference software. Praat software was used as a reference, although the exact
values of jitter cannot be known.
For 10 control subjects and 5 pathological subjects, the absolute jitter was
extracted and averaged by tone and vowel. Table 1 shows the results of this
analysis for control subjects and Table 2 for pathological subjects.
Table 1. Average absolute jitter (in microseconds) for each vowel and tone for control
subjects.
Vowel Tone Praat Teixeira and Gonçalves 2016 Adjusted
a High 14.673 26.795 25.890
Low 27.614 80.084 34.105
Normal 24.916 50.091 32.016
i High 16.092 23.621 23.617
Low 23.805 26.246 26.378
Normal 18.593 23.909 23.900
u High 13.361 23.422 23.857
Low 29.077 60.089 37.140
Normal 18.376 23.759 23.755
Table 2. Average absolute jitter (in microseconds) for each vowel and tone for patho-
logical subjects.
Vowel Tone Praat Teixeira and Gonçalves 2016 Adjusted
a High 32.817 56.447 56.379
Low 39.341 82.316 82.316
Normal 36.462 139.866 103.326
i High 39.774 62.280 62.284
Low 35.512 48.347 48.347
Normal 25.611 33.921 33.916
u High 33.606 45.489 45.489
Low 40.751 65.752 65.745
Normal 52.168 323.279 103.322
Through the data in Tables 1 and 2 it is possible to see that the adjusted
algorithm obtains jitter measures closer to the reference than the Teixeira and
Gonçalves 2016 algorithm, since, for example, in the vowel /a/ low tone, for the
control subjects, with the algorithm obtained an improvement of 46 µs.
After the average evaluation an analysis of the individual values per subject
was made. Thus, the vowel /a/ normal tone was selected for control and patho-
logical subjects for further analysis. In Fig. 10 it is possible to observe the results
Optimization of Glottal Onset Peak Detection 135
of absolute jitter for the 10 control subjects and in Fig. 11 the results of absolute
jitter for the 5 pathological subjects.
Fig. 10. Comparison of the absolute jitter values for the 10 control subjects.
Fig. 11. Comparison of absolute jitter values for the 5 pathological subjects.
Through Figs. 10 and 11 it is possible to observe that comparing the values
of the Teixeira and Gonçalves 2016 algorithm with the adjusted algorithm, the
adjusted algorithm has closer values to the reference for more subjects. In con-
trol subjects, comparing the reference values with the Teixeira and Gonçalves
algorithm, 2016, the difference is less than 16.8 µs. Now comparing the refer-
ence values with the adjusted algorithm, the difference is less than 7.1 µs. For
pathological subjects, the difference between reference values and Teixeira and
136 J. Fernandes et al.
Gonçalves algorithm, 2016, is less than 57.9 µs and the difference with the cor-
rected algorithm is less than 29.4 µs.
Therefore, it can be concluded that the adjusted algorithm obtains values
closer to the reference values.
It should also be mentioned that an interpolation procedure of the speech sig-
nal was also experimented with interpolations of several order between 2 and 10.
The objective was to increase the resolution of the peak position. The interpola-
tion procedure didn’t improve the accuracy of the absolute jitter determination,
and was discarded.
5 Conclusion
The objective of this work was to improve the algorithm developed by Teix-
eira and Gonçalves 2016 in some particular signals subjected to inconsistencies
finding the onset glottal periods. In order to try to increase the accuracy of
the absolute jitter, some adjustments were made in the algorithm, where the
detection of the peaks was improved. In order to understand if these corrections
obtained improvements in the detection of the absolute jitter, an analysis was
made of the averages of the values obtained from the absolute jitter of 10 control
subjects and 5 pathological ones. In this analysis it was noticed that the values
obtained through the corrections were closer to the reference values. In order
to understand how the adjusted algorithm behaves compared to the reference
value another analysis was carried out comparing, for one vowel, the 10 control
subjects and the 5 pathological subjects. Thus, it was noticed that the adjusted
algorithm measures the absolute jitter with values closer to the reference val-
ues for more subjects. The algorithm still measure absolute jitter with slightly
higher values than Praat. In control subjects, the difference in absolute jitter
measurements was reduced from 16.8 to 7.1 µs, and for pathological subjects,
the difference was reduced from 57.9 to 29.4 µs.
References
1. Martins, A.l.H.G., Santana, M.F., Tavares, E.L.M.: Vocal cysts: clinical, endo-
scopic, and surgical aspects. J. Voice 25(1), 107–110 (2011)
2. Godino-Llorente, J., Gomez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduc-
tion of a pathological voice quality assessment system based on gaussian mixture
models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53(10),
1943–1953 (2006)
3. Teixeira, J.P., Alves, N., Fernandes, P.O.: Vocal acoustic analysis: ANN versos
SVM in classification of dysphonic voices and vocal cords paralysis. Int. J. E-
Health Med. Commun. (IJEHMC) 11(1), 37–51 (2020)
4. Sataloff, R.T., Hawkshawe, M.J., Sataloff, J.B.: Common medical diagnoses and
treatments in patients with voice disorders: an introduction and overview. In: Vocal
Health and Pedagogy: Science, Assessment and Treatment, p. 295 (2017)
5. Godino-Llorente, J., Gómez-Vilda, P.: Automatic detection of voice impairments
by means of short-term cepstral parameters and neural network based detectors.
IEEE Trans. Biomed. Eng. 51(2), 380–384 (2004)
Optimization of Glottal Onset Peak Detection 137
6. Kadiri, S., Alku, P.: Analysis and detection of pathological voice using glottal
source features. IEEE J. Sel. Top. Signal Process. 14(2), 367–379 (2020)
7. Teixeira, J.P., Teixeira, F., Fernandes, J., Fernandes, P.O.: Acoustic analysis of
chronic laryngitis - statistical analysis of sustained speech parameters. In: 11th
International Conference on Bio-Inspired Systems and Signal Processing, BIOSIG-
NALS 2018, vol. 4, pp. 168–175 (2018)
8. Sripriya, N., Poornima, S., Shivaranjani, R., Thangaraju, P.: Non-intrusive tech-
nique for pathological voice classification using jitter and shimmer. In: International
Conference on Computer, Communication and Signal Processing (ICCCSP), pp.
1–6 (2017)
9. Teixeira, J.P., Fernandes, P.: Acoustic analysis of vocal dysphonia. Procedia Com-
put. Sci. 64, 466–473 (2015)
10. Teixeira, J.P., Gonçalves, A.: Accuracy of jitter and shimmer measurements. Pro-
cedia Technol. 16, 1190–1199 (2014)
11. Teixeira, J.P., Gonçalves, A.: Algorithm for jitter and shimmer measurement in
pathologic voices. Procedia Comput. Sci. 100, 271–279 (2016)
12. Boersma, P., Weenink, D.: Praat: doing phonetics by computer [Computer pro-
gram]. Version 6.0.48, 15 April 2019 (1992–2019). http://www.praat.org/
13. Barry, W., Pützer, M.: Saarbruecken Voice Database. Institute of Phonetics at the
University of Saarland (2007). http://www.stimmdatenbank.coli.uni-saarland.de.
Accessed 15 Apr 2021
Searching the Optimal Parameters
of a 3D Scanner Through Particle
Swarm Optimization
João Braun1,2(B)
, José Lima2,3
, Ana I. Pereira2
, Cláudia Rocha3
,
and Paulo Costa1,3
1
Faculty of Engineering, University of Porto, Porto, Portugal
paco@fe.up.pt
2
Research Centre in Digitalization and Intelligent Robotics,
Instituto Politécnico de Bragança, Bragança, Portugal
{jbneto,jllima,apereira}@ipb.pt
3
INESC-TEC - INESC Technology and Science, Porto, Portugal
claudia.d.rocha@inesctec.pt
Abstract. The recent growth in the use of 3D printers by independent
users has contributed to a rise in interest in 3D scanners. Current 3D
scanning solutions are commonly expensive due to the inherent complex-
ity of the process. A previously proposed low-cost scanner disregarded
uncertainties intrinsic to the system, associated with the measurements,
such as angles and offsets. This work considers an approach to estimate
these optimal values that minimize the error during the acquisition. The
Particle Swarm Optimization algorithm was used to obtain the param-
eters to optimally fit the final point cloud to the surfaces. Three tests
were performed where the Particle Swarm Optimization successfully con-
verged to zero, generating the optimal parameters, validating the pro-
posed methodology.
Keywords: Nonlinear optimization · Particle swam optimization · 3D
scan · IR sensor
1 Introduction
The demand for 3D scans of small objects has increased over the last few years
due to the availability of 3D printers for regular users. However, the solutions are
usually expensive due to the complexity of the process, especially for sporadic
users. There are several approaches to 3D scanning, with the application, in
general, dictating the scanning system’s requirements. Thus, each of them can
differ concerning different traits such as acquisition technology, the structure
of the system, range of operation, cost, and accuracy. In a more general way,
these systems can be classified as contact or non-contact, even though there are
several sub-classifications inside these two categories, which the reader can find
with more detail in [2].
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 138–152, 2021.
https://doi.org/10.1007/978-3-030-91885-9_11
Optimal Parameters of a 3D Scanner Through PSO 139
Two common approaches for reflective-optical scanning systems are triangu-
lation and time-of-flight (ToF). According to [6], the range and depth variation
in triangulation systems are limited, however, they have greater precision. In
contrast, ToF systems have a large range and depth variation at the cost of
decreased precision. The triangulation approach works, basically, by projecting
a light/laser beam over the object and capturing the reflected wave with a digi-
tal camera. After that, the distance between the object and the scanning system
can be computed with trigonometry, as the distance between the camera and
the scanning system is known [6]. On the other hand, the accuracy of ToF-
based systems is mainly determined by the sensor’s ability to precisely measure
the round-trip time of a pulse of light. In other words, by emitting a pulse of
light and measuring the time that the reflected light takes to reach the sensor’s
detector, the distance between the sensor and the object is measured. Besides,
still regarding ToF-based systems, there is another approach to measure the
distance between the sensor and the object based on the Phase-Shift Method,
which essentially compares the phase shift between the emitted and reflected
electromagnetic waves.
Each approach has its trade-offs, which a common one is between speed and
accuracy in these types of scanning systems. By increasing the scanning speed of
the system, the accuracy will decrease, and vice-versa. This is mitigated by hav-
ing expensive rangefinders with higher sampling frequencies. In addition, accu-
racy during acquisition can be heavily affected by light interference, noise, and
the angle of incidence of the projected beam in the object being much oblique.
Therefore, a controlled environment, and a high-quality sensor and circuit must
be used to perform quality scans. However, the angle of incidence is less con-
trollable since it depends on the scanning system and the object (trade-off of
the system’s structure). In general, these types of 3D scanning systems are very
expensive, especially when even the cheaper ones are considered costly by the
regular user. A particular example includes a low-cost imagery-based 3D scanner
for big objects [18].
A low-cost 3D scanning system that uses a triangulation approach with an
infra-red distance sensor targeting small objects, therefore having limited range
and requiring large accuracy, was proposed and validated both in simulation [5]
and real scenarios, where the advantages and disadvantages of this architecture
were also addressed. Although the proposed system already has good accuracy,
it is possible to make some improvements with a multivariable optimization app-
roach. This is possible because the system makes some assumptions to perform
the scan that, although are true in simulation, do not necessarily hold in a real
scenario. One example is the object not being exactly centered on the scanning
platform. The assumptions, as well as the problem definition, are described in
Subsect. 4.1. The reasons why PSO was chosen instead of other approaches are
addressed in Subsect. 4.2.
The paper is structured as follows: a brief state of the art is presented in Sect. 2.
Section 3 describes the 3D scanning system and how it works. The optimization
problem is defined, and a solution is proposed in Sect. 4. The results are described
in Sect. 5. Finally, the last section presents the conclusion and future works.
140 J. Braun et al.
2 State of the Art
As the field of optimization is vast, there are works in several fields of knowledge.
Therefore, in this section, the objective is to give a brief literature review on
multivariable optimization. Thus, some recent examples follow below.
Regarding optimization for shape fitting, [9] proposed a 3D finite-element
(FE) modeling approach to obtain models that represent the shape of grains
adaptively. According to the authors, the results show that the proposed app-
roach supports the adaptive FE modeling for aggregates with controllable accu-
racy, quality, and quantity, which indicates reliable and efficient results. More-
over, authors in [11], proposed an approach that combined an optimization-
based method with a regression-based method to get the advantages of both
approaches. According to the authors, model-based human pose estimation is
solved by either of these methods, where the first leads to accurate results but is
slow and sensitive to parameter initialization, and the second is fast but not as
accurate as of the first, requiring huge amounts of supervision. Therefore, they
demonstrated the effectiveness of their approach in different settings in compar-
ison to other state-of-the-art model-based pose human estimation approaches.
Also, a global optimization framework for 3D shape reconstruction from sparse
noisy 3D measurements, that usually happens in range scanning, sparse feature-
based stereo, and shape-from-X, was proposed in [13]. Their results showed that
their approach is robust to noise, outliers, missing parts, and varying sampling
density for sparse or incomplete 3D data from laser scanning and passive multi-
view stereo.
In addition, there are several works in multivariable optimization for mechan-
ical systems. For instance, the characteristics, static and dynamic, of hydrostatic
bearings were improved with controllers that were designed for the system with
the help of a multivariable optimization approach for parameter tuning [16]. The
authors proposed a control approach that has two inputs, a PID (Proportional
Integral Derivative) sliding mode control and feed-forward control, where the
parameters of the first input, were optimized with Particle Swarm Optimization
(PSO). Their results performed better compared with results from the literature
which had PID control. In addition, a system that consists of multi-objective
particle swarm optimized neural networks (MOPSONNS) was proposed in [8]
to compute the optimal cutting conditions of 7075 aluminum alloys. The sys-
tem uses multi-objective PSO and neural networks to determine the machining
variables. The authors concluded that MOPSONNS is an excellent optimization
system for metal cutting operations where it can be used for other difficult-
to-machine materials. The authors in [20] proposed a multi-variable Extremum
Seeking Control as a model-free real-time optimizing control approach for oper-
ation of cascade air source heat pump. They used for the scenarios of power
minimization and coefficient of performance maximization. Their approach was
validated through simulation which their results showed good convergence per-
formance for both steady-state and transient characteristics. The authors in
[7] applied two algorithms, continuous ant colony algorithm (CACA) and mul-
tivariable regression, to calculate optimized values in back analysis of several
Optimal Parameters of a 3D Scanner Through PSO 141
geomechanical parameters in an underground hydroelectric power plant cavern.
The authors found that the CACA algorithm showed better performance. To
improve the thermal efficiency of a parabolic trough collector, authors in [1]
presented a multivariate inverse artificial neural network (ANNim) to determine
several optimal input variables. The objective function was solved using two
meta-heuristic algorithms which are Genetic Algorithm (GA) and PSO. The
authors found that although ANNim-GA runs faster, ANNim-PSO has better
precision. The authors stated that the two presented models have the potential
to be used in intelligent systems.
Finally, there are optimization approaches in other areas such as control the-
ory, chemistry, and telecommunication areas. Here are some examples. In control
theory, by employing online optimization, a filtering adaptive tracking control
for multivariable nonlinear systems subject to constraints was proposed in [15].
The proposed control system maintains tracking mode if the constraints are not
violated, if not, it switches to constraint violation avoidance control mode by
solving an online constrained optimization problem. The author showed that
the switching logic avoids the constraints violation at the cost of tracking per-
formance. In chemistry, artificial neural networks were trained under 5 different
learning algorithms to evaluate the influence of several input variables to pre-
dict the specific surface area of synthesized nanocrystalline NiO-ICG composites,
where the model trained with the Levenberg-Marquardt algorithm was chosen
as the optimum algorithm for having the lowest root means square error [17].
It was found that the most influential variable was the calcination temperature.
At last, in telecommunications, Circular Antenna Array synthesis with novel
metaheuristic methods, namely Social Group Optimization Algorithm (SGOA)
and Modified SGOA (MSGOA), was presented in [19]. The authors performed
a comparative study to assess the performance of several synthesis techniques,
where SGOA and MGSOA had good results.
As there aren’t any related works similar to our approach, the contribution of
this work is to propose a procedure to find the optimal parameters (deviations)
of a 3D scanner system through PSO, guaranteeing a good fit for point clouds
over their respective scanned objects.
3 3D Scanning System
The scanning system works basically in two steps. First, the object is scanned
and the data, in spherical coordinates, is saved in a point cloud file. After this,
a software program performs the necessary transformations to cartesian coordi-
nates and generates STL files that correspond to the point clouds. The flowchart
of the overall STL file generation process is presented in Fig. 1a. A prototype
of the low-cost 3D scanner, illustrated in Fig. 1b, was developed to validate the
simulated concept in small objects [5].
The architecture regarding the components and the respective communica-
tion protocols of the 3D scanning system can be seen in Fig. 2.
142 J. Braun et al.
Fig. 1. Generation of an object’s STL file: (a) flowchart of the overall process, and (b)
prototype of the low cost 3D scanner.
The system is composed of a rotating structure supporting the object that is
going to be scanned and another articulated structure supporting the sensor that
performs the data acquisition. The rotating structure is actuated by a stepper
motor that precisely rotates it continuously until the end of one cycle, corre-
sponding to 360◦
. An articulated part whose rotating axis is aligned with the
center of the rotating structure, having attached an optical sensor that measures
the distance to the object, is also actuated with a stepper motor which moves
in steps of 2◦
(configurable). The proposed optical sensor was selected since it
is a combination of a complementary metal-oxide-semiconductor image sensor
and an infrared light-emitting diode (IR-LED) that allow a reduced influence
by the environment temperature and by the object’s reflectivity. Both stepper
motors used on this approach are the well-known NEMA17, controlled by the
DRV8825 driver. Moreover, a limit switch sensor was added to be a reference to
the position of 0◦
. Over a cycle, the sensor obtains 129 measurements, each one
taking 50 ms. The rotating plate’s speed is approximately 55◦
/s (configurable).
The system is controlled by a low-cost ESP32 microcontroller, transferring to the
PC, by serial or Wi-Fi, the measured distance to the object, the angle between
the sensor and the rotating structure’s angle.
Fig. 2. Diagram illustrating the components of the system and the respective commu-
nication protocols used.
Optimal Parameters of a 3D Scanner Through PSO 143
The dynamics of the 3D scanner is illustrated throughout a flowchart, which
can be seen in Fig. 3. The rotating structure keeps rotating until it finishes one
cycle of 360◦
. After this, the system increments the sensor’s angle by 2◦
. This
dynamic keeps repeating until the sensor’s angle reaches 90◦
. Afterward, the
data, in spherical coordinates, is saved to a point cloud file. The data conversion
to Cartesian coordinates is explained in detail in Sect. 4. The algorithm that
generates the STL files is outside of the scope of this work, for further information
the reader is referred to [5].
Fig. 3. 3D scanner dynamics flowchart.
4 Methodology
The notation used in this work to explain the system considers spherical coordi-
nates, (l, θ, ϕ). The scanning process starts by placing the object on the center
of the rotating plate (P = (0, 0, hp)), which is directly above the global reference
frame. The system is represented in Fig. 4, from an XZ plane view. First, the
slope’s angle measured from the X axis to the Z axis, with range [0, π/2] rad, is
θ. Also, ϕ is the signed azimuthal angle measured from the X axis, in the XY
plane, with range [0, 2π]. The parameter hp represents the distance from the
rotating plate to the arm rotation point, and hs is the perpendicular distance
from the arm to the sensor. The parameter l defines the length of the mechanical
arm and d represents the distance from the sensor to the boundary of the object
which is located on the rotating plate.
The position of the sensor s (red circle of Fig. 4) Xs = (xs, ys, zs), in terms of
the parameters (l, θ, ϕ), can be defined as follows, assuming the reference origin
at the center of the rotating base:
⎧
⎨
⎩
xs = (l cos(θ) − hs sin(θ)) cos(ϕ)
ys = (l cos(θ) − hs sin(θ)) sin(ϕ)
zs = l sin(θ) + hs cos(θ) − hp
(1)
Therefore, Xp = (xp, yp, zp), which is the position of a scanned point in a
given time in relation to the rotating plate reference frame, is represented by a
translation considering ||Xs|| and d. Thus, it is possible to calculate Xp as:
144 J. Braun et al.
Fig. 4. Mechanical design of the proposed system. XZ plane view. Red circle represents
the sensor and the dotted line represents its beam.
Xp =

1 −
d
||Xs||

Xs (2)
where ||Xs|| =

x2
s + y2
s + z2
s , which represents the euclidean norm of
(xs, ys, zs).
4.1 Problem Definition
The described calculations to obtain the position of a given scanned point Xp
made several assumptions that, although always true in a simulated scenario, do
not necessarily hold in real environments. If these assumptions would not be met,
the coordinate transformations would not hold, and, therefore the point cloud
and its mesh would become distorted. These assumptions, and consequently the
optimization problem, are going to be described in this subsection.
First, there may exist an offset in θ (arm’s rotation angle), denoted by θoffset,
because the resting position of the arm can be displaced from the initial intended
position. Following the same reasoning, an offset in ϕ, represented by ϕoffset, is
also possible, as the rotating plate can be displaced at the beginning of the scan-
ning procedure. Finally, the scanned object may be displaced from the center of
the rotating plate. This offset can be taken into account as an offset in the x and
y coordinates constituting the sensor’s position, xoffset and yoffset respectively,
as the effect would be the same as applying an offset to the object’s position.
Thus, Eq. (1) is updated to Eq. (3) according with these offsets:
⎧
⎨
⎩
xs = (l cos(θ
) − hs sin(θ
)) cos(ϕ
) + xoffset
ys = (l cos(θ
) − hs sin(θ
)) sin(ϕ
) + yoffset
zs = l sin(θ
) + hs cos(θ
) − hp
(3)
where θ
= θ + θoffset and ϕ
= ϕ + ϕoffset.
Finally, in theory, the sensor must be orthogonal to the rotating arm. How-
ever, the sensor may become rotated γoffset degrees during the assembly of the
real system. Thus, θ
= θ
+ γoffset. This behaviour is better understood in
Optimal Parameters of a 3D Scanner Through PSO 145
Fig. 5, where it is possible to see that if there is no offset in the sensor’s rotation,
θ
= θ
.
Fig. 5. Updated mechanical design of the proposed system with offsets. XZ plane view.
Finally, to take into account θ
and the object’s offset in the coordinate
transformations, it is necessary to make some modifications to the calculations
computing Xp.
Therefore, two 2D rotations are made in θ and ϕ to direct an unit vector,
Xt = (xt, yt, zt), according to the sensor beam direction:
⎧
⎨
⎩
xt = cos(ϕ
)x
u − sin(ϕ
)zt
yt = sin(ϕ
)x
u + cos(ϕ
)zt
zt = − sin(θ
)
(4)
where x
u = − cos(θ
). After the rotations, the unit vector is scaled by d to
obtain the distance of the boundary of the object to the sensor in relation to the
center of the plate.
At last, Xp is given by the sum of the sensor’s position, Xs, and the scaled
rotated unit vector:
Xp = Xs + dXt (5)
So, it is intent to optimize the following problem
min
n

i=1
||Xpi − Xi||2
(6)
where Xp, as previously mentioned, represents the scanned points in relation to
the center of the support plate, X = (x, y, z) is the nearest point of the original
object boundary, and n the number of point on the cloud.
4.2 Optimization
To solve the optimization problem defined on (6), Particle Swarm Optimiza-
tion (PSO) algorithm was used. This algorithm belongs to Swarm Intelligence
146 J. Braun et al.
class of algorithms inspired by natural social intelligent behaviors. Developed
by Kennedy and Eberhart (1995) [10], this method consists on finding the solu-
tion through the exchange of information between individuals (particles) that
belong to a swarm. The PSO is based on a set of operations that promote the
movements of each particle to promising regions in the search space [3,4,10].
This approach was chosen because it has several advantages, according to
[12], such as:
– Easy in concept and implementation in comparison to other approaches.
– Less sensitive to the nature of the objective function regarding to other meth-
ods.
– Has limited number of hyperparameters in relation to other optimization
techniques.
– Less dependent of initial parameters, i.e., convergence from algorithm is
robust.
– Fast calculation speed and stable convergence characteristics with high-
quality solutions.
However, every approach has its limitations and disadvantages. For instance,
authors in [14] stated that one of the disadvantages of PSO is that the algorithm
easily fall into local optimum in high-dimensional space, and therefore, have a
low convergence rate in the iterative process. Nevertheless, the approach showed
good results for our system.
5 Results
To test the approach three case objects were analyzed: Centered sphere, Sphere
and Cylinder.
The first and second case were spheres with radius of 0.05 m. The first sphere
was centered on the rotating plate where the other had an offset of X,Y =
[0.045, 0.045]m. Finally, the third object was a cylinder with h = 0.08 m and
r = 0.06 m.
Since PSO is a stochastic method, 20 independent runs were done to obtain
more reliable results. The Table 1 presents the results obtained by PSO, describ-
ing the best minimum value obtained on the executed runs, the minimum average
from all runs (Av) and their respective standard deviation (SD). To analyze the
behavior of the PSO method, the results considering all runs (100%) and 90%
of best runs are also presented.
First, the results were very consistent for all the cases as can be verified by
the standard deviations, average values, and minima being practically zero. This
is also corroborated by comparing the results from all runs (100%) with 90% of
the best runs, where the averages from (100%) matched the averages from (90%)
of best runs and the standard deviations values maintained the same order of
magnitude. Also, another way to look at the consistency is by comparing the best
minimum value of executed runs with their respective averages, where all cases
match. Still, the only case where the PSO algorithm suffered a little was the
Optimal Parameters of a 3D Scanner Through PSO 147
Table 1. PSO results for the three objects
Minimum 100% 90%
Av SD Av SD
Centered Sphere 7.117 × 10−8 7.117 × 10−8 5.8 × 10−14 7.117 × 10−8 6.06 × 10−14
Sphere 4.039 × 10−5 4.039 × 10−5 1.3 × 10−13 4.039 × 10−5 1.18 × 10−13
Cylinder 3.325 × 10−2 3.325 × 10−2 3.2 × 10−7 3.325 × 10−2 2.4 × 10−7
cylinder, where although the minimum tended to 0, it didn’t reached the same
order of magnitude of the other cases. Therefore, we can confirm that the PSO
achieved global optimum in the sphere cases, on the other hand, the cylinder
case is dubious. There is also the possibility that the objective function for the
cylinder case needs some improvements.
In terms of minimizers, they are described in Table 2. It is possible to see that
for the centered sphere, all the parameters are practically 0. This is expected,
since the centered sphere did not have any offsets. The same can be said for the
cylinder. Finally, for the sphere, it is possible to see that although the sphere
had offsets of X, Y = [0.045, 0.045], the minimizers of X, Y are pratically zero.
This is also expected since as the sphere was displaced on the rotating plate, the
sensor acquired data points from the plate itself (on the center). Therefore, the
portion of the points representing the sphere was already ”displaced” during the
acquisition, i.e., the offsets in X, Y were already accounted for during acquisition.
By consequence, during transformation to cartesian coordinates. Therefore, the
only offset that was impactful on the transformation was the ϕoffset minimizer,
where it accounts for a rotation in the Z axis (to rotate the point cloud to
match the object). Of course, the other minimizers had their impact as well
during transformation.
Table 2. Minimizers obtained by PSO
xoffset[m] yoffset[m] ϕoffset[rad] θoffset[rad] γoffset[rad]
Centered Sphere −4.704 × 10−7
−5.472 × 10−8
−5.031 × 10−6
+4.953 × 10−5
+2.175 × 10−5
Sphere −7.579 × 10−3
8.087 × 10−2
1.566 × 10−1
−8.885 × 10−5
−1.661 × 10−4
Cylinder 5.055 × 10−4
−3.371 × 10−4
−2.862 × 10−1
5.046 × 10−2
1.057 × 10−1
The behavior of the PSO approach for the centered sphere, its representation
in simulation, and its optimized point cloud over the original object can be seen,
respectively, in Figs. 6, 7a, and 7b.
As can be seen from the illustrations, and from Tables 1 and 2, the PSO
managed to find the global optimum of the system’s parameters. Note that,
in Fig. 6, the algorithm converged to the global minimum near iteration 15,
nevertheless it ran for more than 70 iterations (note that these iterations are
from the PSO algorithm itself, therefore they belong to just one run from the all
148 J. Braun et al.
Fig. 6. PSO behavior regarding the centered sphere case.
(a) Simulation system with centered
sphere.
(b) Optimized point cloud of centered
sphere plotted over the object’s original di-
mensions.
Fig. 7. Simulated system (left). Optimized point cloud (right).
20 executed runs). Thus, the generated point cloud fit perfectly to the original
object, as can be seen in Fig. 7b.
In the same reasoning, the behavior of the PSO approach for the cylinder,
its system simulation, and its optimized point cloud over the original object can
be seen, respectively, in Figs. 8, 9a, and 9b.
It is possible to see in Fig. 8 that after 10 iterations the PSO algorithm con-
verged to the minimum value displayed in Table 1, generating the parameters
described in Table 2. The optimized point cloud, however, suffered a tiny dis-
tortion on its sides, where it can be seen in Fig. 9b. This probably happened
because of rounding numbers. This distortion is mainly caused by the small
values of θoffset and γoffset.
Optimal Parameters of a 3D Scanner Through PSO 149
Fig. 8. PSO behavior regarding the cylinder case.
(a) Simulation system with cylin-
der.
(b) Optimized point cloud of cylinder
plotted over the object’s original di-
mensions.
Fig. 9. Simulated system (left). Optimized point cloud (right).
Finally, the behavior of the PSO approach for the sphere with offset, its
system simulation, and its optimized point cloud over the original object can be
seen, respectively, in Figs. 10, 11a, and 11b.
Same as the centered sphere case, the point cloud fits perfectly over the object
with the dimensions from the original object. This is expected, since the PSO
managed to find the global optimum of the objective function near iteration 10,
generating the optimal parameters for the system, as can be seen in Tables 1
and 2.
150 J. Braun et al.
Fig. 10. PSO behavior regarding the sphere with offset case.
(a) Simulation system with sphere
with offset.
(b) Optimized point cloud of sphere
with offset plotted over the object’s
original dimensions.
Fig. 11. Simulated system (left). Optimized point cloud (right).
6 Conclusions and Future Work
This work proposed a solution to the imperfections that are introduced when
assembling a 3D scanner, due to assumptions that do not typical hold in a real
scenario. Some uncertainties associated with angles and offsets, which may arise
during the assembly procedures, can be solved resorting to optimization tech-
niques. The Particle Swarm Optimization algorithm was used in this work to
estimate the aforementioned uncertainties, whose outputs allowed to minimize
the error between the position and orientation of the object and the resulting
Optimal Parameters of a 3D Scanner Through PSO 151
final point cloud. The cost function considered the minimization of the quadratic
error of the distances which were minimized close to zero in all three cases, val-
idating the proposed methodology. As future work, this optimization procedure
will be implemented on the real 3D scanner to reduce the distortions on the
scanning process.
Acknowledgements. The project that gave rise to these results received the support
of a fellowship from ”la Caixa” Foundation (ID 100010434). The fellowship code is
LCF/BQ/DI20/11780028. This work has also been supported by FCT - Fundação
para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020.
References
1. Ajbar, W., et al.: The multivariable inverse artificial neural network combined with
ga and pso to improve the performance of solar parabolic trough collector. Appl.
Thermal Eng. 189, 116651 (2021)
2. Arbutina, M., Dragan, D., Mihic, S., Anisic, Z.: Review of 3D body scanning
systems. Acta Tech. Corviniensis Bulletin Eng. 10(1), 17 (2017)
3. Bento, D., Pinho, D., Pereira, A.I., Lima, R.: Genetic algorithm and particle swarm
optimization combined with powell method. Numer. Anal. Appl. Math. 1558, 578–
581 (2013)
4. Bratton, D., Kennedy, J.: Defining a standard for particle swarm optimization.
IEEE Swarm Intell. Sympo. (2007)
5. Braun, J., Lima, J., Pereira, A., Costa, P.: Low-cost 3d lidar-based scanning system
for small objects. In: 22o
¯
International Conference on Industrial Technology 2021.
IEEE proceedings (2021)
6. Franca, J.G.D.M., Gazziro, M.A., Ide, A.N., Saito, J.H.: A 3d scanning system
based on laser triangulation and variable field of view. In: IEEE International
Conference on Image Processing 2005. vol. 1, pp. I-425 (2005). https://doi.org/10.
1109/ICIP.2005.1529778
7. Ghorbani, E., Moosavi, M., Hossaini, M.F., Assary, M., Golabchi, Y.: Determina-
tion of initial stress state and rock mass deformation modulus at lavarak hepp by
back analysis using ant colony optimization and multivariable regression analysis.
Bulletin Eng. Geol. Environ. 80(1), 429–442 (2021)
8. He, Z., Shi, T., Xuan, J., Jiang, S., Wang, Y.: A study on multivariable optimization
in precision manufacturing using mopsonns. Int. J. Precis. Eng. Manuf. 21(11),
2011–2026 (2020)
9. Jin, C., Li, S., Yang, X.: Adaptive three-dimensional aggregate shape fitting
and mesh optimization for finite-element modeling. J. Comput. Civil Eng. 34(4),
04020020 (2020)
10. Kennedy, J., Eberhart, R.: Particle swarm optimization. IEEE International Con-
ference on Neural Network, pp. 1942–1948 (1995)
11. Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct
3d human pose and shape via model-fitting in the loop. In: Proceedings of the
IEEE/CVF International Conference on Computer Vision, pp. 2252–2261 (2019)
12. Lee, K.Y., Park, J.B.: Application of particle swarm optimization to economic dis-
patch problem: advantages and disadvantages. In: 2006 IEEE PES Power Systems
Conference and Exposition, pp. 188–192 (2006). https://doi.org/10.1109/PSCE.
2006.296295
152 J. Braun et al.
13. Lempitsky, V., Boykov, Y.: Global optimization for shape fitting. In: 2007 IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007). https://
doi.org/10.1109/CVPR.2007.383293
14. Li, M., Du, W., Nian, F.: An adaptive particle swarm optimization algorithm based
on directed weighted complex network. Math. Probl. Eng. 2014 (2014)
15. Ma, T.: Filtering adaptive tracking controller for multivariable nonlinear systems
subject to constraints using online optimization method. Automatica 113, 108689
(2020)
16. Rehman, W.U., et al.: Model-based design approach to improve performance char-
acteristics of hydrostatic bearing using multivariable optimization. Mathematics
9(4), 388 (2021)
17. Soltani, S., et al.: The implementation of artificial neural networks for the mul-
tivariable optimization of mesoporous nio nanocrystalline: biodiesel application.
RSC Advances 10(22), 13302–13315 (2020)
18. Straub, J., Kading, B., Mohammad, A., Kerlin, S.: Characterization of a large,
low-cost 3d scanner. Technologies 3(1), 19–36 (2015)
19. Swathi, A.V.S., Chakravarthy, V.V.S.S.S., Krishna, M.V.: Circular antenna array
optimization using modified social group optimization algorithm. Soft Comput.
25(15), 10467–10475 (2021). https://doi.org/10.1007/s00500-021-05778-2
20. Wang, W., Li, Y., Hu, B.: Real-time efficiency optimization of a cascade heat pump
system via multivariable extremum seeking. Appl. Thermal Eng. 176, 115399
(2020)
Optimal Sizing of a Hybrid Energy
System Based on Renewable Energy
Using Evolutionary Optimization
Algorithms
Yahia Amoura1(B)
, Ângela P. Ferreira1
, José Lima1,3
,
and Ana I. Pereira1,2
1
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politécnico de Bragança, Bragança, Portugal
{yahia,apf,jllima,apereira}@ipb.pt
2
Algoritmi Center, University of Minho, Braga, Portugal
3
INESC-TEC - INESC Technology and Science, Porto, Portugal
Abstract. The current trend in energy sustainability and the energy
growing demand have given emergence to distributed hybrid energy sys-
tems based on renewable energy sources. This study proposes a strategy
for the optimal sizing of an autonomous hybrid energy system integrat-
ing a photovoltaic park, a wind energy conversion, a diesel group, and
a storage system. The problem is formulated as a uni-objective func-
tion subjected to economical and technical constraints, combined with
evolutionary approaches mainly particle swarm optimization algorithm
and genetic algorithm to determine the number of installation elements
for a reduced system cost. The computational results have revealed an
optimal configuration for the hybrid energy system.
Keywords: Renewable energy · Hybrid energy system · Optimal
sizing · Particle swarm optimisation · Genetic algorithm
1 Introduction
The depletion of fossil resources, impacts of global warming and the global aware-
ness of energy-related issues have led in recent years to a regain of interest in
Renewable Energy Sources (RES), these latter benefit from several advantages
which enable them to be a key for the major energy problems. On the other hand,
a remarkable part of the world’s population does not have access to electricity
(approximately 13% of the population worldwide in 2016) [1], which creates an
important limitation for their development. Nevertheless, these needs can be
covered by a distributed generation provided by renewable energy systems with
the implementation of insulated microgrids based on Hybrid Energy Systems
(HES) offer a potential solution for sustainable, energy-efficient power supply,
providing a solution for increasing load growth and they are a viable solution to
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 153–168, 2021.
https://doi.org/10.1007/978-3-030-91885-9_12
154 Y. Amoura et al.
the electrification of remote areas. These microgrids can be exploited connected
to the main grid, with the latter acting as an energy buffer, or in islanded mode.
One of the most important benefits of the RES remains in their contribution to
the solution of global warming problems by avoiding damaging effects such as
greenhouse gas emissions (GHG), stratospheric ozone hole, etc.
Despite technological advances and global development, renewable energy
sources do not represent a universal solution for all current electricity supply
problems. There are several reasons for this, for instance, the low efficiency
compared to conventional sources. Economically, RES provides extra costs and
their payback takes a longer period [2]. In addition, RES are in a large extent,
unpredictable and intermittent due to the stochastic nature of, for instance,
solar and wind sources. To overcome the mismatch between the demand and
the availability of renewable energy sources, typically microgrids are based on
the hybridization of two or more sources and/or the exploitation of storage
systems [3]. By this way, the complementarity between different renewable energy
sources, together with dispatchable conventional sources, gives the emergence of
hybrid energy systems (HES) which represent an ideal solution for the indicated
problems.
Under the proposed concept of HES, in the event that renewable energy
production exceeds the demand, the energy surplus is stored in system devices
as chemical ones or hydrogen-based storage systems, or, in alternative, delivered
to the grid [4].
To increase the interest and improve the reliability of hybrid energy systems,
it is important to provide new solutions and address an essential problem con-
sisting of their optimal sizing. To deal with this type of problem, researchers have
often referred to optimization methods. For instance, in [5] researchers have used
the Sequential Quadratic Programming (SQP) method. Navid et al. [6] have used
the Mixed Integer Non-Linear Programming (MINLP) method, the stochastic
discrete dynamic programming has been used on [7]. Kusakana et al. [8] have
developed an optimal sizing of a hybrid renewable energy plant using the Linear
Programming (LP) method. Among other approaches, in [9] metaheuristics have
proved to be an ideal optimization tool for the resolution of sizing problems.
The work presented in this paper presents a real case study addressing the
optimal sizing of a hybrid energy system in the campus of Santa Apolónia located
in Bragança, Portugal. The objective of the work is to determine the optimal
number of renewable energy-based units: photovoltaic modules and wind con-
version devices on the one hand, and the optimal number of batteries for storage
on the other hand. The contribution of this article consists in the development
of an optimal sizing software based on optimization approaches. Because of the
stochastic effect of the problem that represents a random effect especially in
terms of power produced by the renewable energies sources, it is proposed to
deal with the optimization problem using metaheuristic algorithms, mainly, Par-
ticle Swarm Optimisation (PSO) algorithm and Genetic algorithm (GA). The
PSO is a modern heuristic search approach focused on the principle of swarm-
ing and collective behavior in biological populations. PSO and GA belong to
Optimal Sizing of a Hybrid Energy System 155
the same category of optimization methods and they are both community-based
search approaches that depend on knowledge sharing among population mem-
bers to improve their search processes. Throughout this work, it is performed a
comparison of the computational performance of the two approaches, i.e., which
algorithm will allow achieving the most optimal configuration concerning the
size of the hybrid energy system while satisfying the economic aspects and the
technical constraints related to the problem.
The remaining parts of the paper are organized as follows: Sect. 2 describes
the configuration of the HES. In Sect. 3, the meteorological data used is pre-
sented. Section 4 presents the modeling of the proposed hybrid energy system.
Section 5 define the sizing methodology adopted. The sizing problem is, then, for-
mulated as a uni-objective optimization problem in Sect. 6. The achieved results
are presented and discussed in Sect. 7. Finally, Sect. 8 concludes the study and
proposes guidelines for future works.
2 HES Configuration Architecture
The HES to be developed comprises two main renewable energy sources: Pho-
tovoltaic and wind. In order to ensure the balance between the demand and the
energy produced, and given that the non dispatchable behavior of those sources,
the microgrid also includes a diesel generator and a battery bank. The HES
is developed to operate off-grid. Figure 1 shows the architecture of the chosen
configuration.
Fig. 1. Architecture of the hybrid energy system.
156 Y. Amoura et al.
The battery system is connected to the DC bus which will be connected
bidirectionally to allow it operating in discharge when there is a shortage of
energy and in charge when there is a surplus of energy. An AC bus connects the
AC loads, the diesel generator, the wind turbine generators and a photovoltaic
system. The energy circulates through a bi-directional converter. When there is
an excess of energy in the AC bus, the converter acts as a rectifier to recharge
the batteries. In addition, when there is an energy deficit, the converter acts as
an inverter to transfer energy from the DC bus to the load. In cases when the
demand is higher than the renewable energy production and batteries are below
the minimum charging level, the bi-directional inverter can automatically start
a backup diesel group. A control and energy management system guarantees the
energy flow and the monitoring of the microgrid power quality.
3 Study Data
Meteorological data are an important point for the optimal conception of the
hybrid energy system to avoid an under or oversizing. Nevertheless, the infor-
mation of the solar and wind potential defined by the irradiation and the wind
speed respectively are the key variables to identify the sources that constitute
the HES.
The sampling data were taken at the laboratory level of the Polytechnic
Institute of Bragança (IPB), Portugal, Latitude: 41.8072, Longitude: −6.75919
41o
48’ 26” North, 6o
45’ 33” West with an elevation of 690 m. This section will
describe the average solar and wind potential data for one year (from January
1, 2019, to December 31, 2019) as well as the average load data. The study data
are presented in Fig. 2.
In this work, the solar irradiation data were taken practically on-site using a
pyranometer exposed on a 30◦
south-facing support. Figure 2a shows the daily
average solar irradiation data.
The values of the ambient temperature are indispensable for the prediction
of photovoltaic power. The more the temperature rises, the more the cell voltage
decreases, which leads to a decrease in the power of the module. In this study,
the measurement process of the temperatures is evaluated every hour during the
day by using a temperature sensor, thermocouple K-Type. Figure 2b shows the
daily average temperature data for 2019 year.
The sizing of a wind turbine system depends essentially on the knowledge of
wind speeds. For this reason, a series of precise and local measurements at the
study site were performed. In this work, the wind speed is measured through an
anemometer placed at 10 m from the ground surface. Figure 2c shows the average
wind speed data for the time span under consideration.
All the average values of wind speed are very random and stochastic, this
comes back to the nature of the wind which is unpredictable. A considerable
amount of wind circulating is observed in the afternoon, from 12:00 to 20:00
during the measurement period. On average, it reaches its optimum at 15:00
for a value of 2.93 m/s. From the collected data, the wind turbine characteristics
Optimal Sizing of a Hybrid Energy System 157
(a) Average solar irradiation. (b) Average temperature.
(c) Average wind speed. (d) Average load profile.
Fig. 2. Study data.
(cut in speed, cut-off speed, and rated speed) are evaluated, to select the optimal
type of wind turbine for the project, from the technical point of view.
Besides the environmental data, necessary to select the renewable source
units, the dimensioning of the back up systems (diesel group and battery bank)
requires the characterization of the demand profile. The average load profile is
measured in kWh. There is a significant activity of the load during working hours
from 08:00 am, which reaches a peak consumption of 7.68 kWh at 11:00 am. On
the other hand, during off-peak hours, there is a certain consumption considered
as uninterruptible load. Figure 2d illustrates the average load profile.
4 System Modeling
The optimization of the synergy between different sources in a HES, while main-
taining their best operational characteristics, requires the mathematical model-
ing of the energy sources’ behavior, which must reflect the system response
during its operation, when exposed to all types of external environments.
According to the potential present in the study site, the hybrid energy sys-
tem uses a combination of renewable devices (photovoltaic and wind turbines),
a conventional diesel group, as a back-up system, and an energy storage sys-
tem consisting of a set of electrochemical batteries. This section presents the
mathematical modeling of the different hybrid energy system sources as well
158 Y. Amoura et al.
as the characteristics of the device units adopted, taking into consideration the
environmental data collected and presented in the previous section.
4.1 Photovoltaic Energy System
Considering the objectives of this work, the most suited model for the photo-
voltaic energy system conversion is the power model (input/output) [10], which
has the advantage of estimating the energy produced by the photovoltaic appara-
tus from the global irradiation data and the ambient temperature on an inclined
plane, as well as the manufacturer’s data for the photovoltaic module selected.
The power output of a photovoltaic module can be calculated according to
the following equation:
Ppv(t) = ηSG (1)
being η the instantaneous efficiency of the photovoltaic module, S the surface of
the photovoltaic module and G the global irradiation on an inclined plane.
The instantaneous efficiency is given by,
η = ηr(1 − γ(Tc − T0)) (2)
where ηr is the reference efficiency of the photovoltaic module under standard
conditions (T0 = 25◦
C, G0 = 1000 W/m
2
and AM = 1.5), γ is the temperature
coefficient (◦
C) determined experimentally as the variation in module efficiency
for a 1◦
C variation in cell temperature (it varies between 0.004 and 0.006◦
C), Tc
is the temperature of the module that varies with the irradiance and the ambient
temperature as follows:
Tc = Ta +

NOCT − 20
800

G (3)
being NOCT the nominal operating cell temperature and Ta the ambient tem-
perature in (◦
C).
Thus, the instantaneous total power of the photovoltaic array, Pt
pv, is written
as follows:
Pt
pv(t) = Ppv(t)Npv (4)
where Npv is the number of modules to be used in the photovoltaic array.
The main characteristics of the selected photovoltaic module are presented
in [11].
4.2 Wind Energy Conversion System
The power produced by a wind energy system at a given site depends on the
wind speed at a hub height, the air density, and the specific characteristics of
the turbine. The wind speed in (m/s) at any hub height can be estimated using
the following law:
Optimal Sizing of a Hybrid Energy System 159
v = vi

h
hi
λ
(5)
being hi the height of the measurement point, generally taken at 10 m, vi is the
speed measured at height hi (m/s) and λ is typically taken as 1/7.
The modeling of the power produced by a wind energy system unit Pwt is
selected as a linear model [12], as follows:
⎧
⎪
⎪
⎨
⎪
⎪
⎩
Pwt = 0, v ≥ vc, v  vd
Pwt = Pn

v − vd
vn − vd

, vd ≤ v  vn
Pwt = Pn, vn ≤ v  vc
(6)
where Pn is the rated electrical power of the wind turbine unit, v is the value of
the wind speed, vn is the rated speed of the wind turbine, vc is the cut-off speed
and vd is the cut-in speed of the wind turbine.
The instantaneous total power produced by a wind farm, Pt
wt, is given by
the number of wind energy units, Nwt, times the power of each unit, as follows:
Pt
wt(t) = Pwt(t)Nwt (7)
The features of the selected module of the wind energy conversion system are
presented in [13].
4.3 Energy Storage System
The modeling of the energy storage system (ESS) is divided into two steps
according to two operations. The first step consists of evaluating the total capac-
ity of the battery bank for an autonomy duration determined by the required
specifications. By this way it will be possible to determine the unit capacity of
a battery unit and therefore the number of batteries for the required autonomy,
without the intervention of other production sources. The second step will be
the study of the batteries’ behavior in the chosen sizing configuration to assess
the feasibility of the first step. Therefore, it will be possible to know the exact
number of batteries that will be operating in parallel with the other production
sources, mainly the photovoltaic and wind turbine systems, without any system
interruption.
Sizing the Capacity of the Battery Bank. The total rated capacity of the
batteries, C, should satisfy the following formulation:
C =
NjEneed
PdKt
(8)
being Nj the required autonomy in days, Eneed the daily energy, Pd the allowed
depth of discharge and Kt the temperature coefficient of the capacity.
160 Y. Amoura et al.
Battery Energy Modelling. The battery energy modeling establishes an opti-
mal energy storage management. This operation depends on the previous state
of charge and the differential between the energy produced by the different types
of generators, Epr, and the initial energy required by the load, Ed defined by:
Ed(t) = Et
load(t) − (Epr(t)) (9)
where,
Epr(t) = Et
pv(t) + Et
wt(t) (10)
and ⎧
⎨
⎩
Et
pv(t) = Pt
pv(t)Δt
Et
wt(t) = Pt
wt(t)Δt
Et
load(t) = Pt
load(t)Δt
(11)
knowing that Δt is the simulation step, defined as one-hour interval.
Batteries state of charge can be evaluated according to two operation modes,
described as follows.
• Charging operating mode: If Epr ≥ Ed. The instantaneous energy storage
capacity, E(t), is given by the following equation:
E(t) = E(t − 1) + ηbat

Epr(t)ηconv −
Ed(t)
ηconv

(12)
being ηbat the battery efficiency and ηconv the converter efficiency.
• Discharging operating mode: If Epr  Ed. Now, the instantaneous storage
energy capacity can be expressed as follows:
E(t) = E(t − 1) +
1
ηbat
(Epr(t)ηconv − Ed(t)) (13)
The instantaneous state of the charge is given by:
SOC(t) =
E(t)
Etotal
(14)
In addition, the charging and discharging modes should guarantee that the state
of the charge (SOC) of the energy storage system must satisfy the following
condition:
SOCmin ≤ SOC(t) ≤ SOCmax (15)
being SOCmin and SOCmax the minimum and maximum state of charge of the
energy storage system respectively.
The specification of the proposed battery unit are described in [14].
Optimal Sizing of a Hybrid Energy System 161
4.4 Diesel Generator
The rated power of the diesel generator, Pdg, is chosen equal to the maximum
power required by the load i.e.,
Pdg ≥ Pmax
load (t) (16)
The operating mode of the diesel group is based on two possible scenarios:
• The first scenario occurs when the power verifies the following relation:
Pt
pv(t) + Pt
wt(t) +
E(t)
Δt
≥ Pload(t) (17)
In this case, it is not necessary to start the generator as the consumption is
satisfied.
• The second scenario occurs when the following relationship is verified:
Pt
pv(t) + Pt
wt(t) +
E(t)
Δt
 Pload(t) (18)
In this case, the generator starts up to cover all or part of the energy demand
in the absence of enough power from the other sources.
The technical specifications of the chosen diesel group are mentioned in [15].
5 Adopted Sizing Strategy
After performing load balance and evaluating the solar and wind potential pre-
sented earlier in Sect. 3, the renewable sources are designed to provide continuous
power to the required demand, then, taking into account the autonomy of the
load which is ensured by the storage system reaching 24 h. The latter will there-
fore be able to supply the energy necessary for the load without the presence of
other renewable sources for 24 h. The procedure to be followed is shown in the
flowchart by Fig. 3. First, the capacity of the battery bank is defined according
to the overall load and the desired autonomy, by using the Eq. (8) presented in
Sect. 4.3.
This step allows obtaining the necessary number of batteries, if there are
no renewable sources available, for a specific period of autonomy (24 h in this
case). The diesel generator is the last resource back up system, able to satisfy the
demand if there are no renewable energy available and the battery bank reaches
its minimum state of charge.
The intermittent nature of the renewables, makes the problem subject to
stochastic treatment. To solve the optimization problem, two metaheuristic
methods were used: Genetic Algorithm (GA) and Particle Swarm Optimisation
(PSO) algorithms. So, after providing the main initial data (meteorological data,
constraints and initial parameters) the values of the decisive variables obtained
by the algorithmic computation represent the optimal number of renewable gen-
erators. The algorithm evaluates them recursively until the cost of the installa-
tion is in the most optimal value.
162 Y. Amoura et al.
Fig. 3. Flowchart of the sizing process
6 Problem Formulation
The sizing of the energy hybrid system is formulated as an optimisation prob-
lem, in fact, the objective function contains mainly the different power equations
described in Sect. 4 to consider the technical problem constraints. On the other
hand, the objective function also considers the economic criteria, i.e., the sys-
tem purchase costs, maintenance costs and component replacement costs. The
objective function aims at obtaining the optimal number of the hybrid system
components while satisfying the technical-economical constraints. The overall
cost, in euros, for the system lifespan, T, considered equal to 20 years, is given
by:
Ct(Npv, Nwt, Nb) = Npv(Cpv + CpvKpv + TMpv) + Nwt(Cwt + CwtKwt + TMwt)
+Nb(Cb + CbKb + (T − Kb − 1)Mb) + Cdie + Cconv
where, Npv, Nwt, Nb are, respectively, the number of units of photovoltaic mod-
ules, wind turbines and batteries, Cpv, Cwt, Cb, Cdie, Cconv are respectively the
purchase costs of renewable units (photovoltaic and wind), battery unit, diesel
group and overall converters, Mpv, Mwt, Mb are, costs for the renewable energy
systems (photovoltaic and wind systems) and also battery bank. Finally, Kpv,
Kwt, Kb are, respectively, the number of equipment replacements during the
system lifetime.
Optimal Sizing of a Hybrid Energy System 163
The sizing optimisation problem can be written as follows,
min
(Npv,Nwt,Nb)
Ct(Npv, Nwt, Nb)
s.t. Ppv(t)Npv + Pwt(t)NW T + Pb(t) × Nb ≥ Pload(t)
0 ≤ Npv ≤ Nmax
pv
0 ≤ Nwt ≤ Nmax
wt
0 ≤ Nb ≤ Nmax
b
(19)
The optimization problem will be solved by two metaheuristic algorithms,
mainly, Particle Swarm Optimisation (PSO) algorithm and Genetic algorithm
(GA). PSO and GA are populational methods, i.e., they depend on knowledge
sharing among population members to improve their search processes.
7 Results and Discussions
This section presents the obtained results, from the optimization approach. After
the optimal number of units is reached, a technical study is performed to analyze
the reliability of the installation under the number of units (photovoltaic mod-
ules, wind turbines, batteries, and diesel group). Furthermore, an economical
evaluation is presented for the overall system lifetime.
7.1 Optimisation Results
The sizing problem is implemented by using the Matlab programming platform.
The optimisation procedure is solved based on the two evolutionary algorithms
(PSO) and (GA), introduced above. The parameters of the selected PSO are:
search dimension = 1, population size = 60, number of iteration = 100, c1 = 2,
c2 = 2 and w = 0.9. Moreover, the choose parameters of GA with the same
population size and iteration number with Mutation and crossover probabilities
of 1 %, and 75% respectively. Considering the devices’ specifications previously
introduced in Sect. 4, the optimisation results are presented in Table 1, for both
algorithms considered.
The optimal configuration, which corresponds to the lowest overall cost while
satisfying the technical constraints consists in 50 photovoltaic modules, 3 wind
turbines, a battery bank with 13 units. For 20 years of the HES lifespan, the
total purchase, maintenance and replacement are estimated at 157 ke ,
Figure 4 illustrates the convergence diagram for the two optimisation algo-
rithms, PSO and GA used to solve the optimization problem, defined in Eq.
(19).
According to Table 1 and Fig. 4, it’s remarkable that the PSO presented a
superior efficiency over the GA, with the same stopping condition. The path
to the optimum by using the Genetic Algorithm is much slower than the one
164 Y. Amoura et al.
Table 1. Optimal configuration results.
NP NW Nb Ndg Iterations CPU time (s) Error (%) Total price (e )
Genetic algorithm 49.68 2.62 13 1 38 0.47 1 156310.38
Particle swarm optimisation 49.89 2.73 13 1 27 0.23 0.07 156670.61
Optimal configuration 50 3 13 1 - - - 157755.87
(a) PSO. (b) GA.
Fig. 4. Convergence performances.
with PSO, i.e., the computational effort required by PSO for reaching the global
optimality is less than the effort required to arrive at the same high-quality
results by the GA. It should be noted that PSO principle search mechanism uses
a lower number of function evaluations than the GA, which justifies this output.
Besides, the effects of speed and implicit transition rules on particle movement
led to a faster convergence by the PSO. In terms of accuracy, the PSO has shown
more accurate values than the GA justified by its reduced relative error value.
7.2 Technical Validation
This section at verifying the optimality of the results obtained from the optimiza-
tion algorithms, by considering the optimal number of devices obtained during
the optimization procedure. With this in mind, the overall system is evaluated,
from the technical point of view, in the time span of 24 h. Figure 5a shows the
power output setpoints of the various HES generators. The balance constraint
between production and demand is satisfied throughout the day since the sum
of the power from the sources, including the battery bank, is equal or higher
than the load, for every time steps. The photovoltaic system is operating during
the day to supply the total demand including the charging of the energy stor-
age system. When the solar irradiation decreases, in the presence of the wind
that reaches the speed of the wind turbine system cut-in speed, this latter com-
pensates the deficit of energy. In this case, the diesel generator does not start,
because it is used only in the absence of the two renewable sources with the
storage system having reached its minimum state of charge (SOCmin).
Optimal Sizing of a Hybrid Energy System 165
To analyse the performance of the chosen diesel generator set in this micro-
grid, the system performance is analysed for 48 h, considering the hypothetical
scenario of null production of the renewable sources in the proposed time span.
According to Fig. 5b, the diesel generator is operational (working as back up
system) after the minimal state of charge of the energy storage system has been
reached, which justifies that the number Ndg = 1 is sufficient and optimal to
feed the installation in case of a supply absence.
(a) Presence of all sources. (b) Absence of renewable energies.
Fig. 5. Optimal power set-points describing the hybrid energy system behavior.
Figure 6a shows the evolution of the state of charge (SOC) of the battery
bank when operating in parallel with the other HES sources, for the time span
of 24 h. The energy of the batteries is used (discharged) mainly during the night
to supply the loads while the charging process is ensured during the day by the
surplus of the renewable sources, over the demand.
Figure 6b represents the capacity of the battery bank to maintain a 24-hour
autonomy adopting the Cycle Charge Strategy [16], while satisfying the needs
of the load, i.e. if the renewable sources (photovoltaic and wind systems) are
not available, the storage system will be able to provide the demand for 24 h,
without any interruption of the power supply.
The total discharge of the batteries contributes enormously to their life time
their life span reduction. For this reason, the SOC of the batteries should never
go below its minimum value. This constraint is thus verified since in this study
case the SOC of the energy storage system after 24 h of usage did not sink the
minimum state defined by 20% in the battery specification. Indeed, the number of
storage batteries delivered by the optimization algorithm computations discussed
above represents an optimal value.
166 Y. Amoura et al.
(a) SOC of the battery bank in parallel
operation.
(b) SOC of the battery bank in autonomous
operation.
Fig. 6. Storage system behavior
7.3 Economical Evaluation
In order to evaluate the economic benefit of the overall installation, it is analysed
the profitability during its life cycle. The life time of the HES is 20 years, the
total purchase, maintenance and replacement costs are 157755.87 e . The average
power consumption during 24 h is 122.915 kWh, which is equivalent to a value
of 898508.65 kWh within 20 years, without taking into account the extension of
the load. Portugal has one of the most expensive electricity systems in Europe.
According with eurostat data in 2020, the price of electricity averaged 0.2159
e /kWh including taxes [17]. The estimated cost of the energy bill is 195964.74
e . These values allow to highlight the amount saved by the customer, as well
as the payback period.
Figure 7 shows an economic estimation for the life time under consideration.
The line in black shows the evolution of the energy bill according to conven-
tional energy from the public grid. The line in red represents the evolution of
the investment cost in the HES, this latter starts with an initial purchase cost
that increases each year with the amount of maintenance of the different ele-
ments, every 4 years the batteries are replaced without taking into account the
maintenance price of the batteries during the year of their replacement.
The intersection of the final value of the investment cost with the conventional
energy bill on the time axis represents the period when the customer recovers
the amount invested in the HES, and from that moment until the end of the
life of the HES all the money supposed to be spent on the energy bill of the
installation will be saved. It can be seen that the cost of the investment in the
HES will be repaid at the end of the 16th
year, more precisely on October 2037 if
it is considered that the system will be installed on 1 January 2021. In this case,
the HES is more profitable than conventional energy and allows the customer to
save a value of 35423.49 e .
Optimal Sizing of a Hybrid Energy System 167
Fig. 7. Economic evaluation during the system life-cycle.
8 Conclusions and Future Work
In this paper, the problem of the optimal sizing for a hybrid energy system (HES)
configuration based on renewable energies was formulated as a uni-objective
optimization problem under constraints. Two optimization approaches based
on evolutionary algorithms have been considered: Particle Swarm Optimisation
(PSO) and the genetic algorithm (GA). The PSO shows better performances
in terms of accuracy and convergence speed. The obtained sizing results have
been tested and from the technical point of view, the performance of the sys-
tem guarantees the energy balance and the remaining constraints. In addition,
a simplified economical approach indicates that the return of the investment is
possible within the life time of the energy system. As future work, it is proposed
to treat the problem in a multi-objective optimization formulation while intro-
ducing environmental and economical constraints, whose results would provide
a set of scenarios for the optimal configuration of a given HES. On the other
hand, it is intended to improve this software providing an interface suitable for
all cases of study, allowing to identify the best sizing of a given HES, through a
technical and economic analysis of the computational results.
Acknowledgements. This work has been supported by FCT - Fundação para a
Ciência e Tecnologia within the Project Scope UIDB/05757/2020.
References
1. Ritchie, H., Roser, M.: Acces to energy. Our World in Data (2019). http://www.
ourworldindata.org/energy-access
2. Ma, W., Xue, X., Liu, G.: Techno-economic evaluation for hybrid renewable energy
system: application and merits, Energy. 385–409 (2018). https://doi.org/10.1016/
j.energy.2018.06.101
168 Y. Amoura et al.
3. Mamen, A., Supatti, U.: A survey of hybrid energy storage systems applied for
intermittent renewable energy systems. In: 14th International Conference on Elec-
trical Engineering/Electronics, Computer, Telecommunications and Information
Technology (2017). https://doi.org/10.1109/ecticon.2017.8096342
4. Gupta, A., Saini, R., Sharma, M.P.: Modelling of hybrid energy system-Part I:
problem formulation and model development, Renew. Energy, pp. 459–465 (2011).
https://doi.org/10.1016/j.renene.2010.06.035
5. AlHajri, M.F., El-Hawary, M.E.: Optimal distribution generation sizing via fast
sequential quadratic programming. IEEE Large Eng. Syst. Conf. Power Eng.
(2007). https://doi.org/10.1109/LESCPE.2007.4437354
6. Ghaffarzadeh, N., Zolfaghari, M., Ardakani, F.J., Ardakan, A.J.: Optimal sizing of
energy storage system in a micro grid using the mixed integer linear programming.
Int. J. Renew. Energy Res. (IJRER). 2004–2016 (2017). https://doi.org/10.1037/
0003-066X.59.1.29
7. Bakirtzis, A.G., Gavanidou, E.S.: Optimum operation of a small autonomous sys-
tem with unconventional energy sources. Electr. Power Syst. Res., 93–102 (1992).
https://doi.org/10.1016/0378-7796(92)90056-7
8. Kusakana, K., Vermaak, H.J., Numbi, B.P.: Optimal sizing of a hybrid renewable
energy plant using linear programming. IEEE Power Energy Soc. Conf. Expos.
Africa Intell. Grid Integr. Renew. Energy Resourc. PowerAfrica (2012). https://
doi.org/10.1109/PowerAfrica.2012.6498608
9. Sharafi, M., Tarek, E.Y.: Multi-objective optimal design of hybrid renewable energy
systems using PSO-simulation based approach. Renew. Energy, 67–79 (2014).
https://doi.org/10.1016/j.renene.2014.01.011
10. Adamo, F., Attivissimo, F., Nisio, D., Lanzolla, A., Spadavecchia, M.: Parameters
estimation for a model of photovoltaic panels. In: XIX IMEKO World Congress
Fundamental and Applied Metrology, pp. 6–11. Lisbon, Portugal (2009). https://
doi.org/10.1068/978-963-88410-0-1
11. Zytech Solar Modules Catalogue: Maximum Quality, Efficiency and Reliability.
www.Zytech-solar.com
12. Hongxing, Y., Lu, L., Zhou, W.: A novel optimization sizing model for hybrid
solar-wind power generation system. Solar Energy 76–84 (2007). https://doi.org/
10.1016/j.solener.2006.06.010
13. Flex Pro: Series Eol Generator (EOL/3000) www.flexpro-industry.com
14. Ultracell: Solar Series (UCG250-48) ultracell.com
15. Leroy merlin: Hyndai (Hyundai Diesel Generator 12kva Mono And Tri -
Dhy12000xse-t) leroymerlin.fr
16. Banguero, E., Correcher, A., Pérez-Navarro, Á., Morant, F., Aristizabal, A.: A
review on battery charging and discharging control strategies: application to renew-
able energy systems. Energies 1–15 (2018). https://doi.org/10.3390/en11041021
17. Eurostat: Electricity prices (including taxes) for household consumers, first half
2020. http://www.eurostat/statistics-explained, Accessed 18 Aug 2020
Robotics
Human Detector Smart Sensor for
Autonomous Disinfection Mobile Robot
Hugo Mendonça1,2(B)
, José Lima2,3
, Paulo Costa1,2
,
António Paulo Moreira1,2
, and Filipe Santos2
1
Faculty of Engineering, University of Porto, Porto, Portugal
{up201606204,paco,amoreira}@fe.up.pt
2
INESC-TEC - INESC Technology and Science, Porto, Portugal
{hugo.l.mendonca,jose.lima,filipe.n.santos}@inesctec.pt
3
Research Centre of Digitalization and Intelligent Robotics, Instituto Politécnico de
Bragança, Bragança, Portugal
jllima@ipb.pt
Abstract. The COVID-19 virus outbreak led to the need of developing
smart disinfection systems, not only to protect the people that usually
frequent public spaces but also to protect those who have to subject
themselves to the contaminated areas. In this paper it is developed a
human detector smart sensor for autonomous disinfection mobile robot
that use Ultra Violet C type light for the disinfection task and stops
the disinfection system when a human is detected around the robot in
all directions. UVC light is dangerous for humans and thus the need
for a human detection system that will protect them by disabling the
disinfection process, as soon as a person is detected. This system uses
a Raspberry Pi Camera with a Single Shot Detector (SSD) Mobilenet
neural network to identify and detect persons. It also has a FLIR 3.5
Thermal camera that measures temperatures that are used to detect
humans when within a certain range of temperatures. The normal human
skin temperature is the reference value for the range definition. The
results show that the fusion of both sensors data improves the system
performance, compared to when the sensors are used individually. One
of the tests performed proves that the system is able to distinguish a
person in a picture from a real person by fusing the thermal camera and
the visible light camera data. The detection results validate the proposed
system.
Keywords: Smart sensor · Human detection · Neural network
1 Introduction
The current pandemic situation that we live on, caused by the COVID-19 virus
outbreak led to the need of giving safe conditions to people that share crowded
spaces, especially closed environments, where the propagation of the virus is
substantially more easy and therefore is considered a more dangerous situation to
the society. Places such as hospitals, medical centers, airports, or supermarkets
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 171–186, 2021.
https://doi.org/10.1007/978-3-030-91885-9_13
172 H. Mendonça et al.
are among those that fit in the previous description and even though regular
testing to the users may be one way to prevent the spread of the virus, the
disinfection of public places is of great importance.
Disinfection using chemical products is one of the most popular but also
implies a lot of effort and risk by the user since they must submit themselves
to the dangerous conditions of proximity to possibly infected places. One way
to solve this problem is using an Automated Guided Vehicles (AGV) that can
cover large spaces without the supervision of human operators. Disinfection with
Ultraviolet Radiation is an alternative to the use of chemical products without
losing efficiency. This method works by exposing the infected area for a few
seconds to the UV radiation reducing the viral load that is present there.
This method is particularly interesting since it allows the disinfection of a
larger area in less time, and by using an AGV, this task can be programmed
to be performed during the periods without the presence of people (as example
airports and commercial areas). However, due to its danger for living organisms,
it is necessary to develop a system that detects persons that may be close to the
actuation zone.
This paper addresses a developed system that uses multiple sensors to have a
higher robustness of detecting humans without false positives and false negatives
by fusing the sensors signals that are captured individually in a single algorithm,
that will then communicate with the disinfection robot. False negatives have a
bigger impact in the system performance, since the disinfection lights are not
stopped when persons are near the robot. Results are very promising reaching
97% of accuracy.
The remaining of the paper is organized as follows. After an introduction
presented in Sect. 1, the Sect. 2 addresses the related work regarding the human
detection approaches. Then in Sect. 3, the system architecture is presented where
the thermal camera, the Raspberry pi camera and the LiDAR sensor to estimate
the disinfection time based on the room area are stressed. Section 4 presents the
results and its discussion. Finally, Sect. 5 presents the conclusions and points out
the future work directions.
2 Related Work
When trying to detect people in a closed environment, a multi-sensor approach
can be a good solution. Since it relies on the information of more than one sensor
to give a final, more accurate result, this method is the focus of this article. The
data gathered by the different cameras and other sensors are later fused and
analyzed allowing the system to have one single output. The system uses two
cameras, one visible light camera, and one thermal camera to run an active
detection algorithm. On the other hand, there are also two other sensor types, a
LiDAR that gives distance information on the direction where the cameras are
pointing and a Passive Infrared (PIR) sensor that is used in home surveillance
systems for motion detection [9,10,13]. This last type of sensor is also commonly
used to locate the motion target in space when using multiple sensors pointing
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 173
in different directions allowing the system to pinpoint to a certain degree its
position [5,15,17]. The most common LiDAR cannot distinguish what type of
objects are close to the system where is used, which means it cannot be used as
an active detection sensor. However the distance data can be used to allow the
system a better comprehension of what is happening around it by mapping the
environment [1] and working together with the cameras to improve the system
performance [4,12].
In regards to active human detection applications using cameras, thermal
cameras are a less indicative sensor of visual information since it only captures
temperatures based on infrared sensors. However in situations of low light, as
described in [3] and [16], this type of camera has more accurate detection results
when compared to visible light cameras without infrared sensors. In [7] a com-
parison is made between two cameras, one with and the other without infrared
modules and the first outperforms the second by far, thus making us believe that
in low light conditions thermal cameras are more informative than visible light
cameras.
Since some thermal cameras are usually expensive, it is necessary to find
one that meets our requirements but without compromising our budget. FLIR
Lepton represents such case as proven in [11] showing a capability to detect
specific temperatures of objects smaller than humans from long distances [3] in
a compact, low weight sensor compatible with embedded devices such as the
Raspberry Pi in our case. This model has two versions 2.x and 3.x where the
first has an 80 × 60 resolution and the second a 160 × 120 resolution among other
differences.
The visible light camera detection relies on previously trained models to
identify persons. This is commonly used in Internet of Things applications and
more complex surveillance systems [7], that instead or in addition to the use
of passive sensors such as PIR sensors also use computer vision and machine
learning to more accurately identify persons. Even though PIR sensors are a good
option in simple motion detection systems since they are cheap and easy to use.
Sometimes it triggers false positives which lead to unnecessary preoccupation and
false negatives when there is a person but it is not moving. The human detection
can be achieved by using image processing through background subtraction plus
face detection or head and shoulder detection [7], but also using machine learning
through networks prepared for the task of identification such as Convolutional
Neural Network (CNN) [14], DarkNET Deep Learning Neural Network [8] and
others such as YOLO and its variants, SSD and CNN variants [2].
3 System Architecture
The goal of this paper is to develop a human detection system that can be used
with an autonomous disinfection robot and it was developed to be a low-cost
and low weight efficient solution. The system will be mounted on top of a robot
equipped with a UVC light tower responsible for the disinfection procedure. The
human detector is a stand-alone equipment that is powered by 24 V and can
174 H. Mendonça et al.
communicate by WiFi of serial port. In the presented case study, a serial port
connection is used to obtain information from the developed human detector
module. Figure 1 presents the main architecture of the human detector and its
integration.
dist
I/O 24V
ESP32
PCB1
Stepper
motor
driver
M
Raspberry Pi
LiDAR
RGB cam
Thermal
cam
Robot PC
Serial
USB/Serial
Sonar
Auxiliar I/O
PIR
Fig. 1. System architecture scheme
The human detection system consists of two major parts: the upper part that
is a rotating module and contains a Raspberry Pi, a LiDAR, a FLIR Lepton
Thermal Camera, and a Raspberry Pi Camera Module v2; the down part has
4 PIR sensors to detect movement, only when the robot is not moving, and
a control circuit board that contains an ESP32 micro-controller. It is used to
control the stepper motor that couples the upper part, general purpose I/O and
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 175
a sonar to measure the distance to the building roof. This embedded system also
sends information to the main system (Robot PC) by an USB serial connection.
To minimize the number of cameras used by the system, the upper part is
a rotating module and was developed so that a single pair of cameras would be
able to sweep a full 360◦
rotation. A decision was made to have a non-continuous
motion instead of stopping for a brief moment. After measuring both vertical
apertures and considering that a vertical resolution is used to enhance the field
of view captured by the system, the stops are made every 36◦
resulting in a total
of 10 stops for each rotation, taking an average time of 10 s per rotation.
A two-pulley approach is used to achieve the rotation motion with one of
them being attached to a stepper motor that is controlled by the micro-controller
situated in the down part of the system. The other pulley that is also used to hold
the upper part of the system to the down part has a bearing that is fixed to the
box and rotates accordingly to the stepper rotation with a certain transmission
ratio.
3.1 Thermal Camera
A FLIR Lepton 3.5 with a FLIR Breakout Board v2 is used due to its 160 × 120
resolution which is 4 times larger than its predecessor of 80 × 60. It uses two
types of communication: I2C to send SDK commands provided by FLIR that
can be used to initialization, reboot, and/or system diagnosis; on the other hand,
there are also Serial Peripheral Interface (SPI) lines to send the data from the
camera to the listening device, in this case is the Raspberry Pi.
By default, the camera sends 164-byte packets which contain a 4-byte header
and 160 bytes of payload. Since the raw values of each pixel are 16-bit values
the data of each pixel is sent in 2 bytes which means each packet only sends
information of half a line of a frame. A frame is divided into 4 segments and
each segment consists of 60 packets that can be identified in the header of the
packet. The 20th packet also identifies the current segment, if the segment ID is
1, 2, 3, or 4 the packet is valid if the ID is 0 it means that is a discard packet.
The raw values of 16 bits per pixel represent values between 0 and 65535.
This however does not have a normal distribution in this range instead the
raw values are temperatures in Kelvin multiplied by a factor of 100. Since it
is desired to detect human range temperatures, we are interested in finding
values that are around 306,15K (equal to 33 ◦
C) or the raw value 30615 which
is the normal temperature of the human skin. In the end, we did not find major
differences between using the human body temperature, that is around 37 ◦
C,
and the human skin temperature as the reference value for the detection range,
however, we found that for persons that are further away from the system is
preferably better to use the skin temperature as a reference value or an even
lower value, since the measured human temperature starts becoming an average
between the human skin temperature and the human clothes temperature.
176 H. Mendonça et al.
A range is set in the algorithm to consider a group of temperatures near
this ideal human temperature value considering two factors: human temperature
fluctuation and the camera precision of ±5 ◦
C. If the read value is situated
within this range we consider a positive human detection, even though we need
to consider other inputs, since there can be other objects in the range that have
the same temperature as the human body.
3.2 RGB Camera
The RGB camera used is a Raspberry Pi Camera Module v2 due to its low cost
and ease of use with the Raspberry Pi without compromising performance and
image quality.
The system uses a SSD Mobilenet model trained with COCO Dataset to
detect and identify persons in real-time through the Raspberry Pi camera.
The MobileNet was quantitized to a lower resolution (internal Neural Network
weights are moved from floats to integers), using the Tensorflow framework to
reduce the computational power requirement, and enable the use of this Neural
Network on a Raspberry Pi. As said in [6], a Single Shot multibox Detector (SSD)
is a rather different detection model from Region-CNN or Faster R-CNN. Instead
of generating the regions of interest and classifying the regions separately, SSD
does it simultaneously, in a “single-shot”.
SSD framework is designed to be independent of the base network, and so
it can run on top of other Convolutional Neural Networks (CNNs), including
Mobilenet.
The Mobilenet and Raspberry Pi are capable of detecting persons in the
chosen vertical resolution with a of 2,5 frames per second. In this case we defined
a threshold of 50% on MobileNet detection confidence to reduce the false positive
detections. The network is quite robust, having the capacity of detecting persons
in a different number of situations, like for example through a partial body image
or simply a single hand image.
3.3 LiDAR
The detection system also includes a LiDAR sensor pointing in the same direc-
tion as the cameras that are continuously sending to the Raspberry Pi distance
measurements. This information is useful for mapping applications recreating
the surrounding environment of the sensor giving yet another way of detection
to the system although is much less informative since it cannot distinguish dif-
ferent types of objects simply by using distance measurements. The disinfection
time can be tuned based on this information.
The used LiDAR is a TFMini Plus UART module, a low-cost, small-size,
and low power consumption solution to promoting cost-effective LiDARs. The
TFmini Plus LiDAR Modules emit modulation waves of near-infrared rays on a
periodic basis, which will be reflected after contacting the object. The time of
flight is then measured by the round-trip phase difference and then the relative
range is calculated between the product and the detection object.
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 177
Distances are sent via UART from the LiDAR to Raspberry Pi at a constant
pace via UART and are later sent from the Raspberry Pi to the ESP32 micro-
controller that is responsible for the system and the robot communication for
the mapping tasks. The connections and buses from the cameras and LiDAR to
the Raspberry CPU are presented on Fig. 2.
RaspberryPi
FLIR Lepton 3.5
Picamera
LiDAR
Flat Cable RPi
MISO
TX
RX
CLK
SCL
SDA
VSYNC
CS
DC/DC Converter
5V
GND
3V3GND
ESP32/PCB
RX
TX
24V GND
Fig. 2. Human detection module architecture
3.4 Messaging Protocol
Since the two cameras have separate algorithms running in individual programs,
there is a need to aggregate and compare the detections from each algorithm
in a single program that will then send that information along with the LiDAR
data that is running itself in another program.
The ZMQ which is an asynchronous messaging library is used. It has a pub-
lisher/subscriber option where the detection algorithms and the LiDAR pro-
grams are used as publishers by continuously sending data to a specified port
which will then be subscribed by a master control program that is used as a
subscriber.
This messaging method works by creating a socket on the publisher side and
binding it to the chosen IP address and port and then send the data. On the
subscriber side, a socket is also created and connected to the same IP address
and port as the publisher and reading the data received in the socket. It is
important to mention that each publisher needs to have a different port however
the subscriber is able to read different ports at the same time.
178 H. Mendonça et al.
After gathering all the relevant data to the system, this subscriber program
also communicates via UART with the ESP32 micro-controller sending infor-
mation about the detection state of each camera and also sensor fusion. The
distance read by the LiDAR is also sent for other purposes of the robot.
Finally, Fig. 3 presents the developed Module Prototype.
Fig. 3. Human detection module prototype
4 Results
The detection algorithms were tested by placing the system in a populated envi-
ronment. Figure 4 is an example of the sequential images collected from the ther-
mal camera in one rotation of 360◦
. We can see from those images that there are
people in one part of the testing area and no persons on the other. This gives us
a better understanding of how the systems reacts to populated areas versus non-
populated areas. The goal was to see the individual performance of the cameras
in person detection tasks and test their robustness by testing different scenarios
where we knew a certain camera would fail individually. Some examples of this
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 179
are the detection of pictures instead of real persons in the visible light camera
algorithm and the detection of human-like temperature devices when no human
is present. Finally, we tested the algorithm that combines both data to see if the
performance was better when compared to the previous tests using the cameras
individually. A total of 1200 images were taken from each of the cameras. About
80% of the images are supposed to be classified as positive detections, the rest
are images where no person was in the field of view of the system and therefore
should not detect anything.
Fig. 4. Example of images captured from one rotation
An object detection classification method was used where the True Positive
(TP) means situations where a person was present in the image and the algorithm
detected it positively, whereas a False Positive (FP) means a person was not
present in the image but the algorithm falsely detected it. In regards to the
negative detections, a True Negative (TN) means that no person was present
in the image and the algorithm correctly detected nothing and a False Negative
(FN) means a person was not detected even though it was present. This results
are presented in Table 1.
With the True, False, Positive, Negative information we can then calculate
the algorithm Precision, Recall, and F1 based on the equations below. Precision
is a good measure to determine when the costs of False Positive are high, whereas
Recall calculates how many of the actual Positives our model capture by labeling
it as Positive (True Positive). Applying the same understanding, we know that
Recall shall be the model metric we use to select our best model when there is
180 H. Mendonça et al.
Table 1. Detection classification
Detection method TP TN FP FN
Thermal camera 923 145 306 51
Visible Light camera 897 175 127 89
Thermal + Visible Light 981 194 88 32
a high cost associated with False Negative. F1 Score is needed when you want
to seek a balance between Precision and Recall. The results are presented in
Table 2.
Precision =
TruePositive
TruePositive + FalsePositive
Recall =
TruePositive
TruePositive + FalseNegative
F1 = 2 ×
Precision × Recall
Precision + Recall
Table 2. Detection classification
Detection method Precision (%) Recall (%) F1
Thermal camera 75,18 94,76 0,8379
Visible Light camera 87,59 90,97 0,8925
Thermal + Visible Light 91,77 96,84 0,9423
In Figs. 5 and 6, we have an example of a True Positive in each one of the
cameras. The visible light detection algorithm detects two persons with more
than 50% certainty which classifies as a positive detection, it is important to
mention that the use of masks does not have a big impact on the identification
of persons. The thermal camera detection algorithm has a total of 3545 pixels
in the range of temperatures that are considered acceptable which means that
the system also considers this image as a positive detection. Since both cameras
consider these images as True positives it will result in a True Positive for the
detection system.
In Figs. 7 and 8, we tested a scenario where we knew it was possible that the
Visible Light detection algorithm would falsely signal a detection. The idea was
to place an image of a person in front of the camera even though there is no real
person there. As we can see the visible algorithm detects the image as a True
Positive but the Thermal detection algorithm did not, since no temperatures in
the image were in the predefined range of human temperatures. Individually on
algorithm classifies the scenario as positive and the other as negative resulting
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 181
Fig. 5. Example of true positive - visible light camera
Fig. 6. Example of true positive - thermal camera
182 H. Mendonça et al.
in a system negative. This proves that the fusion of data from both sensors can
improve the system’s performance.
Fig. 7. Example of false positive - visible light camera
There are also some cases where the detection could not be achieved by
either of the cameras. As we can see in Fig. 9 there is a person in the image,
however, the algorithm is not capable of detecting it because of the distance
of the system to the person, the low light conditions and the low resolution of
the Neural Network. The thermal sensor also has some problems when trying
to detect persons that are too far from the sensor or when the environmental
temperatures are too similar to the person’s temperature. As we see in Fig. 10,
even though we can distinguish in the image the person, the chosen range of
temperatures does not consider it as a positive detection. We could also expand
the range of detection but that would make the sensor too sensible to other
objects or/and environmental temperatures. Our measurements indicate that
the system has a range of detection of at least 8 m in non-optimal conditions.
In perfect light and temperature conditions, the system’s range increases even
further.
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 183
Fig. 8. Thermal image in visible camera false positive example
Fig. 9. Example of false negative - visible light camera
184 H. Mendonça et al.
Fig. 10. Example of false negative - thermal camera
5 Conclusions and Future Work
The developed system presented a high accuracy rate when detecting people in
different light and weather conditions by using both thermal and visible light
sensors as well as the PIR sensors. The data fusion algorithm allows the system
to overcome situations where a determined sensor would not be able to perform
as expected. The PIR sensor is the most simple sensor used in our system is good
to detect persons in situations where there is movement, however, this method
fails when a person is not moving but is still present. In this situation, we use
the thermal and visible light camera to enhance the detection of people with the
developed algorithms, using computer vision and temperature measurements.
The thermal detection algorithm is susceptible to failure in situations where the
environment temperature is similar to the human body temperature or when
there are objects with similar temperatures. The visible light detection algorithm
shows a lower performance in low light conditions and is not able to distinguish a
real person from a picture of a human. Even though these two sensors are more
capable than the PIR sensor, they still work together to improve the system
performance as we showed in our tests. Since the cameras are spinning around
central axis of the system, sometimes it fails to detect persons moving in the
same direction as the camera. In this situations, the system relies on the data
of the PIR sensors to accomplish the detection, even if it is not as reliable
as the cameras data. Further work will consist of implementing the developed
system into a disinfection AGV to test its reliability in a real-world scenario.
Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot 185
The thermal detection algorithm can be improved by training a neural network
to identify human bodies in a frame. This can be done in two ways, one of them
is overlaying the RGB and the thermal image to create a new image where each
pixel contains color plus temperature information thus being called RGB-T. The
other way is using the thermal images to create a dataset of images to feed the
network, the same approach used in the visible light detection algorithm.
Acknowledgements. This work has been supported by FCT - Fundação para a
Ciência e Tecnologia within the Project Scope: UIDB/05757/2020.
References
1. Aijazi, A.K., Checchin, P., Trassoudaine, L.: Multi sensorial data fusion for effi-
cient detection and tracking of road obstacles for inter-distance and anti-colision
safety management. In: 2017 3rd International Conference on Control, Automation
and Robotics, ICCAR 2017, pp. 617–621. Institute of Electrical and Electronics
Engineers Inc., June 2017. https://doi.org/10.1109/ICCAR.2017.7942771
2. Alsing, O.: Mobile object detection using tensorflow lite and transfer learning
(2018). http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233775
3. Andraši, P., Radišić, T., Muštra, M., Ivošević, J.: Night-time detection of uavs
using thermal infrared camera, vol. 28, pp. 183–190. Elsevier B.V., January 2017.
https://doi.org/10.1016/j.trpro.2017.12.184
4. Chavez-Garcia, R.O., Aycard, O.: Multiple sensor fusion and classification for mov-
ing object detection and tracking. IEEE Trans. Intell. Transp. Syst. 17, 525–534
(2016). https://doi.org/10.1109/TITS.2015.2479925
5. Ha, K.N., Lee, K.C., Lee, S.: Development of PIR sensor based indoor location
detection system for smart home, pp. 2162–2167 (2006). https://doi.org/10.1109/
SICE.2006.315642
6. Hung, P.D., Kien, N.N.: SSD-mobilenet implementation for classifying fish species.
In: Vasant, P., Zelinka, I., Weber, G.-W. (eds.) ICO 2019. AISC, vol. 1072, pp. 399–
408. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4 40
7. Ivašić-Kos, M., Pobar, M.: Human detection in thermal imaging using yolo. ACM
(2019). https://doi.org/10.1145/3323933.3324076
8. Mitrea, C.A., Constantin, M.G., Stefan, L.D., Ghenescu, M., Ionescu, B.: Little-
big deep neural networks for embedded video surveillance, pp. 493–496. Institute
of Electrical and Electronics Engineers (IEEE), October 2018. https://doi.org/10.
1109/iccomm.2018.8484765
9. Patel, P.B., Student, M.T., Choksi, V.M., Jadhav, S., Potdar, M.B.: ISSN: 2249-
0868 Foundation of Computer Science FCS. Technical report 5 (2016). www.ijais.
org
10. Pawar, Y., Chopde, A., Nandre, M.: Motion detection using PIR sensor. Int. Res.
J. Eng. Technol. 1(5) (2018). www.irjet.net
11. Pestana, D.G., Mendonca, F., Morgado-Dias, F.: A low cost FPGA based thermal
imaging camera for fault detection in PV panels. Institute of Electrical and Elec-
tronics Engineers Inc., August 2017. https://doi.org/10.1109/IoTGC.2017.8008976
12. Premebida, C., Ludwig, O., Nunes, U.: LIDAR and vision-based pedestrian detec-
tion system. J. Field Robot. 26(9), 696–711 (2009). https://doi.org/10.1002/rob.
20312, http://doi.wiley.com/10.1002/rob.20312
186 H. Mendonça et al.
13. Sahoo, K.C., Pati, U.C.: IoT based intrusion detection system using PIR sensor. In:
RTEICT 2017–2nd IEEE International Conference on Recent Trends in Electron-
ics, Information and Communication Technology, Proceedings, vol. 2018-January,
pp. 1641–1645. Institute of Electrical and Electronics Engineers Inc., July 2017.
https://doi.org/10.1109/RTEICT.2017.8256877
14. Wei, H., Laszewski, M., Kehtarnavaz, N.: Deep learning-based person detection and
classification for far field video surveillance. Institute of Electrical and Electronics
Engineers Inc., January 2019. https://doi.org/10.1109/DCAS.2018.8620111
15. Yun, J., Lee, S.S.: Human movement detection and identification using pyroelectric
infrared sensors. Sensors (Switzerland) 14(5), 8057–8081 (2014). https://doi.org/
10.3390/s140508057
16. Yuwono, W.S., Sudiharto, D.W., Wijiutomo, C.W.: Design and implementation of
human detection feature on surveillance embedded IP camera, pp. 42–47. Institute
of Electrical and Electronics Engineers Inc., July 2018. https://doi.org/10.1109/
SIET.2018.8693180
17. Zhang, Z., Gao, X., Biswas, J., Jian, K.W.: Moving targets detection and local-
ization in passive infrared sensor networks (2007). https://doi.org/10.1109/ICIF.
2007.4408178
Multiple Mobile Robots Scheduling Based
on Simulated Annealing Algorithm
Diogo Matos1(B)
, Pedro Costa1,2
, José Lima1,3
, and António Valente1,4
1
INESC-TEC - INESC Technology and Science, Porto, Portugal
diogo.m.matos@inesctec.pt
2
Faculty of Engineering, University of Porto, Porto, Portugal
pedrogc@fe.up.pt
3
Research Centre of Digitalization and Intelligent Robotics,
Instituto Politécnico de Bragança, Bragança, Portugal
jllima@ipb.pt
4
Engineering Department, School of Sciences and Technology, UTAD,
Vila Real, Portugal
avalente@utad.pt
Abstract. Task Scheduling assumes an integral topic in the efficiency
of multiple mobile robots systems and is a key part in most modern man-
ufacturing systems. Advances in the field of combinatorial optimisation
have allowed the implementation of algorithms capable of solving the
different variants of the vehicle routing problem in relation to different
objectives. However few of this approaches are capable of taking into
account the nuances associated with the coordinated path planning in
multi-AGV systems. This paper presents a new study about the imple-
mentation of the Simulated Annealing algorithm to minimise the time
and distance cost of executing a tasks set while taking into account possi-
ble pathing conflicts that may occur during the execution of the referred
tasks. This implementation uses an estimation of the planned paths for
the robots, provided by the Time Enhanced A* (TEA*) to determine
where possible pathing conflicts occur and uses the Simulated Annealing
algorithm to optimise the attribution of tasks to each robot, in order to
minimise the pathing conflicts. Results are presented that validate the
efficiency of this algorithm and compare it to an approach that does not
take into account the estimation of the robots paths.
Keywords: Task scheduling · Multiple AGV · Time Enhanced A*
1 Introduction
Automated Guided Vehicles (AGV) have been more and more adopted not only
by industry for the moving of products on the shop-floor but also in hospitals
and distribution centres, as example. They can be an important help to the com-
petitiveness of a company since a 24/7 h can be adopted, as well as, a reduction
of costs.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 187–202, 2021.
https://doi.org/10.1007/978-3-030-91885-9_14
188 D. Matos et al.
When the solution points out more than one AGV to transport the material,
a new problem must be tackled: scheduling of multiple AGV. The coordination
of a fleet of autonomous vehicles is a complex task and actually most available
multi-robot systems rely on static and pre-configured movements [1].
The solution can be composed by a cooperative movement between all robots
that are part of the system while avoiding mutual block. In fact, this problem
of scheduling for multiple-robots can be compared to the well-known problem of
the Travelling salesman problem applied with multiple salesman. However the
problem of task scheduling in multiple-robots system has the added complexity
that the path of one robot can interfere with the path of the other robots since
they all must share the same resources. Based on this, an optimisation method-
ology is proposed in this paper so that it can solve the multi-robot scheduling,
while taking into account the impact of the pathing conflicts in the overall effi-
ciency of the system. This paper addresses an optimisation approach based on
the estimation of the robots paths via the TEA* algorithm and the consequen-
tial optimisation of the scheduling of multiple AGV via the Simulated Annealing
algorithm.
1.1 Problem Description
The order in which the tasks are attributed to each of the agents of the system can
have a great impact in the overall efficiency of the system. Therefore, discovering
the ideal order of tasks to assign to each of the robots, that are part of a multi
AGV system, is a very important task.
The main goal of this article is to create a module that is capable of optimising
the task order that will be assigned to each of the robots. This optimisation
process must be time sensible, since it must generate a good solution in the time
interval that takes the control system of the multi AGV to execute its current
assigned tasks. This module uses a path estimation generated by the TEA* path
planning algorithm to calculate the cost of each solution taking into account the
possible resource conflicts.
Most task scheduling implementations do not take into account possible
pathing conflicts, that can occur during the planning of the paths for the
robots that comprise the system. These conflicts normally originate unpredicted
increases in the time it takes the system so execute the tasks, since in order to
solve these conflicts one more of the robots must either delay their movement or
use a less efficient path to reach the target. Therefore, the overall objective of this
optimisation is to make the system more efficient by using a path planning algo-
rithm in the optimisation phase. The path planning algorithm will generate an
estimation of the real paths of that the robots will take. The main idea behind
this system is to use this estimations to calculate the value of each solution,
therefore the minimisation of this value will be directly correlated to a decrease
in the number of pathing conflicts. In most real industrial applications of multi
AGV systems there are some restrictions associated with how the transport of
cargo can be made, for example some tasks can only be executed by certain
robots (due to the AGV characteristics) and some tasks have to be executed as
Multiple Mobile Robots Scheduling Based on Simulated Annealing 189
soon as possible having priority over others, this restrictions were also taken into
account during the design of this module.
This article represents the first step into developing a system capable of such
optimisation. It presents the results of the study of the performance of one of the
most commonly used optimisation methods, when it is applied in the scenario
described above. This article also compares the performance of this method with
the results obtained from the minimisation of the travelled distance without
taking into consideration the path estimation. Since the purpose of this article is
just to study the performance of the Simulated Annealing algorithm when it is
apply to this case, the estimation of the AGV battery and the dynamic execution
of tasks were not taken into account, however such features are planned in future
iterations of this work.
For the purposes of this study a task is considered as the movement the
movement of the robot between two workstation. This implies that the robot
must first move from its current position towards the starting workstation, in
order to load the cargo, and then drop it at the final workstation. To execute
this movements the robot must access shared resources, due to the physical
dimensions of the robot and the operation space, it is considered that it is only
possible to have one robot travelling on each edge of the graph, in order to avoid
possible deadlock situations.
This types of studies are of paramount importance as a possible solution to
this optimisation problem may be based in a multi method based approach.
The paper is organised as follows. After this brief introduction, Sect. 2
presents some related work on the scheduling problem. Section 3 addresses the
system architecture whereas in Sect. 4 presents the optimisation methodology.
On Sect. 5, the results of the proposed method are stressed and Sect. 6 concludes
the paper and points out the future work direction.
2 State Art
Task scheduling defines and assigns the sequence of operations (while optimising)
to be done, in order to execute a set of tasks. Examples of scheduling can be
found in a huge number of processes, such as control air traffic control, airport
gate management, allocate plant and machinery resources and plan production
processes [2,4] among several others. The task scheduling can be applied to the
presented problem on assigning the task for each agent on a multi-robot system.
This problem can be considered a variant of the well known Travelling Salesman
Problem (TSP) [3].
According to [12], algorithms between centralised and decoupled planning
belong to three problem classes: 1) coordination along fixed, independent paths;
2) coordination along independent roadmaps; 3) general, unconstrained motion
planning for multiple robots. Following 3), there are several approaches to solve
the scheduling problems, from deterministic methods to heuristics procedures.
Regarding the Heuristic procedures there are several ways such as Artificial
Intelligence, Simulated Annealing (SA) and Evolutionary algorithms [5]. Classi-
cal optimisation methods can also be applied to the scheduling problem such as
190 D. Matos et al.
the Genetic Algorithm (GA) or Tabu Search [6]. Enhanced Dijkstra algorithm
is applied to find the conflict-free routing task on the work proposed by [7].
The scheduling problem has been studied extensively in the literature. In
the robotics domain, when considering each robot as an agent, some of these
algorithms can be directly adapted. However, most of the existing algorithms
can only handle single-robot tasks, or multi-robot tasks that can be divided into
single-robot tasks. Authors from [8] propose an heuristics to address the multi-
robot task scheduling problem but at the coalition level, which hides the details
of robot specifications.
Scheduling problem can be solved resorting to several approaches, such as
mixed integer program, which is a standard approach in the scheduling com-
munity [15] or resorting to optimisation techniques: Hybrid Genetic Algorithm-
Particle Swarm Optimisation (HGA-PSO) multiple AGV [16], find the near-
optimum schedule for two AGVs based on the balanced workload and the min-
imum travelling time for maximum utilisation applying Genetic Algorithm and
Ant Colony Optimisation algorithm [14].
Other approaches provide a methodology for converting a set of robot trajec-
tories and supervision requests into a policy that guides the robots so that oper-
ators can oversee critical sections of robot plans without being over-allocated,
as an human-machine-interface proposal that indicates how to decide the re-
planning [9]. Optimisation models can also help operators on scheduling [10].
Similar approach is stressed on [11] where the operator scheduling strategies
are used in supervisory control of multiple UAVs. Scheduling for heterogeneous
robots can also be found on literature but addressing two agents only [13].
Although there exist some algorithms that support complex cases, they do not
represent efficient and optimised solutions that can be adapted for several (more
than two) multi-robot systems in a convenient manner. This paper proposes a
methodology, based on Simulated Annealing optimisation algorithm, to schedule
a fleet of four and eight robots having in mind the simultaneous reduction of
the distance travelled by the robots and the time it takes to accomplish all the
assigned tasks.
3 System Architecture
The system used to test the performance of the optimisation methods, as shown
in Fig. 1, is comprised of two modules:
– The Optimisation module;
– The Path Estimation module;
The idea behind this study is to, at a later date, use its results to help design
and implement a tasks scheduling software that will work as an interface between
the ERP used for industrial management and the Multi AGV Control System.
The Optimisation module will use the implemented methodologies, described
in Sect. 4, to generate optimised sets of tasks for each of the robots. These sets are
then sent to the Path Estimation module, this module estimates the paths that
Multiple Mobile Robots Scheduling Based on Simulated Annealing 191
Fig. 1. Top level view of the Control system.
each of the robot will have to take, in order to execute these tasks. This paths
will be planned using the TEA* path planning algorithm, therefore taking into
consideration possible resource conflicts that can occur. The Path Estimation
module will be used to calculate the value of each of the task sets, this value
is later used by the Optimisation module to generate better task sets for the
robots. This cycle will continue to be executed until the optimisation method
reaches its goal. In future works this cycle will be restricted to the available time
interval until the completion of the current tasks by the robots, however since
the aim of this study was to analyse the performance of the simulated annealing
method when it is applied to this scenario, this restriction was not implemented.
4 Implemented Optimisation Methodology
In this study the Simulated Annealing(SA) metaheuristic, was chosen in order to
demonstrate its results when it comes to optimising the solutions to the task attri-
bution problem. However, some small modifications needed to be implemented
to the classic version of this metaheuristic, in order to allow the application of
this algorithm to the proposed problem.
Most of these modifications were implemented with the objective of making
the solutions generated by these algorithms compatible with the restrictions that
most real life multi AGV systems are subjected to. The three main restrictions
taken into consideration during this study were:
– The possibility of having different priorities for each of the tasks;
– The possibility of having tasks that can only be executed by specific robots;
– The existence of charging stations that the robots must move to when they
have no tasks assigned;
The implemented algorithm, as well as, the modifications implemented are
described in the following sections.
4.1 Calculate the Value of a Solution
In the tasks scheduling for multiple vehicle routing problem, the solutions can
be classified by two major parameters:
192 D. Matos et al.
– The times it takes the robots to complete all of its tasks;
– The distance travelled by the robot during the execution of the referred tasks;
In order to classify and compare the solutions obtained by the implemented
optimisation methodologies, a mathematical formula was used to obtain a cost
value for each of the solutions. This formula uses as inputs a boolean that indi-
cates if the TEA* is capable of planning a valid path for the robot (Vi), the
number of steps that comprise each of the robots planned path(Si) and the total
distance that each robot is estimated to travel(Di). This formula is represented
bellow:
NumberofRobots

i=0
(c1 ∗ V i + c2 ∗ Si + c3 ∗ Di)
Each of these inputs is multiplied by a distinct constant, these constants
define what is the weight that each of the inputs will have in the final result of
the function. For the purpose of these article, these values were set as 1 * 108
, 5,
1 for c1, c2 and c3, respectively. The intention behind these values is to force the
optimisation process to converge into solutions where the path for all the robots
is able to be planned, by attributing a large constant cost to the solutions that
don’t generate feasible path, prioritising then those paths are the ones which
leads to the fastest execution of all the tasks.
It is of special importance, when using the TEA* algorithm for path planning,
that the optimisation process gives special attention to the overall time taken
by the system to execute the tasks. The TEA* algorithm has the possibility
of solving pathing conflicts by inserting wait conditions into the robots paths,
therefore in the paths generated by this algorithm there isn’t always a direct
relation between travelling less distance and finishing the tasks quicker.
There are some scenarios, as implied above, where the TEA* algorithm can-
not plan viable paths for all the robots, this scenarios are normally associated
with the fact that the maximum number of steps allocated in the algorithm
is reached before all the targets are reached. Since this paths would be pro-
hibitively costly to be executed by the robots since the more steps the path has
the more time it will take to finish and the computational cost of planning will
also increase.
4.2 Initial Clustering of the Tasks
Since the Simulated Annealing is a metaheuristic that generates random changes
in the solution in an attempt to move its solution towards a global minimum, it
the results obtained with this probabilistic technique tend to be dependent on
the starting solution. To generalise this methodology and also to have a com-
parison point between a solution with optimisation and solution that was not
exposed to any optimisation methodologies. The tasks where initially clustered
using a Kmeans Spectral clustering algorithm, this algorithm was based in the
Multiple Mobile Robots Scheduling Based on Simulated Annealing 193
algorithm presented by [17,18] to solve the traditional Multiple Travelling Sales-
man Problem (mTSP).
This initial clustering uses a matrix with the distance between the final point
of each task and the beginning point of the other tasks, in order to create a
number of clusters equal to the number of robots available in the system. Due
to the fact that the beginning point and final point of each task are usually
different, the distance matrix computed is not symmetric. This asymmetry serves
to model the fact that the order in which the tasks are executed by each robot
is not commutative and does have a large impact in the value of the solution.
These clusters were then assigned to the robot that was closest to the start
point of each cluster. This solution will serve as the baseline for the optimisation
executed by the Simulated Annealing metaheuristic.
4.3 Simulated Annealing Algorithm
The implementation of the Simulated Annealing followed and approach that is
very similar to the generalise version of the algorithm found in literature. This
implementation is show in Fig. 2 via flow chart.
Fig. 2. Flowchart representation of the implemented simulated annealing algorithm
The Simulated Annealing algorithm implemented uses the same cooling func-
tion as the classical implementation of this algorithm. It uses initial temperature
of 100◦
and a rate of cooling of 0.05. This algorithm if given an infinite time win-
dow to execute will deliver a solution if the final temperature reaches 0.01◦
.
194 D. Matos et al.
Dealing with Tasks with Different Priority Levels
As stated previously, several modifications needed to be implemented to the
classic approach of this algorithm, in order to deal with the new restrictions
imposed by real industrial environments.
In order to allow the implemented Simulated Annealing algorithm to gener-
ate solutions that respect the priorities of the tasks, the optimisation problem
was divided into several smaller problems. Each of these smaller problems repre-
sent the optimisation of one of the priority groups. The algorithm will generate
solutions to each of these problems starting with the group that as the highest
priority and descending priority wise. After generating a solution for one these
problems the system will calculate the final position of each of the robots, this
position is not only characterised by the physical location of the robot but also
by the estimated time that it will take the robot to finish its tasks. Afterwards,
this position will be used as the initial position of the robots for the next priority
group. This allows the algorithm to take into consideration that the solutions
found for a priority group will also affect the priority groups with lower priorities.
This methodology is represented in Fig. 3 via a flowchart.
Fig. 3. Flowchart representation of an implementation of a simulated annealing algo-
rithm while taking into account the possibility of existing tasks with different priorities.
The TEA* algorithm used to estimate the paths for the robots, will also
set the nodes belonging to the paths estimated for higher priority tasks, has
occupied in the corresponding step. This modification in conjunction with the
updating of the robots initial position allows the algorithm to divide the path
planning problem into smaller chunks without sacrificing the effect that higher
priority solutions may have in the path estimated for lower priority ones.
Multiple Mobile Robots Scheduling Based on Simulated Annealing 195
Robot Exclusivity Problem
Another restriction that the system will have to face is the fact that some
tasks might only be possible to execute by a specific robot or group of robots.
This restriction implies that the changes made to the solution by the Simulated
Annealing algorithm cannot be truly random. To respect this restriction, another
modification must be added to the classical implementation of the Simulated
Annealing algorithm.
To prevent the attribution of tasks to robots that are not capable of execut-
ing them, a taboo table was implemented. This taboo table holds information on
which tasks have this restriction and to which robots can those tasks be assigned
to. By analysing the generated random change using this table, it is possible to
check if a change to the solution can be accepted or not, before calculating the
value associated with that change. The use of a taboo table prevents the algo-
rithm from spending time calculating the value of solutions that are impossible
to execute by the robots. The functioning of this taboo table is shown via a
flowchart in Fig. 4.
The implemented random change algorithm executes two different types of
changes. It either commutes the position of two different tasks between each
other, this was the example used to show the functioning of the taboo table in
Fig. 4. Or it changes the position of one randomly selected task. In this last case,
if the task is affected by this restriction the algorithm uses the taboo table to
randomly select a new position, from the group of positions that respect the
restriction associated with the referred task, to attribute to the task.
In the case where there are no other valid candidates, either other tasks
or other positions that respect the restriction, to exchange the selected task
with, the algorithm will randomly select another task to change, and repeat the
process.
This taboo table is also used in the initial clustering in order to guarantee
that the initial clustering of the tasks also respects this restriction. The taboo
table is analysed when assigning a task to a cluster, tasks that are subjected
to this restriction can only be inserted into cluster assigned to a robot that is
capable of executing said tasks. All sets of tasks will have approximately 10%
of their total tasks affected by robot restrictions and will be divided into three
different priority levels (Table 1).
196 D. Matos et al.
Fig. 4. Flowchart representation of the how the modifications implemented into the
process of generating a random change in the task set.
Table 1. Example of a Taboo Table used to solve the robot exclusivity problem.
Task ID Possible robot ID
1 2
1 4
2 2
Dealing with Idle Robots
In real multi AGV implementations it is necessary to have specific spots where
the robots can recharge, this spots are usually located in places where they will
not affect the normal functioning of the system.
This restriction was also taken into consideration by the implemented algo-
rithm. At the end of each task group assigned to the robot the task scheduler
module will send the robot to its charging station. This action will be considered
as an extra task and is taken in consideration when the algorithm is estimating
the cost of each solution, since even due the robot as finished its tasks it can
Multiple Mobile Robots Scheduling Based on Simulated Annealing 197
still interfere with the normal functioning of the other robots and cause pathing
conflicts.
To prevent the robots from moving to their charging stations before finishing
all their tasks, this task will be permanently fixed as the last task to be executed
by each of the robots. It will also be outside the scope of the normal random
change process that occurs in the Simulated Annealing algorithm, therefore pre-
venting any change, induced by the algorithm into the task set, from altering its
position in the task set.
5 Tests and Results
In this section the test description and results obtained during this study will
be described and displayed. As stated before, the objective of this study is to
determine the efficiency and suitability of the Simulated Annealing algorithm
when it is used to optimise the task scheduling of a multi AGV system.
The Simulated Annealing algorithm will be analysed based on two key factors.
The overall cost of the final path generated by the algorithm and the time it took
the algorithm to reach said solution. The performance of this algorithm will be
compared to the case where the task are randomly distributed and to the case
where the tasks are optimised to reduce the distance travelled between each task
without taking into account possible path planning conflicts.
5.1 Test Parameters and Environment
In order to tests the efficiency of the different methodologies, the map used in
[19] was chosen has the test environment. This map defines a real life map of
an industrial implementation of a multi AGV system and has as its dimensions
110 × 80 m. This map is represented in Fig. 5.
Fig. 5. Map of the environment use for the tests [19]. The possible paths that the
robots can use are represented in the figure via the red coloured lines.
198 D. Matos et al.
Using this map a graph was generated that represents all the possible tra-
jectories that the robots can take. This graph was composed by 253 vertexes
and 279 edges, of the 253 vertexes 41 of them are either workstations or robot
charging stations.
Two different versions of the system will be tested one version will only use
four robots while the other will use eight. Each version of the system will be
tested with a different sets of tasks, these sets vary not only in the number of
tasks but also in the restrictions associated with each one.
These test were executed in a laptop with the a Intel Core i7-6820HK 2.70 ghz
cpu, 32G available ram and NVIDIA GeForce GTX 980M graphic card.
For the purpose of this tests, a task will be characterised by the movement of
the robot to the “Starting Point” of said task and then from the movement of the
robot from “Starting Point” to the “End Point”. In this study it is assumed that
the loading and unloading of the robot happens instantly, in future iterations of
this work this topic will be tackled in detail.
5.2 Four Robots and Twenty Tasks
The first scenario to be tested was the four robots scenario, in this test 20 tasks
were assigned to the system, of these tasks three of them can only be assigned
to a specific robot. After the test was executed 16 times (enough to obtain a
trustworthy average), the results were analysed and are summarised in Table 2.
Table 2. Table with the summarised results from the first scenario
Method Best cost Worst cost Mean cost Mean efficiency (SA)
SA 13030,4 14089,7 13574,6 ———-
Kmeans 16691,9 17693,2 17139,5 20,79%
Random 16498,5 16925,7 16739,3 18,91%
As expected the implemented Simulated Annealing algorithm generated
and average improvement of 20,79% when compared to the Kmeans clustering
method and a 18,90% improvement when compared to a totally random distri-
bution of tasks. The results of the random distribution method were surprising,
although they do show that, even with only four robots, when the distribution is
optimised considering only the shortest distance between tasks it tends to create
more planning conflicts and consequently generate more costly paths, since even
due the overall distance travelled by the robots tends to be minimised, the robots
tend to take longer to execute their tasks due to the necessity of waiting for the
shared resource’s to be freed. All three attribution methods tested were capable
of generating solutions that respected all the system restrictions. In this scenario
the Simulated Annealing took on average 442,5 s (slightly less than seven and a
half minutes) to generate a solution.
Multiple Mobile Robots Scheduling Based on Simulated Annealing 199
The Simulated Annealing algorithm was also tested in terms of its conversion
rate when applied to this scenario, the results of said test are represented in
Fig. 6. These results were generated by changing the cooling rate value in order
to increase or decrease the time it took the algorithm to generate a viable solution.
From the results of this test it is possible to conclude that using a time interval
superior to 400 s would be more than adequate for this scenario.
Fig. 6. Graph representation of the results obtained from the conversion rate tests
5.3 Eight Robots and Forty Tasks
The second scenario was the eight robots scenario, in this test 40 tasks were
assigned to the system, of these tasks four of them can only be assigned to
a specific robot. Similarly to the previous one this test was also executed 16
times to ensure the consistency of the results. The results were analysed and are
represented in Table 3.
Table 3. Table with the summarised results from the second scenario
Method Best cost Worst cost Mean cost Mean efficiency (SA)
SA 14112 15642,3 14743,3 ———-
Kmeans 18303,1 19810,9 19072,7 22,68%
Random 18575,6 19424,1 18894,9 21,97%
In this scenario the Simulated annealing algorithm took in average 881,4 s
(around fifteen minutes), this is roughly double the time taken in the previous
200 D. Matos et al.
test, this was expected since the complexity of the problem was also significantly
increased in this scenario. The overall efficiency of the improvement added, by
the Simulated Annealing algorithm, to the solution also suffered an increase, this
was also expected given the a higher probability of occurring pathing conflicts,
in this scenario.
Fig. 7. Graph representation of the results obtained from the conversion rate tests in
the eight robot scenario
Similarly to the previous scenario the convergence rate of the Simulated
Annealing algorithm was also analysed, the results from this analysis are repre-
sented in Fig. 7.
In this scenario the algorithm requires a larger time interval until the solution
cost reaches a stable value. Given the larger complexity of this scenario this
increase was expected. Therefore, in order to obtain good results when using this
algorithm in a eight robot system, the time interval available for the algorithm
to optimise the task attribution should be larger than 1000 s.
Analysing both flowcharts Figs. 6 and 7, it is possible to ascertain that the
conversion time does not behave in a linear fashion when the problem is scaled.
6 Conclusions
In this work, it was presented a study on the implementation of the Simulated
Annealing algorithm to optimise the assignment of tasks in a multi-AGV sys-
tem. This implementation uses an estimate of the paths planned for the robots,
provided by Time Enhanced A * (TEA *), to minimise the cost of time and
distance to perform tasks assigned to the system, taking into account possible
path conflicts that may occur during tasks execution. This article also compares
Multiple Mobile Robots Scheduling Based on Simulated Annealing 201
the performance of this method with the results obtained from the minimisation
of the travelled distance without taking into consideration the path estimation.
The results show, in the two scenarios implemented (four robots with twenty
tasks, three of which are restricted; eight robots with forty tasks, four of which
are restricted), that the implemented algorithm (Simulated annealing algorithm)
has an average improvement of at least 22.68% (20.79% in the scenario of four
robots), when compared to kmeans and an average improvement of at least
21.97% (18.91% in the scenario of four robots), when compared with a totally
random algorithm.
In this work loading and unloading of the robot happens instantly, in future
work this topic will be addressed in more detail. In future iterations of this work
the efficiency of other optimisation methodologies will also be studied. This
studies will be used as the foundations for the creation of a task scheduling
module, capable of optimising the task attribution while being sensitive to the
available time.
Acknowledgements. This work is financed by National Funds through the Por-
tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project
UIDB/50014/2020.
References
1. Siefke, L., Sommer, V., Wudka, B., Thomas, C.: Robotic systems of systems based
on a decentralized service oriented architecture. Robotics 9, 78 (2020). https://doi.
org/10.3390/robotics9040078
2. Kuhn, K., Loth, S.: Airport Service Vehicle Scheduling, Eighth USA/Europe Air
Traffic Management Research and Development Seminar (ATM2009) (2009)
3. Lawer, E., Lenstra, J., Rinnooy, A.K., Shmoys, D.: The Travelling Salesman Prob-
lem. Wiley, Chichester (1985)
4. Lindholm, P., Giselsson, N.-H., Quttineh, H., Lidestam, C., Johnsson, C., Fors-
man, K.: Production scheduling in the process industry. In: 22nd International
Conference on Production Research (2013)
5. Wall, M.B.: Genetic Algorithm for Resource-Constrained Scheduling. Ph.D., Mas-
sachusetts Institute of Technology (1996)
6. Baar, T., Brucker, P., Knust, S.: Meta-heuristics: Advances and Trends in Local
Search Paradigms for Optimisation, vol. 18 (1998)
7. Vivaldini, K., Rocha, L.F., Martarelli, N.J., et al.: Integrated tasks assignment and
routing for the estimation of the optimal number of AGVS. Int. J. Adv. Manuf.
Technol. 82, 719–736 (2016). https://doi.org/10.1007/s00170-015-7343-4
8. Zhang, Y., Parker, L.E.: Multi-robot task scheduling. In: 2013 IEEE International
Conference on Robotics and Automation, Karlsruhe, Germany, pp. 2992–2998
(2013). https://doi.org/10.1109/ICRA.2013.6630992
9. Zanlongo, S.A., Abodo, F., Long, P., Padir, T., Bobadilla, L.: Multi-robot schedul-
ing and path-planning for non-overlapping operator attention. In: 2018 Second
IEEE International Conference on Robotic Computing (IRC), Laguna Hills, CA,
USA, pp. 87–94 (2018). https://doi.org/10.1109/IRC.2018.00021
202 D. Matos et al.
10. Crandall, J.W., Cummings, M.L., Della Penna, M., de Jong, P.M.A.: Computing
the effects of operator attention allocation in human control of multiple robots.
IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 41(3), 385–397 (2011). https://
doi.org/10.1109/TSMCA.2010.2084082
11. Cummings, M.L., Mitchell, P.J.: Operator scheduling strategies in supervisory con-
trol of multiple UAVs. Aerospace Sci. Technol. 11(4), 339–348 (2007)
12. LaValle, S.M., Hutchinson, S.A.: Optimal motion planning for multiple robots hav-
ing independent goals. IEEE Trans. Robot. Autom. 14(6), 912–925 (1998). https://
doi.org/10.1109/70.736775
13. Wang, H., Chen, W., Wang, J.: Coupled task scheduling for heterogeneous multi-
robot system of two robot types performing complex-schedule order fulfillment
tasks. Robot. Auton. Syst. 131, 103560 (2020)
14. Kumanan, P.U.S.: Task scheduling of AGV in FMS using non-traditional optimiza-
tion techniques. Int. J. Simul. Model. 9(1), 28–39 (2010)
15. Mudrova, L., Hawes, N.: Task scheduling for mobile robots using interval alge-
bra. In: 2015 IEEE International Conference on Robotics and Automation
(ICRA), Seattle, WA, USA, pp. 383–388 (2015). https://doi.org/10.1109/ICRA.
2015.7139027
16. Zhong, M., Yang, Y., Dessouky, Y., Postolache, O.: Multi-AGV scheduling for
conflict-free path planning in automated container terminals. Comput. Ind. Eng.
142, 106371 (2020)
17. Lihi, Z.-M., Pietro, P.: Self-Tuning Spectral Clustering, Computational Vision -
Caltech
18. Rani, S., Kholidah, K.N., Huda, S.N.: A development of travel itinerary planning
application using traveling salesman problem and k-means clustering approach. In:
Proceedings of the 2018 7th International Conference on Software and Computer
Applications (ICSCA 2018), New York, NY, USA, pp. 327–331. Association for
Computing Machinery (2018). https://doi.org/10.1145/3185089.3185142
19. Olmi, R., Secchi, C., Fantuzzi, C.: Coordination of multiple AGVs in an industrial
application, pp. 1916–1921 (2008). https://doi.org/10.1109/ROBOT.2008.4543487
Multi AGV Industrial Supervisory
System
Ana Cruz1(B)
, Diogo Matos2
, José Lima2,3
, Paulo Costa1,2
, and Pedro Costa1,2
1
Faculty of Engineering, University of Porto, Porto, Portugal
up201606324@edu.fe.up.pt, {paco,pedrogc}@fe.up.pt
2
INESC-TEC - INESC Technology and Science, Porto, Portugal
diogo.m.matos@inesctec.pt
3
Research Centre of Digitalization and Intelligent Robotics,
Instituto Politécnico de Bragança, Bragança, Portugal
jllima@ipb.pt
Abstract. Automated guided vehicles (AGV) represent a key element in
industries’ intralogistics and the use of AGV fleets bring multiple advan-
tages. Nevertheless, coordinating a fleet of AGV is already a complex
task but when exposed to delays in the trajectory and communication
faults it can represent a threat, compromising the safety, productivity
and efficiency of these systems. Concerning this matter, trajectory plan-
ning algorithms allied with supervisory systems have been studied and
developed. This article aims to, based on work developed previously,
implement and test a Multi AGV Supervisory System on real robots and
analyse how the system responds to the dynamic of a real environment,
analysing its intervention, what influences it and how the execution time
is affected.
Keywords: Multi AGV coordination · Time enhanced A* · Real
implementation
1 Introduction
Automated guided vehicle systems (AGVS), a flexible automatic mean of con-
veyance, have become a key element in industries’ intralogistics, with more indus-
tries making use of AGVS since the mid-1990s [13]. These systems can assist to
move and transport items in manufacturing facilities, warehouses, and distribu-
tion centres without any permanent conveying system or manual intervention. It
follows configurable guide paths for optimisation of storage, picking, and trans-
port in the environment of premium space [1].
The use of AGV represents a significant reduction of labour cost, an increase
in safety and the sought-after increase of efficiency. By replacing a human worker
with an AGV, a single expense for the equipment is paid- the initial investment-
avoiding the ongoing costs that would come with a new hire [2]. These sys-
tems are also programmed to take over repetitive and fatiguing tasks that could
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 203–218, 2021.
https://doi.org/10.1007/978-3-030-91885-9_15
204 A. Cruz et al.
diminish the human worker attention and lead to possible accidents. Therefore,
industries that use AGV can significantly reduce these accidents and, with a
24 h per day and 7 days per week operability and high production output, AGV
ensure worker safety while maximizing production [3].
Multiple robot systems can accomplish tasks that no single robot can accom-
plish, since a single robot, no matter how capable it is, is spatially limited [5].
Nevertheless, a multi-AGV environment requires special attention. Coordinat-
ing a fleet of AGV is already a complex task and restrict environments with the
possibility of exposing the AGV to delays in the trajectory and even communi-
cation faults can represent a threat, compromising the safety, productivity and
efficiency of these systems. To solve this, trajectory planning algorithms allied
with supervisory systems have been studied and developed. Based on the work
developed by [8], it is intended to implement and test a Multi AGV Supervisory
System on real robots and analyse how the system responds to the dynamic of
a real environment.
This paper aims to analyse the intervention of the implemented Supervisory
Module, what influences it and how the execution time is affected considering
other variables such as the robots’ velocity.
2 Literature Review
The system performance is highly connected with path planning and trajectory
planning. Path planning describes geometrically and mathematically the way
from the starting point to the destination point, avoiding collisions with obsta-
cles. On the other hand, trajectory planning is that path as a function of time:
for each time instance, it defines where the robot must be positioned [7].
When choosing a path planning method, several aspects have to be consid-
ered such as the type of intended optimisation (path length, execution time),
computational complexity and even if it is complete (always finds a solution
when it exists), in full resolution (there is a solution to a particular discretisa-
tion of the environment) or probabilistically complete (the probability of finding
a solution converges to 1 as the time tends to infinity) [7].
When it comes to coordinating a multi-AGV environment, the Time
Enhanced A* (TEA*) Algorithm revealed to be very fitted as it is described
as a multi-mission algorithm that incrementally builds the path of each vehicle
considering the movements of the others [10].
2.1 TEA* Algorithm
The TEA* Algorithm is a graph search algorithm, based on the A* algorithm,
developed by [10]. This approach aims to fulfil industrial needs by creating routes
that minimise the time allocated for each task, avoid collisions between AGV
and prevent the occurrence of deadlocks.
Considering a graph G with a set of vertexes V and edges E (links between
the vertexes), with a representation of the time k = [0; TMax], each AGV can
Multi AGV Industrial Supervisory System 205
Fig. 1. TEA* algorithm: input map and analysed neighbour cells focusing on the cell
with the AGV’s position [11]
only start and stop in vertexes and a vertex can only be occupied by one vehicle
at a temporal layer [11]. As it can be seen in Fig. 1, the analysed neighbour cells
belong to the next temporal layer and include the cell containing the AGV’s
current position.
To find the minimal path, the algorithm starts by calculating the position
of each robot in each temporal layer and the next analysed neighbour cell is
dependent on a cost function. In the TEA* approach, the heuristic function
used is the euclidean distance. Hence, in the future, possible collisions can be
identified and avoided in the beginning (k = 0) of the paths’ calculation [10].
Since the coordination between robots is essential to avoid collisions and to
guarantee the correct execution of the missions, this approach ensures it since
the previously calculated paths become moving obstacles [10].
Combining this approach with a supervisory system revealed to be a more
efficient approach to avoid collisions and deadlocks [8]. Instead of executing
the TEA* methodology as an online method (the paths are re-calculated every
execution cycle, a computationally heavy procedure), the supervisory system
would detect delays in the communication, deviations in the routes of the robots
and communication faults and trigger the re-calculation of the paths, if needed.
Therefore, [8] proposed a supervisory system consisting of two modules: Plan-
ning Supervision Sub-Module and Communication Supervision Sub-Module.
3 System Overall Structure
To implement and test the work developed by [8], and further modifications, it
was set a shop floor map and developed a fleet of three robots and their control
module, a localisation system based on the pose estimation of fiducial markers
and a central control module.
The architecture of the system developed is represented in Fig. 2.
The Central Control Module of the system is composed of three hierarchical
modules: the Path Planning Module (TEA* Algorithm) and the Supervisory
Module: the Planning Supervision and the Communication Supervision.
206 A. Cruz et al.
Robot Localisation
Module
Robot Control Module
Path Planning (TEA* Algorithm)
Planning Supervision
Communication
Supervision
Central Control Module
Fig. 2. System architecture and relations between the different modules
The Robot Localisation Module locates and estimates the robot’s coordinates
and communicates simultaneously with the Central Control Module and the
Robot Control Module.
Lastly, the Robot Control Module is responsible for calculating and deliv-
ering through UDP packets the most suitable velocities for each robots’ wheel,
depending on the destination point/node indicated by the Central Control Mod-
ule.
The communication protocol used, UDP (User Datagram Protocol), is
responsible for establishing low-latency and loss-tolerating connections between
applications on the internet. Because it enables the transfer of data before an
agreement, provided by the receiving party, the transmissions are faster. As a
consequence, UDP is beneficial in time-sensitive communications [12].
3.1 Robot Localisation Module
To run the system, the current coordinates, X and Y, and orientation, Theta,
of the robots were crucial. Suitable approaches would be odometry or even an
Extended Kalman Filter localisation using beacons [4]. However, for pose esti-
mation, the Robot Localisation Module applies a computer vision algorithm. For
this purpose, a camera was set above the centre of the shop floor map and the
robots were identified with ArUco markers [6,9], as represented in Fig. 3.
The ArUco marker is a square marker characterised by a wide black border
and an inner binary matrix that determines its identifier. Using the ArUco library
developed by [9] and [6] and associating an ArUco tag id to each robot, it was
possible to obtain the coordinates of the robots’ positions with less than 2 cm
of error. However, the vision algorithm revealed to be very sensitive to poor
Multi AGV Industrial Supervisory System 207
Fig. 3. Robot localisation module: computer vision algorithm for pose estimation
through the detection and identification of ArUco markers
illumination conditions and unstable regarding non-homogeneous lighting. The
size of the tag used was 15 per 15 cm with a white border of 3.5 cm.
The Robot Localisation Module was able to communicate with the Central
Control Module and the Robot Control Module through UDP messages. Each
UDP message carried the positions of all robots detected and one message was
sent every 170 ms (camera frame rate plus vision algorithm processing time).
3.2 Robot Control Module
The communication between the Central Control Module, constituted by the
Path Planning Module (TEA* Algorithm) and the Supervisory Module, and the
Robot Control Module is established through UDP messages. After the paths
are planned, the information is sent to the Robot Control Module through UDP
packets. Each packet has the structure represented in Fig. 4.
Fig. 4. Communication between central control module and robot control module:
UDP packet structure
The heading of the packet, highlighted in yellow, carried the robot id (N),
the priority of the robot (P) and the number of steps that the packet contained
(S). Consequently, the body of the packet, highlighted in green, contained the
information of the S steps. This information was organised by the number of the
step (I), the coordinates X and Y and the direction D (that was translated into
an angle in radians inside the trajectory control function). The ending of the
packet, highlighted in blue, indicates if the robot has reached the destination
and carried a termination character. The indicator T0 or T1 informed if the
208 A. Cruz et al.
robot’s current position was the final destination (T1 if yes, T0 if not) as a form
of reassurance. When a robot finished its mission, the UDP packet sent would
not contain any steps (S0), therefore it did not contain any information on its
body. In this way, the indicator T0 or T1 assures the ending of the mission.
Lastly, the character F was the termination character chosen to indicate the end
of the packet.
It is important to refer that, even if the communication between the Central
Control Module and the Robot Control Module is shut down, the robots are
capable of continuing their paths until reaching the last step of the last UDP
packet received. This property is important when communication faults may
exist. Thus, the mission is not compromised if the Central Control Module loses
communication with the robot.
After decoding the received packet, the robot’s control is done through the
calculation of the most suitable velocity for each wheel for each step’s X and
Y coordinates and direction/Theta angle. That operation is done through the
function represented by the diagram in Fig. 5.
Rotate
Go_Foward
De_Accel
Final_Rot
abs(erro_theta)  MAX_ETF
erro_dist  TOL_FINDIST
erro_dist  DIST_DA
erro_dist  TOL_FINDIST
abs(erro_theta_f)  THETA_DA
erro_dist  DIST_NEWPOSE
Stop
(abs(erro_theta_f)  THETA_NEWPOSE)
or
(erro_dist  DIST_NEWPOSE)
Fig. 5. Robot trajectory control: diagram of the velocity control function
Multi AGV Industrial Supervisory System 209
In each stage of the diagram, the linear and angular velocities are calculated
accordingly. The velocity for each wheel/motor is calculated through Eqs. 1 and
2, where b represents the distance between both wheels.
M0Speed := linearvelocity +
angularvelocity · b
2
(1)
M1Speed := linearvelocity −
angularvelocity · b
2
(2)
3.3 Robots’ Architecture
To validate the proposed robot orchestration algorithm, three differential mobile
robots were developed, as a small scale of an industrial Automated Guided Vehi-
cle, AGV. Each one measures 11 cm length by 9 cm width, as presented in Fig. 6.
The robot was produced through additive manufacturing using a 3D printer.
Fig. 6. Example of the robot developed to perform tests: Small scale AGV, powered by
three Li-Ion batteries that supply an ESP32 microcontroller and stepper motor drivers
Each robot is powered by three Li-Ion MR18650 batteries, placed on the top
of the robot, that supply the main controller board based on an ESP32 micro-
controller and the stepper motor drivers DRV8825. The ESP32 owns the WiFi
connection that allowed it to communicate with the central system. The main
architecture of the small scale AGV is presented in Fig. 7. A push-button Power
switch controller is used to turn the system off when batteries are discharged
avoiding damaging them. A voltage divider is applied so that the microcontroller
measures the battery voltage.
The communication between the Robot Control Module and each robot is
also done using UDP packets and it sends commands to the motors while receiv-
ing information about the odometry (steps). To avoid slippage, a circular rubber
210 A. Cruz et al.
ESP32
DRV8825
Stepper motor
NEMA
DRV8825
Stepper motor
NEMA
Vbat
Off
control
3 x
18650
R
Wheel
L
Wheel
Push-button Power Switch
Fig. 7. Robot architecture’s diagram: Black lines represent control signals and red lines
represent power supply. (Color figure online)
is placed on each wheel. Beyond the two wheels, a free contact support of Teflon
with low friction is used to support the robot.
3.4 Central Control Module
As explained previously, the Central Control Module is the main module and the
one responsible for running the TEA* algorithm with the help of its supervisory
system.
Supervisory System. The supervisory system proposed by [8] is responsible
for deciding when to re-plan the robots’ paths and for detecting and handling
communications faults through two sub-modules hierarchically related with each
other: these modules are referred to as the Planning Supervision Sub-Module and
the Communication Supervision Sub-Module.
In environments where no communication faults are present, the Planning
Supervision Sub-Module is responsible for determining when one of the robots is
delayed or ahead of time (according to the path planned by the TEA* algorithm)
and also detect when a robot has completed the current step of its path.
On the other hand, the Communication Supervision Sub-Module is only
responsible for detecting communication faults. When a communication fault
is detected, the Planning Supervision Sub-Module is overruled and the re-
calculation of the robots’ paths is controlled by the Communication Supervi-
sion Sub-Module. Once the communication is reestablished, the supervision is
returned to the Planning Supervision Sub-Module.
Multi AGV Industrial Supervisory System 211
In the Central Control Module, after the initial calculation of the robot’s
paths, done by the Path Planning Module, the Planning Supervision Module
is responsible for checking three critical situations that can lead to collisions
and/or deadlocks:
– If a robot is too distant from its planned position;
– If the maximum difference between steps is 1;
– If there is a robot moving into a position currently occupied by other.
All these situations could be covered in only one: verifying when the robots
become unsynchronised. If the robots are not synchronised, it means that each
robot is at a different step of the planned path, which can lead to collisions and
deadlocks. However, triggering the supervisory system by this criterion may lead
to an ineffective system where the paths are constantly being recalculated.
In the next section are described the tests performed to study the impact of
the supervisor and, consequently, how the re-planning of the robot’s path affects
the execution’s time and the number of tasks completed and what influences the
supervisor.
4 Experiments and Results
The shop floor map designed is presented in Fig. 8. This map was turned into a
graph with links around the size of the robot plus a safety margin, in this case,
the links measure, approximately, 15 cm.
Fig. 8. Shop floor map designed to validate the work developed
The decomposition of the map was done using the Map Decomposition Mod-
ule, developed by [8], and incorporated in the Central Control Module through
an XML file.
To test the Planning Supervision Sub-Module and its intervention, it was
assigned a workstation to each robot as written in Table 1. After reaching the
workstation, each robot moves towards the nearest rest station, which is any
other station that is not chosen as a workstation. In Fig. 9, it is possible to
observe the chosen workstations in blue and the available rest stations in red.
212 A. Cruz et al.
Table 1. Assigned tasks for Test A
Robot ID Robot station Workstation Rest station
1 No 8 No 6 No 7
2 No 2 No 4 No 3
3 No 7 No 9 No 10
Fig. 9. Shop floor skeleton graph with the workstations selected for the following tests
in blue (Color figure online)
The following tests aimed to evaluate how many times the supervisor had to
intervene and which situations triggered it the most. It was expected that the
paths were re-calculated more times when the Supervisory System was triggered
upon every delay (Test B), instead of re-calculating the paths only on those crit-
ical situations (Test A). It was also intended to evaluate if the nominal velocity
could influence the re-planning of the paths (Test C). It was expected a decrease
in the number of times the supervisor had to intervene, due to a more similar
step execution time between the robots (less delays).
Initially, the nominal linear velocity and the nominal angular velocity set in
the Robot Control Module was of 800 steps/s (that correspond to 4.09 cm/s)
and 100 rad/s, respectively.
Before running the algorithm, it was already possible to predict a critical sit-
uation. Since robot number 1 and number 3 have part of their paths in common,
the TEA* algorithm defined steps of waiting to prevent the collision of robot
number 1 with robot number 3. However, if robot number 3 suffered any delay,
robot number 1 could still collide with it.
4.1 Test A
During this test, the Supervisory Module only searched for the three situations,
previously mentioned, to re-plan the paths.
This mission was executed five times and the results of each sample and the
average results are registered in Table 2 and Table 3, respectively.
The situations that triggered the Path Supervisory Sub-Module the most
were a robot being in a step ahead of time (verified by obtaining the lowest
Multi AGV Industrial Supervisory System 213
current step of the robots and comparing the advanced robot’s coordinates with
the ones it would have if it was executing that lower step and noticing that the
difference was superior to 10 cm) and the maximum step difference between all
robots being superior to one.
Table 2. Execution of Test A: supervisory module followed the criteria defined previ-
ously
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
1 22 1:50 0 100%
2 23 1:51 0 100%
3 23 1:54 0 100%
4 32 1:47 0 100%
5 21 2:10 0 100%
Table 3. Average values of Test A: supervisory module followed the criteria defined
previously
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
5 24.2 2:02 0 100%
4.2 Test B
The mission previously described was also tested, with the same velocity con-
ditions, with the Supervisory Sub-Module analysing and acting whenever any
robot became out of sync, instead of waiting for any of the other situations to
happen. The results of each sample and the average results are registered in
Table 4 and Table 5, respectively.
Comparing with the previous results, the supervisor was executed an average
of 36.6 times, 12.4 times more. However, since the paths were planed again at
the slightest delay, robot number 1 and number 3 never got as close as in the
first performance. Therefore, this criterion was able to correct the issues before
the other situations triggered the supervisor.
4.3 Test C
Focusing on the supervisor tested in Test A (the one that searches for the three
critical situations), the nominal velocities of the robots, in the Robot Control
Module, were increased by 25%. Therefore, the nominal linear velocity was set
at 1000 steps/s and the nominal angular velocity at 125 rad/s. The results are
described in Table 6 and Table 7.
214 A. Cruz et al.
Table 4. Execution of Test B: Supervisory Module acts upon any delay
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
1 34 1:47 0 100%
2 37 1:52 0 100%
3 37 1:49 0 100%
4 39 1:45 0 100%
5 36 1:50 0 100%
Table 5. Average values of Test B: Supervisory Module acts upon any delay
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
5 36.6 1:49 0 100%
Table 6. Execution of Test C: Nominal Velocities increased by 25%
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
1 25 1:47 0 100%
2 26 1:42 0 100%
3 30 1:40 0 100%
4 25 1:44 0 100%
5 23 1:40 0 100%
Table 7. Average values of Test C: Nominal Velocities increased by 25%
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
5 25.8 1:43 0 100%
Comparing the results with the ones in Test A (Table 2 and Table 3), only
the execution time varied. This means that what may cause the delay between
the robots is either the time that it takes for a robot to rotate not corresponding
to the time that it takes to cross a link (since one step corresponds to crossing
from one node to the other but corresponds also to a rotation of 90◦
C) or the
links (the distance between two nodes) not having a similar size.
Repeating the same test but with the nominal linear velocity at 800 steps/s
and the nominal angular velocity at 125 rad/s. The results are in Table 8 and
Table 9.
The average number of times that the supervisor recalculated the path is
lower when the angular velocity is 25% superior. However, increasing too much
Multi AGV Industrial Supervisory System 215
this value would cause new delays (because the time taken by one step of rotating
would be less than the time of one step of going forward). The average execution
time is also lower, as expected.
Table 8. Execution of Test C’: Nominal linear velocity at 800 steps/s and nominal
angular velocity at 125 rad/s
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
1 19 1:47 0 100%
2 21 1:44 0 100%
3 24 1:43 0 100%
4 24 1:42 0 100%
5 24 1:46 0 100%
Table 9. Average values of Test C’: Nominal linear velocity at 800 steps/s and nominal
angular velocity at 125 rad/s
# Supervisor
intervention
Exec. Time
(min.)
Deadlocks 
Collisions
Tasks
completed
5 22.4 1:44 0 100%
4.4 Results Discussion
The time taken by the robots to perform their tasks is slightly longer in Test A.
This can be justified due to the situations where the supervisor acts on being
more critical and resulting in a different path instead of extra steps on the
same node (the robot waiting in the same position). For example, in Table 2,
the execution time of sample number 5 stands out from the others. This longer
performance is justified by the alteration of the path of robot number 1. The
addition of extra steps in station number 7, for robot number 1 to wait until
robot number 3 has cleared the section, was a consequence of the re-planning
done by the supervisor. Even though, in the other samples, it was possible to
observe robot number 1 starting to rotate in the direction of station number 7,
the robot never moved into that node/station, resulting in a shorter execution
time.
In sample number 4 of Test A, it was also possible to observe robot number 2
taking an alternative path to reach its rest station. The execution time was not
significantly affected but the supervisor had to recalculate the paths 32 times.
When it comes to the results of Test B, sample number 4 is also distinct. Even
though the supervisor recalculates the paths more frequently, the execution time
is a little shorter than on the other samples. This can be a result of the supervisor
216 A. Cruz et al.
being called on simpler situations that do not involve a change of the path and
are quickly sorted.
In conclusion, it is possible to verify that there is a trade-off between the
number of times that the path is calculated (re-calculating the paths frequently
can be a computationally heavy procedure) and the mission’s execution time. If
the supervisor is only called on critical situations, the algorithm will be executed
faster and it will not be as computationally heavy but the robots’ may take
longer to perform their missions due to alternative and longer paths. Contrarily,
if the paths are re-calculated every time a robot becomes unsynchronised, the
most likely situation to happen is the TEA* algorithm adding extra steps on the
current node, making the robots wait for each other and, consequently, avoiding
longer paths.
In Test C, it was evaluated the impact of the robot’s velocity in the num-
ber of times the paths were re-calculated. Even though in experiment Test C’,
the angular velocity seems to be more fitted, the results are not very different.
Therefore, the size of the links should be tested to evaluate their impact on the
synchronisation of the robots.
For this mission, the trajectory that each robot should have follow, according
to the initial calculation done by the TEA* algorithm, is represented in Fig. 10
(rotations are not represented). In most cases, the paths were mainly maintained
because the re-calculation only forced the robots to stop and wait until all robots
were in sync. However, as described before, in sample number 4 and number 5
of Test A, the trajectories performed were different and can be visualised: Test
A Sample number 41
, Test A Sample number 52
, and Test B Sample number 43
.
(a) Steps 1 to 6 (b) Steps 6 and 7 (c) Steps 7 to 10
(d) Steps 10 to 14 (e) Steps 14 to 17 (f) Steps 17 and 18
Fig. 10. Initial path schematic calculated by the Path Planning Module (rotations are
not represented)
1
https://www.youtube.com/watch?v=XeMGC1BlOg8.
2
https://www.youtube.com/watch?v=7tiBd8hPfKE.
3
https://www.youtube.com/watch?v=NmdK5b0vj64.
Multi AGV Industrial Supervisory System 217
5 Conclusions
This article presented the implementation and evaluation of a supervisor under
different situations. Through the development of a small fleet of small scale
industrial Automated Guided Vehicle and their control and localisation mod-
ule, it was possible to test the paths planned for them, provided by the Time
Enhanced A * (TEA *) algorithm, and how the supervisor had to intervene and
recalculate the paths upon different situations always concerning the delays of
the robots (that could consequently evolve to collisions and deadlocks).
Throughout the executed tests, it was possible to notice that there is a trade-
off between the number of times that the path is calculated and the mission’s
execution time. Recalculating frequently the robots’ paths turns the Central
Control Module computationally heavy but if the paths are recalculated every
time a robot becomes unsynchronised, the most likely situation to happen is the
TEA* algorithm adding extra steps on the current node, making the robots wait
for each other and, consequently, avoiding longer paths. This results in a shorter
robots’ execution time.
Meanwhile, only triggering the Supervisor Sub-Module on critical situations
might lead to the robots taking longer time to perform their missions due to
alternative and longer paths but the Central Control Module would not be as
computationally heavy.
It was also studied if the robot’s velocity could be the reason for the delays
detected by the supervisor. Since the time taken for a robot to go from one
step to another should be equal in all situations for all robots, the possibility
of the steps that contemplate rotations take a longer time than the steps that
only include going forward, had to be considered. However, the results did not
show a significant difference meaning that the delays could be caused by links
of different sizes. This hypothesis should be addressed in future work.
Acknowledgements. This work is financed by National Funds through the Por-
tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project
UIDB/50014/2020.
References
1. Automated Guided Vehicle Market Size, Share  Trends Analysis Report
By Vehicle Type, by Navigation Technology, by Application, by End-use
Industry, by Component, by Battery Type, by Region, and Segment Fore-
casts, https://www.grandviewresearch.com/industry-analysis/automated-guided-
vehicle-agv-market. Accessed 21 Jan 2021
2. The Advantages and Disadvantages of Automated Guided Vehicles (AGVs).
https://www.conveyco.com/advantages-disadvantages-automated-guided-
vehicles-agvs. Accessed 21 Jan 2021
3. Benefits of Industrial AGVs in Manufacturing. https://blog.pepperl-fuchs.us/4-
benefits-of-industrial-agvs-in-manufacturing. Accessed 21 Jan 2021
218 A. Cruz et al.
4. Bittel, O., Blaich, M.: Mobile robot localization using beacons and the Kalman
filter technique for the Eurobot competition. In: Obdržálek, D., Gottscheber, A.
(eds.) EUROBOT 2011. CCIS, vol. 161, pp. 55–67. Springer, Heidelberg (2011).
https://doi.org/10.1007/978-3-642-21975-7 6
5. Cao, Y.U., Fukunaga, A.S., Kahng, A.: Cooperative mobile robotics: antecedents
and directions. Autonom. Robot. 4, 7–24 (1997)
6. Garrido-Jurado, S., Muñoz Salinas, R., Madrid-Cuevas, F.J., Medina-Carnicer, R.:
Generation of fiducial marker dictionaries using mixed integer linear programming.
Patt. Recogn. 51, 481–491 (2016)
7. Gomes da Costa, P. L.: Planeamento Cooperativo de Tarefas e Trajectóriasem
Múltiplos Robôs. PhD thesis, Faculdade de Engenharia da Universidade do Porto
(2011)
8. Matos, D., Costa, P., Lima, J., Costa, P.: Multi AGV coordination tolerant to
communication failures. Robotics 10(2), 55 (2021)
9. Romero-Ramirez, F.J., Muñoz-Salinas, R., Medina-Carnicer, R.: Speeded up detec-
tion of squared fiducial markers. Image Vis. Comput. 76, 38–47 (2018)
10. Santos, J., Costa, P., Rocha, L. F., Moreira, A. P., Veiga, G.: Time Enhanced A*:
Towards the Development of a New Approach for Multi-Robot Coordination. In:
Proceedings of the IEEE International Conference on Industrial Technology, pp.
3314–3319. Springer, Heidelberg (2015)
11. Santos J., Costa P., Rocha L., Vivaldini K., Moreira A.P., Veiga G.: Validation of
a time based routing algorithm using a realistic automatic warehouse scenario. In:
Reis, L., Moreira, A., Lima, P., Montano, L., Muñoz-Martinez, V. (eds.) ROBOT
2015: Second Liberian Robotics Conference: Advances in Robotics, vol. 2, pp. 81–
92 (2016)
12. UDP (User Datagram Protocol). https://searchnetworking.techtarget.com.
Accessed 13 June 2021
13. Ullrich, G.: The history of automated guided vehicle systems. In: Automated
Guided Vehicle Systems, pp. 1–14. Springer, Heidelberg (2015). https://doi.org/
10.1007/978-3-662-44814-4 1
Dual Coulomb Counting Extended
Kalman Filter for Battery SOC
Determination
Arezki A. Chellal1,2(B)
, José Lima2,3
, José Gonçalves2,3
,
and Hicham Megnafi1,4
1 Higher School of Applied Sciences, BP165, 13000 Tlemcen, Algeria
2 Research Centre of Digitalization and Intelligent Robotics CeDRI,
Instituto Politécnico de Bragança, 5300-252 Bragança, Portugal
{arezki,jllima,goncalves}@ipb.pt
3 Robotics and Intelligent Systems Research Group, INESC TEC,
4200-465 Porto, Portugal
4 Telecommunication Laboratory of Tlemcen LTT, University of Abou Bakr Belkaid,
BP119, 13000 Tlemcen, Algeria
h.megnafi@essa-tlemcen.dz
Abstract. The importance of energy storage continues to grow, whether
in power generation, consumer electronics, aviation, or other systems.
Therefore, energy management in batteries is becoming an increasingly
crucial aspect of optimizing the overall system and must be done prop-
erly. Very few works have been found in the literature proposing the
implementation of algorithms such as Extended Kalman Filter (EKF)
to predict the State of Charge (SOC) in small systems such as mobile
robots, where in some applications the computational power is severely
lacking. To this end, this work proposes an implementation of the two
algorithms mainly reported in the literature for SOC estimation, in an
ATMEGA328P microcontroller-based BMS. This embedded system is
designed taking into consideration the criteria already defined for such a
system and adding the aspect of flexibility and ease of implementation
with an average error of 5% and an energy efficiency of 94%. One of the
implemented algorithms performs the prediction while the other will be
responsible for the monitoring.
Keywords: Prediction algorithm · Battery management system ·
Extended kalman filter · Coulomb counting algorithm · Engineering
applications
1 Introduction
Embedded systems are ubiquitous today, but because these systems are barely
perceptible, their importance and impact are often underestimated. They are
used as sub-systems in a wide variety of applications for an ever-increasing
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 219–234, 2021.
https://doi.org/10.1007/978-3-030-91885-9_16
220 A. A. Chellal et al.
diversity of functions [1]. Whether it is a hybrid vehicle, a solar power plant,
or any other everyday electrical device (PC, smartphone, drone...), the key ele-
ment remains the ability to monitor, control and optimise the performance of
one or more modules of these batteries, this type of device is often referred to
as a Battery Management System (BMS). A BMS is one of the basic units of
electrical energy storage systems, a variety of already developed algorithms can
be applied to define the main states of the battery, among others: SOC, state of
health (SOH) and state of functions (SOF) that allow real-time management of
the batteries.
For the BMS to provide optimal monitoring, it must operate in a noisy envi-
ronment, it must be able to electrically disconnect the battery at any time, it
must be cell-based and perform uniform charging and discharging across all cells
in the battery [2], and the components used must be able to withstand at least
the total current drawn by the load [3]. In addition, it must continuously monitor
various parameters that can greatly influence the battery, such as cell tempera-
ture, cell terminal voltage and cell current. This embedded system must be able
to notify the robot using the battery to either stop drawing energy from it, or
to go to the nearest charging station.
However, in the field of mobile robotics and small consumer devices, such as
Smarthpones or Laptops, as there are no requirements regarding the accuracy to
which a BMS must be held, the standard approach such as Open Circuit Volt-
age (OCV) and Coulomb Counting (CC) methods are generally applied, this is
mainly due to the fact that the use of more complicated estimation algorithms
such as EKF, Sliding Mode and Machine Learning [4,5] requires higher compu-
tational power, thus, the most advanced battery management system algorithms
reported in the literature are developed and verified by laboratory experiments
using PC-based software such as MATLAB and controllers such as dSPACE.
As an additional information, the most widely used battery systems in robotics
today are based on electrochemical batteries, particularly lithium-ion technolo-
gies with polymer as an Electrolyte [6].
This document is devided into 5 sections, the rest of the paper is structured as
follow. Section 2 promotes the work already done in this field and highlights the
objectives intended throught this work. Section 3 offer a brief algorithm descrip-
tion implemented on the prototype. Section 4, describes the proposed solution
for the system, where a block diagram and several other diagrams defining the
operating principle are offered. Section 5 provides the results and offer some dis-
cussion. Finally, Sect. 5 draws together the main ideas described in this article
and outlines future prospects for the development of this prototype.
2 State of the Art Review
Several research teams around the world have proposed different solutions to
design an efficient BMS system for lithium-ion batteries. Taborelli et al. have
proposed in [7], a design of an EKF and Ascending Extended Kalman Filter
(AEKF) algorithms specifically developed for light vehicle categories, such as
DCC-EKF Battery SOC Determination 221
electric bicycles, in which a capacity prediction algorithm is also implemented
tackling SOH estimation. The design has been validated by using simulation
software and real data acquisition. In [8], Mouna et al. implemented two EKF
and sliding mode algorithms in an Arduino board for SOC estimation, using
an equivalent first-order battery as a model. Validation is performed by data
acquisition from Matlab/Simulink. Sanguino et al. proposed in [9] an alterna-
tive design for battery system, where two batteries works alternatively, one of
the batteries is charged through solar panels installed on the top of the VAN-
TER mobile robot, while the other one provides the energy to the device. The
battery SOC selection is performed following an OCV check only and the SOC
monitoring is done by a coulomb counting method.
In addition, several devices are massively marketed, these products are used
in different systems, from the smallest to the largest. The Battery Management
System 4–15S is a BMS marketed by REC, it has the usual battery protec-
tions (temperature, overcurrent, overvoltage) [10]. A cell internal DC resistance
measurement technique is applied, suggesting the application of a simple resis-
tor equivalent model and an open circuit voltage technique for SOC prediction,
and an Coulomb counting technique for SOC monitoring. The device can be
operated as a stand-alone unit and offers the possibility of being connected to
a computer with an RS-485 for data export. This device is intended for use in
solar system. Roboteq’s BMS10x0 is a BMS and protection system for robotic
devices developed by Roboteq, it uses a 32-bit ARM Cortex processor and offers
the typical features of a BMS, along with Bluetooth compatibility for wireless
states checking. It can monitor from 6 to 15 batteries at the same time. Voltage
and temperature thresholds are assigned based on the chemical composition of
the battery. The SOC is calculated based on an OCV and CC techniques [11].
To the best of the authors’ knowledge, all research in the field of EKF-
based BMS is based on bench-scale experiments using powerful softwares, such
as MATLAB, for data processing. So far, the constraint of computational power
limitation is not really addressed in the majority of scientific papers dealing with
this subject. This paper focus on the implementation of an Extended Kalman
Filter helped with a Coulomb Counting technique, called DCC-EKF, as a SOC
and SOH estimator in ATMEGA328P microcontrollers, the proposed system is
self powered, polyvalent to all types of Lithium cells, easy to plug with other
systems and take into consideration most of the BMS criteria reported in the
literature.
3 Algorithm Description
There are many methods reported in the literature that can give a represen-
tation of the actual battery charge [4,5,7,8]. However, these methods vary in
the complexity of the implementation and the accuracy of the predicted results
over long term use. There is a correlation between these two parameters, as the
complexity of an algorithm increases, so does the accuracy of the results.
Also, simulation is a very common technique for evaluating and validating
approaches and allows for rapid prototyping [1]; these simulations are based on
222 A. A. Chellal et al.
models that are approximations of reality. MATLAB is a modeling and sim-
ulation tool based on mathematical models, provided with various tools and
libraries. In this section, the two algorithms applied in the development of this
prototype will be described and the results of the simulation performed for the
extended Kalman filter algorithm will be given.
3.1 Coulomb Counting Method
The Coulomb counting method consist of a current measurement and integration
in a specific period of time. The evolution of the state of charge for this method,
can be described by the following expression:
SOC(tn) = SOC(tn−1) +
nf
Cactual
 tn−1
tn
I · dt (1)
where, I is the current flowing by the battery, Cactual is the actual total storable
energy in the battery and nf represents the faradic efficiency.
Although this method is one of the most widely used methods for monitor-
ing the condition of batteries and offers the most accurate results if properly
initialized, it has several drawbacks that lead to errors. The coulomb counting
method is an open-loop method and its accuracy is therefore strongly influenced
by accumulated errors, which are produced by the incorrect initial determina-
tion of the faradic efficiency, the battery capacity and the SOC estimation error
[12], making it very rarely applied on its own. In addition, the sampling time
is critical and should be kept as low as possible, making applications where the
current varies rapidly unadvisable with this method.
3.2 Extended Kalman Filter Method
Since 1960, Kalman Filter (KF) has been the subject of extensive research and
applications, the KF Algorithm estimates the states of the system from indirect
and uncertain measurements of the system’s input and output. The general
discrete time representation of it, is represented as follow:

xk = Ak−1 · xk−1 + Bk−1 · uk−1 + ωk
yk = Ck · xk + Dk · uk + υk
(2)
Where, xk ∈ Rn×1
is the state vector, Ak ∈ Rnxn
is the system matrix in
discrete time, yk ∈ Rm × 1 is the output, uk ∈ Rm×1
is the input, Bk ∈ Rnxm
is the input matrix in discrete time, and Ck ∈ Rm×n
is the output matrix in
discrete time. ω and υ represents the Gaussian distributed noise of the process
and measurement.
However, Eq. 2 is only valid for linear systems, in the case of nonlinear systems
such as batteries, the EKF is applied to include the non-linear behaviour and
to determine the states of the system [14]. The EKF equation has the following
form: 
xk+1 = f(xk, uk) + ωk
yk = h(xk, uk) + υk
(3)
DCC-EKF Battery SOC Determination 223
Where,
Ak =
∂f(xk, uk)
∂xk
, Ck =
∂h(xk, uk)
∂xk
(4)
After initialization, the EKF Algorithm always goes through two steps, predic-
tion and correction. The KF is extensively detailed in many references [13–15], we
will therefore not focus on its detailed description. The Algorithm 1 summarize
the EKF function implemented in each Slave ATMEGA328P microcontroller,
the matrices Ak, Bk, Ck, Dk, xk are discussed in Sect. 3.3.
Algorithm 1: Extended Kalman Filter Function
initialise A0, B0, C0, D0, P0, I0, x0, Q and R ;
while EKF Called do
Ik measurements; Vtk measurements;
ˆ
xk = Ak−1 . xk−1 + Bk−1 . Ik−1 ;
ˆ
Pk = Ak−1 . Pk−1 . At
k−1 + Q ;
ˆ
Pk not diagonal = 0;
if ˆ
SOCk ∈[SOC interval] then
Rt, Rp and Cp updated;
end
Ck = V oc( ˆ
SOCk) + V̂Pk ;
Dk = Rt;
V̂tk = Ck + Dk · Ik;
L = (Ck · ˆ
Pk · Ct
k + R);
if L = 0 then
Kk = ˆ
Pk · Ct
k / L;
end
xk = ˆ
xk + Kk · (Vtk − V̂tk );
Pk = (I − Kk · Ck) · ˆ
Pk;
Pk not diagonal = 0;
k = k + 1;
Result: Send xk
end
The overall performance of the filters is set by the covariance matrix P,
the process noise matrix Q and the measurement noise R. The choice of these
parameters were defined throught experience and empirical experiment and are
given by:
Q =

0.25 0
0 0

, P =

1 0
0 1

and R = 0.00001
Data obtained from discharge tests of different cells simulations were applied to
verify the accuracy of this estimator with known parameters and are reported
in Sect. 3.4. The EKF Algorithm implemented in the ATMEGA328P microcon-
trollers has a sampling time of 0.04 s.
224 A. A. Chellal et al.
3.3 Battery Modelling
Two battery models are most often applied, the Electrochemistry model, which
gives a good representation of the internal dynamics of the batteries, and the
Electrical circuit model, which allows the behaviour of the batteries to be trans-
lated into an electrical model and can be easily formulated into mathematical
formula. For this study, the second model, considered more suitable for use with
a microcontroller, is chosen. The most commonly applied electrical model uses
a simple RC equivalent battery circuit model, with two resistors (Rp and Rt),
a capacitor (Cp), a voltage source (Voc) and a current flowing through it (I), as
shown in Fig. 1. It represents the best trade-off between accuracy and complexity.
Fig. 1. First order equivalent battery model
The terminal voltage denoted as Vt, represents the output voltage of the cell,
and is given by Eq. 5.
Vt = Voc(SOC) + I · Rt + Vp (5)
A non-linear relationship exist between SOC and Voc, a representation employ-
ing the seventh-order polynomial to fit the overall curve can be expressed and
is described with the following relation:
Voc(SOC) = 3.4624e−12
· SOC7
− 1.3014e−9
· SOC6
+ 1.9811e−7
· SOC5
(6)
−1.5726e−5
· SOC4
+ 6.9733e−4
· SOC3
− 0.017 · SOC2
+ 0.21 · SOC + 2.7066
The time derivative of Eq. 1 can be formulated as given in Eq. 7,
.
SOC =
I
Cactual
(7)
The polarization voltage is given as,
.
Vp = −
1
RpCp
Vp +
1
Cp
I (8)
DCC-EKF Battery SOC Determination 225
From the Eqs. 5, 7 and 8, it is possible to deduce the equation characterising
the behaviour of the battery, which is expressed as the following system:
⎧
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩
.
SOC
.
Vp
=

0 0
0 −1
RpCp

·

SOC
Vp

+
1
Cactual
1
Cp
· I
Vt = poly 1 ·

SOC
Vp

+ Rt · I
(9)
Where, poly is the polynomial formula described in Eq. 6. Since the microcon-
troller processes actions in discrete events, the previous system must be tran-
scribed into its discrete form, which is written as follows,
⎧
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩
.
SOC
.
Vp
=

1 0
0 exp −Δt
RpCp

·

SOC
Vp

k−1
+
1
Cactual
1
Cp
· Ik−1
Vt = poly 1 ·

SOC
Vp

k
+ Rt · Ik
(10)
From both Eqs. 2 and 10, it is deduced that the current I is the input of the
system, the voltage Vt is the output and the matrices Ak, Bk, Ck and Dk
matrices are given by:
Ak =

1 0
0 exp −Δt
RpCp

, Bk =
1
Cactual
1
Cp
, xk =

SOCk
VPk

, Ct
k =

poly
1

, Dk = Rt
3.4 Simulation Results
Given the large number of articles dealing with the validation of these algorithms
[13,16,17], it is however not necessary to dwell on this subject. Nevertheless, a
simulation of the algorithm is proposed in this section, in order to confirm the
good follow-up of the algorithm according to the chosen parameters.
Fig. 2. (a) overall Simulink model highlighting the Input/Output of the simulation (b)
Battery sub-system model
226 A. A. Chellal et al.
The graphical representation of MATLAB/Simulink using graphical blocks
to represent mathematical and logical constructs and process flow is intuitive and
provides a clear focus on control and observation functions. Although Simulink
has been the tool of choice for much of the control industry for many years [18],
it is best to use Simulink only for modelling battery behaviour and not for the
prediction algorithm. The battery model presented before is applied as a base
model, the more complex model will not be able to simulate all possible cases
and will therefore increase the time needed for development. Figure 2 represents
the battery simulation used in MATLAB/Simulink.
Figure 3 shows a comparison of the state of charge of the battery with simu-
lation for the Extended Kalman filter, and a plot of the error with the real and
the predicted one. The test consists of a 180 s discharge period followed by a
one hour rest period, the cell starts fully charged and the EKF state of charge
indicator is set at 50% SOC.
Fig. 3. Estimated and reference SOC comparison with the absolute error of the SOC
for Extended Kalman Filter with a constant discharge test and a rest period.
The continuous updating of the correction gain (K), which is due to the
continuous re-transcription of the error covariance matrices, limits the divergence
as long as the SOC remains within the predefined operating range (80%-40%).
The response of the observer to the determination of the SOC can be very fast,
on the order of a few seconds. Also, from the previous figure it is clear that,
for known parameters, the SOC can be tracked accurately with less than 5% of
error, and record a perfect prediction for when the battery is at rest. Figure 4
shows the simulation results of a constant current discharge test.
Although a good SOC monitoring by the EKF, it still requires a good deter-
mination of the battery parameter. Some reported works [19,20], have proposed
the use simple algebraic method or a Dual Extended Kalman Filter (DEKF) as
an online parameter estimator, to estimate them on the fly. Unfortunately, these
methods require a lot of computational power, to achieve a low sampling time
DCC-EKF Battery SOC Determination 227
Fig. 4. Estimated and reference SOC comparison with the absolute error of the SOC
for Extended Kalman Filter with a constant discharge test.
and reach convergence, which is difficult to achieve with a simple ATMEGA328P.
For this reason, and given the speed at which the EKF is able to determine the
battery SOC at rest, the EKF algorithm is applied to determine the battery SOC
at start-up and then the CC algorithm is applied to carry on the estimation.
4 Proposed Solution
The system is designed to be “Plug and Play”, with two energy input and output
located on one side only, future users of the product can quickly install and ini-
tialize the BMS, which works for all types of Lithium Model 18650 batteries, the
initialization part is described in more depth in the next section. The connecting
principle of the electronic components is illustrated in Fig. 5.
4.1 Module Connection
The Master microcontroller is the central element of the system, the ATMEL’s
ATMEGA328P was chosen for its operational character and low purchase cost,
it collect the State of Charge predicted by each ATMEGA328P slave micro-
controller and control the energy flow according to the data collected. This
microcontroller is connected to push buttons and an OLED screen to facili-
tate communication with the user. Power is supplied directly from the batteries,
the voltage is stabilised and regulated for the BMS electronics by an step down
voltage regulator.
The Inter-Integrated Circuit (I2C) is a bus interface connection using only
2 pins for data transfer, it is incorporated into many devices and is ideal for
attaching low-speed peripherals to a motherboard or embedded system over a
short distance, it provides a connection-oriented communication with acknowl-
edge [21]. Figure 6 represent the electronic circuit schematic diagram.
228 A. A. Chellal et al.
Master Microcontroller
Oled Display
Button up
Button enter
Button exit
Button down
Slave
Microcontroller #1
Slave
Microcontroller #2
Temperature
Sensor
Battery Pack
Electronic Board
Energy input Energy output
Step-down
regulator
Robot
Charging Station
Operational
Amplifier
Current
measurment
Cell 3
Cell 1
Cell 2
Cell 4
S-8254A Cell
Protection
Fig. 5. Block diagram of the system
The 18650 Lithium cells are placed in the two cell holder (1BH, 2BH). The
Oled display (OLED1) is connected via I2C by the Serial Clock and Serial
DATA line, called SCL and SDA respectively, the ATMEGA328P microcon-
troller (MASTER, SLAVE1, SLAVE2) are connected to the I2C wire through
pins PC5 and PD4. The current is measured from the ACS712 (U1) and gath-
ered by the slave microcontrollers (SLAVE1, SLAVE2) from the analog pin PC0.
A quadruple precision amplifier LM324 (U5) is used to measure the voltage at
each cell terminal, with the use of a simple voltage divider. The push buttons are
linked to the master microcontroller via the digital pins (PD2, PD3, PB4, PB5),
an IRF1405 N-Channel Mosfet is used as a power switch to the load (mobile
robot) while an IRF9610 P-Channel Mosfet is used as a power switch from the
power supply, they are controlled from master ATMEGA328P pins PB3 and
PB2 respectively. A step down voltage regulator is applied to regulate the volt-
age supplied to the electronics to 5V, this power is either gathered from the
battery pack or the power supply.
4.2 Operating Principle
This system is mainly characterised by two modes, the initialisation mode and
the on-mode, these two modes are very dependent on each other and it is possible
to switch from one mode to the other at any time using a single button. Figure 7
briefly summarizes the operation principale for both modes.
DCC-EKF Battery SOC Determination 229
Fig. 6. Schematic diagram of the electronic circuit
Because the characteristics between the batteries are not uniform, and can
vary greatly, and in order to achieve high accuracy of SOC prediction, the initial-
ization mode is introduced into the system, it is possible to skip this mode and
ignore it, but at the cost of reduced accuracy. Through this mode it is possible
to enter the capacity of the batteries installed, the actual SOC to ensure a more
accurate result, as well as to activate an additional protection to limit further
the discharges and the use or not of the EKF algorithm.
As for the on mode, according to what was initialize, the BMS begins by
carrying out the assigned tasks for each battery cell in parallel.
– If the EKF prediction was set on during the initialization mode, the device
shall begin the prediction of the actual SOC value for each cell, and prohibits
the passage of current other than that which powers the BMS for a period of
approximately 3 min (Fig. 7).
– If the EKF prediction was set off during the initialization mode, and an SOC
Value was introduced, the device allows the current to flow directly, and start
the SOC monitoring according to the chosen value.
– If the Protection was set off, the battery will be discharged and charged even
deeper, taking the risk of shorten the life span of the batteries.
230 A. A. Chellal et al.
Begin of
Initialisation mode
Display menu
Arrow Navigation
(UP  DOWN)
buttons
1- Initialisation of cell
capacity
2- (Des)activate the
EKF
4- (Des)activate the
cell protection
3- Enter Manualy the
SOC
Selected ?
5- Finish Initialisation
1
Selected ? Selected ? Selected ? Selected ?
Selection of the value
Value
selected ?
Yes/No Selection of the value Yes/No
2
Choice
selected ?
Value
selected ?
Choice
selected ?
Yes Yes Yes Yes Yes
Yes
Yes Yes Yes
2
EKF function,
Does not allow (dis)charge
3 minutes
Allow (dis)charge,
Coulomb Counting
function
SOCmin  SOC  SOCmax
Calculate R0
Yes
SOC SOCmin
No
No
Does not allow
discharge
Does not allow
charge
T passed
Send data to the
slave µ-controller
Yes
Yes
(b)
(a)
Fig. 7. Flow chart of the system algorithm. (a) Initialisation mode flow chart. (b)
On-mode flow chart with EKF algorithm activated.
The battery states are determined for each cell independently of the others
by a specific microcontroller assigned to it, and works with the principle of
Master/Slave communication. In order to keep a low sampling time, one slave
microcontroller is assigned to two batteries, reaching a sampling time of 0.085 s.
5 Results and Discussions
The result is the design of a BMS that takes into account the already established
requirements for this type of system and defines new ones, such as flexibility
and size, that must be met for mobile robots. A breadboard conception was first
carried out to perform the first tests. After power up, the initialisation mode is
first started as expected. After that, while the on-mode is running, the voltage,
current and SOC data are displayed on the screen, an accuracy of 100 mV and
40 mA is achieved. When the cell is at rest and for a SOC ∈ [80% − −40%], the
EKF prediction reaches an error of 5% in the worst case, for SOC cells above
80% the accuracy drops to 8%, while for SOC below 30% the algorithm diverges
completely (Fig. 10). These results are quite understandable, as the linearization
of these curves is done in only 10 points. A cell protection is performed by an
external circuit based on a S-8254A, it includes a high accuracy voltage detection
for lithium battery, avoiding over voltage while charging the batteries, they are
widely applied in rechargeable battery pack, due to their low costs and good
caracteristics [22].
DCC-EKF Battery SOC Determination 231
The total storable power offered by this model varies according to the capac-
ity of the installed cells. It is clear that the use of 4 lithium cells of 2200 mAh,
gives the system a battery capacity of 8800 mAh which is sufficient in the major-
ity of small robots application, it should be noted that it is also possible to
implement the 5500 mAh cell, giving the product developed great flexibility by
offering the possibility of choosing the battery capacity to users. The maximum
current that can be delivered is 2 A, and therefore allows to offer a power up to
32 W, which is sufficient to power the electronics of the robot, beyond this value
and in order to offer an optimal safety to the battery, a shutdown protocol will
stop the discharge of the battery. As for the charging process, the maximum
current supplied by an external source is currently 0.5 A, in order to have a good
protection when charging the batteries. With such a current, the charging pro-
cess is slow, in fact it will take 8 h and a half to reach the 80% SOC advised for
4 × 2200 mAh cells. Figure 8 shows the breadboard circuit.
Fig. 8. Breadboard circuit
The BMS electronics consume in average 1.15 W, giving the product an effi-
ciency of 96.15%. While charging the battery, and because a constant charging
power is set, the IRF9610 Mosfet leads to a constant loss of power of 0.52W, the
efficiency then reaches 90.38%. For discharging the battery, as the regulation of
the voltage supplied to the robot is left to the user and is done with a variable
current, it is difficult to define with precision the efficiency of this device, it can
however reach 2.5%, which makes the total efficiency to be 93.60%. Thus, in
general the average efficiency of this product is 94.37%. Figure 9 shows a printed
circuit representation of the proposed prototype.
232 A. A. Chellal et al.
Fig. 9. Printed Cirduit Board of the prototype. (a) componants view. (b) top view.
Fig. 10. On-mode display of cells 1, 2 and 3 with a SOC value of 9.51%, 77.25% and
71.10% respectively. (a) Average SOC of the first 50 iteration. (b) Few seconds after.
(c) approximately 1 min after. For cell 1, the algorithm diverges for the SOC estimate.
For cell 2, the estimate has reached the SOC value with an error of 1.75%. For cell 3,
after 1 min, the prediction has not yet reached the estimated value with an error of
6.90%.
DCC-EKF Battery SOC Determination 233
The SOC values, voltage and current are displayed directly on the Oled screen
as shown in Fig. 10. It is possible to display the overall battery status, or cell by
cell for a more detailed view. Taking into account that this approach and the
algorithm are constantly being improved, not all features such as the SOH and
temperature are fully integrated.
6 Conclusion and Future Work
This paper proposes the DCC-EKF approach where the state of charge is first
predicted using an EKF algorithm, after which the SOC is given to the Colum-
bus measurement algorithm to monitor this quantity. Most commercial products
only use the OCV to initially predict the SOC, a technique that is widely imple-
mented, but is severely inaccurate, especially when the BMS is powered directly
from the batteries, so the batteries never reach a resting state. The balancing
and overvoltage protection of the BMS is provided by an external circuit based
on the S-8245A module. The proposed BMS is small (10 cm × 15 cm), easy
to use and has shown accurate SOC prediction with an error of 5% and good
performance in noisy operation. The prototype also offers a good efficiency of
about 93%.
Future developments will focus on implementing a more accurate state of
charge prediction algorithm, such as DEKF and AEKF, as well as a more accu-
rate health prediction algorithm, while aiming to make it more user-friendly and
improve the user experience. The temperature effect will also be addressed in
future versions, as it is actually neglected, as well as the increase of the product
efficiency. In addition, it is considered to Implement a Learning Algorithm, to
further enhance the prediction accuracy, with a fast charging feature.
References
1. Marwedel, P.: Embedded Systems Foundations of Cyber-Physical Systems, and the
Internet of Things. Springer Nature (2021)
2. Park, K.-H., Kim, C.-H., Cho, H.-K., Seo, J.-K.: Design considerations of a lithium
ion battery management system (BMS) for the STSAT-3 satellite. J. Power Elec-
tron. 10(2), 210–217 (2010)
3. Megnafi, H., Chellal, A.-A., Benhanifia, A.: Flexible and automated watering
system using solar energy. In: International Conference in Artificial Intelligence
in Renewable Energetic Systems, pp. 747–755. Springer, Cham, Tipaza (2020).
https://doi.org/10.1007/978-3-030-63846-7 71
4. Xia, B., Lao, Z., Zhang, R., et al.: Online parameter identification and state of
charge estimation of lithium-ion batteries based on forgetting factor recursive least
squares and nonlinear Kalman filter. Energies 11(1), 3 (2018)
5. Hannan, M.A., Lipu, M.H., Hussain, A., et al.: Toward enhanced State of charge
estimation of Lithium-ion Batteries Using optimized Machine Learning techniques.
Sci. Rep. 10(1), 1–15 (2020)
6. Thomas, B.-R.: Linden’s Handbook of Batteries, 4th edn. McGraw-Hill Education,
New York (2011)
234 A. A. Chellal et al.
7. Taborelli, C., Onori, S., Maes, S., et al.: Advanced battery management system
design for SOC/SOH estimation for e-bikes applications. Int. J. Powertrains 5(4),
325–357 (2016)
8. Mouna, A., Abdelilah, B., M’Sirdi, N.-K.: Estimation of the state of charge of the
battery using EKF and sliding mode observer in Matlab-Arduino/LabView. In:
4th International Conference on Optimization and Applications, pp. 1–6. IEEE,
Morocco (2018)
9. Sanguino, T.-D.-J.-M., Ramos, J.-E.-G.: Smart host microcontroller for optimal
battery charging in a solar-powered robotic vehicle. IEEE/ASME Trans. Mecha-
tron. 18(3), 1039–1049 (2012)
10. REC-BMS.: Battery Management System 4–15S. REC, Control your power, Slove-
nia (2017)
11. Robote, Q.: BMS10x0, B40/60V, 100 Amps Management System for Lithium Ion
Batteries. RoboteQ, USA (2018)
12. Kim, I.-S.: A technique for estimating the state of health of lithium batteries
through a dual-sliding-mode observer. IEEE Trans. Power Electron. 25(4), 1013–
1022 (2009)
13. Mastali, M., Vazquez-Arenas, J., Fraser, R.: Battery state of the charge estimation
using Kalman filtering. J. Power Sources 239, 294–307 (2013)
14. Campestrini, C., Heil, T., Kosch, S., Jossen, A.: A comparative study and review
of different Kalman filters by applying an enhanced validation method. J. Energ.
Storage 8, 142–159 (2016)
15. Bishop, G., Welch, G.: An introduction to the kalman filter. In: Proceedings of
SIGGRAPH, Course. Proceedings of SIGGRAPH, vol. 41, pp. 27599–23175 (2001)
16. Campestrini, C., Horsche, M.-F., Zilberman, I.: Validation and benchmark meth-
ods for battery management system functionalities: state of charge estimation algo-
rithms. J. Energ. Storage 7, 38–51 (2016)
17. Yuan, S., Wu, H., Yin, C.: State of charge estimation using the extended Kalman
filter for battery management systems based on the ARX battery model. Energies
6(1), 444–470 (2013)
18. Marian, N., Ma, Y.: Translation of simulink models to component-based software
models. In: 8th International Workshop on Research and Education in Mechatron-
ics, pp. 262–267. Citeseer, Location (2007)
19. Hu, T., Zanchi, B., Zhao, J.: Determining battery parameters by simple algebraic
method. In: Proceedings of the 2011 American Control Conference, pp. 3090–3095.
IEEE, San Francisco (2011)
20. Xu, Y., Hu, M., Zhou, A., et al.: State of charge estimation for lithium-ion batteries
based on adaptive dual Kalman filter. Appl. Math. Modell. 77(5), 1255–1272 (2020)
21. Mazidi, M.-A., Naimi, S., Naimi, S.: AVR Microcontroller and Embedded Systems.
Pearson, India (2010)
22. ABLIC INC.: S-8254A Series Battery Protection IC for 3-Serial- or 4-Serial-CELL
Pack REV.5.2. ABLIC, Japan (2016)
23. Chatzakis, J., Kalaitzakis, K., Voulgaris, C., Manias, N.-S.: Designing a new gen-
eralized battery management system. IEEE Trans. Ind. Electron. 50(5), 990–999
(2003)
24. Chen, L., Xu, L., Wang, R.: State of charge estimation for lithium-ion battery by
using dual square root cubature kalman filter. Mathematical Problems in Engi-
neering 2017 (2017)
Sensor Fusion for Mobile Robot
Localization Using Extended Kalman
Filter, UWB ToF and ArUco Markers
Sı́lvia Faria1(B)
, José Lima2,3
, and Paulo Costa1,3
1
Faculty of Engineering, University of Porto, Porto, Portugal
{up201603368,paco}@fe.up.pt
2
Research Centre of Digitalization and Intelligent Robotics,
Instituto Politécnico de Bragança, Bragança, Portugal
jllima@ipb.pt
3
INESC-TEC - INESC Technology and Science, Porto, Portugal
Abstract. The ability to locate a robot is one of the main features to
be truly autonomous. Different methodologies can be used to determine
robots location as accurately as possible, however these methodologies
present several problems in some circumstances. One of these problems
is the existence of uncertainty in the sensing of the robot. To solve this
problem, it is necessary to combine the uncertain information correctly.
In this way, it is possible to have a system that allows a more robust
localization of the robot, more tolerant to failures and disturbances. This
paper evaluates an Extended Kalman Filter (EKF) that fuses odometry
information with Ultra-WideBand Time-of-Flight (UWB ToF) measure-
ments and camera measurements from the detection of ArUco markers
in the environment. The proposed system is validated in a real envi-
ronment with a differential robot developed for this purpose, and the
achieved results are promising.
Keywords: ArUco markers · Autonomous mobile robot · Extended
kalman filter · Localization · Ultra-WideBand · Vision based system
1 Introduction
In a semi-structured environment a mobile robot needs to be able to locate itself
without human intervention, i.e. autonomously. It can be assumed that if the
robot knows its own pose (position and orientation), it can be truly autonomous
to execute given tasks. Since the first mobile robot was invented, the problem
of pose determination has been addressed by many researches and developers
around the world due to its complexity and the multitude of possible approaches.
The localization methods can be classified in two great categories: rela-
tive methods and absolute methods [1]. Relative localization methods gives the
robot’s pose relative to the initial one and for this purpose dead reckoning meth-
ods such as odometry and inertial navigation are used. Odometry is a technique
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 235–250, 2021.
https://doi.org/10.1007/978-3-030-91885-9_17
236 S. Faria et al.
in which it is common to use encoders connected to the rotation axes of the robot
wheels. Absolute localization methods gives the global pose of the robot and do
not need previously calculated poses. In these methods, data from odometry and
data from external sensors can be combined to estimate the robot’s pose.
In an indoor environment, a mobile robot needs to localize itself accurately
due to limited of the space and obstacles present in the space. Today, the num-
ber of applications which relay on indoor localization is rapidly increasing and
because of this, localization of a robot in an indoor environment has become in
an active research area.
Global Positioning System (GPS) is usually used actually on outdoor envi-
ronment, nevertheless the use of GPS in indoor environment is difficult. Due to
this, different alternative technologies has been proposed and developed to solve
this. Among these technologies that has been proposed, Ultra-WideBand (UWB)
is known for its low-cost, high precision and easy deployment. An UWB based
localization system can be characterized by an UWB tag present in the robot
that can be located by measuring the distance till UWB anchors, whose posi-
tions are known. Since UWB based localization systems cannot provide robot
orientation information, another way to achieve this information is required. To
solve this problem it is possible to use odometry information to achieve the ori-
entation of the robot. Another solutions to locate the robot are based on the
visual field, whose aim is to find the correspondence between the real world and
the image projection. To achieve this goal, methods based on binary squared
fiducial markers have been widely used, such as ArUco markers.
This paper presents the data fusion of the odometry information with UWB
ToF distance measurements and angle measurements of the ArUco markers rel-
ative to the robot’s referential frame. The latter measurements are obtained
through a camera installed on the robot that can detect the ArUco markers
presented in the environment and whose positions are known. This data fusion
is accomplished through the use of an Extended Kalman Filter (EKF). It is
used once a non-linear system is being handled. The developed system allows
increasing the robustness, the scalability and the accuracy of robot localization.
2 Related Work
Mobile robot localization is one of the key functionalities of a truly autonomous
system. This is a complex and challenging problem in robotics field and has been
subject of a great research attention over the years. Depending on the type of
application and its environment (outdoor or indoor), the choice of technologies
and methods used must be made carefully in order to meet the requirements of
the application. Due to the existence of various obstacles, indoor environments
usually depend on non-line-of-sight (NLOS) propagation in which signals can-
not travel directly in straight path from an emitter to a receiver, which causes
inconsistent time delays at the receiver.
Nowadays, the main approaches to the robot localization problem in indoor
environment include:
Sensor Fusion for Mobile Robot Localization 237
– Dead Reckoning (DR): DR can approximately determine robot’s pose
using Odometry or Inertial Measurement Unit (IMU) by integration of the
pose from encoders and inertial sensors. This is one of the first methodology
used to calculate the robot position. Dead reckoning can give accurate infor-
mation on position, but is subject to cumulative errors over a long distances,
in that it needs to utilize the previous position to estimate the next relative
one and during which, the drift, the wheel slippage, the uneven floor and the
uncertainty about the structure of the robot will together cause errors. In
order to improve accuracy and reduce error, DR need to use other methods
to adjusts the position after each interval [2].
– Range-Based: Range-Based localization measures the distance and/or angle
to a natural or artificial landmarks. High accuracy measurements are critical
for in-building applications such as autonomous robotic navigation. Typically,
reference points, i.e. anchors, and their distance and/or angle to one or more
nodes of interest are used with in order to estimate a given pose. These
reference points are normally static or with an a priori map. There are several
methods that are used to perform the distances and angles calculations: Angle
of Arrival (AoA), Time of Flight (ToF), Time Difference of Arrival (TDoA)
and Received Signal Strength Indication (RSSI) [4]. The robot’s position can
be calculated with various technologies such as Ultra-WideBand (UWB) tags,
Radio Frequency Identification (RFID) tags, Laser Range Finders. This type
of localization is widely used in Wireless Sensors Network Localization. It
is important to mention that UWB is a wireless radio technology primarily
used in communications that has lately received more attention in localization
applications and that promises to outperform many of the indoor localization
methods currently available [2]. In [5], the authors proposed a method that
uses Particle Filter to fuse odometry with UWB range measurements, which
lacks the orientation, to obtain an estimate of the real-time posture of a robot
with wheels in an indoor environment.
– Visual-Based Systems: These systems can locate a robot by extracting
images of its environment using a camera. These systems can be Fiducial
marker systems that detect some markers in the environment by using com-
puter vision tools. These systems are useful for Augmented Reality (AR),
robot navigation and applications where the relative pose between camera
and object is required, due to their high accuracy, robustness and speed [7].
Several fiducial marker systems have been proposed in the literature such
as ARSTudio, ARToolkit and ARTag. However, currently the most popular
marker system in academic literature is probably the ArUco which is based
on ARTag and comes with an OpenSource library OpenCV of functions for
detecting and locating markers developed by Rafael Muñoz and Sergio Gar-
rido [6]. In addition, several types of research have been successfully conducted
based on this type of system, such as [6,8,9].
All of the aforementioned technologies can be used independently. However
they can also be used at the same time in order to combine the advantages
of each of them. By the proper fusion of these technologies in a multi-sensor
238 S. Faria et al.
environment a more accurate and robust system can be achieved. For nonlinear
systems this fusion can be achieved through developed algorithms such as the
Extend Kalman Filter (EKF), the Unscented Kalman Filter (UKF) and the
Particle Filter, which are among the most widely used [3].
The presented work uses a combination of odometry, UWB ToF measured
distance and ArUco measured angle to compute an accurate pose of a robot, i.e.
position and orientation. The combination of the measures mentioned above is
achieved by an EKF, since it is the least complex and computationally efficient
algorithm.
3 System Architecture
The system architecture proposed for this work, and its data flow is represented
in Fig. 1. The system consists of two main blocks: Remote Computer and Real
Robot. For all tests performed, a real robot was used and adapted, which was
previously developed for other projects.
Fig. 1. System architecture proposed
The robot has a differential wheel drive system and therefore has two drive
wheels, which are coupled to independent stepper motors powered by Allegro
MicroSystems drivers - A4988, and a castor wheel, which has a support function.
The robot sizes 0.31 m × 0.22 m and is powered by an on-board 12 V battery and
a DC/DC step-down converter to supply the electronic modules, i.e. a Raspberry
Pi 3 Model B and Arduino microcontroller boards. The top level consists of a
Raspberry Pi that runs the raspbian operative system and is responsible for
receiving data from Pozyx system obtained through the Arduino 2 and receiving
data from Arduino 1 that corresponds to the speed and voltage of the motors and
battery current. All the data is then send over Wi-Fi to the Remote Computer,
where it is performed the Kalman Filter and the appropriate speeds for each
wheel are calculated. A Raspberry Pi Camera Module is also connected to the
Raspberry that is tasked with acquiring positioning information from the ArUco
markers placed in the environment. This information is acquired by a script that
runs on the Raspberry Pi and is sent to the Remote Computer via Wi-Fi. A block
diagram of the robot architecture is presented in Fig. 2.
Sensor Fusion for Mobile Robot Localization 239
Fig. 2. Mobile Robot architecture
To determine the Ground-Truth of the robot, a camera (Playstation Eye)
was placed on the ceiling and in the center of the square where the robot moves,
looking down as shown in Fig. 3 (reference frame Xc, Yc, Zc). This camera is
connected to a Raspberry Pi 4 that contains a script and obtains the robot’s
pose. In addition, a 13 cm ArUco marker was placed on top of the robot, centered
on its center of rotation. By this way, the camera can detect the marker and
consequently determine its pose from the aforementioned ArUco library. The
pose extracted from the camera is useful for determining the accuracy of the
developed system by its comparison.
Fig. 3. Referential frames
4 Problem Formulation
The problem of Robot Beacon-Based Localization (Fig. 4) can be defined as
the estimation of the Robot’s pose X =

xr yr θr

by measuring the distances
and/or angles of a given number of beacons, NB, placed in the environment at
known positions MB,i =

xB,i yB,i

, i ∈ 1...NB. X and MB,i are both defined
in the global reference frame WXWY .
240 S. Faria et al.
For the specific problem, the following assumptions were made:
– Since the robot only navigates on the ground, without any inclinations, it is
assumed that the robot navigates on a two-dimensional plane.
– The beacons positions, MB,i, are stationary and previously known.
– Every 40 ms the odometry values are available. This values will be used to
provide the system input uk =

vk ωk
T
and refers to the odometry variation
between k and k − 1 instants.
– ZB,i(k) = [rB,i] and ZB,i(k) = [φB,i], identifies the distance measurement
between the Pozyx tag and the Pozyx anchor i, and the angle measure-
ment between the Robot camera and the ArUco marker i, respectively, at
the instant k.
Fig. 4. 2D localization given measurements to beacons with a known position
4.1 ArUco Markers
As already mentioned, the system is capable of extracting angle measurements
from ArUco markers, arranged in the environment, relative to the referential
frame of the robot. Each of the markers has an associated identification number
that can be extracted when the marker is detected in an image, which distin-
guishes each reference point. Using the ArUco library provided, it is possible
to obtain the rotation and translation (x, y, z) of the ArUco reference frame to
the Camera reference frame (embedded in the robot). This allows to obtain the
angle φ from the marker to the robot, to be extracted by the Eq. 1.
φ = atan2(
x
z
) (1)
Since the movement of the robot is only in 2D, the height (y) of the marker
is ignored when making the calculations, thus having the angle exclusively as a
horizontal difference. Note that the camera has been calibrated with a chessboard
so that the measurements taken are reliable [12].
Sensor Fusion for Mobile Robot Localization 241
4.2 Extended Kalman Filter Localization
The beacons measurements alone cannot return the location of the robot, so
it is necessary to implement an algorithm to obtain this information. For this
purpose, it will be used the well-known EKF, which is widely applied on this type
of problems. The EKF continuously updates a linearization around the previous
state estimation and approximates the state densities by Gaussian densities. It
thus allows the fusion between the odometry model and the Pozyx and ArUco
measurements, and is divided into two phases. The Prediction phase, which
estimates the actual pose by the State Transition Model, and the Correction
phase which corrects the prediction state input with the Pozyx distance and
ArUco angle measurements in the Observation model. The EKF algorithm is
presented in Algorithm 1.
State Transition Model. The state transition model, present in Eq. 3, is a
probabilistic model that describes the state distribution according to the inputs,
in this case the linear and angular velocity of the robot. In addition to the
transition function f(Xk, uk), the noise model N is represented by a Gaussian
distribution with zero mean and covariance Q (Eq. 2), assumed constant.
XK+1 = f(Xk, uk) + N(0, Q) Q =

σ2
vk
0
0 σ2
ωk

(2)
f(Xk, uk) = Xk +
⎡
⎣
vk · Δt · cos(θk + ωk·Δt
2 )
vk · Δt · sin(θk + ωk·Δt
2 )
ωk · Δt
⎤
⎦ (3)
In order to linearize the process, EKF uses the Taylor expansion that intro-
duces uncertainty by calculating both the Jacobian of f with respect to Xk
(Eq. 4) and uk (Eq. 5).
∇fx =
∂f
∂Xk
=
⎡
⎣
1 0 −vk · Δt · sin(θk + ωk·Δt
2 )
0 1 vk · Δt · cos(θk + ωk·Δt
2 )
0 0 1
⎤
⎦ (4)
∇fu =
∂f
∂uk
=
⎡
⎣
Δt · cos(θk + ωk·Δt
2 ) −1
2 · vk · Δt2
· sin(θk + ωk·Δt
2 )
Δt · sin(θk + ωk·Δt
2 ) 1
2 · vk · Δt2
· cos(θk + ωk·Δt
2 )
0 Δt
⎤
⎦ (5)
Observation Model. The filter will have two types of information inputs: the
distance measurement made by Pozyx to each of the anchors and angle measure-
ment made by the robot’s camera to each of the ArUco markers. The filter will
have two observation models: one for each type of measurements. Both models
will have in common the covariance matrix of the state estimate of the cur-
rent state of the filter. The model corresponding to the Pozyx measurements
is presented in Eqs. 6 and 7 and characterizes the Euclidean distance between
the estimated position of the robot and the position of the anchor. The model
242 S. Faria et al.
corresponding to the camera measurements, on the other hand, is presented in
Eqs. 9 and 10 and characterizes the angle between the robot’s referential and
the ArUco marker. Both measurements are affected by an additive Gaussian
noise with zero mean and covariance constant (R), a parameter related to the
characterization of the Pozyx error and the robot camera, respectively.
It should also be noted that, as for the state transition model, the EKF
algorithm also needs the Jacobian of h with respect to Xk (Eq. 8 and Eq. 11).
ZB,i = rB,i = h(MB,i, Xk) + N(0, R) R = [σ2
r ] (6)
ẐB,i = h(MB,i, X̂k) = (xB,i − x̂r,k)2 + (yB,i − ŷr,k)2 (7)
∇hx =
∂h
∂Xk
= −
xB,i−x̂r,k
√
(xB,i−x̂r,k)2+(yB,i−ŷr,k)2
−
yB,i−ŷr,k
√
(xB,i−x̂r,k)2+(yB,i−ŷr,k)2
0 (8)
ZB,i = φB,i = h(MB,i, Xk) + N(0, R) R = [σ2
φ] (9)
ẐB,i = h(MB,i, X̂k) = atan2(yB,i − ŷr,k, xB,i − x̂r,k) − θ̂r,k (10)
∇hx =
∂h
∂Xk
=
yB,i−ŷr,k
(xB,i−x̂r,k)2+(yB,i−ŷr,k)2 −
xB,i−x̂r,k
(xB,i−x̂r,k)2+(yB,i−ŷr,k)2 −1 (11)
Outlier Detection. Normally, when taking measurements with sensors, they
are prone to errors that can be generated due to hardware failures or unexpected
changes in the environment. For proper use of the measurements obtained, the
systems must be able to detect and deal with the less reliable measurements, i.e.,
those that deviate from the normal pattern and called outlier. Although there
are several approaches to solving this problem, but in the presence of multivari-
ate data, one of the common methods is to use the Mahalanobis Distance (MD)
(Eq. 12) as a statistical measure of the probability of some observation belonging
to a given data set. In this approach, outliers are removed as long as they are
outside a specific ellipse that symbolizes a specific probability in the distribu-
tion (Algorithm 2). In [10,11], it can be seen that this approach has been used
successfully in different applications, consisting of calculating the normalized
distance between a point and the entire population.
MD(x) = (x − μ)T · S−1 · (x − μ) (12)
In this case ẐB,i will be the population mean μ and S will be a factor rep-
resenting a combination of the state uncertainty and the actual sensor measure-
ment, present in line 8 of Algorithm 1. The choice of the threshold value can
be made by a simple analysis of the data or by determining some probabilistic
statistical parameter. In this case the threshold was established according to the
Sensor Fusion for Mobile Robot Localization 243
Algorithm 1: EKF Localization Algorithm with known measurements
Input : X̂k, ZB,i, uk, Pk, R, Q
Output : X̂k+1, Pk+1
PREDICTION
1 X̂k+1 = f(X̂k, uk)
2 ∇fx = ∂f
∂Xk
3 ∇fu = ∂f
∂uk
4 Pk+1 = ∇fx · Pk · ∇fT
x + ∇fu · Q · ∇fT
u
CORRECTION
5 for i = 1 to NB do
6 ẐB,i = h(MB,i, X̂k)
7 ∇hx = ∂h
∂Xk
8 Sk,i = ∇hx · Pk · ∇hT
x + R
9 OutlierDetected = OutlierDetection(Sk,i, ẐB,i, ZB,i)
10 if not OutlierDetected then
11 Kk,i = Pk+1 · ∇hT
x · S−1
k,i
12 X̂k+1 = X̂k + Kk,i · [ZB,i − ẐB,i]
13 Pk+1 = [I − Kk,i · ∇hx] · Pk+1
14 end
15 end
χ2
probability table, so that observations with less than 5 % probability were
cut off in the case of the Pozyx system. For the ArUco system a threshold was
chosen so that observations with less than 2.5 % probability were cut off. These
values were chosen through several iterations of the algorithm, and the values,
which the algorithm behaved best for, were chosen.
Algorithm 2: Outliers Detection Algorithm
Input : Sk,i, ẐB,i, ZB,i
Output: OutlierDetected
1 MD(ZB,i) = (ZB,i − ẐB,i)T · S−1
k,i · (ZB,i − ẐB,i)
2 if MD(ZB,i)  Threshold then
3 OutlierDetected = True
4 else
5 OutlierDetected = False
6 end
Pozyx Offset Correction. The Pozyx system distance measurements suffer
from different offsets, i.e. difference between the actual value and the measured
value, depending on how far the Pozyx tag is from a given anchor. These off-
sets also vary depending on the anchor at which the distance is measured, as
244 S. Faria et al.
verified in Sect. 5.1. In order to correct this offset, the Algorithm 3 was imple-
mented. After calculating the estimated distance from the anchor to the robot
tag, it is verified between which predefined values this distance is, and then the
correction is made through a linear interpolation. The variable RealDist is an
array with a set of predefined distances for which the offsets have been deter-
mined. The variable Offset is a 4 × length(RealDist) matrix containing the
offsets corresponding to each distance for each of the four anchors placed in the
environment.
Algorithm 3: Offset Correction Algorithm
Input : ẐB,i
Output : ẐB,i
1 if ẐB,i = RealDist[0] then
2 ẐB,i = ẐB,i + Offset[i, 0]
3 else if ẐB,i = RealDist[7] then
4 ẐB,i = ẐB,i + Offset[i, 7]
5 else
6 for j = 0 to length(RealDist) − 1 do
7 if ẐB,i = RealDist[j] and ẐB,i = RealDist[j + 1] then
8 OffsetLin =
Offset[i, j] + Offset[i,j+1]−Offset[i,j]
RealDist[j+1]−RealDist[j]
· (ẐB,i − RealDist[j])
9 ẐB,i = ẐB,i + OffsetLin
10 end
11 end
12 end
5 Results and Discussion
The developed localization system was tested on the robot in a 2 m × 2 m
square area (Fig. 5). The output state X̂k+1 of the Kalman Filter allows con-
trolling the robot to follow the desired path. Pozyx anchors were placed at the
corners of the square and ArUco markers at the following (x, y) m positions:
(0, 0.5), (0, 1.5), (0.5, 2), (1.5, 2), (2, 1.5), (2, 0.5), (1.5, 0) and (0.5, 0). The initial
(x, y, θ) pose of the robot is (0.4, 0.5, 90◦
) and the robot moves in a clockwise
direction.
Before proceeding to the localization tests it was necessary to characterize
the error of both the Pozyx system and the ArUco system in order to be able to
correctly implement the algorithms discussed. When characterizing the perfor-
mance of systems it is important to realize that an error can have different sources
and can be of different types. The experiments performed in the characteriza-
tion of the systems will focus mainly on the analysis of accuracy and precision,
Sensor Fusion for Mobile Robot Localization 245
Fig. 5. Robot map
which are typically associated with systematic and random errors respectively.
In the case of this particular work the characterization of sensor accuracy is
particularly important, since it will be used in the EKF to characterize the noise
measurement. Furthermore, accuracy measures how close the observations are
to real values, thus characterizing the existing offset.
5.1 Pozyx Characterization
In order to obtain a complete characterization of the Pozyx system and its
error, several experiments were performed in which distance measurements were
taken between the static anchors and the Pozyx tag. These experiments were
performed for different distances: [0.6, 0.8, 1, 1.2, 1.4, 1.6, 1.8, 2] m. It should also
be noted that both the anchors and the tag were at the same height from the
ground and that the real distance was measured with an appropriate laser. Thus
it was possible to conclude on the error/offset and standard deviation of the
measurements and verify their evolution depending on the distance between
anchor and tag.
Figure 6 shows the range of measurements for a distance between an anchor
and tag of 1.6 m. It can be concluded that the error of the Pozyx system follows
a Gaussian distribution.
Fig. 6. Distribution of Pozyx measurements for a distance of 1.6 m
246 S. Faria et al.
Table 1 shows the results obtained for the offset and for the standard devi-
ation for different distances between a given anchor and the Pozyx tag. Note
that the same experiments were performed for the four different anchors used in
the robot localization. Through the analysis of the Table 1 it is possible to con-
clude that independent of the distance the standard deviation is approximately
constant, i.e. it belongs to the same range of values, and therefore for the noise
of the EKF sensor an average of these values was used. Regarding the offset it
was concluded that this is very variable depending on the distances and anchors.
Therefore, the values estimated by EKF were adjusted according to the distance
between a given anchor and tag, as mentioned in Sect. 4.2.
Table 1. Offset and SD as a function of measurement Pozyx distance
Real Distance (mm) Mean (mm) Offset (mm) SD (mm)
600 612 12 22.7
800 825 25 29.1
1000 1029 29 29.3
1200 1217 17 33.9
1400 1422 22 26.7
1600 1612 12 29.1
1800 1787 −13 27.2
2000 2020 20 33.9
5.2 ArUco Characterization
For the characterization of the ArUco system a similar procedure was followed as
for Pozyx, however, instead of distance measurements, angle measurements were
performed, more specifically the φ angle. In particular, tests were performed with
the robot stationary with the built-in camera looking at an ArUco marker. As
with the Pozyx system, the center of the camera and the ArUco marker are at
the same distance from the ground. Five different experiments were performed,
represented in Fig. 7: robot perpendicular and just in front of the marker (1), in
front of the marker but rotated with a given angle to the right (2), in front of
the marker but rotated with a given angle to the left (3), perpendicular to the
marker but translated to the right (4) and finally, perpendicular to the marker
but translated to the left (5).
The histogram resulting from experiment 1 is represented in Fig. 8. The
Table 2 shows the results obtained from all the experiments.
Sensor Fusion for Mobile Robot Localization 247
Fig. 7. ArUco φ angle experiments
Fig. 8. Experiment 1 for ArUco characterization
Table 2. Results of ArUco characterization experiments
Experiment Real φ Angle (◦
) Mean (◦
) Error (◦
) SD (◦
)
1 0 0.0309 0.0309 1.1e-3
2 16.33 16.3359 0.0059 1.0e-3
3 −12.94 −12.9653 −0.0233 1.3e-3
4 11.43 11.4486 0.0186 1.2e-3
5 −15.33 −15.3583 −0.0283 9.0e-4
An average of the standard deviations of the Table 2 was used to characterize
the noise of the measurements obtained by the camera. When performing the
localization tests it was found that when the robot moves in a straight line, the
measurements estimated by the EKF and those actually measured by the camera
were close. However, when the robot goes into rotation to change direction,
these measurements were far apart, as can be seen in Fig. 9. For this reason,
the EKF did not behave as expected. The φ angle measurements resulting from
the ArUco markers when the robot is in rotation can be considered outliers, not
being considered for the estimation of the robot’s pose. Therefore, an outlier
filter was used, the same as the one used with the Pozyx system and explained
in Sect. 4.2.
248 S. Faria et al.
Fig. 9. Comparison of φ estimation and ArUco φ measurements
5.3 Localization Results
For the localization tests it was important to tune the EKF, because changing its
parameters affects the convergence of the algorithm, as well as the smoothness
of the resulting trajectory estimate. Based on the noise characterization of the
systems discussed above, the R and Q parameters were adjusted so that the
estimation results presented were neither sensitive to noise nor too slow.
To verify the effectiveness of the proposed system four different tests were
performed: using only odometry (Fig. 10a), a fusion of odometry with the Pozyx
system (Fig. 10c), a fusion of odometry with the ArUco system (Fig. 10b) and
finally a fusion of odometry with the Pozyx and ArUco systems (Fig. 10d). Accu-
racy was determined by a comparison with Ground-Truth discussed in Sect. 3.
The errors shown in the figures refer to the errors in the last position of the
robot.
Using only the robot’s odometry, localization is clearly affected by wheel
sliding around curves, and the error will increase as the robot travels longer dis-
tances. By fusing the odometry with the angle information from the ArUco mark-
ers a position error of 6.186 cm and an orientation error of 0.237◦
is obtained.
It is thus concluded that the orientation of the robot is significantly improved.
By fusing the odometry with the Pozyx system a 1.653◦
orientation error and
1.626 cm position error is obtained, thus improving the robot’s localization both
in terms of position and orientation. Finally, through the fusion of all systems it
is possible to acquire the benefits of each of the systems individually obtaining
a position error of 1.987 cm and an orientation error of 0.393◦
.
Sensor Fusion for Mobile Robot Localization 249
(a) Only Odometry measurements
(b) Fusion of odometry and ArUco measurements
(c) Fusion of odometry and Pozyx measurements
(d) Fusion of all measurements
Fig. 10. EKF estimate results
250 S. Faria et al.
6 Conclusions and Future Work
Overall the proposed system proved to be effective in improving the robot’s
localization. The use of Pozyx system and ArUco markers allowed to improve
the estimation of the robot’s pose in relation to using only the robot’s odometry
that would induce more and more errors over time. However, a careful tuning
of the Kalman Filter is necessary to ensure a correct convergence of the robot’s
pose. The results obtained showed an error in the order of centimeters, unlike
other existing technologies, such as GPS and Wi-Fi, whose error is in the order
of meters. As future work, it could be possible to add inputs to the Kalman
Filter, such as other possible measurements extracted from the ArUco markers,
i.e. the distance to them and their inclination.
Acknowledgements. This work is financed by National Funds through the Por-
tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project
UIDB/50014/2020.
References
1. Chen, L., Hu, H., McDonald-Maier, K.: EKF based Mobile Robot Localization. In:
Third International Conference on Emerging Security Technologies (2012)
2. Alarifi, A., et al.: Ultra wideband indoor positioning technologies: analysis and
recent advances. In: Sensors (2016)
3. Thrun, S., Fox, D., Burgard, W.: Probabilistic Robotics. Open University Press
(2002)
4. Sakperea, W., Adeyeye-Oshinb, M., Mlitwa, N.: A state-of-the-art survey of indoor
positioning and navigation systems and technologies. South African Comput. J. 29,
145–197 (2017)
5. Liu, Y., Yuchuan Song, Y.: A robust method of fusing ultra-Wideband range mea-
surements with odometry for wheeled robot state estimation in indoor environ-
ment. In: 2018 Chinese Control And Decision Conference (CCDC) (2018)
6. Babineca, A., Jurišicaa, L., Hubinskýa, P., Duchon, F.: Visual localization of mobile
robot using artificial markers. Procedia Eng. 96, 1–9 (2014)
7. Fiala, M.: ARTag, a fiducial marker system using digital techniques. In: 2005 IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (2005)
8. Zheng, J., Bi, S., Cao, B., Yang, D.: Visual localization of inspection robot using
extended Kalman filter and aruco markers. In: IEEE International Conference on
Robotics and Biomimetics (2018)
9. Oliveira, D., Simões, L., Martins, M., Paulino, M.: EKF-SLAM using visual mark-
ers for ITER’s Remote Handling Transport Casks. In: Instituto Superior Técnico,
Autonomous Systems (2019)
10. Sobreira, H.: Fiabilidade e robustez da localização de robôs móveis (2016)
11. Yenké, B., Aboubakar, M., Titouna, C., Ari, A., Gueroui, A.: Adaptive scheme for
outliers detection in wireless sensor networks. Int. J. Comput. Networks Commun.
Secur. (2017)
12. Camera Calibration. https://docs.opencv.org/master/dc/dbb/tutorial py
calibration.html
Deep Reinforcement Learning Applied
to a Robotic Pick-and-Place Application
Natanael Magno Gomes1(B)
, Felipe N. Martins1
, José Lima2,3
,
and Heinrich Wörtche1,4
1
Sensors and Smart Systems Group, Institute of Engineering,
Hanze University of Applied Sciences, Groningen, The Netherlands
natanael gomes@msn.com
2
The Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Polytechnic Institute of Bragança, Bragança, Portugal
3
Centre for Robotics in Industry and Intelligent Systems — INESC TEC,
Porto, Portugal
4
Department of Electrical Engineering, Eindhoven University of Technology,
Eindhoven, The Netherlands
Abstract. Industrial robot manipulators are widely used for repetitive
applications that require high precision, like pick-and-place. In many
cases, the movements of industrial robot manipulators are hard-coded or
manually defined, and need to be adjusted if the objects being manipu-
lated change position. To increase flexibility, an industrial robot should
be able to adjust its configuration in order to grasp objects in vari-
able/unknown positions. This can be achieved by off-the-shelf vision-
based solutions, but most require prior knowledge about each object to
be manipulated. To address this issue, this work presents a ROS-based
deep reinforcement learning solution to robotic grasping for a Collab-
orative Robot (Cobot) using a depth camera. The solution uses deep
Q-learning to process the color and depth images and generate a -
greedy policy used to define the robot action. The Q-values are esti-
mated using Convolutional Neural Network (CNN) based on pre-trained
models for feature extraction. Experiments were carried out in a sim-
ulated environment to compare the performance of four different pre-
trained CNN models (RexNext, MobileNet, MNASNet and DenseNet).
Results show that the best performance in our application was reached by
MobileNet, with an average of 84 % accuracy after training in simulated
environment.
Keywords: Cobots · Reinforcement learning · Computer vision ·
Pick-and-place · Grasping
1 Introduction
The usage of robots has been increasing in the industry for the past 50 years
[1], specially in repetitive tasks. Recently, industrial robots are being deployed
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 251–265, 2021.
https://doi.org/10.1007/978-3-030-91885-9_18
252 N. M. Gomes et al.
in applications in which they share (part of) their working environment with
people. Those type of robots are often referred to as Cobots, and are equipped
with safety systems according to ISO/TS 15066:2016 [2]. Although Cobots are
easy to setup and program, their programs are usually written manually. If
there is a change in the position of objects in their workspace, which is common
when humans also interact with the scene, their program needs to be adjusted.
Therefore, to increase flexibility and to facilitate the implementation of robotic
automation, the robot should be able to adjust its configuration in order to
interact with objects in variable positions.
A Robot manipulator consists of a series of joints and links forming the arm,
at the far end are placed the end-effectors. The purpose of an end-effector is to
act on the environment, for example by manipulating objects in the scene. The
most common end-effector for grasping is the simple parallel gripper, consisting
of two-jaw design.
Grasping is a difficult task when different objects are not always in the same
position. To obtain a grasping position of the object, several techniques have
been applied. In [3] a vision technique is used to define candidate points in the
object and then triangulate one point where the object can be grasped.
With the evolution of the processing power, Computer Vision (CV) has also
played an important role in industrial automation for the last 30 years, including
depth images processing [4]. CV has been applied from food inspection [5,6] to
smartphone parts inspection [7]. Red Green Blue Depth (RGBD) cameras are
composed of a sensor capable of acquiring color and depth information and have
been used in robotics to increase the flexibility and bring new possibilities. There
are several models available e.g. Asus Xtion, Stereolabs ZED, Intel RealSense
and the well-known Microsoft Kinect. One approach to grasping different types
of objects using RBGD cameras is to create 3D templates of the objects and a
database of possible grasping positions. The authors in [8] used dual Machine
Learning (ML) approach, one to identify familiar objects with spin-image and the
second to recognize an appropriate grasping pose. This work also used interactive
object labelling and kinesthetic grasp teaching. The success rate varies according
to the number of known objects and goes from 45% up to 79% [8].
Deep Convolutional Neural Networks (DCNNs) have been used to identify
robotic grasp positions in [9]. It uses RGBD image as input and gives a five-
dimensional grasp representation, with position (x, y), a grasp rectangle (h, w)
and orientation θ of the grasp rectangle with respect to horizontal axis. Two
DCNNs Residual Neural Networks (ResNets) with 50 layers each are used to
analyse the image and generate the features to be used on a shallow CNN to
estimate the grasp position. The networks are trained against a large dataset of
known objects and their grasp position.
Generative Grasping Convolutional Neural Network (GG-CNN) is proposed
in [10], a solution fast to compute, capable of running real-50 Hz. It uses DCNN
with just 10 to 20 layers to analyse the images and depth information to control the
robot in real time to grasp objects, even when they change position on the scene.
In this paper we investigate the use of Reinforcement Learning (RL) to train
an Artificial Intelligence (AI) agent to control a Cobot to perform a given pick-
and-place task, estimating the grasping position without previous knowledge
Deep RL Applied to a Robotic Pick-and-Place Application 253
about the objects. To enable the agent to execute the task, an RGBD camera
is used to generate the inputs for the system. An adaptive learning system was
implemented to adapt to new situations such as new configurations of robot
manipulators and unexpected changes in the environment.
2 Theoretical Background
In this section we present a summary of relevant concepts used in the develop-
ment of our system.
2.1 Convolutional Neural Networks
CNN is a class of algorithms which use the Artificial Neural Network in com-
bination with convolutional kernels to extract information from a dataset. The
convolutional kernel scans the feature space and the result is stored in an array
to be used in the next step of the CNN.
CNN have been applied in different solutions in machine learning, such as
object detection algorithms, natural language processing, anomaly detection,
deep reinforcement learning among others. The majority of the CNN applica-
tion is in the computer vision field with a highlight to object detection and
classification algorithms. The next section explores some of these algorithms.
2.2 Object Detection and Classification Algorithms
In the field of artificial intelligence, image processing for object detection and
recognition is highly advanced. The increase of Central Processing Unit (CPU)
processing power and the increased use of Graphics Processing Unit (GPU) have
an important role in the progress of image processing [11].
The problems of object detection are to detect if there are objects in the
image, to estimate the position of the object in the image and predict the class
of the object. In robotics the orientation of the object can also be very important
to determine the correct grasp position. A set of object detection and recognition
algorithms are investigated in this section.
Several features arrays are extracted from the image and form the base for
the next layer of convolution and so on to refine and reduce dimensionality of
the features, the last step is a classification Artificial Neural Network (ANN)
which is giving the output in a form of certainty to a number of classes. See
Fig. 1 where a complete CNN is shown.
The learning process of a CNN is to determine the value of the kernels to
be used during the multiple convolution steps. The learning process can take
up to hours of processing a labeled data set to estimate the best weights for the
specific object. The advantage is once the model weights have been determined
they can be stored for future applications.
In [13] a Regions with Convolutional Neural Networks (R-CNN) algorithm
is proposed to solve the problem of object detection. The principle is to propose
254 N. M. Gomes et al.
Fig. 1. CNN complete process, several convolutional layers alternate with pooling and
in the final classification step a fully connected ANN [12].
around 2000 areas on the image with possible objects and for each one of these
extract features and analyze with a CNN in order to classify the objects in the
image.
The problem of R-CNN is the high processing power needed to perform
this task. A modern laptop is able to analyze a high definition image using this
technique in about 40 s, making it impossible to execute real time video analysis.
But still capable of being used in some applications where time is not important
or where it is possible to use multiple processors to perform the task, since each
processor can analyze one proposed region.
An alternative to R-CNN is called Fast R-CNN [14] where the features are
extracted before the region proposition is done, so it saves processing time but
loses some abilities to parallel processing. The main difference to R-CNN is the
unique convolutional feature map from the image.
The Fast R-CNN is capable of near real time video analysis in a modern
laptop. For real time application there is a variation of this algorithm proposed
in [15] called Faster R-CNN. It uses the synergy of between steps to reduce the
number of proposed objects, resulting in an algorithm capable of analyzing an
image in 198 ms, sufficient for video analysis. Faster R-CNN has an average
result of over 70% of correct identifications.
Extending Faster R-CNN the Mask R-CNN [16,17] creates a pixel segmen-
tation around the object, giving more information about the orientation of the
object, and in the case of robotics a first hint to where to pick the object.
There are efforts to use depth images with object detection and recognition
algorithms as shown in [18], where the positioning accuracy of the object is
higher than RGB images.
2.3 Deep Reinforcement Learning
Together with Supervised Learning and Unsupervised Learning, RL forms the
base of ML algorithms. RL is the area of ML based on rewards and the learning
process occurs via interaction with the environment. The basic setup includes
the agent being trained, the environment, the possible actions the agent can take
and the reward the agent receives [19]. The reward can be associated with the
action taken or with the new state.
Deep RL Applied to a Robotic Pick-and-Place Application 255
Some problems in RL can be too large to have exact solutions and demand
approximate solutions. The use of deep learning to tackle this problem in com-
bination with RL is called Deep Reinforcement Learning (deep RL). Some
problems can require more memory than available, i.e., a Q-table to store all
possible solutions for an input color image of 250 × 250 pixels would require
250 × 250 × 255 × 255 × 255 = 1.036.335.937.500 bytes, or 1 TB. For such large
problems the complete solution can be prohibitive by the required memory and
processing time.
2.4 Deep Q Learning
For large problems, the Q-table can be approximated using ANN and CNN
to estimate the Q values. Deep Q Learning Network (DQN) was proposed by
[20] to play Atari games on a high level, later this technique was also used in
robotics [21,22]. A self balanced robot was controlled using DQN in a simulated
environment with performance better than Linear–quadratic regulator (LQR)
and Fuzzy controllers [23]. Several DQNs have been tested for ultrasound-guided
robotic navigation in the human spine to locate the sacrum with [24].
3 Proposed System
The proposed system consists of a collaborative robot equipped with a two-
finger gripper and a fixed RGBD camera pointing to the working area. The
control architecture was designed considering the use of DQN to estimate the
Q-values in the Q-Estimator. RL demands multiple episodes to obtain the neces-
sary experience. Acquiring experience can be accelerated in a simulated environ-
ment, which can also be enriched with data not available in the real world. The
proposed architecture shown in Fig. 2 was designed to work in both simulated
and real environments to allow experimentation on a real robot in the future.
The proposed architecture uses Robot Operating System (ROS) topics and
services to transmit data between the learning side and the execution side. The
boxes shown in blue in Fig. 2 are the ROS drivers, necessary to bring the func-
tionalities of the hardware to the ROS environment. The execution side can be
simulated, to easily collect data, or real hardware for fine tuning and evalua-
tion. As in [22], the action space is defined as motor control and the Q-values
correspond to probability of grasp success.
The chosen policy for the RL algorithm is a ε-greedy, i.e., pursue the maxi-
mum reward with ε probability to take a random action. R-Estimator estimates
the reward based on the success of the grasp and the distance reached to the
objects, following Eq. 1.
Rt =

1
dt+1 , if 0 ≤ dt ≤ 0.02
0, otherwise
(1)
where dt is in meters.
256 N. M. Gomes et al.
Fig. 2. Proposed architecture for grasp learning, divided in execution side (left) and
learning sides (right). The modules in blue are ROS Drivers and the modules in yellow
are Python scripts.
3.1 Action Space
The RL gives freedom to choose the possible actions the agent can choose, in
this work actions are defined as the possible positions to attempt to grasp an
object inside the work area, defined as:
Sa = {v, w}, (2)
where {v} is the proportional position inside the working area in the x axis and
{w} is the proportional position inside the working area in the y axis. The values
are discretized by the output of the CNN.
3.2 Convolutional Neural Network
To estimate the Q-values a CNN is used. For the action space Sa the network
consists of two blocks to extract features from the images, a concatenation of
the features and another CNN to reach the Q-values. The feature extraction
blocks are pre-trained Pytorch models where the final classification network is
removed. The layer to be removed is different for each model and, in general, the
fully connected layers are removed. Four models were selected to compose the
network, DenseNet, MobileNet, ResNext and MNASNet. The criteria considered
the feature space and the performance of the models.
The use of pre-trained PyTorch models reduces the overall training time.
However it brings limitations to the system, the size of the input image must
be 224 by 224 pixels and the image must be normalized following the original
Deep RL Applied to a Robotic Pick-and-Place Application 257
2
2
4
x
2
2
4
conv
1
1
2
x
1
1
2
conv
5
6
x
5
6
conv
2
8
x
2
8
conv
1
4
x
1
4
conv
1024
7
x
7
conv
2
2
4
x
2
2
4
conv
1
1
2
x
1
1
2
conv
5
6
x
5
6
conv
2
8
x
2
8
conv
1
4
x
1
4
conv
1024
7
x
7
conv
2048
7
x
7
concat+norm+relu
64
7
x
7
norm+conv+relu
n
7
x
7
norm+conv
n
1
1
2
x
1
1
2
upsample
Fig. 3. The CNN architecture for the action space Sa, the two main blocks are a
simplified representation of pre-trained Densenet model [25], only the feature size is
represented. The features from the Densenet model are concatenated and used to feed
the next CNN, the result is an array with Q-values used to determine the action.
dataset mean and standard deviation [26]. In general this limits the working area
of the algorithm to an approximately square area (Fig. 3).
3.3 Simulation Environment
The simulation environment was built on Webots, an open-source robotics sim-
ulator [27]. The choice has been made considering the usability of the software
and use of computational resources [28]. To enclose the simulation in the ROS
environment some modules were implemented: Gripper Control, Camera Con-
trol and a Supervisor to control the simulation. The simulated UR3e robot is
connected to ROS using the ROS driver provided by the manufacturer and con-
trolled with the Kinematics module. Figure 4 shows the simulation environment,
in which the camera is located in front of the robot, pointing to the working
area. A feature of the simulated environment is to have control over all objects
positions and colors. The positions were used as information for the reward and
the color of the table was changed randomly at each episode to increase robust-
ness during training. For each attempt the table color, the number of objects
and the position of the objects were randomly changed.
Webots Gripper Control. The Gripper Control is responsible to read and
control the position of the joints of the simulated gripper. It controls all joints,
motors and sensors of the simulated gripper. Touch sensors were also added at
the tip of the finger to emulate the feedback signal when an object is grasped.
258 N. M. Gomes et al.
Fig. 4. The virtual environment built on Webots: it consists of a table, a UR3e collab-
orative robot, a camera and the objects used in the training.
The Robotiq 2F-85 is the gripper we are going to use in future experiments
with the real robot. It consists of 6 rotational joints intertwined to form the 2
fingers. During tests, the simulation of the closed kinematic chain of this gripper
in Webots was not stable. To regain stability in simulation we used a gripper
with simpler mechanical structure but with similar dimensions of the Robotiq
2F-85. The gripper used in simulation is shown in detail in Fig. 5.
Fig. 5. Detail of the gripper used in the simulation: its appearance is based on the
Kuka Youbot gripper and its bounding objects are simplified to blocks.
Webots Supervisor. The Supervisor is responsible for resetting the simu-
lation, preparing the position of the objects at the beginning of the episode,
changing color of the table and publishing the position of objects to the reward
estimator. To estimate the distance between the center of the end-effector and
the objects, a GPS position sensor is placed in the gripper’s center to inform
its position to the supervisor. The position of the objects is used to shape the
reward proportional to the distance between the end-effector and the object.
Deep RL Applied to a Robotic Pick-and-Place Application 259
Although this information is not available in the real world they are used to
speed up the simulation training sessions.
Webots Camera. The camera simulated in Webots has the same resolution
of the Intel RealSense camera. To avoid the need of calibration of the depth
camera, both RGB and depth cameras had coincident position and field of view
in simulation. The field of view is the same as the Intel RealSense RGB camera:
69.4◦
or 1.21 rad.
3.4 Integrator
The Integrator is responsible for connecting all modules, simulated or real. It
controls the Webots simulation using the Supervisor API and feed the RGBD
images to the neural network.
Kinematics Module. The kinematics module controls the UR3e robot, simu-
lated or real. It contains several methods to execute the calculations needed for
the movement of the Cobot.
Although RL has been used to solve the kinematics in other works [22,29],
this is not the case in our system. Instead, we make use of analytical solution of
the forward and inverse kinematics of the UR3e [30]. The Denavit–Hartenberg
parameters are used to calculate forward and inverse kinematics of the robot
[31]. Considering the UR3e has 6 joints, the combination of 3 of these can give
23
= 8 different configurations which can give the same pose of the end-effector
(elbow up and down, wrist up and down, shoulder forward and back). On top
of that, the movement of the UR3e joints have a range from −2π to +2π rad,
increasing the possible solution space to 26
= 64 different configurations to the
same pose of the end-effector. To reduce the problem, the range of the joints is
limited via software to −π to +π rad, but still giving 8 possible solutions from
where the nearest solution to the current position is selected.
The kinematics module is capable of moving the robot to any position in the
work space avoiding unreachable positions. To increase the usability of the mod-
ule functions with the same behavior of the original Universal Robots “MOVEL”
and “MOVEJ” have been implemented.
To estimate the cobot joints angles in order to position the end-effector in
space the Tool Center Point (TCP) must be considered in the model. TCP is
the position of the end-effector in relation to the robot flange. The real robot
that will be used for future experiments has a Robotiq wrist camera and a 2F-85
gripper, which means that the TCP is 175.5 mm from the robot flange in the z
axis [32].
4 Results and Discussion
This section shows the results and discussion of two training sections with dif-
ferent methods. The tests were performed on a laptop with a i7-9750H CPU,
260 N. M. Gomes et al.
32 GB RAM and a GTX 1650 4 GB GPU, running Ubuntu 18.04. Although the
GPU was not used in the CNN training, the simulation environment made use
of it.
4.1 Modules
All modules were tested individually to ensure proper functioning. The ROS
communication was tested using the builtin tool rqt , to check the connection
between nodes via topics or services. The UR3e joints positions are always pub-
lished in a topic and controlled via action client. In the simulation environment,
the camera images, the gripper control and the supervisor commands are made
available via ROS services. Differently from ROS topics, ROS services only trans-
mit data when queried, decreasing the processing demanded by Webots. Figure 6
shows the nodes via topics in the simulated environment, services are not rep-
resented in this diagram. The diagram was created with rqt .
Fig. 6. Diagram of the nodes running during the testing phase. In the simulated envi-
ronment most of the data is transmitted via ROS services. In the picture, the topics
/follow joint trajectory and /gripper status are responsible for the robot movement
and griper status information exchange, respectively.
CNN. From the four models tested, DenseNet and ResNext demanded more
memory than the available GPU while MobileNet and MNASNet were capable
of running on the GPU. To keep the fairness of the evaluation all timing tests
were performed on the CPU.
4.2 Training
For training the CNN it was used a Huber loss error function [33] and an Adam
optimizer [34] with weight decay regularization [35], the hyperparameters used
for RL and CNN training are shown in the Table 1.
Deep RL Applied to a Robotic Pick-and-Place Application 261
Table 1. Hyperparameters used in training.
Parameter name Symbol Value
Learning rate for CNN αCNN 1 × 10−3
Weight decay for CNN λw 8 × 10−4
Learning rate for RL αRL 0.7
Discount factor γ 0.90
Initial exploration factor ε0 0.90
Final exploration factor εf 5 × 10−2
Exploration factor decay λε 200
To avoid color bias of the algorithm the color of the simulated table was
changed for every episode.
Each training section was divided in four parts: collecting data, deciding the
action to take based on the estimated Q-values, taking the action receiving a
reward and training the CNN. Several sections of training were performed and
the experience of the previous rounds were used to improve the training process.
The training cycle times are shown in Table 2. Forward is the process follow-
ing the direction from input to output of the CNN, backward is the process to
evaluate the gradient from the difference in the output back to the input. In the
backward process the weights of the network are updated with the learning rate
αCNN .
Table 2. Mean time and standard deviation of forward and backward time during
training.
Base model name Forward time (s) Backward time (s)
DenseNet 0.408 ± 0.113 0.676 ± 0.193
ResNext 0.366 ± 0.097 0.760 ± 0.173
MobileNet 0.141 ± 0.036 0.217 ± 0.053
MNASNet 0.156 ± 0.044 0.257 ± 0.074
First Training Section. In the first training round no previous experience is
used and the algorithm learns from scratch. The main target is to get information
of the training process about cycle time and acquire experience to be used in
future training sections. The algorithm was training according to the most recent
experience with batch size of 1.
In the training sections the accuracy was estimated based on 10 attempts
every 10 epochs to verify how good the algorithm was performing at the time.
The results are shown in Fig. 7. The training section took from 1:43 to 2:05 h to
complete.
262 N. M. Gomes et al.
In Fig. 7 is observed a training problem where the loss reaches zero and there
is no gradient for learning. The algorithm cannot learn and the accuracy shows
the q-values estimated are poor. There are several causes that can explain this
case including the weights of the CNN are too small and the experience accumu-
lated has most errors. The solutions for this are complex including fine-tuning
hyperparameters and selecting best experiences for the algorithm as shown in
[36]. Another solution is to use demonstration through shaping [37], where the
reward function is used to generate training data based on demonstrations of the
correct action to take. The training data for the second section was generated
using the reward function to map all possible rewards of the input.
Fig. 7. The loss and accuracy of 1000 epochs training section, loss data were smoothed
using a third order filter, raw data is shown in light colors.
Fig. 8. The loss and accuracy during 250 epochs training section, data were smoothed
using a third order filter, raw data is shown in light colors.
Second Training Section. The second training section used the demonstra-
tion through shaping. It was possible because in the simulation environment
the information of the position of the objects is available. The training process
received experiences generated from the simulation, these experiences have the
best action possible for each episode.
Deep RL Applied to a Robotic Pick-and-Place Application 263
The batch size used on this training section was 10. The increase of batch size
combined with the new experience replay caused a larger loss at the beginning
of the training section as seen on the Fig. 8. The training section took from 3:43
to 4:18 h to complete. The accuracy as estimated for every epoch based on 10
attempts.
5 Conclusion
This paper presented the use of RL to train an AI agent to control a Collaborative
Robot to perform a pick-and-place task while estimating the grasping position
without previous knowledge about the object. It was used an RGBD camera
to generate the inputs for the system. An adaptive learning system was imple-
mented to adapt to new situations such as new configurations of robot manipu-
lators and unexpected changes in the environment. The results implemented on
simulation validated the proposed approach. As future work, an implementation
with a real manipulator will be addressed.
Acknowledgements. This work has been supported by FCT - Fundação para a
Ciência e Tecnologia within the Project Scope: UIDB/05757/2020 and by the Inno-
vation Cluster Dracten (ICD), project Collaborative Connected Robots (Cobots) 2.0.
The authors also thank the support from the Research Centre Biobased Economy from
the Hanze University of Applied Sciences.
References
1. Siciliano, B., Khatib, O. (eds.): Springer Handbook of Robotics, pp. 1–2227.
Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32552-1
2. ISO/TS 15066 Robots and robotic devices - Collaborative robots. International
Organization for Standardization, Geneva, CH, Standard, February 2016
3. Saxena, A., Driemeyer, J., Kearns, J., Ng, A.Y.: Robotic grasping of novel objects.
In: Advances in Neural Information Processing Systems, pp. 1209–1216 (2007).
https://doi.org/10.7551/mitpress/7503.003.0156. ISBN: 9780262195683
4. Torras, C.: Computer Vision: Theory and Industrial Applications, p. 455. Springer,
Heidelberg (1992). https://doi.org/10.1007/978-3-642-48675-3. ISBN: 3642486754
5. Gomes, J.F.S., Leta, F.R.: Applications of computer vision techniques in the agri-
culture and food industry: a review. Eur. Food Res. Technol. 235, 989–1000 (2012).
https://doi.org/10.1007/s00217-012-1844-2
6. Arakeri, M.P., Lakshmana: Computer vision based fruit grading system for quality
evaluation of tomato in agriculture industry. Procedia Comput. Sci. 79, 426–433
(2016). https://doi.org/10.1016/j.procs.2016.03.055
7. Bhutta, M.U.M., Aslam, S., Yun, P., Jiao, J., Liu, M.: Smart-inspect: micro scale
localization and classification of smartphone glass defects for industrial automa-
tion. arXiv: 2010.00741, October 2020
8. Shafii, N., Kasaei, S.H., Lopes, L.S.: Learning to grasp familiar objects using
object view recognition and template matching. In: IEEE International Conference
on Intelligent Robots and Systems, vol. 2016-November, pp. 2895–2900. Institute
of Electrical and Electronics Engineers Inc., November 2016. https://doi.org/10.
1109/IROS.2016.7759448. ISBN: 9781509037629
264 N. M. Gomes et al.
9. Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural net-
works. In: IEEE International Conference on Intelligent Robots and Systems, vol.
2017-September, pp. 769–776. Institute of Electrical and Electronics Engineers Inc.,
November 2017. https://doi.org/10.1109/IROS.2017.8202237. arXiv: 1611.08036.
ISBN: 9781538626825
10. Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic
grasping. Int. J. Robot. Res. 39(2–3), 183–201 (2020). https://doi.org/10.1177/
0278364919859066. ISSN: 0278-3649
11. Mittal, S., Vaishay, S.: A survey of techniques for optimizing deep
learning on GPUs. J. Syst. Archit. 99, 101635 (2019). https://doi.org/
10.1016/j.sysarc.2019.101635. http://www.sciencedirect.com/science/article/pii/
S1383762119302656. ISSN: 1383-7621
12. Saha, S.: A comprehensive guide to convolutional neural networks - the ELI5
way - by Sumit Saha - towards data science (2018). https://towardsdatascience.
com/a-comprehensiveguide-to-convolutional-neural-networks-the-eli5-way-
3bd2b1164a53. Accessed 20 June 2020
13. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accu-
rate object detection and semantic segmentation. In: Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, pp.
580–587 (2014). https://doi.org/10.1109/CVPR.2014.81. arXiv: 1311.2524. ISBN:
9781479951178
14. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference
on Computer Vision, vol. 2015 Inter, pp. 1440–1448 (2015). https://doi.org/10.
1109/ICCV.2015.169. arXiv: 1504.08083. ISSN: 15505499
15. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object
detection with region proposal networks. IEEE Trans. Pattern Anal. Mach.
Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031.
arXiv: 1506.01497. ISSN: 01628828
16. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. IEEE Trans. Pattern
Anal. Mach. Intell. 42(2), 386–397 (2020). https://doi.org/10.1109/TPAMI.2018.
2844175. arXiv: 1703.06870. ISSN: 19393539
17. Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018).
https://github.com/facebookresearch/detectron
18. Debkowski, D.: SuperBadCode/Depth-Mask-RCNN: using Kinect2 depth sensors
to train neural network for object detection  interaction. https://github.com/
SuperBadCode/Depth-Mask-RCNN. Accessed 20 June 2020
19. Sutton, R.S. Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn, p.
552. The MIT Press, Cambridge (2018). ISBN: 978-0-262-03924-6
20. Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., Cortelazzo, G.M.:
Time-of-Flight and Structured Light Depth Cameras, pp. 1–355. Springer, Cham
(2016). https://doi.org/10.1007/978-3-319-30973-6. ISBN: 9783319309736
21. Zhang, F., Leitner, J., Milford, M., Upcroft, B., Corke, P.: Towards vision-based
deep reinforcement learning for robotic motion control. arXiv: 1511.03791, Novem-
ber 2015
22. Joshi, S., Kumra, S., Sahin, F.: Robotic grasping using deep reinforcement learn-
ing. In: 2020 IEEE 16th International Conference on Automation Science and
Engineering (CASE), pp. 1461–1466. IEEE, August 2020. https://doi.org/10.1109/
CASE48305.2020.9216986. ISBN: 978-1-7281-6904-0
Deep RL Applied to a Robotic Pick-and-Place Application 265
23. Rahman, M.D.M., Rashid, S.M.H., Hossain, M.M.: Implementation of Q learning
and deep Q network for controlling a self balancing robot model. Robot. Biomim.
5(1), 1–6 (2018). https://doi.org/10.1186/s40638-018-0091-9. arXiv: 1807.08272.
ISSN: 2197-3768
24. Hase, H., Azampour, M.F., Tirindelli, M., et al.: Ultrasound-guided robotic navi-
gation with deep reinforcement learning. arXiv: 2003.13321, March 2020
25. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected con-
volutional networks. Technical Report. arXiv: 1608.06993v5. https://github.com/
liuzhuang13/DenseNet
26. Torchvision.models (2019). https://pytorch.org/docs/stable/torchvision/models.
html. Accessed 17 Jan 2021
27. Webots. Commercial Mobile Robot Simulation Software. Cyberbotics Ltd., Ed.
https://www.cyberbotics.com
28. Ayala, A., Cruz, F., Campos, D., Rubio, R., Fernandes, B., Dazeley, R.: A
comparison of humanoid robot simulators: a quantitative approach, pp. 1–10.
arXiv: 2008.04627 (2020)
29. Rajeswaran, A., Kumar, V., Gupta, A., et al.: Learning complex dexterous
manipulation with deep reinforcement learning and demonstrations. Techni-
cal Report. arXiv: 1709.10087v2. http://sites.google.com/view/deeprl-dexterous-
manipulation
30. Hawkins, K.P.: Analytic inverse kinematics for the universal robots UR-5/UR-
10 arms. Technical Report, December 2013. https://smartech.gatech.edu/handle/
1853/50782
31. Universal Robots - Parameters for calculations of kinematics and dynamics.
https://www.universal-robots.com/articles/ur/parameters-for-calculations-of-
kinematics-anddynamics/. Accessed 31 Dec 2020
32. Manual Robotiq 2F-85  2F-140 for e-series universal robots, Robotic, 145 pp.,
November 2018
33. SmoothL1Loss – PyTorch 1.7.0 documentation. https://pytorch.org/docs/stable/
generated/torch.nn.SmoothL1Loss.html. Accessed 15 Jan 2021
34. Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: 3rd Inter-
national Conference on Learning Representations, ICLR 2015 - Conference Track
Proceedings, December 2015. arXiv: 1412.6980
35. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization, November 2017.
arXiv: 1711.05101. http://arxiv.org/abs/1711.05101
36. De Bruin, T., Kober, J., Tuyls, K., Babuška, R.: Experience selection
in deep reinforcement learning for control. J. Mach. Learn. Res. 19, 1–
56 (2018). https://doi.org/10.5555/3291125.3291134. http://jmlr.org/papers/v19/
17-131.html. ISSN: 15337928
37. Brys, T., Harutyunyan, A., Suay, H.B., Chernova, S., Taylor, M.E., Nowé, A.:
Reinforcement learning from demonstration through shaping. In: IJCAI Interna-
tional Joint Conference on Artificial Intelligence, vol. 2015-January, pp. 3352–3358
(2015). ISBN: 9781577357384
Measurements with the Internet
of Things
An IoT Approach for Animals Tracking
Matheus Zorawski1(B)
, Thadeu Brito1,4,5
, José Castro2
,
João Paulo Castro3
, Marina Castro3
, and José Lima1,4
1
Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto
Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal
{matheuszorawski,brito,jllima}@ipb.pt
2
Instituto Politécnico de Bragança, Bragança, Portugal
mzecast@ipb.pt
3
Centro de Investigação de Montanha, Instituto Politécnico de Bragança,
Campus de Santa Apolónia, Bragança, Portugal
{jpmc,marina.castro}@ipb.pt
4
INESC-TEC - INESC Technology and Science, Porto, Portugal
5
Faculty of Engineering of University of Porto, Porto, Portugal
Abstract. Pastoral activities bring several benefits to the ecosystem
and rural communities. These activities are already carried out daily with
goats, cows and sheep in Portugal. Still, they could be better applied to
take advantage of their benefits. Most of these pastoral ecosystem ser-
vices are not remunerated, indicating a lack of making these activities
more attractive to bring returns to shepherds, breeders and landowners.
The monitoring of these activities provides data to value these services,
besides being able to indicate directly to the shepherds’ routes to drive
their flocks and the respective return. There are devices in the market
that perform this monitoring, but they are not adaptable to the circum-
stances and challenges required in the Northeast of Portugal. This work
addresses a system to perform animals tracking, and the development of
a test platform, through long-range technologies for transmission using
LoRaWAN architecture. The results demonstrated the use of LoRaWAN
in tracking services, allowing to conclude about the viability of the pro-
posed methodology and the direction for future works.
Keywords: Animals tracking · IoT · GPS · Grazing monitoring ·
LoRaWAN
1 Introduction
According to breeder organizations, nearly 100.000 indigenous cows, sheep, and
goats graze daily on Portugal’s northeast rangelands. Pastoralism is widely con-
sidered to have a critical role in strengthening rural communities’ resilience and
sustainability to face depopulation and climate change [1]. Every day, hundreds
Supported by FEDER (Fundo Europeu de Desenvolvimento Regional).
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 269–280, 2021.
https://doi.org/10.1007/978-3-030-91885-9_19
270 M. Zorawski et al.
of shepherds drive their flocks of 200 or fewer through multiple-ownership, small-
patched, and intricate rangelands around the farmstead [2]. Thanks to them,
natural vegetation and agricultural remnants yield excellent ecosystem services
like providing high-quality and protein-rich food, reducing fire hazards, recy-
cling organic matter, and strengthening rural communities’ identities and sense
of belonging [3,4]. Although herders are among the most impoverished peo-
ple, most of these pastoral ecosystem services are not remunerated. How can
shepherds, breeders, and landowners receive fair returns from such complicated
structures? Due to assessment objections, decision-makers struggle to regulate
and implement a payment system requiring information on multiple vegetation
changes, various grazing pressures, different species, flock sizes, duration, and
seasons.
The IPB team addresses this challenge using 30 years of joint research on tra-
ditional grazing systems in Trás-os-Montes [2]. They have demonstrated exper-
tise in updating and providing high-precision remote sensing techniques and
interpretation modeling methodologies adapted to handling large amounts of
dynamic data [4,5]. As part of its innovation and demonstration activity, the
IPB team’s projects run ten experimental sites grazed by sheep, goats, cows,
and donkeys monitored by Global Navigation Satellite System (GNSS) collars.
The team has learned the herds’ routes and routines and can accurately describe
their land and vegetation pressure. Thus, how different herds seasonally modify
the vegetation through unique pressures is the focus [2,4]. However, the IPB
team’s developments have encountered constraints in using devices and systems
available in the market, which could be applied in different contexts, purposes
and be aimed at the circumstances of the roaming pastoralism in the Northeast
of Portugal. As an example, the approach adopted by the IPB team uses col-
lars with data transmission via the Global System for Mobile (GSM) network
to track these animals. Unfortunately, this technology has a high energy con-
sumption (causing a low battery autonomy and reduced data rate), requires a
contract with telecommunications companies (which increases the project costs),
besides not being an open device for modifications and additions of new tools
(such as sensors) according to the object to be used.
A grazing monitoring system tailored to local circumstances requires
increased capabilities to record and transmit geolocations from remote sites,
with high frequency, self power supplied, and low priced. Based on continuous
feedback from the stakeholders supporting the IPB team—breeder’s organiza-
tions, Ministry of Agriculture, and the National Agency for Forest Fires—this
paper is a part of the development of such a system through the development of a
tracking and monitoring system that can be attached to the animal and provide
its behavior, i.e. perform the digitalization process. It is desired to keep the his-
tory of movements for each animal so that it is possible to analyze and use this
information further, for example, with a correlation or estimation algorithm. The
GPS position, temperature and humidity are variables that are being measured,
but the proposed system is open and ready to include new ones. This paper
presents an architecture of an Internet of Thing (IoT) approach to track animals
(based on their position) and acquire environmental conditions. As results, the
An IoT Approach for Animals Tracking 271
acquisition, transmission, storage and visualization of the data allow validating
the proposed approach.
The rest of the paper is organized as follows: After this introduction section,
the related works compilation is presented where similar approaches are applied.
Then, Sect. 3 presents the system description where the data acquisition, trans-
mission, storage and visualization are stressed. Section 4 presents the develop-
ments and results. Last section concludes the paper and points out future work
direction.
2 Related Work
Tracking animals is a process that brings several advantages, and it was
addressed several years ago. On the one hand, different groups of researchers
are developing personalized applications, from the large scale animals to the
smallest ones [6], such as bees [7]. On the other hand, the are several commer-
cial solutions that fit this methodology but bring some disadvantages such as
the price, the closed architecture and the dependency. The animals tracking is
vital in several ways, as examples from the territory planning to the understand-
ing the predators [8]. There is also the socio-cultural and economic value of the
ecosystems [3]. Another point to monitor is the movements of flocks that bring
on different grazing pressure over the landscape [4] and also the seasonality of
grazing itineraries [2]. Another justification for the animal tracking is related to
the equilibrium of ecological dynamics on rangelands [1]. So, in the last years,
several technologies have been applied to animal tracking and behavior monitor.
Tracking animals over the cellular network is such expansive (it is necessary to
contract a company to support the communications) and, in some cases, imprac-
tical because the high energy consumption results in recurrent maintenance and
a network coverage dependency.
The fourth Industry revolution brought methodologies than can be used for
this purpose. One of them is the IoT, Internet of Things. Authors from [5]
presents an animal behavior monitoring platform, based on IoT technologies
that includes an IoT local network to gather data from animals to autonomously
shepherd ovine within vineyard areas. In this case, the data is stored on a cloud
platform, with processing capability based on a machine learning algorithm that
allows to extract relevant information. Regarding the network connections, IoT
also brought new approaches of low power networks to be applied in this filed:
examples are the LoRa and Sigfox, among others. Authors of [9] proposes a
novel design of a mesh network formed by LoRa based animal collars or tags,
for tracking location and monitoring other factors via sensors over very wide
areas. Different communication technologies can be found on [10] where collars
are connected to a Low Power Wide Area network (LPWA) Sigfox network and
low cost Bluetooth Low Energy (BLE) tags connected to those collars. From
the commercial side, has appeared a huge number of solution that may fit the
animal tracking. Some examples are the milsar™ [11] and the digitanimal™ [12].
There are also organizations that shares in a free way, the database of worldwide
272 M. Zorawski et al.
animal tracking data, such as the movebank™ [13] hosted by the Max Planck
Institute of Animal Behavior where it is possible to researchers to manage, share,
protect, analyze and archive animal data.
Based on this study, the presented paper presents a developed solution to
track animals while monitoring the behavior. The data is collected and stored
on InfluxDB, where adaptive algorithms can be used to get precise information.
Developing a personalized application owns the advantage of including the char-
acteristics that the team desire. As a modular approach, other sensors can be
added to the system.
3 System Description
Being the main objective of this work the monitoring of pastoral activities, the
basic architecture of the system can be simplified as shown in Fig. 1, in which
the accomplishment of these steps (the first is the acquisition of coordinates in
the monitoring module, second is the transmission of these data and third is the
visualization in a platform) can be approached in several ways.
Fig. 1. System description.
Before defining which method and tools will be used to achieve the desired
system, one must verify the main needs of the project and which architecture
should be used. As this project aims to monitor automatically (not to depend
on the action of the shepherd) and with a real-time display of data in multiple
devices, it was decided to use a collar as the monitoring device and the signal
receiver to be connected to the Internet, for the storage and display not be
exclusive to the local node. Thus, Fig. 2 describes how the system architecture
should be, adding the requirements of this project.
With the system requirements defined, the following points must take into
consideration the feasibility and the restrictions of this architecture to choose
which method and strategy will be addressed, and consequently, which tools will
be used. As the Open2 project works with animal monitoring, it is essential to
be a wireless device, which implies the use of batteries, and consequently, that
it has low power consumption. Another critical point for the project is the low
cost of the entire system, seeking to create the tools and use free applications.
An IoT Approach for Animals Tracking 273
Fig. 2. Animals tracking system description.
Considering these requirements, an alternative technology found was the
LoRa (Long Range) networks, which is a wireless modulation technique of long-
range and low power, being ideal for applications of small data transfers, which
need a more excellent range than WiFi, Bluetooth or ZigBee. With LoRa mod-
ulation, it is possible to create a LPWAN, which through the LoRaWAN pro-
tocol (a protocol to access cloud-based services) manages the communication
between LoRa devices and the LoRaWAN gateway, which is connected by cable
to the internet, thus establishing a real-time communication between End Node
devices and applications connected to the Internet (delaying only the airtime of
the message, which is normally less than 100 ms), as depicted by the architecture
in Fig. 3.
Fig. 3. LoRaWAN architecture.
274 M. Zorawski et al.
With the LoRaWAN architecture, the data that arrives at the gateway must
pass through a server in order to be sent to the applications, which in this project
used The Things Network (TTN). This free and open-source LoRaWAN network
provides a worldwide LoRaWAN network, only needing to respect TTN’s fair
use policy. The architecture that was used in this project is detailed below in two
parts, the first part presenting the LoRa device (data acquisition and transmis-
sion) and the second part the data application (Data Storage and Visualization).
3.1 Data Acquisition and Transmission
The elements responsible for the initial stages of a LoRaWAN network are the
LoRa devices, which are called End Nodes, as described in Fig. 3. The End Nodes
send the data being monitored to the gateway, which is usually sensor values.
As the objective of this project was the wireless monitoring of the End Node
coordinate (which represents the animal tracking), the architecture used was
adopted as depicted in Fig. 4.
Fig. 4. Data acquisition and transmission system.
The End Device thus required as main elements: a wireless power supply
system, Microcontroller, LoRa transceiver module, Battery value reader, and
the sensor that will acquire parameters from the real world. As this project
was developed in a research centre, it became possible to use a device with
these characteristics that were available from another project (SAFe project). In
addition, other points considered at this stage were battery saving and safety,
developing codes to turn off or put in sleep mode components that were not
being used and limiting the operating stage to a minimum battery level.
An IoT Approach for Animals Tracking 275
3.2 Data Storage and e Visualization
With the data properly transmitted to the gateway and on a server (TTN), the
payload can be used in the applications. Node-Red was used, to facilitate data
visualization and storage, as depicted in the Fig. 5.
Fig. 5. Data storage and e visualization system.
Node-red facilitates the use of the existing tools in the platform, such as
local or database storage of the received data, visualization of the data, among
others. Thus, using in this project the storage in .log format, influxDB and
map trajectory visualization (mapper). Grafana helps in the visualization of
data stored in a database, representing these data in several graph formats or
even in a map. In this way, Grafana can support the data visualization with
more than one device, such as creating in this platform multiples visualization
to see the data from many devices.
4 Development and Results
The initial tests in this work were performed to verify the system components
(the GPS and the LoRa device) and later interconnecting both to apply the
LoRa transmission sending the GPS data to the Gateway. Figure 6 represents
the components used in this step: Arduino UNO, Arduino shield (containing
GPS modules and the LoRa RFM95 device) and the Gateway used to receive
the data from the node.
With the LoRa device and the Gateway registered in TTN, the GPS data
was imported to Node-RED to create a more straightforward way, the connection
between the device and its applications. Figure 7 represents the flow created in
276 M. Zorawski et al.
Fig. 6. Components used for the initial tests.
Node-RED to track the LoRa device in real-time, show path travelled, store data
locally in the .log file in the InfluxDB database. This flow can be applied in
multiples devices, allowing the storage and display of registered TTN devices.
Fig. 7. Flow used to track the LoRa device.
After implementing the Node-RED and verifying the system’s operation to
better acquire the coordinates and trajectories, the End Node was replaced by
a device capable of wireless data transmission. For this it was used the Printed
Circuit Board (PCB) of the SAFe project [14–17] (shown in Fig. 8), which already
contained part of the architecture presented in Fig. 4, with a DHT sensor module,
powered by a battery and protected by a 3D printed box, only needing to add the
GPS module. The temperature and humidity sensor (DTH11) was used when
sent the payload, but has not been applied in the following visualization steps.
An IoT Approach for Animals Tracking 277
Fig. 8. PCB adapted from SAFe project and 3D printed box.
With this End Node device, the tracking tests were performed, sending
the signal every three minutes, between the frequency of 863–870 MHz and
a LoRaWAN device of Class A, which allowed an autonomy of two weeks, due to
the GPS sleep module, that it was supposed only to wake up the module when
the position update was needed. Display results were obtained using Node-RED’s
own World Map as represented by Fig. 9, implemented in a way to display the
trajectory and the last received point, which made it possible to validate the
data obtained by the End Device.
Fig. 9. Node-red’s world map displaying device tracking.
Using InfluxDB, a database was created in order to, through Node-RED,
increment the received devices position data, in addition to the metadata, includ-
278 M. Zorawski et al.
ing information such as the time the data was received and other information
related to the LoRaWAN signal information. Grafana was used as a second plat-
form to test other ways to apply the data acquired by End Node. In which it
imported data from the database, representing the stored points together with
the quality of the LoRa signal captured at this point (Fig. 10).
Fig. 10. Grafana’s world map displaying the geolocation points acquired.
Tests on Grafana to verify the application’s capabilities for interactive data
visualization for this project made it possible to verify tools already implemented
on the platform, in addition to the World Map visualization, also the possibility
to display data according to stipulated schedules. Thus, the system has only
hardware costs, the Gateway being the most expensive, and the value of each
End Node is a low-cost product compared to other devices available on the
market.
5 Conclusion and Future Work
A large number of cows, goats and sheep perform grazing activities every day
in Portugal, which bring several benefits to the society and ecosystem. This
work proposes a way to implement a grazing monitoring system that can fit the
requirements (wireless device, storage and transmission of geolocation, high data
transmission rate, and low price) to support the IPB team in their research on
the return of pastoral services fair. It also is open to include new functionalities
for geolocation and environmental conditions. The project exceeded the stages
of research and proposals solution of the requirements of this work, developing
and performing tests on prototypes to validate the methodology addressed. The
tools adopted to visualize test data have met with the intended objectives and
An IoT Approach for Animals Tracking 279
showed a possibility of being better used to improve the application in future
projects.
With the tests of the prototype and the applications, several points were
observed where the system has the capacity to develop in future works. One of
the possible developments of the project would be through the implementation
of a “Server Machine”, allowing to host the tools used (backup in .log and port
control for Node-RED, InfluxDB and Grafana), using Raspberry Pi or virtual
machine. Regarding application improvements, could add other functions for the
Dashboard to encompass more functionalities, such as displaying the percentage
of battery, temperature and humidity, selection of data to be displayed, among
other features. It would also be necessary for the application to include grazing
value according to Agrarian Plot Identification System (SIGPAC), linking with
monitored points on the map and a method to send in real-time this information
to the shepherd.
To ensure that messages are sent, a way to map the areas with LoRaWAN
Signal coverage could be implemented. Also, data storage on the device enables
confirmation of data sent, allowing data to be transmitted again at animal rest
times to ensure a good signal or collecting locally after a period of device oper-
ation. Another way would be through payload optimization, a Dynamic way to
send GPS data, sending the complete GPS data only in certain intervals, and
the next ones reduced, which would allow a higher data transmission rate. The
autonomy of the system could be improved by implementing a separate power
supply for the GPS so that while the GPS is in sleep mode, it can turn off the
rest of the system or turn off the whole system and turn it on again ahead of
time to send data, to fix the GPS. In addition, a solar charger system would
reduce the physical interaction with the devices to charge the batteries.
References
1. Roe, E., Huntsinger, L., Labnow, K.: High reliability pastoralism. J. Arid Environ.
39(1), 39–55 (1998)
2. Castro, M., Castro, J., Gómez Sal, A.: L’utilisation du territoire par les petits
ruminants dans la région de montagne de trás-os-montes, au portugal. Options
Méditerranéennes. Série A, A Séminaires Méditerranéens, no 61, 249–254 (2004)
3. Bernues, A., Rodrı́guez-Ortega, T., Ripoll-Bosch, R., Alfnes, F.: Socio-cultural and
economic valuation of ecosystem services provided by mediterranean mountain
agroecosystems. PloS One 9(7), e102479 (2014)
4. Castro, M., Ameray, A., Castro, J.P.: A new approach to quantify grazing pres-
sure under mediterranean pastoral systems using GIS and remote sensing. Int. J.
Remote Sens. 41(14), 5371–5387 (2020)
5. Nóbrega, L., Tavares, A., Cardoso, A., Gonçalves, P.: Animal monitoring based
on IoT technologies. In: 2018 IoT Vertical and Topical Summit on Agriculture-
Tuscany (IOT Tuscany), pp. 1–5. IEEE (2018)
6. Guide to animal tracking. https://outdooraction.princeton.edu/nature/guide-
animal-tracking. Accessed 15 May 2021
7. Bozek, K., Hebert, L., Portugal, Y., Stephens, G.J.: Markerless tracking of an
entire honey bee colony. Nat. Commun. 12(1), 1–13 (2021)
280 M. Zorawski et al.
8. Ryan, P., Petersen, S., Peters, G., Grémillet, D.: GPS tracking a marine predator:
the effects of precision, resolution and sampling rate on foraging tracks of African
penguins. Mar. Biol. 145(2), 215–223 (2004)
9. Panicker, J.G., Azman, M., Kashyap, R.: A LoRa wireless mesh network for wide-
area animal tracking. In: 2019 IEEE International Conference on Electrical, Com-
puter and Communication Technologies (ICECCT), pp. 1–5. IEEE (2019)
10. Maroto-Molina, F., et al.: A low-cost IoT-based system to monitor the location of
a whole herd. Sensors 19(10), 2298 (2019)
11. Milsar. https://milsar.com/. Accessed 15 May 2021
12. Digital animal. https://digitanimal.pt/. Accessed 15 May 2021
13. Movebank. https://www.movebank.org/cms/movebank-main. Accessed 15 May
2021
14. Brito, T., Pereira, A.I., Lima, J., Castro, J.P., Valente, A.: Optimal sensors posi-
tioning to detect forest fire ignitions. In: Proceedings of the 9th International Con-
ference on Operations Research and Enterprise Systems, pp. 411–418 (2020)
15. Brito, T., Pereira, A.I., Lima, J., Valente, A.: Wireless sensor network for ignitions
detection: an IoT approach. Electronics 9(6), 893 (2020)
16. Azevedo, B.F., Brito, T., Lima, J., Pereira, A.I.: Optimum sensors allocation for a
forest fires monitoring system. Forests 12(4), 453 (2021)
17. Brito, T., Azevedo, B.F., Valente, A., Pereira, A.I., Lima, J., Costa, P.: Environ-
ment monitoring modules with fire detection capability based on IoT methodol-
ogy. In: Paiva, S., Lopes, S.I., Zitouni, R., Gupta, N., Lopes, S.F., Yonezawa, T.
(eds.) SmartCity360◦
2020. LNICST, vol. 372, pp. 211–227. Springer, Cham (2021).
https://doi.org/10.1007/978-3-030-76063-2 16
Optimizing Data Transmission
in a Wireless Sensor Network Based
on LoRaWAN Protocol
Thadeu Brito1,2,3
, Matheus Zorawski1
, João Mendes1(B)
,
Beatriz Flamia Azevedo1,4
, Ana I. Pereira1,4
, José Lima1,3
,
and Paulo Costa2,3
1
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politécnico de Bragança, Campus de Santa Apolónia,
5300-253 Bragança, Portugal
{brito,matheuszorawski,joao.cmendes,beatrizflamia,apereira,jllima}@ipb.pt
2
Faculty of Engineering of University of Porto, Porto, Portugal
3
INESC TEC - INESC Technology and Science, Porto, Portugal
4
Algoritmi Research Centre, University of Minho, Campus de Gualtar,
Braga, Portugal
Abstract. Internet of Things, IoT, is a promising methodology that has
been increasing over the last years. It can be used to allow the connec-
tion and exchange data with other devices and systems over the Internet.
One of the IoT connection protocols is the LoRaWAN, which has several
advantages but has a low bandwidth and limited data transfer. There is
a necessity of optimising the data transfer between devices. Some sensors
have a 10 or 12 bits resolution, while LoRaWAN owns 8 bits or multi-
ples slots of transmission remaining unused bits. This paper addresses
a communication optimisation for wireless sensors resorting to encoding
and decoding procedures. This approach is applied and validated on the
real scenario of a wildfire detection system.
Keywords: Internet of Things · LoRaWAN · Wireless sensor
network · Fire detection · Transmission optimisation
1 Introduction
The project “SAFe: Forest Monitoring and Alert System” proposes an intelligent
system for monitoring situations of potential forest risk. This system combines
sensor nodes that will collect various parameters, namely temperature, humid-
ity and data related to infrared sensors that identify the presence of flame. This
collection of information, combined with a system based on artificial intelligence
and other collected data (such as weather forecasting), will allow an efficient and
intelligent data analysis, allowing the creation of alerts of dangerous situations
to alert the different actors (for example, firefighters, civil protection or city
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 281–293, 2021.
https://doi.org/10.1007/978-3-030-91885-9_20
282 T. Brito et al.
council). These alerts will be parameterized and presented in a personalised way
and tailored to each actor. Thus, it is intended to minimise the occurrence of
ignitions, monitor fauna and flora, and contribute to the environmental develop-
ment of the region of Trás-os-Montes, particularly in the district of Bragança. In
addition to the capacity for early detection of forest fires, the accurate estimation
of the fire hazard, using fire risk indices in real-time, on a sub-daily and local
scale, in meteorological data, in the availability of fuels and moisture content
vegetation is of utmost importance.
The SAFe’s implementation is localised to the Serra da Nogueira area, belong-
ing to the Natura 2000 Network, with unique characteristics in the Bragança
region and with an extension of approximately 13 km. Considering the region’s
size to be monitored, data transmission is carried out by LoRaWAN, it is based
on low power wide area network, which will guarantee full coverage of the area
in question [1–4]. This type of communication is increasingly used, and it is
estimated that by 2024, the IoT industry will generate revenue of 4.3 trillion
dollars [5]. LoRaWAN networks are composed of end devices, gateway and net-
work server in the midst of a rising state. The data is acquired from the modules
(end devices) that send the data to the gateway, which in turn relays the mes-
sages to the servers through non-LoRaWAN networks, such as Ethernet or IP
over cellular [6].
The LoRaWAN communication itself uses the Chirp Spread Spectrum (CSS)
modulation, where chirp pulses modulate the signal. It uses an unlicensed sub-
gigahertz ISM band that varies in width depending on the region of application,
and for Europe, it is 863 MHz to 870 MHz [7]. Within the communication set-
tings, three parameters allow a network adjustment between bit rate and robust-
ness of the transmission: Bandwidth (BW), Spreading Factor (SF), and Coding
Rate (CR) [8]. The format of the Long Range (LoRa) message is subdivided into
several layers, the physical layer, which is composed of five fields, the pream-
ble, the physical header (PHDR), the physical header Cyclic Redundancy check
(PHDR0 CRC), the physical payload (PHY Payload) and an error detection tail
(CRC). The Medium Access Control (MAC) layer, in turn, is a subdivision of
the PHY Payload field and consists of three fields, the MAC Header (MHDR),
MAC Payload and the Massage Integrity Code (MIC). Within the MAC Payload
is the message itself, which consists of three fields, the Frame header (FHDR),
the port and the frame payload (FRM Payload) [9].
At this stage of the SAFe project, while the testing period is still underway in
an urban environment, the configuration being used is a 125 kHz Bandwidth, an
SF7 and a Coding Rate of 4/5, which allows transmission of about 242 Bytes.
However, taking into account that when the project passes the test phase in
a rural environment, it will be necessary to cover a larger area. It is probably
required to use a larger SF to improve the communication range, thus reducing
transmission capacity. Using an SF12 and maintaining the BW and CR, the
LoRaWAN communication will allow transmission of about 51 Bytes. Therefore,
there is a necessity to optimise the use of all bits’ packages as much as possible
and not lose data essential for the project’s realisation. But some sensors have
Optimizing Data Transmission in a Wireless Sensor Network 283
a 10 or 12 bits resolution while LoRaWAN owns 8 bits, which means, in some
cases, multiples slots of transmission remaining unused bits. In this way, this
paper addresses a communication optimisation for Wireless Sensors Network
(WSN) resorting to encoding and decoding procedures.
The rest of paper is organized as follows. After the introduction, Sect. 2
presents the related work. In Sect. 3, the system architecture is addressed where
the focus of this paper is highlighted. Section 4 presents the algorithm used to
optimise the data sent though the network and Sect. 5 stresses the results of
the proposed encoding/decoding. Last section, concludes the paper and points
future work.
2 Related Work
Real-time data collection and analysis had new possibilities with the emergence
of the Internet of Things (IoT). Typically, IoT devices are manufactured to use in
short-range protocols, such as Bluetooth, WiFi, and ZigBee, among others, which
may have limitations, such as the maximum number of devices connected to the
gateway and high power consumption. On the other hand, mobile technologies
that have an extensive range (LTE, 3G and 4G) are expensive, consume a lot of
power and have a continuing cost [10]. The gap that these technologies have been
filled in part by Low Power Wide Area Network (LPWAN) technologies, which
cannot cover the critical applications (e.g. remote health monitoring), but fit
very well for the requirements of IoT applications. Although low data rates limit
LPWAN systems, there are reasons for their use, especially after the emergence
of technologies such as SigFox, NB-IoT, and LoRa [10,11].
LoRa network is highly used in applications requiring low power consumption
and long-range, in which LoRa Alliance proposed the LoRaWAN, defining the
network protocol in the MAC and network layers [10,12]. Different scenarios of
typical LoRa networks applications are pointed out in [12], besides providing
documentation of LoRa networks design and implementation, searching for a
better cost-benefit for LPWAN applications. Also, Baiji [13] showed in 2019 an
application using LoRaWAN in an IoT monitoring system aimed to minimise
the non-productive time for Oil and Gas companies through the detection of a
leak.
Researches are being carried out seeking improvements to the LoRaWAN
system, in which numerous of these are being carried out to seek optimisa-
tion of data rate, airtime, and energy consumption through an Adaptive Data
Rate (ADR), as in [14], where a review of research on ADR algorithms for
LoRaWAN technology is presented. Also, in [15] a method for the use of ADR is
implemented, seeking to provides an optimal throughput with multiple devices
connected to LoRaWAN. Furthermore, a study was carried out looking for max-
imising the battery lifetime through different LoRaWAN nodes and scenarios in
[16], demonstrating relationships between different payload sizes with the power
284 T. Brito et al.
consumption and air time using various SF and CR, indicating an increase in
power consumed per bit of payload used.
As cited by [11], all the advantages of the LoRa network of achieving a long
transmission range, low power and enabling low-cost IoT applications to exist
with the downside of low data throughput rate. The LoRaWAN data transfer
must respect a protocol, and it is possible to send the Payload in several forms,
as long as it is within the limit of the number of Bytes. Considering the article
[9], the LoRa message format is composed of several fields that have specific or
limited sizes. Therefore, it is necessary to optimise the use of bytes in Payload
as much as possible. Some techniques have been used over the years, such as
selecting the best packet size by adjusting the bit rate to achieve maximum
throughput as demonstrated by the authors of the article [17], to optimise the
payload sizes in voice and video applications. Still, within this dynamic, the
authors of the article [18] used throughput as the optimisation metric and based
their study on wireless ATM networks to maximise the efficiency of transfer
by adapting the packet size with the variation of the channel conditions. Also,
Eytan Modiano in [19] presented a system to dynamically optimise the packet
size based on the estimates of the channel’s bit error rate, being particularly
useful for wireless and satellite channels.
The importance of packet size optimisation in WSN are debated in the article
[20], presenting several packet size optimisation techniques and concluding that
there is no agreement on whether the packet should be fixed or dynamic. The
packet size optimisation technique to be used for WSN is defended in article
[21], proposing an energy efficiency factor evaluation metric, being a fixed packet
size for the case studied. Furthermore, the authors of the article [22] present a
variation of the packet size depending on the network conditions to increase the
network throughput and efficiency. The method of payload recommended by The
Things Network (TTN) when using LoRaWAN has described in [23] only the
specifications of how to send each type of data. Suppose the user needs to use
the minimum Bytes possible for a shorter transmission time and being within
the limit of 51 Bytes. One way to implement data encoding for the LPWAN
network as LoRaWAN is through Cayenne Low Power Payload (LPP), which
allows sending the payload dynamically, according to the LoRa frequency and
data rate used [24]. Nevertheless, besides the various optimisations that have
been mentioned above, it remains crucial to find the form to optimise the data
that will be transferred for better use of the LoRa network.
3 System Architecture
The presented system has the main ideology of monitoring a determined area
with a reduced application cost. In this way, it is expected to obtain constant
surveillance to identify any ignition that may exist. This surveillance will be
achieved using the various sensors present in the modules (end devices). Each
module consists of six sensors, with three distinct values types: five flame sensors
and a sensor that provides data on relative humidity and air temperature. The
Optimizing Data Transmission in a Wireless Sensor Network 285
entire operation of the safe project combines the collection of data from the
modules present in the forest; this data collection mustn’t fail. Consequently, the
proper functioning of data transmission is guaranteed because if the data does
not reach the competent entities, they will be of no use. This transmission will be
handle using LoRaWAN technology, which allows us to connect several modules
to different gateways. Thus, guaranteeing low-cost and efficient coverage of the
area. Figure 1 exemplifies the whole system architecture of the SAFe project.
Fig. 1. System architecture of SAFe project [2].
Due to the specificity that each of these elements is defined and developed
(Fig. 1), this work is concentrated only on the LoRaWAN’s Transmission and
the proposed arrangement bits protocol. Other SAFe’s approaches can be seen
in [1–4]. In this way, considering the high number of modules (end devices)
used in this project, there is a requirement to ensure no collisions between their
messages; the whole modules’ description is explained in [2,4]. Thereupon, it is
necessary to optimise the size of the data package sent, thus guaranteeing the
minimum necessary occupation of bytes to reduce the chances of these collisions
and consequent data losses. The normal sending process (recommended by TTN
service) will be exemplified below to explain the process, for this information will
be used related to the six sensors present in the modules as well as the battery
information, resulting in:
286 T. Brito et al.
– Flame sensors: as mentioned before, each node has five flame sensors. These
sensors make 10 bits reading, which means they will generate values between
0 to 1023. To send just one value from one of them is demanded to use 2-Bytes
(16 bits), consequently, to send all of them is required a total of 10-Bytes.
– Relative humidity sensor: this sensor produces a reading between 0 to 99,
but 2-Bytes are required to transmit the message.
– Temperature sensor: gives a reading between 0 to 50 ◦
C, which results in
the use of 2-Bytes to transmit.
– Battery level: the battery level is converted from a 10 bits ADC and a total
of 2-Bytes will be required to transmit.
Analysing the sending process recommended by TTN service, it is possible to
observe that the total package has 16-Bytes per interval of sending data (Fig. 2).
However, the messages resulting from all sensors have unused bits because 16-
Bytes has 128 bits and the data produced by all sensors in each module has a
total of 73 bits. Thus, this work aims to optimise the use of these unfilled bits.
Using the example of flame sensors, it is possible to notice that the 2-Bytes sent
correspond to a message with 16-bits, and only 10-bits are used to transmit the
value with a range of 0 to 1023. In this case, 6-bits are left to optimise with the
data from another sensor’s value. This example can be seen looking for B0 and
B1 in the following Fig. 2.
Fig. 2. 16-Bytes (B) of data necessary to send. The unused bits (b) are red coloured.
Where i ∈ R|0 ≤ i ≤ 4.
This methodology of taking advantage of the unused bits will be explained
in the following section in more detail.
Optimizing Data Transmission in a Wireless Sensor Network 287
4 Algorithm
From the previous section, it is possible to notice that for each flame sensor data
(S[i]), there are some unused bits (x) available to receive data to be carried.
In this way, the humidity sensor (RH) and the temperature sensor (T) will be
stored on a Byte type and the battery (B) level will be stored in 10 bits reso-
lution. Table 1 presents the transmission package without optimise the encoding
procedure (using the recommendation in [23]).
Table 1. Transmission without optimise encoding. S[i] are the flame sensors, H, T and
B are the Humidity, Temperature and Battery Voltage sensors, respectively. The ‘x’
are the unused bits.
S[0] RH
x x x x x x s9[0] s8[0] ... s0[0] x x x x x x x RH7 ... RH0
S[1] T
x x x x x x s9[1] s8[1] ... s0[1] x x x x x x x T7 ... T0
S[2] B
x x x x x x s9[2] s8[2] ... s0[2] x x x x x x B9 B8 ... B0
S[3] Legend
x x x x x x s9[3] s8[3] ... s0[3] Used by 10 bits sensors
Used by Relative Humidity
S[4] Used by Temperature
x x x x x x s9[4] s8[4] ... s0[4]
This approach intends to sort the bits using a less value as possible in bytes
protocol used to transmit the data by LoRaWAN. The main idea is distribute
each bit from each sensor during the encoding process, in a manner that is
necessary use the minimum Bytes. In order to optimise these Not Used bits (b15,
b14, b13, b12, b11 and b10, see Fig. 2) for each sensor S[i] (where i ∈ R|0 ≤ i ≤
4), the word (2-Bytes) of Humidity (RH), Temperature (T) and Battery (B) will
be separated and combined with the unused bits at the encoding procedure. A
bit manipulation operation is required to use the previous unused bits. Table 2
presents the previous “x” assuming bits from other sensors. Moreover, even
including the H, T and B values, there are 4 bits that remains unused. They will
be appropriated to hold four auxiliary bits (AUX) for future applications.
288 T. Brito et al.
Table 2. Proposed encoding map.
S[0] RH
T5 T4 T3 T2 T1 T0 s9[0] s8[0] ... s0[0] x x x x x x x x ... x
S[1] T
RH3 RH2 RH1 RH0 T7 T6 s9[1] s8[1] ... s0[1] x x x x x x x x ... x
S[2] B
B1 B0 RH7 RH6 RH5 RH4 s9[2] s8[2] ... s0[2] x x x x x x x x ... x
S[3] Legend
B7 B6 B5 B4 B3 B2 s9[3] s8[3] ... s0[3] Used by 10 bits sensors
Used by Relative Humidity
S[4] Used by Temperature
AUX4 AUX2 AUX1 AUX0 B9 B8 s9[4] s8[4] ... s0[4] Used by AUX
This encoding operation is carried by an embedded system that is pro-
grammed on C language. For the encoding, the proposed C code is detailed
on Listing 1.1. The message will go through the network and will be decoded by
Listing 1.2.
Listing 1.1. Data encoder
S [ 0 ] = S [ 0 ] + (T  0X3F)  10;
S [ 1 ] = S [ 1 ] + (T  0XC0)  10 + (H  0X0F)  12;
S [ 2 ] = S [ 2 ] + (H  0XF0)  10 + (B  0X003)  14;
S [ 3 ] = S [ 3 ] + (B  0X0FC)  10;
S [ 4 ] = S [ 4 ] + (B  0X300)  10 + (AUX  0X0F)  12;
//TRANSMIT array S
An opposite operation of decoding will allow to extract the Flame values,
Humidity, Temperature and Battery voltage from the message. The C code of
decoding procedure can be found on Listing 1.2.
Listing 1.2. Data decoder
//RECEIVE array S
T = (S [ 0 ]  0XFC00)  10 + (S [ 1 ]  0X0C0)  4;
H = (S [ 1 ]  0XF000)  12 + (S [ 2 ]  0X3C00)  6;
B = (S [ 2 ]  0XC000)  14 + (S [ 3 ]  0XFC00)  10
+ (S [ 4 ]  0X0C00)  2;
AUX = (S [ 4 ]  0XF000)  12;
S [ 0 ] = S [ 0 ]  0X03FF ;
S [ 1 ] = S [ 1 ]  0X03FF ;
S [ 2 ] = S [ 2 ]  0X03FF ;
S [ 3 ] = S [ 3 ]  0X03FF ;
S [ 4 ] = S [ 4 ]  0X03FF ;
Optimizing Data Transmission in a Wireless Sensor Network 289
This optimised message can be sent to the LoRaWAN network and at the
destination, a decoder procedure will construct the original message. Figure 3
presents the encoding, transmission and decoding procedures. As it can be seen,
the packet to be sent keeps the size of produced by the flame sensors S[i] (where
i ∈ R|0 ≤ i ≤ 4):
Fig. 3. Encoding and decoding procedures. Size of S[i] is 16 bits whereas T and H are
8 bits. AUX is an auxiliary variable that was created with the 4 bits that remain free.
By this way, a reduction of transmission from 8 words (16-Bytes) to 5
words (10-Bytes) is obtained. The following section demonstrates the difference
between the procedure of recommendation by TTN and this approach.
5 Results
To demonstrate the difference during transmission using the LoRaWAN network
with the algorithm proposed in this work, a comparison test is performed with
the method recommended by TTN. Therefore, five sensor modules are config-
ured for each transmission method: five nodes with the TTN transmission tech-
nique and another five nodes with the algorithm shown in the previous section.
Figure 4a shows the ten modules distributed on the laboratory bench. They are
configured according to the mentioned algorithms and simulate a small WSN.
All of these modules are configured to transmit at a specific frequency within the
range that LoRaWAN works. In this way, it is possible to probe the behaviour
of the duty cycle in the settings of the gateway used. The gateway used is RAK
7249, which is attached to the laboratory’s roof shown in Fig. 4b.
290 T. Brito et al.
(a) Five modules configured for each al-
gorithm.
(b) LoRaWAN gateway used for the
test.
Fig. 4. Structure used for the comparison between the two algorithms.
The five modules responsible for transmitting the data from the sensors
with the TTN’s recommendations were configured to send using the frequency
864.9 MHz, 865.5 MHz and 863 MHz (this frequency is exclusive for FSK). The
other five modules, those configured with the algorithm proposed in this work,
carried out transmissions on 863.5 MHz , 863.9 MHz and 860.1 MHz (this fre-
quency is exclusive for FSK use). To ensure that no other frequencies were used,
the RAK 7249 was still configured to only work on these six frequencies men-
tioned. In addition, it was also chosen to enable a modulation concentrator for
each frequency range used.
All ten modules have the same sending values; that is, all of them must send
the five values of the flame sensors, the battery level, humidity and relative air
temperature. They have also been configured to communicate in an interval of
60 s. Therefore, after all the necessary configurations for the test, the modules
were deposited under the laboratory bench. Where they are under the same
climatic circumstances, it is also possible to guarantee that they will send cor-
responding temperature and humidity values. Also, note that all batteries are
fully charged.
After 24 h, the generated duty cycle graph was verified in the gateway sys-
tem, and a screenshot is shown in Fig. 5. Through this graph, it is possible to
notice the difference between the two algorithms based on the lines generated
when the gateway performs transmissions at a specific frequency. In this sense,
it is observed that the primary frequencies are the most used (864.9 MHz and
Optimizing Data Transmission in a Wireless Sensor Network 291
863.5 MHz), as the firmware will always choose them as the first attempt to send.
However, when analysing the secondary frequencies (865.5 MHz and 863.9 MHz),
a difference is noted between the amount of duty cycle occupation. This graph-
ical difference demonstrates the optimisation of bytes during data transmission
through the LoRaWAN network. Thus, using the algorithm proposed in this
work, it is possible to insert more modules under the same facilities.
Fig. 5. A screenshot is obtained from the RAK 7249 gateway system after 24 h of use.
6 Conclusion and Future Works
The limited resources of the LoRaWAN requires a reduction of the transmission
data. In this paper, an optimised transmission is proposed availing the unused
bits of the 16 bits values. It is a lossless bit manipulation that will fit the temper-
ature, humidity and battery level data bits in the unused bits of the sensors data
where each sensor occupies 10 bits. The remaining 6 bits are used to perform
this procedure. Also, 4 free bits will allow to use an auxiliary value from 0 to 0xF
that can help on synchronisation and packet sequence verification. The results
showed that the reduction of the duty cycle of the LoRaWAN while maintaining
the data integrity allow to verify that the proposed methodology is able to be
installed on the wireless sensor network, as future work.
Acknowledgements. This work has been supported by Fundação La Caixa
and FCT—Fundação para a Ciência e Tecnologia within the Project Scope:
UIDB/5757/2020.
292 T. Brito et al.
References
1. Brito, T., Pereira, A.I., Lima, J., Castro, J.P., Valente, A.: Optimal sensors posi-
tioning to detect forest fire ignitions. In: Proceedings of the 9th International Con-
ference on Operations Research and Enterprise Systems, pp. 411–418 (2020)
2. Brito, T., Pereira, A.I., Lima, J., Valente, A.: Wireless sensor network for ignitions
detection: an IoT approach. Electronics 9(6), 893 (2020)
3. Azevedo, B.F., Brito, T., Lima, J., Pereira, A.I.: Optimum sensors allocation for a
forest fires monitoring system. Forests 12(4), 453 (2021)
4. Brito, T., Azevedo, B.F., Valente, A., Pereira, A.I., Lima, J., Costa, P.: Environ-
ment monitoring modules with fire detection capability based on IoT methodol-
ogy. In: Paiva, S., Lopes, S.I., Zitouni, R., Gupta, N., Lopes, S.F., Yonezawa, T.
(eds.) SmartCity360◦
2020. LNICST, vol. 372, pp. 211–227. Springer, Cham (2021).
https://doi.org/10.1007/978-3-030-76063-2 16
5. Seye, M.R., Ngom, B., Gueye, B., Diallo, M.: A study of LoRa coverage: range
evaluation and channel attenuation model. In: 2018 1st International Conference
on Smart Cities and Communities (SCCIC), pp. 1–4 (2018). https://doi.org/10.
1109/SCCIC.2018.8584548
6. Phung, K.H., Tran, H., Nguyen, Q., Huong, T.T., Nguyen, T.L.: Analysis and
assessment of LoRaWAN. In: 2018 2nd International Conference on Recent
Advances in Signal Processing, Telecommunications Computing (SigTelCom), pp.
241–246 (2018). https://doi.org/10.1109/SIGTELCOM.2018.8325799
7. Kufakunesu, R., Hancke, G.P., Abu-Mahfouz, A.M.: A survey on adaptive data
rate optimization in LoRaWAN: recent solutions and major challenges. Sensors
20(18), 5044 (2020). https://doi.org/10.3390/s20185044. https://www.mdpi.com/
1424-8220/20/18/5044
8. Jovalekic, N., Drndarevic, V., Pietrosemoli, E., Darby, I., Zennaro, M.: Experimen-
tal study of lora transmission over seawater. Sensors 18(9), 2853 (2018). https://
doi.org/10.3390/s18092853. https://www.mdpi.com/1424-8220/18/9/2853
9. Casals, L., Mir, B., Vidal, R., Gomez, C.: Modeling the energy performance of
LoRaWAN. Sensors 17(10) (2017). https://doi.org/10.3390/s17102364. https://
www.mdpi.com/1424-8220/17/10/2364
10. Tsakos, K., Petrakis, E.G.M.: Service oriented architecture for interconnecting
LoRa devices with the cloud. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T.
(eds.) AINA 2019. AISC, vol. 926, pp. 1082–1093. Springer, Cham (2020). https://
doi.org/10.1007/978-3-030-15032-7 91
11. Raza, U., Kulkarni, P., Sooriyabandara, M.: Low power wide area networks: an
overview. IEEE Commun. Surv. Tutor. 19(2), 855–873 (2017). https://doi.org/10.
1109/COMST.2017.2652320
12. Zhou, Q., Zheng, K., Hou, L., Xing, J., Xu, R.: Design and implementation of open
LoRa for IoT. IEEE Access 7, 100649–100657 (2019)
13. Baiji, Y., Sundaravadivel, P.: iloleak-detect: an IoT-based LoRaWAN-enabled oil
leak detection system for smart cities. In: 2019 IEEE International Symposium on
Smart Electronic Systems (iSES) (Formerly iNiS), pp. 262–267. IEEE (2019)
14. Kufakunesu, R., Hancke, G.P., Abu-Mahfouz, A.M.: A survey on adaptive data
rate optimization in LoRaWAN: recent solutions and major challenges. Sensors
20(18), 5044 (2020)
15. Kim, S., Yoo, Y.: Contention-aware adaptive data rate for throughput optimization
in LoRaWAN. Sensors 18(6), 1716 (2018)
Optimizing Data Transmission in a Wireless Sensor Network 293
16. Bouguera, T., Diouris, J.F., Chaillout, J.J., Jaouadi, R., Andrieux, G.: Energy
consumption model for sensor nodes based on LoRa and LoRaWAN. Sensors 18(7),
2104 (2018)
17. Choudhury, S., Gibson, J.D.: Payload length and rate adaptation for multimedia
communications in wireless LANs. IEEE J. Sel. Areas Commun. 25(4), 796–807
(2007). https://doi.org/10.1109/JSAC.2007.070515
18. Akyildiz, I., Joe, I.: A new ARQ protocol for wireless ATM networks. In: ICC 1998.
1998 IEEE International Conference on Communications. Conference Record.
Affiliated with SUPERCOMM 1998 (Cat. No.98CH36220), vol. 2, pp. 1109–1113
(1998). https://doi.org/10.1109/ICC.1998.685182
19. Modiano, E.: An adaptive algorithm for optimizing packet size used in wire-
less ARQ protocols. Wirel. Netw. 5, 279–286 (2000). https://doi.org/10.1023/A:
1019111430288
20. Leghari, M., Abbasi, S., Dhomeja, L.D.: Survey on packet size optimization tech-
niques in wireless sensor networks. In: Proceedings of the International Conference
on Wireless Sensor Networks (WSN4DC) (2013)
21. Sankarasubramaniam, Y., Akyildiz, I.F., McLaughlin, S.: Energy efficiency based
packet size optimization in wireless sensor networks. In: Proceedings of the First
IEEE International Workshop on Sensor Network Protocols and Applications, pp.
1–8. IEEE (2003)
22. Dong, W., et al.: DPLC: dynamic packet length control in wireless sensor networks.
In: 2010 Proceedings IEEE INFOCOM, pp. 1–9. IEEE (2010)
23. Working with bytes. https://www.thethingsnetwork.org/docs/devices/bytes/.
Accessed 15 May 2021
24. Cayenne low power payload. https://developers.mydevices.com/cayenne/docs/
lora/#lora-cayenne-low-power-payload. Accessed 15 May 2021
Indoor Location Estimation Based
on Diffused Beacon Network
André Mendes1,2(B)
and Miguel Diaz-Cacho2
1
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politécnico de Bragança, 5300-253 Bragança, Portugal
a.chaves@ipb.pt
2
Departamento de Ingenierı́a de Sistemas y Automática,
Universidade de Vigo, 36.310 Vigo, Spain
mcacho@uvigo.es
Abstract. This work investigates the problem of location estimation
in indoor Wireless Sensor Networks (WSN) where precise, discrete and
low-cost independent self-location is a critical requirement. The indoor
scenario makes explicit measurements based on specialised location hard-
ware, such as the Global Navigation Satellite System (GNSS), difficult
and not practical, because RF signals are subjected to many propagation
issues (reflections, absorption, etc.). In this paper, we propose a low-cost
effective WSN location solution. Its design uses received signal strength
for ranging, lightweight distributed algorithms for location computation,
and the collaborative approach to delivering accurate location estima-
tions with a low number of nodes in predefined locations. Through real
experiments, our proposal was evaluated and its performance compared
with other related mechanisms from literature, which shows its suitabil-
ity and its lower average location error almost of the time.
Keywords: Indoor location · Beacon · ESP8266 · Mobile application
1 Introduction
Location is a very appreciated information. Fields of application are uncount-
able, from transportation systems, factory automation, domotics, robotics, sen-
sor networks or autonomous vehicles. Position sensors for location in an outdoor
environment are widely developed based on the Global Navigation Satellite Sys-
tem (GNSS). Despite people, mobile factory robots or even drones spent most
of their time indoors, there is not a globally accepted solution for positioning in
indoor environments. Even for vehicles, indoor vehicle positioning expands the
study of vehicular-control techniques for soft-manoeuvres in vehicles placed in
garages or parking.
In general, the research approaches for the indoor positioning problem are
grouped in two main solutions: beacon-based and beacon-free.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 294–308, 2021.
https://doi.org/10.1007/978-3-030-91885-9_21
Indoor Location Estimation 295
The beacon-based solution is based on the existence of a network of receivers
or transmitters (positioning-infrastructure), some of them placed at known loca-
tions. In this solution, the location is estimated by multilateration [1] (or tri-
angulation for angles) from a set of measured ranges. These methods are also
named Indoor Positioning Systems (IPS), depending on the sensor configura-
tion and processing approach, using technologies such as ultrasound, middle to
short-range radio (Wi-Fi, Bluetooth, UWB, RFID, Zigbee, etc.) or vision [2]. The
main weakness of these solutions is the need for infrastructure, limited absolute
accuracy and coverage.
The beacon-free solution, mainly relying on dead-reckoning methods with
sensors installed on the target to be located, uses Inertial Measuring Units (IMU)
and odometry to estimate the position of the target [3]. They integrate step
lengths and heading angles at each detected step [4] or, alternatively, data from
accelerometers and gyroscopes [5,6]. IMU and odometry based location have
the inconvenience of accumulating errors that drift the path length.
Beacon-based solutions using short-range radio signals deduce the distance by
measuring the received signal strength (RSS), mainly based on two methods: i)
the use of a radio map (previously established or online measured) as a reference;
or, ii) the use of probabilistic methods such as Bayesian inference to estimate
the location of the target.
The implementation of beacon-based solutions requires the existence of a
positioning-infrastructure. While it is possible to achieve extremely high accu-
racy with a specific ad-hoc positioning-infrastructure, the cost of installing and
maintaining the required hardware is a significant drawback of such systems.
On the contrary, the use of an existing beacon-based network (such as Wi-Fi or
Bluetooth) as a non-dedicated positioning-infrastructure could require no extra
hardware installation and minimal adjustments in software to implement an IPS,
but, it typically results in lower accuracy [7–9]. Therefore, most of the research
lines for non-dedicated positioning-infrastructures are focused on increasing the
accuracy in location estimation by using curve fitting and location search in
subareas [9], along with different heuristic-based or mathematical techniques
[10–12].
In this work, a low complexity positioning solution based on an existing
Wi-Fi infrastructure is proposed. Such technology is extremely wide extended
and allowable, therefore it is easily scaled to fill coverage gaps with inexpensive
devices. One of them is the ESP8266 board, made by Espressif Systems, which
has a low-cost Wi-Fi microchip and meandered inverted-F PCB antenna, with
a full TCP/IP stack and a microcontroller.
This solution includes a collaborative node strategy by processing fixed and
dynamic information depending on the environment status, holding the line
started by other authors [13,14] and improving the broadcasting method with
new results [15]. In addition, it assumes no prior knowledge of the statistical
parameters that represents the availability of the RF channel, it can also dynam-
ically adapt to variations of these parameters.
296 A. Mendes and M. Diaz-Cacho
Additionally, the solution has low computational complexity and low energy
consumption, which makes it attractive to be implemented in vehicles, mobile
robots, embedded and portable devices and, especially, in smartphones.
However, there are several issues with this type of location mechanism. Inher-
ent in all signal measurement exists a degree of uncertainty with many sources
(interferences, RF propagation, location of the receiver, obstacles in the path
between transmitter and receiver, etc.). This uncertainty could have a signifi-
cant effect on the measured RSS values, and therefore on the accuracy of the
received signal. In addition, because of the RF wave propagation characteristics
within buildings, the RSS values themselves are not a uniform square law func-
tion of distance. This means that a particular RSS value may be a close match
for more than one RF transmission model curve, pointing that some dynamic
adjustment may be necessary from time to time.
Moreover, some contributions could be stated, as early simulations yield aver-
age location errors below 2 m in the presence of both range and interference
location inaccuracies, experimentally confirmed afterwards; well-performed in
real experiments compared with some related mechanism from the literature;
use COTS inexpensive and easy to use hardware being beacon devices; verify
the barely intuitive idea that there is a trade-off between the number of beacon
devices and the location accuracy.
The paper is organized as follows. This section presented the concept, moti-
vation and common problems of indoor positioning technologies, and in addition
an introduction to the proposed solution. Section 2 provides the basic concepts
and problems encountered when implementing positioning algorithms in non-
dedicated positioning-infrastructures and describes the system model. A novel
positioning-infrastructure based on collaborative beacon nodes is presented in
Sect. 3. Simulation and real implementation results are presented and discussed
in Sect. 4, and finally, Sect. 5 concludes the paper and lists future works.
2 System Model
In this section, we describe the system model, which serves as the basis for the
design and implementation of our proposal. This model allows determining the
location of a client device by using management frames in Wi-Fi standard [16,17]
and Received Signal Strength (RSS) algorithm combined with the lateration
technique [18,19].
Many characteristics make positioning systems work different from outdoor
to indoor [19]. Typically, indoor positioning systems require higher precision and
accuracy than that shaped for outdoor to deal with relatively small areas and
existing obstacles.
In a strict comparison, indoor environments are more complex as there are
various objects (such as walls, equipment, and people) that reflect, refract and
diffract signals and lead to multipath, high attenuation and signal scattering
problems, and also, such systems typically rely on non-line of sight (NLoS) prop-
agation where the signal can not travel directly in a straight path from an emitter
to a receiver, which causes inconsistent time delays at the receiver.
Indoor Location Estimation 297
On the other hand, there are some favourable aspects [19], as small coverage
area becomes relatively under control in terms of infrastructure, corridors, entries
and exits, temperature and humidity gradients, and air circulation. Also, the
indoor environment is less dynamic due to slower-speed moving inside.
Lateration (trilateration or multilateration, depending on the number of ref-
erence points), also called range measurement, computes the location of a target
by measuring its distance from multiple reference points [1,16].
Fig. 1. Basic principle of the lateration technique for 2-D coordinates.
In Fig. 1, the basic principle of the lateration method for a 2-D location
measurement is shown. If the geographical coordinates (xi, yi) of, at least, three
reference elements A, B, C are known (to avoid ambiguous solutions), either
the measured distances dA, dB and dC, then the estimated position P(x̂,ŷ),
considering the measured error—grey area in figure—, can be calculated as a
solution for a system of equations that could be written similarly as Eq. 1, for
the x-axis.
dxi
=

(x̂ − xi)2 + (ŷ − yi)2 (1)
for {i ∈ Z | i  0}, where i is the number of reference elements.
In our model, we consider a static environment with random and grid deploy-
ments of S sensor nodes (throughout this work called only as nodes), which could
be of two different types:
– Reference, which can be an available Wi-Fi compliant access point (AP);
and,
– Beacon, which is made with the inexpensive Espressif ESP8266 board with
a built-in Wi-Fi microchip and meandered inverted-F PCB antenna, a full
TCP/IP stack and a microcontroller.
298 A. Mendes and M. Diaz-Cacho
Both types are in a fixed and well-known location, but there is also a client
device that is a mobile device. All these nodes are capable of wireless communi-
cation.
The client device has installed inside a piece of software that gathers infor-
mation from those nodes and computes its proper location, by using an RF
propagation model and the lateration technique. Therefore, based on a RSS
algorithm, it can estimate distances. The RSS algorithm is developed in two
steps:
i. Converting RSS to estimated distance by the RF propagation model; and,
ii. Computing location by using estimated distances.
Remark that the RF propagation model depends on frequency, antenna ori-
entation, penetration losses through walls and floors, the effect of multipath
propagation, the interference from other signals, among many other factors [20].
In our approach, we use the Wall Attenuation Factor (WAF) model [8] for
distance estimation, described by Eq. 2:
RSS(d) = P(d0) − 10α log10( d
d0
) − WAF
×

nW, if nW  C
C, if nW ⩾ C
(2)
where RSS is the received signal strength in dBm, P(d0) is the signal power
in dBm at some reference distance d0 that can either be derived empirically or
obtained from the device specifications, and d is the transmitter-receiver sepa-
ration distance. Additionally, nW is the number of obstructions (walls) between
the transmitter and the receiver and C is its maximum value up to which the
attenuation factor makes a difference, α indicates the rate at which the path
loss increases with distance, and WAF is the wall attenuation factor. In general,
these last two values depend on the building layout and construction material
and are derived empirically following procedures described in [8].
Since the environment varies significantly from place to place, the simplest
way to find the relationship of RSS and the distance between a pair of nodes is
collecting signal data at some points with known coordinates. Moreover, this is a
learning mode procedure that can improve the lateration process by the adoption
of real environment particularities, and also helps to determine empirically the
path loss exponent α.
Therefore, we determine α pruning the path loss model to a form in which it
presents a linear relationship between the predicted RSS and the logarithm of the
transmitter-receiver separation distance d, and then, we apply a linear regres-
sion to obtain that model parameter. Observe that C and nW are determined
according to the type of facility.
3 Distributed Collaborative Beacon Network
In this section, we introduce the distributed collaborative beacon network for
indoor environments.
Indoor Location Estimation 299
Our goal is to improve the client location accuracy by enhancing the infor-
mation broadcasted by nodes.
3.1 Proposal
Unlike traditional location techniques based on Wi-Fi [17], which map out an
area of signal strengths while storing it in a database, beacon nodes are used to
transmit encoded information within a management frame (beacon frame), which
is used for passive position calculation through lateration technique. Observe
that we are focusing on the client-side, without the need to send a client device
position to anywhere, avoiding privacy concerns. Moreover, since this is a purely
client-based system, location latency is reduced compared to a server-side imple-
mentation.
To improve location accuracy, we propose a collaborative strategy where
nodes exchange information with each other. Every 5 s, during the learning
mode, a beacon node randomly (with probability depending on a threshold ε)
enters in scanning mode [16,17]. The node remains in this mode during at least
one beacon interval (approximately 100 ms) to guarantee the minimum of one
beacon received from any reference node (access point) in range.
A beacon node receives, when in learning mode, RSS and location informa-
tion from reference nodes in range, and therefore can calculate a new α value
using Eq. 2. After, the new propagation parameter (α) and the proper location
coordinates are coded and stuffed into the SSID, then broadcasted to the clients.
Reference nodes only broadcast its proper location.
It is noted that the path loss model plays an important role in improving the
quality and reliability of ranging accuracy. Therefore, it is necessary to investi-
gate it in the actual propagation environment to undermine the obstruction of
beacon node coverage (e.g. people, furniture, etc.) tackling the problem of NLoS
communication.
Once in place, this coded information allows a client device to compute
distances through lateration, then determining its proper location. All of this
is performed without the need for data connectivity as it is determined purely
with the information received in beacon frames from nodes in range.
The coding schema in the standard beacon frame (also known as beacon stuff-
ing [13]) works in nearly all standard-compliant devices and enhancing them if
multiple SSIDs are supported, thus allowing to use of an existing SSID, simul-
taneously.
Beacon frames must be sent at a minimum data rate of 1 Mbps (and standard
amendment b [16,17]) to normalize the RSS across all devices. In general, since
the RSS will change if data rate changes for management data frames, in this
way it did not get any variation in RSS on the client device, except for range.
Consideration should be made when planning the optimal placement of the
beacon nodes, but this will generally be dictated by existing infrastructure (e.g.,
if using an existent standard-compliant access point as reference node), build-
ing limitations or power source requirements.
300 A. Mendes and M. Diaz-Cacho
Algorithm 1: BEACON node
1 Initialization of variables;
2 while (1) do
3 /* zeroing the information array */
4 /* and fill in a proper location */
5 Initialization of I[1..nodes];
6 /* check time counter */
7 if (counter % 5) then
8 draws a random number x, x ∈ [0, 1];
9 /* learning mode */
10 if (x  ε) then
11 WiFi Scanning;
12 get SSID from beacon frame;
13 foreach ref node in SSID do
14 read coded information;
15 Decode;
16 get {location, RSS};
17 I[ref node] ← {location, RSS};
18 /* adjust α if needed */
19 if (α do not fit default curve) then
20 ComputeNewAlpha;
21 I[ref node] ← {α};
22 /* beacon frame coding */
23 PackAndTx ← I;
24 Broadcast coded SSID;
3.2 Algorithms and Complexity Analysis
The proposed strategy is described in Algorithms 1 and 2.
In the beginning, after initialization, the mechanism enters into an infinite
loop during its entire period of operation. And from a beacon node point of
view, it decides according to a threshold (ε) between enter in learning mode, as
described earlier or jump directly to the coding process of its proper location
coordinates into the SSID, and transmission. In Algorithm 2, a client device
verifies if a received beacon frame comes from a beacon or a reference node,
then run the routines to decode informations and adjust α, if needed, and at last
computes its location.
Finally, observing Algorithm 1, it is noted that it depends of three routines,
named Decode, ComputeNewAlpha and PackAndTX, which are finite time oper-
ations with time complexity constant, O(1). However, it all depends on the
number of reference nodes in range, this way it raises the whole complexity
to T(n) = O(n). In a similar fashion, Algorithm 2 depends on Decode and
AdjustAlpha routines, which are finite time operations, but the Positioning
Indoor Location Estimation 301
Algorithm 2: CLIENT device
1 Initialization of variables;
2 while (1) do
3 WiFi Scanning;
4 get SSID from beacon frame;
5 foreach (ref node|beacon node in SSID) do
6 read coded information;
7 Decode;
8 AdjustAlpha;
9 Positioning;
routine takes linear time, also dependent on the amount of nodes (reference
and beacons) in range, thus the complexity becomes linear, T(n) = O(n).
Additionally, the largest resource consumption remains related to the storage
of each coded SSID (32 bytes), which is dependent on the number of nodes
(reference and beacons) in range as well. Thus, S(n) = O(n).
4 Evaluation
In this section, the proposed collaborative strategy is evaluated using numerical
simulations, also by a real implementation experiment using the inexpensive
ESP8266 board [21], a Wi-Fi standard-compliant access point and a smartphone
application software.
Location estimation is the problem of estimating the location of a client
device from a set of noisy measurements, acquired in a distributed manner by
a set of nodes. The ranging error can be defined as the difference between the
estimated distance and the actual distance.
The location accuracy of each client device is evaluated by using the root
mean square error (Eq. 3):
ei =

(x − xe)2 + (y − ye)2 (3)
where x and y denote the actual coordinates, and xe and ye represent the esti-
mated coordinates of the client.
We define the average location error (e) as the arithmetic mean of the location
accuracy, as seen in Eq. 4:
e =
N

j=1
ei
N
(4)
To assist in the evaluation, the strategies are analyzed in terms of average
location error (e), and additionally, some other metrics are used and discussed
occasionally, such as accuracy, technology, scalability, cost, privacy, and robust-
ness [16,19,22].
302 A. Mendes and M. Diaz-Cacho
Finally, the objective of the proposed strategy is not to shorten the network
lifetime. Therefore, we have also evaluated the performance of the strategies in
terms of energy consumption.
4.1 Simulations
Based on the system model described in Sect. 2 and to assess the behaviour of the
proposed strategy (Subsect. 3.1) in solving the indoor positioning problem, we
use the ns-2 simulator environment running under GNU/Linux, that emulates a
network with both types of nodes and clients, and adopts a specific data link
layer where contention and possible collisions, between them, takes place.
In particular, 20 nodes and 10 clients have been deployed in an area of
200 m2
, each of them with initial 100 energy units that were consumed during
the simulation, on a per operation basis. We use a fixed energy model, meaning
that every receive, decode, code and transmit process consume a fixed amount
of energy, as used by ns-2. The α parameter (Eq. 2) has a initial value of 2.0 and
the threshold ε is setted to 0.1. The values of the remaining parameters of the
simulation were picked up as the most representative ones within their validity
intervals, after a round of tests.
For this analysis, we set 2 different scenarios to place the sensors (beacon
and reference) nodes for the physical features of the network, aiming to repro-
duce the Department of Systems Engineering and Automatics’ (DESA) facility.
Clients are placed randomly in both cases.
– Grid: sensor nodes are placed over an equally spaced reference grid; and,
– Random: sensor nodes are placed randomly into the area.
And the performance of the following strategies are evaluated through sim-
ulations in the laboratory:
– WELO: our proposal that was named WEmos (development board based on
ESP8266) LOcalization;
– GENERAL: the single position approach, meaning that only one location is
computed after decoding the data within beacon frames when the node is
placed; and,
– CF: the continuous positioning, like GENERAL, but multiple locations are com-
puted to reduce the minimum square error.
A total of 400 simulation rounds are done for each set of parameters, each one
corresponding to 2.000 s in simulation time, for both scenarios described earlier,
where we compare the performance of the described strategies.
In the results, the average location error (e) obtained at every round is pre-
sented, with error bars corresponding to the 95% confidence interval; also the
quantity of beacons nodes used for positioning, the cumulative distribution func-
tion (CDF) of e, and the average energy consumption, per client.
Indoor Location Estimation 303
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10
Avg.
location
error
(m)
#CLIENTS
GENERAL
WELO
CF
(a) #BEACONS = 4
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10
Avg.
location
error
(m)
#CLIENTS
GENERAL
WELO
CF
(b) #BEACONS = 10
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10
Avg.
location
error
(m)
#CLIENTS
GENERAL
WELO
CF
(c) #BEACONS = 20
Fig. 2. Average location error (e) against clients deployed, per beacons available (by
simulations).
0
5
10
15
20
25
30
35
40
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
#
USED
BEACONS
#AVAILABLE BEACONS
GENERAL
WELO
CF
(a) Beacons deployed versus
available beacons.
0
2
4
6
8
10
12
14
16
18
1 2 3 4 5 6 7 8 9 10
Energy
consumed
%
#CLIENTS
GENERAL
WELO
CF
(b) Energy consumption
against clients deployed.
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30
CDF
(%)
Avg. location error (m)
GENERAL
WELO
CF
(c) Average location error
(e) CDF in simulations.
Fig. 3. WELO behaviour in simulations.
Figure 2 compares the obtained e between strategies. this performance com-
parison shows that our proposal, WELO, improves the accuracy of clients posi-
tioning by reducing e. We have observed when the number of beacon nodes
increases, WELO—as a passive mechanism—reduces e on average, independent of
the quantity of clients deployed. Compared with other strategies, this result
shows that WELO uses more location information obtained through our collab-
orative approach, then leveraging the lateration process, independently of the
number of clients, becoming an important performance metric.
Another relevant observation is that there is a balance between the quan-
tity of beacon nodes placed and the accuracy. Looking at Figs. 2b and 2c, one
can observe that e is marginally reduced when raising the amount of beacons
deployed. From another point of view, Fig. 3a shows this balance with the number
of used beacon nodes to obtain a location. Further studies at different scenar-
ios have to determine if there is a clear relationship between accuracy and the
quantity of nodes, which might go against the statement that accuracy (with
lateration) improves with raising the nodes installed.
The CDF of e is depicted in Fig. 3c, showing that the curve of WELO lies in the
top, demonstrates that its cumulative probability is larger for the same error.
As can be seen, WELO achieves accuracy equal or better than 2m for 80% of the
location estimations along with the scenario, and reduction of at least 80% over
the other strategies.
304 A. Mendes and M. Diaz-Cacho
Apart from that, when looking at Fig. 3b, the average energy consumption
by our strategy is stable in comparison with other strategies. As expected, with
clients running a simplified mechanism and only beacon frames being broad-
casted by beacon nodes, our proposed approach significantly improves the resid-
ual energy of the overall network, for both scenarios.
4.2 Real Implementation Analysis
Our real implementation analysis has been made with placement around the bor-
der of a corridor that connect offices at the Department of Systems Engineering
and Automatics’ (DESA) facility, as shown in Fig. 4.
Fig. 4. DESA’s facility real scenario and heatmap overlay (beacon and reference
nodes—B and R respectively).
We deploy at least one beacon node (made with the ESP8266 board) on each
corner until 8 nodes (01 Wi-Fi standard-compliant access point included) being
placed in total, and a client device (smartphone), in an area of 200 m2
, where
the maximum distance between nodes lies between 2 m and 3 m. To avoid the
potential for strong reflection and multipath effects from the floor and ceiling,
we kept all devices away from the floor and ceiling.
The α parameter (Eq. 2) has a initial value of 2.0 and the threshold ε is set
to 0.1.
Before running the experiment, we measured received signal strengths from
each node (beacon or reference node) of our network. These measurements
were made from distances ranging between 1 m to 20 m of each node, in 10
points within the area of the experiment, totalizing 200 measurements, aided
by a smartphone application. Figure 4 shows a heat map overlay made with the
average of these measurements.
In addition, we added functions to the base firmware of the board ESP8266
(esp iot sdk v1.3.0 SDK [21]) to execute the learning process, described in Sub-
sect. 3.1.
During the experiment, on the smartphone application, the trilateration pro-
cess uses a RSS based path loss algorithm to determine the distance (Eq. 2).
Indoor Location Estimation 305
In the results, the average location error (e) obtained is presented and dis-
cussed, and for purposes of comparison, the cumulative distribution function
(CDF) of e is also investigated.
0
5
10
15
20
25
0 200 400 600 800 1000 1200 1400 1600
Avg.
real
error
(m)
Time (s)
WELO
GNSS
(a) Measurements for the location esti-
mation error.
0
2
4
6
0
4
8
12
16
20
24
Real
error
(m)
Real
error
(m)
WELO
GNSS
0
0.2
0.4
0.6
0.8
1
1.2
0 200 400 600 800 1000 1200 1400 1600
0
20
40
60
80
100
120
140
160
180
200
GNSS
Error
Measurement
(m)
GNSS
Error
Measurement
(m)
Time (s)
WELO
GNSS
(b) Real behavior during measurements.
Fig. 5. Real experiment: our strategy vs GNSS.
Obtained average location error e between WELO, running in a smartphone,
and a GNSS (GPS and GLONASS) receiver application, running in another
smartphone, is presented in Fig. 5a. Both smartphones are the same model and
the measurements were taken the same way to avoid any bias.
This comparison shows that WELO is the one that achieves better results. The
performance of GNSS is worst because of the instability of the received signal,
even with measurements were taken at the ground level, and separated from the
outside by generous glass windows and drywall made walls on the inside.
Another interesting observation relies on Fig. 5b, where we can see the real
behaviour of both systems, GNSS and WELO. At some points near the glass
window, GNSS signals become strong and their error is reduced, in opposite
WELO becomes worst because it is far from nodes (reference and beacons).
This happens due to the dynamic nature of the RF environment.
Even so, GNSS can get much more information during evaluation instants,
but in indoor scenarios, any temporary change from the RF environment causes
significant performance degradation for GNSS. In contrast, WELO achieves a bet-
ter performance not only because of the higher availability of its signals but also
because of the higher confidence of the location information obtained.
Moreover, this behaviour demonstrates that GNSS errors in this scenario
become out of control, mainly because of the multiple reflections at surfaces
that cause multipath propagation, and lose significant power indoors, concern-
ing the required coverage for receivers (at least four satellites). However, with
the introduction of “dual-frequency” support on smartphones, position accuracy
probably experiment promising enhancements. Further analysis will determine
its performance under this same indoor scenario.
306 A. Mendes and M. Diaz-Cacho
10
20
30
40
50
60
70
80
90
100
0 2 4 6 8 10
CDF
(%)
Avg. location error (m)
Simulated
Real
(a) Our strategy: simulated vs real. (b) Benchmarking between mechanisms.
Fig. 6. Average location error (e) CDF comparison (by real experiment).
By comparing the results in Fig. 5a to the ones in Fig. 2, one can observe that
the performance of the WELO is positively impacted with the growth of the amount
of beacon nodes available, but with a limit. Additionally, the performance degra-
dation is small, which shows the robustness of the proposed strategy against the
larger heterogeneity of a real indoor scenario without any prior knowledge.
Based on the above observations, we compared the average location error (e)
CDF comparison (by real experiment) in Fig. 6. Figure 6a shows its performance
in simulations in contrast with its capabilities in the real experiments. At that,
we can observe the accuracy suffering from the toughness of the indoor scenario.
Figure 6b shows the performance evaluation of the proposed strategy against
some classical algorithms, such as RADAR [8] and Curve Fitting with RSS
[9] in the office indoor scenario, together with GNSS (worse performance and
related to upper x-axis only). As can be seen, WELO shows its suitability and
its lower average location error almost of the time within the scenario. From
these figures, WELO results are fairly stimulating and we can observe performance
improvements.
5 Conclusions
The Wi-Fi standard has established itself as a key positioning technology that
together with GNSS is widely used by devices.
Our contribution is the development of a simple and low-cost indoor loca-
tion infrastructure, plus a low complexity and passive positioning strategy that
does not require specialized hardware (instead exploiting hardware commonly
available—ESP8266 board—with enhanced functionality), arduous area polling
or reliance on data connectivity to a location database.
These last two features help to reduce its latency, all while remaining back-
wards compatible with existing Wi-Fi standards, and working side by side with
Indoor Location Estimation 307
IEEE 802.11mc Fine Time Measurement (Wi-Fi RTT) when it will broad avail-
able.
We run simulations with the ns-2 and made an experiment under real indoor
conditions, at DESA’s facility, to evaluate the performance of the proposed strat-
egy. The results show that this solution achieves better performance than other
pre-established strategies (GNSS system included) for the scenarios evaluated.
Further research is required to guarantee the stability of our approach.
As future works, we intend to extend the evaluation for client devices being
located by dynamically deployed beacon nodes.
Acknowledgements. This work has been conducted under the project “BIOMA
– Bioeconomy integrated solutions for the mobilization of the Agri-food market”
(POCI-01-0247-FEDER-046112), by “BIOMA” Consortium, and financed by European
Regional Development Fund (FEDER), through the Incentive System to Research and
Technological development, within the Portugal2020 Competitiveness and Internation-
alization Operational Program.
This work has also been supported by FCT - Fundação para a Ciência e Tecnologia
within the Project Scope: UIDB/05757/2020.
References
1. Savvides, A., Park, H., Srivastava, M.B.: The bits and flops of the n-hop mul-
tilateration primitive for node localization problems. In: Proceedings of the 1st
ACM International Workshop on Wireless Sensor Networks and Applications, pp.
112–121 (2002)
2. Hightower, J., Borriello, G.: Location systems for ubiquitous computing. Computer
34(8), 57–66 (2001)
3. Collin, J.: Investigations of self-contained sensors for personal navigation. Tampere
University of Technology (2006)
4. Jimenez, A.R., Seco, F., Prieto, C., Guevara, J.: A comparison of pedestrian dead-
reckoning algorithms using a low-cost MEMS IMU. In: 2009 IEEE International
Symposium on Intelligent Signal Processing, pp. 37–42. IEEE (2009)
5. Jiménez, A.R., Seco, F., Prieto, J.C., Guevara, J.: Indoor pedestrian navigation
using an INS/EKF framework for yaw drift reduction and a foot-mounted IMU. In:
2010 7th Workshop on Positioning, Navigation and Communication, pp. 135–143.
IEEE (2010)
6. Chatfield, A.B.: Fundamentals of High Accuracy Inertial Navigation. American
Institute of Aeronautics and Astronautics, Inc. (1997)
7. Deasy, T.P., Scanlon, W.G.: Stepwise refinement algorithms for prediction of user
location using receive signal strength indication in infrastructure WLANs. In: 2003
High Frequency Postgraduate Student Colloquium (Cat. No. 03TH8707), pp. 116–
119. IEEE (2003)
8. Bahl, P., Padmanabhan, V.N.: RADAR: an in-building RF-based user location and
tracking system. In: Proceedings IEEE INFOCOM 2000. Conference on Computer
Communications. Nineteenth Annual Joint Conference of the IEEE Computer and
Communications Societies (Cat. No. 00CH37064), vol. 2, pp. 775–784. IEEE (2000)
9. Wang, B., Zhou, S., Liu, W., Mo, Y.: Indoor localization based on curve fitting
and location search using received signal strength. IEEE Trans. Industr. Electron.
62(1), 572–582 (2014)
308 A. Mendes and M. Diaz-Cacho
10. Ji, X., Zha, H.: Sensor positioning in wireless ad-hoc sensor networks using multi-
dimensional scaling. In: IEEE INFOCOM 2004, vol. 4, pp. 2652–2661. IEEE (2004)
11. Doherty, L., El Ghaoui, L., et al.: Convex position estimation in wireless sensor
networks. In: Proceedings IEEE INFOCOM 2001. Conference on Computer Com-
munications. Twentieth Annual Joint Conference of the IEEE Computer and Com-
munications Society (Cat. No. 01CH37213), vol. 3, pp. 1655–1663. IEEE (2001)
12. Shang, Y., Ruml, W.: Improved MDS-based localization. In: IEEE INFOCOM
2004, vol. 4, pp. 2640–2651. IEEE (2004)
13. Chandra, R., Padhye, J., Ravindranath, L., Wolman, A.: Beacon-stuffing: wi-fi
without associations. In: Eighth IEEE Workshop on Mobile Computing Systems
and Applications, pp. 53–57. IEEE (2007)
14. Sahoo, P.K., Hwang, I., et al.: Collaborative localization algorithms for wireless
sensor networks with reduced localization error. Sensors 11(10), 9989–10009 (2011)
15. Bal, M., Shen, W., Ghenniwa, H.: Collaborative signal and information processing
in wireless sensor networks: a review. In: 2009 IEEE International Conference on
Systems, Man and Cybernetics, pp. 3151–3156. IEEE (2009)
16. Liu, H., Darabi, H., Banerjee, P., Liu, J.: Survey of wireless indoor positioning
techniques and systems. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)
37(6), 1067–1080 (2007)
17. 802.11b: Wireless LAN MAC and PHY Specifications: Higher-Speed Physical Layer
Extension in the 2.4 GHz Band (1999). IEEE Standard
18. Al Nuaimi, K., Kamel, H.: A survey of indoor positioning systems and algorithms.
In: 2011 International Conference on Innovations in Information Technology, pp.
185–190. IEEE (2011)
19. Gu, Y., Lo, A., Niemegeers, I.: A survey of indoor positioning systems for wireless
personal networks. IEEE Commun. Surv. Tutor. 11(1), 13–32 (2009)
20. Rappaport, T.: Wireless Communications: Principles and Practice. Prentice-Hall,
Hoboken (2001)
21. Espressif: Datasheet ESP8266 (2016)
22. Huang, H., Gartner, G.: A survey of mobile indoor navigation systems. In: Gartner,
G., Ortag, F. (eds.) Cartography in Central and Eastern Europe. Lecture Notes
in Geoinformation and Cartography, pp. 305–319. Springer, Heidelberg (2009).
https://doi.org/10.1007/978-3-642-03294-3 20
SMACovid-19 – Autonomous Monitoring
System for Covid-19
Rui Fernandes(B)
and José Barbosa
MORE – Laboratório Colaborativo Montanhas de Investigação – Associação,
Bragança, Portugal
{rfernandes,jbarbosa}@morecolab.pt
Abstract. The SMACovid-19 project aims to develop an innovative
solution for users to monitor their health status, alerting health pro-
fessionals to potential deviations from the normal pattern of each user.
For that, data is collected, from wearable devices and through manual
input, to be processed by predictive and analytical algorithms, in order
to forecast their temporal evolution and identify possible deviations, pre-
dicting, for instance, the potential worsening of the clinical situation of
the patient.
Keywords: COVID-19 · FIWARE · Docker · Forecasting
1 Introduction
Wearable devices nowadays support many sensing capabilities that can be used
to collect data, related to the body of the person using the wearable. This allowed
for the development of health related solutions based on data collected from the
wearable devices for different purposes: for instance, in [1], the authors present
a system for automated rehabilitation, in [2] big data analysis is used to define
healthcare services. Nowadays, we are facing a pandemic scenario and this work
intends to use the characteristics of the wearable devices to develop an innovative
solution that can be used to diagnose possible infected COVID-19 patients, at
an early stage, by monitoring biological/physical parameters (through wearable
devices) and by conducting medical surveys that will allow an initial screening.
2 Docker and FIWARE
The system was developed using Docker compose, to containerize the architec-
ture modules, and FIWARE, to deal with the context information.
Docker is an open source platform that allows the containerization of appli-
cations and uses images and containers to do so. An image is a lightweight,
standalone, executable package of software that includes everything needed to
run an application: code, runtime, system tools, system libraries and settings [3].
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 309–316, 2021.
https://doi.org/10.1007/978-3-030-91885-9_22
310 R. Fernandes and J. Barbosa
A container is a standardized unit of software, and includes the code as well as
the necessary dependencies for it to run efficiently, regardless of the computing
environment where is being run [3]. Images turn into containers at runtime. The
advantage that applications developed using this framework have is that they
will always run the same way, regardless of the used infrastructure to run them.
The system was developed using the FIWARE platform [4]. FIWARE defines
a set of standards for context data management and the Context Broker (CB)
is the key element of the platform, enabling the gathering and management of
context data: the current context state can be read and the data can be updated.
FIWARE uses the FIWARE NGSI RESTful API to support data transmis-
sion between components as well as context information update or consumption.
The usage of this open source RESTful API simplifies the process of adding extra
functionalities through the usage of extra FIWARE or third-party components.
As an example, FIWARE extra components, ready to use, cover areas such as
interfacing with the Internet of Things (IoT), robots and third-party systems;
context data/API management; processing, analysis and visualization of context
information. In this project, the version used was the NGSI-v2.
3 System Architecture
The system’s architecture is the one shown in Fig. 1, in which the used/
implemented modules communicate through the ports identified on the edges:
– Orion: FIWARE context broker;
– Cygnus: responsible for the data persistence in the MongoDB;
– Context Provider: performs data acquisition from the Fitbit API;
– Data analysis: processes the available data to generate forecasts;
– Fitbit API: data source for the wearable device.
In the system, data from the wearable device and from the mobile app
is posted into the CB. This data is automatically persisted in the MongoDB
through the cygnus module and is also used to trigger the data analysis module,
to execute a new forecast operation. Data persistence is implemented through
the creation of subscriptions in the CB that notify cygnus about data update.
There are already docker images ready to use for the standard modules,
such as, the MongoDB, the Orion context broker and the Cygnus persistence
module. The context provider and data analysis modules were developed and
containerized to generate the full system architecture.
3.1 Data Model
The NGSI-v2 API establishes rules for entity definition. Per those rules, an entity
has to have mandatory attributes, namely, the id and the type attributes. The
id is an unique uniform resource name (urn) created according with the format
“urn:ngsi-ld:entity-type:entity-id”.
In this work, the entity’s type is identified in each entity definition, as can
be seen in the data model presented in Fig. 2.
SMACovid-19 – Autonomous Monitoring System for Covid-19 311
Fig. 1. Architecture devised to create the SMACovid-19 system.
Given the fact that each id has to be unique, the entity-id part of the urn
was created based on the version 4 of the universally unique identifier (uuid).
Using these rules, a person entity may be, for instance:
urn:ngsi-ld:Person:4bc3262c-e8bd-4b33-b1ed-431da932fa5
To define the data model, extra attributes were used, in each entity, to con-
vey the necessary information between modules. Thus, the Person entity has
attributes to specify the user characteristics and to associate the person with his
fitbit account. The Measurement entity represents the acquired data values and
its corresponding time of observation. This entity is used to specify information
regarding the temperature, oxygen saturation, blood pressure, heart rate and
daily number of diarrhea occurrences. The measurementType attribute is used
to indicate what type of variable is being used on that entity. To ensure max-
imum interoperability possible, the Fast Healthcare Interoperability Resources
(FHIR) specification of Health Level Seven International (HL7) [5] was used
to define terms whenever possible. This resulted in the following terms being
used: bodytemp for temperature, oxygensat for oxygen saturation, bp for blood
pressure, heartrate for heart rate and diarrhea (FHIR non-compliant) for daily
number of diarrhea occurrences.
The Questionnaire entity specifies data that has to be acquired through the
response of a questionnaire, by the user, through the mobile application. The
Organization entity identifies the medical facility that the person is associated
with. The Forecast entity provides the forecast mark value, calculated accord-
312 R. Fernandes and J. Barbosa
Fig. 2. Data model
ing with the classification table defined by the Hospital Terra Quente (HTQ)
personnel, and the corresponding time of forecast calculation.
3.2 Data Acquisition
Data is acquired from two different sources: mobile application and Fitbit API.
The mobile application is responsible for manual data gathering, sending
this data to the CB through POST requests. For instance, data related to the
attributes of the Questionnaire entity is sent to the CB using this methodology.
Fitbit provides an API for data acquisition that, when new data is uploaded
to the Fitbit servers, sends notifications to a configured/verified subscriber end-
point. These notifications are similar to the one represented in Fig. 3. The
Fig. 3. Fitbit notification identifying new data availability [6].
SMACovid-19 – Autonomous Monitoring System for Covid-19 313
ownerId attribute identifies the fitbitId of the person that has new data available,
thus, allowing the system to fetch data from the identified user.
This results in the following workflow, to get the data into the system:
1. create a Person entity;
2. associate Fitbit account with that person;
3. make a oneshot subscription (it is only triggered one time—oneshot) for that
Person entity, notifying the Fitbit subscription endpoint;
4. update the Person entity with the Fitbit details;
5. create the fitbit subscription upon the oneshot notification reception;
6. get new data from Fitbit web API after receiving a notification from Fitbit:
collect the fitbitId from the notification and use it to get the data from the
Fitbit servers, from the last data update instant up until the present time.
3.3 Data Analysis and Forecast Execution
When new data arrives into the system, the data analysis module is called to
perform a forecast, based on the new available data. Given the data sources at
hand, there are two methods to call this module: (1) subscription based from
the context broker or (2) direct call from the context provider.
In the first method, subscriptions are created within the CB to notify the data
analysis module about data change. This method is used on the Questionnaire
and Measurement entities. For the latter, individual subscriptions have to be
created for each possible measurementType value. With this methodology, every
time new data arrives at the CB about any of these variables, the subscription is
triggered and the CB sends the notification to the forecast endpoint, identifying
the entity that has new data available.
The second method is used when there’s new data from the Fitbit servers.
Upon data collection from the Fitbit servers, the data is sent to the CB, being
persisted by cygnus in the database. Afterwards, the data analysis endpoint is
called with the subscription ID set to fitbit.
In order to perform the forecast, data is retrieved from the CB and the
database, based on the subscriptionId, for a 15 day interval, defined based on the
present instant. Data is collected for all variables: (1) anosmia, (2) contactCovid,
(3) cough, (4) dyspnoea, (5) blood pressure, (6) numberDailyDiarrhea, (7) oxy-
gen saturation, (8) temperature and (9) heart rate.
Data history of a certain variable is used to forecast that variable:
– variables 1–4 are boolean variables, thus, the forecast implemented consists
in the last value observed, for each variable, read from the context broker.
– variables 5–8 are variables with slow variation and small dynamic range, thus,
the forecast used was a trend estimator.
– variable 9 has a seasonal variation that can be observed on a daily basis,
thus, a forecast algorithm that can take seasonal effects into consideration
was used, namely, the additive Holt-Winters forecast method.
The Holt-Winters method needs to be fed with periodic data in order to
make the forecast and has to have at least data related to two seasons (two
314 R. Fernandes and J. Barbosa
days). This means that data pre-processing has to be executed to ensure that
this requirement is met. For instance, when the wearable device is being charged,
it cannot, of course, read the patient’s data, creating gaps that have to be pro-
cessed before passing the data to the forecast algorithm. Hence, when a new
forecast is executed, data is collected from the database, the data is verified to
determine if there are gaps in the data and, if the answer is positive, forward and
backward interpolation are used to fill those gaps. The final data, without gaps,
is, afterwards, passed to the forecast method. Holt-Winters parameter optimiza-
tion is executed first, using, to that end, the last known hour data, coupled with
a Mean Absolute Error (MAE) metric. The optimized parameters are then used
to make the forecast for the defined forecast horizon.
Table 1. Classification metrics for Covid-19 level identification provided by HTQ.
Variable 0 Points 1 Points 2 Points
Systolic blood pressure (mmHg) 120 ≤ x ≤ 140 100 ≤ x ≤ 119 x ≤ 99
Heart rate (bpm) 56 ≤ x ≤ 85 86 ≤ x ≤ 100
or 50 ≤ x ≤ 55
x ≥ 101
or x ≤ 49
Oxygen saturation (%) x ≥ 94 90 ≤ x ≤ 93 x ≤ 89
Body temperature (◦C) x ≤ 36.9 37 ≤ x ≤ 39.9 x  39.9
Dyspnea No N/A Yes
Cough No N/A Yes
Anosmia No N/A Yes
Contact COVID+ No N/A Yes
Diarrhea No N/A Yes (minimum
of 5 times/day)
The final forecast mark is computed by adding all individual marks, cal-
culated according with the rules specified in Table 1, defined by the medical
personnel of Hospital Terra Quente. This value is used with the following clas-
sification, to generate alerts:
– value ≤ 4 points → unlikely Covid-19 → 24 h vigilance;
– value ≥ 5 points points → likely Covid-19 → patient should be rushed into
the emergency room (ER).
4 Results
As discussed on Sect. 3.3, some of the used variables in this work are boolean
and the forecast performed is the last value those variables have, that can be
read from the context broker.
Another set of variables use the trend estimator to forecast the next day value.
A case of such scenario can be seen in Fig. 4(a), that showcases the number of
daiy diarrhea occurrences example. In this case, the forecast is less than 5 times
per day, thus, the attributed forecast points are 0, per Table 1.
SMACovid-19 – Autonomous Monitoring System for Covid-19 315
(a) Daily diarrhea example. (b) Oxygen Saturation example.
Fig. 4. Trend estimator forecasting.
Fig. 5. Heart rate forecasting example, using the Holt-Winters method.
Other variable that also uses the trend estimation predictor is the oxygen
saturation variable. An example for this variable is presented in Fig. 4(b). For
this variable, the forecast horizon is 2 h and the worst case scenario is considered
for the forecast points calculation. In this example, the minimum of the forecast
values is 90.78, yielding the value of 1 for the forecast points.
The heart rate, given its seasonality characteristics, uses the Holt-Winters
additive method fed, with, at least, data about two days and a forecast hori-
zon of two hours. Similarly to the other variables, the worst case scenario is
used to calculate the forecast points. Since the optimal case (0 points) is in
the middle of the dynamic range of the heart rate value, both maximum and
minimum forecast values are used to determine the associated forecast points.
Figure 5 shows an example of heart rate forecast, in which, forecasts can only be
determined starting on the second season, when this season’s data is already
316 R. Fernandes and J. Barbosa
available. The smoothing parameters α, β and γ fed into the Holt-Winters
method [7] were [0.5, 0.5, 0.5] respectively. The optimization yielded the values
[0.61136, 0.00189, 0.65050] with an associated MAE of 4.12.
Considering the forecast values, the minimum is 74.4983 and the maximum
is 84.2434, thus, the forecast points associated with this example, according with
Table 1, are 0 points.
5 Conclusions
In this work, we present a solution using mobile applications and wearable
devices for health data collection and analysis, with the purpose of generat-
ing health status forecasting, for patients that may, or may not, be infected with
the COVID-19 virus. To implement this solution, Docker containerization was
used together with elements from the FIWARE platform to create the system’s
architecture. A health context provider was implemented, to fetch data from the
wearable device, as well as a data analysis module to make the forecasts.
Three types of forecasts were devised, considering the variables characteris-
tics, namely, last observation, trend estimation and Holt-Winters methodology.
The obtained results so far are encouraging and future work can be done
in the optimization of the forecast methods and in the inclusion of more data
sources.
Acknowledgements. This work was been supported by SMACovid-19 – Autonomous
Monitoring System for Covid-19 (70078 – SMACovid-19).
References
1. Candelieri, A., Zhang, W., Messina, E., Archetti, F.: Automated rehabilitation exer-
cises assessment in wearable sensor data streams. In: 2018 IEEE International Con-
ference on Big Data (Big Data), pp. 5302–5304 (2018). https://doi.org/10.1109/
BigData.2018.8621958
2. Hahanov, V., Miz, V.: Big data driven healthcare services and wearables. In: the
Experience of Designing and Application of CAD Systems in Microelectronics, pp.
310–312 (2015). https://doi.org/10.1109/CADSM.2015.7230864
3. Docker. https://www.docker.com/resources/what-container. Accessed 5 May 2021
4. FIWARE. https://www.fiware.org/developers/. Accessed 5 May 2021
5. HL7 FHIR Standard. https://www.hl7.org/fhir/R4/. Accessed 5 May 2021
6. Fitbit Subscriptions Homepage. https://dev.fitbit.com/build/reference/web-api/
subscriptions/. Accessed 5 May 2021
7. Holt-Winters Definition. https://otexts.com/fpp2/holt-winters.html. Last accessed
5 May 2021. Accessed 5 May 2021
Optimization in Control Systems Design
Economic Burden of Personal Protective
Strategies for Dengue Disease:
an Optimal Control Approach
Artur M. C. Brito da Cruz1,3
and Helena Sofia Rodrigues2,3(B)
1
Escola Superior de Tecnologia de Setúbal, Instituto Politécnico de Setúbal,
Setúbal, Portugal
artur.cruz@estsetubal.ips.pt
2
Escola Superior de Ciências Empresariais, Instituto Politécnico de Viana
do Castelo, Valença, Portugal
sofiarodrigues@esce.ipvc.pt
3
CIDMA - Centro de Investigação e Desenvolvimento em Matemática e Aplicações,
Departamento de Matemática, Universidade de Aveiro, Aveiro, Portugal
Abstract. Dengue fever is a vector-borne disease that is widely spread.
It has a vast impact on the economy of countries, especially where the
disease is endemic. The associated costs with the disease comprise pre-
vention and treatment. This study focus on the impact of adopting
individual behaviors to reduce mosquito bites - avoiding the disease’s
transmission - and their associated costs. An epidemiological model is
presented with human and mosquito compartments, modeling the inter-
action of dengue disease. The model assumed some self-protection mea-
sures, namely the use of repellent in human skin, wear treated clothes
with repellent, and sleep with treated bed nets. The household costs for
these protections are taking into account to study their use. We con-
clude that personal protection could have an impact on the reduction of
the infected individuals and the outbreak duration. The costs associated
with the personal protection could represent a burden to the household
budget, and its purchase could influence the shape of the infected’s curve.
Keywords: Dengue · Economic burden · Personal protection ·
Household costs · Optimal control
1 Introduction
Dengue is a mosquito-borne disease dispersed almost all over the world. The
infection occurs when a female Aedes mosquito bites an infected person and
then bites another healthy individual to complete its feed [10].
According to World Health Organization (WHO) [27], each year, there are
about 50 million to 100 million cases of dengue fever and 500,000 cases of severe
Supported by FCT - Fundação para a Ciência e a Tecnologia.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 319–335, 2021.
https://doi.org/10.1007/978-3-030-91885-9_23
320 A. M. C. Brito da Cruz and H. S. Rodrigues
dengue, resulting in hundreds of thousands of hospitalizations and over 20,000
deaths, mostly among children and young adults.
Several factors contribute to this public health problem, such as uncontrolled
urbanization and increasing population growth. Problems like the increased use
of non-biodegradable packaging coupled with nonexistent or ineffective trash
collection services, or the growing up number of travel by airplane, allowing
the constant exchange of dengue viruses between countries, reflect such factors.
Besides, the financial and human resources are limited, leading to programs that
emphasize emergency control in response to epidemics rather than on integrated
vector management related to prevention [27].
These days, dengue is one of the most important vector-borne diseases and,
globally, has heavy repercussions in morbidity and economic impact [1,28].
1.1 Dengue Economic Burden
Stanaway et al. [25] estimated dengue mortality, incidence, and burden for the
Global Burden of Disease Study. In 2013, there were almost 60 million symp-
tomatic dengue infections per year, resulting in about 10 000 deaths; besides, the
number of symptomatic dengue infections more than doubled every ten years.
Moreover, Shepard et al. [23] estimated a total annual global cost of dengue
illness of US$8.9 billion, with a distribution of dengue cases of 18% admitted to
hospital, 48% ambulatory, and 34% non-medical.
Dengue cases have twofold costs, namely prevention and treatment. The
increasing of dengue cases leads to the rise of health expenditures, such as outpa-
tient, hospitalization, and drug administration called direct costs. At the same
time, there are indirect costs for the economic operation, such as loss of pro-
ductivity, a decrease in tourism, a reduction in foreign investment flows [14].
Most of the countries only bet on prevention programs to control the outbreaks.
They focus on the surveillance program, with ovitraps or advertising campaigns
regarding decrease the breeding sites of the mosquito.
1.2 Personal Protective Measures (PPM)
Another perspective on dengue prevention is related to household prevention,
where the individual has a central role in self-protection. PPM could be physical
(e.g., bed nets or clothing) or chemical barriers (e.g., skin repellents).
Mosquito nets are helpful in temporary camps or in houses near biting insect
breeding areas. Care is needed not to leave exposed parts of the body in contact
with the net, as mosquitoes bite through the net. However, insecticide-treated
mosquito nets have limited utility in dengue control programs, since the vector
species bite during the day, but treated nets can be effectively utilized to protect
infants and night workers who sleep by day [18].
For particular risk areas or occupations, the protective clothing should be
impregnated with allowed insecticides and used during biting insect risk periods.
Economic Burden of Personal Protective Strategies for Dengue Disease 321
Repellents with the chemical diethyl-toluamide (DEET) give good protec-
tion. Nevertheless, repellents should act as a supplementary to protective cloth-
ing.
The population also has personal expenses to prevent mosquito bites. There
are on the market cans of repellent to apply on the body, clothing already impreg-
nated with insect repellent, and mosquito bed nets to protect individuals when
they are sleeping. All of these items have costs and are considered the household
cost for dengue.
The number of research papers studying the dengue impact dengue on the
country’s economy is considerable [9,11,13,23,24,26]. However, there is a lack of
research related to household expenditures, namely what each individual could
spend to protect themselves and their family from the mosquito. There is some
work available about insecticide bed nets [19] or house spraying operations [8],
but there is a lack of studies using these three PPM in the scientific literature.
Each person has to make decisions related to spending money for its protec-
tion or risking catching the disease.
This cost assessment can provide an insight to policy-makers about the eco-
nomic impact of dengue infection to guide and prioritize control strategies.
Reducing the price or attribute subsidies for some personal protection items
could lead to an increase in the consumption of these items, and therefore, to
prevent mosquito bites and the disease.
This paper studies the influence of the price of personal protection to prevent
dengue disease from the individual perspective. Section 2 introduces the optimal
control problem, the parameter, and the variables that composed the epidemio-
logical model. Section 3 analyzes the numerical results, and Sect. 4 presents the
conclusions of the work.
2 The Epidemiological Model
2.1 Dengue Epidemiological Model
This research is based on the model proposed by Rodrigues et al. [21], where
it is studied the human and mosquito population through a mutually exclu-
sive compartmental model. The human population is divided in the classic SIR-
type epidemiological framework (Susceptible - Infected - Recovered), while the
mosquito population only has two components SI (Susceptible - Infected). Some
assumptions are assumed: both (humans and mosquitoes) are born susceptibles;
there is homogeneity between host and vectors; the human population is con-
stant (N = S + I + R), disregarding migrations processes; and seasonality was
not considered, which could influence the mosquito population.
The epidemic model for dengue transmission is given by the following sys-
tems of ordinary differential equations for human and mosquito populations,
respectively.
322 A. M. C. Brito da Cruz and H. S. Rodrigues
⎧
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎩
dS(t)
dt
= μhNh −

Bβmh
Im(t)
Nh
+ μh

S(t)
dI(t)
dt
= Bβmh
Im(t)
Nh
S(t) − (ηh + μh) I(t)
dR(t)
dt
= ηhI(t) − μhR(t)
(1)
⎧
⎪
⎪
⎨
⎪
⎪
⎩
dSm(t)
dt
= μmNm −

BβhmI(t)
Nh
+ μm

Sm(t)
dIm(t)
dt
= Bβhm
I(t)
Nh
Sm(t) − μmIm(t)
(2)
As expected, these systems depict the interactions of the disease between
humans to mosquitoes and vice-versa. Parameters of the model are presented in
Table 1.
Table 1. Parameters of the epidemiological model
Parameter Description Range Used values Source
Nh Human population 112000 [12]
1
μh
Average lifespan of humans
(in days)
79 × 365 [12]
B Average number of bites
on an unprotected person
(per day)
1
3
[20,21]
βmh Transmission probability
from Im (per bite)
[0.25, 0.33] 0.25 [6]
1
ηh
Average infection period on
humans (per day)
[4, 15] 7 [4]
1
μm
Average lifespan of adult
mosquitoes (in days)
[8, 45] 15 [7,10,16]
Nm Mosquito population 6 × Nh [22]
βhm Transmission probability
from Ih (per bite)
[0.25, 0.33] 0.25 [6]
This model does not incorporate any PPM. Therefore we need to add another
compartment to the human population: the Protected (P) compartment. It com-
prises humans that are using personal protection, and therefore they cannot be
bitten by mosquitoes and getting infected.
It is introduced a control variable u(·) in (1), which represents the effort of
taking PPM. Additionally, a new parameter is required, ρ, depicting the pro-
tection duration per day. Depending on the PPM used, this value is adequate
Economic Burden of Personal Protective Strategies for Dengue Disease 323
for the individual protection capacity [5]. To reduce computational errors, it
was normalized the differential equations, meaning that the proportions of each
compartment of individuals in the population were considered, namely
s =
S
Nh
, p =
P
Nh
, i =
I
Nh
, r =
R
Nh
and
sm =
Sm
Nm
, im =
Im
Nm
The model with control is defined by:
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
ds(t)
dt
= μh − (6Bβmhim(t) + u(t) + μh) s(t) + (1 − ρ)p(t)
dp(t)
dt
= u(t)s(t) − ((1 − ρ) + μh) p(t)
di(t)
dt
= 6Bβmhim(t)s(t) − (ηh + μh) i(t)
dr(t)
dt
= ηhi(t) − μhr(t)
(3)
and
⎧
⎪
⎨
⎪
⎩
dsm(t)
dt
= μm − (Bβhmi(t) + μm) sm(t)
dim(t)
dt
= Bβhmi(t)sm(t) − μmim(t)
(4)
This set of equations is subject to the initial equations [21]:
s(0) =
11191
Nh
, p(0) = 0, i(0) =
9
Nh
, r(0) = 0, sm(0) =
66200
Nm
, im(0) =
1000
Nm
.
(5)
The mathematical model is represented by the epidemiological scheme in
Fig. 1.
Due to the impossibility of educating everyone to use PPM, namely because
of budget constraints, time restrictions, or even the low individual willingness
to using PPM, it is considered that the control u(·) is bounded between 0 and
umax = 0.7.
2.2 Optimal Control Problem
In this section, an optimal control problem is proposed that portrays the costs
associated with PPM and the restrictions of the epidemy.
The objective functional is given by
J(u(·)) = R(T) +
 T
0
γu2
(t)dt (6)
324 A. M. C. Brito da Cruz and H. S. Rodrigues
Fig. 1. Flow diagram of Dengue model with personal protection strategies
where γ is a constant that represents the cost of taking personal prevention
measures per day and person. At the same time, it is relevant that this functional
also has a payoff term: the number of humans recovered by disease at the final
time, R(T). It is expected that the outbreak dies out, and therefore, the total
of recover individuals gives the cumulative number of persons infected by the
disease. We want to minimize the total number of infected persons, and therefore
recovered persons at the final time and the expenses associated with buying
protective measures.
Along time, we want to find the optimal value u∗
of the control u, such that
the associated state trajectories S∗
, P∗
, I∗
, R∗
, S∗
m, I∗
m are solutions of the Eqs.
(3) and (4) with the following initial conditions:
s(0) ⩾ 0, p(0) ⩾ 0, i(0) ⩾ 0, r(0) ⩾ 0, sm(0) ⩾ 0, im(0) ⩾ 0. (7)
The set of admissible controls is
Ω = {u(·) ∈ L∞
[0, T] : 0 ⩽ u(·)  umax, ∀t ∈ [0, T]}.
The optimal control consists of finding (s∗
(·), p∗
(·), i∗
(·), r∗
(·), s∗
m(·), i∗
m(·))
associated with an admissible control u∗
(·) ∈ Ω on the time interval [0, T] that
minimizes the objective functional.
Note that the cost function is L2
and the integrand function is convex with
respect to the function u. Furthermore, the control systems are Lipschitz with
respect to the state variables and, therefore, exists an optimal control [3].
The Hamiltonian function is
H = H (s(t), p(t), i(t), r(t), sm(t), im(t), Λ, u(t)) = γu2
(t)
+ λ1 (μh − (6Bβmhim(t) + u(t) + μh) s(t) + (1 − ρ)p(t))
+ λ2 (u(t)s(t) − ((1 − ρ) + μh) p(t))
+ λ3 (6Bβmhim(t)s(t) − (ηh + μh) i(t))
+ λ4 (ηhi(t) − μhr(t))
+ λ5 (μm − (Bβhmi(t) + μm) sm(t))
+ λ6 (Bβhmi(t)sm(t) − μmim(t))
Economic Burden of Personal Protective Strategies for Dengue Disease 325
Pontryagin’s Maximum Principle [17] states that if u∗
(·) is optimal control
for the equations (3)–(6) with fixed final time, then there exists a nontrivial
absolutely continuous mapping, the adjoint vector:
Λ : [0, T] → R6
, Λ(t) = (λ1(t), λ2(t), λ3(t), λ4(t), λ5(t), λ6(t))
such that
s
=
∂H
∂λ1
, p
=
∂H
∂λ2
, i
=
∂H
∂λ3
, r
=
∂H
∂λ4
, s
m =
∂H
∂λ5
and i
m =
∂H
∂λ6
,
and where the optimality condition
H (s∗
(t), p∗
(t), i∗
(t), r∗
(t), s∗
m(t), i∗
m(t), Λ(t), u∗
(t))
= min
0⩽uumax
H (s∗
(t), p∗
(t), i∗
(t), r∗
(t), s∗
m(t), i∗
m(t), Λ(t), u(t))
and the transversality conditions
λi(T) = 0, i = 1, 2, 3, 5, 6 and λ4(T) = 1 (8)
hold almost everywhere in [0, T] .
The following theorem follows directly from applying the Pontryagin’s max-
imum principle to the optimal control problem.
Theorem 1. The optimal control problem with fixed final time T defined by the
Eqs. (3)–(6) has a unique solution (s∗
(t), p∗
(t), i∗
(t), r∗
(t), s∗
m(t), i∗
m(t)) associ-
ated with the optimal control u∗
(· · · ) on [0, T] given by
u∗
(t) = max 0, min
(λ1(t) − λ2(t)) s∗
(t)
2γ
, umax (9)
with the adjoint function satisfying
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
λ
1(t) = λ1 (6Bβmhi∗
m(t) + u∗
(t) + μh) − λ2u∗
(t) − λ36Bβmhi∗
m(t)
λ
2(t) = −λ1(1 − ρ) + λ2 ((1 − ρ) + μh)
λ
3(t) = λ3 (ηh + μh) − λ4ηh + (λ5 − λ6) Bβhms∗
m(t)
λ
4(t) = λ4μh
λ
5(t) = λ5 (Bβhmi∗
(t) + μm) − λ6Bβhmi∗
(t)
λ
6(t) = (λ1 − λ3) 6Bβmhs∗
(t) + λ6μm
(10)
3 Numerical Results
This section presents the results of the numerical implementation of control
strategies for PPM for dengue disease.
326 A. M. C. Brito da Cruz and H. S. Rodrigues
For the problem resolution, the time frame considered was one year, and the
parameters used are available on Table 1.
To obtain the computational results, Theorem 1 was implemented numeri-
cally on MATLAB version R2017b and the extremal found, u∗
, was evaluated
using a forward-backward fourth-order Runge-Kutta method with a variable
time step for efficient computation (see [15] for more details). For the differen-
tial equations related to the state variables, (3)–(4) it was performed a forward
method using the initial conditions (5). For the differential equations related
to adjoint variables, it was applied a backward system using the transversality
conditions (8).
Figure 2 shows the evolution of the human state variables during the whole
year, without any control. This analysis is a meaningful kickoff point to make a
fair comparison with the application of the PPM. It should also be mentioned
that this model does not take into account deaths by dengue, because in the
considered region of the study [21], fortunately, nobody died, and there were no
cases of hemorrhagic fever.
Fig. 2. Evolution of human state variables when human population did not take any
protective measures
For this research, three protective measures were considered: insect repellent,
bed nets, and clothes impregnated with insect repellent. It was obtained the
average price of each product on the market as well as how long each product
stays active.
The following simulations are divided into two subsections: the first one,
using only one PPM, and the second one, where several PPM are combined.
Each PPM has two factors associated: cost and durability. In our model, the
parameter γ is the cost that each individual spends to protect themself during
Economic Burden of Personal Protective Strategies for Dengue Disease 327
the whole year, and the parameter ρ determines how long the protection stays
active. The individual is responsible for deciding the best strategy and how much
he/she will spend on protective measures. Both constants are listed in Table 2.
For example, the cost of a spray can of insect repellent is 10A
C, and it only lasts
a month, the reason why γ = 10×12
365×Nh
. Furthermore, each application of insect
repellent only lasts four hours, hence ρ as the value 1
6 .
Therefore, this analysis is twofold purposes: to understand the impact in
the curve of infected individuals when distinct personal protective measures are
used; and to find the economic burden of these measures on each person.
Table 2. Parameters associated with the control
Scenario Control Cost (γ) Protection (ρ)
No control None 0 0
Single control A Skin repellent 10×12
365×Nh
1
6
B Bed net 20
365×Nh
1
3
C Insecticide-treated clothes 30×6
365×Nh
1
2
Combined control D Skin repellent+Bed net 10×12+20
365×Nh
1
2
E Skin repellent+Insecticide-treated clothes 10×12+30×6
365×Nh
2
3
F Bed net+Insecticide-treated clothes 20+30×6
365×Nh
5
6
G All 10×12+20+30×6
365×Nh
1
3.1 Single Control
In this section, it is presented the results relative to the application of one single
control. Only one of the protective measures, skin repellent, bed nets, or clothes
impregnated with insecticide, is taken and only used once per day.
In scenario A (see Fig. 3), the case associated with the use of skin repellent,
is the case where more people get infected. About 85% of the population gets
infected, and that’s probably why, in the single control scenarios, this is the case
where the control starts to decrease sooner. A reasonable argument why this
happens is because the protective measure is not being very effective. Note that
this is the case where people are less protected.
Scenarios B (Fig. 4) and C (Fig. 5) have similar behavior of the control func-
tion; however, scenario C is the first case where the number of protected persons
is larger than the number of susceptible persons, at least most of the year.
Note that the cost of using treated clothes is much higher than using bed
nets. However, there is not much difference between the total number of infected
persons using bed nets instead of treated clothes, about 500 more infected per-
sons.
328 A. M. C. Brito da Cruz and H. S. Rodrigues
Fig. 3. Scenario A - Control used: skin repellent
Fig. 4. Scenario B - Control used: bed nets
Fig. 5. Scenario C - Control used: insecticide-treated clothes
Economic Burden of Personal Protective Strategies for Dengue Disease 329
3.2 Combined Control
Another approach to household protection is the combination of more than one
protective measure. Wearing or using more personal protections significantly
reduces the number of infected persons. The combined use of insect repellent
and bed nets, scenario D (Fig. 6), have the same value of ρ as scenario C and,
therefore, the human state variables and control function of these cases have
similar behavior. The big difference between both scenarios, C and D, is the cost
of the protections, which will be discussed in the following subsection.
Fig. 6. Scenario D - Combined control (insect repellent and bed nest)
In scenarios E (Fig. 7), F (Fig. 8), and G (Fig. 9), the number of infected
people is quite inferior to previous cases. The final number of recovered persons
drops down at least to almost half the population. The control function keeps
increasing the number of days staying at maximum control until scenario F. This
seems to happen because the number of infected persons in this scenario reduces
significantly.
Fig. 7. Scenario E - Combined control (skin repellent and Insecticide-treated clothes)
In scenario F, a large number of persons have personal protection. The max-
imum number of infected persons on a day is 99. In this scenario, the human
330 A. M. C. Brito da Cruz and H. S. Rodrigues
state variables have completely different behavior than the previous scenarios.
This amount of protection has an impact on the epidemy.
Fig. 8. Scenario F - Combined control (bed net and Insecticide-treated clothes)
Scenario G is almost utopic since the individual has complete protection
every hour of the day. Right from the start, 70% of the population is protected,
and in 41 days, there will not be more infections with dengue disease. Overall
costs decrease because people do not have to apply personal protections most of
the year.
Fig. 9. Scenario G - Combined control (skin repeelent, bed net, insecticide-treated
clothes)
3.3 Cost of the Control
Control measures associated with personal protection were analyzed, separately
and combined, to achieve the best strategy. Table 3 illustrates the differences
of all scenarios, and several relevant values were scrutinized to understand the
dynamics of human state variables. The peak of the infection, the maximum
Economic Burden of Personal Protective Strategies for Dengue Disease 331
number of infected persons in a day, and the day when that happens are pointed
out to perceive the total number of infected individuals; this information is
crucial to prepare in advance human and medical resources for an outbreak.
The epidemy’s end is considered the day when the number of infected persons
is smaller than 9, which corresponds to the initial value of infected persons on
these simulations. Finally, the last two columns are concerned with the functional
cost. R(T) represents the number of recovered persons on the last day of the year
contemplated, also it tells the total number of infected people during the whole
year. The cost of personal protection measure stands for the value that each
person would have to spend during the whole year.
The maximum value of infected people on a single day occurs, as expected,
when there are no personal protection measures. The minimum value is 81 where
people are fully protected but in scenario F, with ρ = 5
6 , only have a few more
infected persons at its peak, 99. Strategies C and D have likely results regarding
the peak of the infection and according to the respective figures presented in
previous subsections. However, the cost of using only insecticide-treated clothes
(scenario C) is 26% higher when compared with the combined control insect
repellent and bed net (scenario D).
In Table 3, one can see that with more protection measures, the maximum
number of infected people in a single day decreases. However, the day when this
peak is achieved is surprising. This peak is reached on the 35th day, where there
isn’t any control, and it keeps being reached later until scenario E, 81st day. In
these scenarios, from no control until scenario E, flattening the curve of infected
people makes the duration of the epidemy last longer. Both columns, Epidemy’s
end and R(t), illustrate this fact. In the last two scenarios, and due to a large
number of the population is using protection almost all of the day, the total
number of infected people drastically decreases. In these two cases, the human
state variables have different behavior from the other scenarios.
Table 3. Parameters associated with the control
Scenario ρ Peak of infected
persons
Peak’s day Epidemy’s
end (day)
R(T) Cost
No control 0 1912 35 120 10878 0
A 1
6
863 58 202 9571 62,5
B 1
3
710 63 223 9083 13
C 1
2
510 72 260 8177 115,7
D 1
2
510 72 261 8176 91,3
E 2
3
257 81 336 6057 206,1
F 5
6
99 6 221 1349 135,3
G 1 81 3 20 120 8,7
332 A. M. C. Brito da Cruz and H. S. Rodrigues
Another perspective of the outbreak is to present all curves of infected
humans (Fig. 10). The adoption of any PPM decreases the peak of the infected,
which could influence the medical care of the patients. Higher values of ρ mean
less number of infected people. Inversely, the epidemic’s end and its peak happen
later for higher values of ρ (except the last two cases), which could be explained
by the flattening of the curve of infected persons.
Single control scenarios Combined control scenarios
Fig. 10. Infected people - all scenarios
Control functions on all scenarios tend to stay at maximum control value for
most of the year at study. However, in scenario A, the control function starts to
drop down from the maximum value much sooner than in all other scenarios. A
possible explanation seems to be because, on the 200th day, most people already
were infected by dengue. Therefore, there is no need for any more protection.
Again on scenarios F and G, due to the end of epidemy’s is reached sooner,
the control function also starts to decrease sooner when comparing it with the
remaining scenarios (see Fig. 11).
Single control scenarios Combined control scenarios
Fig. 11. Control variable - all scenarios
Economic Burden of Personal Protective Strategies for Dengue Disease 333
Protective measures have varied prices, and the duration that each one lasts is
different. Scenarios F and G seem to prove that the use of very efficient protection
from the start of the outbreak, ends it sooner. Furthermore, and possibly not
expected, each person in these scenarios does not spend as much money as in
other cases. For instance, scenario F appears to be a much better solution than
scenario E, not only for personal cost but also to fight the disease.
4 Conclusions
Effective dengue prevention requires a coordinated and sustainable approach.
This way, it is possible to address environmental, behavioral, and social con-
texts of disease transmission. Determining household expenditure related to bite
prevention is a crucial purpose for acting in advance of an outbreak. With an
efficient cost analysis, each individual could make a trade-off decision between
the available resources.
This work is the follow up work that the authors started in [2]. In here, the
research tried to answer the following question: what is the best strategy to fight
dengue disease and, at the same time, spending less money?
The results corroborated that if one person is more protected, then the disease
spreads less. Long protection flattens the infected human curve, which is a good
perspective from the medical point of view. Hence, the medical staff could have
more time to prepare the outbreak response, and the Heath Authorities could
acquire more supplies to treat the patients.
However, it was found that does not mean that the person should spend
more. As an example, the protection used with skin repellent and bed net has
the same impact on the number of infected individuals when compared with the
insecticide-treated clothes, but it is much cheaper.
These results also reflect the importance of using personal protective mea-
sures for most part of the day. Half or even more of the population will not be
infected if at least 16 h a day a person stays protected.
As future work, it will be interesting to study the economic impact of strate-
gies related to prevention - including self-protect measures - and treatment -
that could including costs with drug administration, hospitalization, or even sick
days losses, focusing only on the individual perspective. Another perspective to
be implemented is to analyze the economic evaluation of prevention. The adop-
tion of some prevention activities, from the individual point of view, could have
a profound impact on the government budget allocated to the disease’s treat-
ment. Thus could be critical to investigate the financial support of preventive
measures.
Acknowledgement. This work is supported by The Center for Research and Develop-
ment in Mathematics and Applications (CIDMA) through the Portuguese Foundation
for Science and Technology (FCT - Fundação para a Ciência e a Tecnologia), references
UIDB/04106/2020 and UIDP/04106/2020.
334 A. M. C. Brito da Cruz and H. S. Rodrigues
References
1. Bhatt, S., et al.: The global distribution and burden of dengue. Nature 496, 504–
507 (2013)
2. Brito da Cruz, A.M., Rodrigues, H.S.: Personal protective strategies for dengue
disease: simulations in two coexisting virus serotypes scenarios. Math. Comput.
Simul. 188, 254–267 (2021)
3. Cesari, L.: Optimization-Theory and Applications. Springer, New York (1983)
4. Chan, M., Johansson, M.A.: The incubation periods of dengue viruses. PLoS One
7(11), e50972 (2012)
5. Demers, J., Bewick, S., Calabrese, J., Fagan, W.F.: Dynamic modelling of personal
protection control strategies for vector-borne disease limits the role of diversity
amplification. J. R. Soc. Interface 15, 20180166 (2018)
6. Focks, D.A., Brenner, R.J., Hayes, J., Daniels, E.: Transmission thresholds for
dengue in terms of Aedes aegypti pupae per person with discussion of their utility
in source reduction efforts. Am. J. Trop. Med. Hyg. 62, 11–18 (2000)
7. Focks, D.A., Haile, D.G., Daniels, E., Mount, G.A.: Dynamic life table model for
Aedes aegypti (Diptera: Culicidae): analysis of the literature and model develop-
ment. J. Med. Entomol. 30, 1003–1017 (1993)
8. Goodman, C.A., Mnzava, A.E.P., Dlamini, S.S., Sharp, B.L., Mthembu, D.J.,
Gumede, J.K.: Comparison of the cost and cost-effectiveness of insecticide-treated
bednets and residual house-spraying in KwaZulu-Natal, South Africa. Trop. Med.
Int. Health 6(4), 280–95 (2001)
9. Hariharana, D., Dasb, M.K., Sheparda, D.S., Arorab, N.K.: Economic burden of
dengue illness in India from 2013 to 2016: a systematic analysis. Int. J. Infect. Dis.
84S, S68–S73 (2019)
10. Harrington, L.C., et al.: Analysis of survival of young and old Aedes aegypti
(Diptera: Culicidae) from Puerto Rico and Thailand. J. Med. Entomol. 38, 537–547
(2001)
11. Hung, T.M., et al.: The estimates of the health and economic burden of dengue in
Vietnam. Trends Parasitol. 34(10), 904–918 (2018)
12. INE. Statistics Portugal. http://censos.ine.pt. Accessed 5 Apr 2020
13. Laserna, A., Barahona-Correa, J., Baquero, L., Castañeda-Cardona, C., Rosselli,
D.: Economic impact of dengue fever in Latin America and the Caribbean: a sys-
tematic review. Rev. Panam Salud Publica 42(e111) (2018)
14. Lee, J.S., et al.: A multi-country study of the economic burden of dengue fever
based on patient-specific field surveys in Burkina Faso, Kenya, and Cambodia.
PLoS Negl. Trop. Dis. 13(2), e0007164 (2019)
15. Lenhart, C.J., Workman, J.T.: Optimal Control Applied to Biological Models.
Chapman  Hall/CRC, Boca Raton (2017)
16. Maciel-de-Freitas, R., Marques, W.A., Peres, R.C., Cunha, S.P., Lourenço-de-
Oliveira, R.: Variation in Aedes aegypti (Diptera: Culicidae) container productivity
in a slum and a suburban district of Rio de Janeiro during dry and wet seasons.
Mem. Inst. Oswaldo Cruz 102, 489–496 (2007)
17. Pontryagin, L.S., Boltyanskii, V.G., Gamkrelidze, R.V., Mishechenko, E.F.: The
Mathematical Theory of Optimal Processes VIII + 360. Wiley, New York/London
(1962)
18. Public Health Agency of Canada: Statement on Personal Protective Measures to
Prevent Arthropod Bites. Canada Communicable Disease Report 38 (2012)
Economic Burden of Personal Protective Strategies for Dengue Disease 335
19. Pulkki-Brännström, A.M., Wolff, C., Brännström, N., Skordis-Worrall, J.: Cost
and cost effectiveness of long-lasting insecticide-treated bed nets - a model-based
analysis. Cost Effectiveness Resour. Allocat. 10(5), 1–13 (2012)
20. Rocha, F.P., Rodrigues, H.S., Monteiro, M.T.T., Torres, D.F.M.: Coexistence of
two dengue virus serotypes and forecasting for Madeira Island. Oper. Res. Health
Care 7, 122–131 (2015)
21. Rodrigues, H.S., Monteiro, M.T.T., Torres, D.F.M., Silva, A.C., Sousa, C., Con-
ceição, C.: Dengue in madeira island. In: Bourguignon, J.-P., Jeltsch, R., Pinto,
A.A., Viana, M. (eds.) Dynamics, Games and Science. CSMS, vol. 1, pp. 593–605.
Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16118-1 32
22. Rodrigues, H.S., Monteiro, M.T.T., Torres, D.F.M., Zinober, A.: Dengue disease,
basic reproduction number and control. Int. J. Comput. Math. 89(3), 334–346
(2012)
23. Shepard, D., Undurraga, E.A., Halasa, Y., Stanaway, J.D.: The global economic
burden of dengue: a systematic analysis. Lancet - Infect. Dis. 16(8), 935–941 (2016)
24. Sher, C.Y., Wong, H.T., Lin, Y.C.: The impact of dengue on economic growth: the
case of Southern Taiwan. Int. J. Environ. Res. Public Health 17(3), 750 (2020)
25. Stanaway, J.D., et al.: The global burden of dengue: an analysis from the global
burden of disease study 2013. Lancet Infect. Dis. 16(6), 712–23 (2016)
26. Suaya, J.A., et al.: Cost of dengue cases in eight countries in the Americas and
Asia: a prospective study. Am. J. Trop. Med. Hyg. 80, 846–855 (2009)
27. World Health Organization: Managing regional public goods for health:
Community-based dengue vector control. Asian Development Bank and World
Health Organization, Philippines (2013)
28. World Health Organization: A toolkit for national dengue burden estimation.
WHO, Geneva (2018)
ERP Business Speed – A Measuring Framework
Zornitsa Yordanova(B)
University of National and World Economy, 8mi dekemvri, Sofia, Bulgaria
zornitsayordanova@unwe.bg
Abstract. A major business problem nowadays is how adequate and appropriate
the ERP system used is for growing business needs, the ever-changing complexity
of the business environment, the growing global scale competition, and as a result
- the growing speed of business. The purpose of this research is to make the bench-
mark for an ERP system business speed (performance from a business operations
point of view), i.e. how fast the business operations executed are with the used
ERP. A literature analysis is conducted for framing and defining the concept of
ERP business speed as an important and crucial factor for business success. Then a
measurement framework for testing ERP systems in terms of business operations
and ERP business performance is proposed and tested. Metrics for measuring ERP
business speed are defined by conducting focused interviews with experts in ERP
implementation and maintenance. The measurement framework has been empir-
ically tested in 6 business organizations, first to validate the selected KPIs and
second to formulate some average business speed indications of the KPIs as part
of the business speed of ERP. The research contributes by providing a framework
measurement tool for testing business speed of ERP systems. The study can also
serve as a benchmark in further measuring of ERP business speed. From theory
perspective, this study provides a definition and an explanation of the term ERP
business speed for the first time in the literature.
Keywords: ERP · Enterprise information systems · Enterprise software ·
Business speed · MIS · Measurement tool
1 Introduction
Enterprise resource planning (ERP) is an integrated management approach for organi-
zations that aims at encompassing all enterprise activities into a system process moving
together [1]. Nowadays these systems are usually integrated into an information system
because of the large scope of business, huge number of transactions, complex inter-
nal processes, competiveness and optimization efforts for achieving higher margin [2].
Enterprise resource planning systems are integrated software packages, including all
necessary ingredients that would support the smooth work flow and process with a
common database [3]. They are used for integration and automation of the processes,
performance improvements, and cost reduction of all enterprise activities [4]. They are
considered as powerful tool for enterprise management and information and data man-
agement for reducing human error, speed the flow of information, and improve overall
© Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 336–344, 2021.
https://doi.org/10.1007/978-3-030-91885-9_24
ERP Business Speed – A Measuring Framework 337
decision making throughout the organization [5]. Over the last two decades, ERP sys-
tems have become one of the most significant and expensive implementations in the
information technology business [6]. Because of their impotence for businesses and
wide-spreading amongst enterprises, any kind of new technology or business issue has
been shortly incorporated within ERPs [4] and they have become an integral part of
enterprises addressing changing environment, the increasing market requirements and
increased data needs of enterprises. Yet, ERP requires large amount of investment, time,
resources and efforts and a potential failure is a risky factor [7] that many companies
consider as frightening and still feasible [8].
However, at the beginning of the third decade of 21st century, the question for the
benefit from implementing ERP system is not relevant anymore. Now the question is
not how fast, easy and effective to implement an ERP, but rather it is how the already
implemented ERP serve the organizations’ needs and how fast it answers to their business
speed [9]. The research question of this study is what is ERP business speed and how it
could be measured.
2 Theoretical Background
2.1 ERP Systems and Their Significance for Businesses
ERP systems have been designed to address the fragmentation and decomposition of
information transaction by transaction across an enterprise’s business, and also to inte-
grate with intra- and inter- enterprise information [10]. They are usually organized
in complex software packages which integrate data, information and business process
across functional areas of business [11]. ERP systems are customer-tailored and usually
customized into a unique system that fulfil individual and specific needs [12]. ERP sys-
tems aims at covering the full range of functionality needed by organizations and this
is considered as their main benefit [13]. Implementation of an ERP system across an
organization takes time, money and a lot of efforts, including internal resources [14].
That is why organizations have been trying hard to find out and come up with factors that
can help them to succeed in their implementation of ERP systems [15]. There are a lot of
researches examining ERP implementation failure factors [16] and project management
approaches [17] so to optimize these implementations and ERP utilization. Nowadays,
ERP researches are focused more on ERP transformation according to the changes [18].
ERP are closely tighten with business and [19] defines ERP is an institutionalized fac-
tor and component of enterprise infrastructure and development. Davenport [20] stated
even 20 years ago that the business world embraced ERP and that might be the most
important development in the corporate use of information technology came from the
90s. But the science literature provides many proves that still it is important in 2020
[21]. ERP systems are crucial for business from operational point of view and functional
capabilities [22] but also are considered as critically needed for top management who
reckon them as an important tool for both strategic and operative purposes [23].
For some organizations, ERP is a mandatory factor for operating [24]. ERP systems
benefit organizations by improving the quality of service and increasing efficiency [25].
ERP systems have transformed the functioning of firms with regard to higher efficiency
and accuracy in their operations [7]. Poston and Grabski [21] examined enterprises
338 Z. Yordanova
before and after adoption and found out a significant cost reduction. Hunton et al. [22]
did an experiment with financial analysts, researching whether investors consider ERP
implementation enhances firm value. The results showed positive effects from such
implementations even though that positive assumption was not measurable. According
Hedman and Kalling [23] main benefit from ERP and enterprise information systems at
all as well as factors that generate profit are the activation of the resource and activities,
and the quality and cost of the offering in the light of competition.
Generally, from business point of view, computing and ERP provides great additional
business value to business [26] and most of the analysed studies give arguments about
this cliché statement. Therefore, the focus of the current research falls on a relatively
new aspect of business and ERP that extend these systems’ significance from business
point of view. This new aspect is business speed and ERP business speed in particular.
2.2 Business Speed Requirements
Esteves and Pastor [25] researched ERP lifecycle and distinguish 6 phases of their usage
and utilization: decision, acquisition, implementation, use and maintenance, evolution,
and retirement phase. Nowadays, most of the implemented ERP systems have already
reached to the retirement phase because of new technologies [26] or because the currently
used ERP system or processes have become inadequate to the business’ needs [27]. As
a result, the ERP implementation is not a factor, a risk, a question or a doubt anymore.
After moving through all of these phases, acquisitions, deployment, use, maintenance
and evolution, the business’s interest in these already proven information systems is
directed at their business speed addressing real business goals. ERP business speed is
a novel term, first presented in this research. It should not be mixed up with economic
speed or business speed in general not with technology or technology adoption speed.
ERP business speed will be discussed in the next sections of the study. It is based on the
concept of business speed as it is increasingly important for businesses.
Business speed has already been met in the literature, but its meaning was never
fully defined and it was used as word combination rather than as a term [28]. The
word combination was used in different context. Bill Gates [29] used business speed
in his first book with meaning of the speed of business thought, how business, leaders
and technology should act together and where business goes to. His usage of business
speed is related to business direction and increasing the business cycles and changes in
general. Pulendran, Speed and Widing [30] use the word combination to explain business
performance. Meyer and Davis [31] used business speed from the prospective of speed of
change in the connected economy. For many researchers, business speed is modelling a
business approach that boost business [32]. For Gregory and Rawling [33] business speed
is related to time-based strategy, time compression and business performance. Miller [34]
was amongst the first ones linking business speed to information systems and addressed
business speed as acceleration of business via implementing SAP. Kaufmann and Wei
[35] examine business speed as commerce speed. Some authors have tried to connect
business speed with the startup ecosystem and some methods for startup management
as Lean startup method [36]. However, the linkage between ERP and business speed has
not yet been researched.
ERP Business Speed – A Measuring Framework 339
3 Research Design
For addressing the challenging agenda of setting a new framework in between business
and information technologies namely business speed of ERPs, the methodology used is
simple and it calls for further exploration. In this research a questionnaire for assessing
ERP business speed is formulated after the literature analysis of the science advance-
ments until now. The questionnaire was formulated using important for businesses key
performance indicators (KPI) which are predominantly performed by ERPs. The ques-
tionnaire was created with the support of subject matter experts in ERP implementation
who took part of the research as consultants. Three consultants of different systems
joined the design of the framework with having experience in implementing ERPs from
brands as: SAP, Microsoft, Infor, Epicor and Oracle. After formulating the questionnaire
and all the experts fully agreed on the top KPIs, the questionnaire was empirically tested
for relevance in 6 firms in order to be validated. Second goal of the empirical testing
was achieving a benchmark to be reached on the formulated KPIs.
The companies tested the ERP business speed framework were from different sectors
and size and do not represent a mono group for quantity results. This approach was
chosen for the purpose of making this framework universal. The results from the tests
are summarized with their average values and are presented in the results’ section in table
format for clearer use of the readers. The testers were key users from all departments of
the organization where the implementation takes place.
4 Results and Discussion: Benchmark of ERP Business Speed
by Operations
After testing the framework of the ERP business speed within the daily business oper-
ations of six companies, the results were first summarized, then averaged and here are
presented according to their business operation affiliation in separated table sections
(Table 1).
340 Z. Yordanova
Table 1. ERP business speed benchmark
(Automatically gener-
ated based on sales or-
der)
Physical picking of goods
with Mobile device inte-
grated with the ERP
10 min 20 min 5 min 2 min
Confirmation of a Picking
list with 5 lines
15 s 40 s 10 s  2 s
(Automatic process inte-
grated with mobile de-
vices)
Creating a Packing list with
5 lines
30 s 1 min 20 s  5 s
(Automatically gener-
ated based on Picking
list)
Creating an invoice with 5
lines
30 s 1 min 20 s  5 s
(Automatically gener-
ated based on Picking
list)
Creating a customer 30 s 1 min 25 s 30 s
Account payable
Creating a Purchase order
with 5 lines
2 min 3 min 2 min 1 min
Confirmation of a Purchase
order with 5 lines
15 s 40 s 10 s  5 s
(Automatically gener-
ated based on Purchase
order)
Creating a Product receipt
Note with 5 lines
30 s 1 min 20 s  5 s
(Automatically gener-
ated based on sales or-
der)
Register a Vendor invoice
with 5 lines and with addi-
tional cost
5 min 10 min 4 min 2 min
General Ledger
Creating an automatic cost
allocations based on prede-
fined rules
-
(None)
-
(None)
-
(None)
 5 s
(Automatic process)
Generating a Trial balance
base on ledger account for
period of 1 year
10 s 1 min 5 s  5 s
Generating a PL (profit
and loss) report for period
of 1 year
1 day 3 days  1 Day 1 min
Generating depreciation
register for 1 month and
100 assets
1 min 2 min 40 s 30 s
Creating a Fixed asset 1 min 2 min 40 s 40 s
Time (avg.) Time (max) Time (min) Time required now
Account receivable
Creating a Sales order with
5 lines
2 min 3 min 2 min 1 min
Confirmation of a Sales or-
der with 5 lines
15 s 40 s 10 s  5 s
(Automatically gener-
ated based on sales or-
der)
Creating a Picking list with
5 lines
30 s 1 min 20 s  5 s
(continued)
ERP Business Speed – A Measuring Framework 341
Table 1. (continued)
Bank management
Creating a payment journal
for 5 client invoices
2 min 5 min 1 min  10 s
(Automatic process
based on Invoice due
day)
Creating a payment journal
for 5 vendor invoices
2 min 5 min 1 min  10 s
(Automatic process
based on Invoice due
day)
Generating Cash Flow re-
port for period of 1 week
2 min 5 min 1 min 50 s
Inventory management
Creating Counting register
for one warehouse and 10
items
1 min 2 min 40 s 30 s
Creating Inventory transfer
register for 5 items
3 min 5 min 2 min 1 min
Generating On hand report
for 1 warehouse and 100
items
 10 s  10 s  10 s  10 s
CRM (Customer relationship management)
Creating a Lead 1 min 2 min 40 s 10 s
Creating а quotation with 5
lines
2 min 3 min 2 min 1 min
Confirming а quotation
with 5 lines
15 s 40 s 10 s  5 s
(Automatically gener-
ated based on quotation)
Converting Lead to real
customer
1 min 2 min 40 s  5 s
(Automatically gener-
ated based on lead)
Marketing
Generating marketing ac-
tivity list with 100 custom-
ers
-
(None)
-
(None)
-
(None)
 5 s
(Automatic process)
HR (Human resources)
Creating an employee 2 min 5 min 1 min 40 s
Creating payroll journal for
10 employee for 1 month
2 min 3 min 1 min 30 s
Generating payment state-
ment report for 1 month
and 10 people
1 min 2 min 30 min  10 s
342 Z. Yordanova
The results provided give clear and specific average benchmark indicators for the
speed of the major business operations executed in ERPs. The framework is open
for adjustments with the advancement of technology and operationally improvements,
automations or optimizations.
5 Conclusion
The paper provides a framework measurement tool for testing business speed of ERP
systems. For testing the method, an empirical research is undertaken. The paper could
serve as a benchmark or a starting point for measuring ERP business speed or this
speed optimization. From science prospective, the paper provides a definition and an
explanation of the term ERP business speed which is introduced firstly here.
The results might be of interest of scholars from both management and computer
science areas, but mostly it could be used by businesses.
Acknowledgement. Supported by the UNWE under the project No NID NI-14/2018.
References
1. Hong,K.K.,Kim,Y.G.:ThecriticalsuccessfactorsforERPimplementation:anorganizational
fit perspective. Inf. Manage. 40, 25–40 (2003)
2. Moon, Y.B.: Enterprise resource planning (ERP): a review of the literature. Int. J. Manage.
Enterp. Dev. 4(3), 235–264 (2007)
3. Staehr, L.: Understanding the role of managerial agency in achieving business benefits from
ERP systems. Inf. Syst. J. 20(3), 213–238 (2010)
4. Bahssas, D., AlBar, A., Hoque, R.: Enterprise resource planning (ERP) systems: design,
trends and deployment. Int. Technol. Manage. Rev. 5(2), 72–81 (2015)
5. Guimaraes, T., et al.: Assessing the impact of ERP on end-user jobs. Int. J. Acad. Bus. World
9(1), 11–21 (2015)
6. Jayawickrama, U., Liu, S., Smith, M.H.: Knowledge prioritisation for ERP implementation
success: perspectives of clients and implementation partners in UK industries. Ind. Manage.
Data Syst. 117(7), 1521–1546 (2017). https://doi.org/10.1108/IMDS-09-2016-0390
7. Shatat, A.: Critical success factors in enterprise resource planning (ERP) system imple-
mentation: an exploratory study in Oman. Electron. J. Inf. Syst. Eval. 18(1), 36–45
(2015)
8. Huang, Z., Palvia, P.: ERP implementation issues in advanced and developing countries. Bus.
Process Manage. J. 7(3), 276–284 (2001)
9. Sharif, A.M., Irani, Z., Love, P.E.: Integrating ERP using EAI: a model for post hoc evaluation.
Eur. J. Inf. Syst. 14(2), 162–174 (2005)
10. Lee, N.C.A., Chang, J.: Adapting ERP systems in the post-implementation stage: dynamic
IT capabilities for ERP. Pac. Asia J. Assoc. Inf. Syst. 12(1), Article 2 (2020). https://doi.org/
10.17705/1pais.12102
11. Davenport, T.H.: The future of enterprise system-enabled organizations. Inf. Syst. Front. 2(2),
163–180 (2000)
12. Chen, C.S., Liangb, W.Y., Hsu, H.: A cloud computing platform for ERP applications. Appl.
Soft Comput. 27, 127–136 (2015)
ERP Business Speed – A Measuring Framework 343
13. Momoh, A., Roy, R., Shehab, E.: Challenges in enterprise resource planning implementation:
state-of-the-art. Bus. Process Manag. J. 16, 537–565 (2010)
14. Mahindroo, A., Singh, H., Samalia, H.V.: Factors impacting intention to use of ERP systems
in Indian context: an empirical analysis. In: 3rd World Conference on Information Technology
(WCIT-2012), vol. 03, pp. 1626–1635 (2013)
15. Gunasekaran, A.: Modelling and Analysis of Enterprise Information Systems, p. 179. IGI
Publishing, New York (2007)
16. Lutfi, A.: Investigating the moderating role of environmental uncertainty between institutional
pressures and ERP adoption in Jordanian SMEs. J. Open Innov. Technol. Mark. Complex.
6(3), 91 (2020). https://doi.org/10.3390/joitmc6030091
17. Kirmizi, M., Kocaoglu, B.: The key for success in enterprise information systems projects:
development of a novel ERP readiness assessment method and a case study. Enterp. Inf. Syst.
14(1), 1–37 (2020). https://doi.org/10.1080/17517575.2019.1686656
18. Karimi, J., Somers, T., Bhattacherjee, A.: The role of information systems resources in ERP
capability building and business process outcomes. J. Manage. Inf. Syst. 24(2), 221–260
(2007)
19. Moe C.E., Fosser E., Leister O.H., Newman M.: How can organizations achieve competitive
advantages using ERP systems through managerial processes? In: Magya, G., Knapp, G.,
Wojtkowski, W., Wojtkowski, W.G., Zupančič, J. (eds.) Advances in Information Systems
Development. Springer, Boston (2007)
20. Al-hadi, M.A., Al-Shaibany, N.A.: An extended ERP model for Yemeni universities using
TAM model. Int. J. Eng. Comput. Sci. 6(7), 22084–22096 (2017)
21. Poston, R., Grabski, S.: The impact of enterprise resource planning systems on firm per-
formance. In: International Conference on Information Systems, pp. 479−493. Brisbane
(2000)
22. Hunton, J., McEwen, A., Wier, B.: The reaction of financial analysts to enterprise resource
planning implementation plans. J. Inf. Syst. (Spring) 3140 (2002)
23. Hedman, J., Kalling, T.: The business model concept: theoretical underpinnings and empirical
illustrations. Eur. J. Inf. Syst. 12(1), 49–59 (2003)
24. Ali, O., et al.: Anticipated benefits of cloud computing adoption in Australian regional
municipal governments: an exploratory study. In: Conference or Workshop Item 209, (2015)
25. Esteves, J., Pastor, J.: Enterprise resource planning systems research: an annotated bibliogra-
phy. Commun. Assoc. Inf. Syst. 7, Article 8 (2001)
26. Demi, S., Haddara, M.: Do cloud ERP systems retire? an ERP lifecycle perspective. Procedia
Comput. Sci. 138(2018), 587–594 (2018)
27. Muhamet, G.: A maturity model for implementation and application of Enterprise Resource
Planning systems and ERP utilization to Industry 4.0. PhD thesis, Budapesti Corvinus
Egyetem, Közgazdasági és Gazdaságinformatikai Doktori Iskola (2021)
28. Raman, K.A.: Speed Matters: Why Best in Class Business Leaders Prioritize Workforce Time
to Proficiency Metrics, Speed To Proficiency Research: S2Pro©, 2021 (2021)
29. Gates, B.: The Road Ahead, 1st edn. Viking Press, ISBN-10: 0670859133 (1999)
30. Pulendran, S., Speed, R., Widing, R., II.: Marketing planning, market orientation and business
performance. Eur. J. Mark. 37(3/4), 476–497 (2003). https://doi.org/10.1108/030905603104
59050
31. Meyer, C., Davis, S.: Blur: the speed of change in the connected economy. Work Study 49(4)
(2000). https://doi.org/10.1108/ws.2000.07949dae.003
32. Cosenz, F., Bivona, E.: Fostering growth patterns of SMEs through business model innovation.
A tailored dynamic business modelling approach. J. Bus. Res. (in press, 2021). https://doi.
org/10.1016/j.jbusres.2020.03.003
33. Gregory, C., Rawling, S.: Profit from time: speed up business improvement by implementing
time compression, Springer, ISBN 1349145912 (2016)
344 Z. Yordanova
34. Miller, S.: Accelerated Sap (ASAP): Implementation at the Speed of Business. McGraw-Hill,
Inc., New York (1998)
35. Kaufmann, D., Wei, S.: Does’ Grease Money’ Speed Up the Wheels of Commerce?
International monetary fund working paper (2000)
36. Yordanova, Z.: Knowledge transfer from lean startup method to project management for
boosting innovation projects performance. Int. J. Technol. Learn. Innovation Dev. 9(4) (2017)
BELBIC Based Step-Down Controller
Design Using PSO
João Paulo Coelho1,2(B)
, Manuel Braz-César1,3
, and José Gonçalves1,2
1
Instituto Politécnico de Bragança, Escola Superior de Tecnologia e Gestão,
Campus de Sta. Apolónia, 5300-253 Bragança, Portugal
{jpcoelho,brazcesar,goncalves}@ipb.pt
2
CeDRI - Centro de Investigação em Digitalização e Robótica Inteligente,
Campus de Sta. Apolónia, 5300-253 Bragança, Portugal
3
CONSTRUCT Institute of RD in Structures and Construction,
Campus da FEUP, 4200-465 Porto, Portugal
Abstract. This article presents a comparison between a common type
III controller and one based on a brain emotional learning paradigm
(BELBIC) parameterized using a particle swarm optimization algorithm
(PSO). Both strategies were evaluated regarding the set-point accuracy,
disturbances rejection ability and control effort of a DC-DC buck con-
verter. The simulation results suggests that, when compared to the com-
mon controller, the BELBIC leads to an increase in both set-point track-
ing and disturbances rejection ability while reducing the dynamics of the
control signal.
Keywords: Optimisation · BELBIC · Buck converter · PSO
1 Introduction
Conversion between different voltages values is amongst the most common oper-
ations found in electronics. For example, many battery-operated devices such as
laptops and mobile devices, are capable of switching between different voltage
values in order to optimize the use of the battery. The 5 V constant core voltage
found of 1970’s microprocessors has evolved for today’s processors to scalable
core supply voltage that can reach values lower than one volt. This voltage scal-
ing task can be performed dynamically at the software or firmware levels by
both the operating system or BIOS. Moreover, a point-to-load approach used in
the motherboard of modern microprocessor devices has led to the inclusion of a
large number of small power supplies scattered along the main board. Reducing
the power dissipated in the form of heat is an important goal which lead to
an increase in efficiency, small form factors by discarding the use of large heat
sinks and an extension of battery life which is a key factor for all mobile devices.
This can be attained by resorting to a class of circuits known by switch-mode
power supplies where the voltage conversion takes place by periodically switching
transistors, embedded in RLC networks, between their on and off states. The
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 345–356, 2021.
https://doi.org/10.1007/978-3-030-91885-9_25
346 J. P. Coelho et al.
input-to-output voltage ratio depends on the duty cycle imposed to those switch-
ing devices by a controller. This controller operates in closed-loop by sampling
the output voltage and comparing it with the desired output voltage value and
the difference between those two values will be used to establish the switching
duty-cycle. The block diagram presented in Fig. 1 illustrate this methodology.
Fig. 1. Typical feedback control architecture used in DC/DC converters.
Often, in practice, a current loop is also added in order to enable current-
mode control. This additional control layer enables overcurrent protection and
reduces the sensitivity of the voltage controller to the capacitor’s ESR. However,
in this paper, only the voltage-mode control is taken into consideration. Voltage-
mode control resort to feedback to keep constant the output voltage despite
unwanted disturbances. The loop compensation network associated to the error
amplifier can be of type I, II or III. Type I is a simple pole at the origin and type
II expand it by including a zero and a high-frequency pole which can leading to
a phase increase of 90o
. Finally, type III adds two poles and two zeros to the
pole at zero which promotes an increase in phase margin.
Those loop compensation circuits are tuned to perform well in a given nom-
inal system operating point. However, if the system deviates from this point,
the controller performance can become very poor. For example, when the power
supply shift between continuous to discontinuous conduction mode. Hence, adap-
tiveness and learning ability must be included in the controller in order for it
to be able to perform well in a large dynamic range and under the presence of
system changes.
This work proposes an alternative controller structure applied to regulate the
operation of a DC to DC buck converter. In particular, it will rely on the use
of control paradigm inspired on the brain emotional learning ability (BELBIC)
to promote adaption to operating point changes. Conceptually, this controller is
inspired by the brain’s limbic system and, when compared to the typical buck
converter controller, its most notorious property is the ability to keep learning
while in operation. The use of BELBIC was already applied within the power
electronics context. In [1], a brain emotional learning approach was employed in
the context of a maximum power-point tracking algorithm applied to solar energy
conversion. Additionally, in [2], a BELBIC controller was applied to control a
BELBIC Based Step-Down Controller Design Using PSO 347
buck DC-DC converter. However, the controller parameter were obtained empir-
ically and no comparison with other techniques was carried out. In this paper,
the buck converter is also addressed but the BELBIC parameters were computed
using the particle swarm optimization algorithm (PSO). Moreover, comparison
of closed loop response with typical loop compensators was performed.
This document is divided into four sections. After this first introductory
section, the mathematical formulation of a step down converter is described
in Sect. 2. Then, the generic BELBIC structure is presented in Sect. 3 and a
general overview of the PSO algorithm is presented in Sect. 4. Details and results
regarding the controller implementation is the aim of Sect. 5 where a performance
comparison was carried out having a Type II controller as benchmark. Finally,
Sect. 6 presents both the conclusions and final remarks.
2 The Step-Down (Buck) Converter
The DC-DC buck converter used in this work assumes the asynchronous archi-
tecture depicted in Fig. 2. The two switching elements are a MOSFET and a
diode. The MOSFET gate will be driven by a pulse-width modulation (PWM)
circuit that, for simplicity, is not shown.
Fig. 2. General schematic of a DC-DC step-down converter built around a MOSFET
and a diode as switching elements.
In this figure, H(s) denotes the voltage sensor transfer function and GC(s)
the controller transfer function. The MOSFET gate is driven by a pulse width
modulation circuit that generates a square wave whose duty-cycle is proportional
to a voltage signal applied to its input.
Considering both steady state and converter’s continuous conduction oper-
ating mode, the average voltage across the inductor assumes the value zero. At
the same time, the average current value across the capacitor, over one switching
period, is also zero.
348 J. P. Coelho et al.
Assuming small magnitudes of the disturbances, compared to the DC quies-
cent values, it is possible to obtain the following differential equations [3]:
L
d
dt
îL(t) = Dv̂g(t) + ˆ
d(t)Vg − v̂o(t) (1a)
C
d
dt
v̂o(t) = îL(t) −
v̂o(t)
Ro
(1b)
îg = DîL(t) + iL(t) ˆ
d(t) (1c)
where the hat symbol over the variables denotes a small disturbance around the
variable’s operating point and D is the PWM duty cycle whose value is within
the interval [0, 1].
After applying the Laplace transform to the previous set of differential equa-
tions, the transfer function between the output and the command signal, denoted
by Gvod(s), is:
Gvod(s) =
v̂o(s)
ˆ
d(s)
=
RoVg
LCRos2 + Ls + Ro
(2)
and from input to output voltage, Gvovg
(s), defined as:
Gvovg
(s) =
v̂o(s)
v̂g(s)
=
RoD
LCRos2 + Ls + Ro
(3)
Without input voltage disturbances or load changes, the converter is able to
operate in open-loop. However, the output value can fluctuates in the presence
of load changes of other disturbances as input voltage drops/rises or shifts in
the components nominal values due to several factors such as aging. Thus, a
closed-loop controller must be added in order to regulate the switched converter
voltage output.
Typically, type I, II or III loop compensator structures are chosen to carry out
this task and can be designed using the previously defined transfer functions.
However, it is worth to notice that those transfer functions are only approxi-
mations and assumes small disturbances around a given operating point. For
this reason, the behaviour of the switching converter can be very different out-
side the defined zone. Specially, if due to small loads, the converter settles to
work in discontinuous conduction mode. Fixed poles-zeros controller are unable
to achieve good performance in the presence of severe changes in the system
dynamic behaviour. Other approach is to enable the controller to learn and use
this knowledge to self adjust its behaviour in order to increase the overall per-
formance within a broader range of operating points. In this work, this feature
will be attained by resorting to a soft-computing paradigm known by BELBIC
and briefly described in the following section.
3 The BELBIC Controller
From an engineering point-of-view, analysis of the solutions produced by nature
to overcome the animal species adaption problems, have led to an increasing
BELBIC Based Step-Down Controller Design Using PSO 349
tendency to introduce bio-morphism and bio-mimicry in many different compu-
tational tools [4]. For example, biological inspiration can be tracked in appli-
cations such as machine cooperation, speech recognition, text recognition and
self-assembly nanotechnology just to name a few. All those examples have, in
fact that all rely on algorithms which have the capability of adaptation and
learning. Indeed, and in the biological realm, learning is one of the most impor-
tant factors for the endurance of all species. Learning allow the organisms to
adjust themselves to cope with changes in environmental and operating condi-
tions. This robustness is a desired characteristic in engineering applications since
the operating conditions of a product are never static. For this reason, simplified
approaches of some natural learning processes have been adapted to serve in
many engineering problems.
In the particular case of humans, learning takes place within generations and
across generations at many different levels. We are not talking only about intel-
lectual learning but also about the learning that is passed through our genome.
All this information shapes the actions of an individual when subject to a set
of environmental stimuli. At the mental level, reasoning is not the only driving
action when decision-making is concerned. Human reactions strongly depend
also on emotions and they play an important role in our everyday life and have
been a valuable asset in our survival and adaptation.
Is generally considered true that emotions were included during the evolu-
tionary stage as a way to reduce the human’s reaction time. That is, rather than
using the intellect to process information and generate actions, which would
take time, the reaction by emotion would be much faster. Emotions can then be
viewed as an automatic behaviour that seeks to improve survival by increasing
the ability to react fast in the presence of threats. It seems that the overall set
of possible emotions are predefined in our genome. However, they can then be
utterly modified based on the person’s individual experience.
Psychology and neural sciences circumscribe emotional activity to a set of
distinct brain regions gathered in what is known as the limbic region. Besides
emotions, the limbic system manages a distinct number of other functions such
as behaviour, motivation and has an important role in memory formation tasks.
At the present, it is still not consensual in the scientific community about which
brain areas should be included in the limbic system. However, it is commonly
accepted that the thalamus, hypothalamus, hippocampus and amygdala are the
main brain structures in the limbic system. Details regarding the role played by
each one of those cortical areas are outside the scope of this work. Instead, the
objective is to convert the limbic system behaviour, from a high-level abstraction
angle, and frame it in the context of computational intelligence. The first step
toward this approach was carried out by [5] by presenting a first mathematical
approach to describe the behaviour of the brain’s emotional learning (BEL) pro-
cess. The idea of applying this learning algorithm in the automatic control area
was provided, a couple of years later, by [6]. The junction of BEL with control
systems design has led to the concept known as “brain emotional learning-based
350 J. P. Coelho et al.
intelligent control” (BELBIC). Further details regarding the operational details
of this control method can be found at [7–10].
The major pitfalls of BELBIC regards the choice of both emotional and
sensory signals in order to maximize the control system performance. Besides
that, several tuning parameters, such as the learning rate of the amygdala and
orbitofrontal processing units, must be found to achieve an acceptable controller
behaviour. The values for such parameters are commonly found by trial-and-
error which can be cumbersome and lead to suboptimal solutions. For those
reasons, other tuning methods have been presented [11–13] among them the use
of evolutionary based algorithms [14–18].
Due to its ability to provide good results for non-convex problems, evolu-
tionary algorithms have been employed in a myriad of different applications. For
this reason, in this work the BELBIC controller will be tuned by resorting to
the particle swarm optimization algorithm. A short overview on this method is
provided in the following section and further details on the methodology will be
described in Sect. 5.
4 The PSO Algorithm
The particle swarm optimisation (PSO) algorithm is fundamentally based
on the social behaviour of animals that moves in herds or flocks [19]. In
this algorithm, a set of particles, representing potential problem solutions
moves through the search space according to a given position vector xi(t) =
{xi1(t), xi2(t), · · · , xin(t)} and velocity vi(t) = {vi1(t), vi2(t), · · · , vin(t)} where
t denotes the current evolutionary iteration and n the number of particles.
The PSO dynamics is governed by individual and social knowledge. That is, a
given particle movement is due to it’s own experience and from social information
sharing. In [19] this concept was mathematically expressed by the following set
of equations:
coid(t) = (pid (t) − xid (t)) (4a)
soid(t) = (pgd (t) − xid (t)) (4b)
where coid(t) is the cognition-only value associated to the dth
dimension of the
ith
particle and soid(t) is the social-only component of the same individual.
The momentum and position of a given particle are computed by:
vid (t + 1) = vid (t) + ϕ1.coid(t) + ϕ2.soid(t) (5a)
xid (t + 1) = xid (t) + vid (t + 1) (5b)
where pid(t) concerns the best previous position of particle i in the current itera-
tion t and pgd(t) denotes the global best particle within a given neighbourhood.
The coefficient ϕ1 is the cognitive constant and ϕ2 is the social coefficient. Gen-
erally they are assumed to be uniformly distributed random numbers between
zero and two.
BELBIC Based Step-Down Controller Design Using PSO 351
To guarantee admissibility and stability, both the particle position and veloc-
ity are bounded to maximum values. For this reason, the search space is always
circumscribed and the maximum step that each particle can undergo during one
iteration is constrained. The values for the maximum or minimum particle posi-
tion are problem dependent. Moreover, the maximum velocity should not be too
high or too low in order to avoid oscillations and local minima [20].
5 Step-Down Control with BELBIC
This section present the procedure behind the design a BELBIC controller for
a DC-DC buck converter capable of generating a 5 V output regulated volt-
age from a 12 V unregulated voltage input. The electrical components nominal
values are L = 20 mH, C = 50 µF and a 50 kHz switching frequency was con-
sidered. To reach the above referred nominal output voltage the duty-cycle D
must be roughly equal to 42%. For this buck converter, a type III controller was
designed in order to have zero steady state error, around 60◦
of phase margin,
10 dB of gain margin and the open-loop frequency response describes a slope of
−20 dB/decade at the crossover frequency. Also, an overshoot lower than 0.5 V
and a rise time smaller or equal to 2 ms. Using the Bode plot reshaping tech-
nique, all those figures-of-merit could be attained by means a regulator with the
following transfer function:
Gc(s) =
0.0317 s2
+ 63.4 s + 5 × 104
s (s2 + 20s + 100)
(6)
Figure 3 present the circuit implemented using Simulink R

’s Simscape R

toolbox. Using this framework, the buck converter non-linear nature is fully rep-
resented since each electronic and electrical components is accurately modelled
taking into consideration its non-ideal characteristics.
Fig. 3. Closed-loop implementation of the buck converter using SimscapeR

and a type
III compensator.
352 J. P. Coelho et al.
The simulation was carried out considering a 5 V sinusoidal disturbance, with
100 Hz frequency, superimposed over the 12 V supply voltage. Moreover, a 20%
step load disturbance was applied at time instant 0.005 s. The simulation was
carried out within a time frame of 15 ms and the observed results are shown in
Fig. 4.
Fig. 4. Buck converter performance using a PID type controller: top - input voltage,
middle - output voltage, bottom - control signal.
From the simulation result, it is possible to observe a performance degra-
dation in both overshoot and bandwidth. Moreover, the set-point accuracy was
compromised as can be seen by the low frequency signal overlapped into the out-
put voltage. This closed-loop mismatch is due to several reasons: components
losses, non-linearities, and model mismatches. For this reason, it is possible to
conclude that this controller is unable to perform well in a broad range of changes
in the system dynamics. Adaption is required which, in this work, is attained by
means of using the BELBIC control strategy.
One of the major handicaps when dealing with a BELBIC controller concerns
the appropriate definition of both emotional and sensory signals. In this work,
the BELBIC Simulink R

toolbox [10] was utilized with the structure depicted
in Fig. 5 where the stimulus signal was defined as:
s(t) = w1 · e(t) + w2 ·

e(t)dt (7)
where e(t) denotes the voltage tracking error.
The reward signal is defined by:
r(t) = w3 · e(t) + w4 · u(t) (8)
where u(t) concerns the control signal and wi, for i = 1, .., 4, are weight factors
that can be used to define the relative importance of each component.
Besides the weights wi, for i = 1, · · · , 4 presented in (7) and (8), the BELBIC
design process require also the definition of a set of parameters for it to operate
BELBIC Based Step-Down Controller Design Using PSO 353
Fig. 5. Closed-loop implementation of the buck converter using SimscapeR

and a
BELBIC controller.
adequately. In particular the value of the amygdala and orbitofrontal learning
rates α and β respectively. Managing such a number of parameters using a trial-
and-error approach will be cumbersome at best. For this reason, in this work
a PSO algorithm will be in charge of deriving the best controller parameters
according to a given performance index.
In the current context, the performance is calculated by:
f(θ) = φ(θ) ·
 tS
0
e2
(θ, τ)dτ (9)
where θ = [α, β, w1, w2, w3, w4] denotes the controller parameters and tS the
simulation time. The function φ(θ) is used to penalizes solutions that result in
control signals with amplitudes outside the actuator range. In the present case,
φ(θ) is defined as:
φ(θ) =

1, min(u(t)) ≥ 0 ∧ max(u(t)) ≤ 1
e(min(u(t))2
+max(u(t))2
)) + 1, otherwise
(10)
for t ∈ [0, tS].
The PSO algorithm was run several times to search for a suitable solution θ
that minimizes the objective function f(θ). During the simulation, the system
was exited using a random input voltage signal with values between 4 V and
15 V changing with a periodicity of 5 ms. A swarm size of 30 particles was used
and the simulation time was set to 100 ms.
The best solution found, in this case α = 0.0241, β = 0.00985, w1 = 6.54,
w2 = 21.3, w3 = 0.0016 and w4 = 0.21, was used to parameterize the BEL
controller which was then subjected to the same simulation conditions as the
type III controller. The result can be seen from Fig. 6.
As can be seen after analysing the plots of Fig. 6, the BELBIC controller was
able to achieve a smaller settling time and lower overshoot. Moreover the steady-
state error and disturbance rejection ability were also enhanced when compared
to the PID controller. However, this improved response comes at the expense of
a more complex and demanding tuning process.
354 J. P. Coelho et al.
Fig. 6. Buck converter performance using a BELBIC type controller: top - input volt-
age, middle - output voltage, bottom - control signal.
A quantitative comparison between the PID and the BELBIC controllers, in
the context of the addressed problem, can be made after computing the following
two figures-of-merit:
RMS =

1
T
 T
0
ε(t)2dt (11)
μRMS =

1
T
 T
0

du(t)
dt
2
dt (12)
The first is the root-mean-square value of the error signal ε(t) taken along
the simulation interval [0, T] and the second, also the root-mean-square, but now
of the control effort.
Regarding the PID controller, RMS = 1.35 and μRMS = 8.19 × 10−3
. For
the BELBIC those values come down to RMS = 1.131 and μRMS = 7.12×10−3
.
Those values reflect a 16% decrease in RMS and an improvement of 13% in the
control signal variability.
6 Conclusion
This paper has compared the performance between an ordinary type III con-
troller, and a BELBIC controller applied to a DC-DC buck converter. Both
strategies were evaluated regarding its abilities to maintain a stable voltage out-
put in the presence of both input and load disturbances.
The obtained results suggest that, when compared to the classical controller,
the BELBIC controller proves to be superior when considering both set-point
tracking accuracy and disturbance rejection ability. Furthermore, these results
are attained by means of lower control effort.
BELBIC Based Step-Down Controller Design Using PSO 355
Future work will consider the controller behaviour if the buck converter enters
discontinuous conduction mode. A physical implementation of this solution is
also an ongoing project and the controller performance will be compared with
the one achieved by commercial devices such as the UC3845A chip.
References
1. Sankarganesh, R., Thangavel, S.: Performance analysis of various DC-DC convert-
ers with optimum controllers for PV applications. Res. J. Appl. Sci. Eng. Technol.
8, 929–941 (2014)
2. Khorashadizadeh, S., Mahdian, M.: Voltage tracking control of DC-DC boost con-
verter using brain emotional learning. In: 4th International Conference on Control,
Instrumentation, and Automation (ICCIA), pp. 268–272 (2016)
3. Erickson, R.W., Maksimovic, D.: Fundamentals of Power Electronics, 2nd edn.
Springer, Boston (2001). https://doi.org/10.1007/b100747
4. Sarpeshkar, R.: Neuromorphic and Biomorphic Engineering Systems. McGraw-Hill
Yearbook of Science and Technology. McGraw-Hill, New York (2009)
5. Balkenius, C., Morén, J.: A computational model of emotional learning in the
amygdala. Cybern. Syst. 32(6), 611–636 (2001)
6. Lucas, C., Shahmirzadi, D., Sheikholeslami, N.: Introducing BELBIC: brain emo-
tional learning based intelligent controller. Intell. Autom. Soft Comput. 10, 11–22
(2004)
7. Rouhani, H., Jalili, M., Araabi, B.N., Eppler, W., Lucas, C.: Brain emotional
learning based intelligent controller applied to neurofuzzy model of micro-heat
exchanger. Expert Syst. Appl. 32(3), 911–918 (2007)
8. Rahman, M.A., Milasi, R.M., Lucas, C., Araabi, B.N., Radwan, T.S.: Implemen-
tation of emotional controller for interior permanent-magnet synchronous motor
drive. IEEE Trans. Ind. Appl. 44(5), 1466–1476 (2008)
9. Nahian, S.A., Truong, D.Q., Ahn, K.K.: A self-tuning brain emotional learning
based intelligent controller for trajectory tracking of electrohydraulic actuator. J.
Syst. Control Eng. 228, 461–475 (2014)
10. Coelho, J.P., Pinho, T.M., Boaventura-Cunha, J., de Oliveira, J.B.: A new
brain emotional learning Simulink R
 toolbox for control systems design. IFAC-
PapersOnLine 50, 16009–16014 (2017)
11. Jafarzadeh, S., Jahed Motlagh, M.R., Barkhordari, M., Mirheidari, R.: A new
Lyapunov based algorithm for tuning BELBIC controllers for a group of linear
systems. In: 2008 16th Mediterranean Conference on Control and Automation.
IEEE, June 2008
12. Garmsiri, N., Najafi, F.: Fuzzy tuning of brain emotional learning based intelligent
controllers. In: 2010 8th World Congress on Intelligent Control and Automation.
IEEE, July 2010
13. Jafari, M., Mohammad Shahri, A., Hamid Elyas, S.: Optimal tuning of brain emo-
tional learning based intelligent controller using clonal selection algorithm. In:
ICCKE 2013. IEEE, October 2013
14. Valizadeh, S., Jamali, M.-R., Lucas, C.: A particle-swarm-based approach for opti-
mum design of BELBIC controller in AVR system. In: International Conference on
Control, Automation and Systems, COEX, Seoul, Korea, pp. 2679–2684, October
2008
356 J. P. Coelho et al.
15. Valipour, M.H., Maleki, K.N., Ghidary, S.S.: Optimization of emotional learning
approach to control systems with unstable equilibrium. In: Lee, R. (ed.) Software
Engineering, Artificial Intelligence, Networking and Parallel/Distributed Comput-
ing. SCI, vol. 569, pp. 45–56. Springer, Cham (2015). https://doi.org/10.1007/978-
3-319-10389-1 4
16. El-Saify, M.H., El-Garhy, A.M., El-Sheikh, G.A.: Brain emotional learning based
intelligent decoupler for nonlinear multi-input multi-output distillation columns.
Math. Probl. Eng. 1–13, 2017 (2017)
17. Mei, Y., Tan, G., Liu, Z.: An improved brain-inspired emotional learning algorithm
for fast classification. Algorithms 10(2), 70 (2017)
18. César, M.B., Coelho, J.P., Gonalves, J.: Evolutionary-based bel controller applied
to a magneto-rheological structural system. Actuators 7(2), 29 (2018)
19. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of the
1995 IEEE International Conference on Neural Network, pp. 1942–1948 (1995)
20. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In:
Porto, V.W., Saravanan, N., Waagen, D., Eiben, A.E. (eds.) EP 1998. LNCS,
vol. 1447, pp. 591–600. Springer, Heidelberg (1998). https://doi.org/10.1007/
BFb0040810
Robotic Welding Optimization
Using A* Parallel Path Planning
Tiago Couto1,2(B)
, Pedro Costa1,3
, Pedro Malaca2
, Daniel Marques2
,
and Pedro Tavares2
1
Faculty of Engineering, University of Porto, Porto, Portugal
2
SARKKIS Robotics, Porto, Portugal
tiago.couto@sarkkis.com
3
INESC-TEC - INESC Technology and Science, Porto, Portugal
Abstract. The world of robotics is in constant evolution, trying to find
new solutions to improve on top of the current technology and to over-
come the current industrial pitfalls. To date, one of the key intelligent
robotics components, path planning algorithms, lack flexibility when con-
sidering dynamic constraints on the surrounding work cell. This is mainly
related to the large amount of time required to generate safe collision-
free paths for high redundancy systems.
Furthermore, and despite the already known benefits, the adoption
of CPU/GPU parallel solutions is still lacking in the robotic field. This
work presents a software solution able of connecting the path planning
algorithms with parallel computing tools, reducing the time needed to
generate a safe path. The output of this work is the validation for the
introduction of intelligent parallel solutions in the robotic sector.
Keywords: Optimization · A* algorithm · CPU parallelism · GPU
parallelism
1 Introduction
Trajectory planning is all about generating the rules for the movement of a
manipulator to reach a determined goal. Adding to this some constraints will
limit the robot movement, such as collision avoidance, limits for joint angles,
velocities, accelerations, or torques. Once taking into account every constraint
and every viable and possible trajectory, optimization is needed to choose the
best trajectory based on defined objective, such as energy consumption or exe-
cution time [5].
Focusing on the path planner optimization, the complexity is high as indus-
trial application usually integrate robot manipulators, additional external axis
to which they are associated (tracks, rings, orbitals), and lastly, the precision
that they need to achieve.
Due to this high complexity, and to achieve the optimal solution for the
movement of a manipulator, path planning becomes a time-consuming task. This
work aims to improve an implementation of the A* algorithm in Cartesian space,
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 357–364, 2021.
https://doi.org/10.1007/978-3-030-91885-9_26
358 T. Couto et al.
using parallel computing (CPU and GPU), for an advanced robotic work cell,
combining advanced (collision-free) offline programming and advanced sensing,
minimizing the path length by using a graph algorithm, and minimizing the
movements performed by the manipulator, namely the joints’ efforts, since the
capabilities for flexible path planning are reduced.
2 Background
The industrial community is in constant search of new approaches to reduce the
time required to complete a given task. The main optimization for robotics, in
particular, is the reduction of the trajectory setup and execution time, which
can be split into two important aspects: minimizing path length and minimiz-
ing trajectory duration. This solution aims to improve both aspects in the A*
algorithm, focusing on achieving an optimal path to reach the solution, while
minimizing the movement performed by the joints. However, to better under-
stand the implementation, there is a need to understand the basic concepts about
the A* algorithm and the parallelism that can be used in robotics.
2.1 Parallel Computing in Robotics
The path planning for a manipulator, as defined earlier, is a very complex task,
and the complexity increases with the introduction of external axes. The parallel
processing allows the reduction of the time required to achieve a specified goal
and/or improve the quality of the solution. Parallel processing has different lev-
els, however, the one focused on this work is the algorithm level, which helps to
manage the heavy computational complexity of these solutions, allowing faster
Cspace evaluation algorithms [8]. Regarding parallel computing, there are two
different approaches, cloud computing, and grid computing [2].
CPU Parallelism. Cloud computing, as defined by [4], is all about sharing
software, reducing hardware load by using the cloud for processing complex
calculations and only retrieving the results when needed, storing data, and query
it to retrieve the data needed.
Multithreading introduces some challenges, arising from the way the CPU
handles them. Shared memory is one of the problems, as well as coordinating
them. Since the CPU processes the Threads in an undefined order and can
switch between them, there is a need to assure that the tasks that each one is
performing do not affect the others, or if so, the resources need to be locked, to
avoid corruption of data. Regarding threads, there are three approaches in C#:
Thread, ThreadPool, BackgroundWorkers.
GPU Parallelism. Grid computing, as defined by [3], is a distributed system,
that uses multiple computing capacities to solve large computational problems.
The tool used is Nvidia’s CUDA [1], allowing to speed up computation processes
by using the power of GPU for the parallelizable part of the computation.
Robotic Welding Optimization Using A* Parallel Path Planning 359
The GPU is designed for computer graphics and image processing, being
more powerful than the CPU when performing small tasks due to the massively
parallel computation, and CUDA allows the usage of GPU for general-purpose
programming [7]. To use CUDA, the host (CPU) calls the kernels (global func-
tions) and the device (GPU) executes and parallelizes the tasks. Since the ker-
nels use data in the GPU memory, the data needs to be transferred between the
CPU and the GPU by using dedicated CUDA variables. The workflow starts
with allocating memory in the device, transferring data from the host to the
device, execute kernel(s), transfer results from the device to the host, and free
the memory used on the device.
3 Implementation
To understand the proposed solution, the implementation created for the A*
algorithm needs to be explained, from the heuristics used, to how the configu-
ration space was built, and how the calculations for the manipulator movement
were defined.
The current work considers a work cell composed of a Yaskawa MA1440
manipulator equipped with a Binzel Abirob W500 welding torch. The goal is to
incorporate the planning solution in the SARKKIS CAM software solution. This
is an offline programming solution for high redundancy robotic system, being
this work focused on the CoopWeld cell (https://youtu.be/3L0JBA9ozFA) at
an initial stage and scaling to H-Tables or higher complexity work cells.
3.1 A* Algorithm
The A* algorithm works based on two heuristics, the g cost and the h cost, and
its performance is as good as the heuristics. The g cost represents the cost of
moving from the actual position to the neighbour that is being explored. This
cost is defined by the difference between the values of each joint, but to minimise
the movement of the manipulator, it was also added the difference between the
neighbour joints and the destination joints, with a proportional ratio related to
the total cost based on the euclidean distance to the destination compared to
the euclidean distance from the origin to the destination. The arm position is
chose based on the angles of the joints that minimise the g cost.
The h cost is calculated using the Euclidean distance between the current
neighbour node and the destination, using the Cartesian coordinates, fitting the
problem since it is always lower than the g cost, which uses the joints movement,
which ensures the A* is a complete algorithm and does not underestimate by a
lot, which reduces the searched nodes, optimizing the running time.
The A* algorithm starts in a defined starting configuration, with a specific
arm position, and tries to find the optimal path to the final configuration, which
also has a defined arm position. In each iteration, the algorithm chooses the node
with the least f cost, the sum of g cost and h cost, and finds all the neighbours
360 T. Couto et al.
that are not obstacles, and add them to the list of nodes to evaluate. The algo-
rithm stops if the final node is reached or if all nodes are searched and there is
no solution.
Configuration Space. The Cspace was implemented with Cartesian coordi-
nates, and the x and y axes were divided into 301 subdivisions and the z axis
in 151 subdivisions, corresponding to 12 mm increments along with the range of
motion. This subdivisions correspond to a total of 301 ∗ 301 ∗ 151 = 13.680.751
configurations.
To create a faster implementation of the algorithm some adjustments were
performed, starting with the creation of a less precise CSpace, with 72 mm incre-
ments instead of 12 mm for the x and y axes, and 36 mm for the z axis, decreasing
the number of nodes to search when finding the path. To obtain this reduc-
tion, the number of subdivisions was decreased to 51, resulting in a total of
51 ∗ 51 ∗ 51 = 132.651 configurations, reducing the search space by more than
100 times, and the average searched nodes is reduced by almost 6 times.
This less precise CSpace allows to improve the algorithm time, since in most
of the path if there are no obstacles around, the intermediate configurations that
would be defined with the complete CSpace are redundant, the only important
ones are the ones close to obstacles, where the complete CSpace would be iterated
but would be almost instantaneously since the referred nodes are close together.
After defining the full path from the source node to the destination node,
and since the CSpace used was less precise, an approximation iteration would
be performed. Around each obstacle, an approximation was performed to avoid
collisions due to a lack of precision. After this, there was also computed an
approximation for the final position, between the final node of the less precise
CSpace to the final node in the more precise CSpace.
Associated with the CSpace is also the CObs, where the configurations that
the robot itself is occupying can be defined.
Arm Position. The arm position of each configuration needs to be determined
using kinematics, which leads to multiple solutions for each Cartesian position.
SARKKIS software has integrated kinematics, which automatically calculates
the possible solutions, and each solution has the manipulator links and the end-
effector dimensions.
The first step is to define if there is a solution that does not cause any
collision with the environment. Using the solutions where there are no collisions,
the one that results in the minimum value for the g cost (cost of movement) is
the optimal one.
3.2 Parallel Computing
After identifying possible tasks to implement parallelism, there was a need to
evaluate if the parallelism could be performed in the CPU or GPU. For the
CPU, every task that executed multiple times is worthy of parallelism, for the
Robotic Welding Optimization Using A* Parallel Path Planning 361
GPU, it was more specific since the type of tasks needs to have dense small
calculations that can be subdivided into many threads, like the calculations for
the neighbours, elements, and heuristics. These CPU and GPU changes can be
seen below in Image 1.
Fig. 1. CPU and GPU parallelism
CPU Parallelism. The first aspect to consider is that this type of parallel
work is only suitable for work that is performed over many iterations and that
requires some amount of work, otherwise it causes the code to run slowly.
Due to the workload limitation, Threading was the solution to be imple-
mented, and the place to achieve a reasonable amount of work was the nodes
calculations.
The first memory shared is the OpenList and ClosedList and to avoid simul-
taneous editing Mutex were introduced, using Microsoft Threading.
The second memory problem was due to the edition of the nodes because
different nodes can have common neighbours. To avoid this, the variables that
are edited were converted to lists, and instead of editing the variable, a new
value is added to the list, and when performing the calculations the minimum
value for the h cost is used, and the other values can be tracked in the list by
using the h cost index.
To complete the Threading research, BackgroundWorkers were implemented.
To define a BackgroundWorker, an event handler is added for the DoWork event,
that will contain the method itself, another event is added to the RunWorker-
Completed, a method for when the BackgroundWorker completes is work, also
there was a need to enable the WorkerSupportsCancellation to cancel ongoing
BackgroundWorkers when the solution is found. To send data to the method
an argument needs to be created, that consists of a Tuple with all the infor-
mation needed (Configurations and CSpace). To start the operation, simply call
RunWorkerAsync with the argument as an input.
362 T. Couto et al.
GPU Parallelism. One of the main concerns when using CUDA is that to
perform the computations, the data needs to be moved to and from the device,
so the complexity of the operations need to justify this data transfer to obtain an
improvement in the performance [6]. The ideal application would be to run many
data elements simultaneously in parallel, which involves arithmetic operations on
a large amount of data, where the operations can be performed across thousands
or millions of elements at the same time.
Since g cost, h cost, and find element are functions with simple calculations,
a CUDA kernel for each one of those resulted in the worst performance of the
algorithm. When using a kernel for find neighbours, it worked a little better since
each node has 26 possible neighbours, but still was not helping the performance.
The one that could improve the most was the find collisions because it has to
verify each configuration that the manipulator is occupying, and verify if there is
a collision with the environment obstacles. With this implementation, there was
the need to create a boolean array with the index’s of the obstacles, to send the
information to the kernel, since the device can not access information from the
host. For this, when the CSpace was created, a list was filled with those indexes.
4 Integration
The goal of this solution is to improve current solutions in the path planning
industry, and for this, the solution was integrated with the SARKKIS software.
This enables to perform industrial validation of the solution, trying to make the
most out of the improvements proposed.
To implement the solution a DLL was created and added to SARKKIS soft-
ware. MetroID software family proprietary from SARKKIS is a CAM software
for cutting and welding focused on generating collision-free operation paths for
robotic work cells. Furthermore, upon sequencing the operation the planning
between cuts or welds can be rapidly prepared considering this parallel A* app-
roach. Therefore, the work is relevant for both the setup stage (vector genera-
tion) and production/control stage (path planner between operations, avoiding
dynamics obstacles as well as added fixtures to the scene).
5 Experimental Validation
To validate the implemented functions and verify the optimization that can be
obtained with them, some tests were performed, to better understand the best
tool to be used or if there was a benefit of combining both. The computer used
for these preliminary testing was equipped with an AMD Ryzen 7 2700X Eight-
Core Processor (@3.70 GHz), 16 GB of RAM and an additional GPU Nvidia
1050 Ti.
Robotic Welding Optimization Using A* Parallel Path Planning 363
5.1 CPU Parallelism
To understand the implemented functions that resulted in an improvement, and
to compare them, the solution for the A* using ThreadPool and Background-
Workers was run across multiple scenarios, with and without obstacles, in more
than 200 tests, to understand what was the optimal number of threads. These
tests were performed on a computer with 8 cores.
With these tests, using 16 threads for the ThreadPool implementation and
using 8 or 16 threads for the BackgroundWorkers implementation proved to be
the implementation that resulted in the best performance. Following these tests,
to verify the difference in these three implementations, and understanding the
improvement, 250 more tests were performed comparing those implementations
with the sequential implementation of the A*, the traditional one, in multiple
scenarios and with different levels of complexity. After these tests, the utilization
of BackgroundWorkers with 8 threads, resulted in the best performance, for our
setup, with an improvement of almost 10 times compared to the traditional
implementation of the A* (Fig. 2).
Fig. 2. Average time of different implementations
Since the creation of the CSpace also takes a lot of time, an implementation
of that function was created using BackgroundWorkers to understand if there
would be any benefit, and the results can be seen below, with an improvement
of more than 20%. Also, a solution using a kernel was tried, but it only resulted
in a very small improvement, around 3%.
5.2 GPU Parallelism
For the GPU parallelism, some functions were created, but since g cost, h cost
and find element were too simple and resulted in a huge loss of performance,
they were not highly tested. For the check collisions function, there was a big
increment in time, since there were too many variables to be copied for the
device which cause a lack of performance. For the find neighbours function, since
it would at the top create 26 neighbours, and therefore 26 threads, it did not
result in an improvement for the implementation.
364 T. Couto et al.
6 Conclusion and Future Work
The solution presented, an optimized A* algorithm, proved to improve the path
planning time with the introduction of grid computing, since it reaches the opti-
mal solution in reduced time, with a collision-free solution. This work provides
a flexible solution to be integrated by a wide range of offline programming soft-
ware.
After the experimental validation, and to validate the results obtained, an
industrial validation was performed in the CoopWeld work cell. The solution
worked in the industrial environment, as it can be seen in https://tiago27couto.
wixsite.com/dissertation, and validated the robustness of the algorithm, with
some changes to be made to refine the solution, and proved to have the potential
to improve current path planning solutions.
As the next step, inserting external axes will improve the reachability of
the solution, even though it will also increase the complexity, which will enable
further improvements and possible integration of GPU computing.
Acknowledgment. The research and developed work leading to these results has
received funding from the TRINITY Robotics project, under the European Union’s
Horizon 2020 research and innovation programme. Grant agreement 825196.
References
1. CUDA zone—NVIDIA developer. https://developer.nvidia.com/cuda-zone. Acces-
sed 15 May 2021
2. Henrich, D., Honiger, T.: Parallel processing approaches in robotics. In: Proceeding
of the IEEE International Symposium on Industrial Electronics, ISIE 1997, vol. 2,
pp. 702–707 (1997). https://doi.org/10.1109/ISIE.1997.649079
3. Jiang, Y.S., Chen, W.M.: Task scheduling in grid computing environments. In: Pan,
J.S., Krömer, P., Snášel, V. (eds.) Genetic and Evolutionary Computing, vol. 238,
pp. 23–32. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01796-9 3
4. Kar, A., Dutta, A.K., Debnath, S.K.: Application of cloud computing for optimiza-
tion of tasks scheduling by multiple robots operating in a co-operative environment.
In: Hemanth, J., Fernando, X., Lafata, P., Baig, Z. (eds.) ICICI 2018. LNDECT,
vol. 26, pp. 118–125. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-
03146-6 11
5. Pham, Q.-C.: Trajectory planning. In: Nee, A.Y.C. (ed.) Handbook of Manufactur-
ing Engineering and Technology, pp. 1873–1887. Springer, London (2015). https://
doi.org/10.1007/978-1-4471-4670-4 92
6. Rahmani, V., Pelechano, N.: Multi-agent parallel hierarchical path finding in navi-
gation meshes (MA-HNA*). Comput. Graph. 86, 1–14 (2020). https://doi.org/10.
1016/j.cag.2019.10.006
7. Ruetsch, G., Oster, B.: Getting started with CUDA (2008). https://www.nvid
ia.com/content/cudazone/download/Getting Started w CUDA Training NVISIO
N08.pdf. Accessed 15 May 2021
8. Therón, R., Blanco Rodrı́guez, F.J., Curto, B., Moreno, V., Garcı́a-Peñalvo, F.:
Parallelism and robotics: the perfect marriage. ACM Crossroads 8, 1–11 (2002)
Deep Learning
Leaf-Based Species Recognition Using
Convolutional Neural Networks
Willian Oliveira Pires1
, Ricardo Corso Fernandes Jr.1
,
Pedro Luiz de Paula Filho1
, Arnaldo Candido Junior1(B)
,
and João Paulo Teixeira2
1
Federal University of Technology - Paraná, Medianeira Campus, Curitiba, Brazil
{willianpires,ricjun}@alunos.utfpr.edu.br, {plpf,arnaldoc}@utfpr.edu.br
2
Research Centre in Digitalization and Intelligent Robotics (CEDRI) – Instituto
Politecnico de Braganca, Braganca, Portugal
joaopt@ipb.pt
http://www.utfpr.edu.br/english, https://cedri.ipb.pt/
Abstract. Identifying plant species is an important activity in specie
control and preservation. The identification process is carried out mainly
by botanists, consisting of a comparison of already known specimens or
using the aid of books, manuals or identification keys. Artificial Neu-
ral Networks have been shown to perform well in classification problems
and are a suitable approach for species identification. This work uses
Convolutional Neural Networks to classify tree species by leaf images. In
total, 29 species were collected. This work analyzed two network mod-
els, Darknet-19 and GoogLeNet (Inception-v3), presenting a comparison
between them. The Darknet and GoogLeNet models achieved recognition
rates of 86.2% and 90.3%, respectively.
Keywords: Deep learning · Leaf recognition · Tree classification
1 Introduction
Sustainability is an important concern in the context of business and govern-
ments in view of nature preservation. According to Shrivastava [12], both busi-
nesses and governments play an important role in nature’s preservation. The
identification process of forest species is important in this context, specially in
the case of endangered species. Flora identification is currently made by botanists
by comparing with already known species or with book guidance, manuals and
identification keys. This comprises simple tasks as identifying whether the plant
have flowers and fruits to more complex tasks, such as identifying the plant
species by observing morphological attributes. For non-professionals, this pro-
cess can be long and error prone, so an automated tool would save time and,
possibly, plant species. The advancements made in computation, image process-
ing techniques and pattern recognition unveiled new ways of specie identification.
Deep learning based system are promising in this field, being helpful both for
the professionals and non-professionals.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 367–380, 2021.
https://doi.org/10.1007/978-3-030-91885-9_27
368 W. O. Pires et al.
In this work, we used leaf images to train Convolutional Neural Netowrks
(CNNs) for the classification task. Two models were trained: GoogLeNet
(Inception-v3) and Darknet-19, both implemented using Tensorflow. These mod-
els were chosen due to their light requirements when compared to other models,
allowing for use in low-cost equipment during field research. The models can be
used to identify species in natura.
This work is organized as follows. Section 2 presents an overview of CNNs,
specie identification and related works. Section 3 presents the materials and
methods used to train our models. Section 4 discusses the results. Finally, Sect. 5
contains the works conclusions.
2 Background
2.1 Species Identification
In order to realize plant species classification, botanists base themselves in veg-
etable taxonomy, analyzing characteristic group species by morphological simi-
larities and genetic kinship links [10]. This process generated a field in botany
called dendrology, which investigates woody plants identification, distribution
and classification [5]. Dendrology scope includes root types, tree sizes, pilosity,
shaft, as well as diagnostic elements (color, texture and structure). The shaft is
a tree trunk part free of ramifications which can be of different types, shapes
and bases.
Table 1 presents leaf components. The bases of leaf variety is its division type,
as there are simple leaves, which presents a single leaf lamina (Fig. 1). There are
also compound leaves, which present more than one leaflet, as shown in Fig. 2.
Table 1. Leaf Characteristics
Scientific name Description
Leaf venation Pattern of veins in the leaf
Hairiness Structures as hair in leaf surface
Leaf arrangement How leaves are arranged on a twig
Stem Plant structure that supports leaves, flowers and fruits
Stipule A pair of small organs that may be attached to the twig on
either side of the petiole
Leaf base Part of leaf nearest to the petiole
Leaf apexes Part of leaf farthest from petiole
Simple leaf Leaf that has a single blade
Compound leaf Leaf that has two or more blades that are called leaflets
Leaflets Leaf subdivisions that are related to compound leaf
Stalk A thin stem that supports a leaf and joints it to another part
of plant tree
Rachis Principal vein in the compound leaf, extension of stalk
Bud A small lump from which a flower, leaf or stem develops
Leaf-Based Species Recognition Using Convolutional Neural Networks 369
Fig. 1. Simple leaf [5]
Fig. 2. Compound leaf [5]
There are other important leaf elements, which generate a wide specie vari-
ety, such as leaf shape, tip (or apex), base and margin attributes, which can
be used to differentiate plant species. Leafs also can be identified according to
their phyllotaxis, which are divided in four types: alternate, spiral, opposite and
whorled, shown in Fig. 3. This attribute is used before leaf extraction, as it shows
how leaves are organized. Other attribute is leaf venation, which is divided in
pinnate, reticulate, parallel, palmate and dichotomous (Fig. 4).
All those characteristics are taken in account during species identification
process and are the foundation of a widely used species identification method
among botanists: dichotomous key, which is based in plant characteristics obser-
vation. Researchers compare characteristics of field extracted species with the
characteristics of dichotomous keys, one by one, until matching with any of the
registered species [10]. Table 2 presents a simple example using a dichotomous
key to classify a leaf according to it’s venation.
370 W. O. Pires et al.
Fig. 3. Leaf phyllotaxis [5]
Fig. 4. Leaf venation [5] onte
Table 2. Simple dichotomous key [5]
1. Leaves with a single vein and not ramified Single main vein
Leaves with more than a single venation 2
2. Leaves with more than a single vein and all
parallel between them
Parallel
Leaves with non-parallel veins 3
3. Secondary veins originates from a main vein Pinnate
Leaves with several main veins originating from
the petiole
Palmate
This is a fairly simple example, but in many cases characteristic selection
might not be trivial and plant specie identification will usually involve more than
a single characteristic. An example is Xanthosoma taioba, which is suitable for
consumption and hard to classify, as it is very similar to Xanthosoma violaceum,
which is not suitable for consumption. In Fig. 5, it becomes clear that differing
two species is not always a simple task.
Leaf-Based Species Recognition Using Convolutional Neural Networks 371
Fig. 5. The real Xanthosoma taioba (right), and Xanthosoma violaceum (left) [5]
2.2 Convolutional Neural Networks
In the context of machine learning, specifically in neural networks, Convolu-
tional Neural Networks (CNNs) are a powerful model to analyze images. CNNs
are popular in image processing since they are inspired by the visual cortex.
These networks are based on the idea of specialized components inside a system
with specific tasks, in a similar fashion as the visual cortex observed by [16].
This architecture is composed by a sequence of layers that tries to capture a
hierarchy of increasingly sophisticated representations. Besides the input layer,
which normally consists of an image with width, height and color depth (RGB –
red, green and blue channels), there are three typical layers: convolution layer,
pooling layer and densely connected layer [15].
The first hidden layer in a CNN is usually a convolutional layer, which is
composed by many feature maps (filters), capable of learning patterns as the
training progresses [3]. Convolutional layers usually receive a two-dimensional
or a collection of input two-dimensional and are widely used to process images.
The layers are then submitted to a convolutional operation subject to parameters
learned during the network training phase.
Each layer usually also performs a non-linear operation which greatly
increases a model generalization capacity. This is done using an activation func-
tion. A popular function is the Rectified Linear Unit (ReLU), as it is a fast to
calculate non-linear function. It replaces negative input values by zero [15]. An
example as can be observed in (1), where z represents the function input (a
neuron input).
φ(z) =

0 z ≤ 0
z z  0
(1)
After a convolution and activation function, it is common the use of a pool-
ing layer. This technique aims to reduce the resulting matrix size, which dimin-
ishes the amount of neural network parameters to learn, contributing to avoid
372 W. O. Pires et al.
overfitting [15]. Max pooling is a pooling technique in which several inputs close
to each other are replaced by a single value, the highest value in their neigh-
bourhood.
Each consecutive layer are capable of representing more complex concepts
than the previous layer. The last layers of a CNN usually are dense (or fully
connected) layers, built on top of the convolutional layers. In case of CNNs for
classification, the last layer outputs a n-dimensional vector, where n is the total
number of classes and each vector element is the predicted class probability
for one of the available classes [3]. For classification, it is common the use of
the Softmax function, which compares each output neuron response and return
the results in the form of probability. This function is presented in (2), for the
stimulus zj received by the j-nth output neuron.
φ(zj) =
ezj

k ezk
(2)
These models are normally trained using the Backpropagation and Gradi-
ent Descent algorithms. To avoid overfitting, dropout technique can be used.
Dropout approach consists on randomly removing neurons during training pro-
cess [1].
2.3 Darknet
Darknet is a neural topology usually implemented in the YOLO framework. This
framework allows real-time object detection and is able to identify objects in
images and videos as presented in Table 3. In this work, we investigated Darknet-
19, an architecture composed of 19 convolutional layers interspersed with 5 more
layers that apply max-pooling [3].
This network segments input image in SxS frames, known as grids. To do
this, it uses a divide and conquer strategy, making use of image segments to
identify object position in addition to only identifying objects [2].
2.4 GoogLeNet
GoogLeNet is a CNN that became notorious after winning the 2014 Imagenet
Competition. GoogLeNet engineer’s objective was to enhance neural network
computational efficiency while making it deeper and wider. The main feature in
GoogLeNet is the inception module, which is based on the idea of using multiple
convolutional filters [13] varying the kernel size used in the same convolutional
layer. The enhanced inception module is presented in Fig. 6.
Leaf-Based Species Recognition Using Convolutional Neural Networks 373
Table 3. Darknet-19 [13]
Type Filters Stride/Size Output Dimension
Convolutional 32 3 × 3 224 × 224
Maxpool 2 × 2/2 112 × 112
Convolutional 64 3 × 3 112 × 112
Maxpool 2 × 2/2 56 × 56
Convolutional 128 3 × 3 56 × 56
Convolutional 64 1 × 1 56 × 56
Convolutional 128 3 × 3 56 × 56
Maxpool 2 × 2/2 28 × 28
Convolutional 256 3 × 3 28 × 28
Convolutional 128 1 × 1 28 × 28
Convolutional 256 3 × 3 28 × 28
Maxpool 2 × 2/2 14 × 14
Convolutional 512 3 × 3 14 × 14
Convolutional 256 1 × 1 14 × 14
Convolutional 512 3 × 3 14 × 14
Convolutional 512 3 × 3 14 × 14
Convolutional 256 1 × 1 14 × 14
Maxpool 2 × 2/2 7 × 7
Convolutional 1024 3 × 3 7 × 7
Convolutional 512 1 × 1 7 × 7
Convolutional 1024 3 × 3 7 × 7
Convolutional 512 1 × 1 7 × 7
Convolutional 1024 3 × 3 7 × 7
Convolutional 1000 1 × 1 7 × 7
Avgpool Global 1000
Softmax
GoogLeNet is a 22 layer network, considering only convolutional layers, as
shown in Table 4. The last layer uses the Softmax function to perform classifica-
tion [15].
2.5 Related Work
Many solutions based on deep learning have been used in specie identification
problems, due to recent results using this technique. Other strategies can also be
used, based on image processing and pattern recognition. These techniques use
macro and microscopic characteristics of the image. For example, [8] classifies
wood applying the following image characteristics extraction techniques: color
374 W. O. Pires et al.
Fig. 6. Inception module [14]
Table 4. GoogLeNet’s structure [13]
Type Filters/Stride Output Dimension
Convolution 7 × 7/2 112 × 112 × 64
Max Pool 3 × 3/2 56 × 56 × 64
Convolution 3 × 3/1 56 × 56 × 192
Max Pool 3 × 3/2 28 × 28 × 192
Inception (3a) 28 × 28 × 256
Inception (3b) 28 × 28 × 480
Max Pool 3 × 3/2 14 × 14 × 480
Inception (4a) 14 × 14 × 512
Inception (4b) 14 × 14 × 512
Inception (4c) 14 × 14 × 512
Inception (4d) 14 × 14 × 512
Inception (4e) 14 × 14 × 832
Max Pool 3 × 3/2 7 × 7 × 832
Inception (5a) 7 × 7 × 832
Inception (5b) 7 × 7 × 1024
Avg Pool 7 × 7/1 1 × 1 × 1024
Dropout (40%) 1 × 1 × 1024
Linear 1 × 1 × 1000
Softmax 1 × 1 × 1000
analysis, GLCM (Gray-Leval Co-occurence Matrix), border histogram, fractals,
LBP (Local Binary Pattern), LPQ (Local Phase Quantization) and Gabor filter.
This approach resulted in a recognition rate of 99.49% among 42 species.
Another work with related theme proposes analysis and identification of plant
species based in texture characteristics extraction from microscopic leaf epider-
mis images [7]. Texture extraction techniques were used to analyze 32 species.
This approach had 96% success rate. By utilizing leaves, [11] applied image
Leaf-Based Species Recognition Using Convolutional Neural Networks 375
segmentation techniques for feature extraction. This was performed using the
GLCM technique and feature vectors were extracted. The achieved recognition
rate was around 75.4% by techniques like MLP (Multilayer Perceptron), SMO
(Sequential Minimal Optimization) and LibSVM (Library for Support Vector
Machines) as classifiers.
Using deep learning as main identification method, the strategy is basically
to use CNNs to identify the best characteristic from the leaf to recognize a specie.
From this strategy, [6] build CNNs being used for weeds control, which tries to
detect a specie in the lawn. It was used 256 × 256 pixels images and the used
architecture was AlexNet. The result was 75% precision [9].
There are other CNN approaches, [17] proposes the analysis of pictures taken
from the top of farms, which demanded an extra detail preprocessing the image
to enhance illumination before sending it to the network, that is composed of
5 convolutional layers with ReLU activation function and, at the end of the
network, a Softmax function. Their experiment obtained 97.47% precision.
3 Method
3.1 Data Collection
Initially, leafs from 29 different species were collected. Species are listed in Sect. 4
(Table 4). For each species, 100 photos were taken on both sides. Images were
obtained through the utilization of a photobooth proposed by [11]. It has 40
square centimeters and its internal contains led strips, which produces high lumi-
nosity with low energy costs. These leds can be feed using batteries or a 12 V
power supply, and can be easily transported. To avoid reflexes on the images,
internal walls were painted black, except for the bottom, which is white. The
leaf is positioned on the bottom of the photobooth, where it is compressed by
a glass pane, to keep it fixed and flat. This dataset is in the process of being
public released.
After gathering enough photos, data augmentation techniques were used to
artificially increase dataset size, as it is necessary to have many samples to
train a Deep Neural Network. Data augmentation was done by using python
3.5.2 and Keras scripts to alter images, creating new ones. It was used Keras
ImageDataGenerator class to generate new image samples from the original data,
using the default values for the operations Rotation Range, Width Shift, Shear
Range, Zoom Range, Horizontal and vertical Flip and Fill mode.
3.2 Training
The models were then trained with the augmented data and had their results
evaluated, in order to improve performance. Both models, GoogLeNet and Dark-
Net19, were trained four times.
For GoogLeNet, we used the work of [13] as a reference to develop and train
our network. Originally, the network was developed with batch size of 50, Reduce
1 × 1 of 104 and Dropout rate of 0.5.
376 W. O. Pires et al.
Regarding Darknet-19, we used the python implementation Darkflow, which
allow for the utilization of Darknet framework in Tensorflow [4]. Tiny-YOLO
version 2 was used instead of YOLO version 3, as it was faster to train. As
it was used 29 classes, the number of feature maps used in the last layer was
170, stride of size 1 and batch size of 64, following the default configuration for
Tiny-YOLO version 2.
4 Tests and Results
The original dataset were divided in 5,800 images for training and 580 for test.
In order to improve results, both sets were subject to augmentation, generating
34,800 images for training and 2,900 for test. The augmented dataset contains
independent training and test set, as no original image from training has an
augmented version in test. Similarly, augmented images in training has no version
in test.
Darknet model was first trained in the original dataset during 5.000 itera-
tions, presenting True Positive (TP) = 414, False Positive (FP) and False Neg-
ative (FN) of 166 and harmonic mean of 71.3%. After the first experiment, the
model was than trained over the augmented dataset, during 18,000 iterations.
Values presented by the network were TP = 2502, FP and FN of 398 and har-
monic mean of 86.2%. Precision and recall were calculated for each class. Results
are presented on Table 4 and Fig. 7.
GoogLeNet experiments were analogous for Darknet-19. First the model was
trained over the original dataset (5,800 images for training and 580 for test). This
process was performed for 2,000 iterations. The network presented TP = 461,
FP and FN of 119 and harmonic mean of 79.4%. For the second experiment,
(34,800 images for training and with 2,900 for validation) the iteration count
was raised to 4,000. The network presented the following values: TP = 2,633,
FP and FN of 267 and harmonic mean of 90.7%. Values for precision, recall and
harmonic mean were also calculated per class, presented on Table 4 and Fig. 8.
In the performed experiments, GoogLeNet had a better result than Darknet.
The difference between both networks in precision, recall and harmonic mean
were, respectively: 4.5, 4.8 and 4.7%. GoogLeNet results are good, even with a
broad and complicated dataset, as it were classified 29 classes, many of which
are similar.
Analyzing results with and without data augmentation, it became clear
the importance of a sufficient amount of data to work with CNNs. Differences
between Darknet and GoogLeNet trained with small and sufficient amount of
data was, respectively, 14.9% and 11.3%. The developed data augmentation algo-
rithm achieved it’s objective, expanding the dataset without damaging samples.
Leaf-Based Species Recognition Using Convolutional Neural Networks 377
Table 5. Precision, recall and harmonic mean in details - YOLO and GoogLeNet
Scientific Name YOLO
Precision
YOLO
Recall
YOLO
F1
GoogLeNet
Precision
GoogLeNet
Recall
GoogLeNet
F1
1 Persea Americana 63,0% 64,2% 72,6% 73,0% 100,0% 84,3%
2 Eriobotrya Japnica Lind 57,0% 100% 72,6% 75,0% 100% 85,7%
3 Psidium Rufum 100% 80,0% 88,8% 100% 74,0% 85,1%
4 Annona Montana 96,0% 56,1% 70,8% 98,0% 64,0% 77,4%
5 Annona Squamosa 97% 100% 98,4% 100% 100% 100%
6 Cojoba Arborea 100% 100% 100% 100% 100% 100%
7 Coffea 55,0% 55,0% 55,0% 72,0% 75,0% 73,4%
8 Pera Heteranthera 95,0% 100% 97,4% 98,0% 82,3% 89,4%
9 Anacardium Occidentale 100% 100% 100% 100% 100% 100%
10 Peltophorum dubium 100% 84,0% 91,3% 100% 83,3% 90,9%
11 Nectandra Megapotamica 98% 100% 98,9% 94% 100% 96,9%
12 Cerasus 83,0% 89,2% 86,0% 90,0% 95,7% 92,7%
13 Prunus Serrulata 78,0% 100% 87,6% 91,0% 100% 95,2%
14 Salix Babylonica 100,0% 100% 100% 82,0% 100% 90,1%
15 Lle Paraguariensis 81,0% 100% 89,5% 80,0% 100% 88,8%
16 Annona Coriácea 94% 80,3% 86,6% 100% 94,3% 97,0%
17 Psidium Guajava 83,0% 76,8% 79,8% 94,0% 88,6% 91,2%
18 Annona Muricata 98% 91,5% 94,6% 100% 98,0% 99,0%
19 Syzygium Cumini 84,0% 85,7% 84,8% 97,0% 100% 98,4%
20 Leucaena Leucocephala 100% 100% 100% 100% 98% 99%
21 Citrus Limon 72% 76,5% 74,2% 76% 100% 86,3%
22 Tibouchina Mutabilis 100% 83,3% 90,9% 100% 90,0% 94,7%
23 Brunfelsia Uniflora 81,0% 81,0% 81,0% 90,0% 100,0% 94,7%
24 Mangifera Indica 97% 100% 98,4% 98% 79,6% 87,8%
25 Licania Tomentosa 61,0% 100% 75,7% 78,0% 81,2% 79,5%
26 Dypsis Lutescens 100% 100% 100% 100% 100% 100%
27 Paubrasilia Echinata 86,0% 100% 92,4% 89,0% 100% 94,1%
28 Aspidosperma Polyneuron 78,0% 72,2% 75,0% 82,0% 90,1% 85,8%
29 Eugenia uniflora 65,0% 100% 88,8% 76,0% 77,5% 78,7%
378 W. O. Pires et al.
Fig. 7. Confusion Matrix Heatmap - Darknet19
Fig. 8. Confusion Matrix Heatmap - GoogLeNet
Leaf-Based Species Recognition Using Convolutional Neural Networks 379
5 Conclusions
This work presented a comparison of Darknet-19 and GoogLeNet for tree species
recognition using a dataset composed by leaf images from 29 different species,
reaching recognition rates of 86.2% and 90.3%, respectively. The obtained results
demonstrates the viability of GoogLeNet and Darknet networks for classification.
The models can be applied in field research, specially being used to identify
species in natura.
For future works, we plan to test the models against images with leafs that
were not removed from the tree. We also plan to use pre-trained Darknet net-
works in other platforms, as in smartphones aiming at practical uses of the
model, comparing it with similar systems. YOLO have functionality for detect-
ing objects in videos and android studio allows Tensorflow usage while associat-
ing a YOLOv2 training model. By using smartphone cameras, it is possible to
develop an app to recognize plant species in video. Another suggestion would be
to use the model in drones, as there is a huge amount of non registered plant
species, in order to explore areas of limited access.
Acknowledgements. We gratefully acknowledge the support of NVIDIA Corpora-
tion with the donation of the GPU used in part of the experiments presented in this
research.
References
1. Baldi, P., Sadowski, P.J.: Understanding dropout. In: 2013 Neural Information
Processing Systems (2013). https://papers.nips.cc/paper/4878-understanding-
dropout
2. Farhadi, A., Girshick, R., Redmon, J.: You only look once: unified, real-time object
detection. In: University of Washington. IEEE (2015)
3. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT
press, Cambridge (2016)
4. Jones, R.P.: Darkflow (2018). https://medium.com/@richardpricejones/darkflow-
9bdc9f9b818e
5. Marchiori, J.N.C.: Elementos de Dendrologia. UFSM, Santa Maria (1995)
6. Mayo, S.J., Remagnino, P.: How deep learning extracts and learns leaf features for
plant classification. Auton. Robot. 1–13 (2016)
7. Odemir, B., et al.: Leaf epidermis images for robust identification of plants. In:
Instituto de Fı́sica de São Carlos, pp. 2–10 (2017)
8. de Paulo Filho, P.L.: Reconhecimento de espécies florestais através de imagens
macroscópica. In: UFPR, pp. 9–48 (2012)
9. Pearlstein, L., Kim, M., Seto, M.: Convolutional neural network application to
plant detection, based on synthetic imagery. In: Proceedings - Applied Imagery
Pattern Recognition Workshop (2017). www.scopus.com
10. Pinheiro, A.L.: Fundamentos de taxonomia e dendrologia tropical. SIF, Santa
Maria (2000)
11. Pires, W.O.: Reconhecimento de espécies florestais utilizando técnicas de proces-
samento de imagem. In: 7o
Seminario de extenção e inovação, Londrina (2017)
380 W. O. Pires et al.
12. Shrivastava, P.: The role of corporations in achieving ecological sustainability.
Acad. Manag. Rev. 20(4), 936–960 (1995)
13. Szegedy, C., Liu, W., Jia, Y.: Going deeper with convolutions. In: IEEE/ Boston,
MA, USA. IEEE (2015)
14. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep-
tion architecture for computer vision. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
15. Vargas, A.C.G., Paes, A., Vasconcelos, N.: Um estudo sobre redes neurais convolu-
cionais e sua aplicação em detecção de pedestres. In: IEEE/Conferencia Universi-
dade Federal de Fluminence de Niterói. IEEE (2015)
16. Wurtz, R.H.: Recounting the impact of hubel and wiesel. Pattern Recogn. 32, 1–20
(2009). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718241/
17. Yalcin, H., Razavi, S.: Plant classification using convolutional neural networks. In:
2016 5th International Conference on Agro-Geoinformatics, Agro-Geoinformatics
2016 (2016). www.scopus.com
Deep Learning Recognition of a Large
Number of Pollen Grain Types
Fernando C. Monteiro(B)
, Cristina M. Pinto, and José Rufino
Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto
Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal
{monteiro,rufino}@ipb.pt
https://cedri.ipb.pt
Abstract. Pollen in honey reflects its botanical origin and melissopa-
lynology is used to identify origin, type and quantities of pollen grains
of the botanical species visited by bees. Automatic pollen counting and
classification can alleviate the problems of manual categorisation such as
subjectivity and time constraints. Despite the efforts made during the
last decades, the manual classification process is still predominant. One of
the reasons for that is the small number of types usually used in previous
studies. In this paper, we present a large study to automatically identify
pollen grains using nine state-of-the-art CNN techniques applied to the
recently published POLEN73S image dataset. We observe that existing
published approaches used original images without study the possible
biased recognition due to pollen’s background colour or using prepro-
cessing techniques. Our proposal manages to classify up to 97.4% of the
samples from the dataset with 73 different types of pollen. This result,
which surpasses previous attempts in number and difficulty of pollen
types under consideration, is an important step towards fully automatic
pollen recognition, even with a large number of pollen grain types.
Keywords: Pollen recognition · Convolutional neural network · Deep
learning · Image segmentation
1 Introduction
Food fraud has devastating consequences, particularly in the field of honey pro-
duction, which the U.S. Pharmacopeia Fraud Database1
has classified as the
third largest area of adulteration, only behind milk and olive oil. Our aim is to
find solutions to help solving this problem and prevent its recurrence. The deter-
mination of the botanical origin can be used to label honey and the knowledge
of the geographic origin is a factor that influences considerably the commercial
value of the product and can be used for quality control and to avoid fraud [6].
Although demanding, pollen grain identification and certification are crucial
tasks, accounting for a variety of questions like pollination or palaeobotany, but
1
https://decernis.com/solutions/food-fraud-database/.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 381–392, 2021.
https://doi.org/10.1007/978-3-030-91885-9_28
382 F. C. Monteiro et al.
also for other fields of research, including crime scene investigation [13], aller-
gology studies [7] as well as the botanical and geographical studies concerning
origins of honey to prevent honey labelling fraud [15]. However, most of the
pollen classification is a time consuming, laborious and a highly skilled work,
visually done by human operators using microscopes, trying to identify differ-
ences and similarities between pollen grains. These differences are, frequently,
imperceptible among pollen grains and may lead to identification errors.
Despite the efforts to develop approaches that allow the automatic iden-
tification of pollen grains [11,22], the discrimination of features performed by
qualified experts is still predominant [4]. Many industries, including medical,
pharmaceutical and honey marketing, depend on the accuracy of this manual
classification process, which is reported to be around 67% [19]. A notorious
paper [22] from 1996 published a brief summary of the state of the art until then
and, more importantly, the demands and needs of palynology to elevate the field
to a higher level, thus making it a more powerful and useful tool.
Pattern recognition from images has a long and successful history. In recent
years, deep learning, and in particular Convolutional Neural Networks (CNNs),
has become the dominant machine learning approach in the computer vision
field, specifically, in image classification and recognition tasks. Since the number
of annotated pollen images in the publicly available datasets is too small to train
a CNN from scratch, transfer learning can be employed. In this paper we pro-
pose an automatic pollen recognition approach divided into three steps: initially,
the regions which contain pollen are segmented from the background; then, the
colour is preprocessed; finally, the pollen is recognized using deep learning.
Most object recognition algorithms focus on recognizing the visual patterns
in the foreground region of the images. Some studies indicate that convolutional
neural networks (CNN) are biased towards textures [3], whereas another set of
studies suggests shape bias for a classification task [2]. However, little atten-
tion has been given to analyze how the recognition process is influenced by the
background information in the training process.
Considering that there are certain similarities between the layers of a trained
artificial network and the recognition task in the human visual cortex, in this
study, we hypothesize that if the collected images of a pollen have a unique
background colour, different from all the other pollens, it may biases the recog-
nition task, since the recognition could be based only in the background colour.
In order to study such influence we trained the CNN with several datasets:
one composed with original images, another composed with segmented images
(where the background colour was eliminated), and with preprocessed images
with histogram equalization and contrast limit adaptive histogram equalization
(CLAHE) techniques.
The acquisition of images usually has some different sources resulting in
images with different background, as shown in Fig. 1 from the POLEN73S2
dataset [1] used in this study. Deep learning based pollen recognition meth-
ods focus on learning visual features to distinguish different pollen grains. We
2
https://doi.org/10.6084/m9.figshare.12536573.v1.
Deep Learning Recognition of a Large Number of Pollen Grain Types 383
observed that existing published approaches used the entire image with the orig-
inal background, in the training process. Background and foreground pixels in
each image contribute with the same influence into the learning algorithms. As
each pollen type has a different background from other types, when those trained
networks are used to classify the pollens they may be biased by capture relevance
from pollen’s background which may result in biased recognition.
Fig. 1. Pollen dataset samples acquired with different background colours.
In this paper, we investigate the background and colour preprocessing influ-
ence by training nine state-of-the-art deep learning convolutional neural net-
works for pollen recognition. We used a recently published POLEN73S image
dataset that includes more than three times as many pollen types and images as
the POLEN23E dataset used in recent studies. Our approach manages to classify
up to 97.4% of the samples from the dataset with 73 different types of pollen.
The remainder of this paper is organized as follows: Previous related works
are presented and reviewed in Sect. 2. In Sect. 3, we describe the used materials
and the proposed method. Section 4 presents the results and the discussion of
the findings. Finally, some conclusions are drawn in Sect. 5.
2 Related Works
Automatic and semi-automatic systems for pollen recognition based on image
features, in particular neural networks and support vector machines, have been
proposed for a long time [9,11,17,21]. In general terms, those approaches extract
some feature characteristics to identify each pollen type.
Although the classification remains based on a combination of image features,
the deep learning CNN approach builds a model determining and extracting
the features itself, in alternative of being predefined by human experts. Several
384 F. C. Monteiro et al.
CNN learning techniques have been developed for classifying pollen grain images
[1,8,18,19]. In [8], Daood et al. present an approach that learns from the image
features and defines the model classifier from a deep learning neural network.
This method achieved a 94% classification rate on a dataset of 30 pollen types.
Sevillano and Aznarte in [18] and [19] proposed a pollen classification method
that applied transfer learning on the POLEN23E dataset and to a 46 different
pollen types dataset, achieving accuracies of over 95% and 98%, respectively. In
[1], Astolfi et al. presented the POLEN73S dataset and made an extensive study
with several CNNs, achieving an accuracy of 95.8%. Despite the importance of
their study, we identify two drawbacks in their approach, that influenced the
performance: they used different number of samples for each pollen type, and
used, for each pollen, an image background that is different from the image
background of other pollen types.
3 Experimental Setup
3.1 Pollen Dataset
The automation of pollen grain recognition depends on large image datasets with
many samples categorized by palynologists. The results obtained depend on the
number of pollen types and the number of samples used. Few samples may result
in poor learning models, that are not sufficient to train conveniently the CNN;
on the other hand, a small number of pollen types simplifies the identification
process making it impractical to be used for recognizing large numbers of pollens
usually found in a honey sample.
While a number of earlier datasets have been used for pollen grain classifi-
cation, such as the POLEN23E3
dataset [9] or the Pollen Grain Classification
Challenge dataset4
, which contain 805 (23 pollinic types) and 11.279 (4 pollinic
types) pollen images, respectively, in this paper we use POLLEN73S, which is
one of the largest publicly available datasets in terms of pollen types number.
POLLEN73S is an annotated public image dataset, for the Brazilian Savan-
nah pollen types. According to its description in [1] the dataset includes pollen
grain images taken with a digital microscope at different angles and manually
classified in 73 pollen types, containing 35 sample images for each pollen type,
except gomphrena sp, trema micrantha and zea mays, with 10, 34 and 29 sam-
ples, respectively. From the results presented in [1], we observed that these small
number of samples biased the results. Since CNNs were trained with a smaller
number of samples for those types of pollens, this resulted in the worst classifica-
tion scores relative to the other pollens. To overcome this problem, in our study,
several images were generated through rotating and scaling the original images
of these pollen types, ensuring the same number of samples for each pollen type,
which gives a total of 2555 pollen images. Although the images in the dataset
3
https://academictorrents.com/details/ee51ec7708b35b023caba4230c871ae1fa25
4ab3.
4
https://iplab.dmi.unict.it/pollenclassificationchallenge/.
Deep Learning Recognition of a Large Number of Pollen Grain Types 385
have different width and height, they were resized accordingly with the image
size of each CNN architecture input.
More datasets were constructed by removing the pollen’s background colour
(see Fig. 2). Since the images background has medium contrast with the pollen
grains, the segmentation process uses just automatic thresholding and morpho-
logical operations. We also applied histogram equalization and contrast limit
adaptive histogram equalization (CLAHE) to those segmented images. These
new datasets allow the independence of training and testing processes from the
background colour among the pollen types.
Fig. 2. First column: original images; second column: segmented images with back-
ground colour removed; third column: segmented equalized images; fourth column:
segmented CLAHE images.
3.2 Convolutional Neural Networks Architectures
CNN is a type of deep learning model for processing images that is inspired by
the organization of the human visual cortex and is designed to automatically
create and learn feature hierarchies through back-propagation by using multiple
layer blocks, such as convolution layers, pooling layers, and fully connected layers
from low to high level patterns [2]. This technology is especially suited for image
processing, as it makes use of hidden layers to convolve the features with the
386 F. C. Monteiro et al.
input data. The automatic extraction of the most discriminant features from a
set of training images, suppressing the need for preliminary feature extraction,
became the main strength of CNN approaches.
In this section, we present an overview of the main characteristics of the
CNNs used in this study for the recognition of pollen grains types. We choose
nine popular CNN architectures due to their performance on previous classifica-
tion tasks. Table 1 contains a list (chronological sorted) of state-of-the-art CNN
architectures, along with a high-level description of how the building blocks can
be combined and how the information moves throughout the architecture.
3.3 Transfer Learning
Constraints of practical nature, such as the limited size of training data, degrade
the performance of CNNs trained from scratch [18]. Since there is so much work
that has already been done on image recognition and classification [10,12,20], in
this study we used transfer learning to solve our problem. With transfer learning,
instead of starting the learning process from scratch, with a large number of
samples, we can use previous patterns that have been learned when solving a
similar classification problem.
Transfer learning is a technique whereby a CNN model is first trained on
a large image dataset with a similar goal to the problem that is being solved.
Several layers from the trained model, usually the lower layers, are then used in
a new CNN, trained with sampled images from the current task. This way, the
learned features in re-used layers are the starting points for the training process
and adapted to classify new types of objects. Transfer learning has the benefit of
reducing the training time for a CNN model and can overcome the generalization
error due to the small number of images used in the training process when using
a network from scratch.
The previous obtained weights, in each layer, may be used as the starting
values for the next layers and adapted in response to the new problem. This
usage treats transfer learning as a type of weight initialization scheme. This may
be useful when the first related problem has a lot more labelled data than the
problem of interest and the similarity in the structure of the problem may be
useful in both contexts.
3.4 Training Process
In the training process, the CNNs use the fine-tuning strategy, as well as the
stochastic gradient descent with momentum optimizer (SGDM) at their default
values, dropout rate set at 0.5 and early-stopping to prevent over-fitting, and
the learning rate at 0.0001. SGDM is used to accelerate gradients vectors in
the correct directions, as we do not need to check all the training examples
to have knowledge of the direction of decreasing slope, thus leading to faster
converging. Additionally, to consume less memory and train the CNNs faster,
we used the CNNs batch size at 12 to update the network weights more often,
and trained them in 30 epochs. All images go through a heavy data augmentation
Deep Learning Recognition of a Large Number of Pollen Grain Types 387
Table 1. Chronological list and descriptions of CNN architectures used in this paper.
VGG 16/19 [20] Introduced the idea of using smaller filter kernels allowing
for deeper networks, and training these networks using pre-
training outputs of superficial layers. They have five convolu-
tional blocks where the first two have convolution layers and
one max-pooling layer in each block. The remaining three
blocks have three fully-connected layers equipped with the
rectification non-linearity (ReLU) and the final softmax layer
ResNet 50/101 [10] Shares some design similarities with the VGG architectures.
The batch normalization is used after each convolution layer
and before activation. These architectures introduce the
residual block that aims to solve the degradation problem
observed during network training. In the residual block, the
identity mapping is performed, creating the input for the next
non-linear layer, from the output of the previous layer
Inception-V3 [24] This network has three inception modules where the resulting
output of each module is the concatenation of the outputs
of three convolutional filters with different sizes. The goal
of these modules is to capture different visual patterns of
different sizes and approximate the optimal sparse structure.
Finally, before the final softmax layer, an auxiliary classifier
acts as a regularization layer
Inception-ResNet [23] Uses the combination of residual connections and the Incep-
tion architecture. In Inception networks the gradient is back-
propagated to earlier layers, and repeated multiplication may
make the gradient indefinitely small, so they replaced filter
concatenation stage with residual connections as in ResNet
Xception [5] The architecture is composed of three blocks, in a sequence,
where convolution, batch normalization, ReLU, and max
pooling operations are made. Besides, the residual connec-
tions between layers are made as in Resnet architecture
DenseNet201 [12] Is based on the ideas of ResNet, but built from dense blocks
and pooling operations, where each dense block is an iterative
concatenation from all previous layers. In the main blocks,
the layers are densely connected to each other. Massive reuse
of residual information allows for deep supervision as each
layer receives more information from the previous layer and
therefore the loss function will react accordingly, which makes
it a more powerful network
DarkNet53 [16] It has 53 layers deep and acts as a backbone for the YOLOv3
object detection approach. This network uses successive con-
volutional layers with some shortcut connections (introduced
by ResNet to help the activations propagate through deeper
layers without gradient vanishing) to improve the learning
ability. Batch Normalization is used to stabilize training,
speed up convergence, and regularize the model batch
388 F. C. Monteiro et al.
which includes horizontal and vertical flipping, 360o
random rotation, rescaling
factor between 70% and 130%, and horizontal and vertical translations between
−20 and +20 pixels. The CNNs were trained using the Matconvnet package for
Matlab R

on a node of the CeDRI cluster with two NVIDIA RTX 2080 Ti GPUs.
As in [1], we used 5-Set Cross-Validation, where in each set the images were
split on two subsets, 70% (1825 images) for training and 30% (730 images) for
testing, allowing the CNN networks to be independently trained and tested on
different sets. Since each testing set is build with images not seen by the training
model, this allows us to anticipate the CNN behaviour against new images. The
four datasets (original, segmented, segmented with equalization and segmented
with CLAHE) were trained and tested in an independent way.
4 Results and Discussion
Other works use different evaluation metrics like Precision, Recall, F1-score [1]
or correct classification rate (CCR) [18]. However, those metrics use the concept
of true negative and false negative. As in this type of experiments we only obtain
true positives or false positives, we evaluate the results with Accuracy (Precision
gives the same score), which relates true positive with all possible results.
The evaluation results for the nine CNN architectures considered, with dif-
ferent colour pre-processing techniques, are presented in Table 2. The numbers
exhibited in bold indicate the best Accuracy result obtained for each network.
Table 2. Classification results (in percentage) on the test set for the different CNNs
and preprocessing techniques considered.
CNN Original Segmented Seg. Equal. Seg. CLAHE
VGG16 94.2 94.3 88.7 92.6
VGG19 91.9 92.4 92.1 92.1
ResNet50 95.1 95.2 93.2 94.8
ResNet101 94.5 94.6 93.0 95.2
Inception V3 92.3 92.8 94.0 90.8
Inception-ResNet 92.3 89.9 92.4 90.3
Xception 89.7 87.8 87.5 88.1
DenseNet201 96.7 97.4 96.6 95.9
DarkNet53 95.8 95.9 95.1 94.9
Based on the results of Table 2, we can conclude that segmenting pollen
grains images improves the classification performance for the majority of CNN
models allowing the DenseNet201 to achieve an accuracy of 97.4%. Only the
Xception network produces better results for the original images. The Inception
architectures achieve the best performance with segmented histogram equalized
Deep Learning Recognition of a Large Number of Pollen Grain Types 389
images. The remain architectures achieved the highest performance when using
segmented images without any colour processing.
The DenseNet201 classified correctly all the images for 65 pollen grain types
out the 73 types of the dataset. For the other 9 types, it misclassified up to
two images, with a total of eleven false positives in the 730 tested images. The
lowest accuracy result of the DenseNet201 was achieved with the pollen types
dipteryx alata and myrcia guianensis. These pollen types have predominantly
rounded shapes and high texture, that are normally learned in the first CNN
layers. Since the transfer learning process changes only a set of the last CNN
layers it does not change those learned features during the training process with
our images, producing some misclassified results.
The accuracy rates achieved by the DenseNet201 network are relevant due to
the amount of pollen types in the POLEN73S dataset, since Sevillano et al. [19]
obtained a higher accuracy in a dataset containing only 46 pollen types. That
shows that DenseNet201 presented an important performance on POLEN73S.
The network trained and tested using the segmented images produced false
positives results that are misclassified as pollens that have high similarity with
the tested ones. Figure 3 shows some of those false positive examples.
Fig. 3. First row: segmented tested pollens (magnolia champaca, myrcia guianen-
sis, dipteryx alata, arachis sp); second row: misclassified pollens (ricinus communis,
schizolobium parahyba, zea mays, myracroduon urundeuva).
In networks trained and tested with segmented images the background colour
bias information was removed, and so the pollen is classified using only the grain
pollen information, correcting some of the false positives of the network trained
with original images, where the background colour was used as a feature in the
classification process.
The high values for the evaluation metric in all CNNs show that the number
of correctly identified pollens is high when compared to the number of tested
390 F. C. Monteiro et al.
Table 3. Comparison with previous attempts at pollen classification of more than 20
pollen types using a CNN classifier, with number of types and the highest reported
accuracy.
Authors #Types Accuracy
Sevillano et al. [18] 23 97.2
Menad et al. [14] 23 95.0
Daood et al. [8] 30 94.0
Sevillano et al. [19] 46 97.8
Astolfi et al. [1] 73 95.8
Our approach 73 97.4
images. We believe that an accuracy over 97% is enough to build an automatic
classification system of pollen grains, since the visual classification performed by
human operators is a hard and time consuming task with a lower performance.
4.1 Comparison with Other Studies
We compared our results with other automatic approaches, from the current
literature, that used a CNN classifier. Previous deep learning approaches have
shown similar or higher accuracy rates to ours, but these studies were conducted
with a small number of pollen types. Table 3 provides a summary table of previ-
ous studies, including class sizes and accuracy/success rates against our result.
All the literature reviewed, except [1], used a significantly smaller image dataset,
in terms of pollen types, than the one used in this paper.
Although the work of Sevillano et al. [19], with forty six types of pollen,
achieved a slightly higher performance than our study, as the number of pollen
types is directly related to the classification performance of the CNNs, the results
must be evaluated taking into account this difference in the number of pollen
types between the work presented in [19] and ours.
In short, it can thus be concluded that training a network with the atten-
tion focused on the object itself by removing the background dissimilarities can
improve the performance of CNN model in pollen classification problem.
5 Conclusion
The usual method for pollen grains identification is a qualitative approach, based
on the discrimination of pollen grain characteristics by an human operator. Even
though this manual method is quite effective, the all process is time consuming,
laborious and sometimes subjective. Creating an automatic approach to identify
the grains, in a precise way, thus represents a task of utmost interest.
In this study, an automated pollen grain recognition approach is proposed.
We investigate the influence of background colours and colour pre-processing in
Deep Learning Recognition of a Large Number of Pollen Grain Types 391
the recognition task using nine state-of-the-art CNN topologies. Using a combi-
nation of an image-processing workflow and a sufficiently trained deep learning
model, we were able to recognize pollen grains from seventy three pollen types,
one of the largest number of pollen types studied until now, achieving an accu-
racy of 97.4% that represents one of the best success rate so far (when weighted
for the number of pollen types used in this work).
This study proves that using deep learning CNN architectures for the pollen
grain recognition task allows good classification results when using a transfer
learning approach. In the future, we plan to combine the features from several
CNNs enhancing the effectiveness of deep learning approaches in pollen grain
recognition.
References
1. Astolfi, G., et al.: POLLEN73S: an image dataset for pollen grains classification.
Ecol. Inform. 60, 101165 (2020)
2. Baker, N., Lu, H., Erlikhman, G., Kellman, P.J.: Local features and global shape
information in object classification by deep convolutional neural networks. Vis.
Res. 172, 46–61 (2020)
3. Bianco, S., Cusano, C., Napoletano, P., Schettini, R.: Improving CNN-based tex-
ture classification by color balancing. J. Imaging 3(33) (2017)
4. Buters, J., et al.: Pollen and spore monitoring in the world. Clin. Transl. Allergy
8(9) (2018)
5. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In:
2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1800–
1807 (2017)
6. Corvucci, F., Nobili, L., Melucci, D., Grillenzoni, F.V.: The discrimination of honey
origin using melissopalynology and raman spectroscopy techniques coupled with
multivariate analysis. Food Chem. 169, 297–304 (2015)
7. D’Amato, G., et al.: Allergenic pollen and pollen allergy in Europe. Allergy 62(9),
976–990 (2007)
8. Daood, A., Ribeiro, E., Bush, M., et al.: Pollen grain recognition using deep learn-
ing. In: Bebis, G. (ed.) ISVC 2016. LNCS, vol. 10072, pp. 321–330. Springer, Cham
(2016). https://doi.org/10.1007/978-3-319-50835-1 30
9. Gonçalves, A.B., et al.: Feature extraction and machine learning for the classifica-
tion of Brazilian savannah pollen grains. PLoS ONE 11(6), e0157044 (2016)
10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 770–778 (2016)
11. Holt, K.A., Bennet, K.D.: Principles and methods for automated palynology. New
Phytol. 203(3), 735–742 (2014)
12. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected
convolutional networks. In: IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 2261–2269 (2017)
13. Laurence, A.R., Bryant, V.M.: Forensic Palynology, pp. 1741–1754. Springer, New
York, NY (2014)
14. Menad, H., Ben-Naoum, F., Amine, A.: Deep convolutional neural network for
pollen grains classification. In: JERI (2019)
392 F. C. Monteiro et al.
15. Ponnuchamy, R., et al.: Honey pollen: using melissopalynology to understand for-
aging preferences of bees in tropical south India. PLoS ONE 9(7), e101618 (2014)
16. Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. ArXiv p.
1804.02767 (2018)
17. Rodriguez-Damian, M., Cernadas, E., Formella, A., Fernandez-Delgado, M., Pilar
De Sa-Otero: Automatic detection and classification of grains of pollen based on
shape and texture. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 36(4),
531–542 (2006)
18. Sevillano, V., Aznarte, J.L.: Improving classification of pollen grain images of the
POLEN23E dataset through three different applications of deep learning convolu-
tional neural networks. PLoS ONE 13(9), e0201807 (2018)
19. Sevillano, V., Holt, K., Aznarte, J.L.: Precise automatic classification of 46 differ-
ent pollen types with convolutional neural networks. PLoS ONE 15(6), e0229751
(2020)
20. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale
image recognition. In: International Conference on Learning Representations, pp.
1–14 (2015)
21. Sobol, M.K., Finkelstein, S.A.: Predictive pollen-based biome modeling using
machine learning. PLoS ONE 13(8), e0202214 (2018)
22. Stillman, E., Flenley, J.: The needs and prospects for automation in palynology.
Quat. Sci. Rev. 15(1), 1–5 (1996)
23. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inceptionresnet and
the impact of residual connections on learning. In: Proceedings of the Thirty-First
AAAI Conference on Artificial Intelligence, pp. 4278–4284 (2017)
24. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep-
tion architecture for computer vision. In: 2016 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)
Predicting Canine Hip Dysplasia
in X-Ray Images Using Deep Learning
Daniel Adorno Gomes1
, Maria Sofia Alves-Pimenta1,2
,
Mário Ginja1,2(B)
, and Vitor Filipe1,3
1
University of Trás-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal
mginja@utad.pt
2
CITAB - Centre for the Research and Technology of Agro-Environmental
and Biological Sciences, Vila Real, Portugal
3
INESC TEC - INESC Technology and Science, 4200-465 Porto, Portugal
Abstract. Convolutional neural networks (CNN) and transfer learning
are receiving a lot of attention because of the positive results achieved
on image recognition and classification. Hip dysplasia is the most preva-
lent hereditary orthopedic disease in the dog. The definitive diagnosis is
using the hip radiographic image. This article compares the results of the
conventional canine hip dysplasia (CHD) classification by a radiologist
using the Fédération Cynologique Internationale criteria and the com-
puter image classification using the Inception-V3, Google’s pre-trained
CNN, combined with the transfer learning technique. The experiment’s
goal was to measure the accuracy of the model on classifying normal and
abnormal images, using a small dataset to train the model. The results
were satisfactory considering that, the developed model classified 75% of
the analyzed images correctly. However, some improvements are desired
and could be achieved in future works by developing a software to select
areas of interest from the hip joints and evaluating each hip individually.
Keywords: Canine hip dysplasia · CHD · Image recognition · CNN ·
Convolutional neural network · Artificial neural network ·
Inception-V3 · Artificial intelligence · Machine learning
1 Introduction
Canine hip dysplasia (CHD) is a hereditary disease that mainly affects dogs of
large and giant breeds [1]. The disease begins with pain and often the dog that
ran and jumped, cannot get up [2]. Genetic and environmental factors, such as
excessive growth rate, obesity, excessive and inadequate exercise can contribute
to the development of hip dysplasia in dogs [3]. The hip joint works like a ball
and a socket. Dogs with hip dysplasia develop osteoarthritis and the ball and
socket do not adjust [4]. Instead of sliding in a smooth way, they will rub and
grind, causing discomfort and gradually loss of function. Diagnostic radiography
is the main method used worldwide for screening hip dysplasia in dogs with
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 393–400, 2021.
https://doi.org/10.1007/978-3-030-91885-9_29
394 D. A. Gomes et al.
breeding or management purposes [5]. Currently, radiographic analysis of CHD
is performed by veterinary radiologists using for scoring different schemes, of
which we highlight: the Fédération Cynologique Internationale’s (FCI) system
is most commonly used in continental European countries and uses five grades
(normal, borderline and three dysplastic grades); the Orthopaedic Foundation
for Animals (OFA) guidelines commonly used in the United States that uses
seven grades (three normal, borderline and three dysplastic); the British Veteri-
nary Association/Kennel Club scoring scheme is commonly used in the United
Kingdon and is a system that attributes disease points, the final dog’s score
ranges from 0 to 106 [4]. In general, all the schemes of assessment are consid-
ered subjective, and sometimes it is recommended the radiographic evaluation
by more than one examiner, being the agreement between the categories of dif-
ferent evaluators in the order of 50–60% [6]. Figure 1 shows examples of normal
(a) and CHD (b) X-ray images.
Fig. 1. Examples of normal (a) and CHD (b) X-ray images.
Advances in artificial intelligence, machine learning, image recognition and
classification, allied with the radiologic equipment, can be useful and help to
improve the diagnosis of this kind of disease. Convolutional Neural Networks
(CNN) are the architecture behind the most recent advances in computer vision.
They are currently providing solutions in many areas like image recognition,
speech recognition, and natural language processing. Through this program-
ming paradigm, a computer can learn from observational data. Practically, this
resource is allowing companies to create a wide range of applications like mecha-
nisms of autonomous vehicle vision, automatic inspection, anomaly detection in
medical imaging, facial recognition, for example, to unlock a mobile phone [7].
A CNN is a specific type of artificial neural network used mainly in image recog-
nition and natural language processing, also known as a ConvNet. The CNNs
combine deep learning algorithms and a multi-layer perceptron network that
consists of an input layer, an output layer and hidden layers that include convo-
lutional, normalization, pooling and fully connected layers, using mathematical
Predicting Canine Hip Dysplasia in X-Ray Images Using Deep Learning 395
models to transmit the results to successive layers. The complexity of the objects
that a CNN can recognize increases accordingly with the filters learned by each
layer.
The first layers learn basic feature detection filters like corners and edges. The
middle layers learn filters that identify parts of objects—for dogs, for example,
they learn to recognize eyes, snouts, and noses. And, the last layers, learn to
identify complete objects, like to differentiate a dog from a cat or identifying
a dog’s race [8]. Pre-trained CNN such as AlexNet, VGGNet, Inception, and
Xception, allied with a technique called transfer learning, have been used to
build accurate models in a time saving way using small datasets [9]. The transfer
learning technique is a deep learning approach in which knowledge is transferred
from one model to another. When applying this technique it is possible to solve
a certain problem using all or part of a pre-trained model developed to solve a
different problem [10].
In this paper, the results of CHD scores attributed by a radiologist using
the FCI criteria were compared with the computer image classification using
the Inception-V3, Google’s pre-trained CNN [11], combined with the transfer
learning technique. The main aim of the work was to measure the accuracy of
the computer model on classifying normal and CHD images. Section 2 presents
the details on the materials and methods applied in the experiment. Section 3
shows the obtained results. Section 4 discusses the results and the performance
of the model, and make a few final remarks.
2 Methods
In this study, it was used a pre-trained network, the Google’s Inception-v3, that
was trained from the ImageNet database on more than a million images [12–14].
This CNN is 48 layers deep and can classify images into 1000 object categories
like chairs, pens, keyboards, animals, and cars. This network has an image input
size of 299 × 299 pixels. Figure 2 shows how the architecture of the Inception-V3
is structured.
The chosen pre-trained CNN, Inception-v3, was trained to identify charac-
teristics of normal and CHD in X-ray images. This new feature added a new
layer to the network capable of carrying out this type of identification. It is a
very popular technique called transfer learning, that permits to train a CNN on
a new task using a smaller number of training images. It consists of the reuse
of a pre-trained network as a starting point to learn about a new domain. The
rich feature representations already learned from a wide range of images and the
pre-trained layers are used to accelerate the process as the network will learn
new features creating the new layer [15]. In practice, this new layer will be the
last layer, and it will only be trained on the new domain to classify which X-ray
images are CHD positive or negative.
The dataset used in our experiments was provided by the Department of
Veterinary Science of the University of Trás-os-Montes and Alto Douro, Portugal.
It was used for training and testing the new layer of the CNN. The dataset
396 D. A. Gomes et al.
Fig. 2. The architecture of the Inception V3. Extracted from [11].
contains a total of 225 X-ray digital images (1760X2140 pixels, 1-channel gray
scale, 8-bit depth in each channel, JPG format) obtained using the convenctional
ventro dorsal hip extended view. They were classified based on the FCI criteria
by a veterinary radiologist, 125 with hip dysplasia signs and 100 as normal.
The dataset was divided into 73% for the training (165 images) and 27% (60
images) for the test set, based on the protocol proposed in [16] that suggests
70% to train and 30% to test.
Table 1 shows how the dataset was structured to execute the training and
the test phases of the transfer learning process.
Table 1. Dataset split.
Type of image Training Test Total
Hip dysplasia 94 31 125
Normal 71 29 100
Total of images 165 60 225
The transfer learning technique was applied on the CNN using the Ten-
sorflow [17] deep learning framework and the Python programming language,
version 3.6. The model was trained and tested using a virtual machine created
on Oracle Virtual Box 5.2.2 and configured with Linux Mint 19.1 operating sys-
tem, an Intel(R) Core(TM) i7-5500U CPU @ 2.40 GHz processor, 8 GB of RAM
and an NVIDIA GeForce 920M GPU.
The total of iterations of the training process was 4000. It is the default
value defined by the Tensorflow framework. During the training process, three
metrics were used: the training accuracy, validation accuracy, and cross-entropy.
The training accuracy achieved was 100%. It represents the percentage of the
images used in the training process that were labeled with the correct category.
The cross-entropy is a loss function that represents the total loss minimization
Predicting Canine Hip Dysplasia in X-Ray Images Using Deep Learning 397
during the training process. The total loss achieved a minimum of 0.038072. The
validation accuracy represents the percentage of images, randomly selected and
not used to train the model, that were correctly labeled. The validation accuracy
achieved was 83.8%. The final test accuracy achieved was 78.2%, meaning that
the model can make a correct prediction between 7 to 8 times out of 10.
After training the model with 165 images it was evaluated on the test set,
which includes 60 images unseen in the training phase. This is the classification
phase in which new samples are evaluated against the new model, and a predic-
tion is generated. Practically, it means that a script was executed for each of the
60 X-ray sample images to classify them in normal or CHD, based on the new
model. This script consists in a small program written in Python that executes
the process of classification in 5 steps. In the first step, it obtains the image that
will be classified. After, it loads the defined labels that will be used to classify
the images, in this case, normal and CHD. In the next step, it retrieves the
model generated in the training phase to compare to the new image, and finally,
provide a prediction.
The performance of the classification algorithm can be visualized in a confu-
sion matrix. Also, it is possible to calculate the accuracy of the algorithm through
Eq. 1, based on the values of the confusion matrix [18]. Both the performance
and the accuracy data will be shown in the results topic.
Accuracy = (TP + TN)/(TP + TN + FP + FN) (1)
3 Results
The confusion matrix in Fig. 3 was built from the model’s predictions obtained
on the test set.
Fig. 3. Obtained results applied in a confusion matrix.
The correct predictions are located in the diagonal of the table, highlighted
in yellow. The model classified correctly 26 images as CHD and 19 as normal. Of
the 31 actual CHD X-ray, the algorithm predicted that 5 were normal, and of the
29 normal, it predicted that 10 were CHD. Accordingly with the numbers of the
Fig. 3, the accuracy of the proposed model is (26+19)/(26+10+5+19) = 75%,
398 D. A. Gomes et al.
considering the traditional default threshold of 0.5, due the binary nature of
the experiment [19]. In global, 45 X-ray images were classified with the correct
category, 26 representing animals with hip dysplasia and 19 with normal status.
The other 15 X-ray images evaluated in the test phase, were incorrectly
classified by the model, representing 10 normal animals and 5 with canine hip
dysplasia. In Fig. 4 are shown two X-ray image samples correctly classified by
the model.
Fig. 4. Samples correctly classified by the model in the test phase.
4 Discussion and Final Remarks
This paper presents a classification of canine X-ray images in hip dysplasia using
a pre-trained convolutional neural network combined with a transfer learning
technique. In part of the sample images, the algorithm used in this experiment
Predicting Canine Hip Dysplasia in X-Ray Images Using Deep Learning 399
was able to achieve similar classification results to the ones obtained from a
veterinary radiologist, using the FCI criteria.
Figure 3 shows the results obtained by the new layer added to the pre-trained
CNN in a confusion matrix. The new model achieved a diagnostic accuracy of
75% compared to the diagnosis provided by the veterinary radiologist.
Our results show that binary image classification performed by pre-trained
CNN using the transfer learning technique works well on small X-ray images
datasets. It was able to learn on how to identify CHD in X-ray images using just
165 samples.
It is important highlight that only 40 images were obtained by digital means
and the rest, 185 images, were taken by traditional means, using film. In the
experiment, the digitized version of these X-ray films were used to achieve a
minimum number of samples that would permit the model’s training.
The results demonstrate the adopted approach works properly on the CHD
classification, considering the dataset was small. The performance achieved can
be improved using larger datasets to train the model. Also, the model can be
improved to classify the images in sub-classes of the canine hip dysplasia, select-
ing areas of interest from the hip joints and evaluating each hip individually.
This work can be extended in the future, executing the same experiment with
more recent CNN architectures like ResNet, ResNext, and DenseNet to compare
the performance between them.
Acknowledgments. This work was supported by National Funds by FCT - Por-
tuguese Foundation for Science and Technology, under the projects UIDB/04033/2020
and Scientific Employment Stimulus - Institutional Call - CEECINST/00127/2018
UTAD.
References
1. Ginja, M.M., et al.: Early hip laxity examination in predicting moderate and severe
hip dysplasia in Estrela mountain dog. J. Small Anim. Pract. 49(12), 641–646
(2008)
2. Loder, R.T., Todhunter, R.J.: The demographics of canine hip dysplasia in the
United States and Canada. J. Vet. Med. 2017, 5723476 (2017). https://doi.org/
10.1155/2017/5723476. PMID: 28386583
3. Kimeli, P., et al.: A retrospective study on findings of canine hip dysplasia screening
in Kenya. Vet. World 8(11), 1326–1330 (2015). https://doi.org/10.14202/vetworld.
2015.1326-1330
4. Ginja, M.M.D., Silvestre, A.M., Gonzalo-Orden, J.M., Ferreira, A.J.A.: Diagnosis,
genetic control and preventive management of canine hip dysplasia: a review. Vet.
J. 184(3), 269–276 (2010). https://doi.org/10.1016/j.tvjl.2009.04.009
5. Butler, R., Gambino, J.: Canine hip dysplasia: diagnostic imaging. Vet. Clin. North
Am. Small Anim. Pract. 47(4), 777–793 (2017)
6. Smith, G.K., Gregor, T.P., McKelvie, P.J., O’Neill, S.M., Fordyce, H., Pressler,
C.R.K.: PennHIP Training Seminar and Reference Material. Synbiotics Corpora-
tion, San Diego (2002)
400 D. A. Gomes et al.
7. Tian, Y., Jana, S., Pei, K., Ray, B.: DeepTest: automated testing of deep-neural-
network-driven autonomous cars. In: IEEE/ACM 40th International Conference
on Software Engineering (ICSE), pp. 303–314 (2018). https://doi.org/10.1145/
3180155.3180220
8. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification:
a comprehensive review. Neural Comput. 29(9), 2352–2449. MIT Press Journals
(2017). https://doi.org/10.1162/NECO a 00990
9. Aloysius, N., Geetha, M.: A review on deep convolutional neural networks. In:
International Conference on Communication and Signal Processing, pp. 588–592
(2017). https://doi.org/10.1109/ICCSP.2017.8286426
10. Sarkar, D., Bali, R., Ghosh, T.: Hands-On Transfer Learning with Python. Packt
Publishing, Birmingham (2018)
11. Advanced Guide to Inception v3 on Cloud TPU Homepage. https://cloud.google.
com/tpu/docs/inception-v3-advanced. Accessed 25 April 2021
12. ImageNet Homepage. http://www.image-net.org. Accessed 25 March 2021
13. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale
hierarchical image database. In: IEEE Conference on Computer Vision and Pattern
Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep con-
volutional neural networks. In: Proceedings of the 25th International Conference
on Neural Information Processing Systems - Volume 1 (NIPS 2012), pp. 1097–1105.
Curran Associates Inc., Red Hook, NY, USA (2012)
15. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep-
tion architecture for computer vision. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 2818–2826 (2016). http://arxiv.
org/abs/1512.00567
16. Spanhol, F.A., Oliveira, L.S., Petitjean, C., Heutte, L.: A dataset for breast cancer
histopathological image classification. IEEE Trans. Biomed. Eng. 63(7), 1455–1462
(2016). https://doi.org/10.1109/TBME.2015.2496264
17. TensorFlow Homepage. https://www.tensorflow.org/. Accessed 25 March 2021
18. Kotu, V., Deshpande, B.: Chapter 8 - Model Evaluation. Data Science (Second
Edition), pp. 263–79. Morgan Kaufmann, Burlington (2019). https://doi.org/10.
1016/B978-0-12-814761-0.00008-3
19. Freeman, E.A., Moisen, G.G.: A comparison of the performance of threshold cri-
teria for binary classification in terms of predicted prevalence and kappa. Ecol.
Modell. 217(1–2), 48–58 (2008). https://doi.org/10.1016/j.ecolmodel.2008.05.015
Convergence of the Reinforcement
Learning Mechanism Applied to the
Channel Detection Sequence Problem
André Mendes(B)
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Polytechnic Institute of Bragança, 5300-253 Bragança, Portugal
a.chaves@ipb.pt
Abstract. The use of mechanisms based on artificial intelligence tech-
niques to perform dynamic learning has received much attention recently
and has been applied in solving many problems. However, the conver-
gence analysis of these mechanisms does not always receive the same
attention. In this paper, the convergence of the mechanism using rein-
forcement learning to determine the channel detection sequence in a
multi-channel, multi-user radio network is discussed and, through sim-
ulations, recommendations are presented for the proper choice of the
learning parameter set to improve the overall reward. Then, applying the
related set of parameters to the problem, the mechanism is compared to
other intuitive sorting mechanisms.
Keywords: Artificial intelligence · Reinforcement learning ·
Convergence analysis · Channel detection sequence problem
1 Introduction
The increasing use of devices that rely on wireless connections in unlicensed
frequency bands has caused huge competition for available spectrum. However,
studies conducted by regulatory agencies and universities show that the static
allocation currently practised through the sale of spectrum bands promotes, in
practice, the under-use of frequency bands, even in urban areas [1].
With this, the idea of dynamic access to spectrum [2] emerged as a solution
to improve the use of this scarce resource and thus provide alternative connec-
tivity for the growing demand-driven, for example, by the Internet of Things
devices. This dynamic access to spectrum relies on the use of devices whose
wireless network interface is reconfigurable, being able to dynamically adapt
their parameters and operating modes to the existing spectrum occupation.
Thus, considering a scenario of multiple frequency bands (called channels)
and users provided with a single transceiver, where only one channel can be
scanned at a time to detect possible “opportunities” for use and, in that same
time interval, effectively use the channel. In this scenario, the channel detection
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 401–416, 2021.
https://doi.org/10.1007/978-3-030-91885-9_30
402 A. Mendes
sequence can have a major impact on user performance by minimising the time
to search and access a free channel.
When there is prior accurate knowledge of the channel statistics, the intuitive
detection sequence, the one that follows the decreasing sequence of channel avail-
ability probabilities is admittedly the optimal sequence [3]. However, in many
practical scenarios, the channel availability probability is unknown in advance,
and therefore, the optimal detection sequence requires a huge effort to obtain. A
first intuitive approach to solve this problem would be the historical observation
of this statistic, however, this approach would greatly increase the channel access
time due to the analysis required to obtain an accurate statistic.
Although the optimal stopping theory [4] combined with a brute-force mech-
anism can solve the problem, there is a strong dependency on previous and
accurate knowledge of the statistics of the channel and the computational cost
of this solution is high, growing even more with the increase of the number
of channels and users. Both problems impact directly the use of this solution
embedded in a platform with few resources.
In [5], the problem of choosing the channel detection sequence for only one
user was investigated and an approach was presented using a low computational
complexity method based on a reinforcement learning machine for the dynamic
search of the optimal detection sequence, where no prior knowledge of the avail-
ability probability and channel capacity is required.
In this paper, the evolution of the mechanism for searching the detection
sequence in a multi-channel, multi-user radio network using reinforcement learn-
ing, presented in [5], is discussed for the multi-user case, where the reward model
follows the optimal stopping theory [4]. The implementation of another strategy
for balancing the investigation-exploration dilemma is also included, and rec-
ommendations are presented for the choice of parameter ranges to improve the
overall reward of the mechanism. Then, the convergence analysis of the mech-
anism is performed by applying the aforementioned set of parameters to the
scenario. Finally, the results obtained are compared to those of other intuitive
sorting mechanisms.
This paper is organized as follows. Section 2 deals with related works.
Section 3 performs the summary of reinforcement learning, the adopted system
model and the proposal of the work that uses this technique for the search of
the optimal channel detection sequence. In Sect. 4, the simulation environment
is described, the convergence analysis of the mechanism is examined and the
results obtained are discussed. Finally, Sect. 5 concludes the paper and com-
ments on future work.
2 Related Work
The work in [6,7] obtains some empirical evidence describing the convergence
properties of reinforcement learning methods in multi-agent systems consider-
ing that in many practical applications it is not reasonable to assume that the
actions of other agents can be observed. Most agents interact with the surround-
ing environment relying on information from “sensors” because, without any
Convergence of the RL Mechanism 403
knowledge of the actions (or rewards) of other agents, the problem becomes
even more complex.
In the work in [8], the authors studied the multiagent class of type indepen-
dent learners, including the convergence of the mechanism, in a deterministic
environment for some scenarios. In future work [9,10], the same class is studied
in a stochastic environment.
An important concept is that of regret (regret theory), which is defined as the
requirement where an agent obtains a reward that is at least as good as the one
obtained through any stationary strategy, independent of the strategy adopted
by the other agents [11]. For certain types of problems, the learning algorithms
that follow this concept converge to the optimal (Nash) equilibrium [12,13].
In the work presented in [14], the convergence of the class independent learn-
ers is established, while applied to the fictional player procedure [15] in com-
petitive games. Taking a differentiated approach, the work in [16] proposed an
algorithm of the class independent learners for repetitive games (where the player
is allowed to have a memory of past moves) that converges to a guaranteed fair
policy but periodically alternates between some equilibrium points.
3 Dynamic Channel Detection Sequence Using
Reinforcement Learning
The proposal of this work is a solution based on multi-agent reinforcement learn-
ing of the independent learners class using the Q-learning algorithm applied to
the problem of finding the optimal channel detection sequence in a network of
wireless devices. Using this approach it is optional to previous knowledge regard-
ing the probability of each channel to be available, nor the estimated quality of
each channel through its average SNRs (signal to noise ratios). Another impor-
tant advantage of this proposal is its adaptability to changes in channel charac-
teristics guaranteed by learning from the actions taken.
Therefore, the mechanism becomes immune to possible changes in the avail-
ability probabilities of the channels, which may occur due to changes in the pat-
terns of channel usage and possible changes in channel quality (average SNRs),
which may occur due to mobility and large scale fading effects.
From now on, the summary of the reinforcement learning technique will be
performed and, in the sequence, the modelling used for the search of the optimal
channel detection sequence will be described.
Reinforcement Learning
Reinforcement learning is one of the three paradigms in the field of artificial
intelligence, alongside supervised and unsupervised learning, where an agent
performs learning through observations of the results of its actions to maximise
a scalar value called reward (or reinforcement signal) [17].
When an agent acts, the obtained response (reward) does not contain infor-
mation about the correct action that should be taken. Rather, the agent needs
404 A. Mendes
to discover for itself which actions lead to a greater reward by testing them. In
some interesting situations, actions may affect not only the immediate reward
but also those obtained in the future. The goal of the agent is to maximize the
sum of collected rewards.
The use of algorithms based on reinforcement learning, in particular, the Q-
learning [18], has received great attention lately. Being an algorithm that does
not need a model of the problem and capable of being used directly at runtime,
Q-learning is very suitable for application in systems with multiple agents, where
each agent has little knowledge of the others and where the scenario changes
during the learning period.
Q-learning adopts a simple approach, leading also to a low computational com-
plexity, which is important for platforms with few resources. In contrast, this
method can exhibit slow convergence [19].
The Q-learning algorithm works by estimating pair values (state, action)
from a value function Q, hence the name. The value Q(s, a), called Q-value,
is defined as the expected value of the sum of future rewards obtained by the
agent by taking the action a from the state s, according to an optimal policy
[18]. The agent initializes its Q-table with an arbitrary value and, at the instant
of choosing the action to be taken, a strategy is generally chosen in a way that
guarantees sufficient exploration while favouring actions with higher Q-values.
One commonly used strategy is the ε-greedy [17], which attempts to make
all actions, and their effects, equally experienced. In ε-greedy the agent uses the
probability given by ε to decide between exploring (exploitation) the Q-table or
investigating (exploration) randomly new states.
Another strategy is called softmax [17], where the probability of choosing a
particular action varies according to the corresponding value of Q-value. In this
strategy, a probability distribution is used for choosing an action a from a state
s. A commonly used distribution for this situation is the Boltzmann distribu-
tion. This distribution has a parameter, called temperature, which controls the
amount of exploration that will be practised. High values of temperature cause
an almost equiprobability of the choice of actions. Low values, on the contrary,
cause a great difference in the probability of choice of actions. In the limit, with
t → 0, the softmax strategy works as if it were the ε-greedy.
Qt+1(s, a) = Q(s, a) + α[r(s, a) + γmaxaQ(st+1, a) − Q(s, a)] (1)
In Q-learning, from the current state s, an action a is selected and subse-
quently a reward r is received, proceeding to the next state s
. With this, one
updates the value Q(s, a) according to Eq. 1, where α is the parameter called
learning rate and 0 ≤ γ ≤ 1 is the parameter called discount factor. Higher val-
ues of α indicate greater importance for present experience over history, while
higher values of γ indicate that the agent relies more on future reward than
immediate reward [20].
The Q-learning algorithm converges to the correct Q-values with probability
1 if and only if: the environment is stationary and Markovian, a storage table
is used to store the Q-values (usually named Q-table), no state-action pair is
Convergence of the RL Mechanism 405
neglected (with time tending to infinity) and the learning rate is reduced appro-
priately with time [18]. The Q-learning does not specify which action should be
chosen at each step but requires that no action should fail to be tested.
System Model
In this work, each user was modelled as a learning agent following an indepen-
dent strategy according to the characteristics of the multi-agent class of type
independent learners [6]. This type of agent does not know the other agents,
interacting with the environment as if it were alone.
The agent, in this case, is unable to observe the rewards and actions of
the other agents and, consequently, can apply the Q-learning technique in its
traditional form. The choice of the multi-agent class, in particular, was motivated
by the possibility of future scalability of the mechanism by increasing the number
of agents and their autonomy, besides the enormous difficulty of satisfying, in
practice, the requirement of observation of the agents’ joint actions necessary
for the application of mechanisms of the joint learners class.
The system model adopted for the development and implementation of the
proposal allows the determination of the optimal detection sequence through the
application of the optimal stopping theory [5]. In this way, the goal is to provide
a decision about the moment to finalize the detection of new channels in such a
way that the reward obtained in the choice of a channel is maximized. This theory
then allows the definition of the stopping rule that maximizes the reward. The
optimal stopping theory provides a criterion for the choice of stopping (staying
on the free channel) or proceeding.
Consider a network with multiple users and a finite number of channels,
N. Each user is equipped with a transceiver for data exchange that can be
tuned to one of the N channels in the network. Moreover, each user has an
operating model whose access time, which we will call slot, devoted to channelling
observation and data transmission is constant with duration T.
In each slot, each channel i has the probability pi of being free or busy. It is
assumed that this probability is independent of its previous state and the state
of other channels, within each slot, and i.i.d. between slots.
Figure 1 exemplifies the activity of a user in a slot, which has two phases: a
detection phase and a data transmission phase. Before deciding to use a channel
in a given slot, the user must perform channel detection in the time equal to
τ to determine whether it is free or busy, minimizing the probabilities of non-
detection and false alarms.
In addition, the time equal to σ is reserved for channel probing, which has a
much shorter duration than the duration of the transmission phase, depends on
the channel bandwidth, modulation, etc., and whose result provides the trans-
mission rate that will be obtained, function of the momentary SNR of that user
on that channel. In the model, it is assumed that both channel detection and
channel probing are accurate and error-free tasks.
406 A. Mendes
Fig. 1. Model of a slot.
Qt+1(s, a) =

(1 − α)Q(s, a) + α [r(s, a) + γmaxaQ(st+1, a)] if free channel
(1 − α)Q(s, a) + αγmaxaQ(st+1, a) if busy channel
(2)
Some constraints must be taken into account when taking actions and updat-
ing the Q-table, which is performed according to Eq. 2. One is that any action
taken in the state (ok, ∗), with 1 ≤ k ≤ (N − 1), always leads to a state where
the position in the detection sequence is ok+1. In the state where the position is
(oN , ∗), which represents the last channel in the detection sequence, the actions
indicate the first channel to be detected in the next slot, that is, it leads to
a state where the position in the detection sequence is (o1, ∗). When the user
decides to use a channel ci at position ok, the detection in that slot is terminated.
In that case, the first channel to be detected in the next slot will be determined
by the best action in state (oN , ci). Another important constraint is concerning
preventing the return to a previously detected channel (no recall).
The operation of the mechanism can be described as follows: at the beginning,
all state-action pairs of the Q-table are initialized with zeros. Then, the learning
phase begins, which is repeated during the entire period of the mechanism’s
operation. In this phase, the exploration strategy is chosen, whether softmax
or ε-greedy, and from it the action that will be followed from a given state is
chosen. After the execution of the action, the mechanism analyses if the channel
is free and decides whether to use it or not, continuing the detection according
to the established sequence of the channels. The goal is to provide a decision
about the moment to end the detection of new channels in such a way that the
reward obtained in choosing a channel is maximized. This indicates that it will
not always be advantageous to use the first free channel found.
Convergence of the RL Mechanism 407
This stopping criterion consists of comparing the current reward, rt, with
the best Q-value of the possible actions from that state. Thus, it is possible to
estimate if the reward of the current free channel is higher than the expected
reward of the best existing action. Notice that even in the case where the free
channel is not used, the Q-value referring to that action is also updated.
4 Parameters Evaluation
Before starting the evaluation of the proposal, some experiments were performed
to obtain the set of parameters of the mechanism based on reinforcement learning
capable of maximizing the amount of reward collected in the dynamic scenario
to which it is submitted, as well as to improve its convergence.
For this, a simulator was implemented, in Tcl language, responsible for emu-
lating the operation of a network, where each of its users uses individual detection
sequences.
4.1 Parameters of the Q-learning Algorithm
The parameters referred to are those of Q-learning: γ, α, and those concerning
the strategies, softmax and ε-greedy, respectively, ε and (temperature, β).
In this evaluation, a simulation is performed with 50,000 slots for each strat-
egy, softmax and ε-greedy, using a number of channels varying between 3 and
9 (x-axis), and 10 users. At the end of the simulation, the average value of the
reward is calculated (y-axis), using a confidence interval of 95%.
70
80
90
100
3 5 7 9
γ = 0.1
γ = 0.3
γ = 0.5
γ = 0.7
γ = 0.9
(a) α=0.9 e ε=0.7
70
80
90
100
3 5 7 9
γ = 0.3
γ = 0.5
γ = 0.6
γ = 0.7
(b) α =0.9 e ε=0.7
70
80
90
100
3 5 7 9
γ = 0
γ = 0.1
γ = 0.3
γ = 0.5
(c) α=0.3 e ε=0.9
Fig. 2. Impact of varying the discount factor γ with ε-greedy strategy, 10 users and
occupancy rate of 10%.
Initially, for ε-greedy strategy, some values for the parameter γ, called the
discount factor, were tested with the mechanism using the ε-greedy strategy.
This parameter directly determines the degree of importance of the reward that
will be collected in the future from an action taken in the present. Therefore,
408 A. Mendes
higher values of γ indicate that the agent relies more on the future reward than
the immediate reward
The results of this evaluation are shown in Fig. 2. For this analysis, it was
necessary to choose an initial range for the other parameters, α and ε, and our
choice was based on the values chosen for these same parameters previously [5].
It can be seen that the three approaches chosen (Figs. 2(a), 2(b) and 2(c)) point
to a relationship between the growth of γ and the increase in reward. However, it
can be noted that there is a threshold for the value of this parameter, between 0.7
and 0.9, which reverses the trend (Fig. 2(a)). An observation of the behaviour of
the parameter for values closer to 0.7 (Fig. 2(b)) demonstrates that the collected
reward values can be considered equal, due to the confidence interval, if γ is
in the interval [0.5, 0.7]. We opted for the smaller value as it favours a fast
convergence of the mechanism [17] .
70
80
90
100
3 5 7 9
α = 0.1
α = 0.3
α = 0.5
α = 0.7
α = 0.9
(a) γ = 0.5 e ε = 0.7
70
80
90
100
3 5 7 9
α = 0.1
α = 0.3
α = 0.5
α = 0.7
α = 0.9
(b) γ = 0 e ε = 0.7
84
86
88
90
92
94
96
98
100
3 5 7 9
ε = 0.1
ε = 0.5
ε = 0.7
ε = 0.9
ε = 1.0
(c) γ = 0.5 e α = 0.1
84
86
88
90
92
94
96
98
100
3 5 7 9
ε = 0.9
ε = 0.95
ε = 1
(d) γ = 0 e α = 0.3
Fig. 3. Impact of varying the learning rate α and the parameter ε with ε-greedy strat-
egy, 10 users and occupancy rate of 10%.
Then, the impact of varying the parameters α and ε was verified (Fig. 3).
Initially, the parameter α, called learning rate, was analysed. Briefly, larger val-
ues of this parameter benefit recent experience against the historical experiences
(or acquired knowledge) of the mechanism. In the same way, as in the previous
analysis, it was necessary to choose the initial value of the other parameters, γ
and ε. The γ value was obtained previously and the ε value was chosen as 0.7.
As can be seen in Figs. 3(a) and 3(b), smaller values of α provide better per-
formance of the mechanism. It was expected that this result would demonstrate
the importance of adapting the mechanism to the dynamics of the environment
and how much this would influence the overall performance, however, it was
found that it is more important to value the acquired knowledge, keeping the
parameter α low.
The parameter ε, also called exploration level, is linked to the ε-greedy strat-
egy and establishes the probability that directs the mechanism to explore new
states. The appropriate choice in the expose level depends very much on the
problem to be addressed. If the purpose of the problem requires intense learn-
ing, the value of this parameter tends to be higher. On the other hand, if in the
problem there is the execution of other tasks parallel to the learning, it does not
Convergence of the RL Mechanism 409
seem appropriate to assume that the mechanism should choose more random
actions and, with that, a compromise between learning and performance needs
to be considered in the choice.
Observing Fig. 3(c), we can notice that this parameter has an upper limit
worth 0.7, above which the variation of the parameter does not significantly
influence the reward.
According to the analyses performed, the parameters γ, α and ε assume
values of, respectively, 0.5, 0.1 and 0.7 forming the set of parameters for the
ε-greedy strategy applied to the problem and with better results. This set of
values will be adopted in the next experiments.
The softmax strategy is very sensitive to the adequate choice of the param-
eter temperature, responsible for controlling the exploration of new actions. If
the choice falls on a very low value, the exploration becomes greedy; and, other-
wise, the exploration becomes very random. Besides the temperature, it is also
necessary to choose the parameter β, which work is inversely proportional to the
learning rate α.
For this analysis, it was necessary to assign an initial value for temperature
so that values could be tested for the other parameter. The initial value for
temperature was 0.7. Figure 4(a) shows the variation curves of β between 0.3,
which makes the learning rate high, and 1, which reduces it. Observing the result,
it was noticed that there is little influence of this parameter on the variation of
the collected reward value when increasing the channels. Thus, the value of β
equal to 1 was chosen, keeping the learning rate reduced.
With this result, we return to the selection of the parameter temperature.
In Fig. 4(b) we can see the curves for the parameter given the chosen value for
β. The influence of the parameter temperature also proved to be small and we
decided to choose a value of 0.8, favouring the exploration of new actions.
This way, the two parameters for the softmax strategy, temperature= 0.8
and β = 1, were established and will be used by the mechanism in the next
experiments.
30
40
50
60
70
80
90
100
3 5 7 9
β = 0.3
β = 0.7
β = 1
(a)
30
40
50
60
70
80
90
100
3 5 7 9
t = 0.70
t = 0.80
t = 0.90
(b)
Fig. 4. Impact of varying the parameters temperature (t) and β with softmax strategy,
10 users and occupancy rate of 10%.
410 A. Mendes
4.2 Number of Users and Channel Occupancy
In this part, an experiment is carried out to determine the impact of varying
the number of users simultaneously with the variation of the channel occupancy.
The purpose is to evaluate the importance of using a mechanism adaptable
to the increase in the number of users, independent of the channel occupancy.
Moreover, with the increase of the number of slots the values of the collected
reward (y-axis) can be observed and conclusions about the convergence of the
mechanism can be drawn.
In this experiment, a simulation is performed with 100,000 slots for each strat-
egy, softmax and ε-greedy, using a number of channels equal to 9. The number
of channels was chosen assuming that there will be greater “opportunities” for
users as there are more channels. The plotted figures have been harvested at
20% of the total number of slots to take advantage of the part that contains the
most information.
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
1 US
10 USs
15 USs
20 USs
(a) Occupancy =
10%, softmax
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
1 US
10 USs
15 USs
20 USs
(b) Occupancy =
90%, softmax
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
1 US
10 USs
15 USs
20 USs
(c) Occupancy =
10%
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
1 US
10 USs
15 USs
20 USs
(d) Occupancy =
90%
Fig. 5. Impact of varying the number of users and the channel occupancy parameter
for both strategies and with 9 channels.
The results of this examination are shown in Fig. 5. As expected, it is pos-
sible to observe that with low channel occupancy the obtained reward values
are higher (Figs. 5(b) and 5(d)), a direct consequence of the higher amount of
“opportunities” for the users. In this result, it is also possible to observe that
there is a better performance of the mechanism for a certain number of users,
in both strategies. It is possible to understand this phenomenon as a result of
the users’ accommodation, a consequence of the greater dispute for the available
opportunities, which are not enough for the demand, causing a reduction in the
reward when the number of users increases too much.
When the channel occupancy parameter increases, a reduction in the reward
values are expected, which can be seen in Figs. 5(a) and 5(c). In this scenario, it
is also noted that there is a better performance of the mechanism for a certain
number of users, although the value is different from that present when there
are more “opportunities” available.
Another important observation is that the ε-greedy strategy outperforms soft-
max when applied to the problem, regardless of the channel occupancy rate. This
Convergence of the RL Mechanism 411
somewhat contradicts the theory that there is a predominance of the softmax
technique, demonstrating that in certain scenarios and applications, the ε-greedy
technique can indeed have superior performance [17].
The results obtained with this experiment also point to a need for adaptation
of the mechanism to the number of users, being interesting that should be a
capability of autonomous restriction of this quantity by the mechanism, aiming
to reach a bigger reward. This will be investigated in future works.
Regarding the convergence of the mechanism, even with the significant
increase in the number of channels, the mechanism shows to be robust, evolving
through a transitory period until the convergence in an equilibrium point, as the
number of slots increases.
4.3 Number of Channels and Channel Occupancy
In this part, the effects on the reward (y-axis) due to the variation of the num-
ber of channels simultaneously with the variation of the channel occupancy were
evaluated. For this purpose, a similar experiment to the previous one (Sect. 4.2)
was carried out, starting from a simulation with the same amount of slots for
each strategy, softmax and ε-greedy. A total of 10 users were used, as per the
results obtained in Sect. 4.2. Plotted figures were harvested at 20% of the total
number of slots for clarity, as in the previous analysis.
As expected, for both strategies and the two evaluated rates of the chan-
nel occupancy parameter, the highest reward values were obtained with the
highest channel offer (Fig. 6). However, with low channel availability for users
(Figs. 6(a) and 6(c)), the performance of the mechanism was similar for 3 chan-
nels or 5 channels. The cause of this behaviour is that a high rate of the channel
occupancy parameter with a reduced number of channels means that there are
fewer “opportunities” to be taken advantage of by the mechanism, and a small
increase in these “opportunities” is enough for the mechanism performance to
raise significantly.
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
3 channels
5 channels
9 channels
(a) Occupancy =
10%, softmax
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
3 channels
5 channels
9 channels
(b) Occupancy =
90%, softmax
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
3 channels
5 channels
9 channels
(c) Occupancy =
10%
0
20
40
60
80
100
10 20 30 50 100
Slot [x 100]
3 channels
5 channels
9 channels
(d) Occupancy =
90%
Fig. 6. Impact of varying the number of channels and the channel occupancy parameter
with both strategies, for 10 users.
For all the configurations presented, it was shown that the mechanism is
immune to any perturbation caused by the variation, either of the number of
412 A. Mendes
users or the number of channels, either for a higher value of the channel occu-
pancy parameter or for a lower value, and that as the number of slots grows the
mechanism evolves to an equilibrium point, which for our scenario and applica-
tion, was close to 5,000 slots.
4.4 Measurement of Justice
Another evaluation can be observed in Fig. 7, where the measure of fairness
among users is presented. This experiment was performed with the same con-
figuration as the previous experiment to evaluate the influence of the number of
users (Sect. 4.2). The importance of this measure lies in the fact that we have
analyzed until now the aggregate values of the reward, which could be mask-
ing some selfish behaviour of the mechanism in the attribution of the “best”
sequences, favouring some users to detriment of others.
0
3
5
7
9
12
1 2 6 10
Users
(a) Transient up to 100
slots, softmax
0
3
5
7
9
12
1 2 6 10
Users
(b) Transient up to 1,000
slots, softmax
0
3
5
7
9
12
1 2 6 10
Users
(c) Measurement up to
10,000 slots, softmax
0
3
5
7
9
12
1 2 6 10
Users
(d) Transient up to 100
slots
0
3
5
7
9
12
1 2 6 10
Users
(e) Transient up to 1,000
slots
0
3
5
7
9
12
1 2 6 10
Users
(f) Measurement up to
10,000 slots
Fig. 7. Measure of fairness between users with both strategies, for 10 users, 9 channels
and for the channel occupancy rate of 90%.
However, what could be observed, in both strategies, is that as the number of
slots increases, the final reward of each user tends to a homogeneous fraction of
the aggregate reward equivalent to that of other users, with no selfish behaviour
occurring.
Convergence of the RL Mechanism 413
This result also contributes to strengthening the hypothesis that the mecha-
nism converges to an equilibrium point since at this point the individual strate-
gies do not lead to the greedy behaviour of some users.
4.5 Results
To evaluate the performance of the proposal, the developed simulator was
extended and the following detection sequences were evaluated:
– the dynamic sequence of the channels provided by the mechanism (RL);
– the sequence of channels in the decreasing sequence of their availability prob-
abilities (Prob);
– the sequence given by the descending sequence of the average capacities of
each channel (CAP) [3]; and,
– the random sequence of channels (RND).
It is worth noting that all of the evaluated sequences, except the RL sequence,
are static, i.e., they do not change throughout the simulation. In the case of
the sequence RL, due to reinforcement learning itself, the sequence may vary
during the simulation. Moreover, all sequences, except for the RL and the RND,
assume a priori knowledge of the average capacities of each channel and/or their
availability probabilities.
At the beginning of each simulation round, the value of the average capac-
ity and availability probabilities of each channel i, i ∈ 1, ..., N, were drawn.
The availability probability of each channel is drawn uniformly within the
interval [0, 1]. The average capacity is drawn following a uniform distribu-
tion within the interval [0.1 × CAPMAX, CAPMAX], where CAPMAX is
the maximum average capacity of the channels. And the instantaneous capac-
ity of each channel is drawn using a uniform distribution within the interval
[0.2×CAPMEDIA, CAPMEDIA], where CAPMEDIA is the average capac-
ity of each channel.
Channel occupancy modelling was established according to an exponential
on-off model synchronised with the slots. With this, a given channel remained
in the occupied state for a period equal to tOF F equivalent to the mean of
an exponential distribution. Thus, tON can be obtained by tON = (1−u)×tOF F
u ,
where u is the channel occupancy rate.
The value of Wm, referring to the collision probability, was set to 8 [21].
Thirty rounds of the simulation were performed for each set of parameters. In
all simulations, CAPMAX was configured with a fixed value of 10 and the size
of the slot was configured as variable, with a value equal to twice the number of
channels used in the simulation multiplied by τ.
The metric used for evaluation was the reward obtained by each of the
sequences implemented using the same states and instantaneous channel capac-
ities (fairness criterion) at each slot, corresponding to the effective transmission
rate obtained by the user when using the channel (Sect. 3) and calculated by the
simulator, at the end of each round, as the average reward obtained by each of
414 A. Mendes
the sequences in all X slots for each user. The overall reward is given by the sum
of the individual rewards. A simulation is performed with 200 rounds of 100,000
slots for each strategy, softmax and ε-greedy.
0
10
20
30
40
50
60
70
80
90
100
2 3 4 5 6 7 8 9 10
Users
RL
Prob
Cap
RND
(a) Occupancy 10%,
softmax
0
20
40
60
80
100
2 3 4 5 6 7 8 9 10
Users
RL
Prob
Cap
RND
(b) Occupancy 90%,
softmax
0
10
20
30
40
50
60
70
80
90
100
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
3 channels
5 channels
9 channels
(c) 2 users, softmax
0
10
20
30
40
50
60
70
80
90
100
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
3 channels
5 channels
9 channels
(d) 10 users, softmax
0
20
40
60
80
100
2 3 4 5 6 7 8 9 10
Users
RL
Prob
Cap
RND
(e) Occupancy 10%
0
10
20
30
40
50
60
70
80
90
100
2 3 4 5 6 7 8 9 10
Users
RL
Prob
Cap
RND
(f) Occupancy 90%
20
30
40
50
60
70
80
90
100
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
3 channels
5 channels
9 channels
(g) 2 users
0
10
20
30
40
50
60
70
80
90
100
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
3 channels
5 channels
9 channels
(h) 10 users
Fig. 8. Results of the proposal RL compared to other intuitive sorting mechanisms.
The results of this evaluation are shown in Fig. 8. For both tested strategies,
softmax and ε-greedy, it can be observed that the proposed RL presents better
results. The performance of the other sequences is unfavourable because none of
them uses stopping rules based on the prediction of the estimated performance of
continuing sensing the next channels of the sequence, i.e., in these other solutions
the first channel sensed as free is always used. In this way, the RL, which uses
the past experiences stored in the Q-table, can efficiently determine whether it
is advantageous to use a given channel sensed as free.
An interesting observation regarding the curves for 9 channels in Figs. 8(e),
8(f) and 8(a), 8(b), is that the performance of the Prob sequences, very close
to that of the random RND sequences, is lower than the performance of the
Cap sequences. This indicates that in this scenario the differentiation between
the average channel capacities (CAPMEDIA) is more important than the dif-
ferentiation between their availability probabilities. With this, it is better to
sequence the channels by the decreasing sequence of their average capacities,
since it increases the probability that the first channel sensed as free is a channel
with a higher capacity.
Another detail is that the curve referring to the proposed RL presents a
smaller growth as the number of users increases, indicating that there is a satu-
Convergence of the RL Mechanism 415
ration threshold of the available capacity of the network according to the number
of available channels.
From Figs. 8(c) and 8(g), it can be seen that with few users, the reward
obtained varies little with the variation of occupancy (x-axis) and channels.
When the number of users increases (Figs. 8(d) and 8(h)), an increase in reward is
observed, although it can be noted that the variation in the parameter occupancy
is small, maintaining a similar percentage growth of reward with the increase in
the number of available channels.
Finally, the ε-greedy strategy had a superior result than the softmax strat-
egy, when applied to the problem. A possible explanation for this lies in the fact
that when Q-learning is used in modelling with a high number of actions, it is
expected to observe a high number of overestimations, which impair the perfor-
mance of the strategy, in some of the values of the actions if there is some noise
in these values, for example, due to the use of approximate value functions or
too abrupt transitions between states. Another explanation comes from the fact
that for some applications/modelling, the softmax strategy keeps exploring for
a long time, oscillating between many actions that obtain rewards with similar
values, also harming its performance [22].
5 Conclusions and Future Works
In this paper, the evolution of the mechanism presented in [5] is discussed for
the multi-user case. The implementation of the softmax strategy to balance the
research-exploration dilemma is also included along with recommendations for
the choice of parameters to improve the overall reward. Then, the convergence
of the mechanism is verified by applying the aforementioned set of parameters.
Finally, the performance evaluation of the evolved mechanism is performed.
Actually, the proper selection of a set of parameters for the mechanism allows
maximizing the reward (data throughput). However, its performance is strictly
related to the values chosen and, hence, the state of the RF environment could
degrade the performance.
It is always worth noting that RF communication is widely used in practical
applications, e.g. for the gathering of data from sensors spread over a primary
production area (agriculture).
As future works, we have the investigation of optimal values for the parame-
ters, the implementation of new strategy to balance the exploitation-exploration
dilemma and the evolution of the mechanism for autonomous user control in
order to achieve a higher reward, as shown in Sect. 4.2.
Acknowledgements. This work has been conducted under the project “BIOMA
– Bioeconomy integrated solutions for the mobilization of the Agri-food market”
(POCI-01-0247-FEDER-046112), by “BIOMA” Consortium, and financed by European
Regional Development Fund (FEDER), through the Incentive System to Research and
Technological development, within the Portugal2020 Competitiveness and Internation-
alization Operational Program.
416 A. Mendes
This work has also been supported by FCT - Fundação para a Ciência e Tecnologia
within the Project Scope: UIDB/05757/2020.
References
1. McHenry, M.A.: NSF Spectrum Occupancy Measurements Project (2005)
2. FCC: FCC-03-322 - NOTICE OF PROPOSED RULE MAKING AND ORDER.
Technical report, Federal Communications Commission, 30 December 2003
3. Cheng, H.T., Zhuang, W.: Simple channel sensing order in cognitive radio networks.
IEEE J. Sel. Areas Commun. (2011)
4. Chow, Y.S., Robbins, H., Siegmund, D.: Great Expectations: The Theory of Opti-
mal Stopping. Houghton Mifflin Company, Boston (1971)
5. Mendes, A.C., Augusto, C.H.P., Da Silva, M.W., Guedes, R.M., De Rezende, J.F.:
Channel sensing order for cognitive radio networks using reinforcement learning.
In: IEEE LCN (2011)
6. Claus, C., Boutilier, C.: The Dynamics of Reinforcement Learning in Cooperative
Multiagent Systems. National Conference on Artificial Intelligence (1998)
7. Tan, M.: Multi-agent Reinforcement Learning: Independent vs. Cooperative
Agents. In: Readings in Agents (1997)
8. Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in
cooperative multi-agent systems. In: ICML (2000)
9. Kapetanakis, S., Kudenko, D.: Improving on the reinforcement learning of coordi-
nation in cooperative multi-agent systems. In: AAMAS (2002)
10. Lauer, M., Riedmiller, M.: Reinforcement learning for stochastic cooperative mul-
tiagent systems. In: AAMAS (2004)
11. Bowling, M.: Convergence and No-Regret in Multiagent Learning. In: Advances in
Neural Information Processing Systems 17. MIT Press, Cambridge (2005)
12. Jafari, A., Greenwald, A., Gondek, D., Ercal, G.: On no-regret learning, fictitious
play and nash equilibrium. In: Proceedings of the 18th International Conference
on Machine Learning (2001)
13. Zapechelnyuk, A.: Limit behavior of no-regret dynamics. Technical report, School
of Economics, Kyiv, Ucraine (2009)
14. Leslie, D., Collins, E.: Generalised weakened fctitious play. Games Econ. Behav.
56(2) (2006)
15. Brown, G.: Some notes on computation of games solutions. Research memoranda
rm-125-pr, RAND Corporation, Santa Monica, California (1949)
16. Verbeeck, K., Nowé, A., Parent, J., Tuyls, K.: Exploring selfish reinforcement learn-
ing in repeated games with stochastic rewards. In: JAAMAS (2006)
17. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MP, Cam-
bridge (1998)
18. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
19. Yau, K.A., Komisarczuk, P., Teal, P.D.: Applications of reinforcement learning to
cognitive radio networks. In: IEEE International Conference in Communications
(ICC) (July 2010)
20. Yau, K.A., Komisarczuk, P., Teal, P.D.: Enhancing network performance in dis-
tributed cognitive radio networks using single-agent and multi-agent reinforcement
learning. In: IEEE Conference on Local Computer Networks (October 2010)
21. Vu, H.L., Sakurai, T.: Collision probability in saturated IEEE 802.11 networks. In:
Australian Telecommunication Networks and Applications Conference (2006)
22. Hasselt, H.: Double q-learning. In: NIPS (2010)
Approaches to Classify Knee
Osteoarthritis Using Biomechanical Data
Tiago Franco1(B)
, P. R. Henriques2
, P. Alves1
,
and M. J. Varanda Pereira1
1
Research Centre in Digitalization and Intelligent Robotics,
Polytechnic Institute of Bragança, Bragança, Portugal
{tiagofranco,palves,mjoao}@ipb.pt
2
ALGORITMI Centre, Department of Informatics,
University of Minho, Braga, Portugal
prh@di.uminho.pt
Abstract. Knee osteoarthritis (KOA) is a degenerative disease that
mainly affects the elderly. The development of this disease is associ-
ated with a complex set of factors that cause abnormalities in motor
functions. The purpose of this review is to understand the composition
of works that combine biomechanical data and machine learning tech-
niques to classify KOA progress. This study was based on research arti-
cles found in the search engines Scopus and PubMed between January
2010 and April 2021. The results were divided into data acquisition, fea-
ture engineering, and algorithms to synthesize the discovered content.
Several approaches have been found for KOA classification with signifi-
cant accuracy, with an average of 86% overall and three papers reaching
100%; that is, they did not fail once in their tests. The acquisition of data
proved to be the divergent task between the works, the most considerable
correlation in this stage was the use of the ground reaction force (GRF)
sensor. Although three studies reached 100% in the classification, two did
not use a gradual evaluation scale, classifying between KOA or healthy
individuals. Thus, we can get out of this work that machine learning
techniques are promising for identifying KOA using biomechanical data.
However, the classification of pathological stages is a complex problem
to discuss, mainly due to the difficult access and lack of standardization
in data acquisition.
Keywords: Knee osteoarthritis · Biomechanical · Data classification ·
Machine learning
1 Introduction
Osteoarthritis (OA), the most common form of arthritis, is a degenerative joint
disease caused by rupture and eventual loss of the cartilage that lines the bony
extremities [18]. This disease is directly related to aging and usually affects the
knee, hip, spine, big toe, and hands. Estimates show that about 10% of men and
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 417–429, 2021.
https://doi.org/10.1007/978-3-030-91885-9_31
418 T. Franco et al.
18% of women over 60 years of age have some kind of the OA, reinforcing its
worldwide importance in public health [17].
In particular, knee osteoarthritis (KOA) is the type of OA with the highest
perceived prevalence. The most characteristic symptoms of KOA are usually
changing in biomechanical behavior, such as joint pain, stiffness, cracking, and
locking of the joints. These symptoms are typically noticed earlier because they
interfere with daily activities, especially gait abnormalities [21].
The development of KOA is associated with a combination of a complex set
of factors. These risk factors can be divide into non-modifiable and modifiable,
including joint integrity, genetic predisposition, local inflammation, mechanical
forces, and cellular and biochemical processes. Consequently, several treatments
have been creating to improve motor function, relieve pain, and alleviate the
deficiencies caused [10].
One of the fundamental pieces for developing a sophisticated treatment is
the understanding and classification of the progression of KOA. As this infor-
mation becomes more practical and accessible, more realistic monitoring of the
case becomes possible, enabling the physiotherapist to prepare a more efficient
specialized solution [2]. In addition, it also creates the possibility of expanding
the early detection of KOA, reducing the prevalence of the disease in the global
population [3].
The most common techniques used to classify KOA progression are X-rays
and magnetic resonance imaging (MRI). Unfortunately, both have limitations;
radiography is low-priced but is generally low-resolution and used only for initial
evaluation. The MRI has better accuracy and is helpful for monitoring stages
of the pathology, but it is expensive [2]. In addition, patients generally seek
these techniques after feeling uncomfortable, making it challenging to detect
asymptomatic patients early.
Thus, researchers from different areas seek to propose innovative, low-cost,
and scalable solutions for OA. The application of computational techniques,
mainly machine learning, stands out to be constantly growing and orthogonal to
most adjacent areas. The evolution of the data acquisition and analysis capacity
aroused great interest from the scientific community, enabling new automated
solutions for pre or post treatment [5].
As a result, several clinical and observational databases are being created,
analyzed, and made available online. These data are highly dimensional, hetero-
geneous, and voluminous, opening up a new range of possibilities and challenges
for improving KOA diagnosis [14].
Studies show evidence that it is possible to recognize through biomechanical
data, including kinematic-kinetic data and electromyography (EMG) signals, the
difference in the behavior of patients with KOA and pathological stage indica-
tions [1,17]. In addition, the review about the application of machine learning
in KOA developed by Kokkotis et at. [5], shows that the application of ML
algorithms for KOA classification using biomechanical data performs similarly
to models built on image data (x-rays and MRI).
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data 419
That said, there was a desire to understand in more detail the composi-
tion of works that focus on biomechanical data, identifying and categorizing
the most common and promising forms of data acquisition, resource engineering
and algorithms. Thus, the present work reviews the literature with the purpose
of systematizing and comparing the knowledge produced by articles that use
biomechanical data to classify the KOA through machine learning algorithms.
The paper is organized in four more sections after this one. Section 2 refers to
the methodology used, such as keywords and selection criteria; Sect. 3 compiles
the results of this article, describing the relevant points found in the reviewed
articles on data acquisition, resource engineering and algorithms; Sect. 4 dis-
cusses the results and the points that we believe should be highlighted; Sect. 5
reports the main conclusions and scientific contributions that we have drawn
from this study.
2 Methods
The literature review presented summarizes the works found in the Scopus and
PubMed repositories. Several combinations of keywords were used in the search,
with the most recurring ones: biomechanical, gait, machine learning, deep learn-
ing, classification, osteoarthritis. Searches were limited by the year of publication
between January 2010 and April 2021.
The focus of the review is to find articles that use biomechanical data from
patients with KOA. Therefore, three rules were used to define the inclusion
criteria, namely:
1. Only articles that mention clinical evidence of the presence of KOA in patients
included in the study;
2. Only articles that handle with biomechanical data;
3. Studies that propose to classify KOA using machine learning algorithms.
For the exclusion criteria, only two rules were added, namely:
1. Studies did not specify KOA as the presenting pathology;
2. Studies that used analytical methods for KOA classification.
Two review papers inspire this study with correlated objectives. The review
[5] synthesizes machine learning techniques in diagnosing knee osteoarthritis
between 2006 to 2019. The study comprehensively explores using any data
applied to knee osteoarthritis, including MRI, X-Ray, Kinetic and Kinematic
data, Clinical data, and Demographics. The second review mentioned [8] makes
a survey on the evaluation of knee osteoarthritis based on gait between 2000
to 2018. The study shows to be the most comprehensive on the subject by dis-
cussing in detail the various techniques applied to the understanding of human
gait and its difference for KOA patients.
The present study differs from the ones mentioned, being more specific,
including only works that use Kinetic and Kinematic data for the classifica-
tion of knee osteoarthritis through machine learning techniques. The focus of
this work is to detail the construction of the dataset and explore the different
models based on the biomechanical data.
420 T. Franco et al.
3 Results
Twelve articles were selected according to the criteria described to be explored.
A list with the meta-information can be seen in Table 1.
Table 1. Selected articles
Index Author Year Country Repository
1 Köktaş [22] 2010 Turkey Scopus
2 Moustakidis [15] 2010 Greece Scopus
3 McBride [12] 2011 USA Scopus
4 Kotti [6] 2014 UK PubMed
5 Phinyomark [19] 2016 Canada PubMed
6 Kotti [7] 2017 UK Scopus
7 Muñoz-Organero [16] 2017 China Scopus
8 Long [11] 2017 UK Scopus
9 Mezghani [13] 2017 Canada Scopus
10 Kobsar [4] 2017 Canada PubMed
11 Kwon [9] 2020 Korea PubMed
12 Vijayvargiya [20] 2020 India Scopus
Two-thirds of the articles found are available on Scopus, with the majority of
the articles being after 2015, with the highest concentration in 2017 with 05/12.
In addition, it was also noted that there is a correlation of authors between the
articles Kotti [6,7] and Long [11], as well as Phinyomark [19], Mezghani [13] and,
Kobsar [4].
N. Kour [8] describes that the pipeline commonly followed to classify KOA
using computable data has five main stages, illustrated in Fig. 1. The first step
is to acquire data from KOA patients and healthy individuals; the second step
is pre-processing, where techniques are applied to improve the quality of the
data; the third phase is the extraction of the most relevant characteristics and
reduction of the data volume; the fourth step is where the model is built and
the classifier algorithm is optimized; lastly, the fifth stage is the final result of
the process, the desired classification.
Fig. 1. The steps of the process followed in the diagnosis of KOA. Adapted [8].
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data 421
Analogously, the selected articles follow a structure similar to that shown
in Fig. 1, diverging in some cases in steps 2 and 3, which may appear together
with the name of feature engineering. The approaches adopted are the main
outcomes of this work, the results of this work are organized in three parts that
compete with the classification process, being: (1) the description of the data
collection scenario and its specifications, (2) the path that the data takes after
its collection until transform into input/output of the classifier algorithm and
(3) the characteristics of the machine learning models produced by the reviewed
articles.
3.1 Data Acquisition
Different approaches were applied to acquire biomechanical data. Only one study
[4] did not add data from healthy individuals to healthy control (HC). The most
used resource for the acquisition was the force plates (FP) and the only one that
was adopted in combination with the other two found kinematic (KMC) and
wearable devices. Consequently, the most applied sensor was the Ground reac-
tion force (GRF), but the Inertial measurement unit (IMU), electromyography
(EMG), and Goniometer (GO) sensors were also seen. The detailed list of the
characteristics of the datasets developed in the studies can be seen in Table 2.
Table 2. Characteristics of the datasets
Author Year Size Sensors Resources Additional data
Köktaş [22] 2010 +150 (KOA/HC) GRF KMC+FP Yes
Moustakidis [15] 2010 24 KOA/12 HC GRF FP No
McBride [12] 2011 20 KOA/10 HC GRF KMC+FP No
Kotti [6] 2014 47 KOA/133 HC GRF FP No
Phinyomark [19] 2016 100 KOA/43 HC GRF KMC+FP No
Kotti [7] 2017 47 KOA/47 HC GRF FP No
Muñoz-Organero [16] 2017 14 KOA/14 HC GRF Wearable No
Long [11] 2017 92 KOA/84 HC GRF KMC+FP Yes
Mezghani [13] 2017 100 KOA/40 HC IMU Wearable Yes
Kobsar [4] 2017 39 KOA IMU Wearable No
Kwon [9] 2020 375 (KOA/HC) GRF KMC+FP Yes
Vijayvargiya [20] 2020 11 KOA/11 HC EMG+GO Wearable No
The collection of biomechanical data is not an easy task, much less stan-
dardized. All studies that describe the creation of the dataset diverge at some
point. Even those that use the same resource and sensors diverge in the data
collection equipment. A good example is the data provided by kinematic, usu-
ally applied for the acquisition of joint angles (knee, hip) and movement in the
anatomical planes (sagittal, frontal, horizontal); the number of cameras used in
[9,11,12,19,22], was 6, not specified, 8, 12, 10, respectively. In addition, most of
422 T. Franco et al.
the software used to acquire the data is paid and closed code, making it difficult
to understand the actual correspondence between the datasets.
There were not many wearable devices, only four, two commercial devices,
and two developed. The technology adopted in each one is quite different. The
article [16] applies an insole composed of 8 Force Sensing Resistors (FSR). The
[13] also applies a commercial device, but this time the wearable uses IMU
sensors in the knee region to translate the movements into sophisticated software.
Similarly, [4] also uses IMU sensors; however, the study proposes the creation of
a wearable device, testing different positions for gait identification. The [20] also
proposes the creation of a wearable for the acquisition, using EMG sensors and
a goniometer to measure the angle between the thigh and the shin.
The number of patients included varies between the works, with a minimum
of 22 [20] and a maximum of 375 [9] individuals divided between KOA and HC.
All studies mention the clinical confirmation of the diagnosis of KOA in patients,
but not all register the degree of the pathology.
One-third of the chosen jobs included additional data in the datasets. The
data were mainly standardized patient-reported outcome measures (PROMs)
forms. The articles [13] and [11] adopted the knee injury and osteoarthritis
outcome score (KOOS) and [9] the Western Ontario and McMaster Univer-
sity Osteoarthritis Index (WOMAC). Finally, [22] follow a different approach,
describing a form with the fields of age, body mass index and pain level. Despite
the division, in practice most of the selected works add some information. For
example, the participant’s weight is used to standardize forces on the ground
across the entire dataset.
3.2 Feature Engineering
The presented study classifies as Feature Engineering the process that the data
goes through starting from the resources adopted by each study and becoming
the input/output of the classifier algorithm. So, according to Fig. 1, we divide
the applied techniques between pre-processing, feature extraction and add an
evaluation scale to imagine the dataset in its entirety. Table 3 seeks to summarize
the main techniques mentioned by the authors.
As we can see, all articles apply techniques or a sequence of calculations
before feature extraction. It is essential to say that the authors who used more
sophisticated software and technology describe less the pre-processing since the
data has been previously treated differently from the others. Thus, the most seen
techniques were normalization by weight and synchronization of gait between
individuals.
Three techniques are highlighted in pre-processing: the first is the discrete
wavelet transform (DW), applied to reduce the noise of the raw EMG signals
and provide information in the frequency and time domains; the second is the
overlay window applied to divide the signal into parts, both of the article [20].
The last, we have the [15] that applied a derivation of the DW called Wavelet
Packet with the inclusion of a more detailed decomposition of the signal.
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data 423
Table 3. Feature engineering
Author Year Pre-processing Feature
Extraction
Evaluation
Scale
Köktaş [22] 2010 Calc. of joint angles Mahalanobis K-L
Moustakidis [15] 2010 Normal. by weight,
Wavelet Packet
FuzCoC K-L adap.
McBride [12] 2011 Calc. of joint angles Analysis K-L
Kotti [6] 2014 Normal. by weight, gait
Sync
PCA KOA or
HC
Phinyomark [19] 2016 Calc. of joint angles PCA KOA or
HC
Kotti [7] 2017 Normal. by weight, gait
Sync
Analysis 0–2 develop
Muñoz-Organero [16] 2017 Normal. by altitude
and duration
Mahalanobis KOA or
HC
Long [11] 2017 Calc. of joint angles,
normal. by weight, gait
Sync
Analysis KOOS
adap.
Mezghani [13] 2017 Calc. of joint angles Analysis K-L
Kobsar [4] 2017 Normal. by altitude,
gait Sync
PCA KOOS
adap.
Kwon [9] 2020 Calc. of joint angles ANOVA,
student t-test
WOMAC
adap.
Vijayvargiya [20] 2020 Filtered, Normal.,
DWT, Overlapping
Windows
Backward
Elimin.
KOA or
HC
Four studies performed a manual analysis to extract characteristics, study-
ing the variance of their fields and making them more powerful. The other 8
chose standardized methods, with a more significant predominance, 3 articles,
the Principal component analysis (PCA). PCA is an orthogonal or linear trans-
formation technique used to convert a set of possibly correlated variables into a
set of linearly unrelated variables called Principal Components (PC) that max-
imize the variability in the original data [19].
Two articles applied the Mahalanobis distance in its Extraction of Char-
acteristics phase, an algorithm that can be applied as a selection criterion. In
summary, the Mahalanobis algorithm calculates the distance between fields in
the pattern due by the mean. In this way, it is possible to reduce the fields by
distinguishing the highest and lowest variance under the Mahalanobis distance
observation.
The statistical methods A One-way Analysis of Variance (ANOVA) and Stu-
dent’s T-Test were the methods applied by [9] to determine which features were
significantly related to the desired evaluation scale. The feature extraction per-
424 T. Franco et al.
formed by [15] followed a sequential selection according to the Complementary
Fuzzy Criteria (FuzCoC), adjusting the nodes of the decision tree algorithm.
For [20], 11 characteristics were initially extracted in the time domain of the
EMG signal. Afterward, a Backward Elimination technique was applied, remov-
ing less significant fields by iteration. This work applies different pre-processing
and feature extraction techniques because it is the only one to use EMG data
for classification.
All the studies mentioned about the Kellgren and Lawrence classification
(K-L) to evaluate the OA progression. The K-L scale is summarized in a set of
evidence that an x-ray needs to confirm one of the 5◦
, 0 being healthy and 4
being the most critical pathological level. This method is commonly adopted for
its practicality and was accepted by the World Health Organization (WHO) in
1961. Despite this, only four studies used the k-L scale as an evaluation scale.
The special addendum to [15], that grouped the classification between the scores
of 1 or 2 as moderate and the score of 3 or 4 as severe.
Two articles took advantage of patient reporting forms to produce a classifi-
cation adapted to their dataset. Author [4] used the variations of KOOS forms
made by patients before and after a treatment. In this way, those who had a posi-
tive change in the form were considered as responding to treatment, consequently
those who did not change or worsened were considered as not responding. The
second [9] used WOMAC to produce its scale. There was an adaptation trans-
forming the values from 1 to 5 of the form into three classes: mild, moderate and
severe.
Lastly, the study [7] developed an evaluation scale for KOA pathology, start-
ing from 0 for the healthiest individual and continues linearly up to 2, equivalent
to both knees with severe KOA. In addition, a maximum value of 0.5 was placed
for the individual to be considered without the presence of KOA.
3.3 Algorithms and Results
Algorithms of different natures were applied for the classification of KOA. These
algorithms are Neural networks, Multilayer Perceptron (MLP), Support-vector
machine (SVM), decision tree-based algorithms, Random Forest, Regression tree,
Extra tree, statistical algorithms, Bayes classifier and, linear discriminant anal-
ysis (LDA) and clustering algorithm k-nearest neighbors (kNN). A combination
of algorithms was also proposed, Köktaş [22] used a decision tree under the dis-
crete data to decide which MLP to use to classify the biomechanical data. The
article [15] combined the Fuzzy methodology to improve a decision tree based
on SVM, denoting Fuzzy decision tree-based SVM (FDT-SVM).
Although some works have presented several algorithms and adaptations
to improve performance, we will consider only the components that produced
the best performance pointed out by the authors of the reviewed articles. The
detailed list can be found in Table 4.
The results obtained by the studies were 72.61% up to classifications that
did not fail once in their tests with 100% accuracy. The studies that reported
100% of their results used the SVM and kNN algorithms. The two papers that
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data 425
Table 4. Algorithms with better performance
Author Year Algorithm Train/Test Validation Result
Köktaş [22] 2010 Decision tree + MLP 66-33 10F-CV 80%
Moustakidis [15] 2010 FDT-SVM 90-10 10F-CV 93.44%
McBride [12] 2011 Neural networks 66-33 – 75.3%
Kotti [6] 2014 Bayes classifier – 47F-CV 82.62%
Phinyomark [19] 2016 SVM – 10F-CV 98–100%
Kotti [7] 2017 Random forest 50-50 5F-CV 72.61%
Muñoz-Organero [16] 2017 SVM – – 100%
Long [11] 2017 KNN 70-30 CV 100%
Mezghani [13] 2017 Regression tree 90-10 10F-CV 85%
Kobsar [4] 2017 LDA – 10F-CV 81.7%
Kwon [9] 2020 Random forest 70-30 HCV 74.1%
Vijayvargiya [20] 2020 Extra tree – 10F-CV 91.3%
performed the Random Forest algorithm and Neural networks achieved results
below 76%. Statistical algorithms obtained similar results at around 82%.
There was no agreement between studies on the distribution of data between
training data and test data. Five of the 12 studies did not perform the distri-
bution. That is, the entire dataset was used in the training phase. The greatest
co-occurrence was among the papers [9,11,12], with approximately 2/3 of the
dataset for training and 1/3 for testing.
Only two studies did not apply or did not mention the use of the validation
technique. All others applied some variation of cross-validation (CV). This tech-
nique is summarized in the division of the training data in X parts (XF-CV),
where X-1 parts will be used in training and 1 part for testing. Then, the algo-
rithm will be trained X times with the addition of circularly alternating the test
subset.
Half of the works discussed adopted the most known form of this technique,
the 10-fold cross-validation. The study [9] used the variation of the technique
called holdout (HCV), which consists of separating the training dataset into two
equal parts. Within this set, Article [6] stands out for using the division by 47.
The authors report that they opted for this division divided by having only 47
individuals with KOA; thus, 46 rows were used for training and one for testing.
4 Discussion
It’s clear from the studies analyzed that each work has its characteristics and
limitations that must be analyzed separately. However, in a general context,
works that proposed the KOA classification reached significant precision, with an
average of 86% overall and three works reaching 100%. The variety of approaches
426 T. Franco et al.
found shows how the classification of KOA is a topic that can be explored in
different ways, even limiting under the biomechanical data.
It is possible to highlight that 9 of the 12 articles used GRF sensors. Even
those who had all the joints available through the acquisition of data via Kine-
matics did not discard the use of GRF. Kotti [6] describes that if a pathological
factor is present in a patient, the movements are expected to be systematically
altered. Thus, as shown in the works presented, the gait movement is altered
in the case of KOA and, it is possible with only the ground reaction to identify
these changes.
Work [22] has a different implementation from the others due to two facts.
The first fact is that it uses additional data (age, body mass index, and pain level)
to build a decision tree. The second fact is that the work uses the decision tree to
select which MLP will be used to classify the grade of the KOA. This approach
becomes interesting since the discrete data has properties quite different from
the timeseries, allowing for optimization separately.
Only one article [4] does not consider healthy individuals in the composition
of the dataset. Consequently, it fails to classify or understand the difference
between healthy and KOA. Although there is no cure for KOA, this information
is still crucial to the classification problem.
The study [20] was innovative within the set studied by using EMG sensors.
With a good accuracy of 91.3%, the study shows that it is possible to distinguish
a person who has KOA or not through multiple EMG sensors. The study authors
comment that the composition of the sensors makes the classification happen.
This is because EMG detects muscle contraction, and with only one muscle, it is
difficult to identify the difference for a pathological patient. However, when sev-
eral sensors are added, it is possible to distinguish a different activation pattern
on the leg muscles when a movement is made.
It is known that the classification of multiple classes is usually more complex
than binary classification, as in the case of KOA or HC. Based on this, Table 5
adds the column of the adopted evaluation scale to the results of the algorithms.
As we can see, all the works that classified between KOA or HC have better
outcomes and, did not divide their datasets between training and test. It was
also noted that the SVM algorithm had the best accuracy, but it was not used
to classify the degree of the KOA pathology.
The studies that adopted the Kellgren and Lawrence scale for their models
obtained a maximum of 85% accuracy. The special case in this category was
Moustakidis [15], which grouped 4◦
in 2, reducing the number of classes and
achieving a performance of 93.44%. Analogously, with the exception of study
[11], the 3 works that sought a different scale had a similar result not above 82%.
Finally, Long [11] stands out from this review, mainly because it proves to
be the most complete. The study has one of the largest datasets (92 KOA/84
HC) collected by kinematic and force plates, contributing to the reliability of the
work. Despite not using the Kellgren and Lawrence scale, the study also uses a
gradual scale for the classification. As a surprise, the algorithm that achieved
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data 427
Table 5. Results ordered by the evaluation scale
Author Year Algorithm Evaluation Scale Train/Test Result
Kotti [6] 2014 Bayes classifier KOA or HC – 82.62%
Phinyomark [19] 2016 SVM KOA or HC – 98–100%
Muñoz-Organero [16] 2017 SVM KOA or HC – 100%
Vijayvargiya [20] 2020 Extra tree KOA or HC – 91.3%
Köktaş [22] 2010 Decision tree + MLP K-L 66–33 80%
Moustakidis [15] 2010 FDT-SVM K-L adap. 90–10 93.44%
McBride [12] 2011 Neural networks K-L 66–33 75.3%
Mezghani [13] 2017 Regression tree K-L 90–10 85%
Kotti [7] 2017 Random forest 0–2 develop 50–50 72.61%
Long [11] 2017 KNN KOOS adap. 70–30 100%
Kobsar [4] 2017 LDA KOOS adap. – 81.7%
Kwon [9] 2020 Random forest WOMAC adap. 70–30 74.1%
100% accuracy for this study was the KNN, the only case of clustering algorithm
in this review.
5 Conclusion
In this review, several approaches were seen to classify KOA progression using
biomechanical data. As seen, all studies that applied a binary classification, that
is, models capable of identifying whether an individual has KOA or not, had
good accuracy, even with small datasets and different sensors. In this way, these
works bring evidence that the differences in the biomechanics of a patient with
KOA are considerably noticeable by the machine learning algorithms.
However, this perception is not so clear for the classification of the KOA
progression. Except for one article, all works found that opted for a KOA gradu-
ated scale had inferior results. In addition to the natural difficulty added by the
classification of multiple classes, non-standard data acquisition seems crucial in
this problem.
In fact, acquiring data to monitor pathologies is always a challenge, no dif-
ferent for KOA. The construction of the reported data sets requires the approval
of a committee, clinical evidence, multiple processes, and specialized personnel.
In this way, the datasets explored are pretty distinct from each other. Not only
in the number of patients and divergent equipment but also the evaluation scale.
Thus, it is not possible to affirm that the data acquired by a study can reach
similar precision to other models presented.
Despite this, studies such as Long [11] prove that it is possible to classify
the progression of KOA efficiently. This work proved to be the most complete of
the reviewed and showed confidence in the results. Although the study did not
apply a known technique in the feature engineering stage, the authors describe
a manual analysis for feature extraction.
428 T. Franco et al.
This work contributes to the characterization of the machine learning models
developed for the classification of KOA. This review provides comparisons under
the main steps followed for the diagnosis of KOA using biomechanical data. Thus,
it is expected that the topics addressed will serve as a starting point for new
approaches. As future work, we would like to compare the different approaches
studied with the same data set and promote better reliable comparisons to the
problem.
Acknowledgment. This work was supported by FCT - Fundação para a Ciência
e a Tecnologia under Projects UIDB/05757/2020, UIDB/00319/2020 and individual
research grant 2020.05704.BD, funded by Ministério da Ciência, Tecnologia e Ensino
Superior (MCTES) and Fundo Social Europeu (FSE) through The Programa Opera-
cional Regional Norte.
References
1. Amer, H.S.A., Sabbahi, M.A., Alrowayeh, H.N., Bryan, W.J., Olson, S.L.: Elec-
tromyographic activity of quadriceps muscle during sit-to-stand in patients with
unilateral knee osteoarthritis. BMC Res. Notes 11, 356 (2018). https://doi.org/10.
1186/s13104-018-3464-9
2. Bijlsma, J.W., Berenbaum, F., Lafeber, F.P.: Osteoarthritis: an update with rele-
vance for clinical practice. Lancet 377(9783), 2115–2126 (2011). https://doi.org/
10.1016/S0140-6736(11)60243-2
3. Chu, C.R., Williams, A.A., Coyle, C.H., Bowers, M.E.: Early diagnosis to enable
early treatment of pre-osteoarthritis. Arthritis Res. Ther. 14, 1–10 (2012). https://
doi.org/10.1186/ar3845
4. Kobsar, D., Osis, S.T., Boyd, J.E., Hettinga, B.A., Ferber, R.: Wearable sensors
to predict improvement following an exercise intervention in patients with knee
osteoarthritis. J. Neuroeng. Rehabil. 14(1), 1–10 (2017)
5. Kokkotis, C., Moustakidis, S., Papageorgiou, E., Giakas, G., Tsaopoulos, D.:
Machine learning in knee osteoarthritis: a review. Osteoarthr. Cartil. Open 2(3),
100069 (2020). https://doi.org/10.1016/j.ocarto.2020.100069
6. Kotti, M., Duffell, L., Faisal, A., Mcgregor, A.: The complexity of human walking:
a knee osteoarthritis study. PloS One 9, e107325 (2014). https://doi.org/10.1371/
journal.pone.0107325
7. Kotti, M., Duffell, L.D., Faisal, A.A., McGregor, A.H.: Detecting knee osteoarthri-
tis and its discriminating parameters using random forests. Med. Eng. Phys. 43,
19–29 (2017). https://doi.org/10.1016/j.medengphy.2017.02.004
8. Kour, N., Gupta, S., Arora, S.: A survey of knee osteoarthritis assessment based
on gait. Arch. Comput. Methods Eng. 28(2), 345–385 (2020). https://doi.org/10.
1007/s11831-019-09379-z
9. Kwon, S.B., Ku, Y., Lee, M.C., Kim, H.C., et al.: A machine learning-based diag-
nostic model associated with knee osteoarthritis severity. Sci. Rep. 10(1), 1–8
(2020)
10. Lespasio, M.J., Piuzzi, N.S., Husni, M.E., Muschler, G.F., Guarino, A., Mont,
M.A.: Knee osteoarthritis: a primer. Perm. J. 21, 16–183 (2017). https://doi.org/
10.7812/TPP/16-183
Approaches to Classify Knee Osteoarthritis Using Biomechanical Data 429
11. Long, M.J., Papi, E., Duffell, L.D., McGregor, A.H.: Predicting knee osteoarthritis
risk in injured populations. Clin. Biomech. 47, 87–95 (2017). https://doi.org/10.
1016/j.clinbiomech.2017.06.001
12. McBride, J., et al.: Neural network analysis of gait biomechanical data for classifi-
cation of knee osteoarthritis. In: Proceedings of the 2011 Biomedical Sciences and
Engineering Conference: Image Informatics and Analytics in Biomedicine, pp. 1–4
(2011). https://doi.org/10.1109/BSEC.2011.5872315
13. Mezghani, N., et al.: Mechanical biomarkers of medial compartment knee
osteoarthritis diagnosis and severity grading: discovery phase. J. Biomech. 52,
106–112 (2017). https://doi.org/10.1016/j.jbiomech.2016.12.022
14. Moustakidis, S., Christodoulou, E., Papageorgiou, E., Kokkotis, C., Papandrianos,
N., Tsaopoulos, D.: Application of machine intelligence for osteoarthritis classi-
fication: a classical implementation and a quantum perspective. Quantum Mach.
Intell. 1(3), 73–86 (2019). https://doi.org/10.1007/s42484-019-00008-3
15. Moustakidis, S., Theocharis, J., Giakas, G.: A fuzzy decision tree-based SVM classi-
fier for assessing osteoarthritis severity using ground reaction force measurements.
Med. Eng. Phys. 32(10), 1145–1160 (2010). https://doi.org/10.1016/j.medengphy.
2010.08.006
16. Muñoz-Organero, M., Littlewood, C., Parker, J., Powell, L., Grindell, C., Mawson,
S.: Identification of walking strategies of people with osteoarthritis of the knee
using insole pressure sensors. IEEE Sens. J. 17(12), 3909–3920 (2017). https://
doi.org/10.1109/JSEN.2017.2696303
17. Nelson, A.: Osteoarthritis year in review 2017: clinical. Osteoarthr. Cartil. 26(3),
319–325 (2018). https://doi.org/10.1016/j.joca.2017.11.014
18. Nelson, A.E., Jordan, J.M.: Osteoarthritis: epidemiology and classification. In:
Hochberg, M.C., Silman, A.J., Smolen, J.S., Weinblatt, M.E., Weisman, M.H.
(eds.) Rheumatology, 6th edn., pp. 1433–1440. Mosby, Philadelphia (2015).
https://doi.org/10.1016/B978-0-323-09138-1.00171-6
19. Phinyomark, A., Osis, S.T., Hettinga, B.A., Kobsar, D., Ferber, R.: Gender differ-
ences in gait kinematics for patients with knee osteoarthritis. BMC Musculoskelet.
Disord. 17(1), 1–12 (2016)
20. Vijayvargiya, A., Kumar, R., Dey, N., Tavares, J.M.R.S.: Comparative anal-
ysis of machine learning techniques for the classification of knee abnormal-
ity. In: 2020 IEEE 5th International Conference on Computing Communication
and Automation (ICCCA), pp. 1–6 (2020). https://doi.org/10.1109/ICCCA49541.
2020.9250799
21. Zhang, Y., Jordan, J.M.: Epidemiology of osteoarthritis. Clin. Geriatr. Med. 26(3),
355–369 (2010). https://doi.org/10.1016/j.cger.2010.03.001
22. Şen Köktaş, N., Yalabik, N., Yavuzer, G., Duin, R.P.: A multi-classifier for grad-
ing knee osteoarthritis using gait analysis. Pattern Recogn. Lett. 31(9), 898–904
(2010). https://doi.org/10.1016/j.patrec.2010.01.003
Artificial Intelligence Architecture Based
on Planar LiDAR Scan Data to Detect
Energy Pylon Structures in a UAV
Autonomous Detailed Inspection Process
Matheus F. Ferraz1(B)
, Luciano B. Júnior1
, Aroldo S. K. Komori1
,
Lucas C. Rech1
, Guilherme H. T. Schneider1
, Guido S. Berger1
,
Álvaro R. Cantieri2
, José Lima3
, and Marco A. Wehrmeister1
1
The Federal University of Technology - Paraná, Curitiba, Brazil
{mferraz,lucjun,aroldok,lucasrech,ghideki}@alunos.utfpr.edu.br,
wehrmeister@utfpr.edu.br
2
Federal Institute of Paraná, Curitiba, Brazil
alvaro.cantieri@ifpr.edu.br
3
Research Centre in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politécnico de Bragança, Portugal and INESC TEC, Porto, Portugal
jllima@ipb.pt
http://www.ifpr.edu.br
Abstract. The technological advances in Unmanned Aerial Vehicles
(UAV) related to energy power structure inspection are gaining visibil-
ity in the past decade, due to the advantages of this technique compared
with traditional inspection methods. In the particular case of power pylon
structure and components, autonomous UAV inspection architectures
are able to increase the efficacy and security of these tasks. This kind
of application presents technical challenges that must be faced to build
real-world solutions, especially the precise positioning and path following
for the UAV during a mission. This paper aims to evaluate a novel archi-
tecture applied to a power line pylon inspection process, based on the
machine learning techniques to process and identify the signal obtained
from a UAV-embedded planar Light Detection and Ranging - LiDAR sen-
sor. A simulated environment built on the GAZEBO software presents a
first evaluation of the architecture. The results show an positive detec-
tion accuracy level superior to 97% using the vertical scan data and
70% using the horizontal scan data. This accuracy level indicates that
the proposed architecture is proper for the development of positioning
algorithms based on the LiDAR scan data of a power pylon.
Keywords: UAV LiDAR pylon detection · Detailed electric pylon
inspection · Machine learning pylon detection
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 430–443, 2021.
https://doi.org/10.1007/978-3-030-91885-9_32
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 431
1 Introduction
Power line pylon inspection is a regular task performed by energy enterprises
to ensure energy infrastructure systems’ operational security. Power line struc-
tures are robust constructions, but deterioration on their components demands
preventive maintenance to keep the quality service in power distribution. The
detailed pylon inspection is commonly executed by technicians that reach the
base of the energy pylon on foot and climb it to observe the details of its compo-
nents and structure while looking for defects that could compromise the system.
These tasks are hard to execute and risky because it is usually carried out with
the energy structure in regular operation.
The manufacture advances of small unmanned aircraft, specifically the multi-
rotor types, allows the proposition of new inspection techniques, including the
pylon detailed inspections based on this kind of vehicle embedded with regular
and thermal cameras. Unmanned Aerial Vehicles (UAV) started being used on
this kind of inspection in the last two decades, driven by the offer of cheap and
robust small aircraft on the global market. For detailed structure inspection,
multi-rotor aircraft presents some advantages, like static flight capability, vertical
landing, precise position reaching, and 3D path following, among others. This
kind of operation is commonly made with the aircraft remotely piloted by a
human operator. To achieve the detailed images properly, the aircraft must fly
near the pylon while performing a displacement around it, avoiding obstacles,
and keeping a security distance from the structure and components. It is a hard
process based on the pilot skills.
Recent research proposes the use of autonomous multi-rotor aircraft to
inspect the power line structure. The use of autonomous aircraft for detailed
pylon inspections offers some challenges for the proper operation, like precise
positioning and path following, effective obstacle detection and collision avoid-
ance, robust flight control and path planning for the operation, component defect
identification, among others. Autonomous flight demands a high accuracy posi-
tion system to be executed properly. The most common approach to providing
precise position data to UAV flight currently uses Differential Global Navigation
Satellite System (DGNSS). This technique is based on using the difference of
satellite signal received between two reception antennas, the first one is placed
in a static reference point, and the second one is placed on-board the aircraft, to
calculate a high accuracy position output data for the mobile module. The global
market presents a considerable number of small DGNSS hardware proper to be
used in small aircraft. The technique provides 1.0-cm level horizontal position
accuracy but is sensitive to the environment and operational conditions. To face
this problem, the proposition of additional position algorithms for UAV precise
flight is presented in the literature, mostly based on computer vision systems.
The precise positioning of the aircraft is essential to provide information to
the flight controller to assure flight security in a detailed inspection process. In
the specific case of the power line structure inspection, some specific challenges
are presented, like:
432 M. F. Ferraz et al.
– Energy power pylons are complex structure, composed of thin metallic com-
ponents, hard to detect by conventional sensors;
– High voltage energy that flows within the cables generates electromagnetic
interference in the navigation sensors of the aircraft during the inspection
operation;
– Presence of trees, civil constructions, and other similar objects around the
pylons and transmission corridors makes it hard to identify the correct posi-
tion of the UAV based on intelligent sensor data processing;
– The need for the aircraft flying close to the tower to obtain adequate images
and data for analysis brings high risky of collision, a serious problem to this
kind of application;
– Maintaining the aircraft’s position and orientation for the adequate acqui-
sition of images, due to the presence of gusts of wind and the uncertainty
present in the common positioning sensors, is a very demanding task for the
autonomous control system;
– DGNSS systems are sensitive to environmental characteristics, like coverage
of the receptor antenna, proper satellite signal reception, and data link WiFi
reception.
Considering these demands, the proposition of UAV positioning sys-
tems based on intelligent computational techniques is an interesting field of
application-oriented research. The main objective of this kind of technique is to
assure the aircraft control system keep the correct position and orientation, in
addition to the trajectory following during the flight. The present work proposes
an Artificial Intelligence-based architecture specifically designed for power pylon
detection, focused on developing a LiDAR positioning system for autonomous
detailed inspection Tasks. It is based on the processing of planar Light Detection
And Range (LiDAR) sensors data embedded on the aircraft. This paper presents
an initial evaluation of an architecture to detect a pylon in the UAV flight area;
it also identifies its direction using only the embedded LiDAR scan data. The
simulations were run in the virtual robotic environment GAZEBO, providing an
overview of the proposed technique’s performance.
The remaining of the paper is organized as follows: After this introduc-
tory section, Sect. 2 discusses the related work and highlights this work’s con-
tributions. Section 3 presents the general problem description is presented and
describes the proposed architecture. Section 4 presents the experimental setup
and simulation results, and Sect. 5 discusses the obtained results. Finally, Sect. 6
presents the conclusions and points out future works directions.
2 Related Works
The common technique to provide position data to UAV outdoor flight is the
use of DGNSS modules. This technique computes the phase difference between
two GNSS antennas to increase the accuracy of the positioning data output.
Nowadays, a considerable number of small-size DGNSS hardware is commer-
cially available, proper for UAV applications. The most used commercial flight
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 433
controllers can work with these GNSS modules data to provide high accuracy
positions to the aircraft, allowing autonomous mission programming. This tech-
nique is one of the most common solutions to provide precise horizontal position-
ing to UAV but demands good operational conditions to work properly. Some
environment conditions, like the presence of obstacles, antenna shadowing, mag-
netic field interference, and cloudy weather conditions, may affect the system
accuracy significantly [19], justifying the development of complementary posi-
tioning solutions to integrate the UAV control system.
A common approach to propose UAV position and navigation data is using
computer vision algorithms that collect images based on regular or stereoscopic
cameras embedded on the UAV. Some review papers were published about this
subject, presenting the main applications and challenges about the practical use
of vision-based UAV control algorithms. This includes the influence of outdoor
light variation, difficult of identifying object clues in the images to provide data
information for the vision algorithms, and the high variability of the flight sites
composition and objects, making it hard to define a general image algorithm
processing for all kind of application [1,10,12].
Considering the specific area of energy power structure autonomous inspec-
tion, two kinds of computer vision algorithm approaches are found in the lit-
erature: (a) the power line following applications, used for a long-range and
long-distance visual inspection, and (b) the pylon detection and localization for
small distance detailed inspection operation.
The line following applications has the main objective of identifying the posi-
tion of the UAV related to the energy power cables and transmission structure
during its displacement along the lines when the aircraft capture images of the
energy cables and components for an overview of the structural conditions. In
this situation, the algorithm must keep the UAV navigation in the correct path
and distance from the structures during the aircraft displacement (10.0 m to
50.0 m commonly), using the images captured to provide information to the dis-
tance and orientation estimators, feeding the flight controller. Most proposals use
image processing techniques to extract the power line from an image and calcu-
late the position and orientation of the UAV related to it [2,6,9,11,13,14,18].
Another approach uses pylon detection and identification on images to pro-
vide a direction to the UAV displacement. In these cases, the vision algorithms
extract the pylon features from the image and calculate its position, feeding the
flight control hardware to pursuit the “target” and displace it to the next pylon.
An example of this approach is presented in the work [8].
Some works presented in the literature apply vision algorithms to provide
a high accuracy position data to the UAV related to the power pylon when
it executes the flight in small distances (2.0 m to 6.0 m commonly) to capture
detailed images of the structure components.
A pylon distance estimation algorithm based on monocular image processing
is presented in [3]. It uses the UAV displacement based on the GPS and the air-
craft’s IMU (Inertial Measurement Unity) data to calculate the pylon position
using the image match points for two consecutive images. A position measure-
434 M. F. Ferraz et al.
ment error lower than 1.0 m has been reported as a result (for the best samples)
of a 10.0 m distance flight in real-world experiments, using a pylon model.
Another similar work proposes a “Point-Line-Based SLAM” technique based
on a monocular camera to calculate the center of the pylon. A 3D point cloud is
estimated from the monocular images at each algorithm interaction, providing
the capability to calculate the distance from the UAV and the pylon. The results
present an average position error of 0.72 m [5].
Using LiDAR sensors to provide distance data for energy power inspection
is a good approach to solve problems related to the inspection tasks. The main
application of this kind of sensor in UAV energy power inspection is focused on
the mapping of the structures using the LiDAR point cloud data to reconstruct
the real conditions of the transmission lines [4,9,16,17].
Another approach that proposes a distance estimation of a pylon for a
detailed inspection application, based on a planar LiDAR sensor, is presented.
In the work, a planar LiDAR carried onboard by a multi-rotor aircraft collects
horizontal data from the pylon structure and uses such information to calcu-
late the geometric centroid of the pylon. One issue of this technique is that it
demands the aircraft to keep the alignment to the pylon to obtain proper data
to feed the position calculation. Also, the inclination of the measurement plane,
due to the movement of the aircraft, has a significant influence on the position
error calculated by the algorithm, as described in [15].
This brief state-of-art review shows that intelligent computing algorithms
based on a planar LiDAR sensor data can provide a significant contribution to
this specific area of application, especially when they are employed to identify
the pylon position/orientation and to calculate its distance from a UAV flying
close to it. The work described in the present paper is a first evaluation of the
application of machine learning algorithms to face the mentioned challenges.
3 Problem Description
The power pylon detailed inspection demands the UAV displacement around its
structure. In this operation, the technician goal is to achieve a close-up image of
the pylon components to verify their integrity. This operation requires that the
aircraft stays hovering close to the targets points, typically within 4.0 m. The
robustness of the flight control depends on the correct evaluation of the UAV
position, in this case, with high accuracy. Although a centimeter-level accuracy
obtained by a DGNSS system is suitable for this operation, the malfunction of
this system demands the proposal of additional positioning systems to assure
the security of the flight, as explained earlier.
This work main goal is, as described below, to run an initial evaluation of a
positioning architecture based on an Artificial Intelligence algorithm to detect
an electric power pylon close to a UAV in short distances, in order to assist
the detailed pylon inspection tasks. Such an architecture is based on two planar
LiDAR sensors to scan, respectively, the horizontal and vertical planes. Each
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 435
sensor scans provide a planar point signature data for intelligent algorithms, to
identify correctly the pylon within the flight area and to indicate its structure
orientation regarding to the UAV pose.
This research was based on the expertise of the local energy company tech-
nicians, which provided essential pieces of information regarding the real-world
UAV inspection process. The local company’s technicians perform the pylon
detailed inspection using a commercial remotely piloted aircraft that captures
images from the structure for a offboard post-processing evaluation.
The proposal of UAV-based autonomous pylon inspection architecture must
assure the capability of reproducing the human-piloted behavior, keeping the
flight security during the operation. An important demand regarding this is to
identify the presence of the pylon in front of the UAV and keep a secure distance
from the structure, to prevent a possible collision. The use of a LiDAR sensor
embedded on the UAV is a possible way to estimate the distance between the
aircraft and the pylon in close distances. To work properly the flight control sys-
tem must receive reliable information that the structure detected by the LiDAR
is the pylon, once there are several other possible objects in the flight area that
could be detected, generating a false distance estimation. Considering this, a
pylon detection and identification algorithm must be provided.
To face these challenges, this work proposes an architecture based on two
planar LiDAR sensors embedded on the UAV, to scan both the horizontal and
the vertical planes and feed AI algorithms. To allow the correct operation of the
algorithms, a predefined programmed behavior for the UAV was proposed. The
Fig. 1 shows the representation of the UAV proposed behavior for a inspection
process.
Two different AI algorithms have been evaluated to compare the pylon detec-
tion performance in the simulated environment: a Neural Network (NN), and a
Support Vector Machine (SVM). Both of these algorithms were chosen for being
well documented in the literature and widely applied in classification problems,
and, therefore, they are a good baseline for this evaluation. A deep Feed For-
ward Network (FFN) was designed in Python (v3.8.5) with the aid of the Keras
(v2.4.1) framework. The NN was build with three hidden dense layers, with the
first two being composed by 200 neurons each and a third layer composed by
5 neurons. Each layer uses the RelU activation function, and the output’s layer
generates the final result using the sigmoid activation function. The network
was trained for 50 epochs and with batch size of 32. Accuracy was chosen as the
evaluation metric, and the loss function used was binary crossentropy. Also, to
mitigate overfitting, the cross-validation technique was used, with the training
data being split in 10 sets. The network training was executed as an supervised
model, were each sample was tagged with the “pylon” or “not-pylon” boolean
parameter.
The Support Vector Machine (SVM) algorithm has been designed using
Python (v3.8.5) and the scikit-learn library. Two kernels for parameter esti-
mation have been used: the polynomial and the radial basis function (RBF).
Similar to the NN algorithm, the SVM algorithm training has been executed
436 M. F. Ferraz et al.
Fig. 1. State diagram showing the proposed UAV behavior for a inspection process.
in the supervised model using the binary parameter to classify the samples as
“pylon” or “not-pylon”.
4 Experimental Setup
To validate the architecture, several simulated scenarios containing an energy
pylon and/or other objects were developed. In each of them, samples of the
LiDAR were collected and fed to both an SVM and a NN algorithm. They
were evaluated using the following metrics: overall performance, performance on
each scenario, and execution time. The overall performance was evaluated using
mainly accuracy as a metric. The performance on each scenario was measured to
find the effects different scenarios may have on the detection ratio. And at last,
a time comparison between the algorithms will be shown, as time is a crucial
metric for applications running in embedded systems.
The validation experiments have been executed on a virtual environment
built within the Gazebo Multi-Robot Simulator, version 11.3.0 running on a
Ubuntu 20.04 Operating System with ROS Noetic Ninjemys. The Gazebo sim-
ulation software and the AI algorithms run on a PC equipped with an Intel i5
10400 processor, 16 Gb RAM and a GPU NVIDIA GeForce 1660 Super with 6 Gb
RAM. Two standard planar LiDAR sensor models have been used to scan the
horizontal and the vertical planes. These sensors have a 40.0-m range, sampling
rate of 5 samples per-second, and angle steps of 0.5◦
. Also, for the horizontal
plane, the LiDAR sensor scans 360.0◦
around the UAV; for the vertical plane,
the sensor scans 90.0◦
on the front side of the UAV.
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 437
Eight distinct scenarios were built to allow the data collection for the training
and evaluation of the horizontal AI algorithms, as shown in Fig. 2.
Fig. 2. Images of the experiment scenarios. a) Only a pylon in the field. b) Several
buildings and one pylon. c) Several trees and one pylon in the field. d) A complex
building structure, a water tank and one pylon. e) Several buildings, one pylon and
one tree. f) Several buildings; no pylon. g) Several trees in the field close from each
other; no pylon. h) A complex building structure, a water tank; no pylon.
A 3D STL pylon model from GRABCAD [7] website was imported to the
GAZEBO. To execute the data collection for the horizontal LiDAR, the sensor
was randomly displaced around the pylon on a range from 2.0 m to 10.0 m in dif-
ferent heights between 2.0 m and 20.0 m. The pylon height was set to 30.0 m. For
all the experiments a practical approach was considered in the sensor horizontal
stabilization. It was defined that the LiDAR sensor was attached to a stabilized
gimbal that keeps the sensor always scanning in a horizontal or vertical plane.
A random horizontal alignment noise with 1.0-degree range was added to the
sensor position to simulate the stabilization error present on the gimbal mecha-
nism. Figure 3 shows an example of the horizontal point map LiDAR detection
on a simulation in a complex environment.
For each position, the sensor captures a 360-degree planar point data and
store it into a vector. All samples for each scenario were stored and pos-processed,
indicating the presence or not presence of the pylon in the scenario. The training
set was composed of 80% of the collected samples and the test set was composed
of 20% of the collected samples.
Table 1 presents the results of the SVM algorithm versus the NN algorithm
for the horizontal LiDAR data experiment.
For the vertical data collection, the LiDAR sensor was configured in the same
conditions as the horizontal sensor, however, it scans a 90.0-degree range. A total
of 17500 samples have been captured and used as the training and test data. The
main difference for this data collection is that the sensor was keep pointed to the
pylon in the scenario with a pylon present to assure that the collected samples
contain the detection of the pylon segment. These samples have been tagged as
“pylon” to feed the training set. In the other scenarios, the sensor was randomly
placed in the space inside a 10.0-m radius from the center of the scene, and also
438 M. F. Ferraz et al.
Fig. 3. a) Simulation scene on GAZEBO showing the pylon, objects in the area and
the sensor scan. b) The point map of the horizontal LiDAR detection.
Table 1. SVM × NN comparing in horizontal LiDAR data set
Horizontal SVM
Training Set Evaluation
Horizontal NN
Training Set Evaluation
Predicted Result Predicted Result
True False True False
True 1830 166 True 1562 434
Actual Result
False 179 825
Actual Result
False 416 588
Total of Samples: 3000
True
Readings
2655
Percentual
Accuracy
88.50%
True
Readings
2150
Percentual
Accuracy
71.67%
randomly directed to collect data from the structures and elements present in
the environment. These samples were tagged as “no pylon” to feed the training
set. From these datasets, 80% of the samples were used as the training dataset
and 20% for the test dataset. Table 2 presents the results of the SVM and NN
algorithms for the vertical LiDAR data experiment.
Table 2. SVM × NN comparing the vertical LiDAR data set
Vertical SVM
Training Set Evaluation
Vertical NN
Training Set Evaluation
Predicted Result Predicted Result
True False True False
True 1991 24 True 1997 38
Actual Result
False 47 1438
Actual Result
False 60 1425
Total of Samples: 3500
True
Readings
3429
Percentual
Accuracy
97.97%
True
Readings
3420
Percentual
Accuracy
97.20%
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 439
The performance of the NN and SVM algorithms for each kind of scenario is
shown in the Fig. 4.
Fig. 4. Accuracy performance of the SVM algorithm versus NN algorithm for the
different experimentation scenarios. SVMs had a better overall performance in every
scenario.
4.1 SVM × NN Time Performance
The time performance of the AI algorithm is an important parameter to the
practical application of the architecture. To evaluate this performance, the same
number of predictions were performed for each algorithm for both the horizontal
and vertical LiDAR datasets to calculate the average time for a single prediction.
As the real UAV will run using previously trained models, the training times will
not have influence in the UAV performance, and thus only the prediction times
were measured. The results are presented in Fig. 5.
5 Discussion
Considering the constructive characteristics of the energy pylon, which is com-
posed of thin metallic segments, the planar LiDAR data collected offers few
points to the AI algorithms processing. This is an important parameter to be
considered in training phase. The LiDAR sensors can capture points not only
from the face of the pylon but also from the back of the structure, and thus,
the point map is particularly different from other objects or buildings commonly
found in the surroundings of an energy transmission line area. In spite of the
LiDAR sensors being used for distance measurements mainly, the point signa-
ture generated in a pylon scanning in comparison with other objects. The results
obtained in our simulated experiments show the likelihood that the proposed
approach can provide suitable information to train an AI-based classification
algorithm for real-world applications.
As described earlier, the proposed architecture provides a way for the UAV to
detect a pylon present in the flight area and to rotate it to find pylon direction.
440 M. F. Ferraz et al.
Fig. 5. Time performance of the SVM algorithm versus NN algorithm for the horizontal
and vertical LiDAR data. The results show that the prediction times using the SVM
algorithm were much faster than their NN counterpart.
This allows the UAV to keep its front-side pointed to the pylon face during all the
inspection process. Moreover, assuring that the pylon exists in the environment
and the UAV is pointed to it, a more reliable distance measurement system
based on the LiDAR data readings could be implemented. The proposal of new
distance measurement algorithms is planned for the next steps of this research.
Comparing the two AI algorithms fed with the horizontal data has shown
an accuracy of 88.50% for the SVM against 71.67% for the NN. For this exper-
iment’s scenarios, the results indicate that the SVM algorithm is better than
the NN considering only the accuracy metric. Also the SVM algorithm aver-
age processing time for each prediction is about significantly smaller than the
NN processing time. It is important to remember that, to be employed in real-
world applications, this architecture intends to be deployed on UAV’s onboard
embedded computing system, which may present limited hardware resources
and performance. Therefore, the choice of which algorithm should compose the
detection architecture must be considered carefully.
The vertical LiDAR data have shown a similar performance between the
SVM and NN algorithms, with an accuracy of 97.97% and 97.20%, respectively.
The vertical scan of the pylon allows detecting more segments of the structure,
offering a detailed signal signature as input to the AI algorithms.
Different kind of scenarios presents different performances for the algorithms,
as expected. It is possible to observe that in the specific case where a significant
number of trees are present in the area, the accuracy of the algorithms has a
significant reduction. Besides that, in a real-world application, the algorithm
will operate in a distance not greater than 10.0-meter from the pylon, where the
presence of trees is not common. Also is possible to observe in the Fig. 4 that
the SVM algorithm provides better performance for all the evaluations.
Comparing the processing time performance, the SVM processing is five times
longer than the NN processing. This impacts the detection algorithm execution
frequency, which, in turn, imposes a speed constraint to the aircraft displacement
speed.
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 441
6 Conclusions and Future Works
This work goal was to evaluate the use of AI-based algorithms to compose an
architecture capable of detecting a power pylon structure in the flight area of a
UAV used for detailed inspection applications. Such an architecture must offer
a reliable response about the direction of the pylon, by detecting its structure
using a planar LiDAR sensor aligned to the vertical plane of the aircraft.
Although a planar LiDAR scan data may provide less information to feed an
AI-based detection algorithm (in comparison with, e.g., an image-based dataset),
the results obtained in the simulated environment present good potential for a
real-world architecture implementation. The obtained accuracy ranged from 70%
to 99%, depending on the arrangement and applied AI algorithm. It is possible
to state intuitively that the processing time of LiDAR samples is considerably
shorter than an image. Therefore, the proposed pylon detecting architecture
based on LiDAR sensors data can be deployed to the UAV’s onboard embedded
computing system.
The algorithm comparison has shown that the SVM-based architecture pro-
vides better results than the NN-based architecture for this kind of application.
These results indicate that the construction of real-world application architec-
ture will probably present better results by using this kind of algorithm. Also, the
SVM processing time was significantly smaller than the NN one, which probably
will be reproduced in a real-world situation.
This work pioneers in proposing an approach based on LiDAR data and
IA-based classification algorithms to detect energy power pylons to the best of
our knowledge. Two popular AI algorithms have been evaluated for this task.
Although computer vision algorithms are probably the most common approach
to detecting a pylon in the captured images, those algorithms are used to cal-
culate the distance between the pylon and the UAV or provide visual odometry.
However, the amount of data provided by a single image is huge compared to a
planar LiDAR data sample, demanding not only a higher amount of processing
and memory resources but also an additional financial cost to build the embed-
ded computing system. Thus, it is possible to say that using an architecture
such as the one proposed in this work has a good potential to be implemented in
real-world applications. It demands less processing time while providing reliable
information about the presence of a pylon in the flight area and its direction
related to the UAV pose. This work is the first step towards implementing a
position algorithm based on the measurements of the pylon structure captured
by the LiDAR sensors.
The results presented in this work represent a first evaluation of the IA-based
algorithms to detect a electric pylon based on LiDAR point scans. Future work
will evaluate the real-world performance of the proposed architecture using data
collected from a scale-size pylon and LiDAR sensors carried onboard of a small-
size quad-rotor aircraft. A distance evaluation algorithm based on the LiDAR
readings is also foreseen as future work. Such a system intends to provide relative
positioning data to the UAV flight controller.
442 M. F. Ferraz et al.
Acknowledgements. This work has been supported by FCT - Fundação para a
Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. This work has also
been supported by Fundação Araucária (grant 34/2019), and by CAPES and UTFPR
through stundent scholarships.
References
1. Al-Kaff, A., Martı́n, D., Garcı́a, F., de la Escalera, A., Marı́a Armingol, J.: Sur-
vey of computer vision algorithms and applications for unmanned aerial vehicles.
Expert Syst. Appl. 92, 447–463 (2018)
2. Araar, O., Aouf, N.: Visual servoing of a Quadrotor UAV for autonomous power
lines inspection. In: 2014 22nd Mediterranean Conference on Control and Automa-
tion, MED 2014 (June), pp. 1418–1424 (2014). https://doi.org/10.1109/MED.
2014.6961575
3. Araar, O., Aouf, N., Dietz, J.L.V.: Power pylon detection and monocular depth
estimation from inspection UAVs. Ind. Robot. 42(3), 200–213 (2015). https://doi.
org/10.1108/IR-11-2014-0419
4. Azevedo, F.: LiDAR-based real-time detection and modeling of power lines for
unmanned aerial vehicles. Sensors (Switzerland) 19(8), 1–28 (2019). https://doi.
org/10.3390/s19081812
5. Bian, J., Hui, X., Zhao, X., Tan, M.: A point-line-based SLAM framework for
UAV close proximity transmission tower inspection. In: 2018 IEEE International
Conference on Robotics and Biomimetics, ROBIO 2018, pp. 1016–1021 (2019).
https://doi.org/10.1109/ROBIO.2018.8664716
6. Cerón, A., Mondragón, I., Prieto, F.: Onboard visual-based navigation system for
power line following with UAV. Int. J. Adv. Rob. Syst. 15(2), 1–12 (2018). https://
doi.org/10.1177/1729881418763452
7. GRABCAD: GrabCAD (2021). https://grabcad.com/
8. Hui, X., Bian, J., Yu, Y., Zhao, X., Tan, M.: A novel autonomous navigation
approach for UAV power line inspection. In: 2017 IEEE International Conference
on Robotics and Biomimetics, ROBIO 2017, 1–6 January 2018 (2018). https://doi.
org/10.1109/ROBIO.2017.8324488
9. Li, X., Guo, Y.: Application of LiDAR technology in power line inspection. IOP
Conf. Ser.: Mater. Sci. Eng. 382(5), 1–5 (2018). https://doi.org/10.1088/1757-
899X/382/5/052025
10. Máthé, K., Buşoniu, L.: Vision and control for UAVs: a survey of general methods
and of inexpensive platforms for infrastructure inspection. Sensors (Switzerland)
15(7), 14887–14916 (2015)
11. Menéndez, O., Pérez, M., Cheein, F.A.: Visual-based positioning of aerial main-
tenance platforms on overhead transmission lines. Appl. Sci. (Switz.) 9(1) (2019).
https://doi.org/10.3390/app9010165
12. Nguyen, V.N., Jenssen, R., Roverso, D.: Automatic autonomous vision-based power
line inspection: a review of current status and the potential role of deep learning.
Int. J. Electr. Power Energy Syst. 99(January), 107–120 (2018)
13. Shuai, C., Wang, H., Zhang, G., Kou, Z., Zhang, W.: Power lines extraction and
distance measurement from binocular aerial images for power lines inspection
using UAV. In: Proceedings - 9th International Conference on Intelligent Human-
Machine Systems and Cybernetics, IHMSC 2017, vol. 2, 69–74 (2017). https://doi.
org/10.1109/IHMSC.2017.131
LiDAR-Based Architecture for Autonomous UAV Pylon Inspection 443
14. Tian, F., Wang, Y., Zhu, L.: Power line recognition and tracking method for UAVs
inspection. In: 2015 IEEE International Conference on Information and Automa-
tion, ICIA 2015 - In conjunction with 2015 IEEE International Conference on
Automation and Logistics (August), pp. 2136–2141 (2015). https://doi.org/10.
1109/ICInfA.2015.7279641
15. Viña, C., Morin, P.: Micro air vehicle local pose estimation with a two-dimensional
laser scanner: a case study for electric tower inspection. Int. J. Micro Air Veh.
10(2), 127–156 (2018). https://doi.org/10.1177/1756829317745316
16. Wu, J., Fei, W., Li, Q.: An integrated measure and location method based on
airborne 2D laser scanning sensor for UAV’s power line inspection. In: Proceedings
- 2013 5th Conference on Measuring Technology and Mechatronics Automation,
ICMTMA 2013, pp. 213–217 (2013). https://doi.org/10.1109/ICMTMA.2013.58
17. Zhang, W., et al.: The application research of UAV-based LiDAR system for power
line inspection. In: Proceedings of the 2nd International Conference on Computer
Engineering, Information Science  Application Technology (ICCIA 2017), vol. 74,
pp. 962–966 (2017). https://doi.org/10.2991/iccia-17.2017.174
18. Zhao, X., Tan, M., Hui, X., Bian, J.: Deep-learning-based autonomous navigation
approach for UAV transmission line inspection. In: Proceedings - 2018 10th Inter-
national Conference on Advanced Computational Intelligence, ICACI 2018, pp.
455–460 (2018). https://doi.org/10.1109/ICACI.2018.8377502
19. Zimmermann, F., Eling, C., Klingbeil, L., Kuhlmann, H.: Precise positioning
of UAVs - dealing with challenging RTK-GPS measurement conditions during
automated UAV flights. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci.
4(2W3), 95–102 (2017). https://doi.org/10.5194/isprs-annals-IV-2-W3-95-2017
Data Visualization and Virtual Reality
Machine Vision to Empower an Intelligent
Personal Assistant for Assembly Tasks
Matheus Talacio1,2
, Gustavo Funchal1(B)
, Victória Melo1
, Luis Piardi1,2
,
Marcos Vallim2
, and Paulo Leitao1
1
Research Center in Digitalization and Intelligent Robotics (CeDRI),
Instituto Politécnico de Bragança, Campus de Santa Apolónia,
5300-253 Bragança, Portugal
{gustavofunchal,victoria,piardi,pleitao}@ipb.pt
2
Universidade Tecnológica Federal do Paraná (UTFPR),
Avenida 7 de Setembro 3165, Curitiba 80230-901, Paraná, Brazil
matheustalacio@alunos.utfpr.edu.br, mvallim@utfpr.edu.br
Abstract. In the context of the fourth industrial revolution, the inte-
gration of human operators in emergent cyber-physical systems assumes
a crucial relevance. In this context, humans and machines can not be
considered in an isolated manner but instead regarded as a collabora-
tive and symbiotic team. Methodologies based on the use of intelligent
assistants that guide human operators during the execution of their oper-
ations, taking advantage of user friendly interfaces, artificial intelligence
(AI) and virtual reality (VR) technologies, become an interesting app-
roach to industrial systems. This is particularly helpful in the execution
of customised and/or complex assembly and maintenance operations.
This paper presents the development of an intelligent personal assis-
tant that empowers operators to perform faster and more cost-effectively
their assembly operations. The developed approach considers ICT tech-
nologies, and particularly machine vision and image processing, to guide
operators during the execution of their tasks, and particularly to verify
the correctness of performed operations, contributing to increase produc-
tivity and efficiency, mainly in the assembly of complex products.
1 Introduction
The 4th industrial revolution is pushing the adoption of emergent technologies,
e.g., Internet of Things (IoT), Artificial Intelligence (AI), Big data, collaborative
robots and Virtual Reality (VR), aiming to transform the way factories operate
to increase their responsiveness and reconfigurability. In particular, Industry 4.0
enables the increasing level of automation and digitization in the factories of the
future [5], with Cyber-Physical Systems (CPS) acting as a backbone to develop
such emergent production systems and contributing to develop smart processes,
machines and products. CPS aims to connect the various physical components,
e.g., sensors and actuators, with cyber systems composed by controllers and
communication networks to achieve a common goal [9].
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 447–462, 2021.
https://doi.org/10.1007/978-3-030-91885-9_33
448 M. Talacio et al.
In this emergent CPS environment, humans have a very important role
since they are the most flexible elements in automated systems, being required
their symbiotic integration. In particular, instead of performing repetitive and
monotonous tasks that can be fully automated, humans will be requested to
perform add-value tasks, e.g., assembly of complex and/or customized products
or performing maintenance interventions.
In this context, intelligent assistants and virtual interfaces can be used to
assist humans to realize their manual operations in a faster and more cost-
effective manner, taking advantage of the huge amount of data available at the
shop floor, as well as the emergent ICT technologies, e.g., AI and VR [3]. These
intelligent assistants can support the online monitoring of the equipment con-
dition during the execution of the assembly and maintenance operations, and
combine this information with diagnostic reports and their previous experience
in analogous situations, to determine the best action plans to be carried out. In
such systems, besides the intelligence behind the guidance system, it is important
to consider automatic systems that dynamically verifies the correctness of per-
formed operations, warning the need to correct the operation and only allowing
to proceed in case the operation is complete and successfully performed.
Such automatic verification usually involves the use of machine vision tech-
niques, which in industry context presents several constraints, namely related
to environmental illumination and shadowing, time response and the irregular
geometries of the pieces to be checked. These systems can be empowered with
the use of AI techniques and should be integrated with the other functionalities
of the intelligent assistants.
Having this in mind, this paper describes the development of a machine vision
solution, as part of an intelligent personal assistant (IPA), that supports human
operators to perform faster and more cost-effectively their assembly operations.
In particular, the IPA application uses ICT technologies and image processing
techniques to guide operators to assembly correctly customized and complex
products, checking the correctness of the performed operations. The activities
carried out by the operator are supervised through an intelligent system that
will verify the assertiveness of the assembly to ensure that the operation cycle
will only end when the system detects that the assembly is correct, even for the
scenarios considering 3D assemblies, in which the final product is assembled step
by step, and each step is understood as a complete activity. Since the algorithm
is executed in real-time, a check to verify if an operator is modifying the assembly
is included. The developed solution allows to increase the productivity of oper-
ators through the direct integration with emerging technologies, and strongly
contributes for the human integration with highly automated applications.
The rest of the paper is organized as follows. Section 2 contextualizes the
related work on developing intelligent assistants, as well as the use of machine
vision to verify the correctness of assembly operations. Section 3 presents the IPA
system architecture for the case study and Sect. 4 describes the development of
the machine vision system to verify the correctness of assembly operations for
the two scenarios considered in the case study. Section 5 discusses the achieved
Machine Vision to Empower an Intelligent Personal Assistant 449
experimental results, and particularly the user experience. Finally, Sect. 6 rounds
up the paper with the conclusions and points out the future work.
2 Related Work
Industry 4.0 relies on the use of CPS complemented with emergent digital tech-
nologies to achieve more intelligent, adaptive and reconfigurable production sys-
tems. Also important is the integration of the human operators into the produc-
tion processes for enhanced product value, achieving manufacturing flexibility in
complex human-machine systems, where humans and machines can not be con-
sidered in an isolated manner but instead regarded as a collaborative team [8].
This requires the presence of operators on the design of human-centred adaptive
manufacturing systems, allowing a symbiotic integration.
However, the execution of some operations, e.g., assembly and maintenance,
is normally complex and requires high-level expertise from the operators to exe-
cute them efficiently in short time, with the required quality and with the mini-
mal impact on the normal production cycles. This requires to enable the human-
machine collaboration, through the use of innovative technologies, e.g., machine
vision, machine learning and VR, that will support the operators during the
realization of their tasks.
In this context, IPA is a guidance software system that can support the
execution of operations, facilitating the interaction between the operators and
machines or computers, based on data mining from images, video, and voice
commands captured from sensors distributed in the environment [6,14]. The use
of IPA, as an intelligent assistant to inform the operator with real-time and his-
torical data and guide through the best action plan to perform the operation [7],
can improve the quality of executed operations, particularly in case of complex
and/or customised ones. As an example, maintenance technicians can use an
IPA system to obtain useful real-time information about the machine condition,
recommendations and actions to be taken, as well as useful information, e.g.,
documents, websites or videos [4]. These systems can provide real-time informa-
tion and instructions, replacing traditional methods that rely on printed manuals
and real machines that can be damaged [13].
The more common commercial versions of IPA are, e.g., Amazon Alexa,
Microsoft Cortana, Google Assistant or Apple Siri. In industrial environments,
IPA are, e.g., being used to training as a means of learning through the first-
person experience [1,10], to train operators in new machine maintenance pro-
cedures [16] and to support operators in performing complex and customised
assembly tasks and maintenance operations [15].
The use of IPA can include mechanisms to automatically verify the correct-
ness of the performed operations, namely assembly operations. In this context,
artificial vision, also known as machine vision, plays an important role in the
supervision of assembly operations since it allows the acquisition and analy-
sis of image data, which would not be possible only by inference of context in
software. In industrial environments, the verification of 3D product assemblies
450 M. Talacio et al.
is not possible by using simple 2D images, which requires to consider alterna-
tive cameras that allows to measure the distance of points in the field of view,
providing a 3D image of the scene. Several technologies can be considered to
accomplish this objective, e.g., TOF (Time of Flight), used by Microsoft Kinect
2.0 and Microsoft Azure Kinect, and stereo vision, used by Intel Realsense and
Stereolabs ZED. The acquired images are analysed by using image processing
techniques to recognize shapes and objects, e.g., contour matching, morpholog-
ical operations and feature detection. In case of recognition of complex shapes,
these techniques can be combined with machine learning algorithms.
The referred commercial intelligent assistants solutions usually provide
speech recognition functionalities but miss the image recognition, which is crucial
in industrial environments. Additionally, the use of such intelligent assistants in
industrial environments is still rare, with the majority running as laboratory pro-
totypes, mainly due to the industrial constraints and complexity, e.g., lighting,
objects geometry and object overlay.
Several aspects can influence the use and confidence of the IPA technology,
e.g., usability, security and privacy [2]. At the practical level, there are some
challenges to be faced regarding the adoption of IPA supporting technologies in
industrial environments. These solutions need to be matured to avoid creating
entropy within the maintenance process and operators need to be adequately
trained to operate the system in a proper manner. The IPA’s human-interaction
barrier models need to be improved, and AI systems need to prove reliabil-
ity. Another aspect is related to the ergonomic evolution of the head mounted
devices, which need to be more comfortable for operators to use during the entire
shift or even complete the maintenance intervention [12].
3 System Architecture for the Intelligent Personal
Assistant
This section presents the setup of the assembly workbench and the system archi-
tecture of the IPA that will be mentoring operators to the execute assembly
operations based on LEGO pieces from different shapes and colours.
3.1 Description of the Case Study
In this work, an IPA is used to support operators to perform their customised
and/or complex assembly operations in a more efficient manner. For this pur-
pose, the IPA is combined with a workbench structure, illustrated in Fig. 1,
providing an intuitive and guidance information to the user regarding the task
to be performed, as well as the capability to verify the correctness of the opera-
tion execution. As example, the IPA informs the operator about the action plan
to perform the operation, namely which step to be executed, how to execute the
step, and which piece or tool should be used in this step.
The physical structure comprises several devices that support the human-
machine interface. In terms of data collection, the workbench has an Intel
Machine Vision to Empower an Intelligent Personal Assistant 451
Working
area
tape
Monitoring assembly
instructions
Fig. 1. IPA workbench layout to support the human-machine interface.
Realsense camera responsible for capturing the image over the working space,
which will be processed by a machine vision algorithm to detect whether the
instructions given are correctly performed. This system is also able to acquire
the depth of the scene, thus enabling the verification of assemblies in a three-
dimensional space. A microphone is available to support the human-machine
interface by collecting the voice instructions provided by the operator during the
execution of the assembly procedure, e.g., the feedback related to the conclusion
of an operation step. Since the industrial environment can be noisy, the platform
can in alternative provide the feedback by using the touch screen interface.
In terms of output devices, the workbench comprises a monitor and a projec-
tor that are responsible for providing information to the operator, e.g., instruc-
tions to operator and feedback from operator. The image of the projector is
displayed over the working space. A LED tape is used to easily indicate to the
operator which piece placed in the several dispensers should be used in a certain
process step.
The IPA should consider the execution of two case study scenarios. In the
first case study scenario, the operator must assemble gift boxes according to the
orders received by clients, with each gift box product comprising four slots where
individual components (with different colour and shape) should be correctly
placed. In the second scenario, the system is used to support the assembly of
more complex three-dimensional products, but unlike the first, the instructions
are passed on incremental steps and the system checks if the current step has
been completed to allow to proceed to the next assembly step.
3.2 System Architecture
The IPA system architecture, illustrated in Fig. 2, comprises several modules,
interconnected via standard protocols, namely using REST services.
452 M. Talacio et al.
LED ribbon
context (assembly
sequences)
workspace
speech
recognition
correctness
verification
Arduino
micro
projector
Intel
RealSense
monitor
HMI back-end
Media files
assembly
instructions
LED updates
BGR
image +
depth data
plan
management
engine
operator
application
dashboard
Fig. 2. System architecture for the IPA system.
The process plan management module is responsible to manage the execution
of the process plan, expressed as a JSON file, related to the assembly operation.
This process considers the assembly plan procedure, the feedback from the envi-
ronment, i.e. from the camera that allows to check the correctness of the assembly
step, and the feedback from the operator, e.g., using the voice commands to ask
more information to execute the instruction.
The interaction with the operator is performed by the dashboard application,
where the step by step guidance instructions to be executed by the operator are
displayed through a monitor and a projector, complemented with supporting
documents and videos. Following the order to assembly a customised product,
the requested configuration is displayed to the operator through the human-
interface at the beginning of the cycle. The intelligent assistant is continuously
checking the assembly correctness and only finalizes the operation cycle when a
perfect match is achieved. The intelligent assistant system also indicates to the
operator which slots/pieces are correctly assembled and which are not.
An important functionality provided by the IPA is to verify the correctness
of the operation performed by the operator through the use of machine vision
techniques; in case of an assembly operation performed incorrectly, the intelligent
system will indicate the error and the assembly procedure can only proceed after
a correct assembly.
The correctness verification module is responsible to verify the correctness
of the step execution by using machine vision algorithms. For this purpose,
the module is continuously receiving the image acquired by the Intel Realsense
camera and uses a machine vision algorithm to determine the important char-
acteristics of the image, being able to conclude if the operation was performed
correctly (see more details in the next section).
Machine Vision to Empower an Intelligent Personal Assistant 453
The results of the analysis of the correctness of the operation execution is
provided to the plan management engine by using a REST service. In order
to optimize the system performance and avoid the overload on the connection
with the server, this module only sends the detection status when there is any
change (comparing the last REST payload sent with the new generated one).
Also, the depth map of the scene is always checked, so it is possible to detect if
the operator is still busy performing the assembly and the recognition is paused
until it is concluded.
Finally, the LED strips are controlled by an C/C++ application running in
an Arduino Uno microcontroller that receives the information of which led area
should be turned on through an UDP socket. Note that during the execution
of assembly tasks, the LED strip will indicate the container where the operator
should pick the pieces to execute the current step.
The proposed system aims to contribute to develop an intelligent context-
awareness system, which helps human operators to perform faster and more cost-
effectively their assembly operations. In addition, it is expected that this solution
can contribute to increase the operator’s productivity through the integration
of emerging technologies, adding value to human operations that must compete
with highly automated industries.
4 Image Processing to Verify Assembly Correctness
Image processing techniques are used to verify the correctness of the assem-
bly operation performed by the operator in real time by comparing the image
acquired by the Intel RealSense camera with the execution instruction previ-
ously provided by the IPA. For this purpose, an algorithm for image recognition
was firstly implemented, which will later be used to verify the correctness of
the assembly operation in two different scenarios, namely the gift-box and the
complex assembly. The image processing algorithms were codified in Python,
ensuring fast run-time, easy development and camera compatibility, and sup-
port the use of image manipulation libraries.
4.1 Image Recognition and Classification Algorithm
The first step in the correctness verification process is related to the image
recognition, which uses the OpenCV library to implement image processing,
e.g., filtering, morphology, contour detection and colour space conversion, on
real-time applications. The developed image recognition algorithm, illustrated
in Fig. 3, identifies the different objects in the image, in terms of shape, colour
and position.
Briefly, the algorithm changes the colour space of images from Red-Green-
Blue (RGB) to Hue-Saturation-Value (HSV), since the last presents better per-
formance in machine vision algorithms and it is more adequate for the detection
of the colour. In fact, this colour space allows to represent the colors easily for
the filter since they are closer to the perception of colours in relation to the
454 M. Talacio et al.
All colours
of the object
analysed?
Yes
No
Image acquisition
(RGB colorspace)
Conversion to the
HSV colorspace
Colour identification
filter (Thresholding
operations)
Add to list with
properties of found
objects
Returns list with
properties of all objects
found in the image
Find contours
Fig. 3. Algorithm for the image recognition.
human eyes [11]. Considering the image after the colour space conversion, the
use of the OpenCV library, and particularly the inRange function, allows apply-
ing a threshold filter to isolate the pixels from the image by defining a range
in the colour space. As result, a binary image is obtained, with the pixels that
are within the defined range for each colour assuming the white colour and the
pixels that are not within the range assuming the black colour. This process is
performed for all the available colours of objects to be identified and the range
is defined for each colour. From the binary image it is possible to find the con-
tour of the different objects by using the findContours method, which extracts
information about the area, center of mass and vertex position of the object.
This image recognition algorithm is performed for both case study scenarios.
However, the algorithm to verify the assembly correctness is dependent of the
scenario and will be described in the following sections.
4.2 Verification Algorithm for the Gift-Box Assembly
In this scenario, the application generates a configuration for the gift-box accord-
ing to the received orders, i.e. which pieces (defined in terms of shape and colour)
should be placed in each one of the four box slots. The operator receives the
information related to this configuration, through the projector and monitor,
and performs the assembly operation by placing the pieces in the target slots
of the gift-box. The requirements to confirm the correct assembly include the
verification if only one piece is placed in each slot and if the piece has the colour
as requested in the configuration instructions. The algorithm to verify the cor-
rectness for the gift-box assembly is illustrated in Fig. 4.
The image recognition algorithm described in the previous section is used to
list the objects present at the scene. Then, for the gift-box assembly verification
algorithm, the pointPolygonTest function from the OpenCV library, is used to
Machine Vision to Empower an Intelligent Personal Assistant 455
All pieces
analysed?
No
Yes
Is the
gift-box among
the recognized
objects?
No
More than
one piece or
wrong colour
in a slot?
Is the
validation string
the same as the
previous
iteration?
Fetch the current
instruction from the
data base
Get list of recognized
objects (from the image
recognition algorithm)
Calculate the distances
between the top-left
vertice of the piece and all
vertices of the box
Assign the lower
distance box vertex as
this piece position
Check positions and
colours
Send error message
to the user interface
Generate the validation
string containing the parts
in slot information
Send the validation
string to the user
interface
Restart the iteration
Yes No
Yes
Yes
No
Fig. 4. Verification algorithm for the gift-box assembly.
select only the pieces that are inside the box. The algorithm identifies the con-
tour of the box, and then, knowing its position, it is possible to measure the
Euclidean distance between the other pieces and each corner of the box. Due to
the geometry of the box, the piece that is placed in a slot will always be at a
smaller distance from its respective corner, e.g., the piece that is in the upper
right slot always has a smaller distance from its upper right corner compared
to the others. This allows to assign the proper slot positions of the pieces by
comparing with the received instruction. If two pieces are detected in the same
position or if the colour of the piece in one of the slots does not match with the
colour presented in the instruction, an alert error is generated. In this way, it is
possible to have freedom during the assembly operation as the positions are all
relative to the current gift-box position.
After validating the correctness of the performed assembly, the information
is sent to the dashboard application that will show the results, and in case of
incorrectness, indicates which slots of the gift-box are wrongly assembled.
4.3 Verification Algorithm for the Complex Assembly
In this scenario, the dashboard application guides the operator during the assem-
bly of a complex product by showing the image of the complete assembly and
the visual instructions (i.e. action, image and video) for the current assembly
456 M. Talacio et al.
step. After concluding the verification of correctness, the system indicates if the
assembly step was completed or not with success.
Since the operation involves a three-dimensional assembly, besides the recog-
nition of the colour and shape of the pieces, it is also required to get information
about the height, which is possible by using the IR Stereo sensor of the Real
Sense camera. To mitigate inaccuracies caused by the camera’s mounting angle
and uneven surface, the application initially saves the scene depth as a height
map, so during the detection routine, the captured depth is subtracted with
the map to obtain the height of the pieces relative to the working table. The
verification algorithm for the complex assembly is illustrated in Fig. 5.
current layer
different from
the instruction
layer?
number of
expected objects
equals to the list
size?
match
found?
All pieces in
the instruction
analysed?
All expected
pieces marked
as found?
Yes
No
Fetch the current
instruction from the
data base
Change layer (adjusting
height for the image
recognition algorithm)
Get list of recognized
objects (from the image
recognition algorithm)
Loop through all the
detected pieces for a
match
Send Incompleted Step
to the user interface
Send Succesful Step to
the user interface
Mark the expected
instruction piece as not
found
Mark the expected
instruction piece as
found
Yes
No
No
Yes
No
Yes
No
Yes
Fig. 5. Verification algorithm for the complex assembly.
To allow the assembly verification in this scenario, the first action is to fetch
the active assembly and search for the respective instruction file for the current
step. For all the instructions generated, the piece in the top left corner in relation
to the rotation of the mounting surface, as illustrated in Fig. 6, is considered as
a master piece, responsible for the basis of calculation to establish the position
of the other pieces in scene. In scenarios that not require to use a box or a
Machine Vision to Empower an Intelligent Personal Assistant 457
basis to accommodate the pieces, the first piece placed in the working area, e.g.,
corresponding to the first step, is considered as the master piece, and all the
others pieces placed later have their position related to the master piece.
Fig. 6. Possible assembly rotations. (Color figure online)
The colour, orientation and shape of the piece are obtained directly from
the image recognition algorithm. The assemblies were separated into layers of
heights, to ensure that all pieces in the scene are analyzed, even those that are
covered by another piece in a later step. Each layer has a master piece to serve
as reference for other pieces in the same plane and a new layer is considered if
any piece is mounted on top of another. It is important to remark a limitation
in the assembly freedom since that the assembly area can not be moved when
a layer change because this would change the stored reference position of the
master piece used to calculate the distance and the reference point for the newly
added piece.
In order to increases the stability of the verification correctness process, the
assembly verification algorithm considers the last five performed detections and
only confirms the assembly if they are equal. Note that this threshold value was
empirically defined, with a small number leading to a reduced certainty and a
high number increasing the response time.
By debugging the software, it was possible to observe that problems in detec-
tion may occur due to the positioning of the assembly in relation to the ambient
lighting. To solve this problem, three solutions have been introduced: the reduc-
tion of the lighting intensity, the demarcation of an optimal area for assembly
and the modification of the algorithm that returns the final detection.
5 Experimental Results
The developed IPA, and particularly the automatic system to verify the correct-
ness of the assembly operation, was implemented in the workbench case study,
and used by operators in their assembly operations’ routines.
458 M. Talacio et al.
5.1 Accuracy of Developed Methods
Multiple running tests were performed using the intelligent assistant by operators
performing assembly operations for the 2 case study scenarios.
Regarding the gift-box scenario, Fig. 7 illustrates an operator performing the
assembly operation with the support of the intelligent assistant, as well as the
output of the recognition algorithm to check the correctness of the performed
operation.
Fig. 7. Gift-box assembly in the IPA (right) with the recognition output (left).
It is clear the benefits of using the intelligent assistant to support the opera-
tor during the execution of customised assembly tasks, including the automatic
presentation of the instructions to perform the operation and the automatic ver-
ification of the correctness of the performed operation, mitigating the possibility
of human errors, e.g., due to fatigue, variability and error of judgement, thus
increasing the efficiency of the process and the comfort of the operator.
The checking algorithm was intensively tested, including incorrect assemblies
with the following situations to validate its accuracy: i) incorrect colour order
within the box, ii) two pieces in the same box slot, and iii) pieces placed outside
the box. No errors occurred during the tests, i.e. where the system output is
correct although the real assembly is not. However, when the box was placed
outside the detection area, some errors have occurred due to the light reflection.
In particular, it was identified the incorrect detection of red pieces when they are
very close to the box wall. This occurs since in some situations of low brightness,
the HSV values for brown and red colours coincide, with the filter not being able
to separate the two objects. This situation can be avoided with the improvement
of the scene lighting and adjusting the filter threshold values to better classify
the pieces’ colours.
The response time to perform the recognition algorithm and to present the
verification decision to the operator is approximately 3 s, which means a waiting
time of 3 s to check the correctness of the performed operation’s step. This implies
the need to establish a cycle time for the assembly of gift-boxes that includes
this additional time.
Machine Vision to Empower an Intelligent Personal Assistant 459
Regarding the second scenario, Fig. 8 shows the operator performing the
assembly of a product and the output of the recognition algorithm. The black
points presented in the algorithm’s output are due to the layer system since as
the distances are subtracted from the height map, it is possible to obtain neg-
ative values by oscillating the camera data, with the algorithm excluding these
areas.
Fig. 8. Operator performing a complex product assembly (left) and the output of the
recognition algorithm (right).
Similarly to the previous scenario, erroneous situations, e.g., switched colour
or position, wrong orientation, pieces mounted in incorrect layers, movement of
the mounting base in instructions of the same layer and removal of previous
pieces, were included to the testing experiment to identify possible errors in the
recognition of the assembly steps. It was observed that during the experimental
tests, no errors have occurred. One problem found was the insertion of pieces of
the same colour as other pieces in lower layers. Although the performance of the
algorithm when ignoring the layers not being worked on was satisfactory, the
shape of lower pieces placed in the proximity of parts of the superior layer was
detected, as observed in Fig. 9. Thus, when the colours of these pieces coincide,
they are considered only one piece, causing the incorrect calculation of their
edges and consequently unable to be confirmed by the algorithm.
It was also proven the theorized problem during the development of this
application in which during the change of layers, if the pieces change their posi-
tion on the working area, it is not possible to obtain the relation of the new
piece and therefore it is not possible to confirm the assembly step. This problem
can be solved by using another algorithm that can identify the position of the
pieces, independently of the relations between the rest of the assembly.
Finally, the classification of the pieces’ heights has been proven to be efficient
by using a height map, since, through the debugging of the algorithm, it is
possible to find the same height for pieces in the same plane. Since a height
map technique is used, it is assumed that the recognition algorithm starts with
no object above the table, otherwise it is necessary to clean it and restart the
460 M. Talacio et al.
Fig. 9. Example of wrong inclusion of bottom layer pieces. (Color figure online)
detection procedure. Also, if the camera is misaligned perpendicularly to the
working area, incorrect heights are obtained, being necessary to perform this
verification ahead.
5.2 Comparing with Traditional Methods
An experimental use case test was performed to compare the time and efficiency
of the IPA system against the traditional manual method, considering the sec-
ond use case scenario. The time elapsed and the errors made during the setup,
assembly and completion phases were analysed. For the conventional method,
the operator follows a printed manual that describes the steps to be executed
and fulfills a report containing the confirmation of the correct performance of the
instructions. After the assembly through the printed manual, the participants
used the workbench to perform a different assembly from the one performed pre-
viously, but with the same difficulty and number of steps, and the same metrics
were calculated. Note that the order in which the methods are carried out does
not influence the result as the set of parts used and the assembly process itself
change, which does not allow the operator to become more skilled after executing
an assembly process. Table 1 summarises the obtained results with four opera-
tors performing the assembly using the two different methods (conventional and
IPA), each one performing a setup and 5 assembly steps.
Table 1. Average assembly time (seconds)
Conventional method IPA method
Setup 84.25 15.75
Steps 97.75 62.00
Total 182.00 77.75
It can be observed that the biggest difference between the two methods is
related to the setup time, which is composed by the selection of the assembly
Machine Vision to Empower an Intelligent Personal Assistant 461
procedure and the pre-filling of the report, where, in the IPA, the information is
filled automatically and uploaded to the database accordingly. The time taken
to complete all the 5 steps when performed using the intelligent assistant shows
a reduction of 36,6% in relation to the conventional methods. Through the use
of assembly recognition associated to the IPA method, it is ensured the certainty
that the instruction was correctly executed, while in the conventional method
this function depends of the operator performance, thus introducing the possi-
bility of human errors. These benefits will be as higher as more complex and
longer is the assembly procedure.
6 Conclusions and Future Work
In Industry 4.0, humans are considered the most flexible piece in an automated
system, being crucial their symbiotic integration. The use of intelligent assis-
tants contribute for this integration, particularly to support humans during the
execution of complex and/or customised operations.
This paper presents the development of an IPA to support the integration of
humans in cyber-physical systems, particularly acting as mentoring of operators
in the execution of their assembly operations. A key issue in this IPA system is
the use of machine vision techniques to automatically verify the correctness of
the performed operations, allowing to reduce the operators errors and improve
the product quality.
The application of a workbench equipped with an intelligent personal assis-
tant demonstrates, through the use of two different scenarios, how emergent ICT
technologies can be applied to help operators to perform assembly tasks faster
and efficiently. In fact, the developed solutions presented high levels of accuracy,
a reduction of the operators errors and a reduction of the execution time, as well
as excellent levels of acceptance from operators.
In this sense, IPA directly contributes to increasing productivity, quality
and efficiency in the tasks performed, highlighting the importance of adopting
Industry 4.0 in traditional manufacturing companies through the interaction of
humans with intelligent assistants.
Future work includes the use of more robust detection algorithms to handle
more complex parts, and AI techniques to provide better action plans along
the assembly process. Also, it is planned to perform tests with more operators,
including a statistical study with the variability of the assembly execution time.
Acknowledgments. This work has been supported by FCT- Fundação para a Ciência
e Tecnologia within the Project Scope: UIDB/05757/2020.
References
1. Abidi, M., Al-Ahmari, A., Ahmad, A., Ameen, W., Alkhalefah, H.: Assessment of
virtual reality-based manufacturing assembly training system. Int. J. Adv. Manuf.
Technol. 105, 3743–3759 (2019)
462 M. Talacio et al.
2. de Barcelos Silva, A., et al.: Intelligent personal assistants: a systematic literature
review. Expert Syst. Appl. 147, 113193 (2020)
3. Fantini, P., et al.: Exploring the Integration of the human as a flexibility factor in
cps enabled manufacturing environments: methodology and results. In: Proceedings
of the 42nd Annual Conference of IEEE Industrial Electronics Society (IECON
2016), pp. 5711–5716 (2016)
4. Frigo, M.A., da Silva, E.C., Barbosa, G.F.: Augmented reality in aerospace man-
ufacturing: a review. J. Ind. Intell. Inf. 4(2), 125–130 (2016)
5. Gilchrist, A.: Industry 4.0: The Industrial Internet of Things. Springer, Heidelberg
(2016)
6. Hauswald, J., Laurenzano, M.A., Zhang, Y., et al.: Sirius: an open end-to-end
voice and vision personal assistant and its implications for future warehouse scale
computers. In: 20th International Conference on Architectural Support for Pro-
gramming Languages and Operating Systems, pp. 223–238 (2015)
7. Hoedt, S., Claeys, A., Landeghem, H.V., Cottyn, J.: The evaluation of an ele-
mentary virtual training system for manual assembly. Int. J. Prod. Res. 55(24),
7496–7508 (2017)
8. Krupitzer, C., et al.: A survey on human machine interaction in industry 4.0. CoRR
abs/2002.01025 (2020)
9. Leitão, P., Colombo, A.W., Karnouskos, S.: Industrial automation based on cyber-
physical systems technologies: prototype implementations and challenges. Comput.
Ind. 81, 11–25 (2016)
10. Mantovani, G.: VR Learning: Potential and Challenges for the Use of 3D. Towards
Cyberpsychology: Mind, Cognitions, and Society in the Internet Age, pp. 208–225
(2003)
11. Mohd Ali, N., Md Rashid, N.K.A., Mustafah, Y.M.: Performance comparison
between RGB and HSV color segmentations for road signs detection. In: Advances
in Manufacturing and Mechanical Engineering. Applied Mechanics and Materials,
vol. 393, pp. 550–555. Trans Tech Publications Ltd (2013)
12. Morgado, M., Miguel, L.: Ergonomics in the industry 4.0: virtual and augmented
reality. J. Ergon. 08 (2018)
13. Pierdicca, R., Frontoni, E., Pollini, R., Trani, M., Verdini, L.: The use of aug-
mented reality glasses for the application in industry 4.0. In: Proceedings of the
International Conference on Augmented Reality, Virtual Reality and Computer
Graphics, pp. 389–401 (2017)
14. Romero, D., et al.: Towards an operator 4.0 typology: a human-centric perspec-
tive on the fourth industrial revolution technologies. In: Proceedings of the Int’l
Conference on Computers and Industrial Engineering, pp. 1–11 (2016)
15. Webel, S., Bockholt, U., Engelke, T., Gavish, N., Olbrich, M., Preusche, C.: Aug-
mented reality training for assembly and maintenance skills. Robot. Auton. Syst.
61(4), 398–403 (2013)
16. Zhu, Z., et al.: AR-mentor: augmented reality based mentoring system. In: Pro-
ceedings of the IEEE International Symposium on Mixed and Augmented Reality
(ISMAR 2014), pp. 17–22 (2014)
Smart River Platform - River Quality
Monitoring and Environmental Awareness
Kenedy P. Cabanga1,2
, Edmilson V. Soares1,2
, Lucas C. Viveiros1
,
Estefânia Gonçalves2
, Ivone Fachada2
, José Lima1
,
and Ana I. Pereira1(B)
1
Research Centre in Digitalization and Intelligent Robotics CeDRI,
Instituto Politécnico de Bragança, 5300-252 Bragança, Portugal
{a36074,a39189}@alunos.ipb.pt, {viveiros,jllima,apereira}@ipb.pt
2
Centro de Ciência Viva de Bragança, Bragança, Portugal
{egoncalves,ifachada}@braganca.cienciaviva.pt
Abstract. In the technology communication era, the use of the Inter-
net of Things (IoT) has become popular among other digital solutions,
since it offers the integration of information from several organisms and
at several sources. By means of this, we can access data from distant
locations and at any time. In the specific case of water monitoring, the
conventional outdated measurement methods can lead to low efficiency
and complexity issues. Hence, Smart systems rise as a solution for a
broad of cases. Smart River is a smart system platform developed to
optimize the resources and monitoring the quality of water parameters
of the Fervença river. The central solution is based at Centro Ciência
Viva de Bragança (CCVB), one of the 21 science centers in Portugal
that aims to promote the preservation and environmental awareness for
the population. By using the IoT technologies, the system allows real-
time data collection with low cost and low energy consumption, being
a complement of existing projects that are being developed to promote
the ecological importance of natural resources. This paper covers sensor
module selection for data collection inside the river and data storage.
The parameters of the river are visualized using a program developed in
Unity engine to present data averages and comparison between weeks,
months, and years.
Keywords: Smart systems · Internet of Things (IoT) · Water
parameters
1 Introduction
Nowadays, the fourth industrial revolution is becoming increasingly popular
among researchers and employees because of its broad contributions in different
sectors. Among its different areas (virtual and augmented reality, artificial intel-
ligence, advanced robots, autonomous vehicles, cloud computing, big data and
block-chain between others) [1,2] the Internet of Things (IoT) is standing out
and growing rapidly.
c
 Springer Nature Switzerland AG 2021
A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 463–474, 2021.
https://doi.org/10.1007/978-3-030-91885-9_34
464 K. P. Cabanga et al.
Overall, the main advantage of IoT technology is the connections since it
allows the communication between many devices to share information inside
a cloud-based data system. Many points can be connected in any place, any
distance, and any time through the internet. This enables a better efficiency for
fast information delivery with low-cost and effort on building intelligent solutions
using software, sensors, vehicles, and electronic devices interconnected [2].
Whereas the conventional methods of water quality monitoring need to dis-
locate to the place to monitor the water directly or by analyzing it in the labo-
ratory. The proposed system takes advantage of the trend technologies of IoT to
build a smart solution. Smart River platform is a set of intelligent and intercon-
nected systems through the cloud to measure the water parameters and quality
of the Fervença river in Bragança, Portugal [5].
Fervença River is a border affluent of Sabor river, has its spring located in the
Fontes Barrosas village and crosses the city of Bragança, being one of the main
hydric resources for farming activities in the small communities of the region. It
is also an extensive source of biodiversity in its waters and the riverside, inside
and outside of the city’s territory.
The rivers and streams are a priceless resource, but pollution from urban and
agricultural areas pose a threat to the water quality. To understand the value
of water quality, and to effectively manage and protect the water resources, it’s
critical to know the current status of water-quality conditions, and how and why
those conditions have been changing over time.
The system aims to collect physical and chemical parameters from the Fer-
vença river, store all the data inside a cloud-based server, and display it in a
touchscreen monitor in one of the exhibits of the science center of Centro de
Ciência Viva de Bragança (CCVB). The exhibit aims to offer an interactive,
scientific and educational experience regarding the importance of carrying out
environmental monitoring combined with the latest communication technologies.
Besides, to improve the environmental awareness of CCVB visitors.
Currently, there is a great need for the use of decentralized and automated
systems for data collection in environmental systems. The data can be used for
the development of the technique that allows the improvement of environmen-
tal conditions, consequently, by improving the quality of people’s life who use
natural resources, as well as promoting sustainability.
Systems such as Smart River have been increasingly used in Smart Cities,
by allowing integration with other technologies and development of solutions
with shared databases. Combined with science communication techniques, the
module presents a tool that contributes to the population’s awareness on the
environmental value of the water.
This paper is organized as follows. Section 2 presents the literature review
associated with the sensor modules configuration, the network communica-
tion, Internet of Things (IoT) and LoRaWAN protocols. Section 3 addresses
the system architecture where the hardware and the communication scheme
is explained. Section 4 details the Smart River system development and the
management of the sensors installed at Bragança region. The water monitor-
ing through Node-Red and Unity applications is described at Sect. 5. Finally,
Sect. 6 presents some conclusions and points for future works.
Smart River Platform 465
2 Theoretical Background
River quality monitoring systems seek to ensure high precision and accuracy of
data, timely reports, data averages, easy access to data and integrity. The river
is an important source of water for all cities and one of the main threats to its
sustainability is pollution. The existing methods for monitoring water quality
rivers are manual monitoring and continuous monitoring. These methods are
expensive and less efficient. Therefore, an intelligent river monitoring system
was proposed [12,13].
A peculiar effect of the IoT applications is the intelligently data control, which
is the result of the platform known as The Things Network (TTN). Whereas the
LoRaWAN protocol is used for sending data, and each device is capable of com-
municating through LoRa technology, known as End Nodes. The communication
occurs with the Gateway from sending data measurements taken by the sensors
and relevant information by the developed application [4,6,7].
The Gateway is responsible for creating a connection between the devices and
the network server and by receiving data from the various End Nodes. Besides
interpreting, storing, and passing the data into the network server. The Gateway
is always connected through the Network Server via the TCP and IP protocols,
either by the wired or wireless network. From the point of view of the End Nodes,
the Gateway is just a messenger forwarded to the Network Server.
Overall, TTN is an open-server platform in which the application’s servers
are built that interpret and present the data to the user. This server is able to
format the data that will be sent to the device and respond, when programmed
to do so, based on the data collected, autonomously and without the constant
need for human observation [8,9].
2.1 End Nodes
The LoRaWAN End nodes usually have sensors to detect a change in the param-
eters (e.g. temperature, humidity, accelerometers, GPS), a LoRa transponder
(responsible for transmitting the signals over LoRaWAN radio), and optionally,
a micro-controller. Besides the End node is battery-powered and uses low energy
approach, it requires attention from outside programs for checking the battery
charge. Depending on the system interface developed, it can show how last the
percentage of the battery. Some LoRa embedded sensors can run batteries to
last for a maximum of 5 years and transmit signals over distances to 10 km from
the other device connected through the network [3].
2.2 Gateways
One advantage of LoRaWAN technology is the worldwide connection between
gateways by the community of TTN. These devices support the operation of the
sensors to transmit the data to the LoRa Gateways. The package communication
process occurs by using the LoRa gateways which are responsible for using the
network with standard internet protocols (IP) to transmit data and to receive
data from the embedded sensors to the server [3].
466 K. P. Cabanga et al.
3 Smart System Development
In this section it is presented the Smart River monitoring system, describing
with detail the river sensor modules and communication network.
3.1 Sensor Modules
To measure the water quality of the Fervença river, the sensor module presented
on Fig. 1 was elaborated that has five independent sensors: (i) Ambient tempera-
ture; (ii) Water temperature; (iii) Relative humidity; (iv) Electrical conductivity;
and (V) pH value.
Fig. 1. Electrical circuit box on the river
The sensors were selected by taking into account the cost of acquisition
(avoiding suffer from vandalism or external damage), easy access to data using
an external system (for example, a microcontroller), and robustness (as it is an
application that requires the exposure of sensors to environmental conditions all
year along). Because of this, the following sensors were chosen, which will be
explicit in the Table 1.
Smart River Platform 467
Table 1. Smart system components and specifications
Sensor
and components
Parameter
and function
Specifications
DHT11 Humidity and ambient
temperature
Voltage of 3 V to 5.5 V; electric current of
0.3 mA (measurement) 60 µA (standby);
temperature range from 0 ◦C to 50 ◦C;
humidity range: 20% to 90%; temperature
and humidity resolution is 16 bits; Accuracy
of ±1 ◦C and ±1% [14].
DS18B20 Water temperature Working Voltage of 3.2 V to 5.25 V DC;
Working Current of 2 mA (max); Resolution
of 9 to 12 bit programmable; Measuring
Range of −55 to 110 ◦C; Measurement
Accuracy of ±0.5 ◦C @ −10 to +80 ◦C; ±2
◦C @ −55 to +110 ◦C; output cables of
yellow (data), red (vcc), black (gnd); sensor
output interface of XH2.54-3P [15]
Analog pH pro V2 pH water level Supply voltage from 3.3 V to 5.5 V; Output
voltage of 0 V to 3.0 V; BNC probe
connector; PH2.0-3P signal connector;
PH2.0-3P signal connector; Measurement
accuracy of ±0.1 @ 25 ◦C; Dimension
42 mm * 32 mm/1.66 * 1.26 in; pH probe:
Probe type industrial grade; Detection range
of 0 to 14; Temperature range of 0 to 60 ◦C;
Accuracy of ±0.1pH (25◦); response time
1 min; probe life of 7 * 24 h  0.5 years
(depending on water quality) [16]
Electrical
conductivity
sensor V1
Water Conductivity Operating voltage +5.00 V; PCB size 45 ×
32 mm (1.77 × 1.26); Measuring range 1
ms/cm - 20 ms/cm; Operating temperature
5 - 40◦C; Accuracy  ±10% F.S (using
Arduino 10 bits ADC); PH2.0 interface
(3-pin SMD); Conductivity electrode
(Constant Electrode K = 1, BNC
connector); Length of electrode cable about
60 cm (23.62); Standard conductivity
solution (1413 µs/cm and 12.88 ms/cm) x1
[17].
Transceiver
LoRa RFM95
Communication with
TTN
LoRa technology is available as Radio
Frequency communication on frequencies
433, 868 and 915 MHz. The operating ranges
of LoRa are part of the Industrial Scientific
and Medical (ISM) band, which different
intervals are reserved internationally for
ISM development
Atmega328p Microcontroller Memory; Timer/counter; A/D converter;
D/A converter; Parallel and Serial ports
468 K. P. Cabanga et al.
3.2 Communication Network
Smart River systems communication occurs through the different IoT platforms
connected to the network. By using these platforms it is possible to collect
data through LoRaWAN technology and the Gateways. TTN aims to enable
low-power devices to use long-range Gateways and connect to a decentralized
open-source network. The final application allows exchanging data with high-
end precision and efficient connection. The Fig. 2 illustrates the sensor modules
registered on the platform [10].
Fig. 2. Devices
TTN uses the LoRaWAN protocol that defines the system architecture and
the communication parameters with LoRa technologies. These devices use low-
power LoRa networks, to connect to the Gateway and thus, enabling a high-
bandwidth connection, such as Wi-Fi, Ethernet, and GSM. All Gateways that
reach a device will receive messages from other devices on the net and re-forward
the message to them in the TTN. Thus, the network will reduplicate the messages
and select the best gateway to forward the messages in the queue to the downlink.
The Fig. 3 presents the communication scheme for data collection.
Fig. 3. Communication scheme
Smart River Platform 469
After the data is collected by the TTN server it is placed in the TTN cloud.
It was possible to develop a script that accesses the data in a predefined time
interval, saving them in a .txt file format in the CCVB database, thus making
the Smart River application more independent from the data storage system of
the TTN.
4 Smart River Platform
Smart River platform is an interactive module that consists of sensors connected
through the network, which collect data on environmental parameters of the
main river of Bragança - Fervença River. The Fig. 4 illustrates the positioning
of the sensors in the region. The sensors are located at strategic points that allow
data collection at four (n = 4) different points:
i) before reaching the city of Bragança (near Castro de Avelãs village),
ii) in the city of Bragança at Instituto Politécnico de Bragança (IPB),
iii) in the city of Bragança at Casa-da-Seda, a building of Centro de Ciência
Viva de Bragança,
vi) after the city of Bragança, near to Municipality Wastewater Treatment Plant
(Estação de Tratamento de Águas Residuais - ETAR).
Fig. 4. Location of sensors
The strategic positioning of the sensors allows to evaluate and correlate the
data collected with the urban activity, and thus propose solutions that improve
the quality of the river water.
The system is based on the IoT application, through the implementation of
the LoRaWAN protocol to carry out the sensor communication with the TTN
server, which is a platform for IoT, and a network based on servers in the cloud
that connect to the Portal spread across the planet.
The TTN network uses the LoRaWAN protocol, that is an open (open-
source) and collaborative (crowdsourced) network, where anyone can install a
Gateway at home, in the company, in an educational institution, etc. Thus, it
470 K. P. Cabanga et al.
is possible to increase the network coverage area. In addition to installing Gate-
ways, anyone can install a Node and connect to a working Gateway at no cost
to use the network [11].
Today TTN has thousands of Gateways around the world and a very active
community using the platform, discussing and exchanging information. With the
use of the TTN network, applications that use IoT can be quickly prototyped,
moving from the validation phase to production with guaranteed scalability and
security [11].
After the data is collected by the sensors and communicated, all data is stored
in a cloud database that can be accessed through external applications. The
microcontroller is programmed to command the data collection by the sensors
and to send the information in byte format to the TTN server.
The microcontroller should use the “hibernate” mode for a defined interval
of 15 min and, if the battery is below 3 V, it goes into “hibernate forever”. By
this way, reducing energy consumption as much as possible and allowing a longer
battery life [11].
To allow access to data in an educational, scientific, and interactive way, two
user interface applications were developed: i) An application developed in Unity
engine, which is part of the permanent exhibition of Casa da Seda; ii) A Node.js
server application, which can be accessed via mobile phones linked. On example
is presented by Fig. 5.
The application made in Unity was developed in order to be educational and
intuitive, facilitating access to data averages and information about measured
parameters, collection points, and the application of LoRaWAN technology in
the development of the module.
Fig. 5. Parameters description
Smart River Platform 471
5 Water Monitoring
Smart River system involves several stages of development until its presentation
to CCVB visitors. Before showing the output information on a monitor, all the
data of the module sensor are sent by using LoRaWAN protocols to the network.
These measurements are stored on the Node-red servers and saved on a local
database. Then, these data are accessed through the Smart River application
made in Unity render-engine. By using the engine, all the data averages of the
parameters of the river can be calculated, and thus, the 3D bar graphs gener-
ated. As a result of the interactive and touchscreen interface to be used inside a
monitor by the visitors.
The use of animated graphs with Unity is a trigger for attracting the attention
of the CCVB visitors. Thus, the platform can translate scientific and technical
data, as well as specific concepts of the physical and chemical descriptions of the
parameters of the river to the public, as can be observed in Fig. 6.
By using the system, the visitors can learn about the conductivity, the tem-
perature, the and other river parameters and descriptions of ideal water’s param-
eters. Further, the system allows the data filtering by week, month and year as
can be observed at Fig. 6. All the graphs, such as the image’s descriptions of the
region, are animated to bring attention to the visitors by gaining interest in the
approach topics.
Fig. 6. Average graph filtered by week - unity engine
Besides the 3D application, the measurements can also be accessed through
the Node-red interface from a technical perspective, as shown in Fig. 7. The
Node.js server application has more graphically condensed data. Allowing the
472 K. P. Cabanga et al.
monitoring to correlate the raw variables without an averages filter, which is used
to observe the specific temporal variation of each parameter individually, and to
manage the devices over Bragança. The system shows the failures of shipments
of data, and battery levels, as well as data integrity and data storage.
Fig. 7. Node-red server interface (Color figure online)
6 Conclusion and Future Work
Smart River platform can be one of the best solutions for analyzing the water
quality of the river by using IoT methodology for data collection. The proposed
system works through the TTN server that stores the data, and makes use of
LoRaWAN technology and protocols for sending and receiving data through the
internet.
It was proposed two independent systems: the Unity educational system was
intended to display the data averages in an animated way and with touch-screen
interaction for the CCVB visitors, while the Node.js server was created for con-
trol and management of devices installed in the rivers and for future scientific
research. All the data are being collected and evaluated, in both systems, which
have been shown to be coherent and accurate with the sensors data output.
It is expected to observe the river’s water parameters and to analyze together
with the competent authorities in order to create coordinated action plans with
the objectives of improving the water quality, with expected results of supporting
environmental balance, social well-being for the citizens of Bragança and tourists,
and ensuring that the river basin of Fervença has a good quality water source.
With this work it is expected to better analyze the consequences of human
activities in the quality of the waters of Fervença River, provide reliable data
to research development in the field, promote environmental awareness to the
Smart River Platform 473
residents and Bragança Ciência Viva Science Center visitors, and support actions
from the competent authorities in improving the hydric resource quality and
consequently, the well-being of the environment and population.
The aesthetic factor is also relevant in the expected results of the system,
since improving the river water quality will also value the real estate business in
the region, attracting more people to live close to the river and the city’s historic
center. For future work, it is intended to optimize the module platform, so that
the battery-life can be longer, and also to change the digital pH and electrical
conductivity sensors with industry standards which has a longer life inside the
water. Other data analysis and environments alerts can also be implemented on
the Smart River system.
Acknowledgments. This work has also been supported by Fundação La Caixa within
the Project Scope: Natureza Virtual, and FCT - Fundação para a Ciência e Tecnologia
within the Project Scope: UIDB/05757/2020.
References
1. Rymarczyk, J.: Technologies, opportunities and challenges of the industrial revolu-
tion 4.0: theoretical considerations. Entrepreneurial Bus. Econ. Rev. 8(1), 185–198
(2020). https://doi.org/10.15678/EBER.2020.080110
2. Hasan, M.S., Yu, H.: Innovative developments in HCI and future trends. Int. J.
Autom. Comput. 14(1), 10–20 (2017). https://doi.org/10.1007/s11633-016-1039-6
3. Basford, P.J., Johnston, S.J., Apetroaie-Cristea, M., Bulot, F.M.J., Cox, S.J.:
LoRaWAN for city scale IoT deployments. Glob. IoT Summit (GIoTS) 2019, 1–6
(2019). https://doi.org/10.1109/GIOTS.2019.8766359
4. Jeong, H., Liu, Y.: Computational modeling of touchscreen drag gestures using a
cognitive architecture and motion tracking. Int. J. Hum.-Comput. Interact. 35(6),
510–520 (2019). https://doi.org/10.1080/10447318.2018.1466858
5. Accessed 21 June 2021. https://www.usgs.gov/mission-areas/water-resources/
science/water-quality-nation-s-streams-and-rivers-current-conditions?qt-science
center objects=0qt-science center objects
6. Madakam, S., Lake, V., Lake, V., Lake, V.: Internet of things (IoT): a literature
review. J. Comput. Commun. 3(05), 164 (2015)
7. Caldas Filho, F.L., Martins, L., Araújo, I.P., Mendonça, F., Costa, J.: Gerencia-
mento de Serviços IoT com gateway Semântico. In: Atas das Conferências IADIS
Ibero-Americanas WWW/Internet 2017 e Computação Aplicada 2017, pp. 199–
206. IADIS Press (2017)
8. Santos, A.V.D.: Integração entre dispositivos LoRa e servidores de aplicação uti-
lizando o protocolo LoRaWAN. Technical report from Universidade Federal de
Santa Catarina, pp. 1–67 (2021)
9. Zyrianoff, I., Heideker, A., Silva, D.O., Kleinschmidt, J.H., Kamienski, C.A.:
Impacto de LoRaWAN no desempenho de plataformas de iot baseadas em nuvem
e névoa computacional. In: Anais do XVII Workshop em Clouds e Aplicaçães, pp.
43–56 (2019)
10. Cabanga, P.K.: Monitorização de contagem eletrica inteligente hardware. Technical
report from Instituto Politécnico de Bragança, pp. 24–25 (2020)
474 K. P. Cabanga et al.
11. Soares, V.E.: Monitorização de contagem eletrica inteligente software. Technical
report from Instituto Politécnico de Bragança, pp. 7–19 (2020)
12. Adu-Manu, K.S., Katsriku, F.A., Abdulai, J.D., Engmann, F.: Smart river moni-
toring using wireless sensor networks. Wirel. Commun. Mob. Comput. 2020, 1–2
(2020)
13. Elijah, O., et al.: A concept paper on an intelligent river monitoring system for
river sustainability. Int. J. Integr. Eng. 10, 1–5 (2018)
14. FILIPEFLOP Homepage. https://www.filipeflop.com/produto/sensor-de-
umidade-e-temperatura-dht11/. Accessed 20 Feb 2021
15. ELECTROFUN Homepage. https://www.electrofun.pt/sensores-arduino/sensor-
temperatura-ds18b20. Accessed 20 Feb 2021
16. OPENcircuit Homepage. https://opencircuit.shop/Product/Gravity-Analog-pH-
Sensor-Meter-Pro-Kit-V2. Accessed 20 Feb 2021
17. BOTNROLL Homepage. https://www.botnroll.com/pt/biometricos/2384-
gravity-analog-electrical-conductivity-sensor-meter-for-arduino-.html. Accessed
20 Feb 2021
Health Informatics
Analysis of the Middle and Long Latency ERP
Components in Schizophrenia
Miguel Rocha e Costa1, Felipe Teixeira1,2 , and João Paulo Teixeira1,3(B)
1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de
Bragança, 5300-253 Bragança, Portugal
a40308@alunos.ipb.pt, {felipe.laje,joaopt}@ipb.pt
2 Universidade de Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal
3 Applied Management Research Unit (UNIAG),
Instituto Politécnico de Bragança, Bragança, Portugal
Abstract. Schizophrenia is a complex and disabling mental disorder estimated to
affect 21 million people worldwide. Electroencephalography (EEG) has proven to
be an excellent tool to improve and aid the current diagnosis of mental disorders
such as schizophrenia. The illness is comprised of various disabilities associated
with sensory processing and perception. In this work, the first 10−200 ms of
brain activity after the self-generation via button presses (condition 1) and pas-
sive presentation (condition 2) of auditory stimuli was addressed. A time-domain
analysis of the event-related potentials (ERPs), specifically the MLAEP, N1, and
P2 components, was conducted on 49 schizophrenic patients (SZ) and 32 healthy
controls (HC), provided by a public dataset. The amplitudes, latencies, and scalp
distribution of the peaks were used to compare groups. Suppression, measured as
the difference between both conditions’ neural activity, was also evaluated. With
the exception of the N1 peak during condition (1), patients exhibited significantly
reduced amplitudes in all waveforms analyzed in both conditions. The SZ group
also demonstrated a peak delay in the MLAEP during condition (2) and a modestly
earlier P2 peak during condition (1). Furthermore, patients exhibited less and more
N1 and P2 suppression, respectively. Finally, the spatial distribution of activity in
the scalp during the MLAEP peak in both conditions, N1 peak in condition (1)
and N1 suppression differed considerably between groups. These findings and
measure
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf
Optimization, Learning Algorithms and Applications.pdf

Optimization, Learning Algorithms and Applications.pdf

  • 1.
    Ana I. Pereira· Florbela P. Fernandes · João P. Coelho · João P. Teixeira · Maria F. Pacheco · Paulo Alves · Rui P. Lopes (Eds.) First International Conference, OL2A 2021 Bragança, Portugal, July 19–21, 2021 Revised Selected Papers Optimization, Learning Algorithms and Applications Communications in Computer and Information Science 1488
  • 2.
    Communications in Computer andInformation Science 1488 Editorial Board Members Joaquim Filipe Polytechnic Institute of Setúbal, Setúbal, Portugal Ashish Ghosh Indian Statistical Institute, Kolkata, India Raquel Oliveira Prates Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil Lizhu Zhou Tsinghua University, Beijing, China
  • 3.
    More information aboutthis series at https://link.springer.com/bookseries/7899
  • 4.
    Ana I. Pereira· Florbela P. Fernandes · João P. Coelho · João P. Teixeira · Maria F. Pacheco · Paulo Alves · Rui P. Lopes (Eds.) Optimization, Learning Algorithms and Applications First International Conference, OL2A 2021 Bragança, Portugal, July 19–21, 2021 Revised Selected Papers
  • 5.
    Editors Ana I. Pereira InstitutoPolitécnico de Bragança Bragança, Portugal João P. Coelho Instituto Politécnico de Bragança Bragança, Portugal Maria F. Pacheco Instituto Politécnico de Bragança Bragança, Portugal Rui P. Lopes Instituto Politécnico de Bragança Bragança, Portugal Florbela P. Fernandes Instituto Politécnico de Bragança Bragança, Portugal João P. Teixeira Instituto Politécnico de Bragança Bragança, Portugal Paulo Alves Instituto Politécnico de Bragança Bragança, Portugal ISSN 1865-0929 ISSN 1865-0937 (electronic) Communications in Computer and Information Science ISBN 978-3-030-91884-2 ISBN 978-3-030-91885-9 (eBook) https://doi.org/10.1007/978-3-030-91885-9 © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
  • 6.
    Preface The volume CCIS1488 contains the refereed proceedings of the International Conference on Optimization, Learning Algorithms and Applications (OL2A 2021), an event that, due to the COVID-19 pandemic, was held online. OL2A 2021 provided a space for the research community on optimization and learning to get together and share the latest developments, trends, and techniques as well as develop new paths and collaborations. OL2A 2021 had more than 400 participants in an online environment throughout the three days of the conference (July 19–21, 2021), discussing topics associated to areas such as optimization and learning and state-of-the-art applications related to multi-objective optimization, optimization for machine learning, robotics, health informatics, data analysis, optimization and learning under uncertainty, and the Fourth Industrial Revolution. Four special sessions were organized under the following topics: Trends in Engineering Education, Optimization in Control Systems Design, Data Visualization and Virtual Reality, and Measurements with the Internet of Things. The event had 52 accepted papers, among which 39 were full papers. All papers were carefully reviewed and selected from 134 submissions. All the reviews were carefully carried out by a Scientific Committee of 61 PhD researchers from 18 countries. July 2021 Ana I. Pereira
  • 7.
    Organization General Chair Ana IsabelPereira Polytechnic Institute of Bragança, Portugal Organizing Committee Chairs Florbela P. Fernandes Polytechnic Institute of Bragança, Portugal João Paulo Coelho Polytechnic Institute of Bragança, Portugal João Paulo Teixeira Polytechnic Institute of Bragança, Portugal M. Fátima Pacheco Polytechnic Institute of Bragança, Portugal Paulo Alves Polytechnic Institute of Bragança, Portugal Rui Pedro Lopes Polytechnic Institute of Bragança, Portugal Scientific Committee Ana Maria A. C. Rocha University of Minho, Portugal Ana Paula Teixeira University of Trás-os-Montes and Alto Douro, Portugal André Pinz Borges Federal University of Technology – Paraná, Brazil Andrej Košir University of Ljubljana, Slovenia Arnaldo Cândido Júnior Federal University of Technology – Paraná, Brazil Bruno Bispo Federal University of Santa Catarina, Brazil Carmen Galé University of Zaragoza, Spain B. Rajesh Kanna Vellore Institute of Technology, India C. Sweetlin Hemalatha Vellore Institute of Technology, India Damir Vrančić Jozef Stefan Institute, Slovenia Daiva Petkeviciute Kaunas University of Technology, Lithuania Diamantino Silva Freitas University of Porto, Portugal Esteban Clua Federal Fluminense University, Brazil Eric Rogers University of Southampton, UK Felipe Nascimento Martins Hanze University of Applied Sciences, The Netherlands Gaukhar Muratova Dulaty University, Kazakhstan Gediminas Daukšys Kauno Technikos Kolegija, Lithuania Glaucia Maria Bressan Federal University of Technology – Paraná, Brazil Humberto Rocha University of Coimbra, Portugal José Boaventura-Cunha University of Trás-os-Montes and Alto Douro, Portugal José Lima Polytechnic Institute of Bragança, Portugal Joseane Pontes Federal University of Technology – Ponta Grossa, Brazil Juani Lopéz Redondo University of Almeria, Spain
  • 8.
    viii Organization Jorge RibeiroPolytechnic Institute of Viana do Castelo, Portugal José Ramos NOVA University Lisbon, Portugal Kristina Sutiene Kaunas University of Technology, Lithuania Lidia Sánchez University of León, Spain Lino Costa University of Minho, Portugal Luís Coelho Polytecnhic Institute of Porto, Portugal Luca Spalazzi Marche Polytechnic University, Italy Manuel Castejón Limas University of León, Spain Marc Jungers Université de Lorraine, France Maria do Rosário de Pinho University of Porto, Portugal Marco Aurélio Wehrmeister Federal University of Technology – Paraná, Brazil Mikulas Huba Slovak University of Technology in Bratislava, Slovakia Michał Podpora Opole University of Technology, Poland Miguel Ángel Prada University of León, Spain Nicolae Cleju Technical University of Iasi, Romania Paulo Lopes dos Santos University of Porto, Portugal Paulo Moura Oliveira University of Trás-os-Montes and Alto Douro, Portugal Pavel Pakshin Nizhny Novgorod State Technical University, Russia Pedro Luiz de Paula Filho Federal University of Technology – Paraná, Brazil Pedro Miguel Rodrigues Catholic University of Portugal, Portugal Pedro Morais Polytechnic Institute of Cávado e Ave, Portugal Pedro Pinto Polytechnic Institute of Viana do Castelo, Portugal Rudolf Rabenstein Friedrich-Alexander-University of Erlangen-Nürnberg, Germany Sani Rutz da Silva Federal University of Technology – Paraná, Brazil Sara Paiva Polytechnic Institute of Viana do Castelo, Portugal Sofia Rodrigues Polytechnic Institute of Viana do Castelo, Portugal Sławomir St˛ epień Poznan University of Technology, Poland Teresa Paula Perdicoulis University of Trás-os-Montes and Alto Douro, Portugal Toma Roncevic University of Split, Croatia Vitor Duarte dos Santos NOVA University Lisbon, Portugal Wojciech Paszke University of Zielona Gora, Poland Wojciech Giernacki Poznan University of Technology, Poland
  • 9.
    Contents Optimization Theory Dynamic ResponseSurface Method Combined with Genetic Algorithm to Optimize Extraction Process Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Laires A. Lima, Ana I. Pereira, Clara B. Vaz, Olga Ferreira, Márcio Carocho, and Lillian Barros Towards a High-Performance Implementation of the MCSFilter Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Leonardo Araújo, Maria F. Pacheco, José Rufino, and Florbela P. Fernandes On the Performance of the OrthoMads Algorithm on Continuous and Mixed-Integer Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Marie-Ange Dahito, Laurent Genest, Alessandro Maddaloni, and José Neto A Look-Ahead Based Meta-heuristics for Optimizing Continuous Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Thomas Nordli and Noureddine Bouhmala Inverse Optimization for Warehouse Management . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Hannu Rummukainen Model-Agnostic Multi-objective Approach for the Evolutionary Discovery of Mathematical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Alexander Hvatov, Mikhail Maslyaev, Iana S. Polonskaya, Mikhail Sarafanov, Mark Merezhnikov, and Nikolay O. Nikitin A Simple Clustering Algorithm Based on Weighted Expected Distances . . . . . . . 86 Ana Maria A. C. Rocha, M. Fernanda P. Costa, and Edite M. G. P. Fernandes Optimization of Wind Turbines Placement in Offshore Wind Farms: Wake Effects Concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 José Baptista, Filipe Lima, and Adelaide Cerveira A Simulation Tool for Optimizing a 3D Spray Painting System . . . . . . . . . . . . . . . 110 João Casanova, José Lima, and Paulo Costa
  • 10.
    x Contents Optimization ofGlottal Onset Peak Detection Algorithm for Accurate Jitter Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Joana Fernandes, Pedro Henrique Borghi, Diamantino Silva Freitas, and João Paulo Teixeira Searching the Optimal Parameters of a 3D Scanner Through Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 João Braun, José Lima, Ana I. Pereira, Cláudia Rocha, and Paulo Costa Optimal Sizing of a Hybrid Energy System Based on Renewable Energy Using Evolutionary Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Yahia Amoura, Ângela P. Ferreira, José Lima, and Ana I. Pereira Robotics Human Detector Smart Sensor for Autonomous Disinfection Mobile Robot . . . . 171 Hugo Mendonça, José Lima, Paulo Costa, António Paulo Moreira, and Filipe Santos Multiple Mobile Robots Scheduling Based on Simulated Annealing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Diogo Matos, Pedro Costa, José Lima, and António Valente Multi AGV Industrial Supervisory System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Ana Cruz, Diogo Matos, José Lima, Paulo Costa, and Pedro Costa Dual Coulomb Counting Extended Kalman Filter for Battery SOC Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Arezki A. Chellal, José Lima, José Gonçalves, and Hicham Megnafi Sensor Fusion for Mobile Robot Localization Using Extended Kalman Filter, UWB ToF and ArUco Markers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Sílvia Faria, José Lima, and Paulo Costa Deep Reinforcement Learning Applied to a Robotic Pick-and-Place Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Natanael Magno Gomes, Felipe N. Martins, José Lima, and Heinrich Wörtche Measurements with the Internet of Things An IoT Approach for Animals Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Matheus Zorawski, Thadeu Brito, José Castro, João Paulo Castro, Marina Castro, and José Lima
  • 11.
    Contents xi Optimizing DataTransmission in a Wireless Sensor Network Based on LoRaWAN Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Thadeu Brito, Matheus Zorawski, João Mendes, Beatriz Flamia Azevedo, Ana I. Pereira, José Lima, and Paulo Costa Indoor Location Estimation Based on Diffused Beacon Network . . . . . . . . . . . . . 294 André Mendes and Miguel Diaz-Cacho SMACovid-19 – Autonomous Monitoring System for Covid-19 . . . . . . . . . . . . . . 309 Rui Fernandes and José Barbosa Optimization in Control Systems Design Economic Burden of Personal Protective Strategies for Dengue Disease: an Optimal Control Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Artur M. C. Brito da Cruz and Helena Sofia Rodrigues ERP Business Speed – A Measuring Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Zornitsa Yordanova BELBIC Based Step-Down Controller Design Using PSO . . . . . . . . . . . . . . . . . . . 345 João Paulo Coelho, Manuel Braz-César, and José Gonçalves Robotic Welding Optimization Using A* Parallel Path Planning . . . . . . . . . . . . . . 357 Tiago Couto, Pedro Costa, Pedro Malaca, Daniel Marques, and Pedro Tavares Deep Learning Leaf-Based Species Recognition Using Convolutional Neural Networks . . . . . . . 367 Willian Oliveira Pires, Ricardo Corso Fernandes Jr., Pedro Luiz de Paula Filho, Arnaldo Candido Junior, and João Paulo Teixeira Deep Learning Recognition of a Large Number of Pollen Grain Types . . . . . . . . 381 Fernando C. Monteiro, Cristina M. Pinto, and José Rufino Predicting Canine Hip Dysplasia in X-Ray Images Using Deep Learning . . . . . . 393 Daniel Adorno Gomes, Maria Sofia Alves-Pimenta, Mário Ginja, and Vitor Filipe Convergence of the Reinforcement Learning Mechanism Applied to the Channel Detection Sequence Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 André Mendes
  • 12.
    xii Contents Approaches toClassify Knee Osteoarthritis Using Biomechanical Data . . . . . . . 417 Tiago Franco, P. R. Henriques, P. Alves, and M. J. Varanda Pereira Artificial Intelligence Architecture Based on Planar LiDAR Scan Data to Detect Energy Pylon Structures in a UAV Autonomous Detailed Inspection Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 Matheus F. Ferraz, Luciano B. Júnior, Aroldo S. K. Komori, Lucas C. Rech, Guilherme H. T. Schneider, Guido S. Berger, Álvaro R. Cantieri, José Lima, and Marco A. Wehrmeister Data Visualization and Virtual Reality Machine Vision to Empower an Intelligent Personal Assistant for Assembly Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Matheus Talacio, Gustavo Funchal, Victória Melo, Luis Piardi, Marcos Vallim, and Paulo Leitao Smart River Platform - River Quality Monitoring and Environmental Awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Kenedy P. Cabanga, Edmilson V. Soares, Lucas C. Viveiros, Estefânia Gonçalves, Ivone Fachada, José Lima, and Ana I. Pereira Health Informatics Analysis of the Middle and Long Latency ERP Components in Schizophrenia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Miguel Rocha e Costa, Felipe Teixeira, and João Paulo Teixeira Feature Selection Optimization for Breast Cancer Diagnosis . . . . . . . . . . . . . . . . . 492 Ana Rita Antunes, Marina A. Matos, Lino A. Costa, Ana Maria A. C. Rocha, and Ana Cristina Braga Cluster Analysis for Breast Cancer Patterns Identification . . . . . . . . . . . . . . . . . . . 507 Beatriz Flamia Azevedo, Filipe Alves, Ana Maria A. C. Rocha, and Ana I. Pereira Overview of Robotic Based System for Rehabilitation and Healthcare . . . . . . . . . 515 Arezki A. Chellal, José Lima, Florbela P. Fernandes, José Gonçalves, Maria F. Pacheco, and Fernando C. Monteiro Understanding Health Care Access in Higher Education Students . . . . . . . . . . . . . 531 Filipe J. A. Vaz, Clara B. Vaz, and Luís C. D. Cadinha
  • 13.
    Contents xiii Using NaturalLanguage Processing for Phishing Detection . . . . . . . . . . . . . . . . . . 540 Richard Adolph Aires Jonker, Roshan Poudel, Tiago Pedrosa, and Rui Pedro Lopes Data Analysis A Panel Data Analysis of the Electric Mobility Deployment in the European Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555 Sarah B. Gruetzmacher, Clara B. Vaz, and Ângela P. Ferreira Data Analysis of Workplace Accidents - A Case Study . . . . . . . . . . . . . . . . . . . . . . 571 Inês P. Sena, João Braun, and Ana I. Pereira Application of Benford’s Law to the Tourism Demand: The Case of the Island of Sal, Cape Verde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 Gilberto A. Neves, Catarina S. Nunes, and Paula Odete Fernandes Volunteering Motivations in Humanitarian Logistics: A Case Study in the Food Bank of Viana do Castelo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 Ana Rita Vasconcelos, Ângela Silva, and Helena Sofia Rodrigues Occupational Behaviour Study in the Retail Sector . . . . . . . . . . . . . . . . . . . . . . . . . 617 Inês P. Sena, Florbela P. Fernandes, Maria F. Pacheco, Abel A. C. Pires, Jaime P. Maia, and Ana I. Pereira A Scalable, Real-Time Packet Capturing Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 630 Rafael Oliveira, João P. Almeida, Isabel Praça, Rui Pedro Lopes, and Tiago Pedrosa Trends in Engineering Education Assessing Gamification Effectiveness in Education Through Analytics . . . . . . . . 641 Zornitsa Yordanova Real Airplane Cockpit Development Applied to Engineering Education: A Project Based Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649 José Carvalho, André Mendes, Thadeu Brito, and José Lima Azbot-1C: An Educational Robot Prototype for Learning Mathematical Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657 Francisco Pedro, José Cascalho, Paulo Medeiros, Paulo Novo, Matthias Funk, Albeto Ramos, Armando Mendes, and José Lima
  • 14.
    xiv Contents Towards DistanceTeaching: A Remote Laboratory Approach for Modbus and IoT Experiencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670 José Carvalho, André Mendes, Thadeu Brito, and José Lima Evaluation of Soft Skills Through Educational Testbed 4.0 . . . . . . . . . . . . . . . . . . 678 Leonardo Breno Pessoa da Silva, Bernado Perrota Barreto, Joseane Pontes, Fernanda Tavares Treinta, Luis Mauricio Martins de Resende, and Rui Tadashi Yoshino Collaborative Learning Platform Using Learning Optimized Algorithms . . . . . . . 691 Beatriz Flamia Azevedo, Yahia Amoura, Gauhar Kantayeva, Maria F. Pacheco, Ana I. Pereira, and Florbela P. Fernandes Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703
  • 15.
  • 16.
    Dynamic Response SurfaceMethod Combined with Genetic Algorithm to Optimize Extraction Process Problem Laires A. Lima1,2(B) , Ana I. Pereira1 , Clara B. Vaz1 , Olga Ferreira2 , Márcio Carocho2 , and Lillian Barros2 1 Research Center in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal {laireslima,apereira,clvaz}@ipb.pt 2 Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal {oferreira,mcarocho,lillian}@ipb.pt Abstract. This study aims to find and develop an appropriate opti- mization approach to reduce the time and labor employed throughout a given chemical process and could be decisive for quality management. In this context, this work presents a comparative study of two optimization approaches using real experimental data from the chemical engineering area, reported in a previous study [4]. The first approach is based on the traditional response surface method and the second approach combines the response surface method with genetic algorithm and data mining. The main objective is to optimize the surface function based on three variables using hybrid genetic algorithms combined with cluster analysis to reduce the number of experiments and to find the closest value to the optimum within the established restrictions. The proposed strategy has proven to be promising since the optimal value was achieved with- out going through derivability unlike conventional methods, and fewer experiments were required to find the optimal solution in comparison to the previous work using the traditional response surface method. Keywords: Optimization · Genetic algorithm · Cluster analysis 1 Introduction Search and optimization methods have several principles, being the most rele- vant: the search space, where the possibilities for solving the problem in question are considered; the objective function (or cost function); and the codification of the problem, that is, the way to evaluate an objective in the search space [1]. Conventional optimization techniques start with an initial value or vector that, iteratively, is manipulated using some heuristic or deterministic process c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 3–14, 2021. https://doi.org/10.1007/978-3-030-91885-9_1
  • 17.
    4 L. A.Lima et al. directly associated with the problem to be solved. The great difficulty to deal with when solving a problem using a stochastic method is the number of possible solutions growing with a factorial speed, being impossible to list all possible solutions of the problem [12]. Evolutionary computing techniques operate on a population that changes in each iteration. Thus, they can search in different regions on the feasible space, allocating an appropriate number of members to search in different areas [12]. Considering the importance of predicting the behavior of analytical processes and avoiding expensive procedures, this study aims to propose an alternative for the optimization of multivariate problems, e.g. extraction processes of high- value compounds from plant matrices. In the standard analytical approach, the identification and quantification of phenolic compounds require expensive and complex laboratory assays [6]. An alternative approach can be applied using forecasting models from Response Surface Method (RSM). This approach can maximize the extraction yield of the target compounds while decreasing the cost of the extraction process. In this study, a comparative analysis between two optimization methodolo- gies (traditional RSM and dynamic RSM), developed in MATLAB R software (version R2019a 9.6), that aim to maximize the heat-assisted extraction yield and phenolic compounds content in chestnut flower extracts is presented. This paper is organized as follows. Section 2 describes the methods used to evaluate multivariate problems involving optimization processes: Response Sur- face Method (RSM), Hybrid Genetic Algorithm, Cluster Analysis and Bootsrap Analysis. Sections 3 and 4 introduce the case study, consisting of the optimization of the extraction yield and content of phenolic compounds in extracts of chest- nut flower by two different approaches: Traditional RSM and Dynamic RSM. Section 5 includes the numerical results obtained by both methods and their comparative evaluation. Finally, Sect. 6 presents the conclusions and future work. 2 Methods Approaches and Techniques Regarding optimization problems, some methods are used more frequently (tra- ditional RSM, for example) due to their applicability and suitability to different cases. For the design of the dynamic RSM, the approach of conventional meth- ods based on Genetic Algorithm combined with clustering and bootstrap analysis was made to evaluate the aspects that could be incorporated into the algorithm developed in this work. The key concepts for dynamic RSM are presented below. 2.1 Response Surface Method The Response Surface Method is a tool introduced in the early 1950s by Box and Wilson, which covers a collection of mathematical and statistical techniques useful for approximating and optimizing stochastic models [11]. It is a widely used optimization method, which applies statistical techniques based on special factorial designs [2,3]. Its scientific approach estimates the ideal conditions for
  • 18.
    Dynamic RSM Combinedwith GA to Optimize Extraction Process Problem 5 achieving the highest or lowest required response value, through the design of the response surface from the Taylor series [8]. RSM promotes the greatest amount of information on experiments, such as the time of experiments and the influence of each dependent variable, being one of the largest advantages in obtaining the necessary general information on the planning of the process and the experience design [8]. 2.2 Hybrid Genetic Algorithm The genetic algorithm is a stochastic optimization method based on the evolu- tionary process of natural selection and genetic dynamics. The method seeks to combine the survival of the fittest among the string structures with an exchange of random, but structured, information to form an ideal solution [7]. Although they are randomized, GA search strategies are able to explore several regions of the feasible search space at a time. In this way, along with the iterations, a unique search path is built, as new solutions are obtained through the com- bination of previous solutions [1]. Optimization problems with restrictions can influence the sampling capacity of a genetic algorithm due to the population limits considered. Incorporating a local optimization method into GA can help overcome most of the obstacles that arise as a result of finite population sizes, for example, the accumulation of stochastic errors that generate genetic drift problems [1,7]. 2.3 Cluster Analysis Cluster algorithms are often used to group large data sets and play an impor- tant role in pattern recognition and mining large arrays. k-means and k-medoids strategies work by grouping partition data into a k number of mutually exclusive clusters, demonstrated in Fig. 1. These techniques assign each observation to a cluster, minimizing the distance from the data point to the average (k-means) or median (k-medoids) location of its assigned cluster [10]. Fig. 1. Mean and Medoid in 2D space representation. In both figures, the data are represented by blue dots, being the rightmost point an outlier and the red point rep- resents the centroid point found by k-mean or k-medoid methods. Adapted from Jin and Han (2011) (Color figure online).
  • 19.
    6 L. A.Lima et al. 2.4 Bootstrap Analysis The idea of bootstrap analysis is to mimic the sampling distribution of the statis- tic of interest through the use of many resamples replacing the original sample elements [5]. In this work, the bootstrap analysis enables the handling of the vari- ability of the optimal solutions derived from the cluster method analysis. Thus, the bootstrap analysis is used to estimate the confidence interval of the statis- tic of interest and subsequently, comparing the results obtained by traditional methods. 3 Case Study This work presents a comparative analysis between two methodologies for opti- mizing the total phenolic content in extracts of chestnut flower, developed in MATLAB R software. The natural values of the dependent variables in the extraction - t, time in minutes; T, temperature in ◦ C; and S, organic solvent con- tent in %v/v of ethanol - were coded based on Central Composite Circumscribed Design (CCCD) and the result was based on extraction yield (Y, expressed in percentage of dry extract) and total phenolic content (Phe, expressed in mg/g of dry extract) as shown in Table 1. The experimental data presented were cordially provided by the Mountain Research Center - CIMO (Bragança, Portugal) [4]. The CCCD design selected for the original experimental study [4] is based on a cube circumscribed to a sphere in which the vertices are at α distance from the center, with 5 levels for each factor (t, T, and S). In this case, the α values vary between −1.68 and 1.68, and correspond to each factor level, as described in Table 2. 4 Data Analysis In this section, the two RSM optimization methods (traditional and dynamic) will be discussed in detail, along with the results obtained from both methods. 4.1 Traditional RSM In the original experiment, a five-level Central Composite Circumscribed Design (CCCD) coupled with RSM was build to optimize the variables for the male chestnut flowers. For the optimization, a simplex method developed ad hoc was used to optimize nonlinear solutions obtained by a regression model to maximize the response, described in the flowchart in Fig. 2. Through the traditional RSM, the authors approximated the surface response to a second-order polynomial function [4]: Y = b0 + n i=1 biXi + n−1 i=1 j1 n j=2 bijXiXj + n i=1 biiX2 i (1)
  • 20.
    Dynamic RSM Combinedwith GA to Optimize Extraction Process Problem 7 Table 1. Variables and natural values of the process parameters for the extraction of chestnut flowers [4]. t (min) T (◦ C) S (%EtOH) Yield (%R) Phenolic cont. (mg.g−1 dry weight) 40.30 37.20 20.30 38.12 36.18 40.30 37.20 79.70 26.73 11.05 40.30 72.80 20.30 42.83 36.66 40.30 72.80 79.70 35.94 22.09 99.70 37.20 20.30 32.77 35.55 99.70 37.20 79.70 32.99 8.85 99.70 72.80 20.30 42.55 29.61 99.70 72.80 79.70 35.52 11.10 120.00 55.00 50.00 42.41 14.56 20.00 55.00 50.00 35.45 24.08 70.00 25.00 50.00 38.82 12.64 70.00 85.00 50.00 42.06 17.41 70.00 55.00 0.00 35.24 34.58 70.00 55.00 100.00 15.61 12.01 20.00 25.00 0.00 22.30 59.56 20.00 25.00 100.00 8.02 15.57 20.00 85.00 0.00 34.81 42.49 20.00 85.00 100.00 18.71 50.93 120.00 25.00 0.00 31.44 40.82 120.00 25.00 100.00 15.33 8.79 120.00 85.00 0.00 34.96 45.61 120.00 85.00 100.00 32.70 21.89 70.00 55.00 50.00 41.03 14.62 Table 2. Natural and coded values of the extraction variables [4]. Natural variables Coded value t (min) T (◦ C) S (%) 20.0 25.0 0.0 −1.68 40.3 37.2 20.0 −1.00 70 55 50.0 0.00 99.7 72.8 80.0 1.00 120 85 100.0 1.68
  • 21.
    8 L. A.Lima et al. Fig. 2. Flowchart of traditional RSM modeling approach for optimal design. where, for i = 0, ..., n and j = 1, ..., n, bi stand for the linear coefficients; bij correspond to the interaction coefficients while bii are the quadratic coefficients; and, finally, Xi are the independent variables, associated to t, T and S, being n the total number of variables. In the previous study, the traditional RSM, Eq. 1 represents coherently the behaviour of the extraction process of the target compounds from chestnut flow- ers [4]. In order to compare the optimization methods and to avoid data conflict, the estimation of the cost function was done based on a multivariate regression model. 4.2 Dynamic RSM For the proposed optimization method, briefly described in the flowchart shown in Fig. 3, the structure of the design of the experience was maintained, as well as the imposed restrictions on the responses and variables to elude awkward solutions. Fig. 3. Flowchart of dynamic RSM integrating genetic algorithm and cluster analysis to the process. The dynamic RSM method was build in MATLAB R using a programming code developed by the authors coupled with pre-existing functions from the statistical and optimization toolboxes of the software. The algorithm starts by generating a set of 15 random combinations between the levels of combinatorial analysis. From this initial experimental data, a multivariate regression model is calculated, being this model the objective function of the problem. Thereafter, a built-in GA-based solver was used to solve the optimization problem. The optimal combination is identified and it is used to define the objective function. The process stops when no new optimal solution is identified.
  • 22.
    Dynamic RSM Combinedwith GA to Optimize Extraction Process Problem 9 Considering the stochastic nature of this case study, clustering analysis is used to identify the best candidate optimal solution. In order to handle the variability of the achieved optimal solution, the bootstrap method is used to estimate the confidence interval at 95%. 5 Numerical Results The study using the traditional RSM returned the following optimal conditions for maximum yield: 120.0 min, 85.0 ◦ C, and 44.5% of ethanol in the solvent, pro- ducing 48.87% of dry extract. For total phenolic content, the optimal conditions were: 20.0 min, 25.0 ◦ C and S = 0.0% of ethanol in the solvent, producing 55.37 mg/g of dry extract. These data are displayed in Table 3. Table 3. Optimal responses and respective conditions using traditional and dynamic RSM based on confidence intervals at 95%. Method t (min) T (◦ C) S (%) Response Extraction yield (%) Traditional RSM 120.0 ± 12.4 85.0 ± 6.7 44.5 ± 9.7 48.87 Dynamic RSM 118.5 ± 1.4 84.07 ± 0.9 46.1 ± 0.85 45.87 Total phenolics (Phe) Traditional RSM 20.0 ± 3.7 25.0 ± 5.7 0.0 ± 8.7 55.37 Dynamic RSM 20.4 ± 1.5 25.1 ± 1.97 0.05 ± 0.05 55.64 For the implementation of dynamic RSM in this case study, 100 runs were carried out to evaluate the effectiveness of the method. For the yield, the esti- mated optimal conditions were: 118.5 min, 84.1 ◦ C, and 46.1% of ethanol in the solvent, producing 45.87% of dry extract. In this case, the obtained optimal con- ditions for time and temperature were in accordance with approximately 80% of the tests. For the total phenolic content, the optimal conditions were: 20.4 min, 25.1 ◦ C, and 0.05% of ethanol in the solvent, producing 55.64 mg/g of dry extract. The results were very similar to the previous report with the same data [4]. The clustering analysis for each response variable was performed considering the means (Figs. 4a and 5a) and the medoids (Figs. 4b and 5b) for the output population (optimal responses). The bootstrap analysis makes the inference con- cerning the results achieved and are represented graphically in terms of mean in Figs. 4c and 5c, and in terms of medoids in Figs. 4d and 5d.
  • 23.
    10 L. A.Lima et al. The box plots of the group of optimal responses from dynamic RSM dis- played in Fig. 6 shows that the variance within each group is small, given that the difference between the set of responses is highly narrow. The histograms concerning the set of dynamic RSM responses and the bootstrap distribution of the mean (1000 resamples) are shown in Figs. 7 and 8. (a) Extraction yield responses and k-means (b) Extraction yield responses and k-medoids (c) Extraction yield bootstrap output and k-means (d) Extraction yield bootstrap output and k-medoids Fig. 4. Clustering analysis of the outputs from the extraction yield optimization using dynamic RSM. The responses are clustered in 3 distinct groups
  • 24.
    Dynamic RSM Combinedwith GA to Optimize Extraction Process Problem 11 (a) Total phenolic content responses and k-means (b) Total phenolic content responses and k-medoids (c) Total phenolic content bootstrap output and k-means (d) Total phenolic content bootstrap output and k-medoids Fig. 5. Clustering analysis of the outputs from the total phenolic content optimization using dynamic RSM. The responses are clustered in 3 distinct groups Fig. 6. Box plot of the dynamic RSM outputs for the extraction yield and total phenolic content before bootstrap analysis, respectively.
  • 25.
    12 L. A.Lima et al. Fig. 7. Histograms of the extraction data (extraction yield) and the bootstrap means, respectively. Fig. 8. Histograms of the extraction data (total phenolic content) and the bootstrap means, respectively. The results obtained in this work are satisfactory since they were analogous for both methods, although dynamic RSM took 15 to 18 experimental points to find the optimal coordinates. Some authors use the design of experiments involving traditional RSM containing 20 different combinations, including the repetition of centroid [4]. However, in studies involving recent data or the absence of complementary data, evaluations about the influence of parameters and range are essential to obtain consistent results, making it necessary to make about 30 experimental points for optimization. Considering these cases, the dynamic RSM method proposes a different, competitive, and economical approach, in which fewer points are evaluated to obtain the maximum response. Genetic algorithms have been providing their efficiency in the search for optimal solutions in a wide variety of problems, given that they do not have some limitations found in traditional search methodologies, such as the require- ment of the derivative function, for example [9]. GA is attractive to identify the global solution of the problem. Considering the stochastic problem presented in this work, the association of genetic algorithm with the k-methods as clustering algorithm obtained satisfactory results. This solution can be used for problems involving small-scale data since GA manages to gather the best data for opti-
  • 26.
    Dynamic RSM Combinedwith GA to Optimize Extraction Process Problem 13 mization through its evolutionary method, while k-means or k-medoids make the grouping of optimum points. In addition to clustering analysis, bootstrapping was also applied, in which the sample distribution of the statistic of interest is simulated through the use of many resamples with replacement of the original sample, thus enabling to make the statistical inference. Bootstrapping was used to calculate the confidence intervals to obtain unbiased estimates from the proposed method. In this case, the confidence interval was calculated at the 95% level (two-tailed), since the same percentage was adopted by Caleja et al. (2019). It was observed that the Dynamic RSM approach also enables the estimation of confidence intervals with less margin of error than the Traditional RSM approach, conducting to define more precisely the optimum conditions for the experiment. 6 Conclusion and Future Work For the presented case study, applying dynamic RSM using Genetic Algorithm coupled with clustering analysis returned positive results, in accordance with previous published data [4]. Both methods seem attractive for the resolution of this particular case concerning the optimization of the extraction of target compounds from plant matrices. Therefore, the smaller number of experiments required for dynamic RSM can be an interesting approach for future studies. In brief, a smaller set of points was obtained that represent the best domain of optimization, thus eliminating the need for a large number of costly labora- tory experiments. The next steps involve the improvement of the dynamic RSM algorithm and the application of the proposed method in other areas of study. Acknowledgments. The authors are grateful to FCT for financial support through national funds FCT/MCTES UIDB/00690/2020 to CIMO and UIDB/05757/2020. M. Carocho also thanks FCT through the individual scientific employment program- contract (CEECIND/00831/2018). References 1. Beasley, D., Bull, D.R., Martin, R.R.: An overview of genetic algorithms: Part 1, fundamentals. Univ. Comput. 2(15), 1–16 (1993) 2. Box, G.E.P., Behnken, D.W.: Simplex-sum designs: a class of second order rotatable designs derivable from those of first order. Ann. Math. Stat. 31(4), 838–864 (1960) 3. Box, G.E.P., Wilson, K.B.: On the experimental attainment of optimum conditions. J. Roy. Stat. Soc. Ser. B (Methodol.) 13(1), 1–38 (1951) 4. Caleja C., Barros L., Prieto M. A., Bento A., Oliveira M.B.P., Ferreira, I.C.F.R.: Development of a natural preservative obtained from male chestnut flowers: opti- mization of a heat-assisted extraction technique. In: Food and Function, vol. 10, pp. 1352–1363 (2019) 5. Efron, B., Tibshirani, R.J.: An introduction to the Bootstrap, 1st edn. Wiley, New York (1994)
  • 27.
    14 L. A.Lima et al. 6. Eftekhari, M., Yadollahi, A., Ahmadi, H., Shojaeiyan, A., Ayyari, M.: Development of an artificial neural network as a tool for predicting the targeted phenolic profile of grapevine (Vitis vinifera) foliar wastes. Front. Plant Sci. 9, 837 (2018) 7. El-Mihoub, T.A., Hopgood, A.A., Nolle, L., Battersby, A.: Hybrid genetic algo- rithms: a review. Eng. Lett. 11, 124–137 (2006) 8. Geiger, E.: Statistical methods for fermentation optimization. In: Vogel H.C., Todaro C.M., (eds.) Fermentation and Biochemical Engineering Handbook: Prin- ciples, Process Design, and Equipment, 3rd edn, pp. 415–422. Elsevier Inc. (2014) 9. Härdle, W.K., Simar, L.: Applied Multivariate Statistical Analysis, 4th edn. Springer, Heidelberg (2019) 10. Jin, X., Han, J.: K-medoids clustering. In: Sammut, C., Webb, G.I. (eds.) Ency- clopedia of Machine Learning, pp. 564–565. Springer, Boston (2011) 11. Şenaras, A.E.: Parameter optimization using the surface response technique in automated guided vehicles. In: Sustainable Engineering Products and Manufac- turing Technologies, pp. 187–197. Academic Press (2019) 12. Schneider, J., Kirkpatrick, S.: Genetic algorithms and evolution strategies. In: Stochastic Optimization, vol. 1, pp. 157–168, Springer-Verlag, Heidelberg (2006)
  • 28.
    Towards a High-Performance Implementationof the MCSFilter Optimization Algorithm Leonardo Araújo1,2 , Maria F. Pacheco2 , José Rufino2 , and Florbela P. Fernandes2(B) 1 Universidade Tecnológica Federal do Paraná, Campus de Ponta Grossa, Ponta Grossa 84017-220, Brazil 2 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, 5300-252 Bragança, Portugal a46677@alunos.ipb.pt, {pacheco,rufino,fflor}@ipb.pt Abstract. Multistart Coordinate Search Filter (MCSFilter) is an opti- mization method suitable to find all minimizers – both local and global – of a non convex problem, with simple bounds or more generic constraints. Like many other optimization algorithms, it may be used in industrial contexts, where execution time may be critical in order to keep a pro- duction process within safe and expected bounds. MCSFilter was first implemented in MATLAB and later in Java (which introduced a signif- icant performance gain). In this work, a comparison is made between these two implementations and a novel one in C that aims at further performance improvements. For the comparison, the problems addressed are bound constraint, with small dimension (between 2 and 10) and mul- tiple local and global solutions. It is possible to conclude that the average time execution for each problem is considerable smaller when using the Java and C implementations, and that the current C implementation, though not yet fully optimized, already exhibits a significant speedup. Keywords: Optimization · MCSFilter method · MatLab · C · Java · Performance 1 Introduction The set of techniques and principals for solving quantitative problems known as optimization has become increasingly important in a broad range of applications in areas of research as diverse as engineering, biology, economics, statistics or physics. The application of the techniques and laws of optimization in these (and other) areas, not only provides resources to describe and solve the specific problems that appear within the framework of each area but it also provides the opportunity for new advances and achievements in optimization theory and its techniques [1,2,6,7]. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 15–30, 2021. https://doi.org/10.1007/978-3-030-91885-9_2
  • 29.
    16 L. Araújoet al. In order to apply optimization techniques, problems can be formulated in terms of an objective function that is to be maximized or minimized, a set of variables and a set of constraints (restrictions on the values that the variables can assume). The structure of the these three items – objective function, variables and constraints –, determines different subfields of optimization theory: linear, integer, stochastic, etc.; within each of these subfields, several lines of research can be pursued. The size and complexity of optimization problems that can be dealt with has increased enormously with the improvement of the overall per- formance of computers. As such, advances in optimization techniques have been following progresses in computer science as well as in combinatorics, operations research, control theory, approximation theory, routing in telecommunication networks, image reconstruction and facility location, among other areas [14]. The need to keep up with the challenges of our rapidly changing society and its digitalization means a continuing need to increase innovation and productiv- ity and improve the performance of the industry sector, and places very high expectations for the progress and adaptation of sophisticated optimization tech- niques that are applied in the industrial context. Many of those problems can be modelled as nonlinear programming problems [10–12] or mixed integer nonlinear programming problems [5,8]. The urgency to quickly output solutions to difficult multivariable problems, leads to an increasing need to develop robust and fast optimization algorithms. Considering that, for many problems, reliable informa- tion about the derivative of the objective function is unavailable, it is important to use a method that allows to solve the problem without this information. Algorithms that do not use derivatives are called derivative-free. The MCS- Filter method is such a method, being able to deal with discontinuous or non- differentiable functions that often appear in many applications. It is also a multi- local method, meaning that it finds all the minimizers, both local and global, and exhibits good results [9,10]. Moreover, a Java implementation was already used to solve processes engineering problems [4]. Considering that, from an industrial point of view, execution time is of utmost importance, a novel C reimplementa- tion, aimed at increased performance, is currently under way, having reached a stage at which it is already able to solve a broad set of problems with measur- able performance gains over the previous Java version. This paper presents the results of a preliminary evaluation of the new C implementation of the MCSFilter method, against the previously developed versions (in MATLAB and Java). The rest of this paper is organized as follows: in Sect. 2, the MCSFilter algo- rithm is briefly described; in Sect. 3 the set of problems that are used to compare the three implementations and the corresponding results are presented and ana- lyzed. Finally, in Sect. 4, conclusions and future work are addressed. 2 The Multistart Coordinate Search Filter Method The MCSFilter algorithm was initially developed in [10], with the aim of finding multiple solutions of nonconvex and nonlinear constrained optimization problems of the following type:
  • 30.
    Towards a High-PerformanceImplementation of the MCSFilter 17 min f(x) subject to gj(x) ≤ 0, j = 1, ..., m li ≤ xi ≤ ui, i = 1, ..., n (1) where f is the objective function, gj(x), j = 1, ..., m are the constraint functions and at least one of the functions f, gj : Rn −→ R is nonlinear; also, l and u are the bounds and Ω = {x ∈ Rn : g(x) ≤ 0 , l ≤ x ≤ u} is the feasible region. This method has two main different parts: i) the multistart part, related with the exploration feature of the method, and ii) the coordinate search filter local search, related with the exploitation of promising regions. The MCSFilter method does not require any information about the deriva- tives and is able to obtain all the solutions, both local and global, of a given non- convex optimization problem. This is an important asset of the method since, in industry problems, it is often not possible to know the derivative functions; moreover, a large number of real-life problems are nonlinear and nonconvex. As already stated, the MCSFilter algorithm relies on a multistart strategy and a local search repeatedly called inside the multistart. Briefly, the multistart strategy is a stochastic algorithm that applies more than once a local search to sample points aiming to converge to all the minimizers, local and global, of a multimodal problem. When the local search is repeatedly applied, some of the minimizers can be reached more than once. This leads to a waste of time since these minimizers have already been determined. To avoid these situations, a clustering technique based on computing the regions of attraction of previously identified minimizers is used. Thus, if the initial point belongs to the region of attraction of a previously detected minimizer, the local search procedure may not be performed, since it would converge to this known minimizer. Figure 1 illustrates the influence of the regions of attraction. The red/magenta lines between the initial approximation and the minimizer represent a local search that has been performed; the red line represents the first local search that converged to a given minimizer; the white dashed line between the two points represents a discarded local search, using the regions of attraction. Therefore, this representation intends to show the regions of attraction of each minimizer and the corresponding points around each one. These regions are dynamic in the sense that they may change every time a new initial point is used [3]. The local search uses a derivative-free strategy that consists of a coordinate search combined with a filter methodology in order to generate a sequence of approximate solutions that improve either the constraint violation or the objec- tive function relatively to the previous approximation; this strategy is called Coordinate Search Filter algorithm (CSFilter). In this way, the initial problem is previously rewritten as a bi-objective problem (2): min (θ(x), f(x)) x ∈ Ω (2)
  • 31.
    18 L. Araújoet al. 3 4 5 6 7 8 9 10 11 12 13 3 4 5 6 7 8 9 10 11 12 13 Fig. 1. Illustration of the multistart strategy with regions of attraction [3]. aiming to minimize, simultaneously, the objective function f(x) and the non- negative continuous aggregate constraint violation function θ(x) defined in (3): θ(x) = g(x)+2 + (l − x)+2 + (x − u)+2 (3) where v+ = max{0, v}. For more details about this method see [9,10]. Algorithm 1 displays the steps of the MSCFilter method. The stopping condi- tion of CSFilter is related with the step size α of the method (see condition (4)): α αmin (4) with αmin 1 and close to zero. The main steps of the MCSFilter algorithm for finding global (as well as local) solutions to problem (1) are shown in Algorithm 2. The stopping condition of the MCSFilter algorithm is related to the number of minimizers found and to the number of local searches that were applied in the multistart strategy. Considering nl as the number of local searches used and nm as the number of minimizers obtained, then Pmin = nm(nm + 1) nl(nl − 1) . The MCSFilter algorithm stops when condition (5) is reached: Pmin (5) where 1. In this preliminary work, the main goal is to compare the performance of MCSFilter when bound constraint problems are addressed, using different
  • 32.
    Towards a High-PerformanceImplementation of the MCSFilter 19 Algorithm 1. CSFilter algorithm Require: x and parameter values, αmin; set x̃ = x, xinf F = x, z = x̃; 1: Initialize the filter; Set α = min{1, 0.05 n i=1 ui−li n }; 2: repeat 3: Compute the trial approximations zi a = x̃ + αei, for all ei ∈ D⊕; 4: repeat 5: Check acceptability of trial points zi a; 6: if there are some zi a acceptable by the filter then 7: Update the filter; 8: Choose zbest a ; set z = x̃, x̃ = zbest a ; update xinf F if appropriate; 9: else 10: Compute the trial approximations zi a = xinf F + αei, for all ei ∈ D⊕; 11: Check acceptability of trial points zi a; 12: if there are some zi a acceptable by the filter then 13: Update the filter; 14: Choose zbest a ; Set z = x̃, x̃ = zbest a ; update xinf F if appropriate; 15: else 16: Set α = α/2; 17: end if 18: end if 19: until new trial zbest a is acceptable 20: until α αmin implementations of the algorithm: the original implementation in MATLAB [10], a follow up implementation in Java (already used to solve problems from the Chemical Engineering area [3,4,13]), and a new implementation in C (evaluated for the first time in this paper). 3 Computational Results In order to compare the performance of the three implementations of the MCS- Filter optimization algorithm, a set of problems was chosen. The definition of each problem (a total of 15 bound constraint problems) is given below, along with the experimental conditions under which they were evaluated, as well as the obtained results (both numerical and performance-related). 3.1 Benchmark Problems The collection of problems was taken from [9] (and the references therein) and all the fifteen problems in study are listed below. The problems were chosen in such a way that different characteristics were addressed: they are multimodal problems with more than one minimizer (actually, the number of minimizers varies from 2 to 1024); they can have just one global minimizer or more than one global minimizer; the dimension of the problems varies between 2 and 10.
  • 33.
    20 L. Araújoet al. Algorithm 2. MCSFilter algorithm Require: Parameter values; set M∗ = ∅, k = 1, t = 1; 1: Randomly generate x ∈ [l, u]; compute Bmin = mini=1,...,n{ui − li}; 2: Compute m1 = CSFilter(x), R1 = x − m1; set r1 = 1, M∗ = M∗ ∪ m1; 3: while the stopping rule is not satisfied do 4: Randomly generate x ∈ [l, u]; 5: Set o = arg minj=1,...,k dj ≡ x − mj; 6: if do Ro then 7: if the direction from x to yo is ascent then 8: Set prob = 1; 9: else 10: Compute prob = φ( do Ro , ro); 11: end if 12: else 13: Set prob = 1; 14: end if 15: if ζ‡ prob then 16: Compute m = CSFilter(x); set t = t + 1; 17: if m − mj γ∗ Bmin, for all j = 1, . . . , k then 18: Set k = k + 1, mk = m, rk = 1, M∗ = M∗ ∪ mk; compute Rk = x − mk; 19: else 20: Set Rl = max{Rl, x − ml}; rl = rl + 1; 21: end if 22: else 23: Set Ro = max{Ro, x − mo}; ro = ro + 1; 24: end if 25: end while – Problem (P1) min f(x) ≡ x2 − 5.1 4π2 x2 1 + 5 π x1 − 6 2 + 10 1 − 1 8π cos(x1) + 10 s.t. −5 ≤ x1 ≤ 10, 0 ≤ x2 ≤ 15 • known global minimum f∗ = 0.39789. – Problem (P2) min f(x) ≡ 4 − 2.1x2 1 + x4 1 3 x2 1 + x1x2 − 4(1 − x2 2)x2 2 s.t. −2 ≤ xi ≤ 2, i = 1, 2 • known global minimum: f∗ = −1.03160. – Problem (P3) min f(x) ≡ n i=1 sin(xi) + sin 2xi 3 s.t. 3 ≤ xi ≤ 13, i = 1, 2
  • 34.
    Towards a High-PerformanceImplementation of the MCSFilter 21 • known global minimum: f∗ = −2.4319. – Problem (P4) min f(x) ≡ 1 2 2 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ xi ≤ 5, i = 1, 2 • known global minimum: f∗ = −78.3323. – Problem (P5) min f(x) ≡ 1 2 3 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 3 • known global minimum: f∗ = −117, 4983. – Problem (P6) min f(x) ≡ 1 2 4 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 4 • known global minimum: f∗ = −156, 665. – Problem (P7) min f(x) ≡ 1 2 5 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 5 • known global minimum: f∗ = −195, 839. – Problem (P8) min f(x) ≡ 1 2 6 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 6 • known global minimum: f∗ = −234, 997. – Problem (P9) min f(x) ≡ 1 2 8 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ xi ≤ 5, i = 1, · · · , 8
  • 35.
    22 L. Araújoet al. • known global minimum: f∗ = −313, 3287. – Problem (P10) min f(x) ≡ 1 2 10 i=1 x4 i − 16x2 i + 5xi s.t. −5 ≤ x1 ≤ 5, i = 1, · · · , 10 • known global minimum: f∗ = −391, 658. – Problem (P11) min f(x) ≡ (1 + (x1 + x2 + 1)2 (19 − 14x1 + 3x2 1 − 14x2 + 6x1x2 + 3x2 2)) × (30 + (2x1 − 3x2)2 (18 − 32x1 + 12x2 1 + 48x2 − 36x1x2 + 27x2 2)) s.t. −2 ≤ xi ≤ 2, i = 1, 2 • known global minimum: f∗ = 3. – Problem (P12) min f(x) ≡ 5 i=1 i cos((i + 1)x1 + i) 5 i=1 i cos((i + 1)x2 + i) s.t. −10 ≤ xi ≤ 10, i = 1, 2 • known global minimum: f∗ = −186, 731. – Problem (P13) min f(x) ≡ cos(x1) sin(x2) − x1 x2 2 + 1 s.t. −1 ≤ x1 ≤ 2, −1 ≤ x2 ≤ 1 • known global minimum: f∗ = −2, 0. – Problem (P14) min f(x) ≡ (1.5 − x1(1 − x2))2 + (2.25 − x1(1 − x2 2))2 + (2.625 − x1(1 − x3 2))2 s.t. −4.5 ≤ xi ≤ 4.5, i = 1, · · · , 2 • known global minimum: f∗ = 0. – Problem (P15) min f(x) ≡ 0.25x4 1 − 0.5x2 1 + 0.1x1 + 0.5x2 2 s.t. −2 ≤ xi ≤ 2, i = 1, · · · , 2 • known global minimum: f∗ = −0, 352386.
  • 36.
    Towards a High-PerformanceImplementation of the MCSFilter 23 3.2 Experimental Conditions The problems were evaluated in a computational system with the following rel- evant characteristics: CPU - 2.3/4.3 GHz 18-core Intel Xeon W-2195, RAM - 32 GB DDR4 2666 MHz ECCR, OS - Linux Ubuntu 20.04.2 LTS, MATLAB - version R2020a, JAVA - OpenJDK 8, C Compiler - gcc version 9.3.0 (-O2 option). Since the MCSFilter algorithm has a stochastic component, 10 runs were performed for each problem. The execution time of the first run was ignored and so the presented execution times are averages of the remaining 9 executions. All three implementations (MATLAB, Java and C) ran using the same parameters, namely, the ones related to the stopping conditions. For the local search CSFilter, αmin = 10−5 was considered in condition (4), as in previous works. For the stopping condition of MCSFilter = 10−2 was considered in condition (5). 3.3 Analysis of the Results Tables 1, 2 and 3 show the results obtained for each implementation. In all the tables the first column (Prob) has the name of each problem; the second col- umn (minavg) presents the average number of minimizers obtained in the 9 executions; in the third column (tavg) the average execution time (in seconds) of 9 runs is shown; the last column (best f∗ ) shows the best value achieved for the global minimum. One important feature visible in the results of all three implementations is that the global minimum is always achieved, in all problems. Table 1. Results obtained using MATLAB. Prob minavg nfavg tavg(s) best f∗ P1 3 5684,82 0,216 0,398 P2 6 8678,36 0,289 −1,032 P3 4 4265,55 0,178 −2,432 P4 4 4663,27 0,162 −78,332 P5 8 15877,09 0,480 −117,499 P6 16 51534,64 1,438 −156,665 P7 32 145749,64 3,898 −195,831 P8 64 391584,00 10,452 −234,997 P9 256 2646434,55 71,556 −313,329 P10 1023,63 15824590,64 551,614 −391,662 P11 4 39264,73 1,591 3,000 P12 64,36 239005,18 6,731 −186,731 P13 2,45 4653,18 0,160 −2,022 P14 2,27 52964,09 2,157 0,000 P15 2 2374,91 0,135 −0,352
  • 37.
    24 L. Araújoet al. For the MATLAB implementation, looking at the second column of Table 1 it is possible to state that all the minimizers are found for all the problems except problems P10, P12, P13, P14. Nevertheless, it is important to note that P10 has 1024 minimizers and an average of 1023,63 were found, and P12 has 760 mini- mizers and an average of 64,36 minimizers were discovered. P13 is the problem where MCSFilter exhibits the worst behaviour, shared by other algorithms. It is also worth to remark that problem P10 has 10 variables and, taking into account the structure of the CSFilter algorithm, such leads to a large number of function evaluations. This, of course, impacts the execution time and, therefore, prob- lem P10 has the highest execution time of all the problems, taking an average 551,614 s per each run. Table 2. Results obtained using JAVA. Prob minavg nfavg tavg(s) best f∗ P1 3 9412,64 0,003 0,3980 P2 6 13461,82 0,005 −1,0320 P3 4 10118,73 0,006 −2,4320 P4 4 10011,91 0,005 −78,3320 P5 8 32990,73 0,013 −117,4980 P6 16 98368,73 0,038 −156,6650 P7 32 274812,36 0,118 −195,8310 P8 64 730754,82 0,363 −234,9970 P9 256 4701470,36 2,868 −313,3290 P10 1024 27608805,73 20,304 −391,6620 P11 4 59438,18 0,009 3,0000 P12 46,45 99022,91 0,035 −186,7310 P13 2,36 6189,09 0,003 −2,0220 P14 2,54 62806,64 0,019 0,0000 P15 2 5439,18 0,002 −0,3520 Considering now the results produced by the Java implementation (Table 2), it is possible to observe a behaviour similar to the MATLAB version, regarding to the best value of the global minimum. This value is always achieved, in all runs, as well as the known number of minimizers – except in problems P12, P13 and P14. It is noteworthy that using Java, all the 1024 minimizers were obtained. If the fourth column of Tables 1 and 2 are compared, it is possible to point out that in Java the algorithm clearly takes less time to obtain the same solutions.
  • 38.
    Towards a High-PerformanceImplementation of the MCSFilter 25 Finally, Table 3 shows the results obtained by the new C-based implementa- tion. It can be observed that the numerical behaviour of the algorithm is similar to that observed in the Java version: both implementations find, approximately, the same number of minimizers. However, comparing the execution times of the C version with those of the Java version (Table 2), the C version is clearly faster. Table 3. Results obtained using C. Prob minavg nfavg tavg(s) best f∗ P1 3 3434,64 0,001 0,3979 P2 6 7809,73 0,002 −1,0316 P3 4 5733,36 0,002 −2,4320 P4 4 6557,54 0,002 −78,3323 P5 8 17093,64 0,007 −117,4985 P6 16 51359,82 0,023 −156,6647 P7 32 162219,63 0,073 −195,8308 P8 64 403745,91 0,198 −234,9970 P9 255,18 2625482,73 1,600 −313,3293 P10 1020,81 15565608,53 14,724 −391,6617 P11 4 8723,34 0,004 3,0000 P12 38,81 43335,73 0,027 −186,7309 P13 2 2345,62 0,001 −2,0218 P14 2,36 24101,18 0,014 0,0000 P15 2 3334,53 0,002 −0,3524 In order to make discernible the differences in the computational performance of the three MCSFilter implementations, the execution times of all problems considered in this study are represented, side-by-side, in two different charts, accordingly with the order of magnitude of the execution times in the MATLAB version (the slowest one). The average execution times for problems P1 −P5, P13 and P15, are represented in Fig. 2; in MATLAB, all these problems executed in less than half a second. For the remaining problems, the execution times are rep- resented in Fig. 3: these include problems P6 −P12, and P14, which took between ≈1,5 s to ≈10,5 s to execute in MATLAB, and whose execution times are repre- sented against the primary (left) vertical axis; and also include problems P9 and P10, whose executions times in MATLAB were ≈71 s and ≈551 s, respectively, and are represented against the (right) secondary axis.
  • 39.
    26 L. Araújoet al. Fig. 2. Average execution time (s) for problems P1 − P5, P13, P15. Fig. 3. Average execution time (s) for problems P6 − P12, P14. A quick overview of Fig. 2 and Fig. 3 is enough to conclude that the new C implementation is faster than the previous Java implementation, and much faster than the original MATLAB version of the MCSFilter algorithm (note that logarithmic scales were used in both figures due the different order of magnitude of the various execution times; also, in Fig. 3 different textures were used for
  • 40.
    Towards a High-PerformanceImplementation of the MCSFilter 27 problems P9 and P10 once their execution times are represented against the right vertical axis). It is also possible to observe that, in general, there is a direct proportionality between the execution times of the three code bases: when changing the optimization problem, if the execution time increases or decreases in one version, the same happens in the other versions. To quantify the performance improvement of a version of the algorithm, over a preceding implementation, one can calculate the Speedups (accelerations) achieved. Thus, the speedup of the X version of the algorithm against the Y version of the same algorithm is simply given by S(X, Y ) = T(Y )/T(X) where T(Y ) and T(X) are the average execution times of the Y and X implementations. The relevant speedups in the context of this study are represented in Table 4. Another perspective on the capabilities of the three MCSFilter implementa- tions herein considered builds on the comparison of their efficiency (or effective- ness) in discovering all known optimizers of the optimization problems at stake. A simple metric that yields such efficiency is E(X, Y ) = minavg(X)/minavg(Y ). Table 5 shows the relative optima search efficiency for several pairs of MCS- Filter implementations. For all but three problems, the MATLAB, JAVA and C implementations are able to find exactly the same number of optimizers (and so their relative efficiency is 1 or 100%). For problems P12, P13 and P14, however, the search efficiency may vary a lot. Compared to the MATLAB version, both the Java and C versions are unable to find as much optimizers for problems P12 and P13; for problem P14, however, the Java version is able to find 12% more optimizers than the MATLAB version, and the C version still lags behind Table 4. Speedups of the execution time. Problem S(Java, MATLAB) S(C, Java) S(C, MATLAB) P1 71,8 2,7 197,4 P2 57,9 2,1 122,3 P3 29,6 3,2 94,5 P4 32,3 2,0 64,7 P5 36,9 1,7 64,5 P6 37,8 1,6 61,2 P7 33,0 1,6 53,2 P8 28,8 1,8 52,7 P9 24,9 1,8 44,7 P10 27,2 1,4 37,5 P11 176,8 2,4 417,9 P12 192,3 1,3 252,3 P13 53,3 2,4 126,3 P14 113,5 1,3 150,4 P15 67,6 1,2 84,4
  • 41.
    28 L. Araújoet al. Table 5. Optima search eficiency. Problem E(Java, MATLAB) E(C, Java) E(C, MATLAB) P1 1 1 1 P2 1 1 1 P3 1 1 1 P4 1 1 1 P5 1 1 1 P6 1 1 1 P7 1 1 1 P8 1 1 1 P9 1 1 1 P10 1 1 1 P11 1 1 1 P12 0,72 0,75 0,54 P13 0,96 0,85 0,81 P14 1,12 0,79 0,88 P15 1 1 1 (finding only 88% of the optimizers found by MATLAB). Also, compared to the Java version, the C version currently shows an inferior search efficiency regarding problems P12, P13 and P14, something to be tackled in future work. A final analysis is provided based on the data of Table 6. This table presents, for each problem Y , and for each MCSFilter implementation X, the precision achieved by that implementation as P(X, Y ) = |f∗ (Y ) − bestf∗ (X)|, that is, the modulus of the distance between the known global minimum of problem Y and the best value achieved for the global minimum by implementation X. The following conclusions may be derived: in all the problems the best f∗ is closer to the global minimum known in the literature since the measure used is close to zero; moreover, in the new implementation (using C) there are six problems for which this implementation overcomes the previous two; in five other problems all the implementations obtained the same precision for the best f∗ .
  • 42.
    Towards a High-PerformanceImplementation of the MCSFilter 29 Table 6. Global optima precision. Problem P(MATLAB) P(Java) P(C) P1 1,1E − 04 1,1E − 04 1,0E − 05 P2 4,0E − 04 4,0E − 04 0,0E + 00 P3 1,0E − 04 1,0E − 04 1,0E − 04 P4 3,0E − 04 3,0E − 04 0,0E + 00 P5 7,0E − 04 3,0E − 04 2,0E − 04 P6 0,0E + 00 0,0E + 00 3,0E − 04 P7 8,0E − 03 8,0E − 03 8,2E − 03 P8 0,0E + 00 0,0E + 00 0,0E + 00 P9 3,0E − 04 3,0E − 04 6,0E − 04 P10 4,0E − 03 4,0E − 03 3,7E − 03 P11 0,0E + 00 0,0E + 00 0,0E + 00 P12 0,0E + 00 0,0E + 00 1,0E − 04 P13 2,2E − 02 2,2E − 02 2,2E − 02 P14 0,0E + 00 0,0E+00 0,0E + 00 P15 3,9E − 04 3,9E − 04 1,4E − 05 4 Conclusions and Future Work The MCSFilter algorithm was used to solve bound constraint problems with dif- ferent dimensions, from two to ten. The algorithm was originally implemented in MATLAB and the results initially obtained were considered very promising. A second implementation was later developed in Java, which increased the perfor- mance considerably. In this work, the MCSFilter algorithm was re-coded in the C language and a comparison was made between all three implementations, both performance-wise and regarding search efficiency and precision. The evaluation results show that, for the set of problems considered, the novel C version, even though it is still a preliminary version, already surpasses the performance of the Java implementation. The search efficiency of the C version, however, must be improved. Regarding precision, the C version matched the previous in 6 problems and brought improvements on 5 other problems, in a total of 15 problems. Besides tackling the numerical efficiency and precision issues that still persist, future work will include testing the C code with other problems (including higher dimension and harder problems), and refining the code in order to improve its performance. In particular, and most relevant for the problems that still take a considerable amount of execution time, parallelization strategies will be exploited as a way to further accelerate the execution of the MCSFilter algorithm. Acknowledgements. This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020.
  • 43.
    30 L. Araújoet al. References 1. Abhishek, K., Leyffer, S., Linderoth, J.: FilMINT: an outer-approximation-based solver for convex mixed-integer nonlinear programs. INFORMS J. Comput. 22(4), 555–567 (2010) 2. Abramson, M., Audet, C., Chrissis, J., Walston, J.: Mesh adaptive direct search algorithms for mixed variable optimization. Optim. Lett. 3(1), 35–47 (2009). https://doi.org/10.1007/s11590-008-0089-2 3. Amador, A., Fernandes, F.P., Santos, L.O., Romanenko, A., Rocha, A.M.A.C.: Parameter estimation of the kinetic α-Pinene isomerization model using the MCS- Filter algorithm. In: Gervasi, O., et al. (eds.) ICCSA 2018, Part II. LNCS, vol. 10961, pp. 624–636. Springer, Cham (2018). https://doi.org/10.1007/978-3-319- 95165-2 44 4. Amador, A., Fernandes, F.P., Santos, L.O., Romanenko, A.: Application of MCS- Filter to estimate stiction control valve parameters. In: International Conference of Numerical Analysis and Applied Mathematics, AIP Conference Proceedings, vol. 1863, pp. 270005 (2017) 5. Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Mahajan, A.: Mixed-Integer Nonlinear Optimization. Acta Numer. 22, 1–131 (2013) 6. Bonami, P., et al.: An algorithmic framework for convex mixed integer nonlinear programs. Discrete Optim. 5(2), 186–204 (2008) 7. Bonami, P., Gonçalves, J.: Heuristics for convex mixed integer nonlinear programs. Comput. Optim. Appl. 51(2), 729–747 (2012) 8. D’Ambrosio, C., Lodi, A.: Mixed integer nonlinear programming tools: an updated practical overview. Ann. Oper. Res. 24, 301–320 (2013). https://doi.org/10.1007/ s10479-012-1272-5 9. Fernandes, F.P.: Programação não linear inteira mista e não convexa sem derivadas. PhD thesis, University of Minho, Braga (2014) 10. Fernandes, F.P., Costa, M.F.P., Fernandes, E.M.G.P., et al.: Multilocal program- ming: a derivative-free filter multistart algorithm. In: Murgante, B. (ed.) ICCSA 2013, Part I. LNCS, vol. 7971, pp. 333–346. Springer, Heidelberg (2013). https:// doi.org/10.1007/978-3-642-39637-3 27 11. Floudas, C., et al.: Handbook of Test Problems in Local and Global Optimization. Kluwer Academic Publishers, Boston (1999) 12. Hendrix, E.M.T., Tóth, B.G.: Introduction to Nonlinear and Global Optimization. Springer, New York (2010). https://doi.org/10.1007/978-0-387-88670-1 13. Romanenko, A., Fernandes, F.P., Fernandes, N.C. P.: PID controllers tuning with MCSFilter. In: AIP Conference Proceedings, vol. 2116, pp. 220003 (2019) 14. Yang, X.-S.: Optimization Techniques and Applications with Examples. Wiley, Hoboken (2018)
  • 44.
    On the Performanceof the ORTHOMADS Algorithm on Continuous and Mixed-Integer Optimization Problems Marie-Ange Dahito1,2(B) , Laurent Genest1 , Alessandro Maddaloni2 , and José Neto2 1 Stellantis, Route de Gisy, 78140 Vélizy-Villacoublay, France {marieange.dahito,laurent.genest}@stellantis.com 2 Samovar, Telecom SudParis, Institut Polytechnique de Paris, 19 place Marguerite Perey, 91120 Palaiseau, France {alessandro.maddaloni,jose.neto}@telecom-sudparis.eu Abstract. ORTHOMADS is an instantiation of the Mesh Adaptive Direct Search (MADS) algorithm used in derivative-free and black- box optimization. We investigate the performance of the variants of ORTHOMADS on the bbob and bbob-mixint, respectively continuous and mixed-integer, testbeds of the COmparing Continuous Optimizers (COCO) platform and compare the considered best variants with heuris- tic and non-heuristic techniques. The results show a favourable perfor- mance of ORTHOMADS on the low-dimensional continuous problems used and advantages on the considered mixed-integer problems. Besides, a generally faster convergence is observed on all types of problems when the search phase of ORTHOMADS is enabled. Keywords: Derivative-free optimization · Blackbox optimization · Benchmarking · Mesh Adaptive Direct Search · Mixed-integer blackbox 1 Introduction Derivative-free optimization (DFO) and blackbox optimization (BBO) are branches of numerical optimization that have known a fast growth in the past years, especially with the growing need to solve real-world application problems but also with the development of methods to deal with unavailable or numeri- cally costly derivatives. DFO focuses on optimization techniques that make no use of derivatives while BBO deals with problems where the objective function is not analytically known, that is it is a blackbox. A regular blackbox objective is the output of a computer simulation: for instance, at Stellantis, the crash or acoustic outputs computed by the finite element simulation of a vehicle. The problems addressed in this paper are of the form: c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 31–47, 2021. https://doi.org/10.1007/978-3-030-91885-9_3
  • 45.
    32 M.-A. Dahitoet al. minimize x∈X f(x), (1) where X is a bounded domain of either Rn or Rc × Zi with c and i respectively the number of continuous and integer variables. n = c+i is the dimension of the problem and f is a blackbox function. Heuristic and non-heuristic techniques can tackle this kind of problems. Among the main approaches used in DFO are direct local search methods. The latter are iterative methods that, at each iteration, evaluate a set of points in a certain radius that can be increased if a better solution is found or decreased if the incumbent remains the best point at the current iteration. The Mesh Adaptive Direct Search (MADS) [1,4,5] is a famous direct local search method used in DFO and BBO that is an extension of the Generalized Pattern Search (GPS) introduced in [28]. MADS evolves on a mesh by first doing a global exploration called the search phase and then, if a better solution than the current iterate is not found, a local poll is performed. The points evaluated in the poll are defined by a finite set of poll directions that is updated at each iteration. The algorithm is derived in several instantiations available in the Non- linear Optimization with the MADS algorithm (NOMAD) software [7,19] and its performance is evaluated in several papers. As examples, a broad compar- ison of DFO optimizers is performed on 502 problems in [25] and NOMAD is used in [24] with a DACE surrogate and compared with other local and global surrogate-based approaches in the context of constrained blackbox optimization on an automotive optimization problem and twenty two test problems. Given the growing number of algorithms to deal with BBO problems, the choice of the most adapted method for solving a specific problem still remains complex. In order to help with this decision, some tools have been developed to compare the performance of algorithms. In particular, data profiles [20] are frequently used in DFO and BBO to benchmark algorithms: they show, given some precision or target value, the fraction of problems solved by an algorithm according to the number of function evaluations. There also exist suites of aca- demic test problems: although the latter are treated as blackbox functions, they are analytically known, which is an advantage to understand the behaviour of an algorithm. There are also available industrial applications but they are rare. Twenty two implementations of derivative-free algorithms for solving box- constrained optimization problems are benchmarked in [25] and compared with each other according to different criteria. They use a set of 502 problems that are categorized according to their convexity (convex or nonconvex), smoothness (smooth or non-smooth) and dimensions between 1 and 300. The algorithms tested include local-search methods such as MADS through NOMAD version 3.3 and global-search methods such as the NEW Unconstrained Optimization Algorithm (NEWUOA) [23] using trust regions and the Covariance Matrix Adap- tation - Evolution Strategy (CMA-ES) [16] which is an evolutionary algorithm. Simulation optimization deals with problems where at least some of the objec- tive or constraints come from stochastic simulations. A review of algorithms to solve simulation optimization is presented in [2], among which the NOMAD software. However, this paper does not compare them due to a lack of standard comparison tools and large-enough testbeds in this optimization branch.
  • 46.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 33 In [3], the MADS algorithm is used to optimize the treatment process of spent potliners in the production of aluminum. The problem is formalized as a 7–dimensional non-linear blackbox problem with 4 inequality constraints. In particular, three strategies are compared using absolute displacements, relative displacements and the latter with a global Latin hypercube sampling search. They show that the use of scaling is particularly beneficial on the considered chemical application. The instantiation ORTHOMADS is introduced in Ref. [1] and consists in using orthogonal directions in the poll step of MADS. It is compared to the initial LTMADS, where the poll directions are generated from a random lower trian- gular matrix, and to GPS algorithm on 45 problems from the literature. They show that MADS outperforms GPS and that the instantiation ORTHOMADS competes with LTMADS and has the advantage that its poll directions cover better the variable space. The ORTHOMADS algorithm, which is the default MADS instantiation used in NOMAD, presents variants in the poll directions of the method. To our knowl- edge, the performance of these different variants has not been discussed in the literature. The purpose of this paper is to explore this aspect by performing experiments with the ORTHOMADS variants. This work is part of a project conducted with the automotive group Stellantis to develop new approaches for solving their blackbox optimization problems. Our contributions are first the evaluations of the ORTHOMADS variants on continuous and mixed-integer opti- mization problems. Besides, the contribution of the search phase is studied and shows a general deterioration of the performance when the search is turned off. The effect however decreases with increasing dimension. Two from the best vari- ants of ORTHOMADS are identified on each of the used testbeds and their perfor- mance is compared with other algorithms including heuristic and non-heuristic techniques. Our experiments exhibit particular variants of ORTHOMADS per- forming best depending on problems features. Plots for analyses are available at the following link: https://github.com/DahitoMA/ResultsOrthoMADS. The paper is organized as follows. Section 2 gives an overview of the MADS algorithm and its ORTHOMADS variants. In Sect. 3, the variants of ORTHOMADS are evaluated on the bbob and bbob-mixint suites that con- sist respectively of continuous and mixed-integer functions. Then, two from the best variants of ORTHOMADS are compared with other algorithms in Sect. 4. Finally, Sect. 5 discusses the results of the paper. 2 MADS and the Variants of ORTHOMADS This section gives an overview of the MADS algorithm and explains the differ- ences among the ORTHOMADS variants. 2.1 The MADS Algorithm MADS is an iterative direct local search method used for DFO and BBO prob- lems. The method relies on a mesh Mk updated at each iteration and determined
  • 47.
    34 M.-A. Dahitoet al. by the current iterate xk, a mesh parameter size δk 0 and a matrix D whose columns consist of p positive spanning directions. The mesh is defined as follows: Mk := {xk + δkDy : y ∈ Np }, (2) where the columns of D form a positive spanning set {D1, D2, . . . , Dp} and N stands for natural numbers. The algorithm proceeds in two phases at each iteration: the search and the poll. The search phase is optional and similar to a design of experiment: a finite set of points Sk, stemming generally from a surrogate model prediction and a Nelder-Mead (N-M) search [21], are evaluated anywhere on the mesh. If the search fails at finding a better point, then a poll is performed. During the poll phase, a finite set of points are evaluated on the mesh in the neighbourhood of the incumbent. This neighbourhood is called the frame Fk and has a radius of Δk 0 that is called the poll size parameter. The frame is defined as follows: Fk := {x ∈ Mk : x − xk∞ ≤ Δkb}, (3) where b = max{d∞, d ∈ D} and D ⊂ {D1, D2, . . . , Dp} is a finite set of poll directions. The latter are such that their union over iterations grows dense on the unit sphere. The two size parameters are such that δk ≤ Δk and evolve after each itera- tion: if a better solution is found, they are increased and otherwise decreased. As the mesh size decreases more drastically than the poll size in case of an unsuc- cessful iteration, the choice of points to evaluate during the poll becomes greater with unsuccessful iteration. Usually, δk = min{Δk, Δ2 k}. The description of the MADS algorithm is given in Algorithm 1 and inspired from [6]. Algorithm 1: Mesh Adaptive Direct Search (MADS) Initialize k = 0, x0 ∈ Rn , D ∈ Rn×p , Δ0 0, τ ∈ (0, 1) ∩ Q, stop 0 1. Update δk = min{Δk, Δ2 k} 2. Search If f(x) f(xk) for x ∈ Sk then xk+1 ← x, Δk+1 ← τ−1 Δk and go to 4 Else go to 3 3. Poll Select Dk,Δk such that Pk := {xk + δkd : d ∈ Dk,Δk } ⊂ Fk If f(x) f(xk) for x ∈ Pk then xk+1 ← x, Δk+1 ← τ−1 Δk and go to 4 Else xk+1 ← xk and Δk+1 ← τΔk 4. Termination If Δk+1 ≥ stop then k ← k + 1 and go to 1 Else stop 2.2 ORTHOMADS Variants MADS has two main instantiations called ORTHOMADS and LTMADS, the latter being the first developed. Both variants are implemented in the NOMAD
  • 48.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 35 software but as ORTHOMADS is to be preferred for its coverage property in the variable space, it was used for the experiments of this paper with NOMAD version 3.9.1. The NOMAD implementation of ORTHOMADS provides 6 variants of the algorithm according to the number of directions used in the poll or according to the way that the last poll direction is computed. They are listed below. ORTHO N + 1 NEG computes n + 1 directions among which n are orthogonal and the (n + 1)th direction is the opposite sum of the n first ones. ORTHO N + 1 UNI computes n + 1 directions among which n are orthogonal and the (n + 1)th direction is generated from a uniform distribution. ORTHO N + 1 QUAD computes n + 1 directions among which n are orthogonal and the (n+1)th direction is generated from the minimization of a local quadratic model of the objective. ORTHO 2N computes 2n directions that are orthogonal. More precisely each direction is orthogonal to 2n−2 directions and collinear with the remaining one. ORTHO 1 uses only one direction in the poll. ORTHO 2 uses two opposite directions in the poll. In the plots, the variants will respectively be denoted using Neg, Uni, Quad, 2N, 1 and 2. 3 Test of the Variants of ORTHOMADS In this section, we try to identify potentially better direction types of ORTHOMADS and investigate the contribution of the search phase. 3.1 The COCO Platform and the Used Testbeds The COmparing Continuous Optimizers (COCO) platform [17] is a bench- marking framework for blackbox optimization. In this respect, several suites of standard test problems are provided and are declined in variants, also called instances. The latter are obtained from transformations in variable and objective space in order to make the functions less regular. In particular, the bbob testbed [13] provides 24 continuous problems for blackbox optimization, each of them available in 15 instances and in dimensions 2, 3, 5, 10, 20 and 40. The problems are categorized in five subgroups: separable functions, functions with low or moderate conditioning, ill-conditioned functions, multi-modal functions with global structure and multi-modal weakly structured functions. All problems are known to have their global optima in [−5, 5]n , where n is the size of a problem. The mixed-integer suite of problems bbob-mixint [29] derives the bbob and bbob-largescale [30] problems by imposing integer constraints on some vari- ables. It consists of the 24 functions of bbob available in 15 instances and in dimensions 5, 10, 20, 40, 80 and 160. COCO also provides various tools for algorithm comparison, notably Empir- ical Cumulative Distribution Function (ECDF) plots (or data profiles) that are
  • 49.
    36 M.-A. Dahitoet al. used in this paper. They show the empirical runtimes, computed as the num- ber of function evaluations to reach given function target values, divided by the dimension. A function target value is defined as ft = f∗ + Δft, where f∗ is the minimum value of a function f and Δft is a target precision. For the bbob and bbob-mixint testbeds, the target precisions are 51 values between 10−8 and 102 . Thus, if a method reaches 1 in the ordinate axis of an ECDF plot, it means 100% of function target values have been reached, including the smallest one f∗ + 10−8 . The presence of a cross on an ECDF curve indicates when the maximal budget of function evaluations is reached. After the cross, COCO estimates the runtimes: it is called simulated restarts. For bbob, an artificial solver called best 2009 is present on the plots and is used as reference solver. Its data comes from the BBOB-2009 workshop1 com- paring 31 solvers. The statistical significance of the results is evaluated in COCO using the rank-sum test. 3.2 Parameter Setting In order to test the performance of the different variants of ORTHOMADS, the bbob and bbob-mixint suites of COCO were used, in particular the problems that have a dimension lower than or equal to 20. This limit in the dimensions has two main reasons: the first one is the computational cost required for the exper- iments and, with the perspective of solving real-world problems, 20 is already a high dimension in this expensive blackbox context. Only the first 5 instances of each function were used, that is a total of respectively 600 and 360 problems used from bbob and bbob-mixint. A maximal function evaluation budget of 2 × 103 × n was set, with n being the dimension of the considered problem. To see the contribution of the search phase, the experiments on the vari- ants were divided in two subgroups: the first one using the default search of ORTHOMADS and the second one where the search phase is disabled. The latter is obtained by setting the four parameters NM SEARCH, VNS SEARCH, SPECULATIVE SEARCH and MODEL SEARCH of NOMAD to the value no. In the plots, the label NoSrch is used when the search is turned off. The search notably includes the use of a quadratic model and of the N-M method. The minimal mesh size was set to 10−11 . Experiments were run with restarts allowed for unsolved problems when the evaluation budget is not reached. This may happen due to internal stopping criteria of the solvers. The initial points used are suggested by COCO through the method initial solution proposal(). 3.3 Results Continuous Problems. As said previously, the contribution of the search phase was studied. The results aggregated on all functions in dimensions 5, 10 1 https://coco.gforge.inria.fr/doku.php?id=bbob-2009-results.
  • 50.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 37 and 20 on the bbob suite are depicted on Fig. 1. They show that enabling the search step in NOMAD generally leads to an equivalent or higher performance of the variants and this improvement can be important. Besides, using one or two directions with or without search is often far from being competitive with the other variants. In particular, 1 NoSrch is often the worst or among the worsts, except on Discus which is an ill-conditioned quadratic function, where it competes with the variants that do not use the search. As mentioned in Sect. 1, the plots depicting the results described in the paper are available online. Looking at the results aggregated on all functions for ORTHO 2N, ORTHO N + 1 NEG, ORTHO N + 1 QUAD and ORTHO N + 1 UNI, the search increases the success rate from nearly 70%, 55% and 40% up to 90%, 80% and 65% respectively in dimensions 2, 3 and 5, as shown in Fig. 1a for dimension 5. From dimension 10, the advantage of the search decreases and the performance of ORTHO N + 1 UNI visibly stands out from the other three variants mentioned above since it decreases with or without the search, as illustrated in Figs. 1b and 1c. Focusing on some families of functions, Neg NoSrch seems slightly less impacted than the other NoSrch variants by the increase of the dimension. On ill-conditioned problems, the variants using search are more sensitive to the increase of the dimension. Considering multi-modal functions with adequate global structure, 2N NoSrch solves 15% more problems than the other NoSrch variants in 2D. In this dimen- sion, the variants using search have a better success rate than the best 2009 up to a budget of 200 function evaluations. From 10D, all curves are rather flat: all ORTHOMADS variants tend to a local optimum. With increasing dimension, Neg is competitive or better than the others on multi-modal problems without global structure, followed by 2N. In particular, in dimension 20 both variants are competitive and outperform the remaining variants that use search on the Gallagher’s Gaussian 101–me peaks function, and Neg outperforms them with a gap of more than 20% in their success rate on the Gallagher’s Gaussian 21–hi peaks function which is also ill-conditioned. Since Neg and 2N are often among the best variants on the considered prob- lems and have an advantage on some multi-modal weakly structured functions, they are chosen for comparison with other solvers. Mixed-Integer Problems. The experiments performed on the mixed-integer problems also show a similar or improved performance of the ORTHOMADS variants when the search step is enabled in NOMAD, as illustrated in Fig. 2 in dimensions 5, 10 and 20. Looking at Fig. 2a for instance, in the given budget of 2 × 103 × n, the variant denoted as 2 solves 75% of the problems in dimension 5 against 42% for 2 NoSrch. However, it is not always the case: the only use of the poll directions is some- times favourable. It is notably the case on the Schwefel function in dimension 20 where the curve Neg NoSrch solves 43% of the problems, which is the highest success rate when the search and non-search settings are compared together.
  • 51.
    38 M.-A. Dahitoet al. (a) 5D (b) 10D (c) 20D Fig. 1. ECDF plots: the variants of ORTHOMADS with and without the search step on the bbob problems. Results aggregated on all functions in dimensions 5, 10 and 20. When the search is disabled, ORTHO 2N seems preferable in small dimension, namely here in 5D as presented in Fig. 2a. In this dimension, it is sometimes the only variant that solves all the instances of a function in the given budget: it is the case for the step-ellipsoidal function, the two Rosenbrock functions (original and rotated), the Schaffer functions, and the Schwefel function. It also solves all the separable functions in 5D and can therefore solve the different types of problems. Although the difference is less noticeable with the search step enabled, this variant is still a good choice, especially on multi-modal problems with adequate global structure. On the whole, looking at Fig. 2, ORTHO 1 and ORTHO 2 solve less problems than the other variants and the gap in performance with the other direction types increases with the dimension, whether using the search phase or not. Although the use of the search helps solving some functions in low dimension such as the sphere or linear slope functions in 5D, both variants perform poorly in dimension 20 on second-order separable functions, even if the search enables the solution of linear slope which is a linear function. Among these two variants, using 2 poll directions also seems better than only one, especially in dimension 10 where ORTHO 2 solves more than 23% and 40% of problems respectively without and with use of search, against 16% and 31% for ORTHO 1 as presented in Fig. 2b. Among the four remaining variants, ORTHO N + 1 UNI reaches equivalent or less targets than the others whether considering the setting where the search is available or when only the poll directions are used, as depicted in Fig. 2. In particular, in dimension 5, the four variants using more than n+1 poll directions solve more than 85% of the separable problems with or without search. But when the dimension increases, ORTHO N + 1 UNI has a disadvantage on the Rastrigin functions where the use of the search does not noticeably help the convergence of the algorithm. Focusing on the different function types, no algorithm among the variants ORTHO 2N, ORTHO N + 1 NEG and ORTHO N + 1 QUAD seem to particularly outperform the others in dimensions 10 and 20. A higher success rate is however noticeable on multimodal weakly structured problems with search available for ORTHO N + 1 NEG in comparison with ORTHO N + 1 QUAD and for the latter in comparison with
  • 52.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 39 ORTHO 2N. Besides, Neg reaches more targets on problems with low or moderate conditioning. For these reasons, ORTHO N + 1 NEG was chosen for comparison with other solvers. Besides, the mentioned slight advantage of ORTHO N + 1 QUAD over ORTHO 2N, its equivalent or better performance on separable and ill-conditioned functions compared with the latter variant, makes it a good second choice to represent ORTHOMADS. (a) 5D (b) 10D (c) 20D Fig. 2. ECDF plots: the variants of ORTHOMADS with and without the search step on the bbob-mixint problems. Results aggregated on all functions in dimensions 5, 10 and 20. 4 Comparison of ORTHOMADS with other solvers The previous experiments showed the advantage of using the search step in ORTHOMADS to speed up convergence. They also revealed the effectiveness of some variants that are used here for comparisons with other algorithms on the continuous and mixed-integer suites. 4.1 Compared Algorithms Apart from ORTHOMADS, the other algorithms used for comparison on bbob are first, three deterministic algorithms: the quasi-Newton Broyden-Fletcher- Goldfarb-Shanno (BFGS) method [22], the quadratic model-based NEWUOA and the adaptive N-M [14] that is a simplicial search. Stochastic methods are also used among which a Random Search (RS) algorithm [10] and three population- based algorithms: a surrogate-assisted CMA-ES, Differential Evolution (DE) [27] and Particle Swarm Optimization (PSO) [11,18]. In order to perform algorithm comparisons on bbob-mixint, data from four stochastic methods were collected: RS, the mixed-integer variant of CMA-ES, DE and the Tree-structured Parzen Estimator (TPE) [8] that is a stochastic model-based technique. BFGS is an iterative quasi-Newton linesearch method that uses approxima- tions of the Hessian matrix of the objective. At iteration k, the search direc- tion pk solves a linear system Bkpk = −∇f(xk), where xk is the iterate, f the
  • 53.
    40 M.-A. Dahitoet al. objective function and Bk ≈ ∇2 f(xk). The matrix Bk is then updated according to a formula. In the context of BBO, the derivatives are approximated with finite differences. NEWUOA is the Powell’s model-based algorithm for DFO. It is a trust-region method that uses sequential quadratic interpolation models to solve uncon- strained derivative-free problems. The N-M method is a heuristic DFO method that uses simplices. It begins with a non degenerated simplex. The algorithm identifies the worst point among the vertices of the simplex and tries to replace it by reflection, expansion or contraction. If none of these geometric transformations of the worst point enables to find a better point, a contraction preserving the best point is done. The adaptive N-M method uses the N-M technique with adaptation of parameters to the dimension, which is notably useful in high dimensions. RS is a stochastic iterative method that performs a random selection of candidates: at each iteration, a random point is sampled and the best between this trial point and the incumbent is kept. CMA-ES is a state-of-the art evolutionary algorithm used in DFO. Let N(m, C) denote a normal distribution of mean m and covariance matrix C. It can be represented by the ellipsoid x C−1 x = 1. The main axes of the ellip- soid are the eigenvectors of C and the square roots of their lengths correspond to the associated eigenvalues. CMA-ES iteratively samples its populations from multivariate normal distributions. The method uses updates of the covariance matrices to learn a quadratic model of the objective. DE is a meta-heuristic that creates a trial vector by combining the incumbent with randomly chosen individuals from a population. The trial vector is then sequentially filled with parameters from itself or the incumbent. Finally the best vector between the incumbent and the created vector is chosen. PSO is an archive-based evolutionary algorithm where candidate solutions are called particles and the population is a swarm. The particles evolve according to the global best solution encountered but also according to their local best points. TPE is an iterative model-based method for hyperparameter optimization. It sequentially builds a probabilistic model from already evaluated hyperparam- eters sets in order to suggest a new set of hyperparameters to evaluate on a score function that is to be minimized. 4.2 Parameter Setting To compare the considered best variants of ORTHOMADS with other methods, the 15 instances of each function were used and the maximal function evaluation budget was increased to 105 × n, with n being the dimension. For the bbob problems, the data used for BFGS, DE and the adaptive N-M method comes from the experiments of [31]. CMA-ES was tested in [15], the data of NEWUOA is from [26], the one of PSO is from [12] and RS results come from [9]. The comparison data of CMA-ES, DE, RS and TPE used on
  • 54.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 41 the bbob-mixint suite comes from the experiments of [29]. All are accessi- ble from the data archives of COCO with the cocopp.archives.bbob and cocopp.archives.bbob mixint methods. 4.3 Results Continuous Problems. Figures 3 and 4 show the ECDF plots comparing the methods on the different function types and on all functions, respectively in dimensions 5 and 20 on the continuous suite. Compared with BFGS, CMA-ES, DE, the adaptive N-M method, NEWUOA, PSO and RS, ORTHOMADS often performs in the average for medium and high dimensions. For small dimensions 2 and 3, it is however among the most competitive. Considering the results aggregated on all functions and splitting them over all targets according to the function evaluations, they can be divided in three parts. The first one consists of very limited budgets (about 20 × n) where NEWUOA competes with or outperforms the others. After that, BFGS becomes the best for an average budget and CMA-ES outperforms the latter for high evaluation budgets (above the order of 102 × n), as shown in Figs. 3f and 4f. The obtained performance restricted to a low budget is an important feature relevant to many applications for which each function evaluation may last hours or even days. On multi-modal problems with adequate structure, there is a noticeable gap between the performance of CMA-ES, which is the best algorithm on this kind of problems, and the other algorithms as shown by Figs. 3d and 4d. ORTHOMADS performs the best in the remaining methods and competes with CMA-ES for low budgets. It is even the best method up to a budget of 103 × n in 2D and 3D while it competes with CMA-ES in higher dimensions for budgets lower than the order of 102 × n. RS is often the worse algorithm to use on the considered problems. Mixed-Integer Problems. Figures 5 and 6 show the ECDF plots comparing the methods on the different function types and on all functions, respectively in dimensions 5 and 20 on the mixed-integer suite. The comparisons of NEG and QUAD with CMA-ES, DE, RS and TPE show an overall advantage of these ORTHOMADS variants over the other methods. A gap is especially visible on separable and ill-conditioned problems, respectively depicted in Figs. 5a and 6a and Figs. 5c and 6c in dimensions 5 and 20, but also on moderately conditioned problems as shown in Figs. 5b and 6b in 5D and 20D. On multi-modal prob- lems with global structure, ORTHOMADS is to prefer only in small dimensions: from 10D its performance highly deteriorates and CMA-ES and DE seem to be better choices. On multi-modal weakly structured functions, the advantages of ORTHOMADS compared to the others emerge when the dimension increases. Besides, although the performance of all algorithms decreases with increasing dimensions, ORTHOMADS seems less sensitive to that. For instance, for a budget of 102 × n, ORTHOMADS reaches 15% more targets than CMA-ES and TPE that are the second best algorithms until this budget, and in dimension 20 this gap increases to 18% for CMA-ES and 25% for TPE.
  • 55.
    42 M.-A. Dahitoet al. (a) Separable (b) Moderately conditioned (c) Ill-conditioned (d) Multi-modal (e) Weakly structured (f) All functions Fig. 3. ECDF plots: comparison of the two variants ORTHO 2N and ORTHO N + 1 NEG of ORTHOMADS with BFGS, NEWUOA, adaptive N-M, RS, CMA-ES, DE and PSO on the bbob problems. Results aggregated on the function types and on all functions in dimension 5. (a) Separable (b) Moderately conditioned (c) Ill-conditioned (d) Multi-modal (e) Weakly structured (f) All functions Fig. 4. ECDF plots: comparison of the two variants ORTHO 2N and ORTHO N + 1 NEG of ORTHOMADS with BFGS, NEWUOA, adaptive N-M, RS, CMA-ES, DE and PSO on the bbob problems. Results aggregated on the function types and on all functions in dimension 20.
  • 56.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 43 On the overall picture, presented in Figs. 5f and 6f, RS performs poorly. The budget allocated to TPE, which is only 102 × n, is way smaller than the ones allocated to the other methods. In this limited budget, TPE competes with CMA-ES in 5D and is better or competitive with DE in 10D and 20D. The latter competes with ORTHOMADS after a budget in the order of 103 × n. Thus, after 5×103 function evaluations, only DE competes with ORTHOMADS in 5D where both methods reach 70% of function-target pairs. Finally, CMA-ES competes with ORTHOMADS when the budget approaches 104 × n function evaluations. Hence, restricted budgets seem to favour the direct local search method while expensive budgets favour the evolutionary algorithms CMA-ES and DE. (a) Separable (b) Moderately conditioned (c) Ill-conditioned (d) Multi-modal (e) Weakly structured (f) All functions Fig. 5. ECDF plots: comparison of the two variants ORTHO N + 1 NEG and ORTHO N + 1 QUAD of ORTHOMADS with RS, CMA-ES, DE and TPE on the bbob-mixint problems. Results aggregated on the function types and on all functions in dimension 5.
  • 57.
    44 M.-A. Dahitoet al. (a) Separable (b) Moderately conditioned (c) Ill-conditioned (d) Multi-modal (e) Weakly structured (f) All functions Fig. 6. ECDF plots: comparison of the two variants ORTHO N + 1 NEG and ORTHO N + 1 QUAD of ORTHOMADS with RS, CMA-ES, DE and TPE on the bbob-mixint problems. Results aggregated on the function types and on all functions in dimension 20. 5 Conclusion This paper investigates the performance of the different poll direction types available in ORTHOMADS on continuous and mixed-integer problems from the literature in a blackbox context. On these two types of problems, ORTHO N + 1 NEG competes with or outperforms the other variants of the algorithm whereas using only 1 or 2 directions is often far from being competitive. On the continuous functions considered, the best poll direction types identi- fied are ORTHO N + 1 NEG and ORTHO 2N, especially on multi-modal weakly struc- tured problems. ORTHOMADS is advantageous in small dimensions and achieves mean results for medium and high dimensions compared to the other algorithms. It also performs well on multi-modal problems with global structure where it competes with CMA-ES for limited budgets. For very limited budgets, the trust-region method NEWUOA is favourable on continuous problems, followed by the linesearch method BFGS for a medium budget and finally the evolutionary algorithm CMA-ES for a high budget. The results on the mixed-integer suite show that, among the poll direction types, ORTHO 2N is preferable in small dimension. Otherwise, ORTHO N + 1 NEG and ORTHO N + 1 QUAD are among the best direction types. Comparing them to other methods show that ORTHOMADS often outperforms the compared algorithms and seems more resilient to the increase of the dimension. For limited budgets, ORTHOMADS seems a good choice among the other considered algorithms to solve unconstrained mixed-integer blackbox problems. This is notably interesting
  • 58.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 45 regarding real-world application problems and, in particular, the mixed-integer optimization problems of Stellantis, where the number of allowed blackbox eval- uations is often limited to a few hundreds. In the latter case, the variables are typically the thicknesses of the sheet metals, considered as continuous, and the materials that are categorical variables encoded as integers. Finally, studying the contribution of the search step of ORTHOMADS shows that disabling it generally leads to a deteriorated performance of the algorithm. Indeed, the default search sequentially executes a N-M search and a quadratic model search that enable a global exploration and accelerate the convergence. However, this effect softens when the dimension increases. References 1. Abramson, M.A., Audet, C., Dennis, J.E., Jr., Le Digabel, S.: OrthoMADS: a deterministic MADS instance with orthogonal directions. SIAM J. Optim. 20(2), 948–966 (2009). https://doi.org/10.1137/080716980 2. Amaran, S., Sahinidis, N.V., Sharda, B., Bury, S.J.: Simulation optimization: a review of algorithms and applications. 4OR 12(4), 301–333 (2014). https://doi. org/10.1007/s10288-014-0275-2 3. Audet, C., Béchard, V., Chaouki, J.: Spent potliner treatment process optimization using a MADS algorithm. Optim. Eng. 9(2), 143–160 (2008). https://doi.org/10. 1007/s11081-007-9030-2 4. Audet, C., Dennis, J.E., Jr.: Mesh adaptive direct search algorithms for constrained optimization. SIAM J. Optim. 17(1), 188–217 (2006). https://doi.org/10.1137/ 040603371 5. Audet, C., Dennis, J.E., Jr.: A progressive barrier for derivative-free nonlinear programming. SIAM J. Optim. 20(1), 445–472 (2009). https://doi.org/10.1137/ 070692662 6. Audet, C., Hare, W.: Derivative-Free and Blackbox Optimization. SSORFE, Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68913-5 7. Audet, C., Le Digabel, S., Tribes, C.: NOMAD user guide. Technical Report G-2009-37, Les cahiers du GERAD (2009). https://www.gerad.ca/nomad/ Downloads/user guide.pdf 8. Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper- parameter optimization. In: Advances in Neural Information Processing Systems, vol. 24 (2011). https://proceedings.neurips.cc/paper/2011/file/ 86e8f7ab32cfd12577bc2619bc635690-Paper.pdf 9. Brockhoff, D., Hansen, N.: The impact of sample volume in random search on the bbob test suite. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2019, pp. 1912–1919. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3319619.3326894 10. Brooks, S.H.: A discussion of random methods for seeking maxima. Oper. Res. 6(2), 244–251 (1958). https://doi.org/10.1287/opre.6.2.244 11. Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: MHS 1995. Proceedings of the Sixth International Symposium on Micro Machine and Human Science, pp. 39–43. IEEE (1995). https://doi.org/10.1109/MHS.1995. 494215
  • 59.
    46 M.-A. Dahitoet al. 12. El-Abd, M., Kamel, M.S.: Black-box optimization benchmarking for noiseless func- tion testbed using particle swarm optimization. In: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, GECCO 2009, pp. 2269–2274. Association for Computing Machinery, New York (2009). https://doi.org/10.1145/1570256.1570316 13. Finck, S., Hansen, N., Ros, R., Auger, A.: Real-parameter black-box optimiza- tion benchmarking 2009: presentation of the noiseless functions. Technical Report 2009/20, Research Center PPE (2009) 14. Gao, F., Han, L.: Implementing the Nelder-Mead simplex algorithm with adaptive parameters. Comp. Optim. Appl. 51, 259–277 (2012). https://doi.org/10.1007/ s10589-010-9329-3 15. Hansen, N.: A global surrogate assisted CMA-ES. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 664–672. Asso- ciation for Computing Machinery, New York (2019). https://doi.org/10.1145/ 3321707.3321842 16. Hansen, N., Auger, A.: Principled design of continuous stochastic search: from theory to practice. In: Borenstein, Y., Moraglio, A. (eds.) Theory and Principled Methods for the Design of Metaheuristics. NCS, pp. 145–180. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-33206-7 8 17. Hansen, N., Auger, A., Ros, R., Mersmann, O., Tušar, T., Brockhoff, D.: COCO: a platform for comparing continuous optimizers in a black-box setting. Optim. Meth. Softw. 36(1), 114–144 (2021). https://doi.org/10.1080/10556788.2020.1808977 18. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948. Citeseer (1995) 19. Le Digabel, S.: Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans. Math. Softw. 37(4), 44:1–44:15 (2011). https://doi.org/ 10.1145/1916461.1916468 20. Moré, J.J., Wild, S.M.: Benchmarking derivative-free optimization algorithms. SIAM J. Optim. 20(1), 172–191 (2009). https://doi.org/10.1137/080724083 21. Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965). https://doi.org/10.1093/comjnl/7.4.308 22. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006). https://doi.org/10.1007/978-0-387-40065-5 23. Powell, M.J.D.: The NEWUOA software for unconstrained optimization without derivatives. In: Di Pillo, G., Roma, M. (eds.) Large-Scale Nonlinear Optimization, pp. 255–297. Springer, Boston (2006). https://doi.org/10.1007/0-387-30065-1 16 24. Regis, R.G.: Constrained optimization by radial basis function interpolation for high-dimensional expensive black-box problems with infeasible initial points. Eng. Optim. 46(2), 218–243 (2014). https://doi.org/10.1080/0305215X.2013.765000 25. Rios, L.M., Sahinidis, N.V.: Derivative-free optimization: a review of algorithms and comparison of software implementations. J. Glob. Optim. 56(3), 1247–1293 (2013). https://doi.org/10.1007/s10898-012-9951-y 26. Ros, R.: Benchmarking the NEWUOA on the BBOB-2009 function testbed. In: Proceedings of the 11th Annual Conference Companion on Genetic and Evolution- ary Computation Conference: Late Breaking Papers, GECCO 2009, pp. 2421–2428. Association for Computing Machinery, New York (2009). https://doi.org/10.1145/ 1570256.1570338 27. Storn, R., Price, K.: Differential evolution-A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328
  • 60.
    ORTHOMADS on Continuousand Mixed-Integer Optimization Problems 47 28. Torczon, V.: On the convergence of pattern search algorithms. SIAM J. Optim. 7(1), 1–25 (1997). https://doi.org/10.1137/S1052623493250780 29. Tušar, T., Brockhoff, D., Hansen, N.: Mixed-integer benchmark problems for single- and bi-objective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 718–726. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3321707.3321868 30. Varelas, K., et al.: A comparative study of large-scale variants of CMA-ES. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11101, pp. 3–15. Springer, Cham (2018). https:// doi.org/10.1007/978-3-319-99253-2 1 31. Varelas, K., Dahito, M.A.: Benchmarking multivariate solvers of SciPy on the noiseless testbed. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 1946–1954. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3319619.3326891
  • 61.
    A Look-Ahead BasedMeta-heuristics for Optimizing Continuous Optimization Problems Thomas Nordli and Noureddine Bouhmala(B) University of South-Eastern Norway, Kongsberg, Norway {thomas.nordli,noureddine.bouhmala}@usn.no http://www.usn.no Abstract. In this paper, the famous kernighan-Lin algorithm is adjusted and embedded into the simulated annealing algorithm and the genetic algorithm for continuous optimization problems. The performance of the different algorithms are evaluated using a set of well known optimization test functions. Keywords: Continuous optimization problems · Simulated annealing · Genetic algorithm · Local search 1 Introduction Several types of meta-heuristics methods have been designed for solving contin- uous optimization problems. Examples include genetic algorithms [8,9] artificial immune systems [7], and taboo search [5]. Meta-heuristics can be divided into two different classes. The first class refers to single-solution search algorithms. A notable example that belongs to this class is the popular simulated annealing algorithm (SA) [12], which is a random search that avoids getting stuck in local minima. In addition to solutions corresponding to an improvement in objective function value, SA also accepts those corresponding to a worse objective function value using a probabilistic acceptance strategy. The second class of algorithms refer to population based algorithms. Algo- rithms of this class applies the principle of survival of the fittest to a population of potential solutions, iteratively improving the population. During each genera- tion, pairs of solutions called individuals are generated to breed a new generation using operators borrowed from natural genetics. This process is repeated until a stopping criteria has been reached. Genetic algorithm is one among many that belongs to this class. The following papers [1,2,6] provide a review of the lit- erature covering the use of evolutionary algorithms for solving continuous opti- mization problems. In spite of the advantages that meta-heuristics offer, they still suffer from the phenomenon of premature convergence. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 48–55, 2021. https://doi.org/10.1007/978-3-030-91885-9_4
  • 62.
    A Look-Ahead BasedMeta-heuristics for Continuous Optimization 49 Recently, several studies combined meta-heuristics with local search meth- ods, resulting in more efficient methods with relatively faster convergence, com- pared to pure meta-heuristics. Such hybrid approaches offer a balance between diversification—to cover more regions of a search space, and intensification – to find better solutions within those regions. The reader might refer to [3,14] for further reading on hybrid optimization methods. This paper introduces a hybridization of genetic algorithm and simulated annealing with the variable depth search (VDS) Kernighan-Lin algorithm (KL) (which was firstly presented for graph partitioning problem in [11]). Compared to simple local search methods, KL allows making steps that worsens the quality of the solution on a short term, as long as the result gives an improvement in a longer run. In this work, the search for one favorable move in SA, and one two-point cross-over in GA, are replaced by a search for a favorable sequence of moves in SA and a series of two-point crossovers using the objective function to guide the search. The rest of this paper is organized as follows. Section 2 describes the com- bined simulated annealing and KL heuristic, while Sect. 3 explains the hybridiza- tion of KL with the genetic algorithm. Section 4 lists the functions used in the benchmark while Section 5 shows the experimental results. Finally, Sect. 6 con- cludes the paper with some future work. 2 Combining Simulated Annealing with Local Search Previously, SA in combination with KL, was applied for the Max-SAT problem in [4]. SA iteratively improves a solution by making random perturbations (moves) to the current solution—exploring the neighborhood in the space of possible solutions. It uses a parameter called temperature to control the decision whether to accept bad moves or not. A bad move is a solution that decreases the value of the objective function. The algorithm starts of with a high temperature, when that almost all moves are accepted. For each iteration, the temperature decreases, the algorithm becomes selective, giving higher preference for better solutions. Assuming an objective function f is to be maximized. The algorithm starts computing the initial temperature T, using a procedure similar to the one described in [12]. The temperature is computed such that the probability of accepting a bad move is approximately equal to a given probability of acceptance Pr. First, a low value of is chosen as the initial temperature. This temperature is used during a number of moves. If the ratio of accepted bad moves is less than Pr, the temperature is multi- plied by two. This continues until the observed acceptance ratio exceeds Pr. A random starting solution is generated and its value is calculated. An iteration of the algorithm starts by performing a series of so-called KL perturbations or moves to the solution Sold leading to a new solution Si new where i denotes the number of consecutive moves. The change in the objective function called gain is
  • 63.
    50 T. Nordliand N. Bouhmala computed for each move. The goal of KL perturbation is to generate a sequence of objective function scores together with their corresponding moves. KL is sup- posed to reach convergence if the function scores of five consecutive moves are bad moves. The subset of moves having the best cumulative score BCSk (SA+KL) is identified. The identification of this subset is equivalent to choosing k so that BCSk (SA+KL) in Eq. 1 is maximum, BCSk (SA+KL) = k i=1 gain(Si new) (1) where i represents the ith move performed, k the number of a moves, and gain(Si new) = f(Si new) − f(Si−1 new) denotes the resultant change of the objective function when ith move has been performed. If BCSk (SA+KL) 0, the solution is updated by taking all the perturbations up to the index k and the best solu- tion is always recorded. If (BCSk SA+KL ≤ 0), the simulated acceptance test is restricted to only the resultant change of the first perturbation. A number from the interval (0,1) is drawn by a random number generator. The move is accepted ff the drawn number is less than exp−δf/T . The process of proposing a series of perturbations and selecting the best subset of moves is repeated for a number of iterations before the temperature is updated. A temperature reduction function is used to lower the temperature. The updating of the temperature is done using a geometric cooling, as shown in Eq. 2 Tnew = α × Told, where α = 0.9. (2) 3 Combining Genetic Algorithm with Local Search Genetic Algorithm (GA) belong to the group of evolutionary algorithms. It works on a set of solutions called a population. Each of these members, called chro- mosomes or individuals, is given a score (fitness), allowing the assessing of its quality. The individuals of the initial population are in most cases generated randomly. A reproduction operator selects individuals as parents, and generates off-springs by combining information from the parent chromosomes. The new population might be subject to a mutation operator introducing diversity to the population. A selection scheme is then used to update the population—resulting in a new generation. This is repeated until the convergence is reached—giving an optimal or near optimal solution. The simple GA as described in [8] is used here. It starts by generating an ini- tial population represented by floating point numbers. Solutions are temporary converted to integers when bit manipulation is needed, and resulting integers are converted back to floating point representation for storage. A roulette function is used for selections. The implementation and based on the one described in section IV of [10], where more details can be found. The purpose of KL-Crossover is to perform the crossover operator a number of times generating a sequence of fitness function scores together with their cor- responding crossover. Thereafter, the subset of consecutive crossovers having the
  • 64.
    A Look-Ahead BasedMeta-heuristics for Continuous Optimization 51 best cumulative score BCSk (GA−KL) is determined. The identification of this sub- set is the same as described in SA-KL. GA-KL chooses k so that BCSk (GA+KL) in Eq. 4 is maximum, where CRi represents the ith crossover performed on two individuals, Il and Im, k the number of allowed crossovers, and gain(Il, Im)CRi denotes the resulting change in the fitness function when the ith crossover CRi has been performed calculated shown in Eq. 3. gain(Il, Im)CRi = f(Il, Im)CRi − f(Il, Im)CRi−1 , (3) where CR0 refers to the chosen pair of parents before applying the cross- over operator. KL-crossover is supposed to reach convergence if the gain of five consecutive cross-overs is negative. Finally, the individuals that are best fit are allowed to move to the next generation while the other half is removed and a new round is performed. BCSk (GA+KL) = Max k i=1 gain(Individuall, Individualm)CRi . (4) 4 Benchmark Functions and Parameter Setting The stopping criterion for all four algorithms (SA, SA-Look-Ahead, GA, GA- Look-Ahead) is supposed to be reached if the best solution has not been improved during 100 consecutive iterations for (SA, SA-Look-Ahead) or 10 generations for (GA, GA-Look-Ahead). The starting temperature for (SA, Look-Ahead-SA) is set to 0.8 (i.e., a bad move has a probability of 80% for being accepted). In the inner loop of (SA, Look-Ahead-SA), the equilibrium is reached if the number of accepted moves is less than 10%. The ten following benchmark functions were retrieved from [13] and tested. 1: Drop Wave f(x, y) = − 1+cos(12 x2+y2) 1 2 (x2+y2)+2 2: Griewangk f(x) = 1 4000 n i=1 − n i=1 cos( xi √ i ) + 1 3: Levy Function sin2(3πx) + (x − 1)2[1 + sin2(3πy)] + (y − 1)2[1 + sin2(2πy)] 4: Rastrigin f(x) = 10n + n i=1(x2 i − 10 cos 2πxi) 5: Sphere Function f(x) = n i=1 x2 i 6: Weighted Sphere Function f(x, y) = x2 + 2y2 7: Sum of different power functions f(x) = n i=1 |xi|i+1 8: Rotated hyper-ellipsoid f(x) = n i=1 i j=1 x2 j 9: Rosenbrock’s valley f(x) = n−1 i=1 [100(xi+1 − x2 i )2 + (1 − xi)2] 10: Three Hump Camel Function f(x, y) = x2 − 1.05x4 + x6 6 + xy + y2
  • 65.
    52 T. Nordliand N. Bouhmala Fig. 1. Functions 1–6: GA vs. GA-KL 5 Experimental Results The results are visualized in Fig. 1, 2 and 3, where both the mean and the best solution over 100 runs are plotted. The X-axis shows the number of generations (for GA and GA-KL—Fig. 1 and 2) or the iterations (for SA and SA-KL—Fig. 3), while the Y-axis gives the absolute error (i.e. the excess of deviation from the optimal solution). Figures 1 and 2 compares GA against GA-KL (GA-Look-Ahead). Looking at the mean solution, GA delivers on average lower absolute error solutions on almost 9 cases out of 10. The average percentage change error reduction in favor of GA is 5% for function 1, 12% for function 2, 43% for function 4, within 1% for functions 5, 6, 7 and 8, 14% for function 9 and finally 2% for 10. Function
  • 66.
    A Look-Ahead BasedMeta-heuristics for Continuous Optimization 53 Fig. 2. Functions 7–10: GA vs. GA-KL 3 was the only test case where GA-Look-Ahead wins with an average percent- age change error reduction of 26%. On the other hand, comparing the curves representing the evolution of the (best solution) fittest individual produced by the two algorithms, GA-Look-Ahead is capable of reaching solutions of higher precision when compared to GA (10−8 versus 10−6 for function 2, 10−11 versus 10−6 for function 3, 10−19 versus 10−10 for function 5, 10−15 versus 10−9 for function 6, 10−21 versus 10−12 for function 7, 10−16 versus 10−6 for function 8, 10−14 versus 10−10 for function 10. The diversity of the population produced by GA-Look-Ahead enables GA from premature convergence phenomenon leading GA to continue for more generations before convergence is reached. GA-Look- Ahead performed 69% more generations compared to GA for function 3, 82% for function 5, 57% for function 6, and 25% for function 10. Figure 3 shows the results for SA and SA-KL (SA-Look-Ahead). Looking at the evolution of the mean, both methods produce similar quality of results while SA is showing insignificantly lower absolute error on few cases. However, when comparing the best versus the best, SA-Look-Ahead delivers solutions of better precision than SA. The precision rate is 10−10 versus 10−8 for function 1, 10−11 versus 10−6 for function 4, 10−15 versus 10−9 for function 7, and 10−7 versus 10−4 for function 9.
  • 67.
    54 T. Nordliand N. Bouhmala Fig. 3. Functions 1, 4, 7, 9: SA vs. SA-KL 6 Conclusion Meta-heuristics global optimization algorithms have become widely popular for solving global optimization problems. In this paper, both SA and GA have been combined for the first time with the popular Kernighan-Lin heuristic used for the graph partitioning problem. The main idea is to replace the search for one possible perturbation in SA or one typical crossover in GA by a search for favor- able sequence of perturbations or crossovers using the objective function to be optimized to guide the search. The results presented in this paper show that the proposed scheme enables both SA and GA to reach solutions of higher accuracy which is the significant result of the present study. In addition, the proposed scheme enables GA to maintain the diversity of the population for a longer period preventing GA from reaching premature convergence which still remains a serious defect in GA when applied to both continuous and discrete optimiza- tion. With regard to the future, we believe that the performance of the strategy could be further improved by selecting the type of perturbations to be processed by introducing a probability of acceptance at the level of the Kernighan-Lin heuristic whenever a bad move is to be selected.
  • 68.
    A Look-Ahead BasedMeta-heuristics for Continuous Optimization 55 References 1. Ardia, D., Boudt, K., Carl, P., Mullen, K., Peterson, B.G.: Differential evolution with DEoptim: an application to non-convex portfolio optimization. R J. 3(1), 27–34 (2011) 2. Ardia, D., David, J., Arango, O., Gómez, N.D.G.: Jump-diffusion calibration using differential evolution. Wilmott 2011(55), 76–79 (2011) 3. Arun, N., Ravi, V.: ACONM: A Hybrid of Ant Colony Optimization and Nelder- Mead Simplex Search. Institute for Development and Research in Banking Tech- nology (IDRBT), India (2009) 4. Bouhmala, N.: Combining simulated annealing with local search heuristic for MAX- SAT. J. Heuristics 25(1), 47–69 (2019). https://doi.org/10.1007/s10732-018-9386- 9 5. Chelouah, R., Siarry, P.: Tabu search applied to global optimization. Eur. J. Oper. Res. 123(2), 256–270 (2000) 6. Chelouah, R., Siarry, P.: Genetic and Nelder-mead algorithms hybridized for a more accurate global optimization of continuous multiminima functions. Eur. J. Oper. Res. 148(2), 335–348 (2003) 7. De Castro, L.N., Von Zuben, F.J.: Learning and optimization using the clonal selection principle. IEEE Trans. Evol. Comput. 6(3), 239–251 (2002) 8. Goldberg, D.E.: Genetic algorithms in search. Optimization, and Machine Learning (1989) 9. Holland John, H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975) 10. Jensen, B., Bouhmala, N., Nordli, T.: A novel tangent based framework for opti- mizing continuous functions. J. Emerg. Trends Comput. Inf. Sci. 4(2), 239–247 (2013) 11. Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970) 12. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983) 13. Surjanovic, S., Bingham, D.: Virtual library of simulation experiments: Test func- tions and datasets. http://www.sfu.ca/∼ssurjano 14. Tank, M.: An ant colony optimization and Nelder-mead simplex search hybrid algorithm for unconstrained optimization (2009)
  • 69.
    Inverse Optimization forWarehouse Management Hannu Rummukainen(B) VTT Technical Research Centre of Finland, P.O. Box 1000, 02044 Espoo, Finland Hannu.Rummukainen@vtt.fi Abstract. Day-to-day operations in industry are often planned in an ad-hoc manner by managers, instead of being automated with the aid of mathematical optimization. To develop operational optimization tools, it would be useful to automatically learn management policies from data about the actual decisions made in production. The goal of this study was to investigate the suitability of inverse optimization for automating warehouse management on the basis of demonstration data. The man- agement decisions concerned the location assignment of incoming pack- ages, considering transport mode, classification of goods, and congestion in warehouse stocking and picking activities. A mixed-integer optimiza- tion model and a column generation procedure were formulated, and an inverse optimization method was applied to estimate an objective func- tion from demonstration data. The estimated objective function was used in a practical rolling horizon procedure. The method was implemented and tested on real-world data from an export goods warehouse of a con- tainer port. The computational experiments indicated that the inverse optimization method, combined with the rolling horizon procedure, was able to mimic the demonstrated policy at a coarse level on the train- ing data set and on a separate test data set, but there were substantial differences in the details of the location assignment decisions. Keywords: Multi-period storage location assignment problem · Class-based storage · Inverse optimization · Mixed-integer linear programming 1 Introduction Many industrial management and planning activities involve effective use of available resources, and mathematical optimization algorithms could in princi- ple be applied to make better decisions. However, in practice such activities are often planned manually, using computers only to track and communicate infor- mation without substantially automating the decision making process itself. The reasons for the limited use of optimization include the expense of customising optimization models, algorithms and software systems for individual business needs, and the complexity of real-world business processes and operating envi- ronments. The larger the variety of issues one needs to consider in the planning, c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 56–71, 2021. https://doi.org/10.1007/978-3-030-91885-9_5
  • 70.
    Inverse Optimization forWarehouse Management 57 the more complex an optimization model is needed, which can make it very challenging to find a satisfactory solution in reasonable time, whether by generic optimization algorithms or case-specific customized heuristics. Moreover, the dynamic nature of business often necessitates regular updates to any decision support tools as business processes and objectives change. To make it easier to develop industrial decision support tools, one potential approach is to apply machine learning methods to data from existing planning processes. The data on actual decisions can be assumed to indicate the pref- erences of competent decision makers, who have considered the effects of all relevant factors and operational constraints in each decision. If a general control policy can be derived from example decisions, the policy can then be replicated in an automated decision support tool to reduce the manual planning effort, or even completely automate the planning process. There would be no need to explicitly model the preferences of the operational management at a particular company. Ideally the control policy learned from data is generalisable to new planning situations that differ from the training data, and can also be explained in an understandable manner to business managers. In addition to implement- ing tools for operational management, machine learning of management policies could also be a useful tool when developing business simulation models for strate- gic decision support. The goal of this case study was to investigate whether inverse optimization would be a suitable machine learning method to automate warehouse manage- ment on the basis of demonstration data. In the study, inverse optimization was applied to the problem of storage location assignment, i.e. where to place incoming packages in a large warehouse. A mixed-integer model and a column generation procedure were formulated for dynamic class-based storage location assignment, with explicit consideration for the arrival and departure points of goods, and possible congestion in warehouse stocking and picking activities. The main contributions of the study are 1) combining the inverse optimiza- tion approach of Ahuja and Orlin [1] with a cutting plane algorithm in the first known application of inverse optimization to a storage location assignment problem, and 2) computational experiments on applying the estimated objective function in a practical rolling horizon procedure with real-world data. 2 Related Work A large variety of analytical approaches have been published for warehouse design and management [6,8,12,21]. Hausman et al. [10] presented an early analysis of basic storage location assignment methods including random stor- age, turnover-based storage in which highest-turnover products are assigned to locations with the shortest transport distance, and class-based storage in which products are grouped into classes that are associated with areas of the warehouse. The location allocation decisions of class-based storage policies may be based on simple rules such as turnover-based ordering, or an explicit optimization model as in [18].
  • 71.
    58 H. Rummukainen Goetschalckxand Ratliff [7] showed that information about storage durations of arriving units can be used in a duration-of-stay-based (DOS-based) shared storage policy that is more efficient than an optimal static dedicated (class- based or product-based) storage policy. Kim and Park [13] used a subgradient optimization algorithm for location allocation, assuming full knowledge of stor- age times and amounts over the planning interval. Chen et al. [5] presented a mixed-integer model and heuristic algorithms to minimize the peak transport load of a DOS-based policy. In the present study, more complex DOS-based policies are considered, with both dynamic class-based location assignment and congestion costs. Updating class-based assignments regularly was proposed by Pierre et al. [20], and further studied by Kofler et al. [14], who called the problem multi-period storage location assignment. Their problem was quite similar to the present study, but both Pierre et al. and Kofler et al. applied heuristic rules without an explicit mathematical programming model, and also included reshuffling moves. The problem of yard planning in container terminals is closely related to warehouse management, and a number of detailed optimization models have been published. [17,24] Congestion constraints for lanes in the yard were already considered by Lee et al. [9,15] Moccia et al. [16] considered storage location assignment as a dynamic generalized assignment problem, in which goods could be relocated over time, but each batch of goods would have to be treated as an unsplittable unit. The present study does not address relocation, but is differ- entiated by class-based assignment, as well as the use of inverse optimization. Pang and Chan [19] applied a data mining algorithm as a component of a dynamic storage location assignment algorithm, but they did not use any external demonstration policy in their approach. Learning to replicate a control policy demonstrated by an expert has been studied by several computational methods, in particular as inverse optimiza- tion [1,3,22] and inverse reinforcement learning [2]. Compared to more generic supervised machine learning methods, inverse optimization and reinforcement learning methods have the advantage that they can take into account the future consequences of present-time actions. The author is not aware of any prior work in applying inverse optimization or inverse reinforcement learning to warehouse management problems. In a rolling horizon procedure, the plan is reoptimized regularly so that con- secutive planning periods overlap. Troutt et al. [23] considered the problem of estimating the objective function of a rolling horizon optimization procedure from demonstrated decisions. They assumed that the initial state for each plan- ning period could be fixed according to the demonstrated decisions, and then minimized the sum of decisional regret over the planning periods: the decisional regret of a planning period was defined as a function of objective function coeffi- cients, and equals the gap between the optimal objective value and the objective value of the demonstrated decisions over the planning period. In the present study, the inverse optimization was performed on a model covering the full
  • 72.
    Inverse Optimization forWarehouse Management 59 demonstration period, and the rolling horizon procedure was not explicitly con- sidered in the inverse optimization at all. 3 Storage Location Assignment Model Fig. 1. Schematics of the warehouse model structure. Left: small-scale example with all connections shown. Right: larger example, with cut down connections for clarity. Goods flow from arrival nodes (solid circles) to storage locations (outlined boxes) and later on to departure nodes (open circles). Locations are grouped into areas (shaded) for congestion modelling. The basic structure of the storage location assignment model is illustrated in Fig. 1. The model covers a number of discrete storage locations with limited capacity. Goods arrive to storage in discrete packages from one or more arrival nodes, and leave through departure nodes: the nodes may represent e.g. load- ing/unloading points of different transport modes, or manufacturing cells at a factory. The storage locations are grouped into areas for the purpose of mod- elling congestion: simultaneous stocking and picking operations on the same area may incur extra cost. In a typical warehouse layout an area would represent a single aisle, as illustrated on the right in Fig. 1. Goods are divided into distinct classes, and one location can only store one class of packages simultaneously. For each arriving package, the decision maker is assumed to know a probability distribution on arrival nodes and departure nodes, as well as the expected storage duration in discrete time units. The deci- sion maker must choose a storage location for each package. The parametric objective function is based on distance parameters between arrival and depar- ture nodes and locations, per-area congestion levels per time step, and cost of storage location allocation. Each location has a fixed capacity, representing avail- able storage volume. Packages are assumed to stay in one location for their entire stay, i.e. relocation actions are not considered. In this study, a mixed-integer linear programming model was developed for storage location assignment over a set of discrete time steps T. The packages to be stored over the planning period were assumed to be known deterministically
  • 73.
    60 H. Rummukainen inadvance. Although the model was intended to be applicable in a relatively short-term rolling-horizon procedure, in the inverse optimization method a linear relaxation of the model was used on historical data over a longer period. 3.1 Definitions Let K be the index set of packages to be stored. There are qk capacity units of packages of kind k ∈ K, stored over time interval Tk ⊆ T. The begin and end time of interval Tk together form the set T k ⊆ Tk. Denoting the set of arrival and departure nodes by N, packages of kind k ∈ K are assumed to arrive via node n ∈ N by probability πnk and depart via node n by probability πkn. The amount qk is split proportionally between the nodes, so that the amounts of arrivals and departures via n ∈ N are given by πnkqk and πknqk; effectively, the optimization objective is the expected cost of the location assignment. The set of storage locations L is grouped into distinct areas A, which are identified with subsets of L. Location l ∈ L can hold Cl capacity units. In order to define piecewise linear area congestion costs, flows to/from areas are specified in R pieces of D capacity units, for a total maximum flow of RD capacity units. For purposes of class-based location assignment, the package kinds K are grouped into distinct classes B, identified with subsets of K. At each time step, the subset of locations occupied by packages of one class is called a pattern. Let P denote the set of patterns, i.e. feasible subsets of L; although the set P is in principle exponential in the number of locations, it is constructed incrementally in a column generation procedure. In rolling horizon planning, the warehouse already contains initial stock at the beginning of the planning period. Let Qtl denote the amount of initial stock remaining at time t ∈ T in location l ∈ L, and Q ta be the amount of initial stock transported at time t ∈ T from area a ∈ A. 3.2 Location Assignment Model The location assignment mixed-integer program includes the following decision variables: xkl ≥ 0 fraction of kind k ∈ K assigned to location l ∈ L, zbpt ∈ {0, 1} assignment of class b ∈ B to location pattern p ∈ P at time t ∈ T, ytar ≥ 0 flow at time t ∈ T from/to area a ∈ A at discretisation level r ∈ {1, . . . , R}. The corresponding objective function coefficients are independent of time: ckl cost of assigning packages of kind k ∈ K to location l ∈ L cbp cost per time step of using pattern p ∈ P for class b ∈ B car transfer cost from/to area a ∈ A at flow discretisation level r ∈ {1, . . . , R} The piecewise linear flow costs must be convex so that the model can be formu- lated without any additional integer variables. Specifically, the car coefficients must be increasing in r, that is car ≤ car for all a ∈ A and r r .
  • 74.
    Inverse Optimization forWarehouse Management 61 The location assignment mixed-integer program is: min b∈B, p∈P, t∈T cbpzbpt + k∈K, l∈L cklxkl + t∈T, a∈A, r=1,...,R carytar (1) p∈P zbpt = 1 ∀b ∈ B, t ∈ T (2) b∈B, pl zbpt ≤ 1 ∀t ∈ T, l ∈ L (3) xkl − pl zbpt ≤ 0 ∀b ∈ B, k ∈ b, t ∈ Tk, l ∈ L (4) l∈L xkl = 1 ∀k ∈ K (5) k∈K: Tkt qkxkl ≤ Cl − Qtl ∀t ∈ T, l ∈ L (6) r=1,...,R ytar − k∈K: T k t, l∈a qkxkl = Q ta ∀t ∈ T, a ∈ A (7) ytar ≤ D ∀t ∈ T, a ∈ A, r = 1, . . . , R (8) The objective function (1) is the sum of class-pattern assignment costs, kind- location assignment costs, and piecewise linear area congestion costs. The con- straints (2) require each class of goods to be assigned to a unique location pattern on each time step, and the constraints (3) require that two different classes of goods cannot be at the same location at the same time. The constraints (4) disal- low kind-location choices xkl that are inconsistent with the class-pattern choices zbpt. The constraints (5) require all goods to be assigned to some location. The constraints (6) bound storage volumes by storage capacity. The constraints (7) link the area flows ytar to the kind-location choices xkl and flows of initial stocks Q ta, and the constraints (8) bound feasible flow volumes. 3.3 Cost Model The cost coefficients cbp, ckl, clm, car are defined in terms of the following cost parameters, which are treated as variables in the inverse optimization. dnl ≥ 0 transfer cost per capacity unit from node n ∈ N to location l ∈ L dln ≥ 0 transfer cost per capacity unit from location l ∈ L to node n ∈ N γl ≥ 0 cost per time step of allocating location l ∈ L δ ≥ 0 cost per distance unit per time step, based on the diameter of an allocation pattern (The diameter of a pattern is the maximum distance between two locations in the pattern.) αr ≥ 0 marginal cost per capacity unit for area flow level r
  • 75.
    62 H. Rummukainen Theabove parameters have nominal values, indicated by superscript 0, which are set a priori before inverse optimization. For the nominal transfer costs d0 nl and d0 ln, the physical distance between node n and location l was used. The nominal cost γ0 l was set proportionally to the capacity of location l, and the nominal pattern diameter cost δ0 was set 0. The nominal marginal flow costs α0 r were set 0 below an estimated flow capacity bound, and a high value above that. Additionally, d0 ll is used to denote the physical distance between locations l, l ∈ L. The cost model can now be stated as: cbp = l∈p γl + δ max l,l∈p d0 ll (9) ckl = qk n∈N (πnkdnl + πkndln) (10) car = r j=1 αj (11) Note that (11) is defined as a sum to ensure that the cost coefficients car are increasing in r. 3.4 Dual Model Let us associate dual variables with the constraints of the location assignment model as follows: ψbt ∈ R for (2), φtl ≥ 0 for (3), ξktl ≥ 0 for (4), μk ∈ R for (5), λtl ≥ 0 for (6), ηta ∈ R for (7), and θtar ≥ 0 for (8). The dual of the linear relaxation of the location assignment model (1)–(8) can now be formulated: max b∈B, t∈T ψbt − t∈T, l∈L φtl + k∈K μk − t∈T, l∈L (Cl −Qtl)λtl + t∈T, a∈A Q taηta − t∈T, a∈A, r=1,...,R Dθtar (12) ψbt − l∈p φtl + k∈b: Tkt, l∈p ξktl ≤ cbp ∀b ∈ B, p ∈ P, t ∈ T (13) − t∈Tk ξktl + μk − t∈Tk qkλtl − t∈T k a∈A: l∈a qkηta ≤ ckl ∀k ∈ K, l ∈ L (14) ηta − θtar ≤ car ∀t ∈ T, a ∈ A, r = 1, . . . , R (15) 3.5 Pattern Generation To find a solution to the location assignment model (1)–(8), a column generation procedure was applied to its linear relaxation, then the set of patterns P was fixed, and the problem was finally resolved as a mixed-integer program. This is a heuristic that does not guarantee optimality for the mixed-integer problem.
  • 76.
    Inverse Optimization forWarehouse Management 63 Potentially profitable patterns can be found by detecting violations of dual constraint (13). Specifically, the procedure seeks p ⊆ L, b ∈ B and t ∈ T that maximize, and reach a positive value for the expression ψbt − l∈p φtl + k∈b: Tkt, l∈p ξktl − cbp = ψbt − l∈p (φtl + γl − k∈b: Tkt ξktl) − δ max l,l∈p d0 ll , where the definition (9) of cbp has been expanded. Let us denote by Vlbt the parenthesized expression on the right-hand side, and by qbt = k∈b: Tkt qk the amount of class b goods at time t. The column generation subproblem for given b and t is defined in terms of the following decision variables: ul ∈ {0, 1} indicates whether location l ∈ L belongs in the pattern, vlk ≥ 0 indicates ordered pairs of locations l, k ∈ L that both belong in the pattern, and s ≥ 0 represents the diameter of the pattern. The subproblem is formulated as a mixed-integer program: min l∈L Vlbtul + δs (16) l∈L CMl ul ≥ qbt (17) s − d0 lkvlk ≥ 0 ∀l, k ∈ L, l k (18) vlk − ul − uk ≥ −1 ∀l, k ∈ L, l k (19) The constraint (17) ensures that the selected locations together have sufficient capacity to be feasible in the location assignment problem. The constraints (18) bound s to match the pattern diameter, and (19) together with the minimization objective ensure that vlk = uluk for all l, k ∈ L. In the column generation procedure, the problem (16)–(19) is solved itera- tively by a mixed-integer solver for all classes b ∈ B and time steps t ∈ T, and whenever the objective is smaller than ψbt, the pattern defined by the corre- sponding solution is added to the set of patterns P. After all b and t have been checked, the linear program (1)–(8) is resolved with the updated P, and the column generation procedure is repeated with newly obtained values of (ψ, φ, ξ). If no more profitable patterns are found or the algorithm runs out of time, the current patterns P are fixed and finally the location assignment problem (1)–(8) is resolved as a mixed-integer program. Note that cbp is a submodular function of p. Kamiyama [11] presented a polynomial-time approximation algorithm for minimizing a submodular function with covering-type constraints (which (17) is), but in this study it was considered simpler to use a mixed-integer solver. 4 Inverse Optimization Method The location assignment decisions demonstrated by the human decision makers are assumed to form a feasible solution (z, x, y) to the location assignment model
  • 77.
    64 H. Rummukainen (1)–(8).The goal of the presented inverse optimization method is to estimate the parameters of the cost model in the form (9)–(11), so that the demonstrated solution (z, x, y) is optimal or near-optimal for the costs c. The demonstrated solution is not assured to be an extreme point, and may be in the relative interior of the feasible region. Moreover, with costs restricted to the form (9)–(11), non-trivial (unique) optimality may be impossible for even some boundary points of the primal feasible region. Because of these issues, in contrast with the inverse linear optimization method of Ahuja and Orlin [1], complementary slackness conditions are not fully enforced, and the duality gap between the primal (1) and dual (12) objectives is minimized as an objective. To make sure that the inverse problem is feasible with nontrivial c, none of the dual constraints (13)–(15) is tightened to strict equality by complementary slackness. Nevertheless, the following additional constraints on the dual solution (ψ, φ, ξ, μ, λ, η, θ) are enforced on the basis of complementary slackness with the fixed primal solution (z, x, y): φtl = 0 ∀t ∈ T, l ∈ L : b∈B, pl zbpt 1 (20) ξktl = 0 ∀b ∈ B, k ∈ b, t ∈ T, l ∈ L : xkl − pl zbpt 0 (21) λtlm = 0 ∀t ∈ T, l ∈ L, m = 1, . . . , Ml : wtlm C (22) θtar = 0 ∀t ∈ T, a ∈ A, , r = 1, . . . , R : ytar D (23) The duality gap that is to be minimized is given by Γ = b∈B, p∈P, t∈T cbpzbpt + k∈K, l∈L cklxkl + t∈T, l∈L, m=1,...,Ml clmwtlm + t∈T, a∈A, r=1,...,R carytar − b∈B, t∈T ψbt + t∈T, l∈L φtl − k∈K μk + t∈T, l∈L (Cl − Qtl)λtl − t∈T, a∈A Q taηta + t∈T, a∈A, r=1,...,R Dθtar. (24) Note that Γ ≥ 0 because (z, x, y) is primal feasible and (ψ, φ, ξ, μ, λ, η, θ) is dual feasible. Similarly to Ahuja and Orlin [1], the cost estimation is regularized by penalis- ing deviations from nominal cost values, but in a more specific formulation. The deviation objective term is defined as follows, weighting cost model parameters approximately in proportion to their contribution to the primal objective: Δ = t∈T b∈B k∈b: Tkt qk l∈L CMl l∈L γl − γ0 l + maxl,l∈L d0 ll 2 δ − δ0 + k∈K qk |L| l∈L n∈N πnk dnl − d0 nl + πkn dln − d0 ln + k∈K 2qk R r=1 r j=1 αj − α0 j (25) As this is a minimization objective, the standard transformation of absolute value terms to linear programming constraints and variables applies [4, p. 17].
  • 78.
    Inverse Optimization forWarehouse Management 65 The inverse problem is obtained by substituting the cost model (9)–(11) in the dual constraints (13)–(15), with the additional constraints (20)–(23), and the objective of minimizing gΔ+Γ, where g is a weighting coefficient in units of storage capacity. The decision variables are the dual variables (ψ, φ, ξ, μ, λ, η, θ) and the cost model parameters (d, γ, δ, α). The inverse problem is a linear program with an exponential number of con- straints (13). The problem can be solved iteratively by first solving the linear program with a limited set of patterns P, then applying the algorithm of Sect. 3.5 to detect violations of constraints (13) and to update the set P accordingly, and repeating. In other words, the primal column generation procedure of Sect. 3.5 can be applied as a dual cutting plane algorithm. The algorithm can be stopped on timeout with a suboptimal solution. 5 Computational Experiments A warehouse management case was kindly provided by the company Steveco Oy, a container port operator in Finland. One of the services provided by Steveco is warehousing of export goods, allowing manufacturers to reduce lead time on deliveries to their foreign customers. The goods in the port warehouses can be shipped either as break-bulk or stuffed in containers at short notice. The case involves the management of inventory in one such port warehouse. Goods are stored in the warehouse for a period ranging from a day or two up to several weeks. Goods may arrive by either rail or truck, which are usually received on opposite sides of the warehouse. The warehouse is split into grid squares of different sizes. Different goods are classified based on manufacturer- provided order identifiers, such that two classes cannot be stored in the same grid square simultaneously. The warehouse management typically receives advance information of deliveries 12–24 h in advance of arrival in the warehouse. A ware- house foreman allocates grid squares for each class of goods, taking into account the transport mode (rail or truck), whether the goods are likely to be stuffed in a container or transported on a rolltrailer directly to a quay, and importantly, simultaneously transported goods must be split between aisles of the warehouse so that multiple loaders can access the goods at different aisles without interfer- ing with each other. Currently the management of the warehouse is a manual procedure, which requires time and effort from experienced foremen. Computational experiments were performed using a 92-day time series of actual storage decisions provided by Steveco. The data described the arrival and departure times, storage locations and classification of each individual package. The data was split into a 46-day training data set, and a 46-day test data set. The model time step was 1 h. There were arrival or departure events on 942 h of the 2208 h period, and a total of 66000 packages in 99 orders (classes of goods). The warehouse had 8 aisles (areas) with a total of 83 grid squares (locations). Arrivals and departures were associated with 8 arrival nodes (6 locations along a spur rail, and truck unloading on 2 sides of the warehouse) and 3 departure nodes (1 quay and 2 truck).
  • 79.
    66 H. Rummukainen Tosimulate usage of the model in operational planning, the storage loca- tion assignment model of Sect. 3.2 was applied in a rolling horizon procedure: For every hour of simulation time, the mixed-integer model was solved for the following 24-hour period, the decisions of the first hour were fixed, and the pro- cedure was repeated with the planning period shifted by one hour forward. The inverse optimization was computed for the entire 46-day training period at once, without regard to rolling horizon planning. Thus there was no guarantee that the cost parameters computed by the inverse optimization would allow the rolling horizon procedure to accurately replicate the demonstrated decisions. The following cases are reported in the results: Example. Demonstrated decisions of warehouse foremen. Learned/post. A posteriori optimization over the entire data period, using the estimated objective function in a linear relaxation of the model of Sect. 3.2. This result is what the inverse optimization method actually aims to match with the demonstration solution, and is only included for comparison. This method cannot be used in operational planning. Nominal/rolling. Rolling horizon planning using the nominal objective func- tion, which corresponds to pure transport distance minimization, along with high aisle congestion cost above estimated loader capacity. Learned/rolling. Rolling horizon planning using the estimated objective func- tion, with the aim of replicating the example behaviour. The experiments were run on a server with two 18-core 2.3 GHz Intel Xeon E5-2699v3 processors and 256 GB RAM, using the IBM ILOG CPLEX 20.1 mixed-integer programming solver. The inverse optimization time was limited to 12 h, and each rolling horizon step took 0–5 min. To compare the optimization results with the demonstrated decisions, two numerical measures were defined at different aggregation levels. Let xkl represent the solution of one of the four cases above, and x◦ kl the demonstrated solution of the Example case. The measure DIFFA indicates how closely the volumes assigned to each aisle match the demonstrated solution over time: DIFFA = 100% · a∈A t∈T k∈K: Tkt l∈a qkxkl − k∈K: Tkt l∈a qkx◦ kl k∈K 2qk |Tk| . (26) The measure DIFFB indicates how closely the volumes of each class of goods assigned to each aisle match the demonstrated solution over time: DIFFB = 100% · b∈B, a∈A, t∈T k∈b: Tkt l∈a qkxkl − k∈b: Tkt l∈a qkx◦ kl k∈K 2qk |Tk| . (27) These measures range from 0% indicating perfect replication, to 100% indicating that no goods are assigned to the same aisles in the two solutions.
  • 80.
    Inverse Optimization forWarehouse Management 67 Fig. 2. Training results: capacity usage over time for each aisle. Fig. 3. Test results: capacity usage over time for each aisle.
  • 81.
    68 H. Rummukainen Table1. Differences in capacity usage between the optimization results and demon- strated decisions. DIFFA indicates differences in per-aisle capacity usage, and DIFFB differences in per-aisle capacity usage of each class of goods; both are relative to the total volume of goods in storage Case DIFFA DIFFB Training Test Training Test Example 0.0% 0.0% 0.0% 0.0% Learned/post 6.6% 15.8% 9.9% 55.6% Nominal/rolling 32.3% 26.2% 43.3% 68.1% Learned/rolling 18.5% 17.2% 32.7% 62.5% Table 2. Total transport distance of the packages, relative to the demonstrated exam- ple solution, on the training and test periods. The comparison includes distances from arrival node to storage location to departure node. Case Training Test Example 100.0% 100.0% Learned/post 99.7% 91.6% Nominal/rolling 86.9% 84.2% Learned/rolling 97.2% 93.0% 5.1 Results Graphs of capacity usage over time in different aisles are shown in Figs. 2 and 3. The graphs give an overview of how well the demonstrated aisle choices could be matched. Note that the differences between these graphs and the Example case are precisely summarized by the DIFFA measure. The numerical values of the measures DIFFA and DIFFB are shown in Table 1. The total transport distance of the packages in the different cases, com- pared to the demonstrated solution, is shown in Table 2. In 12 h the inverse optimization method ran for 21 iterations, on which it generated a total of 2007 patterns. In the last 3 h, there were only negligible improvements to the objective value of the inverse problem. 5.2 Discussion On the training data set the inverse optimization method worked, as can be seen in Fig. 2: the continuous optimization results (Learned/post) match the demonstrated decisions (Example) relatively well. However, the rolling horizon procedure with learned costs (Learned/rolling) diverged substantially from the results of the continuous optimization (Learned/post), possibly due to the short 24 h planning horizon. Nevertheless, as indicated by the numerical measures in
  • 82.
    Inverse Optimization forWarehouse Management 69 Table 1, the results of the rolling horizon procedure were closer to the demon- strated decisions after the inverse optimization (Learned/rolling) than before (Nominal/rolling). On the test data set, as seen in Fig. 3 and Table 1, the continuous optimiza- tion results (Learned/post) diverged further from the demonstrated decisions (Example) than in the training data set. The results of the rolling horizon pro- cedure however appeared to follow the demonstrated decisions somewhat better than on the training data set, both visually and by the DIFFA measure. Again, the results of the rolling horizon procedure were closer to the demonstrated decisisions after the inverse optimization (Learned/rolling) than before (Nomi- nal/rolling). Replicating the demonstrated decisions in more detail than at the level of per- aisle capacity usage was unsuccessful so far. As shown by the DIFFB measure in Table 1, classes of goods were largely assigned to different aisles than in the demonstrated decisions. A particular difficulty in the case study was that there was no machine-readable data on the properties of the classes beyond an opaque order identifier. Besides the difficulties at the level of classes of goods, comparing the assign- ments at the level of individual locations was also fruitless. The preliminary conclusion was that since most of the goods would flow through the warehouse aisles from one side to another, choosing one grid square or another in the same aisle would not make a substantial difference in the total work required. Although matching the demonstrated decisions by inverse optimization had limited success so far, the storage location assignment model of Sect. 3 was found to be valuable. In the Nominal/rolling case, in which the objective was to mini- mize transport distances with congestion penalties, the total transport distance of the packages was reduced by 13–15% compared to manual planning, as can be seen in Table 2. Based on the computational experiments, the model has the potential to be useful for warehouse management on realistic scale. While the algorithm appeared to converge in 12 h with the 46-day training data set, this was at the limit of the current capabilities of the method. In an abandoned attempt to use a 50% larger training data set, the algorithm progressed over 10 times slower than in the presented runs. In particular the size of the location assignment model is sensitive to the size of the set K, i.e. the number of kinds of packages. 6 Conclusion The inverse optimization approach of Ahuja and Orlin [1] was applied to the linear relaxation of a mixed-integer storage location assignment problem, and a solution method based on a cutting plane algorithm was presented. The method was tested on real-world data, and the estimated objective function was used in the mixed-integer model in a practical short-term rolling-horizon procedure. The inverse optimization method and the rolling-horizon procedure were able to coarsely mimic a demonstrated storage location assignment policy on the
  • 83.
    70 H. Rummukainen trainingdata set, as well as on a separate test data set. However, the demon- strated assignments of specific classes of goods could not be replicated well, possibly due to the limitations of the inverse optimization method or due to the nature of the case study. Further research is needed to more accurately follow the demonstrated location assignment policy. References 1. Ahuja, R.K., Orlin, J.B.: Inverse optimization. Oper. Res. 49(5), 771–783 (2001) 2. Arora, S., Doshi, P.: A survey of inverse reinforcement learning: challenges, meth- ods and progress (2020). arXiv:1806.06877 3. Aswani, A., Shen, Z.J., Siddiq, A.: Inverse optimization with noisy data. Oper. Res. 66(3), 870–892 (2018) 4. Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Sci- entific, Belmont (1997) 5. Chen, L., Riopel, D., Langevin, A.: Minimising the peak load in a shared storage system based on the duration-of-stay of unit loads. Int. J. Shipp. Transp. Logist. 1(1), 20–36 (2009) 6. De Koster, R., Le-Duc, T., Roodbergen, K.J.: Design and control of warehouse order picking: a literature review. Eur. J. Oper. Res. 182(2), 481–501 (2007) 7. Goetschalckx, M., Ratliff, H.D.: Shared storage policies based on the duration stay of unit loads. Manage. Sci. 36(9), 1120–1132 (1990) 8. Gu, J., Goetschalckx, M., McGinnis, L.F.: Research on warehouse operation: a comprehensive review. Eur. J. Oper. Res. 177, 1–21 (2007). https://doi.org/10. 1016/j.ejor.2006.02.025 9. Han, Y., Lee, L.H., Chew, E.P., Tan, K.C.: A yard storage strategy for minimizing traffic congestion in a marine container transshipment hub. OR Spectr. 30, 697–720 (2008). https://doi.org/10.1007/s00291-008-0127-6 10. Hausman, W.H., Schwarz, L.B., Graves, S.C.: Optimal storage assignment in auto- matic warehousing systems. Manage. Sci. 22(6), 629–638 (1976) 11. Kamiyama, N.: A note on submodular function minimization with covering type linear constraints. Algorithmica 80, 2957–2971 (2018) 12. Karásek, J.: An overview of warehouse optimization. Int. J. Adv. Telecommun. Electrotech. Sig. Syst. 2(3), 111–117 (2013) 13. Kim, K.H., Park, K.T.: Dynamic space allocation for temporary storage. Int. J. Syst. Sci. 34(1), 11–20 (2003) 14. Kofler, M., Beham, A., Wagner, S., Affenzeller, M.: Robust storage assignment in warehouses with correlated demand. In: Borowik, G., Chaczko, Z., Jacak, W., Luba, T. (eds.) Computational Intelligence and Efficiency in Engineering Systems. SCI, vol. 595, pp. 415–428. Springer, Cham (2015). https://doi.org/10.1007/978- 3-319-15720-7 29 15. Lee, L.H., Chew, E.P., Tan, K.C., Han, Y.: An optimization model for storage yard management in transshipment hubs. OR Spectr. 28, 539–561 (2006) 16. Moccia, L., Cordeau, J.F., Monaco, M.F., Sammarra, M.: A column generation heuristic for dynamic generalized assignment problem. Comput. Oper. Res. 36(9), 2670–2681 (2009). https://doi.org/10.1016/j.cor.2008.11.022 17. Monaco, M.F., Sammarra, M., Sorrentino, G.: The terminal-oriented ship stowage planning problem. Eur. J. Oper. Res. 239, 256–265 (2014)
  • 84.
    Inverse Optimization forWarehouse Management 71 18. Muppani, V.R., Adil, G.K.: Efficient formation of storage classes for warehouse storage location assignment: a simulated annealing approach. Omega 36, 609–618 (2008) 19. Pang, K.W., Chan, H.L.: Data mining-based algorithm for storage location assign- ment in a randomised warehouse. Int. J. Prod. Res. 55(14), 4035–4052 (2017). https://doi.org/10.1080/00207543.2016.1244615 20. Pierre, B., Vannieuwenhuyse, B., Domnianta, D., Dessel, H.V.: Dynamic ABC storage policy in erratic demand environments. Jurnal Teknik Industri 5(1), 1–12 (2003) 21. Rouwenhorst, B., Reuter, B., Stockrahm, V., van Houtum, G.J., Mantel, R.J., Zijm, W.H.M.: Warehouse design and control: framework and literature review. Eur. J. Oper. Res. 122, 515–533 (2000) 22. Schaefer, A.J.: Inverse integer programming. Optim. Lett. 3, 483–489 (2009) 23. Troutt, M.D., Pang, W.K., Hou, S.H.: Behavioral estimation of mathematical pro- gramming objective function coefficients. Manage. Sci. 52(3), 422–434 (2006) 24. Zhen, L., Xu, Z., Wang, K., Ding, Y.: Multi-period yard template planning in container terminals. Transp. Res. Part B 93, 700–719 (2016)
  • 85.
    Model-Agnostic Multi-objective Approach forthe Evolutionary Discovery of Mathematical Models Alexander Hvatov(B) , Mikhail Maslyaev, Iana S. Polonskaya, Mikhail Sarafanov, Mark Merezhnikov, and Nikolay O. Nikitin NSS (Nature Systems Simulation) Lab, ITMO University, Saint-Petersburg, Russia {alex hvatov,mikemaslyaev,ispolonskaia, mik sar,mark.merezhnikov,nnikitin}@itmo.ru Abstract. In modern data science, it is often not enough to obtain only a data-driven model with a good prediction quality. On the contrary, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. Such questions are unified under machine learning interpretability questions, which could be consid- ered one of the area’s raising topics. In the paper, we use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm’s desired properties. It means that whereas one of the apparent objectives is precision, the other could be chosen as the complexity of the model, robustness, and many others. The method application is shown on examples of multi-objective learning of compos- ite models, differential equations, and closed-form algebraic expressions are unified and form approach for model-agnostic learning of the inter- pretable models. Keywords: Model discovery · Multi-objective optimization · Composite models · Data-driven models 1 Introduction The increasing precision of the machine learning models indicates that the best precision model is either overfitted or very complex. Thus, it is used as a black box without understanding the principle of the model’s work. This fact means that we could not say if the model can be applied to another sample without a direct experiment. Related questions such as applicability to the given class of the problems, sensitivity, and the model parameters’ and hyperparameters’ meaning arise the interpretability problem [7]. In machine learning, two approaches to obtain the model that describes the data are existing. The first is to fit the learning algorithm hyperparameters and the parameters of a given model to obtain the minimum possible error. The second one is to obtain the model structure (as an example, it is done in neural c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 72–85, 2021. https://doi.org/10.1007/978-3-030-91885-9_6
  • 86.
    Multi-objective Discovery ofMathematical Models 73 architecture search [1]) or the sequence of models that describe the data with the minimal possible error [12]. We refer to obtaining a set of models or a composite model as a composite model discovery. After the model is obtained, both approaches may require additional model interpretation since the obtained model is still a black box. For model interpre- tation, many different approaches exist. One group of approaches is the model sensitivity analysis [13]. Sensitivity analysis allows mapping the input variability to the output variability. Algorithms from the sensitivity analysis group usually require multiple model runs. As a result, the model behavior is explained rela- tively, meaning how the output changes with respect to the input changing. The second group is the explaining surrogate models that usually have less precision than the parent model but are less complex. For example, linear regres- sion models are used to explain deep neural network models [14]. Additionally, for convolutional neural networks that are often applied to the image classi- fication/regression task, visual interpretation may be used [5]. However, this approach cannot be generalized, and thus we do not put it into the classification of explanation methods. All approaches described above require another model to explain the results. Multi-objective optimization may be used for obtaining a model with desired properties that are defined by the objectives. The Pareto frontier obtained during the optimization can explain how the objectives affect the resulting model’s form. Therefore, both the “fitted” model and the model’s initial interpretation are achieved. We propose a model-agnostic data-driven modeling method that can be used for an arbitrary composite model with directed acyclic graph (DAG) represen- tation. We assume that the composite model graph’s vertices (or nodes) are the models and the vertices define data flow to the final output node. Genetic pro- gramming is the most general approach for the DAG generation using evolution- ary operators. However, it cannot be applied to the composite model discovery directly since the models tend to grow unexpectedly during the optimization process. The classical solution is to restrict the model length [2]. Nevertheless, the model length restriction may not give the best result. Overall, an excessive amount of models in the graph drastically increases the fitting time of a given composite model. Moreover, genetic programming requires that the resulting model is computed recursively, which is not always possible in the composite model case. We refine the usual for the genetic programming cross-over and mutation operators to overcome the extensive growth and the model restrictions. Addition- ally, a regularization operator is used to retain the compactness of the resulting model graph. The evolutionary operators, in general, are defined independently on a model type. However, objectives must be computed separately for every type of model. The advantage of the approach is that it can obtain a data- driven model of a given class. Moreover, the model class change does not require significant changes in the algorithm.
  • 87.
    74 A. Hvatovet al. We introduce several objectives to obtain additional control over the model discovery. The multi-objective formulation allows giving an understanding of how the objectives affect the resulting model. Also, the Pareto frontier provides a set of models that an expert could assess and refine, which significantly reduces time to obtain an expert solution when it is done “from scratch”. Moreover, the multi-objective optimization leads to the overall quality increasing since the population’s diversity naturally increases. Whereas the multi-objective formulation [15] and genetic programming [8] are used for the various types of model discovery separately, we combine both approaches and use them to obtain a composite model for the comprehensive class of atomic models. In contrast to the composite model, the atomic model has single-vector input, single-vector output, and a single set of hyperparameters. As an atomic model, we may consider a single model or a composite model that undergoes the “atomization” procedure. Additionally, we consider the Pareto frontier as the additional tool for the resulting model interpretability. The paper is structured as follows: Sect. 2 describes the evolutionary oper- ators and multi-objective optimization algorithms used throughout the paper, Sect. 3 describes discovery done on the same real-world application dataset for particular model classes and different objectives. In particular, Sect. 3.2 describes the composite model discovery for a class of the machine learning models with two different sets of objective functions; Sect. 3.3 describes the closed-form alge- braic expression discovery; Sect. 3.4 describes the differential equation discovery. Section 4 outlines the paper. 2 Problem Statement for Model-Agnostic Approach The developed model agnostic approach could be applied to an arbitrary com- posite model represented as a directed acyclic graph. We assume that the graph’s vertices (or nodes) are the atomic models with parameters fitted to the given data. It is also assumed that every model has input data and output data. The edges (or connections) define which models participate in generating the given model’s input data. Before the direct approach description, we outline the scheme Fig. 1 that illustrates the step-by-step questions that should be answered to obtain the composite model discovery algorithm. For our approach, we aim to make Step 1 as flexible as possible. Approach application to different classes of functional blocks is shown in Sect. 3. The realization of Step 2 is described below. Moreover, different classes of models require different qualitative measures (Step 3), which is also shown in Sect. 3. The cross-over and mutation schemes used in the described approach do not differ in general from typical symbolic optimization schemes. However, in con- trast to the usual genetic programming workflow, we have to add regularization operators to restrict the model growth. Moreover, the regularization operator allows to control the model discovery and obtain the models with specific prop- erties. In this section, we describe the generalized concepts. However, we note
  • 88.
    Multi-objective Discovery ofMathematical Models 75 Fig. 1. Illustration of the modules of the composite models discovery system and the particular possible choices for the realization strategy. that every model type has its realization details for the specific class of the atomic models. Below we describe the general scheme of the three evolution- ary operators used in the composite model discovery: cross-over, mutation, and regularization shown in Fig. 2. Fig. 2. (Left) The generalized composite model individual: a) individual with models as nodes (green) and the dedicated output node (blue) and nodes with blue and yellow frames that are subjected to the two types of the mutations b1) and b2); b1) mutation with change one atomic model with another atomic model (yellow); b2) mutation with one atomic model replaced with the composite model (orange) (right) The scheme of the cross-over operator green and yellow nodes are the different individuals. The frames are subtrees that are subjected to the cross-over (left), two models after the cross-over operator is applied (right) (Color figure online)
  • 89.
    76 A. Hvatovet al. The mutation operator has two variations: node replacement with another node and node replacement with a sub-tree. Scheme for the both type of the mutations are shown in Fig. 2(left). The two types of mutation can be applied simultaneously. The probabilities of the appearance of a given type of mutation are the algorithm’s hyperparameters. We note that for convenience, some nodes and model types may be “immutable”. This trick is used, for example, in Sect.3.4 to preserve the differential operator form and thus reduce the optimization space (and consequently the optimization time) without losing generality. In general, the cross-over operator could be represented as the subgraphs exchange between two models as shown in Fig. 2(right). In the most general case, in the genetic programming case, subgraphs are chosen arbitrarily. However, since not all models have the same input and output, the subgraphs are chosen such that the inputs and outputs of the offspring models are valid for all atomic models. In order to restrict the excessive growth of the model, we introduce an addi- tional regularization operator shown in Fig. 3. The amount of described disper- sion (as an example, using the R2 metric) is assessed for each graph’s depth level. The models below the threshold are removed from the tree iteratively with metric recalculation after each removal. Also, the applied implementation of the regularization operator can be task-specific (e.g., custom regularization for com- posite machine learning models [11] and the LASSO regression for the partial differential equations [9]). Fig. 3. The scheme of the regularization operator. The numbers are the dispersion ratio that is described by the child nodes as the given depth level. In the example model with dispersion ratio 0.1 is removed from the left model to obtain simpler model to the right. Unlike the genetic operators defined in general, the objectives are defined for the given class of the atomic models and the given problem. The class of the models defines the way of an objective function computation. For example, we consider the objective referred to as “quality”, i.e., the ability of the given com- posite model to predict the data in the test part of the dataset. Machine learning models may be additionally “fitted” to the data with a given composite model structure. Fitting may be used for all models simultaneously or consequently for every model. Also, the parameters of the machine learning models may not
  • 90.
    Multi-objective Discovery ofMathematical Models 77 be additionally optimized, which increases the optimization time. The differen- tial equations and the algebraic expressions realization atomic models are more straightforward than the machine learning models. The only way to change the quality objective is the mutation, cross-over, and regularization operators. We note that there also may be a “fitting” procedure for the differential equations. However, we do not introduce variable parameters for the differential terms in the current realization. In the initial stage of the evolution, according to the standard workflow of the MOEA/DD [6] algorithm, we have to evaluate the best possible value for each of the objective functions. The selection of parents for the cross-over is held for each objective function space region. With a specified probability of maintaining the parents’ selection, we can select an individual outside the processed subregion to partake in the recombination. In other cases, if there are candidate solutions in the region associated with the weights vector, we make a selection among them. The final element of MOEA/DD is population update after creating new solutions, which is held without significant modifications. The resulting algorithm is shown in Algorithm 1. Data: Class of atomic models T = {T1, T2, ... Tn}; (optional) define subclass of immutable models; objective functions Result: Pareto frontier Create a set of weight vectors w = (w1 , ..., wn weights ), wi = (wi 1, ..., wi n eq+1); for weight vector in weights do Select K nearest weight vectors to the weight vector; Randomly generate a set of candidate models divide them into non-dominated levels; Divide the initial population into groups by subregion, to which they belong; for epoch = 1 to epoch number do for weight vector in weights do Parent selection; Apply recombination to parents pool and mutation to individuals inside the region of weights (Fig. 2); for offspring in new solutions do Apply regularization operator (Fig. 3); Get values of objective functions for offspring; Update population; Algorithm 1: The pseudo-code of model-agnostic Pareto frontier construction To sum up, the proposed approach combines classical genetic programming operators with the regularization operator and the immutability property of selected nodes. The refined MOEA/DD and refined genetic programming oper- ators obtain composite models for different atomic model classes.
  • 91.
    78 A. Hvatovet al. 3 Examples In this section, several applications of the described approach are shown. We use a common dataset for all experiments that are described in the Sect. 3.1. While the main idea is to show the advantages of the multi-objective app- roach, the particular experiments show different aspects of the approach realiza- tion for different models’ classes. Namely, we want to show how different choices of the objectives reflect the expert modeling. For the machine learning models in Sect. 3.2, we try to mimic the expert’s approach to the model choice that allows one to transfer models to a set of related problems and use a robust regularization operator. Almost the same idea is pursued in mathematical models. Algebraic expres- sion models in Sect. 3.3 are discovered with the model complexity objective. More complex models tend to reproduce particular dataset variability and thus may not be generalized. To reduce the optimization space, we introduce immutable nodes to restrict the model form without generality loss. The regularization oper- ator is also changed to assess the dispersion relation between the model residuals and data, which has a better agreement with the model class chosen. While the main algebraic expressions flow is valid for partial differential equa- tions discovery in Sect. 3.4, they have specialties, such as the impossibility to solve intermediate models “on-fly”. Therefore LASSO regularization operator is used. 3.1 Experimental Setup The validation of the proposed approach was conducted for the same dataset for all types of models: composite machine learning models, models based on closed-form algebraic expression, and models in differential equations. The multi-scale environmental process was selected as a benchmark. As the dataset for the examples, we take the time series of sea surface height were obtained from numerical simulation using the high-resolution setup of the NEMO model for the Arctic ocean [3]. The simulation covers one year with the hourly time resolution. The visualization of the experimental data is presented in Fig. 4. It is seen from Fig. 4 that the dataset has several variability scales. The com- posite models, due to their nature, can reproduce multiple scales of variability. In the paper, the comparison between single and composite model performance is taken out of the scope. We show only that with a single approach, one may obtain composite models of different classes. 3.2 Composite Machine Learning Models The machine learning pipelines’ discovery methods are usually referred to as automated machine learning (AutoML). For the machine learning model design, the promising task is to control the properties of obtained model. Quality and robustness could be considered as an example of the model’s properties. The
  • 92.
    Multi-objective Discovery ofMathematical Models 79 Fig. 4. The multi-scale time series of sea surface height used as a dataset for all exper- iments. proposed model-agnostic approach can be used to discover the robust compos- ite machine learning models with the structure described as a directed acyclic graph (as described in [11]). In this case, the building blocks are regression-based machine learning models, algorithms for feature selection, and feature transfor- mation. The specialized lagged transformation is applied to the input data to adapt the regression models for the time series forecasting. This transformation is also represented as a building block for the composite graph [4]. The quality of the composite machine learning models can be analyzed in different ways. The simplest way is to estimate the quality of the prediction on the test sample directly. However, the uncertainty in both models and datasets makes it necessary to apply the robust quality evaluation approaches for the effective analysis of modeling quality [16]. The stochastic ensemble-based approach can be applied to obtain the set of predictions Yens for different modeling scenarios using stochastic perturbation of the input data for the model. In this case, the robust objective function fi (Yens) can be evaluated as follows: μens = 1 k k j=1 (Y j ens) + 1, fi (Yens) = μens 1 k−1 k i=1 (fi (Y i ens) − μens) 2 + 1 (1) In Eq. 1 k is the number of models in the ensemble, f - function for modelling error, Yens - ensemble of the modelling results for specific configuration of the composite model. The robust and non-robust error measures of the model cannot be minimized together. In this case, the multi-objective method proposed in the paper can be used to build the predictive model. We implement the described approach as a part of the FEDOT framework1 that allow building various ML-based composite models. The previous implementation of FEDOT allows us to use multi-objective 1 https://github.com/nccr-itmo/FEDOT.
  • 93.
    80 A. Hvatovet al. optimization only for regression and classification tasks with a limited set of objective functions. After the conducted changes, it can be used for custom tasks and objectives. The generative evolutionary approach is used during the experimental studies to discover the composite machine learning model for the sea surface height dataset. The obtained Pareto frontier is presented is Fig. 5. Fig. 5. The Pareto frontier for the evolutionary multi-objective design of the compos- ite machine learning model. The root mean squared error (RMSE) and mean-variance for RMSE are used as objective functions. The ridge and linear regressions, lagged transformation (referred as lag), k-nearest regression (knnreg) and decision tree regres- sion (dtreg) models are used as a parts of optimal composite model for time series forecasting. From Fig. 5 we obtain an interpretation that agrees with the practical guide- lines. Namely, the structures with single ridge regression (M1) are the most robust, meaning that dataset partition less affects the coefficients. The single decision tree model, on the contrary, the most dependent on the dataset parti- tion model. 3.3 Closed-Form Algebraic Expressions The class of the models may include algebraic expressions to obtain better inter- pretability of the model. As the first example, we present the multi-objective algebraic expression discovery example. As the algebraic expression, we understand the sum of the atomic functions’ products, which we call tokens. Basically token is an algebraic expression with
  • 94.
    Multi-objective Discovery ofMathematical Models 81 a free parameters (as an example T = (t; α1, α2, α3) = α3 sin(α1t + α2) with free parameters set α1, α2, α3), which are subject to optimization. In the present paper, we use pulses, polynomials, and trigonometric functions as the tokens set. For the mathematical expression models overall, it is helpful to introduce two groups of objectives. The first group of the objectives we refer to as “quality”. For a given equation M, the quality metric ||·|| is the data D reproduction norm that is represented as Q(M) = ||M − D|| (2) The second group of objectives we refer to as “complexity”. For a given equation M, the complexity metric is bound to the length of the equation that is denoted as #(M) C(M) = #(M) (3) As an example of objectives, we use rooted mean squared error (RMSE) as the quality metric and the number of tokens present in the resulting model as the complexity metric. First, the model’s structure is obtained with a separate evo- lutionary algorithm to compute the mean squared error. In details it is described in [10]. To perform the single model evolutionary optimization in this case, we make the arithmetic operations immutable. The resulting directed acyclic graph is shown in Fig. 6. Thus, the third type of nodes appears - immutable ones. This step is not necessary, and the general approach described above may be used instead. However, it reduces the search space and thus reduces the optimization time without losing generality. Fig. 6. The scheme of the composite model, generalizing the discovered differential equation, where red nodes are the nodes, unaffected by mutation or cross-over operators of the evolutionary algorithm. The blue nodes represent the tokens that evolutionary operators can alter. (Color figure online) The resulting Pareto frontier for the class of the described class of closed-form algebraic expressions is shown in Fig. 7.
  • 95.
    82 A. Hvatovet al. Fig. 7. The Pareto frontier for the evolutionary multi-objective design of the closed- form algebraic expressions. The root mean squared error (RMSE) and model complex- ity are used as objective functions. Since the origin of time series is the sea surface height in the ocean, it is natural to expect that the closed-form algebraic expression is the spectra-like decomposition, which is seen in Fig. 7. It is also seen that as soon as the com- plexity rises, the additional term only adds information to the model without significant changes to the terms that are present in the less complex model. 3.4 Differential Equations The development of a differential equation-based model of a dynamic system can be viewed from the composite model construction point of view. A tree graph rep- resents the equation with input functions, decoded as leaves, and branches repre- senting various mathematical operations between these functions. The specifics of a single equation’s development process were discussed in the article [9]. The evaluation of equation-based model quality is done in a pattern similar to one of the previously introduced composite models. Each equation represents a trade-off between its complexity, which we estimate by the number of terms in it and the quality of a process representation. Here, we will measure this process’s representation quality by comparing the left and right parts of the equation. Thus, the algorithm aims to obtain the Pareto frontier with the quality and complexity taken as the objective functions. We cannot use standard error measures such as RMSE since the partial differential equation with the arbitrary operator cannot be solved automatically. Therefore, the results from previous sections could not be compared using the quality metric.
  • 96.
    Multi-objective Discovery ofMathematical Models 83 Fig. 8. The Pareto frontier for the evolutionary multi-objective discovery of differential equations, where complexity objective function is the number of terms in the left part of the equation, and quality is the approximation error (difference between the left and right parts of the equation). Despite the achieved quality of the equations describing the process, pre- sented in Fig. 8, their predictive properties may be lacking. The most appropriate physics-based equations to describe this class of problems (e.g., shallow-water equations) include spatial partial derivatives that are not available in processing a single time series. 4 Conclusion The paper describes a multi-objective composite models discovery approach intended for data-driven modeling and initial interpretation. Genetic programming is a powerful tool for DAG model generation and opti- mization. However, it requires refinement to be applied to the composite model discovery. We show that the number of changes required is relatively high. There- fore, we are not talking about the genetic programming algorithm. Moreover, the multi-objective formulation may be used to understand how the human- formulated objectives affect the optimization, though this basic interpretation is achieved. As the main advantages we note: – The model and basic interpretation are obtained simultaneously during the optimization – The approach can be applied to the different classes of the models without significant changes – Obtained models could have better quality since the multi-objective problem statement increases diversity which is vital for evolutionary algorithms
  • 97.
    84 A. Hvatovet al. As future work, we plan to work on the unification of the approaches, which will allow obtaining the combination of algebraic-form models and machine learn- ing models, taking best from each of the classes: better interpretability of math- ematical and flexibility machine learning models. Acknowledgements. This research is financially supported by The Russian Science Foundation, Agreement #17-71-30029 with cofinancing of Bank Saint Petersburg. References 1. Elsken, T., Metzen, J.H., Hutter, F., et al.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(55), 1–21 (2019) 2. Grosan, C.: Evolving mathematical expressions using genetic algorithms. In: Genetic and Evolutionary Computation Conference (GECCO). Citeseer (2004) 3. Hvatov, A., Nikitin, N.O., Kalyuzhnaya, A.V., Kosukhin, S.S.: Adaptation of nemo- lim3 model for multigrid high resolution arctic simulation. Ocean Model. 141, 101427 (2019) 4. Kalyuzhnaya, A.V., Nikitin, N.O., Vychuzhanin, P., Hvatov, A., Boukhanovsky, A.: Automatic evolutionary learning of composite models with knowledge enrichment. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, pp. 43–44 (2020) 5. Konforti, Y., Shpigler, A., Lerner, B., Bar-Hillel, A.: Inference graphs for CNN interpretation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part XXV. LNCS, vol. 12370, pp. 69–84. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-58595-2 5 6. Li, K., Deb, K., Zhang, Q., Kwong, S.: An evolutionary many-objective optimiza- tion algorithm based on dominance and decomposition. IEEE Trans. Evol. Com- put. 19(5), 694–716 (2014) 7. Lipton, Z.C.: The mythos of model interpretability: in machine learning, the con- cept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018) 8. Lu, Q., Ren, J., Wang, Z.: Using genetic programming with prior formula knowl- edge to solve symbolic regression problem. Comput. Intell. Neurosci. 2016, 1–17 (2016) 9. Maslyaev, M., Hvatov, A., Kalyuzhnaya, A.V.: Partial differential equations dis- covery with EPDE framework: application for real and synthetic data. J. Comput. Sci. 53, 101345 (2021). https://doi.org/10.1016/j.jocs.2021.101345, https://www. sciencedirect.com/science/article/pii/S1877750321000429 10. Merezhnikov, M., Hvatov, A.: Closed-form algebraic expressions discovery using combined evolutionary optimization and sparse regression approach. Procedia Comput. Sci. 178, 424–433 (2020) 11. Nikitin, N.O., Polonskaia, I.S., Vychuzhanin, P., Barabanova, I.V., Kalyuzhnaya, A.V.: Structural evolutionary learning for composite classification models. Procedia Comput. Sci. 178, 414–423 (2020) 12. Olson, R.S., Moore, J.H.: TPOT: a tree-based pipeline optimization tool for automating machine learning. In: Workshop on Automatic Machine Learning, pp. 66–74. PMLR (2016) 13. Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., Tarantola, S.: Vari- ance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Comput. Phys. Commun. 181(2), 259–270 (2010)
  • 98.
    Multi-objective Discovery ofMathematical Models 85 14. Tsakiri, K., Marsellos, A., Kapetanakis, S.: Artificial neural network and multiple linear regression for flood prediction in mohawk river, New York. Water 10(9), 1158 (2018) 15. Vu, T.M., Probst, C., Epstein, J.M., Brennan, A., Strong, M., Purshouse, R.C.: Toward inverse generative social science using multi-objective genetic program- ming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1356–1363 (2019) 16. Vychuzhanin, P., Nikitin, N.O., Kalyuzhnaya, A.V., et al.: Robust ensemble-based evolutionary calibration of the numerical wind wave model. In: Rodrigues, J.M.F. (ed.) ICCS 2019, Part I. LNCS, vol. 11536, pp. 614–627. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22734-0 45
  • 99.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances Ana Maria A. C. Rocha1(B) , M. Fernanda P. Costa2 , and Edite M. G. P. Fernandes1 1 ALGORITMI Center, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal {arocha,emgpf}@dps.uminho.pt 2 Centre of Mathematics, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal mfc@math.uminho.pt Abstract. This paper contains a proposal to assign points to clusters, represented by their centers, based on weighted expected distances in a cluster analysis context. The proposed clustering algorithm has mecha- nisms to create new clusters, to merge two nearby clusters and remove very small clusters, and to identify points ‘noise’ when they are beyond a reasonable neighborhood of a center or belong to a cluster with very few points. The presented clustering algorithm is evaluated using four randomly generated and two well-known data sets. The obtained cluster- ing is compared to other clustering algorithms through the visualization of the clustering, the value of the DB validity measure and the value of the sum of within-cluster distances. The preliminary comparison of results shows that the proposed clustering algorithm is very efficient and effective. Keywords: Clustering analysis · Partitioning algorithms · Weighted distance 1 Introduction Clustering is an unsupervised machine learning task that is considered one of the most important data analysis techniques in data mining. In a clustering problem, unlabeled data objects are to be partitioned into a certain number of groups (also called clusters) based on their attribute values. The objective is that objects in a cluster are more similar to each other than to objects in another cluster [1–4]. In geometrical terms, the objects can be viewed as points in a a-dimensional space, where a is the number of attributes. Clustering partitions these points into groups, where points in a group are located near one another in space. This work has been supported by FCT – Fundação para a Ciência e Tecnologia within the RD Unit Project Scope UIDB/00319/2020. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 86–101, 2021. https://doi.org/10.1007/978-3-030-91885-9_7
  • 100.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 87 There are a variety of categories of clustering algorithms. The most tradi- tional include the clustering algorithms based on partition, the algorithms based on hierarchy, algorithms based on fuzzy theory, algorithms based on distribution, those based on density, the ones based on graph theory, based on grid, on fractal theory and based on model. The basic and core ideas of a variety of commonly used clustering algorithms, a comprehensive comparison and an analysis of their advantages and disadvantages are summarized in an interesting review of cluster- ing algorithms [5]. The most used are the partitioning clustering methods, being the K-means clustering the most popular [6]. K-means clustering (and other K- means combinations) subdivide the data set into K clusters, where K is specified in advance. Each cluster is represented by the centroid (mean) of the data points belonging to that cluster and the clustering is based on the minimization of the overall sum of the squared errors between the points and their corresponding cluster center. While partition clustering constructs various partitions of the data set and uses some criteria to evaluate them, hierarchical clustering creates a hierarchy of clusters by combining the data points into clusters, and after that these clusters are combined into larger clusters and so on. It does not require the number of clusters in advance. However, once a merge or a split decision have been made, pure hierarchical clustering does not make adjustments. To overcome this limitation, hierarchical clustering can be integrated with other techniques for multiple phase clustering. Linkage algorithms are agglomerative hierarchical methods that considers merging clusters based on the distance between clusters. For example, the single-link is a type of linkage algorithm that merges the two clusters with the smallest minimum pairwise distance [7]. K-means clustering easily fails when the cluster forms depart from the hyper spherical shape. On the other hand, the single-linkage clustering method is affected by the presence of outliers and the differences in the density of clusters. However, it is not sensi- tive to shape and size of clusters. Model-based clustering assumes that the data set comes from a distribution that is a mixture of two or more clusters [8]. Unlike K-means, the model-based clustering uses a soft assignment, where each point has a probability of belonging to each cluster. Since clustering can be seen as an optimization problem, well-known optimization algorithms may be applied in the cluster analysis, and a variety of them have been combined with the K-means clustering, e.g. [9–11]. Unlike K-means and pure hierarchical clustering, the herein proposed parti- tioning clustering algorithm has mechanisms that dynamically adjusts the num- ber of clusters. The mechanisms are able to add, remove and merge clusters depending on some pre-defined conditions. Although for the initialization of the algorithm, an initial number of clusters to start the points assigning process is required, the proposed clustering algorithm has mechanisms to create new clusters. Points that were not assigned to a cluster (using a weighted expected distance of a particular center to the data points) are gathered in a new cluster. The clustering is also able to merge two nearby clusters and remove the clus- ters that have a very few points. Although these mechanisms are in a certain way similar to others in the literature, this paper gives a relevant contribution
  • 101.
    88 A. M.A. C. Rocha et al. to the object similarity issue. Similarity is usually tackled indirectly by using a distance measure to quantify the degree of dissimilarity among objects, in a way that more similar objects have lower dissimilarity values. A probabilistic approach is proposed to define the weighted expected distances from the cluster centers to the data points. These weighted distances rely on the average variabil- ity of the distances of the points to each center with respect to their estimated expected distances. The larger the variability, the smaller the weighted distance. Therefore, weighted expected distances are used to assign points to clusters (that are represented by their centers). The paper is organized as follows. Section 2 briefly describes the ideas of a cluster analysis and Sect. 3 presents the details of the proposed weighted dis- tances clustering algorithm. In Sect. 4, the results of clustering four sets of data points with two attributes, one set with four attributes, and another with thir- teen attributes are shown. Finally, Sect. 5 contains the conclusions of this work. 2 Cluster Analysis Assume that n objects, each with a attributes, are given. These objects can also be represented by a data matrix X with n vectors/points of dimension a. Each element Xi,j corresponds to the jth attribute of the ith point. Thus, given X, a partitioning clustering algorithm tries to find a partition C = {C1, C2, . . . , CK} of K clusters (or groups), in a way that similarity of the points in the same cluster is maximum and points from different clusters differ as much as possible. It is required that the partition satisfies three conditions: 1. each cluster should have at least one point, i.e., |Ck| = 0, k = 1, . . . , K; 2. a point should not belong to two different clusters, i.e., Ck Cl = ∅, for k, l = 1, . . . , K, k = l; 3. each point should belong to a cluster, i.e., K k=1 |Ck| = n; where |Ck| is the number of points in cluster Ck. Since there are a number of ways to partition the points and maintain these properties, a fitness function should be provided so that the adequacy of the partitioning is evaluated. Therefore, the clustering problem could be stated as finding an optimal solution, i.e., partition C∗ , that gives the optimal (or near-optimal) adequacy, when compared to all the other feasible solutions. 3 Weighted Distances Clustering Algorithm Distance based clustering is a very popular technique to partition data points into clusters, and can be used in a large variety of applications. In this section, a probabilistic approach is described that relies on the weighted distances (WD) of each data point Xi, i = 1, . . . , n relative to the cluster centers mk, k = 1, . . . , K. These WD are used to assign the data points to clusters.
  • 102.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 89 3.1 Weighted Expected Distances To compute the WD between the data point Xi and center mk, we resort to the distance vectors DVk, k = 1, . . . , K, and to the variations V ARk, k = 1, . . . , K, the variability in cluster center mk relative to all data points. Thus, let DVk = (DVk,1, DVk,2, . . . , DVk,n) be a n-dimensional distance vector that contains the distances of data points X1, X2, . . . , Xn (a-dimensional vectors) to the cluster center mk ∈ Ra , where DVk,i = Xi − mk2 (i = 1, . . . , n). The componentwise total of the distance vectors, relative to a particular data point Xi, is given by: Ti = K k=1 DVk,i. The probability of mk being surrounded and having a good centrality relative to all points in the data set is given by pk = n i=1 DVk,i n i=1 Ti . (1) Therefore, the expected value for the distance vector component i (relative to point Xi) of the vector DVk is E[DVk,i] = pkTi. Then, the WD is defined for each component i of the vector DVk, WD(k, i), using WD(k, i) = DVk,i/V ARk (2) (corresponding to the weighted distance assigned to data point Xi by the cluster center mk), where the variation V ARk is the average amount of variability of the distances of the points to the center mk, relative to the expected distances: V ARk = 1 n n i=1 (DVk,i − E[DVk,i]) 2 1/2 . The larger the V ARk, the lower the weighted distances WD(k, i), i = 1, . . . , n of the points relative to that center mk, when compared to the other centers. A larger V ARk means that in average the difference between the estimated expected distances and the real distances, to that center mk, is larger than to the other centers. 3.2 The WD-Based Algorithm Algorithm 1 presents the main steps of the proposed weighted expected distances clustering algorithm. For initialization, a problem dependent number of clusters is provided, e.g. K, so that the algorithm randomly selects K points from the data set to initialize the cluster centers. To determine which data points are assigned to a cluster, the algorithm uses the WD as a fitness function. Each point Xi has a WD relative to a cluster center mk, the above defined WD(k, i).
  • 103.
    90 A. M.A. C. Rocha et al. The lower the better. The minimization of WD(k, i) relative to k takes care of the local property of the algorithm, assigning Xi to cluster Ck. Algorithm 1. Clustering Algorithm Require: data set X ≡ Xi,j, i = 1, . . . , n, j = 1, . . . , a, Itmax, Nmin 1: Set initial K; It = 1 2: Randomly select a set of K points from the data set X to initialize the K cluster centers mk, k = 1, . . . , K, 3: repeat 4: Based on mk, k = 1, . . . , K, assign the data points to clusters and eventually add a new cluster, using Algorithm 2 5: Compute ηD using (4) 6: Based on ηD, eventually remove clusters by merging two at a time, and remove clusters with less than Nmin points, using Algorithm 3 7: Set m̄k ← mk, k = 1, . . . , K 8: Compute improved values for the cluster centers, mk, k = 1, . . . , K using Algo- rithm 4 9: Set It = It + 1 10: until It Itmax or maxk=1,...,K m̄k − mk2 ≤ 1e − 3 11: return K∗ , mk, k = 1, . . . , K∗ and optimal C∗ = {C∗ 1 , C∗ 2 , . . . , C∗ K }. Thus, to assign a point Xi to a cluster, the algorithm identifies the center mk that has the lowest weighted distance value to that point, and if WD(k, i) is inferior to the quantity 1.5ηk, where ηk is a threshold value for the center mk, the point is assigned to that cluster. On the other hand, if WD(k, i) is between 1.5ηk and 2.5ηk a new cluster CK+1 is created and the point is assigned to CK+1; otherwise, the WD(k, i) value exceeds 2.5ηk and the point is considered ‘noise’. Thus, all points that have a minimum weighted distance to a specific center mk between 1.5ηk and 2.5ηk are assigned to the same cluster CK+1, and if the minimum weighted distances exceed 2.5ηk, the points are ‘noise’. The goal of the threshold ηk is to define a bound to border the proximity to a cluster center. The limit 1.5ηk defines the region of a good neighborhood, and beyond 2.5ηk the region of ‘noise’ is defined. Similarity in the weighted distances and proximity to a cluster center mk is decided using the threshold ηk. For each cluster k, the ηk is found by using the average of the WD(k, .) of the points which are in the neighborhood of mk and have magnitudes similar to each other. The similarity between the magnitudes of the WD is measured by sorting their values and checking if the consecutive differences remain lower that a threshold ΔW D. This value depends on the number of data points and the median of the matrix WD values (see Algorithm 2 for the details): ΔW D = n − 1 n median[WD(k, i)]k=1,...,K, i=1,...,n. (3) During the iterative process, clusters that are close to each other, measured by the distance between their centers, may be merged if the distance between
  • 104.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 91 their centers is below a threshold, herein denoted as ηD. This parameter value is defined as a function of the search region of the data points and also depends on the current number of clusters and number of attributes of the data set: ηD = minj=1,...,a{maxi Xi,j − mini Xi,j} K(a − 1) . (4) Furthermore, all clusters that have a relatively small number of points, i.e., all clusters that verify |Ck| Nmin, where Nmin = min{50, max{2, 0.05n}}, are removed, their centers are deleted, and the points are coined as ‘noise’ (for the current iteration), see Algorithm 3. Algorithm 2. Assigning Points Algorithm Require: mk for k = 1, . . . , K, data set Xi,j, i = 1, . . . , n, j = 1, . . . , a 1: Compute each component i of the vector DVk, i = 1, . . . , n (k = 1, . . . , K) 2: Compute matrix of WD as shown in (2) 3: Compute ΔW D using (3) 4: for k = 1, . . . , K do 5: WDsort(k) ← sort(WD(k, :)) 6: end for 7: for k = 1, . . . , K do 8: for i = 1, . . . , n − 1 do 9: if (WDsort(k, i + 1) − WDsort(k, i)) ΔW D then 10: break 11: end if 12: end for 13: Compute ηk = i l=1 W Dsort(k,l) i 14: end for 15: for i = 1, . . . , n do 16: Set k = arg mink=1,...,K WD(k, i) 17: if WD(k, i) 1.5ηk then 18: Assign point Xi to the cluster Ck 19: else 20: if WD(k, i) 2.5ηk then 21: Assign point Xi to the new cluster CK+1 22: else 23: Define point Xi as ‘noise’ (class Xnoise) 24: end if 25: end if 26: end for 27: Compute center mK+1 as the centroid of all points assigned to CK+1 28: Update K 29: return K, mk, Ck, for k = 1, . . . , K, and Xnoise. Summarizing, to achieve the optimal number of clusters, the Algorithm 1 proceeds iteratively by:
  • 105.
    92 A. M.A. C. Rocha et al. – adding a new cluster whose center is represented by the centroid of all points that were not assigned to any cluster according to the rules previously defined; – defining points ‘noise’ if they are not in an appropriate neighborhood of the center that is closest to those points, or if they belong to a cluster with very few points; – merging clusters that are sufficiently close to each other, as well as removing clusters with very few points. On the other hand, the position of each cluster center can be iteratively improved, by using the centroid of the data points that have been assigned to that cluster and the previous cluster center [12,13], as shown in Algorithm 4. Algorithm 3. Merging and Removing Clusters Algorithm Require: K, mk, Ck, k = 1, . . . , K, ηD, Nmin, Xnoise 1: for k = 1 to K − 1 do 2: for l = k + 1 to K do 3: Compute Dk,l = mk − ml2 4: if Dk,l ≤ ηD then 5: Compute mk = 1 |Ck|+|Cl| (|Ck|mk + |Cl|ml) 6: Assign all points X ∈ Cl to the cluster Ck 7: Delete ml (and remove Cl) 8: Update K 9: end if 10: end for 11: end for 12: for k = 1, . . . , K do 13: if |Ck| Nmin then 14: Define all X ∈ Ck as points ‘noise’ (add to class Xnoise) 15: Delete mk (remove Ck) 16: Update K 17: end if 18: end for 19: return K, mk, Ck for k = 1, . . . , K, and Xnoise. Algorithm 4. Update Cluster Centers Algorithm Require: mk, Ck, k = 1, . . . , K, 1: Update all cluster centers: mk = 1 |Ck| + 1 ⎛ ⎝ Xl∈Ck Xl + mk ⎞ ⎠ for k = 1, . . . , K 2: return mk, k = 1, . . . , K
  • 106.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 93 4 Computational Results In this section, some preliminary results are shown. Four sets of data points with two attributes, one set with four attributes (known as ‘Iris’) and one set with thirteen attributes (known as ‘Wine’) are used to compute and visu- alize the partitioning clustering. The algorithms are coded in MATLAB R . The performance of the Algorithm 1, based on Algorithm 2, Algorithm 3 and Algorithm 4, depends on some parameter values that are problem depen- dent and dynamically defined: initial K = min{10, max{2, 0.01n}}, Nmin = min{50, max{2, 0.05n}}, ΔW D 0, ηk 0, k = 1, . . . , K, ηD 0. The clus- tering results were obtained for Itmax = 5, except for ‘Iris’ and ‘Wine’ where Itmax = 20 was used. 4.1 Generated Data Sets The generator code for the first four data sets is the following: Problem 1. ‘mydata’1 with 300 data points and two attributes. Problem 2. 2000 data points and two attributes. mu1 = [1 2]; Sigma1 = [2 0; 0 0.5]; mu2 = [-3 -5]; Sigma2 = [1 0; 0 1]; X = [mvnrnd(mu1,Sigma1,1000); mvnrnd(mu2,Sigma2,1000)]; Problem 3. 200 data points and two attributes. X = [randn(100,2)*0.75+ones(100,2); randn(100,2)*0.5-ones(100,2)]; Problem 4. 300 data points and two attributes. mu1 = [1 2]; sigma1 = [3 .2; .2 2]; mu2 = [-1 -2]; sigma2 = [2 0; 0 1]; X = [mvnrnd(mu1,sigma1,200); mvnrnd(mu2,sigma2,100)]; In Figs. 1, 2, 3 and 4, the results obtained for Problems 1–4 are depicted: – the plot (a) is the result of assigning the points (applying Algorithm 2) to the clusters based on the initialized centers (randomly selected K points of the data set); – the plot (a) may contain a point, represented by the symbol ‘’, which iden- tifies the position of the center of the cluster that is added in Algorithm 2 with the points that were not assigned (according to the previously referred rule) to a cluster; – plots (b) and (c) are obtained after the application of Algorithm 4 to update the centers that result from the clustering obtained by Algorithm 2 and Algo- rithm 3, at an iteration; 1 available at Mostapha Kalami Heris, Evolutionary Data Clustering in MATLAB (URL: https://yarpiz.com/64/ypml101-evolutionary-clustering), Yarpiz, 2015.
  • 107.
    94 A. M.A. C. Rocha et al. – the final clustering is visualized in plot (d); – plots (a), (b), (c) or (d) may contain points ‘noise’ identified by the sym- bol ‘×’; – plot (e) shows the values of the Davies-Bouldin (DB) index, at each iteration, obtained after assigning the points to clusters (according to the previously referred rule) and after the update of the cluster centers; – plot (f) contains the K-means clustering when K = K∗ is provided to the algorithm. The code that implements the K-means clustering [14] is used for comparative purposes, although with K-means the number of clusters K had to be specified in advance. The performances of the tested clustering algorithms are measured in terms of a cluster validity measure, the DB index [15]. The DB index aims to evaluate intra-cluster similarity and inter-cluster differences by computing DB = 1 K K k=1 max l=1,...,K, l=k Sk + Sl dk,l (5) where Sk (resp. Sl) represents the average of all the distances between the center mk (resp. ml) and the points in cluster Ck (resp. Cl) and dk,l is the distance between mk and ml. The smallest DB index indicates a valid optimal partition. The computation of the DB index assumes that a clustering has been done. The values presented in plots (a) (on the x-axis), (b), (c) and (d) (on the y-axis), and (e) are obtained out of the herein proposed clustering (recall Algorithm 2, Algorithm 3 and Algorithm 4). The values of the ‘DB index(m,X)’ and ‘CS index(m,X)’ (on the y-axis) of plot (a) come from assigning the points to clusters based on the usual minimum distances of points to centers. The term CS index refers to the CS validity measure and is a function of the ratio of the sum of within-cluster scatter to between-cluster separation [13]. The quantity ‘Sum of Within-Cluster Distance (WCD)’ (in plots (b), (c) and (d) (on the x-axis)) was obtained out of our proposed clustering process. In conclusion, the proposed clustering is effective, very efficient and robust. As it can be seen, the DB index values that result from our clustering are slightly lower than (or equal to) those registered by the K-means clustering. 4.2 ‘Iris’ and ‘Wine’ Data Sets The results of our clustering algorithm, when solving Problems 5 and 6 are now shown. To compare the performance, our algorithm was run 30 times for each data set. Problem 5. ‘Iris’ with 150 data points. It contains three categories (types of iris plant) with 4 attributes (sepal length, sepal width, petal length and petal width) [16]. Problem 6. ‘Wine’ with 178 data points. It contains chemical analysis of 178 wines derived from 3 different regions, with 13 attributes [16].
  • 108.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 95 -2 0 2 4 6 8 DB index = 1.4783 -2 -1 0 1 2 3 4 5 6 DB index(m,X) = 1.6035; CS index(m,X) = 1.2197 Random selection of centers+Assign Cluster 1 Cluster 2 Cluster 3 Cluster 4 (a) Assign after initialization -2 0 2 4 6 8 Sum of Within-Cluster Distance (WCD) = 589.0572 -2 -1 0 1 2 3 4 5 6 BD index after updating centroids = 0.96515 Assign+Delete+Update centroids, at iteration 1 Cluster 1 Cluster 2 Cluster 3 (b) 1st. iteration -2 0 2 4 6 8 Sum of Within-Cluster Distance (WCD) = 379.0274 -2 -1 0 1 2 3 4 5 6 BD index after updating centroids = 0.6342 Assign+Delete+Update centroids, at iteration 2 Cluster 1 Cluster 2 Cluster 3 (c) 2nd. iteration -2 0 2 4 6 8 Sum of Within-Cluster Distance (WCD) = 374.6869 -2 -1 0 1 2 3 4 5 6 BD index after updating centroids = 0.59976 Assign+Delete+Update centroids, at iteration 5 Cluster 1 Cluster 2 Cluster 3 (d) final clustering (5th iteration) 1 1.5 2 2.5 3 3.5 4 4.5 5 Iterations 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 Evaluation of new Updated Centers by DB index DB index (After Update Centroids) DB index (of Assign) (e) DB index plot -2 0 2 4 6 8 -2 -1 0 1 2 3 4 5 6 Kmeans, # clusters = 3 / DB index = 0.60831 / 5 iterations Cluster 1 Cluster 2 Cluster 3 (f) K-means with K = 3 Fig. 1. Clustering process of Problem 1 The results are compared to those of [17], that uses a particle swarm opti- mization (PSO) approach to the clustering, see Table 1. When solving the ‘Iris’ problem, our algorithm finds the 3 clusters in 77% of the runs (23 successful
  • 109.
    96 A. M.A. C. Rocha et al. -8 -6 -4 -2 0 2 4 6 DB index = 1.8608 -8 -6 -4 -2 0 2 4 DB index(m,X) = 1.8226; CS index(m,X) = 1.7192 Random selection of centers+Assign Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8 Cluster 9 Cluster 10 (a) Assign after initialization -8 -6 -4 -2 0 2 4 6 Sum of Within-Cluster Distance (WCD) = 2891.6719 -8 -6 -4 -2 0 2 4 BD index after updating centroids = 2.2919 Assign+Delete+Update centroids, at iteration 1 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Noise (b) 1st. iteration -8 -6 -4 -2 0 2 4 6 Sum of Within-Cluster Distance (WCD) = 2474.8877 -8 -6 -4 -2 0 2 4 BD index after updating centroids = 1.2168 Assign+Delete+Update centroids, at iteration 2 Cluster 1 Cluster 2 Cluster 3 (c) 2nd. iteration -8 -6 -4 -2 0 2 4 6 Sum of Within-Cluster Distance (WCD) = 2645.2442 -8 -6 -4 -2 0 2 4 BD index after updating centroids = 0.37506 Assign+Delete+Update centroids, at iteration 5 Cluster 1 Cluster 2 (d) final clustering (5th iteration) 1 1.5 2 2.5 3 3.5 4 4.5 5 Iterations 0 0.5 1 1.5 2 2.5 Evaluation of new Updated Centers by DB index DB index (After Update Centroids) DB index (of Assign) (e) DB index plot -8 -6 -4 -2 0 2 4 6 -8 -6 -4 -2 0 2 4 Kmeans, # clusters = 2 / DB index = 0.37506 / 5 iterations Cluster 1 Cluster 2 (f) K-means with K = 2 Fig. 2. Clustering process of Problem 2 runs out of 30). When solving the problem ‘Wine’, 27 out of 30 runs identified the 3 clusters. Table 1 shows the final value of a fitness function, known as Sum of Within- Cluster Distance (WCD) (the best, the average (avg.), the worst and the stan-
  • 110.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 97 -2 -1 0 1 2 3 DB index = 7.2298 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 DB index(m,X) = 4.6524; CS index(m,X) = 1.6856 Random selection of centers+Assign Cluster 1 Cluster 2 Cluster 3 Noise (a) Assign after initialization -2 -1 0 1 2 3 Sum of Within-Cluster Distance (WCD) = 211.878 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 BD index after updating centroids = 0.8258 Assign+Delete+Update centroids, at iteration 1 Cluster 1 Cluster 2 Noise (b) 1st. iteration -2 -1 0 1 2 3 Sum of Within-Cluster Distance (WCD) = 165.2711 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 BD index after updating centroids = 0.68398 Assign+Delete+Update centroids, at iteration 2 Cluster 1 Cluster 2 (c) 2nd. iteration -2 -1 0 1 2 3 Sum of Within-Cluster Distance (WCD) = 157.7905 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 BD index after updating centroids = 0.62417 Assign+Delete+Update centroids, at iteration 5 Cluster 1 Cluster 2 (d) final clustering (5th iteration) 1 1.5 2 2.5 3 3.5 4 4.5 5 Iterations 0 1 2 3 4 5 6 7 8 Evaluation of new Updated Centers by DB index DB index (After Update Centroids) DB index (of Assign) (e) DB index plot -2 -1 0 1 2 3 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 Kmeans, # clusters = 2 / DB index = 0.62418 / 5 iterations Cluster 1 Cluster 2 (f) K-means with K = 2 Fig. 3. Clustering process of Problem 3 dard deviation (St. Dev.) values over the successful runs). The third column of the table has the WCD that results from our clustering, i.e., after assigning the points to clusters (according to the above described rules based on the WD – Algorithm 2), merging and removing clusters if appropriate (Algorithm 3) and
  • 111.
    98 A. M.A. C. Rocha et al. -6 -4 -2 0 2 4 6 8 DB index = 1.2733 -4 -2 0 2 4 6 DB index(m,X) = 1.2146; CS index(m,X) = 1.7295 Random selection of centers+Assign Cluster 1 Cluster 2 Cluster 3 Cluster 4 (a) Assign after initialization -6 -4 -2 0 2 4 6 8 Sum of Within-Cluster Distance (WCD) = 546.8942 -4 -2 0 2 4 6 BD index after updating centroids = 0.98561 Assign+Delete+Update centroids, at iteration 1 Cluster 1 Cluster 2 Noise (b) 1st. iteration -6 -4 -2 0 2 4 6 8 Sum of Within-Cluster Distance (WCD) = 528.8877 -4 -2 0 2 4 6 BD index after updating centroids = 0.91049 Assign+Delete+Update centroids, at iteration 2 Cluster 1 Cluster 2 Noise (c) 2nd. iteration -6 -4 -2 0 2 4 6 8 Sum of Within-Cluster Distance (WCD) = 507.6508 -4 -2 0 2 4 6 BD index after updating centroids = 0.84801 Assign+Delete+Update centroids, at iteration 5 Cluster 1 Cluster 2 Noise (d) final clustering (5th iteration) 1 1.5 2 2.5 3 3.5 4 4.5 5 Iterations 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 Evaluation of new Updated Centers by DB index DB index (After Update Centroids) DB index (of Assign) (e) DB index plot -6 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 Kmeans, # clusters = 2 / DB index = 0.86305 / 5 iterations Cluster 1 Cluster 2 (f) K-means with K = 2 Fig. 4. Clustering process of Problem 4 updating the centers (Algorithm 4), ‘WCDW D’. The number of points ‘noise’ at the final clustering (of the best run, i.e., the run with the minimum value of ‘WCDW D’), ‘# noise’, the number of iterations, It, the time in seconds, ‘time’, and the percentage of successful runs, ‘suc.’, are also shown in the table. The
  • 112.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 99 Table 1. Clustering results for Problems 5 and 6 Algorithm 1 Results in [17] Problem ‘WCDW D’ # noise It Time suc. ‘WCD’ ‘WCD’ Time ‘Iris’ Best 107.89 0 5 0.051 77% 100.07 97.22 0.343 Avg. 108.21 8.7 0.101 100.17 97.22 0.359 Worst 109.13 12 0.156 100.47 97.22 0.375 St. Dev. 0.56 0.18 ‘Wine’ Best 17621.45 0 6 0.167 90% 16330.19 16530.54 2.922 Avg. 19138.86 9.1 0.335 17621.12 16530.54 2.944 Worst 20637.04 13 1.132 19274.45 16530.54 3.000 St. Dev. 1249.68 1204.04 Table 2. Results comparison for Problems 5 and 6 Algorithm 1 K-means K-NM-PSO Problem ‘WCDD’ # noise It Time suc. ‘WCD’ ‘WCD’ ‘Iris’ Best 97.20 0 4 0.096 63% 97.33 96.66 Avg. 97.21 8.7 0.204 106.05 96.67 St. Dev. 0.01 14.11 0.008 ‘Wine’ Best 16555.68 0 6 0.159 90% 16555.68 16292.00 Avg. 17031.79 10.3 0.319 18061.00 16293.00 St. Dev. 822.08 793.21 0.46 column identified with ‘WCD’ contains values of WCD (at the final clustering) using X and the obtained final optimal cluster centers to assign the points to the clusters/centers based on the minimum usual distances from each point to the centers. For the same optimal clustering, the ‘WCD’ has always a lower value than ‘WCDW D’. When comparing the ‘WCD’ of our clustering with ‘WCD’ registered in [17], our obtained values are higher, in particular the average and the worst values, except the best of the results ‘WCD’ for problem ‘Wine’. The variability of the results registered in [17] and [18] (see also Table 2) are rather small but this is due to the type of clustering based on the PSO algorithm. The clustering algorithm K-NM-PSO [18] is a combination of K-means, the local optimization method Nelder-Mead (NM) and the PSO. In contrast, our algorithm finds the optimal clustering much faster than the PSO clustering approach in [17]. This is a point in favor of our algorithm, an essential requirement when solving large data sets. The results of our algorithm shown in Table 2 (third column identified with ‘WCDD’) are based on the designed and previously described Algorithms 1, 2, 3 and 4, but using the usual distances between points and centers for the assigning process (over the iterative process). These values are slightly better
  • 113.
    100 A. M.A. C. Rocha et al. than those obtained by the K-means (available in [18]), but not as interesting as those obtained by the K-NM-PSO (reported in [18]). 5 Conclusions A probabilistic approach to define the weighted expected distances from the cluster centers to the data points is proposed. These weighted distances are used to assign points to clusters, represented by their centers. The new proposed clustering algorithm relies on mechanisms to create new clusters. Points that are not assigned to a cluster, based on the weighted expected distances, are gathered in a new cluster. The clustering is also able to merge two nearby clusters and remove the clusters that have very few points. Points are also identified as ‘noise’ when they are beyond a limit of a reasonable neighbourhood (as far as weighted distances are concerned) and when they belong to a cluster with very few points. The preliminary clustering results, and the comparisons with the K-means and other K-means combinations with stochastic algorithms, show that the proposed algorithm is very efficient and effective. In the future, data sets with non-convex clusters, clusters with different shapes, in particular those with non-spherical blobs, and sizes will be addressed. Our proposal is to integrate a kernel function into our clustering approach. We aim to further investigate the dependence of the parameter values of the algo- rithm on the number of attributes, in particular, when highly dimensional data sets should be partitioned. The set of tested problems will also be enlarged to include large data sets with clusters of different shapes and non-convex. Acknowledgments. The authors wish to thank three anonymous referees for their comments and suggestions to improve the paper. References 1. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999) 2. Greenlaw, R., Kantabutra, S.: Survey of clustering: algorithms and applications. Int. J. Inf. Retr. Res. 3(2) (2013). 29 pages 3. Ezugwu, A.E.: Nature-inspired metaheuristics techniques for automatic clustering: a survey and performance study. SN Appl. Sci. 2, 273–329 (2020) 4. Mohammed, J.Z., Meira, W., Jr.: Data Mining and Machine Learning: Fundamen- tal Concepts and Algorithms, 2nd edn. Cambridge University Press, Cambridge (2020) 5. Xu, D., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015) 6. MacQueen, J.B.: Some methods for classification and analysis of multivariate obser- vations. In: Le Cam, L.M., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Uni- versity of California Press (1967) 7. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
  • 114.
    A Simple ClusteringAlgorithm Based on Weighted Expected Distances 101 8. Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002) 9. Kwedlo, W.: A clustering method combining differential evolution with K-means algorithm. Pattern Recogn. Lett. 32, 1613–1621 (2011) 10. Patel, K.G.K., Dabhi, V.K., Prajapati, H.B.: Clustering using a combination of particle swarm optimization and K-means. J. Intell. Syst. 26(3), 457–469 (2017) 11. He, Z., Yu, C.: Clustering stability-based evolutionary K-means. Soft. Comput. 23, 305–321 (2019) 12. Sarkar, M., Yegnanarayana, B., Khemani, D.: A clustering algorithm using evolu- tionary programming-based approach. Pattern Recogn. Lett. 18, 975–986 (1997) 13. Chou, C.-H., Su, M.-C., Lai, E.: A new cluster validity measure and its application to image compression. Pattern Anal. Appl. 7, 205–220 (2004) 14. Asvadi, A.: K-means Clustering Code. Department of ECE, SPR Lab., Babol (Noshirvani) University of Technology (2013). http://www.a-asvadi.ir/ 15. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-1(2), 224–227 (1979) 16. Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2019). http://archive. ics.uci.edu/ml 17. Cura, T.: A particle swarm optimization approach to clustering. Expert Syst. Appl. 39, 1582–1588 (2012) 18. Kao, Y.-T., Zahara, E., Kao, I.-W.: A hybridized approach to data clustering. Expert Syst. Appl. 34, 1754–1762 (2008)
  • 115.
    Optimization of WindTurbines Placement in Offshore Wind Farms: Wake Effects Concerns José Baptista1,2(B) , Filipe Lima1 , and Adelaide Cerveira1,2 1 School of Science and Technology, University of Trás-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal {baptista,cerveira}@utad.pt, al64391@utad.eu 2 UTAD’s Pole, INESC-TEC, 5000-801 Vila Real, Portugal Abstract. In the coming years, many countries are going to bet on the exploitation of offshore wind energy. This is the case of southern European countries, where there is great wind potential for offshore exploitation. Although the conditions for energy production are more advantageous, all the costs involved are substantially higher when com- pared to onshore. It is, therefore, crucial to maximize system efficiency. In this paper, an optimization model based on a Mixed-Integer Linear Programming model is proposed to find the best wind turbines loca- tion in offshore wind farms taking into account the wake effect. A case study, concerning the design of an offshore wind farm, were carried out and several performance indicators were calculated and compared. The results show that the placement of the wind turbines diagonally presents better results for all performance indicators and corresponds to a lower percentage of energy production losses. Keywords: Offshore wind farm · Wake effect · Optimization 1 Introduction In the last two decades, wind energy has been a huge investment for almost all developed countries, with a strong bet on onshore wind farms (WF). It is expected that investments will be directed towards the energy exploration in the sea, where offshore wind farms (OWFs) are a priority. Recently, the Euro- pean Union has highlighted the enormous unexplored potential in Europe’s seas, intending to multiply offshore wind energy by 20 to reach 450 GW, to meet the objective of decarbonizing energy and achieving carbon neutrality by 2050 [6]. An OWF has enormous advantages over onshore, with reduced visual impact and higher efficiency, where the absence of obstacles allow the wind to reach a higher and more constant speed. However, the installation costs are substan- tially higher [10]. So, it is very important to optimize the efficiency of these c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 102–109, 2021. https://doi.org/10.1007/978-3-030-91885-9_8
  • 116.
    Optimization of WTPlacement in OWF: Wake Effects Concerns 103 WFs, wherein the location of wind turbines (WTs) is one of the most important factors to consider, taking into account the wake effect impact on the energy production of each one. To achieve this goal new models related to WF layout optimization should be investigated. Several authors addressed the topic using models based on genetic algorithms (GA) as is the case of Grady et al. [7]. In [12], a mathematical model based in a particle swarm optimization (PSO) algo- rithm which includes the variation of both wind direction and wake deficit is proposed. A different approach is addressed in [13], where a modular framework for the optimization of an OWF using a genetic algorithm is presented. In [14] a sequential procedure is used for global optimization consisting of two steps: i) a heuristic method to set an initial random layout configuration, and ii) the use of nonlinear mathematical programming techniques for local optimization, which use the random layout as an initial solution. Other authors address the topic of WF optimization, focusing on optimizing the interconnection between the turbines, known as cable routing optimization. Cerveira et al. in [5] pro- poses a flow formulation to find the optimal cable network layout considering only the cable infrastructure costs. In [3,4,11] Mixed-Integer Linear Program- ming (MILP) models are proposed to optimize the wind farm layout in order to obtain the greater profit of wind energy, minimizing the layout costs. Assum- ing that the ideal position of the turbines has already been identified, the most effective way of connecting the turbines to each other and to the substations, minimizing the infrastructure cost and the costs of power losses during the wind farm lifetime is considered. The main aim of this paper is to optimize the location of the WTs to minimize the single wake effect and so to maximize energy production. In order to achieve this target, a MILP optimization model is proposed. A case study is considered and performance indicators were computed and analyzed obtaining guidelines for the optimal relative position of the WTs. The paper is organized as follows. In Sect. 2 the wind power model is addressed and in Sect. 3 the MILP model to optimize the WTs placement is presented. Section 4 presents the case study results. Finally, Sect. 5 highlights some conclusions remarks. 2 Wind Farm Power Model A correct evaluation of the wind potential for power generation must be based on the potential of the wind resource and various technical and physical aspects that can affect energy production. Among them, the wind speed probability distribution, the effect of height, and the wake effect are highlighted. 2.1 Power in the Wind The energy available for a WT is kinetic energy associated with an air column traveling at a uniform and constant speed u, in m/s. The power available in the wind, in W, is proportional to the cube of the wind speed, given by Eq. (1), P = 1 2 (ρAv)v2 = 1 2 ρAv3 (1)
  • 117.
    104 J. Baptistaet al. where A is the swept area by the rotor blades, in m2 , v is the wind speed, in m/s, and ρ is the air density, usually considered as constant during the year, with standard value being equal to 1.225 kg/m3 . The power availability significantly depends on the wind speed, so, it is very important to place the turbines where the wind speed is high. Unfortunately, the total wind power cannot be recovered in a WT. According to Betz law [2], it is only allowed to convert, at most, 59.3% of the kinetic energy in mechanical energy to be used through the turbine. Typically, turbines’ technical is provided by manufacturers. This is the case of the power curve which gives the electrical power output, Pe(v), as a function of the wind speed, v. Knowing the wind profile and the turbine power curve, it is possible to determine the energy Ea produced by the conversion system, Ea = 8760 vmax v0 f(v)Pe(v)dv. (2) 2.2 The Jensen Model for Wake Effect The power extraction of turbines from the wind depends on the wind energy content, which differs in its directions. The downstream wind (the one that comes out of the turbine) has less energy content than the upstream wind. The wake effect is the effect of upstream wind turbines on the speed of the wind that reaches downstream turbines. In a WF where there are several WTs, it is very likely that the swept area of a turbine is influenced by one or more turbines that are before, this effect can be approached by the Jensen model proposed in 1983 by N. O. Jensen [8] and developed in 1986 by Katic et al. [9]. The initial diameter of the wake is equal to the diameter of the turbine and expands linearly as a function of wind. Similarly, the downstream wind deficit declines linearly with the distance to the turbine. The simple wake model consists of a linear expression of the wake diameter and the decay of the speed deficit inside the wake. The downstream wind speed (vd) of the turbine can be obtained by (3), where v0 represents the upstream wind speed. vd = v0 ⎛ ⎜ ⎝1 − N i=1 2a (1 + αx r1 )2 2 ⎞ ⎟ ⎠ (3) Where: the wake radius r1 is given by r1 = αx + rr, in which rr is the radius of the upstream turbine; x is the distance that meets the obstacle; α is a scalar that determines how quickly the wake expands, which is given by α = 1 2 ln z z0 , where z is the hub height and z0 is the surface roughness that in the water is around 0.0002 m, although it may increase with sea conditions; a is the axial induction factor given by a = (1 − √ 1 − CT )/2, and the turbine thrust coefficient CT is determined by the turbine manufacturer according to Fig. 1.
  • 118.
    Optimization of WTPlacement in OWF: Wake Effects Concerns 105 3 Optimization Model This section presents a MILP model to determine the optimal location of WT to maximize energy production, taking into account the wake effect. It is given a set of possible locations for WTs and a maximum number of WTs to install. Consider the set N = {1, . . . , n} of all possible locations to install WTs and, for each i ∈ N, let Ei be the energy generated by the WT installed at the location i without considering the wake effect, Eq. (2). Let Iij be the interference (loss of energy produced) at a WT installed on location j by the WT installed at location i, with i, j ∈ N. Those values are computed using the Jensen’s model, Sect. 2.2. It is considered that Ijj = 0. The following decision variables are considered: binary variables xi that assumes the value 1 if a WT is installed in the location i and 0 otherwise; non-negative variables wi that represents the interference at a WT installed on location i by all WT installed in the WF, for each i ∈ N: . If a WT is installed at location i, wi is the summation of the interference caused by all WT. Therefore: wi = j∈N Iijxj if xi = 1 and wi = 0 if xi = 0. So, if a WT is installed on location i, the produced energy is given by Ei −Eiwi, and the objective is to maximize i∈N (Eixi − wiEi) . To set variable w as just been described, the following constraint will be included, where M is a big number, j∈N Iijxj ≤ wi + M (1 − xi) (4) In fact, if xi = 0, by inequality (4), it holds j∈N Iijxj ≤ wi + M, and so there are no lower bound on the value of wi. However, as the coefficient of wi is negative in a maximization model, wi will assume the smallest possible value in the optimal solution which, in this case, is zero. Furthermore, if xi = 1, it holds j∈N Iijxj ≤ wi and wi will assumes the smallest possible value which, in this case, is j∈N Iijxj, as expected. The optimization model to determine the WTs location is denoted by WTL model and can be written as follows: max i∈N (Eixi − wiEi) (5) subject to i∈N xi ≤ U (6) j∈N Iijxj ≤ wi + M (1 − xi) , i ∈ N (7) xi ∈ {0, 1}, wi ≥ 0 i ∈ N (8) The objective function (5) corresponds to the maximization of the energy pro- duced, taking into account the losses due to the wake effect. Constraint (6) imposes a limit U on the number of turbines (this bound is often associated with the available capital). Constraints (7) relate variables x and w, assuring that wi = 0 if xi = 0 and wi = j∈N Iij if xi = 1. Finally, constraints (8) are signal constraints on the variables. To strengthen constraints (7), the value considered for the big constant was M = U − 1.
  • 119.
    106 J. Baptistaet al. 4 Case Study and Results In this section, a case study will be described and the results are presented and discussed. The WT used was Vestas v164-8.0 [15] with the power curve provided by the manufacturer shown in Fig. 1. The site chosen for the WT installation is approximately 20 km off the coast of Viana do Castelo, Portugal, location currently used by the first Portuguese OWF. To achieve the average annual wind speed there was used the Global wind Atlas [1] data. The average annual wind speed considered at the site was 8 m/s collected at 150 m of height. Prandtl Law was used to extrapolate the wind speed at the turbine hub height, around 140 m. Subsequently, the annual frequency of occurrence is calculated for various wind speeds, in the range between 4 m/s and 25 m/s (according to the WT power curve), using the Rayleigh distribution. The occurrence of these wind speeds is shown in the bar chart of Fig. 2a. Figure 2b shows the annual energy production of a WT at each wind speeds considered. Adding all energy values, the total yearly energy production for this WT will be 29.1 GWh. Fig. 1. Power and thrust coefficient curve for WT Vestas V164-8.0. (a) Number of hours wind speeds occur. (b) Annual energy produced by a WT. Fig. 2. Wind distribution and annual energy production. There are 36 possible locations for WT, distributed over six rows, each one of them having six possible locations, as shown in Fig. 3a. The horizontally and ver- tically distance between two neighbored locations is 850 m and the predominant wind direction is marked with an arrow.
  • 120.
    Optimization of WTPlacement in OWF: Wake Effects Concerns 107 (a) Possible location of WTs. (b) Energy production in each row with wake effect. Fig. 3. Wind farm layout and annual energy production According to Sect. 2.2, the Jensen model for the wake effect was used to calculate the speed behind each WT. Note that not all WTs receive the same amount of wind and for this reason do not will produce the same amount of energy. Figure 3b shows the total energy produced in each row. The wake effect leads a reduction in the energy production around 3.5% in each row. Two situations where considered in which it is intended to select a maximum of U = 22 and U = 24 WTs. The WTL model was constructed using FICO Xpress Mosel (Version 4.8.0) and then it was solved using the interfaces provided by these packages to the Optimizer. The CPU-time for all tested instances were less than 1 s and the model has 31 constraints and 66 variables. When not considering the wake effect, for U = 22 the expected electricity produced annually is 636.3 GWh and for U = 24 it is 692.8 GWh. Figures 4a and b show the optimal solutions for the distribution of the WTs, by solving WTL model for both situations. It turns out that when U = 22 the maximum expected production of energy for the 22 turbines is 639.6 GWh, per year. When U = 24 the maximum expected energy production by the 24 WTs is 698.3 GWh. It is observed that when considering the wake effect, with 22 turbines, there is a decrease of 3.6 GWh of energy produced, which corresponds to a reduction of 0.56%. With 24 WTs, there was a decrease of approximately 5.3 GWh of energy produced, which corresponds to a reduction of 0.72%. For a more complete analysis, some performance indicators were calculated, Equivalent full load (h) = Energy Production Total installed power (9) Capacity Factor (%) = Energy Production 8760 · Total installed power (10) Specific energy production (MWh/m2 ) = Energy Production Total Blades swept area (11) The values of these indicators, for both situations, are presented in Table 1.
  • 121.
    108 J. Baptistaet al. (a) Case U = 22. (b) Case U = 24. Fig. 4. Optimal WTs location. Table 1. Energy calculation and performance indicators, equations (9)–(11). 22 turbines 24 turbines With Without With Without Wake effect Wake effect Wake effect Wake effect Energy production (MWh) 636000 639600 693000 698300 Total installed power (MW) 176 176 192 192 Area swept by the blades (m2) 464728 464728 506976 506976 Equivalent full load (h) 3613.64 3634.09 3609.38 3635.42 Capacity factor (%) 41.25 41.49 41.20 41.50 Specific energy production (MWh/m2) 1.37 1.38 1.37 1.38 Analyzing the indicators in Table 1, the solution with the WTs placed diag- onally (22 turbines) presents better results, with a Equivalent Full Load around 3614 h, a Capacity Factor of 41.25% and a Specific energy production of 1.37 MWh/m2 , which confirms that placing the turbines on diagonal lines results in greater gains for OWFs. 5 Conclusions In this work, after having addressed mathematical models for wind characteri- zation, a MILP model is proposed to obtain the optimal distribution of a given number of turbines in an OWF location, with the objective of maximizing energy production. Two very important conclusions can be drawn from the obtained results. The first conclusion is that the relative position of the WTs has an influence on the amount of energy produced by each of them. In fact, the placement of the WTs diagonally presents better results for all performance indicators and has a lower percentage of energy production losses. The second conclusion refers to the size of the wind farm, as the results show, for larger OWFs it results in a lower percentage of energy production losses. In the case of 22 turbines, there is an annual decrease of 3.6 GWh of energy produced, representing 0.56%.
  • 122.
    Optimization of WTPlacement in OWF: Wake Effects Concerns 109 The results obtained show that the optimization model based on MILP is able to achieve, with low processing times, exact optimal solutions allowing to significantly increase the efficiency of OWFs. For future work, it is possible to complement this method by carrying out a more detailed study with regard to energy losses due to the wake effect taking into account also the non-predominant wind directions, and to incorporate in the optimization model the cost of the connection network between turbines taking into account the construction costs and energy losses over its lifetime. References 1. Global Wind Atlas. https://globalwindatlas.info. Accessed May 2021 2. Bergey, K.H.: The lanchester-betz limit. J. Energy 3, 382–384 (1979) 3. Cerveira, A., Baptista, J., Pires, E.J.S.: Wind farm distribution network optimiza- tion. Integr. Comput.-Aided Eng. 23(1), 69–79 (2015) 4. Cerveira, A., de Sousa, A., Solteiro Pires, E.J., Baptista, J.: Optimal cable design of wind farms: the infrastructure and losses cost minimization case. IEEE Trans. Power Syst. 31(6), 4319–4329 (2016) 5. Cerveira, A., Baptista, J., Pires, E.J.S.: Optimization design in wind farm dis- tribution network. In: Herrero, Á., et al. (eds.) International Joint Conference SOCO’13-CISIS’13-ICEUTE’13. AISC, vol. 239, pp. 109–119. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01854-6 12 6. E.Commission: An EU Strategy to harness the potential of offshore renewable energy for a climate neutral future. COM(2020) 741 final (2020) 7. Grady, S., Hussaini, M., Abdullah, M.M.: Placement of wind turbines using genetic algorithms. Renew. Energy 30(2), 259–270 (2005) 8. Jensen, N.O.: A Note on Wind Generator Interaction, pp. 1–16. Riso National Laboratory, Roskilde (1983) 9. Katic, I., Hojstrup, J., Jensen, N.O.: A simple model for cluster efficiency. In: Wind Energy Association Conference EWEC 1986, vol. 1, pp.407–410 (1986) 10. Manzano-Agugliaro, F., Sánchez-Calero, M., Alcayde, A., San-Antonio-Gómez, C., Perea-Moreno, A., Salmeron-Manzano, E.: Wind turbines offshore foundations and connections to grid. Inventions 5, 1–24 (2020) 11. Fischetti, M., Pisinger, D.: Optimizing wind farm cable routing considering power losses. Eur. J. Oper. Res. 270, 917–930 (2017). https://doi.org/10.1016/j.ejor.2017. 07.061 12. Hou, P., Hu, W., Soltani, M., Chen, Z.: Optimized placement of wind turbines in large-scale offshore wind farm using particle swarm optimization algorithm. IEEE Trans. Sustain. Energy 6(4), 1272–1282 (2015) 13. Pillai, A.C., Chick, J., Johanning, L., Khorasanchi, M., Barbouchi, S.: Optimisation of offshore wind farms using a genetic algorithm. Int. J. Offshore Polar Eng. 26(3), 1272–1282 (2016) 14. Pérez, B., Mı́nguez, R., Guanche, R.: Offshore wind farm layout optimization using mathematical programming techniques. Renew. Energy 53, 389–399 (2013) 15. Vestas: Wind-turbine-models. https://en.wind-turbine-models.com/turbines/318- vestas-v164-8.0. Accessed May 2021
  • 123.
    A Simulation Toolfor Optimizing a 3D Spray Painting System João Casanova1(B) , José Lima2,3(B) , and Paulo Costa1,3(B) 1 Faculty of Engineering, University of Porto, Porto, Portugal {up201605317,paco}@fe.up.pt 2 Research Centre of Digitalization and Intelligent Robotics, Instituto Politécnico de Bragança, Braganza, Portugal jllima@ipb.pt 3 INESC-TEC - INESC Technology and Science, Porto, Portugal Abstract. The lack of general robotics purposed, accurate open source simulators is a major setback that limits the optimized trajectory gener- ation research and general evolution of the robotics field. Spray painting is a particular case that has multiple advantages in using a simulator for exploring new algorithms, mainly the waste of materials and the dan- gers associated with a robotic manipulator. This paper demonstrates an implementation of spray painting on a previously existing simula- tor, SimTwo. Several metrics for optimization that evaluate the painted result are also proposed. In order to validate the implementation, we conducted a real world experiment that serves both as proof that the chosen spray distribution model translates to reality and as a way to calibrate the model parameters. Keywords: Spray painting · Simulator · Trajectory validation 1 Introduction Spray painting systems have been widely used in industry, becoming increasingly more competitive. Throughout the years several technologies have been devel- oped and adopted in order to improve optimization and quality of the painting process. In this work we address the problem of simulating spray paint optimiza- tion. This is a vital tool since it allows the user to analyse the quality of the paint produced by different trajectories without additional costs or dangers. In this work a 3D spray painting simulation system is proposed. This system has realistic spray simulation with sufficient accuracy to mimic real spray paint- ing. The simulation has 3D CAD or 3D scanned input pieces and produces a realistic visual effect that allows qualitative analyses of the painted product. It is also presented an evaluation metric that scores the painting trajectory based on thickness, uniformity, time and degree of coverage. This new simulation sys- tem, provides an open source implementation that is capable of real time spray simulation with validated experimental results. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 110–122, 2021. https://doi.org/10.1007/978-3-030-91885-9_9
  • 124.
    A Simulation Toolfor Optimizing a 3D Spray Painting System 111 2 Literature Review Spray painting simulation is an important part of automated spray painting sys- tems since it provides the possibility to predict the end result without any cost. Furthermore, virtual simulations allow quality metrics such as paint thickness uniformity or paint coverage to be easily measured and optimized. This section analyse the basis to simulate spray painting, including painting methods and sev- eral approaches on the spray simulation, such as computational fluid dynamics (CFD), spray simulation applied to robotics and commercial simulators. 2.1 Painting Methods Paint application has been used for centuries, for most of this time the main methods consisted in spreading paint along the desired surface with tools like a brush or a roller. Nowadays however, their popularity and use are decreasing since they are slow methods with low efficiency, and the resulting quality highly depends on the quality of the materials and tools. Consequently new techniques were developed such as dipping and flow coating, that are best applied to small components and protective coatings, and spray painting that is more suitable for industry. The main industrial painting methods are: air atomized spray application, high volume low pressure (HVLP) spray application, airless spray application, air-assisted airless spray application, heated spray application and electrostatic spray application [19]. The conventional spray application method is air atomized spray. In this method the compressed air is mixed with the paint in the nozzle and the atomized paint is propelled out of the gun. HVLP is an improvement on the conventional system and is best suited for finishing operations. The paint is under pressure and the atomization occurs when it contacts with the high volume of low pressure air resulting in the same amount of paint to be propelled at the same rate but with lower speeds. This way a high precision less wasteful spray is produced, since the paint doesn’t miss the target and the droplets don’t bounce back. Airless spray application is a different method, that uses a fluid pump to generate high pressure paint. The small orifice in the tip atomizes the pressurized paint. The air resistance slows down the droplets and allows for a good amount to adhere to the surface, but still in a lower extent than HVLP [14]. Air-assisted airless spray application is a mixture of the airless and conventional methods with the purpose of facilitating the atomization and decreasing the waste of paint. Heated spray application can be applied to the other methods and it consists in pre-heating the paint, making it less viscous requiring lower pressures. This also reduces drying time and increases efficiency. Electrostatic spray application can also be applied with the other techniques and consists in charging the paint droplets as they pass through an electrostatic field. The surface to be painted is grounded in order to attract the charged particles [18]. At last, the conditions at which the paint is applied have a direct impact on the paint job quality and success. This way, painting booths are usually used,
  • 125.
    112 J. Casanovaet al. because it is easier to control aspects such as temperature, relative humidity and ventilation. Lower temperatures increase drying time, while high moisture in the air interferes with adhesion and can cause some types of coating not to cure. Also the ventilation decreases initial drying time and allows the workspace to be safer for the workers. 2.2 Computational Fluid Dynamics (CFD) In order to simulate the spray painting process as accurately as possible, it is necessary to understand the physical properties that dictate droplets trajecto- ries and deposition. Computational fluid dynamics (CFD) allows for a numerical approach that can predict flow fields using the Navier-Stokes equations [15]. An important step that CFD still struggles to solve is the atomization process. It is possible to skip the atomization zone by using initial characteristics that can be approximated by measures of the spray [16]. Another alternative is to allow some inaccuracies of the model in the atomization zone by simulating completed formed droplets near the nozzle. This allows for simpler initialization while main- taining accurate results near the painted surface [21]. The Fig. 1 shows a CFD result that compared with the experimental process reveals similar thickness [22]. Despite these advantages, CFD isn’t used in industrial applications [11]. This is due to the fact that CFD requires a high computation cost and the precision that it produces isn’t a requirement for many applications. On the other hand, with the current technological advances this seems to be a promising technique that enables high precision even on irregular surfaces [20]. Fig. 1. CFD results with the calculated velocity contours colored by magnitude (m/s) in the plane z = 0 [22].
  • 126.
    A Simulation Toolfor Optimizing a 3D Spray Painting System 113 Fig. 2. On the left side is presented a beta distribution for varying β [11]. On the right side is presented an asymmetrical model (ellipse) [11]. 2.3 Spray Simulation Applied to robotics In robotics, it is common to use explicit functions to describe the rate of paint deposition at a point based on the position and orientation of the spray gun. Depending on the functions used, it is conceivable to create finite and infinite range models [11]. In infinite range models the output approaches zero as the distance approaches infinity. The most usually used distributions are Cauchy and Gaussian. Cauchy is simpler, allowing for faster computation while Gaussian is rounder and closer to reality [10,17]. It is also possible to use multiple Gaussian functions in order to obtain more complex and accurate gun models [24]. If the gun model is considerably asymmetrical, an 1D Gaussian with an offset revolved around an axis summed with a 2D Gaussian provides a 2D deposition model [13]. In finite range models the value of deposition outside a certain distance is actu- ally zero. Several models have been developed such as trigonometric, parabolic, ellipsoid and beta [11]. The most used model is the beta distribution model that, as the beta increases, changes from parabolic to ellipsoid and even the Gaussian distribution can be modelled, as shown in Fig. 2 [9]. It is also useful to use two beta distributions that best describe the gun model, as was done with infinite range models, Fig. 2 exemplifies an asymmetrical model [23]. 2.4 Commercial Simulators The growing necessity of an user friendly software that provides a fast and risk free way of programming industrial manipulators led to the development of dedicated software by the manipulators manufacturers and by external software companies. Most of them include essential features such as support for several robots, automatic detection of axis limits, collisions and singularities and cell components positioning.
  • 127.
    114 J. Casanovaet al. RoboDK is a software developed for industrial robots that works as an offline programming and simulation tool [3]. With some simple configurations it’s pos- sible to simulate and program a robotic painting system, however the simulation is a simple cone model and the programming is either entered manually or a program must be implemented [4]. This is a non trivial task for the end user but a favorable feature for research in paint trajectory generation. The advan- tages of this system are mainly the possibility to create a manual program and experiment it without any danger of collisions or without any waste of paint. RobCad paint, by Siemens, also works as an offline programming and simulation tool and also presents features as paint coverage analysis and paint databases [7]. It also includes some predefined paths that only require the user to set a few parameters. This simulator is one of the oldest in the market being used by many researchers to assess their algorithms. OLP Automatic by Inropa is a sys- tem that uses 3D scanning to automatically produce robot programs [1]. A laser scanner is used to scan parts that are placed on a conveyor and the system can paint every piece without human intervention. Although there are no available information about the created paths, the available demonstrations only paint simple pieces like windows and doors. Delfoi Paint is an offline programming software that besides the simulation and thickness analysis tools has distinctive features such as pattern tools that create paths based on surface topology and manual use of the 3D CAD features [2]. RoboGuide PaintPRO by FANUC, on the other hand is specifically designed to generate paths for FANUC robots [8]. The user selects the area to be painted and chooses the painting method and the path is generated automatically. Lastly, RobotStudio R Paint PowerPac by ABB is purposely developed for ABB robots and doesn’t have automatically generated paths. Its main benefit is the fast development of manual programs for multiple simultaneous robots. 3 Simulator Painting Add-On The simulation was developed as an additional feature on an already existing robotics simulator, SimTwo. SimTwo is robot simulation environment that was designed to enable researchers to accurately test their algorithms in a way that reflected the real world. SimTwo uses The Open Dynamics Engine for simulat- ing rigid body dynamics and the virtual world is represented using GLScene components [5], these provide a simple implementation of OpenGL [6]. In order to simulate the spray painting, it is taken the assumption that the target model has regular sized triangles. As this is rarely the case, a preprocess phase is required, where the CAD model is remeshed using an Isotropic Explicit Remeshing algorithm that repeatedly applies edge flip, collapse, relax and refine
  • 128.
    A Simulation Toolfor Optimizing a 3D Spray Painting System 115 Fig. 3. On top is presented the car hood model produced by the modeling software. On the bottom is presented the same model after the Isotropic Explicit Remeshing. to improve aspect ratio and topological regularity. This step can be held by the free software MeshLab [12] and the transformation can be observed in Fig. 3 with a CAD model of a car hood as an example. The importance of the preprocess lies in the fact that from thereafter the triangle center describes a triangle location, since the triangles are approximately equilateral. Another important result emerges from the fact that every triangle area is identical and small enough so that each triangle constitutes an identity for the mesh supporting a pixel/vortex like algorithm.
  • 129.
    116 J. Casanovaet al. Fig. 4. Simulation example with marked Spray gun origin (S), piece Center (C) and example point (P). In order to achieve a fast runtime without compromising on the realism of the simulation, the paint distribution model adopted is a Gaussian function limited to the area of a cone. The parameters that describe the paint distribution as well as the cone’s angle are dependent on the spray gun and its settings. The general equation that calculates the paint quantity per angle pqα based on the painting rate pr, time step of the simulation Δt, standard deviation σ and angle α is presented in the Eq. 1. pqα = pr.Δt. 1 σ ∗ √ 2 ∗ π e− 1 2 −α σ 2 (1) The principle of the simulation consists in iterating every triangle in the target model, and for each one calculating the respective paint thickness that’s being added. For this, let us define the following variables that are available from the scene: C piece Center; Sp, Spray gun position; Sd ≡ − − → SpC, Spray gun direction; Pp the center of an example triangle position and Pn the triangle normal. With these the angle between the Spray gun direction and the vector formed between the spray gun position and the triangle center position can be calculated as expressed on Eq. 2. α = arccos( − − − → SpPp • Sd − − − → SpPp Sd ) (2)
  • 130.
    A Simulation Toolfor Optimizing a 3D Spray Painting System 117 This angle, α, determines the limit of the cone that defines the area that is painted while also allowing to calculate the amount of paint that the triangle should receive as modeled by the Gaussian function. In order to obtain the paint quantity per triangle, pq, the fraction between the spray cone solid angle and the triangle solid angle Ω is used. The triangle solid angle calculation is presented on Eq. 3 and the paint quantity per triangle is presented on Eq. 4. Considering that the angle of the cone is δ, the vectors A, B, C are the 3 vertices of the triangle and the vectors a = − − → SpA, b = − − → SpB and c = − − → SpC. Ω = 2 arctan( |abc| a b c + (a • b) c + (a • c) b + (b • c) a ) (3) pq = pqα Ω (2 ∗ π ∗ (1 − cos(δ 2 )) (4) At last the added paint thickness Pt can be calculated by dividing the paint quantity by the area of the triangle At, Eq. 5. Pt = pq At (5) As this analysis doesn’t avoid painting the back part of the piece, an extra condition can be added based on the angle β (Eq. 6) between the Spray gun direction and the triangle normal since its absolute value must always be larger than π 2 to guarantee that the considered triangle is on the upper side of the piece. β = arccos( Pn • Sd Pn Sd ) (6) Some calculations such as the area and center of the triangles and which vertices correspond to each triangle occur every iteration (as long as the triangle is accumulating paint). As these values aren’t dynamic, they can be calculated only once, when the paint target CAD is loaded, and stored in arrays of records reducing the complexity of the algorithm to O(N). 4 Results In order to determine the paint quality several metrics were implemented. Estab- lishing which part of the paint target model is indeed supposed to be painted is a problem that is outside of the scope of this work as it would either involve a development of a graphical tool to select the parts to be painted or resort to simple plane intersections or even some kind of CAD model coloring that would limit the format abstraction. For this reasons a compromise solution is to cal- culate a dual version of each metric that only considers the painted triangles
  • 131.
    118 J. Casanovaet al. as triangles to be painted. These measurements aren’t as accurate as the others and should only be used when the painting algorithm is clearly doing what it was meant to do and paying close attention to the coverage ratio. In this regard the average spray thickness, the standard deviation of the spray thickness, the positive average of the spray thickness, the positive stan- dard deviation of the spray thickness, the max spray thickness, the minimum spray thickness and the positive minimum spray thickness function have been made available for any trajectory or sets of trajectories. In order to validate the simulation results, a real life experiment was made, using a regular spray painting gun with a servo controlling the trigger position in order to guaran- tee coherence among the results. The inside of the assembly can be observed in Fig. 5. Fig. 5. Spray gun used during experiments, without the front cover (with a servo motor to control the trigger). The spray gun was mounted in a Yaskawa MH50 machining robot, alongside the spindle and several loose paper sheets were used as sample painting targets. The full system is presented in Fig. 6. For the sake of reproducibility and meaningful analysis, an imaging process- ing pipeline was developed to extract the paint distribution function from the painted sheets scans. As the images were obtained with controlled conditions, no extensive preprocessing is necessary to clean them. This way the first step is
  • 132.
    A Simulation Toolfor Optimizing a 3D Spray Painting System 119 Fig. 6. Robotic painting system used during experiments. converting the image to grayscale. Then a simple threshold followed by a single step erosion segments the majority of the paint. Since the spray pattern is cir- cular, the center of the segmentation is the same as the center of the paint, the region of interest is thus defined by a circle, with this center and a slightly larger radius than half the segmentation bounding box. From this region the image intensity along several lines is averaged and the standard deviation is calculated to validate the assumption that the pattern is indeed circular symmetric. The size of the paper sheet used is known as is the amount of paint spent during the experiment, considering that the paint didn’t saturate, we can consider the amount of paint proportional to the pixel intensity as a reasonable assumption. The last step is to fit the curve to the Gaussian equation used in the simulation 1. The least mean squares is used for optimization, obtaining the optimal values for the Gaussian function parameters (the sum of the squared residuals is mini- mized). The pipeline and the calculated plots can be observed in Fig. 7. This tool allows us not only to validate the purposed simulation but also to easily calibrate the spray parameters, this is a great advantage since every spray painting gun has different parameters and even the same gun with different configurations can have a variety of spray patterns.
  • 133.
    120 J. Casanovaet al. Fig. 7. Imaging processing pipeline: original image, converted to grayscale image, thresholded image, eroded image, image with the circle delimiting the region of interest, and an example of the selected lines. Imaging processing plots: the 3D plot of the paint distribution across the image, average paint intensity of the lines and the respective standard deviation and the fitting curve of the Gaussian function with the obtained paint distribution. 5 Conclusion and Future Work This work developed a simulation tool that enables optimized trajectory genera- tion algorithms for industrial spray painting applications. The advantages of this simulation tool include fast and easy testing of on development algorithms, opti- mization without paint, energy and pieces being wasted, availability of qualitative and quantitative metrics without real world hassles. The simulations accuracy and
  • 134.
    A Simulation Toolfor Optimizing a 3D Spray Painting System 121 validity was tested with several experiments and an image processing pipeline that facilitates the tuning of the spray parameters was developed. Future work includes occlusion algorithms in order to allow the simulation to work with more complex parts that have cavities. Acknowledgements. This work is financed by National Funds through the Por- tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project UIDB/50014/2020. References 1. Automatic scanning and programming of robots. www.inropa.com/fileadmin/ Arkiv/Dokumenter/Produktblade/OLP automatic.pdf 2. Delfoi PAINT - Software for painting and coating. https://www.delfoi.com/delfoi- robotics/delfoi-paint/ 3. Examples - RoboDK. https://robodk.com/examples#examples-painting 4. Getting Started - RoboDK Documentation. https://robodk.com/doc/en/Getting- Started.html#Station 5. GLScene. http://glscene.sourceforge.net/wikka/ 6. OpenGL. https://www.opengl.org/ 7. Robcad Robotics and automation workcell simulation, validation and off-line pro- gramming. www.siemens.com/tecnomatix 8. Robust ROBOGUIDE Simulation Software. FANUC America. https:// www.fanucamerica.com/products/robots/robot-simulation-software-FANUC- ROBOGUIDE 9. Andulkar, M.V., Chiddarwar, S.S.: Incremental approach for trajectory generation of spray painting robot. Ind. Robot. (2015). https://doi.org/10.1108/IR-10-2014- 0405 10. Antonio, J.K.: Optimal trajectory planning for spray coating. In: Proceedings of the IEEE International Conference on Robotics and Automation (1994). https:// doi.org/10.1109/robot.1994.351125 11. Chen, Y., Chen, W., Li, B., Zhang, G., Zhang, W.: Paint thickness simulation for painting robot trajectory planning: a review (2017). https://doi.org/10.1108/IR- 07-2016-0205 12. Cignoni, P., Callieri, M., Corsini, M., Dellepiane, M., Ganovelli, F., Ranzuglia, G.: MeshLab: an open-source mesh processing tool. In: Scarano, V., Chiara, R.D., Erra, U. (eds.) Eurographics Italian Chapter Conference. The Eurograph- ics Association (2008). https://doi.org/10.2312/LocalChapterEvents/ItalChap/ ItalianChapConf2008/129-136 13. Conner, D.C., Greenfield, A., Atkar, P.N., Rizzi, A.A., Choset, H.: Paint deposition modeling for trajectory planning on automotive surfaces. IEEE Trans. Autom. Sci. Eng. (2005). https://doi.org/10.1109/TASE.2005.851631 14. Fleming, D.: Airless spray-practical technique for maintenance painting. Plant Eng. (Barrington, Illinois) 31(20), 83–86 (1977) 15. Fogliati, M., Fontana, D., Garbero, M., Vanni, M., Baldi, G., Dondè, R.: CFD simulation of paint deposition in an air spray process. J. Coat. Technol. Res. 3(2), 117–125 (2006) 16. Hicks, P.G., Senser, D.W.: Simulation of paint transfer in an air spray process. J. Fluids Eng. Trans. ASME 117(4), 713–719 (1995). https://doi.org/10.1115/1. 2817327
  • 135.
    122 J. Casanovaet al. 17. Persoons, W., Van Brussel, H.: CAD-based robotic coating of highly curved sur- faces. In: 24th International Symposium on Industrial Robots, Tokyo, pp. 611–618 (November 1993). https://doi.org/10.1109/robot.1994.351125 18. Rupp, J., Guffey, E., Jacobsen, G.: Electrostatic spray processes. Met. Finish. 108(11–12), 150–163 (2010). https://doi.org/10.1016/S0026-0576(10)80225-9 19. Whitehouse, N.R.: Paint application. In: Shreir’s Corrosion, pp. 2637–2642. Else- vier (January 2010). https://doi.org/10.1016/B978-044452787-5.00142-6 20. Ye, Q.: Using dynamic mesh models to simulate electrostatic spray-painting. In: High Performance Computing in Science and Engineering 2005 - Transactions of the High Performance Computing Center Stuttgart, HLRS 2005 (2006). https:// doi.org/10.1007/3-540-29064-8-13 21. Ye, Q., Domnick, J., Khalifa, E.: Simulation of the spray coating process using a pneumatic atomizer. Institute for Liquid Atomization and Spray Systems (2002) 22. Ye, Q., Pulli, K.: Numerical and experimental investigation on the spray coating process using a pneumatic atomizer: influences of operating conditions and target geometries. Coatings (2017). https://doi.org/10.3390/coatings7010013 23. Zhang, Y., Huang, Y., Gao, F., Wang, W.: New model for air spray gun of robotic spray-painting. Jixie Gongcheng Xuebao/Chin. J. Mech. Eng. (2006). https://doi. org/10.3901/JME.2006.11.226 24. Zhou, B., Zhang, X., Meng, Z., Dai, X.: Off-line programming system of industrial robot for spraying manufacturing optimization. In: Proceedings of the 33rd Chi- nese Control Conference, CCC 2014 (2014). https://doi.org/10.1109/ChiCC.2014. 6896426
  • 136.
    Optimization of GlottalOnset Peak Detection Algorithm for Accurate Jitter Measurement Joana Fernandes1,2 , Pedro Henrique Borghi3,4 , Diamantino Silva Freitas2 , and João Paulo Teixeira5(B) 1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politecnico de Braganca (IPB), 5300 Braganca, Portugal joana.fernandes@ipb.pt 2 Faculdade de Engenharia da Universidade do Porto (FEUP), 4200-465 Porto, Portugal dfreitas@ipb.pt 3 Instituto Politecnico de Braganca (IPB), 5300 Braganca, Portugal 4 Federal University of Technology - Parana (UTFPR), Cornelio Procopio 86300-000, Brazil pedromelo@alunos.utfpr.edu.br 5 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Applied Management Research Unit (UNIAG), Instituto Politecnico de Braganca (IPB), 5300 Braganca, Portugal joaopt@ipb.pt Abstract. Jitter is an acoustic parameter used as input for intelligent systems for the diagnosis of speech related pathologies. This work has the objective to improve an algorithm that allows to extract vocal param- eters, and thus improve the accuracy measurement of absolute jitter parameter. Some signals were analyzed, where signal to signal was com- pared in order to try to understand why the values are different in some signal between the original algorithm and the reference software. In this way, some problems were found that allowed to adjust the algorithm, and improve the measurement accuracy for those signals. Subsequently, a comparative analysis was performed between the values of the original algorithm, the adjusted algorithm and the Praat software (assumed as reference). By comparing the results, it was concluded that the adjusted algorithm allows the extraction of the absolute jitter with values closer to the reference values for several speech signals. For the analysis, sustained vowels of control and pathological subjects were used. Keywords: Jitter · Algorithm · Optimization · Speech pathologies · Acoustic analysis This work was supported by Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 123–137, 2021. https://doi.org/10.1007/978-3-030-91885-9_10
  • 137.
    124 J. Fernandeset al. 1 Introduction Speech pathologies are relatively common and can be found in different stages of evolution and severity, affecting approximately 10% of the population [1]. These pathologies directly affect vocal quality, as they alter the phonation process and have increased dramatically in recent times, mainly due to harmful habits, such as smoking, excessive consumption of alcoholic beverages, persistent inhalation of dust-contaminated air and abuse of voice [2]. There are a variety of tests that can be performed to detect pathologies associated with the voice, however, they are invasive becoming uncomfortable for patients and are time-intensive [3]. Auditory acoustic analyzes, performed by professionals, lack objectivity and depend on the experience of the physician who makes the assessment. Acoustic analysis allows non-invasively to determine the individual’s vocal quality. It is a technique widely used in the detection and study of voice pathologies, since it allows to measure properties of the acoustic signal of a recorded voice where a speech or vowels are said in a sustained way [4,5]. This analysis is able to provide the sound wave format allowing the evaluation of certain characteristics such as frequency disturbance measurements, amplitude disturbance measurements and noise parameters. Jitter is one of the most used parameters as part of a voice exam and is used by several authors Kadiri and Alku 2020 [6], Teixeira et al. 2018 [7], Sripriya et al. 2017 [8], Teixeira and Fernandes 2015 [9] to determine voice pathologies. Jitter is the measure of the cycle to cycle variations of the successive glottic cycles, being possible to measure in absolute or relative values. Jitter is mainly affected by the lack of control in the vibration of the vocal folds. Normally, the voices of patients with pathologies tend to have higher values of jitter [10]. This work intends to improve the algorithm developed by Teixeira and Gonçalves [10,11] to obtain the jitter parameter, to later be used in a com- plementary diagnostic system and obtain realiable jitter values. As a means of reference, to ensure reliable values of the algorithm, Praat is used as a refer- ence for comparison, as this software is accepted by the scientific community as an accurate measure and is open software. This software is developed by Paul Boersma and David Weenink [12], from the Institute of Phonetic Sciences at the University of Amsterdam. This article is organized as follows: Sect. 2 describes the determination of the absolute jitter, the pathologies and the database used, as well as the number of subjects used for the study; Sect. 3 describes some of the problems found in some signals in the Teixeira and Gonçalves algorithms [10,11], as well as the description and flowchart of the adjusted algorithm; Sect. 4 describes the studies carried out, the results obtained and the discussion. Finally, in Sect. 5 the conclusions are presented.
  • 138.
    Optimization of GlottalOnset Peak Detection 125 2 Methodology In Fig. 1, the jitter concept is illustrated, where it is possible to perceive that jitter corresponds to the measure of the variation of the duration of the global periods. Fig. 1. Jitter representation in a sustained vowel /a/. 2.1 Jitter Jitter is defined as a measure of glottal variation between cycles of vibration of the vocal cords. Subjects who cannot control the vibration of the vocal cords tend to have higher values of jitter [11]. Absolute jitter (jitta) is the variation of the glottal period between cycles, that is, the average absolute difference between consecutive periods, expressed by Eq. 1. jitta = 1 N − 1 N i=2 |Ti − Ti−1| (1) In Eq. 1 Ti is the size of the glottal period i and N is the total number of glottal periods. 2.2 Database The German Saarbrucken Voice Database (SVD) was used. This database is available online by the Institute of Phonetics at the University of Saarland [13]. The database consists of voice signals of more than 2000 subjects with several dis- ease and controls/healthy subjects. Each person has the recording of phonemes /a/, /i/ and /u/ in the low, normal and high tones, swipe along tones, and the
  • 139.
    126 J. Fernandeset al. German phrase “Guten Morgen, wie geht es Ihnen?” (“Good morning, how are you?”). The size of the sound files is between 1 and 3 s and have a sampling frequency of 50 kHz. For the analysis it was used a sub-sample of 10 control subjects (5 male and 5 female) and 5 patient subjects (2 male and 3 female), of 3 diseases. Of these, 2 subjects had chronic laryngitis, one of each gender, 2 subjects with vocal cord paralysis, one of each gender and 1 female with vocal cord polyps. 3 Development 3.1 Comparative Analysis Between the Signals Themselves Teixeira and Gonçalves in [11] reported the results of the algorithm using groups of 9 healthy and 9 pathological male and female subject. The authors compared the average measures of the algorithm and Praat software, over the groups of 9 subjects. The comparison shows generally slightly higher values measured by the algorithm, but with difference lower than 20 µs except for the female pathological group. 20 µs corresponds to only one speech signal sample at 50 kHz sampling frequency. Authors claim a difference lower than 1 speech sample in average. Anyhow, in a particular speech file it was found very high differences between the algorithm measures and the Praat measures. In order to try to understand why the jitter measure in some particular speech files of the Teixeira and Gonçalves 2016 [11] algorithm are so different from the reference values, we proceeded to a visual comparison of the signals in the reference software and in the Teixeira and Gonçalves 2016 algorithm. In this way, we took the signals in which the values for absolute jitter were very different and it was noticed that for a particular shapes of the signal the algorithm do not find the peaks in a similar way as Praat did. In a female control signal, the reference value for absolute jitter was 10.7 µs and the algorithm value was 478.6 µs. When the signal was observed, it was realized that one of the reasons that led to this difference was the fact that the algorithm was selecting the maximum peaks to make the measurement, while the reference software was using the minimum peaks. This led us to try to understand why the algorithm is selecting the maximum peaks, instead of selecting the minimums. The conclusion observed in Fig. 2 was reached. As it can be seen in Fig. 2, the algorithm is finding two maximum values for the same peak. This first search of peaks is taken in a initial phase of the algorithm to decide to use the maximum or minimum peaks. This two close peaks leads the algorithm to erroneously choose the peaks it will use to make the measurement, since it does not satisfy the condition of the algorithm (number of maximum peaks = 10, in a 10 glottal periods length part of speech). This problem occurs in more signals. Correcting this error, for this signal, the absolute jitter value went from 478.6 µs to 39.3 µs. In another signal, also from a control subject, in which the reference value for the absolute jitter is 9.6 µs and the value of the algorithm was 203,1 µs.
  • 140.
    Optimization of GlottalOnset Peak Detection 127 Fig. 2. Problem 1 - finding peaks in a short window for decision for use the maximum or minimum peaks. When observing the signal, it was also noticed that the algorithm was using the maximum peaks to make the measurement, while the reference software used the minimum peaks. Looking at Fig. 3 (presenting the speech signal and the identification of the right minimum peaks), it is possible to see why that algorithm selected the maximum peaks instead of the minimum peaks. Fig. 3. Problem 2 - selection of maximum/minimum peaks (in red the position of the minimum peaks). (Color figure online) As it is possible to observe in Fig. 3 the length of the window used, defined in global periods, comprises 11 minimum peaks. Taking into account the conditions of the algorithm to select the maximum or minimum peaks, for later measure- ment, the fact of having more than 10 minimum peaks leads the algorithm to select the maximum. When the window correction was made, for this signal in question, it started to select only 10 minimum peaks and the algorithm starts to select the minimum peaks for measurement. Thus, the jitter absolute value went from 203.1 µs to 32.3 µs. In two pathological signals, one female and one male, in which the reference value for absolute jitter is 8.0 µs and the value of the algorithm is 68.6 µs for female signal. For the male, the absolute jitter in reference software is 28.7 µs and that of the algorithm is 108.0 µs. When the signal of each of them was observed,
  • 141.
    128 J. Fernandeset al. it was noticed that the choice between selecting the maximum/minimum was well done, however, this type of signal have two peaks very close in same period that alternatively have the minimum/maximum value. Once the algorithm takes always the minimum/maximum value, the selection of the first and second peak changes along the signal, making the measure of the jitter inconsistent. In Figs. 4 and 5 it is possible to observe the problems found in these two signals. These figures present the speech signal in blue line, the moving average in red line and the marked peaks with red circle. Fig. 4. Problem 3 - variation along the signal of the moving average minimum peak. (Color figure online) Fig. 5. Problem 4 - variation along the signal of the signal minimum peak. (Color figure online) As it is possible to observe in Figs. 4 and 5 the fact that there are 2 peaks (minimum peaks in these examples) in the same glottal period, sometimes the maximum peak is the first, other times it is the second, which leads to increase the absolute jitter determined.
  • 142.
    Optimization of GlottalOnset Peak Detection 129 3.2 Algorithm Adjustments The main purpose of the present work is to present the optimizations applied to the algorithm proposed in [11], since it was mainly observed the inability to deal with some recordings, both in the control group and those who have some type of pathology. Thus, a recap of the steps in [11] is proposed in order to contextualize each improvement developed. It must be emphasised the importance of the exact identification of the position of each onset time of the glottal periods, for an accurate measure of the Jitter values. The onset time of the glottal periods are considered as the position of the peaks in the signal. According to [11], the process of choosing the references for calculating the Jitter parameter can be better performed by analyzing the signal from its cen- tral region. There it is expected to find high stationarity. In addition, it was concluded that a window of 10 fundamental periods contains enough informa- tion to determine the reference of the length of the fundamental period and the selection between the minimums or maximums. On this window, the moments that occur the positive and negative peaks of each glottal period are determined, as well as the amplitude module of the first cycle. The step between adjacent cycles is based on the peak position of the previous cycle plus a fundamental period. Focusing on this point, the position of the posterior peak is determined by searching for the maximum (or minimum) over a two-third window of the fundamental period. Finally, in an interval of two fifths of fundamental period, around each peak, the search for other peaks corresponding to at least 70% of its amplitude is carried out. The reference determination is made considering the positive and negative amplitudes of the first module cycle and the count of positive and negative peaks in the interval of 10 cycles. Basically, the measure of amplitude directs the decision to the parameter with the greatest prominence, since it is expected to have isolated peaks for the one with the greatest ampli- tude. On the other hand, the count of cycles indicates the number of points and oscillations close to the peaks, which may eventually prevail over these. The occurrence of these behaviors may or may not be something characteristic of the signal, in a way that it was observed that for cases in which this rarely occurred, a slight smoothing should be sufficient to readjust the signal. In [11], when the peak count exceeded the limit of 10 for both searches, a strong smoothing was applied over the signal that sometimes de-characterized it. Based on the mentioned amplitude parameters and the peaks count, a decision-making process is carried out, which concludes with the use of max- imums or minimums of the original signal or its moving average, to determine all glottal periods. In Fig. 6 the flowchart of the proposed algorithm is presented.
  • 143.
    130 J. Fernandeset al. Fig. 6. Algorithm flowchart. The rules of the decision process are described below, where the limit param- eter n = 13 is defined as the maximum peak count accepted for the use of the original signal or for the application of light straightening. • If the maximum amplitude is greater than the minimum amplitude in module and the maximum peak count is in the range [10, n], or, the minimum peak count is greater than n and the maximum peak count is in [10, n]: ◦ If the maximum count is equal to 10, the search for peaks is per- formed on the original signal using the maximum as a cycle-by-cycle search method; ◦ Else, a moving average with a Hanning window of equal length to a ninth of a fundamental period is applied to the signal and the search for peaks in this signal is carried out through the maximum of each cycle. Figure 7 shows the comparison between the original signals, the moving
  • 144.
    Optimization of GlottalOnset Peak Detection 131 average MA1 and the moving average MA2. The peak marking was per- formed within this rule, that is, starting from MA1 and using maximum method. It is noted that the slight smoothing applied keeps the peak marking close to the real one and maintains, in a way, the most signifi- cant undulations of the signal, whereas MA2, in addition to shifting the position of the peaks more considerably, their undulations are composed of the contribution of several adjacent waves. This can reduce the reli- ability of marking signals that have peaks’ side components with high amplitude in modulus. Fig. 7. Comparison between the original signal, the signal after moving average with Hanning window of length N0/9 (MA1) and the signal after moving average with Hanning window of length N0/3 (MA2). The peaks of glotal periods are shown in black and were detected through maximum method over MA1. Recording excerpt from a healthy subject performing /i/ low. Normalized representation. • Else, if the minimum amplitude in module is greater than the maximum amplitude and the minimum peak count is in the range [10, n], or, the max- imum peak count is greater than n and the minimum peak count is in [10, n]: ◦ If the minimum count is equal to 10, the search for peaks is per- formed on the original signal using the minimum as a cycle-by-cycle search method; ◦ Else, a moving average with a Hanning window of equal length to a ninth of a fundamental period is applied to the signal and the search for peaks in this signal is carried out through the minimum of each cycle. Figure 8 shows an example of the path taken through this rule. Here, the generated MA1 signal proves to be much less sensitive than the MA2 (gen- erated for comparison) since in this one, the wave that contains the real peak is strongly attenuated by the contribution of the previous positive amplitude to the averaging. In this way, the application of smoothing
  • 145.
    132 J. Fernandeset al. contributes to eliminates rapid oscillations in the samples close to the minimum, but preserves the amplitude behavior of signals that do not need a strong smoothing. Fig. 8. Comparison between the original signal, the signal after moving average with Hanning window of length N0/9 (MA1) and the signal after moving average with Hanning window of length N0/3 (MA2). The peaks of glotal periods are shown in black and were detected through minimum method over MA1. Recording excerpt from a pathological subject performing /a/ high. Normalized representation. • Else, a moving average with a Hanning window of equal length to a third of the fundamental period is applied to the signal. On the result, linear trends are removed and a new analysis is performed in the central region of the signal. In this analysis, an interval of 10 fundamental periods is evaluated cycle by cycle about the maximum, minimum peaks and their adjacent peaks greater than 75% of their amplitude: ◦ If, in the first cycle, the maximum amplitude is greater than the min- imum amplitude in module, and the maximum count is equal to 10 or, the minimum count is greater than 10 and the maximum count is equal to 10, the search for peaks is made on the moving average signal using the maximum as a method of inspection cycle by cycle. Figure 9 demon- strates the case in which this rule is followed. It is possible to observe, by comparing the original signal, MA1 and MA2, that the determination of the peaks of the global periods is only possible when a heavy smoothing is applied, since the region with the greatest positive amplitude and energy, can only be highlighted with a single peak approaching a context long enough to avoid the two adjacent peaks in the original signal. It is also noted that MA1 reduces the high frequencies but keeps the maximum point undefined, varying between the left and right along the signal;
  • 146.
    Optimization of GlottalOnset Peak Detection 133 Fig. 9. Comparison between the original signal, the signal after moving average with Hanning window of length N0/9 (MA1) and the signal after moving average with Hanning window of length N0/3 (MA2). The peaks of glotal periods are shown in black and were detected through maximum method over MA2. Recording excerpt from a pathological subject performing /a/ normal. Normalized representation. ◦ Else, the search for peaks is done on the moving average signal using the minimum as a method of inspection cycle by cycle. In all cases, the inspection of the peaks cycle by cycle of every signal is carried out with steps of a fundamental period from the reference point (maximum or minimum) and then an interval of one third of the fundamental period is analyzed. The reduction in the length of this interval from two thirds (used in [11]) to one third makes the region of analysis more restricted to samples close to one step from the previous peak. With this it is expected to guarantee that the analysis is always carried out in the same region of the cycles, avoiding the exchange between adjacent major peaks. However, in the cases where there is a shift in the fundamental frequency over time, the use of a fixed parameter as a step between cycles, as done in [11] and in this work, can cause inevitable errors in the definition of the search interval. The maximum or minimum points of each cycle are used as a source for determining the parameters Jitter according to Eqs. 1. 4 Results and Discussion In this section, the results of some analyzes made will be reported, as well as their discussion. A comparative analysis was made for the values of absolute jitter between the values obtained by the algorithm developed by Teixeira and Gonçalves 2016 [11], the reference software [12] and the adjusted algorithm, in order to try to understand if the adjusted algorithm values are closer to the values of the
  • 147.
    134 J. Fernandeset al. reference software. Praat software was used as a reference, although the exact values of jitter cannot be known. For 10 control subjects and 5 pathological subjects, the absolute jitter was extracted and averaged by tone and vowel. Table 1 shows the results of this analysis for control subjects and Table 2 for pathological subjects. Table 1. Average absolute jitter (in microseconds) for each vowel and tone for control subjects. Vowel Tone Praat Teixeira and Gonçalves 2016 Adjusted a High 14.673 26.795 25.890 Low 27.614 80.084 34.105 Normal 24.916 50.091 32.016 i High 16.092 23.621 23.617 Low 23.805 26.246 26.378 Normal 18.593 23.909 23.900 u High 13.361 23.422 23.857 Low 29.077 60.089 37.140 Normal 18.376 23.759 23.755 Table 2. Average absolute jitter (in microseconds) for each vowel and tone for patho- logical subjects. Vowel Tone Praat Teixeira and Gonçalves 2016 Adjusted a High 32.817 56.447 56.379 Low 39.341 82.316 82.316 Normal 36.462 139.866 103.326 i High 39.774 62.280 62.284 Low 35.512 48.347 48.347 Normal 25.611 33.921 33.916 u High 33.606 45.489 45.489 Low 40.751 65.752 65.745 Normal 52.168 323.279 103.322 Through the data in Tables 1 and 2 it is possible to see that the adjusted algorithm obtains jitter measures closer to the reference than the Teixeira and Gonçalves 2016 algorithm, since, for example, in the vowel /a/ low tone, for the control subjects, with the algorithm obtained an improvement of 46 µs. After the average evaluation an analysis of the individual values per subject was made. Thus, the vowel /a/ normal tone was selected for control and patho- logical subjects for further analysis. In Fig. 10 it is possible to observe the results
  • 148.
    Optimization of GlottalOnset Peak Detection 135 of absolute jitter for the 10 control subjects and in Fig. 11 the results of absolute jitter for the 5 pathological subjects. Fig. 10. Comparison of the absolute jitter values for the 10 control subjects. Fig. 11. Comparison of absolute jitter values for the 5 pathological subjects. Through Figs. 10 and 11 it is possible to observe that comparing the values of the Teixeira and Gonçalves 2016 algorithm with the adjusted algorithm, the adjusted algorithm has closer values to the reference for more subjects. In con- trol subjects, comparing the reference values with the Teixeira and Gonçalves algorithm, 2016, the difference is less than 16.8 µs. Now comparing the refer- ence values with the adjusted algorithm, the difference is less than 7.1 µs. For pathological subjects, the difference between reference values and Teixeira and
  • 149.
    136 J. Fernandeset al. Gonçalves algorithm, 2016, is less than 57.9 µs and the difference with the cor- rected algorithm is less than 29.4 µs. Therefore, it can be concluded that the adjusted algorithm obtains values closer to the reference values. It should also be mentioned that an interpolation procedure of the speech sig- nal was also experimented with interpolations of several order between 2 and 10. The objective was to increase the resolution of the peak position. The interpola- tion procedure didn’t improve the accuracy of the absolute jitter determination, and was discarded. 5 Conclusion The objective of this work was to improve the algorithm developed by Teix- eira and Gonçalves 2016 in some particular signals subjected to inconsistencies finding the onset glottal periods. In order to try to increase the accuracy of the absolute jitter, some adjustments were made in the algorithm, where the detection of the peaks was improved. In order to understand if these corrections obtained improvements in the detection of the absolute jitter, an analysis was made of the averages of the values obtained from the absolute jitter of 10 control subjects and 5 pathological ones. In this analysis it was noticed that the values obtained through the corrections were closer to the reference values. In order to understand how the adjusted algorithm behaves compared to the reference value another analysis was carried out comparing, for one vowel, the 10 control subjects and the 5 pathological subjects. Thus, it was noticed that the adjusted algorithm measures the absolute jitter with values closer to the reference val- ues for more subjects. The algorithm still measure absolute jitter with slightly higher values than Praat. In control subjects, the difference in absolute jitter measurements was reduced from 16.8 to 7.1 µs, and for pathological subjects, the difference was reduced from 57.9 to 29.4 µs. References 1. Martins, A.l.H.G., Santana, M.F., Tavares, E.L.M.: Vocal cysts: clinical, endo- scopic, and surgical aspects. J. Voice 25(1), 107–110 (2011) 2. Godino-Llorente, J., Gomez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduc- tion of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53(10), 1943–1953 (2006) 3. Teixeira, J.P., Alves, N., Fernandes, P.O.: Vocal acoustic analysis: ANN versos SVM in classification of dysphonic voices and vocal cords paralysis. Int. J. E- Health Med. Commun. (IJEHMC) 11(1), 37–51 (2020) 4. Sataloff, R.T., Hawkshawe, M.J., Sataloff, J.B.: Common medical diagnoses and treatments in patients with voice disorders: an introduction and overview. In: Vocal Health and Pedagogy: Science, Assessment and Treatment, p. 295 (2017) 5. Godino-Llorente, J., Gómez-Vilda, P.: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans. Biomed. Eng. 51(2), 380–384 (2004)
  • 150.
    Optimization of GlottalOnset Peak Detection 137 6. Kadiri, S., Alku, P.: Analysis and detection of pathological voice using glottal source features. IEEE J. Sel. Top. Signal Process. 14(2), 367–379 (2020) 7. Teixeira, J.P., Teixeira, F., Fernandes, J., Fernandes, P.O.: Acoustic analysis of chronic laryngitis - statistical analysis of sustained speech parameters. In: 11th International Conference on Bio-Inspired Systems and Signal Processing, BIOSIG- NALS 2018, vol. 4, pp. 168–175 (2018) 8. Sripriya, N., Poornima, S., Shivaranjani, R., Thangaraju, P.: Non-intrusive tech- nique for pathological voice classification using jitter and shimmer. In: International Conference on Computer, Communication and Signal Processing (ICCCSP), pp. 1–6 (2017) 9. Teixeira, J.P., Fernandes, P.: Acoustic analysis of vocal dysphonia. Procedia Com- put. Sci. 64, 466–473 (2015) 10. Teixeira, J.P., Gonçalves, A.: Accuracy of jitter and shimmer measurements. Pro- cedia Technol. 16, 1190–1199 (2014) 11. Teixeira, J.P., Gonçalves, A.: Algorithm for jitter and shimmer measurement in pathologic voices. Procedia Comput. Sci. 100, 271–279 (2016) 12. Boersma, P., Weenink, D.: Praat: doing phonetics by computer [Computer pro- gram]. Version 6.0.48, 15 April 2019 (1992–2019). http://www.praat.org/ 13. Barry, W., Pützer, M.: Saarbruecken Voice Database. Institute of Phonetics at the University of Saarland (2007). http://www.stimmdatenbank.coli.uni-saarland.de. Accessed 15 Apr 2021
  • 151.
    Searching the OptimalParameters of a 3D Scanner Through Particle Swarm Optimization João Braun1,2(B) , José Lima2,3 , Ana I. Pereira2 , Cláudia Rocha3 , and Paulo Costa1,3 1 Faculty of Engineering, University of Porto, Porto, Portugal paco@fe.up.pt 2 Research Centre in Digitalization and Intelligent Robotics, Instituto Politécnico de Bragança, Bragança, Portugal {jbneto,jllima,apereira}@ipb.pt 3 INESC-TEC - INESC Technology and Science, Porto, Portugal claudia.d.rocha@inesctec.pt Abstract. The recent growth in the use of 3D printers by independent users has contributed to a rise in interest in 3D scanners. Current 3D scanning solutions are commonly expensive due to the inherent complex- ity of the process. A previously proposed low-cost scanner disregarded uncertainties intrinsic to the system, associated with the measurements, such as angles and offsets. This work considers an approach to estimate these optimal values that minimize the error during the acquisition. The Particle Swarm Optimization algorithm was used to obtain the param- eters to optimally fit the final point cloud to the surfaces. Three tests were performed where the Particle Swarm Optimization successfully con- verged to zero, generating the optimal parameters, validating the pro- posed methodology. Keywords: Nonlinear optimization · Particle swam optimization · 3D scan · IR sensor 1 Introduction The demand for 3D scans of small objects has increased over the last few years due to the availability of 3D printers for regular users. However, the solutions are usually expensive due to the complexity of the process, especially for sporadic users. There are several approaches to 3D scanning, with the application, in general, dictating the scanning system’s requirements. Thus, each of them can differ concerning different traits such as acquisition technology, the structure of the system, range of operation, cost, and accuracy. In a more general way, these systems can be classified as contact or non-contact, even though there are several sub-classifications inside these two categories, which the reader can find with more detail in [2]. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 138–152, 2021. https://doi.org/10.1007/978-3-030-91885-9_11
  • 152.
    Optimal Parameters ofa 3D Scanner Through PSO 139 Two common approaches for reflective-optical scanning systems are triangu- lation and time-of-flight (ToF). According to [6], the range and depth variation in triangulation systems are limited, however, they have greater precision. In contrast, ToF systems have a large range and depth variation at the cost of decreased precision. The triangulation approach works, basically, by projecting a light/laser beam over the object and capturing the reflected wave with a digi- tal camera. After that, the distance between the object and the scanning system can be computed with trigonometry, as the distance between the camera and the scanning system is known [6]. On the other hand, the accuracy of ToF- based systems is mainly determined by the sensor’s ability to precisely measure the round-trip time of a pulse of light. In other words, by emitting a pulse of light and measuring the time that the reflected light takes to reach the sensor’s detector, the distance between the sensor and the object is measured. Besides, still regarding ToF-based systems, there is another approach to measure the distance between the sensor and the object based on the Phase-Shift Method, which essentially compares the phase shift between the emitted and reflected electromagnetic waves. Each approach has its trade-offs, which a common one is between speed and accuracy in these types of scanning systems. By increasing the scanning speed of the system, the accuracy will decrease, and vice-versa. This is mitigated by hav- ing expensive rangefinders with higher sampling frequencies. In addition, accu- racy during acquisition can be heavily affected by light interference, noise, and the angle of incidence of the projected beam in the object being much oblique. Therefore, a controlled environment, and a high-quality sensor and circuit must be used to perform quality scans. However, the angle of incidence is less con- trollable since it depends on the scanning system and the object (trade-off of the system’s structure). In general, these types of 3D scanning systems are very expensive, especially when even the cheaper ones are considered costly by the regular user. A particular example includes a low-cost imagery-based 3D scanner for big objects [18]. A low-cost 3D scanning system that uses a triangulation approach with an infra-red distance sensor targeting small objects, therefore having limited range and requiring large accuracy, was proposed and validated both in simulation [5] and real scenarios, where the advantages and disadvantages of this architecture were also addressed. Although the proposed system already has good accuracy, it is possible to make some improvements with a multivariable optimization app- roach. This is possible because the system makes some assumptions to perform the scan that, although are true in simulation, do not necessarily hold in a real scenario. One example is the object not being exactly centered on the scanning platform. The assumptions, as well as the problem definition, are described in Subsect. 4.1. The reasons why PSO was chosen instead of other approaches are addressed in Subsect. 4.2. The paper is structured as follows: a brief state of the art is presented in Sect. 2. Section 3 describes the 3D scanning system and how it works. The optimization problem is defined, and a solution is proposed in Sect. 4. The results are described in Sect. 5. Finally, the last section presents the conclusion and future works.
  • 153.
    140 J. Braunet al. 2 State of the Art As the field of optimization is vast, there are works in several fields of knowledge. Therefore, in this section, the objective is to give a brief literature review on multivariable optimization. Thus, some recent examples follow below. Regarding optimization for shape fitting, [9] proposed a 3D finite-element (FE) modeling approach to obtain models that represent the shape of grains adaptively. According to the authors, the results show that the proposed app- roach supports the adaptive FE modeling for aggregates with controllable accu- racy, quality, and quantity, which indicates reliable and efficient results. More- over, authors in [11], proposed an approach that combined an optimization- based method with a regression-based method to get the advantages of both approaches. According to the authors, model-based human pose estimation is solved by either of these methods, where the first leads to accurate results but is slow and sensitive to parameter initialization, and the second is fast but not as accurate as of the first, requiring huge amounts of supervision. Therefore, they demonstrated the effectiveness of their approach in different settings in compar- ison to other state-of-the-art model-based pose human estimation approaches. Also, a global optimization framework for 3D shape reconstruction from sparse noisy 3D measurements, that usually happens in range scanning, sparse feature- based stereo, and shape-from-X, was proposed in [13]. Their results showed that their approach is robust to noise, outliers, missing parts, and varying sampling density for sparse or incomplete 3D data from laser scanning and passive multi- view stereo. In addition, there are several works in multivariable optimization for mechan- ical systems. For instance, the characteristics, static and dynamic, of hydrostatic bearings were improved with controllers that were designed for the system with the help of a multivariable optimization approach for parameter tuning [16]. The authors proposed a control approach that has two inputs, a PID (Proportional Integral Derivative) sliding mode control and feed-forward control, where the parameters of the first input, were optimized with Particle Swarm Optimization (PSO). Their results performed better compared with results from the literature which had PID control. In addition, a system that consists of multi-objective particle swarm optimized neural networks (MOPSONNS) was proposed in [8] to compute the optimal cutting conditions of 7075 aluminum alloys. The sys- tem uses multi-objective PSO and neural networks to determine the machining variables. The authors concluded that MOPSONNS is an excellent optimization system for metal cutting operations where it can be used for other difficult- to-machine materials. The authors in [20] proposed a multi-variable Extremum Seeking Control as a model-free real-time optimizing control approach for oper- ation of cascade air source heat pump. They used for the scenarios of power minimization and coefficient of performance maximization. Their approach was validated through simulation which their results showed good convergence per- formance for both steady-state and transient characteristics. The authors in [7] applied two algorithms, continuous ant colony algorithm (CACA) and mul- tivariable regression, to calculate optimized values in back analysis of several
  • 154.
    Optimal Parameters ofa 3D Scanner Through PSO 141 geomechanical parameters in an underground hydroelectric power plant cavern. The authors found that the CACA algorithm showed better performance. To improve the thermal efficiency of a parabolic trough collector, authors in [1] presented a multivariate inverse artificial neural network (ANNim) to determine several optimal input variables. The objective function was solved using two meta-heuristic algorithms which are Genetic Algorithm (GA) and PSO. The authors found that although ANNim-GA runs faster, ANNim-PSO has better precision. The authors stated that the two presented models have the potential to be used in intelligent systems. Finally, there are optimization approaches in other areas such as control the- ory, chemistry, and telecommunication areas. Here are some examples. In control theory, by employing online optimization, a filtering adaptive tracking control for multivariable nonlinear systems subject to constraints was proposed in [15]. The proposed control system maintains tracking mode if the constraints are not violated, if not, it switches to constraint violation avoidance control mode by solving an online constrained optimization problem. The author showed that the switching logic avoids the constraints violation at the cost of tracking per- formance. In chemistry, artificial neural networks were trained under 5 different learning algorithms to evaluate the influence of several input variables to pre- dict the specific surface area of synthesized nanocrystalline NiO-ICG composites, where the model trained with the Levenberg-Marquardt algorithm was chosen as the optimum algorithm for having the lowest root means square error [17]. It was found that the most influential variable was the calcination temperature. At last, in telecommunications, Circular Antenna Array synthesis with novel metaheuristic methods, namely Social Group Optimization Algorithm (SGOA) and Modified SGOA (MSGOA), was presented in [19]. The authors performed a comparative study to assess the performance of several synthesis techniques, where SGOA and MGSOA had good results. As there aren’t any related works similar to our approach, the contribution of this work is to propose a procedure to find the optimal parameters (deviations) of a 3D scanner system through PSO, guaranteeing a good fit for point clouds over their respective scanned objects. 3 3D Scanning System The scanning system works basically in two steps. First, the object is scanned and the data, in spherical coordinates, is saved in a point cloud file. After this, a software program performs the necessary transformations to cartesian coordi- nates and generates STL files that correspond to the point clouds. The flowchart of the overall STL file generation process is presented in Fig. 1a. A prototype of the low-cost 3D scanner, illustrated in Fig. 1b, was developed to validate the simulated concept in small objects [5]. The architecture regarding the components and the respective communica- tion protocols of the 3D scanning system can be seen in Fig. 2.
  • 155.
    142 J. Braunet al. Fig. 1. Generation of an object’s STL file: (a) flowchart of the overall process, and (b) prototype of the low cost 3D scanner. The system is composed of a rotating structure supporting the object that is going to be scanned and another articulated structure supporting the sensor that performs the data acquisition. The rotating structure is actuated by a stepper motor that precisely rotates it continuously until the end of one cycle, corre- sponding to 360◦ . An articulated part whose rotating axis is aligned with the center of the rotating structure, having attached an optical sensor that measures the distance to the object, is also actuated with a stepper motor which moves in steps of 2◦ (configurable). The proposed optical sensor was selected since it is a combination of a complementary metal-oxide-semiconductor image sensor and an infrared light-emitting diode (IR-LED) that allow a reduced influence by the environment temperature and by the object’s reflectivity. Both stepper motors used on this approach are the well-known NEMA17, controlled by the DRV8825 driver. Moreover, a limit switch sensor was added to be a reference to the position of 0◦ . Over a cycle, the sensor obtains 129 measurements, each one taking 50 ms. The rotating plate’s speed is approximately 55◦ /s (configurable). The system is controlled by a low-cost ESP32 microcontroller, transferring to the PC, by serial or Wi-Fi, the measured distance to the object, the angle between the sensor and the rotating structure’s angle. Fig. 2. Diagram illustrating the components of the system and the respective commu- nication protocols used.
  • 156.
    Optimal Parameters ofa 3D Scanner Through PSO 143 The dynamics of the 3D scanner is illustrated throughout a flowchart, which can be seen in Fig. 3. The rotating structure keeps rotating until it finishes one cycle of 360◦ . After this, the system increments the sensor’s angle by 2◦ . This dynamic keeps repeating until the sensor’s angle reaches 90◦ . Afterward, the data, in spherical coordinates, is saved to a point cloud file. The data conversion to Cartesian coordinates is explained in detail in Sect. 4. The algorithm that generates the STL files is outside of the scope of this work, for further information the reader is referred to [5]. Fig. 3. 3D scanner dynamics flowchart. 4 Methodology The notation used in this work to explain the system considers spherical coordi- nates, (l, θ, ϕ). The scanning process starts by placing the object on the center of the rotating plate (P = (0, 0, hp)), which is directly above the global reference frame. The system is represented in Fig. 4, from an XZ plane view. First, the slope’s angle measured from the X axis to the Z axis, with range [0, π/2] rad, is θ. Also, ϕ is the signed azimuthal angle measured from the X axis, in the XY plane, with range [0, 2π]. The parameter hp represents the distance from the rotating plate to the arm rotation point, and hs is the perpendicular distance from the arm to the sensor. The parameter l defines the length of the mechanical arm and d represents the distance from the sensor to the boundary of the object which is located on the rotating plate. The position of the sensor s (red circle of Fig. 4) Xs = (xs, ys, zs), in terms of the parameters (l, θ, ϕ), can be defined as follows, assuming the reference origin at the center of the rotating base: ⎧ ⎨ ⎩ xs = (l cos(θ) − hs sin(θ)) cos(ϕ) ys = (l cos(θ) − hs sin(θ)) sin(ϕ) zs = l sin(θ) + hs cos(θ) − hp (1) Therefore, Xp = (xp, yp, zp), which is the position of a scanned point in a given time in relation to the rotating plate reference frame, is represented by a translation considering ||Xs|| and d. Thus, it is possible to calculate Xp as:
  • 157.
    144 J. Braunet al. Fig. 4. Mechanical design of the proposed system. XZ plane view. Red circle represents the sensor and the dotted line represents its beam. Xp = 1 − d ||Xs|| Xs (2) where ||Xs|| = x2 s + y2 s + z2 s , which represents the euclidean norm of (xs, ys, zs). 4.1 Problem Definition The described calculations to obtain the position of a given scanned point Xp made several assumptions that, although always true in a simulated scenario, do not necessarily hold in real environments. If these assumptions would not be met, the coordinate transformations would not hold, and, therefore the point cloud and its mesh would become distorted. These assumptions, and consequently the optimization problem, are going to be described in this subsection. First, there may exist an offset in θ (arm’s rotation angle), denoted by θoffset, because the resting position of the arm can be displaced from the initial intended position. Following the same reasoning, an offset in ϕ, represented by ϕoffset, is also possible, as the rotating plate can be displaced at the beginning of the scan- ning procedure. Finally, the scanned object may be displaced from the center of the rotating plate. This offset can be taken into account as an offset in the x and y coordinates constituting the sensor’s position, xoffset and yoffset respectively, as the effect would be the same as applying an offset to the object’s position. Thus, Eq. (1) is updated to Eq. (3) according with these offsets: ⎧ ⎨ ⎩ xs = (l cos(θ ) − hs sin(θ )) cos(ϕ ) + xoffset ys = (l cos(θ ) − hs sin(θ )) sin(ϕ ) + yoffset zs = l sin(θ ) + hs cos(θ ) − hp (3) where θ = θ + θoffset and ϕ = ϕ + ϕoffset. Finally, in theory, the sensor must be orthogonal to the rotating arm. How- ever, the sensor may become rotated γoffset degrees during the assembly of the real system. Thus, θ = θ + γoffset. This behaviour is better understood in
  • 158.
    Optimal Parameters ofa 3D Scanner Through PSO 145 Fig. 5, where it is possible to see that if there is no offset in the sensor’s rotation, θ = θ . Fig. 5. Updated mechanical design of the proposed system with offsets. XZ plane view. Finally, to take into account θ and the object’s offset in the coordinate transformations, it is necessary to make some modifications to the calculations computing Xp. Therefore, two 2D rotations are made in θ and ϕ to direct an unit vector, Xt = (xt, yt, zt), according to the sensor beam direction: ⎧ ⎨ ⎩ xt = cos(ϕ )x u − sin(ϕ )zt yt = sin(ϕ )x u + cos(ϕ )zt zt = − sin(θ ) (4) where x u = − cos(θ ). After the rotations, the unit vector is scaled by d to obtain the distance of the boundary of the object to the sensor in relation to the center of the plate. At last, Xp is given by the sum of the sensor’s position, Xs, and the scaled rotated unit vector: Xp = Xs + dXt (5) So, it is intent to optimize the following problem min n i=1 ||Xpi − Xi||2 (6) where Xp, as previously mentioned, represents the scanned points in relation to the center of the support plate, X = (x, y, z) is the nearest point of the original object boundary, and n the number of point on the cloud. 4.2 Optimization To solve the optimization problem defined on (6), Particle Swarm Optimiza- tion (PSO) algorithm was used. This algorithm belongs to Swarm Intelligence
  • 159.
    146 J. Braunet al. class of algorithms inspired by natural social intelligent behaviors. Developed by Kennedy and Eberhart (1995) [10], this method consists on finding the solu- tion through the exchange of information between individuals (particles) that belong to a swarm. The PSO is based on a set of operations that promote the movements of each particle to promising regions in the search space [3,4,10]. This approach was chosen because it has several advantages, according to [12], such as: – Easy in concept and implementation in comparison to other approaches. – Less sensitive to the nature of the objective function regarding to other meth- ods. – Has limited number of hyperparameters in relation to other optimization techniques. – Less dependent of initial parameters, i.e., convergence from algorithm is robust. – Fast calculation speed and stable convergence characteristics with high- quality solutions. However, every approach has its limitations and disadvantages. For instance, authors in [14] stated that one of the disadvantages of PSO is that the algorithm easily fall into local optimum in high-dimensional space, and therefore, have a low convergence rate in the iterative process. Nevertheless, the approach showed good results for our system. 5 Results To test the approach three case objects were analyzed: Centered sphere, Sphere and Cylinder. The first and second case were spheres with radius of 0.05 m. The first sphere was centered on the rotating plate where the other had an offset of X,Y = [0.045, 0.045]m. Finally, the third object was a cylinder with h = 0.08 m and r = 0.06 m. Since PSO is a stochastic method, 20 independent runs were done to obtain more reliable results. The Table 1 presents the results obtained by PSO, describ- ing the best minimum value obtained on the executed runs, the minimum average from all runs (Av) and their respective standard deviation (SD). To analyze the behavior of the PSO method, the results considering all runs (100%) and 90% of best runs are also presented. First, the results were very consistent for all the cases as can be verified by the standard deviations, average values, and minima being practically zero. This is also corroborated by comparing the results from all runs (100%) with 90% of the best runs, where the averages from (100%) matched the averages from (90%) of best runs and the standard deviations values maintained the same order of magnitude. Also, another way to look at the consistency is by comparing the best minimum value of executed runs with their respective averages, where all cases match. Still, the only case where the PSO algorithm suffered a little was the
  • 160.
    Optimal Parameters ofa 3D Scanner Through PSO 147 Table 1. PSO results for the three objects Minimum 100% 90% Av SD Av SD Centered Sphere 7.117 × 10−8 7.117 × 10−8 5.8 × 10−14 7.117 × 10−8 6.06 × 10−14 Sphere 4.039 × 10−5 4.039 × 10−5 1.3 × 10−13 4.039 × 10−5 1.18 × 10−13 Cylinder 3.325 × 10−2 3.325 × 10−2 3.2 × 10−7 3.325 × 10−2 2.4 × 10−7 cylinder, where although the minimum tended to 0, it didn’t reached the same order of magnitude of the other cases. Therefore, we can confirm that the PSO achieved global optimum in the sphere cases, on the other hand, the cylinder case is dubious. There is also the possibility that the objective function for the cylinder case needs some improvements. In terms of minimizers, they are described in Table 2. It is possible to see that for the centered sphere, all the parameters are practically 0. This is expected, since the centered sphere did not have any offsets. The same can be said for the cylinder. Finally, for the sphere, it is possible to see that although the sphere had offsets of X, Y = [0.045, 0.045], the minimizers of X, Y are pratically zero. This is also expected since as the sphere was displaced on the rotating plate, the sensor acquired data points from the plate itself (on the center). Therefore, the portion of the points representing the sphere was already ”displaced” during the acquisition, i.e., the offsets in X, Y were already accounted for during acquisition. By consequence, during transformation to cartesian coordinates. Therefore, the only offset that was impactful on the transformation was the ϕoffset minimizer, where it accounts for a rotation in the Z axis (to rotate the point cloud to match the object). Of course, the other minimizers had their impact as well during transformation. Table 2. Minimizers obtained by PSO xoffset[m] yoffset[m] ϕoffset[rad] θoffset[rad] γoffset[rad] Centered Sphere −4.704 × 10−7 −5.472 × 10−8 −5.031 × 10−6 +4.953 × 10−5 +2.175 × 10−5 Sphere −7.579 × 10−3 8.087 × 10−2 1.566 × 10−1 −8.885 × 10−5 −1.661 × 10−4 Cylinder 5.055 × 10−4 −3.371 × 10−4 −2.862 × 10−1 5.046 × 10−2 1.057 × 10−1 The behavior of the PSO approach for the centered sphere, its representation in simulation, and its optimized point cloud over the original object can be seen, respectively, in Figs. 6, 7a, and 7b. As can be seen from the illustrations, and from Tables 1 and 2, the PSO managed to find the global optimum of the system’s parameters. Note that, in Fig. 6, the algorithm converged to the global minimum near iteration 15, nevertheless it ran for more than 70 iterations (note that these iterations are from the PSO algorithm itself, therefore they belong to just one run from the all
  • 161.
    148 J. Braunet al. Fig. 6. PSO behavior regarding the centered sphere case. (a) Simulation system with centered sphere. (b) Optimized point cloud of centered sphere plotted over the object’s original di- mensions. Fig. 7. Simulated system (left). Optimized point cloud (right). 20 executed runs). Thus, the generated point cloud fit perfectly to the original object, as can be seen in Fig. 7b. In the same reasoning, the behavior of the PSO approach for the cylinder, its system simulation, and its optimized point cloud over the original object can be seen, respectively, in Figs. 8, 9a, and 9b. It is possible to see in Fig. 8 that after 10 iterations the PSO algorithm con- verged to the minimum value displayed in Table 1, generating the parameters described in Table 2. The optimized point cloud, however, suffered a tiny dis- tortion on its sides, where it can be seen in Fig. 9b. This probably happened because of rounding numbers. This distortion is mainly caused by the small values of θoffset and γoffset.
  • 162.
    Optimal Parameters ofa 3D Scanner Through PSO 149 Fig. 8. PSO behavior regarding the cylinder case. (a) Simulation system with cylin- der. (b) Optimized point cloud of cylinder plotted over the object’s original di- mensions. Fig. 9. Simulated system (left). Optimized point cloud (right). Finally, the behavior of the PSO approach for the sphere with offset, its system simulation, and its optimized point cloud over the original object can be seen, respectively, in Figs. 10, 11a, and 11b. Same as the centered sphere case, the point cloud fits perfectly over the object with the dimensions from the original object. This is expected, since the PSO managed to find the global optimum of the objective function near iteration 10, generating the optimal parameters for the system, as can be seen in Tables 1 and 2.
  • 163.
    150 J. Braunet al. Fig. 10. PSO behavior regarding the sphere with offset case. (a) Simulation system with sphere with offset. (b) Optimized point cloud of sphere with offset plotted over the object’s original dimensions. Fig. 11. Simulated system (left). Optimized point cloud (right). 6 Conclusions and Future Work This work proposed a solution to the imperfections that are introduced when assembling a 3D scanner, due to assumptions that do not typical hold in a real scenario. Some uncertainties associated with angles and offsets, which may arise during the assembly procedures, can be solved resorting to optimization tech- niques. The Particle Swarm Optimization algorithm was used in this work to estimate the aforementioned uncertainties, whose outputs allowed to minimize the error between the position and orientation of the object and the resulting
  • 164.
    Optimal Parameters ofa 3D Scanner Through PSO 151 final point cloud. The cost function considered the minimization of the quadratic error of the distances which were minimized close to zero in all three cases, val- idating the proposed methodology. As future work, this optimization procedure will be implemented on the real 3D scanner to reduce the distortions on the scanning process. Acknowledgements. The project that gave rise to these results received the support of a fellowship from ”la Caixa” Foundation (ID 100010434). The fellowship code is LCF/BQ/DI20/11780028. This work has also been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. References 1. Ajbar, W., et al.: The multivariable inverse artificial neural network combined with ga and pso to improve the performance of solar parabolic trough collector. Appl. Thermal Eng. 189, 116651 (2021) 2. Arbutina, M., Dragan, D., Mihic, S., Anisic, Z.: Review of 3D body scanning systems. Acta Tech. Corviniensis Bulletin Eng. 10(1), 17 (2017) 3. Bento, D., Pinho, D., Pereira, A.I., Lima, R.: Genetic algorithm and particle swarm optimization combined with powell method. Numer. Anal. Appl. Math. 1558, 578– 581 (2013) 4. Bratton, D., Kennedy, J.: Defining a standard for particle swarm optimization. IEEE Swarm Intell. Sympo. (2007) 5. Braun, J., Lima, J., Pereira, A., Costa, P.: Low-cost 3d lidar-based scanning system for small objects. In: 22o ¯ International Conference on Industrial Technology 2021. IEEE proceedings (2021) 6. Franca, J.G.D.M., Gazziro, M.A., Ide, A.N., Saito, J.H.: A 3d scanning system based on laser triangulation and variable field of view. In: IEEE International Conference on Image Processing 2005. vol. 1, pp. I-425 (2005). https://doi.org/10. 1109/ICIP.2005.1529778 7. Ghorbani, E., Moosavi, M., Hossaini, M.F., Assary, M., Golabchi, Y.: Determina- tion of initial stress state and rock mass deformation modulus at lavarak hepp by back analysis using ant colony optimization and multivariable regression analysis. Bulletin Eng. Geol. Environ. 80(1), 429–442 (2021) 8. He, Z., Shi, T., Xuan, J., Jiang, S., Wang, Y.: A study on multivariable optimization in precision manufacturing using mopsonns. Int. J. Precis. Eng. Manuf. 21(11), 2011–2026 (2020) 9. Jin, C., Li, S., Yang, X.: Adaptive three-dimensional aggregate shape fitting and mesh optimization for finite-element modeling. J. Comput. Civil Eng. 34(4), 04020020 (2020) 10. Kennedy, J., Eberhart, R.: Particle swarm optimization. IEEE International Con- ference on Neural Network, pp. 1942–1948 (1995) 11. Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2252–2261 (2019) 12. Lee, K.Y., Park, J.B.: Application of particle swarm optimization to economic dis- patch problem: advantages and disadvantages. In: 2006 IEEE PES Power Systems Conference and Exposition, pp. 188–192 (2006). https://doi.org/10.1109/PSCE. 2006.296295
  • 165.
    152 J. Braunet al. 13. Lempitsky, V., Boykov, Y.: Global optimization for shape fitting. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007). https:// doi.org/10.1109/CVPR.2007.383293 14. Li, M., Du, W., Nian, F.: An adaptive particle swarm optimization algorithm based on directed weighted complex network. Math. Probl. Eng. 2014 (2014) 15. Ma, T.: Filtering adaptive tracking controller for multivariable nonlinear systems subject to constraints using online optimization method. Automatica 113, 108689 (2020) 16. Rehman, W.U., et al.: Model-based design approach to improve performance char- acteristics of hydrostatic bearing using multivariable optimization. Mathematics 9(4), 388 (2021) 17. Soltani, S., et al.: The implementation of artificial neural networks for the mul- tivariable optimization of mesoporous nio nanocrystalline: biodiesel application. RSC Advances 10(22), 13302–13315 (2020) 18. Straub, J., Kading, B., Mohammad, A., Kerlin, S.: Characterization of a large, low-cost 3d scanner. Technologies 3(1), 19–36 (2015) 19. Swathi, A.V.S., Chakravarthy, V.V.S.S.S., Krishna, M.V.: Circular antenna array optimization using modified social group optimization algorithm. Soft Comput. 25(15), 10467–10475 (2021). https://doi.org/10.1007/s00500-021-05778-2 20. Wang, W., Li, Y., Hu, B.: Real-time efficiency optimization of a cascade heat pump system via multivariable extremum seeking. Appl. Thermal Eng. 176, 115399 (2020)
  • 166.
    Optimal Sizing ofa Hybrid Energy System Based on Renewable Energy Using Evolutionary Optimization Algorithms Yahia Amoura1(B) , Ângela P. Ferreira1 , José Lima1,3 , and Ana I. Pereira1,2 1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Bragança, Portugal {yahia,apf,jllima,apereira}@ipb.pt 2 Algoritmi Center, University of Minho, Braga, Portugal 3 INESC-TEC - INESC Technology and Science, Porto, Portugal Abstract. The current trend in energy sustainability and the energy growing demand have given emergence to distributed hybrid energy sys- tems based on renewable energy sources. This study proposes a strategy for the optimal sizing of an autonomous hybrid energy system integrat- ing a photovoltaic park, a wind energy conversion, a diesel group, and a storage system. The problem is formulated as a uni-objective func- tion subjected to economical and technical constraints, combined with evolutionary approaches mainly particle swarm optimization algorithm and genetic algorithm to determine the number of installation elements for a reduced system cost. The computational results have revealed an optimal configuration for the hybrid energy system. Keywords: Renewable energy · Hybrid energy system · Optimal sizing · Particle swarm optimisation · Genetic algorithm 1 Introduction The depletion of fossil resources, impacts of global warming and the global aware- ness of energy-related issues have led in recent years to a regain of interest in Renewable Energy Sources (RES), these latter benefit from several advantages which enable them to be a key for the major energy problems. On the other hand, a remarkable part of the world’s population does not have access to electricity (approximately 13% of the population worldwide in 2016) [1], which creates an important limitation for their development. Nevertheless, these needs can be covered by a distributed generation provided by renewable energy systems with the implementation of insulated microgrids based on Hybrid Energy Systems (HES) offer a potential solution for sustainable, energy-efficient power supply, providing a solution for increasing load growth and they are a viable solution to c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 153–168, 2021. https://doi.org/10.1007/978-3-030-91885-9_12
  • 167.
    154 Y. Amouraet al. the electrification of remote areas. These microgrids can be exploited connected to the main grid, with the latter acting as an energy buffer, or in islanded mode. One of the most important benefits of the RES remains in their contribution to the solution of global warming problems by avoiding damaging effects such as greenhouse gas emissions (GHG), stratospheric ozone hole, etc. Despite technological advances and global development, renewable energy sources do not represent a universal solution for all current electricity supply problems. There are several reasons for this, for instance, the low efficiency compared to conventional sources. Economically, RES provides extra costs and their payback takes a longer period [2]. In addition, RES are in a large extent, unpredictable and intermittent due to the stochastic nature of, for instance, solar and wind sources. To overcome the mismatch between the demand and the availability of renewable energy sources, typically microgrids are based on the hybridization of two or more sources and/or the exploitation of storage systems [3]. By this way, the complementarity between different renewable energy sources, together with dispatchable conventional sources, gives the emergence of hybrid energy systems (HES) which represent an ideal solution for the indicated problems. Under the proposed concept of HES, in the event that renewable energy production exceeds the demand, the energy surplus is stored in system devices as chemical ones or hydrogen-based storage systems, or, in alternative, delivered to the grid [4]. To increase the interest and improve the reliability of hybrid energy systems, it is important to provide new solutions and address an essential problem con- sisting of their optimal sizing. To deal with this type of problem, researchers have often referred to optimization methods. For instance, in [5] researchers have used the Sequential Quadratic Programming (SQP) method. Navid et al. [6] have used the Mixed Integer Non-Linear Programming (MINLP) method, the stochastic discrete dynamic programming has been used on [7]. Kusakana et al. [8] have developed an optimal sizing of a hybrid renewable energy plant using the Linear Programming (LP) method. Among other approaches, in [9] metaheuristics have proved to be an ideal optimization tool for the resolution of sizing problems. The work presented in this paper presents a real case study addressing the optimal sizing of a hybrid energy system in the campus of Santa Apolónia located in Bragança, Portugal. The objective of the work is to determine the optimal number of renewable energy-based units: photovoltaic modules and wind con- version devices on the one hand, and the optimal number of batteries for storage on the other hand. The contribution of this article consists in the development of an optimal sizing software based on optimization approaches. Because of the stochastic effect of the problem that represents a random effect especially in terms of power produced by the renewable energies sources, it is proposed to deal with the optimization problem using metaheuristic algorithms, mainly, Par- ticle Swarm Optimisation (PSO) algorithm and Genetic algorithm (GA). The PSO is a modern heuristic search approach focused on the principle of swarm- ing and collective behavior in biological populations. PSO and GA belong to
  • 168.
    Optimal Sizing ofa Hybrid Energy System 155 the same category of optimization methods and they are both community-based search approaches that depend on knowledge sharing among population mem- bers to improve their search processes. Throughout this work, it is performed a comparison of the computational performance of the two approaches, i.e., which algorithm will allow achieving the most optimal configuration concerning the size of the hybrid energy system while satisfying the economic aspects and the technical constraints related to the problem. The remaining parts of the paper are organized as follows: Sect. 2 describes the configuration of the HES. In Sect. 3, the meteorological data used is pre- sented. Section 4 presents the modeling of the proposed hybrid energy system. Section 5 define the sizing methodology adopted. The sizing problem is, then, for- mulated as a uni-objective optimization problem in Sect. 6. The achieved results are presented and discussed in Sect. 7. Finally, Sect. 8 concludes the study and proposes guidelines for future works. 2 HES Configuration Architecture The HES to be developed comprises two main renewable energy sources: Pho- tovoltaic and wind. In order to ensure the balance between the demand and the energy produced, and given that the non dispatchable behavior of those sources, the microgrid also includes a diesel generator and a battery bank. The HES is developed to operate off-grid. Figure 1 shows the architecture of the chosen configuration. Fig. 1. Architecture of the hybrid energy system.
  • 169.
    156 Y. Amouraet al. The battery system is connected to the DC bus which will be connected bidirectionally to allow it operating in discharge when there is a shortage of energy and in charge when there is a surplus of energy. An AC bus connects the AC loads, the diesel generator, the wind turbine generators and a photovoltaic system. The energy circulates through a bi-directional converter. When there is an excess of energy in the AC bus, the converter acts as a rectifier to recharge the batteries. In addition, when there is an energy deficit, the converter acts as an inverter to transfer energy from the DC bus to the load. In cases when the demand is higher than the renewable energy production and batteries are below the minimum charging level, the bi-directional inverter can automatically start a backup diesel group. A control and energy management system guarantees the energy flow and the monitoring of the microgrid power quality. 3 Study Data Meteorological data are an important point for the optimal conception of the hybrid energy system to avoid an under or oversizing. Nevertheless, the infor- mation of the solar and wind potential defined by the irradiation and the wind speed respectively are the key variables to identify the sources that constitute the HES. The sampling data were taken at the laboratory level of the Polytechnic Institute of Bragança (IPB), Portugal, Latitude: 41.8072, Longitude: −6.75919 41o 48’ 26” North, 6o 45’ 33” West with an elevation of 690 m. This section will describe the average solar and wind potential data for one year (from January 1, 2019, to December 31, 2019) as well as the average load data. The study data are presented in Fig. 2. In this work, the solar irradiation data were taken practically on-site using a pyranometer exposed on a 30◦ south-facing support. Figure 2a shows the daily average solar irradiation data. The values of the ambient temperature are indispensable for the prediction of photovoltaic power. The more the temperature rises, the more the cell voltage decreases, which leads to a decrease in the power of the module. In this study, the measurement process of the temperatures is evaluated every hour during the day by using a temperature sensor, thermocouple K-Type. Figure 2b shows the daily average temperature data for 2019 year. The sizing of a wind turbine system depends essentially on the knowledge of wind speeds. For this reason, a series of precise and local measurements at the study site were performed. In this work, the wind speed is measured through an anemometer placed at 10 m from the ground surface. Figure 2c shows the average wind speed data for the time span under consideration. All the average values of wind speed are very random and stochastic, this comes back to the nature of the wind which is unpredictable. A considerable amount of wind circulating is observed in the afternoon, from 12:00 to 20:00 during the measurement period. On average, it reaches its optimum at 15:00 for a value of 2.93 m/s. From the collected data, the wind turbine characteristics
  • 170.
    Optimal Sizing ofa Hybrid Energy System 157 (a) Average solar irradiation. (b) Average temperature. (c) Average wind speed. (d) Average load profile. Fig. 2. Study data. (cut in speed, cut-off speed, and rated speed) are evaluated, to select the optimal type of wind turbine for the project, from the technical point of view. Besides the environmental data, necessary to select the renewable source units, the dimensioning of the back up systems (diesel group and battery bank) requires the characterization of the demand profile. The average load profile is measured in kWh. There is a significant activity of the load during working hours from 08:00 am, which reaches a peak consumption of 7.68 kWh at 11:00 am. On the other hand, during off-peak hours, there is a certain consumption considered as uninterruptible load. Figure 2d illustrates the average load profile. 4 System Modeling The optimization of the synergy between different sources in a HES, while main- taining their best operational characteristics, requires the mathematical model- ing of the energy sources’ behavior, which must reflect the system response during its operation, when exposed to all types of external environments. According to the potential present in the study site, the hybrid energy sys- tem uses a combination of renewable devices (photovoltaic and wind turbines), a conventional diesel group, as a back-up system, and an energy storage sys- tem consisting of a set of electrochemical batteries. This section presents the mathematical modeling of the different hybrid energy system sources as well
  • 171.
    158 Y. Amouraet al. as the characteristics of the device units adopted, taking into consideration the environmental data collected and presented in the previous section. 4.1 Photovoltaic Energy System Considering the objectives of this work, the most suited model for the photo- voltaic energy system conversion is the power model (input/output) [10], which has the advantage of estimating the energy produced by the photovoltaic appara- tus from the global irradiation data and the ambient temperature on an inclined plane, as well as the manufacturer’s data for the photovoltaic module selected. The power output of a photovoltaic module can be calculated according to the following equation: Ppv(t) = ηSG (1) being η the instantaneous efficiency of the photovoltaic module, S the surface of the photovoltaic module and G the global irradiation on an inclined plane. The instantaneous efficiency is given by, η = ηr(1 − γ(Tc − T0)) (2) where ηr is the reference efficiency of the photovoltaic module under standard conditions (T0 = 25◦ C, G0 = 1000 W/m 2 and AM = 1.5), γ is the temperature coefficient (◦ C) determined experimentally as the variation in module efficiency for a 1◦ C variation in cell temperature (it varies between 0.004 and 0.006◦ C), Tc is the temperature of the module that varies with the irradiance and the ambient temperature as follows: Tc = Ta + NOCT − 20 800 G (3) being NOCT the nominal operating cell temperature and Ta the ambient tem- perature in (◦ C). Thus, the instantaneous total power of the photovoltaic array, Pt pv, is written as follows: Pt pv(t) = Ppv(t)Npv (4) where Npv is the number of modules to be used in the photovoltaic array. The main characteristics of the selected photovoltaic module are presented in [11]. 4.2 Wind Energy Conversion System The power produced by a wind energy system at a given site depends on the wind speed at a hub height, the air density, and the specific characteristics of the turbine. The wind speed in (m/s) at any hub height can be estimated using the following law:
  • 172.
    Optimal Sizing ofa Hybrid Energy System 159 v = vi h hi λ (5) being hi the height of the measurement point, generally taken at 10 m, vi is the speed measured at height hi (m/s) and λ is typically taken as 1/7. The modeling of the power produced by a wind energy system unit Pwt is selected as a linear model [12], as follows: ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ Pwt = 0, v ≥ vc, v vd Pwt = Pn v − vd vn − vd , vd ≤ v vn Pwt = Pn, vn ≤ v vc (6) where Pn is the rated electrical power of the wind turbine unit, v is the value of the wind speed, vn is the rated speed of the wind turbine, vc is the cut-off speed and vd is the cut-in speed of the wind turbine. The instantaneous total power produced by a wind farm, Pt wt, is given by the number of wind energy units, Nwt, times the power of each unit, as follows: Pt wt(t) = Pwt(t)Nwt (7) The features of the selected module of the wind energy conversion system are presented in [13]. 4.3 Energy Storage System The modeling of the energy storage system (ESS) is divided into two steps according to two operations. The first step consists of evaluating the total capac- ity of the battery bank for an autonomy duration determined by the required specifications. By this way it will be possible to determine the unit capacity of a battery unit and therefore the number of batteries for the required autonomy, without the intervention of other production sources. The second step will be the study of the batteries’ behavior in the chosen sizing configuration to assess the feasibility of the first step. Therefore, it will be possible to know the exact number of batteries that will be operating in parallel with the other production sources, mainly the photovoltaic and wind turbine systems, without any system interruption. Sizing the Capacity of the Battery Bank. The total rated capacity of the batteries, C, should satisfy the following formulation: C = NjEneed PdKt (8) being Nj the required autonomy in days, Eneed the daily energy, Pd the allowed depth of discharge and Kt the temperature coefficient of the capacity.
  • 173.
    160 Y. Amouraet al. Battery Energy Modelling. The battery energy modeling establishes an opti- mal energy storage management. This operation depends on the previous state of charge and the differential between the energy produced by the different types of generators, Epr, and the initial energy required by the load, Ed defined by: Ed(t) = Et load(t) − (Epr(t)) (9) where, Epr(t) = Et pv(t) + Et wt(t) (10) and ⎧ ⎨ ⎩ Et pv(t) = Pt pv(t)Δt Et wt(t) = Pt wt(t)Δt Et load(t) = Pt load(t)Δt (11) knowing that Δt is the simulation step, defined as one-hour interval. Batteries state of charge can be evaluated according to two operation modes, described as follows. • Charging operating mode: If Epr ≥ Ed. The instantaneous energy storage capacity, E(t), is given by the following equation: E(t) = E(t − 1) + ηbat Epr(t)ηconv − Ed(t) ηconv (12) being ηbat the battery efficiency and ηconv the converter efficiency. • Discharging operating mode: If Epr Ed. Now, the instantaneous storage energy capacity can be expressed as follows: E(t) = E(t − 1) + 1 ηbat (Epr(t)ηconv − Ed(t)) (13) The instantaneous state of the charge is given by: SOC(t) = E(t) Etotal (14) In addition, the charging and discharging modes should guarantee that the state of the charge (SOC) of the energy storage system must satisfy the following condition: SOCmin ≤ SOC(t) ≤ SOCmax (15) being SOCmin and SOCmax the minimum and maximum state of charge of the energy storage system respectively. The specification of the proposed battery unit are described in [14].
  • 174.
    Optimal Sizing ofa Hybrid Energy System 161 4.4 Diesel Generator The rated power of the diesel generator, Pdg, is chosen equal to the maximum power required by the load i.e., Pdg ≥ Pmax load (t) (16) The operating mode of the diesel group is based on two possible scenarios: • The first scenario occurs when the power verifies the following relation: Pt pv(t) + Pt wt(t) + E(t) Δt ≥ Pload(t) (17) In this case, it is not necessary to start the generator as the consumption is satisfied. • The second scenario occurs when the following relationship is verified: Pt pv(t) + Pt wt(t) + E(t) Δt Pload(t) (18) In this case, the generator starts up to cover all or part of the energy demand in the absence of enough power from the other sources. The technical specifications of the chosen diesel group are mentioned in [15]. 5 Adopted Sizing Strategy After performing load balance and evaluating the solar and wind potential pre- sented earlier in Sect. 3, the renewable sources are designed to provide continuous power to the required demand, then, taking into account the autonomy of the load which is ensured by the storage system reaching 24 h. The latter will there- fore be able to supply the energy necessary for the load without the presence of other renewable sources for 24 h. The procedure to be followed is shown in the flowchart by Fig. 3. First, the capacity of the battery bank is defined according to the overall load and the desired autonomy, by using the Eq. (8) presented in Sect. 4.3. This step allows obtaining the necessary number of batteries, if there are no renewable sources available, for a specific period of autonomy (24 h in this case). The diesel generator is the last resource back up system, able to satisfy the demand if there are no renewable energy available and the battery bank reaches its minimum state of charge. The intermittent nature of the renewables, makes the problem subject to stochastic treatment. To solve the optimization problem, two metaheuristic methods were used: Genetic Algorithm (GA) and Particle Swarm Optimisation (PSO) algorithms. So, after providing the main initial data (meteorological data, constraints and initial parameters) the values of the decisive variables obtained by the algorithmic computation represent the optimal number of renewable gen- erators. The algorithm evaluates them recursively until the cost of the installa- tion is in the most optimal value.
  • 175.
    162 Y. Amouraet al. Fig. 3. Flowchart of the sizing process 6 Problem Formulation The sizing of the energy hybrid system is formulated as an optimisation prob- lem, in fact, the objective function contains mainly the different power equations described in Sect. 4 to consider the technical problem constraints. On the other hand, the objective function also considers the economic criteria, i.e., the sys- tem purchase costs, maintenance costs and component replacement costs. The objective function aims at obtaining the optimal number of the hybrid system components while satisfying the technical-economical constraints. The overall cost, in euros, for the system lifespan, T, considered equal to 20 years, is given by: Ct(Npv, Nwt, Nb) = Npv(Cpv + CpvKpv + TMpv) + Nwt(Cwt + CwtKwt + TMwt) +Nb(Cb + CbKb + (T − Kb − 1)Mb) + Cdie + Cconv where, Npv, Nwt, Nb are, respectively, the number of units of photovoltaic mod- ules, wind turbines and batteries, Cpv, Cwt, Cb, Cdie, Cconv are respectively the purchase costs of renewable units (photovoltaic and wind), battery unit, diesel group and overall converters, Mpv, Mwt, Mb are, costs for the renewable energy systems (photovoltaic and wind systems) and also battery bank. Finally, Kpv, Kwt, Kb are, respectively, the number of equipment replacements during the system lifetime.
  • 176.
    Optimal Sizing ofa Hybrid Energy System 163 The sizing optimisation problem can be written as follows, min (Npv,Nwt,Nb) Ct(Npv, Nwt, Nb) s.t. Ppv(t)Npv + Pwt(t)NW T + Pb(t) × Nb ≥ Pload(t) 0 ≤ Npv ≤ Nmax pv 0 ≤ Nwt ≤ Nmax wt 0 ≤ Nb ≤ Nmax b (19) The optimization problem will be solved by two metaheuristic algorithms, mainly, Particle Swarm Optimisation (PSO) algorithm and Genetic algorithm (GA). PSO and GA are populational methods, i.e., they depend on knowledge sharing among population members to improve their search processes. 7 Results and Discussions This section presents the obtained results, from the optimization approach. After the optimal number of units is reached, a technical study is performed to analyze the reliability of the installation under the number of units (photovoltaic mod- ules, wind turbines, batteries, and diesel group). Furthermore, an economical evaluation is presented for the overall system lifetime. 7.1 Optimisation Results The sizing problem is implemented by using the Matlab programming platform. The optimisation procedure is solved based on the two evolutionary algorithms (PSO) and (GA), introduced above. The parameters of the selected PSO are: search dimension = 1, population size = 60, number of iteration = 100, c1 = 2, c2 = 2 and w = 0.9. Moreover, the choose parameters of GA with the same population size and iteration number with Mutation and crossover probabilities of 1 %, and 75% respectively. Considering the devices’ specifications previously introduced in Sect. 4, the optimisation results are presented in Table 1, for both algorithms considered. The optimal configuration, which corresponds to the lowest overall cost while satisfying the technical constraints consists in 50 photovoltaic modules, 3 wind turbines, a battery bank with 13 units. For 20 years of the HES lifespan, the total purchase, maintenance and replacement are estimated at 157 ke , Figure 4 illustrates the convergence diagram for the two optimisation algo- rithms, PSO and GA used to solve the optimization problem, defined in Eq. (19). According to Table 1 and Fig. 4, it’s remarkable that the PSO presented a superior efficiency over the GA, with the same stopping condition. The path to the optimum by using the Genetic Algorithm is much slower than the one
  • 177.
    164 Y. Amouraet al. Table 1. Optimal configuration results. NP NW Nb Ndg Iterations CPU time (s) Error (%) Total price (e ) Genetic algorithm 49.68 2.62 13 1 38 0.47 1 156310.38 Particle swarm optimisation 49.89 2.73 13 1 27 0.23 0.07 156670.61 Optimal configuration 50 3 13 1 - - - 157755.87 (a) PSO. (b) GA. Fig. 4. Convergence performances. with PSO, i.e., the computational effort required by PSO for reaching the global optimality is less than the effort required to arrive at the same high-quality results by the GA. It should be noted that PSO principle search mechanism uses a lower number of function evaluations than the GA, which justifies this output. Besides, the effects of speed and implicit transition rules on particle movement led to a faster convergence by the PSO. In terms of accuracy, the PSO has shown more accurate values than the GA justified by its reduced relative error value. 7.2 Technical Validation This section at verifying the optimality of the results obtained from the optimiza- tion algorithms, by considering the optimal number of devices obtained during the optimization procedure. With this in mind, the overall system is evaluated, from the technical point of view, in the time span of 24 h. Figure 5a shows the power output setpoints of the various HES generators. The balance constraint between production and demand is satisfied throughout the day since the sum of the power from the sources, including the battery bank, is equal or higher than the load, for every time steps. The photovoltaic system is operating during the day to supply the total demand including the charging of the energy stor- age system. When the solar irradiation decreases, in the presence of the wind that reaches the speed of the wind turbine system cut-in speed, this latter com- pensates the deficit of energy. In this case, the diesel generator does not start, because it is used only in the absence of the two renewable sources with the storage system having reached its minimum state of charge (SOCmin).
  • 178.
    Optimal Sizing ofa Hybrid Energy System 165 To analyse the performance of the chosen diesel generator set in this micro- grid, the system performance is analysed for 48 h, considering the hypothetical scenario of null production of the renewable sources in the proposed time span. According to Fig. 5b, the diesel generator is operational (working as back up system) after the minimal state of charge of the energy storage system has been reached, which justifies that the number Ndg = 1 is sufficient and optimal to feed the installation in case of a supply absence. (a) Presence of all sources. (b) Absence of renewable energies. Fig. 5. Optimal power set-points describing the hybrid energy system behavior. Figure 6a shows the evolution of the state of charge (SOC) of the battery bank when operating in parallel with the other HES sources, for the time span of 24 h. The energy of the batteries is used (discharged) mainly during the night to supply the loads while the charging process is ensured during the day by the surplus of the renewable sources, over the demand. Figure 6b represents the capacity of the battery bank to maintain a 24-hour autonomy adopting the Cycle Charge Strategy [16], while satisfying the needs of the load, i.e. if the renewable sources (photovoltaic and wind systems) are not available, the storage system will be able to provide the demand for 24 h, without any interruption of the power supply. The total discharge of the batteries contributes enormously to their life time their life span reduction. For this reason, the SOC of the batteries should never go below its minimum value. This constraint is thus verified since in this study case the SOC of the energy storage system after 24 h of usage did not sink the minimum state defined by 20% in the battery specification. Indeed, the number of storage batteries delivered by the optimization algorithm computations discussed above represents an optimal value.
  • 179.
    166 Y. Amouraet al. (a) SOC of the battery bank in parallel operation. (b) SOC of the battery bank in autonomous operation. Fig. 6. Storage system behavior 7.3 Economical Evaluation In order to evaluate the economic benefit of the overall installation, it is analysed the profitability during its life cycle. The life time of the HES is 20 years, the total purchase, maintenance and replacement costs are 157755.87 e . The average power consumption during 24 h is 122.915 kWh, which is equivalent to a value of 898508.65 kWh within 20 years, without taking into account the extension of the load. Portugal has one of the most expensive electricity systems in Europe. According with eurostat data in 2020, the price of electricity averaged 0.2159 e /kWh including taxes [17]. The estimated cost of the energy bill is 195964.74 e . These values allow to highlight the amount saved by the customer, as well as the payback period. Figure 7 shows an economic estimation for the life time under consideration. The line in black shows the evolution of the energy bill according to conven- tional energy from the public grid. The line in red represents the evolution of the investment cost in the HES, this latter starts with an initial purchase cost that increases each year with the amount of maintenance of the different ele- ments, every 4 years the batteries are replaced without taking into account the maintenance price of the batteries during the year of their replacement. The intersection of the final value of the investment cost with the conventional energy bill on the time axis represents the period when the customer recovers the amount invested in the HES, and from that moment until the end of the life of the HES all the money supposed to be spent on the energy bill of the installation will be saved. It can be seen that the cost of the investment in the HES will be repaid at the end of the 16th year, more precisely on October 2037 if it is considered that the system will be installed on 1 January 2021. In this case, the HES is more profitable than conventional energy and allows the customer to save a value of 35423.49 e .
  • 180.
    Optimal Sizing ofa Hybrid Energy System 167 Fig. 7. Economic evaluation during the system life-cycle. 8 Conclusions and Future Work In this paper, the problem of the optimal sizing for a hybrid energy system (HES) configuration based on renewable energies was formulated as a uni-objective optimization problem under constraints. Two optimization approaches based on evolutionary algorithms have been considered: Particle Swarm Optimisation (PSO) and the genetic algorithm (GA). The PSO shows better performances in terms of accuracy and convergence speed. The obtained sizing results have been tested and from the technical point of view, the performance of the sys- tem guarantees the energy balance and the remaining constraints. In addition, a simplified economical approach indicates that the return of the investment is possible within the life time of the energy system. As future work, it is proposed to treat the problem in a multi-objective optimization formulation while intro- ducing environmental and economical constraints, whose results would provide a set of scenarios for the optimal configuration of a given HES. On the other hand, it is intended to improve this software providing an interface suitable for all cases of study, allowing to identify the best sizing of a given HES, through a technical and economic analysis of the computational results. Acknowledgements. This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope UIDB/05757/2020. References 1. Ritchie, H., Roser, M.: Acces to energy. Our World in Data (2019). http://www. ourworldindata.org/energy-access 2. Ma, W., Xue, X., Liu, G.: Techno-economic evaluation for hybrid renewable energy system: application and merits, Energy. 385–409 (2018). https://doi.org/10.1016/ j.energy.2018.06.101
  • 181.
    168 Y. Amouraet al. 3. Mamen, A., Supatti, U.: A survey of hybrid energy storage systems applied for intermittent renewable energy systems. In: 14th International Conference on Elec- trical Engineering/Electronics, Computer, Telecommunications and Information Technology (2017). https://doi.org/10.1109/ecticon.2017.8096342 4. Gupta, A., Saini, R., Sharma, M.P.: Modelling of hybrid energy system-Part I: problem formulation and model development, Renew. Energy, pp. 459–465 (2011). https://doi.org/10.1016/j.renene.2010.06.035 5. AlHajri, M.F., El-Hawary, M.E.: Optimal distribution generation sizing via fast sequential quadratic programming. IEEE Large Eng. Syst. Conf. Power Eng. (2007). https://doi.org/10.1109/LESCPE.2007.4437354 6. Ghaffarzadeh, N., Zolfaghari, M., Ardakani, F.J., Ardakan, A.J.: Optimal sizing of energy storage system in a micro grid using the mixed integer linear programming. Int. J. Renew. Energy Res. (IJRER). 2004–2016 (2017). https://doi.org/10.1037/ 0003-066X.59.1.29 7. Bakirtzis, A.G., Gavanidou, E.S.: Optimum operation of a small autonomous sys- tem with unconventional energy sources. Electr. Power Syst. Res., 93–102 (1992). https://doi.org/10.1016/0378-7796(92)90056-7 8. Kusakana, K., Vermaak, H.J., Numbi, B.P.: Optimal sizing of a hybrid renewable energy plant using linear programming. IEEE Power Energy Soc. Conf. Expos. Africa Intell. Grid Integr. Renew. Energy Resourc. PowerAfrica (2012). https:// doi.org/10.1109/PowerAfrica.2012.6498608 9. Sharafi, M., Tarek, E.Y.: Multi-objective optimal design of hybrid renewable energy systems using PSO-simulation based approach. Renew. Energy, 67–79 (2014). https://doi.org/10.1016/j.renene.2014.01.011 10. Adamo, F., Attivissimo, F., Nisio, D., Lanzolla, A., Spadavecchia, M.: Parameters estimation for a model of photovoltaic panels. In: XIX IMEKO World Congress Fundamental and Applied Metrology, pp. 6–11. Lisbon, Portugal (2009). https:// doi.org/10.1068/978-963-88410-0-1 11. Zytech Solar Modules Catalogue: Maximum Quality, Efficiency and Reliability. www.Zytech-solar.com 12. Hongxing, Y., Lu, L., Zhou, W.: A novel optimization sizing model for hybrid solar-wind power generation system. Solar Energy 76–84 (2007). https://doi.org/ 10.1016/j.solener.2006.06.010 13. Flex Pro: Series Eol Generator (EOL/3000) www.flexpro-industry.com 14. Ultracell: Solar Series (UCG250-48) ultracell.com 15. Leroy merlin: Hyndai (Hyundai Diesel Generator 12kva Mono And Tri - Dhy12000xse-t) leroymerlin.fr 16. Banguero, E., Correcher, A., Pérez-Navarro, Á., Morant, F., Aristizabal, A.: A review on battery charging and discharging control strategies: application to renew- able energy systems. Energies 1–15 (2018). https://doi.org/10.3390/en11041021 17. Eurostat: Electricity prices (including taxes) for household consumers, first half 2020. http://www.eurostat/statistics-explained, Accessed 18 Aug 2020
  • 182.
  • 183.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot Hugo Mendonça1,2(B) , José Lima2,3 , Paulo Costa1,2 , António Paulo Moreira1,2 , and Filipe Santos2 1 Faculty of Engineering, University of Porto, Porto, Portugal {up201606204,paco,amoreira}@fe.up.pt 2 INESC-TEC - INESC Technology and Science, Porto, Portugal {hugo.l.mendonca,jose.lima,filipe.n.santos}@inesctec.pt 3 Research Centre of Digitalization and Intelligent Robotics, Instituto Politécnico de Bragança, Bragança, Portugal jllima@ipb.pt Abstract. The COVID-19 virus outbreak led to the need of developing smart disinfection systems, not only to protect the people that usually frequent public spaces but also to protect those who have to subject themselves to the contaminated areas. In this paper it is developed a human detector smart sensor for autonomous disinfection mobile robot that use Ultra Violet C type light for the disinfection task and stops the disinfection system when a human is detected around the robot in all directions. UVC light is dangerous for humans and thus the need for a human detection system that will protect them by disabling the disinfection process, as soon as a person is detected. This system uses a Raspberry Pi Camera with a Single Shot Detector (SSD) Mobilenet neural network to identify and detect persons. It also has a FLIR 3.5 Thermal camera that measures temperatures that are used to detect humans when within a certain range of temperatures. The normal human skin temperature is the reference value for the range definition. The results show that the fusion of both sensors data improves the system performance, compared to when the sensors are used individually. One of the tests performed proves that the system is able to distinguish a person in a picture from a real person by fusing the thermal camera and the visible light camera data. The detection results validate the proposed system. Keywords: Smart sensor · Human detection · Neural network 1 Introduction The current pandemic situation that we live on, caused by the COVID-19 virus outbreak led to the need of giving safe conditions to people that share crowded spaces, especially closed environments, where the propagation of the virus is substantially more easy and therefore is considered a more dangerous situation to the society. Places such as hospitals, medical centers, airports, or supermarkets c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 171–186, 2021. https://doi.org/10.1007/978-3-030-91885-9_13
  • 184.
    172 H. Mendonçaet al. are among those that fit in the previous description and even though regular testing to the users may be one way to prevent the spread of the virus, the disinfection of public places is of great importance. Disinfection using chemical products is one of the most popular but also implies a lot of effort and risk by the user since they must submit themselves to the dangerous conditions of proximity to possibly infected places. One way to solve this problem is using an Automated Guided Vehicles (AGV) that can cover large spaces without the supervision of human operators. Disinfection with Ultraviolet Radiation is an alternative to the use of chemical products without losing efficiency. This method works by exposing the infected area for a few seconds to the UV radiation reducing the viral load that is present there. This method is particularly interesting since it allows the disinfection of a larger area in less time, and by using an AGV, this task can be programmed to be performed during the periods without the presence of people (as example airports and commercial areas). However, due to its danger for living organisms, it is necessary to develop a system that detects persons that may be close to the actuation zone. This paper addresses a developed system that uses multiple sensors to have a higher robustness of detecting humans without false positives and false negatives by fusing the sensors signals that are captured individually in a single algorithm, that will then communicate with the disinfection robot. False negatives have a bigger impact in the system performance, since the disinfection lights are not stopped when persons are near the robot. Results are very promising reaching 97% of accuracy. The remaining of the paper is organized as follows. After an introduction presented in Sect. 1, the Sect. 2 addresses the related work regarding the human detection approaches. Then in Sect. 3, the system architecture is presented where the thermal camera, the Raspberry pi camera and the LiDAR sensor to estimate the disinfection time based on the room area are stressed. Section 4 presents the results and its discussion. Finally, Sect. 5 presents the conclusions and points out the future work directions. 2 Related Work When trying to detect people in a closed environment, a multi-sensor approach can be a good solution. Since it relies on the information of more than one sensor to give a final, more accurate result, this method is the focus of this article. The data gathered by the different cameras and other sensors are later fused and analyzed allowing the system to have one single output. The system uses two cameras, one visible light camera, and one thermal camera to run an active detection algorithm. On the other hand, there are also two other sensor types, a LiDAR that gives distance information on the direction where the cameras are pointing and a Passive Infrared (PIR) sensor that is used in home surveillance systems for motion detection [9,10,13]. This last type of sensor is also commonly used to locate the motion target in space when using multiple sensors pointing
  • 185.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 173 in different directions allowing the system to pinpoint to a certain degree its position [5,15,17]. The most common LiDAR cannot distinguish what type of objects are close to the system where is used, which means it cannot be used as an active detection sensor. However the distance data can be used to allow the system a better comprehension of what is happening around it by mapping the environment [1] and working together with the cameras to improve the system performance [4,12]. In regards to active human detection applications using cameras, thermal cameras are a less indicative sensor of visual information since it only captures temperatures based on infrared sensors. However in situations of low light, as described in [3] and [16], this type of camera has more accurate detection results when compared to visible light cameras without infrared sensors. In [7] a com- parison is made between two cameras, one with and the other without infrared modules and the first outperforms the second by far, thus making us believe that in low light conditions thermal cameras are more informative than visible light cameras. Since some thermal cameras are usually expensive, it is necessary to find one that meets our requirements but without compromising our budget. FLIR Lepton represents such case as proven in [11] showing a capability to detect specific temperatures of objects smaller than humans from long distances [3] in a compact, low weight sensor compatible with embedded devices such as the Raspberry Pi in our case. This model has two versions 2.x and 3.x where the first has an 80 × 60 resolution and the second a 160 × 120 resolution among other differences. The visible light camera detection relies on previously trained models to identify persons. This is commonly used in Internet of Things applications and more complex surveillance systems [7], that instead or in addition to the use of passive sensors such as PIR sensors also use computer vision and machine learning to more accurately identify persons. Even though PIR sensors are a good option in simple motion detection systems since they are cheap and easy to use. Sometimes it triggers false positives which lead to unnecessary preoccupation and false negatives when there is a person but it is not moving. The human detection can be achieved by using image processing through background subtraction plus face detection or head and shoulder detection [7], but also using machine learning through networks prepared for the task of identification such as Convolutional Neural Network (CNN) [14], DarkNET Deep Learning Neural Network [8] and others such as YOLO and its variants, SSD and CNN variants [2]. 3 System Architecture The goal of this paper is to develop a human detection system that can be used with an autonomous disinfection robot and it was developed to be a low-cost and low weight efficient solution. The system will be mounted on top of a robot equipped with a UVC light tower responsible for the disinfection procedure. The human detector is a stand-alone equipment that is powered by 24 V and can
  • 186.
    174 H. Mendonçaet al. communicate by WiFi of serial port. In the presented case study, a serial port connection is used to obtain information from the developed human detector module. Figure 1 presents the main architecture of the human detector and its integration. dist I/O 24V ESP32 PCB1 Stepper motor driver M Raspberry Pi LiDAR RGB cam Thermal cam Robot PC Serial USB/Serial Sonar Auxiliar I/O PIR Fig. 1. System architecture scheme The human detection system consists of two major parts: the upper part that is a rotating module and contains a Raspberry Pi, a LiDAR, a FLIR Lepton Thermal Camera, and a Raspberry Pi Camera Module v2; the down part has 4 PIR sensors to detect movement, only when the robot is not moving, and a control circuit board that contains an ESP32 micro-controller. It is used to control the stepper motor that couples the upper part, general purpose I/O and
  • 187.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 175 a sonar to measure the distance to the building roof. This embedded system also sends information to the main system (Robot PC) by an USB serial connection. To minimize the number of cameras used by the system, the upper part is a rotating module and was developed so that a single pair of cameras would be able to sweep a full 360◦ rotation. A decision was made to have a non-continuous motion instead of stopping for a brief moment. After measuring both vertical apertures and considering that a vertical resolution is used to enhance the field of view captured by the system, the stops are made every 36◦ resulting in a total of 10 stops for each rotation, taking an average time of 10 s per rotation. A two-pulley approach is used to achieve the rotation motion with one of them being attached to a stepper motor that is controlled by the micro-controller situated in the down part of the system. The other pulley that is also used to hold the upper part of the system to the down part has a bearing that is fixed to the box and rotates accordingly to the stepper rotation with a certain transmission ratio. 3.1 Thermal Camera A FLIR Lepton 3.5 with a FLIR Breakout Board v2 is used due to its 160 × 120 resolution which is 4 times larger than its predecessor of 80 × 60. It uses two types of communication: I2C to send SDK commands provided by FLIR that can be used to initialization, reboot, and/or system diagnosis; on the other hand, there are also Serial Peripheral Interface (SPI) lines to send the data from the camera to the listening device, in this case is the Raspberry Pi. By default, the camera sends 164-byte packets which contain a 4-byte header and 160 bytes of payload. Since the raw values of each pixel are 16-bit values the data of each pixel is sent in 2 bytes which means each packet only sends information of half a line of a frame. A frame is divided into 4 segments and each segment consists of 60 packets that can be identified in the header of the packet. The 20th packet also identifies the current segment, if the segment ID is 1, 2, 3, or 4 the packet is valid if the ID is 0 it means that is a discard packet. The raw values of 16 bits per pixel represent values between 0 and 65535. This however does not have a normal distribution in this range instead the raw values are temperatures in Kelvin multiplied by a factor of 100. Since it is desired to detect human range temperatures, we are interested in finding values that are around 306,15K (equal to 33 ◦ C) or the raw value 30615 which is the normal temperature of the human skin. In the end, we did not find major differences between using the human body temperature, that is around 37 ◦ C, and the human skin temperature as the reference value for the detection range, however, we found that for persons that are further away from the system is preferably better to use the skin temperature as a reference value or an even lower value, since the measured human temperature starts becoming an average between the human skin temperature and the human clothes temperature.
  • 188.
    176 H. Mendonçaet al. A range is set in the algorithm to consider a group of temperatures near this ideal human temperature value considering two factors: human temperature fluctuation and the camera precision of ±5 ◦ C. If the read value is situated within this range we consider a positive human detection, even though we need to consider other inputs, since there can be other objects in the range that have the same temperature as the human body. 3.2 RGB Camera The RGB camera used is a Raspberry Pi Camera Module v2 due to its low cost and ease of use with the Raspberry Pi without compromising performance and image quality. The system uses a SSD Mobilenet model trained with COCO Dataset to detect and identify persons in real-time through the Raspberry Pi camera. The MobileNet was quantitized to a lower resolution (internal Neural Network weights are moved from floats to integers), using the Tensorflow framework to reduce the computational power requirement, and enable the use of this Neural Network on a Raspberry Pi. As said in [6], a Single Shot multibox Detector (SSD) is a rather different detection model from Region-CNN or Faster R-CNN. Instead of generating the regions of interest and classifying the regions separately, SSD does it simultaneously, in a “single-shot”. SSD framework is designed to be independent of the base network, and so it can run on top of other Convolutional Neural Networks (CNNs), including Mobilenet. The Mobilenet and Raspberry Pi are capable of detecting persons in the chosen vertical resolution with a of 2,5 frames per second. In this case we defined a threshold of 50% on MobileNet detection confidence to reduce the false positive detections. The network is quite robust, having the capacity of detecting persons in a different number of situations, like for example through a partial body image or simply a single hand image. 3.3 LiDAR The detection system also includes a LiDAR sensor pointing in the same direc- tion as the cameras that are continuously sending to the Raspberry Pi distance measurements. This information is useful for mapping applications recreating the surrounding environment of the sensor giving yet another way of detection to the system although is much less informative since it cannot distinguish dif- ferent types of objects simply by using distance measurements. The disinfection time can be tuned based on this information. The used LiDAR is a TFMini Plus UART module, a low-cost, small-size, and low power consumption solution to promoting cost-effective LiDARs. The TFmini Plus LiDAR Modules emit modulation waves of near-infrared rays on a periodic basis, which will be reflected after contacting the object. The time of flight is then measured by the round-trip phase difference and then the relative range is calculated between the product and the detection object.
  • 189.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 177 Distances are sent via UART from the LiDAR to Raspberry Pi at a constant pace via UART and are later sent from the Raspberry Pi to the ESP32 micro- controller that is responsible for the system and the robot communication for the mapping tasks. The connections and buses from the cameras and LiDAR to the Raspberry CPU are presented on Fig. 2. RaspberryPi FLIR Lepton 3.5 Picamera LiDAR Flat Cable RPi MISO TX RX CLK SCL SDA VSYNC CS DC/DC Converter 5V GND 3V3GND ESP32/PCB RX TX 24V GND Fig. 2. Human detection module architecture 3.4 Messaging Protocol Since the two cameras have separate algorithms running in individual programs, there is a need to aggregate and compare the detections from each algorithm in a single program that will then send that information along with the LiDAR data that is running itself in another program. The ZMQ which is an asynchronous messaging library is used. It has a pub- lisher/subscriber option where the detection algorithms and the LiDAR pro- grams are used as publishers by continuously sending data to a specified port which will then be subscribed by a master control program that is used as a subscriber. This messaging method works by creating a socket on the publisher side and binding it to the chosen IP address and port and then send the data. On the subscriber side, a socket is also created and connected to the same IP address and port as the publisher and reading the data received in the socket. It is important to mention that each publisher needs to have a different port however the subscriber is able to read different ports at the same time.
  • 190.
    178 H. Mendonçaet al. After gathering all the relevant data to the system, this subscriber program also communicates via UART with the ESP32 micro-controller sending infor- mation about the detection state of each camera and also sensor fusion. The distance read by the LiDAR is also sent for other purposes of the robot. Finally, Fig. 3 presents the developed Module Prototype. Fig. 3. Human detection module prototype 4 Results The detection algorithms were tested by placing the system in a populated envi- ronment. Figure 4 is an example of the sequential images collected from the ther- mal camera in one rotation of 360◦ . We can see from those images that there are people in one part of the testing area and no persons on the other. This gives us a better understanding of how the systems reacts to populated areas versus non- populated areas. The goal was to see the individual performance of the cameras in person detection tasks and test their robustness by testing different scenarios where we knew a certain camera would fail individually. Some examples of this
  • 191.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 179 are the detection of pictures instead of real persons in the visible light camera algorithm and the detection of human-like temperature devices when no human is present. Finally, we tested the algorithm that combines both data to see if the performance was better when compared to the previous tests using the cameras individually. A total of 1200 images were taken from each of the cameras. About 80% of the images are supposed to be classified as positive detections, the rest are images where no person was in the field of view of the system and therefore should not detect anything. Fig. 4. Example of images captured from one rotation An object detection classification method was used where the True Positive (TP) means situations where a person was present in the image and the algorithm detected it positively, whereas a False Positive (FP) means a person was not present in the image but the algorithm falsely detected it. In regards to the negative detections, a True Negative (TN) means that no person was present in the image and the algorithm correctly detected nothing and a False Negative (FN) means a person was not detected even though it was present. This results are presented in Table 1. With the True, False, Positive, Negative information we can then calculate the algorithm Precision, Recall, and F1 based on the equations below. Precision is a good measure to determine when the costs of False Positive are high, whereas Recall calculates how many of the actual Positives our model capture by labeling it as Positive (True Positive). Applying the same understanding, we know that Recall shall be the model metric we use to select our best model when there is
  • 192.
    180 H. Mendonçaet al. Table 1. Detection classification Detection method TP TN FP FN Thermal camera 923 145 306 51 Visible Light camera 897 175 127 89 Thermal + Visible Light 981 194 88 32 a high cost associated with False Negative. F1 Score is needed when you want to seek a balance between Precision and Recall. The results are presented in Table 2. Precision = TruePositive TruePositive + FalsePositive Recall = TruePositive TruePositive + FalseNegative F1 = 2 × Precision × Recall Precision + Recall Table 2. Detection classification Detection method Precision (%) Recall (%) F1 Thermal camera 75,18 94,76 0,8379 Visible Light camera 87,59 90,97 0,8925 Thermal + Visible Light 91,77 96,84 0,9423 In Figs. 5 and 6, we have an example of a True Positive in each one of the cameras. The visible light detection algorithm detects two persons with more than 50% certainty which classifies as a positive detection, it is important to mention that the use of masks does not have a big impact on the identification of persons. The thermal camera detection algorithm has a total of 3545 pixels in the range of temperatures that are considered acceptable which means that the system also considers this image as a positive detection. Since both cameras consider these images as True positives it will result in a True Positive for the detection system. In Figs. 7 and 8, we tested a scenario where we knew it was possible that the Visible Light detection algorithm would falsely signal a detection. The idea was to place an image of a person in front of the camera even though there is no real person there. As we can see the visible algorithm detects the image as a True Positive but the Thermal detection algorithm did not, since no temperatures in the image were in the predefined range of human temperatures. Individually on algorithm classifies the scenario as positive and the other as negative resulting
  • 193.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 181 Fig. 5. Example of true positive - visible light camera Fig. 6. Example of true positive - thermal camera
  • 194.
    182 H. Mendonçaet al. in a system negative. This proves that the fusion of data from both sensors can improve the system’s performance. Fig. 7. Example of false positive - visible light camera There are also some cases where the detection could not be achieved by either of the cameras. As we can see in Fig. 9 there is a person in the image, however, the algorithm is not capable of detecting it because of the distance of the system to the person, the low light conditions and the low resolution of the Neural Network. The thermal sensor also has some problems when trying to detect persons that are too far from the sensor or when the environmental temperatures are too similar to the person’s temperature. As we see in Fig. 10, even though we can distinguish in the image the person, the chosen range of temperatures does not consider it as a positive detection. We could also expand the range of detection but that would make the sensor too sensible to other objects or/and environmental temperatures. Our measurements indicate that the system has a range of detection of at least 8 m in non-optimal conditions. In perfect light and temperature conditions, the system’s range increases even further.
  • 195.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 183 Fig. 8. Thermal image in visible camera false positive example Fig. 9. Example of false negative - visible light camera
  • 196.
    184 H. Mendonçaet al. Fig. 10. Example of false negative - thermal camera 5 Conclusions and Future Work The developed system presented a high accuracy rate when detecting people in different light and weather conditions by using both thermal and visible light sensors as well as the PIR sensors. The data fusion algorithm allows the system to overcome situations where a determined sensor would not be able to perform as expected. The PIR sensor is the most simple sensor used in our system is good to detect persons in situations where there is movement, however, this method fails when a person is not moving but is still present. In this situation, we use the thermal and visible light camera to enhance the detection of people with the developed algorithms, using computer vision and temperature measurements. The thermal detection algorithm is susceptible to failure in situations where the environment temperature is similar to the human body temperature or when there are objects with similar temperatures. The visible light detection algorithm shows a lower performance in low light conditions and is not able to distinguish a real person from a picture of a human. Even though these two sensors are more capable than the PIR sensor, they still work together to improve the system performance as we showed in our tests. Since the cameras are spinning around central axis of the system, sometimes it fails to detect persons moving in the same direction as the camera. In this situations, the system relies on the data of the PIR sensors to accomplish the detection, even if it is not as reliable as the cameras data. Further work will consist of implementing the developed system into a disinfection AGV to test its reliability in a real-world scenario.
  • 197.
    Human Detector SmartSensor for Autonomous Disinfection Mobile Robot 185 The thermal detection algorithm can be improved by training a neural network to identify human bodies in a frame. This can be done in two ways, one of them is overlaying the RGB and the thermal image to create a new image where each pixel contains color plus temperature information thus being called RGB-T. The other way is using the thermal images to create a dataset of images to feed the network, the same approach used in the visible light detection algorithm. Acknowledgements. This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. References 1. Aijazi, A.K., Checchin, P., Trassoudaine, L.: Multi sensorial data fusion for effi- cient detection and tracking of road obstacles for inter-distance and anti-colision safety management. In: 2017 3rd International Conference on Control, Automation and Robotics, ICCAR 2017, pp. 617–621. Institute of Electrical and Electronics Engineers Inc., June 2017. https://doi.org/10.1109/ICCAR.2017.7942771 2. Alsing, O.: Mobile object detection using tensorflow lite and transfer learning (2018). http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233775 3. Andraši, P., Radišić, T., Muštra, M., Ivošević, J.: Night-time detection of uavs using thermal infrared camera, vol. 28, pp. 183–190. Elsevier B.V., January 2017. https://doi.org/10.1016/j.trpro.2017.12.184 4. Chavez-Garcia, R.O., Aycard, O.: Multiple sensor fusion and classification for mov- ing object detection and tracking. IEEE Trans. Intell. Transp. Syst. 17, 525–534 (2016). https://doi.org/10.1109/TITS.2015.2479925 5. Ha, K.N., Lee, K.C., Lee, S.: Development of PIR sensor based indoor location detection system for smart home, pp. 2162–2167 (2006). https://doi.org/10.1109/ SICE.2006.315642 6. Hung, P.D., Kien, N.N.: SSD-mobilenet implementation for classifying fish species. In: Vasant, P., Zelinka, I., Weber, G.-W. (eds.) ICO 2019. AISC, vol. 1072, pp. 399– 408. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4 40 7. Ivašić-Kos, M., Pobar, M.: Human detection in thermal imaging using yolo. ACM (2019). https://doi.org/10.1145/3323933.3324076 8. Mitrea, C.A., Constantin, M.G., Stefan, L.D., Ghenescu, M., Ionescu, B.: Little- big deep neural networks for embedded video surveillance, pp. 493–496. Institute of Electrical and Electronics Engineers (IEEE), October 2018. https://doi.org/10. 1109/iccomm.2018.8484765 9. Patel, P.B., Student, M.T., Choksi, V.M., Jadhav, S., Potdar, M.B.: ISSN: 2249- 0868 Foundation of Computer Science FCS. Technical report 5 (2016). www.ijais. org 10. Pawar, Y., Chopde, A., Nandre, M.: Motion detection using PIR sensor. Int. Res. J. Eng. Technol. 1(5) (2018). www.irjet.net 11. Pestana, D.G., Mendonca, F., Morgado-Dias, F.: A low cost FPGA based thermal imaging camera for fault detection in PV panels. Institute of Electrical and Elec- tronics Engineers Inc., August 2017. https://doi.org/10.1109/IoTGC.2017.8008976 12. Premebida, C., Ludwig, O., Nunes, U.: LIDAR and vision-based pedestrian detec- tion system. J. Field Robot. 26(9), 696–711 (2009). https://doi.org/10.1002/rob. 20312, http://doi.wiley.com/10.1002/rob.20312
  • 198.
    186 H. Mendonçaet al. 13. Sahoo, K.C., Pati, U.C.: IoT based intrusion detection system using PIR sensor. In: RTEICT 2017–2nd IEEE International Conference on Recent Trends in Electron- ics, Information and Communication Technology, Proceedings, vol. 2018-January, pp. 1641–1645. Institute of Electrical and Electronics Engineers Inc., July 2017. https://doi.org/10.1109/RTEICT.2017.8256877 14. Wei, H., Laszewski, M., Kehtarnavaz, N.: Deep learning-based person detection and classification for far field video surveillance. Institute of Electrical and Electronics Engineers Inc., January 2019. https://doi.org/10.1109/DCAS.2018.8620111 15. Yun, J., Lee, S.S.: Human movement detection and identification using pyroelectric infrared sensors. Sensors (Switzerland) 14(5), 8057–8081 (2014). https://doi.org/ 10.3390/s140508057 16. Yuwono, W.S., Sudiharto, D.W., Wijiutomo, C.W.: Design and implementation of human detection feature on surveillance embedded IP camera, pp. 42–47. Institute of Electrical and Electronics Engineers Inc., July 2018. https://doi.org/10.1109/ SIET.2018.8693180 17. Zhang, Z., Gao, X., Biswas, J., Jian, K.W.: Moving targets detection and local- ization in passive infrared sensor networks (2007). https://doi.org/10.1109/ICIF. 2007.4408178
  • 199.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing Algorithm Diogo Matos1(B) , Pedro Costa1,2 , José Lima1,3 , and António Valente1,4 1 INESC-TEC - INESC Technology and Science, Porto, Portugal diogo.m.matos@inesctec.pt 2 Faculty of Engineering, University of Porto, Porto, Portugal pedrogc@fe.up.pt 3 Research Centre of Digitalization and Intelligent Robotics, Instituto Politécnico de Bragança, Bragança, Portugal jllima@ipb.pt 4 Engineering Department, School of Sciences and Technology, UTAD, Vila Real, Portugal avalente@utad.pt Abstract. Task Scheduling assumes an integral topic in the efficiency of multiple mobile robots systems and is a key part in most modern man- ufacturing systems. Advances in the field of combinatorial optimisation have allowed the implementation of algorithms capable of solving the different variants of the vehicle routing problem in relation to different objectives. However few of this approaches are capable of taking into account the nuances associated with the coordinated path planning in multi-AGV systems. This paper presents a new study about the imple- mentation of the Simulated Annealing algorithm to minimise the time and distance cost of executing a tasks set while taking into account possi- ble pathing conflicts that may occur during the execution of the referred tasks. This implementation uses an estimation of the planned paths for the robots, provided by the Time Enhanced A* (TEA*) to determine where possible pathing conflicts occur and uses the Simulated Annealing algorithm to optimise the attribution of tasks to each robot, in order to minimise the pathing conflicts. Results are presented that validate the efficiency of this algorithm and compare it to an approach that does not take into account the estimation of the robots paths. Keywords: Task scheduling · Multiple AGV · Time Enhanced A* 1 Introduction Automated Guided Vehicles (AGV) have been more and more adopted not only by industry for the moving of products on the shop-floor but also in hospitals and distribution centres, as example. They can be an important help to the com- petitiveness of a company since a 24/7 h can be adopted, as well as, a reduction of costs. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 187–202, 2021. https://doi.org/10.1007/978-3-030-91885-9_14
  • 200.
    188 D. Matoset al. When the solution points out more than one AGV to transport the material, a new problem must be tackled: scheduling of multiple AGV. The coordination of a fleet of autonomous vehicles is a complex task and actually most available multi-robot systems rely on static and pre-configured movements [1]. The solution can be composed by a cooperative movement between all robots that are part of the system while avoiding mutual block. In fact, this problem of scheduling for multiple-robots can be compared to the well-known problem of the Travelling salesman problem applied with multiple salesman. However the problem of task scheduling in multiple-robots system has the added complexity that the path of one robot can interfere with the path of the other robots since they all must share the same resources. Based on this, an optimisation method- ology is proposed in this paper so that it can solve the multi-robot scheduling, while taking into account the impact of the pathing conflicts in the overall effi- ciency of the system. This paper addresses an optimisation approach based on the estimation of the robots paths via the TEA* algorithm and the consequen- tial optimisation of the scheduling of multiple AGV via the Simulated Annealing algorithm. 1.1 Problem Description The order in which the tasks are attributed to each of the agents of the system can have a great impact in the overall efficiency of the system. Therefore, discovering the ideal order of tasks to assign to each of the robots, that are part of a multi AGV system, is a very important task. The main goal of this article is to create a module that is capable of optimising the task order that will be assigned to each of the robots. This optimisation process must be time sensible, since it must generate a good solution in the time interval that takes the control system of the multi AGV to execute its current assigned tasks. This module uses a path estimation generated by the TEA* path planning algorithm to calculate the cost of each solution taking into account the possible resource conflicts. Most task scheduling implementations do not take into account possible pathing conflicts, that can occur during the planning of the paths for the robots that comprise the system. These conflicts normally originate unpredicted increases in the time it takes the system so execute the tasks, since in order to solve these conflicts one more of the robots must either delay their movement or use a less efficient path to reach the target. Therefore, the overall objective of this optimisation is to make the system more efficient by using a path planning algo- rithm in the optimisation phase. The path planning algorithm will generate an estimation of the real paths of that the robots will take. The main idea behind this system is to use this estimations to calculate the value of each solution, therefore the minimisation of this value will be directly correlated to a decrease in the number of pathing conflicts. In most real industrial applications of multi AGV systems there are some restrictions associated with how the transport of cargo can be made, for example some tasks can only be executed by certain robots (due to the AGV characteristics) and some tasks have to be executed as
  • 201.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 189 soon as possible having priority over others, this restrictions were also taken into account during the design of this module. This article represents the first step into developing a system capable of such optimisation. It presents the results of the study of the performance of one of the most commonly used optimisation methods, when it is applied in the scenario described above. This article also compares the performance of this method with the results obtained from the minimisation of the travelled distance without taking into consideration the path estimation. Since the purpose of this article is just to study the performance of the Simulated Annealing algorithm when it is apply to this case, the estimation of the AGV battery and the dynamic execution of tasks were not taken into account, however such features are planned in future iterations of this work. For the purposes of this study a task is considered as the movement the movement of the robot between two workstation. This implies that the robot must first move from its current position towards the starting workstation, in order to load the cargo, and then drop it at the final workstation. To execute this movements the robot must access shared resources, due to the physical dimensions of the robot and the operation space, it is considered that it is only possible to have one robot travelling on each edge of the graph, in order to avoid possible deadlock situations. This types of studies are of paramount importance as a possible solution to this optimisation problem may be based in a multi method based approach. The paper is organised as follows. After this brief introduction, Sect. 2 presents some related work on the scheduling problem. Section 3 addresses the system architecture whereas in Sect. 4 presents the optimisation methodology. On Sect. 5, the results of the proposed method are stressed and Sect. 6 concludes the paper and points out the future work direction. 2 State Art Task scheduling defines and assigns the sequence of operations (while optimising) to be done, in order to execute a set of tasks. Examples of scheduling can be found in a huge number of processes, such as control air traffic control, airport gate management, allocate plant and machinery resources and plan production processes [2,4] among several others. The task scheduling can be applied to the presented problem on assigning the task for each agent on a multi-robot system. This problem can be considered a variant of the well known Travelling Salesman Problem (TSP) [3]. According to [12], algorithms between centralised and decoupled planning belong to three problem classes: 1) coordination along fixed, independent paths; 2) coordination along independent roadmaps; 3) general, unconstrained motion planning for multiple robots. Following 3), there are several approaches to solve the scheduling problems, from deterministic methods to heuristics procedures. Regarding the Heuristic procedures there are several ways such as Artificial Intelligence, Simulated Annealing (SA) and Evolutionary algorithms [5]. Classi- cal optimisation methods can also be applied to the scheduling problem such as
  • 202.
    190 D. Matoset al. the Genetic Algorithm (GA) or Tabu Search [6]. Enhanced Dijkstra algorithm is applied to find the conflict-free routing task on the work proposed by [7]. The scheduling problem has been studied extensively in the literature. In the robotics domain, when considering each robot as an agent, some of these algorithms can be directly adapted. However, most of the existing algorithms can only handle single-robot tasks, or multi-robot tasks that can be divided into single-robot tasks. Authors from [8] propose an heuristics to address the multi- robot task scheduling problem but at the coalition level, which hides the details of robot specifications. Scheduling problem can be solved resorting to several approaches, such as mixed integer program, which is a standard approach in the scheduling com- munity [15] or resorting to optimisation techniques: Hybrid Genetic Algorithm- Particle Swarm Optimisation (HGA-PSO) multiple AGV [16], find the near- optimum schedule for two AGVs based on the balanced workload and the min- imum travelling time for maximum utilisation applying Genetic Algorithm and Ant Colony Optimisation algorithm [14]. Other approaches provide a methodology for converting a set of robot trajec- tories and supervision requests into a policy that guides the robots so that oper- ators can oversee critical sections of robot plans without being over-allocated, as an human-machine-interface proposal that indicates how to decide the re- planning [9]. Optimisation models can also help operators on scheduling [10]. Similar approach is stressed on [11] where the operator scheduling strategies are used in supervisory control of multiple UAVs. Scheduling for heterogeneous robots can also be found on literature but addressing two agents only [13]. Although there exist some algorithms that support complex cases, they do not represent efficient and optimised solutions that can be adapted for several (more than two) multi-robot systems in a convenient manner. This paper proposes a methodology, based on Simulated Annealing optimisation algorithm, to schedule a fleet of four and eight robots having in mind the simultaneous reduction of the distance travelled by the robots and the time it takes to accomplish all the assigned tasks. 3 System Architecture The system used to test the performance of the optimisation methods, as shown in Fig. 1, is comprised of two modules: – The Optimisation module; – The Path Estimation module; The idea behind this study is to, at a later date, use its results to help design and implement a tasks scheduling software that will work as an interface between the ERP used for industrial management and the Multi AGV Control System. The Optimisation module will use the implemented methodologies, described in Sect. 4, to generate optimised sets of tasks for each of the robots. These sets are then sent to the Path Estimation module, this module estimates the paths that
  • 203.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 191 Fig. 1. Top level view of the Control system. each of the robot will have to take, in order to execute these tasks. This paths will be planned using the TEA* path planning algorithm, therefore taking into consideration possible resource conflicts that can occur. The Path Estimation module will be used to calculate the value of each of the task sets, this value is later used by the Optimisation module to generate better task sets for the robots. This cycle will continue to be executed until the optimisation method reaches its goal. In future works this cycle will be restricted to the available time interval until the completion of the current tasks by the robots, however since the aim of this study was to analyse the performance of the simulated annealing method when it is applied to this scenario, this restriction was not implemented. 4 Implemented Optimisation Methodology In this study the Simulated Annealing(SA) metaheuristic, was chosen in order to demonstrate its results when it comes to optimising the solutions to the task attri- bution problem. However, some small modifications needed to be implemented to the classic version of this metaheuristic, in order to allow the application of this algorithm to the proposed problem. Most of these modifications were implemented with the objective of making the solutions generated by these algorithms compatible with the restrictions that most real life multi AGV systems are subjected to. The three main restrictions taken into consideration during this study were: – The possibility of having different priorities for each of the tasks; – The possibility of having tasks that can only be executed by specific robots; – The existence of charging stations that the robots must move to when they have no tasks assigned; The implemented algorithm, as well as, the modifications implemented are described in the following sections. 4.1 Calculate the Value of a Solution In the tasks scheduling for multiple vehicle routing problem, the solutions can be classified by two major parameters:
  • 204.
    192 D. Matoset al. – The times it takes the robots to complete all of its tasks; – The distance travelled by the robot during the execution of the referred tasks; In order to classify and compare the solutions obtained by the implemented optimisation methodologies, a mathematical formula was used to obtain a cost value for each of the solutions. This formula uses as inputs a boolean that indi- cates if the TEA* is capable of planning a valid path for the robot (Vi), the number of steps that comprise each of the robots planned path(Si) and the total distance that each robot is estimated to travel(Di). This formula is represented bellow: NumberofRobots i=0 (c1 ∗ V i + c2 ∗ Si + c3 ∗ Di) Each of these inputs is multiplied by a distinct constant, these constants define what is the weight that each of the inputs will have in the final result of the function. For the purpose of these article, these values were set as 1 * 108 , 5, 1 for c1, c2 and c3, respectively. The intention behind these values is to force the optimisation process to converge into solutions where the path for all the robots is able to be planned, by attributing a large constant cost to the solutions that don’t generate feasible path, prioritising then those paths are the ones which leads to the fastest execution of all the tasks. It is of special importance, when using the TEA* algorithm for path planning, that the optimisation process gives special attention to the overall time taken by the system to execute the tasks. The TEA* algorithm has the possibility of solving pathing conflicts by inserting wait conditions into the robots paths, therefore in the paths generated by this algorithm there isn’t always a direct relation between travelling less distance and finishing the tasks quicker. There are some scenarios, as implied above, where the TEA* algorithm can- not plan viable paths for all the robots, this scenarios are normally associated with the fact that the maximum number of steps allocated in the algorithm is reached before all the targets are reached. Since this paths would be pro- hibitively costly to be executed by the robots since the more steps the path has the more time it will take to finish and the computational cost of planning will also increase. 4.2 Initial Clustering of the Tasks Since the Simulated Annealing is a metaheuristic that generates random changes in the solution in an attempt to move its solution towards a global minimum, it the results obtained with this probabilistic technique tend to be dependent on the starting solution. To generalise this methodology and also to have a com- parison point between a solution with optimisation and solution that was not exposed to any optimisation methodologies. The tasks where initially clustered using a Kmeans Spectral clustering algorithm, this algorithm was based in the
  • 205.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 193 algorithm presented by [17,18] to solve the traditional Multiple Travelling Sales- man Problem (mTSP). This initial clustering uses a matrix with the distance between the final point of each task and the beginning point of the other tasks, in order to create a number of clusters equal to the number of robots available in the system. Due to the fact that the beginning point and final point of each task are usually different, the distance matrix computed is not symmetric. This asymmetry serves to model the fact that the order in which the tasks are executed by each robot is not commutative and does have a large impact in the value of the solution. These clusters were then assigned to the robot that was closest to the start point of each cluster. This solution will serve as the baseline for the optimisation executed by the Simulated Annealing metaheuristic. 4.3 Simulated Annealing Algorithm The implementation of the Simulated Annealing followed and approach that is very similar to the generalise version of the algorithm found in literature. This implementation is show in Fig. 2 via flow chart. Fig. 2. Flowchart representation of the implemented simulated annealing algorithm The Simulated Annealing algorithm implemented uses the same cooling func- tion as the classical implementation of this algorithm. It uses initial temperature of 100◦ and a rate of cooling of 0.05. This algorithm if given an infinite time win- dow to execute will deliver a solution if the final temperature reaches 0.01◦ .
  • 206.
    194 D. Matoset al. Dealing with Tasks with Different Priority Levels As stated previously, several modifications needed to be implemented to the classic approach of this algorithm, in order to deal with the new restrictions imposed by real industrial environments. In order to allow the implemented Simulated Annealing algorithm to gener- ate solutions that respect the priorities of the tasks, the optimisation problem was divided into several smaller problems. Each of these smaller problems repre- sent the optimisation of one of the priority groups. The algorithm will generate solutions to each of these problems starting with the group that as the highest priority and descending priority wise. After generating a solution for one these problems the system will calculate the final position of each of the robots, this position is not only characterised by the physical location of the robot but also by the estimated time that it will take the robot to finish its tasks. Afterwards, this position will be used as the initial position of the robots for the next priority group. This allows the algorithm to take into consideration that the solutions found for a priority group will also affect the priority groups with lower priorities. This methodology is represented in Fig. 3 via a flowchart. Fig. 3. Flowchart representation of an implementation of a simulated annealing algo- rithm while taking into account the possibility of existing tasks with different priorities. The TEA* algorithm used to estimate the paths for the robots, will also set the nodes belonging to the paths estimated for higher priority tasks, has occupied in the corresponding step. This modification in conjunction with the updating of the robots initial position allows the algorithm to divide the path planning problem into smaller chunks without sacrificing the effect that higher priority solutions may have in the path estimated for lower priority ones.
  • 207.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 195 Robot Exclusivity Problem Another restriction that the system will have to face is the fact that some tasks might only be possible to execute by a specific robot or group of robots. This restriction implies that the changes made to the solution by the Simulated Annealing algorithm cannot be truly random. To respect this restriction, another modification must be added to the classical implementation of the Simulated Annealing algorithm. To prevent the attribution of tasks to robots that are not capable of execut- ing them, a taboo table was implemented. This taboo table holds information on which tasks have this restriction and to which robots can those tasks be assigned to. By analysing the generated random change using this table, it is possible to check if a change to the solution can be accepted or not, before calculating the value associated with that change. The use of a taboo table prevents the algo- rithm from spending time calculating the value of solutions that are impossible to execute by the robots. The functioning of this taboo table is shown via a flowchart in Fig. 4. The implemented random change algorithm executes two different types of changes. It either commutes the position of two different tasks between each other, this was the example used to show the functioning of the taboo table in Fig. 4. Or it changes the position of one randomly selected task. In this last case, if the task is affected by this restriction the algorithm uses the taboo table to randomly select a new position, from the group of positions that respect the restriction associated with the referred task, to attribute to the task. In the case where there are no other valid candidates, either other tasks or other positions that respect the restriction, to exchange the selected task with, the algorithm will randomly select another task to change, and repeat the process. This taboo table is also used in the initial clustering in order to guarantee that the initial clustering of the tasks also respects this restriction. The taboo table is analysed when assigning a task to a cluster, tasks that are subjected to this restriction can only be inserted into cluster assigned to a robot that is capable of executing said tasks. All sets of tasks will have approximately 10% of their total tasks affected by robot restrictions and will be divided into three different priority levels (Table 1).
  • 208.
    196 D. Matoset al. Fig. 4. Flowchart representation of the how the modifications implemented into the process of generating a random change in the task set. Table 1. Example of a Taboo Table used to solve the robot exclusivity problem. Task ID Possible robot ID 1 2 1 4 2 2 Dealing with Idle Robots In real multi AGV implementations it is necessary to have specific spots where the robots can recharge, this spots are usually located in places where they will not affect the normal functioning of the system. This restriction was also taken into consideration by the implemented algo- rithm. At the end of each task group assigned to the robot the task scheduler module will send the robot to its charging station. This action will be considered as an extra task and is taken in consideration when the algorithm is estimating the cost of each solution, since even due the robot as finished its tasks it can
  • 209.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 197 still interfere with the normal functioning of the other robots and cause pathing conflicts. To prevent the robots from moving to their charging stations before finishing all their tasks, this task will be permanently fixed as the last task to be executed by each of the robots. It will also be outside the scope of the normal random change process that occurs in the Simulated Annealing algorithm, therefore pre- venting any change, induced by the algorithm into the task set, from altering its position in the task set. 5 Tests and Results In this section the test description and results obtained during this study will be described and displayed. As stated before, the objective of this study is to determine the efficiency and suitability of the Simulated Annealing algorithm when it is used to optimise the task scheduling of a multi AGV system. The Simulated Annealing algorithm will be analysed based on two key factors. The overall cost of the final path generated by the algorithm and the time it took the algorithm to reach said solution. The performance of this algorithm will be compared to the case where the task are randomly distributed and to the case where the tasks are optimised to reduce the distance travelled between each task without taking into account possible path planning conflicts. 5.1 Test Parameters and Environment In order to tests the efficiency of the different methodologies, the map used in [19] was chosen has the test environment. This map defines a real life map of an industrial implementation of a multi AGV system and has as its dimensions 110 × 80 m. This map is represented in Fig. 5. Fig. 5. Map of the environment use for the tests [19]. The possible paths that the robots can use are represented in the figure via the red coloured lines.
  • 210.
    198 D. Matoset al. Using this map a graph was generated that represents all the possible tra- jectories that the robots can take. This graph was composed by 253 vertexes and 279 edges, of the 253 vertexes 41 of them are either workstations or robot charging stations. Two different versions of the system will be tested one version will only use four robots while the other will use eight. Each version of the system will be tested with a different sets of tasks, these sets vary not only in the number of tasks but also in the restrictions associated with each one. These test were executed in a laptop with the a Intel Core i7-6820HK 2.70 ghz cpu, 32G available ram and NVIDIA GeForce GTX 980M graphic card. For the purpose of this tests, a task will be characterised by the movement of the robot to the “Starting Point” of said task and then from the movement of the robot from “Starting Point” to the “End Point”. In this study it is assumed that the loading and unloading of the robot happens instantly, in future iterations of this work this topic will be tackled in detail. 5.2 Four Robots and Twenty Tasks The first scenario to be tested was the four robots scenario, in this test 20 tasks were assigned to the system, of these tasks three of them can only be assigned to a specific robot. After the test was executed 16 times (enough to obtain a trustworthy average), the results were analysed and are summarised in Table 2. Table 2. Table with the summarised results from the first scenario Method Best cost Worst cost Mean cost Mean efficiency (SA) SA 13030,4 14089,7 13574,6 ———- Kmeans 16691,9 17693,2 17139,5 20,79% Random 16498,5 16925,7 16739,3 18,91% As expected the implemented Simulated Annealing algorithm generated and average improvement of 20,79% when compared to the Kmeans clustering method and a 18,90% improvement when compared to a totally random distri- bution of tasks. The results of the random distribution method were surprising, although they do show that, even with only four robots, when the distribution is optimised considering only the shortest distance between tasks it tends to create more planning conflicts and consequently generate more costly paths, since even due the overall distance travelled by the robots tends to be minimised, the robots tend to take longer to execute their tasks due to the necessity of waiting for the shared resource’s to be freed. All three attribution methods tested were capable of generating solutions that respected all the system restrictions. In this scenario the Simulated Annealing took on average 442,5 s (slightly less than seven and a half minutes) to generate a solution.
  • 211.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 199 The Simulated Annealing algorithm was also tested in terms of its conversion rate when applied to this scenario, the results of said test are represented in Fig. 6. These results were generated by changing the cooling rate value in order to increase or decrease the time it took the algorithm to generate a viable solution. From the results of this test it is possible to conclude that using a time interval superior to 400 s would be more than adequate for this scenario. Fig. 6. Graph representation of the results obtained from the conversion rate tests 5.3 Eight Robots and Forty Tasks The second scenario was the eight robots scenario, in this test 40 tasks were assigned to the system, of these tasks four of them can only be assigned to a specific robot. Similarly to the previous one this test was also executed 16 times to ensure the consistency of the results. The results were analysed and are represented in Table 3. Table 3. Table with the summarised results from the second scenario Method Best cost Worst cost Mean cost Mean efficiency (SA) SA 14112 15642,3 14743,3 ———- Kmeans 18303,1 19810,9 19072,7 22,68% Random 18575,6 19424,1 18894,9 21,97% In this scenario the Simulated annealing algorithm took in average 881,4 s (around fifteen minutes), this is roughly double the time taken in the previous
  • 212.
    200 D. Matoset al. test, this was expected since the complexity of the problem was also significantly increased in this scenario. The overall efficiency of the improvement added, by the Simulated Annealing algorithm, to the solution also suffered an increase, this was also expected given the a higher probability of occurring pathing conflicts, in this scenario. Fig. 7. Graph representation of the results obtained from the conversion rate tests in the eight robot scenario Similarly to the previous scenario the convergence rate of the Simulated Annealing algorithm was also analysed, the results from this analysis are repre- sented in Fig. 7. In this scenario the algorithm requires a larger time interval until the solution cost reaches a stable value. Given the larger complexity of this scenario this increase was expected. Therefore, in order to obtain good results when using this algorithm in a eight robot system, the time interval available for the algorithm to optimise the task attribution should be larger than 1000 s. Analysing both flowcharts Figs. 6 and 7, it is possible to ascertain that the conversion time does not behave in a linear fashion when the problem is scaled. 6 Conclusions In this work, it was presented a study on the implementation of the Simulated Annealing algorithm to optimise the assignment of tasks in a multi-AGV sys- tem. This implementation uses an estimate of the paths planned for the robots, provided by Time Enhanced A * (TEA *), to minimise the cost of time and distance to perform tasks assigned to the system, taking into account possible path conflicts that may occur during tasks execution. This article also compares
  • 213.
    Multiple Mobile RobotsScheduling Based on Simulated Annealing 201 the performance of this method with the results obtained from the minimisation of the travelled distance without taking into consideration the path estimation. The results show, in the two scenarios implemented (four robots with twenty tasks, three of which are restricted; eight robots with forty tasks, four of which are restricted), that the implemented algorithm (Simulated annealing algorithm) has an average improvement of at least 22.68% (20.79% in the scenario of four robots), when compared to kmeans and an average improvement of at least 21.97% (18.91% in the scenario of four robots), when compared with a totally random algorithm. In this work loading and unloading of the robot happens instantly, in future work this topic will be addressed in more detail. In future iterations of this work the efficiency of other optimisation methodologies will also be studied. This studies will be used as the foundations for the creation of a task scheduling module, capable of optimising the task attribution while being sensitive to the available time. Acknowledgements. This work is financed by National Funds through the Por- tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project UIDB/50014/2020. References 1. Siefke, L., Sommer, V., Wudka, B., Thomas, C.: Robotic systems of systems based on a decentralized service oriented architecture. Robotics 9, 78 (2020). https://doi. org/10.3390/robotics9040078 2. Kuhn, K., Loth, S.: Airport Service Vehicle Scheduling, Eighth USA/Europe Air Traffic Management Research and Development Seminar (ATM2009) (2009) 3. Lawer, E., Lenstra, J., Rinnooy, A.K., Shmoys, D.: The Travelling Salesman Prob- lem. Wiley, Chichester (1985) 4. Lindholm, P., Giselsson, N.-H., Quttineh, H., Lidestam, C., Johnsson, C., Fors- man, K.: Production scheduling in the process industry. In: 22nd International Conference on Production Research (2013) 5. Wall, M.B.: Genetic Algorithm for Resource-Constrained Scheduling. Ph.D., Mas- sachusetts Institute of Technology (1996) 6. Baar, T., Brucker, P., Knust, S.: Meta-heuristics: Advances and Trends in Local Search Paradigms for Optimisation, vol. 18 (1998) 7. Vivaldini, K., Rocha, L.F., Martarelli, N.J., et al.: Integrated tasks assignment and routing for the estimation of the optimal number of AGVS. Int. J. Adv. Manuf. Technol. 82, 719–736 (2016). https://doi.org/10.1007/s00170-015-7343-4 8. Zhang, Y., Parker, L.E.: Multi-robot task scheduling. In: 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, pp. 2992–2998 (2013). https://doi.org/10.1109/ICRA.2013.6630992 9. Zanlongo, S.A., Abodo, F., Long, P., Padir, T., Bobadilla, L.: Multi-robot schedul- ing and path-planning for non-overlapping operator attention. In: 2018 Second IEEE International Conference on Robotic Computing (IRC), Laguna Hills, CA, USA, pp. 87–94 (2018). https://doi.org/10.1109/IRC.2018.00021
  • 214.
    202 D. Matoset al. 10. Crandall, J.W., Cummings, M.L., Della Penna, M., de Jong, P.M.A.: Computing the effects of operator attention allocation in human control of multiple robots. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 41(3), 385–397 (2011). https:// doi.org/10.1109/TSMCA.2010.2084082 11. Cummings, M.L., Mitchell, P.J.: Operator scheduling strategies in supervisory con- trol of multiple UAVs. Aerospace Sci. Technol. 11(4), 339–348 (2007) 12. LaValle, S.M., Hutchinson, S.A.: Optimal motion planning for multiple robots hav- ing independent goals. IEEE Trans. Robot. Autom. 14(6), 912–925 (1998). https:// doi.org/10.1109/70.736775 13. Wang, H., Chen, W., Wang, J.: Coupled task scheduling for heterogeneous multi- robot system of two robot types performing complex-schedule order fulfillment tasks. Robot. Auton. Syst. 131, 103560 (2020) 14. Kumanan, P.U.S.: Task scheduling of AGV in FMS using non-traditional optimiza- tion techniques. Int. J. Simul. Model. 9(1), 28–39 (2010) 15. Mudrova, L., Hawes, N.: Task scheduling for mobile robots using interval alge- bra. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, pp. 383–388 (2015). https://doi.org/10.1109/ICRA. 2015.7139027 16. Zhong, M., Yang, Y., Dessouky, Y., Postolache, O.: Multi-AGV scheduling for conflict-free path planning in automated container terminals. Comput. Ind. Eng. 142, 106371 (2020) 17. Lihi, Z.-M., Pietro, P.: Self-Tuning Spectral Clustering, Computational Vision - Caltech 18. Rani, S., Kholidah, K.N., Huda, S.N.: A development of travel itinerary planning application using traveling salesman problem and k-means clustering approach. In: Proceedings of the 2018 7th International Conference on Software and Computer Applications (ICSCA 2018), New York, NY, USA, pp. 327–331. Association for Computing Machinery (2018). https://doi.org/10.1145/3185089.3185142 19. Olmi, R., Secchi, C., Fantuzzi, C.: Coordination of multiple AGVs in an industrial application, pp. 1916–1921 (2008). https://doi.org/10.1109/ROBOT.2008.4543487
  • 215.
    Multi AGV IndustrialSupervisory System Ana Cruz1(B) , Diogo Matos2 , José Lima2,3 , Paulo Costa1,2 , and Pedro Costa1,2 1 Faculty of Engineering, University of Porto, Porto, Portugal up201606324@edu.fe.up.pt, {paco,pedrogc}@fe.up.pt 2 INESC-TEC - INESC Technology and Science, Porto, Portugal diogo.m.matos@inesctec.pt 3 Research Centre of Digitalization and Intelligent Robotics, Instituto Politécnico de Bragança, Bragança, Portugal jllima@ipb.pt Abstract. Automated guided vehicles (AGV) represent a key element in industries’ intralogistics and the use of AGV fleets bring multiple advan- tages. Nevertheless, coordinating a fleet of AGV is already a complex task but when exposed to delays in the trajectory and communication faults it can represent a threat, compromising the safety, productivity and efficiency of these systems. Concerning this matter, trajectory plan- ning algorithms allied with supervisory systems have been studied and developed. This article aims to, based on work developed previously, implement and test a Multi AGV Supervisory System on real robots and analyse how the system responds to the dynamic of a real environment, analysing its intervention, what influences it and how the execution time is affected. Keywords: Multi AGV coordination · Time enhanced A* · Real implementation 1 Introduction Automated guided vehicle systems (AGVS), a flexible automatic mean of con- veyance, have become a key element in industries’ intralogistics, with more indus- tries making use of AGVS since the mid-1990s [13]. These systems can assist to move and transport items in manufacturing facilities, warehouses, and distribu- tion centres without any permanent conveying system or manual intervention. It follows configurable guide paths for optimisation of storage, picking, and trans- port in the environment of premium space [1]. The use of AGV represents a significant reduction of labour cost, an increase in safety and the sought-after increase of efficiency. By replacing a human worker with an AGV, a single expense for the equipment is paid- the initial investment- avoiding the ongoing costs that would come with a new hire [2]. These sys- tems are also programmed to take over repetitive and fatiguing tasks that could c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 203–218, 2021. https://doi.org/10.1007/978-3-030-91885-9_15
  • 216.
    204 A. Cruzet al. diminish the human worker attention and lead to possible accidents. Therefore, industries that use AGV can significantly reduce these accidents and, with a 24 h per day and 7 days per week operability and high production output, AGV ensure worker safety while maximizing production [3]. Multiple robot systems can accomplish tasks that no single robot can accom- plish, since a single robot, no matter how capable it is, is spatially limited [5]. Nevertheless, a multi-AGV environment requires special attention. Coordinat- ing a fleet of AGV is already a complex task and restrict environments with the possibility of exposing the AGV to delays in the trajectory and even communi- cation faults can represent a threat, compromising the safety, productivity and efficiency of these systems. To solve this, trajectory planning algorithms allied with supervisory systems have been studied and developed. Based on the work developed by [8], it is intended to implement and test a Multi AGV Supervisory System on real robots and analyse how the system responds to the dynamic of a real environment. This paper aims to analyse the intervention of the implemented Supervisory Module, what influences it and how the execution time is affected considering other variables such as the robots’ velocity. 2 Literature Review The system performance is highly connected with path planning and trajectory planning. Path planning describes geometrically and mathematically the way from the starting point to the destination point, avoiding collisions with obsta- cles. On the other hand, trajectory planning is that path as a function of time: for each time instance, it defines where the robot must be positioned [7]. When choosing a path planning method, several aspects have to be consid- ered such as the type of intended optimisation (path length, execution time), computational complexity and even if it is complete (always finds a solution when it exists), in full resolution (there is a solution to a particular discretisa- tion of the environment) or probabilistically complete (the probability of finding a solution converges to 1 as the time tends to infinity) [7]. When it comes to coordinating a multi-AGV environment, the Time Enhanced A* (TEA*) Algorithm revealed to be very fitted as it is described as a multi-mission algorithm that incrementally builds the path of each vehicle considering the movements of the others [10]. 2.1 TEA* Algorithm The TEA* Algorithm is a graph search algorithm, based on the A* algorithm, developed by [10]. This approach aims to fulfil industrial needs by creating routes that minimise the time allocated for each task, avoid collisions between AGV and prevent the occurrence of deadlocks. Considering a graph G with a set of vertexes V and edges E (links between the vertexes), with a representation of the time k = [0; TMax], each AGV can
  • 217.
    Multi AGV IndustrialSupervisory System 205 Fig. 1. TEA* algorithm: input map and analysed neighbour cells focusing on the cell with the AGV’s position [11] only start and stop in vertexes and a vertex can only be occupied by one vehicle at a temporal layer [11]. As it can be seen in Fig. 1, the analysed neighbour cells belong to the next temporal layer and include the cell containing the AGV’s current position. To find the minimal path, the algorithm starts by calculating the position of each robot in each temporal layer and the next analysed neighbour cell is dependent on a cost function. In the TEA* approach, the heuristic function used is the euclidean distance. Hence, in the future, possible collisions can be identified and avoided in the beginning (k = 0) of the paths’ calculation [10]. Since the coordination between robots is essential to avoid collisions and to guarantee the correct execution of the missions, this approach ensures it since the previously calculated paths become moving obstacles [10]. Combining this approach with a supervisory system revealed to be a more efficient approach to avoid collisions and deadlocks [8]. Instead of executing the TEA* methodology as an online method (the paths are re-calculated every execution cycle, a computationally heavy procedure), the supervisory system would detect delays in the communication, deviations in the routes of the robots and communication faults and trigger the re-calculation of the paths, if needed. Therefore, [8] proposed a supervisory system consisting of two modules: Plan- ning Supervision Sub-Module and Communication Supervision Sub-Module. 3 System Overall Structure To implement and test the work developed by [8], and further modifications, it was set a shop floor map and developed a fleet of three robots and their control module, a localisation system based on the pose estimation of fiducial markers and a central control module. The architecture of the system developed is represented in Fig. 2. The Central Control Module of the system is composed of three hierarchical modules: the Path Planning Module (TEA* Algorithm) and the Supervisory Module: the Planning Supervision and the Communication Supervision.
  • 218.
    206 A. Cruzet al. Robot Localisation Module Robot Control Module Path Planning (TEA* Algorithm) Planning Supervision Communication Supervision Central Control Module Fig. 2. System architecture and relations between the different modules The Robot Localisation Module locates and estimates the robot’s coordinates and communicates simultaneously with the Central Control Module and the Robot Control Module. Lastly, the Robot Control Module is responsible for calculating and deliv- ering through UDP packets the most suitable velocities for each robots’ wheel, depending on the destination point/node indicated by the Central Control Mod- ule. The communication protocol used, UDP (User Datagram Protocol), is responsible for establishing low-latency and loss-tolerating connections between applications on the internet. Because it enables the transfer of data before an agreement, provided by the receiving party, the transmissions are faster. As a consequence, UDP is beneficial in time-sensitive communications [12]. 3.1 Robot Localisation Module To run the system, the current coordinates, X and Y, and orientation, Theta, of the robots were crucial. Suitable approaches would be odometry or even an Extended Kalman Filter localisation using beacons [4]. However, for pose esti- mation, the Robot Localisation Module applies a computer vision algorithm. For this purpose, a camera was set above the centre of the shop floor map and the robots were identified with ArUco markers [6,9], as represented in Fig. 3. The ArUco marker is a square marker characterised by a wide black border and an inner binary matrix that determines its identifier. Using the ArUco library developed by [9] and [6] and associating an ArUco tag id to each robot, it was possible to obtain the coordinates of the robots’ positions with less than 2 cm of error. However, the vision algorithm revealed to be very sensitive to poor
  • 219.
    Multi AGV IndustrialSupervisory System 207 Fig. 3. Robot localisation module: computer vision algorithm for pose estimation through the detection and identification of ArUco markers illumination conditions and unstable regarding non-homogeneous lighting. The size of the tag used was 15 per 15 cm with a white border of 3.5 cm. The Robot Localisation Module was able to communicate with the Central Control Module and the Robot Control Module through UDP messages. Each UDP message carried the positions of all robots detected and one message was sent every 170 ms (camera frame rate plus vision algorithm processing time). 3.2 Robot Control Module The communication between the Central Control Module, constituted by the Path Planning Module (TEA* Algorithm) and the Supervisory Module, and the Robot Control Module is established through UDP messages. After the paths are planned, the information is sent to the Robot Control Module through UDP packets. Each packet has the structure represented in Fig. 4. Fig. 4. Communication between central control module and robot control module: UDP packet structure The heading of the packet, highlighted in yellow, carried the robot id (N), the priority of the robot (P) and the number of steps that the packet contained (S). Consequently, the body of the packet, highlighted in green, contained the information of the S steps. This information was organised by the number of the step (I), the coordinates X and Y and the direction D (that was translated into an angle in radians inside the trajectory control function). The ending of the packet, highlighted in blue, indicates if the robot has reached the destination and carried a termination character. The indicator T0 or T1 informed if the
  • 220.
    208 A. Cruzet al. robot’s current position was the final destination (T1 if yes, T0 if not) as a form of reassurance. When a robot finished its mission, the UDP packet sent would not contain any steps (S0), therefore it did not contain any information on its body. In this way, the indicator T0 or T1 assures the ending of the mission. Lastly, the character F was the termination character chosen to indicate the end of the packet. It is important to refer that, even if the communication between the Central Control Module and the Robot Control Module is shut down, the robots are capable of continuing their paths until reaching the last step of the last UDP packet received. This property is important when communication faults may exist. Thus, the mission is not compromised if the Central Control Module loses communication with the robot. After decoding the received packet, the robot’s control is done through the calculation of the most suitable velocity for each wheel for each step’s X and Y coordinates and direction/Theta angle. That operation is done through the function represented by the diagram in Fig. 5. Rotate Go_Foward De_Accel Final_Rot abs(erro_theta) MAX_ETF erro_dist TOL_FINDIST erro_dist DIST_DA erro_dist TOL_FINDIST abs(erro_theta_f) THETA_DA erro_dist DIST_NEWPOSE Stop (abs(erro_theta_f) THETA_NEWPOSE) or (erro_dist DIST_NEWPOSE) Fig. 5. Robot trajectory control: diagram of the velocity control function
  • 221.
    Multi AGV IndustrialSupervisory System 209 In each stage of the diagram, the linear and angular velocities are calculated accordingly. The velocity for each wheel/motor is calculated through Eqs. 1 and 2, where b represents the distance between both wheels. M0Speed := linearvelocity + angularvelocity · b 2 (1) M1Speed := linearvelocity − angularvelocity · b 2 (2) 3.3 Robots’ Architecture To validate the proposed robot orchestration algorithm, three differential mobile robots were developed, as a small scale of an industrial Automated Guided Vehi- cle, AGV. Each one measures 11 cm length by 9 cm width, as presented in Fig. 6. The robot was produced through additive manufacturing using a 3D printer. Fig. 6. Example of the robot developed to perform tests: Small scale AGV, powered by three Li-Ion batteries that supply an ESP32 microcontroller and stepper motor drivers Each robot is powered by three Li-Ion MR18650 batteries, placed on the top of the robot, that supply the main controller board based on an ESP32 micro- controller and the stepper motor drivers DRV8825. The ESP32 owns the WiFi connection that allowed it to communicate with the central system. The main architecture of the small scale AGV is presented in Fig. 7. A push-button Power switch controller is used to turn the system off when batteries are discharged avoiding damaging them. A voltage divider is applied so that the microcontroller measures the battery voltage. The communication between the Robot Control Module and each robot is also done using UDP packets and it sends commands to the motors while receiv- ing information about the odometry (steps). To avoid slippage, a circular rubber
  • 222.
    210 A. Cruzet al. ESP32 DRV8825 Stepper motor NEMA DRV8825 Stepper motor NEMA Vbat Off control 3 x 18650 R Wheel L Wheel Push-button Power Switch Fig. 7. Robot architecture’s diagram: Black lines represent control signals and red lines represent power supply. (Color figure online) is placed on each wheel. Beyond the two wheels, a free contact support of Teflon with low friction is used to support the robot. 3.4 Central Control Module As explained previously, the Central Control Module is the main module and the one responsible for running the TEA* algorithm with the help of its supervisory system. Supervisory System. The supervisory system proposed by [8] is responsible for deciding when to re-plan the robots’ paths and for detecting and handling communications faults through two sub-modules hierarchically related with each other: these modules are referred to as the Planning Supervision Sub-Module and the Communication Supervision Sub-Module. In environments where no communication faults are present, the Planning Supervision Sub-Module is responsible for determining when one of the robots is delayed or ahead of time (according to the path planned by the TEA* algorithm) and also detect when a robot has completed the current step of its path. On the other hand, the Communication Supervision Sub-Module is only responsible for detecting communication faults. When a communication fault is detected, the Planning Supervision Sub-Module is overruled and the re- calculation of the robots’ paths is controlled by the Communication Supervi- sion Sub-Module. Once the communication is reestablished, the supervision is returned to the Planning Supervision Sub-Module.
  • 223.
    Multi AGV IndustrialSupervisory System 211 In the Central Control Module, after the initial calculation of the robot’s paths, done by the Path Planning Module, the Planning Supervision Module is responsible for checking three critical situations that can lead to collisions and/or deadlocks: – If a robot is too distant from its planned position; – If the maximum difference between steps is 1; – If there is a robot moving into a position currently occupied by other. All these situations could be covered in only one: verifying when the robots become unsynchronised. If the robots are not synchronised, it means that each robot is at a different step of the planned path, which can lead to collisions and deadlocks. However, triggering the supervisory system by this criterion may lead to an ineffective system where the paths are constantly being recalculated. In the next section are described the tests performed to study the impact of the supervisor and, consequently, how the re-planning of the robot’s path affects the execution’s time and the number of tasks completed and what influences the supervisor. 4 Experiments and Results The shop floor map designed is presented in Fig. 8. This map was turned into a graph with links around the size of the robot plus a safety margin, in this case, the links measure, approximately, 15 cm. Fig. 8. Shop floor map designed to validate the work developed The decomposition of the map was done using the Map Decomposition Mod- ule, developed by [8], and incorporated in the Central Control Module through an XML file. To test the Planning Supervision Sub-Module and its intervention, it was assigned a workstation to each robot as written in Table 1. After reaching the workstation, each robot moves towards the nearest rest station, which is any other station that is not chosen as a workstation. In Fig. 9, it is possible to observe the chosen workstations in blue and the available rest stations in red.
  • 224.
    212 A. Cruzet al. Table 1. Assigned tasks for Test A Robot ID Robot station Workstation Rest station 1 No 8 No 6 No 7 2 No 2 No 4 No 3 3 No 7 No 9 No 10 Fig. 9. Shop floor skeleton graph with the workstations selected for the following tests in blue (Color figure online) The following tests aimed to evaluate how many times the supervisor had to intervene and which situations triggered it the most. It was expected that the paths were re-calculated more times when the Supervisory System was triggered upon every delay (Test B), instead of re-calculating the paths only on those crit- ical situations (Test A). It was also intended to evaluate if the nominal velocity could influence the re-planning of the paths (Test C). It was expected a decrease in the number of times the supervisor had to intervene, due to a more similar step execution time between the robots (less delays). Initially, the nominal linear velocity and the nominal angular velocity set in the Robot Control Module was of 800 steps/s (that correspond to 4.09 cm/s) and 100 rad/s, respectively. Before running the algorithm, it was already possible to predict a critical sit- uation. Since robot number 1 and number 3 have part of their paths in common, the TEA* algorithm defined steps of waiting to prevent the collision of robot number 1 with robot number 3. However, if robot number 3 suffered any delay, robot number 1 could still collide with it. 4.1 Test A During this test, the Supervisory Module only searched for the three situations, previously mentioned, to re-plan the paths. This mission was executed five times and the results of each sample and the average results are registered in Table 2 and Table 3, respectively. The situations that triggered the Path Supervisory Sub-Module the most were a robot being in a step ahead of time (verified by obtaining the lowest
  • 225.
    Multi AGV IndustrialSupervisory System 213 current step of the robots and comparing the advanced robot’s coordinates with the ones it would have if it was executing that lower step and noticing that the difference was superior to 10 cm) and the maximum step difference between all robots being superior to one. Table 2. Execution of Test A: supervisory module followed the criteria defined previ- ously # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 1 22 1:50 0 100% 2 23 1:51 0 100% 3 23 1:54 0 100% 4 32 1:47 0 100% 5 21 2:10 0 100% Table 3. Average values of Test A: supervisory module followed the criteria defined previously # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 5 24.2 2:02 0 100% 4.2 Test B The mission previously described was also tested, with the same velocity con- ditions, with the Supervisory Sub-Module analysing and acting whenever any robot became out of sync, instead of waiting for any of the other situations to happen. The results of each sample and the average results are registered in Table 4 and Table 5, respectively. Comparing with the previous results, the supervisor was executed an average of 36.6 times, 12.4 times more. However, since the paths were planed again at the slightest delay, robot number 1 and number 3 never got as close as in the first performance. Therefore, this criterion was able to correct the issues before the other situations triggered the supervisor. 4.3 Test C Focusing on the supervisor tested in Test A (the one that searches for the three critical situations), the nominal velocities of the robots, in the Robot Control Module, were increased by 25%. Therefore, the nominal linear velocity was set at 1000 steps/s and the nominal angular velocity at 125 rad/s. The results are described in Table 6 and Table 7.
  • 226.
    214 A. Cruzet al. Table 4. Execution of Test B: Supervisory Module acts upon any delay # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 1 34 1:47 0 100% 2 37 1:52 0 100% 3 37 1:49 0 100% 4 39 1:45 0 100% 5 36 1:50 0 100% Table 5. Average values of Test B: Supervisory Module acts upon any delay # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 5 36.6 1:49 0 100% Table 6. Execution of Test C: Nominal Velocities increased by 25% # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 1 25 1:47 0 100% 2 26 1:42 0 100% 3 30 1:40 0 100% 4 25 1:44 0 100% 5 23 1:40 0 100% Table 7. Average values of Test C: Nominal Velocities increased by 25% # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 5 25.8 1:43 0 100% Comparing the results with the ones in Test A (Table 2 and Table 3), only the execution time varied. This means that what may cause the delay between the robots is either the time that it takes for a robot to rotate not corresponding to the time that it takes to cross a link (since one step corresponds to crossing from one node to the other but corresponds also to a rotation of 90◦ C) or the links (the distance between two nodes) not having a similar size. Repeating the same test but with the nominal linear velocity at 800 steps/s and the nominal angular velocity at 125 rad/s. The results are in Table 8 and Table 9. The average number of times that the supervisor recalculated the path is lower when the angular velocity is 25% superior. However, increasing too much
  • 227.
    Multi AGV IndustrialSupervisory System 215 this value would cause new delays (because the time taken by one step of rotating would be less than the time of one step of going forward). The average execution time is also lower, as expected. Table 8. Execution of Test C’: Nominal linear velocity at 800 steps/s and nominal angular velocity at 125 rad/s # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 1 19 1:47 0 100% 2 21 1:44 0 100% 3 24 1:43 0 100% 4 24 1:42 0 100% 5 24 1:46 0 100% Table 9. Average values of Test C’: Nominal linear velocity at 800 steps/s and nominal angular velocity at 125 rad/s # Supervisor intervention Exec. Time (min.) Deadlocks Collisions Tasks completed 5 22.4 1:44 0 100% 4.4 Results Discussion The time taken by the robots to perform their tasks is slightly longer in Test A. This can be justified due to the situations where the supervisor acts on being more critical and resulting in a different path instead of extra steps on the same node (the robot waiting in the same position). For example, in Table 2, the execution time of sample number 5 stands out from the others. This longer performance is justified by the alteration of the path of robot number 1. The addition of extra steps in station number 7, for robot number 1 to wait until robot number 3 has cleared the section, was a consequence of the re-planning done by the supervisor. Even though, in the other samples, it was possible to observe robot number 1 starting to rotate in the direction of station number 7, the robot never moved into that node/station, resulting in a shorter execution time. In sample number 4 of Test A, it was also possible to observe robot number 2 taking an alternative path to reach its rest station. The execution time was not significantly affected but the supervisor had to recalculate the paths 32 times. When it comes to the results of Test B, sample number 4 is also distinct. Even though the supervisor recalculates the paths more frequently, the execution time is a little shorter than on the other samples. This can be a result of the supervisor
  • 228.
    216 A. Cruzet al. being called on simpler situations that do not involve a change of the path and are quickly sorted. In conclusion, it is possible to verify that there is a trade-off between the number of times that the path is calculated (re-calculating the paths frequently can be a computationally heavy procedure) and the mission’s execution time. If the supervisor is only called on critical situations, the algorithm will be executed faster and it will not be as computationally heavy but the robots’ may take longer to perform their missions due to alternative and longer paths. Contrarily, if the paths are re-calculated every time a robot becomes unsynchronised, the most likely situation to happen is the TEA* algorithm adding extra steps on the current node, making the robots wait for each other and, consequently, avoiding longer paths. In Test C, it was evaluated the impact of the robot’s velocity in the num- ber of times the paths were re-calculated. Even though in experiment Test C’, the angular velocity seems to be more fitted, the results are not very different. Therefore, the size of the links should be tested to evaluate their impact on the synchronisation of the robots. For this mission, the trajectory that each robot should have follow, according to the initial calculation done by the TEA* algorithm, is represented in Fig. 10 (rotations are not represented). In most cases, the paths were mainly maintained because the re-calculation only forced the robots to stop and wait until all robots were in sync. However, as described before, in sample number 4 and number 5 of Test A, the trajectories performed were different and can be visualised: Test A Sample number 41 , Test A Sample number 52 , and Test B Sample number 43 . (a) Steps 1 to 6 (b) Steps 6 and 7 (c) Steps 7 to 10 (d) Steps 10 to 14 (e) Steps 14 to 17 (f) Steps 17 and 18 Fig. 10. Initial path schematic calculated by the Path Planning Module (rotations are not represented) 1 https://www.youtube.com/watch?v=XeMGC1BlOg8. 2 https://www.youtube.com/watch?v=7tiBd8hPfKE. 3 https://www.youtube.com/watch?v=NmdK5b0vj64.
  • 229.
    Multi AGV IndustrialSupervisory System 217 5 Conclusions This article presented the implementation and evaluation of a supervisor under different situations. Through the development of a small fleet of small scale industrial Automated Guided Vehicle and their control and localisation mod- ule, it was possible to test the paths planned for them, provided by the Time Enhanced A * (TEA *) algorithm, and how the supervisor had to intervene and recalculate the paths upon different situations always concerning the delays of the robots (that could consequently evolve to collisions and deadlocks). Throughout the executed tests, it was possible to notice that there is a trade- off between the number of times that the path is calculated and the mission’s execution time. Recalculating frequently the robots’ paths turns the Central Control Module computationally heavy but if the paths are recalculated every time a robot becomes unsynchronised, the most likely situation to happen is the TEA* algorithm adding extra steps on the current node, making the robots wait for each other and, consequently, avoiding longer paths. This results in a shorter robots’ execution time. Meanwhile, only triggering the Supervisor Sub-Module on critical situations might lead to the robots taking longer time to perform their missions due to alternative and longer paths but the Central Control Module would not be as computationally heavy. It was also studied if the robot’s velocity could be the reason for the delays detected by the supervisor. Since the time taken for a robot to go from one step to another should be equal in all situations for all robots, the possibility of the steps that contemplate rotations take a longer time than the steps that only include going forward, had to be considered. However, the results did not show a significant difference meaning that the delays could be caused by links of different sizes. This hypothesis should be addressed in future work. Acknowledgements. This work is financed by National Funds through the Por- tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project UIDB/50014/2020. References 1. Automated Guided Vehicle Market Size, Share Trends Analysis Report By Vehicle Type, by Navigation Technology, by Application, by End-use Industry, by Component, by Battery Type, by Region, and Segment Fore- casts, https://www.grandviewresearch.com/industry-analysis/automated-guided- vehicle-agv-market. Accessed 21 Jan 2021 2. The Advantages and Disadvantages of Automated Guided Vehicles (AGVs). https://www.conveyco.com/advantages-disadvantages-automated-guided- vehicles-agvs. Accessed 21 Jan 2021 3. Benefits of Industrial AGVs in Manufacturing. https://blog.pepperl-fuchs.us/4- benefits-of-industrial-agvs-in-manufacturing. Accessed 21 Jan 2021
  • 230.
    218 A. Cruzet al. 4. Bittel, O., Blaich, M.: Mobile robot localization using beacons and the Kalman filter technique for the Eurobot competition. In: Obdržálek, D., Gottscheber, A. (eds.) EUROBOT 2011. CCIS, vol. 161, pp. 55–67. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21975-7 6 5. Cao, Y.U., Fukunaga, A.S., Kahng, A.: Cooperative mobile robotics: antecedents and directions. Autonom. Robot. 4, 7–24 (1997) 6. Garrido-Jurado, S., Muñoz Salinas, R., Madrid-Cuevas, F.J., Medina-Carnicer, R.: Generation of fiducial marker dictionaries using mixed integer linear programming. Patt. Recogn. 51, 481–491 (2016) 7. Gomes da Costa, P. L.: Planeamento Cooperativo de Tarefas e Trajectóriasem Múltiplos Robôs. PhD thesis, Faculdade de Engenharia da Universidade do Porto (2011) 8. Matos, D., Costa, P., Lima, J., Costa, P.: Multi AGV coordination tolerant to communication failures. Robotics 10(2), 55 (2021) 9. Romero-Ramirez, F.J., Muñoz-Salinas, R., Medina-Carnicer, R.: Speeded up detec- tion of squared fiducial markers. Image Vis. Comput. 76, 38–47 (2018) 10. Santos, J., Costa, P., Rocha, L. F., Moreira, A. P., Veiga, G.: Time Enhanced A*: Towards the Development of a New Approach for Multi-Robot Coordination. In: Proceedings of the IEEE International Conference on Industrial Technology, pp. 3314–3319. Springer, Heidelberg (2015) 11. Santos J., Costa P., Rocha L., Vivaldini K., Moreira A.P., Veiga G.: Validation of a time based routing algorithm using a realistic automatic warehouse scenario. In: Reis, L., Moreira, A., Lima, P., Montano, L., Muñoz-Martinez, V. (eds.) ROBOT 2015: Second Liberian Robotics Conference: Advances in Robotics, vol. 2, pp. 81– 92 (2016) 12. UDP (User Datagram Protocol). https://searchnetworking.techtarget.com. Accessed 13 June 2021 13. Ullrich, G.: The history of automated guided vehicle systems. In: Automated Guided Vehicle Systems, pp. 1–14. Springer, Heidelberg (2015). https://doi.org/ 10.1007/978-3-662-44814-4 1
  • 231.
    Dual Coulomb CountingExtended Kalman Filter for Battery SOC Determination Arezki A. Chellal1,2(B) , José Lima2,3 , José Gonçalves2,3 , and Hicham Megnafi1,4 1 Higher School of Applied Sciences, BP165, 13000 Tlemcen, Algeria 2 Research Centre of Digitalization and Intelligent Robotics CeDRI, Instituto Politécnico de Bragança, 5300-252 Bragança, Portugal {arezki,jllima,goncalves}@ipb.pt 3 Robotics and Intelligent Systems Research Group, INESC TEC, 4200-465 Porto, Portugal 4 Telecommunication Laboratory of Tlemcen LTT, University of Abou Bakr Belkaid, BP119, 13000 Tlemcen, Algeria h.megnafi@essa-tlemcen.dz Abstract. The importance of energy storage continues to grow, whether in power generation, consumer electronics, aviation, or other systems. Therefore, energy management in batteries is becoming an increasingly crucial aspect of optimizing the overall system and must be done prop- erly. Very few works have been found in the literature proposing the implementation of algorithms such as Extended Kalman Filter (EKF) to predict the State of Charge (SOC) in small systems such as mobile robots, where in some applications the computational power is severely lacking. To this end, this work proposes an implementation of the two algorithms mainly reported in the literature for SOC estimation, in an ATMEGA328P microcontroller-based BMS. This embedded system is designed taking into consideration the criteria already defined for such a system and adding the aspect of flexibility and ease of implementation with an average error of 5% and an energy efficiency of 94%. One of the implemented algorithms performs the prediction while the other will be responsible for the monitoring. Keywords: Prediction algorithm · Battery management system · Extended kalman filter · Coulomb counting algorithm · Engineering applications 1 Introduction Embedded systems are ubiquitous today, but because these systems are barely perceptible, their importance and impact are often underestimated. They are used as sub-systems in a wide variety of applications for an ever-increasing c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 219–234, 2021. https://doi.org/10.1007/978-3-030-91885-9_16
  • 232.
    220 A. A.Chellal et al. diversity of functions [1]. Whether it is a hybrid vehicle, a solar power plant, or any other everyday electrical device (PC, smartphone, drone...), the key ele- ment remains the ability to monitor, control and optimise the performance of one or more modules of these batteries, this type of device is often referred to as a Battery Management System (BMS). A BMS is one of the basic units of electrical energy storage systems, a variety of already developed algorithms can be applied to define the main states of the battery, among others: SOC, state of health (SOH) and state of functions (SOF) that allow real-time management of the batteries. For the BMS to provide optimal monitoring, it must operate in a noisy envi- ronment, it must be able to electrically disconnect the battery at any time, it must be cell-based and perform uniform charging and discharging across all cells in the battery [2], and the components used must be able to withstand at least the total current drawn by the load [3]. In addition, it must continuously monitor various parameters that can greatly influence the battery, such as cell tempera- ture, cell terminal voltage and cell current. This embedded system must be able to notify the robot using the battery to either stop drawing energy from it, or to go to the nearest charging station. However, in the field of mobile robotics and small consumer devices, such as Smarthpones or Laptops, as there are no requirements regarding the accuracy to which a BMS must be held, the standard approach such as Open Circuit Volt- age (OCV) and Coulomb Counting (CC) methods are generally applied, this is mainly due to the fact that the use of more complicated estimation algorithms such as EKF, Sliding Mode and Machine Learning [4,5] requires higher compu- tational power, thus, the most advanced battery management system algorithms reported in the literature are developed and verified by laboratory experiments using PC-based software such as MATLAB and controllers such as dSPACE. As an additional information, the most widely used battery systems in robotics today are based on electrochemical batteries, particularly lithium-ion technolo- gies with polymer as an Electrolyte [6]. This document is devided into 5 sections, the rest of the paper is structured as follow. Section 2 promotes the work already done in this field and highlights the objectives intended throught this work. Section 3 offer a brief algorithm descrip- tion implemented on the prototype. Section 4, describes the proposed solution for the system, where a block diagram and several other diagrams defining the operating principle are offered. Section 5 provides the results and offer some dis- cussion. Finally, Sect. 5 draws together the main ideas described in this article and outlines future prospects for the development of this prototype. 2 State of the Art Review Several research teams around the world have proposed different solutions to design an efficient BMS system for lithium-ion batteries. Taborelli et al. have proposed in [7], a design of an EKF and Ascending Extended Kalman Filter (AEKF) algorithms specifically developed for light vehicle categories, such as
  • 233.
    DCC-EKF Battery SOCDetermination 221 electric bicycles, in which a capacity prediction algorithm is also implemented tackling SOH estimation. The design has been validated by using simulation software and real data acquisition. In [8], Mouna et al. implemented two EKF and sliding mode algorithms in an Arduino board for SOC estimation, using an equivalent first-order battery as a model. Validation is performed by data acquisition from Matlab/Simulink. Sanguino et al. proposed in [9] an alterna- tive design for battery system, where two batteries works alternatively, one of the batteries is charged through solar panels installed on the top of the VAN- TER mobile robot, while the other one provides the energy to the device. The battery SOC selection is performed following an OCV check only and the SOC monitoring is done by a coulomb counting method. In addition, several devices are massively marketed, these products are used in different systems, from the smallest to the largest. The Battery Management System 4–15S is a BMS marketed by REC, it has the usual battery protec- tions (temperature, overcurrent, overvoltage) [10]. A cell internal DC resistance measurement technique is applied, suggesting the application of a simple resis- tor equivalent model and an open circuit voltage technique for SOC prediction, and an Coulomb counting technique for SOC monitoring. The device can be operated as a stand-alone unit and offers the possibility of being connected to a computer with an RS-485 for data export. This device is intended for use in solar system. Roboteq’s BMS10x0 is a BMS and protection system for robotic devices developed by Roboteq, it uses a 32-bit ARM Cortex processor and offers the typical features of a BMS, along with Bluetooth compatibility for wireless states checking. It can monitor from 6 to 15 batteries at the same time. Voltage and temperature thresholds are assigned based on the chemical composition of the battery. The SOC is calculated based on an OCV and CC techniques [11]. To the best of the authors’ knowledge, all research in the field of EKF- based BMS is based on bench-scale experiments using powerful softwares, such as MATLAB, for data processing. So far, the constraint of computational power limitation is not really addressed in the majority of scientific papers dealing with this subject. This paper focus on the implementation of an Extended Kalman Filter helped with a Coulomb Counting technique, called DCC-EKF, as a SOC and SOH estimator in ATMEGA328P microcontrollers, the proposed system is self powered, polyvalent to all types of Lithium cells, easy to plug with other systems and take into consideration most of the BMS criteria reported in the literature. 3 Algorithm Description There are many methods reported in the literature that can give a represen- tation of the actual battery charge [4,5,7,8]. However, these methods vary in the complexity of the implementation and the accuracy of the predicted results over long term use. There is a correlation between these two parameters, as the complexity of an algorithm increases, so does the accuracy of the results. Also, simulation is a very common technique for evaluating and validating approaches and allows for rapid prototyping [1]; these simulations are based on
  • 234.
    222 A. A.Chellal et al. models that are approximations of reality. MATLAB is a modeling and sim- ulation tool based on mathematical models, provided with various tools and libraries. In this section, the two algorithms applied in the development of this prototype will be described and the results of the simulation performed for the extended Kalman filter algorithm will be given. 3.1 Coulomb Counting Method The Coulomb counting method consist of a current measurement and integration in a specific period of time. The evolution of the state of charge for this method, can be described by the following expression: SOC(tn) = SOC(tn−1) + nf Cactual tn−1 tn I · dt (1) where, I is the current flowing by the battery, Cactual is the actual total storable energy in the battery and nf represents the faradic efficiency. Although this method is one of the most widely used methods for monitor- ing the condition of batteries and offers the most accurate results if properly initialized, it has several drawbacks that lead to errors. The coulomb counting method is an open-loop method and its accuracy is therefore strongly influenced by accumulated errors, which are produced by the incorrect initial determina- tion of the faradic efficiency, the battery capacity and the SOC estimation error [12], making it very rarely applied on its own. In addition, the sampling time is critical and should be kept as low as possible, making applications where the current varies rapidly unadvisable with this method. 3.2 Extended Kalman Filter Method Since 1960, Kalman Filter (KF) has been the subject of extensive research and applications, the KF Algorithm estimates the states of the system from indirect and uncertain measurements of the system’s input and output. The general discrete time representation of it, is represented as follow: xk = Ak−1 · xk−1 + Bk−1 · uk−1 + ωk yk = Ck · xk + Dk · uk + υk (2) Where, xk ∈ Rn×1 is the state vector, Ak ∈ Rnxn is the system matrix in discrete time, yk ∈ Rm × 1 is the output, uk ∈ Rm×1 is the input, Bk ∈ Rnxm is the input matrix in discrete time, and Ck ∈ Rm×n is the output matrix in discrete time. ω and υ represents the Gaussian distributed noise of the process and measurement. However, Eq. 2 is only valid for linear systems, in the case of nonlinear systems such as batteries, the EKF is applied to include the non-linear behaviour and to determine the states of the system [14]. The EKF equation has the following form: xk+1 = f(xk, uk) + ωk yk = h(xk, uk) + υk (3)
  • 235.
    DCC-EKF Battery SOCDetermination 223 Where, Ak = ∂f(xk, uk) ∂xk , Ck = ∂h(xk, uk) ∂xk (4) After initialization, the EKF Algorithm always goes through two steps, predic- tion and correction. The KF is extensively detailed in many references [13–15], we will therefore not focus on its detailed description. The Algorithm 1 summarize the EKF function implemented in each Slave ATMEGA328P microcontroller, the matrices Ak, Bk, Ck, Dk, xk are discussed in Sect. 3.3. Algorithm 1: Extended Kalman Filter Function initialise A0, B0, C0, D0, P0, I0, x0, Q and R ; while EKF Called do Ik measurements; Vtk measurements; ˆ xk = Ak−1 . xk−1 + Bk−1 . Ik−1 ; ˆ Pk = Ak−1 . Pk−1 . At k−1 + Q ; ˆ Pk not diagonal = 0; if ˆ SOCk ∈[SOC interval] then Rt, Rp and Cp updated; end Ck = V oc( ˆ SOCk) + V̂Pk ; Dk = Rt; V̂tk = Ck + Dk · Ik; L = (Ck · ˆ Pk · Ct k + R); if L = 0 then Kk = ˆ Pk · Ct k / L; end xk = ˆ xk + Kk · (Vtk − V̂tk ); Pk = (I − Kk · Ck) · ˆ Pk; Pk not diagonal = 0; k = k + 1; Result: Send xk end The overall performance of the filters is set by the covariance matrix P, the process noise matrix Q and the measurement noise R. The choice of these parameters were defined throught experience and empirical experiment and are given by: Q = 0.25 0 0 0 , P = 1 0 0 1 and R = 0.00001 Data obtained from discharge tests of different cells simulations were applied to verify the accuracy of this estimator with known parameters and are reported in Sect. 3.4. The EKF Algorithm implemented in the ATMEGA328P microcon- trollers has a sampling time of 0.04 s.
  • 236.
    224 A. A.Chellal et al. 3.3 Battery Modelling Two battery models are most often applied, the Electrochemistry model, which gives a good representation of the internal dynamics of the batteries, and the Electrical circuit model, which allows the behaviour of the batteries to be trans- lated into an electrical model and can be easily formulated into mathematical formula. For this study, the second model, considered more suitable for use with a microcontroller, is chosen. The most commonly applied electrical model uses a simple RC equivalent battery circuit model, with two resistors (Rp and Rt), a capacitor (Cp), a voltage source (Voc) and a current flowing through it (I), as shown in Fig. 1. It represents the best trade-off between accuracy and complexity. Fig. 1. First order equivalent battery model The terminal voltage denoted as Vt, represents the output voltage of the cell, and is given by Eq. 5. Vt = Voc(SOC) + I · Rt + Vp (5) A non-linear relationship exist between SOC and Voc, a representation employ- ing the seventh-order polynomial to fit the overall curve can be expressed and is described with the following relation: Voc(SOC) = 3.4624e−12 · SOC7 − 1.3014e−9 · SOC6 + 1.9811e−7 · SOC5 (6) −1.5726e−5 · SOC4 + 6.9733e−4 · SOC3 − 0.017 · SOC2 + 0.21 · SOC + 2.7066 The time derivative of Eq. 1 can be formulated as given in Eq. 7, . SOC = I Cactual (7) The polarization voltage is given as, . Vp = − 1 RpCp Vp + 1 Cp I (8)
  • 237.
    DCC-EKF Battery SOCDetermination 225 From the Eqs. 5, 7 and 8, it is possible to deduce the equation characterising the behaviour of the battery, which is expressed as the following system: ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ . SOC . Vp = 0 0 0 −1 RpCp · SOC Vp + 1 Cactual 1 Cp · I Vt = poly 1 · SOC Vp + Rt · I (9) Where, poly is the polynomial formula described in Eq. 6. Since the microcon- troller processes actions in discrete events, the previous system must be tran- scribed into its discrete form, which is written as follows, ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ . SOC . Vp = 1 0 0 exp −Δt RpCp · SOC Vp k−1 + 1 Cactual 1 Cp · Ik−1 Vt = poly 1 · SOC Vp k + Rt · Ik (10) From both Eqs. 2 and 10, it is deduced that the current I is the input of the system, the voltage Vt is the output and the matrices Ak, Bk, Ck and Dk matrices are given by: Ak = 1 0 0 exp −Δt RpCp , Bk = 1 Cactual 1 Cp , xk = SOCk VPk , Ct k = poly 1 , Dk = Rt 3.4 Simulation Results Given the large number of articles dealing with the validation of these algorithms [13,16,17], it is however not necessary to dwell on this subject. Nevertheless, a simulation of the algorithm is proposed in this section, in order to confirm the good follow-up of the algorithm according to the chosen parameters. Fig. 2. (a) overall Simulink model highlighting the Input/Output of the simulation (b) Battery sub-system model
  • 238.
    226 A. A.Chellal et al. The graphical representation of MATLAB/Simulink using graphical blocks to represent mathematical and logical constructs and process flow is intuitive and provides a clear focus on control and observation functions. Although Simulink has been the tool of choice for much of the control industry for many years [18], it is best to use Simulink only for modelling battery behaviour and not for the prediction algorithm. The battery model presented before is applied as a base model, the more complex model will not be able to simulate all possible cases and will therefore increase the time needed for development. Figure 2 represents the battery simulation used in MATLAB/Simulink. Figure 3 shows a comparison of the state of charge of the battery with simu- lation for the Extended Kalman filter, and a plot of the error with the real and the predicted one. The test consists of a 180 s discharge period followed by a one hour rest period, the cell starts fully charged and the EKF state of charge indicator is set at 50% SOC. Fig. 3. Estimated and reference SOC comparison with the absolute error of the SOC for Extended Kalman Filter with a constant discharge test and a rest period. The continuous updating of the correction gain (K), which is due to the continuous re-transcription of the error covariance matrices, limits the divergence as long as the SOC remains within the predefined operating range (80%-40%). The response of the observer to the determination of the SOC can be very fast, on the order of a few seconds. Also, from the previous figure it is clear that, for known parameters, the SOC can be tracked accurately with less than 5% of error, and record a perfect prediction for when the battery is at rest. Figure 4 shows the simulation results of a constant current discharge test. Although a good SOC monitoring by the EKF, it still requires a good deter- mination of the battery parameter. Some reported works [19,20], have proposed the use simple algebraic method or a Dual Extended Kalman Filter (DEKF) as an online parameter estimator, to estimate them on the fly. Unfortunately, these methods require a lot of computational power, to achieve a low sampling time
  • 239.
    DCC-EKF Battery SOCDetermination 227 Fig. 4. Estimated and reference SOC comparison with the absolute error of the SOC for Extended Kalman Filter with a constant discharge test. and reach convergence, which is difficult to achieve with a simple ATMEGA328P. For this reason, and given the speed at which the EKF is able to determine the battery SOC at rest, the EKF algorithm is applied to determine the battery SOC at start-up and then the CC algorithm is applied to carry on the estimation. 4 Proposed Solution The system is designed to be “Plug and Play”, with two energy input and output located on one side only, future users of the product can quickly install and ini- tialize the BMS, which works for all types of Lithium Model 18650 batteries, the initialization part is described in more depth in the next section. The connecting principle of the electronic components is illustrated in Fig. 5. 4.1 Module Connection The Master microcontroller is the central element of the system, the ATMEL’s ATMEGA328P was chosen for its operational character and low purchase cost, it collect the State of Charge predicted by each ATMEGA328P slave micro- controller and control the energy flow according to the data collected. This microcontroller is connected to push buttons and an OLED screen to facili- tate communication with the user. Power is supplied directly from the batteries, the voltage is stabilised and regulated for the BMS electronics by an step down voltage regulator. The Inter-Integrated Circuit (I2C) is a bus interface connection using only 2 pins for data transfer, it is incorporated into many devices and is ideal for attaching low-speed peripherals to a motherboard or embedded system over a short distance, it provides a connection-oriented communication with acknowl- edge [21]. Figure 6 represent the electronic circuit schematic diagram.
  • 240.
    228 A. A.Chellal et al. Master Microcontroller Oled Display Button up Button enter Button exit Button down Slave Microcontroller #1 Slave Microcontroller #2 Temperature Sensor Battery Pack Electronic Board Energy input Energy output Step-down regulator Robot Charging Station Operational Amplifier Current measurment Cell 3 Cell 1 Cell 2 Cell 4 S-8254A Cell Protection Fig. 5. Block diagram of the system The 18650 Lithium cells are placed in the two cell holder (1BH, 2BH). The Oled display (OLED1) is connected via I2C by the Serial Clock and Serial DATA line, called SCL and SDA respectively, the ATMEGA328P microcon- troller (MASTER, SLAVE1, SLAVE2) are connected to the I2C wire through pins PC5 and PD4. The current is measured from the ACS712 (U1) and gath- ered by the slave microcontrollers (SLAVE1, SLAVE2) from the analog pin PC0. A quadruple precision amplifier LM324 (U5) is used to measure the voltage at each cell terminal, with the use of a simple voltage divider. The push buttons are linked to the master microcontroller via the digital pins (PD2, PD3, PB4, PB5), an IRF1405 N-Channel Mosfet is used as a power switch to the load (mobile robot) while an IRF9610 P-Channel Mosfet is used as a power switch from the power supply, they are controlled from master ATMEGA328P pins PB3 and PB2 respectively. A step down voltage regulator is applied to regulate the volt- age supplied to the electronics to 5V, this power is either gathered from the battery pack or the power supply. 4.2 Operating Principle This system is mainly characterised by two modes, the initialisation mode and the on-mode, these two modes are very dependent on each other and it is possible to switch from one mode to the other at any time using a single button. Figure 7 briefly summarizes the operation principale for both modes.
  • 241.
    DCC-EKF Battery SOCDetermination 229 Fig. 6. Schematic diagram of the electronic circuit Because the characteristics between the batteries are not uniform, and can vary greatly, and in order to achieve high accuracy of SOC prediction, the initial- ization mode is introduced into the system, it is possible to skip this mode and ignore it, but at the cost of reduced accuracy. Through this mode it is possible to enter the capacity of the batteries installed, the actual SOC to ensure a more accurate result, as well as to activate an additional protection to limit further the discharges and the use or not of the EKF algorithm. As for the on mode, according to what was initialize, the BMS begins by carrying out the assigned tasks for each battery cell in parallel. – If the EKF prediction was set on during the initialization mode, the device shall begin the prediction of the actual SOC value for each cell, and prohibits the passage of current other than that which powers the BMS for a period of approximately 3 min (Fig. 7). – If the EKF prediction was set off during the initialization mode, and an SOC Value was introduced, the device allows the current to flow directly, and start the SOC monitoring according to the chosen value. – If the Protection was set off, the battery will be discharged and charged even deeper, taking the risk of shorten the life span of the batteries.
  • 242.
    230 A. A.Chellal et al. Begin of Initialisation mode Display menu Arrow Navigation (UP DOWN) buttons 1- Initialisation of cell capacity 2- (Des)activate the EKF 4- (Des)activate the cell protection 3- Enter Manualy the SOC Selected ? 5- Finish Initialisation 1 Selected ? Selected ? Selected ? Selected ? Selection of the value Value selected ? Yes/No Selection of the value Yes/No 2 Choice selected ? Value selected ? Choice selected ? Yes Yes Yes Yes Yes Yes Yes Yes Yes 2 EKF function, Does not allow (dis)charge 3 minutes Allow (dis)charge, Coulomb Counting function SOCmin SOC SOCmax Calculate R0 Yes SOC SOCmin No No Does not allow discharge Does not allow charge T passed Send data to the slave µ-controller Yes Yes (b) (a) Fig. 7. Flow chart of the system algorithm. (a) Initialisation mode flow chart. (b) On-mode flow chart with EKF algorithm activated. The battery states are determined for each cell independently of the others by a specific microcontroller assigned to it, and works with the principle of Master/Slave communication. In order to keep a low sampling time, one slave microcontroller is assigned to two batteries, reaching a sampling time of 0.085 s. 5 Results and Discussions The result is the design of a BMS that takes into account the already established requirements for this type of system and defines new ones, such as flexibility and size, that must be met for mobile robots. A breadboard conception was first carried out to perform the first tests. After power up, the initialisation mode is first started as expected. After that, while the on-mode is running, the voltage, current and SOC data are displayed on the screen, an accuracy of 100 mV and 40 mA is achieved. When the cell is at rest and for a SOC ∈ [80% − −40%], the EKF prediction reaches an error of 5% in the worst case, for SOC cells above 80% the accuracy drops to 8%, while for SOC below 30% the algorithm diverges completely (Fig. 10). These results are quite understandable, as the linearization of these curves is done in only 10 points. A cell protection is performed by an external circuit based on a S-8254A, it includes a high accuracy voltage detection for lithium battery, avoiding over voltage while charging the batteries, they are widely applied in rechargeable battery pack, due to their low costs and good caracteristics [22].
  • 243.
    DCC-EKF Battery SOCDetermination 231 The total storable power offered by this model varies according to the capac- ity of the installed cells. It is clear that the use of 4 lithium cells of 2200 mAh, gives the system a battery capacity of 8800 mAh which is sufficient in the major- ity of small robots application, it should be noted that it is also possible to implement the 5500 mAh cell, giving the product developed great flexibility by offering the possibility of choosing the battery capacity to users. The maximum current that can be delivered is 2 A, and therefore allows to offer a power up to 32 W, which is sufficient to power the electronics of the robot, beyond this value and in order to offer an optimal safety to the battery, a shutdown protocol will stop the discharge of the battery. As for the charging process, the maximum current supplied by an external source is currently 0.5 A, in order to have a good protection when charging the batteries. With such a current, the charging pro- cess is slow, in fact it will take 8 h and a half to reach the 80% SOC advised for 4 × 2200 mAh cells. Figure 8 shows the breadboard circuit. Fig. 8. Breadboard circuit The BMS electronics consume in average 1.15 W, giving the product an effi- ciency of 96.15%. While charging the battery, and because a constant charging power is set, the IRF9610 Mosfet leads to a constant loss of power of 0.52W, the efficiency then reaches 90.38%. For discharging the battery, as the regulation of the voltage supplied to the robot is left to the user and is done with a variable current, it is difficult to define with precision the efficiency of this device, it can however reach 2.5%, which makes the total efficiency to be 93.60%. Thus, in general the average efficiency of this product is 94.37%. Figure 9 shows a printed circuit representation of the proposed prototype.
  • 244.
    232 A. A.Chellal et al. Fig. 9. Printed Cirduit Board of the prototype. (a) componants view. (b) top view. Fig. 10. On-mode display of cells 1, 2 and 3 with a SOC value of 9.51%, 77.25% and 71.10% respectively. (a) Average SOC of the first 50 iteration. (b) Few seconds after. (c) approximately 1 min after. For cell 1, the algorithm diverges for the SOC estimate. For cell 2, the estimate has reached the SOC value with an error of 1.75%. For cell 3, after 1 min, the prediction has not yet reached the estimated value with an error of 6.90%.
  • 245.
    DCC-EKF Battery SOCDetermination 233 The SOC values, voltage and current are displayed directly on the Oled screen as shown in Fig. 10. It is possible to display the overall battery status, or cell by cell for a more detailed view. Taking into account that this approach and the algorithm are constantly being improved, not all features such as the SOH and temperature are fully integrated. 6 Conclusion and Future Work This paper proposes the DCC-EKF approach where the state of charge is first predicted using an EKF algorithm, after which the SOC is given to the Colum- bus measurement algorithm to monitor this quantity. Most commercial products only use the OCV to initially predict the SOC, a technique that is widely imple- mented, but is severely inaccurate, especially when the BMS is powered directly from the batteries, so the batteries never reach a resting state. The balancing and overvoltage protection of the BMS is provided by an external circuit based on the S-8245A module. The proposed BMS is small (10 cm × 15 cm), easy to use and has shown accurate SOC prediction with an error of 5% and good performance in noisy operation. The prototype also offers a good efficiency of about 93%. Future developments will focus on implementing a more accurate state of charge prediction algorithm, such as DEKF and AEKF, as well as a more accu- rate health prediction algorithm, while aiming to make it more user-friendly and improve the user experience. The temperature effect will also be addressed in future versions, as it is actually neglected, as well as the increase of the product efficiency. In addition, it is considered to Implement a Learning Algorithm, to further enhance the prediction accuracy, with a fast charging feature. References 1. Marwedel, P.: Embedded Systems Foundations of Cyber-Physical Systems, and the Internet of Things. Springer Nature (2021) 2. Park, K.-H., Kim, C.-H., Cho, H.-K., Seo, J.-K.: Design considerations of a lithium ion battery management system (BMS) for the STSAT-3 satellite. J. Power Elec- tron. 10(2), 210–217 (2010) 3. Megnafi, H., Chellal, A.-A., Benhanifia, A.: Flexible and automated watering system using solar energy. In: International Conference in Artificial Intelligence in Renewable Energetic Systems, pp. 747–755. Springer, Cham, Tipaza (2020). https://doi.org/10.1007/978-3-030-63846-7 71 4. Xia, B., Lao, Z., Zhang, R., et al.: Online parameter identification and state of charge estimation of lithium-ion batteries based on forgetting factor recursive least squares and nonlinear Kalman filter. Energies 11(1), 3 (2018) 5. Hannan, M.A., Lipu, M.H., Hussain, A., et al.: Toward enhanced State of charge estimation of Lithium-ion Batteries Using optimized Machine Learning techniques. Sci. Rep. 10(1), 1–15 (2020) 6. Thomas, B.-R.: Linden’s Handbook of Batteries, 4th edn. McGraw-Hill Education, New York (2011)
  • 246.
    234 A. A.Chellal et al. 7. Taborelli, C., Onori, S., Maes, S., et al.: Advanced battery management system design for SOC/SOH estimation for e-bikes applications. Int. J. Powertrains 5(4), 325–357 (2016) 8. Mouna, A., Abdelilah, B., M’Sirdi, N.-K.: Estimation of the state of charge of the battery using EKF and sliding mode observer in Matlab-Arduino/LabView. In: 4th International Conference on Optimization and Applications, pp. 1–6. IEEE, Morocco (2018) 9. Sanguino, T.-D.-J.-M., Ramos, J.-E.-G.: Smart host microcontroller for optimal battery charging in a solar-powered robotic vehicle. IEEE/ASME Trans. Mecha- tron. 18(3), 1039–1049 (2012) 10. REC-BMS.: Battery Management System 4–15S. REC, Control your power, Slove- nia (2017) 11. Robote, Q.: BMS10x0, B40/60V, 100 Amps Management System for Lithium Ion Batteries. RoboteQ, USA (2018) 12. Kim, I.-S.: A technique for estimating the state of health of lithium batteries through a dual-sliding-mode observer. IEEE Trans. Power Electron. 25(4), 1013– 1022 (2009) 13. Mastali, M., Vazquez-Arenas, J., Fraser, R.: Battery state of the charge estimation using Kalman filtering. J. Power Sources 239, 294–307 (2013) 14. Campestrini, C., Heil, T., Kosch, S., Jossen, A.: A comparative study and review of different Kalman filters by applying an enhanced validation method. J. Energ. Storage 8, 142–159 (2016) 15. Bishop, G., Welch, G.: An introduction to the kalman filter. In: Proceedings of SIGGRAPH, Course. Proceedings of SIGGRAPH, vol. 41, pp. 27599–23175 (2001) 16. Campestrini, C., Horsche, M.-F., Zilberman, I.: Validation and benchmark meth- ods for battery management system functionalities: state of charge estimation algo- rithms. J. Energ. Storage 7, 38–51 (2016) 17. Yuan, S., Wu, H., Yin, C.: State of charge estimation using the extended Kalman filter for battery management systems based on the ARX battery model. Energies 6(1), 444–470 (2013) 18. Marian, N., Ma, Y.: Translation of simulink models to component-based software models. In: 8th International Workshop on Research and Education in Mechatron- ics, pp. 262–267. Citeseer, Location (2007) 19. Hu, T., Zanchi, B., Zhao, J.: Determining battery parameters by simple algebraic method. In: Proceedings of the 2011 American Control Conference, pp. 3090–3095. IEEE, San Francisco (2011) 20. Xu, Y., Hu, M., Zhou, A., et al.: State of charge estimation for lithium-ion batteries based on adaptive dual Kalman filter. Appl. Math. Modell. 77(5), 1255–1272 (2020) 21. Mazidi, M.-A., Naimi, S., Naimi, S.: AVR Microcontroller and Embedded Systems. Pearson, India (2010) 22. ABLIC INC.: S-8254A Series Battery Protection IC for 3-Serial- or 4-Serial-CELL Pack REV.5.2. ABLIC, Japan (2016) 23. Chatzakis, J., Kalaitzakis, K., Voulgaris, C., Manias, N.-S.: Designing a new gen- eralized battery management system. IEEE Trans. Ind. Electron. 50(5), 990–999 (2003) 24. Chen, L., Xu, L., Wang, R.: State of charge estimation for lithium-ion battery by using dual square root cubature kalman filter. Mathematical Problems in Engi- neering 2017 (2017)
  • 247.
    Sensor Fusion forMobile Robot Localization Using Extended Kalman Filter, UWB ToF and ArUco Markers Sı́lvia Faria1(B) , José Lima2,3 , and Paulo Costa1,3 1 Faculty of Engineering, University of Porto, Porto, Portugal {up201603368,paco}@fe.up.pt 2 Research Centre of Digitalization and Intelligent Robotics, Instituto Politécnico de Bragança, Bragança, Portugal jllima@ipb.pt 3 INESC-TEC - INESC Technology and Science, Porto, Portugal Abstract. The ability to locate a robot is one of the main features to be truly autonomous. Different methodologies can be used to determine robots location as accurately as possible, however these methodologies present several problems in some circumstances. One of these problems is the existence of uncertainty in the sensing of the robot. To solve this problem, it is necessary to combine the uncertain information correctly. In this way, it is possible to have a system that allows a more robust localization of the robot, more tolerant to failures and disturbances. This paper evaluates an Extended Kalman Filter (EKF) that fuses odometry information with Ultra-WideBand Time-of-Flight (UWB ToF) measure- ments and camera measurements from the detection of ArUco markers in the environment. The proposed system is validated in a real envi- ronment with a differential robot developed for this purpose, and the achieved results are promising. Keywords: ArUco markers · Autonomous mobile robot · Extended kalman filter · Localization · Ultra-WideBand · Vision based system 1 Introduction In a semi-structured environment a mobile robot needs to be able to locate itself without human intervention, i.e. autonomously. It can be assumed that if the robot knows its own pose (position and orientation), it can be truly autonomous to execute given tasks. Since the first mobile robot was invented, the problem of pose determination has been addressed by many researches and developers around the world due to its complexity and the multitude of possible approaches. The localization methods can be classified in two great categories: rela- tive methods and absolute methods [1]. Relative localization methods gives the robot’s pose relative to the initial one and for this purpose dead reckoning meth- ods such as odometry and inertial navigation are used. Odometry is a technique c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 235–250, 2021. https://doi.org/10.1007/978-3-030-91885-9_17
  • 248.
    236 S. Fariaet al. in which it is common to use encoders connected to the rotation axes of the robot wheels. Absolute localization methods gives the global pose of the robot and do not need previously calculated poses. In these methods, data from odometry and data from external sensors can be combined to estimate the robot’s pose. In an indoor environment, a mobile robot needs to localize itself accurately due to limited of the space and obstacles present in the space. Today, the num- ber of applications which relay on indoor localization is rapidly increasing and because of this, localization of a robot in an indoor environment has become in an active research area. Global Positioning System (GPS) is usually used actually on outdoor envi- ronment, nevertheless the use of GPS in indoor environment is difficult. Due to this, different alternative technologies has been proposed and developed to solve this. Among these technologies that has been proposed, Ultra-WideBand (UWB) is known for its low-cost, high precision and easy deployment. An UWB based localization system can be characterized by an UWB tag present in the robot that can be located by measuring the distance till UWB anchors, whose posi- tions are known. Since UWB based localization systems cannot provide robot orientation information, another way to achieve this information is required. To solve this problem it is possible to use odometry information to achieve the ori- entation of the robot. Another solutions to locate the robot are based on the visual field, whose aim is to find the correspondence between the real world and the image projection. To achieve this goal, methods based on binary squared fiducial markers have been widely used, such as ArUco markers. This paper presents the data fusion of the odometry information with UWB ToF distance measurements and angle measurements of the ArUco markers rel- ative to the robot’s referential frame. The latter measurements are obtained through a camera installed on the robot that can detect the ArUco markers presented in the environment and whose positions are known. This data fusion is accomplished through the use of an Extended Kalman Filter (EKF). It is used once a non-linear system is being handled. The developed system allows increasing the robustness, the scalability and the accuracy of robot localization. 2 Related Work Mobile robot localization is one of the key functionalities of a truly autonomous system. This is a complex and challenging problem in robotics field and has been subject of a great research attention over the years. Depending on the type of application and its environment (outdoor or indoor), the choice of technologies and methods used must be made carefully in order to meet the requirements of the application. Due to the existence of various obstacles, indoor environments usually depend on non-line-of-sight (NLOS) propagation in which signals can- not travel directly in straight path from an emitter to a receiver, which causes inconsistent time delays at the receiver. Nowadays, the main approaches to the robot localization problem in indoor environment include:
  • 249.
    Sensor Fusion forMobile Robot Localization 237 – Dead Reckoning (DR): DR can approximately determine robot’s pose using Odometry or Inertial Measurement Unit (IMU) by integration of the pose from encoders and inertial sensors. This is one of the first methodology used to calculate the robot position. Dead reckoning can give accurate infor- mation on position, but is subject to cumulative errors over a long distances, in that it needs to utilize the previous position to estimate the next relative one and during which, the drift, the wheel slippage, the uneven floor and the uncertainty about the structure of the robot will together cause errors. In order to improve accuracy and reduce error, DR need to use other methods to adjusts the position after each interval [2]. – Range-Based: Range-Based localization measures the distance and/or angle to a natural or artificial landmarks. High accuracy measurements are critical for in-building applications such as autonomous robotic navigation. Typically, reference points, i.e. anchors, and their distance and/or angle to one or more nodes of interest are used with in order to estimate a given pose. These reference points are normally static or with an a priori map. There are several methods that are used to perform the distances and angles calculations: Angle of Arrival (AoA), Time of Flight (ToF), Time Difference of Arrival (TDoA) and Received Signal Strength Indication (RSSI) [4]. The robot’s position can be calculated with various technologies such as Ultra-WideBand (UWB) tags, Radio Frequency Identification (RFID) tags, Laser Range Finders. This type of localization is widely used in Wireless Sensors Network Localization. It is important to mention that UWB is a wireless radio technology primarily used in communications that has lately received more attention in localization applications and that promises to outperform many of the indoor localization methods currently available [2]. In [5], the authors proposed a method that uses Particle Filter to fuse odometry with UWB range measurements, which lacks the orientation, to obtain an estimate of the real-time posture of a robot with wheels in an indoor environment. – Visual-Based Systems: These systems can locate a robot by extracting images of its environment using a camera. These systems can be Fiducial marker systems that detect some markers in the environment by using com- puter vision tools. These systems are useful for Augmented Reality (AR), robot navigation and applications where the relative pose between camera and object is required, due to their high accuracy, robustness and speed [7]. Several fiducial marker systems have been proposed in the literature such as ARSTudio, ARToolkit and ARTag. However, currently the most popular marker system in academic literature is probably the ArUco which is based on ARTag and comes with an OpenSource library OpenCV of functions for detecting and locating markers developed by Rafael Muñoz and Sergio Gar- rido [6]. In addition, several types of research have been successfully conducted based on this type of system, such as [6,8,9]. All of the aforementioned technologies can be used independently. However they can also be used at the same time in order to combine the advantages of each of them. By the proper fusion of these technologies in a multi-sensor
  • 250.
    238 S. Fariaet al. environment a more accurate and robust system can be achieved. For nonlinear systems this fusion can be achieved through developed algorithms such as the Extend Kalman Filter (EKF), the Unscented Kalman Filter (UKF) and the Particle Filter, which are among the most widely used [3]. The presented work uses a combination of odometry, UWB ToF measured distance and ArUco measured angle to compute an accurate pose of a robot, i.e. position and orientation. The combination of the measures mentioned above is achieved by an EKF, since it is the least complex and computationally efficient algorithm. 3 System Architecture The system architecture proposed for this work, and its data flow is represented in Fig. 1. The system consists of two main blocks: Remote Computer and Real Robot. For all tests performed, a real robot was used and adapted, which was previously developed for other projects. Fig. 1. System architecture proposed The robot has a differential wheel drive system and therefore has two drive wheels, which are coupled to independent stepper motors powered by Allegro MicroSystems drivers - A4988, and a castor wheel, which has a support function. The robot sizes 0.31 m × 0.22 m and is powered by an on-board 12 V battery and a DC/DC step-down converter to supply the electronic modules, i.e. a Raspberry Pi 3 Model B and Arduino microcontroller boards. The top level consists of a Raspberry Pi that runs the raspbian operative system and is responsible for receiving data from Pozyx system obtained through the Arduino 2 and receiving data from Arduino 1 that corresponds to the speed and voltage of the motors and battery current. All the data is then send over Wi-Fi to the Remote Computer, where it is performed the Kalman Filter and the appropriate speeds for each wheel are calculated. A Raspberry Pi Camera Module is also connected to the Raspberry that is tasked with acquiring positioning information from the ArUco markers placed in the environment. This information is acquired by a script that runs on the Raspberry Pi and is sent to the Remote Computer via Wi-Fi. A block diagram of the robot architecture is presented in Fig. 2.
  • 251.
    Sensor Fusion forMobile Robot Localization 239 Fig. 2. Mobile Robot architecture To determine the Ground-Truth of the robot, a camera (Playstation Eye) was placed on the ceiling and in the center of the square where the robot moves, looking down as shown in Fig. 3 (reference frame Xc, Yc, Zc). This camera is connected to a Raspberry Pi 4 that contains a script and obtains the robot’s pose. In addition, a 13 cm ArUco marker was placed on top of the robot, centered on its center of rotation. By this way, the camera can detect the marker and consequently determine its pose from the aforementioned ArUco library. The pose extracted from the camera is useful for determining the accuracy of the developed system by its comparison. Fig. 3. Referential frames 4 Problem Formulation The problem of Robot Beacon-Based Localization (Fig. 4) can be defined as the estimation of the Robot’s pose X = xr yr θr by measuring the distances and/or angles of a given number of beacons, NB, placed in the environment at known positions MB,i = xB,i yB,i , i ∈ 1...NB. X and MB,i are both defined in the global reference frame WXWY .
  • 252.
    240 S. Fariaet al. For the specific problem, the following assumptions were made: – Since the robot only navigates on the ground, without any inclinations, it is assumed that the robot navigates on a two-dimensional plane. – The beacons positions, MB,i, are stationary and previously known. – Every 40 ms the odometry values are available. This values will be used to provide the system input uk = vk ωk T and refers to the odometry variation between k and k − 1 instants. – ZB,i(k) = [rB,i] and ZB,i(k) = [φB,i], identifies the distance measurement between the Pozyx tag and the Pozyx anchor i, and the angle measure- ment between the Robot camera and the ArUco marker i, respectively, at the instant k. Fig. 4. 2D localization given measurements to beacons with a known position 4.1 ArUco Markers As already mentioned, the system is capable of extracting angle measurements from ArUco markers, arranged in the environment, relative to the referential frame of the robot. Each of the markers has an associated identification number that can be extracted when the marker is detected in an image, which distin- guishes each reference point. Using the ArUco library provided, it is possible to obtain the rotation and translation (x, y, z) of the ArUco reference frame to the Camera reference frame (embedded in the robot). This allows to obtain the angle φ from the marker to the robot, to be extracted by the Eq. 1. φ = atan2( x z ) (1) Since the movement of the robot is only in 2D, the height (y) of the marker is ignored when making the calculations, thus having the angle exclusively as a horizontal difference. Note that the camera has been calibrated with a chessboard so that the measurements taken are reliable [12].
  • 253.
    Sensor Fusion forMobile Robot Localization 241 4.2 Extended Kalman Filter Localization The beacons measurements alone cannot return the location of the robot, so it is necessary to implement an algorithm to obtain this information. For this purpose, it will be used the well-known EKF, which is widely applied on this type of problems. The EKF continuously updates a linearization around the previous state estimation and approximates the state densities by Gaussian densities. It thus allows the fusion between the odometry model and the Pozyx and ArUco measurements, and is divided into two phases. The Prediction phase, which estimates the actual pose by the State Transition Model, and the Correction phase which corrects the prediction state input with the Pozyx distance and ArUco angle measurements in the Observation model. The EKF algorithm is presented in Algorithm 1. State Transition Model. The state transition model, present in Eq. 3, is a probabilistic model that describes the state distribution according to the inputs, in this case the linear and angular velocity of the robot. In addition to the transition function f(Xk, uk), the noise model N is represented by a Gaussian distribution with zero mean and covariance Q (Eq. 2), assumed constant. XK+1 = f(Xk, uk) + N(0, Q) Q = σ2 vk 0 0 σ2 ωk (2) f(Xk, uk) = Xk + ⎡ ⎣ vk · Δt · cos(θk + ωk·Δt 2 ) vk · Δt · sin(θk + ωk·Δt 2 ) ωk · Δt ⎤ ⎦ (3) In order to linearize the process, EKF uses the Taylor expansion that intro- duces uncertainty by calculating both the Jacobian of f with respect to Xk (Eq. 4) and uk (Eq. 5). ∇fx = ∂f ∂Xk = ⎡ ⎣ 1 0 −vk · Δt · sin(θk + ωk·Δt 2 ) 0 1 vk · Δt · cos(θk + ωk·Δt 2 ) 0 0 1 ⎤ ⎦ (4) ∇fu = ∂f ∂uk = ⎡ ⎣ Δt · cos(θk + ωk·Δt 2 ) −1 2 · vk · Δt2 · sin(θk + ωk·Δt 2 ) Δt · sin(θk + ωk·Δt 2 ) 1 2 · vk · Δt2 · cos(θk + ωk·Δt 2 ) 0 Δt ⎤ ⎦ (5) Observation Model. The filter will have two types of information inputs: the distance measurement made by Pozyx to each of the anchors and angle measure- ment made by the robot’s camera to each of the ArUco markers. The filter will have two observation models: one for each type of measurements. Both models will have in common the covariance matrix of the state estimate of the cur- rent state of the filter. The model corresponding to the Pozyx measurements is presented in Eqs. 6 and 7 and characterizes the Euclidean distance between the estimated position of the robot and the position of the anchor. The model
  • 254.
    242 S. Fariaet al. corresponding to the camera measurements, on the other hand, is presented in Eqs. 9 and 10 and characterizes the angle between the robot’s referential and the ArUco marker. Both measurements are affected by an additive Gaussian noise with zero mean and covariance constant (R), a parameter related to the characterization of the Pozyx error and the robot camera, respectively. It should also be noted that, as for the state transition model, the EKF algorithm also needs the Jacobian of h with respect to Xk (Eq. 8 and Eq. 11). ZB,i = rB,i = h(MB,i, Xk) + N(0, R) R = [σ2 r ] (6) ẐB,i = h(MB,i, X̂k) = (xB,i − x̂r,k)2 + (yB,i − ŷr,k)2 (7) ∇hx = ∂h ∂Xk = − xB,i−x̂r,k √ (xB,i−x̂r,k)2+(yB,i−ŷr,k)2 − yB,i−ŷr,k √ (xB,i−x̂r,k)2+(yB,i−ŷr,k)2 0 (8) ZB,i = φB,i = h(MB,i, Xk) + N(0, R) R = [σ2 φ] (9) ẐB,i = h(MB,i, X̂k) = atan2(yB,i − ŷr,k, xB,i − x̂r,k) − θ̂r,k (10) ∇hx = ∂h ∂Xk = yB,i−ŷr,k (xB,i−x̂r,k)2+(yB,i−ŷr,k)2 − xB,i−x̂r,k (xB,i−x̂r,k)2+(yB,i−ŷr,k)2 −1 (11) Outlier Detection. Normally, when taking measurements with sensors, they are prone to errors that can be generated due to hardware failures or unexpected changes in the environment. For proper use of the measurements obtained, the systems must be able to detect and deal with the less reliable measurements, i.e., those that deviate from the normal pattern and called outlier. Although there are several approaches to solving this problem, but in the presence of multivari- ate data, one of the common methods is to use the Mahalanobis Distance (MD) (Eq. 12) as a statistical measure of the probability of some observation belonging to a given data set. In this approach, outliers are removed as long as they are outside a specific ellipse that symbolizes a specific probability in the distribu- tion (Algorithm 2). In [10,11], it can be seen that this approach has been used successfully in different applications, consisting of calculating the normalized distance between a point and the entire population. MD(x) = (x − μ)T · S−1 · (x − μ) (12) In this case ẐB,i will be the population mean μ and S will be a factor rep- resenting a combination of the state uncertainty and the actual sensor measure- ment, present in line 8 of Algorithm 1. The choice of the threshold value can be made by a simple analysis of the data or by determining some probabilistic statistical parameter. In this case the threshold was established according to the
  • 255.
    Sensor Fusion forMobile Robot Localization 243 Algorithm 1: EKF Localization Algorithm with known measurements Input : X̂k, ZB,i, uk, Pk, R, Q Output : X̂k+1, Pk+1 PREDICTION 1 X̂k+1 = f(X̂k, uk) 2 ∇fx = ∂f ∂Xk 3 ∇fu = ∂f ∂uk 4 Pk+1 = ∇fx · Pk · ∇fT x + ∇fu · Q · ∇fT u CORRECTION 5 for i = 1 to NB do 6 ẐB,i = h(MB,i, X̂k) 7 ∇hx = ∂h ∂Xk 8 Sk,i = ∇hx · Pk · ∇hT x + R 9 OutlierDetected = OutlierDetection(Sk,i, ẐB,i, ZB,i) 10 if not OutlierDetected then 11 Kk,i = Pk+1 · ∇hT x · S−1 k,i 12 X̂k+1 = X̂k + Kk,i · [ZB,i − ẐB,i] 13 Pk+1 = [I − Kk,i · ∇hx] · Pk+1 14 end 15 end χ2 probability table, so that observations with less than 5 % probability were cut off in the case of the Pozyx system. For the ArUco system a threshold was chosen so that observations with less than 2.5 % probability were cut off. These values were chosen through several iterations of the algorithm, and the values, which the algorithm behaved best for, were chosen. Algorithm 2: Outliers Detection Algorithm Input : Sk,i, ẐB,i, ZB,i Output: OutlierDetected 1 MD(ZB,i) = (ZB,i − ẐB,i)T · S−1 k,i · (ZB,i − ẐB,i) 2 if MD(ZB,i) Threshold then 3 OutlierDetected = True 4 else 5 OutlierDetected = False 6 end Pozyx Offset Correction. The Pozyx system distance measurements suffer from different offsets, i.e. difference between the actual value and the measured value, depending on how far the Pozyx tag is from a given anchor. These off- sets also vary depending on the anchor at which the distance is measured, as
  • 256.
    244 S. Fariaet al. verified in Sect. 5.1. In order to correct this offset, the Algorithm 3 was imple- mented. After calculating the estimated distance from the anchor to the robot tag, it is verified between which predefined values this distance is, and then the correction is made through a linear interpolation. The variable RealDist is an array with a set of predefined distances for which the offsets have been deter- mined. The variable Offset is a 4 × length(RealDist) matrix containing the offsets corresponding to each distance for each of the four anchors placed in the environment. Algorithm 3: Offset Correction Algorithm Input : ẐB,i Output : ẐB,i 1 if ẐB,i = RealDist[0] then 2 ẐB,i = ẐB,i + Offset[i, 0] 3 else if ẐB,i = RealDist[7] then 4 ẐB,i = ẐB,i + Offset[i, 7] 5 else 6 for j = 0 to length(RealDist) − 1 do 7 if ẐB,i = RealDist[j] and ẐB,i = RealDist[j + 1] then 8 OffsetLin = Offset[i, j] + Offset[i,j+1]−Offset[i,j] RealDist[j+1]−RealDist[j] · (ẐB,i − RealDist[j]) 9 ẐB,i = ẐB,i + OffsetLin 10 end 11 end 12 end 5 Results and Discussion The developed localization system was tested on the robot in a 2 m × 2 m square area (Fig. 5). The output state X̂k+1 of the Kalman Filter allows con- trolling the robot to follow the desired path. Pozyx anchors were placed at the corners of the square and ArUco markers at the following (x, y) m positions: (0, 0.5), (0, 1.5), (0.5, 2), (1.5, 2), (2, 1.5), (2, 0.5), (1.5, 0) and (0.5, 0). The initial (x, y, θ) pose of the robot is (0.4, 0.5, 90◦ ) and the robot moves in a clockwise direction. Before proceeding to the localization tests it was necessary to characterize the error of both the Pozyx system and the ArUco system in order to be able to correctly implement the algorithms discussed. When characterizing the perfor- mance of systems it is important to realize that an error can have different sources and can be of different types. The experiments performed in the characteriza- tion of the systems will focus mainly on the analysis of accuracy and precision,
  • 257.
    Sensor Fusion forMobile Robot Localization 245 Fig. 5. Robot map which are typically associated with systematic and random errors respectively. In the case of this particular work the characterization of sensor accuracy is particularly important, since it will be used in the EKF to characterize the noise measurement. Furthermore, accuracy measures how close the observations are to real values, thus characterizing the existing offset. 5.1 Pozyx Characterization In order to obtain a complete characterization of the Pozyx system and its error, several experiments were performed in which distance measurements were taken between the static anchors and the Pozyx tag. These experiments were performed for different distances: [0.6, 0.8, 1, 1.2, 1.4, 1.6, 1.8, 2] m. It should also be noted that both the anchors and the tag were at the same height from the ground and that the real distance was measured with an appropriate laser. Thus it was possible to conclude on the error/offset and standard deviation of the measurements and verify their evolution depending on the distance between anchor and tag. Figure 6 shows the range of measurements for a distance between an anchor and tag of 1.6 m. It can be concluded that the error of the Pozyx system follows a Gaussian distribution. Fig. 6. Distribution of Pozyx measurements for a distance of 1.6 m
  • 258.
    246 S. Fariaet al. Table 1 shows the results obtained for the offset and for the standard devi- ation for different distances between a given anchor and the Pozyx tag. Note that the same experiments were performed for the four different anchors used in the robot localization. Through the analysis of the Table 1 it is possible to con- clude that independent of the distance the standard deviation is approximately constant, i.e. it belongs to the same range of values, and therefore for the noise of the EKF sensor an average of these values was used. Regarding the offset it was concluded that this is very variable depending on the distances and anchors. Therefore, the values estimated by EKF were adjusted according to the distance between a given anchor and tag, as mentioned in Sect. 4.2. Table 1. Offset and SD as a function of measurement Pozyx distance Real Distance (mm) Mean (mm) Offset (mm) SD (mm) 600 612 12 22.7 800 825 25 29.1 1000 1029 29 29.3 1200 1217 17 33.9 1400 1422 22 26.7 1600 1612 12 29.1 1800 1787 −13 27.2 2000 2020 20 33.9 5.2 ArUco Characterization For the characterization of the ArUco system a similar procedure was followed as for Pozyx, however, instead of distance measurements, angle measurements were performed, more specifically the φ angle. In particular, tests were performed with the robot stationary with the built-in camera looking at an ArUco marker. As with the Pozyx system, the center of the camera and the ArUco marker are at the same distance from the ground. Five different experiments were performed, represented in Fig. 7: robot perpendicular and just in front of the marker (1), in front of the marker but rotated with a given angle to the right (2), in front of the marker but rotated with a given angle to the left (3), perpendicular to the marker but translated to the right (4) and finally, perpendicular to the marker but translated to the left (5). The histogram resulting from experiment 1 is represented in Fig. 8. The Table 2 shows the results obtained from all the experiments.
  • 259.
    Sensor Fusion forMobile Robot Localization 247 Fig. 7. ArUco φ angle experiments Fig. 8. Experiment 1 for ArUco characterization Table 2. Results of ArUco characterization experiments Experiment Real φ Angle (◦ ) Mean (◦ ) Error (◦ ) SD (◦ ) 1 0 0.0309 0.0309 1.1e-3 2 16.33 16.3359 0.0059 1.0e-3 3 −12.94 −12.9653 −0.0233 1.3e-3 4 11.43 11.4486 0.0186 1.2e-3 5 −15.33 −15.3583 −0.0283 9.0e-4 An average of the standard deviations of the Table 2 was used to characterize the noise of the measurements obtained by the camera. When performing the localization tests it was found that when the robot moves in a straight line, the measurements estimated by the EKF and those actually measured by the camera were close. However, when the robot goes into rotation to change direction, these measurements were far apart, as can be seen in Fig. 9. For this reason, the EKF did not behave as expected. The φ angle measurements resulting from the ArUco markers when the robot is in rotation can be considered outliers, not being considered for the estimation of the robot’s pose. Therefore, an outlier filter was used, the same as the one used with the Pozyx system and explained in Sect. 4.2.
  • 260.
    248 S. Fariaet al. Fig. 9. Comparison of φ estimation and ArUco φ measurements 5.3 Localization Results For the localization tests it was important to tune the EKF, because changing its parameters affects the convergence of the algorithm, as well as the smoothness of the resulting trajectory estimate. Based on the noise characterization of the systems discussed above, the R and Q parameters were adjusted so that the estimation results presented were neither sensitive to noise nor too slow. To verify the effectiveness of the proposed system four different tests were performed: using only odometry (Fig. 10a), a fusion of odometry with the Pozyx system (Fig. 10c), a fusion of odometry with the ArUco system (Fig. 10b) and finally a fusion of odometry with the Pozyx and ArUco systems (Fig. 10d). Accu- racy was determined by a comparison with Ground-Truth discussed in Sect. 3. The errors shown in the figures refer to the errors in the last position of the robot. Using only the robot’s odometry, localization is clearly affected by wheel sliding around curves, and the error will increase as the robot travels longer dis- tances. By fusing the odometry with the angle information from the ArUco mark- ers a position error of 6.186 cm and an orientation error of 0.237◦ is obtained. It is thus concluded that the orientation of the robot is significantly improved. By fusing the odometry with the Pozyx system a 1.653◦ orientation error and 1.626 cm position error is obtained, thus improving the robot’s localization both in terms of position and orientation. Finally, through the fusion of all systems it is possible to acquire the benefits of each of the systems individually obtaining a position error of 1.987 cm and an orientation error of 0.393◦ .
  • 261.
    Sensor Fusion forMobile Robot Localization 249 (a) Only Odometry measurements (b) Fusion of odometry and ArUco measurements (c) Fusion of odometry and Pozyx measurements (d) Fusion of all measurements Fig. 10. EKF estimate results
  • 262.
    250 S. Fariaet al. 6 Conclusions and Future Work Overall the proposed system proved to be effective in improving the robot’s localization. The use of Pozyx system and ArUco markers allowed to improve the estimation of the robot’s pose in relation to using only the robot’s odometry that would induce more and more errors over time. However, a careful tuning of the Kalman Filter is necessary to ensure a correct convergence of the robot’s pose. The results obtained showed an error in the order of centimeters, unlike other existing technologies, such as GPS and Wi-Fi, whose error is in the order of meters. As future work, it could be possible to add inputs to the Kalman Filter, such as other possible measurements extracted from the ArUco markers, i.e. the distance to them and their inclination. Acknowledgements. This work is financed by National Funds through the Por- tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project UIDB/50014/2020. References 1. Chen, L., Hu, H., McDonald-Maier, K.: EKF based Mobile Robot Localization. In: Third International Conference on Emerging Security Technologies (2012) 2. Alarifi, A., et al.: Ultra wideband indoor positioning technologies: analysis and recent advances. In: Sensors (2016) 3. Thrun, S., Fox, D., Burgard, W.: Probabilistic Robotics. Open University Press (2002) 4. Sakperea, W., Adeyeye-Oshinb, M., Mlitwa, N.: A state-of-the-art survey of indoor positioning and navigation systems and technologies. South African Comput. J. 29, 145–197 (2017) 5. Liu, Y., Yuchuan Song, Y.: A robust method of fusing ultra-Wideband range mea- surements with odometry for wheeled robot state estimation in indoor environ- ment. In: 2018 Chinese Control And Decision Conference (CCDC) (2018) 6. Babineca, A., Jurišicaa, L., Hubinskýa, P., Duchon, F.: Visual localization of mobile robot using artificial markers. Procedia Eng. 96, 1–9 (2014) 7. Fiala, M.: ARTag, a fiducial marker system using digital techniques. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2005) 8. Zheng, J., Bi, S., Cao, B., Yang, D.: Visual localization of inspection robot using extended Kalman filter and aruco markers. In: IEEE International Conference on Robotics and Biomimetics (2018) 9. Oliveira, D., Simões, L., Martins, M., Paulino, M.: EKF-SLAM using visual mark- ers for ITER’s Remote Handling Transport Casks. In: Instituto Superior Técnico, Autonomous Systems (2019) 10. Sobreira, H.: Fiabilidade e robustez da localização de robôs móveis (2016) 11. Yenké, B., Aboubakar, M., Titouna, C., Ari, A., Gueroui, A.: Adaptive scheme for outliers detection in wireless sensor networks. Int. J. Comput. Networks Commun. Secur. (2017) 12. Camera Calibration. https://docs.opencv.org/master/dc/dbb/tutorial py calibration.html
  • 263.
    Deep Reinforcement LearningApplied to a Robotic Pick-and-Place Application Natanael Magno Gomes1(B) , Felipe N. Martins1 , José Lima2,3 , and Heinrich Wörtche1,4 1 Sensors and Smart Systems Group, Institute of Engineering, Hanze University of Applied Sciences, Groningen, The Netherlands natanael gomes@msn.com 2 The Research Centre in Digitalization and Intelligent Robotics (CeDRI), Polytechnic Institute of Bragança, Bragança, Portugal 3 Centre for Robotics in Industry and Intelligent Systems — INESC TEC, Porto, Portugal 4 Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands Abstract. Industrial robot manipulators are widely used for repetitive applications that require high precision, like pick-and-place. In many cases, the movements of industrial robot manipulators are hard-coded or manually defined, and need to be adjusted if the objects being manipu- lated change position. To increase flexibility, an industrial robot should be able to adjust its configuration in order to grasp objects in vari- able/unknown positions. This can be achieved by off-the-shelf vision- based solutions, but most require prior knowledge about each object to be manipulated. To address this issue, this work presents a ROS-based deep reinforcement learning solution to robotic grasping for a Collab- orative Robot (Cobot) using a depth camera. The solution uses deep Q-learning to process the color and depth images and generate a - greedy policy used to define the robot action. The Q-values are esti- mated using Convolutional Neural Network (CNN) based on pre-trained models for feature extraction. Experiments were carried out in a sim- ulated environment to compare the performance of four different pre- trained CNN models (RexNext, MobileNet, MNASNet and DenseNet). Results show that the best performance in our application was reached by MobileNet, with an average of 84 % accuracy after training in simulated environment. Keywords: Cobots · Reinforcement learning · Computer vision · Pick-and-place · Grasping 1 Introduction The usage of robots has been increasing in the industry for the past 50 years [1], specially in repetitive tasks. Recently, industrial robots are being deployed c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 251–265, 2021. https://doi.org/10.1007/978-3-030-91885-9_18
  • 264.
    252 N. M.Gomes et al. in applications in which they share (part of) their working environment with people. Those type of robots are often referred to as Cobots, and are equipped with safety systems according to ISO/TS 15066:2016 [2]. Although Cobots are easy to setup and program, their programs are usually written manually. If there is a change in the position of objects in their workspace, which is common when humans also interact with the scene, their program needs to be adjusted. Therefore, to increase flexibility and to facilitate the implementation of robotic automation, the robot should be able to adjust its configuration in order to interact with objects in variable positions. A Robot manipulator consists of a series of joints and links forming the arm, at the far end are placed the end-effectors. The purpose of an end-effector is to act on the environment, for example by manipulating objects in the scene. The most common end-effector for grasping is the simple parallel gripper, consisting of two-jaw design. Grasping is a difficult task when different objects are not always in the same position. To obtain a grasping position of the object, several techniques have been applied. In [3] a vision technique is used to define candidate points in the object and then triangulate one point where the object can be grasped. With the evolution of the processing power, Computer Vision (CV) has also played an important role in industrial automation for the last 30 years, including depth images processing [4]. CV has been applied from food inspection [5,6] to smartphone parts inspection [7]. Red Green Blue Depth (RGBD) cameras are composed of a sensor capable of acquiring color and depth information and have been used in robotics to increase the flexibility and bring new possibilities. There are several models available e.g. Asus Xtion, Stereolabs ZED, Intel RealSense and the well-known Microsoft Kinect. One approach to grasping different types of objects using RBGD cameras is to create 3D templates of the objects and a database of possible grasping positions. The authors in [8] used dual Machine Learning (ML) approach, one to identify familiar objects with spin-image and the second to recognize an appropriate grasping pose. This work also used interactive object labelling and kinesthetic grasp teaching. The success rate varies according to the number of known objects and goes from 45% up to 79% [8]. Deep Convolutional Neural Networks (DCNNs) have been used to identify robotic grasp positions in [9]. It uses RGBD image as input and gives a five- dimensional grasp representation, with position (x, y), a grasp rectangle (h, w) and orientation θ of the grasp rectangle with respect to horizontal axis. Two DCNNs Residual Neural Networks (ResNets) with 50 layers each are used to analyse the image and generate the features to be used on a shallow CNN to estimate the grasp position. The networks are trained against a large dataset of known objects and their grasp position. Generative Grasping Convolutional Neural Network (GG-CNN) is proposed in [10], a solution fast to compute, capable of running real-50 Hz. It uses DCNN with just 10 to 20 layers to analyse the images and depth information to control the robot in real time to grasp objects, even when they change position on the scene. In this paper we investigate the use of Reinforcement Learning (RL) to train an Artificial Intelligence (AI) agent to control a Cobot to perform a given pick- and-place task, estimating the grasping position without previous knowledge
  • 265.
    Deep RL Appliedto a Robotic Pick-and-Place Application 253 about the objects. To enable the agent to execute the task, an RGBD camera is used to generate the inputs for the system. An adaptive learning system was implemented to adapt to new situations such as new configurations of robot manipulators and unexpected changes in the environment. 2 Theoretical Background In this section we present a summary of relevant concepts used in the develop- ment of our system. 2.1 Convolutional Neural Networks CNN is a class of algorithms which use the Artificial Neural Network in com- bination with convolutional kernels to extract information from a dataset. The convolutional kernel scans the feature space and the result is stored in an array to be used in the next step of the CNN. CNN have been applied in different solutions in machine learning, such as object detection algorithms, natural language processing, anomaly detection, deep reinforcement learning among others. The majority of the CNN applica- tion is in the computer vision field with a highlight to object detection and classification algorithms. The next section explores some of these algorithms. 2.2 Object Detection and Classification Algorithms In the field of artificial intelligence, image processing for object detection and recognition is highly advanced. The increase of Central Processing Unit (CPU) processing power and the increased use of Graphics Processing Unit (GPU) have an important role in the progress of image processing [11]. The problems of object detection are to detect if there are objects in the image, to estimate the position of the object in the image and predict the class of the object. In robotics the orientation of the object can also be very important to determine the correct grasp position. A set of object detection and recognition algorithms are investigated in this section. Several features arrays are extracted from the image and form the base for the next layer of convolution and so on to refine and reduce dimensionality of the features, the last step is a classification Artificial Neural Network (ANN) which is giving the output in a form of certainty to a number of classes. See Fig. 1 where a complete CNN is shown. The learning process of a CNN is to determine the value of the kernels to be used during the multiple convolution steps. The learning process can take up to hours of processing a labeled data set to estimate the best weights for the specific object. The advantage is once the model weights have been determined they can be stored for future applications. In [13] a Regions with Convolutional Neural Networks (R-CNN) algorithm is proposed to solve the problem of object detection. The principle is to propose
  • 266.
    254 N. M.Gomes et al. Fig. 1. CNN complete process, several convolutional layers alternate with pooling and in the final classification step a fully connected ANN [12]. around 2000 areas on the image with possible objects and for each one of these extract features and analyze with a CNN in order to classify the objects in the image. The problem of R-CNN is the high processing power needed to perform this task. A modern laptop is able to analyze a high definition image using this technique in about 40 s, making it impossible to execute real time video analysis. But still capable of being used in some applications where time is not important or where it is possible to use multiple processors to perform the task, since each processor can analyze one proposed region. An alternative to R-CNN is called Fast R-CNN [14] where the features are extracted before the region proposition is done, so it saves processing time but loses some abilities to parallel processing. The main difference to R-CNN is the unique convolutional feature map from the image. The Fast R-CNN is capable of near real time video analysis in a modern laptop. For real time application there is a variation of this algorithm proposed in [15] called Faster R-CNN. It uses the synergy of between steps to reduce the number of proposed objects, resulting in an algorithm capable of analyzing an image in 198 ms, sufficient for video analysis. Faster R-CNN has an average result of over 70% of correct identifications. Extending Faster R-CNN the Mask R-CNN [16,17] creates a pixel segmen- tation around the object, giving more information about the orientation of the object, and in the case of robotics a first hint to where to pick the object. There are efforts to use depth images with object detection and recognition algorithms as shown in [18], where the positioning accuracy of the object is higher than RGB images. 2.3 Deep Reinforcement Learning Together with Supervised Learning and Unsupervised Learning, RL forms the base of ML algorithms. RL is the area of ML based on rewards and the learning process occurs via interaction with the environment. The basic setup includes the agent being trained, the environment, the possible actions the agent can take and the reward the agent receives [19]. The reward can be associated with the action taken or with the new state.
  • 267.
    Deep RL Appliedto a Robotic Pick-and-Place Application 255 Some problems in RL can be too large to have exact solutions and demand approximate solutions. The use of deep learning to tackle this problem in com- bination with RL is called Deep Reinforcement Learning (deep RL). Some problems can require more memory than available, i.e., a Q-table to store all possible solutions for an input color image of 250 × 250 pixels would require 250 × 250 × 255 × 255 × 255 = 1.036.335.937.500 bytes, or 1 TB. For such large problems the complete solution can be prohibitive by the required memory and processing time. 2.4 Deep Q Learning For large problems, the Q-table can be approximated using ANN and CNN to estimate the Q values. Deep Q Learning Network (DQN) was proposed by [20] to play Atari games on a high level, later this technique was also used in robotics [21,22]. A self balanced robot was controlled using DQN in a simulated environment with performance better than Linear–quadratic regulator (LQR) and Fuzzy controllers [23]. Several DQNs have been tested for ultrasound-guided robotic navigation in the human spine to locate the sacrum with [24]. 3 Proposed System The proposed system consists of a collaborative robot equipped with a two- finger gripper and a fixed RGBD camera pointing to the working area. The control architecture was designed considering the use of DQN to estimate the Q-values in the Q-Estimator. RL demands multiple episodes to obtain the neces- sary experience. Acquiring experience can be accelerated in a simulated environ- ment, which can also be enriched with data not available in the real world. The proposed architecture shown in Fig. 2 was designed to work in both simulated and real environments to allow experimentation on a real robot in the future. The proposed architecture uses Robot Operating System (ROS) topics and services to transmit data between the learning side and the execution side. The boxes shown in blue in Fig. 2 are the ROS drivers, necessary to bring the func- tionalities of the hardware to the ROS environment. The execution side can be simulated, to easily collect data, or real hardware for fine tuning and evalua- tion. As in [22], the action space is defined as motor control and the Q-values correspond to probability of grasp success. The chosen policy for the RL algorithm is a ε-greedy, i.e., pursue the maxi- mum reward with ε probability to take a random action. R-Estimator estimates the reward based on the success of the grasp and the distance reached to the objects, following Eq. 1. Rt = 1 dt+1 , if 0 ≤ dt ≤ 0.02 0, otherwise (1) where dt is in meters.
  • 268.
    256 N. M.Gomes et al. Fig. 2. Proposed architecture for grasp learning, divided in execution side (left) and learning sides (right). The modules in blue are ROS Drivers and the modules in yellow are Python scripts. 3.1 Action Space The RL gives freedom to choose the possible actions the agent can choose, in this work actions are defined as the possible positions to attempt to grasp an object inside the work area, defined as: Sa = {v, w}, (2) where {v} is the proportional position inside the working area in the x axis and {w} is the proportional position inside the working area in the y axis. The values are discretized by the output of the CNN. 3.2 Convolutional Neural Network To estimate the Q-values a CNN is used. For the action space Sa the network consists of two blocks to extract features from the images, a concatenation of the features and another CNN to reach the Q-values. The feature extraction blocks are pre-trained Pytorch models where the final classification network is removed. The layer to be removed is different for each model and, in general, the fully connected layers are removed. Four models were selected to compose the network, DenseNet, MobileNet, ResNext and MNASNet. The criteria considered the feature space and the performance of the models. The use of pre-trained PyTorch models reduces the overall training time. However it brings limitations to the system, the size of the input image must be 224 by 224 pixels and the image must be normalized following the original
  • 269.
    Deep RL Appliedto a Robotic Pick-and-Place Application 257 2 2 4 x 2 2 4 conv 1 1 2 x 1 1 2 conv 5 6 x 5 6 conv 2 8 x 2 8 conv 1 4 x 1 4 conv 1024 7 x 7 conv 2 2 4 x 2 2 4 conv 1 1 2 x 1 1 2 conv 5 6 x 5 6 conv 2 8 x 2 8 conv 1 4 x 1 4 conv 1024 7 x 7 conv 2048 7 x 7 concat+norm+relu 64 7 x 7 norm+conv+relu n 7 x 7 norm+conv n 1 1 2 x 1 1 2 upsample Fig. 3. The CNN architecture for the action space Sa, the two main blocks are a simplified representation of pre-trained Densenet model [25], only the feature size is represented. The features from the Densenet model are concatenated and used to feed the next CNN, the result is an array with Q-values used to determine the action. dataset mean and standard deviation [26]. In general this limits the working area of the algorithm to an approximately square area (Fig. 3). 3.3 Simulation Environment The simulation environment was built on Webots, an open-source robotics sim- ulator [27]. The choice has been made considering the usability of the software and use of computational resources [28]. To enclose the simulation in the ROS environment some modules were implemented: Gripper Control, Camera Con- trol and a Supervisor to control the simulation. The simulated UR3e robot is connected to ROS using the ROS driver provided by the manufacturer and con- trolled with the Kinematics module. Figure 4 shows the simulation environment, in which the camera is located in front of the robot, pointing to the working area. A feature of the simulated environment is to have control over all objects positions and colors. The positions were used as information for the reward and the color of the table was changed randomly at each episode to increase robust- ness during training. For each attempt the table color, the number of objects and the position of the objects were randomly changed. Webots Gripper Control. The Gripper Control is responsible to read and control the position of the joints of the simulated gripper. It controls all joints, motors and sensors of the simulated gripper. Touch sensors were also added at the tip of the finger to emulate the feedback signal when an object is grasped.
  • 270.
    258 N. M.Gomes et al. Fig. 4. The virtual environment built on Webots: it consists of a table, a UR3e collab- orative robot, a camera and the objects used in the training. The Robotiq 2F-85 is the gripper we are going to use in future experiments with the real robot. It consists of 6 rotational joints intertwined to form the 2 fingers. During tests, the simulation of the closed kinematic chain of this gripper in Webots was not stable. To regain stability in simulation we used a gripper with simpler mechanical structure but with similar dimensions of the Robotiq 2F-85. The gripper used in simulation is shown in detail in Fig. 5. Fig. 5. Detail of the gripper used in the simulation: its appearance is based on the Kuka Youbot gripper and its bounding objects are simplified to blocks. Webots Supervisor. The Supervisor is responsible for resetting the simu- lation, preparing the position of the objects at the beginning of the episode, changing color of the table and publishing the position of objects to the reward estimator. To estimate the distance between the center of the end-effector and the objects, a GPS position sensor is placed in the gripper’s center to inform its position to the supervisor. The position of the objects is used to shape the reward proportional to the distance between the end-effector and the object.
  • 271.
    Deep RL Appliedto a Robotic Pick-and-Place Application 259 Although this information is not available in the real world they are used to speed up the simulation training sessions. Webots Camera. The camera simulated in Webots has the same resolution of the Intel RealSense camera. To avoid the need of calibration of the depth camera, both RGB and depth cameras had coincident position and field of view in simulation. The field of view is the same as the Intel RealSense RGB camera: 69.4◦ or 1.21 rad. 3.4 Integrator The Integrator is responsible for connecting all modules, simulated or real. It controls the Webots simulation using the Supervisor API and feed the RGBD images to the neural network. Kinematics Module. The kinematics module controls the UR3e robot, simu- lated or real. It contains several methods to execute the calculations needed for the movement of the Cobot. Although RL has been used to solve the kinematics in other works [22,29], this is not the case in our system. Instead, we make use of analytical solution of the forward and inverse kinematics of the UR3e [30]. The Denavit–Hartenberg parameters are used to calculate forward and inverse kinematics of the robot [31]. Considering the UR3e has 6 joints, the combination of 3 of these can give 23 = 8 different configurations which can give the same pose of the end-effector (elbow up and down, wrist up and down, shoulder forward and back). On top of that, the movement of the UR3e joints have a range from −2π to +2π rad, increasing the possible solution space to 26 = 64 different configurations to the same pose of the end-effector. To reduce the problem, the range of the joints is limited via software to −π to +π rad, but still giving 8 possible solutions from where the nearest solution to the current position is selected. The kinematics module is capable of moving the robot to any position in the work space avoiding unreachable positions. To increase the usability of the mod- ule functions with the same behavior of the original Universal Robots “MOVEL” and “MOVEJ” have been implemented. To estimate the cobot joints angles in order to position the end-effector in space the Tool Center Point (TCP) must be considered in the model. TCP is the position of the end-effector in relation to the robot flange. The real robot that will be used for future experiments has a Robotiq wrist camera and a 2F-85 gripper, which means that the TCP is 175.5 mm from the robot flange in the z axis [32]. 4 Results and Discussion This section shows the results and discussion of two training sections with dif- ferent methods. The tests were performed on a laptop with a i7-9750H CPU,
  • 272.
    260 N. M.Gomes et al. 32 GB RAM and a GTX 1650 4 GB GPU, running Ubuntu 18.04. Although the GPU was not used in the CNN training, the simulation environment made use of it. 4.1 Modules All modules were tested individually to ensure proper functioning. The ROS communication was tested using the builtin tool rqt , to check the connection between nodes via topics or services. The UR3e joints positions are always pub- lished in a topic and controlled via action client. In the simulation environment, the camera images, the gripper control and the supervisor commands are made available via ROS services. Differently from ROS topics, ROS services only trans- mit data when queried, decreasing the processing demanded by Webots. Figure 6 shows the nodes via topics in the simulated environment, services are not rep- resented in this diagram. The diagram was created with rqt . Fig. 6. Diagram of the nodes running during the testing phase. In the simulated envi- ronment most of the data is transmitted via ROS services. In the picture, the topics /follow joint trajectory and /gripper status are responsible for the robot movement and griper status information exchange, respectively. CNN. From the four models tested, DenseNet and ResNext demanded more memory than the available GPU while MobileNet and MNASNet were capable of running on the GPU. To keep the fairness of the evaluation all timing tests were performed on the CPU. 4.2 Training For training the CNN it was used a Huber loss error function [33] and an Adam optimizer [34] with weight decay regularization [35], the hyperparameters used for RL and CNN training are shown in the Table 1.
  • 273.
    Deep RL Appliedto a Robotic Pick-and-Place Application 261 Table 1. Hyperparameters used in training. Parameter name Symbol Value Learning rate for CNN αCNN 1 × 10−3 Weight decay for CNN λw 8 × 10−4 Learning rate for RL αRL 0.7 Discount factor γ 0.90 Initial exploration factor ε0 0.90 Final exploration factor εf 5 × 10−2 Exploration factor decay λε 200 To avoid color bias of the algorithm the color of the simulated table was changed for every episode. Each training section was divided in four parts: collecting data, deciding the action to take based on the estimated Q-values, taking the action receiving a reward and training the CNN. Several sections of training were performed and the experience of the previous rounds were used to improve the training process. The training cycle times are shown in Table 2. Forward is the process follow- ing the direction from input to output of the CNN, backward is the process to evaluate the gradient from the difference in the output back to the input. In the backward process the weights of the network are updated with the learning rate αCNN . Table 2. Mean time and standard deviation of forward and backward time during training. Base model name Forward time (s) Backward time (s) DenseNet 0.408 ± 0.113 0.676 ± 0.193 ResNext 0.366 ± 0.097 0.760 ± 0.173 MobileNet 0.141 ± 0.036 0.217 ± 0.053 MNASNet 0.156 ± 0.044 0.257 ± 0.074 First Training Section. In the first training round no previous experience is used and the algorithm learns from scratch. The main target is to get information of the training process about cycle time and acquire experience to be used in future training sections. The algorithm was training according to the most recent experience with batch size of 1. In the training sections the accuracy was estimated based on 10 attempts every 10 epochs to verify how good the algorithm was performing at the time. The results are shown in Fig. 7. The training section took from 1:43 to 2:05 h to complete.
  • 274.
    262 N. M.Gomes et al. In Fig. 7 is observed a training problem where the loss reaches zero and there is no gradient for learning. The algorithm cannot learn and the accuracy shows the q-values estimated are poor. There are several causes that can explain this case including the weights of the CNN are too small and the experience accumu- lated has most errors. The solutions for this are complex including fine-tuning hyperparameters and selecting best experiences for the algorithm as shown in [36]. Another solution is to use demonstration through shaping [37], where the reward function is used to generate training data based on demonstrations of the correct action to take. The training data for the second section was generated using the reward function to map all possible rewards of the input. Fig. 7. The loss and accuracy of 1000 epochs training section, loss data were smoothed using a third order filter, raw data is shown in light colors. Fig. 8. The loss and accuracy during 250 epochs training section, data were smoothed using a third order filter, raw data is shown in light colors. Second Training Section. The second training section used the demonstra- tion through shaping. It was possible because in the simulation environment the information of the position of the objects is available. The training process received experiences generated from the simulation, these experiences have the best action possible for each episode.
  • 275.
    Deep RL Appliedto a Robotic Pick-and-Place Application 263 The batch size used on this training section was 10. The increase of batch size combined with the new experience replay caused a larger loss at the beginning of the training section as seen on the Fig. 8. The training section took from 3:43 to 4:18 h to complete. The accuracy as estimated for every epoch based on 10 attempts. 5 Conclusion This paper presented the use of RL to train an AI agent to control a Collaborative Robot to perform a pick-and-place task while estimating the grasping position without previous knowledge about the object. It was used an RGBD camera to generate the inputs for the system. An adaptive learning system was imple- mented to adapt to new situations such as new configurations of robot manipu- lators and unexpected changes in the environment. The results implemented on simulation validated the proposed approach. As future work, an implementation with a real manipulator will be addressed. Acknowledgements. This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020 and by the Inno- vation Cluster Dracten (ICD), project Collaborative Connected Robots (Cobots) 2.0. The authors also thank the support from the Research Centre Biobased Economy from the Hanze University of Applied Sciences. References 1. Siciliano, B., Khatib, O. (eds.): Springer Handbook of Robotics, pp. 1–2227. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32552-1 2. ISO/TS 15066 Robots and robotic devices - Collaborative robots. International Organization for Standardization, Geneva, CH, Standard, February 2016 3. Saxena, A., Driemeyer, J., Kearns, J., Ng, A.Y.: Robotic grasping of novel objects. In: Advances in Neural Information Processing Systems, pp. 1209–1216 (2007). https://doi.org/10.7551/mitpress/7503.003.0156. ISBN: 9780262195683 4. Torras, C.: Computer Vision: Theory and Industrial Applications, p. 455. Springer, Heidelberg (1992). https://doi.org/10.1007/978-3-642-48675-3. ISBN: 3642486754 5. Gomes, J.F.S., Leta, F.R.: Applications of computer vision techniques in the agri- culture and food industry: a review. Eur. Food Res. Technol. 235, 989–1000 (2012). https://doi.org/10.1007/s00217-012-1844-2 6. Arakeri, M.P., Lakshmana: Computer vision based fruit grading system for quality evaluation of tomato in agriculture industry. Procedia Comput. Sci. 79, 426–433 (2016). https://doi.org/10.1016/j.procs.2016.03.055 7. Bhutta, M.U.M., Aslam, S., Yun, P., Jiao, J., Liu, M.: Smart-inspect: micro scale localization and classification of smartphone glass defects for industrial automa- tion. arXiv: 2010.00741, October 2020 8. Shafii, N., Kasaei, S.H., Lopes, L.S.: Learning to grasp familiar objects using object view recognition and template matching. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2016-November, pp. 2895–2900. Institute of Electrical and Electronics Engineers Inc., November 2016. https://doi.org/10. 1109/IROS.2016.7759448. ISBN: 9781509037629
  • 276.
    264 N. M.Gomes et al. 9. Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural net- works. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2017-September, pp. 769–776. Institute of Electrical and Electronics Engineers Inc., November 2017. https://doi.org/10.1109/IROS.2017.8202237. arXiv: 1611.08036. ISBN: 9781538626825 10. Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic grasping. Int. J. Robot. Res. 39(2–3), 183–201 (2020). https://doi.org/10.1177/ 0278364919859066. ISSN: 0278-3649 11. Mittal, S., Vaishay, S.: A survey of techniques for optimizing deep learning on GPUs. J. Syst. Archit. 99, 101635 (2019). https://doi.org/ 10.1016/j.sysarc.2019.101635. http://www.sciencedirect.com/science/article/pii/ S1383762119302656. ISSN: 1383-7621 12. Saha, S.: A comprehensive guide to convolutional neural networks - the ELI5 way - by Sumit Saha - towards data science (2018). https://towardsdatascience. com/a-comprehensiveguide-to-convolutional-neural-networks-the-eli5-way- 3bd2b1164a53. Accessed 20 June 2020 13. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accu- rate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81. arXiv: 1311.2524. ISBN: 9781479951178 14. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2015 Inter, pp. 1440–1448 (2015). https://doi.org/10. 1109/ICCV.2015.169. arXiv: 1504.08083. ISSN: 15505499 15. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031. arXiv: 1506.01497. ISSN: 01628828 16. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 386–397 (2020). https://doi.org/10.1109/TPAMI.2018. 2844175. arXiv: 1703.06870. ISSN: 19393539 17. Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018). https://github.com/facebookresearch/detectron 18. Debkowski, D.: SuperBadCode/Depth-Mask-RCNN: using Kinect2 depth sensors to train neural network for object detection interaction. https://github.com/ SuperBadCode/Depth-Mask-RCNN. Accessed 20 June 2020 19. Sutton, R.S. Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn, p. 552. The MIT Press, Cambridge (2018). ISBN: 978-0-262-03924-6 20. Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., Cortelazzo, G.M.: Time-of-Flight and Structured Light Depth Cameras, pp. 1–355. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30973-6. ISBN: 9783319309736 21. Zhang, F., Leitner, J., Milford, M., Upcroft, B., Corke, P.: Towards vision-based deep reinforcement learning for robotic motion control. arXiv: 1511.03791, Novem- ber 2015 22. Joshi, S., Kumra, S., Sahin, F.: Robotic grasping using deep reinforcement learn- ing. In: 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), pp. 1461–1466. IEEE, August 2020. https://doi.org/10.1109/ CASE48305.2020.9216986. ISBN: 978-1-7281-6904-0
  • 277.
    Deep RL Appliedto a Robotic Pick-and-Place Application 265 23. Rahman, M.D.M., Rashid, S.M.H., Hossain, M.M.: Implementation of Q learning and deep Q network for controlling a self balancing robot model. Robot. Biomim. 5(1), 1–6 (2018). https://doi.org/10.1186/s40638-018-0091-9. arXiv: 1807.08272. ISSN: 2197-3768 24. Hase, H., Azampour, M.F., Tirindelli, M., et al.: Ultrasound-guided robotic navi- gation with deep reinforcement learning. arXiv: 2003.13321, March 2020 25. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected con- volutional networks. Technical Report. arXiv: 1608.06993v5. https://github.com/ liuzhuang13/DenseNet 26. Torchvision.models (2019). https://pytorch.org/docs/stable/torchvision/models. html. Accessed 17 Jan 2021 27. Webots. Commercial Mobile Robot Simulation Software. Cyberbotics Ltd., Ed. https://www.cyberbotics.com 28. Ayala, A., Cruz, F., Campos, D., Rubio, R., Fernandes, B., Dazeley, R.: A comparison of humanoid robot simulators: a quantitative approach, pp. 1–10. arXiv: 2008.04627 (2020) 29. Rajeswaran, A., Kumar, V., Gupta, A., et al.: Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. Techni- cal Report. arXiv: 1709.10087v2. http://sites.google.com/view/deeprl-dexterous- manipulation 30. Hawkins, K.P.: Analytic inverse kinematics for the universal robots UR-5/UR- 10 arms. Technical Report, December 2013. https://smartech.gatech.edu/handle/ 1853/50782 31. Universal Robots - Parameters for calculations of kinematics and dynamics. https://www.universal-robots.com/articles/ur/parameters-for-calculations-of- kinematics-anddynamics/. Accessed 31 Dec 2020 32. Manual Robotiq 2F-85 2F-140 for e-series universal robots, Robotic, 145 pp., November 2018 33. SmoothL1Loss – PyTorch 1.7.0 documentation. https://pytorch.org/docs/stable/ generated/torch.nn.SmoothL1Loss.html. Accessed 15 Jan 2021 34. Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: 3rd Inter- national Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, December 2015. arXiv: 1412.6980 35. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization, November 2017. arXiv: 1711.05101. http://arxiv.org/abs/1711.05101 36. De Bruin, T., Kober, J., Tuyls, K., Babuška, R.: Experience selection in deep reinforcement learning for control. J. Mach. Learn. Res. 19, 1– 56 (2018). https://doi.org/10.5555/3291125.3291134. http://jmlr.org/papers/v19/ 17-131.html. ISSN: 15337928 37. Brys, T., Harutyunyan, A., Suay, H.B., Chernova, S., Taylor, M.E., Nowé, A.: Reinforcement learning from demonstration through shaping. In: IJCAI Interna- tional Joint Conference on Artificial Intelligence, vol. 2015-January, pp. 3352–3358 (2015). ISBN: 9781577357384
  • 278.
    Measurements with theInternet of Things
  • 279.
    An IoT Approachfor Animals Tracking Matheus Zorawski1(B) , Thadeu Brito1,4,5 , José Castro2 , João Paulo Castro3 , Marina Castro3 , and José Lima1,4 1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal {matheuszorawski,brito,jllima}@ipb.pt 2 Instituto Politécnico de Bragança, Bragança, Portugal mzecast@ipb.pt 3 Centro de Investigação de Montanha, Instituto Politécnico de Bragança, Campus de Santa Apolónia, Bragança, Portugal {jpmc,marina.castro}@ipb.pt 4 INESC-TEC - INESC Technology and Science, Porto, Portugal 5 Faculty of Engineering of University of Porto, Porto, Portugal Abstract. Pastoral activities bring several benefits to the ecosystem and rural communities. These activities are already carried out daily with goats, cows and sheep in Portugal. Still, they could be better applied to take advantage of their benefits. Most of these pastoral ecosystem ser- vices are not remunerated, indicating a lack of making these activities more attractive to bring returns to shepherds, breeders and landowners. The monitoring of these activities provides data to value these services, besides being able to indicate directly to the shepherds’ routes to drive their flocks and the respective return. There are devices in the market that perform this monitoring, but they are not adaptable to the circum- stances and challenges required in the Northeast of Portugal. This work addresses a system to perform animals tracking, and the development of a test platform, through long-range technologies for transmission using LoRaWAN architecture. The results demonstrated the use of LoRaWAN in tracking services, allowing to conclude about the viability of the pro- posed methodology and the direction for future works. Keywords: Animals tracking · IoT · GPS · Grazing monitoring · LoRaWAN 1 Introduction According to breeder organizations, nearly 100.000 indigenous cows, sheep, and goats graze daily on Portugal’s northeast rangelands. Pastoralism is widely con- sidered to have a critical role in strengthening rural communities’ resilience and sustainability to face depopulation and climate change [1]. Every day, hundreds Supported by FEDER (Fundo Europeu de Desenvolvimento Regional). c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 269–280, 2021. https://doi.org/10.1007/978-3-030-91885-9_19
  • 280.
    270 M. Zorawskiet al. of shepherds drive their flocks of 200 or fewer through multiple-ownership, small- patched, and intricate rangelands around the farmstead [2]. Thanks to them, natural vegetation and agricultural remnants yield excellent ecosystem services like providing high-quality and protein-rich food, reducing fire hazards, recy- cling organic matter, and strengthening rural communities’ identities and sense of belonging [3,4]. Although herders are among the most impoverished peo- ple, most of these pastoral ecosystem services are not remunerated. How can shepherds, breeders, and landowners receive fair returns from such complicated structures? Due to assessment objections, decision-makers struggle to regulate and implement a payment system requiring information on multiple vegetation changes, various grazing pressures, different species, flock sizes, duration, and seasons. The IPB team addresses this challenge using 30 years of joint research on tra- ditional grazing systems in Trás-os-Montes [2]. They have demonstrated exper- tise in updating and providing high-precision remote sensing techniques and interpretation modeling methodologies adapted to handling large amounts of dynamic data [4,5]. As part of its innovation and demonstration activity, the IPB team’s projects run ten experimental sites grazed by sheep, goats, cows, and donkeys monitored by Global Navigation Satellite System (GNSS) collars. The team has learned the herds’ routes and routines and can accurately describe their land and vegetation pressure. Thus, how different herds seasonally modify the vegetation through unique pressures is the focus [2,4]. However, the IPB team’s developments have encountered constraints in using devices and systems available in the market, which could be applied in different contexts, purposes and be aimed at the circumstances of the roaming pastoralism in the Northeast of Portugal. As an example, the approach adopted by the IPB team uses col- lars with data transmission via the Global System for Mobile (GSM) network to track these animals. Unfortunately, this technology has a high energy con- sumption (causing a low battery autonomy and reduced data rate), requires a contract with telecommunications companies (which increases the project costs), besides not being an open device for modifications and additions of new tools (such as sensors) according to the object to be used. A grazing monitoring system tailored to local circumstances requires increased capabilities to record and transmit geolocations from remote sites, with high frequency, self power supplied, and low priced. Based on continuous feedback from the stakeholders supporting the IPB team—breeder’s organiza- tions, Ministry of Agriculture, and the National Agency for Forest Fires—this paper is a part of the development of such a system through the development of a tracking and monitoring system that can be attached to the animal and provide its behavior, i.e. perform the digitalization process. It is desired to keep the his- tory of movements for each animal so that it is possible to analyze and use this information further, for example, with a correlation or estimation algorithm. The GPS position, temperature and humidity are variables that are being measured, but the proposed system is open and ready to include new ones. This paper presents an architecture of an Internet of Thing (IoT) approach to track animals (based on their position) and acquire environmental conditions. As results, the
  • 281.
    An IoT Approachfor Animals Tracking 271 acquisition, transmission, storage and visualization of the data allow validating the proposed approach. The rest of the paper is organized as follows: After this introduction section, the related works compilation is presented where similar approaches are applied. Then, Sect. 3 presents the system description where the data acquisition, trans- mission, storage and visualization are stressed. Section 4 presents the develop- ments and results. Last section concludes the paper and points out future work direction. 2 Related Work Tracking animals is a process that brings several advantages, and it was addressed several years ago. On the one hand, different groups of researchers are developing personalized applications, from the large scale animals to the smallest ones [6], such as bees [7]. On the other hand, the are several commer- cial solutions that fit this methodology but bring some disadvantages such as the price, the closed architecture and the dependency. The animals tracking is vital in several ways, as examples from the territory planning to the understand- ing the predators [8]. There is also the socio-cultural and economic value of the ecosystems [3]. Another point to monitor is the movements of flocks that bring on different grazing pressure over the landscape [4] and also the seasonality of grazing itineraries [2]. Another justification for the animal tracking is related to the equilibrium of ecological dynamics on rangelands [1]. So, in the last years, several technologies have been applied to animal tracking and behavior monitor. Tracking animals over the cellular network is such expansive (it is necessary to contract a company to support the communications) and, in some cases, imprac- tical because the high energy consumption results in recurrent maintenance and a network coverage dependency. The fourth Industry revolution brought methodologies than can be used for this purpose. One of them is the IoT, Internet of Things. Authors from [5] presents an animal behavior monitoring platform, based on IoT technologies that includes an IoT local network to gather data from animals to autonomously shepherd ovine within vineyard areas. In this case, the data is stored on a cloud platform, with processing capability based on a machine learning algorithm that allows to extract relevant information. Regarding the network connections, IoT also brought new approaches of low power networks to be applied in this filed: examples are the LoRa and Sigfox, among others. Authors of [9] proposes a novel design of a mesh network formed by LoRa based animal collars or tags, for tracking location and monitoring other factors via sensors over very wide areas. Different communication technologies can be found on [10] where collars are connected to a Low Power Wide Area network (LPWA) Sigfox network and low cost Bluetooth Low Energy (BLE) tags connected to those collars. From the commercial side, has appeared a huge number of solution that may fit the animal tracking. Some examples are the milsar™ [11] and the digitanimal™ [12]. There are also organizations that shares in a free way, the database of worldwide
  • 282.
    272 M. Zorawskiet al. animal tracking data, such as the movebank™ [13] hosted by the Max Planck Institute of Animal Behavior where it is possible to researchers to manage, share, protect, analyze and archive animal data. Based on this study, the presented paper presents a developed solution to track animals while monitoring the behavior. The data is collected and stored on InfluxDB, where adaptive algorithms can be used to get precise information. Developing a personalized application owns the advantage of including the char- acteristics that the team desire. As a modular approach, other sensors can be added to the system. 3 System Description Being the main objective of this work the monitoring of pastoral activities, the basic architecture of the system can be simplified as shown in Fig. 1, in which the accomplishment of these steps (the first is the acquisition of coordinates in the monitoring module, second is the transmission of these data and third is the visualization in a platform) can be approached in several ways. Fig. 1. System description. Before defining which method and tools will be used to achieve the desired system, one must verify the main needs of the project and which architecture should be used. As this project aims to monitor automatically (not to depend on the action of the shepherd) and with a real-time display of data in multiple devices, it was decided to use a collar as the monitoring device and the signal receiver to be connected to the Internet, for the storage and display not be exclusive to the local node. Thus, Fig. 2 describes how the system architecture should be, adding the requirements of this project. With the system requirements defined, the following points must take into consideration the feasibility and the restrictions of this architecture to choose which method and strategy will be addressed, and consequently, which tools will be used. As the Open2 project works with animal monitoring, it is essential to be a wireless device, which implies the use of batteries, and consequently, that it has low power consumption. Another critical point for the project is the low cost of the entire system, seeking to create the tools and use free applications.
  • 283.
    An IoT Approachfor Animals Tracking 273 Fig. 2. Animals tracking system description. Considering these requirements, an alternative technology found was the LoRa (Long Range) networks, which is a wireless modulation technique of long- range and low power, being ideal for applications of small data transfers, which need a more excellent range than WiFi, Bluetooth or ZigBee. With LoRa mod- ulation, it is possible to create a LPWAN, which through the LoRaWAN pro- tocol (a protocol to access cloud-based services) manages the communication between LoRa devices and the LoRaWAN gateway, which is connected by cable to the internet, thus establishing a real-time communication between End Node devices and applications connected to the Internet (delaying only the airtime of the message, which is normally less than 100 ms), as depicted by the architecture in Fig. 3. Fig. 3. LoRaWAN architecture.
  • 284.
    274 M. Zorawskiet al. With the LoRaWAN architecture, the data that arrives at the gateway must pass through a server in order to be sent to the applications, which in this project used The Things Network (TTN). This free and open-source LoRaWAN network provides a worldwide LoRaWAN network, only needing to respect TTN’s fair use policy. The architecture that was used in this project is detailed below in two parts, the first part presenting the LoRa device (data acquisition and transmis- sion) and the second part the data application (Data Storage and Visualization). 3.1 Data Acquisition and Transmission The elements responsible for the initial stages of a LoRaWAN network are the LoRa devices, which are called End Nodes, as described in Fig. 3. The End Nodes send the data being monitored to the gateway, which is usually sensor values. As the objective of this project was the wireless monitoring of the End Node coordinate (which represents the animal tracking), the architecture used was adopted as depicted in Fig. 4. Fig. 4. Data acquisition and transmission system. The End Device thus required as main elements: a wireless power supply system, Microcontroller, LoRa transceiver module, Battery value reader, and the sensor that will acquire parameters from the real world. As this project was developed in a research centre, it became possible to use a device with these characteristics that were available from another project (SAFe project). In addition, other points considered at this stage were battery saving and safety, developing codes to turn off or put in sleep mode components that were not being used and limiting the operating stage to a minimum battery level.
  • 285.
    An IoT Approachfor Animals Tracking 275 3.2 Data Storage and e Visualization With the data properly transmitted to the gateway and on a server (TTN), the payload can be used in the applications. Node-Red was used, to facilitate data visualization and storage, as depicted in the Fig. 5. Fig. 5. Data storage and e visualization system. Node-red facilitates the use of the existing tools in the platform, such as local or database storage of the received data, visualization of the data, among others. Thus, using in this project the storage in .log format, influxDB and map trajectory visualization (mapper). Grafana helps in the visualization of data stored in a database, representing these data in several graph formats or even in a map. In this way, Grafana can support the data visualization with more than one device, such as creating in this platform multiples visualization to see the data from many devices. 4 Development and Results The initial tests in this work were performed to verify the system components (the GPS and the LoRa device) and later interconnecting both to apply the LoRa transmission sending the GPS data to the Gateway. Figure 6 represents the components used in this step: Arduino UNO, Arduino shield (containing GPS modules and the LoRa RFM95 device) and the Gateway used to receive the data from the node. With the LoRa device and the Gateway registered in TTN, the GPS data was imported to Node-RED to create a more straightforward way, the connection between the device and its applications. Figure 7 represents the flow created in
  • 286.
    276 M. Zorawskiet al. Fig. 6. Components used for the initial tests. Node-RED to track the LoRa device in real-time, show path travelled, store data locally in the .log file in the InfluxDB database. This flow can be applied in multiples devices, allowing the storage and display of registered TTN devices. Fig. 7. Flow used to track the LoRa device. After implementing the Node-RED and verifying the system’s operation to better acquire the coordinates and trajectories, the End Node was replaced by a device capable of wireless data transmission. For this it was used the Printed Circuit Board (PCB) of the SAFe project [14–17] (shown in Fig. 8), which already contained part of the architecture presented in Fig. 4, with a DHT sensor module, powered by a battery and protected by a 3D printed box, only needing to add the GPS module. The temperature and humidity sensor (DTH11) was used when sent the payload, but has not been applied in the following visualization steps.
  • 287.
    An IoT Approachfor Animals Tracking 277 Fig. 8. PCB adapted from SAFe project and 3D printed box. With this End Node device, the tracking tests were performed, sending the signal every three minutes, between the frequency of 863–870 MHz and a LoRaWAN device of Class A, which allowed an autonomy of two weeks, due to the GPS sleep module, that it was supposed only to wake up the module when the position update was needed. Display results were obtained using Node-RED’s own World Map as represented by Fig. 9, implemented in a way to display the trajectory and the last received point, which made it possible to validate the data obtained by the End Device. Fig. 9. Node-red’s world map displaying device tracking. Using InfluxDB, a database was created in order to, through Node-RED, increment the received devices position data, in addition to the metadata, includ-
  • 288.
    278 M. Zorawskiet al. ing information such as the time the data was received and other information related to the LoRaWAN signal information. Grafana was used as a second plat- form to test other ways to apply the data acquired by End Node. In which it imported data from the database, representing the stored points together with the quality of the LoRa signal captured at this point (Fig. 10). Fig. 10. Grafana’s world map displaying the geolocation points acquired. Tests on Grafana to verify the application’s capabilities for interactive data visualization for this project made it possible to verify tools already implemented on the platform, in addition to the World Map visualization, also the possibility to display data according to stipulated schedules. Thus, the system has only hardware costs, the Gateway being the most expensive, and the value of each End Node is a low-cost product compared to other devices available on the market. 5 Conclusion and Future Work A large number of cows, goats and sheep perform grazing activities every day in Portugal, which bring several benefits to the society and ecosystem. This work proposes a way to implement a grazing monitoring system that can fit the requirements (wireless device, storage and transmission of geolocation, high data transmission rate, and low price) to support the IPB team in their research on the return of pastoral services fair. It also is open to include new functionalities for geolocation and environmental conditions. The project exceeded the stages of research and proposals solution of the requirements of this work, developing and performing tests on prototypes to validate the methodology addressed. The tools adopted to visualize test data have met with the intended objectives and
  • 289.
    An IoT Approachfor Animals Tracking 279 showed a possibility of being better used to improve the application in future projects. With the tests of the prototype and the applications, several points were observed where the system has the capacity to develop in future works. One of the possible developments of the project would be through the implementation of a “Server Machine”, allowing to host the tools used (backup in .log and port control for Node-RED, InfluxDB and Grafana), using Raspberry Pi or virtual machine. Regarding application improvements, could add other functions for the Dashboard to encompass more functionalities, such as displaying the percentage of battery, temperature and humidity, selection of data to be displayed, among other features. It would also be necessary for the application to include grazing value according to Agrarian Plot Identification System (SIGPAC), linking with monitored points on the map and a method to send in real-time this information to the shepherd. To ensure that messages are sent, a way to map the areas with LoRaWAN Signal coverage could be implemented. Also, data storage on the device enables confirmation of data sent, allowing data to be transmitted again at animal rest times to ensure a good signal or collecting locally after a period of device oper- ation. Another way would be through payload optimization, a Dynamic way to send GPS data, sending the complete GPS data only in certain intervals, and the next ones reduced, which would allow a higher data transmission rate. The autonomy of the system could be improved by implementing a separate power supply for the GPS so that while the GPS is in sleep mode, it can turn off the rest of the system or turn off the whole system and turn it on again ahead of time to send data, to fix the GPS. In addition, a solar charger system would reduce the physical interaction with the devices to charge the batteries. References 1. Roe, E., Huntsinger, L., Labnow, K.: High reliability pastoralism. J. Arid Environ. 39(1), 39–55 (1998) 2. Castro, M., Castro, J., Gómez Sal, A.: L’utilisation du territoire par les petits ruminants dans la région de montagne de trás-os-montes, au portugal. Options Méditerranéennes. Série A, A Séminaires Méditerranéens, no 61, 249–254 (2004) 3. Bernues, A., Rodrı́guez-Ortega, T., Ripoll-Bosch, R., Alfnes, F.: Socio-cultural and economic valuation of ecosystem services provided by mediterranean mountain agroecosystems. PloS One 9(7), e102479 (2014) 4. Castro, M., Ameray, A., Castro, J.P.: A new approach to quantify grazing pres- sure under mediterranean pastoral systems using GIS and remote sensing. Int. J. Remote Sens. 41(14), 5371–5387 (2020) 5. Nóbrega, L., Tavares, A., Cardoso, A., Gonçalves, P.: Animal monitoring based on IoT technologies. In: 2018 IoT Vertical and Topical Summit on Agriculture- Tuscany (IOT Tuscany), pp. 1–5. IEEE (2018) 6. Guide to animal tracking. https://outdooraction.princeton.edu/nature/guide- animal-tracking. Accessed 15 May 2021 7. Bozek, K., Hebert, L., Portugal, Y., Stephens, G.J.: Markerless tracking of an entire honey bee colony. Nat. Commun. 12(1), 1–13 (2021)
  • 290.
    280 M. Zorawskiet al. 8. Ryan, P., Petersen, S., Peters, G., Grémillet, D.: GPS tracking a marine predator: the effects of precision, resolution and sampling rate on foraging tracks of African penguins. Mar. Biol. 145(2), 215–223 (2004) 9. Panicker, J.G., Azman, M., Kashyap, R.: A LoRa wireless mesh network for wide- area animal tracking. In: 2019 IEEE International Conference on Electrical, Com- puter and Communication Technologies (ICECCT), pp. 1–5. IEEE (2019) 10. Maroto-Molina, F., et al.: A low-cost IoT-based system to monitor the location of a whole herd. Sensors 19(10), 2298 (2019) 11. Milsar. https://milsar.com/. Accessed 15 May 2021 12. Digital animal. https://digitanimal.pt/. Accessed 15 May 2021 13. Movebank. https://www.movebank.org/cms/movebank-main. Accessed 15 May 2021 14. Brito, T., Pereira, A.I., Lima, J., Castro, J.P., Valente, A.: Optimal sensors posi- tioning to detect forest fire ignitions. In: Proceedings of the 9th International Con- ference on Operations Research and Enterprise Systems, pp. 411–418 (2020) 15. Brito, T., Pereira, A.I., Lima, J., Valente, A.: Wireless sensor network for ignitions detection: an IoT approach. Electronics 9(6), 893 (2020) 16. Azevedo, B.F., Brito, T., Lima, J., Pereira, A.I.: Optimum sensors allocation for a forest fires monitoring system. Forests 12(4), 453 (2021) 17. Brito, T., Azevedo, B.F., Valente, A., Pereira, A.I., Lima, J., Costa, P.: Environ- ment monitoring modules with fire detection capability based on IoT methodol- ogy. In: Paiva, S., Lopes, S.I., Zitouni, R., Gupta, N., Lopes, S.F., Yonezawa, T. (eds.) SmartCity360◦ 2020. LNICST, vol. 372, pp. 211–227. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76063-2 16
  • 291.
    Optimizing Data Transmission ina Wireless Sensor Network Based on LoRaWAN Protocol Thadeu Brito1,2,3 , Matheus Zorawski1 , João Mendes1(B) , Beatriz Flamia Azevedo1,4 , Ana I. Pereira1,4 , José Lima1,3 , and Paulo Costa2,3 1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal {brito,matheuszorawski,joao.cmendes,beatrizflamia,apereira,jllima}@ipb.pt 2 Faculty of Engineering of University of Porto, Porto, Portugal 3 INESC TEC - INESC Technology and Science, Porto, Portugal 4 Algoritmi Research Centre, University of Minho, Campus de Gualtar, Braga, Portugal Abstract. Internet of Things, IoT, is a promising methodology that has been increasing over the last years. It can be used to allow the connec- tion and exchange data with other devices and systems over the Internet. One of the IoT connection protocols is the LoRaWAN, which has several advantages but has a low bandwidth and limited data transfer. There is a necessity of optimising the data transfer between devices. Some sensors have a 10 or 12 bits resolution, while LoRaWAN owns 8 bits or multi- ples slots of transmission remaining unused bits. This paper addresses a communication optimisation for wireless sensors resorting to encoding and decoding procedures. This approach is applied and validated on the real scenario of a wildfire detection system. Keywords: Internet of Things · LoRaWAN · Wireless sensor network · Fire detection · Transmission optimisation 1 Introduction The project “SAFe: Forest Monitoring and Alert System” proposes an intelligent system for monitoring situations of potential forest risk. This system combines sensor nodes that will collect various parameters, namely temperature, humid- ity and data related to infrared sensors that identify the presence of flame. This collection of information, combined with a system based on artificial intelligence and other collected data (such as weather forecasting), will allow an efficient and intelligent data analysis, allowing the creation of alerts of dangerous situations to alert the different actors (for example, firefighters, civil protection or city c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 281–293, 2021. https://doi.org/10.1007/978-3-030-91885-9_20
  • 292.
    282 T. Britoet al. council). These alerts will be parameterized and presented in a personalised way and tailored to each actor. Thus, it is intended to minimise the occurrence of ignitions, monitor fauna and flora, and contribute to the environmental develop- ment of the region of Trás-os-Montes, particularly in the district of Bragança. In addition to the capacity for early detection of forest fires, the accurate estimation of the fire hazard, using fire risk indices in real-time, on a sub-daily and local scale, in meteorological data, in the availability of fuels and moisture content vegetation is of utmost importance. The SAFe’s implementation is localised to the Serra da Nogueira area, belong- ing to the Natura 2000 Network, with unique characteristics in the Bragança region and with an extension of approximately 13 km. Considering the region’s size to be monitored, data transmission is carried out by LoRaWAN, it is based on low power wide area network, which will guarantee full coverage of the area in question [1–4]. This type of communication is increasingly used, and it is estimated that by 2024, the IoT industry will generate revenue of 4.3 trillion dollars [5]. LoRaWAN networks are composed of end devices, gateway and net- work server in the midst of a rising state. The data is acquired from the modules (end devices) that send the data to the gateway, which in turn relays the mes- sages to the servers through non-LoRaWAN networks, such as Ethernet or IP over cellular [6]. The LoRaWAN communication itself uses the Chirp Spread Spectrum (CSS) modulation, where chirp pulses modulate the signal. It uses an unlicensed sub- gigahertz ISM band that varies in width depending on the region of application, and for Europe, it is 863 MHz to 870 MHz [7]. Within the communication set- tings, three parameters allow a network adjustment between bit rate and robust- ness of the transmission: Bandwidth (BW), Spreading Factor (SF), and Coding Rate (CR) [8]. The format of the Long Range (LoRa) message is subdivided into several layers, the physical layer, which is composed of five fields, the pream- ble, the physical header (PHDR), the physical header Cyclic Redundancy check (PHDR0 CRC), the physical payload (PHY Payload) and an error detection tail (CRC). The Medium Access Control (MAC) layer, in turn, is a subdivision of the PHY Payload field and consists of three fields, the MAC Header (MHDR), MAC Payload and the Massage Integrity Code (MIC). Within the MAC Payload is the message itself, which consists of three fields, the Frame header (FHDR), the port and the frame payload (FRM Payload) [9]. At this stage of the SAFe project, while the testing period is still underway in an urban environment, the configuration being used is a 125 kHz Bandwidth, an SF7 and a Coding Rate of 4/5, which allows transmission of about 242 Bytes. However, taking into account that when the project passes the test phase in a rural environment, it will be necessary to cover a larger area. It is probably required to use a larger SF to improve the communication range, thus reducing transmission capacity. Using an SF12 and maintaining the BW and CR, the LoRaWAN communication will allow transmission of about 51 Bytes. Therefore, there is a necessity to optimise the use of all bits’ packages as much as possible and not lose data essential for the project’s realisation. But some sensors have
  • 293.
    Optimizing Data Transmissionin a Wireless Sensor Network 283 a 10 or 12 bits resolution while LoRaWAN owns 8 bits, which means, in some cases, multiples slots of transmission remaining unused bits. In this way, this paper addresses a communication optimisation for Wireless Sensors Network (WSN) resorting to encoding and decoding procedures. The rest of paper is organized as follows. After the introduction, Sect. 2 presents the related work. In Sect. 3, the system architecture is addressed where the focus of this paper is highlighted. Section 4 presents the algorithm used to optimise the data sent though the network and Sect. 5 stresses the results of the proposed encoding/decoding. Last section, concludes the paper and points future work. 2 Related Work Real-time data collection and analysis had new possibilities with the emergence of the Internet of Things (IoT). Typically, IoT devices are manufactured to use in short-range protocols, such as Bluetooth, WiFi, and ZigBee, among others, which may have limitations, such as the maximum number of devices connected to the gateway and high power consumption. On the other hand, mobile technologies that have an extensive range (LTE, 3G and 4G) are expensive, consume a lot of power and have a continuing cost [10]. The gap that these technologies have been filled in part by Low Power Wide Area Network (LPWAN) technologies, which cannot cover the critical applications (e.g. remote health monitoring), but fit very well for the requirements of IoT applications. Although low data rates limit LPWAN systems, there are reasons for their use, especially after the emergence of technologies such as SigFox, NB-IoT, and LoRa [10,11]. LoRa network is highly used in applications requiring low power consumption and long-range, in which LoRa Alliance proposed the LoRaWAN, defining the network protocol in the MAC and network layers [10,12]. Different scenarios of typical LoRa networks applications are pointed out in [12], besides providing documentation of LoRa networks design and implementation, searching for a better cost-benefit for LPWAN applications. Also, Baiji [13] showed in 2019 an application using LoRaWAN in an IoT monitoring system aimed to minimise the non-productive time for Oil and Gas companies through the detection of a leak. Researches are being carried out seeking improvements to the LoRaWAN system, in which numerous of these are being carried out to seek optimisa- tion of data rate, airtime, and energy consumption through an Adaptive Data Rate (ADR), as in [14], where a review of research on ADR algorithms for LoRaWAN technology is presented. Also, in [15] a method for the use of ADR is implemented, seeking to provides an optimal throughput with multiple devices connected to LoRaWAN. Furthermore, a study was carried out looking for max- imising the battery lifetime through different LoRaWAN nodes and scenarios in [16], demonstrating relationships between different payload sizes with the power
  • 294.
    284 T. Britoet al. consumption and air time using various SF and CR, indicating an increase in power consumed per bit of payload used. As cited by [11], all the advantages of the LoRa network of achieving a long transmission range, low power and enabling low-cost IoT applications to exist with the downside of low data throughput rate. The LoRaWAN data transfer must respect a protocol, and it is possible to send the Payload in several forms, as long as it is within the limit of the number of Bytes. Considering the article [9], the LoRa message format is composed of several fields that have specific or limited sizes. Therefore, it is necessary to optimise the use of bytes in Payload as much as possible. Some techniques have been used over the years, such as selecting the best packet size by adjusting the bit rate to achieve maximum throughput as demonstrated by the authors of the article [17], to optimise the payload sizes in voice and video applications. Still, within this dynamic, the authors of the article [18] used throughput as the optimisation metric and based their study on wireless ATM networks to maximise the efficiency of transfer by adapting the packet size with the variation of the channel conditions. Also, Eytan Modiano in [19] presented a system to dynamically optimise the packet size based on the estimates of the channel’s bit error rate, being particularly useful for wireless and satellite channels. The importance of packet size optimisation in WSN are debated in the article [20], presenting several packet size optimisation techniques and concluding that there is no agreement on whether the packet should be fixed or dynamic. The packet size optimisation technique to be used for WSN is defended in article [21], proposing an energy efficiency factor evaluation metric, being a fixed packet size for the case studied. Furthermore, the authors of the article [22] present a variation of the packet size depending on the network conditions to increase the network throughput and efficiency. The method of payload recommended by The Things Network (TTN) when using LoRaWAN has described in [23] only the specifications of how to send each type of data. Suppose the user needs to use the minimum Bytes possible for a shorter transmission time and being within the limit of 51 Bytes. One way to implement data encoding for the LPWAN network as LoRaWAN is through Cayenne Low Power Payload (LPP), which allows sending the payload dynamically, according to the LoRa frequency and data rate used [24]. Nevertheless, besides the various optimisations that have been mentioned above, it remains crucial to find the form to optimise the data that will be transferred for better use of the LoRa network. 3 System Architecture The presented system has the main ideology of monitoring a determined area with a reduced application cost. In this way, it is expected to obtain constant surveillance to identify any ignition that may exist. This surveillance will be achieved using the various sensors present in the modules (end devices). Each module consists of six sensors, with three distinct values types: five flame sensors and a sensor that provides data on relative humidity and air temperature. The
  • 295.
    Optimizing Data Transmissionin a Wireless Sensor Network 285 entire operation of the safe project combines the collection of data from the modules present in the forest; this data collection mustn’t fail. Consequently, the proper functioning of data transmission is guaranteed because if the data does not reach the competent entities, they will be of no use. This transmission will be handle using LoRaWAN technology, which allows us to connect several modules to different gateways. Thus, guaranteeing low-cost and efficient coverage of the area. Figure 1 exemplifies the whole system architecture of the SAFe project. Fig. 1. System architecture of SAFe project [2]. Due to the specificity that each of these elements is defined and developed (Fig. 1), this work is concentrated only on the LoRaWAN’s Transmission and the proposed arrangement bits protocol. Other SAFe’s approaches can be seen in [1–4]. In this way, considering the high number of modules (end devices) used in this project, there is a requirement to ensure no collisions between their messages; the whole modules’ description is explained in [2,4]. Thereupon, it is necessary to optimise the size of the data package sent, thus guaranteeing the minimum necessary occupation of bytes to reduce the chances of these collisions and consequent data losses. The normal sending process (recommended by TTN service) will be exemplified below to explain the process, for this information will be used related to the six sensors present in the modules as well as the battery information, resulting in:
  • 296.
    286 T. Britoet al. – Flame sensors: as mentioned before, each node has five flame sensors. These sensors make 10 bits reading, which means they will generate values between 0 to 1023. To send just one value from one of them is demanded to use 2-Bytes (16 bits), consequently, to send all of them is required a total of 10-Bytes. – Relative humidity sensor: this sensor produces a reading between 0 to 99, but 2-Bytes are required to transmit the message. – Temperature sensor: gives a reading between 0 to 50 ◦ C, which results in the use of 2-Bytes to transmit. – Battery level: the battery level is converted from a 10 bits ADC and a total of 2-Bytes will be required to transmit. Analysing the sending process recommended by TTN service, it is possible to observe that the total package has 16-Bytes per interval of sending data (Fig. 2). However, the messages resulting from all sensors have unused bits because 16- Bytes has 128 bits and the data produced by all sensors in each module has a total of 73 bits. Thus, this work aims to optimise the use of these unfilled bits. Using the example of flame sensors, it is possible to notice that the 2-Bytes sent correspond to a message with 16-bits, and only 10-bits are used to transmit the value with a range of 0 to 1023. In this case, 6-bits are left to optimise with the data from another sensor’s value. This example can be seen looking for B0 and B1 in the following Fig. 2. Fig. 2. 16-Bytes (B) of data necessary to send. The unused bits (b) are red coloured. Where i ∈ R|0 ≤ i ≤ 4. This methodology of taking advantage of the unused bits will be explained in the following section in more detail.
  • 297.
    Optimizing Data Transmissionin a Wireless Sensor Network 287 4 Algorithm From the previous section, it is possible to notice that for each flame sensor data (S[i]), there are some unused bits (x) available to receive data to be carried. In this way, the humidity sensor (RH) and the temperature sensor (T) will be stored on a Byte type and the battery (B) level will be stored in 10 bits reso- lution. Table 1 presents the transmission package without optimise the encoding procedure (using the recommendation in [23]). Table 1. Transmission without optimise encoding. S[i] are the flame sensors, H, T and B are the Humidity, Temperature and Battery Voltage sensors, respectively. The ‘x’ are the unused bits. S[0] RH x x x x x x s9[0] s8[0] ... s0[0] x x x x x x x RH7 ... RH0 S[1] T x x x x x x s9[1] s8[1] ... s0[1] x x x x x x x T7 ... T0 S[2] B x x x x x x s9[2] s8[2] ... s0[2] x x x x x x B9 B8 ... B0 S[3] Legend x x x x x x s9[3] s8[3] ... s0[3] Used by 10 bits sensors Used by Relative Humidity S[4] Used by Temperature x x x x x x s9[4] s8[4] ... s0[4] This approach intends to sort the bits using a less value as possible in bytes protocol used to transmit the data by LoRaWAN. The main idea is distribute each bit from each sensor during the encoding process, in a manner that is necessary use the minimum Bytes. In order to optimise these Not Used bits (b15, b14, b13, b12, b11 and b10, see Fig. 2) for each sensor S[i] (where i ∈ R|0 ≤ i ≤ 4), the word (2-Bytes) of Humidity (RH), Temperature (T) and Battery (B) will be separated and combined with the unused bits at the encoding procedure. A bit manipulation operation is required to use the previous unused bits. Table 2 presents the previous “x” assuming bits from other sensors. Moreover, even including the H, T and B values, there are 4 bits that remains unused. They will be appropriated to hold four auxiliary bits (AUX) for future applications.
  • 298.
    288 T. Britoet al. Table 2. Proposed encoding map. S[0] RH T5 T4 T3 T2 T1 T0 s9[0] s8[0] ... s0[0] x x x x x x x x ... x S[1] T RH3 RH2 RH1 RH0 T7 T6 s9[1] s8[1] ... s0[1] x x x x x x x x ... x S[2] B B1 B0 RH7 RH6 RH5 RH4 s9[2] s8[2] ... s0[2] x x x x x x x x ... x S[3] Legend B7 B6 B5 B4 B3 B2 s9[3] s8[3] ... s0[3] Used by 10 bits sensors Used by Relative Humidity S[4] Used by Temperature AUX4 AUX2 AUX1 AUX0 B9 B8 s9[4] s8[4] ... s0[4] Used by AUX This encoding operation is carried by an embedded system that is pro- grammed on C language. For the encoding, the proposed C code is detailed on Listing 1.1. The message will go through the network and will be decoded by Listing 1.2. Listing 1.1. Data encoder S [ 0 ] = S [ 0 ] + (T 0X3F) 10; S [ 1 ] = S [ 1 ] + (T 0XC0) 10 + (H 0X0F) 12; S [ 2 ] = S [ 2 ] + (H 0XF0) 10 + (B 0X003) 14; S [ 3 ] = S [ 3 ] + (B 0X0FC) 10; S [ 4 ] = S [ 4 ] + (B 0X300) 10 + (AUX 0X0F) 12; //TRANSMIT array S An opposite operation of decoding will allow to extract the Flame values, Humidity, Temperature and Battery voltage from the message. The C code of decoding procedure can be found on Listing 1.2. Listing 1.2. Data decoder //RECEIVE array S T = (S [ 0 ] 0XFC00) 10 + (S [ 1 ] 0X0C0) 4; H = (S [ 1 ] 0XF000) 12 + (S [ 2 ] 0X3C00) 6; B = (S [ 2 ] 0XC000) 14 + (S [ 3 ] 0XFC00) 10 + (S [ 4 ] 0X0C00) 2; AUX = (S [ 4 ] 0XF000) 12; S [ 0 ] = S [ 0 ] 0X03FF ; S [ 1 ] = S [ 1 ] 0X03FF ; S [ 2 ] = S [ 2 ] 0X03FF ; S [ 3 ] = S [ 3 ] 0X03FF ; S [ 4 ] = S [ 4 ] 0X03FF ;
  • 299.
    Optimizing Data Transmissionin a Wireless Sensor Network 289 This optimised message can be sent to the LoRaWAN network and at the destination, a decoder procedure will construct the original message. Figure 3 presents the encoding, transmission and decoding procedures. As it can be seen, the packet to be sent keeps the size of produced by the flame sensors S[i] (where i ∈ R|0 ≤ i ≤ 4): Fig. 3. Encoding and decoding procedures. Size of S[i] is 16 bits whereas T and H are 8 bits. AUX is an auxiliary variable that was created with the 4 bits that remain free. By this way, a reduction of transmission from 8 words (16-Bytes) to 5 words (10-Bytes) is obtained. The following section demonstrates the difference between the procedure of recommendation by TTN and this approach. 5 Results To demonstrate the difference during transmission using the LoRaWAN network with the algorithm proposed in this work, a comparison test is performed with the method recommended by TTN. Therefore, five sensor modules are config- ured for each transmission method: five nodes with the TTN transmission tech- nique and another five nodes with the algorithm shown in the previous section. Figure 4a shows the ten modules distributed on the laboratory bench. They are configured according to the mentioned algorithms and simulate a small WSN. All of these modules are configured to transmit at a specific frequency within the range that LoRaWAN works. In this way, it is possible to probe the behaviour of the duty cycle in the settings of the gateway used. The gateway used is RAK 7249, which is attached to the laboratory’s roof shown in Fig. 4b.
  • 300.
    290 T. Britoet al. (a) Five modules configured for each al- gorithm. (b) LoRaWAN gateway used for the test. Fig. 4. Structure used for the comparison between the two algorithms. The five modules responsible for transmitting the data from the sensors with the TTN’s recommendations were configured to send using the frequency 864.9 MHz, 865.5 MHz and 863 MHz (this frequency is exclusive for FSK). The other five modules, those configured with the algorithm proposed in this work, carried out transmissions on 863.5 MHz , 863.9 MHz and 860.1 MHz (this fre- quency is exclusive for FSK use). To ensure that no other frequencies were used, the RAK 7249 was still configured to only work on these six frequencies men- tioned. In addition, it was also chosen to enable a modulation concentrator for each frequency range used. All ten modules have the same sending values; that is, all of them must send the five values of the flame sensors, the battery level, humidity and relative air temperature. They have also been configured to communicate in an interval of 60 s. Therefore, after all the necessary configurations for the test, the modules were deposited under the laboratory bench. Where they are under the same climatic circumstances, it is also possible to guarantee that they will send cor- responding temperature and humidity values. Also, note that all batteries are fully charged. After 24 h, the generated duty cycle graph was verified in the gateway sys- tem, and a screenshot is shown in Fig. 5. Through this graph, it is possible to notice the difference between the two algorithms based on the lines generated when the gateway performs transmissions at a specific frequency. In this sense, it is observed that the primary frequencies are the most used (864.9 MHz and
  • 301.
    Optimizing Data Transmissionin a Wireless Sensor Network 291 863.5 MHz), as the firmware will always choose them as the first attempt to send. However, when analysing the secondary frequencies (865.5 MHz and 863.9 MHz), a difference is noted between the amount of duty cycle occupation. This graph- ical difference demonstrates the optimisation of bytes during data transmission through the LoRaWAN network. Thus, using the algorithm proposed in this work, it is possible to insert more modules under the same facilities. Fig. 5. A screenshot is obtained from the RAK 7249 gateway system after 24 h of use. 6 Conclusion and Future Works The limited resources of the LoRaWAN requires a reduction of the transmission data. In this paper, an optimised transmission is proposed availing the unused bits of the 16 bits values. It is a lossless bit manipulation that will fit the temper- ature, humidity and battery level data bits in the unused bits of the sensors data where each sensor occupies 10 bits. The remaining 6 bits are used to perform this procedure. Also, 4 free bits will allow to use an auxiliary value from 0 to 0xF that can help on synchronisation and packet sequence verification. The results showed that the reduction of the duty cycle of the LoRaWAN while maintaining the data integrity allow to verify that the proposed methodology is able to be installed on the wireless sensor network, as future work. Acknowledgements. This work has been supported by Fundação La Caixa and FCT—Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/5757/2020.
  • 302.
    292 T. Britoet al. References 1. Brito, T., Pereira, A.I., Lima, J., Castro, J.P., Valente, A.: Optimal sensors posi- tioning to detect forest fire ignitions. In: Proceedings of the 9th International Con- ference on Operations Research and Enterprise Systems, pp. 411–418 (2020) 2. Brito, T., Pereira, A.I., Lima, J., Valente, A.: Wireless sensor network for ignitions detection: an IoT approach. Electronics 9(6), 893 (2020) 3. Azevedo, B.F., Brito, T., Lima, J., Pereira, A.I.: Optimum sensors allocation for a forest fires monitoring system. Forests 12(4), 453 (2021) 4. Brito, T., Azevedo, B.F., Valente, A., Pereira, A.I., Lima, J., Costa, P.: Environ- ment monitoring modules with fire detection capability based on IoT methodol- ogy. In: Paiva, S., Lopes, S.I., Zitouni, R., Gupta, N., Lopes, S.F., Yonezawa, T. (eds.) SmartCity360◦ 2020. LNICST, vol. 372, pp. 211–227. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76063-2 16 5. Seye, M.R., Ngom, B., Gueye, B., Diallo, M.: A study of LoRa coverage: range evaluation and channel attenuation model. In: 2018 1st International Conference on Smart Cities and Communities (SCCIC), pp. 1–4 (2018). https://doi.org/10. 1109/SCCIC.2018.8584548 6. Phung, K.H., Tran, H., Nguyen, Q., Huong, T.T., Nguyen, T.L.: Analysis and assessment of LoRaWAN. In: 2018 2nd International Conference on Recent Advances in Signal Processing, Telecommunications Computing (SigTelCom), pp. 241–246 (2018). https://doi.org/10.1109/SIGTELCOM.2018.8325799 7. Kufakunesu, R., Hancke, G.P., Abu-Mahfouz, A.M.: A survey on adaptive data rate optimization in LoRaWAN: recent solutions and major challenges. Sensors 20(18), 5044 (2020). https://doi.org/10.3390/s20185044. https://www.mdpi.com/ 1424-8220/20/18/5044 8. Jovalekic, N., Drndarevic, V., Pietrosemoli, E., Darby, I., Zennaro, M.: Experimen- tal study of lora transmission over seawater. Sensors 18(9), 2853 (2018). https:// doi.org/10.3390/s18092853. https://www.mdpi.com/1424-8220/18/9/2853 9. Casals, L., Mir, B., Vidal, R., Gomez, C.: Modeling the energy performance of LoRaWAN. Sensors 17(10) (2017). https://doi.org/10.3390/s17102364. https:// www.mdpi.com/1424-8220/17/10/2364 10. Tsakos, K., Petrakis, E.G.M.: Service oriented architecture for interconnecting LoRa devices with the cloud. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T. (eds.) AINA 2019. AISC, vol. 926, pp. 1082–1093. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-15032-7 91 11. Raza, U., Kulkarni, P., Sooriyabandara, M.: Low power wide area networks: an overview. IEEE Commun. Surv. Tutor. 19(2), 855–873 (2017). https://doi.org/10. 1109/COMST.2017.2652320 12. Zhou, Q., Zheng, K., Hou, L., Xing, J., Xu, R.: Design and implementation of open LoRa for IoT. IEEE Access 7, 100649–100657 (2019) 13. Baiji, Y., Sundaravadivel, P.: iloleak-detect: an IoT-based LoRaWAN-enabled oil leak detection system for smart cities. In: 2019 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS), pp. 262–267. IEEE (2019) 14. Kufakunesu, R., Hancke, G.P., Abu-Mahfouz, A.M.: A survey on adaptive data rate optimization in LoRaWAN: recent solutions and major challenges. Sensors 20(18), 5044 (2020) 15. Kim, S., Yoo, Y.: Contention-aware adaptive data rate for throughput optimization in LoRaWAN. Sensors 18(6), 1716 (2018)
  • 303.
    Optimizing Data Transmissionin a Wireless Sensor Network 293 16. Bouguera, T., Diouris, J.F., Chaillout, J.J., Jaouadi, R., Andrieux, G.: Energy consumption model for sensor nodes based on LoRa and LoRaWAN. Sensors 18(7), 2104 (2018) 17. Choudhury, S., Gibson, J.D.: Payload length and rate adaptation for multimedia communications in wireless LANs. IEEE J. Sel. Areas Commun. 25(4), 796–807 (2007). https://doi.org/10.1109/JSAC.2007.070515 18. Akyildiz, I., Joe, I.: A new ARQ protocol for wireless ATM networks. In: ICC 1998. 1998 IEEE International Conference on Communications. Conference Record. Affiliated with SUPERCOMM 1998 (Cat. No.98CH36220), vol. 2, pp. 1109–1113 (1998). https://doi.org/10.1109/ICC.1998.685182 19. Modiano, E.: An adaptive algorithm for optimizing packet size used in wire- less ARQ protocols. Wirel. Netw. 5, 279–286 (2000). https://doi.org/10.1023/A: 1019111430288 20. Leghari, M., Abbasi, S., Dhomeja, L.D.: Survey on packet size optimization tech- niques in wireless sensor networks. In: Proceedings of the International Conference on Wireless Sensor Networks (WSN4DC) (2013) 21. Sankarasubramaniam, Y., Akyildiz, I.F., McLaughlin, S.: Energy efficiency based packet size optimization in wireless sensor networks. In: Proceedings of the First IEEE International Workshop on Sensor Network Protocols and Applications, pp. 1–8. IEEE (2003) 22. Dong, W., et al.: DPLC: dynamic packet length control in wireless sensor networks. In: 2010 Proceedings IEEE INFOCOM, pp. 1–9. IEEE (2010) 23. Working with bytes. https://www.thethingsnetwork.org/docs/devices/bytes/. Accessed 15 May 2021 24. Cayenne low power payload. https://developers.mydevices.com/cayenne/docs/ lora/#lora-cayenne-low-power-payload. Accessed 15 May 2021
  • 304.
    Indoor Location EstimationBased on Diffused Beacon Network André Mendes1,2(B) and Miguel Diaz-Cacho2 1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, 5300-253 Bragança, Portugal a.chaves@ipb.pt 2 Departamento de Ingenierı́a de Sistemas y Automática, Universidade de Vigo, 36.310 Vigo, Spain mcacho@uvigo.es Abstract. This work investigates the problem of location estimation in indoor Wireless Sensor Networks (WSN) where precise, discrete and low-cost independent self-location is a critical requirement. The indoor scenario makes explicit measurements based on specialised location hard- ware, such as the Global Navigation Satellite System (GNSS), difficult and not practical, because RF signals are subjected to many propagation issues (reflections, absorption, etc.). In this paper, we propose a low-cost effective WSN location solution. Its design uses received signal strength for ranging, lightweight distributed algorithms for location computation, and the collaborative approach to delivering accurate location estima- tions with a low number of nodes in predefined locations. Through real experiments, our proposal was evaluated and its performance compared with other related mechanisms from literature, which shows its suitabil- ity and its lower average location error almost of the time. Keywords: Indoor location · Beacon · ESP8266 · Mobile application 1 Introduction Location is a very appreciated information. Fields of application are uncount- able, from transportation systems, factory automation, domotics, robotics, sen- sor networks or autonomous vehicles. Position sensors for location in an outdoor environment are widely developed based on the Global Navigation Satellite Sys- tem (GNSS). Despite people, mobile factory robots or even drones spent most of their time indoors, there is not a globally accepted solution for positioning in indoor environments. Even for vehicles, indoor vehicle positioning expands the study of vehicular-control techniques for soft-manoeuvres in vehicles placed in garages or parking. In general, the research approaches for the indoor positioning problem are grouped in two main solutions: beacon-based and beacon-free. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 294–308, 2021. https://doi.org/10.1007/978-3-030-91885-9_21
  • 305.
    Indoor Location Estimation295 The beacon-based solution is based on the existence of a network of receivers or transmitters (positioning-infrastructure), some of them placed at known loca- tions. In this solution, the location is estimated by multilateration [1] (or tri- angulation for angles) from a set of measured ranges. These methods are also named Indoor Positioning Systems (IPS), depending on the sensor configura- tion and processing approach, using technologies such as ultrasound, middle to short-range radio (Wi-Fi, Bluetooth, UWB, RFID, Zigbee, etc.) or vision [2]. The main weakness of these solutions is the need for infrastructure, limited absolute accuracy and coverage. The beacon-free solution, mainly relying on dead-reckoning methods with sensors installed on the target to be located, uses Inertial Measuring Units (IMU) and odometry to estimate the position of the target [3]. They integrate step lengths and heading angles at each detected step [4] or, alternatively, data from accelerometers and gyroscopes [5,6]. IMU and odometry based location have the inconvenience of accumulating errors that drift the path length. Beacon-based solutions using short-range radio signals deduce the distance by measuring the received signal strength (RSS), mainly based on two methods: i) the use of a radio map (previously established or online measured) as a reference; or, ii) the use of probabilistic methods such as Bayesian inference to estimate the location of the target. The implementation of beacon-based solutions requires the existence of a positioning-infrastructure. While it is possible to achieve extremely high accu- racy with a specific ad-hoc positioning-infrastructure, the cost of installing and maintaining the required hardware is a significant drawback of such systems. On the contrary, the use of an existing beacon-based network (such as Wi-Fi or Bluetooth) as a non-dedicated positioning-infrastructure could require no extra hardware installation and minimal adjustments in software to implement an IPS, but, it typically results in lower accuracy [7–9]. Therefore, most of the research lines for non-dedicated positioning-infrastructures are focused on increasing the accuracy in location estimation by using curve fitting and location search in subareas [9], along with different heuristic-based or mathematical techniques [10–12]. In this work, a low complexity positioning solution based on an existing Wi-Fi infrastructure is proposed. Such technology is extremely wide extended and allowable, therefore it is easily scaled to fill coverage gaps with inexpensive devices. One of them is the ESP8266 board, made by Espressif Systems, which has a low-cost Wi-Fi microchip and meandered inverted-F PCB antenna, with a full TCP/IP stack and a microcontroller. This solution includes a collaborative node strategy by processing fixed and dynamic information depending on the environment status, holding the line started by other authors [13,14] and improving the broadcasting method with new results [15]. In addition, it assumes no prior knowledge of the statistical parameters that represents the availability of the RF channel, it can also dynam- ically adapt to variations of these parameters.
  • 306.
    296 A. Mendesand M. Diaz-Cacho Additionally, the solution has low computational complexity and low energy consumption, which makes it attractive to be implemented in vehicles, mobile robots, embedded and portable devices and, especially, in smartphones. However, there are several issues with this type of location mechanism. Inher- ent in all signal measurement exists a degree of uncertainty with many sources (interferences, RF propagation, location of the receiver, obstacles in the path between transmitter and receiver, etc.). This uncertainty could have a signifi- cant effect on the measured RSS values, and therefore on the accuracy of the received signal. In addition, because of the RF wave propagation characteristics within buildings, the RSS values themselves are not a uniform square law func- tion of distance. This means that a particular RSS value may be a close match for more than one RF transmission model curve, pointing that some dynamic adjustment may be necessary from time to time. Moreover, some contributions could be stated, as early simulations yield aver- age location errors below 2 m in the presence of both range and interference location inaccuracies, experimentally confirmed afterwards; well-performed in real experiments compared with some related mechanism from the literature; use COTS inexpensive and easy to use hardware being beacon devices; verify the barely intuitive idea that there is a trade-off between the number of beacon devices and the location accuracy. The paper is organized as follows. This section presented the concept, moti- vation and common problems of indoor positioning technologies, and in addition an introduction to the proposed solution. Section 2 provides the basic concepts and problems encountered when implementing positioning algorithms in non- dedicated positioning-infrastructures and describes the system model. A novel positioning-infrastructure based on collaborative beacon nodes is presented in Sect. 3. Simulation and real implementation results are presented and discussed in Sect. 4, and finally, Sect. 5 concludes the paper and lists future works. 2 System Model In this section, we describe the system model, which serves as the basis for the design and implementation of our proposal. This model allows determining the location of a client device by using management frames in Wi-Fi standard [16,17] and Received Signal Strength (RSS) algorithm combined with the lateration technique [18,19]. Many characteristics make positioning systems work different from outdoor to indoor [19]. Typically, indoor positioning systems require higher precision and accuracy than that shaped for outdoor to deal with relatively small areas and existing obstacles. In a strict comparison, indoor environments are more complex as there are various objects (such as walls, equipment, and people) that reflect, refract and diffract signals and lead to multipath, high attenuation and signal scattering problems, and also, such systems typically rely on non-line of sight (NLoS) prop- agation where the signal can not travel directly in a straight path from an emitter to a receiver, which causes inconsistent time delays at the receiver.
  • 307.
    Indoor Location Estimation297 On the other hand, there are some favourable aspects [19], as small coverage area becomes relatively under control in terms of infrastructure, corridors, entries and exits, temperature and humidity gradients, and air circulation. Also, the indoor environment is less dynamic due to slower-speed moving inside. Lateration (trilateration or multilateration, depending on the number of ref- erence points), also called range measurement, computes the location of a target by measuring its distance from multiple reference points [1,16]. Fig. 1. Basic principle of the lateration technique for 2-D coordinates. In Fig. 1, the basic principle of the lateration method for a 2-D location measurement is shown. If the geographical coordinates (xi, yi) of, at least, three reference elements A, B, C are known (to avoid ambiguous solutions), either the measured distances dA, dB and dC, then the estimated position P(x̂,ŷ), considering the measured error—grey area in figure—, can be calculated as a solution for a system of equations that could be written similarly as Eq. 1, for the x-axis. dxi = (x̂ − xi)2 + (ŷ − yi)2 (1) for {i ∈ Z | i 0}, where i is the number of reference elements. In our model, we consider a static environment with random and grid deploy- ments of S sensor nodes (throughout this work called only as nodes), which could be of two different types: – Reference, which can be an available Wi-Fi compliant access point (AP); and, – Beacon, which is made with the inexpensive Espressif ESP8266 board with a built-in Wi-Fi microchip and meandered inverted-F PCB antenna, a full TCP/IP stack and a microcontroller.
  • 308.
    298 A. Mendesand M. Diaz-Cacho Both types are in a fixed and well-known location, but there is also a client device that is a mobile device. All these nodes are capable of wireless communi- cation. The client device has installed inside a piece of software that gathers infor- mation from those nodes and computes its proper location, by using an RF propagation model and the lateration technique. Therefore, based on a RSS algorithm, it can estimate distances. The RSS algorithm is developed in two steps: i. Converting RSS to estimated distance by the RF propagation model; and, ii. Computing location by using estimated distances. Remark that the RF propagation model depends on frequency, antenna ori- entation, penetration losses through walls and floors, the effect of multipath propagation, the interference from other signals, among many other factors [20]. In our approach, we use the Wall Attenuation Factor (WAF) model [8] for distance estimation, described by Eq. 2: RSS(d) = P(d0) − 10α log10( d d0 ) − WAF × nW, if nW C C, if nW ⩾ C (2) where RSS is the received signal strength in dBm, P(d0) is the signal power in dBm at some reference distance d0 that can either be derived empirically or obtained from the device specifications, and d is the transmitter-receiver sepa- ration distance. Additionally, nW is the number of obstructions (walls) between the transmitter and the receiver and C is its maximum value up to which the attenuation factor makes a difference, α indicates the rate at which the path loss increases with distance, and WAF is the wall attenuation factor. In general, these last two values depend on the building layout and construction material and are derived empirically following procedures described in [8]. Since the environment varies significantly from place to place, the simplest way to find the relationship of RSS and the distance between a pair of nodes is collecting signal data at some points with known coordinates. Moreover, this is a learning mode procedure that can improve the lateration process by the adoption of real environment particularities, and also helps to determine empirically the path loss exponent α. Therefore, we determine α pruning the path loss model to a form in which it presents a linear relationship between the predicted RSS and the logarithm of the transmitter-receiver separation distance d, and then, we apply a linear regres- sion to obtain that model parameter. Observe that C and nW are determined according to the type of facility. 3 Distributed Collaborative Beacon Network In this section, we introduce the distributed collaborative beacon network for indoor environments.
  • 309.
    Indoor Location Estimation299 Our goal is to improve the client location accuracy by enhancing the infor- mation broadcasted by nodes. 3.1 Proposal Unlike traditional location techniques based on Wi-Fi [17], which map out an area of signal strengths while storing it in a database, beacon nodes are used to transmit encoded information within a management frame (beacon frame), which is used for passive position calculation through lateration technique. Observe that we are focusing on the client-side, without the need to send a client device position to anywhere, avoiding privacy concerns. Moreover, since this is a purely client-based system, location latency is reduced compared to a server-side imple- mentation. To improve location accuracy, we propose a collaborative strategy where nodes exchange information with each other. Every 5 s, during the learning mode, a beacon node randomly (with probability depending on a threshold ε) enters in scanning mode [16,17]. The node remains in this mode during at least one beacon interval (approximately 100 ms) to guarantee the minimum of one beacon received from any reference node (access point) in range. A beacon node receives, when in learning mode, RSS and location informa- tion from reference nodes in range, and therefore can calculate a new α value using Eq. 2. After, the new propagation parameter (α) and the proper location coordinates are coded and stuffed into the SSID, then broadcasted to the clients. Reference nodes only broadcast its proper location. It is noted that the path loss model plays an important role in improving the quality and reliability of ranging accuracy. Therefore, it is necessary to investi- gate it in the actual propagation environment to undermine the obstruction of beacon node coverage (e.g. people, furniture, etc.) tackling the problem of NLoS communication. Once in place, this coded information allows a client device to compute distances through lateration, then determining its proper location. All of this is performed without the need for data connectivity as it is determined purely with the information received in beacon frames from nodes in range. The coding schema in the standard beacon frame (also known as beacon stuff- ing [13]) works in nearly all standard-compliant devices and enhancing them if multiple SSIDs are supported, thus allowing to use of an existing SSID, simul- taneously. Beacon frames must be sent at a minimum data rate of 1 Mbps (and standard amendment b [16,17]) to normalize the RSS across all devices. In general, since the RSS will change if data rate changes for management data frames, in this way it did not get any variation in RSS on the client device, except for range. Consideration should be made when planning the optimal placement of the beacon nodes, but this will generally be dictated by existing infrastructure (e.g., if using an existent standard-compliant access point as reference node), build- ing limitations or power source requirements.
  • 310.
    300 A. Mendesand M. Diaz-Cacho Algorithm 1: BEACON node 1 Initialization of variables; 2 while (1) do 3 /* zeroing the information array */ 4 /* and fill in a proper location */ 5 Initialization of I[1..nodes]; 6 /* check time counter */ 7 if (counter % 5) then 8 draws a random number x, x ∈ [0, 1]; 9 /* learning mode */ 10 if (x ε) then 11 WiFi Scanning; 12 get SSID from beacon frame; 13 foreach ref node in SSID do 14 read coded information; 15 Decode; 16 get {location, RSS}; 17 I[ref node] ← {location, RSS}; 18 /* adjust α if needed */ 19 if (α do not fit default curve) then 20 ComputeNewAlpha; 21 I[ref node] ← {α}; 22 /* beacon frame coding */ 23 PackAndTx ← I; 24 Broadcast coded SSID; 3.2 Algorithms and Complexity Analysis The proposed strategy is described in Algorithms 1 and 2. In the beginning, after initialization, the mechanism enters into an infinite loop during its entire period of operation. And from a beacon node point of view, it decides according to a threshold (ε) between enter in learning mode, as described earlier or jump directly to the coding process of its proper location coordinates into the SSID, and transmission. In Algorithm 2, a client device verifies if a received beacon frame comes from a beacon or a reference node, then run the routines to decode informations and adjust α, if needed, and at last computes its location. Finally, observing Algorithm 1, it is noted that it depends of three routines, named Decode, ComputeNewAlpha and PackAndTX, which are finite time oper- ations with time complexity constant, O(1). However, it all depends on the number of reference nodes in range, this way it raises the whole complexity to T(n) = O(n). In a similar fashion, Algorithm 2 depends on Decode and AdjustAlpha routines, which are finite time operations, but the Positioning
  • 311.
    Indoor Location Estimation301 Algorithm 2: CLIENT device 1 Initialization of variables; 2 while (1) do 3 WiFi Scanning; 4 get SSID from beacon frame; 5 foreach (ref node|beacon node in SSID) do 6 read coded information; 7 Decode; 8 AdjustAlpha; 9 Positioning; routine takes linear time, also dependent on the amount of nodes (reference and beacons) in range, thus the complexity becomes linear, T(n) = O(n). Additionally, the largest resource consumption remains related to the storage of each coded SSID (32 bytes), which is dependent on the number of nodes (reference and beacons) in range as well. Thus, S(n) = O(n). 4 Evaluation In this section, the proposed collaborative strategy is evaluated using numerical simulations, also by a real implementation experiment using the inexpensive ESP8266 board [21], a Wi-Fi standard-compliant access point and a smartphone application software. Location estimation is the problem of estimating the location of a client device from a set of noisy measurements, acquired in a distributed manner by a set of nodes. The ranging error can be defined as the difference between the estimated distance and the actual distance. The location accuracy of each client device is evaluated by using the root mean square error (Eq. 3): ei = (x − xe)2 + (y − ye)2 (3) where x and y denote the actual coordinates, and xe and ye represent the esti- mated coordinates of the client. We define the average location error (e) as the arithmetic mean of the location accuracy, as seen in Eq. 4: e = N j=1 ei N (4) To assist in the evaluation, the strategies are analyzed in terms of average location error (e), and additionally, some other metrics are used and discussed occasionally, such as accuracy, technology, scalability, cost, privacy, and robust- ness [16,19,22].
  • 312.
    302 A. Mendesand M. Diaz-Cacho Finally, the objective of the proposed strategy is not to shorten the network lifetime. Therefore, we have also evaluated the performance of the strategies in terms of energy consumption. 4.1 Simulations Based on the system model described in Sect. 2 and to assess the behaviour of the proposed strategy (Subsect. 3.1) in solving the indoor positioning problem, we use the ns-2 simulator environment running under GNU/Linux, that emulates a network with both types of nodes and clients, and adopts a specific data link layer where contention and possible collisions, between them, takes place. In particular, 20 nodes and 10 clients have been deployed in an area of 200 m2 , each of them with initial 100 energy units that were consumed during the simulation, on a per operation basis. We use a fixed energy model, meaning that every receive, decode, code and transmit process consume a fixed amount of energy, as used by ns-2. The α parameter (Eq. 2) has a initial value of 2.0 and the threshold ε is setted to 0.1. The values of the remaining parameters of the simulation were picked up as the most representative ones within their validity intervals, after a round of tests. For this analysis, we set 2 different scenarios to place the sensors (beacon and reference) nodes for the physical features of the network, aiming to repro- duce the Department of Systems Engineering and Automatics’ (DESA) facility. Clients are placed randomly in both cases. – Grid: sensor nodes are placed over an equally spaced reference grid; and, – Random: sensor nodes are placed randomly into the area. And the performance of the following strategies are evaluated through sim- ulations in the laboratory: – WELO: our proposal that was named WEmos (development board based on ESP8266) LOcalization; – GENERAL: the single position approach, meaning that only one location is computed after decoding the data within beacon frames when the node is placed; and, – CF: the continuous positioning, like GENERAL, but multiple locations are com- puted to reduce the minimum square error. A total of 400 simulation rounds are done for each set of parameters, each one corresponding to 2.000 s in simulation time, for both scenarios described earlier, where we compare the performance of the described strategies. In the results, the average location error (e) obtained at every round is pre- sented, with error bars corresponding to the 95% confidence interval; also the quantity of beacons nodes used for positioning, the cumulative distribution func- tion (CDF) of e, and the average energy consumption, per client.
  • 313.
    Indoor Location Estimation303 0 5 10 15 20 25 30 35 40 1 2 3 4 5 6 7 8 9 10 Avg. location error (m) #CLIENTS GENERAL WELO CF (a) #BEACONS = 4 0 5 10 15 20 25 30 35 40 1 2 3 4 5 6 7 8 9 10 Avg. location error (m) #CLIENTS GENERAL WELO CF (b) #BEACONS = 10 0 5 10 15 20 25 30 35 40 1 2 3 4 5 6 7 8 9 10 Avg. location error (m) #CLIENTS GENERAL WELO CF (c) #BEACONS = 20 Fig. 2. Average location error (e) against clients deployed, per beacons available (by simulations). 0 5 10 15 20 25 30 35 40 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # USED BEACONS #AVAILABLE BEACONS GENERAL WELO CF (a) Beacons deployed versus available beacons. 0 2 4 6 8 10 12 14 16 18 1 2 3 4 5 6 7 8 9 10 Energy consumed % #CLIENTS GENERAL WELO CF (b) Energy consumption against clients deployed. 0 10 20 30 40 50 60 70 80 90 100 0 5 10 15 20 25 30 CDF (%) Avg. location error (m) GENERAL WELO CF (c) Average location error (e) CDF in simulations. Fig. 3. WELO behaviour in simulations. Figure 2 compares the obtained e between strategies. this performance com- parison shows that our proposal, WELO, improves the accuracy of clients posi- tioning by reducing e. We have observed when the number of beacon nodes increases, WELO—as a passive mechanism—reduces e on average, independent of the quantity of clients deployed. Compared with other strategies, this result shows that WELO uses more location information obtained through our collab- orative approach, then leveraging the lateration process, independently of the number of clients, becoming an important performance metric. Another relevant observation is that there is a balance between the quan- tity of beacon nodes placed and the accuracy. Looking at Figs. 2b and 2c, one can observe that e is marginally reduced when raising the amount of beacons deployed. From another point of view, Fig. 3a shows this balance with the number of used beacon nodes to obtain a location. Further studies at different scenar- ios have to determine if there is a clear relationship between accuracy and the quantity of nodes, which might go against the statement that accuracy (with lateration) improves with raising the nodes installed. The CDF of e is depicted in Fig. 3c, showing that the curve of WELO lies in the top, demonstrates that its cumulative probability is larger for the same error. As can be seen, WELO achieves accuracy equal or better than 2m for 80% of the location estimations along with the scenario, and reduction of at least 80% over the other strategies.
  • 314.
    304 A. Mendesand M. Diaz-Cacho Apart from that, when looking at Fig. 3b, the average energy consumption by our strategy is stable in comparison with other strategies. As expected, with clients running a simplified mechanism and only beacon frames being broad- casted by beacon nodes, our proposed approach significantly improves the resid- ual energy of the overall network, for both scenarios. 4.2 Real Implementation Analysis Our real implementation analysis has been made with placement around the bor- der of a corridor that connect offices at the Department of Systems Engineering and Automatics’ (DESA) facility, as shown in Fig. 4. Fig. 4. DESA’s facility real scenario and heatmap overlay (beacon and reference nodes—B and R respectively). We deploy at least one beacon node (made with the ESP8266 board) on each corner until 8 nodes (01 Wi-Fi standard-compliant access point included) being placed in total, and a client device (smartphone), in an area of 200 m2 , where the maximum distance between nodes lies between 2 m and 3 m. To avoid the potential for strong reflection and multipath effects from the floor and ceiling, we kept all devices away from the floor and ceiling. The α parameter (Eq. 2) has a initial value of 2.0 and the threshold ε is set to 0.1. Before running the experiment, we measured received signal strengths from each node (beacon or reference node) of our network. These measurements were made from distances ranging between 1 m to 20 m of each node, in 10 points within the area of the experiment, totalizing 200 measurements, aided by a smartphone application. Figure 4 shows a heat map overlay made with the average of these measurements. In addition, we added functions to the base firmware of the board ESP8266 (esp iot sdk v1.3.0 SDK [21]) to execute the learning process, described in Sub- sect. 3.1. During the experiment, on the smartphone application, the trilateration pro- cess uses a RSS based path loss algorithm to determine the distance (Eq. 2).
  • 315.
    Indoor Location Estimation305 In the results, the average location error (e) obtained is presented and dis- cussed, and for purposes of comparison, the cumulative distribution function (CDF) of e is also investigated. 0 5 10 15 20 25 0 200 400 600 800 1000 1200 1400 1600 Avg. real error (m) Time (s) WELO GNSS (a) Measurements for the location esti- mation error. 0 2 4 6 0 4 8 12 16 20 24 Real error (m) Real error (m) WELO GNSS 0 0.2 0.4 0.6 0.8 1 1.2 0 200 400 600 800 1000 1200 1400 1600 0 20 40 60 80 100 120 140 160 180 200 GNSS Error Measurement (m) GNSS Error Measurement (m) Time (s) WELO GNSS (b) Real behavior during measurements. Fig. 5. Real experiment: our strategy vs GNSS. Obtained average location error e between WELO, running in a smartphone, and a GNSS (GPS and GLONASS) receiver application, running in another smartphone, is presented in Fig. 5a. Both smartphones are the same model and the measurements were taken the same way to avoid any bias. This comparison shows that WELO is the one that achieves better results. The performance of GNSS is worst because of the instability of the received signal, even with measurements were taken at the ground level, and separated from the outside by generous glass windows and drywall made walls on the inside. Another interesting observation relies on Fig. 5b, where we can see the real behaviour of both systems, GNSS and WELO. At some points near the glass window, GNSS signals become strong and their error is reduced, in opposite WELO becomes worst because it is far from nodes (reference and beacons). This happens due to the dynamic nature of the RF environment. Even so, GNSS can get much more information during evaluation instants, but in indoor scenarios, any temporary change from the RF environment causes significant performance degradation for GNSS. In contrast, WELO achieves a bet- ter performance not only because of the higher availability of its signals but also because of the higher confidence of the location information obtained. Moreover, this behaviour demonstrates that GNSS errors in this scenario become out of control, mainly because of the multiple reflections at surfaces that cause multipath propagation, and lose significant power indoors, concern- ing the required coverage for receivers (at least four satellites). However, with the introduction of “dual-frequency” support on smartphones, position accuracy probably experiment promising enhancements. Further analysis will determine its performance under this same indoor scenario.
  • 316.
    306 A. Mendesand M. Diaz-Cacho 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 CDF (%) Avg. location error (m) Simulated Real (a) Our strategy: simulated vs real. (b) Benchmarking between mechanisms. Fig. 6. Average location error (e) CDF comparison (by real experiment). By comparing the results in Fig. 5a to the ones in Fig. 2, one can observe that the performance of the WELO is positively impacted with the growth of the amount of beacon nodes available, but with a limit. Additionally, the performance degra- dation is small, which shows the robustness of the proposed strategy against the larger heterogeneity of a real indoor scenario without any prior knowledge. Based on the above observations, we compared the average location error (e) CDF comparison (by real experiment) in Fig. 6. Figure 6a shows its performance in simulations in contrast with its capabilities in the real experiments. At that, we can observe the accuracy suffering from the toughness of the indoor scenario. Figure 6b shows the performance evaluation of the proposed strategy against some classical algorithms, such as RADAR [8] and Curve Fitting with RSS [9] in the office indoor scenario, together with GNSS (worse performance and related to upper x-axis only). As can be seen, WELO shows its suitability and its lower average location error almost of the time within the scenario. From these figures, WELO results are fairly stimulating and we can observe performance improvements. 5 Conclusions The Wi-Fi standard has established itself as a key positioning technology that together with GNSS is widely used by devices. Our contribution is the development of a simple and low-cost indoor loca- tion infrastructure, plus a low complexity and passive positioning strategy that does not require specialized hardware (instead exploiting hardware commonly available—ESP8266 board—with enhanced functionality), arduous area polling or reliance on data connectivity to a location database. These last two features help to reduce its latency, all while remaining back- wards compatible with existing Wi-Fi standards, and working side by side with
  • 317.
    Indoor Location Estimation307 IEEE 802.11mc Fine Time Measurement (Wi-Fi RTT) when it will broad avail- able. We run simulations with the ns-2 and made an experiment under real indoor conditions, at DESA’s facility, to evaluate the performance of the proposed strat- egy. The results show that this solution achieves better performance than other pre-established strategies (GNSS system included) for the scenarios evaluated. Further research is required to guarantee the stability of our approach. As future works, we intend to extend the evaluation for client devices being located by dynamically deployed beacon nodes. Acknowledgements. This work has been conducted under the project “BIOMA – Bioeconomy integrated solutions for the mobilization of the Agri-food market” (POCI-01-0247-FEDER-046112), by “BIOMA” Consortium, and financed by European Regional Development Fund (FEDER), through the Incentive System to Research and Technological development, within the Portugal2020 Competitiveness and Internation- alization Operational Program. This work has also been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. References 1. Savvides, A., Park, H., Srivastava, M.B.: The bits and flops of the n-hop mul- tilateration primitive for node localization problems. In: Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, pp. 112–121 (2002) 2. Hightower, J., Borriello, G.: Location systems for ubiquitous computing. Computer 34(8), 57–66 (2001) 3. Collin, J.: Investigations of self-contained sensors for personal navigation. Tampere University of Technology (2006) 4. Jimenez, A.R., Seco, F., Prieto, C., Guevara, J.: A comparison of pedestrian dead- reckoning algorithms using a low-cost MEMS IMU. In: 2009 IEEE International Symposium on Intelligent Signal Processing, pp. 37–42. IEEE (2009) 5. Jiménez, A.R., Seco, F., Prieto, J.C., Guevara, J.: Indoor pedestrian navigation using an INS/EKF framework for yaw drift reduction and a foot-mounted IMU. In: 2010 7th Workshop on Positioning, Navigation and Communication, pp. 135–143. IEEE (2010) 6. Chatfield, A.B.: Fundamentals of High Accuracy Inertial Navigation. American Institute of Aeronautics and Astronautics, Inc. (1997) 7. Deasy, T.P., Scanlon, W.G.: Stepwise refinement algorithms for prediction of user location using receive signal strength indication in infrastructure WLANs. In: 2003 High Frequency Postgraduate Student Colloquium (Cat. No. 03TH8707), pp. 116– 119. IEEE (2003) 8. Bahl, P., Padmanabhan, V.N.: RADAR: an in-building RF-based user location and tracking system. In: Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No. 00CH37064), vol. 2, pp. 775–784. IEEE (2000) 9. Wang, B., Zhou, S., Liu, W., Mo, Y.: Indoor localization based on curve fitting and location search using received signal strength. IEEE Trans. Industr. Electron. 62(1), 572–582 (2014)
  • 318.
    308 A. Mendesand M. Diaz-Cacho 10. Ji, X., Zha, H.: Sensor positioning in wireless ad-hoc sensor networks using multi- dimensional scaling. In: IEEE INFOCOM 2004, vol. 4, pp. 2652–2661. IEEE (2004) 11. Doherty, L., El Ghaoui, L., et al.: Convex position estimation in wireless sensor networks. In: Proceedings IEEE INFOCOM 2001. Conference on Computer Com- munications. Twentieth Annual Joint Conference of the IEEE Computer and Com- munications Society (Cat. No. 01CH37213), vol. 3, pp. 1655–1663. IEEE (2001) 12. Shang, Y., Ruml, W.: Improved MDS-based localization. In: IEEE INFOCOM 2004, vol. 4, pp. 2640–2651. IEEE (2004) 13. Chandra, R., Padhye, J., Ravindranath, L., Wolman, A.: Beacon-stuffing: wi-fi without associations. In: Eighth IEEE Workshop on Mobile Computing Systems and Applications, pp. 53–57. IEEE (2007) 14. Sahoo, P.K., Hwang, I., et al.: Collaborative localization algorithms for wireless sensor networks with reduced localization error. Sensors 11(10), 9989–10009 (2011) 15. Bal, M., Shen, W., Ghenniwa, H.: Collaborative signal and information processing in wireless sensor networks: a review. In: 2009 IEEE International Conference on Systems, Man and Cybernetics, pp. 3151–3156. IEEE (2009) 16. Liu, H., Darabi, H., Banerjee, P., Liu, J.: Survey of wireless indoor positioning techniques and systems. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(6), 1067–1080 (2007) 17. 802.11b: Wireless LAN MAC and PHY Specifications: Higher-Speed Physical Layer Extension in the 2.4 GHz Band (1999). IEEE Standard 18. Al Nuaimi, K., Kamel, H.: A survey of indoor positioning systems and algorithms. In: 2011 International Conference on Innovations in Information Technology, pp. 185–190. IEEE (2011) 19. Gu, Y., Lo, A., Niemegeers, I.: A survey of indoor positioning systems for wireless personal networks. IEEE Commun. Surv. Tutor. 11(1), 13–32 (2009) 20. Rappaport, T.: Wireless Communications: Principles and Practice. Prentice-Hall, Hoboken (2001) 21. Espressif: Datasheet ESP8266 (2016) 22. Huang, H., Gartner, G.: A survey of mobile indoor navigation systems. In: Gartner, G., Ortag, F. (eds.) Cartography in Central and Eastern Europe. Lecture Notes in Geoinformation and Cartography, pp. 305–319. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03294-3 20
  • 319.
    SMACovid-19 – AutonomousMonitoring System for Covid-19 Rui Fernandes(B) and José Barbosa MORE – Laboratório Colaborativo Montanhas de Investigação – Associação, Bragança, Portugal {rfernandes,jbarbosa}@morecolab.pt Abstract. The SMACovid-19 project aims to develop an innovative solution for users to monitor their health status, alerting health pro- fessionals to potential deviations from the normal pattern of each user. For that, data is collected, from wearable devices and through manual input, to be processed by predictive and analytical algorithms, in order to forecast their temporal evolution and identify possible deviations, pre- dicting, for instance, the potential worsening of the clinical situation of the patient. Keywords: COVID-19 · FIWARE · Docker · Forecasting 1 Introduction Wearable devices nowadays support many sensing capabilities that can be used to collect data, related to the body of the person using the wearable. This allowed for the development of health related solutions based on data collected from the wearable devices for different purposes: for instance, in [1], the authors present a system for automated rehabilitation, in [2] big data analysis is used to define healthcare services. Nowadays, we are facing a pandemic scenario and this work intends to use the characteristics of the wearable devices to develop an innovative solution that can be used to diagnose possible infected COVID-19 patients, at an early stage, by monitoring biological/physical parameters (through wearable devices) and by conducting medical surveys that will allow an initial screening. 2 Docker and FIWARE The system was developed using Docker compose, to containerize the architec- ture modules, and FIWARE, to deal with the context information. Docker is an open source platform that allows the containerization of appli- cations and uses images and containers to do so. An image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings [3]. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 309–316, 2021. https://doi.org/10.1007/978-3-030-91885-9_22
  • 320.
    310 R. Fernandesand J. Barbosa A container is a standardized unit of software, and includes the code as well as the necessary dependencies for it to run efficiently, regardless of the computing environment where is being run [3]. Images turn into containers at runtime. The advantage that applications developed using this framework have is that they will always run the same way, regardless of the used infrastructure to run them. The system was developed using the FIWARE platform [4]. FIWARE defines a set of standards for context data management and the Context Broker (CB) is the key element of the platform, enabling the gathering and management of context data: the current context state can be read and the data can be updated. FIWARE uses the FIWARE NGSI RESTful API to support data transmis- sion between components as well as context information update or consumption. The usage of this open source RESTful API simplifies the process of adding extra functionalities through the usage of extra FIWARE or third-party components. As an example, FIWARE extra components, ready to use, cover areas such as interfacing with the Internet of Things (IoT), robots and third-party systems; context data/API management; processing, analysis and visualization of context information. In this project, the version used was the NGSI-v2. 3 System Architecture The system’s architecture is the one shown in Fig. 1, in which the used/ implemented modules communicate through the ports identified on the edges: – Orion: FIWARE context broker; – Cygnus: responsible for the data persistence in the MongoDB; – Context Provider: performs data acquisition from the Fitbit API; – Data analysis: processes the available data to generate forecasts; – Fitbit API: data source for the wearable device. In the system, data from the wearable device and from the mobile app is posted into the CB. This data is automatically persisted in the MongoDB through the cygnus module and is also used to trigger the data analysis module, to execute a new forecast operation. Data persistence is implemented through the creation of subscriptions in the CB that notify cygnus about data update. There are already docker images ready to use for the standard modules, such as, the MongoDB, the Orion context broker and the Cygnus persistence module. The context provider and data analysis modules were developed and containerized to generate the full system architecture. 3.1 Data Model The NGSI-v2 API establishes rules for entity definition. Per those rules, an entity has to have mandatory attributes, namely, the id and the type attributes. The id is an unique uniform resource name (urn) created according with the format “urn:ngsi-ld:entity-type:entity-id”. In this work, the entity’s type is identified in each entity definition, as can be seen in the data model presented in Fig. 2.
  • 321.
    SMACovid-19 – AutonomousMonitoring System for Covid-19 311 Fig. 1. Architecture devised to create the SMACovid-19 system. Given the fact that each id has to be unique, the entity-id part of the urn was created based on the version 4 of the universally unique identifier (uuid). Using these rules, a person entity may be, for instance: urn:ngsi-ld:Person:4bc3262c-e8bd-4b33-b1ed-431da932fa5 To define the data model, extra attributes were used, in each entity, to con- vey the necessary information between modules. Thus, the Person entity has attributes to specify the user characteristics and to associate the person with his fitbit account. The Measurement entity represents the acquired data values and its corresponding time of observation. This entity is used to specify information regarding the temperature, oxygen saturation, blood pressure, heart rate and daily number of diarrhea occurrences. The measurementType attribute is used to indicate what type of variable is being used on that entity. To ensure max- imum interoperability possible, the Fast Healthcare Interoperability Resources (FHIR) specification of Health Level Seven International (HL7) [5] was used to define terms whenever possible. This resulted in the following terms being used: bodytemp for temperature, oxygensat for oxygen saturation, bp for blood pressure, heartrate for heart rate and diarrhea (FHIR non-compliant) for daily number of diarrhea occurrences. The Questionnaire entity specifies data that has to be acquired through the response of a questionnaire, by the user, through the mobile application. The Organization entity identifies the medical facility that the person is associated with. The Forecast entity provides the forecast mark value, calculated accord-
  • 322.
    312 R. Fernandesand J. Barbosa Fig. 2. Data model ing with the classification table defined by the Hospital Terra Quente (HTQ) personnel, and the corresponding time of forecast calculation. 3.2 Data Acquisition Data is acquired from two different sources: mobile application and Fitbit API. The mobile application is responsible for manual data gathering, sending this data to the CB through POST requests. For instance, data related to the attributes of the Questionnaire entity is sent to the CB using this methodology. Fitbit provides an API for data acquisition that, when new data is uploaded to the Fitbit servers, sends notifications to a configured/verified subscriber end- point. These notifications are similar to the one represented in Fig. 3. The Fig. 3. Fitbit notification identifying new data availability [6].
  • 323.
    SMACovid-19 – AutonomousMonitoring System for Covid-19 313 ownerId attribute identifies the fitbitId of the person that has new data available, thus, allowing the system to fetch data from the identified user. This results in the following workflow, to get the data into the system: 1. create a Person entity; 2. associate Fitbit account with that person; 3. make a oneshot subscription (it is only triggered one time—oneshot) for that Person entity, notifying the Fitbit subscription endpoint; 4. update the Person entity with the Fitbit details; 5. create the fitbit subscription upon the oneshot notification reception; 6. get new data from Fitbit web API after receiving a notification from Fitbit: collect the fitbitId from the notification and use it to get the data from the Fitbit servers, from the last data update instant up until the present time. 3.3 Data Analysis and Forecast Execution When new data arrives into the system, the data analysis module is called to perform a forecast, based on the new available data. Given the data sources at hand, there are two methods to call this module: (1) subscription based from the context broker or (2) direct call from the context provider. In the first method, subscriptions are created within the CB to notify the data analysis module about data change. This method is used on the Questionnaire and Measurement entities. For the latter, individual subscriptions have to be created for each possible measurementType value. With this methodology, every time new data arrives at the CB about any of these variables, the subscription is triggered and the CB sends the notification to the forecast endpoint, identifying the entity that has new data available. The second method is used when there’s new data from the Fitbit servers. Upon data collection from the Fitbit servers, the data is sent to the CB, being persisted by cygnus in the database. Afterwards, the data analysis endpoint is called with the subscription ID set to fitbit. In order to perform the forecast, data is retrieved from the CB and the database, based on the subscriptionId, for a 15 day interval, defined based on the present instant. Data is collected for all variables: (1) anosmia, (2) contactCovid, (3) cough, (4) dyspnoea, (5) blood pressure, (6) numberDailyDiarrhea, (7) oxy- gen saturation, (8) temperature and (9) heart rate. Data history of a certain variable is used to forecast that variable: – variables 1–4 are boolean variables, thus, the forecast implemented consists in the last value observed, for each variable, read from the context broker. – variables 5–8 are variables with slow variation and small dynamic range, thus, the forecast used was a trend estimator. – variable 9 has a seasonal variation that can be observed on a daily basis, thus, a forecast algorithm that can take seasonal effects into consideration was used, namely, the additive Holt-Winters forecast method. The Holt-Winters method needs to be fed with periodic data in order to make the forecast and has to have at least data related to two seasons (two
  • 324.
    314 R. Fernandesand J. Barbosa days). This means that data pre-processing has to be executed to ensure that this requirement is met. For instance, when the wearable device is being charged, it cannot, of course, read the patient’s data, creating gaps that have to be pro- cessed before passing the data to the forecast algorithm. Hence, when a new forecast is executed, data is collected from the database, the data is verified to determine if there are gaps in the data and, if the answer is positive, forward and backward interpolation are used to fill those gaps. The final data, without gaps, is, afterwards, passed to the forecast method. Holt-Winters parameter optimiza- tion is executed first, using, to that end, the last known hour data, coupled with a Mean Absolute Error (MAE) metric. The optimized parameters are then used to make the forecast for the defined forecast horizon. Table 1. Classification metrics for Covid-19 level identification provided by HTQ. Variable 0 Points 1 Points 2 Points Systolic blood pressure (mmHg) 120 ≤ x ≤ 140 100 ≤ x ≤ 119 x ≤ 99 Heart rate (bpm) 56 ≤ x ≤ 85 86 ≤ x ≤ 100 or 50 ≤ x ≤ 55 x ≥ 101 or x ≤ 49 Oxygen saturation (%) x ≥ 94 90 ≤ x ≤ 93 x ≤ 89 Body temperature (◦C) x ≤ 36.9 37 ≤ x ≤ 39.9 x 39.9 Dyspnea No N/A Yes Cough No N/A Yes Anosmia No N/A Yes Contact COVID+ No N/A Yes Diarrhea No N/A Yes (minimum of 5 times/day) The final forecast mark is computed by adding all individual marks, cal- culated according with the rules specified in Table 1, defined by the medical personnel of Hospital Terra Quente. This value is used with the following clas- sification, to generate alerts: – value ≤ 4 points → unlikely Covid-19 → 24 h vigilance; – value ≥ 5 points points → likely Covid-19 → patient should be rushed into the emergency room (ER). 4 Results As discussed on Sect. 3.3, some of the used variables in this work are boolean and the forecast performed is the last value those variables have, that can be read from the context broker. Another set of variables use the trend estimator to forecast the next day value. A case of such scenario can be seen in Fig. 4(a), that showcases the number of daiy diarrhea occurrences example. In this case, the forecast is less than 5 times per day, thus, the attributed forecast points are 0, per Table 1.
  • 325.
    SMACovid-19 – AutonomousMonitoring System for Covid-19 315 (a) Daily diarrhea example. (b) Oxygen Saturation example. Fig. 4. Trend estimator forecasting. Fig. 5. Heart rate forecasting example, using the Holt-Winters method. Other variable that also uses the trend estimation predictor is the oxygen saturation variable. An example for this variable is presented in Fig. 4(b). For this variable, the forecast horizon is 2 h and the worst case scenario is considered for the forecast points calculation. In this example, the minimum of the forecast values is 90.78, yielding the value of 1 for the forecast points. The heart rate, given its seasonality characteristics, uses the Holt-Winters additive method fed, with, at least, data about two days and a forecast hori- zon of two hours. Similarly to the other variables, the worst case scenario is used to calculate the forecast points. Since the optimal case (0 points) is in the middle of the dynamic range of the heart rate value, both maximum and minimum forecast values are used to determine the associated forecast points. Figure 5 shows an example of heart rate forecast, in which, forecasts can only be determined starting on the second season, when this season’s data is already
  • 326.
    316 R. Fernandesand J. Barbosa available. The smoothing parameters α, β and γ fed into the Holt-Winters method [7] were [0.5, 0.5, 0.5] respectively. The optimization yielded the values [0.61136, 0.00189, 0.65050] with an associated MAE of 4.12. Considering the forecast values, the minimum is 74.4983 and the maximum is 84.2434, thus, the forecast points associated with this example, according with Table 1, are 0 points. 5 Conclusions In this work, we present a solution using mobile applications and wearable devices for health data collection and analysis, with the purpose of generat- ing health status forecasting, for patients that may, or may not, be infected with the COVID-19 virus. To implement this solution, Docker containerization was used together with elements from the FIWARE platform to create the system’s architecture. A health context provider was implemented, to fetch data from the wearable device, as well as a data analysis module to make the forecasts. Three types of forecasts were devised, considering the variables characteris- tics, namely, last observation, trend estimation and Holt-Winters methodology. The obtained results so far are encouraging and future work can be done in the optimization of the forecast methods and in the inclusion of more data sources. Acknowledgements. This work was been supported by SMACovid-19 – Autonomous Monitoring System for Covid-19 (70078 – SMACovid-19). References 1. Candelieri, A., Zhang, W., Messina, E., Archetti, F.: Automated rehabilitation exer- cises assessment in wearable sensor data streams. In: 2018 IEEE International Con- ference on Big Data (Big Data), pp. 5302–5304 (2018). https://doi.org/10.1109/ BigData.2018.8621958 2. Hahanov, V., Miz, V.: Big data driven healthcare services and wearables. In: the Experience of Designing and Application of CAD Systems in Microelectronics, pp. 310–312 (2015). https://doi.org/10.1109/CADSM.2015.7230864 3. Docker. https://www.docker.com/resources/what-container. Accessed 5 May 2021 4. FIWARE. https://www.fiware.org/developers/. Accessed 5 May 2021 5. HL7 FHIR Standard. https://www.hl7.org/fhir/R4/. Accessed 5 May 2021 6. Fitbit Subscriptions Homepage. https://dev.fitbit.com/build/reference/web-api/ subscriptions/. Accessed 5 May 2021 7. Holt-Winters Definition. https://otexts.com/fpp2/holt-winters.html. Last accessed 5 May 2021. Accessed 5 May 2021
  • 327.
  • 328.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease: an Optimal Control Approach Artur M. C. Brito da Cruz1,3 and Helena Sofia Rodrigues2,3(B) 1 Escola Superior de Tecnologia de Setúbal, Instituto Politécnico de Setúbal, Setúbal, Portugal artur.cruz@estsetubal.ips.pt 2 Escola Superior de Ciências Empresariais, Instituto Politécnico de Viana do Castelo, Valença, Portugal sofiarodrigues@esce.ipvc.pt 3 CIDMA - Centro de Investigação e Desenvolvimento em Matemática e Aplicações, Departamento de Matemática, Universidade de Aveiro, Aveiro, Portugal Abstract. Dengue fever is a vector-borne disease that is widely spread. It has a vast impact on the economy of countries, especially where the disease is endemic. The associated costs with the disease comprise pre- vention and treatment. This study focus on the impact of adopting individual behaviors to reduce mosquito bites - avoiding the disease’s transmission - and their associated costs. An epidemiological model is presented with human and mosquito compartments, modeling the inter- action of dengue disease. The model assumed some self-protection mea- sures, namely the use of repellent in human skin, wear treated clothes with repellent, and sleep with treated bed nets. The household costs for these protections are taking into account to study their use. We con- clude that personal protection could have an impact on the reduction of the infected individuals and the outbreak duration. The costs associated with the personal protection could represent a burden to the household budget, and its purchase could influence the shape of the infected’s curve. Keywords: Dengue · Economic burden · Personal protection · Household costs · Optimal control 1 Introduction Dengue is a mosquito-borne disease dispersed almost all over the world. The infection occurs when a female Aedes mosquito bites an infected person and then bites another healthy individual to complete its feed [10]. According to World Health Organization (WHO) [27], each year, there are about 50 million to 100 million cases of dengue fever and 500,000 cases of severe Supported by FCT - Fundação para a Ciência e a Tecnologia. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 319–335, 2021. https://doi.org/10.1007/978-3-030-91885-9_23
  • 329.
    320 A. M.C. Brito da Cruz and H. S. Rodrigues dengue, resulting in hundreds of thousands of hospitalizations and over 20,000 deaths, mostly among children and young adults. Several factors contribute to this public health problem, such as uncontrolled urbanization and increasing population growth. Problems like the increased use of non-biodegradable packaging coupled with nonexistent or ineffective trash collection services, or the growing up number of travel by airplane, allowing the constant exchange of dengue viruses between countries, reflect such factors. Besides, the financial and human resources are limited, leading to programs that emphasize emergency control in response to epidemics rather than on integrated vector management related to prevention [27]. These days, dengue is one of the most important vector-borne diseases and, globally, has heavy repercussions in morbidity and economic impact [1,28]. 1.1 Dengue Economic Burden Stanaway et al. [25] estimated dengue mortality, incidence, and burden for the Global Burden of Disease Study. In 2013, there were almost 60 million symp- tomatic dengue infections per year, resulting in about 10 000 deaths; besides, the number of symptomatic dengue infections more than doubled every ten years. Moreover, Shepard et al. [23] estimated a total annual global cost of dengue illness of US$8.9 billion, with a distribution of dengue cases of 18% admitted to hospital, 48% ambulatory, and 34% non-medical. Dengue cases have twofold costs, namely prevention and treatment. The increasing of dengue cases leads to the rise of health expenditures, such as outpa- tient, hospitalization, and drug administration called direct costs. At the same time, there are indirect costs for the economic operation, such as loss of pro- ductivity, a decrease in tourism, a reduction in foreign investment flows [14]. Most of the countries only bet on prevention programs to control the outbreaks. They focus on the surveillance program, with ovitraps or advertising campaigns regarding decrease the breeding sites of the mosquito. 1.2 Personal Protective Measures (PPM) Another perspective on dengue prevention is related to household prevention, where the individual has a central role in self-protection. PPM could be physical (e.g., bed nets or clothing) or chemical barriers (e.g., skin repellents). Mosquito nets are helpful in temporary camps or in houses near biting insect breeding areas. Care is needed not to leave exposed parts of the body in contact with the net, as mosquitoes bite through the net. However, insecticide-treated mosquito nets have limited utility in dengue control programs, since the vector species bite during the day, but treated nets can be effectively utilized to protect infants and night workers who sleep by day [18]. For particular risk areas or occupations, the protective clothing should be impregnated with allowed insecticides and used during biting insect risk periods.
  • 330.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 321 Repellents with the chemical diethyl-toluamide (DEET) give good protec- tion. Nevertheless, repellents should act as a supplementary to protective cloth- ing. The population also has personal expenses to prevent mosquito bites. There are on the market cans of repellent to apply on the body, clothing already impreg- nated with insect repellent, and mosquito bed nets to protect individuals when they are sleeping. All of these items have costs and are considered the household cost for dengue. The number of research papers studying the dengue impact dengue on the country’s economy is considerable [9,11,13,23,24,26]. However, there is a lack of research related to household expenditures, namely what each individual could spend to protect themselves and their family from the mosquito. There is some work available about insecticide bed nets [19] or house spraying operations [8], but there is a lack of studies using these three PPM in the scientific literature. Each person has to make decisions related to spending money for its protec- tion or risking catching the disease. This cost assessment can provide an insight to policy-makers about the eco- nomic impact of dengue infection to guide and prioritize control strategies. Reducing the price or attribute subsidies for some personal protection items could lead to an increase in the consumption of these items, and therefore, to prevent mosquito bites and the disease. This paper studies the influence of the price of personal protection to prevent dengue disease from the individual perspective. Section 2 introduces the optimal control problem, the parameter, and the variables that composed the epidemio- logical model. Section 3 analyzes the numerical results, and Sect. 4 presents the conclusions of the work. 2 The Epidemiological Model 2.1 Dengue Epidemiological Model This research is based on the model proposed by Rodrigues et al. [21], where it is studied the human and mosquito population through a mutually exclu- sive compartmental model. The human population is divided in the classic SIR- type epidemiological framework (Susceptible - Infected - Recovered), while the mosquito population only has two components SI (Susceptible - Infected). Some assumptions are assumed: both (humans and mosquitoes) are born susceptibles; there is homogeneity between host and vectors; the human population is con- stant (N = S + I + R), disregarding migrations processes; and seasonality was not considered, which could influence the mosquito population. The epidemic model for dengue transmission is given by the following sys- tems of ordinary differential equations for human and mosquito populations, respectively.
  • 331.
    322 A. M.C. Brito da Cruz and H. S. Rodrigues ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ dS(t) dt = μhNh − Bβmh Im(t) Nh + μh S(t) dI(t) dt = Bβmh Im(t) Nh S(t) − (ηh + μh) I(t) dR(t) dt = ηhI(t) − μhR(t) (1) ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ dSm(t) dt = μmNm − BβhmI(t) Nh + μm Sm(t) dIm(t) dt = Bβhm I(t) Nh Sm(t) − μmIm(t) (2) As expected, these systems depict the interactions of the disease between humans to mosquitoes and vice-versa. Parameters of the model are presented in Table 1. Table 1. Parameters of the epidemiological model Parameter Description Range Used values Source Nh Human population 112000 [12] 1 μh Average lifespan of humans (in days) 79 × 365 [12] B Average number of bites on an unprotected person (per day) 1 3 [20,21] βmh Transmission probability from Im (per bite) [0.25, 0.33] 0.25 [6] 1 ηh Average infection period on humans (per day) [4, 15] 7 [4] 1 μm Average lifespan of adult mosquitoes (in days) [8, 45] 15 [7,10,16] Nm Mosquito population 6 × Nh [22] βhm Transmission probability from Ih (per bite) [0.25, 0.33] 0.25 [6] This model does not incorporate any PPM. Therefore we need to add another compartment to the human population: the Protected (P) compartment. It com- prises humans that are using personal protection, and therefore they cannot be bitten by mosquitoes and getting infected. It is introduced a control variable u(·) in (1), which represents the effort of taking PPM. Additionally, a new parameter is required, ρ, depicting the pro- tection duration per day. Depending on the PPM used, this value is adequate
  • 332.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 323 for the individual protection capacity [5]. To reduce computational errors, it was normalized the differential equations, meaning that the proportions of each compartment of individuals in the population were considered, namely s = S Nh , p = P Nh , i = I Nh , r = R Nh and sm = Sm Nm , im = Im Nm The model with control is defined by: ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ds(t) dt = μh − (6Bβmhim(t) + u(t) + μh) s(t) + (1 − ρ)p(t) dp(t) dt = u(t)s(t) − ((1 − ρ) + μh) p(t) di(t) dt = 6Bβmhim(t)s(t) − (ηh + μh) i(t) dr(t) dt = ηhi(t) − μhr(t) (3) and ⎧ ⎪ ⎨ ⎪ ⎩ dsm(t) dt = μm − (Bβhmi(t) + μm) sm(t) dim(t) dt = Bβhmi(t)sm(t) − μmim(t) (4) This set of equations is subject to the initial equations [21]: s(0) = 11191 Nh , p(0) = 0, i(0) = 9 Nh , r(0) = 0, sm(0) = 66200 Nm , im(0) = 1000 Nm . (5) The mathematical model is represented by the epidemiological scheme in Fig. 1. Due to the impossibility of educating everyone to use PPM, namely because of budget constraints, time restrictions, or even the low individual willingness to using PPM, it is considered that the control u(·) is bounded between 0 and umax = 0.7. 2.2 Optimal Control Problem In this section, an optimal control problem is proposed that portrays the costs associated with PPM and the restrictions of the epidemy. The objective functional is given by J(u(·)) = R(T) + T 0 γu2 (t)dt (6)
  • 333.
    324 A. M.C. Brito da Cruz and H. S. Rodrigues Fig. 1. Flow diagram of Dengue model with personal protection strategies where γ is a constant that represents the cost of taking personal prevention measures per day and person. At the same time, it is relevant that this functional also has a payoff term: the number of humans recovered by disease at the final time, R(T). It is expected that the outbreak dies out, and therefore, the total of recover individuals gives the cumulative number of persons infected by the disease. We want to minimize the total number of infected persons, and therefore recovered persons at the final time and the expenses associated with buying protective measures. Along time, we want to find the optimal value u∗ of the control u, such that the associated state trajectories S∗ , P∗ , I∗ , R∗ , S∗ m, I∗ m are solutions of the Eqs. (3) and (4) with the following initial conditions: s(0) ⩾ 0, p(0) ⩾ 0, i(0) ⩾ 0, r(0) ⩾ 0, sm(0) ⩾ 0, im(0) ⩾ 0. (7) The set of admissible controls is Ω = {u(·) ∈ L∞ [0, T] : 0 ⩽ u(·) umax, ∀t ∈ [0, T]}. The optimal control consists of finding (s∗ (·), p∗ (·), i∗ (·), r∗ (·), s∗ m(·), i∗ m(·)) associated with an admissible control u∗ (·) ∈ Ω on the time interval [0, T] that minimizes the objective functional. Note that the cost function is L2 and the integrand function is convex with respect to the function u. Furthermore, the control systems are Lipschitz with respect to the state variables and, therefore, exists an optimal control [3]. The Hamiltonian function is H = H (s(t), p(t), i(t), r(t), sm(t), im(t), Λ, u(t)) = γu2 (t) + λ1 (μh − (6Bβmhim(t) + u(t) + μh) s(t) + (1 − ρ)p(t)) + λ2 (u(t)s(t) − ((1 − ρ) + μh) p(t)) + λ3 (6Bβmhim(t)s(t) − (ηh + μh) i(t)) + λ4 (ηhi(t) − μhr(t)) + λ5 (μm − (Bβhmi(t) + μm) sm(t)) + λ6 (Bβhmi(t)sm(t) − μmim(t))
  • 334.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 325 Pontryagin’s Maximum Principle [17] states that if u∗ (·) is optimal control for the equations (3)–(6) with fixed final time, then there exists a nontrivial absolutely continuous mapping, the adjoint vector: Λ : [0, T] → R6 , Λ(t) = (λ1(t), λ2(t), λ3(t), λ4(t), λ5(t), λ6(t)) such that s = ∂H ∂λ1 , p = ∂H ∂λ2 , i = ∂H ∂λ3 , r = ∂H ∂λ4 , s m = ∂H ∂λ5 and i m = ∂H ∂λ6 , and where the optimality condition H (s∗ (t), p∗ (t), i∗ (t), r∗ (t), s∗ m(t), i∗ m(t), Λ(t), u∗ (t)) = min 0⩽uumax H (s∗ (t), p∗ (t), i∗ (t), r∗ (t), s∗ m(t), i∗ m(t), Λ(t), u(t)) and the transversality conditions λi(T) = 0, i = 1, 2, 3, 5, 6 and λ4(T) = 1 (8) hold almost everywhere in [0, T] . The following theorem follows directly from applying the Pontryagin’s max- imum principle to the optimal control problem. Theorem 1. The optimal control problem with fixed final time T defined by the Eqs. (3)–(6) has a unique solution (s∗ (t), p∗ (t), i∗ (t), r∗ (t), s∗ m(t), i∗ m(t)) associ- ated with the optimal control u∗ (· · · ) on [0, T] given by u∗ (t) = max 0, min (λ1(t) − λ2(t)) s∗ (t) 2γ , umax (9) with the adjoint function satisfying ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ λ 1(t) = λ1 (6Bβmhi∗ m(t) + u∗ (t) + μh) − λ2u∗ (t) − λ36Bβmhi∗ m(t) λ 2(t) = −λ1(1 − ρ) + λ2 ((1 − ρ) + μh) λ 3(t) = λ3 (ηh + μh) − λ4ηh + (λ5 − λ6) Bβhms∗ m(t) λ 4(t) = λ4μh λ 5(t) = λ5 (Bβhmi∗ (t) + μm) − λ6Bβhmi∗ (t) λ 6(t) = (λ1 − λ3) 6Bβmhs∗ (t) + λ6μm (10) 3 Numerical Results This section presents the results of the numerical implementation of control strategies for PPM for dengue disease.
  • 335.
    326 A. M.C. Brito da Cruz and H. S. Rodrigues For the problem resolution, the time frame considered was one year, and the parameters used are available on Table 1. To obtain the computational results, Theorem 1 was implemented numeri- cally on MATLAB version R2017b and the extremal found, u∗ , was evaluated using a forward-backward fourth-order Runge-Kutta method with a variable time step for efficient computation (see [15] for more details). For the differen- tial equations related to the state variables, (3)–(4) it was performed a forward method using the initial conditions (5). For the differential equations related to adjoint variables, it was applied a backward system using the transversality conditions (8). Figure 2 shows the evolution of the human state variables during the whole year, without any control. This analysis is a meaningful kickoff point to make a fair comparison with the application of the PPM. It should also be mentioned that this model does not take into account deaths by dengue, because in the considered region of the study [21], fortunately, nobody died, and there were no cases of hemorrhagic fever. Fig. 2. Evolution of human state variables when human population did not take any protective measures For this research, three protective measures were considered: insect repellent, bed nets, and clothes impregnated with insect repellent. It was obtained the average price of each product on the market as well as how long each product stays active. The following simulations are divided into two subsections: the first one, using only one PPM, and the second one, where several PPM are combined. Each PPM has two factors associated: cost and durability. In our model, the parameter γ is the cost that each individual spends to protect themself during
  • 336.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 327 the whole year, and the parameter ρ determines how long the protection stays active. The individual is responsible for deciding the best strategy and how much he/she will spend on protective measures. Both constants are listed in Table 2. For example, the cost of a spray can of insect repellent is 10A C, and it only lasts a month, the reason why γ = 10×12 365×Nh . Furthermore, each application of insect repellent only lasts four hours, hence ρ as the value 1 6 . Therefore, this analysis is twofold purposes: to understand the impact in the curve of infected individuals when distinct personal protective measures are used; and to find the economic burden of these measures on each person. Table 2. Parameters associated with the control Scenario Control Cost (γ) Protection (ρ) No control None 0 0 Single control A Skin repellent 10×12 365×Nh 1 6 B Bed net 20 365×Nh 1 3 C Insecticide-treated clothes 30×6 365×Nh 1 2 Combined control D Skin repellent+Bed net 10×12+20 365×Nh 1 2 E Skin repellent+Insecticide-treated clothes 10×12+30×6 365×Nh 2 3 F Bed net+Insecticide-treated clothes 20+30×6 365×Nh 5 6 G All 10×12+20+30×6 365×Nh 1 3.1 Single Control In this section, it is presented the results relative to the application of one single control. Only one of the protective measures, skin repellent, bed nets, or clothes impregnated with insecticide, is taken and only used once per day. In scenario A (see Fig. 3), the case associated with the use of skin repellent, is the case where more people get infected. About 85% of the population gets infected, and that’s probably why, in the single control scenarios, this is the case where the control starts to decrease sooner. A reasonable argument why this happens is because the protective measure is not being very effective. Note that this is the case where people are less protected. Scenarios B (Fig. 4) and C (Fig. 5) have similar behavior of the control func- tion; however, scenario C is the first case where the number of protected persons is larger than the number of susceptible persons, at least most of the year. Note that the cost of using treated clothes is much higher than using bed nets. However, there is not much difference between the total number of infected persons using bed nets instead of treated clothes, about 500 more infected per- sons.
  • 337.
    328 A. M.C. Brito da Cruz and H. S. Rodrigues Fig. 3. Scenario A - Control used: skin repellent Fig. 4. Scenario B - Control used: bed nets Fig. 5. Scenario C - Control used: insecticide-treated clothes
  • 338.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 329 3.2 Combined Control Another approach to household protection is the combination of more than one protective measure. Wearing or using more personal protections significantly reduces the number of infected persons. The combined use of insect repellent and bed nets, scenario D (Fig. 6), have the same value of ρ as scenario C and, therefore, the human state variables and control function of these cases have similar behavior. The big difference between both scenarios, C and D, is the cost of the protections, which will be discussed in the following subsection. Fig. 6. Scenario D - Combined control (insect repellent and bed nest) In scenarios E (Fig. 7), F (Fig. 8), and G (Fig. 9), the number of infected people is quite inferior to previous cases. The final number of recovered persons drops down at least to almost half the population. The control function keeps increasing the number of days staying at maximum control until scenario F. This seems to happen because the number of infected persons in this scenario reduces significantly. Fig. 7. Scenario E - Combined control (skin repellent and Insecticide-treated clothes) In scenario F, a large number of persons have personal protection. The max- imum number of infected persons on a day is 99. In this scenario, the human
  • 339.
    330 A. M.C. Brito da Cruz and H. S. Rodrigues state variables have completely different behavior than the previous scenarios. This amount of protection has an impact on the epidemy. Fig. 8. Scenario F - Combined control (bed net and Insecticide-treated clothes) Scenario G is almost utopic since the individual has complete protection every hour of the day. Right from the start, 70% of the population is protected, and in 41 days, there will not be more infections with dengue disease. Overall costs decrease because people do not have to apply personal protections most of the year. Fig. 9. Scenario G - Combined control (skin repeelent, bed net, insecticide-treated clothes) 3.3 Cost of the Control Control measures associated with personal protection were analyzed, separately and combined, to achieve the best strategy. Table 3 illustrates the differences of all scenarios, and several relevant values were scrutinized to understand the dynamics of human state variables. The peak of the infection, the maximum
  • 340.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 331 number of infected persons in a day, and the day when that happens are pointed out to perceive the total number of infected individuals; this information is crucial to prepare in advance human and medical resources for an outbreak. The epidemy’s end is considered the day when the number of infected persons is smaller than 9, which corresponds to the initial value of infected persons on these simulations. Finally, the last two columns are concerned with the functional cost. R(T) represents the number of recovered persons on the last day of the year contemplated, also it tells the total number of infected people during the whole year. The cost of personal protection measure stands for the value that each person would have to spend during the whole year. The maximum value of infected people on a single day occurs, as expected, when there are no personal protection measures. The minimum value is 81 where people are fully protected but in scenario F, with ρ = 5 6 , only have a few more infected persons at its peak, 99. Strategies C and D have likely results regarding the peak of the infection and according to the respective figures presented in previous subsections. However, the cost of using only insecticide-treated clothes (scenario C) is 26% higher when compared with the combined control insect repellent and bed net (scenario D). In Table 3, one can see that with more protection measures, the maximum number of infected people in a single day decreases. However, the day when this peak is achieved is surprising. This peak is reached on the 35th day, where there isn’t any control, and it keeps being reached later until scenario E, 81st day. In these scenarios, from no control until scenario E, flattening the curve of infected people makes the duration of the epidemy last longer. Both columns, Epidemy’s end and R(t), illustrate this fact. In the last two scenarios, and due to a large number of the population is using protection almost all of the day, the total number of infected people drastically decreases. In these two cases, the human state variables have different behavior from the other scenarios. Table 3. Parameters associated with the control Scenario ρ Peak of infected persons Peak’s day Epidemy’s end (day) R(T) Cost No control 0 1912 35 120 10878 0 A 1 6 863 58 202 9571 62,5 B 1 3 710 63 223 9083 13 C 1 2 510 72 260 8177 115,7 D 1 2 510 72 261 8176 91,3 E 2 3 257 81 336 6057 206,1 F 5 6 99 6 221 1349 135,3 G 1 81 3 20 120 8,7
  • 341.
    332 A. M.C. Brito da Cruz and H. S. Rodrigues Another perspective of the outbreak is to present all curves of infected humans (Fig. 10). The adoption of any PPM decreases the peak of the infected, which could influence the medical care of the patients. Higher values of ρ mean less number of infected people. Inversely, the epidemic’s end and its peak happen later for higher values of ρ (except the last two cases), which could be explained by the flattening of the curve of infected persons. Single control scenarios Combined control scenarios Fig. 10. Infected people - all scenarios Control functions on all scenarios tend to stay at maximum control value for most of the year at study. However, in scenario A, the control function starts to drop down from the maximum value much sooner than in all other scenarios. A possible explanation seems to be because, on the 200th day, most people already were infected by dengue. Therefore, there is no need for any more protection. Again on scenarios F and G, due to the end of epidemy’s is reached sooner, the control function also starts to decrease sooner when comparing it with the remaining scenarios (see Fig. 11). Single control scenarios Combined control scenarios Fig. 11. Control variable - all scenarios
  • 342.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 333 Protective measures have varied prices, and the duration that each one lasts is different. Scenarios F and G seem to prove that the use of very efficient protection from the start of the outbreak, ends it sooner. Furthermore, and possibly not expected, each person in these scenarios does not spend as much money as in other cases. For instance, scenario F appears to be a much better solution than scenario E, not only for personal cost but also to fight the disease. 4 Conclusions Effective dengue prevention requires a coordinated and sustainable approach. This way, it is possible to address environmental, behavioral, and social con- texts of disease transmission. Determining household expenditure related to bite prevention is a crucial purpose for acting in advance of an outbreak. With an efficient cost analysis, each individual could make a trade-off decision between the available resources. This work is the follow up work that the authors started in [2]. In here, the research tried to answer the following question: what is the best strategy to fight dengue disease and, at the same time, spending less money? The results corroborated that if one person is more protected, then the disease spreads less. Long protection flattens the infected human curve, which is a good perspective from the medical point of view. Hence, the medical staff could have more time to prepare the outbreak response, and the Heath Authorities could acquire more supplies to treat the patients. However, it was found that does not mean that the person should spend more. As an example, the protection used with skin repellent and bed net has the same impact on the number of infected individuals when compared with the insecticide-treated clothes, but it is much cheaper. These results also reflect the importance of using personal protective mea- sures for most part of the day. Half or even more of the population will not be infected if at least 16 h a day a person stays protected. As future work, it will be interesting to study the economic impact of strate- gies related to prevention - including self-protect measures - and treatment - that could including costs with drug administration, hospitalization, or even sick days losses, focusing only on the individual perspective. Another perspective to be implemented is to analyze the economic evaluation of prevention. The adop- tion of some prevention activities, from the individual point of view, could have a profound impact on the government budget allocated to the disease’s treat- ment. Thus could be critical to investigate the financial support of preventive measures. Acknowledgement. This work is supported by The Center for Research and Develop- ment in Mathematics and Applications (CIDMA) through the Portuguese Foundation for Science and Technology (FCT - Fundação para a Ciência e a Tecnologia), references UIDB/04106/2020 and UIDP/04106/2020.
  • 343.
    334 A. M.C. Brito da Cruz and H. S. Rodrigues References 1. Bhatt, S., et al.: The global distribution and burden of dengue. Nature 496, 504– 507 (2013) 2. Brito da Cruz, A.M., Rodrigues, H.S.: Personal protective strategies for dengue disease: simulations in two coexisting virus serotypes scenarios. Math. Comput. Simul. 188, 254–267 (2021) 3. Cesari, L.: Optimization-Theory and Applications. Springer, New York (1983) 4. Chan, M., Johansson, M.A.: The incubation periods of dengue viruses. PLoS One 7(11), e50972 (2012) 5. Demers, J., Bewick, S., Calabrese, J., Fagan, W.F.: Dynamic modelling of personal protection control strategies for vector-borne disease limits the role of diversity amplification. J. R. Soc. Interface 15, 20180166 (2018) 6. Focks, D.A., Brenner, R.J., Hayes, J., Daniels, E.: Transmission thresholds for dengue in terms of Aedes aegypti pupae per person with discussion of their utility in source reduction efforts. Am. J. Trop. Med. Hyg. 62, 11–18 (2000) 7. Focks, D.A., Haile, D.G., Daniels, E., Mount, G.A.: Dynamic life table model for Aedes aegypti (Diptera: Culicidae): analysis of the literature and model develop- ment. J. Med. Entomol. 30, 1003–1017 (1993) 8. Goodman, C.A., Mnzava, A.E.P., Dlamini, S.S., Sharp, B.L., Mthembu, D.J., Gumede, J.K.: Comparison of the cost and cost-effectiveness of insecticide-treated bednets and residual house-spraying in KwaZulu-Natal, South Africa. Trop. Med. Int. Health 6(4), 280–95 (2001) 9. Hariharana, D., Dasb, M.K., Sheparda, D.S., Arorab, N.K.: Economic burden of dengue illness in India from 2013 to 2016: a systematic analysis. Int. J. Infect. Dis. 84S, S68–S73 (2019) 10. Harrington, L.C., et al.: Analysis of survival of young and old Aedes aegypti (Diptera: Culicidae) from Puerto Rico and Thailand. J. Med. Entomol. 38, 537–547 (2001) 11. Hung, T.M., et al.: The estimates of the health and economic burden of dengue in Vietnam. Trends Parasitol. 34(10), 904–918 (2018) 12. INE. Statistics Portugal. http://censos.ine.pt. Accessed 5 Apr 2020 13. Laserna, A., Barahona-Correa, J., Baquero, L., Castañeda-Cardona, C., Rosselli, D.: Economic impact of dengue fever in Latin America and the Caribbean: a sys- tematic review. Rev. Panam Salud Publica 42(e111) (2018) 14. Lee, J.S., et al.: A multi-country study of the economic burden of dengue fever based on patient-specific field surveys in Burkina Faso, Kenya, and Cambodia. PLoS Negl. Trop. Dis. 13(2), e0007164 (2019) 15. Lenhart, C.J., Workman, J.T.: Optimal Control Applied to Biological Models. Chapman Hall/CRC, Boca Raton (2017) 16. Maciel-de-Freitas, R., Marques, W.A., Peres, R.C., Cunha, S.P., Lourenço-de- Oliveira, R.: Variation in Aedes aegypti (Diptera: Culicidae) container productivity in a slum and a suburban district of Rio de Janeiro during dry and wet seasons. Mem. Inst. Oswaldo Cruz 102, 489–496 (2007) 17. Pontryagin, L.S., Boltyanskii, V.G., Gamkrelidze, R.V., Mishechenko, E.F.: The Mathematical Theory of Optimal Processes VIII + 360. Wiley, New York/London (1962) 18. Public Health Agency of Canada: Statement on Personal Protective Measures to Prevent Arthropod Bites. Canada Communicable Disease Report 38 (2012)
  • 344.
    Economic Burden ofPersonal Protective Strategies for Dengue Disease 335 19. Pulkki-Brännström, A.M., Wolff, C., Brännström, N., Skordis-Worrall, J.: Cost and cost effectiveness of long-lasting insecticide-treated bed nets - a model-based analysis. Cost Effectiveness Resour. Allocat. 10(5), 1–13 (2012) 20. Rocha, F.P., Rodrigues, H.S., Monteiro, M.T.T., Torres, D.F.M.: Coexistence of two dengue virus serotypes and forecasting for Madeira Island. Oper. Res. Health Care 7, 122–131 (2015) 21. Rodrigues, H.S., Monteiro, M.T.T., Torres, D.F.M., Silva, A.C., Sousa, C., Con- ceição, C.: Dengue in madeira island. In: Bourguignon, J.-P., Jeltsch, R., Pinto, A.A., Viana, M. (eds.) Dynamics, Games and Science. CSMS, vol. 1, pp. 593–605. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16118-1 32 22. Rodrigues, H.S., Monteiro, M.T.T., Torres, D.F.M., Zinober, A.: Dengue disease, basic reproduction number and control. Int. J. Comput. Math. 89(3), 334–346 (2012) 23. Shepard, D., Undurraga, E.A., Halasa, Y., Stanaway, J.D.: The global economic burden of dengue: a systematic analysis. Lancet - Infect. Dis. 16(8), 935–941 (2016) 24. Sher, C.Y., Wong, H.T., Lin, Y.C.: The impact of dengue on economic growth: the case of Southern Taiwan. Int. J. Environ. Res. Public Health 17(3), 750 (2020) 25. Stanaway, J.D., et al.: The global burden of dengue: an analysis from the global burden of disease study 2013. Lancet Infect. Dis. 16(6), 712–23 (2016) 26. Suaya, J.A., et al.: Cost of dengue cases in eight countries in the Americas and Asia: a prospective study. Am. J. Trop. Med. Hyg. 80, 846–855 (2009) 27. World Health Organization: Managing regional public goods for health: Community-based dengue vector control. Asian Development Bank and World Health Organization, Philippines (2013) 28. World Health Organization: A toolkit for national dengue burden estimation. WHO, Geneva (2018)
  • 345.
    ERP Business Speed– A Measuring Framework Zornitsa Yordanova(B) University of National and World Economy, 8mi dekemvri, Sofia, Bulgaria zornitsayordanova@unwe.bg Abstract. A major business problem nowadays is how adequate and appropriate the ERP system used is for growing business needs, the ever-changing complexity of the business environment, the growing global scale competition, and as a result - the growing speed of business. The purpose of this research is to make the bench- mark for an ERP system business speed (performance from a business operations point of view), i.e. how fast the business operations executed are with the used ERP. A literature analysis is conducted for framing and defining the concept of ERP business speed as an important and crucial factor for business success. Then a measurement framework for testing ERP systems in terms of business operations and ERP business performance is proposed and tested. Metrics for measuring ERP business speed are defined by conducting focused interviews with experts in ERP implementation and maintenance. The measurement framework has been empir- ically tested in 6 business organizations, first to validate the selected KPIs and second to formulate some average business speed indications of the KPIs as part of the business speed of ERP. The research contributes by providing a framework measurement tool for testing business speed of ERP systems. The study can also serve as a benchmark in further measuring of ERP business speed. From theory perspective, this study provides a definition and an explanation of the term ERP business speed for the first time in the literature. Keywords: ERP · Enterprise information systems · Enterprise software · Business speed · MIS · Measurement tool 1 Introduction Enterprise resource planning (ERP) is an integrated management approach for organi- zations that aims at encompassing all enterprise activities into a system process moving together [1]. Nowadays these systems are usually integrated into an information system because of the large scope of business, huge number of transactions, complex inter- nal processes, competiveness and optimization efforts for achieving higher margin [2]. Enterprise resource planning systems are integrated software packages, including all necessary ingredients that would support the smooth work flow and process with a common database [3]. They are used for integration and automation of the processes, performance improvements, and cost reduction of all enterprise activities [4]. They are considered as powerful tool for enterprise management and information and data man- agement for reducing human error, speed the flow of information, and improve overall © Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 336–344, 2021. https://doi.org/10.1007/978-3-030-91885-9_24
  • 346.
    ERP Business Speed– A Measuring Framework 337 decision making throughout the organization [5]. Over the last two decades, ERP sys- tems have become one of the most significant and expensive implementations in the information technology business [6]. Because of their impotence for businesses and wide-spreading amongst enterprises, any kind of new technology or business issue has been shortly incorporated within ERPs [4] and they have become an integral part of enterprises addressing changing environment, the increasing market requirements and increased data needs of enterprises. Yet, ERP requires large amount of investment, time, resources and efforts and a potential failure is a risky factor [7] that many companies consider as frightening and still feasible [8]. However, at the beginning of the third decade of 21st century, the question for the benefit from implementing ERP system is not relevant anymore. Now the question is not how fast, easy and effective to implement an ERP, but rather it is how the already implemented ERP serve the organizations’ needs and how fast it answers to their business speed [9]. The research question of this study is what is ERP business speed and how it could be measured. 2 Theoretical Background 2.1 ERP Systems and Their Significance for Businesses ERP systems have been designed to address the fragmentation and decomposition of information transaction by transaction across an enterprise’s business, and also to inte- grate with intra- and inter- enterprise information [10]. They are usually organized in complex software packages which integrate data, information and business process across functional areas of business [11]. ERP systems are customer-tailored and usually customized into a unique system that fulfil individual and specific needs [12]. ERP sys- tems aims at covering the full range of functionality needed by organizations and this is considered as their main benefit [13]. Implementation of an ERP system across an organization takes time, money and a lot of efforts, including internal resources [14]. That is why organizations have been trying hard to find out and come up with factors that can help them to succeed in their implementation of ERP systems [15]. There are a lot of researches examining ERP implementation failure factors [16] and project management approaches [17] so to optimize these implementations and ERP utilization. Nowadays, ERP researches are focused more on ERP transformation according to the changes [18]. ERP are closely tighten with business and [19] defines ERP is an institutionalized fac- tor and component of enterprise infrastructure and development. Davenport [20] stated even 20 years ago that the business world embraced ERP and that might be the most important development in the corporate use of information technology came from the 90s. But the science literature provides many proves that still it is important in 2020 [21]. ERP systems are crucial for business from operational point of view and functional capabilities [22] but also are considered as critically needed for top management who reckon them as an important tool for both strategic and operative purposes [23]. For some organizations, ERP is a mandatory factor for operating [24]. ERP systems benefit organizations by improving the quality of service and increasing efficiency [25]. ERP systems have transformed the functioning of firms with regard to higher efficiency and accuracy in their operations [7]. Poston and Grabski [21] examined enterprises
  • 347.
    338 Z. Yordanova beforeand after adoption and found out a significant cost reduction. Hunton et al. [22] did an experiment with financial analysts, researching whether investors consider ERP implementation enhances firm value. The results showed positive effects from such implementations even though that positive assumption was not measurable. According Hedman and Kalling [23] main benefit from ERP and enterprise information systems at all as well as factors that generate profit are the activation of the resource and activities, and the quality and cost of the offering in the light of competition. Generally, from business point of view, computing and ERP provides great additional business value to business [26] and most of the analysed studies give arguments about this cliché statement. Therefore, the focus of the current research falls on a relatively new aspect of business and ERP that extend these systems’ significance from business point of view. This new aspect is business speed and ERP business speed in particular. 2.2 Business Speed Requirements Esteves and Pastor [25] researched ERP lifecycle and distinguish 6 phases of their usage and utilization: decision, acquisition, implementation, use and maintenance, evolution, and retirement phase. Nowadays, most of the implemented ERP systems have already reached to the retirement phase because of new technologies [26] or because the currently used ERP system or processes have become inadequate to the business’ needs [27]. As a result, the ERP implementation is not a factor, a risk, a question or a doubt anymore. After moving through all of these phases, acquisitions, deployment, use, maintenance and evolution, the business’s interest in these already proven information systems is directed at their business speed addressing real business goals. ERP business speed is a novel term, first presented in this research. It should not be mixed up with economic speed or business speed in general not with technology or technology adoption speed. ERP business speed will be discussed in the next sections of the study. It is based on the concept of business speed as it is increasingly important for businesses. Business speed has already been met in the literature, but its meaning was never fully defined and it was used as word combination rather than as a term [28]. The word combination was used in different context. Bill Gates [29] used business speed in his first book with meaning of the speed of business thought, how business, leaders and technology should act together and where business goes to. His usage of business speed is related to business direction and increasing the business cycles and changes in general. Pulendran, Speed and Widing [30] use the word combination to explain business performance. Meyer and Davis [31] used business speed from the prospective of speed of change in the connected economy. For many researchers, business speed is modelling a business approach that boost business [32]. For Gregory and Rawling [33] business speed is related to time-based strategy, time compression and business performance. Miller [34] was amongst the first ones linking business speed to information systems and addressed business speed as acceleration of business via implementing SAP. Kaufmann and Wei [35] examine business speed as commerce speed. Some authors have tried to connect business speed with the startup ecosystem and some methods for startup management as Lean startup method [36]. However, the linkage between ERP and business speed has not yet been researched.
  • 348.
    ERP Business Speed– A Measuring Framework 339 3 Research Design For addressing the challenging agenda of setting a new framework in between business and information technologies namely business speed of ERPs, the methodology used is simple and it calls for further exploration. In this research a questionnaire for assessing ERP business speed is formulated after the literature analysis of the science advance- ments until now. The questionnaire was formulated using important for businesses key performance indicators (KPI) which are predominantly performed by ERPs. The ques- tionnaire was created with the support of subject matter experts in ERP implementation who took part of the research as consultants. Three consultants of different systems joined the design of the framework with having experience in implementing ERPs from brands as: SAP, Microsoft, Infor, Epicor and Oracle. After formulating the questionnaire and all the experts fully agreed on the top KPIs, the questionnaire was empirically tested for relevance in 6 firms in order to be validated. Second goal of the empirical testing was achieving a benchmark to be reached on the formulated KPIs. The companies tested the ERP business speed framework were from different sectors and size and do not represent a mono group for quantity results. This approach was chosen for the purpose of making this framework universal. The results from the tests are summarized with their average values and are presented in the results’ section in table format for clearer use of the readers. The testers were key users from all departments of the organization where the implementation takes place. 4 Results and Discussion: Benchmark of ERP Business Speed by Operations After testing the framework of the ERP business speed within the daily business oper- ations of six companies, the results were first summarized, then averaged and here are presented according to their business operation affiliation in separated table sections (Table 1).
  • 349.
    340 Z. Yordanova Table1. ERP business speed benchmark (Automatically gener- ated based on sales or- der) Physical picking of goods with Mobile device inte- grated with the ERP 10 min 20 min 5 min 2 min Confirmation of a Picking list with 5 lines 15 s 40 s 10 s 2 s (Automatic process inte- grated with mobile de- vices) Creating a Packing list with 5 lines 30 s 1 min 20 s 5 s (Automatically gener- ated based on Picking list) Creating an invoice with 5 lines 30 s 1 min 20 s 5 s (Automatically gener- ated based on Picking list) Creating a customer 30 s 1 min 25 s 30 s Account payable Creating a Purchase order with 5 lines 2 min 3 min 2 min 1 min Confirmation of a Purchase order with 5 lines 15 s 40 s 10 s 5 s (Automatically gener- ated based on Purchase order) Creating a Product receipt Note with 5 lines 30 s 1 min 20 s 5 s (Automatically gener- ated based on sales or- der) Register a Vendor invoice with 5 lines and with addi- tional cost 5 min 10 min 4 min 2 min General Ledger Creating an automatic cost allocations based on prede- fined rules - (None) - (None) - (None) 5 s (Automatic process) Generating a Trial balance base on ledger account for period of 1 year 10 s 1 min 5 s 5 s Generating a PL (profit and loss) report for period of 1 year 1 day 3 days 1 Day 1 min Generating depreciation register for 1 month and 100 assets 1 min 2 min 40 s 30 s Creating a Fixed asset 1 min 2 min 40 s 40 s Time (avg.) Time (max) Time (min) Time required now Account receivable Creating a Sales order with 5 lines 2 min 3 min 2 min 1 min Confirmation of a Sales or- der with 5 lines 15 s 40 s 10 s 5 s (Automatically gener- ated based on sales or- der) Creating a Picking list with 5 lines 30 s 1 min 20 s 5 s (continued)
  • 350.
    ERP Business Speed– A Measuring Framework 341 Table 1. (continued) Bank management Creating a payment journal for 5 client invoices 2 min 5 min 1 min 10 s (Automatic process based on Invoice due day) Creating a payment journal for 5 vendor invoices 2 min 5 min 1 min 10 s (Automatic process based on Invoice due day) Generating Cash Flow re- port for period of 1 week 2 min 5 min 1 min 50 s Inventory management Creating Counting register for one warehouse and 10 items 1 min 2 min 40 s 30 s Creating Inventory transfer register for 5 items 3 min 5 min 2 min 1 min Generating On hand report for 1 warehouse and 100 items 10 s 10 s 10 s 10 s CRM (Customer relationship management) Creating a Lead 1 min 2 min 40 s 10 s Creating а quotation with 5 lines 2 min 3 min 2 min 1 min Confirming а quotation with 5 lines 15 s 40 s 10 s 5 s (Automatically gener- ated based on quotation) Converting Lead to real customer 1 min 2 min 40 s 5 s (Automatically gener- ated based on lead) Marketing Generating marketing ac- tivity list with 100 custom- ers - (None) - (None) - (None) 5 s (Automatic process) HR (Human resources) Creating an employee 2 min 5 min 1 min 40 s Creating payroll journal for 10 employee for 1 month 2 min 3 min 1 min 30 s Generating payment state- ment report for 1 month and 10 people 1 min 2 min 30 min 10 s
  • 351.
    342 Z. Yordanova Theresults provided give clear and specific average benchmark indicators for the speed of the major business operations executed in ERPs. The framework is open for adjustments with the advancement of technology and operationally improvements, automations or optimizations. 5 Conclusion The paper provides a framework measurement tool for testing business speed of ERP systems. For testing the method, an empirical research is undertaken. The paper could serve as a benchmark or a starting point for measuring ERP business speed or this speed optimization. From science prospective, the paper provides a definition and an explanation of the term ERP business speed which is introduced firstly here. The results might be of interest of scholars from both management and computer science areas, but mostly it could be used by businesses. Acknowledgement. Supported by the UNWE under the project No NID NI-14/2018. References 1. Hong,K.K.,Kim,Y.G.:ThecriticalsuccessfactorsforERPimplementation:anorganizational fit perspective. Inf. Manage. 40, 25–40 (2003) 2. Moon, Y.B.: Enterprise resource planning (ERP): a review of the literature. Int. J. Manage. Enterp. Dev. 4(3), 235–264 (2007) 3. Staehr, L.: Understanding the role of managerial agency in achieving business benefits from ERP systems. Inf. Syst. J. 20(3), 213–238 (2010) 4. Bahssas, D., AlBar, A., Hoque, R.: Enterprise resource planning (ERP) systems: design, trends and deployment. Int. Technol. Manage. Rev. 5(2), 72–81 (2015) 5. Guimaraes, T., et al.: Assessing the impact of ERP on end-user jobs. Int. J. Acad. Bus. World 9(1), 11–21 (2015) 6. Jayawickrama, U., Liu, S., Smith, M.H.: Knowledge prioritisation for ERP implementation success: perspectives of clients and implementation partners in UK industries. Ind. Manage. Data Syst. 117(7), 1521–1546 (2017). https://doi.org/10.1108/IMDS-09-2016-0390 7. Shatat, A.: Critical success factors in enterprise resource planning (ERP) system imple- mentation: an exploratory study in Oman. Electron. J. Inf. Syst. Eval. 18(1), 36–45 (2015) 8. Huang, Z., Palvia, P.: ERP implementation issues in advanced and developing countries. Bus. Process Manage. J. 7(3), 276–284 (2001) 9. Sharif, A.M., Irani, Z., Love, P.E.: Integrating ERP using EAI: a model for post hoc evaluation. Eur. J. Inf. Syst. 14(2), 162–174 (2005) 10. Lee, N.C.A., Chang, J.: Adapting ERP systems in the post-implementation stage: dynamic IT capabilities for ERP. Pac. Asia J. Assoc. Inf. Syst. 12(1), Article 2 (2020). https://doi.org/ 10.17705/1pais.12102 11. Davenport, T.H.: The future of enterprise system-enabled organizations. Inf. Syst. Front. 2(2), 163–180 (2000) 12. Chen, C.S., Liangb, W.Y., Hsu, H.: A cloud computing platform for ERP applications. Appl. Soft Comput. 27, 127–136 (2015)
  • 352.
    ERP Business Speed– A Measuring Framework 343 13. Momoh, A., Roy, R., Shehab, E.: Challenges in enterprise resource planning implementation: state-of-the-art. Bus. Process Manag. J. 16, 537–565 (2010) 14. Mahindroo, A., Singh, H., Samalia, H.V.: Factors impacting intention to use of ERP systems in Indian context: an empirical analysis. In: 3rd World Conference on Information Technology (WCIT-2012), vol. 03, pp. 1626–1635 (2013) 15. Gunasekaran, A.: Modelling and Analysis of Enterprise Information Systems, p. 179. IGI Publishing, New York (2007) 16. Lutfi, A.: Investigating the moderating role of environmental uncertainty between institutional pressures and ERP adoption in Jordanian SMEs. J. Open Innov. Technol. Mark. Complex. 6(3), 91 (2020). https://doi.org/10.3390/joitmc6030091 17. Kirmizi, M., Kocaoglu, B.: The key for success in enterprise information systems projects: development of a novel ERP readiness assessment method and a case study. Enterp. Inf. Syst. 14(1), 1–37 (2020). https://doi.org/10.1080/17517575.2019.1686656 18. Karimi, J., Somers, T., Bhattacherjee, A.: The role of information systems resources in ERP capability building and business process outcomes. J. Manage. Inf. Syst. 24(2), 221–260 (2007) 19. Moe C.E., Fosser E., Leister O.H., Newman M.: How can organizations achieve competitive advantages using ERP systems through managerial processes? In: Magya, G., Knapp, G., Wojtkowski, W., Wojtkowski, W.G., Zupančič, J. (eds.) Advances in Information Systems Development. Springer, Boston (2007) 20. Al-hadi, M.A., Al-Shaibany, N.A.: An extended ERP model for Yemeni universities using TAM model. Int. J. Eng. Comput. Sci. 6(7), 22084–22096 (2017) 21. Poston, R., Grabski, S.: The impact of enterprise resource planning systems on firm per- formance. In: International Conference on Information Systems, pp. 479−493. Brisbane (2000) 22. Hunton, J., McEwen, A., Wier, B.: The reaction of financial analysts to enterprise resource planning implementation plans. J. Inf. Syst. (Spring) 3140 (2002) 23. Hedman, J., Kalling, T.: The business model concept: theoretical underpinnings and empirical illustrations. Eur. J. Inf. Syst. 12(1), 49–59 (2003) 24. Ali, O., et al.: Anticipated benefits of cloud computing adoption in Australian regional municipal governments: an exploratory study. In: Conference or Workshop Item 209, (2015) 25. Esteves, J., Pastor, J.: Enterprise resource planning systems research: an annotated bibliogra- phy. Commun. Assoc. Inf. Syst. 7, Article 8 (2001) 26. Demi, S., Haddara, M.: Do cloud ERP systems retire? an ERP lifecycle perspective. Procedia Comput. Sci. 138(2018), 587–594 (2018) 27. Muhamet, G.: A maturity model for implementation and application of Enterprise Resource Planning systems and ERP utilization to Industry 4.0. PhD thesis, Budapesti Corvinus Egyetem, Közgazdasági és Gazdaságinformatikai Doktori Iskola (2021) 28. Raman, K.A.: Speed Matters: Why Best in Class Business Leaders Prioritize Workforce Time to Proficiency Metrics, Speed To Proficiency Research: S2Pro©, 2021 (2021) 29. Gates, B.: The Road Ahead, 1st edn. Viking Press, ISBN-10: 0670859133 (1999) 30. Pulendran, S., Speed, R., Widing, R., II.: Marketing planning, market orientation and business performance. Eur. J. Mark. 37(3/4), 476–497 (2003). https://doi.org/10.1108/030905603104 59050 31. Meyer, C., Davis, S.: Blur: the speed of change in the connected economy. Work Study 49(4) (2000). https://doi.org/10.1108/ws.2000.07949dae.003 32. Cosenz, F., Bivona, E.: Fostering growth patterns of SMEs through business model innovation. A tailored dynamic business modelling approach. J. Bus. Res. (in press, 2021). https://doi. org/10.1016/j.jbusres.2020.03.003 33. Gregory, C., Rawling, S.: Profit from time: speed up business improvement by implementing time compression, Springer, ISBN 1349145912 (2016)
  • 353.
    344 Z. Yordanova 34.Miller, S.: Accelerated Sap (ASAP): Implementation at the Speed of Business. McGraw-Hill, Inc., New York (1998) 35. Kaufmann, D., Wei, S.: Does’ Grease Money’ Speed Up the Wheels of Commerce? International monetary fund working paper (2000) 36. Yordanova, Z.: Knowledge transfer from lean startup method to project management for boosting innovation projects performance. Int. J. Technol. Learn. Innovation Dev. 9(4) (2017)
  • 354.
    BELBIC Based Step-DownController Design Using PSO João Paulo Coelho1,2(B) , Manuel Braz-César1,3 , and José Gonçalves1,2 1 Instituto Politécnico de Bragança, Escola Superior de Tecnologia e Gestão, Campus de Sta. Apolónia, 5300-253 Bragança, Portugal {jpcoelho,brazcesar,goncalves}@ipb.pt 2 CeDRI - Centro de Investigação em Digitalização e Robótica Inteligente, Campus de Sta. Apolónia, 5300-253 Bragança, Portugal 3 CONSTRUCT Institute of RD in Structures and Construction, Campus da FEUP, 4200-465 Porto, Portugal Abstract. This article presents a comparison between a common type III controller and one based on a brain emotional learning paradigm (BELBIC) parameterized using a particle swarm optimization algorithm (PSO). Both strategies were evaluated regarding the set-point accuracy, disturbances rejection ability and control effort of a DC-DC buck con- verter. The simulation results suggests that, when compared to the com- mon controller, the BELBIC leads to an increase in both set-point track- ing and disturbances rejection ability while reducing the dynamics of the control signal. Keywords: Optimisation · BELBIC · Buck converter · PSO 1 Introduction Conversion between different voltages values is amongst the most common oper- ations found in electronics. For example, many battery-operated devices such as laptops and mobile devices, are capable of switching between different voltage values in order to optimize the use of the battery. The 5 V constant core voltage found of 1970’s microprocessors has evolved for today’s processors to scalable core supply voltage that can reach values lower than one volt. This voltage scal- ing task can be performed dynamically at the software or firmware levels by both the operating system or BIOS. Moreover, a point-to-load approach used in the motherboard of modern microprocessor devices has led to the inclusion of a large number of small power supplies scattered along the main board. Reducing the power dissipated in the form of heat is an important goal which lead to an increase in efficiency, small form factors by discarding the use of large heat sinks and an extension of battery life which is a key factor for all mobile devices. This can be attained by resorting to a class of circuits known by switch-mode power supplies where the voltage conversion takes place by periodically switching transistors, embedded in RLC networks, between their on and off states. The c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 345–356, 2021. https://doi.org/10.1007/978-3-030-91885-9_25
  • 355.
    346 J. P.Coelho et al. input-to-output voltage ratio depends on the duty cycle imposed to those switch- ing devices by a controller. This controller operates in closed-loop by sampling the output voltage and comparing it with the desired output voltage value and the difference between those two values will be used to establish the switching duty-cycle. The block diagram presented in Fig. 1 illustrate this methodology. Fig. 1. Typical feedback control architecture used in DC/DC converters. Often, in practice, a current loop is also added in order to enable current- mode control. This additional control layer enables overcurrent protection and reduces the sensitivity of the voltage controller to the capacitor’s ESR. However, in this paper, only the voltage-mode control is taken into consideration. Voltage- mode control resort to feedback to keep constant the output voltage despite unwanted disturbances. The loop compensation network associated to the error amplifier can be of type I, II or III. Type I is a simple pole at the origin and type II expand it by including a zero and a high-frequency pole which can leading to a phase increase of 90o . Finally, type III adds two poles and two zeros to the pole at zero which promotes an increase in phase margin. Those loop compensation circuits are tuned to perform well in a given nom- inal system operating point. However, if the system deviates from this point, the controller performance can become very poor. For example, when the power supply shift between continuous to discontinuous conduction mode. Hence, adap- tiveness and learning ability must be included in the controller in order for it to be able to perform well in a large dynamic range and under the presence of system changes. This work proposes an alternative controller structure applied to regulate the operation of a DC to DC buck converter. In particular, it will rely on the use of control paradigm inspired on the brain emotional learning ability (BELBIC) to promote adaption to operating point changes. Conceptually, this controller is inspired by the brain’s limbic system and, when compared to the typical buck converter controller, its most notorious property is the ability to keep learning while in operation. The use of BELBIC was already applied within the power electronics context. In [1], a brain emotional learning approach was employed in the context of a maximum power-point tracking algorithm applied to solar energy conversion. Additionally, in [2], a BELBIC controller was applied to control a
  • 356.
    BELBIC Based Step-DownController Design Using PSO 347 buck DC-DC converter. However, the controller parameter were obtained empir- ically and no comparison with other techniques was carried out. In this paper, the buck converter is also addressed but the BELBIC parameters were computed using the particle swarm optimization algorithm (PSO). Moreover, comparison of closed loop response with typical loop compensators was performed. This document is divided into four sections. After this first introductory section, the mathematical formulation of a step down converter is described in Sect. 2. Then, the generic BELBIC structure is presented in Sect. 3 and a general overview of the PSO algorithm is presented in Sect. 4. Details and results regarding the controller implementation is the aim of Sect. 5 where a performance comparison was carried out having a Type II controller as benchmark. Finally, Sect. 6 presents both the conclusions and final remarks. 2 The Step-Down (Buck) Converter The DC-DC buck converter used in this work assumes the asynchronous archi- tecture depicted in Fig. 2. The two switching elements are a MOSFET and a diode. The MOSFET gate will be driven by a pulse-width modulation (PWM) circuit that, for simplicity, is not shown. Fig. 2. General schematic of a DC-DC step-down converter built around a MOSFET and a diode as switching elements. In this figure, H(s) denotes the voltage sensor transfer function and GC(s) the controller transfer function. The MOSFET gate is driven by a pulse width modulation circuit that generates a square wave whose duty-cycle is proportional to a voltage signal applied to its input. Considering both steady state and converter’s continuous conduction oper- ating mode, the average voltage across the inductor assumes the value zero. At the same time, the average current value across the capacitor, over one switching period, is also zero.
  • 357.
    348 J. P.Coelho et al. Assuming small magnitudes of the disturbances, compared to the DC quies- cent values, it is possible to obtain the following differential equations [3]: L d dt îL(t) = Dv̂g(t) + ˆ d(t)Vg − v̂o(t) (1a) C d dt v̂o(t) = îL(t) − v̂o(t) Ro (1b) îg = DîL(t) + iL(t) ˆ d(t) (1c) where the hat symbol over the variables denotes a small disturbance around the variable’s operating point and D is the PWM duty cycle whose value is within the interval [0, 1]. After applying the Laplace transform to the previous set of differential equa- tions, the transfer function between the output and the command signal, denoted by Gvod(s), is: Gvod(s) = v̂o(s) ˆ d(s) = RoVg LCRos2 + Ls + Ro (2) and from input to output voltage, Gvovg (s), defined as: Gvovg (s) = v̂o(s) v̂g(s) = RoD LCRos2 + Ls + Ro (3) Without input voltage disturbances or load changes, the converter is able to operate in open-loop. However, the output value can fluctuates in the presence of load changes of other disturbances as input voltage drops/rises or shifts in the components nominal values due to several factors such as aging. Thus, a closed-loop controller must be added in order to regulate the switched converter voltage output. Typically, type I, II or III loop compensator structures are chosen to carry out this task and can be designed using the previously defined transfer functions. However, it is worth to notice that those transfer functions are only approxi- mations and assumes small disturbances around a given operating point. For this reason, the behaviour of the switching converter can be very different out- side the defined zone. Specially, if due to small loads, the converter settles to work in discontinuous conduction mode. Fixed poles-zeros controller are unable to achieve good performance in the presence of severe changes in the system dynamic behaviour. Other approach is to enable the controller to learn and use this knowledge to self adjust its behaviour in order to increase the overall per- formance within a broader range of operating points. In this work, this feature will be attained by resorting to a soft-computing paradigm known by BELBIC and briefly described in the following section. 3 The BELBIC Controller From an engineering point-of-view, analysis of the solutions produced by nature to overcome the animal species adaption problems, have led to an increasing
  • 358.
    BELBIC Based Step-DownController Design Using PSO 349 tendency to introduce bio-morphism and bio-mimicry in many different compu- tational tools [4]. For example, biological inspiration can be tracked in appli- cations such as machine cooperation, speech recognition, text recognition and self-assembly nanotechnology just to name a few. All those examples have, in fact that all rely on algorithms which have the capability of adaptation and learning. Indeed, and in the biological realm, learning is one of the most impor- tant factors for the endurance of all species. Learning allow the organisms to adjust themselves to cope with changes in environmental and operating condi- tions. This robustness is a desired characteristic in engineering applications since the operating conditions of a product are never static. For this reason, simplified approaches of some natural learning processes have been adapted to serve in many engineering problems. In the particular case of humans, learning takes place within generations and across generations at many different levels. We are not talking only about intel- lectual learning but also about the learning that is passed through our genome. All this information shapes the actions of an individual when subject to a set of environmental stimuli. At the mental level, reasoning is not the only driving action when decision-making is concerned. Human reactions strongly depend also on emotions and they play an important role in our everyday life and have been a valuable asset in our survival and adaptation. Is generally considered true that emotions were included during the evolu- tionary stage as a way to reduce the human’s reaction time. That is, rather than using the intellect to process information and generate actions, which would take time, the reaction by emotion would be much faster. Emotions can then be viewed as an automatic behaviour that seeks to improve survival by increasing the ability to react fast in the presence of threats. It seems that the overall set of possible emotions are predefined in our genome. However, they can then be utterly modified based on the person’s individual experience. Psychology and neural sciences circumscribe emotional activity to a set of distinct brain regions gathered in what is known as the limbic region. Besides emotions, the limbic system manages a distinct number of other functions such as behaviour, motivation and has an important role in memory formation tasks. At the present, it is still not consensual in the scientific community about which brain areas should be included in the limbic system. However, it is commonly accepted that the thalamus, hypothalamus, hippocampus and amygdala are the main brain structures in the limbic system. Details regarding the role played by each one of those cortical areas are outside the scope of this work. Instead, the objective is to convert the limbic system behaviour, from a high-level abstraction angle, and frame it in the context of computational intelligence. The first step toward this approach was carried out by [5] by presenting a first mathematical approach to describe the behaviour of the brain’s emotional learning (BEL) pro- cess. The idea of applying this learning algorithm in the automatic control area was provided, a couple of years later, by [6]. The junction of BEL with control systems design has led to the concept known as “brain emotional learning-based
  • 359.
    350 J. P.Coelho et al. intelligent control” (BELBIC). Further details regarding the operational details of this control method can be found at [7–10]. The major pitfalls of BELBIC regards the choice of both emotional and sensory signals in order to maximize the control system performance. Besides that, several tuning parameters, such as the learning rate of the amygdala and orbitofrontal processing units, must be found to achieve an acceptable controller behaviour. The values for such parameters are commonly found by trial-and- error which can be cumbersome and lead to suboptimal solutions. For those reasons, other tuning methods have been presented [11–13] among them the use of evolutionary based algorithms [14–18]. Due to its ability to provide good results for non-convex problems, evolu- tionary algorithms have been employed in a myriad of different applications. For this reason, in this work the BELBIC controller will be tuned by resorting to the particle swarm optimization algorithm. A short overview on this method is provided in the following section and further details on the methodology will be described in Sect. 5. 4 The PSO Algorithm The particle swarm optimisation (PSO) algorithm is fundamentally based on the social behaviour of animals that moves in herds or flocks [19]. In this algorithm, a set of particles, representing potential problem solutions moves through the search space according to a given position vector xi(t) = {xi1(t), xi2(t), · · · , xin(t)} and velocity vi(t) = {vi1(t), vi2(t), · · · , vin(t)} where t denotes the current evolutionary iteration and n the number of particles. The PSO dynamics is governed by individual and social knowledge. That is, a given particle movement is due to it’s own experience and from social information sharing. In [19] this concept was mathematically expressed by the following set of equations: coid(t) = (pid (t) − xid (t)) (4a) soid(t) = (pgd (t) − xid (t)) (4b) where coid(t) is the cognition-only value associated to the dth dimension of the ith particle and soid(t) is the social-only component of the same individual. The momentum and position of a given particle are computed by: vid (t + 1) = vid (t) + ϕ1.coid(t) + ϕ2.soid(t) (5a) xid (t + 1) = xid (t) + vid (t + 1) (5b) where pid(t) concerns the best previous position of particle i in the current itera- tion t and pgd(t) denotes the global best particle within a given neighbourhood. The coefficient ϕ1 is the cognitive constant and ϕ2 is the social coefficient. Gen- erally they are assumed to be uniformly distributed random numbers between zero and two.
  • 360.
    BELBIC Based Step-DownController Design Using PSO 351 To guarantee admissibility and stability, both the particle position and veloc- ity are bounded to maximum values. For this reason, the search space is always circumscribed and the maximum step that each particle can undergo during one iteration is constrained. The values for the maximum or minimum particle posi- tion are problem dependent. Moreover, the maximum velocity should not be too high or too low in order to avoid oscillations and local minima [20]. 5 Step-Down Control with BELBIC This section present the procedure behind the design a BELBIC controller for a DC-DC buck converter capable of generating a 5 V output regulated volt- age from a 12 V unregulated voltage input. The electrical components nominal values are L = 20 mH, C = 50 µF and a 50 kHz switching frequency was con- sidered. To reach the above referred nominal output voltage the duty-cycle D must be roughly equal to 42%. For this buck converter, a type III controller was designed in order to have zero steady state error, around 60◦ of phase margin, 10 dB of gain margin and the open-loop frequency response describes a slope of −20 dB/decade at the crossover frequency. Also, an overshoot lower than 0.5 V and a rise time smaller or equal to 2 ms. Using the Bode plot reshaping tech- nique, all those figures-of-merit could be attained by means a regulator with the following transfer function: Gc(s) = 0.0317 s2 + 63.4 s + 5 × 104 s (s2 + 20s + 100) (6) Figure 3 present the circuit implemented using Simulink R ’s Simscape R toolbox. Using this framework, the buck converter non-linear nature is fully rep- resented since each electronic and electrical components is accurately modelled taking into consideration its non-ideal characteristics. Fig. 3. Closed-loop implementation of the buck converter using SimscapeR and a type III compensator.
  • 361.
    352 J. P.Coelho et al. The simulation was carried out considering a 5 V sinusoidal disturbance, with 100 Hz frequency, superimposed over the 12 V supply voltage. Moreover, a 20% step load disturbance was applied at time instant 0.005 s. The simulation was carried out within a time frame of 15 ms and the observed results are shown in Fig. 4. Fig. 4. Buck converter performance using a PID type controller: top - input voltage, middle - output voltage, bottom - control signal. From the simulation result, it is possible to observe a performance degra- dation in both overshoot and bandwidth. Moreover, the set-point accuracy was compromised as can be seen by the low frequency signal overlapped into the out- put voltage. This closed-loop mismatch is due to several reasons: components losses, non-linearities, and model mismatches. For this reason, it is possible to conclude that this controller is unable to perform well in a broad range of changes in the system dynamics. Adaption is required which, in this work, is attained by means of using the BELBIC control strategy. One of the major handicaps when dealing with a BELBIC controller concerns the appropriate definition of both emotional and sensory signals. In this work, the BELBIC Simulink R toolbox [10] was utilized with the structure depicted in Fig. 5 where the stimulus signal was defined as: s(t) = w1 · e(t) + w2 · e(t)dt (7) where e(t) denotes the voltage tracking error. The reward signal is defined by: r(t) = w3 · e(t) + w4 · u(t) (8) where u(t) concerns the control signal and wi, for i = 1, .., 4, are weight factors that can be used to define the relative importance of each component. Besides the weights wi, for i = 1, · · · , 4 presented in (7) and (8), the BELBIC design process require also the definition of a set of parameters for it to operate
  • 362.
    BELBIC Based Step-DownController Design Using PSO 353 Fig. 5. Closed-loop implementation of the buck converter using SimscapeR and a BELBIC controller. adequately. In particular the value of the amygdala and orbitofrontal learning rates α and β respectively. Managing such a number of parameters using a trial- and-error approach will be cumbersome at best. For this reason, in this work a PSO algorithm will be in charge of deriving the best controller parameters according to a given performance index. In the current context, the performance is calculated by: f(θ) = φ(θ) · tS 0 e2 (θ, τ)dτ (9) where θ = [α, β, w1, w2, w3, w4] denotes the controller parameters and tS the simulation time. The function φ(θ) is used to penalizes solutions that result in control signals with amplitudes outside the actuator range. In the present case, φ(θ) is defined as: φ(θ) = 1, min(u(t)) ≥ 0 ∧ max(u(t)) ≤ 1 e(min(u(t))2 +max(u(t))2 )) + 1, otherwise (10) for t ∈ [0, tS]. The PSO algorithm was run several times to search for a suitable solution θ that minimizes the objective function f(θ). During the simulation, the system was exited using a random input voltage signal with values between 4 V and 15 V changing with a periodicity of 5 ms. A swarm size of 30 particles was used and the simulation time was set to 100 ms. The best solution found, in this case α = 0.0241, β = 0.00985, w1 = 6.54, w2 = 21.3, w3 = 0.0016 and w4 = 0.21, was used to parameterize the BEL controller which was then subjected to the same simulation conditions as the type III controller. The result can be seen from Fig. 6. As can be seen after analysing the plots of Fig. 6, the BELBIC controller was able to achieve a smaller settling time and lower overshoot. Moreover the steady- state error and disturbance rejection ability were also enhanced when compared to the PID controller. However, this improved response comes at the expense of a more complex and demanding tuning process.
  • 363.
    354 J. P.Coelho et al. Fig. 6. Buck converter performance using a BELBIC type controller: top - input volt- age, middle - output voltage, bottom - control signal. A quantitative comparison between the PID and the BELBIC controllers, in the context of the addressed problem, can be made after computing the following two figures-of-merit: RMS = 1 T T 0 ε(t)2dt (11) μRMS = 1 T T 0 du(t) dt 2 dt (12) The first is the root-mean-square value of the error signal ε(t) taken along the simulation interval [0, T] and the second, also the root-mean-square, but now of the control effort. Regarding the PID controller, RMS = 1.35 and μRMS = 8.19 × 10−3 . For the BELBIC those values come down to RMS = 1.131 and μRMS = 7.12×10−3 . Those values reflect a 16% decrease in RMS and an improvement of 13% in the control signal variability. 6 Conclusion This paper has compared the performance between an ordinary type III con- troller, and a BELBIC controller applied to a DC-DC buck converter. Both strategies were evaluated regarding its abilities to maintain a stable voltage out- put in the presence of both input and load disturbances. The obtained results suggest that, when compared to the classical controller, the BELBIC controller proves to be superior when considering both set-point tracking accuracy and disturbance rejection ability. Furthermore, these results are attained by means of lower control effort.
  • 364.
    BELBIC Based Step-DownController Design Using PSO 355 Future work will consider the controller behaviour if the buck converter enters discontinuous conduction mode. A physical implementation of this solution is also an ongoing project and the controller performance will be compared with the one achieved by commercial devices such as the UC3845A chip. References 1. Sankarganesh, R., Thangavel, S.: Performance analysis of various DC-DC convert- ers with optimum controllers for PV applications. Res. J. Appl. Sci. Eng. Technol. 8, 929–941 (2014) 2. Khorashadizadeh, S., Mahdian, M.: Voltage tracking control of DC-DC boost con- verter using brain emotional learning. In: 4th International Conference on Control, Instrumentation, and Automation (ICCIA), pp. 268–272 (2016) 3. Erickson, R.W., Maksimovic, D.: Fundamentals of Power Electronics, 2nd edn. Springer, Boston (2001). https://doi.org/10.1007/b100747 4. Sarpeshkar, R.: Neuromorphic and Biomorphic Engineering Systems. McGraw-Hill Yearbook of Science and Technology. McGraw-Hill, New York (2009) 5. Balkenius, C., Morén, J.: A computational model of emotional learning in the amygdala. Cybern. Syst. 32(6), 611–636 (2001) 6. Lucas, C., Shahmirzadi, D., Sheikholeslami, N.: Introducing BELBIC: brain emo- tional learning based intelligent controller. Intell. Autom. Soft Comput. 10, 11–22 (2004) 7. Rouhani, H., Jalili, M., Araabi, B.N., Eppler, W., Lucas, C.: Brain emotional learning based intelligent controller applied to neurofuzzy model of micro-heat exchanger. Expert Syst. Appl. 32(3), 911–918 (2007) 8. Rahman, M.A., Milasi, R.M., Lucas, C., Araabi, B.N., Radwan, T.S.: Implemen- tation of emotional controller for interior permanent-magnet synchronous motor drive. IEEE Trans. Ind. Appl. 44(5), 1466–1476 (2008) 9. Nahian, S.A., Truong, D.Q., Ahn, K.K.: A self-tuning brain emotional learning based intelligent controller for trajectory tracking of electrohydraulic actuator. J. Syst. Control Eng. 228, 461–475 (2014) 10. Coelho, J.P., Pinho, T.M., Boaventura-Cunha, J., de Oliveira, J.B.: A new brain emotional learning Simulink R toolbox for control systems design. IFAC- PapersOnLine 50, 16009–16014 (2017) 11. Jafarzadeh, S., Jahed Motlagh, M.R., Barkhordari, M., Mirheidari, R.: A new Lyapunov based algorithm for tuning BELBIC controllers for a group of linear systems. In: 2008 16th Mediterranean Conference on Control and Automation. IEEE, June 2008 12. Garmsiri, N., Najafi, F.: Fuzzy tuning of brain emotional learning based intelligent controllers. In: 2010 8th World Congress on Intelligent Control and Automation. IEEE, July 2010 13. Jafari, M., Mohammad Shahri, A., Hamid Elyas, S.: Optimal tuning of brain emo- tional learning based intelligent controller using clonal selection algorithm. In: ICCKE 2013. IEEE, October 2013 14. Valizadeh, S., Jamali, M.-R., Lucas, C.: A particle-swarm-based approach for opti- mum design of BELBIC controller in AVR system. In: International Conference on Control, Automation and Systems, COEX, Seoul, Korea, pp. 2679–2684, October 2008
  • 365.
    356 J. P.Coelho et al. 15. Valipour, M.H., Maleki, K.N., Ghidary, S.S.: Optimization of emotional learning approach to control systems with unstable equilibrium. In: Lee, R. (ed.) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Comput- ing. SCI, vol. 569, pp. 45–56. Springer, Cham (2015). https://doi.org/10.1007/978- 3-319-10389-1 4 16. El-Saify, M.H., El-Garhy, A.M., El-Sheikh, G.A.: Brain emotional learning based intelligent decoupler for nonlinear multi-input multi-output distillation columns. Math. Probl. Eng. 1–13, 2017 (2017) 17. Mei, Y., Tan, G., Liu, Z.: An improved brain-inspired emotional learning algorithm for fast classification. Algorithms 10(2), 70 (2017) 18. César, M.B., Coelho, J.P., Gonalves, J.: Evolutionary-based bel controller applied to a magneto-rheological structural system. Actuators 7(2), 29 (2018) 19. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of the 1995 IEEE International Conference on Neural Network, pp. 1942–1948 (1995) 20. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Porto, V.W., Saravanan, N., Waagen, D., Eiben, A.E. (eds.) EP 1998. LNCS, vol. 1447, pp. 591–600. Springer, Heidelberg (1998). https://doi.org/10.1007/ BFb0040810
  • 366.
    Robotic Welding Optimization UsingA* Parallel Path Planning Tiago Couto1,2(B) , Pedro Costa1,3 , Pedro Malaca2 , Daniel Marques2 , and Pedro Tavares2 1 Faculty of Engineering, University of Porto, Porto, Portugal 2 SARKKIS Robotics, Porto, Portugal tiago.couto@sarkkis.com 3 INESC-TEC - INESC Technology and Science, Porto, Portugal Abstract. The world of robotics is in constant evolution, trying to find new solutions to improve on top of the current technology and to over- come the current industrial pitfalls. To date, one of the key intelligent robotics components, path planning algorithms, lack flexibility when con- sidering dynamic constraints on the surrounding work cell. This is mainly related to the large amount of time required to generate safe collision- free paths for high redundancy systems. Furthermore, and despite the already known benefits, the adoption of CPU/GPU parallel solutions is still lacking in the robotic field. This work presents a software solution able of connecting the path planning algorithms with parallel computing tools, reducing the time needed to generate a safe path. The output of this work is the validation for the introduction of intelligent parallel solutions in the robotic sector. Keywords: Optimization · A* algorithm · CPU parallelism · GPU parallelism 1 Introduction Trajectory planning is all about generating the rules for the movement of a manipulator to reach a determined goal. Adding to this some constraints will limit the robot movement, such as collision avoidance, limits for joint angles, velocities, accelerations, or torques. Once taking into account every constraint and every viable and possible trajectory, optimization is needed to choose the best trajectory based on defined objective, such as energy consumption or exe- cution time [5]. Focusing on the path planner optimization, the complexity is high as indus- trial application usually integrate robot manipulators, additional external axis to which they are associated (tracks, rings, orbitals), and lastly, the precision that they need to achieve. Due to this high complexity, and to achieve the optimal solution for the movement of a manipulator, path planning becomes a time-consuming task. This work aims to improve an implementation of the A* algorithm in Cartesian space, c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 357–364, 2021. https://doi.org/10.1007/978-3-030-91885-9_26
  • 367.
    358 T. Coutoet al. using parallel computing (CPU and GPU), for an advanced robotic work cell, combining advanced (collision-free) offline programming and advanced sensing, minimizing the path length by using a graph algorithm, and minimizing the movements performed by the manipulator, namely the joints’ efforts, since the capabilities for flexible path planning are reduced. 2 Background The industrial community is in constant search of new approaches to reduce the time required to complete a given task. The main optimization for robotics, in particular, is the reduction of the trajectory setup and execution time, which can be split into two important aspects: minimizing path length and minimiz- ing trajectory duration. This solution aims to improve both aspects in the A* algorithm, focusing on achieving an optimal path to reach the solution, while minimizing the movement performed by the joints. However, to better under- stand the implementation, there is a need to understand the basic concepts about the A* algorithm and the parallelism that can be used in robotics. 2.1 Parallel Computing in Robotics The path planning for a manipulator, as defined earlier, is a very complex task, and the complexity increases with the introduction of external axes. The parallel processing allows the reduction of the time required to achieve a specified goal and/or improve the quality of the solution. Parallel processing has different lev- els, however, the one focused on this work is the algorithm level, which helps to manage the heavy computational complexity of these solutions, allowing faster Cspace evaluation algorithms [8]. Regarding parallel computing, there are two different approaches, cloud computing, and grid computing [2]. CPU Parallelism. Cloud computing, as defined by [4], is all about sharing software, reducing hardware load by using the cloud for processing complex calculations and only retrieving the results when needed, storing data, and query it to retrieve the data needed. Multithreading introduces some challenges, arising from the way the CPU handles them. Shared memory is one of the problems, as well as coordinating them. Since the CPU processes the Threads in an undefined order and can switch between them, there is a need to assure that the tasks that each one is performing do not affect the others, or if so, the resources need to be locked, to avoid corruption of data. Regarding threads, there are three approaches in C#: Thread, ThreadPool, BackgroundWorkers. GPU Parallelism. Grid computing, as defined by [3], is a distributed system, that uses multiple computing capacities to solve large computational problems. The tool used is Nvidia’s CUDA [1], allowing to speed up computation processes by using the power of GPU for the parallelizable part of the computation.
  • 368.
    Robotic Welding OptimizationUsing A* Parallel Path Planning 359 The GPU is designed for computer graphics and image processing, being more powerful than the CPU when performing small tasks due to the massively parallel computation, and CUDA allows the usage of GPU for general-purpose programming [7]. To use CUDA, the host (CPU) calls the kernels (global func- tions) and the device (GPU) executes and parallelizes the tasks. Since the ker- nels use data in the GPU memory, the data needs to be transferred between the CPU and the GPU by using dedicated CUDA variables. The workflow starts with allocating memory in the device, transferring data from the host to the device, execute kernel(s), transfer results from the device to the host, and free the memory used on the device. 3 Implementation To understand the proposed solution, the implementation created for the A* algorithm needs to be explained, from the heuristics used, to how the configu- ration space was built, and how the calculations for the manipulator movement were defined. The current work considers a work cell composed of a Yaskawa MA1440 manipulator equipped with a Binzel Abirob W500 welding torch. The goal is to incorporate the planning solution in the SARKKIS CAM software solution. This is an offline programming solution for high redundancy robotic system, being this work focused on the CoopWeld cell (https://youtu.be/3L0JBA9ozFA) at an initial stage and scaling to H-Tables or higher complexity work cells. 3.1 A* Algorithm The A* algorithm works based on two heuristics, the g cost and the h cost, and its performance is as good as the heuristics. The g cost represents the cost of moving from the actual position to the neighbour that is being explored. This cost is defined by the difference between the values of each joint, but to minimise the movement of the manipulator, it was also added the difference between the neighbour joints and the destination joints, with a proportional ratio related to the total cost based on the euclidean distance to the destination compared to the euclidean distance from the origin to the destination. The arm position is chose based on the angles of the joints that minimise the g cost. The h cost is calculated using the Euclidean distance between the current neighbour node and the destination, using the Cartesian coordinates, fitting the problem since it is always lower than the g cost, which uses the joints movement, which ensures the A* is a complete algorithm and does not underestimate by a lot, which reduces the searched nodes, optimizing the running time. The A* algorithm starts in a defined starting configuration, with a specific arm position, and tries to find the optimal path to the final configuration, which also has a defined arm position. In each iteration, the algorithm chooses the node with the least f cost, the sum of g cost and h cost, and finds all the neighbours
  • 369.
    360 T. Coutoet al. that are not obstacles, and add them to the list of nodes to evaluate. The algo- rithm stops if the final node is reached or if all nodes are searched and there is no solution. Configuration Space. The Cspace was implemented with Cartesian coordi- nates, and the x and y axes were divided into 301 subdivisions and the z axis in 151 subdivisions, corresponding to 12 mm increments along with the range of motion. This subdivisions correspond to a total of 301 ∗ 301 ∗ 151 = 13.680.751 configurations. To create a faster implementation of the algorithm some adjustments were performed, starting with the creation of a less precise CSpace, with 72 mm incre- ments instead of 12 mm for the x and y axes, and 36 mm for the z axis, decreasing the number of nodes to search when finding the path. To obtain this reduc- tion, the number of subdivisions was decreased to 51, resulting in a total of 51 ∗ 51 ∗ 51 = 132.651 configurations, reducing the search space by more than 100 times, and the average searched nodes is reduced by almost 6 times. This less precise CSpace allows to improve the algorithm time, since in most of the path if there are no obstacles around, the intermediate configurations that would be defined with the complete CSpace are redundant, the only important ones are the ones close to obstacles, where the complete CSpace would be iterated but would be almost instantaneously since the referred nodes are close together. After defining the full path from the source node to the destination node, and since the CSpace used was less precise, an approximation iteration would be performed. Around each obstacle, an approximation was performed to avoid collisions due to a lack of precision. After this, there was also computed an approximation for the final position, between the final node of the less precise CSpace to the final node in the more precise CSpace. Associated with the CSpace is also the CObs, where the configurations that the robot itself is occupying can be defined. Arm Position. The arm position of each configuration needs to be determined using kinematics, which leads to multiple solutions for each Cartesian position. SARKKIS software has integrated kinematics, which automatically calculates the possible solutions, and each solution has the manipulator links and the end- effector dimensions. The first step is to define if there is a solution that does not cause any collision with the environment. Using the solutions where there are no collisions, the one that results in the minimum value for the g cost (cost of movement) is the optimal one. 3.2 Parallel Computing After identifying possible tasks to implement parallelism, there was a need to evaluate if the parallelism could be performed in the CPU or GPU. For the CPU, every task that executed multiple times is worthy of parallelism, for the
  • 370.
    Robotic Welding OptimizationUsing A* Parallel Path Planning 361 GPU, it was more specific since the type of tasks needs to have dense small calculations that can be subdivided into many threads, like the calculations for the neighbours, elements, and heuristics. These CPU and GPU changes can be seen below in Image 1. Fig. 1. CPU and GPU parallelism CPU Parallelism. The first aspect to consider is that this type of parallel work is only suitable for work that is performed over many iterations and that requires some amount of work, otherwise it causes the code to run slowly. Due to the workload limitation, Threading was the solution to be imple- mented, and the place to achieve a reasonable amount of work was the nodes calculations. The first memory shared is the OpenList and ClosedList and to avoid simul- taneous editing Mutex were introduced, using Microsoft Threading. The second memory problem was due to the edition of the nodes because different nodes can have common neighbours. To avoid this, the variables that are edited were converted to lists, and instead of editing the variable, a new value is added to the list, and when performing the calculations the minimum value for the h cost is used, and the other values can be tracked in the list by using the h cost index. To complete the Threading research, BackgroundWorkers were implemented. To define a BackgroundWorker, an event handler is added for the DoWork event, that will contain the method itself, another event is added to the RunWorker- Completed, a method for when the BackgroundWorker completes is work, also there was a need to enable the WorkerSupportsCancellation to cancel ongoing BackgroundWorkers when the solution is found. To send data to the method an argument needs to be created, that consists of a Tuple with all the infor- mation needed (Configurations and CSpace). To start the operation, simply call RunWorkerAsync with the argument as an input.
  • 371.
    362 T. Coutoet al. GPU Parallelism. One of the main concerns when using CUDA is that to perform the computations, the data needs to be moved to and from the device, so the complexity of the operations need to justify this data transfer to obtain an improvement in the performance [6]. The ideal application would be to run many data elements simultaneously in parallel, which involves arithmetic operations on a large amount of data, where the operations can be performed across thousands or millions of elements at the same time. Since g cost, h cost, and find element are functions with simple calculations, a CUDA kernel for each one of those resulted in the worst performance of the algorithm. When using a kernel for find neighbours, it worked a little better since each node has 26 possible neighbours, but still was not helping the performance. The one that could improve the most was the find collisions because it has to verify each configuration that the manipulator is occupying, and verify if there is a collision with the environment obstacles. With this implementation, there was the need to create a boolean array with the index’s of the obstacles, to send the information to the kernel, since the device can not access information from the host. For this, when the CSpace was created, a list was filled with those indexes. 4 Integration The goal of this solution is to improve current solutions in the path planning industry, and for this, the solution was integrated with the SARKKIS software. This enables to perform industrial validation of the solution, trying to make the most out of the improvements proposed. To implement the solution a DLL was created and added to SARKKIS soft- ware. MetroID software family proprietary from SARKKIS is a CAM software for cutting and welding focused on generating collision-free operation paths for robotic work cells. Furthermore, upon sequencing the operation the planning between cuts or welds can be rapidly prepared considering this parallel A* app- roach. Therefore, the work is relevant for both the setup stage (vector genera- tion) and production/control stage (path planner between operations, avoiding dynamics obstacles as well as added fixtures to the scene). 5 Experimental Validation To validate the implemented functions and verify the optimization that can be obtained with them, some tests were performed, to better understand the best tool to be used or if there was a benefit of combining both. The computer used for these preliminary testing was equipped with an AMD Ryzen 7 2700X Eight- Core Processor (@3.70 GHz), 16 GB of RAM and an additional GPU Nvidia 1050 Ti.
  • 372.
    Robotic Welding OptimizationUsing A* Parallel Path Planning 363 5.1 CPU Parallelism To understand the implemented functions that resulted in an improvement, and to compare them, the solution for the A* using ThreadPool and Background- Workers was run across multiple scenarios, with and without obstacles, in more than 200 tests, to understand what was the optimal number of threads. These tests were performed on a computer with 8 cores. With these tests, using 16 threads for the ThreadPool implementation and using 8 or 16 threads for the BackgroundWorkers implementation proved to be the implementation that resulted in the best performance. Following these tests, to verify the difference in these three implementations, and understanding the improvement, 250 more tests were performed comparing those implementations with the sequential implementation of the A*, the traditional one, in multiple scenarios and with different levels of complexity. After these tests, the utilization of BackgroundWorkers with 8 threads, resulted in the best performance, for our setup, with an improvement of almost 10 times compared to the traditional implementation of the A* (Fig. 2). Fig. 2. Average time of different implementations Since the creation of the CSpace also takes a lot of time, an implementation of that function was created using BackgroundWorkers to understand if there would be any benefit, and the results can be seen below, with an improvement of more than 20%. Also, a solution using a kernel was tried, but it only resulted in a very small improvement, around 3%. 5.2 GPU Parallelism For the GPU parallelism, some functions were created, but since g cost, h cost and find element were too simple and resulted in a huge loss of performance, they were not highly tested. For the check collisions function, there was a big increment in time, since there were too many variables to be copied for the device which cause a lack of performance. For the find neighbours function, since it would at the top create 26 neighbours, and therefore 26 threads, it did not result in an improvement for the implementation.
  • 373.
    364 T. Coutoet al. 6 Conclusion and Future Work The solution presented, an optimized A* algorithm, proved to improve the path planning time with the introduction of grid computing, since it reaches the opti- mal solution in reduced time, with a collision-free solution. This work provides a flexible solution to be integrated by a wide range of offline programming soft- ware. After the experimental validation, and to validate the results obtained, an industrial validation was performed in the CoopWeld work cell. The solution worked in the industrial environment, as it can be seen in https://tiago27couto. wixsite.com/dissertation, and validated the robustness of the algorithm, with some changes to be made to refine the solution, and proved to have the potential to improve current path planning solutions. As the next step, inserting external axes will improve the reachability of the solution, even though it will also increase the complexity, which will enable further improvements and possible integration of GPU computing. Acknowledgment. The research and developed work leading to these results has received funding from the TRINITY Robotics project, under the European Union’s Horizon 2020 research and innovation programme. Grant agreement 825196. References 1. CUDA zone—NVIDIA developer. https://developer.nvidia.com/cuda-zone. Acces- sed 15 May 2021 2. Henrich, D., Honiger, T.: Parallel processing approaches in robotics. In: Proceeding of the IEEE International Symposium on Industrial Electronics, ISIE 1997, vol. 2, pp. 702–707 (1997). https://doi.org/10.1109/ISIE.1997.649079 3. Jiang, Y.S., Chen, W.M.: Task scheduling in grid computing environments. In: Pan, J.S., Krömer, P., Snášel, V. (eds.) Genetic and Evolutionary Computing, vol. 238, pp. 23–32. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01796-9 3 4. Kar, A., Dutta, A.K., Debnath, S.K.: Application of cloud computing for optimiza- tion of tasks scheduling by multiple robots operating in a co-operative environment. In: Hemanth, J., Fernando, X., Lafata, P., Baig, Z. (eds.) ICICI 2018. LNDECT, vol. 26, pp. 118–125. Springer, Cham (2019). https://doi.org/10.1007/978-3-030- 03146-6 11 5. Pham, Q.-C.: Trajectory planning. In: Nee, A.Y.C. (ed.) Handbook of Manufactur- ing Engineering and Technology, pp. 1873–1887. Springer, London (2015). https:// doi.org/10.1007/978-1-4471-4670-4 92 6. Rahmani, V., Pelechano, N.: Multi-agent parallel hierarchical path finding in navi- gation meshes (MA-HNA*). Comput. Graph. 86, 1–14 (2020). https://doi.org/10. 1016/j.cag.2019.10.006 7. Ruetsch, G., Oster, B.: Getting started with CUDA (2008). https://www.nvid ia.com/content/cudazone/download/Getting Started w CUDA Training NVISIO N08.pdf. Accessed 15 May 2021 8. Therón, R., Blanco Rodrı́guez, F.J., Curto, B., Moreno, V., Garcı́a-Peñalvo, F.: Parallelism and robotics: the perfect marriage. ACM Crossroads 8, 1–11 (2002)
  • 374.
  • 375.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks Willian Oliveira Pires1 , Ricardo Corso Fernandes Jr.1 , Pedro Luiz de Paula Filho1 , Arnaldo Candido Junior1(B) , and João Paulo Teixeira2 1 Federal University of Technology - Paraná, Medianeira Campus, Curitiba, Brazil {willianpires,ricjun}@alunos.utfpr.edu.br, {plpf,arnaldoc}@utfpr.edu.br 2 Research Centre in Digitalization and Intelligent Robotics (CEDRI) – Instituto Politecnico de Braganca, Braganca, Portugal joaopt@ipb.pt http://www.utfpr.edu.br/english, https://cedri.ipb.pt/ Abstract. Identifying plant species is an important activity in specie control and preservation. The identification process is carried out mainly by botanists, consisting of a comparison of already known specimens or using the aid of books, manuals or identification keys. Artificial Neu- ral Networks have been shown to perform well in classification problems and are a suitable approach for species identification. This work uses Convolutional Neural Networks to classify tree species by leaf images. In total, 29 species were collected. This work analyzed two network mod- els, Darknet-19 and GoogLeNet (Inception-v3), presenting a comparison between them. The Darknet and GoogLeNet models achieved recognition rates of 86.2% and 90.3%, respectively. Keywords: Deep learning · Leaf recognition · Tree classification 1 Introduction Sustainability is an important concern in the context of business and govern- ments in view of nature preservation. According to Shrivastava [12], both busi- nesses and governments play an important role in nature’s preservation. The identification process of forest species is important in this context, specially in the case of endangered species. Flora identification is currently made by botanists by comparing with already known species or with book guidance, manuals and identification keys. This comprises simple tasks as identifying whether the plant have flowers and fruits to more complex tasks, such as identifying the plant species by observing morphological attributes. For non-professionals, this pro- cess can be long and error prone, so an automated tool would save time and, possibly, plant species. The advancements made in computation, image process- ing techniques and pattern recognition unveiled new ways of specie identification. Deep learning based system are promising in this field, being helpful both for the professionals and non-professionals. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 367–380, 2021. https://doi.org/10.1007/978-3-030-91885-9_27
  • 376.
    368 W. O.Pires et al. In this work, we used leaf images to train Convolutional Neural Netowrks (CNNs) for the classification task. Two models were trained: GoogLeNet (Inception-v3) and Darknet-19, both implemented using Tensorflow. These mod- els were chosen due to their light requirements when compared to other models, allowing for use in low-cost equipment during field research. The models can be used to identify species in natura. This work is organized as follows. Section 2 presents an overview of CNNs, specie identification and related works. Section 3 presents the materials and methods used to train our models. Section 4 discusses the results. Finally, Sect. 5 contains the works conclusions. 2 Background 2.1 Species Identification In order to realize plant species classification, botanists base themselves in veg- etable taxonomy, analyzing characteristic group species by morphological simi- larities and genetic kinship links [10]. This process generated a field in botany called dendrology, which investigates woody plants identification, distribution and classification [5]. Dendrology scope includes root types, tree sizes, pilosity, shaft, as well as diagnostic elements (color, texture and structure). The shaft is a tree trunk part free of ramifications which can be of different types, shapes and bases. Table 1 presents leaf components. The bases of leaf variety is its division type, as there are simple leaves, which presents a single leaf lamina (Fig. 1). There are also compound leaves, which present more than one leaflet, as shown in Fig. 2. Table 1. Leaf Characteristics Scientific name Description Leaf venation Pattern of veins in the leaf Hairiness Structures as hair in leaf surface Leaf arrangement How leaves are arranged on a twig Stem Plant structure that supports leaves, flowers and fruits Stipule A pair of small organs that may be attached to the twig on either side of the petiole Leaf base Part of leaf nearest to the petiole Leaf apexes Part of leaf farthest from petiole Simple leaf Leaf that has a single blade Compound leaf Leaf that has two or more blades that are called leaflets Leaflets Leaf subdivisions that are related to compound leaf Stalk A thin stem that supports a leaf and joints it to another part of plant tree Rachis Principal vein in the compound leaf, extension of stalk Bud A small lump from which a flower, leaf or stem develops
  • 377.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks 369 Fig. 1. Simple leaf [5] Fig. 2. Compound leaf [5] There are other important leaf elements, which generate a wide specie vari- ety, such as leaf shape, tip (or apex), base and margin attributes, which can be used to differentiate plant species. Leafs also can be identified according to their phyllotaxis, which are divided in four types: alternate, spiral, opposite and whorled, shown in Fig. 3. This attribute is used before leaf extraction, as it shows how leaves are organized. Other attribute is leaf venation, which is divided in pinnate, reticulate, parallel, palmate and dichotomous (Fig. 4). All those characteristics are taken in account during species identification process and are the foundation of a widely used species identification method among botanists: dichotomous key, which is based in plant characteristics obser- vation. Researchers compare characteristics of field extracted species with the characteristics of dichotomous keys, one by one, until matching with any of the registered species [10]. Table 2 presents a simple example using a dichotomous key to classify a leaf according to it’s venation.
  • 378.
    370 W. O.Pires et al. Fig. 3. Leaf phyllotaxis [5] Fig. 4. Leaf venation [5] onte Table 2. Simple dichotomous key [5] 1. Leaves with a single vein and not ramified Single main vein Leaves with more than a single venation 2 2. Leaves with more than a single vein and all parallel between them Parallel Leaves with non-parallel veins 3 3. Secondary veins originates from a main vein Pinnate Leaves with several main veins originating from the petiole Palmate This is a fairly simple example, but in many cases characteristic selection might not be trivial and plant specie identification will usually involve more than a single characteristic. An example is Xanthosoma taioba, which is suitable for consumption and hard to classify, as it is very similar to Xanthosoma violaceum, which is not suitable for consumption. In Fig. 5, it becomes clear that differing two species is not always a simple task.
  • 379.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks 371 Fig. 5. The real Xanthosoma taioba (right), and Xanthosoma violaceum (left) [5] 2.2 Convolutional Neural Networks In the context of machine learning, specifically in neural networks, Convolu- tional Neural Networks (CNNs) are a powerful model to analyze images. CNNs are popular in image processing since they are inspired by the visual cortex. These networks are based on the idea of specialized components inside a system with specific tasks, in a similar fashion as the visual cortex observed by [16]. This architecture is composed by a sequence of layers that tries to capture a hierarchy of increasingly sophisticated representations. Besides the input layer, which normally consists of an image with width, height and color depth (RGB – red, green and blue channels), there are three typical layers: convolution layer, pooling layer and densely connected layer [15]. The first hidden layer in a CNN is usually a convolutional layer, which is composed by many feature maps (filters), capable of learning patterns as the training progresses [3]. Convolutional layers usually receive a two-dimensional or a collection of input two-dimensional and are widely used to process images. The layers are then submitted to a convolutional operation subject to parameters learned during the network training phase. Each layer usually also performs a non-linear operation which greatly increases a model generalization capacity. This is done using an activation func- tion. A popular function is the Rectified Linear Unit (ReLU), as it is a fast to calculate non-linear function. It replaces negative input values by zero [15]. An example as can be observed in (1), where z represents the function input (a neuron input). φ(z) = 0 z ≤ 0 z z 0 (1) After a convolution and activation function, it is common the use of a pool- ing layer. This technique aims to reduce the resulting matrix size, which dimin- ishes the amount of neural network parameters to learn, contributing to avoid
  • 380.
    372 W. O.Pires et al. overfitting [15]. Max pooling is a pooling technique in which several inputs close to each other are replaced by a single value, the highest value in their neigh- bourhood. Each consecutive layer are capable of representing more complex concepts than the previous layer. The last layers of a CNN usually are dense (or fully connected) layers, built on top of the convolutional layers. In case of CNNs for classification, the last layer outputs a n-dimensional vector, where n is the total number of classes and each vector element is the predicted class probability for one of the available classes [3]. For classification, it is common the use of the Softmax function, which compares each output neuron response and return the results in the form of probability. This function is presented in (2), for the stimulus zj received by the j-nth output neuron. φ(zj) = ezj k ezk (2) These models are normally trained using the Backpropagation and Gradi- ent Descent algorithms. To avoid overfitting, dropout technique can be used. Dropout approach consists on randomly removing neurons during training pro- cess [1]. 2.3 Darknet Darknet is a neural topology usually implemented in the YOLO framework. This framework allows real-time object detection and is able to identify objects in images and videos as presented in Table 3. In this work, we investigated Darknet- 19, an architecture composed of 19 convolutional layers interspersed with 5 more layers that apply max-pooling [3]. This network segments input image in SxS frames, known as grids. To do this, it uses a divide and conquer strategy, making use of image segments to identify object position in addition to only identifying objects [2]. 2.4 GoogLeNet GoogLeNet is a CNN that became notorious after winning the 2014 Imagenet Competition. GoogLeNet engineer’s objective was to enhance neural network computational efficiency while making it deeper and wider. The main feature in GoogLeNet is the inception module, which is based on the idea of using multiple convolutional filters [13] varying the kernel size used in the same convolutional layer. The enhanced inception module is presented in Fig. 6.
  • 381.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks 373 Table 3. Darknet-19 [13] Type Filters Stride/Size Output Dimension Convolutional 32 3 × 3 224 × 224 Maxpool 2 × 2/2 112 × 112 Convolutional 64 3 × 3 112 × 112 Maxpool 2 × 2/2 56 × 56 Convolutional 128 3 × 3 56 × 56 Convolutional 64 1 × 1 56 × 56 Convolutional 128 3 × 3 56 × 56 Maxpool 2 × 2/2 28 × 28 Convolutional 256 3 × 3 28 × 28 Convolutional 128 1 × 1 28 × 28 Convolutional 256 3 × 3 28 × 28 Maxpool 2 × 2/2 14 × 14 Convolutional 512 3 × 3 14 × 14 Convolutional 256 1 × 1 14 × 14 Convolutional 512 3 × 3 14 × 14 Convolutional 512 3 × 3 14 × 14 Convolutional 256 1 × 1 14 × 14 Maxpool 2 × 2/2 7 × 7 Convolutional 1024 3 × 3 7 × 7 Convolutional 512 1 × 1 7 × 7 Convolutional 1024 3 × 3 7 × 7 Convolutional 512 1 × 1 7 × 7 Convolutional 1024 3 × 3 7 × 7 Convolutional 1000 1 × 1 7 × 7 Avgpool Global 1000 Softmax GoogLeNet is a 22 layer network, considering only convolutional layers, as shown in Table 4. The last layer uses the Softmax function to perform classifica- tion [15]. 2.5 Related Work Many solutions based on deep learning have been used in specie identification problems, due to recent results using this technique. Other strategies can also be used, based on image processing and pattern recognition. These techniques use macro and microscopic characteristics of the image. For example, [8] classifies wood applying the following image characteristics extraction techniques: color
  • 382.
    374 W. O.Pires et al. Fig. 6. Inception module [14] Table 4. GoogLeNet’s structure [13] Type Filters/Stride Output Dimension Convolution 7 × 7/2 112 × 112 × 64 Max Pool 3 × 3/2 56 × 56 × 64 Convolution 3 × 3/1 56 × 56 × 192 Max Pool 3 × 3/2 28 × 28 × 192 Inception (3a) 28 × 28 × 256 Inception (3b) 28 × 28 × 480 Max Pool 3 × 3/2 14 × 14 × 480 Inception (4a) 14 × 14 × 512 Inception (4b) 14 × 14 × 512 Inception (4c) 14 × 14 × 512 Inception (4d) 14 × 14 × 512 Inception (4e) 14 × 14 × 832 Max Pool 3 × 3/2 7 × 7 × 832 Inception (5a) 7 × 7 × 832 Inception (5b) 7 × 7 × 1024 Avg Pool 7 × 7/1 1 × 1 × 1024 Dropout (40%) 1 × 1 × 1024 Linear 1 × 1 × 1000 Softmax 1 × 1 × 1000 analysis, GLCM (Gray-Leval Co-occurence Matrix), border histogram, fractals, LBP (Local Binary Pattern), LPQ (Local Phase Quantization) and Gabor filter. This approach resulted in a recognition rate of 99.49% among 42 species. Another work with related theme proposes analysis and identification of plant species based in texture characteristics extraction from microscopic leaf epider- mis images [7]. Texture extraction techniques were used to analyze 32 species. This approach had 96% success rate. By utilizing leaves, [11] applied image
  • 383.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks 375 segmentation techniques for feature extraction. This was performed using the GLCM technique and feature vectors were extracted. The achieved recognition rate was around 75.4% by techniques like MLP (Multilayer Perceptron), SMO (Sequential Minimal Optimization) and LibSVM (Library for Support Vector Machines) as classifiers. Using deep learning as main identification method, the strategy is basically to use CNNs to identify the best characteristic from the leaf to recognize a specie. From this strategy, [6] build CNNs being used for weeds control, which tries to detect a specie in the lawn. It was used 256 × 256 pixels images and the used architecture was AlexNet. The result was 75% precision [9]. There are other CNN approaches, [17] proposes the analysis of pictures taken from the top of farms, which demanded an extra detail preprocessing the image to enhance illumination before sending it to the network, that is composed of 5 convolutional layers with ReLU activation function and, at the end of the network, a Softmax function. Their experiment obtained 97.47% precision. 3 Method 3.1 Data Collection Initially, leafs from 29 different species were collected. Species are listed in Sect. 4 (Table 4). For each species, 100 photos were taken on both sides. Images were obtained through the utilization of a photobooth proposed by [11]. It has 40 square centimeters and its internal contains led strips, which produces high lumi- nosity with low energy costs. These leds can be feed using batteries or a 12 V power supply, and can be easily transported. To avoid reflexes on the images, internal walls were painted black, except for the bottom, which is white. The leaf is positioned on the bottom of the photobooth, where it is compressed by a glass pane, to keep it fixed and flat. This dataset is in the process of being public released. After gathering enough photos, data augmentation techniques were used to artificially increase dataset size, as it is necessary to have many samples to train a Deep Neural Network. Data augmentation was done by using python 3.5.2 and Keras scripts to alter images, creating new ones. It was used Keras ImageDataGenerator class to generate new image samples from the original data, using the default values for the operations Rotation Range, Width Shift, Shear Range, Zoom Range, Horizontal and vertical Flip and Fill mode. 3.2 Training The models were then trained with the augmented data and had their results evaluated, in order to improve performance. Both models, GoogLeNet and Dark- Net19, were trained four times. For GoogLeNet, we used the work of [13] as a reference to develop and train our network. Originally, the network was developed with batch size of 50, Reduce 1 × 1 of 104 and Dropout rate of 0.5.
  • 384.
    376 W. O.Pires et al. Regarding Darknet-19, we used the python implementation Darkflow, which allow for the utilization of Darknet framework in Tensorflow [4]. Tiny-YOLO version 2 was used instead of YOLO version 3, as it was faster to train. As it was used 29 classes, the number of feature maps used in the last layer was 170, stride of size 1 and batch size of 64, following the default configuration for Tiny-YOLO version 2. 4 Tests and Results The original dataset were divided in 5,800 images for training and 580 for test. In order to improve results, both sets were subject to augmentation, generating 34,800 images for training and 2,900 for test. The augmented dataset contains independent training and test set, as no original image from training has an augmented version in test. Similarly, augmented images in training has no version in test. Darknet model was first trained in the original dataset during 5.000 itera- tions, presenting True Positive (TP) = 414, False Positive (FP) and False Neg- ative (FN) of 166 and harmonic mean of 71.3%. After the first experiment, the model was than trained over the augmented dataset, during 18,000 iterations. Values presented by the network were TP = 2502, FP and FN of 398 and har- monic mean of 86.2%. Precision and recall were calculated for each class. Results are presented on Table 4 and Fig. 7. GoogLeNet experiments were analogous for Darknet-19. First the model was trained over the original dataset (5,800 images for training and 580 for test). This process was performed for 2,000 iterations. The network presented TP = 461, FP and FN of 119 and harmonic mean of 79.4%. For the second experiment, (34,800 images for training and with 2,900 for validation) the iteration count was raised to 4,000. The network presented the following values: TP = 2,633, FP and FN of 267 and harmonic mean of 90.7%. Values for precision, recall and harmonic mean were also calculated per class, presented on Table 4 and Fig. 8. In the performed experiments, GoogLeNet had a better result than Darknet. The difference between both networks in precision, recall and harmonic mean were, respectively: 4.5, 4.8 and 4.7%. GoogLeNet results are good, even with a broad and complicated dataset, as it were classified 29 classes, many of which are similar. Analyzing results with and without data augmentation, it became clear the importance of a sufficient amount of data to work with CNNs. Differences between Darknet and GoogLeNet trained with small and sufficient amount of data was, respectively, 14.9% and 11.3%. The developed data augmentation algo- rithm achieved it’s objective, expanding the dataset without damaging samples.
  • 385.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks 377 Table 5. Precision, recall and harmonic mean in details - YOLO and GoogLeNet Scientific Name YOLO Precision YOLO Recall YOLO F1 GoogLeNet Precision GoogLeNet Recall GoogLeNet F1 1 Persea Americana 63,0% 64,2% 72,6% 73,0% 100,0% 84,3% 2 Eriobotrya Japnica Lind 57,0% 100% 72,6% 75,0% 100% 85,7% 3 Psidium Rufum 100% 80,0% 88,8% 100% 74,0% 85,1% 4 Annona Montana 96,0% 56,1% 70,8% 98,0% 64,0% 77,4% 5 Annona Squamosa 97% 100% 98,4% 100% 100% 100% 6 Cojoba Arborea 100% 100% 100% 100% 100% 100% 7 Coffea 55,0% 55,0% 55,0% 72,0% 75,0% 73,4% 8 Pera Heteranthera 95,0% 100% 97,4% 98,0% 82,3% 89,4% 9 Anacardium Occidentale 100% 100% 100% 100% 100% 100% 10 Peltophorum dubium 100% 84,0% 91,3% 100% 83,3% 90,9% 11 Nectandra Megapotamica 98% 100% 98,9% 94% 100% 96,9% 12 Cerasus 83,0% 89,2% 86,0% 90,0% 95,7% 92,7% 13 Prunus Serrulata 78,0% 100% 87,6% 91,0% 100% 95,2% 14 Salix Babylonica 100,0% 100% 100% 82,0% 100% 90,1% 15 Lle Paraguariensis 81,0% 100% 89,5% 80,0% 100% 88,8% 16 Annona Coriácea 94% 80,3% 86,6% 100% 94,3% 97,0% 17 Psidium Guajava 83,0% 76,8% 79,8% 94,0% 88,6% 91,2% 18 Annona Muricata 98% 91,5% 94,6% 100% 98,0% 99,0% 19 Syzygium Cumini 84,0% 85,7% 84,8% 97,0% 100% 98,4% 20 Leucaena Leucocephala 100% 100% 100% 100% 98% 99% 21 Citrus Limon 72% 76,5% 74,2% 76% 100% 86,3% 22 Tibouchina Mutabilis 100% 83,3% 90,9% 100% 90,0% 94,7% 23 Brunfelsia Uniflora 81,0% 81,0% 81,0% 90,0% 100,0% 94,7% 24 Mangifera Indica 97% 100% 98,4% 98% 79,6% 87,8% 25 Licania Tomentosa 61,0% 100% 75,7% 78,0% 81,2% 79,5% 26 Dypsis Lutescens 100% 100% 100% 100% 100% 100% 27 Paubrasilia Echinata 86,0% 100% 92,4% 89,0% 100% 94,1% 28 Aspidosperma Polyneuron 78,0% 72,2% 75,0% 82,0% 90,1% 85,8% 29 Eugenia uniflora 65,0% 100% 88,8% 76,0% 77,5% 78,7%
  • 386.
    378 W. O.Pires et al. Fig. 7. Confusion Matrix Heatmap - Darknet19 Fig. 8. Confusion Matrix Heatmap - GoogLeNet
  • 387.
    Leaf-Based Species RecognitionUsing Convolutional Neural Networks 379 5 Conclusions This work presented a comparison of Darknet-19 and GoogLeNet for tree species recognition using a dataset composed by leaf images from 29 different species, reaching recognition rates of 86.2% and 90.3%, respectively. The obtained results demonstrates the viability of GoogLeNet and Darknet networks for classification. The models can be applied in field research, specially being used to identify species in natura. For future works, we plan to test the models against images with leafs that were not removed from the tree. We also plan to use pre-trained Darknet net- works in other platforms, as in smartphones aiming at practical uses of the model, comparing it with similar systems. YOLO have functionality for detect- ing objects in videos and android studio allows Tensorflow usage while associat- ing a YOLOv2 training model. By using smartphone cameras, it is possible to develop an app to recognize plant species in video. Another suggestion would be to use the model in drones, as there is a huge amount of non registered plant species, in order to explore areas of limited access. Acknowledgements. We gratefully acknowledge the support of NVIDIA Corpora- tion with the donation of the GPU used in part of the experiments presented in this research. References 1. Baldi, P., Sadowski, P.J.: Understanding dropout. In: 2013 Neural Information Processing Systems (2013). https://papers.nips.cc/paper/4878-understanding- dropout 2. Farhadi, A., Girshick, R., Redmon, J.: You only look once: unified, real-time object detection. In: University of Washington. IEEE (2015) 3. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT press, Cambridge (2016) 4. Jones, R.P.: Darkflow (2018). https://medium.com/@richardpricejones/darkflow- 9bdc9f9b818e 5. Marchiori, J.N.C.: Elementos de Dendrologia. UFSM, Santa Maria (1995) 6. Mayo, S.J., Remagnino, P.: How deep learning extracts and learns leaf features for plant classification. Auton. Robot. 1–13 (2016) 7. Odemir, B., et al.: Leaf epidermis images for robust identification of plants. In: Instituto de Fı́sica de São Carlos, pp. 2–10 (2017) 8. de Paulo Filho, P.L.: Reconhecimento de espécies florestais através de imagens macroscópica. In: UFPR, pp. 9–48 (2012) 9. Pearlstein, L., Kim, M., Seto, M.: Convolutional neural network application to plant detection, based on synthetic imagery. In: Proceedings - Applied Imagery Pattern Recognition Workshop (2017). www.scopus.com 10. Pinheiro, A.L.: Fundamentos de taxonomia e dendrologia tropical. SIF, Santa Maria (2000) 11. Pires, W.O.: Reconhecimento de espécies florestais utilizando técnicas de proces- samento de imagem. In: 7o Seminario de extenção e inovação, Londrina (2017)
  • 388.
    380 W. O.Pires et al. 12. Shrivastava, P.: The role of corporations in achieving ecological sustainability. Acad. Manag. Rev. 20(4), 936–960 (1995) 13. Szegedy, C., Liu, W., Jia, Y.: Going deeper with convolutions. In: IEEE/ Boston, MA, USA. IEEE (2015) 14. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep- tion architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016) 15. Vargas, A.C.G., Paes, A., Vasconcelos, N.: Um estudo sobre redes neurais convolu- cionais e sua aplicação em detecção de pedestres. In: IEEE/Conferencia Universi- dade Federal de Fluminence de Niterói. IEEE (2015) 16. Wurtz, R.H.: Recounting the impact of hubel and wiesel. Pattern Recogn. 32, 1–20 (2009). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718241/ 17. Yalcin, H., Razavi, S.: Plant classification using convolutional neural networks. In: 2016 5th International Conference on Agro-Geoinformatics, Agro-Geoinformatics 2016 (2016). www.scopus.com
  • 389.
    Deep Learning Recognitionof a Large Number of Pollen Grain Types Fernando C. Monteiro(B) , Cristina M. Pinto, and José Rufino Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal {monteiro,rufino}@ipb.pt https://cedri.ipb.pt Abstract. Pollen in honey reflects its botanical origin and melissopa- lynology is used to identify origin, type and quantities of pollen grains of the botanical species visited by bees. Automatic pollen counting and classification can alleviate the problems of manual categorisation such as subjectivity and time constraints. Despite the efforts made during the last decades, the manual classification process is still predominant. One of the reasons for that is the small number of types usually used in previous studies. In this paper, we present a large study to automatically identify pollen grains using nine state-of-the-art CNN techniques applied to the recently published POLEN73S image dataset. We observe that existing published approaches used original images without study the possible biased recognition due to pollen’s background colour or using prepro- cessing techniques. Our proposal manages to classify up to 97.4% of the samples from the dataset with 73 different types of pollen. This result, which surpasses previous attempts in number and difficulty of pollen types under consideration, is an important step towards fully automatic pollen recognition, even with a large number of pollen grain types. Keywords: Pollen recognition · Convolutional neural network · Deep learning · Image segmentation 1 Introduction Food fraud has devastating consequences, particularly in the field of honey pro- duction, which the U.S. Pharmacopeia Fraud Database1 has classified as the third largest area of adulteration, only behind milk and olive oil. Our aim is to find solutions to help solving this problem and prevent its recurrence. The deter- mination of the botanical origin can be used to label honey and the knowledge of the geographic origin is a factor that influences considerably the commercial value of the product and can be used for quality control and to avoid fraud [6]. Although demanding, pollen grain identification and certification are crucial tasks, accounting for a variety of questions like pollination or palaeobotany, but 1 https://decernis.com/solutions/food-fraud-database/. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 381–392, 2021. https://doi.org/10.1007/978-3-030-91885-9_28
  • 390.
    382 F. C.Monteiro et al. also for other fields of research, including crime scene investigation [13], aller- gology studies [7] as well as the botanical and geographical studies concerning origins of honey to prevent honey labelling fraud [15]. However, most of the pollen classification is a time consuming, laborious and a highly skilled work, visually done by human operators using microscopes, trying to identify differ- ences and similarities between pollen grains. These differences are, frequently, imperceptible among pollen grains and may lead to identification errors. Despite the efforts to develop approaches that allow the automatic iden- tification of pollen grains [11,22], the discrimination of features performed by qualified experts is still predominant [4]. Many industries, including medical, pharmaceutical and honey marketing, depend on the accuracy of this manual classification process, which is reported to be around 67% [19]. A notorious paper [22] from 1996 published a brief summary of the state of the art until then and, more importantly, the demands and needs of palynology to elevate the field to a higher level, thus making it a more powerful and useful tool. Pattern recognition from images has a long and successful history. In recent years, deep learning, and in particular Convolutional Neural Networks (CNNs), has become the dominant machine learning approach in the computer vision field, specifically, in image classification and recognition tasks. Since the number of annotated pollen images in the publicly available datasets is too small to train a CNN from scratch, transfer learning can be employed. In this paper we pro- pose an automatic pollen recognition approach divided into three steps: initially, the regions which contain pollen are segmented from the background; then, the colour is preprocessed; finally, the pollen is recognized using deep learning. Most object recognition algorithms focus on recognizing the visual patterns in the foreground region of the images. Some studies indicate that convolutional neural networks (CNN) are biased towards textures [3], whereas another set of studies suggests shape bias for a classification task [2]. However, little atten- tion has been given to analyze how the recognition process is influenced by the background information in the training process. Considering that there are certain similarities between the layers of a trained artificial network and the recognition task in the human visual cortex, in this study, we hypothesize that if the collected images of a pollen have a unique background colour, different from all the other pollens, it may biases the recog- nition task, since the recognition could be based only in the background colour. In order to study such influence we trained the CNN with several datasets: one composed with original images, another composed with segmented images (where the background colour was eliminated), and with preprocessed images with histogram equalization and contrast limit adaptive histogram equalization (CLAHE) techniques. The acquisition of images usually has some different sources resulting in images with different background, as shown in Fig. 1 from the POLEN73S2 dataset [1] used in this study. Deep learning based pollen recognition meth- ods focus on learning visual features to distinguish different pollen grains. We 2 https://doi.org/10.6084/m9.figshare.12536573.v1.
  • 391.
    Deep Learning Recognitionof a Large Number of Pollen Grain Types 383 observed that existing published approaches used the entire image with the orig- inal background, in the training process. Background and foreground pixels in each image contribute with the same influence into the learning algorithms. As each pollen type has a different background from other types, when those trained networks are used to classify the pollens they may be biased by capture relevance from pollen’s background which may result in biased recognition. Fig. 1. Pollen dataset samples acquired with different background colours. In this paper, we investigate the background and colour preprocessing influ- ence by training nine state-of-the-art deep learning convolutional neural net- works for pollen recognition. We used a recently published POLEN73S image dataset that includes more than three times as many pollen types and images as the POLEN23E dataset used in recent studies. Our approach manages to classify up to 97.4% of the samples from the dataset with 73 different types of pollen. The remainder of this paper is organized as follows: Previous related works are presented and reviewed in Sect. 2. In Sect. 3, we describe the used materials and the proposed method. Section 4 presents the results and the discussion of the findings. Finally, some conclusions are drawn in Sect. 5. 2 Related Works Automatic and semi-automatic systems for pollen recognition based on image features, in particular neural networks and support vector machines, have been proposed for a long time [9,11,17,21]. In general terms, those approaches extract some feature characteristics to identify each pollen type. Although the classification remains based on a combination of image features, the deep learning CNN approach builds a model determining and extracting the features itself, in alternative of being predefined by human experts. Several
  • 392.
    384 F. C.Monteiro et al. CNN learning techniques have been developed for classifying pollen grain images [1,8,18,19]. In [8], Daood et al. present an approach that learns from the image features and defines the model classifier from a deep learning neural network. This method achieved a 94% classification rate on a dataset of 30 pollen types. Sevillano and Aznarte in [18] and [19] proposed a pollen classification method that applied transfer learning on the POLEN23E dataset and to a 46 different pollen types dataset, achieving accuracies of over 95% and 98%, respectively. In [1], Astolfi et al. presented the POLEN73S dataset and made an extensive study with several CNNs, achieving an accuracy of 95.8%. Despite the importance of their study, we identify two drawbacks in their approach, that influenced the performance: they used different number of samples for each pollen type, and used, for each pollen, an image background that is different from the image background of other pollen types. 3 Experimental Setup 3.1 Pollen Dataset The automation of pollen grain recognition depends on large image datasets with many samples categorized by palynologists. The results obtained depend on the number of pollen types and the number of samples used. Few samples may result in poor learning models, that are not sufficient to train conveniently the CNN; on the other hand, a small number of pollen types simplifies the identification process making it impractical to be used for recognizing large numbers of pollens usually found in a honey sample. While a number of earlier datasets have been used for pollen grain classifi- cation, such as the POLEN23E3 dataset [9] or the Pollen Grain Classification Challenge dataset4 , which contain 805 (23 pollinic types) and 11.279 (4 pollinic types) pollen images, respectively, in this paper we use POLLEN73S, which is one of the largest publicly available datasets in terms of pollen types number. POLLEN73S is an annotated public image dataset, for the Brazilian Savan- nah pollen types. According to its description in [1] the dataset includes pollen grain images taken with a digital microscope at different angles and manually classified in 73 pollen types, containing 35 sample images for each pollen type, except gomphrena sp, trema micrantha and zea mays, with 10, 34 and 29 sam- ples, respectively. From the results presented in [1], we observed that these small number of samples biased the results. Since CNNs were trained with a smaller number of samples for those types of pollens, this resulted in the worst classifica- tion scores relative to the other pollens. To overcome this problem, in our study, several images were generated through rotating and scaling the original images of these pollen types, ensuring the same number of samples for each pollen type, which gives a total of 2555 pollen images. Although the images in the dataset 3 https://academictorrents.com/details/ee51ec7708b35b023caba4230c871ae1fa25 4ab3. 4 https://iplab.dmi.unict.it/pollenclassificationchallenge/.
  • 393.
    Deep Learning Recognitionof a Large Number of Pollen Grain Types 385 have different width and height, they were resized accordingly with the image size of each CNN architecture input. More datasets were constructed by removing the pollen’s background colour (see Fig. 2). Since the images background has medium contrast with the pollen grains, the segmentation process uses just automatic thresholding and morpho- logical operations. We also applied histogram equalization and contrast limit adaptive histogram equalization (CLAHE) to those segmented images. These new datasets allow the independence of training and testing processes from the background colour among the pollen types. Fig. 2. First column: original images; second column: segmented images with back- ground colour removed; third column: segmented equalized images; fourth column: segmented CLAHE images. 3.2 Convolutional Neural Networks Architectures CNN is a type of deep learning model for processing images that is inspired by the organization of the human visual cortex and is designed to automatically create and learn feature hierarchies through back-propagation by using multiple layer blocks, such as convolution layers, pooling layers, and fully connected layers from low to high level patterns [2]. This technology is especially suited for image processing, as it makes use of hidden layers to convolve the features with the
  • 394.
    386 F. C.Monteiro et al. input data. The automatic extraction of the most discriminant features from a set of training images, suppressing the need for preliminary feature extraction, became the main strength of CNN approaches. In this section, we present an overview of the main characteristics of the CNNs used in this study for the recognition of pollen grains types. We choose nine popular CNN architectures due to their performance on previous classifica- tion tasks. Table 1 contains a list (chronological sorted) of state-of-the-art CNN architectures, along with a high-level description of how the building blocks can be combined and how the information moves throughout the architecture. 3.3 Transfer Learning Constraints of practical nature, such as the limited size of training data, degrade the performance of CNNs trained from scratch [18]. Since there is so much work that has already been done on image recognition and classification [10,12,20], in this study we used transfer learning to solve our problem. With transfer learning, instead of starting the learning process from scratch, with a large number of samples, we can use previous patterns that have been learned when solving a similar classification problem. Transfer learning is a technique whereby a CNN model is first trained on a large image dataset with a similar goal to the problem that is being solved. Several layers from the trained model, usually the lower layers, are then used in a new CNN, trained with sampled images from the current task. This way, the learned features in re-used layers are the starting points for the training process and adapted to classify new types of objects. Transfer learning has the benefit of reducing the training time for a CNN model and can overcome the generalization error due to the small number of images used in the training process when using a network from scratch. The previous obtained weights, in each layer, may be used as the starting values for the next layers and adapted in response to the new problem. This usage treats transfer learning as a type of weight initialization scheme. This may be useful when the first related problem has a lot more labelled data than the problem of interest and the similarity in the structure of the problem may be useful in both contexts. 3.4 Training Process In the training process, the CNNs use the fine-tuning strategy, as well as the stochastic gradient descent with momentum optimizer (SGDM) at their default values, dropout rate set at 0.5 and early-stopping to prevent over-fitting, and the learning rate at 0.0001. SGDM is used to accelerate gradients vectors in the correct directions, as we do not need to check all the training examples to have knowledge of the direction of decreasing slope, thus leading to faster converging. Additionally, to consume less memory and train the CNNs faster, we used the CNNs batch size at 12 to update the network weights more often, and trained them in 30 epochs. All images go through a heavy data augmentation
  • 395.
    Deep Learning Recognitionof a Large Number of Pollen Grain Types 387 Table 1. Chronological list and descriptions of CNN architectures used in this paper. VGG 16/19 [20] Introduced the idea of using smaller filter kernels allowing for deeper networks, and training these networks using pre- training outputs of superficial layers. They have five convolu- tional blocks where the first two have convolution layers and one max-pooling layer in each block. The remaining three blocks have three fully-connected layers equipped with the rectification non-linearity (ReLU) and the final softmax layer ResNet 50/101 [10] Shares some design similarities with the VGG architectures. The batch normalization is used after each convolution layer and before activation. These architectures introduce the residual block that aims to solve the degradation problem observed during network training. In the residual block, the identity mapping is performed, creating the input for the next non-linear layer, from the output of the previous layer Inception-V3 [24] This network has three inception modules where the resulting output of each module is the concatenation of the outputs of three convolutional filters with different sizes. The goal of these modules is to capture different visual patterns of different sizes and approximate the optimal sparse structure. Finally, before the final softmax layer, an auxiliary classifier acts as a regularization layer Inception-ResNet [23] Uses the combination of residual connections and the Incep- tion architecture. In Inception networks the gradient is back- propagated to earlier layers, and repeated multiplication may make the gradient indefinitely small, so they replaced filter concatenation stage with residual connections as in ResNet Xception [5] The architecture is composed of three blocks, in a sequence, where convolution, batch normalization, ReLU, and max pooling operations are made. Besides, the residual connec- tions between layers are made as in Resnet architecture DenseNet201 [12] Is based on the ideas of ResNet, but built from dense blocks and pooling operations, where each dense block is an iterative concatenation from all previous layers. In the main blocks, the layers are densely connected to each other. Massive reuse of residual information allows for deep supervision as each layer receives more information from the previous layer and therefore the loss function will react accordingly, which makes it a more powerful network DarkNet53 [16] It has 53 layers deep and acts as a backbone for the YOLOv3 object detection approach. This network uses successive con- volutional layers with some shortcut connections (introduced by ResNet to help the activations propagate through deeper layers without gradient vanishing) to improve the learning ability. Batch Normalization is used to stabilize training, speed up convergence, and regularize the model batch
  • 396.
    388 F. C.Monteiro et al. which includes horizontal and vertical flipping, 360o random rotation, rescaling factor between 70% and 130%, and horizontal and vertical translations between −20 and +20 pixels. The CNNs were trained using the Matconvnet package for Matlab R on a node of the CeDRI cluster with two NVIDIA RTX 2080 Ti GPUs. As in [1], we used 5-Set Cross-Validation, where in each set the images were split on two subsets, 70% (1825 images) for training and 30% (730 images) for testing, allowing the CNN networks to be independently trained and tested on different sets. Since each testing set is build with images not seen by the training model, this allows us to anticipate the CNN behaviour against new images. The four datasets (original, segmented, segmented with equalization and segmented with CLAHE) were trained and tested in an independent way. 4 Results and Discussion Other works use different evaluation metrics like Precision, Recall, F1-score [1] or correct classification rate (CCR) [18]. However, those metrics use the concept of true negative and false negative. As in this type of experiments we only obtain true positives or false positives, we evaluate the results with Accuracy (Precision gives the same score), which relates true positive with all possible results. The evaluation results for the nine CNN architectures considered, with dif- ferent colour pre-processing techniques, are presented in Table 2. The numbers exhibited in bold indicate the best Accuracy result obtained for each network. Table 2. Classification results (in percentage) on the test set for the different CNNs and preprocessing techniques considered. CNN Original Segmented Seg. Equal. Seg. CLAHE VGG16 94.2 94.3 88.7 92.6 VGG19 91.9 92.4 92.1 92.1 ResNet50 95.1 95.2 93.2 94.8 ResNet101 94.5 94.6 93.0 95.2 Inception V3 92.3 92.8 94.0 90.8 Inception-ResNet 92.3 89.9 92.4 90.3 Xception 89.7 87.8 87.5 88.1 DenseNet201 96.7 97.4 96.6 95.9 DarkNet53 95.8 95.9 95.1 94.9 Based on the results of Table 2, we can conclude that segmenting pollen grains images improves the classification performance for the majority of CNN models allowing the DenseNet201 to achieve an accuracy of 97.4%. Only the Xception network produces better results for the original images. The Inception architectures achieve the best performance with segmented histogram equalized
  • 397.
    Deep Learning Recognitionof a Large Number of Pollen Grain Types 389 images. The remain architectures achieved the highest performance when using segmented images without any colour processing. The DenseNet201 classified correctly all the images for 65 pollen grain types out the 73 types of the dataset. For the other 9 types, it misclassified up to two images, with a total of eleven false positives in the 730 tested images. The lowest accuracy result of the DenseNet201 was achieved with the pollen types dipteryx alata and myrcia guianensis. These pollen types have predominantly rounded shapes and high texture, that are normally learned in the first CNN layers. Since the transfer learning process changes only a set of the last CNN layers it does not change those learned features during the training process with our images, producing some misclassified results. The accuracy rates achieved by the DenseNet201 network are relevant due to the amount of pollen types in the POLEN73S dataset, since Sevillano et al. [19] obtained a higher accuracy in a dataset containing only 46 pollen types. That shows that DenseNet201 presented an important performance on POLEN73S. The network trained and tested using the segmented images produced false positives results that are misclassified as pollens that have high similarity with the tested ones. Figure 3 shows some of those false positive examples. Fig. 3. First row: segmented tested pollens (magnolia champaca, myrcia guianen- sis, dipteryx alata, arachis sp); second row: misclassified pollens (ricinus communis, schizolobium parahyba, zea mays, myracroduon urundeuva). In networks trained and tested with segmented images the background colour bias information was removed, and so the pollen is classified using only the grain pollen information, correcting some of the false positives of the network trained with original images, where the background colour was used as a feature in the classification process. The high values for the evaluation metric in all CNNs show that the number of correctly identified pollens is high when compared to the number of tested
  • 398.
    390 F. C.Monteiro et al. Table 3. Comparison with previous attempts at pollen classification of more than 20 pollen types using a CNN classifier, with number of types and the highest reported accuracy. Authors #Types Accuracy Sevillano et al. [18] 23 97.2 Menad et al. [14] 23 95.0 Daood et al. [8] 30 94.0 Sevillano et al. [19] 46 97.8 Astolfi et al. [1] 73 95.8 Our approach 73 97.4 images. We believe that an accuracy over 97% is enough to build an automatic classification system of pollen grains, since the visual classification performed by human operators is a hard and time consuming task with a lower performance. 4.1 Comparison with Other Studies We compared our results with other automatic approaches, from the current literature, that used a CNN classifier. Previous deep learning approaches have shown similar or higher accuracy rates to ours, but these studies were conducted with a small number of pollen types. Table 3 provides a summary table of previ- ous studies, including class sizes and accuracy/success rates against our result. All the literature reviewed, except [1], used a significantly smaller image dataset, in terms of pollen types, than the one used in this paper. Although the work of Sevillano et al. [19], with forty six types of pollen, achieved a slightly higher performance than our study, as the number of pollen types is directly related to the classification performance of the CNNs, the results must be evaluated taking into account this difference in the number of pollen types between the work presented in [19] and ours. In short, it can thus be concluded that training a network with the atten- tion focused on the object itself by removing the background dissimilarities can improve the performance of CNN model in pollen classification problem. 5 Conclusion The usual method for pollen grains identification is a qualitative approach, based on the discrimination of pollen grain characteristics by an human operator. Even though this manual method is quite effective, the all process is time consuming, laborious and sometimes subjective. Creating an automatic approach to identify the grains, in a precise way, thus represents a task of utmost interest. In this study, an automated pollen grain recognition approach is proposed. We investigate the influence of background colours and colour pre-processing in
  • 399.
    Deep Learning Recognitionof a Large Number of Pollen Grain Types 391 the recognition task using nine state-of-the-art CNN topologies. Using a combi- nation of an image-processing workflow and a sufficiently trained deep learning model, we were able to recognize pollen grains from seventy three pollen types, one of the largest number of pollen types studied until now, achieving an accu- racy of 97.4% that represents one of the best success rate so far (when weighted for the number of pollen types used in this work). This study proves that using deep learning CNN architectures for the pollen grain recognition task allows good classification results when using a transfer learning approach. In the future, we plan to combine the features from several CNNs enhancing the effectiveness of deep learning approaches in pollen grain recognition. References 1. Astolfi, G., et al.: POLLEN73S: an image dataset for pollen grains classification. Ecol. Inform. 60, 101165 (2020) 2. Baker, N., Lu, H., Erlikhman, G., Kellman, P.J.: Local features and global shape information in object classification by deep convolutional neural networks. Vis. Res. 172, 46–61 (2020) 3. Bianco, S., Cusano, C., Napoletano, P., Schettini, R.: Improving CNN-based tex- ture classification by color balancing. J. Imaging 3(33) (2017) 4. Buters, J., et al.: Pollen and spore monitoring in the world. Clin. Transl. Allergy 8(9) (2018) 5. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1800– 1807 (2017) 6. Corvucci, F., Nobili, L., Melucci, D., Grillenzoni, F.V.: The discrimination of honey origin using melissopalynology and raman spectroscopy techniques coupled with multivariate analysis. Food Chem. 169, 297–304 (2015) 7. D’Amato, G., et al.: Allergenic pollen and pollen allergy in Europe. Allergy 62(9), 976–990 (2007) 8. Daood, A., Ribeiro, E., Bush, M., et al.: Pollen grain recognition using deep learn- ing. In: Bebis, G. (ed.) ISVC 2016. LNCS, vol. 10072, pp. 321–330. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50835-1 30 9. Gonçalves, A.B., et al.: Feature extraction and machine learning for the classifica- tion of Brazilian savannah pollen grains. PLoS ONE 11(6), e0157044 (2016) 10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016) 11. Holt, K.A., Bennet, K.D.: Principles and methods for automated palynology. New Phytol. 203(3), 735–742 (2014) 12. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017) 13. Laurence, A.R., Bryant, V.M.: Forensic Palynology, pp. 1741–1754. Springer, New York, NY (2014) 14. Menad, H., Ben-Naoum, F., Amine, A.: Deep convolutional neural network for pollen grains classification. In: JERI (2019)
  • 400.
    392 F. C.Monteiro et al. 15. Ponnuchamy, R., et al.: Honey pollen: using melissopalynology to understand for- aging preferences of bees in tropical south India. PLoS ONE 9(7), e101618 (2014) 16. Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. ArXiv p. 1804.02767 (2018) 17. Rodriguez-Damian, M., Cernadas, E., Formella, A., Fernandez-Delgado, M., Pilar De Sa-Otero: Automatic detection and classification of grains of pollen based on shape and texture. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 36(4), 531–542 (2006) 18. Sevillano, V., Aznarte, J.L.: Improving classification of pollen grain images of the POLEN23E dataset through three different applications of deep learning convolu- tional neural networks. PLoS ONE 13(9), e0201807 (2018) 19. Sevillano, V., Holt, K., Aznarte, J.L.: Precise automatic classification of 46 differ- ent pollen types with convolutional neural networks. PLoS ONE 15(6), e0229751 (2020) 20. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations, pp. 1–14 (2015) 21. Sobol, M.K., Finkelstein, S.A.: Predictive pollen-based biome modeling using machine learning. PLoS ONE 13(8), e0202214 (2018) 22. Stillman, E., Flenley, J.: The needs and prospects for automation in palynology. Quat. Sci. Rev. 15(1), 1–5 (1996) 23. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inceptionresnet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 4278–4284 (2017) 24. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep- tion architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)
  • 401.
    Predicting Canine HipDysplasia in X-Ray Images Using Deep Learning Daniel Adorno Gomes1 , Maria Sofia Alves-Pimenta1,2 , Mário Ginja1,2(B) , and Vitor Filipe1,3 1 University of Trás-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal mginja@utad.pt 2 CITAB - Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Vila Real, Portugal 3 INESC TEC - INESC Technology and Science, 4200-465 Porto, Portugal Abstract. Convolutional neural networks (CNN) and transfer learning are receiving a lot of attention because of the positive results achieved on image recognition and classification. Hip dysplasia is the most preva- lent hereditary orthopedic disease in the dog. The definitive diagnosis is using the hip radiographic image. This article compares the results of the conventional canine hip dysplasia (CHD) classification by a radiologist using the Fédération Cynologique Internationale criteria and the com- puter image classification using the Inception-V3, Google’s pre-trained CNN, combined with the transfer learning technique. The experiment’s goal was to measure the accuracy of the model on classifying normal and abnormal images, using a small dataset to train the model. The results were satisfactory considering that, the developed model classified 75% of the analyzed images correctly. However, some improvements are desired and could be achieved in future works by developing a software to select areas of interest from the hip joints and evaluating each hip individually. Keywords: Canine hip dysplasia · CHD · Image recognition · CNN · Convolutional neural network · Artificial neural network · Inception-V3 · Artificial intelligence · Machine learning 1 Introduction Canine hip dysplasia (CHD) is a hereditary disease that mainly affects dogs of large and giant breeds [1]. The disease begins with pain and often the dog that ran and jumped, cannot get up [2]. Genetic and environmental factors, such as excessive growth rate, obesity, excessive and inadequate exercise can contribute to the development of hip dysplasia in dogs [3]. The hip joint works like a ball and a socket. Dogs with hip dysplasia develop osteoarthritis and the ball and socket do not adjust [4]. Instead of sliding in a smooth way, they will rub and grind, causing discomfort and gradually loss of function. Diagnostic radiography is the main method used worldwide for screening hip dysplasia in dogs with c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 393–400, 2021. https://doi.org/10.1007/978-3-030-91885-9_29
  • 402.
    394 D. A.Gomes et al. breeding or management purposes [5]. Currently, radiographic analysis of CHD is performed by veterinary radiologists using for scoring different schemes, of which we highlight: the Fédération Cynologique Internationale’s (FCI) system is most commonly used in continental European countries and uses five grades (normal, borderline and three dysplastic grades); the Orthopaedic Foundation for Animals (OFA) guidelines commonly used in the United States that uses seven grades (three normal, borderline and three dysplastic); the British Veteri- nary Association/Kennel Club scoring scheme is commonly used in the United Kingdon and is a system that attributes disease points, the final dog’s score ranges from 0 to 106 [4]. In general, all the schemes of assessment are consid- ered subjective, and sometimes it is recommended the radiographic evaluation by more than one examiner, being the agreement between the categories of dif- ferent evaluators in the order of 50–60% [6]. Figure 1 shows examples of normal (a) and CHD (b) X-ray images. Fig. 1. Examples of normal (a) and CHD (b) X-ray images. Advances in artificial intelligence, machine learning, image recognition and classification, allied with the radiologic equipment, can be useful and help to improve the diagnosis of this kind of disease. Convolutional Neural Networks (CNN) are the architecture behind the most recent advances in computer vision. They are currently providing solutions in many areas like image recognition, speech recognition, and natural language processing. Through this program- ming paradigm, a computer can learn from observational data. Practically, this resource is allowing companies to create a wide range of applications like mecha- nisms of autonomous vehicle vision, automatic inspection, anomaly detection in medical imaging, facial recognition, for example, to unlock a mobile phone [7]. A CNN is a specific type of artificial neural network used mainly in image recog- nition and natural language processing, also known as a ConvNet. The CNNs combine deep learning algorithms and a multi-layer perceptron network that consists of an input layer, an output layer and hidden layers that include convo- lutional, normalization, pooling and fully connected layers, using mathematical
  • 403.
    Predicting Canine HipDysplasia in X-Ray Images Using Deep Learning 395 models to transmit the results to successive layers. The complexity of the objects that a CNN can recognize increases accordingly with the filters learned by each layer. The first layers learn basic feature detection filters like corners and edges. The middle layers learn filters that identify parts of objects—for dogs, for example, they learn to recognize eyes, snouts, and noses. And, the last layers, learn to identify complete objects, like to differentiate a dog from a cat or identifying a dog’s race [8]. Pre-trained CNN such as AlexNet, VGGNet, Inception, and Xception, allied with a technique called transfer learning, have been used to build accurate models in a time saving way using small datasets [9]. The transfer learning technique is a deep learning approach in which knowledge is transferred from one model to another. When applying this technique it is possible to solve a certain problem using all or part of a pre-trained model developed to solve a different problem [10]. In this paper, the results of CHD scores attributed by a radiologist using the FCI criteria were compared with the computer image classification using the Inception-V3, Google’s pre-trained CNN [11], combined with the transfer learning technique. The main aim of the work was to measure the accuracy of the computer model on classifying normal and CHD images. Section 2 presents the details on the materials and methods applied in the experiment. Section 3 shows the obtained results. Section 4 discusses the results and the performance of the model, and make a few final remarks. 2 Methods In this study, it was used a pre-trained network, the Google’s Inception-v3, that was trained from the ImageNet database on more than a million images [12–14]. This CNN is 48 layers deep and can classify images into 1000 object categories like chairs, pens, keyboards, animals, and cars. This network has an image input size of 299 × 299 pixels. Figure 2 shows how the architecture of the Inception-V3 is structured. The chosen pre-trained CNN, Inception-v3, was trained to identify charac- teristics of normal and CHD in X-ray images. This new feature added a new layer to the network capable of carrying out this type of identification. It is a very popular technique called transfer learning, that permits to train a CNN on a new task using a smaller number of training images. It consists of the reuse of a pre-trained network as a starting point to learn about a new domain. The rich feature representations already learned from a wide range of images and the pre-trained layers are used to accelerate the process as the network will learn new features creating the new layer [15]. In practice, this new layer will be the last layer, and it will only be trained on the new domain to classify which X-ray images are CHD positive or negative. The dataset used in our experiments was provided by the Department of Veterinary Science of the University of Trás-os-Montes and Alto Douro, Portugal. It was used for training and testing the new layer of the CNN. The dataset
  • 404.
    396 D. A.Gomes et al. Fig. 2. The architecture of the Inception V3. Extracted from [11]. contains a total of 225 X-ray digital images (1760X2140 pixels, 1-channel gray scale, 8-bit depth in each channel, JPG format) obtained using the convenctional ventro dorsal hip extended view. They were classified based on the FCI criteria by a veterinary radiologist, 125 with hip dysplasia signs and 100 as normal. The dataset was divided into 73% for the training (165 images) and 27% (60 images) for the test set, based on the protocol proposed in [16] that suggests 70% to train and 30% to test. Table 1 shows how the dataset was structured to execute the training and the test phases of the transfer learning process. Table 1. Dataset split. Type of image Training Test Total Hip dysplasia 94 31 125 Normal 71 29 100 Total of images 165 60 225 The transfer learning technique was applied on the CNN using the Ten- sorflow [17] deep learning framework and the Python programming language, version 3.6. The model was trained and tested using a virtual machine created on Oracle Virtual Box 5.2.2 and configured with Linux Mint 19.1 operating sys- tem, an Intel(R) Core(TM) i7-5500U CPU @ 2.40 GHz processor, 8 GB of RAM and an NVIDIA GeForce 920M GPU. The total of iterations of the training process was 4000. It is the default value defined by the Tensorflow framework. During the training process, three metrics were used: the training accuracy, validation accuracy, and cross-entropy. The training accuracy achieved was 100%. It represents the percentage of the images used in the training process that were labeled with the correct category. The cross-entropy is a loss function that represents the total loss minimization
  • 405.
    Predicting Canine HipDysplasia in X-Ray Images Using Deep Learning 397 during the training process. The total loss achieved a minimum of 0.038072. The validation accuracy represents the percentage of images, randomly selected and not used to train the model, that were correctly labeled. The validation accuracy achieved was 83.8%. The final test accuracy achieved was 78.2%, meaning that the model can make a correct prediction between 7 to 8 times out of 10. After training the model with 165 images it was evaluated on the test set, which includes 60 images unseen in the training phase. This is the classification phase in which new samples are evaluated against the new model, and a predic- tion is generated. Practically, it means that a script was executed for each of the 60 X-ray sample images to classify them in normal or CHD, based on the new model. This script consists in a small program written in Python that executes the process of classification in 5 steps. In the first step, it obtains the image that will be classified. After, it loads the defined labels that will be used to classify the images, in this case, normal and CHD. In the next step, it retrieves the model generated in the training phase to compare to the new image, and finally, provide a prediction. The performance of the classification algorithm can be visualized in a confu- sion matrix. Also, it is possible to calculate the accuracy of the algorithm through Eq. 1, based on the values of the confusion matrix [18]. Both the performance and the accuracy data will be shown in the results topic. Accuracy = (TP + TN)/(TP + TN + FP + FN) (1) 3 Results The confusion matrix in Fig. 3 was built from the model’s predictions obtained on the test set. Fig. 3. Obtained results applied in a confusion matrix. The correct predictions are located in the diagonal of the table, highlighted in yellow. The model classified correctly 26 images as CHD and 19 as normal. Of the 31 actual CHD X-ray, the algorithm predicted that 5 were normal, and of the 29 normal, it predicted that 10 were CHD. Accordingly with the numbers of the Fig. 3, the accuracy of the proposed model is (26+19)/(26+10+5+19) = 75%,
  • 406.
    398 D. A.Gomes et al. considering the traditional default threshold of 0.5, due the binary nature of the experiment [19]. In global, 45 X-ray images were classified with the correct category, 26 representing animals with hip dysplasia and 19 with normal status. The other 15 X-ray images evaluated in the test phase, were incorrectly classified by the model, representing 10 normal animals and 5 with canine hip dysplasia. In Fig. 4 are shown two X-ray image samples correctly classified by the model. Fig. 4. Samples correctly classified by the model in the test phase. 4 Discussion and Final Remarks This paper presents a classification of canine X-ray images in hip dysplasia using a pre-trained convolutional neural network combined with a transfer learning technique. In part of the sample images, the algorithm used in this experiment
  • 407.
    Predicting Canine HipDysplasia in X-Ray Images Using Deep Learning 399 was able to achieve similar classification results to the ones obtained from a veterinary radiologist, using the FCI criteria. Figure 3 shows the results obtained by the new layer added to the pre-trained CNN in a confusion matrix. The new model achieved a diagnostic accuracy of 75% compared to the diagnosis provided by the veterinary radiologist. Our results show that binary image classification performed by pre-trained CNN using the transfer learning technique works well on small X-ray images datasets. It was able to learn on how to identify CHD in X-ray images using just 165 samples. It is important highlight that only 40 images were obtained by digital means and the rest, 185 images, were taken by traditional means, using film. In the experiment, the digitized version of these X-ray films were used to achieve a minimum number of samples that would permit the model’s training. The results demonstrate the adopted approach works properly on the CHD classification, considering the dataset was small. The performance achieved can be improved using larger datasets to train the model. Also, the model can be improved to classify the images in sub-classes of the canine hip dysplasia, select- ing areas of interest from the hip joints and evaluating each hip individually. This work can be extended in the future, executing the same experiment with more recent CNN architectures like ResNet, ResNext, and DenseNet to compare the performance between them. Acknowledgments. This work was supported by National Funds by FCT - Por- tuguese Foundation for Science and Technology, under the projects UIDB/04033/2020 and Scientific Employment Stimulus - Institutional Call - CEECINST/00127/2018 UTAD. References 1. Ginja, M.M., et al.: Early hip laxity examination in predicting moderate and severe hip dysplasia in Estrela mountain dog. J. Small Anim. Pract. 49(12), 641–646 (2008) 2. Loder, R.T., Todhunter, R.J.: The demographics of canine hip dysplasia in the United States and Canada. J. Vet. Med. 2017, 5723476 (2017). https://doi.org/ 10.1155/2017/5723476. PMID: 28386583 3. Kimeli, P., et al.: A retrospective study on findings of canine hip dysplasia screening in Kenya. Vet. World 8(11), 1326–1330 (2015). https://doi.org/10.14202/vetworld. 2015.1326-1330 4. Ginja, M.M.D., Silvestre, A.M., Gonzalo-Orden, J.M., Ferreira, A.J.A.: Diagnosis, genetic control and preventive management of canine hip dysplasia: a review. Vet. J. 184(3), 269–276 (2010). https://doi.org/10.1016/j.tvjl.2009.04.009 5. Butler, R., Gambino, J.: Canine hip dysplasia: diagnostic imaging. Vet. Clin. North Am. Small Anim. Pract. 47(4), 777–793 (2017) 6. Smith, G.K., Gregor, T.P., McKelvie, P.J., O’Neill, S.M., Fordyce, H., Pressler, C.R.K.: PennHIP Training Seminar and Reference Material. Synbiotics Corpora- tion, San Diego (2002)
  • 408.
    400 D. A.Gomes et al. 7. Tian, Y., Jana, S., Pei, K., Ray, B.: DeepTest: automated testing of deep-neural- network-driven autonomous cars. In: IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 303–314 (2018). https://doi.org/10.1145/ 3180155.3180220 8. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29(9), 2352–2449. MIT Press Journals (2017). https://doi.org/10.1162/NECO a 00990 9. Aloysius, N., Geetha, M.: A review on deep convolutional neural networks. In: International Conference on Communication and Signal Processing, pp. 588–592 (2017). https://doi.org/10.1109/ICCSP.2017.8286426 10. Sarkar, D., Bali, R., Ghosh, T.: Hands-On Transfer Learning with Python. Packt Publishing, Birmingham (2018) 11. Advanced Guide to Inception v3 on Cloud TPU Homepage. https://cloud.google. com/tpu/docs/inception-v3-advanced. Accessed 25 April 2021 12. ImageNet Homepage. http://www.image-net.org. Accessed 25 March 2021 13. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848 14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep con- volutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS 2012), pp. 1097–1105. Curran Associates Inc., Red Hook, NY, USA (2012) 15. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep- tion architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016). http://arxiv. org/abs/1512.00567 16. Spanhol, F.A., Oliveira, L.S., Petitjean, C., Heutte, L.: A dataset for breast cancer histopathological image classification. IEEE Trans. Biomed. Eng. 63(7), 1455–1462 (2016). https://doi.org/10.1109/TBME.2015.2496264 17. TensorFlow Homepage. https://www.tensorflow.org/. Accessed 25 March 2021 18. Kotu, V., Deshpande, B.: Chapter 8 - Model Evaluation. Data Science (Second Edition), pp. 263–79. Morgan Kaufmann, Burlington (2019). https://doi.org/10. 1016/B978-0-12-814761-0.00008-3 19. Freeman, E.A., Moisen, G.G.: A comparison of the performance of threshold cri- teria for binary classification in terms of predicted prevalence and kappa. Ecol. Modell. 217(1–2), 48–58 (2008). https://doi.org/10.1016/j.ecolmodel.2008.05.015
  • 409.
    Convergence of theReinforcement Learning Mechanism Applied to the Channel Detection Sequence Problem André Mendes(B) Research Centre in Digitalization and Intelligent Robotics (CeDRI), Polytechnic Institute of Bragança, 5300-253 Bragança, Portugal a.chaves@ipb.pt Abstract. The use of mechanisms based on artificial intelligence tech- niques to perform dynamic learning has received much attention recently and has been applied in solving many problems. However, the conver- gence analysis of these mechanisms does not always receive the same attention. In this paper, the convergence of the mechanism using rein- forcement learning to determine the channel detection sequence in a multi-channel, multi-user radio network is discussed and, through sim- ulations, recommendations are presented for the proper choice of the learning parameter set to improve the overall reward. Then, applying the related set of parameters to the problem, the mechanism is compared to other intuitive sorting mechanisms. Keywords: Artificial intelligence · Reinforcement learning · Convergence analysis · Channel detection sequence problem 1 Introduction The increasing use of devices that rely on wireless connections in unlicensed frequency bands has caused huge competition for available spectrum. However, studies conducted by regulatory agencies and universities show that the static allocation currently practised through the sale of spectrum bands promotes, in practice, the under-use of frequency bands, even in urban areas [1]. With this, the idea of dynamic access to spectrum [2] emerged as a solution to improve the use of this scarce resource and thus provide alternative connec- tivity for the growing demand-driven, for example, by the Internet of Things devices. This dynamic access to spectrum relies on the use of devices whose wireless network interface is reconfigurable, being able to dynamically adapt their parameters and operating modes to the existing spectrum occupation. Thus, considering a scenario of multiple frequency bands (called channels) and users provided with a single transceiver, where only one channel can be scanned at a time to detect possible “opportunities” for use and, in that same time interval, effectively use the channel. In this scenario, the channel detection c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 401–416, 2021. https://doi.org/10.1007/978-3-030-91885-9_30
  • 410.
    402 A. Mendes sequencecan have a major impact on user performance by minimising the time to search and access a free channel. When there is prior accurate knowledge of the channel statistics, the intuitive detection sequence, the one that follows the decreasing sequence of channel avail- ability probabilities is admittedly the optimal sequence [3]. However, in many practical scenarios, the channel availability probability is unknown in advance, and therefore, the optimal detection sequence requires a huge effort to obtain. A first intuitive approach to solve this problem would be the historical observation of this statistic, however, this approach would greatly increase the channel access time due to the analysis required to obtain an accurate statistic. Although the optimal stopping theory [4] combined with a brute-force mech- anism can solve the problem, there is a strong dependency on previous and accurate knowledge of the statistics of the channel and the computational cost of this solution is high, growing even more with the increase of the number of channels and users. Both problems impact directly the use of this solution embedded in a platform with few resources. In [5], the problem of choosing the channel detection sequence for only one user was investigated and an approach was presented using a low computational complexity method based on a reinforcement learning machine for the dynamic search of the optimal detection sequence, where no prior knowledge of the avail- ability probability and channel capacity is required. In this paper, the evolution of the mechanism for searching the detection sequence in a multi-channel, multi-user radio network using reinforcement learn- ing, presented in [5], is discussed for the multi-user case, where the reward model follows the optimal stopping theory [4]. The implementation of another strategy for balancing the investigation-exploration dilemma is also included, and rec- ommendations are presented for the choice of parameter ranges to improve the overall reward of the mechanism. Then, the convergence analysis of the mech- anism is performed by applying the aforementioned set of parameters to the scenario. Finally, the results obtained are compared to those of other intuitive sorting mechanisms. This paper is organized as follows. Section 2 deals with related works. Section 3 performs the summary of reinforcement learning, the adopted system model and the proposal of the work that uses this technique for the search of the optimal channel detection sequence. In Sect. 4, the simulation environment is described, the convergence analysis of the mechanism is examined and the results obtained are discussed. Finally, Sect. 5 concludes the paper and com- ments on future work. 2 Related Work The work in [6,7] obtains some empirical evidence describing the convergence properties of reinforcement learning methods in multi-agent systems consider- ing that in many practical applications it is not reasonable to assume that the actions of other agents can be observed. Most agents interact with the surround- ing environment relying on information from “sensors” because, without any
  • 411.
    Convergence of theRL Mechanism 403 knowledge of the actions (or rewards) of other agents, the problem becomes even more complex. In the work in [8], the authors studied the multiagent class of type indepen- dent learners, including the convergence of the mechanism, in a deterministic environment for some scenarios. In future work [9,10], the same class is studied in a stochastic environment. An important concept is that of regret (regret theory), which is defined as the requirement where an agent obtains a reward that is at least as good as the one obtained through any stationary strategy, independent of the strategy adopted by the other agents [11]. For certain types of problems, the learning algorithms that follow this concept converge to the optimal (Nash) equilibrium [12,13]. In the work presented in [14], the convergence of the class independent learn- ers is established, while applied to the fictional player procedure [15] in com- petitive games. Taking a differentiated approach, the work in [16] proposed an algorithm of the class independent learners for repetitive games (where the player is allowed to have a memory of past moves) that converges to a guaranteed fair policy but periodically alternates between some equilibrium points. 3 Dynamic Channel Detection Sequence Using Reinforcement Learning The proposal of this work is a solution based on multi-agent reinforcement learn- ing of the independent learners class using the Q-learning algorithm applied to the problem of finding the optimal channel detection sequence in a network of wireless devices. Using this approach it is optional to previous knowledge regard- ing the probability of each channel to be available, nor the estimated quality of each channel through its average SNRs (signal to noise ratios). Another impor- tant advantage of this proposal is its adaptability to changes in channel charac- teristics guaranteed by learning from the actions taken. Therefore, the mechanism becomes immune to possible changes in the avail- ability probabilities of the channels, which may occur due to changes in the pat- terns of channel usage and possible changes in channel quality (average SNRs), which may occur due to mobility and large scale fading effects. From now on, the summary of the reinforcement learning technique will be performed and, in the sequence, the modelling used for the search of the optimal channel detection sequence will be described. Reinforcement Learning Reinforcement learning is one of the three paradigms in the field of artificial intelligence, alongside supervised and unsupervised learning, where an agent performs learning through observations of the results of its actions to maximise a scalar value called reward (or reinforcement signal) [17]. When an agent acts, the obtained response (reward) does not contain infor- mation about the correct action that should be taken. Rather, the agent needs
  • 412.
    404 A. Mendes todiscover for itself which actions lead to a greater reward by testing them. In some interesting situations, actions may affect not only the immediate reward but also those obtained in the future. The goal of the agent is to maximize the sum of collected rewards. The use of algorithms based on reinforcement learning, in particular, the Q- learning [18], has received great attention lately. Being an algorithm that does not need a model of the problem and capable of being used directly at runtime, Q-learning is very suitable for application in systems with multiple agents, where each agent has little knowledge of the others and where the scenario changes during the learning period. Q-learning adopts a simple approach, leading also to a low computational com- plexity, which is important for platforms with few resources. In contrast, this method can exhibit slow convergence [19]. The Q-learning algorithm works by estimating pair values (state, action) from a value function Q, hence the name. The value Q(s, a), called Q-value, is defined as the expected value of the sum of future rewards obtained by the agent by taking the action a from the state s, according to an optimal policy [18]. The agent initializes its Q-table with an arbitrary value and, at the instant of choosing the action to be taken, a strategy is generally chosen in a way that guarantees sufficient exploration while favouring actions with higher Q-values. One commonly used strategy is the ε-greedy [17], which attempts to make all actions, and their effects, equally experienced. In ε-greedy the agent uses the probability given by ε to decide between exploring (exploitation) the Q-table or investigating (exploration) randomly new states. Another strategy is called softmax [17], where the probability of choosing a particular action varies according to the corresponding value of Q-value. In this strategy, a probability distribution is used for choosing an action a from a state s. A commonly used distribution for this situation is the Boltzmann distribu- tion. This distribution has a parameter, called temperature, which controls the amount of exploration that will be practised. High values of temperature cause an almost equiprobability of the choice of actions. Low values, on the contrary, cause a great difference in the probability of choice of actions. In the limit, with t → 0, the softmax strategy works as if it were the ε-greedy. Qt+1(s, a) = Q(s, a) + α[r(s, a) + γmaxaQ(st+1, a) − Q(s, a)] (1) In Q-learning, from the current state s, an action a is selected and subse- quently a reward r is received, proceeding to the next state s . With this, one updates the value Q(s, a) according to Eq. 1, where α is the parameter called learning rate and 0 ≤ γ ≤ 1 is the parameter called discount factor. Higher val- ues of α indicate greater importance for present experience over history, while higher values of γ indicate that the agent relies more on future reward than immediate reward [20]. The Q-learning algorithm converges to the correct Q-values with probability 1 if and only if: the environment is stationary and Markovian, a storage table is used to store the Q-values (usually named Q-table), no state-action pair is
  • 413.
    Convergence of theRL Mechanism 405 neglected (with time tending to infinity) and the learning rate is reduced appro- priately with time [18]. The Q-learning does not specify which action should be chosen at each step but requires that no action should fail to be tested. System Model In this work, each user was modelled as a learning agent following an indepen- dent strategy according to the characteristics of the multi-agent class of type independent learners [6]. This type of agent does not know the other agents, interacting with the environment as if it were alone. The agent, in this case, is unable to observe the rewards and actions of the other agents and, consequently, can apply the Q-learning technique in its traditional form. The choice of the multi-agent class, in particular, was motivated by the possibility of future scalability of the mechanism by increasing the number of agents and their autonomy, besides the enormous difficulty of satisfying, in practice, the requirement of observation of the agents’ joint actions necessary for the application of mechanisms of the joint learners class. The system model adopted for the development and implementation of the proposal allows the determination of the optimal detection sequence through the application of the optimal stopping theory [5]. In this way, the goal is to provide a decision about the moment to finalize the detection of new channels in such a way that the reward obtained in the choice of a channel is maximized. This theory then allows the definition of the stopping rule that maximizes the reward. The optimal stopping theory provides a criterion for the choice of stopping (staying on the free channel) or proceeding. Consider a network with multiple users and a finite number of channels, N. Each user is equipped with a transceiver for data exchange that can be tuned to one of the N channels in the network. Moreover, each user has an operating model whose access time, which we will call slot, devoted to channelling observation and data transmission is constant with duration T. In each slot, each channel i has the probability pi of being free or busy. It is assumed that this probability is independent of its previous state and the state of other channels, within each slot, and i.i.d. between slots. Figure 1 exemplifies the activity of a user in a slot, which has two phases: a detection phase and a data transmission phase. Before deciding to use a channel in a given slot, the user must perform channel detection in the time equal to τ to determine whether it is free or busy, minimizing the probabilities of non- detection and false alarms. In addition, the time equal to σ is reserved for channel probing, which has a much shorter duration than the duration of the transmission phase, depends on the channel bandwidth, modulation, etc., and whose result provides the trans- mission rate that will be obtained, function of the momentary SNR of that user on that channel. In the model, it is assumed that both channel detection and channel probing are accurate and error-free tasks.
  • 414.
    406 A. Mendes Fig.1. Model of a slot. Qt+1(s, a) = (1 − α)Q(s, a) + α [r(s, a) + γmaxaQ(st+1, a)] if free channel (1 − α)Q(s, a) + αγmaxaQ(st+1, a) if busy channel (2) Some constraints must be taken into account when taking actions and updat- ing the Q-table, which is performed according to Eq. 2. One is that any action taken in the state (ok, ∗), with 1 ≤ k ≤ (N − 1), always leads to a state where the position in the detection sequence is ok+1. In the state where the position is (oN , ∗), which represents the last channel in the detection sequence, the actions indicate the first channel to be detected in the next slot, that is, it leads to a state where the position in the detection sequence is (o1, ∗). When the user decides to use a channel ci at position ok, the detection in that slot is terminated. In that case, the first channel to be detected in the next slot will be determined by the best action in state (oN , ci). Another important constraint is concerning preventing the return to a previously detected channel (no recall). The operation of the mechanism can be described as follows: at the beginning, all state-action pairs of the Q-table are initialized with zeros. Then, the learning phase begins, which is repeated during the entire period of the mechanism’s operation. In this phase, the exploration strategy is chosen, whether softmax or ε-greedy, and from it the action that will be followed from a given state is chosen. After the execution of the action, the mechanism analyses if the channel is free and decides whether to use it or not, continuing the detection according to the established sequence of the channels. The goal is to provide a decision about the moment to end the detection of new channels in such a way that the reward obtained in choosing a channel is maximized. This indicates that it will not always be advantageous to use the first free channel found.
  • 415.
    Convergence of theRL Mechanism 407 This stopping criterion consists of comparing the current reward, rt, with the best Q-value of the possible actions from that state. Thus, it is possible to estimate if the reward of the current free channel is higher than the expected reward of the best existing action. Notice that even in the case where the free channel is not used, the Q-value referring to that action is also updated. 4 Parameters Evaluation Before starting the evaluation of the proposal, some experiments were performed to obtain the set of parameters of the mechanism based on reinforcement learning capable of maximizing the amount of reward collected in the dynamic scenario to which it is submitted, as well as to improve its convergence. For this, a simulator was implemented, in Tcl language, responsible for emu- lating the operation of a network, where each of its users uses individual detection sequences. 4.1 Parameters of the Q-learning Algorithm The parameters referred to are those of Q-learning: γ, α, and those concerning the strategies, softmax and ε-greedy, respectively, ε and (temperature, β). In this evaluation, a simulation is performed with 50,000 slots for each strat- egy, softmax and ε-greedy, using a number of channels varying between 3 and 9 (x-axis), and 10 users. At the end of the simulation, the average value of the reward is calculated (y-axis), using a confidence interval of 95%. 70 80 90 100 3 5 7 9 γ = 0.1 γ = 0.3 γ = 0.5 γ = 0.7 γ = 0.9 (a) α=0.9 e ε=0.7 70 80 90 100 3 5 7 9 γ = 0.3 γ = 0.5 γ = 0.6 γ = 0.7 (b) α =0.9 e ε=0.7 70 80 90 100 3 5 7 9 γ = 0 γ = 0.1 γ = 0.3 γ = 0.5 (c) α=0.3 e ε=0.9 Fig. 2. Impact of varying the discount factor γ with ε-greedy strategy, 10 users and occupancy rate of 10%. Initially, for ε-greedy strategy, some values for the parameter γ, called the discount factor, were tested with the mechanism using the ε-greedy strategy. This parameter directly determines the degree of importance of the reward that will be collected in the future from an action taken in the present. Therefore,
  • 416.
    408 A. Mendes highervalues of γ indicate that the agent relies more on the future reward than the immediate reward The results of this evaluation are shown in Fig. 2. For this analysis, it was necessary to choose an initial range for the other parameters, α and ε, and our choice was based on the values chosen for these same parameters previously [5]. It can be seen that the three approaches chosen (Figs. 2(a), 2(b) and 2(c)) point to a relationship between the growth of γ and the increase in reward. However, it can be noted that there is a threshold for the value of this parameter, between 0.7 and 0.9, which reverses the trend (Fig. 2(a)). An observation of the behaviour of the parameter for values closer to 0.7 (Fig. 2(b)) demonstrates that the collected reward values can be considered equal, due to the confidence interval, if γ is in the interval [0.5, 0.7]. We opted for the smaller value as it favours a fast convergence of the mechanism [17] . 70 80 90 100 3 5 7 9 α = 0.1 α = 0.3 α = 0.5 α = 0.7 α = 0.9 (a) γ = 0.5 e ε = 0.7 70 80 90 100 3 5 7 9 α = 0.1 α = 0.3 α = 0.5 α = 0.7 α = 0.9 (b) γ = 0 e ε = 0.7 84 86 88 90 92 94 96 98 100 3 5 7 9 ε = 0.1 ε = 0.5 ε = 0.7 ε = 0.9 ε = 1.0 (c) γ = 0.5 e α = 0.1 84 86 88 90 92 94 96 98 100 3 5 7 9 ε = 0.9 ε = 0.95 ε = 1 (d) γ = 0 e α = 0.3 Fig. 3. Impact of varying the learning rate α and the parameter ε with ε-greedy strat- egy, 10 users and occupancy rate of 10%. Then, the impact of varying the parameters α and ε was verified (Fig. 3). Initially, the parameter α, called learning rate, was analysed. Briefly, larger val- ues of this parameter benefit recent experience against the historical experiences (or acquired knowledge) of the mechanism. In the same way, as in the previous analysis, it was necessary to choose the initial value of the other parameters, γ and ε. The γ value was obtained previously and the ε value was chosen as 0.7. As can be seen in Figs. 3(a) and 3(b), smaller values of α provide better per- formance of the mechanism. It was expected that this result would demonstrate the importance of adapting the mechanism to the dynamics of the environment and how much this would influence the overall performance, however, it was found that it is more important to value the acquired knowledge, keeping the parameter α low. The parameter ε, also called exploration level, is linked to the ε-greedy strat- egy and establishes the probability that directs the mechanism to explore new states. The appropriate choice in the expose level depends very much on the problem to be addressed. If the purpose of the problem requires intense learn- ing, the value of this parameter tends to be higher. On the other hand, if in the problem there is the execution of other tasks parallel to the learning, it does not
  • 417.
    Convergence of theRL Mechanism 409 seem appropriate to assume that the mechanism should choose more random actions and, with that, a compromise between learning and performance needs to be considered in the choice. Observing Fig. 3(c), we can notice that this parameter has an upper limit worth 0.7, above which the variation of the parameter does not significantly influence the reward. According to the analyses performed, the parameters γ, α and ε assume values of, respectively, 0.5, 0.1 and 0.7 forming the set of parameters for the ε-greedy strategy applied to the problem and with better results. This set of values will be adopted in the next experiments. The softmax strategy is very sensitive to the adequate choice of the param- eter temperature, responsible for controlling the exploration of new actions. If the choice falls on a very low value, the exploration becomes greedy; and, other- wise, the exploration becomes very random. Besides the temperature, it is also necessary to choose the parameter β, which work is inversely proportional to the learning rate α. For this analysis, it was necessary to assign an initial value for temperature so that values could be tested for the other parameter. The initial value for temperature was 0.7. Figure 4(a) shows the variation curves of β between 0.3, which makes the learning rate high, and 1, which reduces it. Observing the result, it was noticed that there is little influence of this parameter on the variation of the collected reward value when increasing the channels. Thus, the value of β equal to 1 was chosen, keeping the learning rate reduced. With this result, we return to the selection of the parameter temperature. In Fig. 4(b) we can see the curves for the parameter given the chosen value for β. The influence of the parameter temperature also proved to be small and we decided to choose a value of 0.8, favouring the exploration of new actions. This way, the two parameters for the softmax strategy, temperature= 0.8 and β = 1, were established and will be used by the mechanism in the next experiments. 30 40 50 60 70 80 90 100 3 5 7 9 β = 0.3 β = 0.7 β = 1 (a) 30 40 50 60 70 80 90 100 3 5 7 9 t = 0.70 t = 0.80 t = 0.90 (b) Fig. 4. Impact of varying the parameters temperature (t) and β with softmax strategy, 10 users and occupancy rate of 10%.
  • 418.
    410 A. Mendes 4.2Number of Users and Channel Occupancy In this part, an experiment is carried out to determine the impact of varying the number of users simultaneously with the variation of the channel occupancy. The purpose is to evaluate the importance of using a mechanism adaptable to the increase in the number of users, independent of the channel occupancy. Moreover, with the increase of the number of slots the values of the collected reward (y-axis) can be observed and conclusions about the convergence of the mechanism can be drawn. In this experiment, a simulation is performed with 100,000 slots for each strat- egy, softmax and ε-greedy, using a number of channels equal to 9. The number of channels was chosen assuming that there will be greater “opportunities” for users as there are more channels. The plotted figures have been harvested at 20% of the total number of slots to take advantage of the part that contains the most information. 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 1 US 10 USs 15 USs 20 USs (a) Occupancy = 10%, softmax 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 1 US 10 USs 15 USs 20 USs (b) Occupancy = 90%, softmax 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 1 US 10 USs 15 USs 20 USs (c) Occupancy = 10% 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 1 US 10 USs 15 USs 20 USs (d) Occupancy = 90% Fig. 5. Impact of varying the number of users and the channel occupancy parameter for both strategies and with 9 channels. The results of this examination are shown in Fig. 5. As expected, it is pos- sible to observe that with low channel occupancy the obtained reward values are higher (Figs. 5(b) and 5(d)), a direct consequence of the higher amount of “opportunities” for the users. In this result, it is also possible to observe that there is a better performance of the mechanism for a certain number of users, in both strategies. It is possible to understand this phenomenon as a result of the users’ accommodation, a consequence of the greater dispute for the available opportunities, which are not enough for the demand, causing a reduction in the reward when the number of users increases too much. When the channel occupancy parameter increases, a reduction in the reward values are expected, which can be seen in Figs. 5(a) and 5(c). In this scenario, it is also noted that there is a better performance of the mechanism for a certain number of users, although the value is different from that present when there are more “opportunities” available. Another important observation is that the ε-greedy strategy outperforms soft- max when applied to the problem, regardless of the channel occupancy rate. This
  • 419.
    Convergence of theRL Mechanism 411 somewhat contradicts the theory that there is a predominance of the softmax technique, demonstrating that in certain scenarios and applications, the ε-greedy technique can indeed have superior performance [17]. The results obtained with this experiment also point to a need for adaptation of the mechanism to the number of users, being interesting that should be a capability of autonomous restriction of this quantity by the mechanism, aiming to reach a bigger reward. This will be investigated in future works. Regarding the convergence of the mechanism, even with the significant increase in the number of channels, the mechanism shows to be robust, evolving through a transitory period until the convergence in an equilibrium point, as the number of slots increases. 4.3 Number of Channels and Channel Occupancy In this part, the effects on the reward (y-axis) due to the variation of the num- ber of channels simultaneously with the variation of the channel occupancy were evaluated. For this purpose, a similar experiment to the previous one (Sect. 4.2) was carried out, starting from a simulation with the same amount of slots for each strategy, softmax and ε-greedy. A total of 10 users were used, as per the results obtained in Sect. 4.2. Plotted figures were harvested at 20% of the total number of slots for clarity, as in the previous analysis. As expected, for both strategies and the two evaluated rates of the chan- nel occupancy parameter, the highest reward values were obtained with the highest channel offer (Fig. 6). However, with low channel availability for users (Figs. 6(a) and 6(c)), the performance of the mechanism was similar for 3 chan- nels or 5 channels. The cause of this behaviour is that a high rate of the channel occupancy parameter with a reduced number of channels means that there are fewer “opportunities” to be taken advantage of by the mechanism, and a small increase in these “opportunities” is enough for the mechanism performance to raise significantly. 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 3 channels 5 channels 9 channels (a) Occupancy = 10%, softmax 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 3 channels 5 channels 9 channels (b) Occupancy = 90%, softmax 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 3 channels 5 channels 9 channels (c) Occupancy = 10% 0 20 40 60 80 100 10 20 30 50 100 Slot [x 100] 3 channels 5 channels 9 channels (d) Occupancy = 90% Fig. 6. Impact of varying the number of channels and the channel occupancy parameter with both strategies, for 10 users. For all the configurations presented, it was shown that the mechanism is immune to any perturbation caused by the variation, either of the number of
  • 420.
    412 A. Mendes usersor the number of channels, either for a higher value of the channel occu- pancy parameter or for a lower value, and that as the number of slots grows the mechanism evolves to an equilibrium point, which for our scenario and applica- tion, was close to 5,000 slots. 4.4 Measurement of Justice Another evaluation can be observed in Fig. 7, where the measure of fairness among users is presented. This experiment was performed with the same con- figuration as the previous experiment to evaluate the influence of the number of users (Sect. 4.2). The importance of this measure lies in the fact that we have analyzed until now the aggregate values of the reward, which could be mask- ing some selfish behaviour of the mechanism in the attribution of the “best” sequences, favouring some users to detriment of others. 0 3 5 7 9 12 1 2 6 10 Users (a) Transient up to 100 slots, softmax 0 3 5 7 9 12 1 2 6 10 Users (b) Transient up to 1,000 slots, softmax 0 3 5 7 9 12 1 2 6 10 Users (c) Measurement up to 10,000 slots, softmax 0 3 5 7 9 12 1 2 6 10 Users (d) Transient up to 100 slots 0 3 5 7 9 12 1 2 6 10 Users (e) Transient up to 1,000 slots 0 3 5 7 9 12 1 2 6 10 Users (f) Measurement up to 10,000 slots Fig. 7. Measure of fairness between users with both strategies, for 10 users, 9 channels and for the channel occupancy rate of 90%. However, what could be observed, in both strategies, is that as the number of slots increases, the final reward of each user tends to a homogeneous fraction of the aggregate reward equivalent to that of other users, with no selfish behaviour occurring.
  • 421.
    Convergence of theRL Mechanism 413 This result also contributes to strengthening the hypothesis that the mecha- nism converges to an equilibrium point since at this point the individual strate- gies do not lead to the greedy behaviour of some users. 4.5 Results To evaluate the performance of the proposal, the developed simulator was extended and the following detection sequences were evaluated: – the dynamic sequence of the channels provided by the mechanism (RL); – the sequence of channels in the decreasing sequence of their availability prob- abilities (Prob); – the sequence given by the descending sequence of the average capacities of each channel (CAP) [3]; and, – the random sequence of channels (RND). It is worth noting that all of the evaluated sequences, except the RL sequence, are static, i.e., they do not change throughout the simulation. In the case of the sequence RL, due to reinforcement learning itself, the sequence may vary during the simulation. Moreover, all sequences, except for the RL and the RND, assume a priori knowledge of the average capacities of each channel and/or their availability probabilities. At the beginning of each simulation round, the value of the average capac- ity and availability probabilities of each channel i, i ∈ 1, ..., N, were drawn. The availability probability of each channel is drawn uniformly within the interval [0, 1]. The average capacity is drawn following a uniform distribu- tion within the interval [0.1 × CAPMAX, CAPMAX], where CAPMAX is the maximum average capacity of the channels. And the instantaneous capac- ity of each channel is drawn using a uniform distribution within the interval [0.2×CAPMEDIA, CAPMEDIA], where CAPMEDIA is the average capac- ity of each channel. Channel occupancy modelling was established according to an exponential on-off model synchronised with the slots. With this, a given channel remained in the occupied state for a period equal to tOF F equivalent to the mean of an exponential distribution. Thus, tON can be obtained by tON = (1−u)×tOF F u , where u is the channel occupancy rate. The value of Wm, referring to the collision probability, was set to 8 [21]. Thirty rounds of the simulation were performed for each set of parameters. In all simulations, CAPMAX was configured with a fixed value of 10 and the size of the slot was configured as variable, with a value equal to twice the number of channels used in the simulation multiplied by τ. The metric used for evaluation was the reward obtained by each of the sequences implemented using the same states and instantaneous channel capac- ities (fairness criterion) at each slot, corresponding to the effective transmission rate obtained by the user when using the channel (Sect. 3) and calculated by the simulator, at the end of each round, as the average reward obtained by each of
  • 422.
    414 A. Mendes thesequences in all X slots for each user. The overall reward is given by the sum of the individual rewards. A simulation is performed with 200 rounds of 100,000 slots for each strategy, softmax and ε-greedy. 0 10 20 30 40 50 60 70 80 90 100 2 3 4 5 6 7 8 9 10 Users RL Prob Cap RND (a) Occupancy 10%, softmax 0 20 40 60 80 100 2 3 4 5 6 7 8 9 10 Users RL Prob Cap RND (b) Occupancy 90%, softmax 0 10 20 30 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 3 channels 5 channels 9 channels (c) 2 users, softmax 0 10 20 30 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 3 channels 5 channels 9 channels (d) 10 users, softmax 0 20 40 60 80 100 2 3 4 5 6 7 8 9 10 Users RL Prob Cap RND (e) Occupancy 10% 0 10 20 30 40 50 60 70 80 90 100 2 3 4 5 6 7 8 9 10 Users RL Prob Cap RND (f) Occupancy 90% 20 30 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 3 channels 5 channels 9 channels (g) 2 users 0 10 20 30 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 3 channels 5 channels 9 channels (h) 10 users Fig. 8. Results of the proposal RL compared to other intuitive sorting mechanisms. The results of this evaluation are shown in Fig. 8. For both tested strategies, softmax and ε-greedy, it can be observed that the proposed RL presents better results. The performance of the other sequences is unfavourable because none of them uses stopping rules based on the prediction of the estimated performance of continuing sensing the next channels of the sequence, i.e., in these other solutions the first channel sensed as free is always used. In this way, the RL, which uses the past experiences stored in the Q-table, can efficiently determine whether it is advantageous to use a given channel sensed as free. An interesting observation regarding the curves for 9 channels in Figs. 8(e), 8(f) and 8(a), 8(b), is that the performance of the Prob sequences, very close to that of the random RND sequences, is lower than the performance of the Cap sequences. This indicates that in this scenario the differentiation between the average channel capacities (CAPMEDIA) is more important than the dif- ferentiation between their availability probabilities. With this, it is better to sequence the channels by the decreasing sequence of their average capacities, since it increases the probability that the first channel sensed as free is a channel with a higher capacity. Another detail is that the curve referring to the proposed RL presents a smaller growth as the number of users increases, indicating that there is a satu-
  • 423.
    Convergence of theRL Mechanism 415 ration threshold of the available capacity of the network according to the number of available channels. From Figs. 8(c) and 8(g), it can be seen that with few users, the reward obtained varies little with the variation of occupancy (x-axis) and channels. When the number of users increases (Figs. 8(d) and 8(h)), an increase in reward is observed, although it can be noted that the variation in the parameter occupancy is small, maintaining a similar percentage growth of reward with the increase in the number of available channels. Finally, the ε-greedy strategy had a superior result than the softmax strat- egy, when applied to the problem. A possible explanation for this lies in the fact that when Q-learning is used in modelling with a high number of actions, it is expected to observe a high number of overestimations, which impair the perfor- mance of the strategy, in some of the values of the actions if there is some noise in these values, for example, due to the use of approximate value functions or too abrupt transitions between states. Another explanation comes from the fact that for some applications/modelling, the softmax strategy keeps exploring for a long time, oscillating between many actions that obtain rewards with similar values, also harming its performance [22]. 5 Conclusions and Future Works In this paper, the evolution of the mechanism presented in [5] is discussed for the multi-user case. The implementation of the softmax strategy to balance the research-exploration dilemma is also included along with recommendations for the choice of parameters to improve the overall reward. Then, the convergence of the mechanism is verified by applying the aforementioned set of parameters. Finally, the performance evaluation of the evolved mechanism is performed. Actually, the proper selection of a set of parameters for the mechanism allows maximizing the reward (data throughput). However, its performance is strictly related to the values chosen and, hence, the state of the RF environment could degrade the performance. It is always worth noting that RF communication is widely used in practical applications, e.g. for the gathering of data from sensors spread over a primary production area (agriculture). As future works, we have the investigation of optimal values for the parame- ters, the implementation of new strategy to balance the exploitation-exploration dilemma and the evolution of the mechanism for autonomous user control in order to achieve a higher reward, as shown in Sect. 4.2. Acknowledgements. This work has been conducted under the project “BIOMA – Bioeconomy integrated solutions for the mobilization of the Agri-food market” (POCI-01-0247-FEDER-046112), by “BIOMA” Consortium, and financed by European Regional Development Fund (FEDER), through the Incentive System to Research and Technological development, within the Portugal2020 Competitiveness and Internation- alization Operational Program.
  • 424.
    416 A. Mendes Thiswork has also been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. References 1. McHenry, M.A.: NSF Spectrum Occupancy Measurements Project (2005) 2. FCC: FCC-03-322 - NOTICE OF PROPOSED RULE MAKING AND ORDER. Technical report, Federal Communications Commission, 30 December 2003 3. Cheng, H.T., Zhuang, W.: Simple channel sensing order in cognitive radio networks. IEEE J. Sel. Areas Commun. (2011) 4. Chow, Y.S., Robbins, H., Siegmund, D.: Great Expectations: The Theory of Opti- mal Stopping. Houghton Mifflin Company, Boston (1971) 5. Mendes, A.C., Augusto, C.H.P., Da Silva, M.W., Guedes, R.M., De Rezende, J.F.: Channel sensing order for cognitive radio networks using reinforcement learning. In: IEEE LCN (2011) 6. Claus, C., Boutilier, C.: The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. National Conference on Artificial Intelligence (1998) 7. Tan, M.: Multi-agent Reinforcement Learning: Independent vs. Cooperative Agents. In: Readings in Agents (1997) 8. Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: ICML (2000) 9. Kapetanakis, S., Kudenko, D.: Improving on the reinforcement learning of coordi- nation in cooperative multi-agent systems. In: AAMAS (2002) 10. Lauer, M., Riedmiller, M.: Reinforcement learning for stochastic cooperative mul- tiagent systems. In: AAMAS (2004) 11. Bowling, M.: Convergence and No-Regret in Multiagent Learning. In: Advances in Neural Information Processing Systems 17. MIT Press, Cambridge (2005) 12. Jafari, A., Greenwald, A., Gondek, D., Ercal, G.: On no-regret learning, fictitious play and nash equilibrium. In: Proceedings of the 18th International Conference on Machine Learning (2001) 13. Zapechelnyuk, A.: Limit behavior of no-regret dynamics. Technical report, School of Economics, Kyiv, Ucraine (2009) 14. Leslie, D., Collins, E.: Generalised weakened fctitious play. Games Econ. Behav. 56(2) (2006) 15. Brown, G.: Some notes on computation of games solutions. Research memoranda rm-125-pr, RAND Corporation, Santa Monica, California (1949) 16. Verbeeck, K., Nowé, A., Parent, J., Tuyls, K.: Exploring selfish reinforcement learn- ing in repeated games with stochastic rewards. In: JAAMAS (2006) 17. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MP, Cam- bridge (1998) 18. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992) 19. Yau, K.A., Komisarczuk, P., Teal, P.D.: Applications of reinforcement learning to cognitive radio networks. In: IEEE International Conference in Communications (ICC) (July 2010) 20. Yau, K.A., Komisarczuk, P., Teal, P.D.: Enhancing network performance in dis- tributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: IEEE Conference on Local Computer Networks (October 2010) 21. Vu, H.L., Sakurai, T.: Collision probability in saturated IEEE 802.11 networks. In: Australian Telecommunication Networks and Applications Conference (2006) 22. Hasselt, H.: Double q-learning. In: NIPS (2010)
  • 425.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data Tiago Franco1(B) , P. R. Henriques2 , P. Alves1 , and M. J. Varanda Pereira1 1 Research Centre in Digitalization and Intelligent Robotics, Polytechnic Institute of Bragança, Bragança, Portugal {tiagofranco,palves,mjoao}@ipb.pt 2 ALGORITMI Centre, Department of Informatics, University of Minho, Braga, Portugal prh@di.uminho.pt Abstract. Knee osteoarthritis (KOA) is a degenerative disease that mainly affects the elderly. The development of this disease is associ- ated with a complex set of factors that cause abnormalities in motor functions. The purpose of this review is to understand the composition of works that combine biomechanical data and machine learning tech- niques to classify KOA progress. This study was based on research arti- cles found in the search engines Scopus and PubMed between January 2010 and April 2021. The results were divided into data acquisition, fea- ture engineering, and algorithms to synthesize the discovered content. Several approaches have been found for KOA classification with signifi- cant accuracy, with an average of 86% overall and three papers reaching 100%; that is, they did not fail once in their tests. The acquisition of data proved to be the divergent task between the works, the most considerable correlation in this stage was the use of the ground reaction force (GRF) sensor. Although three studies reached 100% in the classification, two did not use a gradual evaluation scale, classifying between KOA or healthy individuals. Thus, we can get out of this work that machine learning techniques are promising for identifying KOA using biomechanical data. However, the classification of pathological stages is a complex problem to discuss, mainly due to the difficult access and lack of standardization in data acquisition. Keywords: Knee osteoarthritis · Biomechanical · Data classification · Machine learning 1 Introduction Osteoarthritis (OA), the most common form of arthritis, is a degenerative joint disease caused by rupture and eventual loss of the cartilage that lines the bony extremities [18]. This disease is directly related to aging and usually affects the knee, hip, spine, big toe, and hands. Estimates show that about 10% of men and c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 417–429, 2021. https://doi.org/10.1007/978-3-030-91885-9_31
  • 426.
    418 T. Francoet al. 18% of women over 60 years of age have some kind of the OA, reinforcing its worldwide importance in public health [17]. In particular, knee osteoarthritis (KOA) is the type of OA with the highest perceived prevalence. The most characteristic symptoms of KOA are usually changing in biomechanical behavior, such as joint pain, stiffness, cracking, and locking of the joints. These symptoms are typically noticed earlier because they interfere with daily activities, especially gait abnormalities [21]. The development of KOA is associated with a combination of a complex set of factors. These risk factors can be divide into non-modifiable and modifiable, including joint integrity, genetic predisposition, local inflammation, mechanical forces, and cellular and biochemical processes. Consequently, several treatments have been creating to improve motor function, relieve pain, and alleviate the deficiencies caused [10]. One of the fundamental pieces for developing a sophisticated treatment is the understanding and classification of the progression of KOA. As this infor- mation becomes more practical and accessible, more realistic monitoring of the case becomes possible, enabling the physiotherapist to prepare a more efficient specialized solution [2]. In addition, it also creates the possibility of expanding the early detection of KOA, reducing the prevalence of the disease in the global population [3]. The most common techniques used to classify KOA progression are X-rays and magnetic resonance imaging (MRI). Unfortunately, both have limitations; radiography is low-priced but is generally low-resolution and used only for initial evaluation. The MRI has better accuracy and is helpful for monitoring stages of the pathology, but it is expensive [2]. In addition, patients generally seek these techniques after feeling uncomfortable, making it challenging to detect asymptomatic patients early. Thus, researchers from different areas seek to propose innovative, low-cost, and scalable solutions for OA. The application of computational techniques, mainly machine learning, stands out to be constantly growing and orthogonal to most adjacent areas. The evolution of the data acquisition and analysis capacity aroused great interest from the scientific community, enabling new automated solutions for pre or post treatment [5]. As a result, several clinical and observational databases are being created, analyzed, and made available online. These data are highly dimensional, hetero- geneous, and voluminous, opening up a new range of possibilities and challenges for improving KOA diagnosis [14]. Studies show evidence that it is possible to recognize through biomechanical data, including kinematic-kinetic data and electromyography (EMG) signals, the difference in the behavior of patients with KOA and pathological stage indica- tions [1,17]. In addition, the review about the application of machine learning in KOA developed by Kokkotis et at. [5], shows that the application of ML algorithms for KOA classification using biomechanical data performs similarly to models built on image data (x-rays and MRI).
  • 427.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data 419 That said, there was a desire to understand in more detail the composi- tion of works that focus on biomechanical data, identifying and categorizing the most common and promising forms of data acquisition, resource engineering and algorithms. Thus, the present work reviews the literature with the purpose of systematizing and comparing the knowledge produced by articles that use biomechanical data to classify the KOA through machine learning algorithms. The paper is organized in four more sections after this one. Section 2 refers to the methodology used, such as keywords and selection criteria; Sect. 3 compiles the results of this article, describing the relevant points found in the reviewed articles on data acquisition, resource engineering and algorithms; Sect. 4 dis- cusses the results and the points that we believe should be highlighted; Sect. 5 reports the main conclusions and scientific contributions that we have drawn from this study. 2 Methods The literature review presented summarizes the works found in the Scopus and PubMed repositories. Several combinations of keywords were used in the search, with the most recurring ones: biomechanical, gait, machine learning, deep learn- ing, classification, osteoarthritis. Searches were limited by the year of publication between January 2010 and April 2021. The focus of the review is to find articles that use biomechanical data from patients with KOA. Therefore, three rules were used to define the inclusion criteria, namely: 1. Only articles that mention clinical evidence of the presence of KOA in patients included in the study; 2. Only articles that handle with biomechanical data; 3. Studies that propose to classify KOA using machine learning algorithms. For the exclusion criteria, only two rules were added, namely: 1. Studies did not specify KOA as the presenting pathology; 2. Studies that used analytical methods for KOA classification. Two review papers inspire this study with correlated objectives. The review [5] synthesizes machine learning techniques in diagnosing knee osteoarthritis between 2006 to 2019. The study comprehensively explores using any data applied to knee osteoarthritis, including MRI, X-Ray, Kinetic and Kinematic data, Clinical data, and Demographics. The second review mentioned [8] makes a survey on the evaluation of knee osteoarthritis based on gait between 2000 to 2018. The study shows to be the most comprehensive on the subject by dis- cussing in detail the various techniques applied to the understanding of human gait and its difference for KOA patients. The present study differs from the ones mentioned, being more specific, including only works that use Kinetic and Kinematic data for the classifica- tion of knee osteoarthritis through machine learning techniques. The focus of this work is to detail the construction of the dataset and explore the different models based on the biomechanical data.
  • 428.
    420 T. Francoet al. 3 Results Twelve articles were selected according to the criteria described to be explored. A list with the meta-information can be seen in Table 1. Table 1. Selected articles Index Author Year Country Repository 1 Köktaş [22] 2010 Turkey Scopus 2 Moustakidis [15] 2010 Greece Scopus 3 McBride [12] 2011 USA Scopus 4 Kotti [6] 2014 UK PubMed 5 Phinyomark [19] 2016 Canada PubMed 6 Kotti [7] 2017 UK Scopus 7 Muñoz-Organero [16] 2017 China Scopus 8 Long [11] 2017 UK Scopus 9 Mezghani [13] 2017 Canada Scopus 10 Kobsar [4] 2017 Canada PubMed 11 Kwon [9] 2020 Korea PubMed 12 Vijayvargiya [20] 2020 India Scopus Two-thirds of the articles found are available on Scopus, with the majority of the articles being after 2015, with the highest concentration in 2017 with 05/12. In addition, it was also noted that there is a correlation of authors between the articles Kotti [6,7] and Long [11], as well as Phinyomark [19], Mezghani [13] and, Kobsar [4]. N. Kour [8] describes that the pipeline commonly followed to classify KOA using computable data has five main stages, illustrated in Fig. 1. The first step is to acquire data from KOA patients and healthy individuals; the second step is pre-processing, where techniques are applied to improve the quality of the data; the third phase is the extraction of the most relevant characteristics and reduction of the data volume; the fourth step is where the model is built and the classifier algorithm is optimized; lastly, the fifth stage is the final result of the process, the desired classification. Fig. 1. The steps of the process followed in the diagnosis of KOA. Adapted [8].
  • 429.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data 421 Analogously, the selected articles follow a structure similar to that shown in Fig. 1, diverging in some cases in steps 2 and 3, which may appear together with the name of feature engineering. The approaches adopted are the main outcomes of this work, the results of this work are organized in three parts that compete with the classification process, being: (1) the description of the data collection scenario and its specifications, (2) the path that the data takes after its collection until transform into input/output of the classifier algorithm and (3) the characteristics of the machine learning models produced by the reviewed articles. 3.1 Data Acquisition Different approaches were applied to acquire biomechanical data. Only one study [4] did not add data from healthy individuals to healthy control (HC). The most used resource for the acquisition was the force plates (FP) and the only one that was adopted in combination with the other two found kinematic (KMC) and wearable devices. Consequently, the most applied sensor was the Ground reac- tion force (GRF), but the Inertial measurement unit (IMU), electromyography (EMG), and Goniometer (GO) sensors were also seen. The detailed list of the characteristics of the datasets developed in the studies can be seen in Table 2. Table 2. Characteristics of the datasets Author Year Size Sensors Resources Additional data Köktaş [22] 2010 +150 (KOA/HC) GRF KMC+FP Yes Moustakidis [15] 2010 24 KOA/12 HC GRF FP No McBride [12] 2011 20 KOA/10 HC GRF KMC+FP No Kotti [6] 2014 47 KOA/133 HC GRF FP No Phinyomark [19] 2016 100 KOA/43 HC GRF KMC+FP No Kotti [7] 2017 47 KOA/47 HC GRF FP No Muñoz-Organero [16] 2017 14 KOA/14 HC GRF Wearable No Long [11] 2017 92 KOA/84 HC GRF KMC+FP Yes Mezghani [13] 2017 100 KOA/40 HC IMU Wearable Yes Kobsar [4] 2017 39 KOA IMU Wearable No Kwon [9] 2020 375 (KOA/HC) GRF KMC+FP Yes Vijayvargiya [20] 2020 11 KOA/11 HC EMG+GO Wearable No The collection of biomechanical data is not an easy task, much less stan- dardized. All studies that describe the creation of the dataset diverge at some point. Even those that use the same resource and sensors diverge in the data collection equipment. A good example is the data provided by kinematic, usu- ally applied for the acquisition of joint angles (knee, hip) and movement in the anatomical planes (sagittal, frontal, horizontal); the number of cameras used in [9,11,12,19,22], was 6, not specified, 8, 12, 10, respectively. In addition, most of
  • 430.
    422 T. Francoet al. the software used to acquire the data is paid and closed code, making it difficult to understand the actual correspondence between the datasets. There were not many wearable devices, only four, two commercial devices, and two developed. The technology adopted in each one is quite different. The article [16] applies an insole composed of 8 Force Sensing Resistors (FSR). The [13] also applies a commercial device, but this time the wearable uses IMU sensors in the knee region to translate the movements into sophisticated software. Similarly, [4] also uses IMU sensors; however, the study proposes the creation of a wearable device, testing different positions for gait identification. The [20] also proposes the creation of a wearable for the acquisition, using EMG sensors and a goniometer to measure the angle between the thigh and the shin. The number of patients included varies between the works, with a minimum of 22 [20] and a maximum of 375 [9] individuals divided between KOA and HC. All studies mention the clinical confirmation of the diagnosis of KOA in patients, but not all register the degree of the pathology. One-third of the chosen jobs included additional data in the datasets. The data were mainly standardized patient-reported outcome measures (PROMs) forms. The articles [13] and [11] adopted the knee injury and osteoarthritis outcome score (KOOS) and [9] the Western Ontario and McMaster Univer- sity Osteoarthritis Index (WOMAC). Finally, [22] follow a different approach, describing a form with the fields of age, body mass index and pain level. Despite the division, in practice most of the selected works add some information. For example, the participant’s weight is used to standardize forces on the ground across the entire dataset. 3.2 Feature Engineering The presented study classifies as Feature Engineering the process that the data goes through starting from the resources adopted by each study and becoming the input/output of the classifier algorithm. So, according to Fig. 1, we divide the applied techniques between pre-processing, feature extraction and add an evaluation scale to imagine the dataset in its entirety. Table 3 seeks to summarize the main techniques mentioned by the authors. As we can see, all articles apply techniques or a sequence of calculations before feature extraction. It is essential to say that the authors who used more sophisticated software and technology describe less the pre-processing since the data has been previously treated differently from the others. Thus, the most seen techniques were normalization by weight and synchronization of gait between individuals. Three techniques are highlighted in pre-processing: the first is the discrete wavelet transform (DW), applied to reduce the noise of the raw EMG signals and provide information in the frequency and time domains; the second is the overlay window applied to divide the signal into parts, both of the article [20]. The last, we have the [15] that applied a derivation of the DW called Wavelet Packet with the inclusion of a more detailed decomposition of the signal.
  • 431.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data 423 Table 3. Feature engineering Author Year Pre-processing Feature Extraction Evaluation Scale Köktaş [22] 2010 Calc. of joint angles Mahalanobis K-L Moustakidis [15] 2010 Normal. by weight, Wavelet Packet FuzCoC K-L adap. McBride [12] 2011 Calc. of joint angles Analysis K-L Kotti [6] 2014 Normal. by weight, gait Sync PCA KOA or HC Phinyomark [19] 2016 Calc. of joint angles PCA KOA or HC Kotti [7] 2017 Normal. by weight, gait Sync Analysis 0–2 develop Muñoz-Organero [16] 2017 Normal. by altitude and duration Mahalanobis KOA or HC Long [11] 2017 Calc. of joint angles, normal. by weight, gait Sync Analysis KOOS adap. Mezghani [13] 2017 Calc. of joint angles Analysis K-L Kobsar [4] 2017 Normal. by altitude, gait Sync PCA KOOS adap. Kwon [9] 2020 Calc. of joint angles ANOVA, student t-test WOMAC adap. Vijayvargiya [20] 2020 Filtered, Normal., DWT, Overlapping Windows Backward Elimin. KOA or HC Four studies performed a manual analysis to extract characteristics, study- ing the variance of their fields and making them more powerful. The other 8 chose standardized methods, with a more significant predominance, 3 articles, the Principal component analysis (PCA). PCA is an orthogonal or linear trans- formation technique used to convert a set of possibly correlated variables into a set of linearly unrelated variables called Principal Components (PC) that max- imize the variability in the original data [19]. Two articles applied the Mahalanobis distance in its Extraction of Char- acteristics phase, an algorithm that can be applied as a selection criterion. In summary, the Mahalanobis algorithm calculates the distance between fields in the pattern due by the mean. In this way, it is possible to reduce the fields by distinguishing the highest and lowest variance under the Mahalanobis distance observation. The statistical methods A One-way Analysis of Variance (ANOVA) and Stu- dent’s T-Test were the methods applied by [9] to determine which features were significantly related to the desired evaluation scale. The feature extraction per-
  • 432.
    424 T. Francoet al. formed by [15] followed a sequential selection according to the Complementary Fuzzy Criteria (FuzCoC), adjusting the nodes of the decision tree algorithm. For [20], 11 characteristics were initially extracted in the time domain of the EMG signal. Afterward, a Backward Elimination technique was applied, remov- ing less significant fields by iteration. This work applies different pre-processing and feature extraction techniques because it is the only one to use EMG data for classification. All the studies mentioned about the Kellgren and Lawrence classification (K-L) to evaluate the OA progression. The K-L scale is summarized in a set of evidence that an x-ray needs to confirm one of the 5◦ , 0 being healthy and 4 being the most critical pathological level. This method is commonly adopted for its practicality and was accepted by the World Health Organization (WHO) in 1961. Despite this, only four studies used the k-L scale as an evaluation scale. The special addendum to [15], that grouped the classification between the scores of 1 or 2 as moderate and the score of 3 or 4 as severe. Two articles took advantage of patient reporting forms to produce a classifi- cation adapted to their dataset. Author [4] used the variations of KOOS forms made by patients before and after a treatment. In this way, those who had a posi- tive change in the form were considered as responding to treatment, consequently those who did not change or worsened were considered as not responding. The second [9] used WOMAC to produce its scale. There was an adaptation trans- forming the values from 1 to 5 of the form into three classes: mild, moderate and severe. Lastly, the study [7] developed an evaluation scale for KOA pathology, start- ing from 0 for the healthiest individual and continues linearly up to 2, equivalent to both knees with severe KOA. In addition, a maximum value of 0.5 was placed for the individual to be considered without the presence of KOA. 3.3 Algorithms and Results Algorithms of different natures were applied for the classification of KOA. These algorithms are Neural networks, Multilayer Perceptron (MLP), Support-vector machine (SVM), decision tree-based algorithms, Random Forest, Regression tree, Extra tree, statistical algorithms, Bayes classifier and, linear discriminant anal- ysis (LDA) and clustering algorithm k-nearest neighbors (kNN). A combination of algorithms was also proposed, Köktaş [22] used a decision tree under the dis- crete data to decide which MLP to use to classify the biomechanical data. The article [15] combined the Fuzzy methodology to improve a decision tree based on SVM, denoting Fuzzy decision tree-based SVM (FDT-SVM). Although some works have presented several algorithms and adaptations to improve performance, we will consider only the components that produced the best performance pointed out by the authors of the reviewed articles. The detailed list can be found in Table 4. The results obtained by the studies were 72.61% up to classifications that did not fail once in their tests with 100% accuracy. The studies that reported 100% of their results used the SVM and kNN algorithms. The two papers that
  • 433.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data 425 Table 4. Algorithms with better performance Author Year Algorithm Train/Test Validation Result Köktaş [22] 2010 Decision tree + MLP 66-33 10F-CV 80% Moustakidis [15] 2010 FDT-SVM 90-10 10F-CV 93.44% McBride [12] 2011 Neural networks 66-33 – 75.3% Kotti [6] 2014 Bayes classifier – 47F-CV 82.62% Phinyomark [19] 2016 SVM – 10F-CV 98–100% Kotti [7] 2017 Random forest 50-50 5F-CV 72.61% Muñoz-Organero [16] 2017 SVM – – 100% Long [11] 2017 KNN 70-30 CV 100% Mezghani [13] 2017 Regression tree 90-10 10F-CV 85% Kobsar [4] 2017 LDA – 10F-CV 81.7% Kwon [9] 2020 Random forest 70-30 HCV 74.1% Vijayvargiya [20] 2020 Extra tree – 10F-CV 91.3% performed the Random Forest algorithm and Neural networks achieved results below 76%. Statistical algorithms obtained similar results at around 82%. There was no agreement between studies on the distribution of data between training data and test data. Five of the 12 studies did not perform the distri- bution. That is, the entire dataset was used in the training phase. The greatest co-occurrence was among the papers [9,11,12], with approximately 2/3 of the dataset for training and 1/3 for testing. Only two studies did not apply or did not mention the use of the validation technique. All others applied some variation of cross-validation (CV). This tech- nique is summarized in the division of the training data in X parts (XF-CV), where X-1 parts will be used in training and 1 part for testing. Then, the algo- rithm will be trained X times with the addition of circularly alternating the test subset. Half of the works discussed adopted the most known form of this technique, the 10-fold cross-validation. The study [9] used the variation of the technique called holdout (HCV), which consists of separating the training dataset into two equal parts. Within this set, Article [6] stands out for using the division by 47. The authors report that they opted for this division divided by having only 47 individuals with KOA; thus, 46 rows were used for training and one for testing. 4 Discussion It’s clear from the studies analyzed that each work has its characteristics and limitations that must be analyzed separately. However, in a general context, works that proposed the KOA classification reached significant precision, with an average of 86% overall and three works reaching 100%. The variety of approaches
  • 434.
    426 T. Francoet al. found shows how the classification of KOA is a topic that can be explored in different ways, even limiting under the biomechanical data. It is possible to highlight that 9 of the 12 articles used GRF sensors. Even those who had all the joints available through the acquisition of data via Kine- matics did not discard the use of GRF. Kotti [6] describes that if a pathological factor is present in a patient, the movements are expected to be systematically altered. Thus, as shown in the works presented, the gait movement is altered in the case of KOA and, it is possible with only the ground reaction to identify these changes. Work [22] has a different implementation from the others due to two facts. The first fact is that it uses additional data (age, body mass index, and pain level) to build a decision tree. The second fact is that the work uses the decision tree to select which MLP will be used to classify the grade of the KOA. This approach becomes interesting since the discrete data has properties quite different from the timeseries, allowing for optimization separately. Only one article [4] does not consider healthy individuals in the composition of the dataset. Consequently, it fails to classify or understand the difference between healthy and KOA. Although there is no cure for KOA, this information is still crucial to the classification problem. The study [20] was innovative within the set studied by using EMG sensors. With a good accuracy of 91.3%, the study shows that it is possible to distinguish a person who has KOA or not through multiple EMG sensors. The study authors comment that the composition of the sensors makes the classification happen. This is because EMG detects muscle contraction, and with only one muscle, it is difficult to identify the difference for a pathological patient. However, when sev- eral sensors are added, it is possible to distinguish a different activation pattern on the leg muscles when a movement is made. It is known that the classification of multiple classes is usually more complex than binary classification, as in the case of KOA or HC. Based on this, Table 5 adds the column of the adopted evaluation scale to the results of the algorithms. As we can see, all the works that classified between KOA or HC have better outcomes and, did not divide their datasets between training and test. It was also noted that the SVM algorithm had the best accuracy, but it was not used to classify the degree of the KOA pathology. The studies that adopted the Kellgren and Lawrence scale for their models obtained a maximum of 85% accuracy. The special case in this category was Moustakidis [15], which grouped 4◦ in 2, reducing the number of classes and achieving a performance of 93.44%. Analogously, with the exception of study [11], the 3 works that sought a different scale had a similar result not above 82%. Finally, Long [11] stands out from this review, mainly because it proves to be the most complete. The study has one of the largest datasets (92 KOA/84 HC) collected by kinematic and force plates, contributing to the reliability of the work. Despite not using the Kellgren and Lawrence scale, the study also uses a gradual scale for the classification. As a surprise, the algorithm that achieved
  • 435.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data 427 Table 5. Results ordered by the evaluation scale Author Year Algorithm Evaluation Scale Train/Test Result Kotti [6] 2014 Bayes classifier KOA or HC – 82.62% Phinyomark [19] 2016 SVM KOA or HC – 98–100% Muñoz-Organero [16] 2017 SVM KOA or HC – 100% Vijayvargiya [20] 2020 Extra tree KOA or HC – 91.3% Köktaş [22] 2010 Decision tree + MLP K-L 66–33 80% Moustakidis [15] 2010 FDT-SVM K-L adap. 90–10 93.44% McBride [12] 2011 Neural networks K-L 66–33 75.3% Mezghani [13] 2017 Regression tree K-L 90–10 85% Kotti [7] 2017 Random forest 0–2 develop 50–50 72.61% Long [11] 2017 KNN KOOS adap. 70–30 100% Kobsar [4] 2017 LDA KOOS adap. – 81.7% Kwon [9] 2020 Random forest WOMAC adap. 70–30 74.1% 100% accuracy for this study was the KNN, the only case of clustering algorithm in this review. 5 Conclusion In this review, several approaches were seen to classify KOA progression using biomechanical data. As seen, all studies that applied a binary classification, that is, models capable of identifying whether an individual has KOA or not, had good accuracy, even with small datasets and different sensors. In this way, these works bring evidence that the differences in the biomechanics of a patient with KOA are considerably noticeable by the machine learning algorithms. However, this perception is not so clear for the classification of the KOA progression. Except for one article, all works found that opted for a KOA gradu- ated scale had inferior results. In addition to the natural difficulty added by the classification of multiple classes, non-standard data acquisition seems crucial in this problem. In fact, acquiring data to monitor pathologies is always a challenge, no dif- ferent for KOA. The construction of the reported data sets requires the approval of a committee, clinical evidence, multiple processes, and specialized personnel. In this way, the datasets explored are pretty distinct from each other. Not only in the number of patients and divergent equipment but also the evaluation scale. Thus, it is not possible to affirm that the data acquired by a study can reach similar precision to other models presented. Despite this, studies such as Long [11] prove that it is possible to classify the progression of KOA efficiently. This work proved to be the most complete of the reviewed and showed confidence in the results. Although the study did not apply a known technique in the feature engineering stage, the authors describe a manual analysis for feature extraction.
  • 436.
    428 T. Francoet al. This work contributes to the characterization of the machine learning models developed for the classification of KOA. This review provides comparisons under the main steps followed for the diagnosis of KOA using biomechanical data. Thus, it is expected that the topics addressed will serve as a starting point for new approaches. As future work, we would like to compare the different approaches studied with the same data set and promote better reliable comparisons to the problem. Acknowledgment. This work was supported by FCT - Fundação para a Ciência e a Tecnologia under Projects UIDB/05757/2020, UIDB/00319/2020 and individual research grant 2020.05704.BD, funded by Ministério da Ciência, Tecnologia e Ensino Superior (MCTES) and Fundo Social Europeu (FSE) through The Programa Opera- cional Regional Norte. References 1. Amer, H.S.A., Sabbahi, M.A., Alrowayeh, H.N., Bryan, W.J., Olson, S.L.: Elec- tromyographic activity of quadriceps muscle during sit-to-stand in patients with unilateral knee osteoarthritis. BMC Res. Notes 11, 356 (2018). https://doi.org/10. 1186/s13104-018-3464-9 2. Bijlsma, J.W., Berenbaum, F., Lafeber, F.P.: Osteoarthritis: an update with rele- vance for clinical practice. Lancet 377(9783), 2115–2126 (2011). https://doi.org/ 10.1016/S0140-6736(11)60243-2 3. Chu, C.R., Williams, A.A., Coyle, C.H., Bowers, M.E.: Early diagnosis to enable early treatment of pre-osteoarthritis. Arthritis Res. Ther. 14, 1–10 (2012). https:// doi.org/10.1186/ar3845 4. Kobsar, D., Osis, S.T., Boyd, J.E., Hettinga, B.A., Ferber, R.: Wearable sensors to predict improvement following an exercise intervention in patients with knee osteoarthritis. J. Neuroeng. Rehabil. 14(1), 1–10 (2017) 5. Kokkotis, C., Moustakidis, S., Papageorgiou, E., Giakas, G., Tsaopoulos, D.: Machine learning in knee osteoarthritis: a review. Osteoarthr. Cartil. Open 2(3), 100069 (2020). https://doi.org/10.1016/j.ocarto.2020.100069 6. Kotti, M., Duffell, L., Faisal, A., Mcgregor, A.: The complexity of human walking: a knee osteoarthritis study. PloS One 9, e107325 (2014). https://doi.org/10.1371/ journal.pone.0107325 7. Kotti, M., Duffell, L.D., Faisal, A.A., McGregor, A.H.: Detecting knee osteoarthri- tis and its discriminating parameters using random forests. Med. Eng. Phys. 43, 19–29 (2017). https://doi.org/10.1016/j.medengphy.2017.02.004 8. Kour, N., Gupta, S., Arora, S.: A survey of knee osteoarthritis assessment based on gait. Arch. Comput. Methods Eng. 28(2), 345–385 (2020). https://doi.org/10. 1007/s11831-019-09379-z 9. Kwon, S.B., Ku, Y., Lee, M.C., Kim, H.C., et al.: A machine learning-based diag- nostic model associated with knee osteoarthritis severity. Sci. Rep. 10(1), 1–8 (2020) 10. Lespasio, M.J., Piuzzi, N.S., Husni, M.E., Muschler, G.F., Guarino, A., Mont, M.A.: Knee osteoarthritis: a primer. Perm. J. 21, 16–183 (2017). https://doi.org/ 10.7812/TPP/16-183
  • 437.
    Approaches to ClassifyKnee Osteoarthritis Using Biomechanical Data 429 11. Long, M.J., Papi, E., Duffell, L.D., McGregor, A.H.: Predicting knee osteoarthritis risk in injured populations. Clin. Biomech. 47, 87–95 (2017). https://doi.org/10. 1016/j.clinbiomech.2017.06.001 12. McBride, J., et al.: Neural network analysis of gait biomechanical data for classifi- cation of knee osteoarthritis. In: Proceedings of the 2011 Biomedical Sciences and Engineering Conference: Image Informatics and Analytics in Biomedicine, pp. 1–4 (2011). https://doi.org/10.1109/BSEC.2011.5872315 13. Mezghani, N., et al.: Mechanical biomarkers of medial compartment knee osteoarthritis diagnosis and severity grading: discovery phase. J. Biomech. 52, 106–112 (2017). https://doi.org/10.1016/j.jbiomech.2016.12.022 14. Moustakidis, S., Christodoulou, E., Papageorgiou, E., Kokkotis, C., Papandrianos, N., Tsaopoulos, D.: Application of machine intelligence for osteoarthritis classi- fication: a classical implementation and a quantum perspective. Quantum Mach. Intell. 1(3), 73–86 (2019). https://doi.org/10.1007/s42484-019-00008-3 15. Moustakidis, S., Theocharis, J., Giakas, G.: A fuzzy decision tree-based SVM classi- fier for assessing osteoarthritis severity using ground reaction force measurements. Med. Eng. Phys. 32(10), 1145–1160 (2010). https://doi.org/10.1016/j.medengphy. 2010.08.006 16. Muñoz-Organero, M., Littlewood, C., Parker, J., Powell, L., Grindell, C., Mawson, S.: Identification of walking strategies of people with osteoarthritis of the knee using insole pressure sensors. IEEE Sens. J. 17(12), 3909–3920 (2017). https:// doi.org/10.1109/JSEN.2017.2696303 17. Nelson, A.: Osteoarthritis year in review 2017: clinical. Osteoarthr. Cartil. 26(3), 319–325 (2018). https://doi.org/10.1016/j.joca.2017.11.014 18. Nelson, A.E., Jordan, J.M.: Osteoarthritis: epidemiology and classification. In: Hochberg, M.C., Silman, A.J., Smolen, J.S., Weinblatt, M.E., Weisman, M.H. (eds.) Rheumatology, 6th edn., pp. 1433–1440. Mosby, Philadelphia (2015). https://doi.org/10.1016/B978-0-323-09138-1.00171-6 19. Phinyomark, A., Osis, S.T., Hettinga, B.A., Kobsar, D., Ferber, R.: Gender differ- ences in gait kinematics for patients with knee osteoarthritis. BMC Musculoskelet. Disord. 17(1), 1–12 (2016) 20. Vijayvargiya, A., Kumar, R., Dey, N., Tavares, J.M.R.S.: Comparative anal- ysis of machine learning techniques for the classification of knee abnormal- ity. In: 2020 IEEE 5th International Conference on Computing Communication and Automation (ICCCA), pp. 1–6 (2020). https://doi.org/10.1109/ICCCA49541. 2020.9250799 21. Zhang, Y., Jordan, J.M.: Epidemiology of osteoarthritis. Clin. Geriatr. Med. 26(3), 355–369 (2010). https://doi.org/10.1016/j.cger.2010.03.001 22. Şen Köktaş, N., Yalabik, N., Yavuzer, G., Duin, R.P.: A multi-classifier for grad- ing knee osteoarthritis using gait analysis. Pattern Recogn. Lett. 31(9), 898–904 (2010). https://doi.org/10.1016/j.patrec.2010.01.003
  • 438.
    Artificial Intelligence ArchitectureBased on Planar LiDAR Scan Data to Detect Energy Pylon Structures in a UAV Autonomous Detailed Inspection Process Matheus F. Ferraz1(B) , Luciano B. Júnior1 , Aroldo S. K. Komori1 , Lucas C. Rech1 , Guilherme H. T. Schneider1 , Guido S. Berger1 , Álvaro R. Cantieri2 , José Lima3 , and Marco A. Wehrmeister1 1 The Federal University of Technology - Paraná, Curitiba, Brazil {mferraz,lucjun,aroldok,lucasrech,ghideki}@alunos.utfpr.edu.br, wehrmeister@utfpr.edu.br 2 Federal Institute of Paraná, Curitiba, Brazil alvaro.cantieri@ifpr.edu.br 3 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Portugal and INESC TEC, Porto, Portugal jllima@ipb.pt http://www.ifpr.edu.br Abstract. The technological advances in Unmanned Aerial Vehicles (UAV) related to energy power structure inspection are gaining visibil- ity in the past decade, due to the advantages of this technique compared with traditional inspection methods. In the particular case of power pylon structure and components, autonomous UAV inspection architectures are able to increase the efficacy and security of these tasks. This kind of application presents technical challenges that must be faced to build real-world solutions, especially the precise positioning and path following for the UAV during a mission. This paper aims to evaluate a novel archi- tecture applied to a power line pylon inspection process, based on the machine learning techniques to process and identify the signal obtained from a UAV-embedded planar Light Detection and Ranging - LiDAR sen- sor. A simulated environment built on the GAZEBO software presents a first evaluation of the architecture. The results show an positive detec- tion accuracy level superior to 97% using the vertical scan data and 70% using the horizontal scan data. This accuracy level indicates that the proposed architecture is proper for the development of positioning algorithms based on the LiDAR scan data of a power pylon. Keywords: UAV LiDAR pylon detection · Detailed electric pylon inspection · Machine learning pylon detection c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 430–443, 2021. https://doi.org/10.1007/978-3-030-91885-9_32
  • 439.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 431 1 Introduction Power line pylon inspection is a regular task performed by energy enterprises to ensure energy infrastructure systems’ operational security. Power line struc- tures are robust constructions, but deterioration on their components demands preventive maintenance to keep the quality service in power distribution. The detailed pylon inspection is commonly executed by technicians that reach the base of the energy pylon on foot and climb it to observe the details of its compo- nents and structure while looking for defects that could compromise the system. These tasks are hard to execute and risky because it is usually carried out with the energy structure in regular operation. The manufacture advances of small unmanned aircraft, specifically the multi- rotor types, allows the proposition of new inspection techniques, including the pylon detailed inspections based on this kind of vehicle embedded with regular and thermal cameras. Unmanned Aerial Vehicles (UAV) started being used on this kind of inspection in the last two decades, driven by the offer of cheap and robust small aircraft on the global market. For detailed structure inspection, multi-rotor aircraft presents some advantages, like static flight capability, vertical landing, precise position reaching, and 3D path following, among others. This kind of operation is commonly made with the aircraft remotely piloted by a human operator. To achieve the detailed images properly, the aircraft must fly near the pylon while performing a displacement around it, avoiding obstacles, and keeping a security distance from the structure and components. It is a hard process based on the pilot skills. Recent research proposes the use of autonomous multi-rotor aircraft to inspect the power line structure. The use of autonomous aircraft for detailed pylon inspections offers some challenges for the proper operation, like precise positioning and path following, effective obstacle detection and collision avoid- ance, robust flight control and path planning for the operation, component defect identification, among others. Autonomous flight demands a high accuracy posi- tion system to be executed properly. The most common approach to providing precise position data to UAV flight currently uses Differential Global Navigation Satellite System (DGNSS). This technique is based on using the difference of satellite signal received between two reception antennas, the first one is placed in a static reference point, and the second one is placed on-board the aircraft, to calculate a high accuracy position output data for the mobile module. The global market presents a considerable number of small DGNSS hardware proper to be used in small aircraft. The technique provides 1.0-cm level horizontal position accuracy but is sensitive to the environment and operational conditions. To face this problem, the proposition of additional position algorithms for UAV precise flight is presented in the literature, mostly based on computer vision systems. The precise positioning of the aircraft is essential to provide information to the flight controller to assure flight security in a detailed inspection process. In the specific case of the power line structure inspection, some specific challenges are presented, like:
  • 440.
    432 M. F.Ferraz et al. – Energy power pylons are complex structure, composed of thin metallic com- ponents, hard to detect by conventional sensors; – High voltage energy that flows within the cables generates electromagnetic interference in the navigation sensors of the aircraft during the inspection operation; – Presence of trees, civil constructions, and other similar objects around the pylons and transmission corridors makes it hard to identify the correct posi- tion of the UAV based on intelligent sensor data processing; – The need for the aircraft flying close to the tower to obtain adequate images and data for analysis brings high risky of collision, a serious problem to this kind of application; – Maintaining the aircraft’s position and orientation for the adequate acqui- sition of images, due to the presence of gusts of wind and the uncertainty present in the common positioning sensors, is a very demanding task for the autonomous control system; – DGNSS systems are sensitive to environmental characteristics, like coverage of the receptor antenna, proper satellite signal reception, and data link WiFi reception. Considering these demands, the proposition of UAV positioning sys- tems based on intelligent computational techniques is an interesting field of application-oriented research. The main objective of this kind of technique is to assure the aircraft control system keep the correct position and orientation, in addition to the trajectory following during the flight. The present work proposes an Artificial Intelligence-based architecture specifically designed for power pylon detection, focused on developing a LiDAR positioning system for autonomous detailed inspection Tasks. It is based on the processing of planar Light Detection And Range (LiDAR) sensors data embedded on the aircraft. This paper presents an initial evaluation of an architecture to detect a pylon in the UAV flight area; it also identifies its direction using only the embedded LiDAR scan data. The simulations were run in the virtual robotic environment GAZEBO, providing an overview of the proposed technique’s performance. The remaining of the paper is organized as follows: After this introduc- tory section, Sect. 2 discusses the related work and highlights this work’s con- tributions. Section 3 presents the general problem description is presented and describes the proposed architecture. Section 4 presents the experimental setup and simulation results, and Sect. 5 discusses the obtained results. Finally, Sect. 6 presents the conclusions and points out future works directions. 2 Related Works The common technique to provide position data to UAV outdoor flight is the use of DGNSS modules. This technique computes the phase difference between two GNSS antennas to increase the accuracy of the positioning data output. Nowadays, a considerable number of small-size DGNSS hardware is commer- cially available, proper for UAV applications. The most used commercial flight
  • 441.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 433 controllers can work with these GNSS modules data to provide high accuracy positions to the aircraft, allowing autonomous mission programming. This tech- nique is one of the most common solutions to provide precise horizontal position- ing to UAV but demands good operational conditions to work properly. Some environment conditions, like the presence of obstacles, antenna shadowing, mag- netic field interference, and cloudy weather conditions, may affect the system accuracy significantly [19], justifying the development of complementary posi- tioning solutions to integrate the UAV control system. A common approach to propose UAV position and navigation data is using computer vision algorithms that collect images based on regular or stereoscopic cameras embedded on the UAV. Some review papers were published about this subject, presenting the main applications and challenges about the practical use of vision-based UAV control algorithms. This includes the influence of outdoor light variation, difficult of identifying object clues in the images to provide data information for the vision algorithms, and the high variability of the flight sites composition and objects, making it hard to define a general image algorithm processing for all kind of application [1,10,12]. Considering the specific area of energy power structure autonomous inspec- tion, two kinds of computer vision algorithm approaches are found in the lit- erature: (a) the power line following applications, used for a long-range and long-distance visual inspection, and (b) the pylon detection and localization for small distance detailed inspection operation. The line following applications has the main objective of identifying the posi- tion of the UAV related to the energy power cables and transmission structure during its displacement along the lines when the aircraft capture images of the energy cables and components for an overview of the structural conditions. In this situation, the algorithm must keep the UAV navigation in the correct path and distance from the structures during the aircraft displacement (10.0 m to 50.0 m commonly), using the images captured to provide information to the dis- tance and orientation estimators, feeding the flight controller. Most proposals use image processing techniques to extract the power line from an image and calcu- late the position and orientation of the UAV related to it [2,6,9,11,13,14,18]. Another approach uses pylon detection and identification on images to pro- vide a direction to the UAV displacement. In these cases, the vision algorithms extract the pylon features from the image and calculate its position, feeding the flight control hardware to pursuit the “target” and displace it to the next pylon. An example of this approach is presented in the work [8]. Some works presented in the literature apply vision algorithms to provide a high accuracy position data to the UAV related to the power pylon when it executes the flight in small distances (2.0 m to 6.0 m commonly) to capture detailed images of the structure components. A pylon distance estimation algorithm based on monocular image processing is presented in [3]. It uses the UAV displacement based on the GPS and the air- craft’s IMU (Inertial Measurement Unity) data to calculate the pylon position using the image match points for two consecutive images. A position measure-
  • 442.
    434 M. F.Ferraz et al. ment error lower than 1.0 m has been reported as a result (for the best samples) of a 10.0 m distance flight in real-world experiments, using a pylon model. Another similar work proposes a “Point-Line-Based SLAM” technique based on a monocular camera to calculate the center of the pylon. A 3D point cloud is estimated from the monocular images at each algorithm interaction, providing the capability to calculate the distance from the UAV and the pylon. The results present an average position error of 0.72 m [5]. Using LiDAR sensors to provide distance data for energy power inspection is a good approach to solve problems related to the inspection tasks. The main application of this kind of sensor in UAV energy power inspection is focused on the mapping of the structures using the LiDAR point cloud data to reconstruct the real conditions of the transmission lines [4,9,16,17]. Another approach that proposes a distance estimation of a pylon for a detailed inspection application, based on a planar LiDAR sensor, is presented. In the work, a planar LiDAR carried onboard by a multi-rotor aircraft collects horizontal data from the pylon structure and uses such information to calcu- late the geometric centroid of the pylon. One issue of this technique is that it demands the aircraft to keep the alignment to the pylon to obtain proper data to feed the position calculation. Also, the inclination of the measurement plane, due to the movement of the aircraft, has a significant influence on the position error calculated by the algorithm, as described in [15]. This brief state-of-art review shows that intelligent computing algorithms based on a planar LiDAR sensor data can provide a significant contribution to this specific area of application, especially when they are employed to identify the pylon position/orientation and to calculate its distance from a UAV flying close to it. The work described in the present paper is a first evaluation of the application of machine learning algorithms to face the mentioned challenges. 3 Problem Description The power pylon detailed inspection demands the UAV displacement around its structure. In this operation, the technician goal is to achieve a close-up image of the pylon components to verify their integrity. This operation requires that the aircraft stays hovering close to the targets points, typically within 4.0 m. The robustness of the flight control depends on the correct evaluation of the UAV position, in this case, with high accuracy. Although a centimeter-level accuracy obtained by a DGNSS system is suitable for this operation, the malfunction of this system demands the proposal of additional positioning systems to assure the security of the flight, as explained earlier. This work main goal is, as described below, to run an initial evaluation of a positioning architecture based on an Artificial Intelligence algorithm to detect an electric power pylon close to a UAV in short distances, in order to assist the detailed pylon inspection tasks. Such an architecture is based on two planar LiDAR sensors to scan, respectively, the horizontal and vertical planes. Each
  • 443.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 435 sensor scans provide a planar point signature data for intelligent algorithms, to identify correctly the pylon within the flight area and to indicate its structure orientation regarding to the UAV pose. This research was based on the expertise of the local energy company tech- nicians, which provided essential pieces of information regarding the real-world UAV inspection process. The local company’s technicians perform the pylon detailed inspection using a commercial remotely piloted aircraft that captures images from the structure for a offboard post-processing evaluation. The proposal of UAV-based autonomous pylon inspection architecture must assure the capability of reproducing the human-piloted behavior, keeping the flight security during the operation. An important demand regarding this is to identify the presence of the pylon in front of the UAV and keep a secure distance from the structure, to prevent a possible collision. The use of a LiDAR sensor embedded on the UAV is a possible way to estimate the distance between the aircraft and the pylon in close distances. To work properly the flight control sys- tem must receive reliable information that the structure detected by the LiDAR is the pylon, once there are several other possible objects in the flight area that could be detected, generating a false distance estimation. Considering this, a pylon detection and identification algorithm must be provided. To face these challenges, this work proposes an architecture based on two planar LiDAR sensors embedded on the UAV, to scan both the horizontal and the vertical planes and feed AI algorithms. To allow the correct operation of the algorithms, a predefined programmed behavior for the UAV was proposed. The Fig. 1 shows the representation of the UAV proposed behavior for a inspection process. Two different AI algorithms have been evaluated to compare the pylon detec- tion performance in the simulated environment: a Neural Network (NN), and a Support Vector Machine (SVM). Both of these algorithms were chosen for being well documented in the literature and widely applied in classification problems, and, therefore, they are a good baseline for this evaluation. A deep Feed For- ward Network (FFN) was designed in Python (v3.8.5) with the aid of the Keras (v2.4.1) framework. The NN was build with three hidden dense layers, with the first two being composed by 200 neurons each and a third layer composed by 5 neurons. Each layer uses the RelU activation function, and the output’s layer generates the final result using the sigmoid activation function. The network was trained for 50 epochs and with batch size of 32. Accuracy was chosen as the evaluation metric, and the loss function used was binary crossentropy. Also, to mitigate overfitting, the cross-validation technique was used, with the training data being split in 10 sets. The network training was executed as an supervised model, were each sample was tagged with the “pylon” or “not-pylon” boolean parameter. The Support Vector Machine (SVM) algorithm has been designed using Python (v3.8.5) and the scikit-learn library. Two kernels for parameter esti- mation have been used: the polynomial and the radial basis function (RBF). Similar to the NN algorithm, the SVM algorithm training has been executed
  • 444.
    436 M. F.Ferraz et al. Fig. 1. State diagram showing the proposed UAV behavior for a inspection process. in the supervised model using the binary parameter to classify the samples as “pylon” or “not-pylon”. 4 Experimental Setup To validate the architecture, several simulated scenarios containing an energy pylon and/or other objects were developed. In each of them, samples of the LiDAR were collected and fed to both an SVM and a NN algorithm. They were evaluated using the following metrics: overall performance, performance on each scenario, and execution time. The overall performance was evaluated using mainly accuracy as a metric. The performance on each scenario was measured to find the effects different scenarios may have on the detection ratio. And at last, a time comparison between the algorithms will be shown, as time is a crucial metric for applications running in embedded systems. The validation experiments have been executed on a virtual environment built within the Gazebo Multi-Robot Simulator, version 11.3.0 running on a Ubuntu 20.04 Operating System with ROS Noetic Ninjemys. The Gazebo sim- ulation software and the AI algorithms run on a PC equipped with an Intel i5 10400 processor, 16 Gb RAM and a GPU NVIDIA GeForce 1660 Super with 6 Gb RAM. Two standard planar LiDAR sensor models have been used to scan the horizontal and the vertical planes. These sensors have a 40.0-m range, sampling rate of 5 samples per-second, and angle steps of 0.5◦ . Also, for the horizontal plane, the LiDAR sensor scans 360.0◦ around the UAV; for the vertical plane, the sensor scans 90.0◦ on the front side of the UAV.
  • 445.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 437 Eight distinct scenarios were built to allow the data collection for the training and evaluation of the horizontal AI algorithms, as shown in Fig. 2. Fig. 2. Images of the experiment scenarios. a) Only a pylon in the field. b) Several buildings and one pylon. c) Several trees and one pylon in the field. d) A complex building structure, a water tank and one pylon. e) Several buildings, one pylon and one tree. f) Several buildings; no pylon. g) Several trees in the field close from each other; no pylon. h) A complex building structure, a water tank; no pylon. A 3D STL pylon model from GRABCAD [7] website was imported to the GAZEBO. To execute the data collection for the horizontal LiDAR, the sensor was randomly displaced around the pylon on a range from 2.0 m to 10.0 m in dif- ferent heights between 2.0 m and 20.0 m. The pylon height was set to 30.0 m. For all the experiments a practical approach was considered in the sensor horizontal stabilization. It was defined that the LiDAR sensor was attached to a stabilized gimbal that keeps the sensor always scanning in a horizontal or vertical plane. A random horizontal alignment noise with 1.0-degree range was added to the sensor position to simulate the stabilization error present on the gimbal mecha- nism. Figure 3 shows an example of the horizontal point map LiDAR detection on a simulation in a complex environment. For each position, the sensor captures a 360-degree planar point data and store it into a vector. All samples for each scenario were stored and pos-processed, indicating the presence or not presence of the pylon in the scenario. The training set was composed of 80% of the collected samples and the test set was composed of 20% of the collected samples. Table 1 presents the results of the SVM algorithm versus the NN algorithm for the horizontal LiDAR data experiment. For the vertical data collection, the LiDAR sensor was configured in the same conditions as the horizontal sensor, however, it scans a 90.0-degree range. A total of 17500 samples have been captured and used as the training and test data. The main difference for this data collection is that the sensor was keep pointed to the pylon in the scenario with a pylon present to assure that the collected samples contain the detection of the pylon segment. These samples have been tagged as “pylon” to feed the training set. In the other scenarios, the sensor was randomly placed in the space inside a 10.0-m radius from the center of the scene, and also
  • 446.
    438 M. F.Ferraz et al. Fig. 3. a) Simulation scene on GAZEBO showing the pylon, objects in the area and the sensor scan. b) The point map of the horizontal LiDAR detection. Table 1. SVM × NN comparing in horizontal LiDAR data set Horizontal SVM Training Set Evaluation Horizontal NN Training Set Evaluation Predicted Result Predicted Result True False True False True 1830 166 True 1562 434 Actual Result False 179 825 Actual Result False 416 588 Total of Samples: 3000 True Readings 2655 Percentual Accuracy 88.50% True Readings 2150 Percentual Accuracy 71.67% randomly directed to collect data from the structures and elements present in the environment. These samples were tagged as “no pylon” to feed the training set. From these datasets, 80% of the samples were used as the training dataset and 20% for the test dataset. Table 2 presents the results of the SVM and NN algorithms for the vertical LiDAR data experiment. Table 2. SVM × NN comparing the vertical LiDAR data set Vertical SVM Training Set Evaluation Vertical NN Training Set Evaluation Predicted Result Predicted Result True False True False True 1991 24 True 1997 38 Actual Result False 47 1438 Actual Result False 60 1425 Total of Samples: 3500 True Readings 3429 Percentual Accuracy 97.97% True Readings 3420 Percentual Accuracy 97.20%
  • 447.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 439 The performance of the NN and SVM algorithms for each kind of scenario is shown in the Fig. 4. Fig. 4. Accuracy performance of the SVM algorithm versus NN algorithm for the different experimentation scenarios. SVMs had a better overall performance in every scenario. 4.1 SVM × NN Time Performance The time performance of the AI algorithm is an important parameter to the practical application of the architecture. To evaluate this performance, the same number of predictions were performed for each algorithm for both the horizontal and vertical LiDAR datasets to calculate the average time for a single prediction. As the real UAV will run using previously trained models, the training times will not have influence in the UAV performance, and thus only the prediction times were measured. The results are presented in Fig. 5. 5 Discussion Considering the constructive characteristics of the energy pylon, which is com- posed of thin metallic segments, the planar LiDAR data collected offers few points to the AI algorithms processing. This is an important parameter to be considered in training phase. The LiDAR sensors can capture points not only from the face of the pylon but also from the back of the structure, and thus, the point map is particularly different from other objects or buildings commonly found in the surroundings of an energy transmission line area. In spite of the LiDAR sensors being used for distance measurements mainly, the point signa- ture generated in a pylon scanning in comparison with other objects. The results obtained in our simulated experiments show the likelihood that the proposed approach can provide suitable information to train an AI-based classification algorithm for real-world applications. As described earlier, the proposed architecture provides a way for the UAV to detect a pylon present in the flight area and to rotate it to find pylon direction.
  • 448.
    440 M. F.Ferraz et al. Fig. 5. Time performance of the SVM algorithm versus NN algorithm for the horizontal and vertical LiDAR data. The results show that the prediction times using the SVM algorithm were much faster than their NN counterpart. This allows the UAV to keep its front-side pointed to the pylon face during all the inspection process. Moreover, assuring that the pylon exists in the environment and the UAV is pointed to it, a more reliable distance measurement system based on the LiDAR data readings could be implemented. The proposal of new distance measurement algorithms is planned for the next steps of this research. Comparing the two AI algorithms fed with the horizontal data has shown an accuracy of 88.50% for the SVM against 71.67% for the NN. For this exper- iment’s scenarios, the results indicate that the SVM algorithm is better than the NN considering only the accuracy metric. Also the SVM algorithm aver- age processing time for each prediction is about significantly smaller than the NN processing time. It is important to remember that, to be employed in real- world applications, this architecture intends to be deployed on UAV’s onboard embedded computing system, which may present limited hardware resources and performance. Therefore, the choice of which algorithm should compose the detection architecture must be considered carefully. The vertical LiDAR data have shown a similar performance between the SVM and NN algorithms, with an accuracy of 97.97% and 97.20%, respectively. The vertical scan of the pylon allows detecting more segments of the structure, offering a detailed signal signature as input to the AI algorithms. Different kind of scenarios presents different performances for the algorithms, as expected. It is possible to observe that in the specific case where a significant number of trees are present in the area, the accuracy of the algorithms has a significant reduction. Besides that, in a real-world application, the algorithm will operate in a distance not greater than 10.0-meter from the pylon, where the presence of trees is not common. Also is possible to observe in the Fig. 4 that the SVM algorithm provides better performance for all the evaluations. Comparing the processing time performance, the SVM processing is five times longer than the NN processing. This impacts the detection algorithm execution frequency, which, in turn, imposes a speed constraint to the aircraft displacement speed.
  • 449.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 441 6 Conclusions and Future Works This work goal was to evaluate the use of AI-based algorithms to compose an architecture capable of detecting a power pylon structure in the flight area of a UAV used for detailed inspection applications. Such an architecture must offer a reliable response about the direction of the pylon, by detecting its structure using a planar LiDAR sensor aligned to the vertical plane of the aircraft. Although a planar LiDAR scan data may provide less information to feed an AI-based detection algorithm (in comparison with, e.g., an image-based dataset), the results obtained in the simulated environment present good potential for a real-world architecture implementation. The obtained accuracy ranged from 70% to 99%, depending on the arrangement and applied AI algorithm. It is possible to state intuitively that the processing time of LiDAR samples is considerably shorter than an image. Therefore, the proposed pylon detecting architecture based on LiDAR sensors data can be deployed to the UAV’s onboard embedded computing system. The algorithm comparison has shown that the SVM-based architecture pro- vides better results than the NN-based architecture for this kind of application. These results indicate that the construction of real-world application architec- ture will probably present better results by using this kind of algorithm. Also, the SVM processing time was significantly smaller than the NN one, which probably will be reproduced in a real-world situation. This work pioneers in proposing an approach based on LiDAR data and IA-based classification algorithms to detect energy power pylons to the best of our knowledge. Two popular AI algorithms have been evaluated for this task. Although computer vision algorithms are probably the most common approach to detecting a pylon in the captured images, those algorithms are used to cal- culate the distance between the pylon and the UAV or provide visual odometry. However, the amount of data provided by a single image is huge compared to a planar LiDAR data sample, demanding not only a higher amount of processing and memory resources but also an additional financial cost to build the embed- ded computing system. Thus, it is possible to say that using an architecture such as the one proposed in this work has a good potential to be implemented in real-world applications. It demands less processing time while providing reliable information about the presence of a pylon in the flight area and its direction related to the UAV pose. This work is the first step towards implementing a position algorithm based on the measurements of the pylon structure captured by the LiDAR sensors. The results presented in this work represent a first evaluation of the IA-based algorithms to detect a electric pylon based on LiDAR point scans. Future work will evaluate the real-world performance of the proposed architecture using data collected from a scale-size pylon and LiDAR sensors carried onboard of a small- size quad-rotor aircraft. A distance evaluation algorithm based on the LiDAR readings is also foreseen as future work. Such a system intends to provide relative positioning data to the UAV flight controller.
  • 450.
    442 M. F.Ferraz et al. Acknowledgements. This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. This work has also been supported by Fundação Araucária (grant 34/2019), and by CAPES and UTFPR through stundent scholarships. References 1. Al-Kaff, A., Martı́n, D., Garcı́a, F., de la Escalera, A., Marı́a Armingol, J.: Sur- vey of computer vision algorithms and applications for unmanned aerial vehicles. Expert Syst. Appl. 92, 447–463 (2018) 2. Araar, O., Aouf, N.: Visual servoing of a Quadrotor UAV for autonomous power lines inspection. In: 2014 22nd Mediterranean Conference on Control and Automa- tion, MED 2014 (June), pp. 1418–1424 (2014). https://doi.org/10.1109/MED. 2014.6961575 3. Araar, O., Aouf, N., Dietz, J.L.V.: Power pylon detection and monocular depth estimation from inspection UAVs. Ind. Robot. 42(3), 200–213 (2015). https://doi. org/10.1108/IR-11-2014-0419 4. Azevedo, F.: LiDAR-based real-time detection and modeling of power lines for unmanned aerial vehicles. Sensors (Switzerland) 19(8), 1–28 (2019). https://doi. org/10.3390/s19081812 5. Bian, J., Hui, X., Zhao, X., Tan, M.: A point-line-based SLAM framework for UAV close proximity transmission tower inspection. In: 2018 IEEE International Conference on Robotics and Biomimetics, ROBIO 2018, pp. 1016–1021 (2019). https://doi.org/10.1109/ROBIO.2018.8664716 6. Cerón, A., Mondragón, I., Prieto, F.: Onboard visual-based navigation system for power line following with UAV. Int. J. Adv. Rob. Syst. 15(2), 1–12 (2018). https:// doi.org/10.1177/1729881418763452 7. GRABCAD: GrabCAD (2021). https://grabcad.com/ 8. Hui, X., Bian, J., Yu, Y., Zhao, X., Tan, M.: A novel autonomous navigation approach for UAV power line inspection. In: 2017 IEEE International Conference on Robotics and Biomimetics, ROBIO 2017, 1–6 January 2018 (2018). https://doi. org/10.1109/ROBIO.2017.8324488 9. Li, X., Guo, Y.: Application of LiDAR technology in power line inspection. IOP Conf. Ser.: Mater. Sci. Eng. 382(5), 1–5 (2018). https://doi.org/10.1088/1757- 899X/382/5/052025 10. Máthé, K., Buşoniu, L.: Vision and control for UAVs: a survey of general methods and of inexpensive platforms for infrastructure inspection. Sensors (Switzerland) 15(7), 14887–14916 (2015) 11. Menéndez, O., Pérez, M., Cheein, F.A.: Visual-based positioning of aerial main- tenance platforms on overhead transmission lines. Appl. Sci. (Switz.) 9(1) (2019). https://doi.org/10.3390/app9010165 12. Nguyen, V.N., Jenssen, R., Roverso, D.: Automatic autonomous vision-based power line inspection: a review of current status and the potential role of deep learning. Int. J. Electr. Power Energy Syst. 99(January), 107–120 (2018) 13. Shuai, C., Wang, H., Zhang, G., Kou, Z., Zhang, W.: Power lines extraction and distance measurement from binocular aerial images for power lines inspection using UAV. In: Proceedings - 9th International Conference on Intelligent Human- Machine Systems and Cybernetics, IHMSC 2017, vol. 2, 69–74 (2017). https://doi. org/10.1109/IHMSC.2017.131
  • 451.
    LiDAR-Based Architecture forAutonomous UAV Pylon Inspection 443 14. Tian, F., Wang, Y., Zhu, L.: Power line recognition and tracking method for UAVs inspection. In: 2015 IEEE International Conference on Information and Automa- tion, ICIA 2015 - In conjunction with 2015 IEEE International Conference on Automation and Logistics (August), pp. 2136–2141 (2015). https://doi.org/10. 1109/ICInfA.2015.7279641 15. Viña, C., Morin, P.: Micro air vehicle local pose estimation with a two-dimensional laser scanner: a case study for electric tower inspection. Int. J. Micro Air Veh. 10(2), 127–156 (2018). https://doi.org/10.1177/1756829317745316 16. Wu, J., Fei, W., Li, Q.: An integrated measure and location method based on airborne 2D laser scanning sensor for UAV’s power line inspection. In: Proceedings - 2013 5th Conference on Measuring Technology and Mechatronics Automation, ICMTMA 2013, pp. 213–217 (2013). https://doi.org/10.1109/ICMTMA.2013.58 17. Zhang, W., et al.: The application research of UAV-based LiDAR system for power line inspection. In: Proceedings of the 2nd International Conference on Computer Engineering, Information Science Application Technology (ICCIA 2017), vol. 74, pp. 962–966 (2017). https://doi.org/10.2991/iccia-17.2017.174 18. Zhao, X., Tan, M., Hui, X., Bian, J.: Deep-learning-based autonomous navigation approach for UAV transmission line inspection. In: Proceedings - 2018 10th Inter- national Conference on Advanced Computational Intelligence, ICACI 2018, pp. 455–460 (2018). https://doi.org/10.1109/ICACI.2018.8377502 19. Zimmermann, F., Eling, C., Klingbeil, L., Kuhlmann, H.: Precise positioning of UAVs - dealing with challenging RTK-GPS measurement conditions during automated UAV flights. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 4(2W3), 95–102 (2017). https://doi.org/10.5194/isprs-annals-IV-2-W3-95-2017
  • 452.
    Data Visualization andVirtual Reality
  • 453.
    Machine Vision toEmpower an Intelligent Personal Assistant for Assembly Tasks Matheus Talacio1,2 , Gustavo Funchal1(B) , Victória Melo1 , Luis Piardi1,2 , Marcos Vallim2 , and Paulo Leitao1 1 Research Center in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253 Bragança, Portugal {gustavofunchal,victoria,piardi,pleitao}@ipb.pt 2 Universidade Tecnológica Federal do Paraná (UTFPR), Avenida 7 de Setembro 3165, Curitiba 80230-901, Paraná, Brazil matheustalacio@alunos.utfpr.edu.br, mvallim@utfpr.edu.br Abstract. In the context of the fourth industrial revolution, the inte- gration of human operators in emergent cyber-physical systems assumes a crucial relevance. In this context, humans and machines can not be considered in an isolated manner but instead regarded as a collabora- tive and symbiotic team. Methodologies based on the use of intelligent assistants that guide human operators during the execution of their oper- ations, taking advantage of user friendly interfaces, artificial intelligence (AI) and virtual reality (VR) technologies, become an interesting app- roach to industrial systems. This is particularly helpful in the execution of customised and/or complex assembly and maintenance operations. This paper presents the development of an intelligent personal assis- tant that empowers operators to perform faster and more cost-effectively their assembly operations. The developed approach considers ICT tech- nologies, and particularly machine vision and image processing, to guide operators during the execution of their tasks, and particularly to verify the correctness of performed operations, contributing to increase produc- tivity and efficiency, mainly in the assembly of complex products. 1 Introduction The 4th industrial revolution is pushing the adoption of emergent technologies, e.g., Internet of Things (IoT), Artificial Intelligence (AI), Big data, collaborative robots and Virtual Reality (VR), aiming to transform the way factories operate to increase their responsiveness and reconfigurability. In particular, Industry 4.0 enables the increasing level of automation and digitization in the factories of the future [5], with Cyber-Physical Systems (CPS) acting as a backbone to develop such emergent production systems and contributing to develop smart processes, machines and products. CPS aims to connect the various physical components, e.g., sensors and actuators, with cyber systems composed by controllers and communication networks to achieve a common goal [9]. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 447–462, 2021. https://doi.org/10.1007/978-3-030-91885-9_33
  • 454.
    448 M. Talacioet al. In this emergent CPS environment, humans have a very important role since they are the most flexible elements in automated systems, being required their symbiotic integration. In particular, instead of performing repetitive and monotonous tasks that can be fully automated, humans will be requested to perform add-value tasks, e.g., assembly of complex and/or customized products or performing maintenance interventions. In this context, intelligent assistants and virtual interfaces can be used to assist humans to realize their manual operations in a faster and more cost- effective manner, taking advantage of the huge amount of data available at the shop floor, as well as the emergent ICT technologies, e.g., AI and VR [3]. These intelligent assistants can support the online monitoring of the equipment con- dition during the execution of the assembly and maintenance operations, and combine this information with diagnostic reports and their previous experience in analogous situations, to determine the best action plans to be carried out. In such systems, besides the intelligence behind the guidance system, it is important to consider automatic systems that dynamically verifies the correctness of per- formed operations, warning the need to correct the operation and only allowing to proceed in case the operation is complete and successfully performed. Such automatic verification usually involves the use of machine vision tech- niques, which in industry context presents several constraints, namely related to environmental illumination and shadowing, time response and the irregular geometries of the pieces to be checked. These systems can be empowered with the use of AI techniques and should be integrated with the other functionalities of the intelligent assistants. Having this in mind, this paper describes the development of a machine vision solution, as part of an intelligent personal assistant (IPA), that supports human operators to perform faster and more cost-effectively their assembly operations. In particular, the IPA application uses ICT technologies and image processing techniques to guide operators to assembly correctly customized and complex products, checking the correctness of the performed operations. The activities carried out by the operator are supervised through an intelligent system that will verify the assertiveness of the assembly to ensure that the operation cycle will only end when the system detects that the assembly is correct, even for the scenarios considering 3D assemblies, in which the final product is assembled step by step, and each step is understood as a complete activity. Since the algorithm is executed in real-time, a check to verify if an operator is modifying the assembly is included. The developed solution allows to increase the productivity of oper- ators through the direct integration with emerging technologies, and strongly contributes for the human integration with highly automated applications. The rest of the paper is organized as follows. Section 2 contextualizes the related work on developing intelligent assistants, as well as the use of machine vision to verify the correctness of assembly operations. Section 3 presents the IPA system architecture for the case study and Sect. 4 describes the development of the machine vision system to verify the correctness of assembly operations for the two scenarios considered in the case study. Section 5 discusses the achieved
  • 455.
    Machine Vision toEmpower an Intelligent Personal Assistant 449 experimental results, and particularly the user experience. Finally, Sect. 6 rounds up the paper with the conclusions and points out the future work. 2 Related Work Industry 4.0 relies on the use of CPS complemented with emergent digital tech- nologies to achieve more intelligent, adaptive and reconfigurable production sys- tems. Also important is the integration of the human operators into the produc- tion processes for enhanced product value, achieving manufacturing flexibility in complex human-machine systems, where humans and machines can not be con- sidered in an isolated manner but instead regarded as a collaborative team [8]. This requires the presence of operators on the design of human-centred adaptive manufacturing systems, allowing a symbiotic integration. However, the execution of some operations, e.g., assembly and maintenance, is normally complex and requires high-level expertise from the operators to exe- cute them efficiently in short time, with the required quality and with the mini- mal impact on the normal production cycles. This requires to enable the human- machine collaboration, through the use of innovative technologies, e.g., machine vision, machine learning and VR, that will support the operators during the realization of their tasks. In this context, IPA is a guidance software system that can support the execution of operations, facilitating the interaction between the operators and machines or computers, based on data mining from images, video, and voice commands captured from sensors distributed in the environment [6,14]. The use of IPA, as an intelligent assistant to inform the operator with real-time and his- torical data and guide through the best action plan to perform the operation [7], can improve the quality of executed operations, particularly in case of complex and/or customised ones. As an example, maintenance technicians can use an IPA system to obtain useful real-time information about the machine condition, recommendations and actions to be taken, as well as useful information, e.g., documents, websites or videos [4]. These systems can provide real-time informa- tion and instructions, replacing traditional methods that rely on printed manuals and real machines that can be damaged [13]. The more common commercial versions of IPA are, e.g., Amazon Alexa, Microsoft Cortana, Google Assistant or Apple Siri. In industrial environments, IPA are, e.g., being used to training as a means of learning through the first- person experience [1,10], to train operators in new machine maintenance pro- cedures [16] and to support operators in performing complex and customised assembly tasks and maintenance operations [15]. The use of IPA can include mechanisms to automatically verify the correct- ness of the performed operations, namely assembly operations. In this context, artificial vision, also known as machine vision, plays an important role in the supervision of assembly operations since it allows the acquisition and analy- sis of image data, which would not be possible only by inference of context in software. In industrial environments, the verification of 3D product assemblies
  • 456.
    450 M. Talacioet al. is not possible by using simple 2D images, which requires to consider alterna- tive cameras that allows to measure the distance of points in the field of view, providing a 3D image of the scene. Several technologies can be considered to accomplish this objective, e.g., TOF (Time of Flight), used by Microsoft Kinect 2.0 and Microsoft Azure Kinect, and stereo vision, used by Intel Realsense and Stereolabs ZED. The acquired images are analysed by using image processing techniques to recognize shapes and objects, e.g., contour matching, morpholog- ical operations and feature detection. In case of recognition of complex shapes, these techniques can be combined with machine learning algorithms. The referred commercial intelligent assistants solutions usually provide speech recognition functionalities but miss the image recognition, which is crucial in industrial environments. Additionally, the use of such intelligent assistants in industrial environments is still rare, with the majority running as laboratory pro- totypes, mainly due to the industrial constraints and complexity, e.g., lighting, objects geometry and object overlay. Several aspects can influence the use and confidence of the IPA technology, e.g., usability, security and privacy [2]. At the practical level, there are some challenges to be faced regarding the adoption of IPA supporting technologies in industrial environments. These solutions need to be matured to avoid creating entropy within the maintenance process and operators need to be adequately trained to operate the system in a proper manner. The IPA’s human-interaction barrier models need to be improved, and AI systems need to prove reliabil- ity. Another aspect is related to the ergonomic evolution of the head mounted devices, which need to be more comfortable for operators to use during the entire shift or even complete the maintenance intervention [12]. 3 System Architecture for the Intelligent Personal Assistant This section presents the setup of the assembly workbench and the system archi- tecture of the IPA that will be mentoring operators to the execute assembly operations based on LEGO pieces from different shapes and colours. 3.1 Description of the Case Study In this work, an IPA is used to support operators to perform their customised and/or complex assembly operations in a more efficient manner. For this pur- pose, the IPA is combined with a workbench structure, illustrated in Fig. 1, providing an intuitive and guidance information to the user regarding the task to be performed, as well as the capability to verify the correctness of the opera- tion execution. As example, the IPA informs the operator about the action plan to perform the operation, namely which step to be executed, how to execute the step, and which piece or tool should be used in this step. The physical structure comprises several devices that support the human- machine interface. In terms of data collection, the workbench has an Intel
  • 457.
    Machine Vision toEmpower an Intelligent Personal Assistant 451 Working area tape Monitoring assembly instructions Fig. 1. IPA workbench layout to support the human-machine interface. Realsense camera responsible for capturing the image over the working space, which will be processed by a machine vision algorithm to detect whether the instructions given are correctly performed. This system is also able to acquire the depth of the scene, thus enabling the verification of assemblies in a three- dimensional space. A microphone is available to support the human-machine interface by collecting the voice instructions provided by the operator during the execution of the assembly procedure, e.g., the feedback related to the conclusion of an operation step. Since the industrial environment can be noisy, the platform can in alternative provide the feedback by using the touch screen interface. In terms of output devices, the workbench comprises a monitor and a projec- tor that are responsible for providing information to the operator, e.g., instruc- tions to operator and feedback from operator. The image of the projector is displayed over the working space. A LED tape is used to easily indicate to the operator which piece placed in the several dispensers should be used in a certain process step. The IPA should consider the execution of two case study scenarios. In the first case study scenario, the operator must assemble gift boxes according to the orders received by clients, with each gift box product comprising four slots where individual components (with different colour and shape) should be correctly placed. In the second scenario, the system is used to support the assembly of more complex three-dimensional products, but unlike the first, the instructions are passed on incremental steps and the system checks if the current step has been completed to allow to proceed to the next assembly step. 3.2 System Architecture The IPA system architecture, illustrated in Fig. 2, comprises several modules, interconnected via standard protocols, namely using REST services.
  • 458.
    452 M. Talacioet al. LED ribbon context (assembly sequences) workspace speech recognition correctness verification Arduino micro projector Intel RealSense monitor HMI back-end Media files assembly instructions LED updates BGR image + depth data plan management engine operator application dashboard Fig. 2. System architecture for the IPA system. The process plan management module is responsible to manage the execution of the process plan, expressed as a JSON file, related to the assembly operation. This process considers the assembly plan procedure, the feedback from the envi- ronment, i.e. from the camera that allows to check the correctness of the assembly step, and the feedback from the operator, e.g., using the voice commands to ask more information to execute the instruction. The interaction with the operator is performed by the dashboard application, where the step by step guidance instructions to be executed by the operator are displayed through a monitor and a projector, complemented with supporting documents and videos. Following the order to assembly a customised product, the requested configuration is displayed to the operator through the human- interface at the beginning of the cycle. The intelligent assistant is continuously checking the assembly correctness and only finalizes the operation cycle when a perfect match is achieved. The intelligent assistant system also indicates to the operator which slots/pieces are correctly assembled and which are not. An important functionality provided by the IPA is to verify the correctness of the operation performed by the operator through the use of machine vision techniques; in case of an assembly operation performed incorrectly, the intelligent system will indicate the error and the assembly procedure can only proceed after a correct assembly. The correctness verification module is responsible to verify the correctness of the step execution by using machine vision algorithms. For this purpose, the module is continuously receiving the image acquired by the Intel Realsense camera and uses a machine vision algorithm to determine the important char- acteristics of the image, being able to conclude if the operation was performed correctly (see more details in the next section).
  • 459.
    Machine Vision toEmpower an Intelligent Personal Assistant 453 The results of the analysis of the correctness of the operation execution is provided to the plan management engine by using a REST service. In order to optimize the system performance and avoid the overload on the connection with the server, this module only sends the detection status when there is any change (comparing the last REST payload sent with the new generated one). Also, the depth map of the scene is always checked, so it is possible to detect if the operator is still busy performing the assembly and the recognition is paused until it is concluded. Finally, the LED strips are controlled by an C/C++ application running in an Arduino Uno microcontroller that receives the information of which led area should be turned on through an UDP socket. Note that during the execution of assembly tasks, the LED strip will indicate the container where the operator should pick the pieces to execute the current step. The proposed system aims to contribute to develop an intelligent context- awareness system, which helps human operators to perform faster and more cost- effectively their assembly operations. In addition, it is expected that this solution can contribute to increase the operator’s productivity through the integration of emerging technologies, adding value to human operations that must compete with highly automated industries. 4 Image Processing to Verify Assembly Correctness Image processing techniques are used to verify the correctness of the assem- bly operation performed by the operator in real time by comparing the image acquired by the Intel RealSense camera with the execution instruction previ- ously provided by the IPA. For this purpose, an algorithm for image recognition was firstly implemented, which will later be used to verify the correctness of the assembly operation in two different scenarios, namely the gift-box and the complex assembly. The image processing algorithms were codified in Python, ensuring fast run-time, easy development and camera compatibility, and sup- port the use of image manipulation libraries. 4.1 Image Recognition and Classification Algorithm The first step in the correctness verification process is related to the image recognition, which uses the OpenCV library to implement image processing, e.g., filtering, morphology, contour detection and colour space conversion, on real-time applications. The developed image recognition algorithm, illustrated in Fig. 3, identifies the different objects in the image, in terms of shape, colour and position. Briefly, the algorithm changes the colour space of images from Red-Green- Blue (RGB) to Hue-Saturation-Value (HSV), since the last presents better per- formance in machine vision algorithms and it is more adequate for the detection of the colour. In fact, this colour space allows to represent the colors easily for the filter since they are closer to the perception of colours in relation to the
  • 460.
    454 M. Talacioet al. All colours of the object analysed? Yes No Image acquisition (RGB colorspace) Conversion to the HSV colorspace Colour identification filter (Thresholding operations) Add to list with properties of found objects Returns list with properties of all objects found in the image Find contours Fig. 3. Algorithm for the image recognition. human eyes [11]. Considering the image after the colour space conversion, the use of the OpenCV library, and particularly the inRange function, allows apply- ing a threshold filter to isolate the pixels from the image by defining a range in the colour space. As result, a binary image is obtained, with the pixels that are within the defined range for each colour assuming the white colour and the pixels that are not within the range assuming the black colour. This process is performed for all the available colours of objects to be identified and the range is defined for each colour. From the binary image it is possible to find the con- tour of the different objects by using the findContours method, which extracts information about the area, center of mass and vertex position of the object. This image recognition algorithm is performed for both case study scenarios. However, the algorithm to verify the assembly correctness is dependent of the scenario and will be described in the following sections. 4.2 Verification Algorithm for the Gift-Box Assembly In this scenario, the application generates a configuration for the gift-box accord- ing to the received orders, i.e. which pieces (defined in terms of shape and colour) should be placed in each one of the four box slots. The operator receives the information related to this configuration, through the projector and monitor, and performs the assembly operation by placing the pieces in the target slots of the gift-box. The requirements to confirm the correct assembly include the verification if only one piece is placed in each slot and if the piece has the colour as requested in the configuration instructions. The algorithm to verify the cor- rectness for the gift-box assembly is illustrated in Fig. 4. The image recognition algorithm described in the previous section is used to list the objects present at the scene. Then, for the gift-box assembly verification algorithm, the pointPolygonTest function from the OpenCV library, is used to
  • 461.
    Machine Vision toEmpower an Intelligent Personal Assistant 455 All pieces analysed? No Yes Is the gift-box among the recognized objects? No More than one piece or wrong colour in a slot? Is the validation string the same as the previous iteration? Fetch the current instruction from the data base Get list of recognized objects (from the image recognition algorithm) Calculate the distances between the top-left vertice of the piece and all vertices of the box Assign the lower distance box vertex as this piece position Check positions and colours Send error message to the user interface Generate the validation string containing the parts in slot information Send the validation string to the user interface Restart the iteration Yes No Yes Yes No Fig. 4. Verification algorithm for the gift-box assembly. select only the pieces that are inside the box. The algorithm identifies the con- tour of the box, and then, knowing its position, it is possible to measure the Euclidean distance between the other pieces and each corner of the box. Due to the geometry of the box, the piece that is placed in a slot will always be at a smaller distance from its respective corner, e.g., the piece that is in the upper right slot always has a smaller distance from its upper right corner compared to the others. This allows to assign the proper slot positions of the pieces by comparing with the received instruction. If two pieces are detected in the same position or if the colour of the piece in one of the slots does not match with the colour presented in the instruction, an alert error is generated. In this way, it is possible to have freedom during the assembly operation as the positions are all relative to the current gift-box position. After validating the correctness of the performed assembly, the information is sent to the dashboard application that will show the results, and in case of incorrectness, indicates which slots of the gift-box are wrongly assembled. 4.3 Verification Algorithm for the Complex Assembly In this scenario, the dashboard application guides the operator during the assem- bly of a complex product by showing the image of the complete assembly and the visual instructions (i.e. action, image and video) for the current assembly
  • 462.
    456 M. Talacioet al. step. After concluding the verification of correctness, the system indicates if the assembly step was completed or not with success. Since the operation involves a three-dimensional assembly, besides the recog- nition of the colour and shape of the pieces, it is also required to get information about the height, which is possible by using the IR Stereo sensor of the Real Sense camera. To mitigate inaccuracies caused by the camera’s mounting angle and uneven surface, the application initially saves the scene depth as a height map, so during the detection routine, the captured depth is subtracted with the map to obtain the height of the pieces relative to the working table. The verification algorithm for the complex assembly is illustrated in Fig. 5. current layer different from the instruction layer? number of expected objects equals to the list size? match found? All pieces in the instruction analysed? All expected pieces marked as found? Yes No Fetch the current instruction from the data base Change layer (adjusting height for the image recognition algorithm) Get list of recognized objects (from the image recognition algorithm) Loop through all the detected pieces for a match Send Incompleted Step to the user interface Send Succesful Step to the user interface Mark the expected instruction piece as not found Mark the expected instruction piece as found Yes No No Yes No Yes No Yes Fig. 5. Verification algorithm for the complex assembly. To allow the assembly verification in this scenario, the first action is to fetch the active assembly and search for the respective instruction file for the current step. For all the instructions generated, the piece in the top left corner in relation to the rotation of the mounting surface, as illustrated in Fig. 6, is considered as a master piece, responsible for the basis of calculation to establish the position of the other pieces in scene. In scenarios that not require to use a box or a
  • 463.
    Machine Vision toEmpower an Intelligent Personal Assistant 457 basis to accommodate the pieces, the first piece placed in the working area, e.g., corresponding to the first step, is considered as the master piece, and all the others pieces placed later have their position related to the master piece. Fig. 6. Possible assembly rotations. (Color figure online) The colour, orientation and shape of the piece are obtained directly from the image recognition algorithm. The assemblies were separated into layers of heights, to ensure that all pieces in the scene are analyzed, even those that are covered by another piece in a later step. Each layer has a master piece to serve as reference for other pieces in the same plane and a new layer is considered if any piece is mounted on top of another. It is important to remark a limitation in the assembly freedom since that the assembly area can not be moved when a layer change because this would change the stored reference position of the master piece used to calculate the distance and the reference point for the newly added piece. In order to increases the stability of the verification correctness process, the assembly verification algorithm considers the last five performed detections and only confirms the assembly if they are equal. Note that this threshold value was empirically defined, with a small number leading to a reduced certainty and a high number increasing the response time. By debugging the software, it was possible to observe that problems in detec- tion may occur due to the positioning of the assembly in relation to the ambient lighting. To solve this problem, three solutions have been introduced: the reduc- tion of the lighting intensity, the demarcation of an optimal area for assembly and the modification of the algorithm that returns the final detection. 5 Experimental Results The developed IPA, and particularly the automatic system to verify the correct- ness of the assembly operation, was implemented in the workbench case study, and used by operators in their assembly operations’ routines.
  • 464.
    458 M. Talacioet al. 5.1 Accuracy of Developed Methods Multiple running tests were performed using the intelligent assistant by operators performing assembly operations for the 2 case study scenarios. Regarding the gift-box scenario, Fig. 7 illustrates an operator performing the assembly operation with the support of the intelligent assistant, as well as the output of the recognition algorithm to check the correctness of the performed operation. Fig. 7. Gift-box assembly in the IPA (right) with the recognition output (left). It is clear the benefits of using the intelligent assistant to support the opera- tor during the execution of customised assembly tasks, including the automatic presentation of the instructions to perform the operation and the automatic ver- ification of the correctness of the performed operation, mitigating the possibility of human errors, e.g., due to fatigue, variability and error of judgement, thus increasing the efficiency of the process and the comfort of the operator. The checking algorithm was intensively tested, including incorrect assemblies with the following situations to validate its accuracy: i) incorrect colour order within the box, ii) two pieces in the same box slot, and iii) pieces placed outside the box. No errors occurred during the tests, i.e. where the system output is correct although the real assembly is not. However, when the box was placed outside the detection area, some errors have occurred due to the light reflection. In particular, it was identified the incorrect detection of red pieces when they are very close to the box wall. This occurs since in some situations of low brightness, the HSV values for brown and red colours coincide, with the filter not being able to separate the two objects. This situation can be avoided with the improvement of the scene lighting and adjusting the filter threshold values to better classify the pieces’ colours. The response time to perform the recognition algorithm and to present the verification decision to the operator is approximately 3 s, which means a waiting time of 3 s to check the correctness of the performed operation’s step. This implies the need to establish a cycle time for the assembly of gift-boxes that includes this additional time.
  • 465.
    Machine Vision toEmpower an Intelligent Personal Assistant 459 Regarding the second scenario, Fig. 8 shows the operator performing the assembly of a product and the output of the recognition algorithm. The black points presented in the algorithm’s output are due to the layer system since as the distances are subtracted from the height map, it is possible to obtain neg- ative values by oscillating the camera data, with the algorithm excluding these areas. Fig. 8. Operator performing a complex product assembly (left) and the output of the recognition algorithm (right). Similarly to the previous scenario, erroneous situations, e.g., switched colour or position, wrong orientation, pieces mounted in incorrect layers, movement of the mounting base in instructions of the same layer and removal of previous pieces, were included to the testing experiment to identify possible errors in the recognition of the assembly steps. It was observed that during the experimental tests, no errors have occurred. One problem found was the insertion of pieces of the same colour as other pieces in lower layers. Although the performance of the algorithm when ignoring the layers not being worked on was satisfactory, the shape of lower pieces placed in the proximity of parts of the superior layer was detected, as observed in Fig. 9. Thus, when the colours of these pieces coincide, they are considered only one piece, causing the incorrect calculation of their edges and consequently unable to be confirmed by the algorithm. It was also proven the theorized problem during the development of this application in which during the change of layers, if the pieces change their posi- tion on the working area, it is not possible to obtain the relation of the new piece and therefore it is not possible to confirm the assembly step. This problem can be solved by using another algorithm that can identify the position of the pieces, independently of the relations between the rest of the assembly. Finally, the classification of the pieces’ heights has been proven to be efficient by using a height map, since, through the debugging of the algorithm, it is possible to find the same height for pieces in the same plane. Since a height map technique is used, it is assumed that the recognition algorithm starts with no object above the table, otherwise it is necessary to clean it and restart the
  • 466.
    460 M. Talacioet al. Fig. 9. Example of wrong inclusion of bottom layer pieces. (Color figure online) detection procedure. Also, if the camera is misaligned perpendicularly to the working area, incorrect heights are obtained, being necessary to perform this verification ahead. 5.2 Comparing with Traditional Methods An experimental use case test was performed to compare the time and efficiency of the IPA system against the traditional manual method, considering the sec- ond use case scenario. The time elapsed and the errors made during the setup, assembly and completion phases were analysed. For the conventional method, the operator follows a printed manual that describes the steps to be executed and fulfills a report containing the confirmation of the correct performance of the instructions. After the assembly through the printed manual, the participants used the workbench to perform a different assembly from the one performed pre- viously, but with the same difficulty and number of steps, and the same metrics were calculated. Note that the order in which the methods are carried out does not influence the result as the set of parts used and the assembly process itself change, which does not allow the operator to become more skilled after executing an assembly process. Table 1 summarises the obtained results with four opera- tors performing the assembly using the two different methods (conventional and IPA), each one performing a setup and 5 assembly steps. Table 1. Average assembly time (seconds) Conventional method IPA method Setup 84.25 15.75 Steps 97.75 62.00 Total 182.00 77.75 It can be observed that the biggest difference between the two methods is related to the setup time, which is composed by the selection of the assembly
  • 467.
    Machine Vision toEmpower an Intelligent Personal Assistant 461 procedure and the pre-filling of the report, where, in the IPA, the information is filled automatically and uploaded to the database accordingly. The time taken to complete all the 5 steps when performed using the intelligent assistant shows a reduction of 36,6% in relation to the conventional methods. Through the use of assembly recognition associated to the IPA method, it is ensured the certainty that the instruction was correctly executed, while in the conventional method this function depends of the operator performance, thus introducing the possi- bility of human errors. These benefits will be as higher as more complex and longer is the assembly procedure. 6 Conclusions and Future Work In Industry 4.0, humans are considered the most flexible piece in an automated system, being crucial their symbiotic integration. The use of intelligent assis- tants contribute for this integration, particularly to support humans during the execution of complex and/or customised operations. This paper presents the development of an IPA to support the integration of humans in cyber-physical systems, particularly acting as mentoring of operators in the execution of their assembly operations. A key issue in this IPA system is the use of machine vision techniques to automatically verify the correctness of the performed operations, allowing to reduce the operators errors and improve the product quality. The application of a workbench equipped with an intelligent personal assis- tant demonstrates, through the use of two different scenarios, how emergent ICT technologies can be applied to help operators to perform assembly tasks faster and efficiently. In fact, the developed solutions presented high levels of accuracy, a reduction of the operators errors and a reduction of the execution time, as well as excellent levels of acceptance from operators. In this sense, IPA directly contributes to increasing productivity, quality and efficiency in the tasks performed, highlighting the importance of adopting Industry 4.0 in traditional manufacturing companies through the interaction of humans with intelligent assistants. Future work includes the use of more robust detection algorithms to handle more complex parts, and AI techniques to provide better action plans along the assembly process. Also, it is planned to perform tests with more operators, including a statistical study with the variability of the assembly execution time. Acknowledgments. This work has been supported by FCT- Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. References 1. Abidi, M., Al-Ahmari, A., Ahmad, A., Ameen, W., Alkhalefah, H.: Assessment of virtual reality-based manufacturing assembly training system. Int. J. Adv. Manuf. Technol. 105, 3743–3759 (2019)
  • 468.
    462 M. Talacioet al. 2. de Barcelos Silva, A., et al.: Intelligent personal assistants: a systematic literature review. Expert Syst. Appl. 147, 113193 (2020) 3. Fantini, P., et al.: Exploring the Integration of the human as a flexibility factor in cps enabled manufacturing environments: methodology and results. In: Proceedings of the 42nd Annual Conference of IEEE Industrial Electronics Society (IECON 2016), pp. 5711–5716 (2016) 4. Frigo, M.A., da Silva, E.C., Barbosa, G.F.: Augmented reality in aerospace man- ufacturing: a review. J. Ind. Intell. Inf. 4(2), 125–130 (2016) 5. Gilchrist, A.: Industry 4.0: The Industrial Internet of Things. Springer, Heidelberg (2016) 6. Hauswald, J., Laurenzano, M.A., Zhang, Y., et al.: Sirius: an open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers. In: 20th International Conference on Architectural Support for Pro- gramming Languages and Operating Systems, pp. 223–238 (2015) 7. Hoedt, S., Claeys, A., Landeghem, H.V., Cottyn, J.: The evaluation of an ele- mentary virtual training system for manual assembly. Int. J. Prod. Res. 55(24), 7496–7508 (2017) 8. Krupitzer, C., et al.: A survey on human machine interaction in industry 4.0. CoRR abs/2002.01025 (2020) 9. Leitão, P., Colombo, A.W., Karnouskos, S.: Industrial automation based on cyber- physical systems technologies: prototype implementations and challenges. Comput. Ind. 81, 11–25 (2016) 10. Mantovani, G.: VR Learning: Potential and Challenges for the Use of 3D. Towards Cyberpsychology: Mind, Cognitions, and Society in the Internet Age, pp. 208–225 (2003) 11. Mohd Ali, N., Md Rashid, N.K.A., Mustafah, Y.M.: Performance comparison between RGB and HSV color segmentations for road signs detection. In: Advances in Manufacturing and Mechanical Engineering. Applied Mechanics and Materials, vol. 393, pp. 550–555. Trans Tech Publications Ltd (2013) 12. Morgado, M., Miguel, L.: Ergonomics in the industry 4.0: virtual and augmented reality. J. Ergon. 08 (2018) 13. Pierdicca, R., Frontoni, E., Pollini, R., Trani, M., Verdini, L.: The use of aug- mented reality glasses for the application in industry 4.0. In: Proceedings of the International Conference on Augmented Reality, Virtual Reality and Computer Graphics, pp. 389–401 (2017) 14. Romero, D., et al.: Towards an operator 4.0 typology: a human-centric perspec- tive on the fourth industrial revolution technologies. In: Proceedings of the Int’l Conference on Computers and Industrial Engineering, pp. 1–11 (2016) 15. Webel, S., Bockholt, U., Engelke, T., Gavish, N., Olbrich, M., Preusche, C.: Aug- mented reality training for assembly and maintenance skills. Robot. Auton. Syst. 61(4), 398–403 (2013) 16. Zhu, Z., et al.: AR-mentor: augmented reality based mentoring system. In: Pro- ceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR 2014), pp. 17–22 (2014)
  • 469.
    Smart River Platform- River Quality Monitoring and Environmental Awareness Kenedy P. Cabanga1,2 , Edmilson V. Soares1,2 , Lucas C. Viveiros1 , Estefânia Gonçalves2 , Ivone Fachada2 , José Lima1 , and Ana I. Pereira1(B) 1 Research Centre in Digitalization and Intelligent Robotics CeDRI, Instituto Politécnico de Bragança, 5300-252 Bragança, Portugal {a36074,a39189}@alunos.ipb.pt, {viveiros,jllima,apereira}@ipb.pt 2 Centro de Ciência Viva de Bragança, Bragança, Portugal {egoncalves,ifachada}@braganca.cienciaviva.pt Abstract. In the technology communication era, the use of the Inter- net of Things (IoT) has become popular among other digital solutions, since it offers the integration of information from several organisms and at several sources. By means of this, we can access data from distant locations and at any time. In the specific case of water monitoring, the conventional outdated measurement methods can lead to low efficiency and complexity issues. Hence, Smart systems rise as a solution for a broad of cases. Smart River is a smart system platform developed to optimize the resources and monitoring the quality of water parameters of the Fervença river. The central solution is based at Centro Ciência Viva de Bragança (CCVB), one of the 21 science centers in Portugal that aims to promote the preservation and environmental awareness for the population. By using the IoT technologies, the system allows real- time data collection with low cost and low energy consumption, being a complement of existing projects that are being developed to promote the ecological importance of natural resources. This paper covers sensor module selection for data collection inside the river and data storage. The parameters of the river are visualized using a program developed in Unity engine to present data averages and comparison between weeks, months, and years. Keywords: Smart systems · Internet of Things (IoT) · Water parameters 1 Introduction Nowadays, the fourth industrial revolution is becoming increasingly popular among researchers and employees because of its broad contributions in different sectors. Among its different areas (virtual and augmented reality, artificial intel- ligence, advanced robots, autonomous vehicles, cloud computing, big data and block-chain between others) [1,2] the Internet of Things (IoT) is standing out and growing rapidly. c Springer Nature Switzerland AG 2021 A. I. Pereira et al. (Eds.): OL2A 2021, CCIS 1488, pp. 463–474, 2021. https://doi.org/10.1007/978-3-030-91885-9_34
  • 470.
    464 K. P.Cabanga et al. Overall, the main advantage of IoT technology is the connections since it allows the communication between many devices to share information inside a cloud-based data system. Many points can be connected in any place, any distance, and any time through the internet. This enables a better efficiency for fast information delivery with low-cost and effort on building intelligent solutions using software, sensors, vehicles, and electronic devices interconnected [2]. Whereas the conventional methods of water quality monitoring need to dis- locate to the place to monitor the water directly or by analyzing it in the labo- ratory. The proposed system takes advantage of the trend technologies of IoT to build a smart solution. Smart River platform is a set of intelligent and intercon- nected systems through the cloud to measure the water parameters and quality of the Fervença river in Bragança, Portugal [5]. Fervença River is a border affluent of Sabor river, has its spring located in the Fontes Barrosas village and crosses the city of Bragança, being one of the main hydric resources for farming activities in the small communities of the region. It is also an extensive source of biodiversity in its waters and the riverside, inside and outside of the city’s territory. The rivers and streams are a priceless resource, but pollution from urban and agricultural areas pose a threat to the water quality. To understand the value of water quality, and to effectively manage and protect the water resources, it’s critical to know the current status of water-quality conditions, and how and why those conditions have been changing over time. The system aims to collect physical and chemical parameters from the Fer- vença river, store all the data inside a cloud-based server, and display it in a touchscreen monitor in one of the exhibits of the science center of Centro de Ciência Viva de Bragança (CCVB). The exhibit aims to offer an interactive, scientific and educational experience regarding the importance of carrying out environmental monitoring combined with the latest communication technologies. Besides, to improve the environmental awareness of CCVB visitors. Currently, there is a great need for the use of decentralized and automated systems for data collection in environmental systems. The data can be used for the development of the technique that allows the improvement of environmen- tal conditions, consequently, by improving the quality of people’s life who use natural resources, as well as promoting sustainability. Systems such as Smart River have been increasingly used in Smart Cities, by allowing integration with other technologies and development of solutions with shared databases. Combined with science communication techniques, the module presents a tool that contributes to the population’s awareness on the environmental value of the water. This paper is organized as follows. Section 2 presents the literature review associated with the sensor modules configuration, the network communica- tion, Internet of Things (IoT) and LoRaWAN protocols. Section 3 addresses the system architecture where the hardware and the communication scheme is explained. Section 4 details the Smart River system development and the management of the sensors installed at Bragança region. The water monitor- ing through Node-Red and Unity applications is described at Sect. 5. Finally, Sect. 6 presents some conclusions and points for future works.
  • 471.
    Smart River Platform465 2 Theoretical Background River quality monitoring systems seek to ensure high precision and accuracy of data, timely reports, data averages, easy access to data and integrity. The river is an important source of water for all cities and one of the main threats to its sustainability is pollution. The existing methods for monitoring water quality rivers are manual monitoring and continuous monitoring. These methods are expensive and less efficient. Therefore, an intelligent river monitoring system was proposed [12,13]. A peculiar effect of the IoT applications is the intelligently data control, which is the result of the platform known as The Things Network (TTN). Whereas the LoRaWAN protocol is used for sending data, and each device is capable of com- municating through LoRa technology, known as End Nodes. The communication occurs with the Gateway from sending data measurements taken by the sensors and relevant information by the developed application [4,6,7]. The Gateway is responsible for creating a connection between the devices and the network server and by receiving data from the various End Nodes. Besides interpreting, storing, and passing the data into the network server. The Gateway is always connected through the Network Server via the TCP and IP protocols, either by the wired or wireless network. From the point of view of the End Nodes, the Gateway is just a messenger forwarded to the Network Server. Overall, TTN is an open-server platform in which the application’s servers are built that interpret and present the data to the user. This server is able to format the data that will be sent to the device and respond, when programmed to do so, based on the data collected, autonomously and without the constant need for human observation [8,9]. 2.1 End Nodes The LoRaWAN End nodes usually have sensors to detect a change in the param- eters (e.g. temperature, humidity, accelerometers, GPS), a LoRa transponder (responsible for transmitting the signals over LoRaWAN radio), and optionally, a micro-controller. Besides the End node is battery-powered and uses low energy approach, it requires attention from outside programs for checking the battery charge. Depending on the system interface developed, it can show how last the percentage of the battery. Some LoRa embedded sensors can run batteries to last for a maximum of 5 years and transmit signals over distances to 10 km from the other device connected through the network [3]. 2.2 Gateways One advantage of LoRaWAN technology is the worldwide connection between gateways by the community of TTN. These devices support the operation of the sensors to transmit the data to the LoRa Gateways. The package communication process occurs by using the LoRa gateways which are responsible for using the network with standard internet protocols (IP) to transmit data and to receive data from the embedded sensors to the server [3].
  • 472.
    466 K. P.Cabanga et al. 3 Smart System Development In this section it is presented the Smart River monitoring system, describing with detail the river sensor modules and communication network. 3.1 Sensor Modules To measure the water quality of the Fervença river, the sensor module presented on Fig. 1 was elaborated that has five independent sensors: (i) Ambient tempera- ture; (ii) Water temperature; (iii) Relative humidity; (iv) Electrical conductivity; and (V) pH value. Fig. 1. Electrical circuit box on the river The sensors were selected by taking into account the cost of acquisition (avoiding suffer from vandalism or external damage), easy access to data using an external system (for example, a microcontroller), and robustness (as it is an application that requires the exposure of sensors to environmental conditions all year along). Because of this, the following sensors were chosen, which will be explicit in the Table 1.
  • 473.
    Smart River Platform467 Table 1. Smart system components and specifications Sensor and components Parameter and function Specifications DHT11 Humidity and ambient temperature Voltage of 3 V to 5.5 V; electric current of 0.3 mA (measurement) 60 µA (standby); temperature range from 0 ◦C to 50 ◦C; humidity range: 20% to 90%; temperature and humidity resolution is 16 bits; Accuracy of ±1 ◦C and ±1% [14]. DS18B20 Water temperature Working Voltage of 3.2 V to 5.25 V DC; Working Current of 2 mA (max); Resolution of 9 to 12 bit programmable; Measuring Range of −55 to 110 ◦C; Measurement Accuracy of ±0.5 ◦C @ −10 to +80 ◦C; ±2 ◦C @ −55 to +110 ◦C; output cables of yellow (data), red (vcc), black (gnd); sensor output interface of XH2.54-3P [15] Analog pH pro V2 pH water level Supply voltage from 3.3 V to 5.5 V; Output voltage of 0 V to 3.0 V; BNC probe connector; PH2.0-3P signal connector; PH2.0-3P signal connector; Measurement accuracy of ±0.1 @ 25 ◦C; Dimension 42 mm * 32 mm/1.66 * 1.26 in; pH probe: Probe type industrial grade; Detection range of 0 to 14; Temperature range of 0 to 60 ◦C; Accuracy of ±0.1pH (25◦); response time 1 min; probe life of 7 * 24 h 0.5 years (depending on water quality) [16] Electrical conductivity sensor V1 Water Conductivity Operating voltage +5.00 V; PCB size 45 × 32 mm (1.77 × 1.26); Measuring range 1 ms/cm - 20 ms/cm; Operating temperature 5 - 40◦C; Accuracy ±10% F.S (using Arduino 10 bits ADC); PH2.0 interface (3-pin SMD); Conductivity electrode (Constant Electrode K = 1, BNC connector); Length of electrode cable about 60 cm (23.62); Standard conductivity solution (1413 µs/cm and 12.88 ms/cm) x1 [17]. Transceiver LoRa RFM95 Communication with TTN LoRa technology is available as Radio Frequency communication on frequencies 433, 868 and 915 MHz. The operating ranges of LoRa are part of the Industrial Scientific and Medical (ISM) band, which different intervals are reserved internationally for ISM development Atmega328p Microcontroller Memory; Timer/counter; A/D converter; D/A converter; Parallel and Serial ports
  • 474.
    468 K. P.Cabanga et al. 3.2 Communication Network Smart River systems communication occurs through the different IoT platforms connected to the network. By using these platforms it is possible to collect data through LoRaWAN technology and the Gateways. TTN aims to enable low-power devices to use long-range Gateways and connect to a decentralized open-source network. The final application allows exchanging data with high- end precision and efficient connection. The Fig. 2 illustrates the sensor modules registered on the platform [10]. Fig. 2. Devices TTN uses the LoRaWAN protocol that defines the system architecture and the communication parameters with LoRa technologies. These devices use low- power LoRa networks, to connect to the Gateway and thus, enabling a high- bandwidth connection, such as Wi-Fi, Ethernet, and GSM. All Gateways that reach a device will receive messages from other devices on the net and re-forward the message to them in the TTN. Thus, the network will reduplicate the messages and select the best gateway to forward the messages in the queue to the downlink. The Fig. 3 presents the communication scheme for data collection. Fig. 3. Communication scheme
  • 475.
    Smart River Platform469 After the data is collected by the TTN server it is placed in the TTN cloud. It was possible to develop a script that accesses the data in a predefined time interval, saving them in a .txt file format in the CCVB database, thus making the Smart River application more independent from the data storage system of the TTN. 4 Smart River Platform Smart River platform is an interactive module that consists of sensors connected through the network, which collect data on environmental parameters of the main river of Bragança - Fervença River. The Fig. 4 illustrates the positioning of the sensors in the region. The sensors are located at strategic points that allow data collection at four (n = 4) different points: i) before reaching the city of Bragança (near Castro de Avelãs village), ii) in the city of Bragança at Instituto Politécnico de Bragança (IPB), iii) in the city of Bragança at Casa-da-Seda, a building of Centro de Ciência Viva de Bragança, vi) after the city of Bragança, near to Municipality Wastewater Treatment Plant (Estação de Tratamento de Águas Residuais - ETAR). Fig. 4. Location of sensors The strategic positioning of the sensors allows to evaluate and correlate the data collected with the urban activity, and thus propose solutions that improve the quality of the river water. The system is based on the IoT application, through the implementation of the LoRaWAN protocol to carry out the sensor communication with the TTN server, which is a platform for IoT, and a network based on servers in the cloud that connect to the Portal spread across the planet. The TTN network uses the LoRaWAN protocol, that is an open (open- source) and collaborative (crowdsourced) network, where anyone can install a Gateway at home, in the company, in an educational institution, etc. Thus, it
  • 476.
    470 K. P.Cabanga et al. is possible to increase the network coverage area. In addition to installing Gate- ways, anyone can install a Node and connect to a working Gateway at no cost to use the network [11]. Today TTN has thousands of Gateways around the world and a very active community using the platform, discussing and exchanging information. With the use of the TTN network, applications that use IoT can be quickly prototyped, moving from the validation phase to production with guaranteed scalability and security [11]. After the data is collected by the sensors and communicated, all data is stored in a cloud database that can be accessed through external applications. The microcontroller is programmed to command the data collection by the sensors and to send the information in byte format to the TTN server. The microcontroller should use the “hibernate” mode for a defined interval of 15 min and, if the battery is below 3 V, it goes into “hibernate forever”. By this way, reducing energy consumption as much as possible and allowing a longer battery life [11]. To allow access to data in an educational, scientific, and interactive way, two user interface applications were developed: i) An application developed in Unity engine, which is part of the permanent exhibition of Casa da Seda; ii) A Node.js server application, which can be accessed via mobile phones linked. On example is presented by Fig. 5. The application made in Unity was developed in order to be educational and intuitive, facilitating access to data averages and information about measured parameters, collection points, and the application of LoRaWAN technology in the development of the module. Fig. 5. Parameters description
  • 477.
    Smart River Platform471 5 Water Monitoring Smart River system involves several stages of development until its presentation to CCVB visitors. Before showing the output information on a monitor, all the data of the module sensor are sent by using LoRaWAN protocols to the network. These measurements are stored on the Node-red servers and saved on a local database. Then, these data are accessed through the Smart River application made in Unity render-engine. By using the engine, all the data averages of the parameters of the river can be calculated, and thus, the 3D bar graphs gener- ated. As a result of the interactive and touchscreen interface to be used inside a monitor by the visitors. The use of animated graphs with Unity is a trigger for attracting the attention of the CCVB visitors. Thus, the platform can translate scientific and technical data, as well as specific concepts of the physical and chemical descriptions of the parameters of the river to the public, as can be observed in Fig. 6. By using the system, the visitors can learn about the conductivity, the tem- perature, the and other river parameters and descriptions of ideal water’s param- eters. Further, the system allows the data filtering by week, month and year as can be observed at Fig. 6. All the graphs, such as the image’s descriptions of the region, are animated to bring attention to the visitors by gaining interest in the approach topics. Fig. 6. Average graph filtered by week - unity engine Besides the 3D application, the measurements can also be accessed through the Node-red interface from a technical perspective, as shown in Fig. 7. The Node.js server application has more graphically condensed data. Allowing the
  • 478.
    472 K. P.Cabanga et al. monitoring to correlate the raw variables without an averages filter, which is used to observe the specific temporal variation of each parameter individually, and to manage the devices over Bragança. The system shows the failures of shipments of data, and battery levels, as well as data integrity and data storage. Fig. 7. Node-red server interface (Color figure online) 6 Conclusion and Future Work Smart River platform can be one of the best solutions for analyzing the water quality of the river by using IoT methodology for data collection. The proposed system works through the TTN server that stores the data, and makes use of LoRaWAN technology and protocols for sending and receiving data through the internet. It was proposed two independent systems: the Unity educational system was intended to display the data averages in an animated way and with touch-screen interaction for the CCVB visitors, while the Node.js server was created for con- trol and management of devices installed in the rivers and for future scientific research. All the data are being collected and evaluated, in both systems, which have been shown to be coherent and accurate with the sensors data output. It is expected to observe the river’s water parameters and to analyze together with the competent authorities in order to create coordinated action plans with the objectives of improving the water quality, with expected results of supporting environmental balance, social well-being for the citizens of Bragança and tourists, and ensuring that the river basin of Fervença has a good quality water source. With this work it is expected to better analyze the consequences of human activities in the quality of the waters of Fervença River, provide reliable data to research development in the field, promote environmental awareness to the
  • 479.
    Smart River Platform473 residents and Bragança Ciência Viva Science Center visitors, and support actions from the competent authorities in improving the hydric resource quality and consequently, the well-being of the environment and population. The aesthetic factor is also relevant in the expected results of the system, since improving the river water quality will also value the real estate business in the region, attracting more people to live close to the river and the city’s historic center. For future work, it is intended to optimize the module platform, so that the battery-life can be longer, and also to change the digital pH and electrical conductivity sensors with industry standards which has a longer life inside the water. Other data analysis and environments alerts can also be implemented on the Smart River system. Acknowledgments. This work has also been supported by Fundação La Caixa within the Project Scope: Natureza Virtual, and FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020. References 1. Rymarczyk, J.: Technologies, opportunities and challenges of the industrial revolu- tion 4.0: theoretical considerations. Entrepreneurial Bus. Econ. Rev. 8(1), 185–198 (2020). https://doi.org/10.15678/EBER.2020.080110 2. Hasan, M.S., Yu, H.: Innovative developments in HCI and future trends. Int. J. Autom. Comput. 14(1), 10–20 (2017). https://doi.org/10.1007/s11633-016-1039-6 3. Basford, P.J., Johnston, S.J., Apetroaie-Cristea, M., Bulot, F.M.J., Cox, S.J.: LoRaWAN for city scale IoT deployments. Glob. IoT Summit (GIoTS) 2019, 1–6 (2019). https://doi.org/10.1109/GIOTS.2019.8766359 4. Jeong, H., Liu, Y.: Computational modeling of touchscreen drag gestures using a cognitive architecture and motion tracking. Int. J. Hum.-Comput. Interact. 35(6), 510–520 (2019). https://doi.org/10.1080/10447318.2018.1466858 5. Accessed 21 June 2021. https://www.usgs.gov/mission-areas/water-resources/ science/water-quality-nation-s-streams-and-rivers-current-conditions?qt-science center objects=0qt-science center objects 6. Madakam, S., Lake, V., Lake, V., Lake, V.: Internet of things (IoT): a literature review. J. Comput. Commun. 3(05), 164 (2015) 7. Caldas Filho, F.L., Martins, L., Araújo, I.P., Mendonça, F., Costa, J.: Gerencia- mento de Serviços IoT com gateway Semântico. In: Atas das Conferências IADIS Ibero-Americanas WWW/Internet 2017 e Computação Aplicada 2017, pp. 199– 206. IADIS Press (2017) 8. Santos, A.V.D.: Integração entre dispositivos LoRa e servidores de aplicação uti- lizando o protocolo LoRaWAN. Technical report from Universidade Federal de Santa Catarina, pp. 1–67 (2021) 9. Zyrianoff, I., Heideker, A., Silva, D.O., Kleinschmidt, J.H., Kamienski, C.A.: Impacto de LoRaWAN no desempenho de plataformas de iot baseadas em nuvem e névoa computacional. In: Anais do XVII Workshop em Clouds e Aplicaçães, pp. 43–56 (2019) 10. Cabanga, P.K.: Monitorização de contagem eletrica inteligente hardware. Technical report from Instituto Politécnico de Bragança, pp. 24–25 (2020)
  • 480.
    474 K. P.Cabanga et al. 11. Soares, V.E.: Monitorização de contagem eletrica inteligente software. Technical report from Instituto Politécnico de Bragança, pp. 7–19 (2020) 12. Adu-Manu, K.S., Katsriku, F.A., Abdulai, J.D., Engmann, F.: Smart river moni- toring using wireless sensor networks. Wirel. Commun. Mob. Comput. 2020, 1–2 (2020) 13. Elijah, O., et al.: A concept paper on an intelligent river monitoring system for river sustainability. Int. J. Integr. Eng. 10, 1–5 (2018) 14. FILIPEFLOP Homepage. https://www.filipeflop.com/produto/sensor-de- umidade-e-temperatura-dht11/. Accessed 20 Feb 2021 15. ELECTROFUN Homepage. https://www.electrofun.pt/sensores-arduino/sensor- temperatura-ds18b20. Accessed 20 Feb 2021 16. OPENcircuit Homepage. https://opencircuit.shop/Product/Gravity-Analog-pH- Sensor-Meter-Pro-Kit-V2. Accessed 20 Feb 2021 17. BOTNROLL Homepage. https://www.botnroll.com/pt/biometricos/2384- gravity-analog-electrical-conductivity-sensor-meter-for-arduino-.html. Accessed 20 Feb 2021
  • 481.
  • 482.
    Analysis of theMiddle and Long Latency ERP Components in Schizophrenia Miguel Rocha e Costa1, Felipe Teixeira1,2 , and João Paulo Teixeira1,3(B) 1 Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, 5300-253 Bragança, Portugal a40308@alunos.ipb.pt, {felipe.laje,joaopt}@ipb.pt 2 Universidade de Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal 3 Applied Management Research Unit (UNIAG), Instituto Politécnico de Bragança, Bragança, Portugal Abstract. Schizophrenia is a complex and disabling mental disorder estimated to affect 21 million people worldwide. Electroencephalography (EEG) has proven to be an excellent tool to improve and aid the current diagnosis of mental disorders such as schizophrenia. The illness is comprised of various disabilities associated with sensory processing and perception. In this work, the first 10−200 ms of brain activity after the self-generation via button presses (condition 1) and pas- sive presentation (condition 2) of auditory stimuli was addressed. A time-domain analysis of the event-related potentials (ERPs), specifically the MLAEP, N1, and P2 components, was conducted on 49 schizophrenic patients (SZ) and 32 healthy controls (HC), provided by a public dataset. The amplitudes, latencies, and scalp distribution of the peaks were used to compare groups. Suppression, measured as the difference between both conditions’ neural activity, was also evaluated. With the exception of the N1 peak during condition (1), patients exhibited significantly reduced amplitudes in all waveforms analyzed in both conditions. The SZ group also demonstrated a peak delay in the MLAEP during condition (2) and a modestly earlier P2 peak during condition (1). Furthermore, patients exhibited less and more N1 and P2 suppression, respectively. Finally, the spatial distribution of activity in the scalp during the MLAEP peak in both conditions, N1 peak in condition (1) and N1 suppression differed considerably between groups. These findings and measure