Cloud and Internet Computing

Lecture Notes in Networks and Systems 96
Leonard Barolli
Peter Hellinckx
Juggapong Natwichai Editors
Advances on P2P,
Parallel, Grid,
Cloud and Internet
Computing
Proceedings of the 14th International
Conference on P2P, Parallel, Grid, Cloud
and Internet Computing (3PGCIC-2019)
Lecture Notes in Networks and Systems
Volume 96
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Fernando Gomide, Department of Computer Engineering and Automation—DCA,
School of Electrical and Computer Engineering—FEEC, University of Campinas—
UNICAMP, São Paulo, Brazil
Okyay Kaynak, Department of Electrical and Electronic Engineering,
Bogazici University, Istanbul, Turkey
Derong Liu, Department of Electrical and Computer Engineering, University
of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy
of Sciences, Beijing, China
Witold Pedrycz, Department of Electrical and Computer Engineering,
University of Alberta, Alberta, Canada; Systems Research Institute,
Polish Academy of Sciences, Warsaw, Poland
Marios M. Polycarpou, Department of Electrical and Computer Engineering,
KIOS Research Center for Intelligent Systems and Networks, University of Cyprus,
Nicosia, Cyprus
Imre J. Rudas, Óbuda University, Budapest, Hungary
Jun Wang, Department of Computer Science, City University of Hong Kong,
Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest
developments in Networks and Systems—quickly, informally and with high quality.
Original research reported in proceedings and post-proceedings represents the core
of LNNS.
Volumes published in LNNS embrace all aspects and subfields of, as well as new
challenges in, Networks and Systems.
The series contains proceedings and edited volumes in systems and networks,
spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor
Networks, Control Systems, Energy Systems, Automotive Systems, Biological
Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems,
Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems,
Robotics, Social Systems, Economic Systems and other. Of particular value to both
the contributors and the readership are the short publication timeframe and the
world-wide distribution and exposure which enable both a wide and rapid
dissemination of research output.
The series covers the theory, applications, and perspectives on the state of the art
and future developments relevant to systems and networks, decision making, control,
complex processes and related areas, as embedded in the fields of interdisciplinary
and applied sciences, engineering, computer science, physics, economics, social, and
life sciences, as well as the paradigms and methodologies behind them.
** Indexing: The books of this series are submitted to ISI Proceedings,

SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/15179

Leonard Barolli Peter Hellinckx
• •
Juggapong Natwichai
Editors
Advances on P2P, Parallel,

Grid, Cloud and Internet
Computing
Proceedings of the 14th International
Conference on P2P, Parallel, Grid, Cloud
123
Editors
Leonard Barolli Peter Hellinckx
Department of Information University of Antwerp
and Communication Engineering Antwerp, Belgium
Fukuoka Institute of Technology
Fukuoka, Japan
Juggapong Natwichai
Chiang Mai University
Chiang Mai, Thailand
ISSN 2367-3370 ISSN 2367-3389 (electronic)

Lecture Notes in Networks and Systems
ISBN 978-3-030-33508-3 ISBN 978-3-030-33509-0 (eBook)
https://doi.org/10.1007/978-3-030-33509-0
© Springer Nature Switzerland AG 2020
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Welcome Message from 3PGCIC-2019
Organizing Committee
Welcome to the 14th International Conference on P2P, Parallel, Grid, Cloud and
Internet Computing (3PGCIC-2019), which will be held in conjunction with
BWCCA-2019 International Conference from November 7 to November 9, 2019, at
University of Antwerp, Antwerp, Belgium.
P2P, grid, cloud and Internet computing technologies have been established as
breakthrough paradigms for solving complex problems by enabling large-scale
aggregation and sharing of computational, data and other geographically distributed
computational resources.
Grid computing originated as a paradigm for high-performance computing, as an
alternative to expensive supercomputers. Grid computing includes semantic and
service-oriented grid, pervasive grid, data grid, enterprise grid,autonomic grid,
knowledge and economy grid.
P2P computing appeared as the new paradigm after client–server and Web-based
computing. These systems are evolving beyond file sharing toward a platform for
large-scale distributed applications. P2P systems have as well inspired the
emergence and development of social networking, B2B (business to business), B2C
(business to consumer), B2G (business to government), B2E (business to
employee) and so on.
Cloud computing has been defined as a “computing paradigm where the
boundaries of computing are determined by economic rationale rather than
technical limits.” Cloud computing is a multi-purpose paradigm that enables
efficient management of data centers, time-sharing and virtualization of resources
with a special emphasis on business model. Cloud Computing has fast become the
computing paradigm with applications in all application domains and providing
utility computing at large scale.
Finally, Internet computing is the basis of any large-scale distributed computing
paradigms; it has very fast developed into a vast area of flourishing field with
enormous impact on today’s information societies. Internet-based computing serves
thus as a universal platform comprising a large variety of computing forms.
v
vi Welcome Message from 3PGCIC-2019 Organizing Committee
The aim of the 3PGCIC Conference is to provide a research forum for presenting
innovative research results, methods and development techniques from both
theoretical and practical perspectives related to P2P, grid, cloud and Internet
computing.
In this edition, 102 papers were submitted and based on the reviewers’ reports,
and the Program Committee selected 31 papers (about 30% acceptance rate) for
presentation in the conference and publication in the Springer Lecture Notes in
Networks and Systems Proceedings.
Many people have helped and worked hard to produce a successful
3PGCIC-2019 technical program and conference proceedings. First, we would like
to thank all the authors for submitting their papers, the PC members and the
reviewers who carried out the most difficult work by carefully evaluating the
submitted papers.
The general chairs of the conference would like to thank the PC co-chairs for
their great efforts in organizing a successful conference and an interesting
conference program. We would like to appreciate the work of the workshop
co-chairs for supporting the workshop organizers. Our appreciations also go to all
workshop organizers for their hard work in successfully organizing these
workshops.
We thank Web administrators for their excellent work and support with the Web
submission and management system of conference. We are grateful to honorary
co-chairs for their support and encouragement. Our special thanks to keynote
speakers for delivering inspiring keynotes at the conference.
Finally, we would like to thank the Local Arrangement at University of Antwerp
for making excellent local arrangement for the conference.
We hope you will enjoy the conference and have a great time in Antwerp,
Belgium.
Peter Hellinckx
Leonard Barolli
Flora Amato
3PGCIC-2019 General Co-chairs
Jan Broeckhove
Tomoki Yoshihisa
Juggapong Natwichai
3PGCIC-2019 Program Committee Co-chairs
Welcome Message from 3PGCIC-2019
Workshop Co-chairs
Welcome to the workshops of the 14th International Conference on P2P, Parallel,

Grid, Cloud and Internet Computing (3PGCIC-2019), which will be held from
November 7 to November 9, 2019, at University of Antwerp, Antwerp, Belgium.
The objective of the workshops is to present research results, work on progress
and thus complement the main themes of 3PGCIC-2019 with specific topics of grid,
P2P, cloud and Internet computing.
The workshops cover research on simulation and modeling of emergent
computational systems, multimedia, Web, streaming media delivery, middleware of
large-scale distributed systems, network convergence, pervasive computing and
distributed systems and security.
The 3PGCIC-2019 workshops are as follows:
1. The 12th International Workshop on Simulation and Modelling of Emergent
Computational Systems (SMECS-2019)
2. The 10th International Workshop on Streaming Media Delivery and
Management Systems (SMDMS-2019)
3. The 9th International Workshop on Multimedia, Web and Virtual Reality
Technologies and Applications (MWVRTA-2019)
4. The 9th International Workshop on Adaptive Learning via Interactive,
Cognitive and Emotional approaches (ALICE-2019)
5. The 7th International Workshop on Cloud and Distributed System Applications
(CADSA-2019)
6. The 7th International Workshop on Cloud Computing Project and Initiatives
(CCPI-2019)
7. The 6th International Workshop on Distributed Embedded Systems
(DEM-2019)
8. The 5th International Workshop on Signal Processing and Machine Learning
(SiPML-2019)
9. The 2nd International Workshop on Business Intelligence and Distributed
Systems (BIDS-2019)
vii
viii Welcome Message from 3PGCIC-2019 Workshop Co-chairs
We would like to thank all workshop organizers for their hard work in
organizing these workshops and selecting high-quality papers for presentation at
workshops, the interesting programs and the arrangements of the workshop during
the conference days.
We hope you will enjoy the conference and have a great time in Antwerp,
Belgium
Siegfried Mercelis
Santi Caballe
3PGCIC-2019 Workshop Chairs
3PGCIC-2019 Organizing Committee
Honorary Co-chairs
Makoto Takizawa Hosei University, Japan
Walter Sevenhans University of Antwerp, Belgium
General Co-chairs
Peter Hellinckx University of Antwerp, Belgium
Leonard Barolli Fukuoka Institute of Technology, Japan
Flora Amato University of Naples Federico II, Italy
Program Committee Co-chairs

Jan Broeckhove University of Antwerp, Belgium
Tomoki Yoshihisa Osaka University, Japan
Juggapong Natwichai Chiang Mai University, Thailand
Workshop Co-chairs
Siegfried Mercelis University of Antwerp, Belgium
Santi Caballe Open University of Catalonia, Spain
Finance Chair
Makoto Ikeda FIT, Japan
ix
x 3PGCIC-2019 Organizing Committee
Web Administrator Co-chairs

Kevin Bylykbashi Fukuoka Institute of Technology, Japan
Donald Elmazi Fukuoka Institute of Technology, Japan
Miralda Cuka Fukuoka Institute of Technology, Japan
Local Organizing Co-chairs

Stig Bosmans University of Antwerp, Belgium
Simon Vanneste University of Antwerp, Belgium
Steering Committee Chair

Track Areas
1. Data Mining, Semantic Web and Information Retrieval

Co-chairs
Bowonsak Srisungsittisunti University of Phayao, Thailand
Francesco Piccialli University of Naples Federico II, Italy
Agnes Haryanto Monash University, Australia
PC Members
De-Nian Yang Academia Sinica, Taiwan

Nicola Cuomo ESET, Slovakia
Marco Cesarano Marvell Semiconductor, Santa Clara, California,
USA
Giuseppe Cicotti Definiens, The Tissue Phenomics Company,
Munich, Germany
Marco Giacalone Vrije Universiteit Brussel, Belgium
SeyedehSajedeh Saleh Vrije Universiteit Brussel, Belgium
Luca Sorrentino Brightstep AB, Stockholm, Sweden
Antonino Vespoli Centre for Intelligent Power at Eaton, Dublin,
Ireland
Wenny Rahayu La Trobe University, Australia
David Taniar Monash University, Australia
Eric Pardede La Trobe University, Australia
Kiki Adhinugraha La Trobe University, Australia
3PGCIC-2019 Organizing Committee xi
2. Cloud and Service-Oriented Computing

Co-chairs
Mario Dantas Federal University of Juiz de Fora (UFJF), Brazil
Francesco Orciuoli University of Salerno, Italy
Wang Xu An Engineering University of CAPF, China
PC Members
Douglas D. J. de Macedo Federal University of Santa Catarina, Brazil

Edelberto Franco Silva Federal University of Juiz de Fora, Brazil
Massimo Villari University of Messina, Italy
Stefano Chessa University of Pisa, Italy
Miriam Capretz University of Western Ontario, Canada
Jean-Francois Mehaut University of Grenoble Alpes, France
Giuseppe Fenza University of Salerno, Italy
Carmen De Maio University of Salerno, Italy
Angelo Gaeta University of Salerno, Italy
Sergio Miranda University of Salerno, Italy
3. Security and Privacy for Distributed Systems

Co-chairs
Aniello Castiglione University of Naples Parthenope, Italy
Michal Choras University of Bydgoszcz, Poland
Giovanni Mazzeo University of Naples Parthenope, Italy
PC Members
Silvio Barra University of Cagliari, Italy
Carmen Bisogni University of Salerno, Italy
Javier Garcia Blas Charles III University of Madrid, Spain
Han Jinguang University of Surrey, UK
Sokol Kosta University of Aalborg, Denmark
Gloria Ortega López University of Malaga, Spain
Raffaele Montella University of Naples Parthenope, Italy
Fabio Narducci University of Naples Parthenope, Italy
Rafal Kozik UTP Bydgoszcz, Poland
Joerg Keller FUH Hagen, Germany
Rafal Renk UAM Poznan, Poland
Salvatore D’Antonio University of Naples Parthenope, Italy
Lukasz Apiecionek UKW Bydgoszcz, Poland
Joao Campos University of Coimbra, Portugal
Gerhard Habiger Ulm University, Germany
xii 3PGCIC-2019 Organizing Committee
Luigi Sgaglione University of Naples Parthenope, Italy

Valerio Formicola University of Naples Parthenope, Italy
4. P2P, Grid and Scalable Computing

Co-chairs
Nadeem Javaid COMSATS University Islamabad, Pakistan
Evjola Spaho Polytechnic University of Tirana, Albania
PC Members
Joan Arnedo-Moreno Open University of Catalonia, Spain

Santi Caballe Open University of Catalonia, Spain
Keita Matsuo Fukuoka Institute of Technology, Japan
Vladi Kolici Polytechnic University of Tirana, Albania
Yi Liu Fukuoka Institute of Technology, Japan
Yusuke Gotoh Okayama University, Japan
Akihiro Fujimoto Wakayama University, Japan
Kamran Munir University of the West England, UK
Safdar Hussain Bouk Daegu Gyeongbuk Institute of Science
and Technology (DGIST), Korea
Muhammad Imran King Saud University, Saudi Arabia
Syed Hassan Ahmed Georgia Southern University, USA
Hina Nasir Air University Islamabad, Pakistan
Sakeena Javaid COMSATS University Islamabad, Pakistan
Rasool Bakhsh COMSATS University Islamabad, Pakistan
Asif Khan COMSATS University Islamabad, Pakistan
Adia Khalid COMSATS University Islamabad, Pakistan
Sana Mujeeb COMSATS University Islamabad, Pakistan
5. Bio-inspired Computing and Pattern Recognition

Co-chairs
Francesco Mercaldo Institute of Informatics and Telematics (IIT),
CNR, Italy
Salvatore Vitabile University of Palermo, Italy
PC Members
Andrea Saracino Institute of Informatics and Telematics (IIT),

CNR, Italy
Andrea De Lorenzo University of Trieste, Italy
Fabio Di Troia San Jose State University, USA
3PGCIC-2019 Organizing Committee xiii
Jelena Milosevic TU Wien, Austria

Martina Lindorfer University of California, Santa Barbara, USA
6. Intelligent and Cognitive Systems

Co-chairs
Serena Pelosi University of Salerno, Italy
Alessandro Maisto University of Salerno, Italy
Nico Surantha Bina Nusantara University, Indonesia
PC Members
Lorenza Melillo University of Salerno, Italy

Francesca Esposito University of Salerno, Italy
Pierluigi Vitale University of Salerno, Italy
Chiara Galdi EURECOM, Sophia Antipolis, France
Marica Catone University of Salerno, Italy
Annibale Elia University of Salerno, Italy
Raffaele Guarasci Institute for High Performance Computing
and Networking (ICAR), CNR, Italy
Mario Monteleone University of Salerno, Italy
Azzurra Mancuso University of Salerno, Italy
Daniela Trotta University of Salerno, Italy
7. Web Application, Multimedia and Internet Computing

Co-chairs
Giovanni Cozzolino University of Naples Federico II, Italy
Yasuo Ebara Kyoto University, Japan
PC Members

Vincenzo Moscato University of Naples Federico II, Italy
Walter Balzano University of Naples Federico II, Italy
Francesco Moscato University of Campania Luigi Vanvitelli, Italy
Francesco Mercaldo National Research Council of Italy (CNR), Italy
Alessandra Amato University of Naples Federico II, Italy
Francesco Piccialli University of Naples Federico II, Italy
Tetsuro Ogi Keio University, Japan
Hideo Miyachi Tokyo City University, Japan
Kaoru Sugita Fukuoka Institute of Technology, Japan
Akio Doi Iwate Prefectural University, Japan
Tomoyuki Ishida Fukuoka Institute of Technology, Japan
xiv 3PGCIC-2019 Organizing Committee
8. Distributed Systems and Social Networks

Co-chairs
Masaki Kohana Chuo University, Japan
Jana Nowakova VSB - Technical University of Ostrava,
Czech Republic
PC Members
Jun Iio Chuo University, Japan

Shusuke Okamoto Seikei University, Japan
Hiroki Sakaji The University of Tokyo, Japan
Shinji Sakamoto Seikei University, Japan
Masaru Kamada Ibaraki University, Japan
Martin Hasal VSB - Technical University of Ostrava,
Czech Republic
Jakub Safarik VSB - Technical University of Ostrava,
Czech Republic
Michal Pluhacek Tomas Bata University in Zlin, Czech Republic
9. IoT Computing Systems

Co-chairs
Paskorn Champrasert Chiang Mai University, Thailand
Lei Shu Nanjing Agricultural University, China
PC Members
Chonho Lee Cybermedia Center, Osaka University, Japan

Yuthapong Somchit Chiang Mai University, Thailand
Pruet Boonma Chiang Mai University, Thailand
Somrawee Aramkul Chiang Mai Rajabhat University, Thailand
Roselin Petagon Chiang Mai Rajabhat University, Thailand
Guisong Yang University of Shanghai for Science
and Technology, P.R. China
Baohua Zhang College of Engineering, Nanjing Agricultural
University, China
Ye Liu College of Engineering, Nanjing Agricultural
University, China
Kai Huang College of Engineering, Nanjing Agricultural
University, China
Jun Liu School of Automation, Guangdong Polytechnic
Normal University, China
3PGCIC-2019 Organizing Committee xv
Feng Wang Hubei University of Arts and Science, China

Alba Amato National Research Council of Italy (CNR), Italy
Salvatore Venticinque University of Campania Luigi Vanvitelli, Italy
10. Wireless Networks and Mobile Computing

Co-chairs
Akimitu Kanzaki Shimane University, Japan
Shinji Sakamoto Seikei University, Japan
PC Members
Teruaki Kitasuka Hiroshima University, Japan

Hiroyasu Obata Hiroshima City University, Japan
Tetsuya Shigeyasu Prefectural University of Hiroshima, Japan
Chisa Takano Hiroshima City University, Japan
Shigeru Tomisato Okayama University, Japan
Makoto Ikeda Fukuoka Institute of Technology, Japan
Donald Elmazi Fukuoka Institute of Technology, Japan
Admir Barolli Aleksander Moisiu University of Durres, Albania
Elis Kulla Okayama University of Science, Japan
Tetsuya Oda Okayama University of Science, Japan
3PGCIC-2019 Reviewers
Amato Flora Enokido Tomoya

Barolli Admir Fenza Giuseppe
Barolli Leonard Ficco Massimo
Barra Silvio Fiore Ugo
Boonma Pruet Fortino Giancarlo
Caballé Santi Fun Li Kin
Capretz Miriam Funabiki Nobuo
Capuano Nicola Giacalone Marco
Champrasert Paskorn Goreti Marreiros
Choras Michal Gotoh Yusuke
Cozzolino Giovanni Hasal Martin
Cui Baojiang Hayashibara Naohiro
D’Antonio Salvatore Hellinckx Peter
Di Martino Beniamino Hussain Farookh
Ebara Yasuo Hussain Omar
xvi 3PGCIC-2019 Organizing Committee
Jordi Conesa Ogiela Lidia

Jorge Ricardo Rodríguez Ogiela Marek
Iio Jun Ogiela Ursula
Ikeda Makoto Okada Yoshihiro
Ishida Tomoyuki Orciuoli Francesco
Kamada Masaru Pace Pasquale
Kanzaki Akimitsu Palmieri Francesco
Kohana Masaki Pardede Eric
Kolici Vladi Rahayu Wenny
Koyama Akio Rawat Danda
Kryvinska Natalia Ritrovato Pierluigi
Kulla Elis Rodriguez Jorge Ricardo
Loia Vincenzo Sakaji Hiroki
Liu Yi Shibata Yoshitaka
Ma Kun Shu Lei
Maisto Alessandro Spaho Evjola
Mizera-Pietraszko Jolanta Somchit Yuthapong
Macedo Douglas Sugita Kaoru
Matsuo Keita Surantha Nico
Mazzeo Giovanni Takizawa Makoto
Messina Fabrizio Taniar David
Moore Philip Uchida Noriki
Moreno Edward Uchiya Takahiro
Moscato Francesco Uehara Minoru
Natwichai, Juggapong Venticinque Salvatore
Nishino Hiroaki Villari Massimo
Nabuo Funabiki Wang Xu An
Nowakova Jana Yoshihisa Tomoki
Oda Tetsuya
Welcome Message from SMECS-2019
Workshop Organizers
On behalf of the Organizing Committee of the 12th International Workshop on

Simulation and Modelling of Engineering & Computational Systems, we would
like to warmly welcome you for this workshop, which is held in conjunction with
the 14th International Conference on P2P, Parallel, Grid, Cloud and Internet
Computing (3PGCIC-2019) from November 7 to November 9, 2019, at University
of Antwerp, Belgium.
Modeling and simulation have become the de facto approach for studying the
behavior of complex engineering and enterprise information and communication
systems before deployment in a real setting. The workshop is devoted to the
advances in modeling and simulation techniques in the fields of emergent
computational systems in complex biological and engineering systems, and real-life
applications.
Modeling and simulation are greatly benefiting from the fast development in
information technologies. The use of mathematical techniques in the development
of computational analysis together with the ever-greater computational processing
power is making possible the simulation of very large complex dynamic systems.
This workshop seeks relevant contributions to the modeling and simulation driven
by computational technology.
The papers were reviewed and give a new insight into latest innovations in the
different modeling and simulation techniques for emergent computational systems
in computing, networking, engineering systems and real-life applications. Special
attention is paid to modeling techniques for information security, encryption,
privacy and authentication.
We hope that you will find the workshop an interesting forum for discussion,
research cooperation, contacts and valuable resource of new ideas for your research
and academic activities.
Leonard Barolli
Wang Xu An
SMECS-2019 Workshop Organizers
xvii
xviii Welcome Message from SMECS-2019 Workshop Organizers
SMECS-2019 Program Committee

Workshop Organizers
Wang Xu An Engineering University of CAPF, China
PC Members
Markus Aleksy ABB, Germany

Makoto Ikeda Fukuoka Institute of Technology, Japan
Kin Fun Li University of Victoria, Canada
Hiroaki Nishino University of Oita, Japan
Makoto Takizawa Hosei University, Japan
Tomoya Enokido Rissho University, Japan
Natalia Kryvinska University of Vienna, Austria
Welcome Message from SMDMS-2019
Workshop Organizers
It is my great pleasure to welcome you to the 8th International Workshop on

Streaming Media Delivery and Management Systems (SMDMS-2019). This
workshop is held in conjunction with the 14th International Conference on P2P,
Parallel, Grid, Cloud and Internet Computing (3PGCIC-2019) from November 7 to
November 9, 2019, at University of Antwerp, Belgium.
The tremendous advances in communication and computing technologies have
created large academic and industry fields for streaming media. Streaming media
have an interesting feature that the data stream continuously. They include many
types of data like sensor data, video/audio data, stock data and so on. It is obvious
that with the accelerating trends toward streaming media, information and
communication techniques will play an important role in the future network. In
order to accelerate this trend, further progresses of the researches on streaming
media delivery and management systems are necessary. The aim of this workshop
is to bring together practitioners and researchers from both academia and industry in
order to have a forum for discussion and technical presentations on the current
researches and future research directions related to this hot research area.
I would like to express my gratitude to the authors of the submitted papers for
their excellent papers. I am very thankful to the Program Committee members who
devoted their time for preparing and supporting the workshop. Without their help,
this workshop would never be successful. A list of all of them is given in the
program as well as the workshop Web site. I would like to also thank 3PGCIC-2019
Organizing Committee members for their tremendous support for organizing.
Finally, I wish to thank all SMDMS-2019 attendees for supporting this
workshop. I hope that you have a memorable experience you will never forget.
Tomoki Yoshihisa
SMDMS-2019 Workshop Chair
xix
xx Welcome Message from SMDMS-2019 Workshop Organizers
SMDMS-2019 Organizing Committee

Workshop Chair
Tomoki Yoshihisa Osaka University, Japan
International Liaison Chair
Lei Shu Nanjing Agricultural University, China
Program Committee Members
Akihiro Fujimoto Wakayama University, Japan

Akimitu Kanzaki Shimane University, Japan
Mithun Mukherjee Guangdong University of Petrochemical
Technology, China
Tomoya Kawakami Nara Institute of Science and Technology, Japan
Toshiro Nunome Nagoya Institute of Technology, Japan
Yusuke Gotoh Okayama University, Japan
Welcome Message from MWVRTA-2019
Workshop Organizers
Welcome to the 9th International Workshop on Multimedia, Web and Virtual

Reality Technologies and Applications (MWVRTA-2019), which will be held in
conjunction with the 14th International Conference on P2P, Parallel, Grid, Cloud
and Internet Computing (3PGCIC-2019) from November 7 to November 9, 2019, at
University of Antwerp, Belgium.
With appearance of multimedia, Web and virtual reality technologies, different
types of networks, paradigms and platforms of distributed computation are
emerging as new forms of the computation in the new millennium. Among these
paradigms and technologies, Web computing, multimodal communication and
tele-immersion software are most important. From the scientific perspective, one
of the main targets behind these technologies and paradigms is to enable the
solution of very complex problems such as e-science problems that arise in different
branches of science, engineering and industry. The aim of this workshop is to
present innovative research and technologies as well as methods and techniques
related to new concept, service and application software in emergent computational
systems, multimedia, Web and virtual reality. It provides a forum for sharing ideas
and research work in all areas of multimedia technologies and applications.
We would like to express our appreciation to the authors of the submitted papers
and to the Program Committee members, who provided timely and significant
review.
We hope that all of you will enjoy MWVRTA-2019 and find this a productive
opportunity to exchange ideas and research work with many researchers.
Kaoru Sugita
Leonard Barolli
MWVRTA-2019 Workshop Co-chairs
xxi
xxii Welcome Message from MWVRTA-2019 Workshop Organizers
MWVRTA-2019 Program Committee

Workshop Co-chairs
Kaoru Sugita Fukuoka Institute of Technology, Japan
Tetsuro Ogi Keio University, Japan

Yasuo Ebara Osaka University, Japan
Nobuyoshi Satou Iwate Prefectural University, Japan
Makio Ishihara Fukuoka Institute of Technology, Japan
Akihiro Miyakawa Information Policy Division of Nanao-City,
Ishikawa, Japan
Akio Koyama Yamagata University, Japan
Fatos Xhafa Technical University of Catalonia, Spain
Vladi Kolici Polytechnic University of Tirana, Albania
Joan Arnedo-Moreno Open University of Catalonia, Spain
Hiroaki Nishino Oita University, Japan
Farookh Hussain University of Technology Sydney, Australia
Welcome Message from ALICE-2019
Workshop Organizers
It is our great pleasure to welcome you to the 9th International Workshop on

Adaptive Learning via Interactive, Collaborative and Emotional approaches
(ALICE-2019), which is held in conjunction with the 14th International Conference
on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2019) from
November 7 to November 9, 2019, at University of Antwerp, Belgium.
ALICE Workshop aims at providing a forum for innovative adaptive e-learning
combining personalization, collaboration and simulation aspects within an
affective-/emotional-based approach able to contribute to the overcoming of the
quoted limitations of current e-learning systems and content. Special emphasis is
given to MOOC environments that are interactive, challenging and context-aware
while enabling learners’ demand of empowerment, e-assessment and authentic
learning experience.
We accepted high-quality papers for ALICE-2019 that addressed the above
issues and have consistently highlighted e-assessment, artificial intelligence,
affective computing and learning analytics as hot topics in the current e-learning
arena in order to engage and motivate students in the learning process. A selection
of these papers will be considered after the workshop for publication in the
International Journal of Emerging Technologies in Learning (iJET) indexed in
Scopus.
ALICE-2019 is supported by the EU Project “colMOOC: Integrating
Conversational Agents and Learning Analytics in MOOCs”
(588438-EPP-1-2017-1-EL-EPPKA2-KA), (http://colmooc.eu/). People inside and
outside this project contributed to the success of this edition of the ALICE
Workshop, we really appreciate their support. In addition, we would like to thank
the Organizing Committee of 3PGCIC-2019 for giving us the opportunity to
organize ALICE-2019 as well as the Local Arrangement chairs for the local
arrangement of the workshop. Finally, we would like to thank all the authors of the
xxiii
xxiv Welcome Message from ALICE-2019 Workshop Organizers
workshop for submitting their research works and for their participation. We are
looking forward to meeting them again in the forthcoming editions of the
workshop. Last but not least, we would like to express our great appreciation to
ALICE-2019 reviewers who carefully evaluated the submitted papers.
We hope you enjoy the workshop and have a great time in Antwerp, Belgium!
Santi Caballé
Nicola Capuano
ALICE-2019 Workshop Organizers
ALICE-2019 Program Committee

Workshop Organizers
Santi Caballé Open University of Catalonia, Spain
Nicola Capuano University of Salerno, Italy
PC Members
Antonio Cerrato University of Naples Federico II, Italy

Jordi Conesa Open University of Catalonia, Spain
Thanasis Daradoumis University of the Aegean, Greece
Giuliana Dettori Italian National Research Council, Italy
Sara de Freitas Coventry University, UK
Angelo Gaeta University of Salerno, Italy
David Gañán Open University of Catalonia, Spain
Isabel Guitart Open University of Catalonia, Spain
Christian Gütl Graz University of Technology, Austria
Jorge Miguel Open University of Catalonia, Spain
Néstor Mora Open University of Catalonia, Spain
Anna Pierri University of Salerno, Italy
Krassen Stefanov Sofia University “St. Kliment Ohridski,” Bulgaria
Welcome Message from CADSA-2019
Workshop Organizer
Welcome to the 7th International Workshop on Cloud and Distributed System

Applications (CADSA-2019), which is held in conjunction with the 14th
International Conference on P2P, Parallel, Grid, Cloud and Internet Computing
(3PGCIC-2019) from November 7 to November 9, 2019, at University of Antwerp,
Belgium.
This International Workshop on Cloud and Distributed System Applications
brings together scientists, engineers and students for sharing experiences, ideas and
research results about domain-specific applications relying on cloud computing or
distributed systems.
This workshop provides an international forum for researchers and participants
to share and exchange their experiences, discuss challenges and present original
ideas in all aspects related to the cloud and distributed system application design
and development.
We have encouraged innovative contributions about cloud and distributed
computing as follows:
– Distributed computing applications
– Cloud computing applications
– Collaborative platforms
– Topologies for distributed computing
– Semantic technologies for cloud
– Modeling and simulation of cloud computing
– Modeling and simulation of distributed system
– Distributed knowledge management
– Distributed computing for smart cities
– Distributed computing for e-health
– Quality evaluation of distributed services
Many people contributed to the success of CADSA-2019. I would like to thank
the Organizing Committee of 3PGCIC-2019 International Conference for giving us
the opportunity to organize the workshop. I also would like to thank Program
xxv
xxvi Welcome Message from CADSA-2019 Workshop Organizer
Committee members and authors of the workshop for submitting their research
works and for their participation. Finally, I would like to thank the Local
Arrangement chairs of the 3PGCIC-2019 International Conference.
I hope you will enjoy CADSA-2019 Workshop and 3PGCIC-2019 International
Conference, find this a productive opportunity for sharing experiences, ideas and
research results with many researchers and have a great time in Antwerp, Belgium.
Flora Amato
CADSA-2019 Workshop Chair
CADSA-2019 Program Committee

Workshop Chair
Antonino Mazzeo University of Naples Federico II, Italy

Nicola Mazzocca University of Naples Federico II, Italy
Carlo Sansone University of Naples Federico II, Italy
Beniamino di Martino Second University of Naples, Italy
Antonio Picariello University of Naples Federico II, Italy
Valeria Vittorini University of Naples Federico II, Italy
Anna Rita Fasolino University of Naples Federico II, Italy
Umberto Villano Università degli Studi del Sannio, Italy
Kami Makki Lamar University, Beaumont (Texas), USA
Valentina Casola University of Naples Federico II, Italy
Stefano Marrone Second University of Naples, Italy
Alessandro Cilardo University of Naples Federico II, Italy
Vincenzo Moscato University of Naples Federico II, Italy
Porfirio Tramontana University of Naples Federico II, Italy
Francesco Moscato Second University of Naples, Italy
Salvatore Venticinque Second University of Naples, Italy
Emanuela Marasco West Virginia University, USA
Massimiliano Albanese George Mason University, USA
Domenico Amalfitano University of Naples Federico II, Italy
Massimo Esposito Institute for High Performance Computing
and Networking (ICAR), Italy
Alessandra de Benedictis University of Naples Federico II, Italy
Roberto Nardone University of Naples Federico II, Italy
Mario Barbareschi University of Naples Federico II, Italy
Ermanno Battista University of Naples Federico II, Italy
Welcome Message from CADSA-2019 Workshop Organizer xxvii
Mario Sicuranza Institute for High Performance Computing

and Networking (ICAR), Italy
Natalia Kryvinska University of Vienna, Austria
Moujabbir Mohammed Université Hassan II Mohammedia-Casablanca,
Morocco
Welcome Message from CCPI-2019
Workshop Co-chairs
Welcome to the 7th International Workshop on Cloud Computing Projects and

Initiatives (CCPI-2019), which is held in conjunction with the 14th International
Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2019)
from November 7 to November 9, 2019, at University of Antwerp, Belgium.
The workshop was initiated as a collaboration and dissemination workshop of
European Commission-funded FP7-ICT Project mOSAIC—Open-Source API and
Platform for Multiple Clouds (http://www.mosaic-cloud.eu).
Cloud computing is a recent computing paradigm for enabling convenient,
on-demand network access to a shared pool of configurable computing resources
(e.g., networks, servers, storage, applications and services) that can be rapidly
provisioned and released with minimal management effort or service provider
interaction. Clouds are currently used mainly in commercial settings and focus on
on-demand provision of IT infrastructure. Cloud computing can play a significant
role in a variety of areas including innovations, virtual worlds, e-business, social
networks or search engines.
CCPI-2019 has gathered together scientists, engineers and computer users both
from industry and academia to exchange and share experiences, new ideas and
research results from collaborative international and national projects and initiatives
on cloud computing. Each submitted paper was peer-reviewed by reviewers who
are experts in the cloud area. Based on the review results, we accepted very good
quality papers.
We are grateful to many people who contributed to the success of this event.
First of all, we would like to thank all the authors for contributing their papers and
talks. We also appreciate the support from Program Committee members and
reviewers who carried out the difficult work of carefully evaluating the submitted
papers.
xxix
xxx Welcome Message from CCPI-2019 Workshop Co-chairs
We would like to give our special thanks to Prof. Leonard Barolli, Steering
Committee Chair of 3PGCIC International Conference for his encouragement to
organize the workshop with 3PGCIC-2019. We would like to thank 3PGCIC-2019
general co-chairs, Program Committee co-chairs and workshop co-chairs for their
support to organize the workshop.
Beniamino Di Martino
Antonio Esposito
CCPI-2019 Workshop Co-chairs
CCPI-2019 Organizing Committee

Workshop Co-chairs
Beniamino Di Martino University of Campania Luigi Vanvitelli, Italy
Antonio Esposito University of Campania Luigi Vanvitelli, Italy
Rocco Aversa University of Campania Luigi Vanvitelli, Italy

Raj Buyya University of Melbourne, Australia
Siegfred Benkner University of Vienna, Austria
Giuseppina Cretella University of Campania Luigi Vanvitelli, Italy
Salvatore D’Angelo University of Campania Luigi Vanvitelli, Italy
Marios Dikaiakos University of Cyprus, Cyprus
Jack Dongarra University of Tennessee, Texas, USA
Thomas Fahringer University of Innsbruck, Austria
Massimo Ficco University of Campania Luigi Vanvitelli, Italy
Ian Foster Argonne National Laboratory, USA
Dieter Kranzlmueller Ludwig Maximilian University of Munich,
Germany
Craig Lee Aerospace Corporation, USA
Ignacio Llorente Universidad Complutense de Madrid, Spain
Vincenzo Loia University of Salerno, Italy
Salvatore Augusto Maisto University of Campania Luigi Vanvitelli, Italy
Antonino Mazzeo University of Naples Federico II, Italy
Stefania Nacchia University of Campania Luigi Vanvitelli, Italy
Thierry Priol INRIA, France
Dana Petcu West University of Timisoara, Romania
Omer Rana Cardiff University, UK
Domenico Talia University of Calabria, Italy
Welcome Message from CCPI-2019 Workshop Co-chairs xxxi
Luca Tasquier University of Campania Luigi Vanvitelli, Italy

Maria Tsakali European Commission, Belgium
Salvatore Venticinque University of Campania Luigi Vanvitelli, Italy
Laurence Yang St. Francis Xavier University, Canada
Hans Zima JPL, Nasa, USA/University of Vienna, Austria
Welcome Message from DEM-2019
Workshop Organizers
Welcome to the 6th International Workshop on Distributed Embedded Systems

(DEM-2019), which is held in conjunction with the 14th International Conference
on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2019) from
November 7 to November 9, 2019, at University of Antwerp, Belgium.
The tremendous advances in communication technologies and embedded
systems have created an entirely new research field in both academia and industry
for distributed embedded software development. This field introduces constrained
systems into distributed software development. The implementation of limitations
like real-time requirements, power limitations and memory constraints. Within a
distributed environment requires the introduction of new software development
processes, software development techniques and software architectures. It is
obvious that these new methodologies will play a key role in the future networked
embedded systems. In order to facilitate these processes, further progress of the
research and engineering on distributed embedded systems is mandatory.
The International Workshop on Distributed Embedded Systems (DEM) aims to
bring together practitioners and researchers from both academia and industry in
order to have a forum for discussion and technical presentations on the current
research and future research directions related to this hot scientific area. Topics
include (but are not limited to) virtualization on embedded systems, model-based
embedded software development, real time in the cloud, Internet of things,
distributed safety concepts, embedded software (mechatronics, automotive, health
care, energy, telecom, etc.), sensor fusion, embedded multicore software,
distributed localization, distributed embedded software development and testing.
This workshop provides an international forum for researchers and participants to
share and exchange their experiences, discuss challenges and present original ideas
in all aspects of distributed and/or embedded systems.
I would like to appreciate the Organizing Committee of the 3PGCIC-2019
International Conference for giving us the opportunity to organize the
workshop. My sincere thanks to Program Committee members and to all the
xxxiii
xxxiv Welcome Message from DEM-2019 Workshop Organizers
authors of the workshop for submitting their research works and for their
participation.
I hope you will enjoy DEM Workshop and have a great time in Antwerp,
Belgium.
Peter Hellinckx
DEM-2019 Workshop Chair
DEM-2019 Program Committee

Workshop Chair
Peter Hellinckx University of Antwerp, Belgium
Paul Demeulenaere University of Antwerp, Belgium

Marijn Temmerman University of Antwerp, Belgium
Joachim Denil McGill University, Canada
Maarten Weyn University of Antwerp, Belgium
Welcome Message from SiPML-2019
Workshop Organizers
Welcome to the 5th International Workshop on Signal Processing and Machine

Learning (SiPML-2019), which is held in conjunction with the 14th International
Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2019)
from November 7 to November 9, 2019, at University of Antwerp, Belgium.
The workshop brings together engineers, students, practitioners and researchers
from the fields of machine learning (ML) and signal processing (SP). The aim of the
workshop is to contribute to the cross-fertilization between the research on ML
methods and their application to SP to initiate collaboration between these areas.
ML usually plays an important role in the transition from data storage to decision
systems based on large databases of signals such as the obtained from sensor
networks, Internet services or communication systems. These systems imply
developing both computational solutions and novel models. Signals from
real-world systems are usually complex such as speech, music, biomedical and
multimedia, among others. Thus, SP techniques are very useful for these types of
systems to automate processing and analysis techniques to retrieve information
from data storage. Topics of the workshop range from foundations for real-world
systems and processing, such as speech, language analysis, biomedicine,
convergence and complexity analysis, machine learning, social networks, sparse
representations, visual analytics, robust statistical methods.
We would like to thank the 3PGCIC-2019 International Conference for giving us
the opportunity to organize the workshop. The time and efforts of the PC members
of the workshop as well as of the authors of the workshop for submitting their
research works and for their participation are highly appreciated.
We wish you enjoy the workshop at the 3PGCIC-2019 Conference and have a
pleasant stay in Antwerp, Belgium.
Ricardo Rodriguez Jorge

Jolanta Mizera-Pietraszko
SiPML-2019 Workshop Co-chairs
xxxv
xxxvi Welcome Message from SiPML-2019 Workshop Organizers
SiPML-2019 Program Committee

Workshop Co-chairs
Ricardo Rodriguez Jorge Autonomous University of Ciudad Juarez,
Mexico
Jolanta Mizera-Pietraszko Opole University, Poland
Edgar Alonso Martínez Autonomous University of Ciudad Juarez,

García Mexico
Ezendu Ariwa University of Bedfordshire, UK
Jiri Bila Czech Technical University in Prague,
Czech Republic
Jorge Enrique Rodas Osollo Autonomous University of Ciudad Juarez,
Mexico
Ke Liao University of Kansas, USA
Mohamed Elgendi University of British Columbia, Canada
Nghien N. B. Hanoi University of Industry, Vietnam
Pit Pichappan Al-Imam University, Saudi Arabia
Rafael Torres Córdoba Autonomous University of Ciudad Juarez,
Mexico
Yao-Liang Chung National Taipei University, Taiwan
Welcome Message from BIDS-2019
International Workshop Organizers
Welcome to the 2nd International Workshop on Business Intelligence and

Distributed Systems (BIDS-2019), which is held in conjunction with the 14th
International Conference on P2P, Parallel, Grid, Cloud and Internet Computing
(3PGCIC-2019) from November 7 to November 9, 2019, at University of Antwerp,
Belgium.
As many large-scale enterprise information systems start to utilize P2P networks,
parallel, grid, cloud and Internet computing, they have become a major source of
business information. Techniques and methodologies to extract quality information
in distributed systems are of paramount importance for many applications and users
in the business community. Data mining and knowledge discovery play key roles in
many of today’s prominent business intelligence applications to uncover relevant
information of competitors, consumers, markets and products, so that appropriate
marketing and product development strategies can be devised. In addition, formal
methods and architectural infrastructures for related issues in distributed systems,
such as e-commerce and computer security, are being explored and investigated by
many researchers.
The BIDS Workshop aims to bring together scientists, engineers and practi-
tioners to discuss, exchange ideas and present their research findings on general
business intelligence techniques and methodologies, as well as their application in
distributed systems.
We would like to express our sincere gratitude to the members of the Program
Committee for their efforts, University of Antwerp, and 3PGCIC-2019 Organizing
Committee for co-hosting BIDS-2019. Most importantly, we thank all the authors
for their submission and contribution to the workshop.
Kin Fun Li
Shengrui Wang
BIDS-2019 International Workshop Co-chairs
xxxvii
xxxviii Welcome Message from BIDS-2019 International Workshop Organizers
BIDS-2019 Program Committee

Workshop Co-chairs
Kin Fun Li University of Victoria, Canada
Shengrui Wang University of Sherbrooke, Canada
Watheq El-Kharashi Ain Shams University, Egypt

Wei Lu Keene University, USA
Rafael Parra-Hernandez Enquisite Software, Canada
Darshika G. Perera University of Colorado at Colorado Springs
(UCCS), USA
Kosuke Takano Kanagawa Institute of Technology, Japan
Martine Wedlake IBM, USA
3PGCIC-2019 Keynote Talks
Wireless Experimentation with SDR:
The Way to Drive Innovation
Ingrid Moerman
Ghent University, Ghent, Belgium
Abstract. There exist many ways for researching and developing innovative
solutions: from theoretical analysis, simulations, small-scale setup to large-scale
experimentation. This first part of this talk will discuss the benefits and pitfalls of
different approaches and illustrate them with some concrete examples.
While experimentation seems to be most challenging approach, the second part
of this talk will present how the software-defined radio (SDR) facility offered in the
H2020 ORCA Project is capable to accelerate wireless innovation. The advantage
of SDR over “off-the-shelf” technology is that it enables full and open
implementation of all network functionality, also the lower physical and medium
access control (MAC) layers. The ultimate goal of the ORCA Project is to enable
wireless experimenters to unlock the potential of reconfigurable radio technology
by setting up advanced experiments involving end-to-end applications that require
control of novel wireless technologies or cooperation between multiple networked
SDR platforms within extreme and/or diverging communication needs in terms of
latency, reliability or throughput, well before new radio technologies become
available on the market in commercial off-the-shelf products.
In the third and last part of the talk, the ORCA vision toward orchestrating
next-generation services through end-to-end network slicing will be presented.
Network slicing (also known as network virtualization) allows network resources to
be used in a flexible, dynamic and customized manner, and most crucially provides
isolation between different virtual networks. ORCA believes that each network
segment should have its own orchestrator, tailored to the segment’s particularities.
The use of a separate orchestrator for each network segment reduces complexity
and breaks down the larger E2E network orchestration problem into smaller parts.
In this way, each segment orchestrator can focus on a limited number of
well-defined tasks, reducing the software complexity, in terms of both design and
implementation. The ORCA vision is expected to foster innovation for everyone
(not only big industrial players, but also smaller companies and the research
community), to reduce development life cycle, to simplify standardization and to
stimulate multi-disciplinary experimentation.
xli
2020: The AI Decade
Deevid De Meyer
Cronos Group, Leuven, Belgium
Abstract. By now it should be clear to everyone that AI has had a significant

impact over the past decade. Thanks to the rise of deep learning, applications are
being released almost every week that were previously deemed impossible.
Chatbots, deepfakes, self-driving cars, intelligent cameras, digital authors, these
technologies have been made feasible in the past 10 years thanks to machine
learning, and breakthroughs are still happening on almost a weekly basis. Where
2010 was the decade where AI broke through, many people think that 2020 will be
the decade where it reaches maturity and widespread adoption. In this presentation,
we will look at today’s frontier of artificial intelligence and predict how the field of
AI will evolve in the coming decade.
xliii
Contents
The 14th International Conference on P2P, Parallel, Grid, Cloud

A Fuzzy-Based Peer Coordination Quality System in Mobile P2P
Networks: Effect of Time for Finishing Required Task
(TFRT) Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Vladi Kolici, Yi Liu, and Leonard Barolli
A Fuzzy-Based System for Driving Risk Measurement (FSDRM)
in VANETs: A Comparison Study of Simulation
and Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Kevin Bylykbashi, Ermioni Qafzezi, Makoto Ikeda, Keita Matsuo,
and Leonard Barolli
A Comparison Study of Constriction and Linearly Decreasing
Vmax Replacement Methods for Wireless Mesh Networks
by WMN-PSOHC-DGA Simulation System . . . . . . . . . . . . . . . . . . . . . . 26
Admir Barolli, Shinji Sakamoto, Heidi Durresi, Seiji Ohara,
Leonard Barolli, and Makoto Takizawa
Effect of Degree of Centrality Parameter on Actor Selection
in WSANs: A Fuzzy-Based Simulation System and Its
Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Donald Elmazi, Miralda Cuka, Makoto Ikeda, Keita Matsuo,
and Leonard Barolli
Wind Power Forecasting Based on Efficient Deep Convolution
Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Sana Mujeeb, Nadeem Javaid, Hira Gul, Nazia Daood, Shaista Shabbir,
and Arooj Arif
xlv
xlvi Contents
One Step Forward: Towards a Blockchain Based Trust Model

for WSNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Abdul Mateen, Jawad Tanveer, Ashrafullah, Nasir Ali Khan,
Mubariz Rehman, and Nadeem Javaid
Smart Contracts for Research Lab Sharing Scholars Data Rights
Management over the Ethereum Blockchain Network . . . . . . . . . . . . . . 70
Abdul Ghaffar, Muhammad Azeem, Zain Abubaker,
Muhammad Usman Gurmani, Tanzeela Sultana, Faisal Shehzad,
and Nadeem Javaid
Secure Service Provisioning Scheme for Lightweight Clients
with Incentive Mechanism Based on Blockchain . . . . . . . . . . . . . . . . . . 82
Ishtiaq Ali, Raja Jalees ul Hussen Khan, Zainib Noshad, Atia Javaid,
Maheen Zahid, and Nadeem Javaid
Node Recovery in Wireless Sensor Networks via Blockchain . . . . . . . . . 94
Raja Jalees ul Hussen Khan, Zainib Noshad, Atia Javaid, Maheen Zahid,
Ishtiaq Ali, and Nadeem Javaid
Towards Plug and Use Functionality for Autonomous Buildings . . . . . . 106
Markus Aleksy, Reuben Borrison, Christian Groß,
and Johannes Schmitt
An Evaluation of Pacemaker Cluster Resource Manager Reliability . . . 117
Davide Appierto and Vincenzo Giuliano
A Secure and Distributed Architecture for Vehicular Cloud . . . . . . . . . 127
Hassan Mistareehi, Tariqul Islam, Kiho Lim, and D. Manivannan
Introducing Connotation Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Marina Danchovsky Ibrishimova and Kin Fun Li
A Cotton and Flax Fiber Classification Model Based on Transfer
Learning and Spatial Fusion of Deep Features . . . . . . . . . . . . . . . . . . . . 152
Shangli Zhou, Song Cai, Chunyan Zeng, and Zhifeng Wang
An Automatic Text Summary Method Based on LDA Model . . . . . . . . 163
Caiquan Xiong, Li Shen, and Zhuang Wang
Investigating Performance and Cost
in Function-as-a-Service Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Diogo Bortolini and Rafael R. Obelheiro
Optimal Bandwidth and Delay of Video Streaming Traffic
in Cloud Data Center Server Racks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Nader F. Mir, Vincy Singh, Akshay Paranjpe, Abhilash Naredla,
Jahnavi Tejomurtula, Abhash Malviya, and Ashmita Chakraborty
Contents xlvii
Log-Based Intrusion Detection for Cloud Web Applications Using

Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Jaron Fontaine, Chris Kappler, Adnan Shahid, and Eli De Poorter
Automatic Text Classification Through Point of Cultural Interest
Digital Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Maria Carmela Catone, Mariacristina Falco, Alessandro Maisto,
Serena Pelosi, and Alfonso Siano
An Industrial Multi-Agent System (MAS) Platform . . . . . . . . . . . . . . . . 221
Ariona Shashaj, Federico Mastrorilli, Michele Stingo,
and Massimiliano Polito
Museums’ Tales: Visualizing Instagram Users’ Experience . . . . . . . . . . 234
Pierluigi Vitale, Azzurra Mancuso, and Mariacristina Falco
Research Topics: A Multidisciplinary Analysis of Online Communities
to Detect Policy-Making Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Iolanda Sara Iannotta and Pierluigi Vitale
Proposal of Transesophageal Echo Examination Support System
by CT Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
H. Takahashi, T. Katoh, A. Doi, M. Hozawa, and Y. Morino
A Study of 3D Shape Similarity Search in Point Representation
by Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Hideo Miyachi and Koshiro Murakami
A Fog-Cloud Approach to Enhance Communication in a Distance
Learning Cloud Based System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Lucas Larcher, Victor Ströele, and Mario Dantas
Research Characterization on I/O Improvements of Storage
Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Laércio Pioli, Victor Ströele de Andrade Menezes,
and Mario Antonio Ribeiro Dantas
Home Fine Dust Monitoring Systems Using XBee . . . . . . . . . . . . . . . . . 299
Sung Woo Cho
Opinion Mining in Consumers Food Choice and Quality Perception . . . 310
Alessandra Amato, Giovanni Cozzolino, and Marco Giacalone
A Model for Human Activity Recognition in Ambient
Assisted Living . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
Wagner D. do Amaral, Mario A. R. Dantas, and Fernanda Campos
xlviii Contents
Omniconn: An Architecture for Heterogeneous Devices

Interoperability on Industrial Internet of Things . . . . . . . . . . . . . . . . . . 329
Bruno Machado Agostinho, Cauê Baasch de Souza,
Fernanda Oliveira Gomes, Alex Sandro Roschildt Pinto,
and Mario Antônio Ribeiro Dantas
A Framework for Allocation of IoT Devices to the Fog Service
Providers in Strategic Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Anjan Bandyopadhyay, Fatos Xhafa, Saurav Mallik, Paul Krause,
Sajal Mukhopadhyay, Vikash Kumar Singh, and Ujjwal Maulik
The 12th International Workshop on Simulation and Modelling

of Engineering and Computational Systems (SMECS-2019)
Blockchain Based Decentralized Authentication and Licensing
Process of Medicine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Muhammad Azeem, Zain Abubaker, Muhammad Usman Gurmani,
Tanzeela Sultana, Abdul Ghaffar, Abdul Basit Majeed Khan,
and Nadeem Javaid
Detection of Malicious Code Variants Based on a Flexible
and Lightweight Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Wang Bo, Wang Xu An, Su Yang, and Nie Jun Ke
Preprocessing of Correlation Power Analysis Based on Improved
Wavelet Packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Peng Ma, Ze-yu Wang, WeiDong Zhong, and Xu An Wang
A Method of Annotating Disease Names in TCM Patents
Based on Co-training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Na Deng, Xu Chen, and Caiquan Xiong
Semantic Annotation in Maritime Legal Case Texts
Based on Co-training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Jun Luo, Ziqi Hu, Qi Liu, Sizhuo Chen, Peiyong Wang, and Na Deng
Data Analytical Platform Deployment: A Case Study from
Automotive Industry in Thailand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Chidchamaiporn Kanmai, Chartchai Doungsa-ard, Worachet Kanjanakuha,
and Juggapong Natwichai
The 10th International Workshop on Streaming Media Delivery

and Management Systems (SMDMS-2019)
The Structured Way of Dealing with Heterogeneous Live
Streaming Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Andrea Tomassilli, Nicolas Huin, and Frédéric Giroire
Contents xlix
A Rule Design for Trust-Oriented Internet Live Video

Distribution Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Satoru Matsumoto, Tomoki Yoshihisa, Tomoya Kawakami,
and Yuuichi Teranishi
High-Performance Computing Environment with Cooperation
Between Supercomputer and Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Toshihiro Kotani and Yusuke Gotoh
Evaluation of a Distributed Sensor Data Stream Collection
Method Considering Phase Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Tomoya Kawakami, Tomoki Yoshihisa, and Yuuichi Teranishi
A Mathematical Analysis of 2-Tiered Hybrid Broadcasting
Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
Satoru Matsumoto, Kenji Ohira, and Tomoki Yoshihisa
The 9th International Workshop on Multimedia, Web and Virtual

Reality Technologies (MWVRTA-2019)
Influence of Japanese Traditional Crafts on Kansei in Different
Interior Styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Ryo Nakai, Yangzhicheng Lu, Tomoyuki Ishida, Akihiro Miyakwa,
Kaoru Sugita, and Yoshitaka Shibata
Semantic Similarity Calculation of TCM Patents in Intelligent
Retrieval Based on Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Na Deng, Xu Chen, and Caiquan Xiong
The Design and Development of Assistant Application for Maritime
Law Knowledge Built on Android . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Jun Luo, Ziqi Hu, Qi Liu, Sizhuo Chen, Peiyong Wang, and Na Deng
A Matrix Factorization Recommendation Method Based
on Multi-grained Cascade Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
Shangli Zhou, Songnan Lv, Chunyan Zeng, and Zhifeng Wang
The 9th International Workshop on Adaptive Learning via

Interactive, Cognitive and Emotional approaches (ALICE-2019)
Multi-attribute Categorization of MOOC Forum Posts
and Applications to Conversational Agents . . . . . . . . . . . . . . . . . . . . . . 505
Nicola Capuano and Santi Caballé
A Tool for Creating Educational Resources Through
Content Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Antonio Sarasa-Cabezuelo and Santi Caballe
l Contents
A Methodology Approach to Evaluate Cloud-Based Infrastructures

in Support for e-Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Josep Prieto and David Gañán
Towards an Educational Model for Lifelong Learning . . . . . . . . . . . . . . 537
Jordi Conesa, Josep-Maria Batalla-Busquets, David Bañeres,
Carme Carrion, Israel Conejero-Arto, María del Carmen Cruz Gil,
Montserrat Garcia-Alsina, Beni Gómez-Zúñiga,
María J. Martinez-Argüelles, Xavier Mas, Tona Monjo, and Enric Mor
The 7th International Workshop on Cloud and Distributed System

Applications (CADSA-2019)
Optimization Algorithms and Tools Applied
in Agreements Negotiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
Alessandra Amato, Flora Amato, Giovanni Cozzolino, Marco Giacalone,
and Francesco Romeo
A Configurable Implementation of the SHA-256 Hash Function . . . . . . 558
Raffaele Martino and Alessandro Cilardo
A Blockchain Based Incentive Mechanism for Crowd
Sensing Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Zainib Noshad, Atia Javaid, Maheen Zahid, Ishtiaq Ali,
Raja Jalees ul Hussen Khan, and Nadeem Javaid
Design of a Cloud-Oriented Web Application for Legal Conflict
Resolution Through Equitative Algorithms . . . . . . . . . . . . . . . . . . . . . . 579
Alessandra Amato, Flora Amato, Giovanni Cozzolino, Marco Giacalone,
and Francesco Romeo
Equitative Algorithms for Legal Conflict Resolution . . . . . . . . . . . . . . . 589
Alessandra Amato, Flora Amato, Giovanni Cozzolino,
and Marco Giacalone
The 7th International Workshop on Cloud Computing Projects

and Initiatives (CCPI-2019)
An Approach to Help in Cloud Model Choice for Academia
Services’ Supplying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
Pasquale Cantiello, Beniamino Di Martino, and Michele Mastroianni
Italian Cloud Tourism as Tool to Develop Local Tourist Districts
Economic Vitality and Reformulate Public Policies . . . . . . . . . . . . . . . . 609
Alfonso Marino and Paolo Pariso
Auto-scaling in the Cloud: Current Status and Perspectives . . . . . . . . . 616
Marta Catillo, Massimiliano Rak, and Umberto Villano
Contents li
Dynamic Patterns for Cloud Application Life-Cycle Management . . . . . 626

Geir Horn, Leire Orue-Echevarria Arrieta, Beniamino Di Martino,
Paweł Skrzypek, and Dimosthenis Kyriazis
From Monolith to Cloud Architecture Using Semi-automated
Microservices Modernization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
Salvatore Augusto Maisto, Beniamino Di Martino,
and Stefania Nacchia
Reinforcement Learning for Resource Allocation
in Cloud Datacenter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Salvatore Venticinque, Stefania Nacchia, and Salvatore Augusto Maisto
The 6th International Workshop on Distributed Embedded

Systems (DEM-2019)
DUST Initializr - CAD Drawing Platform for Designing Modules
and Applications in the DUST Framework . . . . . . . . . . . . . . . . . . . . . . 661
Thomas Huybrechts, Simon Vanneste, Reinout Eyckerman, Jens de Hoog,
Siegfried Mercelis, and Peter Hellinckx
Distributed Task Placement in the Fog: A Positioning Paper . . . . . . . . . 671
Reinout Eyckerman, Siegfried Mercelis, Johann Marquez-Barja,
and Peter Hellinckx
Using Neural Architecture Search to Optimize Neural Networks
for Embedded Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
Thomas Cassimon, Simon Vanneste, Stig Bosmans, Siegfried Mercelis,
and Peter Hellinckx
Spiking Neural Network Implementation on FPGA
for Robotic Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694
Maximiliaan Walravens, Erik Verreyken, and Jan Steckel
A New Approach to Selectively Implement Control Flow Error
Detection Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704
Jens Vankeirsbilck, Jonas Van Waes, Hans Hallez, and Jeroen Boydens
In-Air Imaging Sonar Sensor Network with Real-Time Processing
Using GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
Wouter Jansen, Dennis Laurijssen, Robin Kerstens, Walter Daems,
and Jan Steckel
Comparing Machine Learning Algorithms for RSS-Based
Localization in LPWAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
Thomas Janssen, Rafael Berkvens, and Maarten Weyn
lii Contents
Learning to Communicate with Multi-agent Reinforcement Learning

Using Value-Decomposition Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 736
Simon Vanneste, Astrid Vanneste, Stig Bosmans, Siegfried Mercelis,
and Peter Hellinckx
AirLeakSlam: Automated Air Leak Detection . . . . . . . . . . . . . . . . . . . . 746
Anthony Schenck, Walter Daems, and Jan Steckel
Simulating a Combination of TDoA and AoA Localization
for LoRaWAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Michiel Aernouts, Noori BniLam, Rafael Berkvens, and Maarten Weyn
Localization Accuracy Performance Comparison Between LTE-V
and IEEE 802.11p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766
Rreze Halili, Maarten Weyn, and Rafael Berkvens
Online Reverse Engineering of CAN Data . . . . . . . . . . . . . . . . . . . . . . . 776
Jens de Hoog, Nick Castermans, Siegfried Mercelis, and Peter Hellinckx
Time Synchronization with Channel Hopping Scheme
for LoRa Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786
Ritesh Kumar Singh, Rafael Berkvens, and Maarten Weyn
LiDAR and Camera Sensor Fusion for 2D and 3D Object Detection . . . 798
Dieter Balemans, Simon Vanneste, Jens de Hoog, Siegfried Mercelis,
and Peter Hellinckx
The 5th International Workshop on Signal Processing

and Machine Learning (SiPML-2019)
Apple Brand Classification Using CNN Aiming at Automatic Apple
Texture Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811
Shigeru Kato, Ryuji Ito, Takaya Shiozaki, Fuga Kitano, Naoki Wada,
Tomomichi Kagawa, Hajime Nobuhara, Takanori Hino, and Yukinori Sato
Fundamental Study on Evaluation System of Beginner’s Welding
Using CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821
Shigeru Kato, Takanori Hino, and Naoki Yoshikawa
Building an Early Warning Model for Detecting Environmental
Pollution of Wastewater in Industrial Zones . . . . . . . . . . . . . . . . . . . . . 828
Nghien Nguyen Ba and Ricardo Rodriguez Jorge
A Robust Fully Correntropy–Based Sparse Modeling Alternative
to Dictionary Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838
Carlos A. Loza
Labeling Activities Acquired by a Low-Accuracy EEG Device . . . . . . . 848
Ákos Rudas and Sándor Laki
Contents liii
The 2nd International Workshop on Business Intelligence

and Distributed Systems (BIDS-2019)
Data Sharing System Integrating Access Control Based on Smart
Contracts for IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863
Tanzeela Sultana, Abdul Ghaffar, Muhammad Azeem, Zain Abubaker,
Muhammad Usman Gurmani, and Nadeem Javaid
Energy Trading Between Prosumer and Consumer in P2P Network
Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875
Muhammad Usman Gurmani, Tanzeela Sultana, Abdul Ghaffar,
Muhammad Azeem, Zain Abubaker, Hassan Farooq, and Nadeem Javaid
Auto-Generating Examination Paper Based on Genetic Algorithms . . . . 887
Xu Chen, Deliang Zhong, Yutian Liu, Yipeng Li, Shudong Liu,
and Na Deng
The Data Scientist Job in Italy: What Companies Require . . . . . . . . . . 894
Maddalena della Volpe and Francesca Esposito
An Architecture for System Recovery Based on Solution Records
on Different Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
Takayuki Kasai and Kosuke Takano
A Novel Approach for Selecting Hybrid Features from Online
News Textual Metadata for Fake News Detection . . . . . . . . . . . . . . . . . 914
Mohamed K. Elhadad, Kin Fun Li, and Fayez Gebali
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927

The 14th International Conference on
P2P, Parallel, Grid, Cloud and Internet
Computing (3PGCIC-2019)
A Fuzzy-Based Peer Coordination Quality
System in Mobile P2P Networks: Effect
of Time for Finishing Required Task
(TFRT) Parameter
Vladi Kolici1(B) , Yi Liu2 , and Leonard Barolli2

1
Faculty of Information Technology, Polytechnic University of Tirana,
Mother Theresa Square, No. 4, Tirana, Albania
vkolici@fti.edu.al
2
Department of Information and Communication Engineering,
Fukuoka Institute of Technology (FIT),
3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan
ryuui1010@gmail.com, barolli@fit.ac.jp
Abstract. In this work, we present a distributed event-based awareness

approach for P2P groupware systems. The awareness of collaboration is
achieved by using primitive operations and services that are integrated
into the P2P middleware. We propose an abstract model for achieving
these requirements and we discuss how this model can support aware-
ness of collaboration in mobile teams. In this paper, we present a Fuzzy
Peer Coordination Quality System (FPCQS) for P2P networks accord-
ing to four parameters. We consider Time for Finishing the Required
Task (TFRT) as a new parameter. We evaluated the performance of
proposed system by computer simulations. The simulation results show
that that when GS is increased, the PCQ is increased. But, by increasing
PCC, the PCQ is decreased. When the PM is 50 units, the PCQ is the
best. Considering the effect of TFRT parameter, we found that when
TFRT is increased, the PCQ is decreased.
1 Introduction
Peer to Peer (P2P) technologies has been among most disruptive technologies
after Internet. Indeed, the emergence of the P2P technologies changed drasti-
cally the concepts, paradigms and protocols of sharing and communication in
large scale distributed systems. The nature of the sharing and the direct com-
munication among peers in the system, being these machines or people, makes
possible to overcome the limitations of the flat communications through email,
newsgroups and other forum-based communication forms [1–5].
The usefulness of P2P technologies on one hand has been shown for the
development of stand alone applications. On the other hand, P2P technolo-
gies, paradigms and protocols have penetrated other large scale distributed sys-
tems such as Mobile Ad hoc Networks (MANETs), Groupware systems, Mobile
c Springer Nature Switzerland AG 2020
L. Barolli et al. (Eds.): 3PGCIC 2019, LNNS 96, pp. 3–13, 2020.
https://doi.org/10.1007/978-3-030-33509-0_1
4 V. Kolici et al.
Systems to achieve efficient sharing, communication, coordination, replication,

awareness and synchronization. In fact, for every new form of Internet-based
distributed systems, we are seeing how P2P concepts and paradigms again play
an important role to enhance the efficiency and effectiveness of such systems or
to enhance information sharing and online collaborative activities of groups of
people. We briefly introduce below some common application scenarios that can
benefit from P2P communications.
Awareness is a key feature of groupware systems. In its simplest terms, aware-
ness can be defined as the system’s ability to notify the members of a group of
changes occurring in the group’s workspace. Awareness systems for online col-
laborative work have been proposed since in early stages of Web technology.
Such proposals started by approaching workspace awareness, aiming to inform
users about changes occurring in the shared workspace. More recently, research
has focussed on using new paradigms, such as P2P systems, to achieve fully
decentralized, ubiquitous groupware systems and awareness in such systems. In
P2P groupware systems group processes may be more efficient because peers can
be aware of the status of other peers in the group, and can interact directly and
share resources with peers in order to provide additional scaffolding or social
support. Moreover, P2P systems are pervasive and ubiquitous in nature, thus
enabling contextualized awareness.
Fuzzy Logic (FL) is the logic underlying modes of reasoning which are approx-
imate rather then exact. The importance of FL derives from the fact that most
modes of human reasoning and especially common sense reasoning are approx-
imate in nature [6]. FL uses linguistic variables to describe the control param-
eters. By using relatively simple linguistic expressions it is possible to describe
and grasp very complex problems. A very important property of the linguistic
variables is the capability of describing imprecise parameters.
The concept of a fuzzy set deals with the representation of classes whose
boundaries are not determined. It uses a characteristic function, taking values
usually in the interval [0, 1]. The fuzzy sets are used for representing linguistic
labels. This can be viewed as expressing an uncertainty about the clear-cut
meaning of the label. But important point is that the valuation set is supposed
to be common to the various linguistic labels that are involved in the given
problem.
The fuzzy set theory uses the membership function to encode a preference
among the possible interpretations of the corresponding label. A fuzzy set can be
defined by examplification, ranking elements according to their typicality with
respect to the concept underlying the fuzzy set [7].
In this paper, we propose a Fuzzy Peer Coordination Quality Sys-
tem (FPCQS) considering four parameters: Time for Finishing the Required
Task (TFRT), Group Synchronization (GS), Peer Communication Cost (PCC)
and Peer Mobility (PM) to decide the Peer Coordination Quality (PCQ). We
evaluated the proposed system by simulations.
The structure of this paper is as follows. In Sect. 2, we introduce the group
activity awareness model. In Sect. 3, we introduce FL used for control. In Sect. 4,
A Fuzzy-Based Approach for Peer Coordination Quality 5
Fig. 1. Super-peer P2P group network.
we present the proposed fuzzy-based system. In Sect. 5, we discuss the simulation

results. Finally, conclusions and future work are given in Sect. 6.
2 Group Activity Awareness Model

The awareness model considered here focuses on supporting group activities so
to accomplish a common group project, although it can also be used in a broader
scope of teamwork [8–14]. The main building blocks of our model (see also [15,16]
in the context of web-based groupware) are described below.
Activity awareness: Activity awareness refers to awareness information about
the project-related activities of group members. Project-based work is one of the
most common methods of group working. Activity awareness aims to provide
information about progress on the accomplishment of tasks by both individuals
and the group as a whole. It comprises knowing about actions taken by members
of the group according to the project schedule, and synchronization of activities
with the project schedule. Activity awareness should therefore enable members
to know about recent and past actions on the project’s work by the group. As
part of activity awareness, we also consider information on group artifacts such
as documents and actions upon them (uploads, downloads, modifications, read-
ing). Activity awareness is one of most important, and most complex, types of
awareness. As well as the direct link to monitoring a group’s progress on the
work relating to a project, it also supports group communication and coordina-
tion processes.
Process awareness: In project-based work, a project typically requires the
enactment of a workflow. In such a case, the objective of the awareness is to
track the state of the workflow and to inform users accordingly. We term this
process awareness. The workflow is defined through a set of tasks and precedence
relationships relating to their order of completion. Process awareness targets
the information flow of the project, providing individuals and the group with a
partial view (what they are each doing individually) and a complete view (what
they are doing as a group), thus enabling the identification of past, current and
next states of the workflow in order to move the collaboration process forward.
Communication awareness: Another type of awareness considered in this
work is that of communication awareness. We consider awareness information
6 V. Kolici et al.
relating to message exchange, and synchronous and asynchronous discussion

forums. The first is intended to support awareness of peer-to-peer communica-
tion (when some peer wants to establish a direct communication with another
peer); the second is aimed at supporting awareness about chat room creation
and lifetime (so that other peers can be aware of, and possibly eventually join,
the chat room); the third refers to awareness of new messages posted at the
discussion forum, replies, etc.
Availability awareness: Availability awareness is useful for provide individuals
and the group with information on members’ and resources’ availability. The
former is necessary for establishing synchronous collaboration either in peer-to-
peer mode or (sub)group mode. The later is useful for supporting members’ tasks
requiring available resources (e.g. a machine for running a software program).
Groupware applications usually monitor availability of group members by simply
looking at group workspaces. However, availability awareness encompasses not
only knowing who is in the workspace at any given moment but also who is
available when, via members’ profiles (which include also personal calendars) and
information explicitly provided by members. In the case of resources, awareness
is achieved via the schedules of resources. Thus, both explicit and implicit forms
of gathering availability awareness information should be supported.
3 Application of Fuzzy Logic for Control
The ability of fuzzy sets and possibility theory to model gradual properties or
soft constraints whose satisfaction is matter of degree, as well as information
pervaded with imprecision and uncertainty, makes them useful in a great variety
of applications [17–23].
The most popular area of application is Fuzzy Control (FC), since the appear-
ance, especially in Japan, of industrial applications in domestic appliances, pro-
cess control, and automotive systems, among many other fields.
In the FC systems, expert knowledge is encoded in the form of fuzzy rules,
which describe recommended actions for different classes of situations repre-
sented by fuzzy sets.
In fact, any kind of control law can be modeled by the FC methodology,
provided that this law is expressible in terms of “if ... then ...” rules, just like
in the case of expert systems. However, FL diverges from the standard expert
system approach by providing an interpolation mechanism from several rules. In
the contents of complex processes, it may turn out to be more practical to get
knowledge from an expert operator than to calculate an optimal control, due to
modeling costs or because a model is out of reach.
A concept that plays a central role in the application of FL is that of a
linguistic variable. The linguistic variables may be viewed as a form of data
compression. One linguistic variable may represent many numerical variables. It
is suggestive to refer to this form of data compression as granulation.
The same effect can be achieved by conventional quantization, but in the
case of quantization, the values are intervals, whereas in the case of granula-
tion the values are overlapping fuzzy sets. The advantages of granulation over
quantization are as follows:
• it is more general;
• it mimics the way in which humans interpret linguistic values;
• the transition from one linguistic value to a contiguous linguistic value is
gradual rather than abrupt, resulting in continuity and robustness.
FC describes the algorithm for process control as a fuzzy relation between

information about the conditions of the process to be controlled, x and y, and
the output for the process z. The control algorithm is given in “if ... then ...”
expression, such as:
If x is small and y is big, then z is medium;

If x is big and y is medium, then z is big.
These rules are called FC rules. The “if” clause of the rules is called the
antecedent and the “then” clause is called consequent. In general, variables x
and y are called the input and z the output. The “small” and “big” are fuzzy
values for x and y, and they are expressed by fuzzy sets.
Fuzzy controllers are constructed of groups of these FC rules, and when an
actual input is given, the output is calculated by means of fuzzy inference.
4 Proposed Fuzzy Peer Coordination Quality System
The P2P group-based model considered is that of a superpeer model as show

in Fig. 1. In this model, the P2P network is fragmented into several disjoint
peergroups (see Fig. 2). The peers of each peergroup are connected to a single
superpeer. There is frequent local communication between peers in a peergroup,
and less frequent global communication between superpeers.
To complete a certain task in P2P mobile collaborative team work, peers
often have to interact with unknown peers. Thus, it is important that group
members must select reliable peers to interact.
In this work, we consider four parameters: Time for Finishing the Required
Task (TFRT), Group Synchronization (GS), Peer Communication Cost (PCC)
and Peer Mobility (PM) to decide the Peer Coordination Quality (PCQ). The
structure of FPCQS is shown in Fig. 3. These four parameters are fuzzified using
fuzzy system, and based on the decision of fuzzy system the peer coordination
quality is calculated. The membership functions for our system are shown in
Fig. 4. In Table 1, we show the Fuzzy Rule Base (FRB) of our proposed system,
which consists of 108 rules [24].
The input parameters for FPCQS are: TFRT, SCT, GS and PCC. The output
linguistic parameter is PCQ. The term sets of TFRT, GS, PCC and PM are
defined respectively as:
8 V. Kolici et al.
Fig. 2. P2P group-based model.
TFRT
GS
FPCQS PCQ
PCC
PM
Fig. 3. Proposed system of structure.
T F RT = {F ast, N ormal, Late}

= {F, N, L};
GS = {Bad, N ormal, Good}
= {Ba, N or, Go};
P CC = {Low, M iddle, High}
= {Lo, M i, Hi};
P M = {Slow Speed, M iddle Speed, F ast Speed}
= {Ss, M s, F s}.
µ(TFRT)
F N L
1
TFRT
0 10 20 30 40 50 60 70 80 90 100
µ(GS) Nor
Ba Go
1
GS
0 10 20 30 40 50 60 70 80 90 100
µ(PCC)
Lo Mi Hi
1
PCC
0 10 20 30 40 50 60 70 80 90 100
µ(PM) Ss Ms Fs
PM
0 10 20 30 40 50 60 70 80 90 100
µ(PCQ)
EB BD MG PG G VG VVG
1
PCQ
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fig. 4. Membership functions.
and the term set for the output PCQ is defined as:
⎛ ⎞ ⎛ ⎞
Extremely Bad EB
⎜ Bad ⎟ ⎜ BD ⎟
⎜ ⎟ ⎜ ⎟
⎜ M inimally Good ⎟ ⎜ M G ⎟
⎜ ⎟ ⎜ ⎟
P CQ = ⎜ ⎟ ⎜ ⎟
⎜ P artially Good ⎟ = ⎜ P G ⎟
⎜ Good ⎟ ⎜ G ⎟
⎜ ⎟ ⎜ ⎟
⎝ V ery Good ⎠ ⎝ VG ⎠
V ery V ery Good VVG
10 V. Kolici et al.
Table 1. FRB.
Rule TFRT GS PCC PM PCQ Rule TFRT GS PCC PM PCQ Rule TFRT GS PCC PM PCQ
1 F Ba Lo Ss VG 28 N Ba Lo Ss PG 55 L Ba Lo Ss MG
2 F Ba Lo Ms VVG 29 N Ba Lo Ms VG 56 L Ba Lo Ms G
3 F Ba Lo Fs VG 30 N Ba Lo Fs PG 57 L Ba Lo Fs MG
4 F Ba Mi Ss PG 31 N Ba Mi Ss BD 58 L Ba Mi Ss EB
5 F Ba Mi Ms VG 32 N Ba Mi Ms PG 59 L Ba Mi Ms MG
6 F Ba Mi Fs PG 33 N Ba Mi Fs BD 60 L Ba Mi Fs EB
7 F Ba Hi Ss BD 34 N Ba Hi Ss EB 61 L Ba Hi Ss EB
8 F Ba Hi Ms PG 35 N Ba Hi Ms BD 62 L Ba Hi Ms EB
9 F Ba Hi Fs BD 36 N Ba Hi Fs EB 63 L Ba Hi Fs EB
10 F Nor Lo Ss VG 37 N Nor Lo Ss G 64 L Nor Lo Ss PG
11 F Nor Lo Ms VVG 38 N Nor Lo Ms VVG 65 L Nor Lo Ms VG
12 F Nor Lo Fs VG 39 N Nor Lo Fs G 66 L Nor Lo Fs PG
13 F Nor Mi Ss G 40 N Nor Mi Ss MG 67 L Nor Mi Ss BD
14 F Nor Mi Ms VG 41 N Nor Mi Ms G 68 L Nor Mi Ms PG
15 F Nor Mi Fs G 42 N Nor Mi Fs MG 69 L Nor Mi Fs BD
16 F Nor Hi Ss MG 43 N Nor Hi Ss BD 70 L Nor Hi Ss EB
17 F Nor Hi Ms G 44 N Nor Hi Ms MG 71 L Nor Hi Ms BD
18 F Nor Hi Fs MG 45 N Nor Hi Fs BD 72 L Nor Hi Fs EB
19 F Go Lo Ss VVG 46 N Go Lo Ss VG 73 L Go Lo Ss G
20 F Go Lo Ms VVG 47 N Go Lo Ms VVG 74 L Go Lo Ms VG
21 F Go Lo Fs VVG 48 N Go Lo Fs VG 75 L Go Lo Fs G
22 F Go Mi Ss VG 49 N Go Mi Ss G 76 L Go Mi Ss MG
23 F Go Mi Ms VVG 50 N Go Mi Ms VG 77 L Go Mi Ms G
24 F Go Mi Fs VG 51 N Go Mi Fs G 78 L Go Mi Fs MG
25 F Go Hi Ss PG 52 N Go Hi Ss MG 79 L Go Hi Ss EB
26 F Go Hi Ms VG 53 N Go Hi Ms G 80 L Go Hi Ms MG
27 F Go Hi Fs PG 54 N Go Hi Fs MG 81 L Go Hi Fs EB
PCC=10,PM=10 PCC=90,PM=10
1 1
GS=10
0.9 0.9 GS=50
GS=90
0.8 0.8
0.7 0.7
0.6 0.6
PCQ
PCQ
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
GS=10
0.1 GS=50 0.1
GS=90
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
TFRT TFRT
(a) PCC=10 (b) PCC=90
Fig. 5. Relation of PCQ with TFRT and GS for different PCC when PM = 10.
1 1
GS=10
0.9 0.9 GS=50
GS=90
0.8 0.8
0.7 0.7
0.6 0.6
PCQ
PCQ
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
GS=10
0.1 GS=50 0.1
GS=90
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
TFRT TFRT
(a) PCC=10 (b) PCC=90
Fig. 6. Relation of PCQ with TFRT and GS for different PCC when PM = 50.
1 1
GS=10
0.9 0.9 GS=50
GS=90
0.8 0.8
0.7 0.7
0.6 0.6
PCQ
PCQ
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
GS=10
0.1 GS=50 0.1
GS=90
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
TFRT TFRT
(a) PCC=10 (b) PCC=90
Fig. 7. Relation of PCQ with TFRT and GS for different PCC when PM=90.
5 Simulation Results
In this section, we present the simulation results for our FPCQS system. In our
system, we decided the number of term sets by carrying out many simulations.
In Figs. 5, 6 and 7, we show the relation of PCQ with TFRT, GS, PCC and
PM. In these simulations, we consider the PCC and PM as constant parameters.
In Fig. 5, we consider the PCC value 10 units. We change the PCC value from 10
to 90 units. When the PCC increases, the PCQ is decreased. But, when the GS
is increased, the PCQ values is high. Considering the effect of TFRT parameter,
we see that when TFRT is increased, the PCQ is decreased. In Figs. 6 and 7, we
increase the PM values to 50 and 90 units, respectively. We see that, when the
PM is 50 units, the PCQ is the best.
12 V. Kolici et al.
6 Conclusions and Future Work

In this paper, we proposed a FPCQS to decide the PCQ in mobile P2P networks.
We took into consideration four parameters: TFRT, GS, PCC and PM. We
evaluated the performance of proposed system by computer simulations. From
the simulations results, we conclude that when GS is increased, the PCQ is
increased. But, by increasing PCC and TFRT, the PCQ is decreased.
In the future, we would like to make extensive simulations to evaluate the
proposed systems and compare the performance with other systems.
References
1. Oram, A. (ed.): Peer-to-Peer: Harnessing the Power of Disruptive Technologies.
O’Reilly and Associates, Sebastopol (2001)
2. Sula, A., Spaho, E., Matsuo, K., Barolli, L., Xhafa, F., Miho, R.: A new system
for supporting children with autism spectrum disorder based on IoT and P2P
technology. Int. J. Space-Based Situated Comput. 4(1), 55–64 (2014). https://doi.
org/10.1504/IJSSC.2014.060688
3. Di Stefano, A., Morana, G., Zito, D.: QoS-aware services composition in P2PGrid
environments. Int. J. Grid Util. Comput. 2(2), 139–147 (2011). https://doi.org/
10.1504/IJGUC.2011.040601
4. Sawamura, S., Barolli, A., Aikebaier, A., Takizawa, M., Enokido, T.: Design and
evaluation of algorithms for obtaining objective trustworthiness on acquaintances
in P2P overlay networks. Int. J. Grid Util. Comput. 2(3), 196–203 (2011). https://
doi.org/10.1504/IJGUC.2011.042042
5. Higashino, M., Hayakawa, T., Takahashi, K., Kawamura, T., Sugahara, K.: Man-
agement of streaming multimedia content using mobile agent technology on pure
P2P-based distributed e-learning system. Int. J. Grid Util. Comput. 5(3), 198–204
(2014). https://doi.org/10.1504/IJGUC.2014.062928
6. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance
evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space-Based
Situated Comput. 1(1) (2011). https://doi.org/10.1504/IJSSC.2016.082768
7. Terano, T., Asai, K., Sugeno, M.: Fuzzy Systems Theory and Its Applications.
Academic Press, Harcourt Brace Jovanovich Inc. (1992)
8. Mori, T., Nakashima, M., Ito, T.: SpACCE: a sophisticated ad hoc cloud computing
environment built by server migration to facilitate distributed collaboration. Int. J.
Space-Based Situated Comput. 1(1) (2011). https://doi.org/10.1504/IJSSC.2012.
050000
9. Xhafa, F., Poulovassilis, A.: Requirements for distributed event-based awareness
in P2P groupware systems. In: Proceedings of AINA 2010, pp. 220–225 (2010)
10. Xhafa, F., Barolli, L., Caballé, S., Fernandez, R.: Supporting scenario-based online
learning with P2P group-based systems. In: Proceedings of NBiS 2010, pp. 173–180
(2010)
11. Gupta, S., Kaiser, G.: P2P video synchronization in a collaborative virtual envi-
ronment. In: Proceedings of the 4th International Conference on Advances in Web-
Based Learning (ICWL 2005), pp. 86–98 (2005)
12. Martnez-Alemn, A.M., Wartman, K.L.: Online Social Networking on Campus
Understanding What Matters in Student Culture. Taylor and Francis, Routledge
(2008)
13. Puzar, M., Plagemann, T.: Data sharing in mobile ad-hoc networks – a study of
replication and performance in the MIDAS data space. Int. J. Space-Based Situated
Comput. 1(1) (2011). https://doi.org/10.1504/IJSSC.2011.040340
14. Spaho, E., Kulla, E., Xhafa, F., Barolli, L.: P2P solutions to efficient mobile peer
collaboration in MANETs. In: Proceedings of 3PGCIC 2012 , pp. 379–383, Novem-
ber 2012
15. Gutwin, C., Greenberg, S., Roseman, M.: Workspace awareness in real-time dis-
tributed groupware: framework, widgets, and evaluation. In: BCS HCI, pp. 281–298
(1996)
16. You, Y., Pekkola, S.: Meeting others - supporting situation awareness on the
WWW. Decis. Support Syst. 32(1), 71–82 (2001)
17. Kandel, A.: Fuzzy Expert Systems. CRC Press, Boca Raton (1992)
18. Zimmermann, H.J.: Fuzzy Set Theory and Its Applications, 2nd Rev. edn. Kluwer
Academic Publishers (1991)
19. McNeill, F.M., Thro, E.: Fuzzy Logic: A Practical Approach. Academic Press Inc.,
Cambridge (1994)
20. Zadeh, L.A., Kacprzyk, J.: Fuzzy Logic for the Management of Uncertainty. Wiley,
Hoboken (1992)
21. Procyk, T.J., Mamdani, E.H.: A linguistic self-organizing process controller. Auto-
matica 15(1), 15–30 (1979)
22. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty, and Information. Prentice Hall,
Englewood Cliffs (1988)
23. Munakata, T., Jani, Y.: Fuzzy systems: an overview. Commun. ACM 37(3), 69–76
(1994)
24. Yi, L., Kouseke, O., Keita, M., Makoto, I., Leonard, B.: A fuzzy-based approach
for improving peer coordination quality in MobilePeerDroid mobile system. In:
Proceedings of IMIS 2018, pp. 60–73 (2018)
A Fuzzy-Based System for Driving Risk
Measurement (FSDRM) in VANETs: A
Comparison Study of Simulation and
Experimental Results
Kevin Bylykbashi1(B) , Ermioni Qafzezi1 , Makoto Ikeda2 , Keita Matsuo2 ,

and Leonard Barolli2
1
Graduate School of Engineering, Fukuoka Institute of Technology (FIT),
bylykbashi.kevin@gmail.com, eqafzezi@gmail.com
2
makoto.ikd@acm.org, {kt-matsuo,barolli}@fit.ac.jp
Abstract. Vehicular Ad hoc Networks (VANETs) have gained great

attention due to the rapid development of mobile internet and Inter-
net of Things (IoT) applications. With the evolution of technology, it is
expected that VANETs will be massively deployed in upcoming vehicles.
In addition, ambitious efforts are being done to incorporate Ambient
Intelligence (AmI) technology in the vehicles, as it will be an important
factor for VANET to accomplish one of its main goals, the road safety. In
this paper, we propose an intelligent system for safe driving in VANETs
using fuzzy logic. The proposed system considers in-car environment data
such as the ambient temperature and noise, vehicle speed, and driver’s
heart rate to assess the risk level. Then, it uses the smart box to inform
the driver and to provide better assistance. We aim to realize a new sys-
tem to support the driver for safe driving. We evaluated the performance
of the proposed system by computer simulations and experiments. From
the evaluation results, we conclude that the vehicle’s inside temperature,
noise level, vehicle speed, and driver’s heart rate have different effects on
the assessment of risk level.
1 Introduction
Traffic accidents, road congestion and environmental pollution are persistent
problems faced by both developed and developing countries, which have made
people live in difficult situations. Among these, the traffic incidents are the most
serious ones because they result in huge loss of life and property. For decades, we
have seen governments and car manufacturers struggle for safer roads and car
accident prevention. The development in wireless communications has allowed
companies, researchers and institutions to design communication systems that
https://doi.org/10.1007/978-3-030-33509-0_2
A Fuzzy-Based System for Driving Risk Measurement (FSDRM) in VANETs 15
provide new solutions for these issues. Therefore, new types of networks, such
as Vehicular Ad hoc Networks (VANETs) have been created. VANET consists
of a network of vehicles in which vehicles are capable of communicating among
themselves in order to deliver valuable information such as safety warnings and
traffic information.
Nowadays, every car is likely to be equipped with various forms of smart
sensors, wireless communication modules, storage and computational resources.
The sensors will gather information about the road and environment conditions
and share it with neighboring vehicles and adjacent roadside units (RSU) via
vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) communication. How-
ever, the difficulty lies on how to understand the sensed data and how to make
intelligent decisions based on the provided information.
As a result, Ambient Intelligence (AmI) becomes a significant factor for
VANETs. Various intelligent systems and applications are now being deployed
and they are going to change the way manufacturers design vehicles. These sys-
tems include many intelligence computational technologies such as fuzzy logic,
neural networks, machine learning, adaptive computing, voice recognition, and
so on, and they are already announced or deployed [1]. The goal is to improve
both vehicle safety and performance by realizing a series of automatic driving
technologies based on the situation recognition. The car control relies on the
measurement and recognition of the outside environment and their reflection on
driving operation.
On the other hand, we are focused on the in-car information and driver’s vital
information to detect the danger or risk situation and inform the driver about the
risk or change his mood. Thus, our goal is to prevent the accidents by supporting
the drivers. In order to realize the proposed system, we use some Internet of
Things (IoT) devices equipped with various sensors for in-car monitoring.
In this paper, we propose a fuzzy-based system for safe driving considering
four parameters: Vehicle’s Inside Temperature (VIT), Noise Level (NL), Vehi-
cle Speed (VS) and Heart Rate (HR) to determine the Driving Risk Measure-
ment (DRM).
The structure of the paper is as follows. In Sect. 2, we present an overview
of VANETs. In Sect. 3, we present a short description of AmI. In Sect. 4, we
describe the proposed fuzzy-based system and its implementation. In Sect. 5, we
discuss the simulation and experimental results. Finally, conclusions and future
work are given in Sect. 6.
2 Vehicular Ad Hoc Networks (VANETs)
VANETs are a type of wireless networks that have emerged thanks to advances
in wireless communication technologies and the automotive industry. VANETs
are considered to have an enormous potential in enhancing road traffic safety
and traffic efficiency. Therefore, various governments have launched programs
dedicated to the development and consolidation of vehicular communications
and networking and both industrial and academic researchers are addressing
16 K. Bylykbashi et al.
many related challenges, including socio-economic ones, which are among the
most important [2].
The VANET technology uses moving vehicle as nodes to form a wireless
mobile network. It aims to provide fast and cost-efficient data transfer for the
advantage of passenger safety and comfort. To improve road safety and travel
comfort of voyagers and drivers, Intelligent Transport Systems (ITS) are devel-
oped. The ITS manages the vehicle traffic, support drivers with safety and other
information, and provide some services such as automated toll collection and
driver assist systems [3].
The VANETs provide new prospects to improve advanced solutions for mak-
ing reliable communication between vehicles. VANETs can be defined as a part
of ITS which aims to make transportation systems faster and smarter, in which
vehicles are equipped with some short-range and medium-range wireless com-
munication [4]. In a VANET, wireless vehicles are able to communicate directly
with each other (i.e., emergency vehicle warning, stationary vehicle warning) and
also served various services (i.e., video streaming, internet) from access points
(i.e., 3G or 4G) through roadside units.
3 Ambient Intelligence (AmI)
The AmI is the vision that technology will become invisible, embedded in our
natural surroundings, present whenever we need it, enabled by simple and effort-
less interactions, attuned to all our senses, adaptive to users and context and
autonomously acting [5]. High quality information and content must be available
to any user, anywhere, at any time, and on any device.
In order that AmI becomes a reality, it should completely envelope humans,
without constraining them. Distributed embedded systems for AmI are going to
change the way we design embedded systems, as well as the way we think about
such systems. But, more importantly, they will have a great impact on the way
we live. Applications ranging from safe driving systems, smart buildings and
home security, smart fabrics or e-textiles, to manufacturing systems and rescue
and recovery operations in hostile environments, are poised to become part of
society and human lives.
The AmI deals with a new world of ubiquitous computing devices, where
physical environments interact intelligently and unobtrusively with people. AmI
environments can be diverse, such as homes, offices, meeting rooms, hospitals,
control centers, vehicles, tourist attractions, stores, sports facilities, and music
devices.
In the future, small devices will monitor the health status in a continuous
manner, diagnose any possible health conditions, have conversation with people
to persuade them to change the lifestyle for maintaining better health, and com-
municates with the doctor, if needed [6]. The device might even be embedded
into the regular clothing fibers in the form of very tiny sensors and it might
communicate with other devices including the variety of sensors embedded into
the home to monitor the lifestyle. For example, people might be alarmed about
Fig. 1. Proposed system architecture.
the lack of a healthy diet based on the items present in the fridge and based on
what they are eating outside regularly.
The AmI paradigm represents the future vision of intelligent computing
where environments support the people inhabiting them [7–9]. In this new com-
puting paradigm, the conventional input and output media no longer exist, rather
the sensors and processors will be integrated into everyday objects, working
together in harmony in order to support the inhabitants [10]. By relying on
various artificial intelligence techniques, AmI promises the successful interpre-
tation of the wealth of contextual information obtained from such embedded
sensors and it will adapt the environment to the user needs in a transparent and
anticipatory manner.
4 Proposed System
In this work, we use fuzzy logic to implement the proposed system. Fuzzy sets
and fuzzy logic have been developed to manage vagueness and uncertainty in a
reasoning process of an intelligent system such as a knowledge based system, an
expert system or a logic control system [11–16]. In Fig. 1, we show the architec-
ture of our proposed system.
4.1 Proposed Fuzzy-Based Simulation System
The proposed system called Fuzzy-based System for Driving Risk Measure-
ment (FSDRM) is shown in Fig. 2. For the implementation of our system,
we consider four input parameters: Vehicle’s Inside Temperature (VIT), Noise
Level (NL), Vehicle Speed (VS) and Heart Rate (HR) to determine the Driv-
ing Risk Measurement (DRM). These four input parameters are not correlated
Fig. 2. Proposed system structure.
Table 1. Parameters and their term sets for FSDRM.
Parameters Term sets

Vehicle’s Inside Temperature (VIT) Low (L), Medium (M), High (H)
Noise Level (NL) Quiet (Q), Noisy (N), Very Noisy (VN)
Vehicle Speed (VS) Low (Lo), Moderate (Mo), High (Hi)
Heart Rate (HR) Slow (S), Normal (No), Fast (F)
Driving Risk Measurement (DRM) Safe (Sf), Low (Lw), Moderate (Md), High
(Hg), Very High (VH), Severe (Sv), Danger (D)
with each other, for this reason we use fuzzy system. The input parameters are
fuzzified using the membership functions showed in Fig. 3(a), (b), (c) and (d). In
Fig. 3(e) are shown the membership functions used for the output parameter. We
use triangular and trapezoidal membership functions because they are suitable
for real-time operation. The term sets for each linguistic parameter are shown in
Table 1. We decided the number of term sets by carrying out many simulations.
In Table 2, we show the Fuzzy Rule Base (FRB) of FSDRM, which consists of 81
rules. The control rules have the form: IF “conditions” THEN “control action”.
For instance, for Rule 1: “IF VIT is L, NL is Q, VS is Lo and HR is S, THEN
DRM is Hg” or for Rule 29: “IF VIT is M, NL is Q, VS is Lo and HR is No,
THEN DRM is Sf”.
4.2 Testbed Description
In order to evaluate the proposed system, we implemented a testbed and car-

ried out experiments in a real scenario [17,18]. A snapshot of testbed is shown in
Fig. 3. Membership functions.
Fig. 4. The testbed is composed of sensing and processing components. The sens-
ing system consists of two parts. The first part is implemented in the Arduino
Platform while the second one consists of a Microwave Sensor Module (MSM)
called DC6M4JN3000. We set-up sensors on Arduino Uno to measure the envi-
ronment temperature and noise and used the MSM to measure the driver’s heart
rate. The vehicle speed is considered as a random value. Then, we implemented
a processing device to get the sensed data and to run our fuzzy system. The
sensing components are connected to the processing device via USB cable. We
used Arduino IDE and Processing language to get the sensed data from the
first module, whereas the MSM generates the sensed data in the appropriate
format itself. Then, we use FuzzyC to fuzzify these data and to determine the
degree of risk which is the output of our proposed system. Based on the DRM
an appropriate task can be performed.
Table 2. FRB of FSDRM.
Nr. VIT NL VS HR DRM Nr. VIT NL VS HR DRM Nr. VIT NL VS HR DRM

1 L Q Lo S Hg 28 M Q Lo S Lw 55 H Q Lo S Hg
2 L Q Lo No Lw 29 M Q Lo No Sf 56 H Q Lo No Lw
3 L Q Lo F Hg 30 M Q Lo F Lw 57 H Q Lo F Hg
4 L Q Mo S Hg 31 M Q Mo S Md 58 H Q Mo S VH
5 L Q Mo No Md 32 M Q Mo No Sf 59 H Q Mo No Md
6 L Q Mo F Hg 33 M Q Mo F Md 60 H Q Mo F VH
7 L Q Hi S Sv 34 M Q Hi S Hg 61 H Q Hi S D
8 L Q Hi No Hg 35 M Q Hi No Md 62 H Q Hi No VH
9 L Q Hi F Sv 36 M Q Hi F Hg 63 H Q Hi F D
10 L N Lo S Hg 37 M N Lo S Lw 64 H N Lo S Hg
11 L N Lo No Md 38 M N Lo No Sf 65 H N Lo No Md
12 L N Lo F Hg 39 M N Lo F Lw 66 H N Lo F Hg
13 L N Mo S VH 40 M N Mo S Md 67 H N Mo S VH
14 L N Mo No Md 41 M N Mo No Sf 68 H N Mo No Md
15 L N Mo F VH 42 M N Mo F Md 69 H N Mo F VH
16 L N Hi S D 43 M N Hi S VH 70 H N Hi S D
17 L N Hi No VH 44 M N Hi No Md 71 H N Hi No VH
18 L N Hi F D 45 M N Hi F VH 72 H N Hi F D
19 L VN Lo S VH 46 M VN Lo S Md 73 H VN Lo S Sv
20 L VN Lo No Md 47 M VN Lo No Sf 74 H VN Lo No Hg
21 L VN Lo F VH 48 M VN Lo F Md 75 H VN Lo F Sv
22 L VN Mo S Sv 49 M VN Mo S Hg 76 H VN Mo S Sv
23 L VN Mo No Hg 50 M VN Mo No Lw 77 H VN Mo No Hg
24 L VN Mo F Sv 51 M VN Mo F Hg 78 H VN Mo F Sv
25 L VN Hi S D 52 M VN Hi S Sv 79 H VN Hi S D
26 L VN Hi No Sv 53 M VN Hi No Hg 80 H VN Hi No Sv
27 L VN Hi F D 54 M VN Hi F Sv 81 H VN Hi F D
Fig. 4. Snapshot of testbed.

Fig. 5. Simulation results for VIT = 10 °C.
5 Proposed System Evaluation

5.1 Simulation Results
In this subsection, we present the simulation results for our proposed system.
The simulation results are presented in Figs. 5, 6 and 7. We consider the VIT
and NL as constant parameters. The VS values considered for simulations are

from 10 to 100 kmph. We show the relation between DRM and HR for different
VS values. We vary the HR parameter from 30 to 150 bpm.
In Fig. 5, we consider the VIT value 10 °C and change the NL from 40 dB
to 65 dB. From Fig. 5(a) we can see that the DRM values are relatively high,
especially when the vehicle speed is over 40 kmph and the driver’s heart rate is
not a normal one. In Fig. 5(b) is considered the same scenario but with a noisy
environment. It can be seen that the DRM is increased compared with the first
scenario. However, when the vehicle speed is under 40 kmph and the driver’s
heart beats at a normal rate, the risk level is not a big concern.
In Fig. 6, we present the simulation results for VIT 20 °C. In Fig. 6(a) is con-
sidered the scenario with a quiet ambient. We can see that the DRM values are
lower than all the other considered scenarios. This is due to the good conditions
of the vehicle’s inside environment in which the driver feels comfortable and
can easily manage situations when he is driving with high speed. With a noise
present (see Fig. 6(b)), it can be seen that the DRM is increased, however, the
risk is still in moderate levels.
In Fig. 7, we increase the value of VIT to 30 °C. If the driver’s heart beats
normally we can see that there are some cases when the degree of risk is not
high such as when the ambient is not noisy or the vehicle moves slowly. But,
when the inside environment becomes very noisy, we can see that the degree of
risk is assessed to be very high.
In the cases when the risk level is above the moderate level for a relatively
long time, the system can perform a certain action. For example, when the
DRM value is slightly above the moderate level the system may take an action
to change the driver’s mood, and when the DRM value is very high, the system
could limit the vehicle’s maximal speed to a speed that the risk level is decreased
significantly.
5.2 Experimental Results
For the experiments, we considered the vehicle speed as up to 150 kmph. The
experimental results are presented in Figs. 8, 9 and 10. In Fig. 8(a) are shown
the results of DRM when VIT is “Low” and NL is “Quiet”. From Fig. 8(a), we
can see a number DRM values that indicate a situation with a low risk. These
values are achieved when the ambient is quiet and the driver’s heart is beating
very normally. On the other hand, when the ambient is noisy, we can see that
there is not any DRM value that is decided as a safe or low risk situation by the
system (see Fig. 8(b)). All the situations are decided with moderate values, or
very risky if the vehicle moves at high speed.
The results of DRM for medium temperatures are presented in Fig. 9, with
Fig. 9(a) and Fig. 9(b) presenting the experimental results for quiet and noisy
ambient, respectively. Here the driver is in better conditions and when his heart
beats normally, the risk is mostly under the moderate level. Although the risk is
in low levels, there are cases such as when the driver is driving fast and simulta-
Fig. 8. Experimental results for low temperatures.
Fig. 9. Experimental results for medium temperatures.
Fig. 10. Experimental results for high temperatures.
neously his heart is beating at slow/high rates that the risk level is determined
as above the moderate level.
In Fig. 10 are shown the results of DRM for high temperatures. The results
are almost the same with that of Fig. 7 where the low values of DRM happen to
be only when the ambient is quiet, the driver’s heart rate is normal and he is
driving slowly. When the ambient is very noisy, the degree of risk is decided to
be even above the “Very High” level. In these situations, the driver should not
drive fast as his situation is a potential risk for him and for other vehicles on the
road. Therefore, the system decides to perform the appropriate action in order
to provide the driving safety.
6 Conclusions
In this paper, we proposed a fuzzy-based system to decide the driving risk mea-
surement. We took into consideration four parameters: vehicle’s inside tempera-
ture, noise level, vehicle speed and driver’s heart rate. We evaluated the perfor-
mance of proposed system by simulations and experiments. From the evaluation
results, we conclude that the vehicle’s inside temperature, noise level, vehicle
speed and driver’s heart rate have different effects on the decision of the risk
level.
In the future, we would like to make extensive simulations and experiments to
evaluate the proposed system and compare the performance with other systems.
References
1. Gusikhin, O., Filev, D., Rychtyckyj, N.: Intelligent vehicle systems: applications
and new trends. In: Informatics in Control Automation and Robotics, pp. 3–14.
Springer (2008)
2. Santi, P.: Mobility Models for Next Generation Wireless Networks: Ad Hoc, Vehic-
ular and Mesh Networks. Wiley, Hoboken (2012)
3. Hartenstein, H., Laberteaux, L.: A tutorial survey on vehicular ad hoc networks.
IEEE Commun. Mag. 46(6), 164–171 (2008)
4. Karagiannis, G., Altintas, O., Ekici, E., Heijenk, G., Jarupan, B., Lin, K., Weil,
T.: Vehicular networking: a survey and tutorial on requirements, architectures,
challenges, standards and solutions. IEEE Commun. Surv. Tutor. 13(4), 584–616
(2011)
5. Lindwer, M., Marculescu, D., Basten, T., Zimmennann, R., Marculescu, R., Jung,
S., Cantatore, E.: Ambient intelligence visions and achievements: linking abstract
ideas to real-world concepts. In: 2003 Design, Automation and Test in Europe
Conference and Exhibition, pp. 10–15, March 2003
6. Acampora, G., Cook, D.J., Rashidi, P., Vasilakos, A.V.: A survey on ambient
intelligence in healthcare. Proc. IEEE 101(12), 2470–2494 (2013)
7. Aarts, E., Wichert, R.: Ambient intelligence. In: Technology Guide, pp. 244–249.
Springer (2009)
8. Aarts, E., De Ruyter, B.: New research perspectives on ambient intelligence. J.
Ambient Intell. Smart Environ. 1(1), 5–14 (2009)
9. Vasilakos, A., Pedrycz, W.: Ambient Intelligence, Wireless Networking, and Ubiq-
uitous Computing. Artech House Inc., Norwood (2006)
10. Sadri, F.: Ambient intelligence: a survey. ACM Comput. Surv. (CSUR) 43(4), 36
(2011)
11. Kandel, A.: Fuzzy Expert Systems. CRC Press, Boca Raton (1991)
12. Zimmermann, H.-J.: Fuzzy Set Theory and Its Applications. Springer, Heidelberg
(1991)
13. McNeill, F.M., Thro, E.: Fuzzy Logic: A Practical Approach. Academic Press,
Cambridge (1994)
14. Zadeh, L.A., Kacprzyk, J.: Fuzzy Logic for the Management of Uncertainty. Wiley,
Hoboken (1992)
15. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty, and Information (1988)
16. Munakata, T., Jani, Y.: Fuzzy systems: an overview. Commun. ACM 37(3), 69–77
(1994)
17. Bylykbashi, K., Elmazi, D., Matsuo, K., Ikeda, M., Barolli, L.: Implementation of
a fuzzy-based simulation system and a testbed for improving driving conditions
in VANETs. In: International Conference on Complex, Intelligent, and Software
Intensive Systems, pp. 3–12. Springer (2019)
18. Bylykbashi, K., Qafzezi, E., Ikeda, M., Matsuo, K., Barolli, L.: Implementation of
a fuzzy-based simulation system and a testbed for improving driving conditions in
VANETs considering drivers’s vital signs. In:International Conference on Network-
Based Information Systems, pp. 37–48. Springer (2019)
A Comparison Study of Constriction and
Linearly Decreasing Vmax Replacement
Methods for Wireless Mesh Networks by
WMN-PSOHC-DGA Simulation System
Admir Barolli1 , Shinji Sakamoto2 , Heidi Durresi3 , Seiji Ohara4 ,

Leonard Barolli3(B) , and Makoto Takizawa5
1
Department of Information Technology, Aleksander Moisiu University of Durres,
L.1, Rruga e Currilave, Durres, Albania
admir.barolli@gmail.com
2
Department of Computer and Information Science, Seikei University,
3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan
shinji.sakamoto@ieee.org
3
Department of Information and Communication Engineering, Fukuoka Institute
of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan
hdurresi@gmail.com, barolli@fit.ac.jp
4
Graduate School of Engineering, Fukuoka Institute of Technology,
seiji.ohara.19@gmail.com
5
Department of Advanced Sciences, Faculty of Science and Engineering,
Hosei University, Kajino-Machi, Koganei-Shi, Tokyo 184-8584, Japan
makoto.takizawa@computer.org
Abstract. The Wireless Mesh Networks (WMNs) are becoming an

important networking infrastructure because they have many advantages
such as low cost and increased high speed wireless Internet connectiv-
ity. In our previous work, we implemented a Particle Swarm Optimiza-
tion (PSO) and Hill Climbing (HC) based hybrid simulation system,
called WMN-PSOHC, and a simulation system based on Genetic Algo-
rithm (GA), called WMN-GA, for solving node placement problem in
WMNs. Then, we implemented a hybrid simulation system based on
PSOHC and distributed GA (DGA), called WMN-PSOHC-DGA. In this
paper, we analyze the performance of WMNs using WMN-PSOHC-DGA
simulation system considering Constriction Method (CM) and Linearly
Decreasing Vmax Method (LDVM). Simulation results show that a good
performance is achieved for CM compared with the case of LDVM.
1 Introduction
The wireless networks and devices are becoming increasingly popular and they
provide users access to information and communication anytime and any-
where [3,6,10–12,17,18,22,27,28,31]. Wireless Mesh Networks (WMNs) are
https://doi.org/10.1007/978-3-030-33509-0_3
A Comparison Study of CM and LDVM 27
gaining a lot of attention because of their low cost nature that makes them
attractive for providing wireless Internet connectivity. A WMN is dynamically
self-organized and self-configured, with the nodes in the network automatically
establishing and maintaining mesh connectivity among them-selves (creating, in
effect, an ad hoc network). This feature brings many advantages to WMNs such
as low up-front cost, easy network maintenance, robustness and reliable service
coverage [1]. Moreover, such infrastructure can be used to deploy community
networks, metropolitan area networks, municipal and corporative networks, and
to support applications for urban areas, medical, transport and surveillance sys-
tems.
Mesh node placement in WMN can be seen as a family of problems, which are
shown (through graph theoretic approaches or placement problems, e.g. [8,15])
to be computationally hard to solve for most of the formulations [34]. We consider
the version of the mesh router nodes placement problem in which we are given
a grid area where to deploy a number of mesh router nodes and a number of
mesh client nodes of fixed positions (of an arbitrary distribution) in the grid
area. The objective is to find a location assignment for the mesh routers to
the cells of the grid area that maximizes the network connectivity and client
coverage. Node placement problems are known to be computationally hard to
solve [13,35]. In some previous works, intelligent algorithms have been recently
investigated [4,5,9,16,19,20,24,25,29,30].
In [28], we implemented a Particle Swarm Optimization (PSO) and Hill
Climbing (HC) based simulation system, called WMN-PSOHC. Also, we imple-
mented another simulation system based on Genetic Algorithm (GA), called
WMN-GA [4,14], for solving node placement problem in WMNs. Then, we
designed a Hybrid Intelligent System Based on PSO, HC and DGA, called WMN-
PSOHC-DGA [26].
In this paper, we evaluate the performance of WMNs using WMN-PSOHC-
DGA simulation system considering Constriction Method (CM) and Linearly
Decreasing Vmax Method (LDVM).
The rest of the paper is organized as follows. The mesh router nodes place-
ment problem is defined in Sect. 2. We present our designed and implemented
hybrid simulation system in Sect. 3. The simulation results are given in Sect. 4.
Finally, we give conclusions and future work in Sect. 5.
2 Node Placement Problem in WMNs

For this problem, we have a grid area arranged in cells we want to find where to
distribute a number of mesh router nodes and a number of mesh client nodes of
fixed positions (of an arbitrary distribution) in the considered area. The objective
is to find a location assignment for the mesh routers to the area that maximizes
the network connectivity and client coverage. Network connectivity is measured
by Size of Giant Component (SGC) of the resulting WMN graph, while the user
coverage is simply the number of mesh client nodes that fall within the radio
coverage of at least one mesh router node and is measured by Number of Covered
Mesh Clients (NCMC).
28 A. Barolli et al.
An instance of the problem consists as follows.

• N mesh router nodes, each having its own radio coverage, defining thus a
vector of routers.
• An area W ×H where to distribute N mesh routers. Positions of mesh routers
are not pre-determined and are to be computed.
• M client mesh nodes located in arbitrary points of the considered area, defin-
ing a matrix of clients.
It should be noted that network connectivity and user coverage are among
most important metrics in WMNs and directly affect the network performance.
In this work, we have considered a bi-objective optimization in which we first
maximize the network connectivity of the WMN (through the maximization of
the SGC) and then, the maximization of the NCMC.
In fact, we can formalize an instance of the problem by constructing an
adjacency matrix of the WMN graph, whose nodes are router nodes and client
nodes and whose edges are links between nodes in the mesh network. Each mesh
node in the graph is a triple v = < x, y, r > representing the 2D location point
and r is the radius of the transmission range. There is an arc between two nodes
u and v, if v is within the transmission circular area of u.
3 Proposed and Implemented Simulation System

3.1 WMN-PSOHC-DGA Hybrid Simulation System
Distributed Genetic Algorithm (DGA) has been focused from various fields of sci-
ence. DGA has shown their usefulness for the resolution of many computationally
hard combinatorial optimization problems. Also, Particle Swarm Optimization
(PSO) has been investigated for solving NP-hard problem.
Velocities and Positions of Particles
WMN-PSOHC-DGA decide the velocity of particles by a random process con-
sidering the area size. For instance,
√ when √ the area size is W × H, the velocity
is decided randomly from − W 2 + H 2 to W 2 + H 2 . Each particle’s velocities
are updated by simple rule [21].
For HC mechanism, next positions of each particle are used for neighbor
solution s . The fitness function f gives points to the current solution s. If f (s )
is better than f (s), the s is updated to s . However, if f (s ) is not better than
f (s), the s is not updated. It should be noted that the positions are not updated
but the velocities are updated even if the f (s) is better than f (s ).
Routers Replacement Methods
A mesh router has x, y positions and velocity. Mesh routers are moved based on
velocities. There are many router replacement methods. In this paper, we use
CM and LDVM.
Constriction Method (CM)
CM is a method which PSO parameters are set to a week stable region
(ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc
et al. [2,7,33].
WMN-PSOHC-DGA system
Particle-pattern
Migration Island
Individual
PSOHC part DGA part
Fig. 1. Model of WMN-PSOHC-DGA migration.
Fig. 2. Relationship among global solution, particle-patterns and mesh routers in

PSOHC part.
Linearly Decreasing Vmax Method (LDVM)

In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 =
2.0). A value of Vmax which is maximum velocity of particles is considered.
With increasing of iteration of computations, the Vmax is kept decreasing
linearly [23,32].
DGA Operations
Population of individuals: Unlike local search techniques that construct a path
in the solution space jumping from one solution to another one through local
perturbations, DGA use a population of individuals giving thus the search a
larger scope and chances to find better solutions. This feature is also known
as “exploration” process in difference to “exploitation” process of local search
methods.
Selection: The selection of individuals to be crossed is another important
aspect in DGA as it impacts on the convergence of the algorithm. Several selec-
tion schemes have been proposed in the literature for selection operators trying
to cope with premature convergence of DGA. There are many selection methods
in GA. In our system, we implement 2 selection methods: Random method and
Roulette wheel method.
Crossover operators: Use of crossover operators is one of the most impor-
tant characteristics. Crossover operator is the means of DGA to transmit best
genetic features of parents to offsprings during generations of the evolution pro-
cess. Many methods for crossover operators have been proposed such as Blend
Crossover (BLX-α), Unimodal Normal Distribution Crossover (UNDX), Simplex

Crossover (SPX).
Mutation operators: These operators intend to improve the individuals of a
population by small local perturbations. They aim to provide a component of
randomness in the neighborhood of the individuals of the population. In our
system, we implemented two mutation methods: uniformly random mutation
and boundary mutation.
Escaping from local optimal: GA itself has the ability to avoid falling prema-
turely into local optimal and can eventually escape from them during the search
process. DGA has one more mechanism to escape from local optimal by consid-
ering some islands. Each island computes GA for optimizing and they migrate
its gene to provide the ability to avoid from local optimal.
Convergence: The convergence of the algorithm is the mechanism of DGA
to reach to good solutions. A premature convergence of the algorithm would
cause that all individuals of the population be similar in their genetic features
and thus the search would result ineffective and the algorithm getting stuck
into local optimal. Maintaining the diversity of the population is therefore very
important to this family of evolutionary algorithms.
In following, we present our proposed and implemented simulation system
called WMN-PSOHC-DGA. We show the fitness function, migration function,
particle-pattern, gene coding and client distributions.
Fitness Function
The determination of an appropriate fitness function, together with the chromo-
some encoding are crucial to the performance. Therefore, one of most important
thing is to decide the determination of an appropriate objective function and its
encoding. In our case, each particle-pattern and gene has an own fitness value
which is comparable and compares it with other fitness value in order to share
information of global solution. The fitness function follows a hierarchical app-
roach in which the main objective is to maximize the SGC in WMN. Thus, the
fitness function of this scenario is defined as
Fitness = 0.7 × SGC(xij , y ij ) + 0.3 × NCMC(xij , y ij ).
Migration Function
Our implemented simulation system uses Migration function as shown in Fig. 1.
The Migration function swaps solutions between PSOHC part and DGA part.
Particle-Pattern and Gene Coding
In order to swap solutions, we design particle-patterns and gene coding carefully.
A particle is a mesh router. Each particle has position in the considered area
and velocities. A fitness value of a particle-pattern is computed by combination
of mesh routers and mesh clients positions. In other words, each particle-pattern
is a solution as shown is Fig. 2.
A gene describes a WMN. Each individual has its own combination of mesh
nodes. In other words, each individual has a fitness value. Therefore, the combi-
nation of mesh nodes is a solution.
Table 1. WMN-PSOHC-DGA parameters.
Parameters Values
Clients distribution Normal distribution
Area size 32.0 × 32.0
Number of mesh routers 16
Number of mesh clients 48
Number of migrations 200
Evolution steps 9
Number of GA islands 16
Radius of a mesh router 2.0
Selection method Roulette wheel method
Crossover method SPX
Mutation method Boundary mutation
Crossover rate 0.8
Mutation rate 0.2
Replacement method CM, LDVM
In this section, we show simulation results using WMN-PSOHC-DGA system.
In this work, we analyze the performance of WMNs considering CM and LDVM
router replacement methods. The number of mesh routers is considered 16 and
the number of mesh clients 48. We conducted simulations 100 times, in order to
avoid the effect of randomness and create a general view of results. We show the
parameter setting for WMN-PSOHC-DGA in Table 1.
We show simulation results in Figs. 3 and 4. We see that for both SGC and
NCMC, the performance of CM is better than LDVM.
Fig. 3. Simulation results of WMN-PSOHC-DGA for SGC.

Fig. 4. Simulation results of WMN-PSOHC-DGA for NCMC.
5 Conclusions
In this work, we evaluated the performance of WMNs using a hybrid simulation
system based on PSOHC and DGA (called WMN-PSOHC-DGA) considering
CM and LDVM router replacement methods. Simulation results show that the
performance is better for CM compared with the case of LDVM.
In our future work, we would like to evaluate the performance of the proposed
system for different parameters and patterns.
References
1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput.
Netw. 47(4), 445–487 (2005)
2. Barolli, A., Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L., Takizawa, M.: Perfor-
mance evaluation of WMNs by WMN-PSOSA simulation system considering con-
striction and linearly decreasing Vmax methods. In: International Conference on
P2P, Parallel, Grid, Cloud and Internet Computing, pp. 111–121. Springer (2017)
3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simula-
tion system based on particle swarm optimization and distributed genetic algorithm
for WMNs considering different distributions of mesh clients. In: International Con-
ference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp.
32–45. Springer (2018)
4. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: Design
and implementation of a hybrid intelligent system based on particle swarm opti-
mization and distributed genetic algorithm. In: International Conference on Emerg-
ing Internetworking, Data & Web Technologies, pp. 79–93. Springer (2018)
5. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of WMNs
by WMN-PSODGA simulation system considering weibull and chi-square client
distributions. In: International Conference on Advanced Information Networking
and Applications, pp. 366–375. Springer (2019)
6. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance analy-
sis of WMNs by WMN-PSOHC-DGA simulation system considering random iner-
tia weight and linearly decreasing Vmax router replacement methods. In: Inter-
national Conference on Complex, Intelligent, and Software Intensive Systems, pp.
13–21. Springer (2019)
7. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in
a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002)
8. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of two-
tier wireless mesh networks. In: Proceedings of Global Telecommunications Con-
ference, pp. 4823–4827 (2007)
9. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless
mesh network design problem using genetic algorithm and simulated annealing
optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014)
10. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance
evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space-Based
Situated Comput. 6(4), 228–238 (2016)
11. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission
control in WLAN: a fuzzy approach and its performance evaluation. in: Interna-
tional Conference on Broadband and Wireless Computing, Communication and
Applications, pp. 559–571. Springer (2016)
12. Inaba, T., Ozera, K., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed
for admission control in WLANs: effects of RSSI on connection keep-alive time.
In: The 31st International Conference on Advanced Information Networking and
Applications Workshops (WAINA-2017), pp. 722–729. IEEE (2017)
13. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int.
J. Commun. Netw. Syst. Sci. 2(1), 44 (2009)
14. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance
analysis of WMNs by WMN-GA simulation system for two WMN architectures
and different TCP congestion-avoidance algorithms and client distributions. Int.
J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018)
15. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh net-
works. in: Proceedings of 8th International IEEE Symposium on Computer Net-
works, pp. 4754–4759 (2008)
16. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization
for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003)
17. Ohara, S., Barolli, A., Sakamoto, S., Barolli, L.: Performance analysis of WMNs
by WMN-PSODGA simulation system considering load balancing and client uni-
form distribution. In: International Conference on Innovative Mobile and Internet
Services in Ubiquitous Computing, pp. 25–38. Springer (2019)
18. Ozera, K., Inaba, T., Bylykbashi, K., Sakamoto, S., Ikeda, M., Barolli, L.: A WLAN
triage testbed based on fuzzy logic and its performance evaluation for different
number of clients and throughput parameter. Int. J. Grid Util. Comput. 10(2),
168–178 (2019)
19. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: An integrated simulation
system considering WMN-PSO simulation system and network simulator 3. In:
International Conference on Broadband and Wireless Computing, Communication
and Applications, pp. 187–198. Springer (2016)
20. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evalu-
ation of a simulation system based on particle swarm optimisation for node place-
ment problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst.
17(1), 1–13 (2016)
21. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a
new replacement method in WMN-PSO simulation system and its performance
evaluation. In: The 30th IEEE International Conference on Advanced Information
Networking and Applications (AINA-2016), pp. 206–211 (2016). https://doi.org/
10.1109/AINA.2016.42
22. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance
analysis of two wireless mesh network architectures by WMN-SA and WMN-TS
simulation systems. J. High Speed Netw. 23(4), 311–322 (2017)
23. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Performance evaluation of WMNs
by WMN-PSOSA simulation system considering constriction and linearly decreas-
ing inertia weight methods. In: International Conference on Network-Based Infor-
mation Systems, pp. 3–13. Springer (2017)
24. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation
of intelligent hybrid systems for node placement in wireless mesh networks: a com-
parison study of WMN-PSOHC and WMN-PSOSA. In: International Conference
on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 16–26.
Springer (2017)
25. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of
WMN-PSOHC and WMN-PSO simulation systems for node placement in wireless
mesh networks: a comparison study. In: International Conference on Emerging
Internetworking, Data & Web Technologies, pp. 64–74. Springer (2017)
26. Sakamoto, S., Barolli, A., Barolli, L., Takizawa, M.: Design and implementation of
a hybrid intelligent system based on particle swarm optimization, hill climbing and
distributed genetic algorithm for node placement problem in WMNs: a compari-
son study. In: The 32nd IEEE International Conference on Advanced Information
Networking and Applications (AINA-2018), pp. 678–685. IEEE (2018)
27. Sakamoto, S., Ozera, K., Barolli, A., Barolli, L., Kolici, V., Takizawa, M.: Perfor-
mance evaluation of WMN-PSOSA considering four different replacement methods.
In: International Conference on Emerging Internetworking, Data & Web Technolo-
gies, pp. 51–64. Springer (2018)
28. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid
systems for node placement problem in WMNs considering particle swarm opti-
mization, hill climbing and simulated annealing. Mobile Netw. Appl. 23(1), 27–33
(2018)
29. Sakamoto, S., Barolli, L., Okamoto, S.: Performance evaluation of WMNs by
WMN-PSOSA system considering chi-square and exponential client distributions.
In: International Conference on Advanced Information Networking and Applica-
tions, pp. 397–406. Springer (2019)
30. Sakamoto, S., Ohara, S., Barolli, L., Okamoto, S.: Performance evaluation of
WMNs by WMN-PSOHC system considering random inertia weight and linearly
decreasing inertia weight replacement methods. In: International Conference on
Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 39–48.
Springer (2019)
31. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Imple-
mentation of an intelligent hybrid simulation systems for WMNs based on particle
swarm optimization and simulated annealing: performance evaluation for different
replacement methods. Soft Comput. 23(9), 3029–3035 (2019)
32. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle
swarms. J. Global Optim. 31(1), 93–108 (2005)
33. Shi, Y.: Particle swarm optimization. IEEE Connect. 2(1), 8–13 (2004)
34. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize
node placement and configuration for WLAN planning. In: The 4th IEEE Inter-
national Symposium on Wireless Communication Systems, pp. 612–616 (2007)
35. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wire-
less mesh networks. In: Proceedings of IEEE International Conference on Mobile
Adhoc and Sensor Systems (MASS-2007), pp. 1–9 (2007)
Effect of Degree of Centrality Parameter
on Actor Selection in WSANs: A
Fuzzy-Based Simulation System and Its
Performance Evaluation
Donald Elmazi1(B) , Miralda Cuka2 , Makoto Ikeda1 , Keita Matsuo1 ,

and Leonard Barolli1
1
donald.elmazi@gmail.com, makoto.ikd@acm.org,
{kt-matsuo,barolli}@fit.ac.jp
2
Graduate School of Engineering, Fukuoka Institute of Technology (FIT),
mcuka91@gmail.com
Abstract. The growth in sensor networks and importance of active

devices in the physical world has led to the development of Wireless
Sensor and Actor Networks (WSANs). WSANs consist of a large num-
ber of sensors and also a smaller number of actors. Whenever there is
any emergency situation i.e., fire, earthquake, flood or enemy attack in
the area, sensor nodes have the responsibility to sense it and send infor-
mation towards an actor node. According to these data gathered, the
actor nodes take a prompt action. In this work, we consider the actor
node selection problem and propose a fuzzy-based system (FBS) that
based on data provided by sensors and actors selects an appropriate
actor node to carry out a task. We use 4 input parameters: Degree of
Centrality (DoC), Distance to Event (DE), Power Consumption (PC)
and Number of Sensors per Actor (NSA) and the output parameter is
Actor Selection Decision (ASD).
1 Introduction
Recent technological advances have lead to the emergence of distributed Wireless
Sensor and Actor Networks (WSANs) which are capable of observing the physical
world, processing the data, making decisions based on the observations and
performing appropriate actions [1].
In WSANs, the devices deployed in the environment are sensors able to sense
environmental data, actors able to react by affecting the environment or have
both functions integrated. Actor nodes are equipped with two radio transmitters,
a low data rate transmitter to communicate with the sensor and a high data
rate interface for actor-actor communication. For example, in the case of a fire,
https://doi.org/10.1007/978-3-030-33509-0_4
36 D. Elmazi et al.
Fig. 1. Wireless Sensor Actor Network (WSAN).
sensors relay the exact origin and intensity of the fire to actors so that they
can extinguish it before spreading in the whole building or in a more complex
scenario, to save people who may be trapped by fire [2–4].
To provide effective operation of WSAN, it is very important that sensors and
actors coordinate in what are called sensor-actor and actor-actor coordination.
Coordination is not only important during task conduction, but also during
network’s self-improvement operations, i.e. connectivity restoration [5,6], reliable
service [7], Quality of Service (QoS) [8,9] and so on.
Degree of Centrality (DoC) indicates the number of connections of a node. If
it has higher values of centrality means that the node is more central and with
high importance.
The role of Load Balancing in WSAN is to provide a reliable service, in
order to have a better performance and effortness. In this paper we consider the
Load Balancing issue. Load balancing identifies the optimal load on nodes of the
network to increase the network efficiency.
In this paper, different from our previous research, we consider DoC param-
eter. The system is based on fuzzy logic and considers four input parameters for
actor selection. We show the simulation results for different values of parameters.
The remainder of the paper is organized as follows. In Sect. 2, we describe
the basics of WSANs including research challenges and architecture. In Sect. 3,
we describe the system model and its implementation. The simulation results
are shown in Sect. 4. Finally, conclusions and future work are given in Sect. 5.
2 WSANs
A WSAN is shown in Fig. 1. The main functionality of WSANs is to make

actors perform appropriate actions in the environment, based on the data sensed
from sensors and actors. When important data has to be transmitted (an event
occurred), sensors may transmit their data back to the sink, which will control
the actors’ tasks from distance, or transmit their data to actors, which can per-
form actions independently from the sink node. Here, the former scheme is called
Semi-Automated Architecture and the latter one Fully-Automated Architecture
A Fuzzy-Based System for Actor Nodes Selection in WSANs 37
Fig. 2. WSAN architectures.
Fig. 3. Proposed System.
(see Fig. 2). Obviously, both architectures can be used in different applications.
In the Fully-Automated Architecture are needed new sophisticated algorithms
in order to provide appropriate coordination between nodes of WSAN. On the
other hand, it has advantages, such as low latency, low energy consumption, long
network lifetime [1], higher local position accuracy, higher reliability and so on.
3 Proposed Fuzzy-Based System
Based on WSAN characteristics and challenges, we consider the following param-

eters for implementation of our proposed system.
Degree of Centrality (DoC): Degree of Centrality (see Fig. 5) is a simple way
of measuring a nodes centrality. It counts how many neighbors a node has. A
node is important if it has many neighbors or many other nodes that link to it.
For example if a node has 10 links, the DoC is 10. The higher the DoC is, the
more connections it has, which makes it more central.
Distance to Event (DE): The number of actors in a WSAN is smaller than
the number of sensors. Thus, when an actor is called for action near an event,
the distance from the actor to the event is important because when the distance
is longer, the actor will spend more energy. Thus, an actor which is close to an
event, should be selected.
38 D. Elmazi et al.
Fig. 4. FLC structure.
Fig. 5. Degree of Centrality example.
Fig. 6. Triangular and trapezoidal membership functions.
Power Consumption (PC): As actors are active in the monitored field, they
perform tasks and exchange data in different ways. Thus they have to spend
energy (limited resource) based on the task and the application. It is better that
the actors which consume less power are selected to carry out a task, so the
network lifetime can be increased.
Number of Sensors per Actor (NSA): The number of sensors deployed in
an area for sensing any event may be in the order of hundreds or thousands.
So in order to have a better coverage of these sensors, the number of sensors
covered by each actor node should be balanced.
Actor Selection Decision (ASD): Our system is able to decide the willingness
of an actor to be assigned a certain task at a certain time. The actors respond
in five different levels, which can be interpreted as:
• Very Low Selection Possibility (VLSP) - It is not worth assigning the task to
this actor.
• Low Selection Possibility (LSP) - There might be other actors which can do
the job better.
• Middle Selection Possibility (MSP) - The Actor is ready to be assigned a task,
but is not the “chosen” one.
Table 1. Parameters and their term sets for FLC.
Parameters Term sets

Degree of Centrality (DoC) Low (Lw), Middle (Mi), High (Hg)
Distance to Event (DE) Near (Ne), Moderate (Mo), Far (Fa)
Power Consumption (PC) Low (Lo), Medium (Mdm), High (Hi)
Number of Sensors per Actor (NSA) Few (Fw), Medium (Me), Many (My)
Actor Selection Decision (ASD) Very Low Selection Possibility (VLSP), Low
Selection Possibility (LSP), Middle Selection
Possibility (MSP), High Selection Possibility
(HSP), Very High Selection Possibility
(VHSP)
• High Selection Possibility (HSP) - The actor takes responsibility of completing

the task.
• Very High Selection Possibility (VHSP) - Actor has almost all required infor-
mation and potential and takes full responsibility.
Fuzzy sets and fuzzy logic have been developed to manage vagueness and
uncertainty in a reasoning process of an intelligent system such as a knowledge
based system, an expert system or a logic control system [10–22].
The structure of the proposed system is shown in Fig. 3. It consists of one
Fuzzy Logic Controller (FLC), which is the main part of our system and its basic
elements are shown in Fig. 4. They are the fuzzifier, inference engine, Fuzzy Rule
Base (FRB) and defuzzifier.
As shown in Fig. 6, we use triangular and trapezoidal membership functions
for FLC, because they are suitable for real-time operation [23]. The x0 in f (x)
is the center of triangular function, x0 (x1 ) in g(x) is the left (right) edge of
trapezoidal function, and a0 (a1 ) is the left (right) width of the triangular or
trapezoidal function. We explain in details the design of FLC in following.
We use four input parameters for FLC:
• Degree of Centrality (DoC);

• Distance to Event (DE)
• Power Consumption (PC);
• Number of Sensors per Actor (NSA).
The term sets for each input linguistic parameter are defined respectively as
shown in Table 1.
T (DoC) = {Low(Lw), M iddle(M i), High(Hg)}

T (DE) = {N ear(N e), M oderate(M o), F ar(F a)}
T (P C) = {Low(Lo), M edium(M dm), High(Hi)}
T (N SA) = {F ew(F w), M edium(M e), M any(M y)}
40 D. Elmazi et al.
Fig. 7. Fuzzy membership functions.
The membership functions for input parameters of FLC are defined as:
µLw (DoC) = g(DoC; Lw0 , Lw1 , Lww0 , Lww1 )
µM i (DoC) = f (DoC; M i0 , M iw0 , M iw1 )
µHg (DoC) = g(DoC; Hg0 , Hg1 , Hgw0 , Hgw1 )
µN e (DE) = g(DE; N e0 , N e1 , N ew0 , N ew1 )
µM o (DE) = f (DE; M o0 , M ow0 , M ow1 )
µF a (DE) = g(DE; F a0 , F a1 , F aw0 , F aw1 )
µLo (P C) = g(P C; Lo0 , Lo1 , Low0 , Low1 )
µM dm (P C) = f (P C; M dm0 , M dmw0 , M dmw1 )
µHi (P C) = g(P C; Hi0 , Hi1 , Hiw0 , Hiw1 ).
µF w (N SA) = g(N SA; F w0 , F w1 , F ww0 , F ww1 )
µM e (N SA) = f (N SA; M e0 , M ew0 , M ew1 )
µM y (N SA) = g(N SA; M y0 , M y1 , M yw0 , F aw1 )
The small letters w0 and w1 mean left width and right width, respectively.
The output linguistic parameter is the Actor Selection Decision (ASD). We
define the term set of ASD as:
{V ery Low Selection P ossibility (V LSP ),
Low Selection P ossibility (LSP ),
M iddle Selection P ossibility (M SP ),
High Selection P ossibility (HSP ),
V ery High Selection P ossibility (V HSP )}.
The membership functions for the output parameter ASD are defined as:
µV LSP (ASD) = g(ASD; V LSP0 , V LSP1 , V LSPw0 )
µLSP (ASD) = f (ASD; LSP0 , LSP1 , LSPw0 )
µM SP (ASD) = f (ASD; M SP0 , M SP1 , M SPw0 )
µHSP (ASD) = f (ASD; HSP0 , HSP1 , HSPw0 )
µV HSP (ASD) = g(ASD; V HSP0 , V HSP1 , V HSPw0 )
The membership functions are shown in Fig. 7 and the Fuzzy Rule Base
(FRB) is shown in Table 2. The FRB forms a fuzzy set of dimensions |T (DoC)|×
|T (DE)| × |T (P C)| × |T (N SA)|, where |T (x)| is the number of terms on T (x).
The FRB has 81 rules. The control rules have the form: IF “conditions” THEN
“control action”.
The simulation results for our system are shown in Figs. 8, 9 and 10. One of the
main issues in WSANs is power consumption. Lower consume of nodes energy
improves the overall lifetime of the network. In Fig. 8, DoC is 0.1, PC is 0.9 and
for DE from 0.1 to 0.5 we see that ASD is decreased 24% and when DE increases
from 0.5 to 0.9 ASD is decreased 13%. This is because when the distance of actors
to an event is far they spend more energy.
In Fig. 9, DoC is increased from 0.1 to 0.5 and we see that the probability of
an actor to be selected is also increased. Because higher DoC, means that the
actor node has a high number of links in the network. For PC = 0.9 and for DE
from 0.1 to 0.5 the ASD is decreased 25%. Also for DE from 0.5 to 0.9 the ASD
is decreased 12%.
In Fig. 10. DoC is increased again, from 0.5 to 0.9 and also the actor’s prob-
ability to be selected is higher. So from NSA 0.2 to 0.5, the ASD increases 28%,
on the other hand when NSA increases from 0.5 to 0.8, the ASD decreases 40%.
In Fig. 10(c), for PC from 0.9 to 0.5 and 0.5 to 0.1, ASD increases because the
power consumption decreases and the energy of the nodes is higher.
If we compare 3 graphs for DoC from 0.1 to 0.5 (Fig. 8(b) with Fig. 9(b))
and 0.5 to 0.9 (Fig. 9(b) with Fig. 10(b)), for PC = 0.9 and DE = 0.5, the ASD
is increased 12% and 25%, respectively.
In this paper we proposed and implemented a fuzzy-based system for actor node
selection in WSANs, considering DoC as new parameter. Considering that some
nodes are more reliable than the others, our system decides which actor nodes
are better suited to carry out a task. From simulation results, we conclude as
follows.
42 D. Elmazi et al.
Table 2. FRB of proposed fuzzy-based system.
No. DoC DE PC NSA ASD No. DoC NSA PC DE ASD

1 Lw Ne Lo Fw VLSP 41 Mi Mo Mdm Me MSP
2 Lw Ne Lo Me LSP 42 Mi Mo Mdm My HSP
3 Lw Ne Lo My MSP 43 Mi Mo Hi Fw HSP
4 Lw Ne Mdm Fw LSP 44 Mi Mo Hi Me HSP
5 Lw Ne Mdm Me MSP 45 Mi Mo Hi My VHSP
6 Lw Ne Mdm My HSP 46 Mi Fa Lo Fw ELSP
7 Lw Ne Hi Fw MSP 47 Mi Fa Lo Me ELSP
8 Lw Ne Hi Me HSP 48 Mi Fa Lo My VLSP
9 Lw Ne Hi My VHSP 49 Mi Fa Mdm Fw VLSP
10 Lw Mo Lo Fw MSP 50 Mi Fa Mdm Me VLSP
11 Lw Mo Lo Me MSP 51 Mi Fa Mdm My VLSP
12 Lw Mo Lo My HSP 52 Mi Fa Hi Fw MSP
13 Lw Mo Mdm Fw HSP 53 Mi Fa Hi Me LSP
14 Lw Mo Mdm Me HSP 54 Mi Fa Hi My MSP
15 Lw Mo Mdm My VHSP 55 Hg Ne Lo Fw MSP
16 Lw Mo Hi Fw VHSP 56 Hg Ne Lo Me ELSP
17 Lw Mo Hi Me EHSP 57 Hg Ne Lo My ELSP
18 Lw Mo Hi My EHSP 58 Hg Ne Mdm Fw VLSP
19 Lw Fa Lo Fw ELSP 59 Hg Ne Mdm Me ELSP
20 Lw Fa Lo Me VLSP 60 Hg Ne Mdm My VLSP
21 Lw Fa Lo My LSP 61 Hg Ne Hi Fw LSP
22 Lw Fa Mdm Fw LSP 62 Hg Ne Hi Me VLSP
23 Lw Fa Mdm Me MSP 63 Hg Ne Hi My LSP
24 Lw Fa Mdm My MSP 64 Hg Mo Lo Fw MSP
25 Lw Fa Hi Fw MSP 65 Hg Mo Lo Me VLSP
26 Lw Fa Hi Me MSP 66 Hg Mo Lo My LSP
27 Lw Fa Hi My HSP 67 Hg Mo Mdm Fw MSP
28 Mi Ne Lo Fw ELSP 68 Hg Mo Mdm Me LSP
29 Mi Ne Lo Me VLSP 69 Hg Mo Mdm My MSP
30 Mi Ne Lo My VLSP 70 Hg Mo Hi Fw HSP
31 Mi Ne Mdm Fw VLSP 71 Hg Mo Hi Me MSP
32 Mi Ne Mdm Me LSP 72 Hg Mo Hi My HSP
33 Mi Ne Mdm My MSP 73 Hg Fa Lo Fw VHSP
34 Mi Ne Hi Fw LSP 74 Hg Fa Lo Me EHSP
35 Mi Ne Hi Me MSP 75 Hg Fa Lo My ELSP
36 Mi Ne Hi My HSP 76 Hg Fa Mdm Fw ELSP
37 Mi Mo Lo Fw VLSP 77 Hg Fa Mdm Me ELSP
38 Mi Mo Lo Me MSP 78 Hg Fa Mdm My VLSP
39 Mi Mo Lo My MSP 79 Hg Fa Hi Fw LSP
40 Mi Mo Mdm Fw MSP 80 Hg Fa Hi Me VLSP
81 Hg Fa Hi My MSP
DoC=0.1-DE=0.1 DoC=0.1-DE=0.5
1 1
PC=0.1 PC=0.1
0.9 PC=0.5 0.9 PC=0.5
PC=0.9 PC=0.9
0.8 0.8
0.7 0.7
0.6 0.6
ASD [unit]
ASD [unit]
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NSA NSA
(a) DE=0.1 (b) DE=0.5

DoC=0.1-DE=0.9
1
PC=0.1
0.9 PC=0.5
PC=0.9
0.8
0.7
0.6
ASD [unit]
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NSA
(c) DE=0.9
Fig. 8. Results for DoC = 0.1.
DoC=0.5-DE=0.1 DoC=0.5-DE=0.5
1 1
PC=0.1 PC=0.1
0.9 PC=0.5 0.9 PC=0.5
PC=0.9 PC=0.9
0.8 0.8
0.7 0.7
0.6 0.6
ASD [unit]
ASD [unit]
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NSA NSA
(a) DE=0.1 (b) DE=0.5

DoC=0.5-DE=0.9
1
PC=0.1
0.9 PC=0.5
PC=0.9
0.8
0.7
0.6
ASD [unit]
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NSA
(c) DE=0.9

44 D. Elmazi et al.
DoC=0.9-DE=0.1 DoC=0.9-DE=0.5
1 1
PC=0.1
0.9 0.9 PC=0.5
PC=0.9
0.8 0.8
0.7 0.7
0.6 0.6
ASD [unit]
ASD [unit]
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
PC=0.1
0.1 PC=0.5 0.1
PC=0.9
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NSA NSA
(a) DE=0.1 (b) DE=0.5

DoC=0.9-DE=0.9
1
PC=0.1
0.9 PC=0.5
PC=0.9
0.8
0.7
0.6
ASD [unit]
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NSA
(c) DE=0.9
• When DoC parameter is increased, the ASD parameter is increased, so the

probability that the system selects an actor node for the job is high.
• Actors are distributed in the network and they have different distances from
an event. Actors that are further away, are less likely to be selected.
• When the DE parameter is increased, the ASD parameter is decreased, so the
probability that an actor node is selected for the required task is low.
• When PC decreases, the ASD increases because the node energy is not con-
sumed.
• If we compare 3 graphs for DoC from 0.1 to 0.5 (Fig. 8(b) with Fig. 9(b)) and
0.5 to 0.9 (Fig. 9(b) with Fig. 10(b)), for PC = 0.9 and DE = 0.5, the ASD is
increased 12% and 25%, respectively.
In the future work, we will consider also other parameters for actor selection
and make extensive simulations to evaluate the proposed system. Also we will
conduct experiments for other scenarios.
References
1. Akyildiz, I.F., Kasimoglu, I.H.: Wireless sensor and actor networks: research chal-
lenges. Ad Hoc Netw. J. 2(4), 351–367 (2004)
2. Akyildiz, I., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor net-
works: a survey. Comput. Netw. 38(4), 393–422 (2002)
3. Boyinbode, O., Le, H., Takizawa, M.: A survey on clustering algorithms for wireless
sensor networks. Int. J. Space-Based Situated Comput. 1(2/3), 130–136 (2011)
4. Bahrepour, M., Meratnia, N., Poel, M., Taghikhaki, Z., Havinga, P.J.: Use of wire-
less sensor networks for distributed event detection in disaster management appli-
cations. Int. J. Space-Based Situated Comput. 2(1), 58–69 (2012)
5. Haider, N., Imran, M., Saad, N., Zakariya, M.: Performance analysis of reactive
connectivity restoration algorithms for wireless sensor and actor networks. In: IEEE
Malaysia International Conference on Communications (MICC-2013), pp. 490–495,
November 2013
6. Abbasi, A., Younis, M., Akkaya, K.: Movement-assisted connectivity restoration
in wireless sensor and actor networks. IEEE Trans. Parallel Distrib. Syst. 20(9),
1366–1379 (2009)
7. Li, X., Liang, X., Lu, R., He, S., Chen, J., Shen, X.: Toward reliable actor services
in wireless sensor and actor networks. In: 2011 IEEE 8th International Conference
on Mobile Ad-Hoc and Sensor Systems (MASS), pp. 351–360, October 2011
8. Akkaya, K., Younis, M.: COLA: a coverage and latency aware actor placement
for wireless sensor and actor networks. In: IEEE 64th Conference on Vehicular
Technology (VTC-2006) Fall, pp. 1–5, September 2006
9. Kakarla, J., Majhi, B.: A new optimal delay and energy efficient coordination algo-
rithm for WSAN. In: 2013 IEEE International Conference on Advanced Networks
and Telecommuncations Systems (ANTS), pp. 1–6, December 2013
10. Inaba, T., Sakamoto, S., Kolici, V., Mino, G., Barolli, L.: A CAC scheme based
on fuzzy logic for cellular networks considering security and priority parameters.
In: The 9-th International Conference on Broadband and Wireless Computing,
Communication and Applications (BWCCA-2014), pp. 340–346 (2014)
11. Spaho, E., Sakamoto, S., Barolli, L., Xhafa, F., Barolli, V., Iwashige, J.: A fuzzy-
based system for peer reliability in JXTA-overlay P2P considering number of inter-
actions. In: The 16th International Conference on Network-Based Information Sys-
tems (NBiS-2013), pp. 156–161 (2013)
12. Matsuo, K., Elmazi, D., Liu, Y., Sakamoto, S., Mino, G., Barolli, L.: FACS-MP: a
fuzzy admission control system with many priorities for wireless cellular networks
and its performance evaluation. J. High Speed Netw. 21(1), 1–14 (2015)
13. Grabisch, M.: The application of fuzzy integrals in multicriteria decision making.
Eur. J. Oper. Res. 89(3), 445–456 (1996)
14. Inaba, T., Elmazi, D., Liu, Y., Sakamoto, S.,Barolli, L., Uchida, K.: Integrating
wireless cellular and ad-hoc networks using fuzzy logic considering node mobility
and security. In: The 29th IEEE International Conference on Advanced Information
Networking and Applications Workshops (WAINA-2015), pp. 54–60 (2015)
15. Kulla, E., Mino, G., Sakamoto, S., Ikeda, M., Caballé, S., Barolli, L.: FBMIS: a
fuzzy-based multi-interface system for cellular and ad hoc networks. In: Interna-
tional Conference on Advanced Information Networking and Applications (AINA-
2014), pp. 180–185 (2014)
16. Elmazi, D., Kulla, E., Oda, T., Spaho, E., Sakamoto, S., Barolli, L.: A comparison
study of two fuzzy-based systems for selection of actor node in wireless sensor actor
networks. J. Ambient Intell. Humaniz. Comput. 6(9), 635–645 (2015)
17. Zadeh, L.: Fuzzy logic, neural networks, and soft computing. Commun. ACM 37(3),
77–85 (1994)
18. Spaho, E., Sakamoto, S., Barolli, L., Xhafa, F., Ikeda, M.: Trustworthiness in P2P:
performance behaviour of two fuzzy-based systems for JXTA-overlay platform.
Soft. Comput. 18(9), 1783–1793 (2014)
46 D. Elmazi et al.
19. Inaba, T., Sakamoto, S., Kulla, E., Caballe, S., Ikeda, M., Barolli, L.: An integrated
system for wireless cellular and ad-hoc networks using fuzzy logic. In: International
Conference on Intelligent Networking and Collaborative Systems (INCoS-2014),
pp. 157–162 (2014)
20. Matsuo, K., Elmazi, D., Liu, Y., Sakamoto, S., Barolli, L.: A multi-modal simula-
tion system for wireless sensor networks: a comparison study considering stationary
and mobile sink and event. J. Ambient Intell. Humaniz. Comput. 6(4), 519–529
(2015)
21. Kolici, V., Inaba, T., Lala, A., Mino, G., Sakamoto, S., Barolli, L.: A fuzzy-based
CAC scheme for cellular networks considering security. In: International Conference
on Network-Based Information Systems (NBiS-2014), pp. 368–373 (2014)
22. Matsuo, K., Elmazi, D., Liu, Y., Sakamoto, S., Mino, G., Barolli, L.: FACS-MP: a
fuzzy admission control system with many priorities for wireless cellular networks
and its perforemance evaluation. J. High Speed Netw. 21(1), 1–14 (2015)
23. Mendel, J.M.: Fuzzy logic systems for engineering: a tutorial. Proc. IEEE 83(3),
345–377 (1995)
Wind Power Forecasting Based
on Efficient Deep Convolution
Neural Networks
Sana Mujeeb1 , Nadeem Javaid1(B) , Hira Gul1 , Nazia Daood1 ,

Shaista Shabbir2 , and Arooj Arif1
1
COMSATS University Islamabad, Islamabad 44000, Pakistan
nadeemjavaidqau@gmail.com
2
Virtual University, Kotli 11100, Pakistan
http://www.njavaid.com/
Abstract. Due to the depletion of fossil fuel and global warming,

the incorporation of alternative low carbon emission energy generation
becomes crucial for energy systems. The wind power is a popular energy
source because of its environmental and economic benefits. However, the
uncertainty of wind power, makes its incorporation in energy systems
really difficult. To mitigate the risk of demand-supply imbalance by wind
power, an accurate estimation of wind power is essential. Recognizing
this challenging task, an efficient deep learning based prediction model
is proposed for wind power forecasting. In this proposed model, Wavelet
Packet Transform (WPT) is used to decompose the wind power signals.
Along with decomposed signals and lagged inputs, multiple exogenous
inputs (calendar variable, Numerical Weather Prediction (NWP)) are
used as input to forecast wind power. Efficient Deep Convolution Neural
Network (EDCNN) is employed to forecast wind power. The proposed
model’s performance is evaluated on real data of Maine wind farm ISO
NE, USA.
Keywords: Data analytics · Wind power · Demand side

management · Energy management · Forecasting · Deep learning
1 Introduction
Due to the continual decrease in fossil fuel, the energy crisis has become crucial
[1]. To mitigate the energy crisis, regulative acts that encourage the utiliza-
tion of renewable energy are promoted worldwide. Among the renewable energy
resources, wind energy, as an alternative to traditional generation, has attracted
a lot of attention. The reason of popularity of wind power is its environment
friendly nature. Wind power has no carbon emission and therefore, helps in
reducing environmental pollution [2]. It is introduced worldwide as a way to
reduce greenhouse gas emission. According to the Global Wind Energy Council
[3], the cumulative capacity of wind power reached 486 GW across the global
https://doi.org/10.1007/978-3-030-33509-0_5
48 S. Mujeeb et al.
market in 2016. Wind power is expected to significantly expand leading to an

overall zero emission power system.
The wind power is not only environmental friendly, but it also has a low
investment cost (due to the developing technology) [4]. In the USA, the U.S.
Department of Energy target of renewable integration is responsible of providing
20% of the total energy through wind, by the year 2030 [5]. In this regard, the
Independent System Operators (ISOs) are producing significant wind power and
increasing their wind generation.
It is acknowledged widely that accurate WPF significantly reduces the risks
of incorporating wind power in power supply systems [6]. Generally, the WPF
results are in the deterministic form (i.e., point forecast). Reducing the fore-
casting errors of WPF is the focus of many researchers [7]. A point forecast is
the estimated value of future wind energy. However, the wind power is random
variable having a Probability Density Function (PDF), and point forecasts is
unable to capture the uncertainty of this random variable. This the limitation of
the point forecasts. Therefore, point forecasts have limited use in stability and
security analysis of power systems. To overcome the limitation of point forecasts,
deep learning methods are widely used in the field of WPF and other electricity
related forecasting tasks [8–10]. Deep Neural Networks (DNN) have the inherent
property of automatic modeling of the wind power characteristics [11].
The wind power has a chaotic nature. Therefore, the incorporation of wind
power in power supply systems is a risky task. To mitigate this risk, wind power
forecasting is the most popular method. The wind power is forecasted using clas-
sical [9–17], statistical and artificial intelligent methods. In literature, there are
two types of wind power forecasting techniques: time series [12] and multivari-
ate [13]. The accuracy of wind power forecasting is very important to avoid the
demand supply imbalance. Therefore, researcher are still competing to improve
the wind power forecasting accuracy.
Convolution Neural Network (CNN) is a state-of-the-art deep learning
method. It is the CNN’s characteristic that it can extract the spatial features
automatically. CNN is the most popular method for extracting features from
the images and widely used in the field of computer vision. The efficient feature
extraction capability of CNN motivate us to exploit it for wind power forecast-
ing. CNN successfully extracts the spatio-temporal correlations in wind power
data.
2 Contributions
In this paper, we are concerned with the problem of predicting the wind power.
The contributions of this research work are listed below:
– For forecasting wind power, the Numeric Weather Prediction (NWP) are used
along with lagged wind power and Wavelet Packet decomposed (WPT) past
wind.
Efficient Deep Convolution Neural Networks Based Wind Power Forecasting 49
Fig. 1. Overview of proposed system for wind power forecasting.
– A predictive Deep Convolution Neural Network (DCNN) model for accurate

wind power prediction is proposed, that employs an efficient activation func-
tion and loss function in output layer (Fig. 1).
3 Proposed Model
The proposed method for forecasting wind power generation and power man-
agement algorithm are discussed in this section.
3.1 Data Preprocess

The features and targets (wind power) are normalized using min-max normal-
ization (as shown in Eq. 1).
Xi − min(X)
Xnor = (1)
max(X) − min(X)
The inputs to the forecast model are shown in Table 1. Three types of inputs are
given to the forecasting model are: (i) Numerical Weather Prediction (NWP):
dew point temperature, dry bulb temperature, wind speed, (ii) past lagged values
of wind power and (iii) wavelet packet decomposed wind power. The wavelet
decomposition is described in the next section.
3.2 Feature Engineering

The historical wind power signal is decomposed using Wavelet Packet Trans-
form (WPT). The WPT is a general form of the wavelet decomposition which
50 S. Mujeeb et al.
Table 1. Inputs to the forecast model.
Input Description
Dew point temperature Past NWP forecast
Dry bulb temperature Past NWP forecast
Wind speed Past NWP forecast
Lagged wind power 1 Wind power (t-24)
Lagged wind power 2 Wind power (t-25)
Decomposed wind power Wavelet decomposed past wind power
Hour Time of the day
Fig. 2. Wavelet packet tree with three levels.
performs a better signal analysis. WPT was introduced in 1992 by Coifman and
Wickerhauser [14]. Unlike, Discrete Wavelet Transform (DWT), the WPT wave-
forms or packets that are interpreted by three different parameters: frequency,
position and scale (similar to the DWT). For every orthogonal wavelet func-
tion multiple wavelet packets are generated, having different bases (as shown in
Fig. 2). With the help of these bases, the input signal can be encoded in such a
way that the global energy of signal is preserved and exact signal can be recon-
structed effectively. Multiple expansions of an input signal be achieved using
WPT. The suitable most decomposition is selected by calculating the entropy
(e.g. Shannon entropy). The minimal representation of the relevant data based
on a cost function is calculated in WPT. The benefit of WPT is its characteristic
of analyzing signal in different temporal as well as spatial positions. For highly
nonlinear and oscillating signal like wind power DWT doesn’t guarantee good
results [15]. In WPT, both the approximation and detail coefficients are further
decomposed into approximation and detail coefficients as the level of tree goes
deeper. Wavelet packet decomposition operation can be expressed Eqs. 2 and 3.
For a signal a to be decomposed, two filters of size 2N are applied on a. The

corresponding wavelets are h(n) and g(n).
−1
√ 2N
W2n (a) = 2 h(k)Wn (2a − k) (2)
k=0
−1
√ 2N
W2n+1 (a) = 2 g(k)Wn (2a − k) (3)
k=0
Where, the scaling factor W0 (a) = φ(a) and the wavelet function is W1 (a) =
ψ(a).
After decomposing the past wind signals, the engineered features along with
NWP variables and lagged wind power are input to the proposed forecasting
model. The proposed forecasting model is discussed in the next section.
3.3 Efficient DCNN
The inputs are given to the Efficient Deep CNN (EDCNN) for predicting day-
ahead hourly wind power (24 values). Firstly, the functionality of trivial CNN is
discussed in this section. Secondly, the proposed method EDCNN is explained.
CNN works on the principle of visual system of human brain. CNN has an
excellent capability of extracting deep underlying features from data. The CNN
effectively identify the spatially local correlations in data through convolution
operation. In the convolution operation, a filter is applied to a block of spatially
adjacent neurons and result is passed through an activation function. This output
of convolution layer becomes the input to next layer’s neuron. Thus, the input to
every neuron of a layer is the output of convolved block of previous layer. Unlike
ANN, the CNN training is efficient due to the weight sharing scheme. Due to
the weight sharing, the learning efficiency improves. CNN is composed of three
altering layers: (i) convolution layer, (ii) sampling layer and (iii) fully connected
layer. The convolution operation can be explained by following Eq. 4.
Suppose, X = [x1 , x2 , x3 , . . . , xn ] are the training samples and C =
[c1 , c2 , c3 , . . . , cn ] is the vector of corresponding targets. n is the number of train-
ing samples. CNN attempts to learn the optimal filter weights and biases that
minimize the forecasting error. CNN can be defined as:
Yim = f (wm ⊗ Xim + bm ) (4)
Where, i = [1, 2, . . . , n] and m = [1, 2, . . . , M]. m is the number of layer to be

learned. The filter weights of the mth layer is denoted by wm . bm represents the
corresponding biases, ⊗ refers to the convolution operation. f (·) is the nonlinear
activation function. Yim is the feature map generated by sample Xi at layer m.
The proposed forecasting method EDCNN, there are eleven layers: three con-
volution layers, three max pooling layers, two batch normalization layers, three
ReLU (Rectified Linear Unit) layers, one modified fully connected layer and
52 S. Mujeeb et al.
modified output layer (Enhanced Regression Output Layer (EROL)). Function-

ality of two layers are modified, in order to improve the forecasting performance
of EDCNN.
According to the ANN literature, there is no standard way to choose an
optimal activation function. However, its a well-known fact that machine learning
methods have an excellent optimization capability of any model or function.
On basis of these facts, a modified activation function is employed in a hidden
layer. The proposed activation function is ensemble of three activation functions:
hyperbolic tangent, sigmoid and radial base function.
ex − e−x
TH = (5)
ex + e−x
ex
σ= (6)
1 + ex
φ(x, c) = φx − c (7)
(T H + σ + φ)
F = (8)
3
In the proposed output layer EROL, a modified objective function is embedded.
The objective is to minimize the absolute percentage error between the forecasted
values and actual targets. The objective can be expressed as 9:
min Loss = L(w, Xi , ci ) + F (Y k , c) (9)
k
Where, L(w, Xi , ci ) is the forecasting error or loss from sample Xi , and F (Y , c)
represents the objective. The input to objective function F (·) are the feature
maps generated at the k th layer and c are their respective targets. The objective
function is expressed as 10:
n
1 Yi − ci
F = 100 (10)
n i=1 Yi
Spring Summer
Wind Power (MW)
Wind Power (MW)
800
600
600
400 400
200 200
0 0
0 50 100 150 0 50 100 150
Time (H) Time (H)
Autumn Winter
Wind Power (MW)
Wind Power (MW)
1000
600
400 500
200
0 0
0 50 100 150 0 50 100 150
Time (H) Time (H)
Fig. 3. Wind power of all four seasons of a year.

Where, Yi is the output at the output layer and ci is the desired or actual target.
4 Results and Analysis
The proposed algorithms are implemented using MATLAB R2018a on a PC with

core i 3 processor and 4 GB RAM.
4.1 Data Description
The three year hourly data of wind power is taken from ISO New England’s
wind farm located in Maine [16]. The weather parameter, i.e., wind speed data
is taken from Maine weather station data repository.
4.2 Wind Power Analysis
The wind power is directly proportional to the wind speed. The wind speed vary
from season to season. In Maine, USA the wind speed is effected by seasonality.
In Fig. 3, the one day wind power of all the four seasons, is shown. The wind
power in autumn is higher compared to other seasons. The reason behind this is
the fast winds in coastal area of Maine, where the wind turbines are installed.
4.3 Performance Evaluation

For performance evaluation of wind power forecasting, three evaluation indica-
tors are used: MAE and NRMSE and MAPE (Fig. 4 and Table 2).
Spring Summer
800
Wind Power (MW)
Wind Power (MW)
500 Observed Observed

ECNN ECNN
400 SELU CNN 600 SELU CNN
CNN CNN
300 400
200
200
100
0 5 10 15 20 25 0 5 10 15 20 25
Time (H) Time (H)
Autumn Winter
1000 600
Wind Power (MW)
Wind Power (MW)
Observed
900 ECNN
400 SELU CNN
800 Observed CNN
ECNN 200
700 SELU CNN
CNN
600 0
0 5 10 15 20 25 0 5 10 15 20 25
Time (H) Time (H)
Fig. 4. All seasons predictions of wind power.

54 S. Mujeeb et al.
Table 2. MAPE and NRMSE of proposed and compared methods.
Method Season MAPE NRMSE MAE

CNN Spring 8.42 2.34 3.34
Summer 8.23 2.27 3.24
Autumn 7.9 2.65 3.36
Winter 8.1 2.71 2.89
SELU CNN Spring 3.47 0.12 3.1
Summer 3.62 0.13 3.3
Autumn 3.45 0.12 3.4
Winter 3.27 0.17 3.2
EDCNN Spring 2.67 0.092 2.4
Summer 2.43 0.096 2.24
Autumn 2.56 0.085 2.67
Winter 2.62 0.094 2.18
Table 3. Diebold-Mariano test results at a 95% confidence level.
Method Season Diebold-Mariano

EDCNN vs. CNN Spring
Summer
Autumn
Winter
EDCNN vs. SELU CNN Spring
Summer
Autumn -
Winter
4.4 Diebold-Mariano Test

The aforementioned error indicator are utilized for accuracy comparison of fore-
casting models. However, the lesser error or higher accuracy of a model doesn’t
guarantee its superiority over other models. A model is better as compared to
another model, if the difference between their accuracies is statistically signif-
icant. Different statistical tests are used to validate the significance of models,
such as Friedman test [17], error analysis [18], Diebold-Mariano (DM) test [19],
etc. To validate performance of the proposed forecasting model EDCNN, a well-
known statistical test DM is used. Diebold and Mariano propose the classical
Diebold-Mariano statistical test in 1995 [19]. DM is widely used for validation
of wind power forecasting [20].
A vector of values that are to be forecasted are [y1 , y2 , . . . , yn ]. These values
are predicted by two forecasting models: M 1 and M 2 . The forecasting errors of
1
M1 M1 M2 M2 M2
these models are [εM1 , ε2 , . . . , εn ] and [ε1 , ε2 , . . . , εn ]. A covariance
loss function L() and differential loss are calculated in DM as 11 [21]:

1
, M2 1 2
dM
t = L(εM M
t ) − L(εt ) (11)
DM is applied to the forecasting results of EDCNN and two compared methods:

CNN and SELU CNN [13]. The results of DM test with confidence level of 95%
are shown in Table 3. The check marks are shown at the places where the perfor-
mance of EDCNN is significantly better as compared to the comparable method.
If the forecasting accuracy is not significantly improved, hyphen is placed. The
performance of proposed forecaster EDCNN is compared with standard CNN
and SELU CNN [13]. The predictive analysis are performed for all four seasons
of a year.
5 Conclusion
In this paper, the problem of predicting wind power generation is considered. In
order to take part in the daily market that regulates the supply and demand in
the Maine electricity system. A deep-learning technique EDCNN is developed
to accurately predict the hourly day-ahead wind power on the Maine wind farm
data. The numeric results validates the efficiency of proposed model for wind
power forecasting.
References
1. Zhao, Y.N., Ye, L., Li, Z., Song, X.R., Lang, Y.S., Su, J.: A novel bidirectional
mechanism based on time series model for wind power forecasting. Appl. Energy
177, 793–803 (2016)
2. Jong, P., Kiperstok, A., Sanchez, A.S., Dargaville, R., Torres, E.A.: Integrating
large scale wind power into the electricity grid in the Northeast of Brazil. Energy
100, 401–15 (2016)
3. Global Wind Energy Council. GWEC Global Wind Report (2016)
4. U.S. Department of Energy, Staff Report to the Secretary on Electricity Markets
and Reliability (2017)
5. U.S. Department of Energy, 20% Wind energy by 2030: increasing wind energy’s
contribution to US electricity supply, Energy Efficiency and Renewable Energy
(EERE) (2008)
6. Chen, Z.: Wind power in modern power systems. J. Mod. Power Syst. Clean Energy
1(1), 2–13 (2013)
7. Haque, A.U., Nehrir, M.H., Mandal, P.: A hybrid intelligent model for deterministic
and quantile regression approach for probabilistic wind power forecasting. IEEE
Trans. Power Syst. 29(4), 1663–1672 (2014)
8. Kazmi, H.S.Z., Javaid, N., Imran, M.: Towards energy efficiency and trustfulness
in complex networks using data science techniques and blockchain. MS thesis.
COMSATS University Islamabad (CUI), Islamabad, Pakistan, July 2019
9. Zahid, M., Javaid, N., Rasheed, M.B.: Balancing electricity demand and supply in
smart grids using blockchain. MS thesis. COMSATS University Islamabad (CUI),
Islamabad, Pakistan, July 2019
56 S. Mujeeb et al.
10. Bano, H., Javaid, N., Rasheed, M.B.: Electricity price and load forecasting using
enhanced machine learning techniques. MS thesis. COMSATS University Islam-
abad (CUI), Islamabad, Pakistan, July 2019
11. Juban, J., Siebert, N., Kariniotakis, G.N.: Probabilistic short-term wind power
forecasting for the optimal management of wind generation. In: Power Tech, 2007
IEEE Lausanne, pp. 683–688. IEEE (2007)
12. Wang, H.Z., Li, G.Q., Wang, G.B., Peng, J.C., Jiang, H., Liu, Y.T.: Deep learning
based ensemble approach for probabilistic wind power forecasting. Appl. Energy
15(188), 56–70 (2017)
13. Torres, J.M., Aguilar, R.M.: Using deep learning to predict complex systems: a
case study in wind farm generation. Complexity 2018, 10 (2018)
14. Coifman, R.R., Wickerhauser, M.V.: Entropy-based algorithms for best basis selec-
tion. IEEE Trans. Inf. Theory 38(2), 713–8 (1992)
15. Burrus, C.S., Gopinath, R., Guo, H.: Introduction to Wavelets and Wavelet Trans-
forms: A Primer. Prentice Hall, Upper Saddle River (1997)
16. ISO NE Market Operations Data. https://www.iso-ne.com. Accessed 20th Jan
2019
17. Derrac, J., Garcia, S., Molina, D., Herrera, F.: A practical tutorial on the use of
nonparametric statistical tests as a methodology for comparing evolutionary and
swarm intelligence algorithms. Swarm Evol. Comput. 1(1), 3–18 (2011)
18. Martin, P., Moreno, G., Rodriguez, F., Jimenez, J., Fernandez, I.: A hybrid app-
roach to short-term load forecasting aimed at bad data detection in secondary
substation monitoring equipment. Sensors 18(11), 3947 (2018)
19. Diebold, F.X., Mariano, R.S.: Comparing predictive accuracy. J. Bus. Econ. Stat.
13, 253–63 (1995)
20. Chen, H., Wan, Q., Wang, Y.: Refined Diebold-Mariano test methods for the eval-
uation of wind power forecasting models. Energies 7(7), 4185–4198 (2014)
21. Lago, J., De Ridder, F., De Schutter, B.: Forecasting spot electricity prices: deep
learning approaches and empirical comparison of traditional algorithms. Appl.
Energy 221, 386–405 (2018)
One Step Forward: Towards a Blockchain
Based Trust Model for WSNs
Abdul Mateen1 , Jawad Tanveer2 , Ashrafullah1 , Nasir Ali Khan1 ,

Mubariz Rehman1 , and Nadeem Javaid1(B)
1
COMSATS University Islamabad, Islamabad, Pakistan
ammateen49@gmail.com, ashrafullahmarwat12@gmail.com,
nasirkhan.online@gmail.com, mubarizrahman@gmail.com,
2
Sejong University, Seoul, South Korea
comsats8@gmail.com
Abstract. Nowadays, Wireless Sensor Networks (WSNs) are facing var-

ious challenges. Cost efficiency, low energy consumption, reliable data
communication between nodes and security are the major challenges in
the field of WSNs. On the other hand, blockchain is also a very hot
domain in this era. Blockchain has a remedy for some challenges, which
are faced by the WSNs, e.g., secure data transactions and trustworthi-
ness, etc. By keeping in mind the security issues, we induce blockchain
into the WSNs. In short, we have proposed a trust model to avoid the
malicious attacks and keep the transact data using the blockchain prop-
erty of immutability. Moreover, an enhanced version of Proof of Stack
(PoS), i.e., the Proof of Authority (PoA) consensus mechanism is being
used to add a new node in the network. Additionally, the smart contract
is also written to check the working status of nodes. Simulations are
performed in order to record the transaction cost and execution cost.
Keywords: Blockchain · Trust model · Security · Proof of Authority
1 Introduction
Blockchain attracts great courtesy of researchers as they have faith in that,
this technology will bring remarkable changes and opportunities to industries.
Blockchain is a very powerful technology for resolving trusted communications
in a decentralized fashion. This technology was introduced back in 2008 and
circulated by the cryptography mailing group. The main strength of blockchain
is in decentralization, which allows direct (peer to peer) transactions. This app-
roach is also used in distributed systems where trust is not needed for nodes
to do transactions. Blockchain adopts many means such as: time stamping, dis-
tributed consensus, data encryption and economic incentives. It is used to solve
the problems having inefficiency, high cost and insecure data storage. In contem-
porary years, the research on blockchain technology is excited to grow quickly
with the rapid acclaim and development of blockchain.
https://doi.org/10.1007/978-3-030-33509-0_6
58 A. Mateen et al.
Wireless Sensor Network (WSN) comprises of different type of small sens-

ing devices. These small devices are being used to monitor physical conditions.
Applications of WSNs are smart cities, military purpose, medicine and most com-
monly monitor an environment [1]. Sensing nodes are deployed in the desired area
with desired fashion (may be random or static) for the purpose of monitoring,
detecting and collecting information. It has some challenges, e.g., throughput,
routing, connectivity, void holes, small memory and most importantly security
issues.
The threats which are faced by the WSNs majorly come from two sources.
Firstly, the external attacks on the network and secondly, internal nodes of
the network become malicious [2]. Therefore, it is an essential security enigma
for WSNs to have the capability to recognize and exclude internal malicious
nodes. So, how to solve this security issue for sensor nodes becomes a major
challenge. The malicious node problem in WSNs can be avoided by using one
of the following two categories; (1) either propose a secure model or (2) WSN
protocol. In this paper, we adopt the first one and propose a trust model for
WSNs by using the concepts of blockchain.
In [8], Lin et al. proposed a blockchain based solution for Long Range
Wide Area Networks (LoRaWAN). Authors integrate blockchain and LoRaWAN
by considering the crowd sensing and sharing economy. They developed a
LoRaWAN server to solve the problems of trust on private network operators
and lack of network coverage. In this study, a mechanism is proposed to verify
the existence of data at a specific time on a network. Authors in [14] propose
a blockchain based location privacy protection incentive mechanism. Confusion
Mechanism Algorithm (CMA) is introduced to protect the user’s information by
encrypting the received information from the sensors. Blockchain is also there
to further secure the user’s information and issue incentives based on the fre-
quency of participation. Results show that the proposed mechanism increased
the user’s participation largely from 20% in traditional to 80% in the proposed
mode. However, the results obtained may be one-sided as a limited data (100
pieces of data) was collected.
In this paper, we proposed a blockchain based trust model for WSNs. This
model is used to avoid the malicious attacks and performs secure data transfer
from one ordinary sensor node to the sink node. The main contributions of this
paper are as follows:
• Trust model is proposed for avoiding the malicious attacks.

• A smart contract is composed and simulations are performed using the fol-
lowing tools: Remix IDE, Ganache, MetaMask and MATLAB R2018a.
• A comparative analysis table for transaction and execution costs is presented.
Remaining of this paper is proceeded as follows: Sect. 2 covers state of the

art work. In Sect. 3, system model of this work is presented. Section 4 shows the
results of simulations. At the end, Sect. 5 concludes this work.
One Step Forward: Towards a Blockchain Based Trust Model for WSNs 59
2 Literature Review and Problem Statement

Authors in [3–5] proposed a blockchain oriented secure service provisioning mech-
anism for the lightweight Internet of Things (IoT) devices. Authors applied smart
contracts to check the validity of acquired services. High throughput and low
latency using consortium-blockchain with Proof of Authority (PoA) is achieved.
Analysis of packaging time, throughput and latency by comparing PoA with
Proof of Work (PoW) is also done in this work. Moreover, the results show that
the proposed scheme guards lightweight devices from untrusted edge service
providers and insecure services.
In [6], it is expected by the researchers that conventional blockchain tech-
nology cannot be effortlessly applied to mobile devices. It is due to the reason
that PoW prerequisites large computational ability and storage volume during
the mining process. For this, authors proposed Mobile Edge Computing (MEC)
enabled wireless blockchain framework in [7]. In which they used stochastic geom-
etry theory and Alternating Direction-Method of Multipliers (ADMM) based
algorithm. The proposed algorithm is also compared with the existing central-
ized solution. Simulation results demonstrate that the proposed algorithm is
efficient.
Blockchain based incentive mechanism was proposed in [9] for data storage
by the nodes of a WSN. The authors used Provable Data Possession (PDP)
technique instead of PoW to obtain better results. They also applied preserving
hash function to compare the existing data of nodes with the new one. The only
problem with PDP is that it can identify the damaged data on nodes, but is
unable to recover it.
The authors in [10] proposed a data transmission scheme based on the multi-
link concurrent communication tree model. It is to handle the failure nodes
in blockchain. Results show that proposed scheme works effectively for 15% of
failed nodes. However, if this number reaches the 30%, communication time and
delay will increase. In [11], problem of user access control for network optimiza-
tion in a data-intensive application was identified. Proposed solution considers
authenticity of Channel State Information (CSI) using blockchain consensus and
deep learning. Analysis shows that the proposed scheme increases the spectral
efficiency.
Branch based blockchain technology for Intelligent Vehicles (IVs) was pro-
posed in [12]. Branching is done at Locally Dynamic Blockchain (LDB). It is
to handle the large amount of data generated by IVs. While blockchain is used
to keep track of the data generated by IVs and to verify it. Additionally, the
concept of Intelligent Vehicle Trust Point (IVTP) is also introduced to build
the trust. Problem with branching is that duplicate state changes increase with
increasing load.
Orchestration is the automated configuration, coordination and management
of computer systems and softwares. A Blockchain-based Distributed Applications
(DApps) framework for multi domain service orchestration was proposed in [13].
The authors used blockchain, smart contract and DApps to solve the automation
and distributed harmony issues in networking. Results show that it is essential for
60 A. Mateen et al.
the Multi-domain Orchestrators (MdO) blockchain network to be secure. Also,

transaction confirmation time should be well defined for better performance.
At the same, representing smart contract and interpreting it in the proposed
framework is still an open topic to be covered.
In [16], blockchain is integrated with Internet of Vehicles (IoVs) to provide
large and secure data storage. The authors designed multi-blockchain architec-
ture consists of five blockchains according to the different data blocks to be
stored. Results show that this integration provides large and secure data storage.
They achieved high throughput with increasing data, but delay also increases. In
[17], the authors proposed a vehicular network architecture based on blockchain
for smart city. The authors used the blockchain with smart contract. However,
the service providers are not rewarded so they will not provide services effectively.
Some other authors also involve blockchain in their research works [19–27] for
multiple purposes.
2.1 Problem Statement

In conventional routing protocols, central authority is required for facilitating
the authentication and identification of every device. The research work in [18]
implements the blockchain in networks for avoiding malicious attacks. Two types
of attacks are considered in this paper: greyhole attacks and blackhole attacks.
However, the performance of the network was gradually decreased and some
unnecessary computations was involved due to the Proof of Work (PoW) consen-
sus algorithm. In current work, a trust model is proposed to avoid the malicious
attacks and provide the security to the sensor networks using the concepts of
blockchain. Moreover, PoW is replaced with PoA to avoid unnecessary compu-
tations, which were involved earlier due to the PoW.
3 Proposed System Model

In this section, the blockchain based system model for malicious node avoidance
in WSNs is presented. The key components for this system model are classified
as follow:
3.1 Ordinary Sensor Nodes

These sensor nodes only monitor the environment and collect the real-time data
and upload this data to the associated sink node.
3.2 Sink Nodes

These nodes perform three major responsibilities; first, data collection from ordi-
nary sensor node; second, new node addition using PoA consensus mechanism
and last, smart contract execution which is published by the main server. Sink
nodes differentiate data on the base of id and location of the ordinary node.
Fig. 1. Blockchain based system model
Each sink node has its own database consists of hashes to keep the record of
transactions. Every sink node has the ability to communicate with ordinary sen-
sor node, other sink nodes and the main server. Sink node uses the private keys
for accessing the data from the main server.
3.3 Main Server
The main server is also known as endpoint or base station. The major tasks
for base station are to publish the smart contracts, issuance of activity and
processing of sensed data. The main server records each and every transaction
along with sink id and location in its immutable database. This database can
only be accessed by the main server itself or pre-authorized sink nodes.
It can be observed from the Fig. 1 that the ordinary nodes are connected with
sink nodes. Every sink node will get the data from ordinary sensor nodes. Sink
nodes can send their data to other sinks as well as to the main server. Where,
a smart contract is implemented on sink nodes and issued by the main server.
Sink nodes can authenticate and blacklist any ordinary sensor node at any time
on the detection of malicious activity. Each sink has a communication record of
its own as well other nodes in its distributed ledger.
In this system model, the validity of data is checked at sink nodes. Noticeable
thing is that access on the main server is only granted to the sink nodes. The
main server checks the working status of sink nodes and ordinary nodes. It can
also remove any node if (1) it is dead or (2) involved in any suspicious activity.
62 A. Mateen et al.
3.3.1 Hash Function

For each transaction, a hash is generated which is called transaction hash. A hash
is just a function which takes the input value and generates the output value.
This output is a deterministic value against the input value. This is mathemat-
ically written as follows:
f (a) = b, (1)
where, a is any input and b is associated output against the a; e.g., hash
value for “hi” in keccak-256 is “7624778dedc75f8b322b9fa1632a610d40b85e106
c7d9bf0e743a9ce291b9c6f”. Hash values are generally ‘irreversible’ which means
that input cannot be figured out by knowing the output except hit and trial
method.
In this section, we discuss the simulation results which are obtained using dif-
ferent tools and the reasons for these results. In Sect. 4.1, simulation tools are
explained. Moreover, results and their reasoning are discussed in Sect. 4.2.
4.1 Simulation Tools

In order to take the simulation results, we have used multiple tools. Four tools are
reviewed for developing and testing the smart contract. First, we write a smart
contract on Remix IDE. Second, Ganache is used to show the clear deployment
visualization of the smart contract. Third, to connect the Etherium node with
browser, MetaMask which is the extension of Chrome browser, is used. Fourth,
execution cost and transaction cost of different transactions are obtained from
Remix IDE and later plotted with MATLAB.
106
6
4
Cost (gas)
0
Transaction Cost Execution Cost
Fig. 2. Network deployment cost

Table 1. Network deployment cost
Parameter Value
Status 0x1 Transaction mined and execution succeed
Transaction hash 0x98933, ..., 75f15
Contract address 0x08970, ..., 659fb
From 0xca35b, ..., a733c
To Clustering.(constructor)
Gas 300000000 gas
Transaction cost 5674465 gas
Execution cost 4272505 gas
Hash 0x98933, ..., 75f15
Input 0x608...40029
Decoded input {}
Decoded output -
Logs []
Value 0 wei
4
10
3
2.5
2 Transaction Cost
Execution Cost
Cost (gas)
1.5
0.5
0
Current states of Sink nodes Current states of Sink nodes on main server
Fig. 3. State checking cost of sink nodes
4
10 Transaction Cost Execution Cost
3.5
2.5
Cost (gas)
1.5
0.5
0
Sink Node 1 Sink Node 2 Sink Node 3 Sink Node 4 Sink Node 5
Fig. 4. Individual state checking cost of sink nodes

64 A. Mateen et al.
4
10 Transaction Cost Execution Cost
5
3
Cost (gas)
0
Sink Node 1 Sink Node 2 Sink Node 3 Sink Node 4 Sink Node 5
Fig. 5. Transaction cost of sink nodes
4.1.1 Remix IDE

Integrated Development Environment (IDE) is an open source tool which allows
you to debug, test and compile smart contract from browser. It helps program-
mers for designing the software and other different tasks relating to software
development.
Table 2. Current active/de-active state of nodes
Parameter Value
Transaction hash 0xfbaef, ..., 840f96
From 0xca35b7, ..., fa733c
To Clustering.StateOfSNs() 0xbbf28, ..., 732db
Transaction cost 27886 gas (Cost only applies when called by a contract)
Execution cost 6614 gas (Cost only applies when called by a contract)
Hash 0xfbaef, ..., 840f96
Input 0x38e, ..., fccad
Decoded input {}
Decoded output {“0”: “string: Current state of Sink Node 1 is: 1”
“1”: “string: Current state of Sink Node 2 is: 1”
“5”: “string: State of selected Ordinary Node is: 1”}
Logs []
Table 3. Comparison of different costs
Function Transaction cost (gas) Execution cost (gas)

Network deployment cost 5674465 4272505
Per transaction cost
Sink Node 1 46325 23837
Sink Node 2 46019 23595
Sink Node 3 46237 23749
Sink Node 4 46193 23705
Sink Node 5 46457 23969
Current state (Active/De-active) 27886 6614
of sink nodes
Current state (Active/De-active) 27911 6639
of sink nodes according to main
server
Individual state checking cost
Sink Node 1 30427 9155
Sink Node 2 30493 9221
Sink Node 3 31964 10692
Sink Node 4 32118 10846
Sink Node 5 30515 9243
4.1.2 Ganache
Ganache provides a clear visualization of the smart contract deploying trans-
actions. Ganache grants you access to 10 accounts and each having 100 Ethers
for testing purpose. When a transaction or a smart contract is deployed on the
blockchain, Ganache immediately confirms this transaction or smart contract. In
the result of any transaction or smart contract deployment, the transaction log
is increased and Ether is deducted. Each transaction has details of fund trans-
fer, contract creation and contract call, etc. along with the sender’s address and
transaction hash.
4.1.3 MetaMask
MetaMask is a Google Chrome extension which connects the Etherium node
and browser. It allows us to send and receive Ethers from Ganache. For this, the
connection of Ganache wallet with MetaMask is possible by using a private key.
MetaMask is also connected with Remix IDE. When Remix IDE and MetaMask
are connected, Ether will be deducted on each transaction.
4.1.4 MATLAB R2018a

MATLAB R2018a is a comfortable tool for researchers to work within. It is
a multi-dimensional mathematical computing environment used to solve linear
66 A. Mateen et al.
programming problems within seconds. However, we have used this tool for get-
ting plots by passing the values which were obtained by the Remix IDE.
4.2 Simulations Reasoning

In this section, the simulation tools, results and reasoning of these results will be
discussed. Nevertheless, we provide an overview of some important terms which
require for understanding the term gas price and about its calculation.
4.2.1 Execution Cost

Execution cost is a cost which requires the gas as a fuel on the execution of the
functions (code lines). It also requires storage allocation for different variables
as a result of the execution of operations. On the other hand, the transaction
cost requires for sending the data on the blockchain.
4.2.2 Estimating Transaction Cost

The total Ether cost of a transaction is based on the following two factors:
• Gas used: it can be defined as total gas consumed by the transaction.

• Gas price: it is a price of per unit gas specified in the transaction.
This transaction cost is calculated by the following formula:
Costtotal = Ugas × Pgas . (2)

Where, Ugas and Pgas represent the gas used in transaction and price specified
for that transaction, respectively.
4.2.3 Why Ether Is Not Used Instead of Gas?

Actually, it is made to decouple the cost of any operation from the market price
of Ether. As we know that the cryptocurrency prices are volatile and Ether also
has no exception. The gas limit for each action is constant and this is the reason
why we use gas instead of Ether.
Simulations for the calculation of execution and transaction costs are per-
formed in Remix IDE. During simulations, it is observed that the transaction
cost is always high with the comparison of execution cost. We perform a cost
analysis for a smart contract using the aforementioned costs in term of gas
consumption. Figure 2 shows the transaction and execution costs required for
network deployment. We can see from the Fig. 2 that the operation for network
deployment is the most expensive. It can also be seen from the Table 1 that the
transaction cost for network deployment is 5674465 gas which is higher than the
execution cost (4272505 gas).
Figure 3 depicts the state checking cost of sink nodes. State checking cost
means that the cost which is used for ensuring the active or de-active status of
sink nodes. To check whether the sink nodes are active or not, their status has
been checked by other sink nodes and then by the main server. This active or de-
active states must be the same in both places. If the status is same then the node
is working perfectly; otherwise, the node is malicious and need to be removed
from the network. This state checking cost is presented in Fig. 3. Similarly, Fig. 4
shows the transaction and execution costs for each sink node’s state. Moreover,
status for each node is presented in Fig. 4 and this status is cross-verified by
the main server. Nevertheless, notable thing is that the active status of nodes is
represented by “1” and vice versa by “0”.
Transaction and execution costs per transaction are shown in Fig. 5. We can
easily observe the difference between both of aforementioned costs. Transaction
cost is always high and execution cost is low. We have already discussed the
reasons for the higher transaction cost than the execution cost.
4.2.4 Comparison Between Transaction and Execution Costs

We execute different functions and calculate the transaction and execution costs
in terms of gas. After that, these costs are plotted using MATLAB. Moreover,
the aforementioned costs for different functions are provided in Tables 1 and 2,
respectively. Nevertheless, we have also provided with comprehensive comparison
of different costs in Table 3.
5 Conclusion
WSN has the ability to monitor, collect and send data from one place to another
in unorthodox conditions. However, this network has lots off security risks. In this
paper, we come up with the solution for their security concerns by exploiting the
blockchain concepts. So in this paper, we propose a blockchain based trust model
for WSN to communicate with other nodes without having any security risks.
Moreover, simulations are performed to compare the transaction and execution
costs. In future, we will implement blockchain in any state of the art routing
protocol and compare the performance with the original one.
References
1. Mateen, A., Awais, M., Javaid, N., Ishmanov, F., Afzal, M.K., Kazmi, S.: Geo-
graphic and opportunistic recovery with depth and power transmission adjustment
for energy-efficiency and void hole alleviation in UWSNs. Sensors 19(3), 709 (2019)
2. She, W., Liu, Q., Tian, Z., Chen, J.-S., Wang, B., Liu, W.: Blockchain trust model
for malicious node detection in wireless sensor networks. IEEE Access 7, 38947–
38956 (2019)
3. Liu, M., Yu, F.R., Teng, Y., Leung, V.C.M., Song, M.: Computation offloading
and content caching in wireless blockchain networks with mobile edge computing.
IEEE Trans. Veh. Technol. 67(11), 11008–11021 (2018)
4. Awais, M., Javaid, N., Imran, M.: Energy efficient routing with void hole allevi-
ation in underwater wireless sensor networks. MS thesis. COMSATS University
Islamabad (CUI), Islamabad 44000, Pakistan, July 2019
68 A. Mateen et al.
5. Mateen, A., Javaid, N., Iqbal, S.: Towards energy efficient routing in blockchain
based underwater WSNs via recovering the void holes. MS thesis. COMSATS Uni-
versity Islamabad (CUI), Islamabad 44000, Pakistan, July 2019
6. Xu, Y., Wang, G., Jidian Yang, J., Ren, Y.Z., Zhang, C.: Towards secure network
computing services for lightweight clients using blockchain. Wirel. Commun. Mob.
Comput. 2018, 1–13 (2018)
7. Lin, J., Shen, Z., Miao, C., Liu, S.: Using blockchain to build trusted LoRaWAN
sharing server. Int. J. Crowd Sci. 1(3), 270–280 (2017)
8. Ren, Y., Liu, Y., Ji, S., Sangaiah, A.K., Wang, J.: Incentive mechanism of data
storage based on blockchain for wireless sensor networks. Mob. Inf. Syst. 2018, 10
(2018)
9. Li, J.: Data transmission scheme considering node failure for blockchain. Wireless
Pers. Commun. 103(1), 179–194 (2018)
10. Lin, D., Tang, Y.: Blockchain consensus based user access strategies in D2D net-
works for data-intensive applications. IEEE Access 6, 72683–72690 (2018)
11. Singh, M., Kim, S.: Branch based blockchain technology in intelligent vehicle. Com-
put. Netw. 145, 219–231 (2018)
12. Dai, M., Zhang, S., Wang, H., Jin, S.: A low storage room requirement framework
for distributed ledger in blockchain. IEEE Access 6, 22970–22975 (2018)
13. Jia, B., Zhou, T., Li, W., Liu, Z., Zhang, J.: A blockchain-based location privacy
protection incentive mechanism in crowd sensing networks. Sensors 18(11), 3894
(2018)
14. Zhang, Y., Wen, J.: The IoT electric business model: using blockchain technology
for the Internet of Things. Peer-to-Peer Netw. Appl. 10(4), 983–994 (2017)
15. Xu, C., Wang, K., Li, P., Guo, S., Luo, J., Ye, B., Guo, M.: Making big data open in
edges: a resource-efficient blockchain-based approach. IEEE Trans. Parallel Distrib.
Syst. 30(4), 870–882 (2019)
16. Sharma, P.K., Moon, S.Y., Park, J.H.: Block-VN: a distributed blockchain based
vehicular network architecture in smart city. JIPS 13(1), 184–195 (2017)
17. Zhang, G., Li, T., Li, Y., Hui, P., Jin, D.: Blockchain-based data sharing system
for AI-powered network operations. J. Commun. Inf. Netw. 3(3), 1–8 (2018)
18. Gholamreza, R., Leung, C.: A blockchain-based contractual routing protocol for the
Internet of Things using smart contracts. Wirel. Commun. Mob. Comput. 2018,
14 (2018)
19. Naz, M., Javaid, N., Iqbal, S.: Research based data rights management using
blockchain over ethereum network. MS thesis. COMSATS University Islamabad
(CUI), Islamabad 44000, Pakistan, July 2019
20. Javaid, A., Javaid, N., Imran, M.: Ensuring analyzing and monetization of data
using data science and blockchain in loT devices. MS thesis, COMSATS University
in complex networks using data science techniques and blockchain. MS thesis,
COMSATS University Islamabad (CUI), Islamabad 44000, Pakistan, July 2019
smart grids using blockchain. MS thesis, COMSATS University Islamabad (CUI),
Islamabad 44000, Pakistan, July 2019
23. Noshad, Z., Javaid, N., Imran, M.: Analyzing and securing data using data science
and blockchain in smart networks. MS thesis, COMSATS University Islamabad
24. Ali, I., Javaid, N., Iqbal, S.: An incentive mechanism for secure service provision-
ing for lightweight clients based on blockchain. MS thesis, COMSATS University
25. ul Hussen Khan, R.J., Javaid, N., Iqbal, S.: Blockchain based node recovery scheme
for wireless sensor networks. MS Thesis, COMSATS University Islamabad (CUI),
26. Samuel, O., Javaid, N., Awais, M., Ahmed, Z., Imran, M., Guizani, M.: A
blockchain model for fair data sharing in deregulated smart grids. In: IEEE Global
Communications Conference (GLOBCOM 2019) (2019)
27. Rehman, M., Javaid, N., Awais, M., Imran, M., Naseer, N.: Cloud based secure
service providing for IoTs using blockchain. In: IEEE Global Communications Con-
ference (GLOBCOM 2019) (2019)
Smart Contracts for Research Lab
Sharing Scholars Data Rights
Management over the Ethereum
Blockchain Network
Abdul Ghaffar, Muhammad Azeem, Zain Abubaker,

Muhammad Usman Gurmani, Tanzeela Sultana, Faisal Shehzad,
and Nadeem Javaid(B)

abdul7g7@gmail.com, nadeemjavaidqau@gmail.com
Abstract. The data sharing is the claim of actual scholars datasets to

share and reuse in the future from any domain. The rise of blockchain
technology has to increase universally and enhancement in share and
reuse of scholars datasets. Despite there are numbers of security man-
agement frameworks for share data securely. However, those frameworks
is a centralize based to make data share digitally. Its has restriction and
owned by third party authority. The access and reuse of research datasets
have a variety of issues it misinterpretation. In this aspect, the researcher
or publisher has not to share data publicly due to reuse and perceive the
risk in a data sharing environment. Preparing and storing data is diffi-
cult in contents sharing. To overcome the limitation and restriction, we
proposed distributed data sharing management based on blockchain net-
work (peer to peer P2P network). To signify on Ethereum framework, we
proposed the case study of data sharing on the Ethereum smart contract
platform to achieve the access.
Keywords: Repository technologies · Research data sharing · Smart

contract · Digital right management
1 Introduction
Data sharing is an important and vital mainly for public researcher to obtain
their tasks and update there work. Research datasets contain issues to publish.
e.g., reproduced the paper results and simulation that update and proof your
work to republish your task without adding or commitment of their own scheme
in the search cycle. The main difficulties of scholarly communication to main-
tain the preparing and storing the research datasets [1]. Recently, the research
datasets is given autonomy to a requester to update the publish research sharing.
The scholarly communication of research datasets is digitally right managed.
This article proposed the new scheme for the researcher to publish his work
typically its can examine by researchers to view his work on each step to follow
https://doi.org/10.1007/978-3-030-33509-0_7
Smart Contracts for Research Lab Sharing Scholars 71
the reuser action that how it can be accessed. Those conditions and policies can
be enforced and follow through the smart contract that follows the conditions
under which the research data is to publish.
The paper aims to evaluate and compare the decentralized architecture with
the existing management scheme for research datasets right management. Using
datasets repository and data sharing solutions for scholarly communication.
Specifically, the research data is evaluated the performance of existing archi-
tecture. In addition, our proposed technique meet design goal of scalability by
measuring the system with a different configuration.
The scholarly datasets communication of researcher domain has more con-
nected parts include numbers of activities for sharing research data. Authors will
post the research datasets by own website [1]. In [2], we facilitate to upload free
services of research contents. From the points of publish research data sharing
and reuse the research contents (reuser/requester) can specify under certain con-
ditions and terms [3]. When supporting research content is publish it follow the
access and reused conditions. So its remain confidential in case of reuser inter-
ference to reuse practically. Usually, reuser takes permission from the author.

The most necessary and essential activity of researcher/scholar to share their
scholarly datasets from one part to another mainly due to requirements of
research organization and research labs [1]. In the research cycle to sharing
the scholarly datasets is not necessary because in the survey less then the half
researcher can response on it due to difficulties in storing and data sharing
which is technical issue [2]. In the various activity of knowledge [3], they aware
of the environment that publish research datasets can have the wrong impact
on academic desired aim. In the survey, the main obstacle is fear in misused
and misinterpretation of research publish datasets and fear in lose of publishing
chances.
1.2 Technical Contribution

Smart contract technique is used to solve the digital right management in the
proposed system. We reference to Ethereum blockchain network. To implements
the reuse of scholarly contents/datasets from essential domain to enhance the
knowledge and consideration. The number of scholars can share the contents of
research datasets due to use for further innovation interpretation. To solve the
problem we reach to the best solution using innovation mechanism and tech-
nology called blockchain. This is the new plan to promote the flexibility and
reliability to record each term/condition/policy activities involve in the smart
contract.
The blockchain used to maintain and deploy the records of transactions in
a decentralized distributed public digital ledger. There is no need for the third
party to maintain the transaction. Blockchain maintains each entity to records
separately and subsequently. It also needs to records important information
72 A. Ghaffar et al.
about the publisher and researcher workflow. Moreover, a smart contract for
digital right management represents the solid base for datasets sharing over the
Internet.
Smart contract technology makes interactive between authors scholar
datasets and requester access action. The parameters assigning by authors as
follow:
• High qualification degree holder.
• Associated with development and research lab to publish the research area
from any domain.
• Author account address to store the incentives received from requester.
The remaining paper is structured in the following manner: Sect. 2 discusses
the blockchain mechanism. Related work for scholarly datasets communication
of research work and motivation is discussed in Sect. 3. Section 4 shows an intro-
duction of our new proposed technique. In Sect. 5 concludes the results. More-
over, Sect. 6 present the simulation environment. Finally, Sect. 7 we conclude our
article and discuss the future limitation/scope of blockchain in data sharing.
2 Blockchain Mechanism
Revolutionized technology used to handle the data. It is the hash-based data
structure. Initially, blockchain develops for the invention of bitcoins. Most peo-
ple think, that this technology used for power the bitcoin. It is actually the
chronological linked list of batches called block. Those blocks are used for store
data which is untampered. The technology of blockchain is an increasing list of
blocks (records) that are related by cryptographically.
2.1 Blockchain Fundamental

In the blockchain, there is a hash function, hash pointer to the previous hash,
and Merkle tree also called a hashed tree. The Merkle tree or hash tree is usually
used to append the data hashes to build a new string of hash. After the resultant,
we get the root hash (the unique hash).
The blockchain used two cryptographically techniques, namely as a digtal sig-
nature and hash function. The digital signature provides the integrity, authen-
tication, and non-repudiation for bitcoin transaction. A hash function is used
to compute the hash value of the previous block and makes blocks as a chain.
Besides, blockchain is basically a decentralized distributed ledger, which is capa-
ble enough to record all transaction information between different agents (sellers
and buyers) in a certified and reliable manner [1,2].
Finally, blockchain technology is completely distributed and public database.
Where any kind of data is to be exchanged. On every transaction, the blockchain
technology creates a new block. The blockchain is formed when all the blocks
are associated sequentially by hash. In addition, the body of every block has all
the necessary transaction data in the preceding phase.
The decentralized blockchain technology in data sharing has many applica-
tions, such as access to data, data trading [6], and data management.
2.2 Blockchain Construction
When a new transaction or any data is to insert in well-known ledger it initially

broadcast to all participant in the particularly given network. The proof-of-
work PoW is actually the solution for authorized access of data inference in the
blockchain. In the bitcoin network, after verification of the transaction. Each
verified transaction is then broadcast to all nodes in the network for record in
public ledger.
For a single transaction, first to be verified for validity before it is recorded
in public ledger. For the tampering data in a blockcain network need the cre-
ation of a new block to store the desired modified data by whom and when.
Because a blockchain is a long time stored mechanism for decades. The most
useful verification techniques as follow:
• Proof-of-Work: In PoW to add a block to the chain, miners must compete to

solve difficult mathematical puzzles using their computer processing powers.
In order to add a malicious block for any transaction, you’d to have a com-
puter more powerful to make 51% the network approval [9]. In PoW the first
miner is rewarded to solve the puzzle first.
• Proof-of-Stack (PoS): In technique, their is no competition for the block cre-
ator. It is chosen by an algorithm based on the user’s stack. In order to add
a malicious block, you’d have to own 51% of all the cryptocurrency on the
network. There is no reward to the miner on creating the block. But only a
fee transaction is to give to it.
• Proof-of-Authority PoA: In PoA transaction and blocks are validated by
approved accounts called validators. Those validators run the software to
put the transactions in the verified blocks. This process is automated and not
need of validators to constantly monitor their computers.
2.3 Blockchain Basic Components
• Transaction: Transmitting transaction is actually the information or data

from one participant to another in a particular network. The transaction is
kept on track by a blockchain, since from birth to the desire entity.
• Blocks: The blocks are used to collect valid transactions. Each block collect
transaction that has to occur it given period and has the refers to the pre-
ceding block. Therefore, the chain of blocks is built.
• Nodes: The nodes are the participants in a particular network. Those nodes
used to store the complete transaction. Nodes are the members. Therefore,
each node is to store the entire blockchain/ledger separate copy instead in a
centralized server/database.
• Majority Consensus: It is actually the decision authority. The centralized
authority is to discard. Therefore, the decision is taken by majority node in
the specified network. Participants or nodes modifies the stored transaction
data if the majority in a specified network is approval the status [4].
3 Related Work
In this section, we discuss the literature review and motivation to the context of
this article.
The most necessary and essential activity of researcher/scholar to share their
scholarly datasets from one part to another mainly due to the requirements
of research organization and research labs [1]. In the research cycle, to share
the scholarly datasets is not necessary because in the survey less then the half
researcher can respond to it due to difficulties in storing and data sharing which
is a technical issue [2].
In the various acts of knowledge [3], they aware the environment that publish
of research datasets can have the wrong impact on academic desired aim. In the
survey, the main fence or obstacle is fear in misused and misinterpretation of
research publish data and fear in lose of publishing chances.
The scholarly communication of researcher domain has more connected parts
include numbers of activities for sharing research datasets. Authors will post the
research datasets by own website [4,5].
In ZENODO [6] facilitate to upload free services of research contents. From
the point of publishing research datasets sharing and reuse the research contents
(reusers/requesters) can specify under certain conditions and terms [7]. When
supporting research contents is publish it follow the access and reuse conditions.
So its remain confidential in case of reuser interference to reuse practically. Usu-
ally, reusers take permission from author.
The development of mobile communication and networking which is
extremely difficult to manage. Due to proposed in [7] scenario, Artificial Intelli-
gence AI-powered network frameworks. This platform operates a network auto-
matically. To tackle data barriers, we used mutual data sharing frameworks.
With the invention of blockchain technology, most researcher sharing focus
on blockchain mechanism to notice the value of scholarly data sharing. Mostly
include medical data sharing. In a medical scenario, the authors used to take
ownership of saving and manipulate the data. For this, the only solution to the
problem was blockchain technology. For whatever, medical data need only to
verify at the permission from data ownership. This permission is verified through
smart contracts using blockchain technology.
Another problem is to tackle through on aspect of data authority manage-
ment. For tackle such issue, blockchain technology is used for data authority
management. In this scenario, the raw data can’t be controlled by the owner as
long as. In system [7], it includes three levels: user layer, management layer, and
data sharing layer. We also used the storage layer as composed to the cloud-based
storage layer. In [8–17], authors use of blockchain in wireless sensor networks,
Internet of things, smart grids, vehicle networks, etc.
4 Proposed Scheme Model

In this section, the smart contracts technique is used to solve the digital right
management. We reference to Ethereum blockchain network. To implements the
Data
Scholar Publisher Data Repository CommunicaƟon
Layer 1 Profile Account
Authors / Publisher
Domain Data Access
AbstracƟon of
Publisher Data
Layer 2
CondiƟons / Terms Smart contract
Domain
Reuser
Layer 3 Account
Reuser / Requester
Domain
Requester
Rework
Publisher
Fig. 1. Smart contracts for Research Data Right Management
reuse of scholarly contents/data from essential domain to enhance the knowledge

and consideration. The number of scholars can share the research contents or
research datasets due to used for further innovation interpretation. To solve the
problem, we reach the best solution using innovation mechanism and technology
called blockchain and smart contracts. This is the new plan to promote the
flexibility and reliability to record and verify each term activity involves in the
smart contracts.
Actually, in the proposed model we have to contribute and divides the given
scenario. The division is made on the functionality of various objects and their
environments. The data repository is storing location where publish datasets is to
maintain. The publisher will access the datasets directly. In datasets repository,
each publisher has its own scholar profiles. Usually, the scholar profiles consists
of various fields of well-known information about scholars. From the scholar
profiles, the required scholar datasets are accessed.
The blockchain used to maintain and deploy the records of transactions in
a decentralized distributed public digital ledger. Their is no need of the third
party to maintain the transaction. Blockchain maintains each entity records sep-
arately and subsequently. It also needs to records important information about
the publishers and researcher workflow. Moreover, smart contracts for digital
right management represent the solid base for data sharing over the Internet.
In Fig. 1 smart contracts technology makes interactive between authors
scholar datasets and requester access action. The parameters assigning by
authors is follow:
• High qualification degree holder.

• Associated with development and research lab to publish the research area
from any domain.
• Authors account address to store the incentives receive from requester.
• Hashing for authors datasets and conditions (used in smart contracts under
which research datasets may access and reuse).
• The published work is reuse is to record permanently in a transaction
(Table 1).
Table 1. The gas value occurs during implementing the transaction
Authors Requester Authors and Authors Performance Reuser Performance

qualification qualification requester output development in gas development in gas
description description parameter in gas research lab research lab
Phd BS 45871 RMIT Lab 45957 RMIT Lab 46023
Phd BS 26328 Comsens 31406 RMIT Lab 26480
Lab
Phd MS 26328 QU Lab 31278 Comsens Lab 31472
MS MS 31064 UOP Lab 26542 Comsens Lab 26672
M-Phil Phd 26264 NUST Lab 31406 QU Lab 31344
BS Phd 26264 USTB Lab 26608 UOP Lab 31216
The performance of transaction evaluation is measured in gas. Ethereum

Virtual Machine (EVM) and the network which used to implement the smart
contracts used to manage the research scholar datasets. In the proposed scenario,
the calculation of each output parameter performance is taken in gas. While the
transaction cost is estimated in ether given in Fig. 2. The execution cost of the
transaction based on ether cryptocurrency.
Table 2. Total evolution of transaction
Transaction sequence in gas Evolution of transaction

Development research lab 99.98
Authors qualification description 99.93
Requester qualification description 99.78
Reuser development research lab 99.78
In the above table, the evaluation of composed parameters is shown. Dur-

ing the transmission of transaction occurs it consume how much gas in ether
(Ethereum coin called ether). The gradually decrement in the transaction of
Ethereum if any parameter occurs (Table 2).
5 Analysis of Results and Discussion

The model is deployed in Ethereum platform, in order to improve the decen-
tralization of scholar datasets sharing using blockchain technology and smart
contracts. The datasets are consumed and encompass on input control and out-
put performance. In this proposed scheme, we consider three layers. Each layer
has own transaction performance. Moreover, the performance evaluation shows
in each plotted graphs.
Fig. 2. Authors development research lab
In Fig. 2 shows the output evaluation transaction cost on y-axis based on

input parameters (number of research labs that publish research content or com-
municate scholarly). It is the domain from where the authors will publish the
scholar datasets to maintain the consistency and flexibility of enhancement. Cer-
tainly, lab description allows the desired and needed scholars datasets have to
publish. The datasets published depends on the authors research labs descrip-
tion. The scholars datasets from any desired lab descriptions increments simul-
taneously in data publishing. The output evaluation is computing in transaction
gas period. Definitely, authors, research lab description shows the levels of pub-
lishing materials. Usually, it demonstrates which type of authors from specified
research lab can upload which levels of scholar datasets.
The rapidly decrement in the amount of transaction depends on the through-
put of authors research labs shows in Fig. 2. Our proposed technique contribution
has motivated from intelligent vehicle technology [5].
The lab description shows the various labs that are required to publish their
concerned scholar datasets. The numerous well-known scholar data publish from
each lab to take the revolution of participating to make more transaction in the
access of there paper to be published. On y-axis, we evaluate the transaction of
each lab to publish there work done.
Fig. 3. Authors qualification description
Fig. 4. Requester qualification description
In Fig. 3 shows the authors qualification for publishing the desired and needed
scholar datasets. The data published depends on authors qualification descrip-
tion. The scholar datasets from any desired descriptions is an enhancement in
data publishing. The output evaluation is computing in transaction cost period.
Definitely, authors description shows the levels of publishers. Usually, it demon-
strates which type of authors can upload which level of scholar datasets.
In Fig. 4 shows the requester qualification description. On both sides, the
transaction output is taken in gas. The delay is actually shown in gas perfor-
mance. How fast the scholar data is accessed from the publisher, the transaction
almost gradually performed. In general, the result of the experiment shows that
the publish and downloaded gas transaction for each event is acceptable. We also
experiment to measure the relationship between concurrent services requested
and transaction output gas in ether.
In the given scenario, we have three layers, in which the bottom layer is for
reuser of the scholar datasets. In Fig. 5, the reuser or the requester needs to
access the published datasets from the blockchain technology. On each access of
the desired paper, the incentives can share to the publisher. Reuser can access
Fig. 5. Reuser development research lab
publish materials from any well-known labs for future. The access of required
papers can be made from the authorized lab because of their reuse must be
working on policies.
6 Semulation Environment
In the simulation scenarios, we consider the experiment employs on the laptop

which actually indicates the user’s devices. Here laptop plays the role of miners
due to relative compute and storage capability as follow:
• Laptop with 8.00 GB RAM.

• 64-bit window 10 (Operating System).
• Laptop processor is Intel Core m3-7Y30 CPU @ 1.00 GHz 1.61 GHz.
The smart contracts actually make runtime contract between the datasets pol-
isher and reuser for sharing datasets.
Remix is an integrated development environment (IDE). Which is web
browser-based IDE for programming in solidity (to write the smart contracts
that digitally facilitate the parties on both sides to trust on given rules). The
Ethereum platform used ether. Which shows how much ether has been done on
a specific task. How much gas is consumed to indicate actually the total cost for
a specific task. For more a complex task, we need more cost to be consume.
Ganache is a personal blockchain to create a smart contract. It’s available for
a desktop application which is a command line tool for windows. It works and
used on Ethereum development. It provides the functionality of the deployment
of smart contracts. We used as a desktop application to deploy the contract
between data scholars publisher and reuser. Ganache allows you to create a
private Ethereum blockchain for you to run for test and execute commands.
MetaMask is a bridge that allows you to visit the distributed web to run
Ethereum desktop application on your browser. It is usually a wallet to store,
send, and receive the ethers. It allows you to control the funds.
7 Conclusion and Limitation

The results of experiment shows the gas value occurs during implementing the
transaction in EVM of smart contracts for the right research datasets man-
agements. While the transaction cost defines the actual workflow to be done
in blockchain network transaction based on the limited number of computing
resources. The publish of scholar datasets similarly estimates in ethers incentive
tokens to stay the terms.
The above plots shows the throughput parameters of publishing total transac-
tion, lab description, and authors qualification. Each plot defines the transaction
cost and estimated gas value which is calculated in either token due to EVM. We
carried out a brief description of each newly added parameter in research right
managements datasets to reuse by various requesters to enhance the domain. On
each plot, the datasets are taken randomly to define the conclusion for mainte-
nance. The proposed system is accordingly working as resultants of experiments.
Each author has much scholarly information to publish in any fields but in some
checks such as qualification and developed research lab mandatory.
The total sequence gas generated due to researcher publish the datasets to
response the reusers request. In enhanced publishing, data storing is increasing
with probability. The authors must attach to a development lab to access the
published work to enhance and develop his own research work.
Moreover, for future limitation, the scheme can be extended to enforce the
policies and terms conditions to establish the research data. It can also increase
in economic perspectives. We should also examine for research interest in inte-
grating the access to identical fields data.
References
1. Xu, Y., Wang, G., Yang, J., Ren, J., Zhang, Y., Zhang, C.: Towards secure network
Comput. 2018, 12 (2018)
3. Novo, O.: Scalable access management in IoT using blockchain: a performance
evaluation. IEEE Internet Things J. 6, 4694–4701 (2018)
4. Jiang, T., Fang, H., Wang, H.: Blockchain-based internet of vehicles: distributed
network architecture and performance analysis. IEEE Internet Things J. 6, 4640–
4649 (2018)
5. Sergii, K., Prieto-Castrillo, F.: A rolling blockchain for a dynamic WSNs in a smart
city. arXiv preprint arXiv:1806.11399 (2018)
for AI-powered network operations. J. Commun. Inf. Netw. 3(3), 1–8 (2018)
7. Pãnescu, A.-T., Manta, V.: Smart contracts for research data rights management
over the ethereum blockchain network. Sci. Technol. Librar. 37(3), 235–245 (2018)
based underwater WSNs via recovering the void holes. MS thesis, COMSATS Uni-
versity Islamabad (CUI), Islamabad 44000, Pakistan, July 2019
using data science and blockchain in loT Devices. MS thesis. COMSATS University
and blockchain in smart networks. MS thesis. COMSATS University Islamabad
ing for lightweight clients based on blockchain. MS thesis. COMSATS University
for wireless sensor networks. MS thesis. COMSATS University Islamabad (CUI),
Secure Service Provisioning Scheme
for Lightweight Clients with Incentive
Mechanism Based on Blockchain
Ishtiaq Ali, Raja Jalees ul Hussen Khan, Zainib Noshad, Atia Javaid,
Maheen Zahid, and Nadeem Javaid(B)
COMSATS University, Islamabad 44000, Pakistan

http://www.njavaid.com/
Abstract. The Internet of Things (IoT) industry is growing very fast

to transform factories, homes and farms to make them automatic and
efficient. In the past, IoT is applied in different resilient scenarios and
applications. IoT faces a lot of challenges due to the lack of compu-
tational power, battery and storage resources. Fortunately, the rise of
blockchain technology facilitates IoT devices in security solutions. Nowa-
days, blockchain is used to make reliable and efficient communication
among IoT devices and emerging computing technologies. In this paper,
a blockchain-based secure service provisioning scheme is proposed for
Lightweight Clients (LCs). Furthermore, an incentive mechanism based
on reputation is proposed. We used consortium blockchain with the
Proof of Authority (PoA) consensus mechanism. Furthermore, we used
Smart Contracts (SCs) to validate the services provided by the Service
Providers (SPs) to the LCs, transfer cryptocurrency to the SPs and main-
tain the reputation of the SPs. Moreover, the keccak256 hashing algo-
rithm is used for converting the data of arbitrary size to the hash of fixed
size. The simulation results show that the LCs receive validated services
from the SPs at an affordable cost. The results also depict that the par-
ticipation rate of SPs is increased because of the incentive mechanism.
Keywords: Internet of Things · Blockchain · Service provisioning
1 Introduction
The Internet of Things (IoT) industry is remarkably evolved in the last ten years.
At present, 13 billion IoT devices are connected and in the near future, it would
increase up to 30 billion [1]. Gartner in [2] says that the total amount of spending
on IoT devices and its services till 2017 is almost $2 trillion. The author also
predicted that the number of connected IoT devices all over the world will grow
up to 20 billion by 2020.
The IoT is optimizing and transforming the hand-operated (manual) pro-
cesses to automatic processes and make them part of the modern era. Central-
ized architecture like cloud computing has extremely contributed to IoT devel-
opment. During the last decade, cloud computing technology has contributed
https://doi.org/10.1007/978-3-030-33509-0_8
Secure Service Provisioning Scheme for Lightweight Clients 83
a lot to provide the necessary functionalities to IoT devices. Using these func-
tionalities the IoT devices analyze and process the information and convert it to
real-time information [3].
Meanwhile, the abilities of resource constraint IoT devices are extended by fog
computing, edge computing and transparent computing. The abilities of the IoT
devices are extended through service provisioning and sharing. Security issues
arise unintentionally during service provisioning to IoT devices. The services
provided by transparent computing technologies are not always accurate. The
services must be validated before executing it. The researchers in some of the
early studies work on validation of the services before execution, which is pro-
vided by cloud computing. The authors in [4] used block-stream techniques to
encode the services and provide the encoded services to the IoT devices. The
authors in [5] proposed a scheme that uses local trusted firmware and trusted
platform modules to validate the services before executing.
Nakamoto in 2008 presented a cryptocurrency based on blockchain [6].
Blockchain is a technology in which transactions are validated by untrusted
actors. Blockchain provides an immutable, distributed, secure, transparent and
auditable ledger. The information in the blockchain is structured in a chain of
blocks. The blocks contain the number of transactions and these blocks are linked
together through cryptographic hashes.
After the invent of blockchain, researchers used Smart Contracts (SCs) for
validation of services before executing it. Due to the aforementioned features of
the blockchain, it is used as an underlying security fabric in the service provision-
ing systems. In this work, we proposed a secure service provisioning scheme for
Lightweight Clients (LCs) with an incentive mechanism based on the reputation
of Service Providers (SPs). In the proposed scheme, the LCs send requests to
the SPs through blockchain. The SP provides the service codes in an off-chain
method. The LC validates the service codes and pays the price of the service
to the SP. The Incentive mechanism is based on the reputation of the SP. The
reputation of the SP is increased when it provides valid services and that SP
will receive more requests from the LCs and more incentives will be gathered.
Consortium blockchain with Proof of Authority (PoA) consensus mechanism
is used. A consortium blockchain is used because it has both the features of
the public blockchain and private blockchain. PoA consensus mechanism is used
because it requires less computational power as compared to other consensus
mechanisms. Moreover, the Keccak256 hashing algorithm is used for finding the
hash of the service codes. Keccak256 is used because it consumes less cost as
compared to other hashing algorithms. The evaluation results show that the
LCs receive validated services at an affordable cost. Moreover, the results also
depict that the participation rate of the SPs increase because of the incentive
mechanism.
The rest of the paper is organized as follows. Section 2 presents the literature
review. In Sect. 3, the problem statement is presented. Section 4, Presents the
proposed scheme for secure service provisioning with fair payment and incen-
84 I. Ali et al.
tive mechanism. In Sect. 5, simulation and results are discussed. Finally, Sect. 6
concluded our work.
2 Related Work
Blockchain is the emerging technology and most of the researchers are attracted
to it. Every field in the current era leveraged some of the features of blockchain.
The following are the fields in which blockchain is used by researchers in past
years. Blockchain is one of the emerging topics for research in recent years.
Some of the researchers integrated blockchain with IoT to overcome the issues
of openness, scalability, data storage, security and channel reliability.
Lin et al. integrated blockchain in Long Range Wide Area Network
(LoRaWAN) server. Using blockchain in the LoRaWAN server, an open, trusted,
decentralized and tamper-proof system is developed. However, the scalability of
the network is ignored [7].
The author in [8] presented a proof of concept architecture to implement an
access management system for IoT using blockchain technology. The permissions
and credentials of different IoT resources are stored globally. The results show
that the proposed system performed well when the Wireless Sensor Networks
(WSNs) are connected to multiple management hubs. However, when the WSNs
are connected with a single management hub, the architecture performs as a
centralized IoT system. Furthermore, when the management hub fails the devices
connected to it disappear.
Sharma et al. proposed a novel hybrid network architecture by leverag-
ing Software Defined Network (SDN) and blockchain technology. The proposed
architecture is divided into two parts: core network and edge network. By divid-
ing the architecture into two parts, the architecture has both centralized and
distributed features and strengths. The proposed architecture is compared to
Ethereum blockchain and the difference of 16.1 sec is observed in latency. How-
ever, edge nodes are not deployed efficiently and have issues in enabling the
caching technique at edge nodes [9].
The author used blockchain for research data rights management, monetiza-
tion of data and energy efficiency and truthfulness of the networks in [10–12].
Sharma et al. proposed a distributed secure SDN architecture for IoT using
the blockchain technology. The results show that the proposed system performs
well in terms of scalability, accuracy, defense effects and efficiency. However, the
data storage issue is ignored in [13].
For data storage, Jiang et al. proposed a blockchain architecture and net-
work model by considering the participation of the Internet of Vehicles (IoVs).
The lag timestamp range function is used for block validation. The blockchain
architecture consists of five different blockchains. The results show that when
the traffic increases, the average number of retransmission is about 0.86 and the
mean throughput of the network also increases. However, the channel reliability
of the cellular network is ignored [14].
Kushch et al. proposed a rolling blockchain concept for IoT devices. The
IoT devices have less battery resources and computational power to carry out
Proof of Work (PoW). The results show that the blockchain remains stable with
an increasing number of attacks. The lost blocks depend on the density of the
sensors and the intensity of the attack. However, security issues and pollution
attacks are ignored in this work [15].
For security issues, Xu et al. proposed a novel blockchain approach for secure
service provisioning. Furthermore, the authors used consortium blockchain with
the PoA consensus algorithm. The authors also used SCs to validate the edge
servers and service codes. However, no reward is given to the SPs to motivate
them [16].
Sharma et al. proposed a secure Vehicular Network (VN) architecture based
on blockchain for a smart city. The authors used the blockchain with SC. How-
ever, the SPs are not rewarded so they will not provide services effectively [17].
In [18–20], the authors used blockchain for balancing electricity demand and
supply, analyzing and securing data and node recovery.
Zhang et al. proposed an IoT E-business model because the traditional E-
business model was not feasible for the IoT. The transactions of smart property
and paid data between IoT are carried out and stored in blockchain by using the
SC. The proposed IoT E-business model is used for a case study and observed
that it is working effectively as compared to the traditional E-business model.
However, a platform for data exchange is missing [21].
Singh et al. used blockchain to build a secure and trusted environment
for Intelligent Vehicles (IVs) communication. The authors also proposed local
dynamic blockchain branching and unbranching algorithms to automate the
branching process of IV communication. Furthermore, they introduce an IV
Trust Point (IVTP), which is used as a cryptocurrency during communication.
However, by using the branching algorithm, the duplicates state changes are
increased [22].
The author in [23] proposed a data transmission scheme for block validation
in blockchain considering the node failure. The authors used a multilink con-
current communication tree model. The results show that the proposed scheme
works effectively until the failed nodes reach to 15%. However, the average link
stress, the concurrent communication time and the average end to end delay
increased when the failed nodes reach about 30%. Furthermore, failed nodes are
only detected, not recovered.
Lin et al. used the blockchain and Byzantine consensus mechanism to propose
a framework. The proposed framework authenticates the Channel State Informa-
tion (CSI) for Device-to-Device (D2D) underlying cellular networks. The scheme
of user access control among the users is studied in a data-intensive service appli-
cation. The results show that our proposed framework beats the Q-learning algo-
rithm and random search algorithm in terms of spectrum efficiency. However,
users with non-cooperative behavior are not considered [24].
In [25], the authors proposed a novel secure service provisioning scheme for
LCs. In the proposed, the cloud server validates the services, the edge servers
and maintain the record of the services. The experimental analysis shows that
the proposed system is suitable for resource constrained devices.
86 I. Ali et al.
The authors in [26] proposed a decentralized system for data sharing to

share their private data in smart grids. A new PoA consensus mechanism is
also proposed in which the reputation assigned to participants is based on the
PageRank mechanism.
In [27], the authors proposed a trust model for Underwater Wireless Sensor
Network (UWSN) based on blockchain.
3 Problem Statement
The services provided by SPs are not always accurate. The authors in [4] used
a block-stream code technique for service provisioning to IoT devices. However,
the SPs cannot be trusted that they provide the correct services. The authors in
[5] proposed a scheme in which service programs are validated before execution
by using techniques like local trusted firmware and trusted platform modules.
However, IoT devices have less space for spare firmware and no specific hardware
for trusted modules. After the invent of blockchain, researchers used blockchain
for service validation and verification. In [16], the authors use SCs for the vali-
dation of the services before executing it. However, no rewards are given to the
SPs to motivate them. Moreover, the payment for the service provisioning is also
not considered in this work. In this work, a secure service provisioning scheme.
Payments are transferred in the form of cryptocurrency to eliminate the third
party. Furthermore, an incentive mechanism based on the reputation of the SPs
is proposed to motivate the SPs.

In this section, we elaborated the proposed system model. The proposed system
model is graphically presented in Fig. 1. Nevertheless, for this proposed model,
we have got the motivation from the system model of paper [16].
A consortium blockchain is used in the proposed model because it is managed
by different authorized nodes. Furthermore, the PoA consensus mechanism is
used because it requires less computational power. Keccak256 hashing algorithm
is used because it requires low gas consumption. There are three main entities
of the proposed model.
SPs: SPs are the nodes that provide services to the LCs in the proposed system.
When LCs request for a specific service. The SP encrypts the service codes, finds
the hash of that service codes and publishes that hash on the blockchain. Then
the SP sends that encrypted service codes to the LCs in an off-chain method.
LCs: LCs are the IoT devices in the proposed system, which have less compu-
tational power, storage and battery resources. In the proposed system, LCs send
requests to SPs through blockchain for service codes. Then these service codes
are validated through SC before execution. LCs also pay to the SPs in the form
of cryptocurrency for specific service.
Fig. 1. Proposed system model
Maintainer Nodes (MNs): MNs are the validators in the blockchain. MNs
are responsible for evidence recording in the blockchain. MNs record transactions
in blockchain about the validity of the LCs, SPs, services and the reputation of
the SPs. The reputation of SPs is based on the number of validated transactions.
4.1 Steps of the Proposed System Model
There are five steps in our proposed system model as:
1. Firstly, the LC sends a request transaction through blockchain to the SP,

which contains the service name and the SP id.
2. Secondly, the SP finds the hash of the service codes using the keccak256
hashing algorithm. The hash is published on the blockchain with the service
name.
3. In step three, the SP transmits the service codes to the LC in an off-chain
manner by using any transmission medium.
4. The LC calculates the hash of the received service codes from SP using the
keccak256 hashing algorithm. Then the LC sends a transaction containing
the service name and the hash of the service codes to the blockchain for
verification. Using SC the hash sent by the LC is compared with the hash
88 I. Ali et al.
published by the SP in step 3. If the hashes match, then the output will be
valid service else invalid service.
5. The LC transfers cryptocurrency to the SP account after receiving the vali-
dated service codes.
4.2 Reputation Based Incentive Mechanism

We proposed an incentive mechanism, which is based on the number of valid
transactions and number of invalid transaction. The incentive mechanism is pro-
posed to motivate the SPs to actively take part in the service provisioning pro-
cess. Now, what is reputation, reputation in the proposed system is the number
of valid transactions. Valid transactions are the transactions in which valid ser-
vice codes are sent to the LCs by the SPs. When SP sends the service codes
to the LC and LC generates the hash of the service codes and sends this hash
for validation to SC. SC compares the hash sent by the LC with the hash pub-
lished by the SP and returns the result to the LC. The SC also updates the
reputation of the SP according to the result. When the service codes are valid
then the reputation is incremented by one and if invalid then one is decremented
from the reputation of the SP. When new LCs request for the services they will
first look at the reputation of the SPs and then send requests. The SPs with a
high reputation in the network receive more requests and gain more profit. The
scheme is validated through the participation rate of the SPs.
T xval = T xtotal − T xnonval (1)
T xnonval = T xtotal − T xval (2)

T xval are the validated transactions, T xtotal are the total transactions and
T xnonval are the non-validated transaction.
P r = (T xval /T xtotal ) × 100 (3)
P r stands for the participation rate of the SPs.
5 Simulation and Results

For simulation, we used different tools, which work in a combined manner. For
SC development we used Remix IDE online, which uses solidity language to
develop SC. Remix IDE also provides tools for debugging, deploying and statis-
tical analysis within that online environment.
Ganache is used because it provides a private Ethereum blockchain. On that pri-
vate Ethereum blockchain users perform any operation, which can be performed
on the main Ethereum blockchain without any cost. We used Ganache to test
our SC during development.
MetaMask is a Google Chrome extension that allows developers to run their
DApps on browser without running a full Ethereum node.
We used MetaMask to run our SC on the browser and perform all transactions.
MATLAB is used for plotting the results. The values of gas consumption are
recorded from remix IDE and then plotted by using MATLAB.
The results of the proposed scheme are discussed in this section. Furthermore,
a comparison with other algorithms is also performed to check the efficiency of
our used algorithms.
5.1 Gas Consumption

In Ethereum blockchain, gas consumption is a small amount of cryptocurrency.
The amount is deducted from the user’s account that is performing the trans-
action on the Ethereum blockchain. The cryptocurrency deducted from the user
account is given as a reward to the miner. Figure 2 shows the gas consumption
of every event in the overall process of service provisioning, which is determined
as gas units. The gas consumption depends on the complexity of the code. The
event b and c in the Fig. 2 shows that the codes are complex in these events,
that is why they consume more gas as compare to a and c.
Fig. 2. Gas consumption of events
Figure 3 depicts the total gas consumption of the overall process of service
provisioning. The results show that PoA outperforms PoW in terms of gas con-
sumption. Gas consumed by using the PoA consensus mechanism is 166550 gas
units, while gas units consumed by using PoW are 202668. PoW is computa-
tionally intensive that is why it consumes more gas units as compared to PoA.
90 I. Ali et al.
Fig. 3. Total gas consumption
The total gas consumption by using PoA is almost 17% reduced as compare to
PoW.
Figure 4 depicts the total gas consumption of the overall process of ser-
vice provisioning. The gas consumption is observed by using different hashing
algorithms. Keccak256 consumes 30 gas units+6 gas units per word, SHA256
consumes 60 gas+12 gas units per word and RIPEMD160 consumes 160 gas
units+120 gas units per word. The results show that keccak256 outperforms the
SHA256 and RIPEMD160. The total gas consumption by using keccak256 is
166550 gas units, while SHA256 and RIPEMD160 consume 168735 and 169925
gas units respectively. Keccak256 reduces gas consumption by 1.9% and 1.3% as
compared to RIPEMD160 and SHA256, respectively.
5.2 Participation Rate
The participation rate is defined as the participation of SPs in the service provi-
sioning process. The participation rate can be found with the help of the number
of valid and invalid transactions. When the number of valid transactions is more,
then the participation rate will increase and vice versa. Figure 5 shows the par-
ticipation rate of the SPs with respect to the number of valid transactions. The
participation rate is directly proportional to the number of valid transactions.
When the number of valid transactions increases the participation rate of the
SP is increasing as shown in Fig. 5.
Fig. 4. Total gas consumption by comparing hashing algorithms
Fig. 5. Participation rate with respect to valid transactions
6 Conclusion
In this paper, a secure service provisioning scheme for LCs using blockchain is
proposed. Furthermore, an incentive mechanism based on the reputation of SPs
is proposed. A consortium blockchain is used because it has both the features of
the public and private blockchain. The maintenance of the blockchain is done by
permissioned users. Other public users only read from the blockchain and verify
the service codes received from the SPs. For consensus in the blockchain, we
used the PoA consensus mechanism in which a group of validators is selected for
consensus and adding new blocks to the blockchain. The validators are selected
92 I. Ali et al.
based on their reputation in the network. The keccak256 hashing algorithm is

used to convert the data of arbitrary size into fixed-size hash. Keccak256 is used
because of its less gas consumption as compared to SHA256 and RIPEMD160.
We also used SC for the validation of the services provided by SPs to the LCs.
The simulation results show that using PoA the total gas consumption is reduced
by almost 17% as compared to the PoW. The results also depict that by using
keccak256 the total gas consumption is reduced by 1.9% and 1.3% as compared
to RIPEMD160 and SHA256, respectively. The results also show that as the
reputation of the SP increases its participation rate increases.
References
1. MacGillivray, C., Gorman, P.: Connecting the IoT: Te Road to Success. Interna-
tional Data Corporation (IDC) Report (2018)
2. Gartner (2017): Gartner says 8.4 billion connected ‘things’ will be in use in 2017,
Up 31 Percent From 2016. www.gartner.com/newsroom/id/3598917
3. Diaz, M., Martin, C., Rubio, B.: State-of-the-art, challenges, and open issues in the
integration of Internet of Things and cloud computing. J. Netw. Comput. Appl.
67, 99–117 (2016)
4. Peng, X., Ren, J., She, L., Zhang, D., Li, J., Zhang, Y.: BOAT: a block-streaming
app execution scheme for lightweight IoT devices. IEEE Internet Things J. 5(3),
1816–1829 (2018)
5. Kuang, W., Zhang, Y., Zhou, Y., Yang, H.: RBIS: Security enhancement for MRBP
and MRBP2 using integrity check. Minimicro Syst.-Shenyang 28(2), 251 (2007)
6. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). https://
bitcoin.org/bitcoin.pdf. Accessed 1 Feb 2018
evaluation. IEEE Internet Things J. 6, 4694–4701 (2018)
9. Sharma, P.K., Park, J.H.: Blockchain based hybrid network architecture for the
smart city. Future Gener. Comput. Syst. 86, 650–655 (2018)
using data science and blockchain in loT devices. MS thesis. COMSATS University
13. Sharma, P.K., Singh, S., Jeong, Y.-S., Park, J.H.: DistBlockNet: a distributed
blockchains-based secure SDN architecture for IoT networks. IEEE Commun. Mag.
55(9), 78–85 (2017)
14. Jiang, T., Fang, H., Wang, H.: Blockchain-based Internet of vehicles: distributed
network architecture and performance analysis. IEEE Internet Things J. https://
doi.org/10.1109/JIOT.2018.2874398
15. Kushch, S., Prieto-Castrillo, F.: A rolling blockchain for a dynamic WSNs in a
smart city. arXiv preprint arXiv:1806.11399 (2018)
Comput. 2018, 12 (2018)
and blockchain in smart networks. MS thesis. COMSATS University Islamabad
for wireless sensor networks. MS thesis. COMSATS University Islamabad (CUI),
put. Netw. 145, 219–231 (2018)
23. Li, J.: Data transmission scheme considering node failure for blockchain. Wirel.
Pers. Commun. 103(1), 179–194 (2018)
based underwater WSNs via recovering the void holes. MS Thesis, COMSATS
University Islamabad (CUI), Islamabad 44000, Pakistan, July 2019
Node Recovery in Wireless Sensor
Networks via Blockchain
Raja Jalees ul Hussen Khan, Zainib Noshad, Atia Javaid, Maheen Zahid,
Ishtiaq Ali, and Nadeem Javaid(B)

nadeemjavaidqau@gmail.com, http://www.njavaid.com
Abstract. Wireless Sensor Network (WSN) is a network of nodes con-

nected through a wireless channel. The sensor nodes in the network
are resource constrained in terms of energy, storage and computational
power. Node failure is a common phenomenon, which occurs due to envi-
ronmental factors, adversary attacks, draining of battery power, etc.
After node failure, recovery is challenging that needs a strong mecha-
nism. In this paper, Blockchain-based Node Recovery (BNR) scheme for
WSNs is proposed. In BNR scheme, recovery of failed nodes is on the
basis of node degree. The working mechanism of the scheme is that firstly,
the failed nodes are detected using the state (active or inactive) of Clus-
ter Heads (CHs). In the second phase, the recovery process is initiated
for inactive nodes. The main purpose of this step is to recover the failed
CH, which ultimately results in restoring the active states of its member
nodes. NodeRecovery Smart Contract (SC) is written for the purpose.
The cost analysis for NodeRecovery is also performed in the proposed
work. Moreover, the security analysis is performed to ensure the security
of the proposed scheme. Effectiveness of the proposed model is shown by
the simulation results.
Keywords: Blockchain · Wireless Sensor Networks · Node failure ·

Node recovery
1 Background
Wireless Sensor Networks (WSNs) have attracted extensive attention of
researchers in recent times. These networks consist of several sensor nodes work-
ing collectively to monitor the environmental conditions: temperature, humidity,
sound level, pollution level, etc. The collected data is then stored at a central
location, which is termed as a sink or base station. Such nodes have a microcon-
troller, transceiver, wireless communicating devices and an energy source (bat-
tery). The nodes have limited energy, computational power and storage capacity.
Therefore, WSNs have limitations: low computational capability, small memory
and limited energy resources.
Blockchain is also currently discussed in the news around the globe. It is
proposed in 2008 by Satoshi Nakamoto. It is simply an arrangement of blocks
https://doi.org/10.1007/978-3-030-33509-0_9
Node Recovery in Wireless Sensor Networks via Blockchain 95
connected through hashes. The core of this technology is secure and distributed
ledger [1]. Most renowned application of blockchain is bitcoin [2]. However, it is
applicable to different fields, including WSNs, Internet of Things (IoT), Internet
of Vehicles (IoV), academia, smart grid [3], smart city, etc.
In WSNs, the major focus of authors is on security problems, computational
power, low data storage and node failure of sensor nodes. There is a need of
blockchain framework for partially connected IoT devices. In [4], smart cars are
considered as nodes of the network. These cars can be either fully or partially
connected to the network. Rolling blockchain is proposed to handle large data
generated by the cars. Using this technique you have two options; (i) delete
first block from the memory of sensors (ii) delete all blocks except last block,
to accommodate the incoming data. The probabilistic values are found using
Monte Carlo simulations in case of broken links. Simulations show that in the
case of partially connected devices, the network remains stable and blockchain
can be built. However, security analysis, hacker’s attack and use of Merkle tree
in the proposed algorithm are still left open.
The authors in [5] proposed a data transmission scheme, which is based on
the multi-link concurrent communication tree model. It is to handle the failure
nodes in blockchain. Results show that the proposed scheme works effectively for
15% of failed nodes. However, if this number reaches the 30%, communication
time and delay will increase.
Blockchain-based incentive mechanism is proposed in [6] for data storage
by the nodes. The researchers replaced traditional Proof of Work (PoW) with
Provable Data Possession (PDP) to achieve better results. For the comparison
of existing data of a node with new data, they used preserving hash function.
The damaged data is only identified by the proposed scheme; however, it is
not recovered using the proposed schemes. In [7], an energy efficient routing is
proposed for Underwater WSNs using blockchain.
Blockchain-based solution is proposed in [8] by the authors for long range
wide area networks. Crowd sensing is used for the integration of long range wide
are networks and blockchain. The proposed model deals with the issue of trust
and lack of network coverage in private networks.
It is an era of IoT; therefore, the integration of blockchain and IoT is studied
by many researchers. They aimed at how to resolve the issues of trust, latency,
security, scalability, efficiency and massive data generated by IoT devices. In [9],
the authors proposed a blockchain-based secure service provisioning mechanism
for lightweight IoT devices. Smart Contract (SC) is used to check the validity
of acquired services and edge service providers. Results show that the proposed
scheme protects lightweight devices from untrusted edge service providers and
unsecure services. However, a reputation system for the service providers can be
built in future.
In [10], a reputation system is built for the edge servers. Also, incentives are
given. The authors proposed PoA instead of PoW to ensure high throughput and
low latency of the proposed system. However, security mechanism for lightweight
96 R. J. ul Hussen Khan et al.
clients is missing. Also in [11], an incentive mechanism is proposed on the basis

of reputation of edge servers.
IoT is a network that connects smart objects together so that they can com-
municate with each other. If the smart objects are cars, it is known as IoV.
Branch-based blockchain technology for Intelligent Vehicles (IVs) was proposed
in [12]. It is to handle the large data generated by IVs. While blockchain is used
to keep track of the data generated by IVs and to verify it. Additionally, the
concept of IV Trust Point (IVTP) is also introduced to build trust. Problem
with branching is that duplicate state changes increase with increasing load. In
[13], an efficient PoA is proposed by the authors for fair data sharing in smart
grids. In the proposed model, a block is added by the customer with high rep-
utation instead of solving complex computational problems. Results show that
the proposed PoA outperforms state of the art PoW in terms of computational
cost and gas consumption. Also, the system is secure against attacks. In [14],
data rights are managed using blockchain.
In paper [15], authors used blockchain for secure data storage. To resolve the
storage issue they proposed a five blockchain-based architecture. The blockchains
store different types of data. The results show that the storage issue is tackled by
providing secure and large data storage. Also, high throughput is achieved with
increasing data. However, delay is increased when the data increased. Blockchain-
based Vehicular Network (Block-VN) model is proposed in [16] for the smart
city. It is to solve the issues of user comfort and security. There are three types
of nodes in the model: miner nodes, ordinary nodes and controller nodes. The
authors achieved the real time ride sharing, ride switching and security, however
they missed that there can be broken nodes that may transmit the inaccurate
information. In [17–19], blockchain and data science are used for data moneti-
zation, energy efficiency and data security.
As WSNs consist of resource constrained sensor nodes, mostly deployed in
unusual conditions, e.g., caves, deserts and forests, etc. Such conditions combined
with limited resources (battery, storage and computational power) of sensors,
make sensor nodes vulnerable to faults. Node failure is thus a common phe-
nomenon in WSNs. For proper functioning of WSNs, identification and recov-
ery of failed nodes is important. By keeping issue of node failure into account,
Blockchain-based Node Recovery (BNR) scheme for node recovery is presented
in this paper. The scheme will not only detect, but also recover the failed nodes.

Consortium blockchain-based model is proposed in this paper. Figure 1 illustrates
the proposed system model. In this work clustered WSN is considered, which
is a group of sensor nodes grouped into different clusters [20]. Each cluster has
sensor nodes and a Cluster Head (CH). The model is divided into sensor nodes,
CHs, sink node and blockchain technology. Sensor nodes are deployed to collect
data from the environment.
Nodes collect and forward the data to the CH. Non-CHs can communicate
to the sink via CH, which is then responsible for the transfer of data to the sink
node. Sink node receives the data from different CHs and performs the necessary
operations. The sink is also responsible to initiate the SC execution. As sensor
nodes are connected to sink through CH; therefore, the failure of CHs can cause
the failure of sensor nodes. When a CH is failed, its attached sensor nodes are
failed to communicate to the sink node.
Fig. 1. Proposed system model for BNR
Blockchain is a distributed peer-to-peer technology. In the proposed model,

blockchain is mainly implemented at CHs. Each CH is connected to every other
CH via a peer-to-peer network and also to the sink node. Every CH maintains a
neighbor list, which contains the ID, node degree, state and reputation. As the
network is distributed, so all the CHs will have information about each other.
When there is a state change of a CH from working to non-working, i.e.,
state = 0, other nodes will be aware of the change. Once the failure of CH is
detected, the recovery process starts. Recovery is based on the node degree and
reputation of CHs. CH with the least degree and highest reputation is selected
as the best candidate. Failed CH is replaced with the best candidate.
Figure 2 is the activity diagram of the BNR scheme. It shows the basic work-
flow of the scheme. Sensor nodes sense data from the field in which they are
deployed. After the data is acquired, it is sent from the source to the CHs. At
this step, the proposed system checks the state of CH. There are two possibili-
ties: either CH is in the working state, i.e., state = 1 or in the non-working state,
i.e., state = 0. State change will be decided after consensus among CHs. If a CH
is in state = 0, there is communication gap created between sensor nodes and
the sink node.
Fig. 2. Activity diagram of the BNR scheme
In the case of working CH, consensus will be performed among CHs to assure
that state = 1. If 51% consensus is achieved for CH is in working state, it receives
the data from sensor nodes and forwards it to the sink. Where, sink node per-
forms the necessary tasks. However if failed CH is detected, consensus will be
among working CHs for the failed CH. Then, the recovery process starts. First
step is the selection of best candidate for replacement. This selection is on the
basis of node degree.
Node degree is the number of edges or connections. The CH which has the
least node degree is selected as the best candidate. As CHs are connected in a
peer-to-peer network; therefore, the distance of all CHs from failed CH is the
same. Due to this, we ignored the distance and only considered the node degree.
Now, the best candidate will replace the failed CH. Once CH is recovered, its
neighbor sensor nodes are now able to communicate normally to the sink node.
3 Simulation Results and Discussions
In this section, experimental setup and simulation results are discussed. Effec-
tiveness of the models is shown by results. Also, a cost analysis for NodeRecovery
SC is performed. Security analysis of the proposed scheme is also given in this
section.
3.1 SC Implementation
NodeRecovery SC is implemented using Solidity language with the help of fol-

lowing tools:
• Tools: Remix IDE, MetaMask, Ganache, Rinkeby test network and MAT-
LAB R2018a.
Remix is an Ethereum based IDE, which supports implementation and execution

of SCs using Solidity language. MetaMask is an Ethereum wallet connected to
Ganache, where Ganache provides Ethereum accounts with Ethers for testing.
Rinkeby test network is used, as it supports Proof of Authority (PoA). MATLAB
R2018a is used for the simulation purpose.
2000000
Gas consumption (gas)
1500000
1000000
500000
Transaction Cost Execution Cost

Cost
Fig. 3. NodeRecovery deployment cost
The deployment of the SC is an expensive operation. In Fig. 3, transaction

cost and execution cost for deploying the SC are shown. The transaction cost
for the NodeRecovery is high (1803382 gas units) as compared to execution cost
(1353610 gas units). This is due to the fact that, execution cost is included in
transaction cost.
As execution cost is included in transaction cost, so for further simulations
only transaction cost is considered. Different operations are performed by the
NodeRecovery SC, having different gas consumption. These operations include:
Attached-Sensors, Deploying-CHs, CH-Neighbors, CH-Degree, State-Detection,
Recovery-Needed and CH-Recovery. Gas is consumed when some transaction
happens in the SC. Deploying-CHs function involved transactions, so cost is
involved. However, other functions are only used to view the results. Therefore,
no cost is used for those functions.
For simulations, 4 CHs and 30 sensor nodes are considered. These CHs are
connected to each other via a peer-to-peer network. They maintain a neighbor
list. Once a CH is failed, the CH with least degree will replace it. Detail of
CHs and sensor nodes are given in Table 1, along with ID and degree. CH1 has
ID = 01, sensor nodes = 6 and degree = 9; CH2 has ID = 02, sensor nodes = 9 and
Table 1. Parameters of CHs
CHs ID Attached Node degree

sensor nodes
CH1 01 6 9
CH2 02 9 12
CH3 03 12 15
CH4 04 3 6
50000
40000
Transaction Cost (gas)
30000
20000
10000
CH1 CH2 CH3 CH4

CHs
Fig. 4. State checking cost when CH1 is not working
degree = 12; CH3 has ID = 03, sensor nodes = 12 and degree = 15 while, CH4 has
ID = 04, sensor nodes = 3 and degree = 6.
Four different scenarios are used to obtain the results depicted in Figs. 4, 5,
6 and 7. Figure 4 shows the gas consumption for the first scenario, where CH1
is in the non-working state (state = 0) while, other CHs are in the working state
(state = 1). Similarly, Fig. 5 shows the gas consumption for the second scenario,
where the state of CH2 is considered as non-working and other CHs’ as working.
In the third scenario, state = 0 is considered for CH3 and state = 1 for others.
While in the last scenario, CH4 is considered as in the non-working state.
Figures 4, 5, 6 and 7 show the gas consumption for the first, second, third and
fourth scenario, respectively. It can be concluded from the above results that, the
CH which is in non-working state consumes less gas. While the CHs in working
state consume more gas. It can be noted that the CH with non-working state
still consumes gas. The reason behind this, the gas is consumed for checking the
state of a CH.
3.2 Outputs for SC

Outputs of NodeRecovery SC are given in this part. Each CH in the network
maintains a list of its neighbor CHs. Figure 8 is the output for neighbor list
maintenance. Any CH can be either in working or non-working state.
50000
40000

30000
20000
10000
CH1 CH2 CH3 CH4

CHs
50000
40000
30000
20000
10000
CH1 CH2 CH3 CH4

CHs
50000
40000
30000
20000
10000
CH1 CH2 CH3 CH4

CHs
Figure 9 shows the states of CHs. It can be seen that CH3 is in the non-
working state, i.e., state = 0, while CH1, CH2 and CH4 are in the working state.
After CH3 is detected as a failed node, the recovery process starts.
As recovery in the scheme is on the basis of node degree, so Fig. 10 is the
output for node degrees of different CHs. It can be seen that CH4 has the least
node degree, i.e., 6. Therefore in Fig. 11, it is shown that CH4 will replace the
failed CH3. Ultimately, the adjacent nodes of CH3 are recovered and able to
Fig. 8. Neighbor list maintenance log
Fig. 9. State detection log
Fig. 10. CH degree log
communicate again. Results show that the proposed system is effective, as it

recovers the failed nodes effectively.
3.3 Cost Analysis

Cost analysis is performed for the NodeRecovery SC. The cost can be calculated
by calculating the gas consumption. Table 2 shows the total gas consumption for
different scenarios for recovery of nodes. It can be seen that the gas consumption
for different scenarios is distinct. However, overall gas usage is stable (2113662)
for all scenarios. While performing simulations, price of one Ether is $269.08 in
May 2019. Where, 1 Ether is for 1 billion gas units or 1 gas unit = 0.000000001
Fig. 11. CH recovery log
Table 2. Gas consumption detail for node recovery
Scenarios SC CH1 (gas) CH2 (gas) CH3 (gas) CH4 gas) Total
deployment (gas)
cost (gas)
Failed CH1 1803382 62694 82536 82470 82580 2113662
Failed CH2 1803382 82558 62672 82470 82580 2113662
Failed CH3 1803382 82558 82536 62606 82580 2113662
Failed CH4 1803382 82558 82536 82470 62716 2113662
Ethers. To calculate the price in USD, gas is converted to Ethers. For this, the
formula motivated from [21] is used:
Ethers used = Gas consumed ∗ 0.000000001 Ether.
Therefore by using the given formula, it can be found that the total Ethers used
are 0.002113662. Now, the total cost required to execute a full scenario can be
easily calculated by multiplying the result with the current cost of Ether. Hence
after going through all these steps, the total cost of setting up and executing the
NodeRecovery can be found as low as 0.5687$.
3.4 Security Analysis

WSNs are widely used in various areas where security is important, such as
military applications, healthcare and other fields. To overcome the security
issue, Consortium blockchain is used in the proposed scheme. In Consortium
blockchain, the consensus mechanism is managed by a set of pre-defined par-
ticipants, i.e., in the proposed BNR scheme, the consensus is performed by the
known CHs. Therefore, BNR is secure against the attacks.
4 Conclusion
In recent years, researchers incorporated blockchain in different fields including
WSNs. In this paper, node failure issue in WSNs is addressed using blockchain.
BNR scheme is proposed for the recovery of failed nodes. PoW is replaced
with PoA to avoid the computational overhead for resource-constrained WSNs.
NodeRecovery SC is developed for the recovery of failed nodes. Failed CH is
successfully detected and recovered using its state with the help of BNR scheme.
Once the CH is recovered, failed nodes are also recovered and able to commu-
nicate again to the sink. Results show that the system is effective in recovering
the failed nodes. As the cost for deploying and executing SC is important, so a
cost analysis is also performed. Results of cost analysis show that the total cost
for deployment and execution of NodeRecovery is 0.5687$, which is very low.
Security analysis of the proposed scheme is also performed in order to assure the
security.
References
1. Chen, W., Xu, Z., Shi, S., Zhao, Y., Zhao, J.: A survey of blockchain applica-
tions in different domains. In: Proceedings of the 2018 International Conference on
Blockchain Technology and Application, pp. 17–21. ACM (2018)
2. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008)
3. Zahid, M., Nadeem, J., Muhammad, B.R.: Balancing electricity demand and sup-
ply in smart grids using blockchain. MS thesis, COMSATS University Islamabad
(CUI), Islamabad, Pakistan, July 2019
4. Kushch, S., Prieto-Castrillo, F.A.: Rolling blockchain for a dynamic WSNs in a
smart city (2018). arXiv preprint arXiv:1806.11399
5. Li, J.: Data transmission scheme considering node failure for blockchain. Wirel.
Pers. Commun. 1031, 179–194 (2018)
6. Ren, Y., Liu, Y., Ji, S., Sangaiah, A.K., Wang, J.: Incentive mechanism of data
storage based on blockchain for Wireless Sensor Networks. Mob. Inf. Syst. 2018,
10 (2018)
7. Mateen, A., Nadeem, J., Sohail, I.: Towards energy efficient routing in blockchain
based underwater WSNs via recovering the void holes. MS thesis, COMSATS Uni-
versity Islamabad (CUI), Islamabad, Pakistan, July 2019
Comput. 2018, 12 (2018)
10. Rehman, M., Nadeem, J., Muhammad A., Muhammad, I., Nidal, N.: Cloud based
secure service providing for IoTs using blockchain. In: The IEEE Global Commu-
nications Conference (GLOBCOM 2019) (2019)
11. Ali, I., Nadeem, J., Sohail, I.: An incentive mechanism for secure service provision-
ing for lightweight clients based on blockchain. MS thesis, COMSATS University
Islamabad (CUI), Islamabad, Pakistan, July 2019
put. Netw. 145, 219–231 (2018)
13. Samuel, O., Nadeem, J., Muhammad, A., Zeeshan, A., Muhammad, I., Mohsen,
G.: A blockchain model for fair data sharing in deregulated smart grids. In: The
IEEE Global Communications Conference (GLOBCOM 2019) (2019)
14. Naz, M., Nadeem, J., Sohail, I.: Research based data rights management using
blockchain over ethereum network. MS thesis, COMSATS University Islamabad
(CUI), Islamabad, Pakistan, July 2019
15. Jiang, T., Fang, H., Wang, H.: Blockchain-based Internet of vehicles: distributed
network architecture and performance analysis. IEEE Internet of Things J. 6,
4640–4649 (2018)
17. Javaid, A., Nadeem, J., Muhammad, I.: Ensuring analyzing and monetization of
data using data science and blockchain in IoT devices. MS thesis, COMSATS
University Islamabad (CUI), Islamabad, Pakistan, July 2019
18. Kazmi, H.S.Z., Nadeem, J., Muhammad, I.: Towards energy efficiency and trustful-
ness in complex networks using data science techniques and blockchain. MS thesis,
COMSATS University Islamabad (CUI), Islamabad, Pakistan, July 2019
19. Noshad, Z., Nadeem, J., Muhammad, I.: Analyzing and securing data using data
science and blockchain in smart networks. MS thesis, COMSATS University Islam-
abad (CUI), Islamabad, Pakistan, July 2019
20. Jarwan, A., Sabbah, A., Ibnkahla, M.: Data transmission reduction schemes in
WSNs for efficient IoT systems. IEEE J. Sel. Areas Commun. 37(6), 1307–1324
(2019)
over the ethereum blockchain network. Sci. Technol. Libr. 37(3), 235–245 (2018)
Towards Plug and Use Functionality
for Autonomous Buildings
Markus Aleksy(B) , Reuben Borrison, Christian Groß, and Johannes Schmitt
ABB Corporate Research, Wallstadter Str. 59, 68526 Ladenburg, Germany

{markus.aleksy,reuben.borrison,christian.d.gross,
johannes.o.schmitt}@de.abb.com
Abstract. Existing building automation solutions suffer from the draw-

back of causing high engineering, commissioning, and installation efforts.
Furthermore, the existing approaches are error-prone and require a lot
of manual interaction between the involved parties. This article presents
the results of an analysis for moving towards autonomous plug&use func-
tionality by exploiting external data sources merged and maintained in
a central information model, e.g., device information obtained via the
scanning of NFC or QR tags, functional planning information, network
topology information, wiring plan information, etc. In this context, the
autonomous plug&use functionality aims at fully automating the com-
missioning process of devices of a building automation system by deriving
input parameters from external data sources thereby minimizing manual
user efforts.
1 Introduction
With the emergence of internet of things, almost every aspect of our everyday
life is influenced by smart sensors and actuators that try to optimize and simplify
our life. Building automation is an area which sees its direct benefits as with the
constantly increasing number of smart sensors and network-enabled actuators
for buildings, various tasks can be automated and optimized, leading to a more
and more autonomous building that provides an increased comfort and safety
while exhibiting a reduced cost footprint throughout its entire lifetime [4,12].
From a high-level perspective, this encompasses the following different life-
time phases [11]:
• The planning, installation and commissioning phase of a building,
when the architecture building is defined, the electrical planning is done, and
when all electrical components of a building automation systems are installed,
configured, and put into operational mode.
• The maintenance and operation phase during which the building is used
as planned and where all electrical components are working normally. In case
of a failure of one or more components, a maintenance workflow is triggered
leading to a possible exchange of components putting the system back into
operational state.
https://doi.org/10.1007/978-3-030-33509-0_10
Towards Plug and Use Functionality for Autonomous Buildings 107
• The demolition and tear down phase, where the building automation
system is teared down permanently, and all its components are discarded.
While the higher degree of automation leads to an increased comfort, effi-
ciency, as well as security of users in buildings, a large number of sensors and
actuators are required. Existing building automation solutions suffer from the
drawback of causing high engineering, commissioning, and installation efforts for
commercial buildings. Furthermore, the existing approaches are error-prone and
require a lot of manual interaction between the involved parties. To overcome
the problems mentioned above, the following challenges has to be addressed:
(i) Identify all major stakeholders that play a major role in the planning and
commissioning of building automation components. (ii) Identify and describe
the main steps of the planning and commissioning workflow of building automa-
tion components. (iii) Identify and analyze the major pain points for the key
stakeholder in each step of the workflow. (iv) Develop improvements that can be
made regarding the current workflow of planning and commissioning of building
automation devices leading to a plug&use functionality. Due to space limitation,
we will focus on the last aspect in this paper.
Based on our analysis of the above aspects, we introduce a novel tag-enabled
commissioning approach aiming for an automated plug&use functionality that
reduces manual and faulty user inputs thereby increasing the efficiency of the
commissioning and configuration process. In this context, we define the plug&use
functionality as the ability of a building automation system consisting of differ-
ent platforms and technologies to (semi-)automatically identify, connect, locate,
and configure all its required components and services and putting them into
an operational state providing the full targeted functionality without any limi-
tations. This comprises introducing and connecting the components to various
internal and external services.
The remainder of the paper is structured as follows: Sect. 2 highlights the
autonomy levels for a step-wise introduction of a plug&use functionality, lead-
ing to a more and more autonomous building. Subsequently, Sect. 3 presents
the general assumptions upon which we based our later design followed by
Sect. 4 describing the design considerations that guided our work. Further-
more, the section describes the steps of an abstract commissioning workflow.
Section 5 provides the definition and classification of use cases addressing a sim-
plified plug&use functionality for building automation. Section 6 gives a detailed
description of a selected use case focusing on adding already installed devices to
a functional building plan. Finally, Sect. 7 concludes our work.
2 Autonomy Levels for Plug and Use in Buildings

The discussion on levels of autonomy was started in the area of autonomous driv-
ing [13]. Gamer et al. [3] defined different levels of autonomy in industry ranging
from zero to five where autonomy level 0 comprises conventional industry exhibit-
ing no autonomous behavior and level 5 constituting the highest autonomy level
of industrial automation. When applying the same autonomy level definitions on
108 M. Aleksy et al.
Fig. 1. Levels of autonomy for plug and use in buildings.
the plug&use functionality, we obtain with the autonomy levels for the plug&use
functionality as shown in Fig. 1.
Thereby, the plug&use functionality is further broken down into the following
functional artifacts:
• Device connectivity ensuring that any given device is physically connected

to the system in a wired or wireless way, powered up, and communication
enabled.
• Device detection ensuring that any newly added device is detected by the
system.
• Device identification ensuring that each device can be identified uniquely.
• Device localization ensuring that the physical location (PL) and functional
location (FL) of the device is determined correctly.
• Device configuration ensuring that all relevant configuration parameters of
the device are set properly.
Based on the given functional artifacts, the autonomy levels for the plug&use
functionality can be defined as follows: Level 0 defines the conventional installa-
tion procedure for non-communicating electrical components such as traditional
lights and switches. In case that a component requires further configuration
parameters, they must be done physically on the device. By introducing net-
work communicating automation components to buildings, we obtain an auto-
mated building as defined by the autonomy level 1 that represents the default
“automated building” of today. All functional artifacts need to be done manu-
ally by the system integrator. By introducing increased tool support via profiles
or wizards, the plug&use functionality at autonomy level 2 can be reached. At
autonomy level 3, installed devices are automatically detected and manually
configured based on the functionality they provide. By introducing autonomy
level 4, all devices are automatically connected, detected, and identified by the
building automation system. The device localization, however, still needs to be
done manually by the system integration. At autonomy level 5, the full plug&use
functionality is reached, where all functional artifacts are executed fully auto-
matically without human intervention after power-up of the building automation

system.
3 General Assumptions
For the development of the plug&use functionality the following general assump-
tions have been made:
• Network communication is required by a device to provide its functionality
(i.e. not considering standalone devices).
• Focusing on mainly used publicly available communication standards/tech-
nologies, e.g., KNX [8,9], Zigbee [1], Bluetooth [6]. Proprietary or technologies
without any kind of device or interface description (like Bluetooth profiles)
are excluded in this report as they are supposed to be not suitable for the
targeted plug&play capabilities.
• Targeting larger use-cases like mid-size commercial buildings (not private
households, however, overlapping use-cases might occur) with a size of up to
10.000 m2 .
• Devices can have differing FL and PL (e.g. a device installed in a cabinet
switching mains that powers the lamp installed in a room).
• Building automation installations are typically heterogeneous: comprising
various device types, technologies, and manufactured by different manufac-
turers.
4 Design Considerations
In the following, the design considerations that guided our work are presented
that drive the need for plug&use in the area of building automation.
4.1 Main Function Blocks for Plug and Use

Plug&use is targeting simplified or ideally automatic commissioning of devices.
Commissioning is mainly about the configuration of devices. The configuration
of a device comprises of the (A) activation and parameterization of the targeted
functionality. If this functionality is based on the interaction with other devices
the communication between these devices must be established. To establish a
communication typically the (new) device has to be (B) connected to the tar-
geted communication technology and the (C) interaction between the devices
must be configured.
Figure 2 shows the abstract commissioning workflow consisting of the steps
for identifying, connecting, and configuring a component of a building automa-
tion system. Thereby, each step depends on a set of input parameters that need
to be given to execute and complete the step. For achieving fully autonomous
plug&use functionality each step of the workflow needs to be automated which in
turn requires all the input parameters to be derived automatically from different
input sources. In cases, where a parameter cannot be derived automatically, the

system would require manual user input. The commissioning process as shown
in Fig. 2 consists of the following basic steps:
• Start: Unconfigured devices that require installation and commissioning.

• Device detection, identification, and localization: The ability of the
system to detect, uniquely identify, and localize each entity of the system.
• Device connectivity: The ability of the system to address each entity and
to establish a connection to each entity for at least configuration purposes.
• Device configuration: The ability of the system to set all the input parame-
ters of a device to integrate and activate it such that it reaches an operational
state.
• End: Fully configured device that is integrated with the rest of the system
and in operational state.
Fig. 2. Basic commissioning steps.
The order of execution may vary: e.g. in KNX typically the device config-
uration is made first using the commissioning tool ETS, where also the device
location is defined manually. Then, after the device connectivity is established
on the KNX network, the device is identified (by the given physical address or
by a pressed push-button) and the device configuration is downloaded to the
device.
4.2 Support for Differing Workflows
In general, it can be differentiated between workflows using the following starting

scenarios:
• Greenfield: Complete new installation of a building automation system.

Assuming no configured system, devices or network infrastructure.
• Brownfield: Extension/partially replacement of an existing building
automation installation. Assuming existing network connectivity and working
system with configured devices.
However, in both cases, artifacts about targeted installation might be avail-

able (e.g. planning of functions/building/wiring).
4.3 Including Workflow Artifacts and Mechanisms

The following aspects are also relevant to categorize the plug&use workflow. We
assume that the artifacts are available in machine-readable form (otherwise a
manual interpretation might be considered as an alternative). If not mentioned
the artifacts can be available or relevant for both approaches (Greenfield and
Brownfield). These artifacts might be contained as part of a project database
(like in ETS for KNX):
• Building topology plan includes information about the building structure,
such as floors, rooms, areas, and cabinets.
• Functional plan typically makes use of (common/standardized) functional
blocks to describe targeted functions (e.g. VDI3813 [14]) and can contain
configuration details (e.g. dimming speed). Functional planning is based with
existing information about the building topology and is usually used as a
basis for the device placement.
• Device placement plan contains information where the (physical) devices
are or shall be placed.
• Wiring plan contains planning of (at least) the endpoints of cables or cable-
channels related to an existing building topology. The detail grade of the plan
might differ from cable-channel to which cable is connected to which device-
module and can also include technology specific details (e.g. KNX addresses).
• Infrastructure plan provides additional information about building
automation technology specific aspects including the related infrastructure
(e.g. LAN/WiFi settings or KNX line-couplers).
• Device semantics are artifacts that allow the mapping of devices towards a
(common) set of semantic definitions (e.g. mapping a device towards a dim-
mer). Device Semantics are supposed to support the mapping from a Func-
tional Plan to certain devices and thus to determine the Device Placement.
• (Pre-)Configured devices can contain valuable information: In a Green-
field scenario the integrator might have pre-configured the devices to (at least)
make the devices available in the network once they are installed and powered.
If in a Brownfield scenario a device shall be replaced which is still running
and available in the network, it might be possible to gather information about
its configuration by reading the device configuration.
• Device identification is a mechanism that provides an identifier to identify
the device within the used technology or network. The identifier can have
different properties, such as being human-readable (e.g. serial number printed
on the device), being machine-readable (e.g. NFC tags, QR tags), responding
on-request (blinking LED on selected device) or being triggered manually
(e.g. by pressing a push-button on the device). Moreover, sometimes the type
information can be part of the device identification, e.g. as an additional
property.
• Device localization is often related to an indoor localization technology
since the global positioning system (GPS) [7] often isn’t working reliably
here. Gu et al. [5] and Liu et al. [10] provide an overview of indoor positioning
techniques.
5 Plug and Use Requirements for Selected Use Cases

In our analysis, we focused on the following use cases:
1. Traditional greenfield with ETS/KNX: This use case describes the tra-
ditional workflow for the commissioning of KNX devices using ETS.
2. Utilizing an integrated building automation environment [2] for
commissioning of KNX devices: This function-based planning and com-
missioning tool supports the generation of device configurations based on
information modelling on functional planning applying the VDI3813 stan-
dard.
3. Adding a Bluetooth (BTLE) device to existing setup: The target of
this use case is to integrate a new Bluetooth device into a running building
automation system that makes use of the concepts supported by the function-
based planning and commissioning tool. It is assumed the function that is
provided by the device was already considered in the functional planning –
but not yet assigned to a physical device.
4. Add installed devices to a new functional plan with same PL and
FL: The target is to populate a functional plan with existing (and potentially
pre-configured) devices to further on make use of functional planning concepts
for (re-)configuration of the system. In contrast to the use case before this
use case does not require an existing functional plan. Instead, the target is
to populate a functional plan with already installed devices. This approach
can be described as brownfield-scanning. The brownfield-scanning considers
unconfigured and pre-configured devices.
5. Add installed devices to a new functional plan with differing PL
and FL: This use case is representing a special case of the use case 4: It
takes devices into account that have differing PL and FL. The PL is required
to set up the device communication (e.g. based on the network topology)
while the FL is required to set up the device configuration – especially the
interaction with other devices that have the same FL.
6 Case Study
In the following section, we will describe how use case 4 can benefit from the
plug&use concept in detail as it represents the most common use case during
the commissioning workflow.
The main target of this use case is to populate a functional plan with existing
(and potentially pre-configured) devices to make use of functional planning con-
cepts for (re-)configuration of the system. The corresponding approach can be
described as brownfield-scanning. The brownfield-scanning considers the usage
of unconfigured as well as pre-configured devices.
In context of the case study, we made the following assumptions by consid-
ering the following workflow artifacts to be available as shown in Fig. 3.
Fig. 3. In- and outputs for automating the commissioning process of devices with same
functional and physical address using a function-based planning and commissioning
tool.
• Network topology: We assume that when the devices are already connected
to the network, in some cases the network topology can be derived by scanning
the network.
• Building topology: The building topology plan can be available before or
created during the device identification task (e.g. using the mobile app for
device scanning also to create the location structure).
• Device placement: The information about device placement can be already
part of a building plan or building information model. Otherwise, if the
devices are already installed at their targeted location, additional concepts
can be utilized: e.g. by identifying the devices at their locations by scanning
their NFC or QR tags using a mobile application and assigning their position
information.
• Semantic device packages: Knowledge about the provided functions per
device exists.
• Pre-configured devices: Basically, we assume that new and unconfigured
devices will be scanned. However, also the case that the scanned devices
already have a configuration will be discussed.
• Device identification: Here, device identification is comparable to use case
3 requiring a mechanism to identify the device and its type.
6.1 Device Identification
In case of pre-configured devices and in addition to the other identification mech-

anisms the devices can be identified by a physical trigger while monitoring the
network or by activation of a status LED. The required input includes the fol-
lowing aspects:
• An identification technology to identify the device: Here, machine-

readable identification technologies, such as NFC or QR tags can be utilized.
The interaction required from the user is reduced to scanning the tag.
• Placement of the devices within the building topology: In addition to
the reading of the tag, the location of the device must be identified. This can
be done manually within the mobile application by selecting the actual room
within the building topology plan, reading another tag that is assigned to the
room and providing the room identifier, or falling back on the functionality
of an indoor positioning system.
The corresponding output includes the device id, the device type, and device
placement. The device id could be a serial number or the MAC address (e.g. in
case of Bluetooth devices). The device type is required since sometimes it cannot
be derived from the device id. Semantic device packages providing the functions
per device type can be used together with the location of the device and its
type to determine the available functions. The available functions per room are
used as input for the functional planning. Finally, the device placement contains
information where the device was scanned (assuming the same PL and FL).
6.2 Device Connectivity
We can differentiate between different alternatives concerning to device connec-

tivity:
• Supporting lookup for active devices: Here, broadcasting, a method of

transferring a message to all recipients, is used.
• Devices are publishing their availability: This approach is utilized in
Zigbee for example.
• Scanning all addresses for active devices: This technique is used in
KNX-based systems.
The corresponding input includes device IDs or pre-configured physical

addresses while the output is related to extended/updated network topology
that is required to support device configuration.
6.3 Device Configuration
The focus of device configuration is to (re-)configure the system based on the

functional planning. This use case assumes that the features of the installed
devices are known and assigned to the rooms where the functions are provided.
The main tasks that the user has to do afterward within the functional plan-
ning, is to select the features (i.e. the function blocks) that shall be activated and
set up the interaction between the function blocks (i.e. draw the lines between
the I/O ports of the function blocks).
The required input is related to:
• Available functions per room: As determined by the device scanning in

combination with the semantic device packages.
• Network topology: As a given requirement, we assume a working network
environment. The network topology can be derived from the output of the
device scanning and the set up of the connectivity.
• Device placement: Output of the device scanning – including the assump-

tion that PL and FL are the same, the functions of the scanned device have
the same location as the device itself.
• Functional planning: The main tasks that the user must do after scanning,
is to select the functionalities (i.e. the function blocks) that shall be activated
and set up the interaction between the function blocks (i.e. draw the lines
between the I/O ports of the function blocks).
The corresponding output includes the device configuration with technology-

specific settings created by the function-based planning and commissioning tool.
The latter contains information regarding device-specific parameter settings (e.g.
sampling rate), activated functions (which aspects of a device are activated and
relevant for the system), and interactions between devices (e.g. KNX group com-
munication).

Based on the analysis conducted above the following steps are recommended to
move towards an autonomous commissioning functionality enabling a plug&use
of newly installed components of a building automation system.
Concerning device detection and identification, the use of NFC and QR tags
can significantly reduce the commissioning overhead as well as the error rates
during commissioning. NFC and QR tags are well known and available tech-
nologies are supported by almost every state of the art mobile device. Other
identification technologies such as Bluetooth-Beacons or visual light communi-
cation can be used as well but require more adjustments and more specialized
hardware.
For identifying the PL of devices, most techniques require additional hard-
ware, which causes additional costs or installation and configuration effort before
being able to localize the devices to be installed and integrated into the system.
Other non-hardware-based localization techniques suffer from a limited accuracy
or cannot be used within buildings. Hence, the localization remains a mainly
manual process that could be semi-automated via the scanning of NFC tags
attached to each room or location of relevance.
Looking at the automated determination of the functional location of devices,
as it is needed for devices with differing PL and FL, the most promising approach
would be to make use of wiring plan and network topology information, which in
turn requires the availability of a digital, semantically annotated wiring plan and
format. At the time of this research work, however, corresponding formats were
not available. To overcome this drawback, an integrated tool would be needed
that allows for both the functional planning of devices of a building as well as the
subsequent wiring planning. This way mapping between devices and instances
of the functional planning and entities in the wiring plan can be done on the
fly. The development of such an integrated tool, however, denotes a significant
effort, which could be done as part of a separate research activity.
Regarding the automation of the connectivity of devices, DHCP like address-

ing approaches could be developed that exploit the network topology of the
building automation network. However, similar to the wiring plans, additional
implementation effort would be needed to enable the exploitation of network
topology information for the automatic address assignment.
References
1. Farahani, S.: ZigBee Wireless Networks and Transceivers (2008). https://doi.org/
10.1016/B978-0-7506-8393-7.X0001-5
2. Gamer, T.: An integrated building automation solution environment. ABB
Rev. 4 (2018). https://new.abb.com/news/detail/11238/an-integrated-building-
automation-solution-environment
3. Gamer, T., Klöpper, B., Hoernicke, M., Subasic, M., Biagini, V., Groß, C.:
Industrie-sicht auf autonomie. atp magazin 61(6–7), 62–68 (2019)
4. Garnett Clark, G., Mehta, P.: Artificial intelligence and networking in integrated
building management systems. Autom. Constr. 6(5–6), 481–498 (1997). https://
doi.org/10.1016/S0926-5805(97)00026-5
5. Gu, Y., Lo, A., Niemegeers, I.: A survey of indoor positioning systems for wireless
personal networks. IEEE Commun. Surv. Tutorials 11(1) (2009)
6. Haartsen, J.C.: Bluetooth radio system. IEEE Pers. Commun. 7(1), 28–36 (2000).
https://doi.org/10.1109/98.824570
7. Kaplan, E., Hegarty, C.: Understanding GPS: Principles and Applications. Artech
House (2005)
8. KNX Association: KNX Specifications. https://www2.knx.org
9. Kriesel, W., Sokollik, F., Helm, P.: KNX/EIB für die Gebäudesystemtechnik in
Wohn- und Zweckbau (2009)
10. Liu, H., Darabi, H., Banerjee, P., Liu, J.: Survey of wireless indoor positioning
techniques and systems. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 37(6),
1067–1080 (2007). https://doi.org/10.1109/TSMCC.2007.905750
11. Ngwepe, L., Aigbavboa, C.: A theoretical review of building life cycle stages and
their related environmental impacts (2015)
12. Penya, Y.K.: Last-generation applied artificial intelligence for energy management
in building automation. IFAC Proc. 36(13), 73–77 (2003). https://doi.org/10.1016/
S1474-6670(17)32467-9
13. SAE: Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle
Automated Driving Systems (2014)
14. VDI-Richtlinien: VDI 3813 - Building automation and control systems (BACS)
(2015)
An Evaluation of Pacemaker Cluster
Resource Manager Reliability
Davide Appierto and Vincenzo Giuliano(B)
University of Naples ‘Parthenope’, Naples, Italy

{davide.appierto,vincenzo.giuliano}@studenti.uniparthenope.it
Abstract. Assessing Mission-Critical systems is non-trivial, even more

when Commercial-Off-The-Shelf (COTS) software tools, which have not
been developed following custom reliability rules, are adopted. A sat-
isfactory process of standard certification brings in a model estimation
of system reliability. However, its evaluation requires in input reliability
data of subsystem units, which are quite difficult to obtain. A practi-
cal issue, in fact, concerns a general lack of detailed statistical evalu-
ations coming from on-field experiences. While, on the hardware side,
the research community gave an effective contribution, on the software
side, there is still work to do. An example is represented by the Cluster
Resource Manager (CRM) software running on top of clustered systems,
which is responsible of orchestrating fail-over operations. To the best of
our knowledge, for such a component there isn’t any reliability evalua-
tion based on field experiences.
In this work, a particular CRM, namely Pacemaker, was tested to esti-
mate the fail-over success probability in the occurrence of different type
of resource outages. Pacemaker is one of the most accepted CRM and is
used in several Critical Infrastructure (CI) contexts to ensure high avail-
ability of their Industrial Control System (ICS). Our experiments have
been conducted on a real clustered ICS, the Train Management System
(TMS) of Hitachi Ansaldo STS.
Keywords: Critical Infrastructure · Industrial Control Systems ·

Pacemaker · Cluster Resource Manager · Dependability ·
Mission-Critical Systems
1 Introduction
Industrial Control Systems (ICS) are now in every sector like energy, telecommu-
nications, water, transport, finance, gas and oil [1,2]. Most of the times they are
classified as Mission-Critical Systems since their failures could disrupt a specific
business operation or even lead to loss of life in case of Safety-Critical operations.
Due to ICS importance in everyday life, many mechanisms and strategies are
enforced to harden the reliability of such systems. A consolidated trend, in this
regard, is the usage of redundant computers organized in clusters that can ensure
https://doi.org/10.1007/978-3-030-33509-0_11
118 D. Appierto and V. Giuliano
uninterrupted service in front of Hardware (HW)/Software (SW) component fail-

ures. On top of these computer groups, Cluster Resource Manager (CRM) soft-
ware must execute to manage resources of different nodes, monitor their status,
detect failures and enforce fail-over strategies by automatically restarting appli-
cations on healthy nodes. Usually companies adopt CRMs coming from a wider
set of Commercial-Off-The-Shelf (COTS) software tools, which encompass also
the Operating System (OS) and the Messaging System (MS). COTS software,
in fact, has the tremendous advantage of reducing costs and shortening develop-
ment/deployment time. However, at the same time, they pose serious problems
in the proof of compliance with stringent reliability requirements of certification
standards, e.g. IEC61508 or EN50129, as COTS systems are result of a non-
custom development process and because of this cannot provide guarantees in
terms of reliability.
Assessing COTS-based systems is therefore a difficult task, a fortiori lack-
ing literature facing a statistical analysis of their reliability that could provide
reusable numerical results. Weak attempts have been made to prove that COTS
software can be used within systems with certain Safety Integrity Level (SIL)
requirements. As an example, Connelly et al. [3] tried to prove that a COTS OS
can be employed in SIL2 safety-related applications by using wrappers able to
protect the system against failures. The level of fault coverage of such a solution
is extremely low, as those really critical like transient faults cannot be addressed.
Or Pierce et al. [4], which report a detailed document of guidelines on how to
make more predictable Linux OS. They suggest proper tuning of most impor-
tant OS mechanisms to reach SIL2 requirements. Or furthermore, Jones et al. [5]
that provide a survey of assessment methods in the context of IEC61508 stan-
dard certification. Black-box and White-box testing techniques are presented by
the authors highlighting limits of both methods as the lack of numerical fail-
ure rates, the difficulty in enforcing tests without automated tools, the issue of
time-consuming tests, the impossibility of covering a wide range of SW faults.
What emerges is the lack of quantitative more than qualitative evaluations
based on field-experience. Researchers, in fact, are often forced to use hypo-
thetical reliability metrics in their model that cannot provide accurate results.
This is the case of the CRM software, which is usually underestimated during
the assessment process. The success of fail-over operations is fundamental to
avoid too long service outages and for this reason its probability of failure can
strongly impact the overall reliability calculated through cluster models defined
in the assessment process.
Any research paper estimated through evidences the reliability of COTS
CRMs, this paper aims at bridge the gap. Such an evaluation was the result
of a real need during the assessment of an Active/Standby cluster provided by
Hitachi Ansaldo STS (ASTS) that aims at demonstrating the compliance of their
Train Management System (TMS) application servers with SIL2 standards [6].
In this work, one of the most used CRM software, namely Pacemaker, has
been tested to obtain a measure of fail-over reliability. Different type of outages
were simulated to vary the failure conditions. Since the tuning of Pacemaker
An Evaluation of Pacemaker Cluster Resource Manager Reliability 119
can strongly affect the CRM behavior, the Benz et al. [7] work was leveraged to
define an optimal configuration.
The reminder of this paper is organized as follows: Sect. 2 describes the ASTS
use case by reporting the cluster architecture and the correspondent model. In
Sect. 3, an overview of Pacemaker and its configuration is provided. Then, Sect. 4
reports the approach conducted in the on-field experiments to test the CRM
and evaluate results. After that, Sect. 5 presents the results obtained after the
experimental campaign. Finally, Sect. 6 concludes the document.
2 ASTS Testbed Cluster
Performing accurate estimations of on-field experiments is challenging. The first

contributing element is the testbed that must be as close as possible to real sys-
tems. Our evaluation of Pacemaker fail-over availability was enforced on a real
Industrial Control System (ICS): the application servers of the Train Manage-
ment System (TMS) provided by Hitachi Ansaldo STS (ASTS). This cluster has
been leveraged to test the Pacemaker in real conditions and, most important, to
verify what impact the CRM configuration parameters can have on a two-node
cluster overall availability. To do so, we delineated a Markov model of the ASTS
cluster that takes into account the Pacemaker CRM.
2.1 Cluster Architecture
The TMS application server is organized as typical mission-critical systems with

high availability requirements: a redundant architecture allows to perform Safety
Integrity Functions (SIF) through two clustered server machines that provide
uninterrupted service even in front of node outages. The TMS system is config-
ured as an Active/Standby (1oo2) system. Such a configuration can guarantee
redundancy and, more important, a low system complexity and costs. Unlike
Active/Active (2oo2) where both nodes execute and a load balancer forwards
requests coming from a client, in the Active/Standby configuration one node
hosts the application while the other just exchanges the application state to
intervene in case of failures.
At hardware level, the Client/Server TMS application runs on top of two
identical redundant serves. Each node has (Fig. 3): 2 hot-swap power supplies, 2
hot-swap fans, 2 RAMs with IBM ChipKill, 1 CPU, 2 HDDs in RAID1 configu-
ration, 1 Network Interface Card (NIC), and finally the Operating System (OS)
with all cluster service software.
As typical clustered systems, the ASTS cluster needs software responsible
for resources orchestration, failure diagnosis, and fail-over management. TMS
servers are therefore provided with Linux SUSE and its High-Availability (HA)
Extension, which comes with two service software responsible for nodes coor-
dination and fail-over operations, the Pacemaker Cluster Resource Manager
(CRM) 1 , and a messaging layer for reliable communications, the Corosync Clus-
ter Engine 2 . Their combination allows to handle heartbeats between the nodes,
hardware/software fault detection, and application context migration.
The reliability of above-mentioned COTS software is crucial for the integrity
of the whole mission-critical system. As standards state, proving the availability
of COTS SW requires on-field study able to guarantee more accurate estimations.
In this sense, Pacemaker lacks of valid works. This is where our paper aims to
fill the gap.
Fig. 1. Cluster architecture
2.2 Cluster Markov Model

Figure 2 reports the cluster Markov model that includes the Time to Switch
needed by Pacemaker to perform recovery operations.
In state 2 the system is executing properly. At a certain point the cluster
can suffer a failure of the Active node with rate 2λ that moves it to state T
where the system is unavailable waiting for the switch to the Standby node.
Otherwise, the Standby server, with rate λs , can fail. In this case the cluster
keeps working without downtimes. Anyhow, The recovery of a failed node is
done after μ = 1/M T T R.
As all the other units, even the fail-over operation of Pacemaker can fail. If
this happens the system changes state by going from T to M . Such a situation can
occur with probability (1 − S)θ where S is the probability of fail-over operation
success and θ is the Time to Switch. A manual switch to the Standby node within
time μ∗ could bring again the system UP to state 1. If a second failure occurs
the system goes to state 0 where becomes unavailable.
1
http://clusterlabs.org/wiki/Pacemaker.
2
http://corosync.github.io/corosync/.
Fig. 2. Cluster Markov model without rejuvenation
As evidenced from the cluster model, the probability of fail-over operation

success S may have an impact on the overall cluster reliability. Other works in
literature hypothesize this metric or eventually , which is not acceptable. That
is why we believe it is necessary to estimate with real evidences the fail-over
dependability.
3 Pacemaker Architecture and Its Configurations
Pacemaker is a distributed Cluster Resource Manager (CRM) software usu-

ally provided within Linux-HA tools capable of co-ordinating the startup
and recovery of application services distributed across a set of machines. It
allows to flexible build different redundancy configurations like Active/Active
or Active/Standby ones. A proper setup of a clustered Pacemaker -based appli-
cation needs the following entities:
• The Messaging Layer (usually CoroSync or Heartbeat) is responsible of reli-

able messaging, membership and quorum information needed by the cluster
for nodes orchestration. To this layer belongs also “I’m alive” messages, fun-
damental to know if specific node failed. It must be noticed that such a com-
ponent does not belong to Pacemaker but it is essential for correct cluster
operations.
• The Pacemaker CRM Core embraces all the facilities needed by Pacemaker
to react to cluster events and take decisions on resources management. The
CRM Core includes:
– CRM Abstraction Layer useful to abstract the underlying hardware
infrastructure and then allow the Pacemaker to be flexible.
– CRMD (Cluster Resource Management Daemon) is the coordinator for
activities like starting/stopping resources. It receives order from the Pol-
icy Engine and gives command to the LRMD.
– LRMD (Local Resource Management Daemons), also known as Resource
Agents, run on each node and are the ultimate performers of cluster opera-
tions like monitoring, starting or stopping application services. If required,
the LRMD makes use of the STONITH to force an action. LRMD behav-
ior is controlled by the CRMD based on CIB configurations.
– STONITHD (STONITH Daemon) is a fencing system able to force oper-

ations on cluster’s nodes state. The goal is to avoid situations in which a
failed node can intact sensitive resources or block the whole cluster. Most
of the time, STONITHD is configured to hard-reset the failed node. Such
a feature requires the HW support in the server, that is, a unit indepen-
dent from the rest of the system. In the ASTS IBM servers the Integrated
Management Module (IMM) is responsible of STONITH actions.
– CIB (Cluster Information Base), a XML-based configuration file, which
allows to configure the cluster resource management and behavior for
anomalous/non-anomalous situations. The CIB is automatically synchro-
nized by the CRM among all nodes.
– The Policy Engine receives in input the CIB configuration file and
instructs a CRMD instance to act as a master and carry out the engine
instructions by passing them to the LRMD
• Distributed Replicated Block Device (DRBD): Most of the time an additional
entity, external to Pacemaker, is needed to synchronize the application state.
The ASTS cluster testbed, leveraged in this paper, follows the common trend
of using a DRBD in conjunction with the Pacemaker cluster stack to syn-
chronize the application state. DRBD, in fact, allows to build a shared File
System (FS), continuously synchronized, between the two nodes. The FS is
mounted only on the Active server and sends updates to the Standby one.
As soon as a failure occur, the FS is mounted on the recovery node and the
application can be restarted from a recent state update.
The majority of the units presented above are configurable through the CIB.
Its configuration sets the operation to be enforced on the resources controlled
by the CRM. The CIB tuning can strongly influence the Time to React to
events, the Time to Switch to secondary nodes and therefore the overall cluster
steady-state availability. In this paper, we identified the following noteworthy
CIB parameters that affect the fail-over operation:
• Timeout (T ) - the time to wait before declaring a specific resource as failed.
• Interval Time (IT ) - the resource monitoring period in which the CRM agents
checks the status of the resources
• On-Fail (OF ) - the action to enforce when a failure occurs: ignore the failure,
block (B) operations on the failed resource, stop the resource in whole cluster,
restart (R) the resource, fence (F) through STONITH, or standby (S) and
move away the resources from the failed node.
• Migration-Threshold (M T ) - the number of time a resource failure may occur
before declaring the node as ineligible to host the resource.
4 The Testing Approach

Testing the Pacemaker fail-over requires an exhaustive outages simulation of the
resources needed (CoroSync) and supported (DRBD and TMS application) by
the CRM. In this sense, we enforced fault conditions at different levels [8,9] of
Fig. 3. Cluster architecture
the cluster architecture presented in Fig. 1. The idea is to test the CRM fail-over
under several different failure circumstances as:
• Corosync Failure: the heartbeat message could fail to reach the other node.
Such a situation may happen for several reasons like an error condition in
Corosync, a network failure, or the total shutdown of a node. Therefore, in
this work, anomalous behaviors like the ones defined before were simulated
by simply randomly killing the Corosync service.
• DRBD Failure: an error in DRBD operations is a plausible condition that
worth considering. Thus, we forced a failure into the Device Mapper (DM)
unit of DRBD. This, in fact, is a framework of Linux kernel for mapping the
physical block devices onto higher-level virtual block devices.
• TMS Failure: a comprehensive analysis must consider even the possibility
that the application itself fails. A wide simulation of software failures is non-
trivial. Many techniques exist in this sense [10]. It would be out of the scope
of this paper. Our work, instead, focused only on the possibility of errors
at OS level. In this sense, we enforced a fault injection into POSIX APIs
leveraging a well-known tool, namely libfiu 3 . Such a tool builds a wrapper for
the application that intercepts all the system calls and allows the modification
of parameters.
3
https://blitiri.com.ar/p/libfiu/.
Previous simulations have been automatically launched through a script,

which enforce in a random way the different failure conditions. The elapsed
time between two successive experiments was 15 min as it is needed to take into
account the time spent from the cluster to eventually restart a server node.
Tests have been conducted in a period of 1200 h. This means that we were able
to realize a total number of tests equal to NT = 4800.
Pacemaker fail-over behavior was measured through a Field-Failure Data
Analysis (FFDA) conducted on system logs and Pacemaker logs. In particular,
the Black-Box (BB) recorder feature provided in Pacemaker was leveraged to
this end. The BB contains a rolling buffer of all logs and is written to disk
after a crash or assertion failure. This logging feature allows to get much more
information than in a normal usage.
5 Estimation of Fail-Over Reliability

Software reliability is defined as the probability of failure-free software operation
for a specific interval of time in a specified environment [11,12]. With regard to
Pacemaker, we define the fail-over operation reliability as the probability of
successful switch from the Active to the Standby node. This happens when the
recovery action is realized in a time Ts > LA = 2T where LA represents the
Latency Admitted and T is the timeout configuration parameter described in
Sect. 3.
Even if software reliability is not properly a function of time like hardware
equipment, the research community has usually adopted two different software
modeling techniques: prediction and estimation modeling. Both are based on
observing, accumulating and statistically analyze on-field data. While they dif-
fer in different period of use. Prediction models, in fact, are more used in early
development phase, while estimation models are adopted to estimate the fault
rate when the software application has been already released. In this case, the
most used distributions are exponential, Weibull, Thompson and Chelson’s mod-
els.
In this paper, the Pacemaker fail-over reliability has been estimated lever-
aging a Weibull distribution assuming a decreasing hazard function, that make
sense since software, unlike hardware, is more reliable over time [13]. We pur-
sued an approach based on a mixture of model- and experimental-based mea-
surements. Such an approach seems the most reasonable in the community [14].
A two-parameters Weibull has been adopted, having the following pdf and
CDF:
β t t β t β
f (t) = ( )β−1 e−( η ) ; F (t) = 1 − e−( η )
η η
Where β represents shape parameter, which define the type of distribution (e.g.
β = 1 ⇒ Weibull = Exponential Distribution); and η is the scale parameter (or
characteristic life), which changes the trend of the pdf for fixed values of β.
In reliability engineering, f (t) represents the failure rate function, while
R(t) = 1 − F (t) represents the system availability. The ratio between the two
parameters is the Hazard function, that is the instantaneous failure rate as func-
tion of age [15]:
f (t)
h(t) =
1 − F (t)
In Weibull, unlike Exponential distributions, the Hazard function is non-
constant. Based on β values, in fact, the hazard function can increase (β > 1)
or decrease (0 < β < 1). Therefore, such a parameter must be defined based on
degradation assumptions. In this work, the shape parameter value is assumed
equal to β = 0.8, as the Pacemaker fail-over is assumed to improve thanks
also to more suitable parameter configurations. The value of η, instead, indi-
cates the time at which 63.2% of equipment fails. The major findings from the
experimental campaign are:
• The CRM did not fail to perform its recovery operations when CoroSync
failures were simulated
• The CRM did not fail when the DRBD resource was forced to fail
• The occurrence of faults at application level caused the Pacemaker to fail 3
times. Such situations were due to a TMS application hanged execution. In
fact, the CRM was not able to understand in time the anomalous execution.
Based on previous evidences, and knowing that the MTTF of the Weibull pdf is
given by:
M T T F = η(1/β + 1))
the Pacemaker fail-over failure rate (λ = 1/M T T F ) is equal to λ = 1.54 × 10−3 .
6 Conclusion
This paper presented the experimental campaign conducted on a real use case,
the Train Management System of Hitachi Ansaldo STS, to evaluate the fail-
over success probability of the Pacemaker Cluster Resource Manager. Results
demonstrated that the CRM behaves abnormally only in the occurrence of appli-
cation hang. In the other cases, it provides a promptly recovery operation. Such
evidences need further analysis to be validated.
In this sense, future work can and should be done to better understand the
Pacemaker behavior. For example, an interesting extension to this research may
be focused on testing and measuring the effect of CRM configuration parameters
on the fail-over reliability, on the time to switch and, finally, on the overall cluster
reliability.
Acknowledgments. This project has received funding from the European Union’s
Horizon 2020 Framework Programme for Research and Innovation under g.a. No.
833088 (InfraStress).
Authors would like to thank Giovanni Mazzeo for his support in the research activ-
ity.
References
1. Campanile, F., Coppolino, L., D’Antonio, S., Lev, L., Mazzeo, G., Romano, L.,
Sgaglione, L., Tessitore, F.: Cloudifying critical applications: a use case from the
power grid domain. In: 2017 25th Euromicro International Conference on Parallel,
Distributed and Network-based Processing (PDP), pp. 363–370, March 2017
2. Cerullo, G., Mazzeo, G., Papale, G., Sgaglione, L., Cristaldi, R.: A secure cloud-
based scada application: the use case of a water supply network. In: SoMeT (2016)
3. Connelly, S., Becht, H.: Developing a methodology for the use of cots operating sys-
tems with safety-related software. In: Proceedings of the Australian System Safety
Conference, ASSC 2011, Darlinghurst, Australia, vol. 133, pp. 27–36. Australian
Computer Society, Inc. (2011)
4. Pierce, R.H.: Preliminary Assessment of Linux for Safety Related Systems.
Research Report Series, HSE Books (2002)
5. Jones, C., Bloomfield, R., Froome, P., Bishope, P.: Methods for assessing the safety
integrity of safety-related software of uncertain pedigree (SOUP). HSE Books
(2001)
6. Mazzeo, G., Coppolino, L., D’Antonio, S., Mazzariello, C., Romano, L.: SIL2 assess-
ment of an active/standby cots-based safety-related system. Reliab. Eng. Syst. Saf.
176, 125–134 (2018)
7. Benz, K., Bohnert, T.: Impact of pacemaker failover configuration on mean time
to recovery for small cloud clusters. In: IEEE International Conference on Cloud
Computing, CLOUD, July 2014
8. Cotroneo, D., Simone, L.D., Natella, R.: Dependability certification guidelines for
NFVIS through fault injection. In: 2018 IEEE International Symposium on Soft-
ware Reliability Engineering Workshops (ISSREW), Los Alamitos, CA, USA, pp.
321–328. IEEE Computer Society, October 2018
9. Cotroneo, D., De Simone, L., Iannillo, A.K., Lanzaro, A., Natella, R., Fan, J., Ping,
W.: Network function virtualization: challenges and directions for reliability assur-
ance. In: 2014 IEEE International Symposium on Software Reliability Engineering
Workshops, pp. 37–42, November 2014
10. Hsueh, M.-C., Tsai, T.K., Iyer, R.K.: Fault injection techniques and tools. Com-
puter 30(4), 75–82 (1997)
11. Lyu, M.R.: Handbook of Software Reliability Engineering. Research Report Series,
Mcgraw-Hill (1996)
12. Cotroneo, D., De Simone, L., Iannillo, A.K., Lanzaro, A., Natella, R.: Dependabil-
ity evaluation and benchmarking of network function virtualization infrastructures.
In: Proceedings of the 2015 1st IEEE Conference on Network Softwarization (Net-
Soft), pp. 1–9, April 2015
13. Kuo, S.Y., Huang, C.Y., Lyu, M.R.: Framework for modeling software reliability,
using various testing-efforts and fault-detection rates. IEEE Trans. Reliab. 50(3),
310–320 (2001)
14. Coppolino, L., D’Antonio, S., Mazzeo, G., Romano, L., Sgaglione, L.: Exploiting
new CPU extensions for secure exchange of ehealth data at the EU level. In: 2018
14th European Dependable Computing Conference (EDCC), pp. 17–24, September
2018
15. Nelson, W.: Weibull analysis of reliability data with few or no failures. J. Qual.
Technol. 173, 140–146 (1985)
A Secure and Distributed Architecture
for Vehicular Cloud
Hassan Mistareehi1 , Tariqul Islam1 , Kiho Lim2 , and D. Manivannan1(B)

1
University of Kentucky, 301 Rose Street, Lexington, KY 40508-3026, USA
{hassan.mistareehi,pavel.tariq}@uky.edu, mani@cs.uky.edu
2
William Paterson University of New Jersey,
300 Pompton Road, Wayne, NJ 07470, USA
limk2@upunj.edu
Abstract. Given the enormous interest in self-driving cars, Vehicular

Ad hoc NETworks (VANETs) are likely to be widely deployed in the
near future. Cloud computing is also gaining widespread deployment.
Marriage between cloud computing and VANETs, would help solve many
of the needs of the drivers, law-enforcement agencies, traffic management,
etc. In this paper, we propose a secure and distributed architecture for
vehicular cloud which uses the capabilities of vehicles to provide various
services such as parking management, accident alert, traffic updates,
cooperative driving, etc. Our architecture ensures privacy of vehicles,
and supports scalable and secure message dissemination using vehicular
infrastructure.
1 Introduction
Vehicular Ad hoc NETworks (VANETs) are a type of Mobile Ad hoc NETworks
(MANETs) that allow vehicles on roads to communicate among themselves and
form a self-organized network. Although modern vehicles are equipped with com-
puting, communication and storage resources, they often remain underutilized.
The Vehicular Cloud (VC) architecture which combines VANETs with cloud,
was first proposed [1] to fully capitalize the resources in VANET. In their app-
roach, a VC is a collection of autonomous vehicles in VANET where vehicles
contribute their underutilized computing, sensing, and communication devices
to the cloud [2].
In many of the existing vehicular cloud architectures, vehicles either com-
municate with each other directly or using Road Side Units (RSUs) to form a
vehicular cloud. In some schemes such as [3,4], vehicles authenticate themselves
and some other schemes use RSU -aided message authentication [5,6]. These
schemes suffer from communication overhead and do not scale well. If network
traffic becomes heavy, it might not be possible for vehicles to authenticate, pro-
cess, and forward messages in a timely manner. This could also result in message
loss redundant message propagation.
Ensuring security and privacy is an important issue in vehicular cloud; if
information exchanged between entities is modified by a malicious vehicle, seri-
ous consequences such as traffic congestion and accidents can occur. In addition,
https://doi.org/10.1007/978-3-030-33509-0_12
128 H. Mistareehi et al.
sensitive data could be lost, and human lives also could be in danger. Hence,
messages sent by vehicles must be authenticated and securely delivered to vehi-
cles in the appropriate regions. Furthermore, driver-related privacy information
such as driver name, position, and traveling route must be preserved. If vehi-
cles cannot communicate anonymously, an attacker could easily trace vehicles by
monitoring the messages sent by that vehicle. Several schemes [3,4,6–8] exist in
literature for solving authentication and privacy issues in communication; how-
ever, many of them do not guarantee confidentiality of exchanged messages and
therefore, are vulnerable to attacks.
In our proposed architecture, vehicles collect data and forward to the cloud
where this data can be verified, analyzed, organized, aggregated and then prop-
agated to the relevant vehicles. Multiple vehicles may observe the same phenom-
ena and forward these to the cloud which could result in propagation of redun-
dant messages. However, if information about observed phenomena is aggregated
and stored in a cloud, this redundant propagation of messages can be prevented.
Hence, data aggregation is another crucial requirement for building an efficient
vehicular cloud. Many of the existing schemes do not address this issue and
therefore, suffer from computation and communication overhead.
Objectives. The purpose of our work is to design a distributed, secure vehicu-

lar cloud architecture that ensures security and privacy in communication. Our
scheme should also be scalable and should have less communication overhead.
Furthermore, it should be capable of eliminating redundant messages through
aggregation. Last but not the least, the architecture is expected to be attack
resilient and handle failure of RSUs.
Contributions. Following are the main contributions of our work:

• Confidentiality. We propose a Vehicular Cloud architecture that ensures con-
fidentiality of the sensitive messages by encrypting messages using Symmetric
Key Cryptography.
• Authentication. We use digital signature based on Public Key Cryptography
to ensure authenticity and integrity of messages.
• Aggregation. Our architecture supports data aggregation based on the type
and location of the message to eliminate redundant messages.
• Scalability. In our scheme, vehicles do not exchange messages between them-
selves; they only forward them to the nearest Road Side Units, which in turn
propagate the messages further in a hierarchical manner. This reduces com-
munication overhead and makes the architecture scalable.
• Privacy. Our scheme provides anonymity and privacy of vehicles using pseudo
IDs. Any of the existing mechanisms for assigning and changing pseudo IDs
can be incorporated in our architecture to ensure privacy.
The rest of the paper is organized as follows. In Sect. 2, we describe our
proposed model. In Sect. 3, we present the security and overhead analysis of the
model. We present and compare some related work in Sect. 4. Finally, Sect. 5
concludes the paper.
A Secure and Distributed Architecture for Vehicular Cloud 129
2 Proposed Model
In this section, we present our system model and then describe the proposed
architecture in detail.
2.1 System Model

Figure 1 illustrates the proposed architecture which consists of vehicles, Road
Side Units (RSUs), Regional Clouds (RC) and a Central Cloud (CC).
Vehicle: Vehicles are assumed to be equipped with an on-board unit (OBU) for
computation and communication. Vehicles can communicate with RSUs through
the radio defined under the IEEE Standard 1609.2 [9], which is the proposed
standard for wireless access in vehicular environment (WAVE). We assume that
vehicles obtain their public/private key pairs and the public keys of the RSUs and
a set of pseudonyms when they register with their local RC such as Department of
Motor Vehicles (DMV) that administers vehicle registration and driver licensing.
RSU: RSUs are distributed on the road sides. In our architecture, RSUs are
assumed to be not compromised. They collect the information sent by vehicles
as well as authenticate and aggregate the received messages and forwards them
to the regional cloud.
RC: We assume the geographical area (for example, a country) is divided into
regions and each region is controlled by an RC. RC stores, analyzes, processes
and aggregates the relevant messages received from RSUs and forwards the infor-
mation to CC if necessary. It manages all private information about vehicles and
shares them securely with RSUs upon request. The RC and the RSUs within
its region are able to communicate with each other through a wired or wireless
network. When a vehicle sends a message, the RSU can verify the authenticity
of vehicle and the RC can also help RSUs to identify the real identity of vehi-
cles when investigations are required. RCs are assumed to be trustworthy and
not compromised and have a large computation and storage capacity. RCs are
assumed to be connected to all RSUs in its region, possibly through Internet.
CC: The CC is assumed to have more storage and computational power than the
RCs. CC and RCs can communicate with each other securely via a wired or wire-
less network. RCs provide the CC with services which may be needed by other
vehicles in other regions. In addition, CC can provide services to other depart-
ments such as law enforcement, traffic management, etc. The CC is assumed to
be trustworthy and not compromised.
Fig. 1. Secure and distributed architecture for vehicular cloud.
Table 1. Notations.
Notation Description
RCi Regional Cloud i
RSUi Road Side Unit i
CC Central Cloud
IDA Identity of Entity A
P IDA Pseudo Identity of Entity A
M A Message
Vx Vehicle x
T ype Type of Message
Loc Location of the Phenomena
ts Timestamp
SKA Private Key of Entity A
P KA Public Key of Entity A
K Secret Shared Key
SIGA (M ) Signature of M Signed using A’s Private Key
H() Hash Function
E(M, K) Encryption of M with Key K
2.2 Proposed Architecture
In this section, we describe our architecture in detail. The notations used in this
section are listed in Table 1.
In our scheme, information collected by vehicles in an area are sent to the
RC which covers that area (e.g., city or state) through the nearby RSU s which
authenticate and aggregate the received messages and then forward them to the
RC for storage. RC analyzes, processes and further aggregates the relevant mes-
sages and sends them to vehicles in appropriate regions so the drivers can take
appropriate action. The RCs also communicate securely via wired/wireless net-
work with the CC which has more storage and computational power. The CC
provides services to the RCs and to different departments (e.g., police depart-
ment, traffic department, health department, etc.) and these departments can
also provide the CC with services that the drivers might need. For example, if
someone’s car gets stolen, he/she will call the police department, the police will
notify the CC which will forward it to the RCs and the RCs will forward it to
the vehicles in their region through RSU s. When a vehicle sees the stolen car
using a camera that captures the plate number of the stolen car, it will send a
message that contains the location of the stolen car back to the RC which will
send to the police department.
Fixed infrastructure (e.g., RSU s) may not exist in some areas or nearby RSU
could have failed. In such cases, messages have to be routed to another nearby
RSU through intermediate vehicles. If there are no vehicles or RSUs within the
transmission range of a vehicle, the vehicle stores and carries the message until it
gets closer to the next RSU or a vehicle. RSU s are responsible for verifying the
authenticity and integrity of messages sent by vehicles before forwarding them
to RC. In addition, all driver information should be protected and attackers
should not be able to trace the routes of the vehicles. This could be achieved by
assigning pseudonyms to vehicles in all communications and use them instead of
their real identities and changing them frequently to maximize privacy. Various
solutions proposed in the literature such as [10,11] addressing privacy issue can
be applied in our architecture.
Key Generation and Distribution. We assume that the RSU s, RCs and
the CC are trusted and not compromised. When a vehicle v is registered or
renewed with RC, it gets public/private key pair (P Kv , SKv ) and the public
keys of the RSU s are stored in the vehicles and each vehicle is preloaded with a
set of pseudonyms (P ID1 , P ID2 , ..., P IDn ).
In our architecture, all messages are authenticated using digital signatures.
When a vehicle sends a message, the sender vehicle attaches its digital signature
to the message. The digital signature is made by encrypting the hash of the
message and encrypting it using the vehicle’s private key. Moreover, not all the
messages need to be encrypted; a vehicle can decide if the message needs to be
encrypted or not depending on the type of the message. For example, if a vehicle
has to notify about ice on the road, it need not be encrypt the message. On the
other hand, if vehicle wants to send a message about a crime scene, this message
needs to be encrypted. If the vehicle decides to encrypt a message, it generates
a secret key K and encrypts the message using K and this key also is encrypted
using the public key of RSU P KRSU and sent to the nearby RSU . Encryption of
the secret key K with RSU s public key ensures that only the RSU can decrypt
it. This ensures confidentiality. Now, when RSU receives the message, it gets the
secret key K first by decrypting it using RSU s private key SKRSU and then
Fig. 2. Vehicle sending a message to RC.
it decrypts the message using the secret key K and authenticates the message
using the signature.
The following two subsections describe how vehicles send messages about
observed phenomena and how they request a service from the RC.
Vehicles Sending Messages about Observed Phenomena. We assume

events are classified into various types such as congestion, ice on road, accidents,
etc. When vehicles sense events, they send the sensed events to the RC through
RSU s and the RC determines if the message needs to be forwarded to all the
vehicles in its region. At the same time the RC also forwards the messages to
the CC if necessary. The CC processes and stores the received messages and
forwards the messages to other RCs if they need them. For example, if a vehicle
observes abnormal road condition (e.g., work zone area), it will send a message
to RC which will send it to the vehicles in its region and to the CC which
will forward it to the relevant RCs. So the vehicles that are going toward that
work zone area can avoid the area. So, the driver can have information about
traffic condition which can increase driver safety and reduce the number of traffic
accidents. Figure 2 shows the flow chart of communication from vehicle to RC.
This scheme works as follows.
When a vehicle vx senses an event, it decides whether it is a sensitive informa-
tion or not. If it is not sensitive, vx computes the message M1 without encrypting
it, and then sends it to the nearby RSUk . If it is sensitive information, it com-
putes M2 and then sends it to the nearby RSUk , where M1 and M2 are defined
as follows:
M1 = IDRSUk , (P IDvx , T ype, Loc, ts), SIGvx (M1 ).

(where, SIGvx (M1 ) = E(H(P IDvx , T ype, Loc, ts), SKvx ).
M2 = IDRSUk , E((P IDvx , T ype, Loc, ts), K), E(K, P KRSUk ), SIGvx (M2 ).
(where, SIGvx (M2 ) = E(H(P IDvx , T ype, Loc, ts), SKvx ).
The message M1 includes the pseudo ID of vehicle vx , type of the message
T ype, the location of phenomena Loc, and the timestamp ts. vx attaches its
digital signature that is obtained by computing the hash of the message and
encrypting it using its private key SKvx . In message M2 , the symmetric key K,
generated by vx is used to encrypt the message and this key is encrypted with
the public key of RSUk . When the nearby RSUk receives the message, it verifies
the authenticity and integrity of the message using the signature and processes
the message. In case, no nearby RSU exists or nearby RSU becomes unavailable
due to failure, our architecture handles this situation as follows.
Case 1. When a vehicle finds that the RSU within its transmission range
has failed (or there is no RSU within its transmission range), it computes the
message using the ID and the public key of the nearest RSU and then forwards
to it through other vehicles using an underlying routing algorithm. We assume
that all vehicles know the location as well as the public keys of all RSUs.
Case 2. If there is no nearby RSU and no nearby vehicles, vx waits (stores

and carries the message) until it finds an RSU or vehicle within its transmission
range and then forwards the message to that RSU or to that vehicle.
Message Aggregation by RSU s and RC. When an RSU receives multiple

messages from different vehicles that describe the same event, it aggregates them
(based on type and location, and time) to a single message Magg and sends it
to the RC. Now when RC receives the message Magg , RC stores Magg and does
more aggregation if it got similar messages from other RSU s. Magg is defined
as follows:
Magg = IDRCi , (IDRSUi , M, ts), SIGRSUi (Magg )
(where SIGRSUi (Magg ) = E(H(IDRSUi , M, ts), SKRSUi ).
RC Sending Messages to CC or to RSUs. When an RC decides to send a

message to the RSU s in the area covered by its region (e.g., city or state), it com-
putes and disseminates M3 to the appropriate RSU s and the RSU s broadcast
it to vehicles within their transmission range. In addition, RC decides weather
or not to send M3 to CC. If RC decides that M3 is important message, then it
sends M3 to CC as well. M3 is given by,
M3 = IDRC , (T ype, Loc, ts), SIGRC (M3 )
(where SIGRC (M3 ) = E(H(T ype, Loc, ts), SKRC )
Table 2. Vehicle vi sending messages to RC
Upon receiving the message M3 , CC stores and processes M3 . Then it com-

putes M4 which could be encrypted if necessary and then disseminates M4 to
all appropriate RCs if they need them. When RCi receives the message M4 ,
it broadcasts M4 to all vehicles within its region through the RSU s and the
intended vehicles consume it. M4 is given by,
M4 = IDRCi , IDCC , (T ype, Loc, ts), SIGCC (M4 )

(where SIGCC (M4 ) = E(H(T ype, Loc, ts), SKCC )
Table 2 shows the algorithm for vehicle sending message to RC.
Vehicles Requesting Service from an RSU. Figure 3 gives a flow chart

of actions taken when a vehicle requests for a service from the RC. Next, we
present the algorithm for requesting service in detail.
Fig. 3. Vehicle requesting a service from RSU .
When vehicle vx wants to request a service from the RC, it computes and
sends the message M1 to the nearby RSU which then forwards M2 to the RC
where M1 and M2 are given by,
M1 = IDRSU , (P IDvx , T ype, Loc, ts), SIGvx (M1 )

(whereSIGvx (M1 ) = E(H(P IDvx , T ype, Loc, ts), SKvx )
M2 = IDRC , (IDRSU , T ype, Loc, ts), SIGRSU (M2 )
(whereSIGRSU (M2 ) = E(H(IDRSU , T ype, Loc, ts), SKRSU )
The message M1 includes the pseudo ID of vehicle P IDvx , the type of event
T ype, the location of event Loc and the timestamp ts. The vehicle vx attaches
its digital signature SIGvx , so the RSU can authenticate the request message.
If the vehicle vx wants to request a sensitive information (e.g., a vehicle from
police department), it generates a symmetric key K and encrypts the message
using K and then encrypts K with RSU s public key P KRSU . In this case vx
sends M3 where M3 is defined as follows:
M3 = IDRSU , E((P IDvx , T ype, Loc, ts), K), E(K, P KRSU ), SIGvx (M3 )
(where SIGvx (M3 ) = E(H(P IDvx , T ype, Loc, ts), SKvx )
If the nearby RSU is not within the transmission range of vx , it will forward
the request message to the nearby RSU through intermediate vehicles using an
underlying routing protocol.
When an RC receives the request message, it checks if it has the requested

information. If so, it computes and sends a service message M4 to the RSU
which sends M5 to vx . Note that if the vehicle is not within the transmission
range of RSU , M5 will be forwarded through intermediate vehicles, where,
M4 = IDRSU , (IDRC , T ype, Loc, ts), SIGRC (M4 )

(whereSIGRC (M4 ) = E(H(IDRC , T ype, Loc, ts), SKRC )
M5 = P IDvx , (IDRSU , T ype, Loc, ts), SIGRSU (M5 )
(whereSIGRSU (M5 ) = E(H(IDRSU , T ype, Loc, ts), SKRSU )
If RC does not have the requested information, it forwards the request to the
CC. When CC receives the request, it checks if it has the requested information.
If it has it, CC computes and sends the service message to the RC. When RC
gets the service message, it forwards the service message to the relevant RSU
which forwards it to the vehicle vx . Then vx authenticates the message and
consumes it. Table 3 gives the algorithm for requesting information.
3 Security and Overhead Analysis

3.1 Ensuring Security and Privacy
Confidentiality. All the sensitive messages sent by the vehicles, RSUs, RCs
and CC are encrypted using symmetric key which has low overhead.
Message Authentication and Non-repudiation. We used digital signature

based on public key cryptography to ensure the authenticity and the integrity of
messages. In our architecture, a digital signature is attached with every message,
whether it is encrypted or not. When receiver receives the message. Receiver
decrypts the digital signature using the public key of the sender. The result
of the decryption will be the hash of the message. Receiver can then hash the
message in the same way as sender did and compare the two hashes. Since only
the sender can generate the digital signature, the authenticity of the message and
non repudiation are guaranteed. In addition, if the hash in the digital signature
matches with the hash of the message calculated by the receiver, integrity of the
message is ensured.
Privacy Preservation. Vehicles are assigned pseudo IDs. A vehicle never uses
its real ID in any communication. So using the pseudo IDs, privacy of vehicles
is ensured. In order to maximize privacy, many researchers in [10,11] suggest
assigning pseudonyms to vehicles and changing them more frequently. These
schemes could be applied in our architecture. When a malicious node is detected,
the real ID of the malicious vehicle is revealed by the RC to the authorities for
legal investigation.
Table 3. Vehicle vi requesting information from RC
3.2 Attack Resilience
Man-in-the-Middle Attack. In this attack, the adversary wants to intercept

and alter the messages transferred between two vehicles or between a vehicle
and an RSU. All these communicating parties (vehicles and RSU s) believe that
they are directly communicating with each other, but actually they are not.
In our proposed scheme, a sender attaches the digital signature SIG(M ) =
E(H(M ), SK) to the message which encrypts the hash of the message using
its private key. It is not easy to forge a signature without knowing the sender’s
private key. Thus, our scheme prevents man-in-the-middle attack.
Replay Attack. In this attack, the attacker resends or delays a previously

transmitted message. To prevent this attack, all messages carry time stamp
ts. We assume clocks are loosely synchronized. Vehicles can use GPS (Global
Positioning System) for synchronizing clocks.
Message Modification Attack. In order to protect the integrity of the mes-

sages from the attackers, each message carries the signature of the sender. More-
over, it also guarantees unforgeability under chosen message attacks– adversary
who knows the public key, and gets to see signatures on messages of his/her
choice, can not modify the message and produce a verifiable signature on the
modified message.
3.3 Communication and Computation Overhead
Our architecture is scalable, since vehicles do not store any keys of other vehicles
and do not authenticate any message sent by other vehicles. The RSUs which
have a large storage and computational power are responsible for authenticating
and aggregating messages. Vehicles only need to verify messages received from
RSUs, not from other vehicles; so communication and computation overhead is
low for vehicles. In addition, not all messages need to be encrypted in our archi-
tecture, just the sensitive messages need to be encrypted. If any vehicle decides
to encrypt a message, it generates a symmetric key to encrypt the message. Vehi-
cles do not have to request a symmetric key from RSUs, which further reduces
the communication and computation overhead.
4 Related Work
A cluster-based vehicular cloud architecture has been proposed in [12] and [13],
by grouping vehicles according to their location and speed. Both of these schemes
rely on a cluster head (CH) which is elected by the vehicles in the cluster, and
this CH performs the creation, maintenance, and deletion of all the vehicles
in that cluster. A similar approach is proposed by Chaqfeh et al. [14], where
vehicles in a specific region form a vehicular cloud and elect a broker among
them. The broker collects the desired data from the vehicles and then sends to
a cloud server if further processing is required. None of these schemes scale well
as the number of vehicles increases. When vehicles are moving fast, frequent CH
and broker elections occur that result in large message overhead. In addition, if
CH or broker fails, data aggregated by them may be lost.
Architecture combining VANET and cloud have been proposed by researchers
in [15,16]. In these schemes, vehicles collect data and send them to the cloud
through a mediator or an RSU. The vehicles at the same area could collect the
same data, so this leads to redundancy and results in large message overhead.
Like [12,13], they also suffer from single point of failure– if the mediator or the
RSU fails, data aggregated could be lost.
Several privacy-preserving authentication schemes such as cooperative
authentication [3], anonymous authentication [7], and dual authentication [8]
have been proposed. In these schemes, vehicles communicate not only with each
others but also with the RSU s or the T A (Trusted Authority) to verify the
authenticity of the messages. Although they ensure authentication and privacy,
these schemes do not scale well when traffic becomes heavy– vehicles may not be
able to verify all the messages sent by its neighbor vehicles in a timely manner,
which could result in message loss.
5 Conclusion
In this paper, we proposed a secure and distributed architecture for vehicular

cloud. This architecture is hierarchical which consists of vehicles, road side units,
regional clouds and the central cloud. Each regional cloud covers a region (e.g.,
city,state) and processes the information collected from vehicles through the
RSU s and provides on demand services to vehicles in its region. These regional
clouds further communicate with the central cloud and exchange information
between themselves to provide a wide range of services to vehicles. Our archi-
tecture also copes with RSU failures. In addition, our scheme ensures confiden-
tiality, authentication and privacy in communication.
References
1. Olariu, S., Khalil, I., Abuelela, M.: Int. J. Pervasive Comput. Commun. 7(1), 7
(2011)
2. Eltoweissy, M., Olariu, S., Younis, M.: EAI Endorsed Trans. Mobile Commun.
Appl. 1(1), 1 (2011)
3. Jo, H., Kim, I., Lee, D.: IEEE Trans. Intell. Transp. Syst. 19(4), 1065 (2018)
4. Lin, X., Li, X.: IEEE Trans. Veh. Technol. 62(7), 3339 (2013)
5. Zang, C., Lin, X., Lu, R., Ho, P.: IEEE International Conference on Communica-
tions (ICC), pp. 1451–1457. IEEE (2008)
6. Mallissery, S., Pai, M., Pai, R., Smitha, A.: In IEEE International Conference on
Connected Vehicles and Expo (ICCVE), pp. 596–601. IEEE (2014)
7. Azees, M., Vijayakumar, P., Deboarh, L.: IEEE Trans. Intell. Transp. Syst. 18(9),
2467 (2017)
8. Liu, Y., Wang, Y., Chang, G.: IEEE Trans. Intell. Transp. Syst. 18(10), 2740
(2017)
9. I.S. 1609.2. IEEE Xplore, pp. 1–240 (2016)
10. Ying, B., Makrakis, D.: IEEE International Conference on Communications (ICC),
pp. 7292–7297. IEEE (2015)
11. Li, J., Lu, H., Guizani, M.: IEEE Trans. Parallel Distrib. Syst. 26(4), 938 (2015)
12. Arkian, H., Atani, R., Diyanat, A., Pourkhalili, A.: J. Supercomput. 71(4), 1401
(2015)
13. Dua, A., Kumar, N., Das, A., Susilo, W.: IEEE Trans. Veh. Technol. 67(5), 4359
(2018)
14. Chaqfeh, M., Mohamed, N., Jawhar, I., Wu, J.: Proceedings of IEEE Smart Cloud
Networks and Systems, pp. 1–6. IEEE (2016)
15. Hussain, R., Oh, H.: J. Inf. Process. Syst. 10(1), 103 (2014)
16. Xu, K., Wang, K., Amin, R., Martin, J., Izard, R.: IEEE Trans. Veh. Technol.
64(11), 5327 (2015)
Introducing Connotation Similarity
Marina Danchovsky Ibrishimova and Kin Fun Li(&)
Department of Electrical and Computer Engineering, University of Victoria,

Victoria, BC V8P5C2, Canada
{marinaibrishimova,kinli}@uvic.ca
Abstract. Various different measures of textual similarity exist including

string-based, corpus-based, knowledge-based, and hybrid-based measures. To
our knowledge, none of them examine the textual connotation of two different
sentences for the purpose of establishing whether they express a similar opinion.
Connotation, within the context of this work, is the negative, positive, or neutral
emotional meaning of words or phrases. In this paper we define a new type of a
similarity measure mathematically, namely connotation similarity. It evaluates
how similar the emotional meanings of two sentences are using opinion mining,
which is also known as sentiment analysis. We compare two different algorithms
of our definition of connotation similarity against various algorithms of cosine
similarity using one dataset of 100 pairs of sentences, and another dataset of 8
pairs. We show how connotation similarity can be used on its own to indicate
whether two sentences express a similar opinion, and also how it can be used to
improve other similarity measures such as cosine similarity.
1 Introduction
There are several different categories of textual similarity measures. Researchers have
identified 2 main categories, namely string-based similarity measures, and semantic
similarity measures such as corpus-based, knowledge-based, and hybrid-based mea-
sures [2–4]. To our knowledge, none of these measures explicitly assesses the simi-
larity of expressed opinions. Many different semantic similarity measures have been
introduced over the last 20 years, namely latent semantic analysis [5], latent relational
analysis [6], explicit semantic analysis [7], temporal semantic analysis [8], distributed
semantic analysis [9], among others [10–14].
String-based types such as cosine similarity assess the terms in two sentences that
are exactly the same [2–4]. Such measures deliver results fast but the conclusions are
not always accurate. In particular, cosine similarity fails when comparing synonyms.
Additionally, it also fails if one of the sentences expresses the opposite opinion but they
are otherwise exactly the same. For example, the sentence “I don’t like ice-cream” has
the opposite meaning of the sentence “I like ice-cream”. Their cosine similarity score is
0.8 out of 1 even though they express vastly different opinions. In cases where two
sentences express different opinions using the same terms, cosine similarity can still
return a high score. This is a major flaw with cosine similarity. Any method or algo-
rithm, which relies on cosine similarity in some capacity, is therefore vulnerable to this
flaw. The contributions of this paper are as follows:

https://doi.org/10.1007/978-3-030-33509-0_13
142 M. D. Ibrishimova and K. F. Li
1. We provide a mathematical definition of connotation similarity based on the range

D between the sentiment scores of two sentences divided by the full range of
sentiment S as defined by an arbitrary sentiment analysis system.
2. We provide two algorithms of the mathematical definition, namely Weighted
Connotation Similarity (WCS) and Maximum Connotation Similarity (MCS)
a. WCS generates the range between two sentences from the weighted sum of
sentiment scores of their respective verbs, adjectives, and nouns.
b. MCS generates the range between two sentences by finding the largest gap of
sentiment scores between equivalent parts of speech of the two sentences.
3. We compare WCS to MCS and to several algorithms of cosine similarity, and show
that when it comes to assessing the similarity of opinions MCS outperforms WCS
and the various algorithms of cosine similarity that we tested it against using a test
dataset of 100 examples, and another dataset of 8 examples.
The rest of the paper is structured as follows: In Sect. 2 we define connotation
similarity mathematically and linguistically. We provide two implementations of this
mathematical definition in Sect. 3. In Sect. 4 we compare the two implementations of
connotation similarity against several implementations of cosine similarity using the
two sets of examples. In Sect. 5 we draw conclusions and discuss future work.
2 Defining Connotation Similarity
From a linguistic perspective, connotation is the emotional meaning of a word or a

phrase, and it is “described as either positive or negative” or neutral depending on
whether it evokes a “pleasing or displeasing” emotional association [1] or whether it is
free of any emotional associations.
In this paper we define connotation similarity linguistically as the similarity
between the connotative meanings of two sentences or phrases, which can be either
positive, negative, or neutral. To establish whether the connotative meaning of a phrase
is positive, negative, or neutral, various state-of-the-art opinion mining algorithms can
be used. One such algorithm is described by Giatsoglou et al. [15], and it can be
adapted to different languages. While the research in opinion mining (also known as
sentiment analysis) is still ongoing, progress has been made in identifying negative
connotation [16, 17]. Unfortunately, many of the current state-of-the-art sentiment
analysis systems are proprietary, which means that a mathematical definition has to be
abstract and yet simple enough to be applicable to different kinds of systems. Generally
speaking, an arbitrary sentiment analysis system defines a full range of sentiment and
maps individual phrases to either negative, positive, or neutral depending on their
connotation meaning.
As defined previously, connotation similarity is the similarity between the emo-
tional meanings of two sentences or phrases, and here we have extended the entities
under investigation to words. We categorize the emotional meaning of a phrase as
either positive, negative, or neutral. In this section we provide a mathematical definition
of connotation similarity based on the numerical range between the sentiment score of
two phrases s1 and s2. The sentiment score of a phrase can be obtained in various
Introducing Connotation Similarity 143
ways, which include lexical-based approaches and/or machine learning approaches

[18–20]. Typically the full range of sentiment as defined by an arbitrary sentiment
analysis system is between 1 and −1. The sentiment score of a phrase or a word in such
systems is a number greater than −1 and smaller than 1. The range between the
sentiment scores of s1 and s2 is smaller than or equal to the full range of sentiment.
In order to define connotation similarity we must first make a few assumptions.
Given a word or a phrase s1 and a word or a phrase s2 and their respective sentiment
scores min and max:
1. Axiom. If s1 or s2 has a negative sentiment score, then it most likely carries a
negative connotation. If it has a neutral or positive sentiment score, then it most
likely carries a neutral or positive connotation respectively.
2. Axiom. The connotation similarity of s1 and s2 “is a function of their common-
alities and differences”. This axiom is equivalent to assumption 3 stated by Lin in
[21].
3. Axiom. If min is equal to max then we define the connotation similarity of s1 and s2
as equal to 1. This is similar to assumption 4 stated by Lin in [21].
4. Axiom. If the sign of min is not equal to the sign of max, then the connotation
similarity of s1 and s2 is equal to the connotation difference of s1 and s2.
Fig. 1. Mathematical definition of connotation similarity

5. Axiom. The connotation difference of s1 and s2 is the range between the sentiment
scores of s1 and s2 divided by the full range of sentiment.
6. Axiom. If min is not equal to max but their signs are the same, then their conno-
tation similarity is equal to their connotation commonality.
7. Axiom. The connotation commonality of s1 and s2 is 1 minus the connotation
difference of s1 and s2.
Using the axioms above we can define connotation similarity mathematically.
Figure 1 illustrates the formal mathematical definition of connotation similarity. In the
next section we implement this definition in two separate algorithms.
3 WCS vs MCS
In this section we present and evaluate two different algorithms of the mathematical
definition of connotation similarity provided in Sect. 2, namely WCS and MCS. In
Sect. 4 we establish which one works best with respect to the linguistic definition of
connotation similarity provided in Sect. 1.
In order to assess the similarity of expressed opinions in two sentences, we first
extract the sets of words that are likely to carry an emotional charge in each sentence.
The sets of words that are likely to carry an emotional charge are nouns, verbs, and
adjectives. We then find the minimum and maximum sentiment scores in order to find
the range of sentiment between the two sentences. We then calculate the probability
that the two sentences express different opinions as defined in Fig. 1. We find the
minimum and maximum sentiment scores in two different algorithms:
1. By using a weighted average of the different word sets in each sentence
2. By comparing the word sets to find the highest and lowest scores
The first algorithm WCS generates the range D of sentiment between the sentences
from the weighted sum of sentiment scores of their respective sets of verbs, adjectives,
and nouns. This range is then divided by the full range of sentiment S of the sentiment
analysis system to obtain the probability that they express different opinions.
In particular, we parse sentences s1 and s2 and create a set of verbs from sentence
s1 and a set of verbs from sentence s2. We then get their respective sentiment scores,
denoted as verbs_s1 and verbs_s2 in Fig. 2. We repeat this procedure for all nouns and
adjectives as well. We compute min and max using the weighted sum of sentiment
scores of the verbs, nouns, and adjectives in the two sentences as described in Fig. 2.
The full WCS algorithm is described in Fig. 3.
Fig. 2. WCS formula for finding min and max

Fig. 3. The full WCS algorithm
We also implemented the second algorithm MCS for comparison. Instead of using a
weighted average of the sentiment scores of verbs, adjectives, and nouns in each
phrase, MCS finds the maximum difference between pairs of sentiment scores of the
two sentences. The function shown in Fig. 4 describes this process in detail. It takes as
input two dictionaries, namely scores[], which contains the sentiment scores of verbs,
nouns, and adjectives; and diffs[], which contains the absolute differences between the
scores in the two sentences. The variable cur_diff initially contains the difference
between the sentiment scores of the full sentences. The full MCS algorithm is described
in Fig. 5.
In the next section we evaluate these algorithms in terms of how well they capture
the connotation similarity of two types of phrases from two different datasets we built
using the dataset in [22].
function findMax(scores,diffs,cur_diff)
float min = scores[sentiment_s1]
float max = scores[sentiment_s2]
if diffs[nouns] > cur_diff:
cur_diff = diffs[nouns]
min = scores[sentiment_nouns1]
max = scores[sentiment_nouns2]
if diff[verbs] > cur_diff:
cur_diff = diff[verbs]
min = scores[sentiment_verbs1]
max = scores[sentiment_verbs2]
if diff[adj] > cur_diff:
min = scores[sentiment_adj1]
max = scores[sentiment_adj2]
return min,max
Fig. 4. A MCS function for finding the smallest min and the largest max
Fig. 5. The full MCS algorithm

4 Results Analysis
From a linguistic perspective, connotation similarity evaluates whether two phrases are
similar in their emotional meaning, which can be negative, positive, or neutral. In this
section we compare WCS to MCS with respect to this definition. In particular, we
discuss the results we obtained by running various examples of phrases with similar
and opposite emotional meaning through WCS, MCS, a standard algorithm of cosine
similarity, and a novel algorithm of cosine similarity we call Weighted Cosine Simi-
larity (WCosS). Similarly to WCS, for WCosS we parse each phrase into sets of verbs,
adjectives, and nouns. Then we run each set through the standard algorithm of cosine
similarity and obtain a score. We then take the average of these scores as the final
WCosS score. For parsing parts-of-speech tags and for obtaining the sentiment scores
we use Google NLP API [23].
We constructed sentences from the first 100 phrases of the dataset in [22]. For each
of the phrases we looked for additional synonym phrases in the dataset and in a
thesaurus, we then negated the phrases, and we finally added identical fill words to
pairs of phrases to create full sentences. As shown in Table 1, for example, one of the
synonymous pairs in the dataset is “continue to believe, are continuing to believe”. An
additional synonymous phrase is “still believe”. The negated phrases include “don’t
continue to believe, aren’t continuing to believe”. We added the words “people, in,
fake, news” to complete the sentences in this example. The first pair of sentences is
“People continue to believe in fake news; People are continuing to believe in fake
news”. They are nearly identical and yet cosine similarity returns a score of only 0.67
out of 1. Our varied implementation of cosine similarity, namely WCosS is not per-
forming any better. This is because in this example one of the sentences contains the
word “are” but the other sentence does not.
Cosine similarity simply checks to see if terms appearing in one sentence also
appear in the other. The word “are” happens to be a verb so WCosS registers it as well.
For this reason, both cosine similarity algorithms return a score lower than expected.
On the other hand, both connotation similarity algorithms capture the similarity of the
expressed opinions as expected and return a score of 0.99/1 and 1/1 respectively.
Similarly, in the second and third example pairs, both connotation similarity algorithms
outperform the cosine similarity algorithms. WCosS performs slightly better than the
ordinary cosine similarity algorithm because in these two cases the extra word is “still”,
which is neither a verb, nor a noun or an adjective so WCosS does not register it.
The phrases “didn’t stop believing, didn’t discontinue to believe” also express
similar sentiment to “continue to believe, are continuing to believe” so their conno-
tation similarity score should be high as well as it is in both WCS and MCS. Unfor-
tunately, both cosine similarity algorithms return a very low similarity score for these
examples.
In example pairs with opposite emotional meaning only MCS performs as
expected. Namely, for two sentences, in which opposite opinions are expressed, MCS
returns the lowest score. This is expected because by the mathematical definition of
connotation similarity, sentences with opposite sentiment are expected to have a
negative connotation similarity score. With respect to the linguistic definition of
Table 1. Combinations of sentences with similar and opposite meaning

Input Cosine WCosS WCS MCS
People continue to believe in fake news 0.67 0.66 0.99 1
People are continuing to believe in fake news
People continue to believe in fake news 0.63 0.86 0.99 1
People still believe in fake news
People are continuing to believe in fake news 0.57 0.77 0.99 1
People still believe in fake news
People are continuing to believe in fake news 0.36 0.5 1 0.9
People didn’t discontinue believing in fake news
People continue to believe in fake news 0.4 0.55 0.99 0.9
People don’t continue to believe in fake news 0.41 0.55 0.99 0.96
People discontinued believing in fake news
People didn’t stop believing in fake news 0.75 0.8 0.99 0.94
People are continuing to believe in fake news 0.78 0.88 0.96 −0.1
People aren’t continuing to believe in fake news
People continue to believe in fake news 0.45 0.59 0.98 −0.1
People don’t continue to believe in fake news
People stopped believing in fake news
People stopped believing in fake news
connotation similarity, MCS clearly outperforms all other algorithms we tested it

against as well.
We created a second dataset of examples to test WCS and MCS for other parts of
speech such as nouns and adjectives. Table 2 shows the words we used to create the
Table 2. Sentiment scores of words according to Google NLP API

Word Sentiment score
Destroyed −0.3
Protected +0.2
Secured +0.2
Fake −0.3
Real +0.1
Malware −0.4
Virus −0.1
Table 3. A test dataset for other parts of speech such as nouns and adjectives
Input Cosine WCosS WCS MCS
The firewall protected the real system 0.78 0.75 0 −0.4
The firewall protected the fake system
The firewall destroyed the real system 0.78 0.75 0 −0.4
The firewall protected the real system
The firewall protected the real system 0.78 0.75 0.99 0.75
The firewall secured the real system
The virus destroyed the fake system 0.78 0.75 0.97 0.9
The malware destroyed the fake system
The malware destroyed the fake system 0.78 0.67 0.95 −0.2
The firewall destroyed the fake system
The malware destroyed the fake system 0.6 0.41 0 −0.25
The firewall protected the fake system
The malware destroyed the fake system 0.5 0.2 −0.1 −0.65
The firewall protected the real system
The malware destroyed the real system 0.78 0.75 0.97 −0.2
The malware destroyed the fake system
examples in Table 3. Each one of these words is associated with a particular sentiment
score as shown in Table 2, which is why there are slight fluctuations in the sentiment of
different sentences. When all else is equal, MCS correctly identifies the connotation
nuances caused by the different sentiment scores of the different parts of speech.
Therefore, it can be used in conjunction with other similarity measures such as cosine
similarity to establish textual similarity of two sentences.
5 Conclusion
In this paper we define the notion of connotation similarity from a mathematical and
from a linguistic perspective for the purpose of establishing similarity of opinions using
sentiment analysis. We implement the mathematical definition into two different
algorithms, namely Weighted Connotation Similarity (WCS) and Maximum Conno-
tation Similarity (MCS), and compare their performance with respect to our mathe-
matical and linguistic definitions of connotation similarity. The two algorithms differ in
the way we extract and rate expressed opinions in two sentences.
We introduce a novel variation of cosine similarity, namely Weighted Cosine
Similarity (WCosS) and compare all algorithms against one another using two datasets.
We show that one of the algorithms for connotation similarity, namely MCS outper-
forms all other when it comes to determining the similarity of sentiment expressed in
two sentences. Both WCS and MCS require a sentiment analysis system, which is
capable of identifying implicit and explicit negation. We implement a state-of-the-art
sentiment analysis, namely Google NLP API, in order to test our algorithms using the
datasets described in the previous section. Unfortunately, the sentiment analysis
function of Google NLP API is proprietary and we could not provide in-depth analysis
of our results [23]. Future work will focus on designing an open source sentiment
analysis system, which is capable of identifying implicit and explicit negation.
References
1. Connotation. https://en.wikipedia.org/wiki/Connotation. Accessed 27 June 2019
2. Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl.
86(13), 13–18 (2013)
3. Vijaymeena, M., Kavitha, K.: A survey on similarity measures in text mining. Mach. Learn.
Appl. Int. J. 3(2), 19–28 (2016)
4. Rozeva, A., Zerkova, S.: Assessing semantic similarity of texts – methods and algorithms.
In: AIP Conference Proceedings, vol. 1910, p. 060012 (2017). https://doi.org/10.1063/1.
5014006
5. Spasić, I., Corcoran, P., Gagarin, A., Buerki, A.: Head to head: semantic similarity of multi-
word terms. IEEE Access 6, 20545–20557 (2018). https://doi.org/10.1109/ACCESS.2018.
2826224
6. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis.
Discourse Processes 25(2–3), 259–284 (1998). https://doi.org/10.1080/01638539809545028
7. Turney, P.D.: Measuring semantic similarity by latent relational analysis. In: Proceedings of
the Nineteenth International Joint Conference on Artificial Intelligence, IJCAI 2005,
Edinburgh, Scotland, pp. 1136–1141 (2005)
8. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based
explicit semantic analysis. In: Sangal, R., Mehta, H., Bagga, R.K. (eds.) Proceedings of the
20th International Joint Conference on Artificial Intelligence (IJCAI 2007), pp. 1606–1611.
Morgan Kaufmann Publishers Inc., San Francisco (2007)
9. Radinsky, K., Agichtein, E., Gabrilovich, E., Markovitch, S.: A word at a time: computing
word relatedness using temporal semantic analysis. In: Proceedings of the 20th International
Conference on World Wide Web (WWW 2011), pp. 337–346. ACM, New York (2011).
http://dx.doi.org/10.1145/1963405.1963455
10. Hieu, N.T., Di Francesco, M., Ylä-Jääski, A.: Extracting knowledge from Wikipedia articles
through distributed semantic analysis. In: Lindstaedt, S., Granitzer, M. (eds.) Proceedings of
the 13th International Conference on Knowledge Management and Knowledge Technolo-
gies (i-Know 2013), Article 6, 8 p. ACM, New York (2013). https://doi.org/10.1145/
2494188.2494195
11. Shirakawa, M., Nakayama, K., Hara, T., Nishio, S.: Wikipedia-based semantic similarity
measurements for noisy short texts using extended Naive Bayes. IEEE Trans. Emerg.
Top. Comput. 3(2), 205–219 (2015). https://doi.org/10.1109/TETC.2015.2418716
12. Pawar, A., Mago, V.: Challenging the boundaries of unsupervised learning for semantic
similarity. IEEE Access 7, 16291–16308 (2019). https://doi.org/10.1109/ACCESS.2019.
2891692
13. Zhu, G., Iglesias, C.A.: Computing semantic similarity of concepts in knowledge graphs.
IEEE Trans. Knowl. Data Eng. 29(1), 72–85 (2017). https://doi.org/10.1109/tkde.2016.
2610428
14. Xie, C., Li, G., Cai, H., Jiang, L., Xiong, N.N.: Dynamic weight-based individual similarity
calculation for information searching in social computing. IEEE Syst. J. 11(1), 333–344
(2017). https://doi.org/10.1109/JSYST.2015.2443806
15. Giatsoglou, M., Vozalis, M.G., Diamantaras, K., Vakali, A., Sarigiannidis, G., Chatzisavvas,
K.Ch.: Sentiment analysis leveraging emotions and word embeddings. Expert Syst. Appl.
69, 214–224 (2017). https://doi.org/10.1016/j.eswa.2016.10.043. ISSN 0957-4174
16. Councill, I.G., McDonald, R., Velikovich, L.: What’s great and what’s not: learning to
classify the scope of negation for improved sentiment analysis. In: Morante, R., Sporleder,
C. (eds.) Proceedings of the Workshop on Negation and Speculation in Natural Language
Processing (NeSp-NLP 2010), pp. 51–59. Association for Computational Linguistics,
Stroudsburg (2010)
17. Priyadarshana, Y.H.P.P., Ranathunga, L., Karunaratne, P.M.: Sentiment negation: a novel
approach in measuring negation score. In: 2016 Future Technologies Conference (FTC), San
Francisco, CA, pp. 689–695 (2016). https://doi.org/10.1109/ftc.2016.7821679
18. Ravi, K., Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches
and applications. Knowl. Based Syst. 89, 14–46 (2015). https://doi.org/10.1016/j.knosys.
2015.06.015. ISSN 0950-7051
19. Appel, O., Chiclana, F., Carter, J., Fujita, H.: A hybrid approach to the sentiment analysis
problem at the sentence level. Knowl. Based Syst. 108, 110–124 (2016). https://doi.org/10.
1016/j.knosys.2016.05.040. ISSN 0950-7051
20. Dragoni, M., Poria, S., Cambria, E.: OntoSenticNet: a commonsense ontology for sentiment
analysis. IEEE Intell. Syst. 33(3), 77–85 (2018). https://doi.org/10.1109/MIS.2018.
033001419
21. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth
International Conference on Machine Learning, ICML 1998, San Francisco, CA, USA,
pp. 296–304 (1998)
22. Paraphrase Database. http://paraphrase.org/#/download. Accessed 08 July 2019
23. Google NLP API. https://cloud.google.com/natural-language/. Accessed 08 July 2019
A Cotton and Flax Fiber Classification Model
Based on Transfer Learning and Spatial
Fusion of Deep Features
Shangli Zhou1, Song Cai1, Chunyan Zeng1(&), and Zhifeng Wang2

1
Hubei Key Laboratory for High-Efficiency Utilization of Solar Energy
and Operation Control of Energy Storage System,
Hubei University of Technology, Wuhan, China
cyzeng@hbut.edu.cn
2
Department of Digital Media Technology,
Central China Normal University, Wuhan, China
Abstract. In order to make up the disadvantages of transfer learning methods, a

cotton and flax fibred classification model with fusion of deep features transfer
learning is proposed. First, the proposed model utilizes VGG16 and InceptionV3
to extract deep features from cotton and flax fiber images. Next, using spatial
feature fusion algorithm, the model merges the deep features extracted from dif-
ferent networks. Finally, the generalized deep feature is used to train SoftMax
classifier, thereby achieving accurate detection of cotton and flax fibers. In testing
datasets which have 4008 images, the cotton and flax Fiber classification accuracy,
sensitivity and specificity of the proposed model are 0.978, 0.969, 0.985 respec-
tively. The experiments demonstrate that the proposed model outperforms the
state-of-the-art model in the same hardware environment. The results show that the
proposed model can detect cotton and flax Fiber with high accuracy.
1 Introduction
Cotton and flax both are natural plant fibers and good textile materials. Cotton is good
at moisture absorption with characteristics of soft feeling, but easy to wrinkle, shrink,
and mildew. Flax has excellent breathability and heat dissipation; however, its texture
is rough. With the development of chemical industry and textile industry, a number of
new textiles composed of cotton and flax appear on the market. In terms of the textile
industry, how to detect fiber composition quickly and accurately, as one of the major
indexes for assessing the value of textile goods, has become a key issue.
Although transfer learning has achieved superior results in the classification of flax
and fiber detection than machine learning and deep learning, it is not enough to transfer
learning with deep features by a single CNN architecture. These deep features have
problems of small intra-class variance and large inter-class variance [1]. Generally,
single feature descriptor is only sensitive to changes in some special image charac-
teristics, but not to changes in other characteristics [2]. As for plant fiber images with
similar color features, texture features and backgrounds, classifiers usually fail to get
accurate detection results.

https://doi.org/10.1007/978-3-030-33509-0_14
A Cotton and Flax Fiber Classification Model 153
Aiming at the above problems and challenges, a cotton and flax fibred classification
model with fusion of deep features [3] transfer learning is proposed. Specifically, the
pretrained CNN model is used as a feature extractor to extract the deep features of
images [4], and the spatial feature fusion method will be added to cascade the deep
features extracted by different networks [5]. This method uses only 20,000 fiber
images, and achieves a detection accuracy of 0.978 in 100 training epochs. The
detection accuracy of proposed model higher than prior research.
There are three main contributions in this paper. Firstly, it applies the idea of feature
fusion to the classification of plant fibers, making up for the shortcomings of single
deep feature. Secondly, it combines transfer learning with feature fusion ideas, which
further improved the classification performance of the proposed model. Thirdly, the
creation of the proposed model facilitates the work of those involved in the detection
and pricing of the textile.
2 Proposed Model
In the proposed model, two stages were included: deep feature extraction and feature
fusion, as it shows in Fig. 1.
Fig. 1. The proposed framework
In the first stage, two sets of deep feature extractors are constructed [6], the first
extractor consisting of a VGG 16 network and the other consisting of an Inception V3
network. The feature extracted by the VGG 16 deep features extractor are derived from
the output of the seventh fully connected layer of the VGG 16 network, while for the
Inception V3 deep feature extractor, the features are obtained from the output of the last
154 S. Zhou et al.
fully connected layer Of the Inception V3. Compared with handcraft features, the deep
features are more robust [7]. The second stage is the feature fusion which includes three
parts: (1) Reducing feature dimension select the valuable features by which the vari-
ations in the samples are captured; (2) The features extracted by different extractors are
linearly concatenated to form new generalized features; (3) A SoftMax classifier are
created, consisting of two fully connected layers and SoftMax activation functions.
The proposed model training process are as follows: firstly, the network parameters
of pre-trained CNN should be moved into feature extractor, and get image feature
through the Forward propagation of feature extractor. Finally, training SoftMax clas-
sifier composed of Shallow neural networks achieves result convergence. We can find
out that the process of feature extraction avoids complex backpropagation operations of
convolutional neural networks. In other words, proposed model that needs training is
only a shallow neural network. Therefore, the training time of the model is greatly
shortened. At the same time, compared to a single feature, the fusion of different
features improves generalization ability and detection accuracy of the proposed model.
3 Deep Feature Extractor
In this section, we adopt four pre-trained CNNs: AlexNet, VGG 16, Inception V3 and
RestNet 50 as deep feature extractor. All CNNs are trained over the ImageNet dataset
on an image classification problem. The details of these four CNNs are described as
follows.
In 2010, AlexNet proposed by Krizhevsky et al. significantly conquered all the
competitors by reducing the top-5 error from 26% to 15.3% [8]. The structure of
AlexNet is similar to the LeNet by Yann LeCun et al. but was deeper, with more filters
per layer, and with stacked convolutional layers. It attached ReLU activations after
every convolutional and fully-connected layer. In this paper, because the hidden layer
of the AlexNet is relatively small, so that the training can be completed quickly, and
has very good feature extraction ability with regular network structure, so the first
feature extractor designed in this paper is. AlexNet deep feature extractor, the structure
diagram is shown in Fig. 2.
The second chosen network is VGG 16 which was runner-up at the ILSVRC 2014
competition, invented by VGG (Visual Geometry Group) [9] from University of
Oxford. VGGNet consists of 16 convolutional layers and is very appealing because of
its uniform architecture. Similar to AlexNet, it has only 3 3 convolutions, but lots of
filters. It is currently the most preferred choice in the community for extracting features
from images. In this paper VGG 16 feature extractor gets the output of the seventh fully
connected layer of VGG 16 as deep features which dimension is 4096, as can be seen in
Fig. 3.
The third chosen network is Inception V3 [10] which is designed by using
inception blocks on the basis of typical CNN architecture, proposed by Szegedy et al.
in 2015. With orderly adding layers of RMSProp, smoothing label, factorizing 7*7 and
BN-auxiliary on InceptionV2 network, lower error rate is obtained and make it become
the 1st Runner Up for image classification in ILSVRC 2015. In this paper Inception V3
Fig. 2. AlexNet deep feature extractor
Fig. 3. VGG-16 deep feature extractor
Feature extractor gets the output of the last fully connected layer of Inception V3 as
deep features which dimension is 2048, as can be seen in Fig. 4.
The last chosen network is ResNet 50 [11]. The ResNet network is referenced to the
VGG19 network, and the residual unit is added through the short circuit mechanism, as
shown in Fig. 5. The change is mainly that ResNet directly uses the convolution of
stride = 2 for downsampling, and replaces the fully connected layer with the global
average pool layer. At the same time, an important design principle is applied in the
ResNet framework which the number of feature maps is doubled with t the feature map
size is reduced by half. Thereby complexity of the network layer can be preserved.
ResNet achieves a top-5 error rate of 3.57% which beat human-level performance on
this dataset. In this paper ResNet 50 Feature extractor gets the bottleneck feature which
dimension is 2048 from the ResNet 50.
156 S. Zhou et al.
Fig. 4. Inception V3 deep feature extractor
Fig. 5. Residual block
4 Fusion of Deep Features
Features fusion is the comprehensive processing of multi-source heterogeneous data to

achieve joint decision-making. In this paper, the deep features extracted by multiple
CNNs are mapped from multiple feature spaces to a generalized feature space through
fusion methods. A feature space is defined as:
W ¼ fX1 ; X2 ; ; Xe g ð1Þ
where Xk , k 2 ½1; M is the feature set of a single sample and M is the total number of
samples.
The feature set:
Xk ¼ fXk1 ; Xk2 ; Xk3 ; ; Xkd g ð2Þ
where Xki represents the ith dimension feature in the kth sample, and i, i 2 [1,D] is a
single sample feature dimension. The features extracted by different CNNs form
individual feature spaces. The goal of this feature fusion (refer to Fig. 6) is finding a
feature subspace, which is more generalized than the individual feature space. The
fusion process is defined as:
f : ðWa ; Wb Þ ! W ð3Þ
Where Wa and Wb are different individual feature space and f is Fusion function.
Fig. 6. Concept diagram: feature mapping.
The deep features extracted by CNNs are usually affected by the data imbalance
and inherits noise. However, such joint feature transformation plays vital role to
selecting the valuable features by which the variations in the samples are captured. As a
result, the above factors would not degrade the classification accuracy.
4.1 Spatial Feature Fusion Algorithm

First, the spatial feature fusion algorithm aims to fuse the output features of two deep
convolutional neural networks, in other words. Two deep convolutional neural net-
works are connected by using the fusion algorithm, and the connection point is a fusion
point. Next step is training the new SoftMax classifier to achieve classification task.
The fusion function is defined as:
f : Xa þ Xb ! y ð4Þ
Where X a and X b are feature sets extracted by A feature extractor and B feature
extractor respectively. Y represents the fusion space feature set and X a , X b , y 2 WMDC
where M, D and C are the length, width and the channels of the feature set,
respectively.
158 S. Zhou et al.
4.2 Cascade Feature Fusion Function

The fusion function used in this paper is a cascade fusion:
ycat ¼ f cat ðX a ; X b Þ ð5Þ
The process of cascade feature fusion can be seen in Fig. 7.
Fig. 7. Process of cascade feature fusion
Cascading fusion preserves two sets of sub-features and expands the fused feature
set dimension to twice the sub-feature set, such as:
i;2D ¼ Xi;D ; yi;2D1 ¼ Xi;D

ycat ð6Þ
a cat b
Where y 2 WM2DC , d M and D = l or D < l, the dimension of the feature extracted

by A and B feature extractors is defined as Da ; Db respectively, and then l = min{Da,
Db}.
5 Experimental Results and Discussion
In order to verify the performance of the proposed method, this paper uses the cotton
and flax fiber images obtained from the laboratory as a data set to compare with transfer
learning models and the proposed model.
The proposed model still needs to build a SoftMax classifier. However, the inputs
to the classifier are the fusion deep feature.
All experiments are carried out on the CPU with Intel Core i7-8665UE Processor
and GPU with GeForce GTX 1080Ti, 11G by using programming language of Python
3.6 based on TensorFlow framework. For all chosen deep CNNs, the network is
learned 100 epochs. For each epoch, 8 samples are orderly selected by using Adam
Optimizer with 0.001 learning rate, 0.9 beta1 and 0.999 beta2 to update network
weights. One epoch is over until that the whole dataset is traversed.
5.1 Datasets
The data sets used in the experiment are fiber images of cotton and flax collected via
optical microscope, as can be seen in Fig. 8.
Fig. 8. Datasets image samples, (a) Cotton. (b) Flax.
The datasets are RGB images which sizes are 1116*164 and saved in PNG format.
So far, a total of 20032 images have been collected including 14445 flax images and
5598 cotton images, and the distribution of the datasets can be seen in Fig. 9.
Fig. 9. Distribution of the datasets
In order to adapt to the input size requirements of different networks, the sizes of
the image would be resized by OpenCV to 224*224, 227*227 and 299*299
respectively.
160 S. Zhou et al.
5.2 Metrics
Performances of all methods are evaluated by accuracy, sensitivity and specificity,
computed as follows:
TP þ TN
Acc ¼ ð7Þ
TP þ TN þ FP þ FN
TP
Sen ¼ ð8Þ
TP þ FN
TN
Spe ¼ ð9Þ
TN þ FP
where Acc, Sen and Spe are accuracy, sensitivity and specificity respectively. TP and
TN are the number of true positives (Cotton) and negatives (Flax) respectively; TP and
FN are the number of false positives and negatives respectively.
5.3 Experiment
In this experiment, the feature fusion was added to the transfer learning model to
establish a new cotton and flax fiber classification model, which is cotton and flax Fiber
classification model Based on Transfer Learning and Fusion of deep features. As a
result, the classification result about proposed model and transfer learning can be seen
in Table 1.
Table 1. Results of transfer learning and proposed model

Model Feature extractor Acc Sen Spe
Proposed model Fusion of VGG-16 and Inception-V3 0.978 0.969 0.985
Transfer learning Inception-V3 0.975 0.957 0.982
VGG-16 0.971 0.949 0.979
AlexNet 0.959 0.921 0.974
ResNet-50 0.913 0.816 0.952
It can be seen from Table 1 that the accuracy, sensitivity and specificity of the
proposed model are higher than transfer learning models. Especially, the sensitivity of
the proposed model was significantly improved. Therefore, comparing the inception V3
model which get best performance among four transfer learning methods with proposed
model, we can get the confusion matrix of results as Fig. 10.
Fig. 10. Confusion matrix of Inception-V3 and proposed model, 1: Cotton. 0: Flax.
It can be clearly seen from Fig. 6 that the proposed model detected more cotton
fibers than Inception V3, but its recognition ability to flax was unchanged, which
improved the classification accuracy rate. In other words, feature fusion can provide an
enhancement in learning ability of model to cotton fibers. So, making full use of the
spatial feature fusion would reduce the impact from imbalanced data. We can draw a
conclusion that spatial feature fusion can help classifiers to improve the detection
accuracy. Hence, proposed model has better performance in cotton and flax Fiber
classification than transfer learning with single deep feature.
6 Conclusion
Combining transfer learning and Spatial Fusion idea we proposed a cotton and flax
Fiber classification model based on Transfer Learning and Spatial Fusion of deep
features. Due to the unbalanced number of cotton and flax images, transfer learning
models lack of recognition ability to cotton images.
However, experimental results show that recall value of proposed model was
increased. Having this finding, we think spatial feature fusion, to some extent, could
avoid influence from imbalanced data. The proposed model which absorbed the
advantages of transfer learning and spatial feature fusion, represented a successful gain
in accuracy. As for the future direction, more attention can be paid to different methods
of feature fusion. As a result, it is possible to obtain more generalized feature spaces
than before.
162 S. Zhou et al.
Acknowledgments. This research was supported by National Natural Science Foundation of

China (No. 61901165, No. 61501199), Science and Technology Research Project of Hubei
Education Department (No. Q20191406), Excellent Young and Middle-aged Science and
Technology Innovation Team Project in Higher Education Institutions of Hubei Province
(No. T201805), Hubei Natural Science Foundation (No. 2017CFB683), and self-determined
research funds of CCNU from the colleges’ basic research and operation of MOE
(No. CCNU18QN021).
References
1. Ge, S., et al.: Deep and discriminative feature learning for fingerprint classification. In: IEEE
International Conference on Computer and Communications (2018)
2. Hassaballah, M., Abdelmgeid, A.A., Alshazly, H.A.: Image features detection, description
and matching. In: Awad, A., Hassaballah, M. (eds.) Image Feature Detectors and
Descriptors, vol. 630. Springer, Cham (2016)
3. Thangarajah, A., et al.: Fusion of transfer learning features and its application in image
classification. In: Electrical and Computer Engineering (2017)
4. Martel, J., et al.: A combination of hand-crafted and hierarchical high-level learnt feature
extraction for music genre classification. In: International Conference on Artificial Neural
Networks and Machine Learning ICANN (2013)
5. Liu, Y., Zou, Z., Xing, J.: Feature fusion method in pattern classification. J. Beijing Univ.
Posts Telecommun. 40(4), 1–8 (2017)
6. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10),
1345–1359 (2010)
7. Wang, X., et al.: Matching user photos to online products with robust deep features (2016)
8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional
neural networks. In: International Conference on Neural Information Processing Systems
(2012)
9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition (2014). arXiv:1409.1556
10. Szegedy, C., et al.: Rethinking the inception architecture for computer vision. In: Computer
Vision and Pattern Recognition (2016)
11. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (2016)
An Automatic Text Summary Method Based
on LDA Model
Caiquan Xiong, Li Shen(&), and Zhuang Wang
School of Computer Science, Hubei University of Technology,

Wuhan 430068, China
slchampion@163.com
Abstract. Document automatic summarization technology is a method that

refines documents and generates summaries representing the whole document to
help people quickly extract important information. Aiming at solving lack of
semantic information in document abstracts, this paper proposed a weighted
hybrid document summary model based on LDA. This model obtains the theme
distribution probability through analysing the document. Firstly, we used the
FCNNM (Fine-grained Convolutional Neural Network Model) extract the
semantic features, then search the surface information of the text from heuristic
rules, including the length, location of the sentence and TF-IDF of the words in
the sentence, and weighted to calculate the sentence score. Finally, used the
greedy algorithm to select the sentence to form the abstract. Experiments show
that the proposed model can effectively compensate for the lack of semantics
between abstract sentences and text in traditional algorithms, effectively reduce
the high redundancy in document abstracts and improve the quality of abstracts.
1 Introduction
With the continuous development of Internet technology, the amount of data generated
every day is explosive. How to extract effective information from massive data has
become an urgent need. Automated text summary technology is precisely to refine a
large amount of data content and generate concise summary to replace the entire
document. Since Luhn [1] proposed in 1958, it has been one of the hotspots in the field
of natural language processing.
At the beginning of the research, due to the limitation of computer technology, the
method of automatic summarization was taking some heuristic features into consid-
eration, such as Luhn, which directly considers the importance of word frequency in
documents to sentences, and then ranks them to extract sentences. Edmundson [2]
considers the position of sentences in the original text and word frequency. Then weigh
these factors to calculate the importance of sentences.
In recent years, with the continuous improvement of computer power, some large-
scale deep learning network model and some centrality-based approach was applied to
the field of natural language processing field. Such as Zhong [3] designed a model
contained three parts: concepts extraction, summary generation and reconstruction
validation method, it was the first time use the deep learning in query-oriented docu-
ment summarization tasks. Xiong and Li [4] attempted to combine the TextRank with

https://doi.org/10.1007/978-3-030-33509-0_15
164 C. Xiong et al.
heuristic clustering algorithm to make multi-document summary, and it achieved good

results in its online argumentation platform.
When generating abstracts, man usually need to consider the deep meaning of the
document. LDA (Latent Dirichlet Allocation) is a probability generation model based
on Bayesian theory proposed by Blei [5] in 2003. It used Gibbs sampling method to
transformed the document into a three-tier model of “document-topic-word” to rep-
resent potential topic information. Liu and Tang [6] combined the linear weight method
with the LDA model, proposed a document summarization algorithm based on topic
sensitivity. Yang and Fan [7] considered two thematic models (PLSA [8] and LDA),
combined with the centrality-based method, and then achieved good results.
The focus of this paper is to use the topic model to mine deeper semantic infor-
mation from the document. The core of extractive summary is to determine the
importance of sentences in summary set, and the topic is an abstract summary of
document set. If sentences with high similarity to topic should be selected as candidate
summary set and given a relatively large weight, which is also in line with the logic of
manual writing summary. In view of the rapid development of deep neural network in
recent years, it has powerful representation ability, so this paper proposes an automatic
text summarization method which combines LDA topic model with neural network,
use Bayesian probabilistic generation model to find potential topic information, and
neural network to find semantic association between words and sentences in docu-
ments. Then combined with the surface information of the document, generate sum-
maries of document synthetically. The experimental results show that the model
proposed in this paper can effectively improve the statement of lacking information
between context semantics and the quality of abstracts in documents.
2 Extraction of Document Summary Based on LDA Model
2.1 Basic Idea

In extracting document summaries, the key step is to rank the sentences according to
their importance. Traditional summarization algorithms only consider the surface
information. How to analyze the importance of sentences from the semantic level of
sentences themselves is a key issue for document summaries. This paper considers that
whether a sentence belongs to a summary or not is closely related to its topic distri-
bution. LDA model thoughts that a document is composed of subjects subject to
Dirichlet distribution. Each topic is composed of words subject to Dirichlet distribu-
tion, in which multiple distributions are obeyed in the process of document generation.
It is a three-tier Bayesian model, which uses unsupervised machine learning method to
find the probability of hidden subject information in documents. Based on this, this
paper proposed a hybrid machine learning model based on LDA. First, the probability
distribution of the topic is analyzed through corpus. For sentences with high similarity
to the subject words, we think it is associated with the summary, so we give it a high
weight, then use FCNNM to extract the features of word vector and sentence vector
respectively from the data set LCSTS [9]. And combining with the TF-IDF
An Automatic Text Summary Method Based on LDA Model 165
information, we used greedy algorithm to sort the sentences and get the final summary
results. The specific process is shown in the following Fig. 1:
Fig. 1. Document summarization process based on LDA model.
2.2 LDA Theme Model

LDA topic model is an unsupervised learning method in machine learning. It is usually
used to find potential topic information in text data [10]. It is a three-tier Bayesian
structure of “document-topic” and “topic-word”, as shown in Fig. 2.
Fig. 2. Three-tier Bayesian hierarchy.
Difference from PLSA, LDA considers that the subject and word of a document are
subordinate to the prior distribution of Dirichlet. It is a generalization of Beta distri-
bution in higher dimensions. Its probability density function is:
166 C. Xiong et al.
PK K
C ak Y
pj~
Dir ð~ aÞ ¼ QK k¼1 pak k
k¼1 Cð a Þ
k k¼1
ð1Þ
1 Y K
¼ pak
Dð~
aÞ k¼1 k
Among this:
QK Z 1Y
Cðak Þ K
Dð~
aÞ ¼ k¼1
PK ¼ pkak 1 d~
p ð2Þ
C a
k¼1 k 0 k¼1
The core idea of LDA is to describe the probability distribution among document-
topic and topic-word models with polynomial distribution. Dirichlet distribution is used
as its prior distribution. Its formal representation is as follows:

~ ~
p x~;~
z~ ~~
a; b ¼ p x zj~
z; b pð~ aÞ ð3Þ
The process is shown in the following Fig. 3:
Fig. 3. LDA graph model.
2.3 Sentence Feature Selection Based on FCNNM

Convolutional Neural Network (CNN) was originally used as a method of image
recognition in the field of artificial intelligence. With the further development of
research, it has also been widely used in the field of natural language processing [11].
In the task of document summarization, the function of modeling words and sen-
tences is to find the hidden semantic features between words and sentences. Based on
this, a Fine-grained Convolutional Neural Network Model (FCNNM) is proposed. The
basic idea of this method is to divide the fine-grained documents into word level and
sentence level, and then convolute them to extract the semantic features. The flow chart
is as follows (Fig. 4):
Fig. 4. FCNNM model.
For a document FS , represent its sentence sequence as CS ¼ fS1 ; S2 ; ; Sn g, where

n is the number of sentences. Then express its word sequence as
CW ¼ W1 ; W2 ; ; Wp , Where p is the number of words. A filter is used to convolute
the sentence sequence and word sequence, which is expressed mathematically as:
!
X
D
ConS ¼ f W p CS þ bs
d¼1
! ð4Þ
X
D
ConW ¼ f W CW þ bw
P
d¼1
One of W p 2 RmnD is the three dimensions convolutional filter, bs and bw is the

offset, f ðÞ is the activation function, in this paper, we choose tanðÞ as the activation
function.
After convoluting the word and sentence sequences, the maximum pooling oper-
ation is performed to obtain the final semantic features of sentence and word levels: its
mathematical representation is as follows:
Fs;w ¼ max ðConS ; ConW Þ ð5Þ

i2Rdm;n
2.4 Sentence Regression Model

Sentence regression model is an important step in marking sentences in document
summaries. This paper designs a Mixed Regression Model (MRM) which includes
LDA model, FCNNM and TF-IDF. By calculating the number of multiple documents,
we can give the number of topics and subject words. Then combining FCNNM, TF-
IDF and some heuristic feature, calculating it Cosine Similarity cosðÞ of Sentences and
168 C. Xiong et al.
Subject Words, then a similarity value is calculated by the given activation function.
The activation function used in this paper is Sigmod function. Its mathematical form is
as follows.
1
f ð xÞ ¼ ð6Þ
1 þ ex
By comparing with a given threshold, a corresponding score is given for the

importance of the statement if the threshold is greater than or equal to it. The heuristic
feature F used in the scoring model is shown in the following Table 1:
Table 1. Sentence feature.

Feature Formula Describe
Significance characteristics Len(S) Length of sentence
Pos(S) Position of sentence
First(S) If in the first sentence
Last(S) If in the last sentence
Stop(S) Percentage of stop word
Num(S) Percentage of number
After obtain these features, the scoring function of the importance of the final
modelling sentence is as follows:
X
n X
m
Final Score ¼ f ð x Þ bi þ Fi pi ð7Þ
i¼1 i¼1
2.5 Redundant Control Summary Selection

In order to reduce the redundancy of extracted sentences, a summary selection algo-
rithm based on the redundancy control of greedy algorithm is proposed in this paper.
The core idea of greedy algorithm is always to make the best choice when solving
problems. In the task of document summarization, all sentences in candidate clusters
are traversed first to find the score value. The highest sentence is put into a new cluster
C, and then the remaining sentences are compared with the sentences in cluster C. The
redundancy of cluster C is controlled by a threshold t. If it is greater than or equal to the
threshold, the sentence is discarded, otherwise it is put into the final summary. The
advantage of the algorithm is that no additional variables are introduced to solve the
problem, which reduces the complexity of the solution, so it has excellent performance.
The flow chart of the algorithm is shown in the following table (Fig. 5):
Fig. 5. Greedy algorithm flow.
3 Experimental Analysis
3.1 Experimental Data

The experimental data set is LCSTS. The data set is a large Chinese short text summary
data set constructed by Intelligent Computing Research Center of Harbin University of
Technology based on Sina Weibo. The data set contains three parts. The first part is a
large-scale short text and summary pair, and the second part is more than 10,000
documents, each containing a short text and summary pair with one manual marker.
The third part is more than 1,000 documents, each containing a short text and summary
pair with three manual markers.
3.2 Evaluation Index

ROUGE is a common set of indicators in the field of automatic text summarization. It is
divided into ROUGE-1, ROUGE-2 and ROUGE-L according to the different fine-
grained sentences. Among them, 1, 2 and L represent the length of the lexical elements.
Its mathematical form is as follows:
170 C. Xiong et al.
P P
Countmatch ðgramn Þ
S2fReferemeveSummariesg gramn S
ROUGE N ¼ P P ð8Þ
Countðgramn Þ
S2fReferenceSummariesg gramn 2S
Among this, ReferenceSummaries represents standard summary from the data set,
gramn represents the number of lemma.
3.3 Experimental Detail

In order to construct the required data, LDA uses gensim module, first uses Jieba
participle tool, and then constructs a stop word list to stop words. The results are as
follows (Table 2):
Table 2. Document theme.

Number Probability Theme Probability Theme
Theme 1 0.013 Government 0.009 Enterprise
Theme 2 0.013 Internet 0.010 Market
Theme 3 0.014 Fund 0.012 Company
Theme 4 0.018 Home 0.012 Centre
Theme 5 0.009 Society 0.008 Reform
According to the characteristics of LCSTS dataset, all of them are short texts
extracted from Sina Weibo. So at the time of choosing the number of topics and the
number of words, we tested the Rouge scores from 1 to 8 respectively. The results are
as follows (Fig. 6):
Fig. 6. The number of theme.
After eliminating several extreme values, we can find that the Rouge value
increases slightly with the increase of subject words, but remains unchanged or even
slightly decreases after increasing to a certain number. This is due to the limitation of
the length of short text, so we choose a suitable median value, the number of topics is
designated as 2, and the number of words is designated as 4.
In the FCNNM, we choose Bert model [12] as the pre-training word vector. Tra-
ditional model such as word2vec can’t parse the same words or sentences in different
contexts. The Bert model uses transformer model, which has deeper layers and better
parallelism. In the training process, the learning rate is set to 0.1. In order to prevent
over-fitting, dropout layer is added and the probability of working neurons is set to 0.7,
the final loss value tends to converge when iteration is about 50 times. The results are
as follows (Fig. 7):
Fig. 7. The value of loss.
After getting the trained model, we get a 768-dimensional vector through the results
of the FCNNM. Then we compare the vectors of each sentence with the results,
calculate the similarity using Manhattan distance, and select several sentences with
high similarity as candidate sentences. Finally, the sentence score calculated by LDA
model and TF-IDF are combined. The final summary is selected by greedy algorithm.
The rouge scores are shown in the following Fig. 8:
Fig. 8. Rouge score.

172 C. Xiong et al.
As a contrast, we choose the LexRank, TextRank, Cluster, CNN and TF-IDF

classical methods, from the chart we can find out that effects of our model is better than
simple CNN and TF-IDF, the reason mainly because the traditional methods lack
semantic analysis, document’s sentence are associated, and in our model take it into
consideration significant.
4 Summary
This paper proposed a document summarization method based on LDA, by analyzing

the topic of the document, combining some heuristic features to parse the semantics in
the document. And finally through experiments, proves that our method is effective. By
contrast with traditional methods, our model can complement traditional methods
effectively at semantic level, make summary content more accurate.
Acknowledgments. This research is supported by National Key Research and Development

Scheme of China under grant number 2017YFC1405403, and National Natural Science Foun-
dation of China under grant number 61075059, and Green Industry Technology Leading Project
(product development category) of Hubei University of Technology under grant number
CPYF2017008, and Philosophical and Social Sciences Research Project of Hubei Education
Department under Grant 19Q054.
References
1. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(4), 159–165
(1958)
2. Edmundson, H.P.: New methods in automatic extracting. J. ACM (JACM) 16(2), 264–285
(1969)
3. Zhong, S., Liu, Y., Li, B., et al.: Query-oriented unsupervised multi-document summariza-
tion via deep learning model. Expert Syst. Appl. 42(21), 8146–8155 (2015)
4. Xiong, C., Li, X., Li, Y., et al.: Multi-documents summarization based on TextRank and its
application in online argumentation platform. Int. J. Data Warehous. Min. (IJDWM) 14(3),
69–89 (2018)
5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan),
993–1022 (2003)
6. Liu, N., Tang, X.J., Lu, Y., et al.: Topic-sensitive multi-document summarization algorithm.
In: 2014 Sixth International Symposium on Parallel Architectures, Algorithms and
Programming, pp. 69–74. IEEE (2014)
7. Yang, C.Z., Fan, J.S., Liu, Y.F.: Multi-document summarization using probabilistic topic-
based network models. J. Inf. Sci. Eng. 32(6), 1613–1634 (2016)
8. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth
Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann
Publishers Inc. (1999)
9. Hu, B., Chen, Q., Zhu, F.: LCSTS: a large scale Chinese short text summarization dataset
(2015). arXiv preprint arXiv:1506.05865
10. Momtazi, S.: Unsupervised latent Dirichlet allocation for supervised question classification.
Inf. Process. Manag. 54(3), 380–393 (2018)
11. Agarwal, B., Ramampiaro, H., Langseth, H., et al.: A deep network model for paraphrase
detection in short text messages. Inf. Process. Manag. 54(6), 922–937 (2018)
12. Devlin, J., Chang, M.W., Lee, K., et al.: BERT: pre-training of deep bidirectional
transformers for language understanding (2018). arXiv preprint arXiv:1810.04805
Investigating Performance and Cost
in Function-as-a-Service Platforms
Diogo Bortolini and Rafael R. Obelheiro(B)
Graduate Program in Applied Computing (PPGCA),

Santa Catarina State University (UDESC), Joinville, Brazil
{diogo.bortolini,rafael.obelheiro}@udesc.br
Abstract. The function-as-a-service (FaaS) service model has been

gaining adoption at a fast pace. In this service model, cloud applications
are structured as self-contained code modules called functions that are
instantiated on-demand, and billing is based on the number of function
invocations and on function execution time. Developers are attracted to
FaaS because it promises to remove two drawbacks of the traditional
IaaS and PaaS service models, the need to provision and manage infras-
tructure, and the need to pay for unused resources. In practice, however,
things are a little less rosy: developers still have to choose the amount of
memory allocated to functions, and costs are less predictable, especially
because they are tied to function performance. This work investigates
performance and cost variations within and across FaaS providers. Our
results show that performance and cost can be significantly affected by
the choice of memory allocation, FaaS provider, and programming lan-
guage: we observed differences of up to 8.5× in performance and 67× in
cost between providers (with the same language and memory size), and
16.8× in performance and 67.2× in cost between programming languages
(with the same provider and memory).
1 Introduction
A growing trend is cloud computing has been the adoption of the Function-as-
a-Service (FaaS) service model, where applications are built as sets of functions
that execute on instances (containers or virtual machines) activated on-demand
in response to events [12]. These instances are stateless and temporary, pos-
sibly remaining active during the execution of a single invocation, and fully
managed by the provider, who is responsible for ensuring their scalability and
availability. The customer is charged on a per-invocation basis, without hav-
ing to pay for idle resources. Therefore, the FaaS model may offer significant
cost savings when compared to the more traditional infrastructure/platform-as-
a-service (IaaS/PaaS) models, while at the same time freeing developers from
having to worry about infrastructure management [1]. Several case studies in the
literature have already shown that meaningful cost reductions can be achieved:
Adzic and Chatley [1] report savings of 66% in a PaaS→FaaS migration and 95%
in a IaaS→FaaS migration, Villamizar et al. [14] show that a FaaS deployment
https://doi.org/10.1007/978-3-030-33509-0_16
Investigating Performance and Cost in Function-as-a-Service Platforms 175
can reduce costs by 50–70%, and Horovitz et al. [5] pointed to reductions of over
50% in costs using FaaS.
FaaS billing is based on the number of function invocations and on the prod-
uct between resource allocation size and function execution time. As such, costs
in FaaS are more variable than the costs in IaaS/PaaS, where customers are
billed by provisioned capacity [2]. In addition, several studies [2,7,9] point out
that cost visibility is low, i.e., it is hard for a cloud customer to anticipate how
much she will pay for the services.
There are a few choices a developer can make that influence function perfor-
mance and cost. The most often cited is the amount of memory allocated to the
function, since some FaaS platforms, such as AWS Lambda and Google Cloud
Functions, allocate CPU proportional to memory size. In this case, allocating
more memory than necessary in order to run faster might make sense not only
in terms of performance, but also in terms of cost. Another factor is the choice
of programming language, which may have implications not only in code speed
but also in runtime overhead and platform maturity [8].
There is a body of work that explores the costs in the FaaS model. Some
references [1,2,9] disregard how cost may be affected by variations in function
performance – which are often outside the customer’s control but occur in prac-
tice [7,10,12,15] –, while others address both performance and cost [4,5,14].
None of these works explore how the choice of programming language influences
performance and cost.
Our work aims to bridge this gap, exploring more widely how function per-
formance varies in different cloud platforms and what is the cost impact for
customers. We performed experiments with three programming languages (Go,
Node.js, and Python) on three FaaS providers: AWS Lambda, Google Cloud
Functions (GCF), and IBM Cloud Functions. The results show that performance
and cost are heavily dependent on the specific choice of cloud provider, function
size (amount of allocated memory), and programming language.
The remainder of this paper is organized as follows. Section 2 discusses related
work. Section 3 describes the main features of FaaS platforms and their billing
models. Section 4 presents our experimental results. Finally, Sect. 5 concludes
the paper.
2 Related Work
Related work can be broadly divided into three groups of references. The first
group has a focus on FaaS performance, the second group focuses on FaaS costs,
and the third group addresses both performance and cost.
The first group features studies that explored various performance aspects of
FaaS platforms: how concurrency affects performance [8,11], cold start latency
[11,15], expiration behavior of instances on FaaS providers [11], overhead of pro-
gramming language runtimes [8], effect of memory size on performance [10,15],
and performance variability [7]. These references largely disregard cost issues,
including the relationship between performance and cost.
176 D. Bortolini and R. R. Obelheiro
In the second group, Eivy [2] highlights that the FaaS billing model is hard
to understand, variations in function execution time may induce variations in
costs, and, since functions are stateless, they represent just a fraction of the cost
of an application, which will likely have additional costs (storage, networking,
and other services). Adzic and Chatley [1] present a limited cost comparison,
considering a single provider (AWS Lambda) and a function with constant exe-
cution time and only two memory sizes (128 and 512 MB). Leitner et al. [9]
also present a cost model for microservice-based applications implemented using
both IaaS/PaaS and FaaS. The model is difficult to parameterize, and considers
a fixed cost per request, disregarding performance variability. Lastly, Rogers [13]
introduces an analytical model to evaluate how cost is affected by the number of
invocations, memory allocation, and execution time, and he shows that different
providers have the lowest cost depending on the other parameters.
In the third group, Villamizar et al. [14] did an experimental study com-
paring performance and cost of the same application with different architec-
tures: monolithic, microservices, and function-based. Although the FaaS version
had the lowest cost, it used functions on AWS Lambda with two memory sizes
(320 MB and 512 MB). Horovitz et al. [5] introduced FaaStest, a solution that
uses machine learning to dynamically select the lowest-cost platform (FaaS or
IaaS) according to its function call behavior. When FaaS is the chosen platform,
a prediction algorithm is used to estimate when functions will be called next, in
order to prevent cold starts. Figiela et al. [4] evaluated performance and cost on
four FaaS providers (AWS, Azure, GCF, and IBM). They evaluated execution
time for both integer and floating-point benchmarks, file transfer times, instance
lifetime, and infrastructure heterogeneity. They also evaluated the cost of their
integer benchmark (a Node.js function that calls a random number generator
written in C) with varying memory allocations. AWS and IBM provided consis-
tent performance, and only the former allocates CPU in proportion to memory.
GCF and Azure had wide variations in function execution time, with the lat-
ter being generally slower. In terms of cost, AWS is independent of memory
allocation size, while on GCF and IBM smaller functions are cheaper.
Our work compares more directly to the studies that measure performance
and cost [4,14], particularly the latter. Villamizar et al. [14] compared the per-
formance and cost of a specific cloud application, but did not evaluate how dif-
ferent configurations affect performance and/or cost. Figiela et al. [4] evaluated
performance and cost for a benchmark on different configurations, but did not
consider how the choice of programming language influences those aspects. Only
Lee et al. [8] assessed the impact of programming languages on performance, but
they measured the overhead of language runtimes, not their influence on exe-
cution time and/or cost. Therefore, this is the first study that investigates how
the choice of provider, memory allocation, and programming language affects
performance and cost in the FaaS model.
3 Function-as-a-Service Platforms
For our experiments we have used three of the most popular FaaS platforms:
AWS Lambda1 , Google Cloud Functions2 , and IBM Cloud Functions3 . This
section presents the most relevant features of each platform, including their
billing models. Table 1 summarizes these characteristics.
Table 1. Features and billing models for FaaS platforms.
AWS Google IBM

Memory 64k MB 128k MB 32k MB
(k = 2, 3, . . . , 47) (k = 1, 2, 4, 8, 16) (k = 4, 5, . . . , 64)
CPU proportional to proportional to not informed
memory memory
Network (bytes 1 GB 5 GB not informed
trans-
ferred/month)
Local storage 512 MB 500 MB not informed
Supported Node.js (JavaScript), Node.js, Python, JavaScript, Swift,
languages Python, Java, Ruby Go Python, PHP,
C# (.NET Core), Java, Go, Ruby,
Go .Net Core and
others
Maximum 50 MB (compressed) 100 MB 48 MB
function size and 250 MB (compressed) and
(uncompressed) 500 MB
(uncompressed)
Maximum 15 min 9 min 10 min
execution time
Billing granularity 100 ms 100 ms 100 ms
Monthly cost $0.20/1M $0.40/1M $0.000017/GB-s
(USD) executions, executions,
$0.0000166667/GB-s $0.0000025/GB-s,
$0.0000100/GHz-s
Free tier (per 400,000 GB-s, 400,000 GB-s, 400,000 GB-s
month) 1 M executions 2 M executions
A customer of a FaaS provider develops functions as code modules that imple-

ment a given application logic. These functions are packaged and uploaded to
a cloud storage service. When a function is invoked, its code is loaded and exe-
cuted on one or more instances (typically containers), according to the volume
1
https://aws.amazon.com/lambda/.
2
https://cloud.google.com/functions/.
3
https://www.ibm.com/cloud/functions.
of requests. After the function ends, its instance becomes idle. To avoid latency
in activating new instances, idle instances can be reused to service new requests,
but this is not guaranteed. Each instance has some space on the local disk for
temporary storage that is wiped when the instance is deactivated; for persistent
storage, the most common option is to use one of the various services offered by
the provider, which are billed separately. Each function has a maximum execu-
tion time ranging from 9 min (Google) to 15 min (AWS).
In all three platforms, the customer has to specify the amount of mem-
ory allocated for an instance, ranging from 128 MB to 2048 MB (Google, IBM)
or 3008 MB (AWS). The granularity of allocation is also variable, as shown
in Table 1. CPU allocation is proportional to memory allocation for AWS and
Google; there is no information on CPU allocation for IBM.
In general, FaaS billing has two components. The first is the number of
function invocations, considering all functions belonging to the same user. This
metric is easy to understand, even if it is dependent on application architecture:
an application with a large number of fine-grained functions induces more invo-
cations than an application with a smaller number of coarse-grained functions
[2]. AWS and Google use this billing component, while IBM does not.
The second billing component is resource consumption. This consumption is
measured in GB-s, which is the product of the allocated memory (in GB) and
the execution time (in seconds). For instance, a function with 512 MB of mem-
ory that executes for 200 ms consumes 0.5 GB × 0.2 s = 0.1 GB-s. An important
issue is that memory allocation and execution time may have different granular-
ities depending on the provider. For instance, a function that needs 140 MB of
memory will require 160 MB in IBM, 192 MB in AWS, but 256 MB in Google.
Execution time is rounded up to a multiple of 100 ms; therefore, a function that
executes in 99 ms will be billed for 100 ms, and a function that runs in 101 ms will
be billed for 200 ms. In Google, resource consumption has an additional compo-
nent, provisioned CPU, which is measured in GHz-s and given by the product
of the execution time and the allocation of CPU cycles, as shown in Table 2.
Table 2. CPU provisioning in Google Cloud Functions
Memory (MB) 128 256 512 1024 2048

CPU (GHz) 0.2 0.4 0.8 1.4 2.4
The three providers offer free usage tiers of 400,000 GB-s each month. AWS
and GCF also include a number of free executions (1 and 2 M, respectively) per
month.
An analysis of the billing models of the three FaaS providers underscores
the relevance of our work. Customers have difficulty in understanding and plan-
ning costs due to the multiple cost components (volume of requests, resource
consumption), free tiers, and differences among providers. In addition, the costs
associated to resource consumption are directly influenced by performance vari-

ations and billing granularity.
4 Results
4.1 Description of Experiments
As discussed in Sect. 3, an application in the FaaS service model combines func-

tions provided by the customer with services offered by the provider (such as
storage and messaging). Thus, application performance has two major compo-
nents, function performance and service performance, and in this work we focus
on the first. Most cloud services are not FaaS-specific, and so their performance
and cost can be assessed separately to get the full picture of application perfor-
mance/cost.
To measure function performance, we need one or more functions that do
not rely on external services and do not use the network. Given the lack of FaaS
benchmarks, we developed a simple function that returns the n-th prime number
using an unoptimized algorithm. The goal was to have a CPU-bound function
with a small memory footprint that could be easily ported to any programming
language. To minimize timing inaccuracies and to avoid large cost fluctuations
due to the rounding of execution times, n was tuned to get an execution time on
the order of a few seconds in the best case. The results presented in this section
are for n = 80, 000.
We evaluated the performance and cost impact of three factors: FaaS
provider, allocated memory, and programming language. Memory sizes were con-
strained by Google, which has the fewest available options: 128, 256, 512, 1024,
and 2048 MB. As for programming language, we implemented the algorithm in
languages supported by all three providers: Go (version 1.11), Python (version
3.7), and Node.js (version 8 on GCF and version 10 on AWS and IBM). With
five memory sizes and three programming languages, there are 15 configurations
for each of the three providers, or 45 configurations overall.
We performed our experiments between April and May 2019. The provider
locations we used were US East (N. Virginia) (AWS), us-central-1 (GCF),
and Dallas (CF Based) (IBM). Function invocation was triggered by sched-
uled invocations. Functions were scheduled to execute at 15 min intervals. Since
execution time is measured by the function itself, it does not include the time
needed to activate an instance. In other words, our execution time measurements
are not affected by either cold or warm start.
4.2 Performance Comparison
4.2.1 Performance Variation for Fixed Configurations

We first analyzed how function execution time varied for fixed configurations, i.e.,
for the same provider, memory allocation, and language. Figure 1(a) shows the
coefficient of variation (CV, the standard deviation divided by the mean) for the
execution time, with each point representing the CV for a given configuration.
AWS had 8 configurations with CVs between 1% and 4%, and 7 configurations
between 21% and 44%. Google had CVs well distributed between 8% and 57%.
Most of the configurations for IBM (14 out of 15) had a CV below or equal
to 28%, but there was one configuration (Node.js, 128 MB) with a CV of 125%.
Overall, there was significant performance variation in all providers: for one-third
of the configurations (15 out of 45), function execution time had a coefficient of
variation above 20%.
128 256 512 1024 2048

125 125
100
75
Go
CV of Execution Time (%)
CV of execution time (%)

100 50
25
0
125
75 100
Node.js
75
50
50 25
0
125
100
Python
25 75
50
25
0
0
S
F
M
F
M
F
M
F
M
F
M
C
C
AW
AW
AW
AW
AW
AWS GCF IBM
IB
IB
IB
IB
IB
G
G
Provider Provider
(a) CV for all configurations (b) CV broken down by configuration
Fig. 1. Coefficient of variation of function execution time for fixed configurations
Figure 1(b) breaks down the CV of execution time by configuration. On AWS,

variability was primarily affected by programming language, with Python having
the lowest CV and Node.js the highest one, while memory only had an influ-
ence with Python. GCF had no discernible trend apart from an inverse rela-
tion between memory size and variability with Python. On IBM, variability was
higher for Node.js than for the other languages, including the aforementioned
maximum of 125% with Node.js and 128 MB.
To illustrate the distribution of execution times, Fig. 2 shows the histograms
of configurations with 128 MB and Node.js, using 100 ms bins. These configu-
rations exhibited high variability on IBM and AWS (CVs of 125% and 44%,
respectively), and low variability on GCF (a CV of 9%). All providers had times
clustered around a single mode, but the distributions are skewed: IBM and AWS
had long tails to the right, while GCF had a tail to the left (a single observation
out of 96, actually). Given that the histograms of other configurations show simi-
lar distributions, we can say that, for a given configuration, while most execution
times remain within a definite range, there is a small probability that they will
vary widely.
4.2.2 Performance Variation Across Configurations

We next analyzed how execution time varied according to provider, allocated
memory, and programming language; the results are shown in Fig. 3. IBM had
60
40
AWS
20
0
60
Count
40
GCF
20
0
60
40
IBM
20
0
5000 10000
Execution time (ms)
Fig. 2. Histograms for the execution time on configurations with 128 MB and Node.js.
better performance than AWS and GCF for all languages with 128 MB and
256 MB memory sizes, and for Go and Python with 512 MB. Performance was
similar for all languages and providers with 1024 MB, and AWS had better per-
formance for 2048 MB. The results show that IBM provides the same CPU power
irrespective of memory size, while AWS and GCF allocate CPU in proportion to
memory size (as expected). Thus, in terms of performance alone, IBM provides
the best choice for memory sizes up to 512 MB, while AWS is the best for larger
memory allocations.
Figure 3 also shows noticeable performance differences between programming
languages. Figure 4 provides a better visualization for this, depicting the ratios
of execution time of Go/Python to Node.js for all configurations. For comparison
purposes, we also performed measurements on a test machine with an Intel Core
i5 6600 processor at 3.3 GHz and 8 GB RAM, running Ubuntu 18.04.2 LTS; on
this machine we measured ratios of 1.1 (Go) and 6.3 (Python) to Node.js. The
results show that:
1. The ratios were different for each provider and memory size;
2. Node.js outperformed Go and Python on 14 out of 15 combinations of provider
and memory size (the exception was IBM with 128 MB, where Go was faster).
On the FaaS platforms, the average ratios of execution time of Go/Python
to Node.js were 1.9 and 12.7, respectively;
3. On average, IBM had the lowest ratios (1.4 and 10.4), while GCF had the
highest ones (2.5 and 14.8);
4. The differences between languages are larger on the FaaS platforms than on
bare metal (with the exception of IBM with 128 MB, where the Go version is
faster than the Node.js version).
As a whole, these results reveal that (i) it is unreasonable to expect that relative
performance differences observed in a test environment will remain the same
when the code is executed on an FaaS platform, and (ii) a judicious choice of
programming language may produce significant savings on provider costs.
The biggest difference between providers with the same language and mem-
ory size was 8.5×, for IBM/AWS with Go and 128 MB. The maximum difference
between programming languages on the same provider and with the same mem-
ory size was 16.8×, for Go/Python on Google Cloud Functions with 512 MB.
6000
Execution Time (ms)
9000 60000
4000
6000 40000
2000
3000 20000
258
6
24
48
258
6
24
48
258
6
24
48
12
51
12
51
12
51
10
20
10
20
10
20
Allocated Memory (MB) Allocated Memory (MB) Allocated Memory (MB)
Provider AWS GCF IBM Provider AWS GCF IBM Provider AWS GCF IBM
(a) Go (b) Node.js (c) Python
Fig. 3. Function execution time for different configurations. Notice the different scales
on the y-axis.
15
Ratio of Execution Time
2
10
1
5
0 0
128 256 512 1024 2048 8192 128 256 512 1024 2048 8192
Allocated Memory (MB) Allocated Memory (MB)
AWS GCF IBM Bare−Metal AWS GCF IBM Bare−Metal
(a) Go to Node.js (b) Python to Node.js
Fig. 4. Ratio of average execution times of Go and Python to Node.js
Figures 3 and 4 show that performance is heavily dependent on the specific

combination of provider, programming language, and memory allocation chosen
by the customer. To statistically analyze the contributions from each of these
aspects, we ran a Factorial ANOVA [3, Ch. 12] on our data and calculated the
allocation of variation. All the main factors (provider, language, and memory)
and their interactions are statistically significant (p < 0.001). The main factors
account for the 52.8% of the observed variation in execution time, while the inter-
actions account for 40.3% of the variation (the remaining 6.9% are residuals).
The high percentage of variation due to interactions provide strong statistical
evidence that the factors have to be considered together, not separately.
4.3 Cost Comparison

After analyzing performance, we investigated how cost varied according to
provider, memory allocation, and programming language. We considered the
monthly cost for a fixed request rate λ, expressed in requests per second (req/s).
This fixed rate can be interpreted as the average request rate over a month,
not necessarily as a constant rate. The request rate was multiplied by the aver-
age number of seconds in a month to obtain NRM, the number of requests per
month. The cost due to the number of function invocations per month was given
by NRM multiplied by the cost per invocation, while the cost due to resource
consumption was calculated using NRM, the average execution time (rounded
up to a multiple of 100 ms), and the memory allocation for each configuration; in
both cases, applicable free tiers were taken into consideration. Figure 5 depicts
the average monthly cost for a low request rate (λ = 1 req/s).
With respect to the FaaS providers, the graphs show that:
1. On AWS, costs remain approximately constant, irrespective of memory size,
since cost in inversely proportional to allocated memory. This confirms that
computing power is proportional to allocated memory (cf. Sect. 4.2.2).
2. On IBM, costs are proportional to memory size.
3. On GCF, costs are lower for 256 and 512 MB than for 128 MB (at least for
Go and Python), and increase for memory allocations larger than 512 MB. A
possible explanation is that, according to Table 2, increases in memory allo-
cation beyond 512 MB correspond to smaller increases in CPU provisioning
(e.g., going from 512 MB to 1024 MB gives double the memory but only 75%
more CPU power).
4. IBM has the lowest cost up to 512 MB, AWS and GCF share the lowest cost
for 1024 MB, and AWS has the lowest cost for 2048 MB.
The variations in cost are significant. The biggest difference in cost between FaaS
providers with the same language and memory size was 67.0× (USD 54.13), for
IBM/AWS with Go and 128 MB.
With regard to programming languages, Fig. 5 shows that Node.js has the
lowest cost, except on IBM with 128 MB, where Go is cheaper. The graphs also
show that the difference in cost between Go and Node.js (1.8× on average, with
a maximum of 2.8×) is much smaller than the difference between Python and
Node.js (11.8× on average, with a maximum of 17×). The maximum difference
between programming languages on the same provider and with the same mem-
ory size was 67.2× (USD 54.25), for Go/Python on IBM with 512 MB.
We next combined performance and cost data into a single cost/performance
metric. A cost-conscious FaaS customer will strive to minimize the
cost/performance ratio within the set of feasible configurations for a given func-
tion. Since the ratio is expressed in dollars per throughput [6, Ch. 3], we divided
the number of primes (80, 000) by the average execution time to obtain the
throughput of each configuration in thousand primes per second (kpps). The
cost/performance curves, shown in Fig. 6, allow us to conclude that: (i) the
FaaS platforms with the best cost-efficiency are IBM up to 512 MB (all lan-
guages), GCF for 1024 MB (Node.js and Python), and AWS for 1024 MB (Go)
and 2048 MB (all languages); (ii) GCF is less competitive for Go than for Node.js
or Python; and (iii) Node.js is the most cost-efficient programming language,
while Python is the least cost-efficient language.
nodejs python
80
Monthly Cost (USD)
100 750
60
40 500
50
20 250
0 0
8
6
24
48
8
6
24
48
258
6
24
48
12
25
51
12
25
51
12
51
10
20
10
20
10
20
Fig. 5. Average monthly cost (λ = 1 req/s). Notice the different scales on the y-axis.
The best (lowest) cost/performance ratio was USD 0.01/kpps (IBM/Go/

128 MB), while the worst (highest) ratio was USD 372.71/kpps (AWS/Python/
128 MB). Such a large difference is striking evidence of how much performance
and cost vary across FaaS platforms.
8
Cost/Performance (USD/kpps)
6 1.5 300
4 1.0 200
2 0.5 100
0 0.0 0
8
6
24
48
8
6
24
48
8
6
24
48
12
25
51
12
25
51
12
25
51
10
20
10
20
10
20
Fig. 6. Cost/performance ratios. Notice the different scales on the y-axis.
5 Conclusion
In this article we presented an experimental performance and cost evaluation of
three function-as-a-service (FaaS) platforms: AWS Lambda, Google Cloud Func-
tions, and IBM Cloud Functions. We have shown that these platforms alleviate
the burden of managing infrastructure but suffer from low cost visibility, since
function execution times vary within and across configurations, which induces
cost fluctuations. The choice of programming language has been identified as a
key factor, with differences in execution time superior to 15×. Memory alloca-
tion size may affect both performance and cost, depending on the platform. The
wide variation across configurations revealed by our cost/performance analysis
lead us to conclude that FaaS customers should use our experimental method-
ology and results as guidance, and evaluate the performance and cost of their
own functions before embarking on large-scale FaaS deployments. In future work
we intend to analyze other factors that influence FaaS costs, such as network
traffic and cloud storage, and perform experiments over an extended period of
time, which would enable us to analyze platform trends and to characterize the
distribution of function execution times.
Acknowledgments. This research was supported by FAPESC and UDESC.
References
1. Adzic, G., Chatley, R.: Serverless computing: Economic and architectural impact.
In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engi-
neering, ESEC/FSE 2017, pp. 884–889. ACM, New York (2017)
2. Eivy, A.: Be wary of the economics of “serverless” cloud computing. IEEE Cloud
Comput. 4(2), 6–12 (2017)
3. Field, A., Miles, J., Field, Z.: Discovering Statistics Using R. SAGE Publications,
Thousand Oaks (2012)
4. Figiela, K., Gajek, A., Zima, A., Obrok, B., Malawski, M.: Performance evaluation
of heterogeneous cloud functions. Concurr. Comput. Pract. Exp. 30(23), e4792
(2018)
5. Horovitz, S., Amos, R., Baruch, O., Cohen, T., Oyar, T., Deri, A.: FaaStest –
machine learning based cost and performance FaaS optimization. In: 15th GECON,
Pisa, Italy, pp. 171–186. Springer, Heidelberg, September 2018
6. Jain, R.: The Art of Computer Systems Performance Analysis. Wiley, Hoboken
(1991)
7. Jonas, E., Pu, Q., Venkataraman, S., Stoica, I., Recht, B.: Occupy the cloud:
distributed computing for the 99%. In: SoCC 2017, pp. 445–451. ACM, New York
(2017)
8. Lee, H., Satyam, K., Fox, G.C.: Evaluation of production serverless computing
environments. In: IEEE 11th International Conference on Cloud Computing, pp.
442–450 (2018)
9. Leitner, P., Cito, J., Stöckli, E.: Modelling and managing deployment costs of
microservice-based cloud applications. In: IEEE/ACM 9th UCC, pp. 165–174,
December 2016
10. Lloyd, W., Ramesh, S., Chinthalapati, S., Ly, L., Pallickara, S.: Serverless com-
puting: an investigation of factors influencing microservice performance. In: 2018
IEEE International Conference on Cloud Engineering (IC2E), pp. 159–169. IEEE
(2018)
11. McGrath, G., Brenner, P.R.: Serverless computing: design, implementation, and
performance. In: 2017 IEEE 37th International Conference on Distributed Com-
puting Systems Workshops (ICDCSW) (2017)
12. Roberts, M.: Serverless architectures (2018). https://martinfowler.com/articles/
serverless.html
13. Rogers, O.: Economics of serverless cloud computing. Technical report, 451
Research, June 2017. https://bit.ly/2nB4m6t
14. Villamizar, M., et al.: Cost comparison of running web applications in the cloud
using monolithic, microservice, and AWS Lambda architectures. Serv. Oriented
Comput. Appl. 11(2), 233–247 (2017)
15. Wang, L., Li, M., Zhang, Y., Ristenpart, T., Swift, M.: Peeking behind the curtains
of serverless platforms. In: 2018 USENIX Annual Technical Conference (USENIX
ATC 18), USENIX Association, Boston, MA, pp. 133–146 (2018)
Optimal Bandwidth and Delay of Video
Streaming Traffic in Cloud Data Center
Server Racks
Nader F. Mir(&), Vincy Singh, Akshay Paranjpe, Abhilash Naredla,

Jahnavi Tejomurtula, Abhash Malviya, and Ashmita Chakraborty
Department of Electrical Engineering, San Jose State University,

California, San Jose, CA 95195, USA
nader.mir@sjsu.edu
Abstract. The task of streaming video files or sources in data centers is

challenging since data streaming must be implemented at a constant rate without
any interruption. The task of streaming requires proper storage space, network
bandwidth and efficient algorithms that tackle the changing environmental
conditions. In this paper, we conduct research on the optimality of incurring
delay and consuming bandwidth when video streaming is established in data
center network (DCN) architectures. We first observe the behavior of such traffic
patterns in eight server rack and sixteen server rack architectures and compare
the simulation results. We then expand our study to observe the behavior of
higher complexity data center network. In our study, we attempted to include
realistic situations such as appropriate Proxy servers, adaptive bit rate streaming
(ABS) feature, and virtual private network (VPN) security feature. We also
conduct a study on high-resolution video and multicast traffic in such
environments.
1 Introduction
Clouds’ data center networks (DCNs) consist of layer 2 and 3 switches and routers
connected to form a certain topology among centers server racks. The servers of each
rack are directly interconnected using one tier of top-of-rack (ToR) or edge switches
that are layer 2 type switches [1, 2]. Edge technologies offer viable solution to multiple
challenges faced by interactive media and video streaming today. Figure 1 shows a
snap shot of a distributed use of a certain type of cloud-based data centers. In this
figure, there are three cloud centers 1, 2, and 3 deployed in three different locations.
Mobile host 1 tries to use a certain service such as video streaming. The video to be
streamed is originally found in cloud center 2 while upon arrival of the inquiry from
host 1 at this center, the inquiry is forwarded to a closer location to host 1 being cloud
center 3.

https://doi.org/10.1007/978-3-030-33509-0_17
Optimal Bandwidth and Delay of Video Streaming Traffic in Cloud Data 187
Host 1
Internet Backbone
Network
Clo
r
nte
ud
Ce
Ce
ud
nte
3
Clo
r1
Cloud Center 2
Fig. 1. An overview of a cloud computing and distributed data cloud centers
A key factor for the success of cloud computing is the inclusion of virtualization in
its structure. Cloud computing is indeed a method of providing computing services
from a large, highly “virtualized” data center to many independent users utilizing
shared applications. In data centers, the services are provided by “physical” servers and
“virtual” servers. A virtual server is a machine that emulates a server through software
running on one or more real servers. Such virtual servers do not physically exist and
can therefore be moved around and scaled up or down without affecting the end user.
Cloud computing, on the other hand, is equivalent to “distributed computing over a
network” that runs programs or applications on numerous connected host servers at the
same time. A host server in a data center is typically called a blade. A blade includes a
CPU, memory, and disk storage. Blades are normally stacked in racks, with each rack
having an average of 30 blades. These racks of host blades are also known as server
racks.
The challenge of media streaming in data centers passing through DCNs remains a
challenge for researchers [3]. Streaming of media is a technique for transferring data
such that it can be processed as a steady and continuous stream. Streaming video
requires the compression and buffering techniques that allow one to transmit and view
video in real-time through the internet. In other words, streaming media is the simul-
taneous transfer of digital media files such as video, audio, and data through a server
application that can be displayed in real-time by client applications. To play a media
file smoothly, video data needs to be available continuously on a server and in the
proper sequence without interruption [4]. Streaming provides a steady method of
delivery controlled by interaction between computer and the server. Audio and video
streamed content will be delivered from the cloud through centralized or regional data
centers across optimized terrestrial or wireless technologies.
In the rest of the paper, we first introduce a testbed for end-to-end system analysis,
and then focus on obtaining the system performance on delay and bandwidth provi-
sioning. At the end of this paper we present results and conclusion.
188 N. F. Mir et al.
2 Testbed Systems for End-to-End Analysis
We start with obtaining a clear picture of streaming in cloud data centers by presenting
the testbed systems used in this paper. Figure 2 shows an abstract model of an 8-server
supported DCN that acts as the communication component of data servers. There are 8
server racks in the testbed module. Typical DCNs are currently based on routers and
layer 2 and 3 switches connected in a certain topology. The servers of each rack are
directly interconnected using one tier of top-of-rack (ToR) or edge switches that are
layer 2 type switches. Each host (server) in a server rack has a network interface card
(NIC) that connects it to its ToR switch while each host is also assigned its own data-
center internal IP address. Each top-of-rack (ToR) layer 2 switch in a data center uses a
NIC for its connection.
Fig. 2. DCN architecture with 8 server racks
The destination is typically a host that attempts to receive the cloud’s service. The
service typically passes through ToR, aggregate and core switches and ultimately
impacted by the gateway router. We tried to study the simulation graph for 8 server
racks as shown in Fig. 2.
Video traffic in this paper is set up using the adaptive bit rate streaming (ABS).
ABS is a management technique for streaming live or pre-recorded media over com-
puter networks. ABS can be included in most real-time media transmission protocols.
The most practical deployment of ABS is the one built for transmission over HTTP in
networks. Through the adaptive bit rate streaming, viewing experience for a diverse
range of devices over a broad set of connection speeds can be made possible. The
complete topology of Fig. 1 was built in the simulation testbed and the results of the
ABS video streaming to a destination outside and the server which is inside the data
center was simulated.
The basic test bed system also includes appropriate Proxy servers for authentication
purposes (not shown in the figure). Proxy servers may introduce some additional
processing delay to forwarded packets (circuit level filtering) depending on the
application that the datagram belong to.
Another feature incorporated into the testbed system is virtual private network
technique (VPN) that is done for security reasons. A VPN is an example of providing a
controlled connectivity over a public network such as the Internet. VPN utilizes a
concept called an IP tunnel and virtual point-to-point link between a pair of nodes that
are separated by an arbitrary number of networks. The simulation testbed allows
comparing all security with VPN, with firewall and without firewall at a same time by
using different scenarios. VPN is then another source of packet delay.
The parameters defining the above features were defined in Application config and
Profile config of OPNET which helped create the application on servers and profile on
destination node. The results of simulation for a system with 8 server racks are shown
in Fig. 3. The figure shows that the end-to-end delay starts at 0.0254 s, and it tends to
be reduced over the next 500 s. This decreasing delay is due to the time required for the
streaming connection to stabilize when the adaptive bit rate streaming (ABS) tries to
settle.
Fig. 3. End-to-end delay in 8-server rack architecture using high-resolution traffic
The data center network is then expanded to 16 server racks from 8 server rackz in
previous topology as shown in Fig. 4. Also, the effect on the video streaming by
changing the video quality on the network was significant.
Fig. 4. DCN architecture with 16 server racks
As we have increased the number of servers from 8 to 16, we observe as seen in

Fig. 5 that the end-to-end delay worsens because there are two server racks per ToR
switch which can resolve the client request and give the response back quickly com-
pared to 8 server racks topology.
Fig. 5. End-to-end delay in 16-server rack architecture using high-resolution traffic

We also implemented multicasting by using IGMP protocol. A multicast set-up can

be enabled on clients who wish to receive traffic via it. With the multicast technique, we
can send a single copy of data to all the clients in the multicasting group. For this, we
also implemented a firewall and QoS policy. Firewall in any architecture serves an
important purpose to protect the client from unwanted traffic. QoS implementation
helps give priority to critical traffic. Firewalls are multi-homed servers with routing
capabilities that are aimed to protect the local networks against unauthorized accesses.
Firewalls contain proxy servers which determine the firewalls security policies for the
corresponding applications. If a firewall does not have the proxy server of a certain
application, then this application is not allowed through the firewall [6].
IP multicast implies that one sender sends data to multiple recipients. Much like
broadcast, there are special addresses designated for multicast data. The difference is
that some of these can be routed and used on the Internet. The multicast space reserved
is 224.0.0.0/4. Multicasting is more efficient than broadcast, because broadcast packets
must be received by everyone on the local link. Figure 6 shows the result of simulation
for a multicast traffic in an 8-server rack system. We can conclude from this figure that
with implementing multicast in the data center network topology the delay with video
streaming data improves.
Fig. 6. IP multicast traffic received at destination node with high resolution video quality
3 Study on Larger Systems
For more redundancy in a data center network, increasing the number of switches that
handle operations scales the data center. When a data center is scaled, the major change
is contributed by server virtualization, which helps consolidating the load onto the newly
added, more capable switches. This method is cost and performance efficient. However,
there are major challenges in scaling the data center networks. One such a challenge
would be the resource allocation. Even though there are many servers configured in the
system and the resources are not allocated properly, the performance is at stake. Allo-
cating resources to each machine is as important as scaling the network. Failure of
resource allocation leads to workload imbalance, loss of resources, server unable to host
the needed number of virtual machines and even cause the entire server to crash. Any
scaled system would become efficient when the multiple dependent factors are satisfied.
Implementation of Fat-Tree Architectures
The Fat tree topology is built using the following values. The architecture was built for
k = 4, 5, 6, 7 i.e. 16, 35, 64 and 98 hosts in network as shown in Fig. 7. For all the
above topologies the architecture was designed by scaling the layer of aggregate
switches horizontally and vertically. The connections between any two layers of
switches are made as one too many. In horizontal scaling there is only one layer of
aggregate switches and for vertically scaling there are 2 layers of aggregate switches for
(16, 35 and 64 hosts) and 3 layers of aggregate switches for (98 hosts). Each aggregate
switch layer in the vertical scaling is connected to the above and below aggregate layer
switches in the same pod. The topology diagram can be seen below.
Fig. 7. Fat-tree vertically scaled architectures for 16, 35, 64, and 98 hosts.
Implementation of Multi-tiered Architectures

First item we have taken into consideration is that for us to compare Multi-Tiered and
Fat-Tree architectures that the over-subscription ratio for both the architectures should
be 1:1, whereas Fat-Tree with its k-pod structure makes 1:1 inherent. We design multi-
tiered architecture to be having 1:1 oversubscription ratio. For the ratio to be 1:1 the
bandwidth over the links all through the three levels of basic aggregate switch must be
in the ratio e: 2e, which is the uplink to down link band width provision to be in the
ratio 1:2. Unlike Fat-Tree architecture, the Multi-tiered architecture has a flexible
algorithmic structure. In this discussion, k is the number of servers per rack, so that the
number of Top of the Rack switches = k/2, the number of Aggregate switches = k/4,
and the number of Core switches = k/8. The edge layer consists of the ToR switches
and it can be scaled horizontally by increasing the number of switches which are
directly connected to the hosts. Various sizes of such a network is shown in Fig. 8.
Fig. 8. Multi-tiered vertically scaled architectures for 16, 32, 64, and 98 hosts.
Iperf Test
Iperf (Internet performance test) is a tool, which we used to test the limits of the
network. It is used to measure the maximum bandwidth achievable on the IP networks.
With the iperf tool we will have the ability to choose the parameters like protocols,
timing and buffer sizes. When networks are tested with iperf results involve the
parameters like bandwidth, loss and other parameters.
Ping Test
Ping is a function to test the network performance. It is generally used to analyze
parameters like the total number of packets sent and received, the percentage of lost
packets over different paths, and the response time of both minimum and maximum. It
is generally in milliseconds, and the average response time in milliseconds. Ping test
helps check the connectivity as well as the response time for each packet. Ping can be
done between two hosts and between all the nodes in network which can be a unicast or
broadcast type.
Analysis of Layered and Non-layered Fat-Tree Architectures
It was observed from Figs. 9 and 10 that the average bandwidth and average delay of
non-layered architectures is greater than average bandwidth and average delay of
layered architectures for both inter rack communication and intra communication. The
only exception was the 35 host layered fat tree architecture in which the intra rack
average bandwidth was less than inter rack average bandwidth and the intra rack
average delay is higher than inter rack average delay, indicating that the architecture for
35 hosts layered is correctly tailored, to provide higher bandwidth for inter rack
communication in layered architectures as compared to non-layered architectures and
to provide lower delay for inter rack communication in layered architectures as com-
pared to non-layered architectures. The exception was due to the number of switches
and layering in 35 host-layered architecture that provided efficient bandwidth allocation
and response time for end-to-end packet transfer between different racks. For 16 hosts
layered and non-layered architecture the bandwidth was high that was due to the less
load on the network, which has high performance of the mininet.
Fig. 9. Average bandwidth in fat tree topologies (Inter and Intra)
Fig. 10. Average delay in fat tree topologies (Inter and Intra)
It was observed that average bandwidth for communications between hosts in the
same rack is higher than the average bandwidth for communications between hosts in
different racks. Also, the average delay between the hosts in different rack is higher
than the average delay between hosts in the same rack. This is because for inter rack
data transfer, the data packet must go through layers of aggregate switches and one
layer of core switches which is indicated in the topology. The layers of aggregate
switches increase the number of times a packet must pass through switches to reach the
end racks of the network topology which in turn results in the packet having to contend
for available bandwidth along with other Inter Rack communication packets and
Switch configuration messages, controller messages in the network. This reduced the
bandwidth available for individual packets during inter rack communication and
increases the switching time, number of times it must share the bandwidth. In Intra rack
communication, packets only must pass through maximum of two ToR switches and
sometimes through a single Aggregate switch to reach the distant host connected to the
same rack. Since the ToR switches forward the packets directly to the end hosts and do
not have to deal with forwarding of other packets in inter rack data transfers, most of
their bandwidth is available for forwarding of packets towards destination hosts.
Analysis of Fat-Tree and Multi-tiered Architectures
We calculated and compared the overall average bandwidth and overall average delay
for all architectures in multi-tiered topology and fat-tree topology. The data was
compared in terms of the number of hosts in the network and the type of network
(layered or non-layered). For accurate results of comparison, we designed two
topologies with same metrics.
The difference in bandwidths is because Fat Tree Architecture follows an algorithm
for determining the number of switches in the network based on the number of hosts
per rack, which results in an increase in the overall number of switches. Scaling this
architecture doubles the number of switch connectivity, which in turn increases the
number of times, packet must pass through switches before reaching its destination.
The increased connectivity to multiple core and aggregate switches decreases the
available bandwidth as core and aggregate switches also forward controller messages
and configuration messages from the switches across the same links through which data
packets pass. Any such algorithm does not limit multi-tiered Architecture. It depends
only on the size of switches used for ToR and aggregate connections, which allows
multiple hosts to be connected to single switch and reduces the number of switches
used at ToR, aggregate, and core layers, providing better bandwidth from a source to a
destination.
4 Conclusion
In this paper, we presented an evaluation of data center network schemes on end-to-

end-delay and bandwidth. We discussed the proposed schemes from different per-
spectives, highlighting the trends researchers have been following when designing
these architectures for performing video streaming. We were able to observe various
aspects of video streaming with the use of OPNET simulator. The results provide us
with useful graphs to study the behavior of the DCN network under different cir-
cumstances. The comparison between the 8-server rack architecture and 16 server rack
architectures gave us a base insightful result. The study followed by scaled up archi-
tectures. The implementation of multicasting on network helps realize the difference
between multicasting and broadcasting. Analysis and evaluation of the models pre-
sented in this paper demonstrate that when multicast is implemented there are signif-
icant performance gains in throughput, start-up time, playback lags and reduction in
channel switching delays.
References
1. Lascano, J.E., Clyde, S.W.: Improving computer science education through cloud computing:
an observational study. In: 2017 IEEE 30th Conference on Software Engineering Education
and Training (CSEE&T), Savannah, GA (2017)
2. Mir, N.F.: Computer and Communication Networks, 2nd edn (2015). ISBN 0-13-381474-2
3. Bilal, K., Erbad, A.: Edge computing for interactive media and video streaming. In: 2017
Second International Conference on Fog and Mobile Edge Computing (FMEC), Valencia
(2017)
4. Aceto, G., Botta, A., de Donato, W., Pescapè, A.: Cloud monitoring: definitions, issues and
future directions. In: 2012 IEEE 1st International Conference on Cloud Networking
(CLOUDNET), Paris, France (2012)
5. Barayuga, V.J.D., Yu, W.E.S.: Study of packet level UDP performance of NAT44, NAT64
and IPv6 using Iperf in the context of IPv6 migration. In: IEEE Conference, Beijing, INSPEC:
14882156, October 2014
6. Chee, L.L., Xuen, C.Y., Karuppiah, S.A.L., Fadzil Mohd Siam, M.: IEEE Paper Published in
Advanced Computer Science Applications and Technologies (ACSAT), 26–28 November
2016
Log-Based Intrusion Detection for Cloud
Web Applications Using Machine
Learning
Jaron Fontaine1(B) , Chris Kappler2 , Adnan Shahid1 , and Eli De Poorter1

1
IDLab, Department of Information Technology, Ghent University - imec,
Gent, Belgium
{jaron.fontaine,adnan.shahid,eli.depoorter}@ugent.be
2
PwC, Brussels, Belgium
chris.kappler@pwc.com
Abstract. With the ongoing rise and ease-of-use of services provided by

major cloud providers, many enterprises migrate their infrastructure and
applications to cloud platforms. However, the success of using many and
diverse services leads to more attack vectors for potential attackers to
exploit, that again leads to more difficult, complex and platform-specific
security architectures. This paper is an attempt to remedy this problem
by proposing simplified cloud security using machine learning techniques.
This leads to a more general architecture that uses classifiers such as
decision trees and neural networks, trained with data logged by cloud
applications. More specifically, we collected easy-to-interpret access logs
from multiple web applications which are often the only kind of security
information available on various services on such platforms. The results
show a more flexible approach, that was experimentally validated on
cloud platforms and improve detection speed, using neural networks and
J48 decision trees, up to 26–47 times while still maintaining an accuracy
of 98.47% and 97.71% respectively.
1 Introduction
Typical enterprises use multiple cloud solution providers for hosting services.
Cloud platforms come with advantages such as scalability, faster development,
higher cost efficiency with minimal management effort or service provider inter-
action [3]. Cloud platforms offer a wide range of applications. These include a
range of simple web-based services such as web applications, storage, databases,
data analytics, etc. They also include platforms required by many computation-
ally intensive software such as virtual machines, container services, load bal-
ancers, etc. With the increasing popularity of cloud solutions, security becomes
more important than ever to ensure data integrity, availability and confiden-
tiality. Users can access private data on such services from everywhere, at any
time. This makes it possible for intruders to try accessing or modifying the same
resources [18]. Common intrusion detection systems (IDS) can detect very spe-
cific well-defined attacks that intrude certain platforms or networks. As more
https://doi.org/10.1007/978-3-030-33509-0_18
198 J. Fontaine et al.
cloud service public endpoints are provided to customers, more attack vec-
tors arise from potential intruders to attack these systems. Deploying multiple
platform- and application-specific intrusion detection systems is a challenging
problem, however, it is not very flexible and demands domain expertise in how
vulnerabilities are exploited and attacks are executed. Complex detection meth-
ods on many layers of the system will also degrade system performance while
scanning each kind of packet, even safe ones. This paper provides a remedy for
this problem by aiming for a broader system to perform attack detection in
cloud services using web access logs and machine learning (ML) techniques. In
this paper we do not try to implement a full stack detection system, but rather
explore the possibilities of ML that can accelerate and increase the flexibility of
attack classification. We target detection on web applications, using data avail-
able in cloud security logs.
Our main contribution is an introduction of a flexible ML approach to
perform attack detection using logs from web applications. This contribution
includes information-rich features of these logs and proposes configurations of
ML algorithms with outstanding performance and minimal time overhead. In
contrast to previous detection methods, our solution does not require domain
expertise in a vast amount of exploits, making deployment almost effortless
for web applications in cloud environments. More specifically, feature selection
methods on web application logs are used to prove useful features needed for
such detection. Accuracy and computation time trade-offs of ML techniques
are demonstrated and compared to more traditional ruled-based systems. The
particular techniques include both simple and state-of-the-art models such as
decision trees, neural networks and ensemble meta-algorithms.
The remainder of this paper is structured as follows. Section 2 briefly
describes related work. In Sect. 3 a brief overview lists the available cloud secu-
rity logs, followed by datasets used for the training of a log-based intrusion
detection model. In Sect. 4, focusing on web application logs, feature extraction
is performed, followed by an extensive search for the optimal feature selection
method. ML algorithms we used, are examined in Sect. 5 for the use in attack
classification problems followed by their associated results in Sect. 6. Next, future
research is presented in 7 by proposing a cloud security dashboard architecture
and log aggregation system, together with a hybrid approach for attack and
anomaly detection. Finally, Sect. 8 concludes the paper.
2 Related Work
This section describes related work on web and network attack detection using
ML techniques. Table 1 compares the related works in terms of their supported
web attacks, input data, approach and support for multiple types of cloud ser-
vices such as web applications, databases, cloud storage, etc. The first two papers
focus on the detection of network-related attacks, while the other papers focus
on attacks set in the application layer. Compared to the works explained in
more detail below, this paper focuses on a much broader set of web attack types
Log-Based Intrusion Detection for Cloud Web Applications 199
Table 1. Overview of related work in the field of attack detection on web or cloud
applications.
Paper Supported web Input data Classification Cloud services

attacks approach scaling
[19] DoS & probe Network-layer DTree, FNN &
attacks header data SVM
[15] Local access, Network-layer Autoencoder &
dos & probe header data FNN
[17] SQL injection Web application SVM,
& XSS requests Naı̈ve-Bayes &
kNN
[7] Android data Network & SVM, FNN,
theft application data DTree & kNN
This 10+ high risk Cloud-based DTree,
web attacks web app logs meta-learning
& FNN
and extensively compares accuracy and complexity of multiple state-of-the-art

types of ML-based classifiers. Moreover, our paper focuses on the ability to scale
towards multiple cloud services and supports multiple cloud platforms.
The authors of [19] perform real time detection of attacks using ML tech-
niques and stream processing. Although their method is robust against new
attacks and achieves high accuracy, they only validate their model’s performance
on two kinds of network-based threats: Denial of Service (DoS) attacks and a
probe. Similarly, the authors of [15] focus on the detection of three kinds of
network-based threats: DoS, probe and remote to local access attacks. They
use state-of-the-art ML techniques such as neural networks and autoencoders.
Whereas these solutions can intend host security from the cloud provider per-
spective, our approach focuses on a broad set of attacks, with a big impact on
security, targeting cloud customers running many cloud services and hosting web
applications.
The authors of [17] used ML-based detection methods (SVM, Naı̈ve-Bayes,
kNN) for cross-site scripting and SQL injection attacks. Although this paper
shows similar interest in web-based attacks, their focus is smaller and does not
cover all types of attacks that can be targeted towards web cloud services. The
paper focuses on similar ML techniques, except neural networks and achieves
comparable accuracy, however they used smaller datasets which cannot cover
most important attacks and their solution cannot be directly applied to the
datasets that we considered in this work. [7] combines both network and appli-
cation (GET/POST request) features to detect malicious malware on Android
devices causing data theft, premium SMS features and downloading extra mali-
cious code and backdoors. They also used machine learning approaches such as
support vector machines (SVM), neural networks, decision trees and k-nearest
neighbours. More specifically, their model indicates transmissions to attackers of

user sensitive information coming from client-side devices. Our approach detects
attacks that attempt to retrieve such information and also detects attacks that
try to modify server-side information.
3 Cloud Web Application Security Logs

This paper focuses on web application logs produced by a Infrastructure as a
Service (IaaS) or Platform as a Service (PaaS). Software as a Service (SaaS)
applications are expected to be monitored and secured by the cloud provider. In
addition, SaaS applications do not often provide the needed details for attack
detection.
3.1 Azure, Amazon Web Services and Google Cloud

Microsoft Azure, Amazon Web Services (AWS) and Google Cloud Platform are
the most used cloud platforms by enterprises today [6]. These cloud platforms
offer services to collect and analyse cloud logs such as Azure Diagnostics, Azure
Monitor, Amazon CloudWatch and Google Stackdriver.
The structure of many of these logs is similar to web HyperText Transfer
Protocol Secure (HTTPS) request logs. Section 3.2 describes web application
logs more in detail, as this paper uses such logs to analyse machine learning
performance and accuracy. Other cloud service specific logs are not tested since
there is a need for cross-cloud platform log aggregation system to connect all
cloud service logs. However, the training and testing phases of such systems is
similar to the methods proposed in this paper.
3.2 Web Application Logs

Web application logs, offered by many PAAS and IAAS, are a good way to
demonstrate the possibility of log aggregation and conversion to a single format.
This is required when using attack detection using machine learning on multiple
platforms. To train the classifiers, various web application logs were collected
and converted to the World Wide Web Consortium (W3C) format [23] that is
common in Apache web servers. Section 3.3 describes the structure of W3C logs
as well as attack labels.
3.3 Dataset and Labeling

We constructed a dataset with open logs originating from honeypots running
web servers among other services. These logs are available at [1,2]. Honeypots
are heavily monitored systems that are intentionally vulnerable and try to collect
information upon exploitation in order to detect new attack trends [20].
These datasets contain raw Apache access log files in the W3C format. A
common problem in ML applications is the absence of labeled datasets. To
Fig. 1. Labeled logs used for training attack detection models.
allow supervised learning, a rule-based labeling approach was utilized. While

this restricts the absolute accuracy that can be achieved, it allows us to continue
the research on the accuracy and detection speed trade-off relative to rule-based
intrusion detection systems and improve attack detection in complex cloud sys-
tems. Figure 1 shows the structure of W3C logs together with attack classes as
labels. These labels were generated using Apache scalp [10], and detection rules
from the PHP-IDS project [5].
4 Feature Selection
We applied feature selection methods to optimize the attack detection speed
and accuracy of ML algorithms. First, we discuss features extracted from the
logs. Next, methods of analysing detection accuracy and speed of features are
discussed, followed by results of feature selection.
4.1 Web Security Features

The following attack classes are detected by the proposed classifiers: “Basic
directory traversal”, “SQL injection”, “Cross-site scripting (XSS)”, “Common
comment types”, “Half-/fullwidth encoded unicode HTML breaking”, “JS with(),
ternary operators and XML predicates”, “Nullbytes and other dangerous char-
acters”, “Obfuscated JavaScript script injections”, “Includes and typical script
methods”, “Find attribute, HTML injection and whitespace.”
A script was developed to extract features in the well-known Comma Sep-
arated Value (CSV) format. Figure 2 shows the extracted the 28 features from
web application logs. Some features are based on the value of parameters in the
HTTPS protocol. Most features are constructed using a series or combination of
special characters.
4.2 Selection Methods

Feature selection is the process of determining which features are relevant for
ML algorithms. It can provide detection speed and accuracy improvements by
selecting a subset of the total amount of features [11]. We investigated these
claims using Weka [24]. Weka contains a collection of machine learning algo-
rithms and allows ranking features (by importance according to the feature selec-
tion method) to select a relevant subset, as described in Sect. 4.3. We selected
the top n ranked features to compare results from different selection methods,
Fig. 2. 28 web log features used for attack detection.
where n is a number of selected features. Training and testing performances are

measured using a J48 classifier. 10-fold cross validation is used to have a repre-
sentative accuracy result. The following feature selection methods were utilized
to score features:
No Selection: Using all the features and collect their performance and accu-
racy.
Correlation Evaluation: This method evaluates Pearson correlation between

the values of features and their corresponding classes [24]. A feature is considered
’good’, resulting in a high score when there is a high correlation between two
variables (X, Y ) with feature X and class Y . The correlation coefficient r is
calculated using (1). Values of r will lie between –1 and 1 with 0 having no
correlation and 1 high positive correlation [12].
n

(xi − x̄i )(yi − y¯i )
i=1
r= . (1)

n n
(xi − x̄i )2 (yi − y¯i )2
i=1 i=1
where xi , x̄i , yi and y¯i correspond to the ith feature value, the mean feature
value, the ith class value and the mean class value, respectively.
Gain Ratio: Gain ratio uses entropy to evaluate the value of features [24]. Gain
ratio between feature X and class Y is defined by (2).
H(Y ) − H(Y |X)

GainR(Y, X) = , (2)
H(X)
with
n

H(X) = − P (xi ) ln P (xi ), (3)
i=1
and
H(Y |X) = H(Y, X) − H(X). (4)
where P (xi ) is the probability of feature X having a value xi out of all possible
values, H(Y, X) is the joint entropy and H(Y |X) is the conditional entropy
between class Y and feature X.
Table 2. Detection accuracy and training time for 18 selected features. OneR achieves
the highest accuracy and correlation evaluation offers a shorter training time.
Method Accuracy Training time

Correlation evaluation 82.88% 20 s
Gain ratio 78.01% 20 s
Info gain 97.24% 29 s
OneR evaluation 97.25% 32 s
Info Gain: Info gain is similar to gain ratio, but does not divide in (5) by the
entropy of feature X.
Inf oG(Y, X) = H(Y ) − H(Y |X). (5)

OneR Evaluation: This method uses a simple classifier with only one feature
to train on. The score is determined by the classification accuracy [14].
A desirable metric that indicates the correlation between features was not
found in Weka and could not be tested. Section 4.3 presents results using the
discussed selection methods.
4.3 Results
Table 2 shows the measured accuracy and training time with a selection of the
18 top ranked features (instead of all the 28 features). Correlation evaluation
and gain ratio have a much lower accuracy than info gain and OneR, but their
training timing is lower. This behaviour occurs because the classifier uses dif-
ferent, sometimes more valuable, features selected by the selection algorithms.
When the model trains using all the 28 features, results show a higher accuracy
of 97.71%, but a slower training time of 40 s. The time to classify instances
remained the same. In addition, we confirmed this observation as the trained
decision trees did not change enough in size. These (deeper) trees contained less
features but had more nodes. According to info gain and OneR, the features
which are most informative for ML algorithms are: the length of requested file
and query, the amount of special chars, the amount of dots and the amount of
percentage signs.
We conclude that info gain and OneR are the optimal feature selection meth-
ods, with a preference for info gain because of its faster training time and minimal
accuracy loss. Further results presented in the paper will use all the 28 features
and are preferred in the security context because of their high accuracy and
equal classification time compared to the selected subsets of features.
5 Machine Learning Methods

When enterprises make the decision to use a ML-based intrusion detection sys-
tem over a rule-based system, they have to know what their advantages and
trade-offs are. When constructing a trained model to detect attacks on multiple
platforms, it is important to focus on generalization because it prevents over-
fitting. This leads to bad results on unseen data. ML algorithms also promise
execution time advantages which, if large enough, could be considered over a
rule-based system. Additionally, ML techniques, with deep learning in particu-
lar, offer online learning [22]. These techniques enable the support of new attacks
without the need to retrain with the full dataset and thus allows fast detection
of new attack types. Section 5.1 draws a short overview of the tested algorithms.
Next, the test method is explained in Sect. 5.2, followed by the results in Sect. 6.
5.1 Tested Algorithms

Reduced Error Pruning (REP) tree is a fast decision tree learner [24].
The tree is trained using information gain at each leaf. The feature having the
most information gain is used to make two new arcs and nodes. The result is a
tree, which is simplified and shown in Fig. 3. After the training process pruning
takes place, where leaves are removed to mitigate overfitting. textbfRandom
trees are similar to REP trees, but do not apply pruning. At each leaf, a random
feature is used to split the tree further into new leaves. This results in a much
faster training process, but is usually sensitive to overfitting. As a result, the
decision tree is much larger.
J48 is Weka’s implementation of the C4.5 algorithm. This algorithm uses
information gain or entropy to build the J48 decision tree [21]. A confidence
factor of 0.25 was used to ensure pruning operations.
Random trees are similar to REP trees, but do not apply pruning. At each
leaf, a random feature is used to split the tree further into new leaves. This
results in a much faster trainings process, but is usually sensitive to overfitting.
As a result, the decision tree is much larger.
Random forest consists of multiple trees, constructed with random subsets
of the total features [13]. When a trained forest is used to classify, the feature
vector is given to each tree. The class with the greatest occurrence is then chosen
by the classifier. In this paper 100 trees were trained.
Fig. 3. Simplified representation of a REP tree decision tree.

Table 3. Fully-connected neural network architecture for attack detection
Layer Activation function Output dimensions

Input – 28 neurons
Dense layer ReLu 1024 neurons
Dropout layer – 1024 neurons
Dense layer ReLU 256 neurons
Dropout layer – 256 neurons
Dense layer ReLU 64 neurons
Output layer Softmax 11 neurons
Boosting is a ML ensemble meta-algorithm that combines multiple weak

classifier into one strong classifier [8]. Weka’s implementation of boosting is
AdaBoost [24]. Multiple iterations modify the weak classifiers to reduce the
amount of misclassified instances. The final boosted classifier uses a weighted
sum of the accuracy from all previous weak classifiers to determine the class. To
test this algorithm, a J48 classifier with 15 iterations was used.
Bagging, like boosting, is a ML ensemble meta-algorithm. Multiple random
subsets of the total training set are used to train new classifiers. The bagging
classifier chooses the class that is predicted the most, given a feature vector,
by all the generated classifiers [4]. Again, a J48 classifier with 15 iterations was
used.
Neural networks such as a multilayer perceptron exist out of connected
neurons, having a non-linear mapping between input and output vectors [9].
The output of each neuron is the output of the activation function in a neuron,
with the sum of all inputs as parameter. We implemented a Fully-connected
Neural Network (FNN) in Tensorflow, using rectified linear units (ReLU) as
activation functions. Table 3 presents the proposed FNN architecture. It contains
five dense layers and includes dropout to improve generalisation. For simplicity
and equal comparison to the other proposed methods, in this paper we did not
consider more complex architectures e.g. Convolutional Neural Networks (CNN).
CNN has the advantage of automatically extracting features, a step which we
already performed before providing input data. The proposed neural network
was configured to perform 1000 iterations (epochs) and uses a batch size of 256.
In each iteration, the weights of each arc in the network are adapted to better
fit the training data. This can be achieved using back propagation and gradient
descent [9].
5.2 Testing Method
Accuracy of the proposed ML algorithms is measured with 10-fold cross valida-

tion, that will detect overfitting by testing 10 times with a 10% random subset
and training with the other 90%. A pareto-efficiency chart is used to illustrate
these results. This chart gives a quick overview if a algorithm is not pareto-
efficient, meaning that there is another algorithm with a higher accuracy and
a lower classification time. Classification time is calculated based on the total
time needed to classify all 73165 logs. Numerical results are presented in terms
of accuracy, F-score, training and classification time.
6 Performance and Accuracy Results
Figure 4 shows the pareto-efficiency of the random tree, REP tree, J48, bagging,
boosting, random forest and neural networks. The rule-based measurement is not
shown on the graph, because the classification time is too extended. However, we
can see which algorithm achieves the most optimal results. Random tree is the
fastest in classifying instances, while the neural network is the most accurate.
We can also see that there should be no reason to choose random forest in this
case. Random forest is pareto-inefficient, because it is slower and less accurate
than boosting and the neural network. Table 4 shows the accuracy and F-score
together with training and testing times of the tested ML algorithms. Based
on Table 4 and Fig. 4, J48 should be the choice for cases where classification
time is important. On the other hand, if accuracy is critical, neural networks
should be used. The (relative) 100% rule-based method completes classification
in 115.74 s. This is 47 times slower than J48. The neural network is 26 times
faster than a rule-based method, while only loosing 1.69% accuracy. Taking into
account the training time, decision trees show good scaling when using large
datasets. The authors of [25] show that training decision trees has a complexity
of O(mnlog(n)) + O(n(log(n))2 ), where m and n correspond to the amount of
features and training instances respectively. Decision trees such as J48 and REP
tree will introduce pruning, that will increase training time, but decrease the
total height of the tree. In addition, a less complex model is produced. Boost-
ing and bagging are more complex because of their nature of building multiple
decision trees. However, using weak classifiers, that have an accuracy just above
Acc. (%) 100
99
Neural network
Boosting
98 J48
Bagging Random forest
REP tree
97 Random tree
96
2 3 4 5 6
Classification time (s)
Fig. 4. Accuracy (Acc.) and classification time on our dataset using the proposed
machine learning techniques.
Table 4. Performance of rule-based versus machine learning attack detection systems.
NN Rand. REP J48 Bagging Boosting Rand. Rule-based

tree tree forest
Accuracy 98.47% 96.91% 97.24% 97.71% 97.90% 98.08% 97.98% 100%
F-score 98.47% 96.90% 96.55% 97.30% 97.90% 98.10% 98.00% 100%
Train time 110.05 s 2.69 s 4.64 s 11.56 s 31.59 s 98.14 s 12.71 s n/a
Classification time 4.38 s 2.35 s 2.36 s 2.45 s 3.39 s 3.46 s 5.43 s 115.74 s
Fig. 5. Confusion matrix of the fully-connected neural network.
50%, these algorithms will produce less complex models. Unfortunately, we were
not able to achieve good results with the weak classifiers.
To briefly go into more detail, the confusion matrix in Fig. 5 shows the clas-
sification score for each of the 11 classified classes (10 attack classes + safe
class). These results are derived from the best classifier, in our case the fully-
connected neural network. For better readability, the categorical attack names
are not displayed. The ‘safe’ attack class (nr. 10) is classified 99% correctly.
Except class nr. 4 ‘Half-/fullwidth encoded unicode HTML breaking’ and class
nr. 7 ‘Includes and typical script methods’, all other attack classes are classified
near perfectly. However, class nr. 4 and nr. 7 only contain 34 and 130 training
examples respectively, making generalisation harder for these classes. Wrongly
detected attack classes are only infrequently classified as a ‘safe’ class, which is
unwanted behaviour in the security context. This accuracy could be strength-
ened further by training the classifier only on attack/safe class and increasing
the generalisation for attack detection.
As a conclusion, we can verify that neural networks in combination with

the proposed selected features enable near-perfect accuracy and is able to dif-
ferentiate all the attacks from the safe logs. Moreover, J48 scores better when
considering classification time. This classifier achieves very good accuracy, while
having 44% less the classification time compared to neural networks. Using ML
techniques can be a viable option if web applications are constrained by the
performance of an intrusion prevention system (IPS). We see 26–47 times higher
detection speed, that can lead to the same improvement when used in an IPS.
7 Future Research
Future work should depend on more real-world data, coming from many deployed
cloud services. At the time of writing, hardly any open data is available concern-
ing this. We urge companies to commence publishing data in an open format,
preferably anonymized, to allow and help future research to collect more results
and design up-to-date systems. Extending this research, consists of implementing
ML-trained models in the real world and preprocessing combined logs of multiple
cloud services, that generate logs. Next these logs can be transferred to a central
system that processes these logs e.g. an Azure WebJob. An Azure WebJob runs
in an infinite loop, while constantly checking if new data is available to be pro-
cessed. We used Weka for .NET in a similar configuration to confirm successful
deployment of the model proposed in this paper. The classified output of such
model is then saved to a central database. These logs can then be visualized on
a security cloud dashboard. This future work requires building a platform that
combines logs from multiple cloud services (other than web applications) while
simulating attacks on the platform in order to extend available training data.
Another addition can be made by introducing an online self-learning system.
Such system receives input from experts on the proposed cloud security plat-
form. Experts can e.g. indicate which attacks are misclassified. Periodically the
model is updated, keeping these corrected labels or new attacks into account.
A major advantage, equivalent to the techniques proposed in this paper, is that
we only need to know which attack occurred, without knowing how the attack
is constructed.
Another contribution can be made investigating a hybrid approach by com-
bining anomaly detection, based on the ML techniques in the proposed model,
with a rule-based system for classifying specific attacks. As shown in [16]
attack detection by only classifying attack or no-attack can perform better
than signature-based methods. More complex rule-based systems can investigate
which kind of attack occurred while the anomaly detection method would have
the same performance advantages as the proposed model in this paper, allowing
more real-time detection and mitigation of attacks on cloud-based applications.
Such contribution can also investigate if the anomaly detection is able to detect
zero-day attacks, that the rule-based system cannot classify yet.
8 Conclusion
The increasing complexity in the landscape of cloud applications leads to the
need of a more flexible security system. Such system should allow less complex
and easier-to-update detection models. In this work, contributions are made
by presenting web cloud security logs that show similarities with traditional
on-premises security logs. Extra metrics can allow attacks to be classified that
influence the performance of a system, e.g. a distributed denial-of-service attack.
Machine learning techniques are proposed, with the advantage of having more
simplicity when training detection models by quickly adding newly discovered
attacks without the knowledge of their details. These models are applied on web
application logs, derived from multiple servers. Such logs are converted into a
single format, that helps easier feature extraction. Feature selection test scenarios
show that information gain offers the highest accuracy compared to other feature
selection methods, while also being the fastest method. However, using all the
features still delivers the highest accuracy. The J48 decision tree offers the overall
best results, considering both accuracy and performance. This classifier achieves
a 47x performance improvement over the traditional rule-based systems, while
only loosing 2.29% accuracy. The neural network offers the highest accuracy of
98.47% and enables efficient and accurate attack detection, possible on multiple
web services in cloud environments.
Acknowledgments. This work was funded by the Fund for Scientific Research Flan-
ders (Belgium, FWO-Vlaanderen, FWO-SB grant number 1SB7619N).
References
1. Honeypot project. http://old.honeynet.org/. Accessed 22 Mar 2018
2. Public security log sharing site (2006). http://log-sharing.dreamhosters.com/.
Accessed 22 Mar 2018
3. Almorsy, M., Grundy, J.C., Müller, I.: An analysis of the cloud computing security
problem. CoRR abs/1609.01107 (2016). http://arxiv.org/abs/1609.01107
4. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.
org/10.1023/A:1018054314350
5. Christian Matthies, M.H.: Phpids (2014). https://github.com/PHPIDS/PHPIDS.
Accessed 11 May 2017
6. Clutch: 2016 enterprise cloud computing survey (2016). https://clutch.co/cloud#
survey. Accessed 29 May 2017
7. Feizollah, A., Anuar, N.B., Salleh, R., Amalina, F., Ma’arof, R.R., Shamshir-
band, S.: A study of machine learning classifiers for anomaly-based
mobile botnet detection. Malays. J. Comput. Sci. 26(4), 251–265 (2013).
https://mjes.um.edu.my/index.php/MJCS/article/view/6785
8. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learn-
ing and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997).
https://doi.org/10.1006/jcss.1997.1504. http://www.sciencedirect.com/science/ar
ticle/pii/S002200009791504X
9. Gardner, M., Dorling, S.: Artificial neural networks (the multilayer percep-
tron) - a review of applications in the atmospheric sciences. Atmos. Envi-
ron. 32(14–15), 2627–2636 (1998). https://doi.org/10.1016/S1352-2310(97)00447-
0. http://www.sciencedirect.com/science/article/pii/S1352231097004470
10. Gaucher, R.: Apache-scalp (2008). https://code.google.com/archive/p/apache-
scalp/. Accessed 11 May 2017
11. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach.
Learn. Res. 3, 1157–1182 (2003). http://dl.acm.org/citation.cfm?id=944919.
944968
12. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.:
The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18
(2009). https://doi.org/10.1145/1656274.1656278
13. Ho, T.K.: The random subspace method for constructing decision forests. IEEE
Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). https://doi.org/10.1109/
34.709601
14. Holte, R.: Very simple classification rules perform well on most commonly used
datasets. Mach. Learn. 11, 63–91 (1993)
15. Ieracitano, C., Adeel, A., Gogate, M., Dashtipour, K., Morabito, F., Larijani, H.,
Raza, A., Hussain, A.: Statistical analysis driven optimized deep learning system
for intrusion detection. In: International Conference on Brain Inspired Cognitive
Systems (2018)
16. Kaur, J.: Wired lan and wireless lan attack detection using signature based and
machine learning tools. In: Perez, G.M., Mishra, K.K., Tiwari, S., Trivedi, M.C.
(eds.) Networking Communication and Data Knowledge Engineering, pp. 15–24.
Springer, Singapore (2018)
17. Komiya, R., Paik, I., Hisada, M.: Classification of malicious web code by machine
learning. In: 2011 3rd International Conference on Awareness Science and Technol-
ogy (iCAST), pp. 406–411 (2011). https://doi.org/10.1109/ICAwST.2011.6163109
18. Liu, W.: Research on cloud computing security problem and strategy. In: 2012 2nd
International Conference on Consumer Electronics, Communications and Networks
(CECNet), pp. 1216–1219 (2012). https://doi.org/10.1109/CECNet.2012.6202020
19. Lobato, A., Lopez, M.A., Duarte, O.: An accurate threat detection system through
real-time stream processing (2016)
20. Provos, N., et al.: A virtual honeypot framework. USENIX Secur. Symp. 173, 1–14
(2004)
21. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publish-
ers Inc., San Francisco (1993)
22. Sahoo, D., Pham, Q., Lu, J., Hoi, S.C.: Online deep learning: learning deep neural
networks on the fly. arXiv preprint arXiv:1711.03705 (2017)
23. W3: Logging in w3c httpd (1995). https://www.w3.org/Daemon/User/Config/
Logging.html#common-logfile-format. Accessed 27 May 2017
24. Weka: Weka documentation (2016). http://weka.sourceforge.net/doc.stable/.
Accessed 15 May 2017
25. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine
Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)
Automatic Text Classification
Through Point of Cultural Interest
Digital Identifiers
Maria Carmela Catone, Mariacristina Falco, Alessandro Maisto(B) ,

Serena Pelosi, and Alfonso Siano
University of Salerno, via Giovanni Paolo II, Fisciano, SA, Italy

{mcatone,mfalco,amaisto,spelosi,sianoalf}@unisa.it
Abstract. The present work faces the problem of automatic classifica-

tion and representation of unstructured texts into the Cultural Heritage
domain. The research is carried out through a methodology based on
the exploitation of machine-readable dictionaries of terminological sim-
ple words and multiword expressions. In the paper we will discuss the
design and the population of a domain ontology, that enters into a com-
plex interaction with the electronic dictionaries and a network of local
grammars. A Max-Ent classifier, based on the ontology schema, aims to
confer to each analyzed text an object identifier which is related to the
semantic dimension of the text. Into this activity, the unstructured texts
are processed through the use of the semantically annotated dictionaries
in order to discover the underlying structure which facilitates the classifi-
cation. The final purpose is the automatic attribution of POIds to texts
on the base of the semantic features extracted into the texts through
NLP strategies.
1 Introduction
The present research has been carried out in the context of the Encore project
(ENgaging Content Object for Reuse and Exploitation of cultural resources),
which makes available a specialized Registry system (the POIr), able to identify
content objects through a unique and persistent Point of (cultural) Interest dig-
ital Identifier (POId). The aim of the whole project is to promote a new model
of publishing in order to favor production and sharing of authors, publishers and
users. Tourism and cultural heritage are the two main areas of interest of the
project, whose corpus consists of unstructured data1 , shared on the web.
Techniques and applications that prepare non-structured or semi-structured
texts for automatic processing are within the scope of NLP (Natural Language
Processing).
The general architecture of the system includes a Domain Ontology whose
Schema is used as structure for all the classifications performed in the project.
1
By “non-structured data” we mean raw texts, characterized by a structure that
machine can not manage without human intervention.
https://doi.org/10.1007/978-3-030-33509-0_19
212 M. C. Catone et al.
The Linguistic Module, as an example, makes use of the Ontology Schema in

order to add tags to simple and compound words. It uses the dictionary and a
set of Finite State Automata in order to extract and tag the ontology class labels
into the texts. This way, the classification algorithm operates on semantically
structured texts and gains a higher accuracy (see the results in Sect. 4).
In the following paragraphs we will describe the work performed through the
use of advanced linguistic devices, namely a large-scale collection of termino-
logical MultiWord Expressions (MWE) and a set of local grammars formalized
as Finite State Automata (FSA) and Regular Expressions, able to compute the
shape and the meaning of the MWE lemmatized in the lexical databases in a
context of high lexical and syntactic variability. The present paper is structured
as follow: in Sect. 1.1 we will explain the structure of the POId codes; Sect. 2 will
present a review of state of art classification techniques. The adopted methodol-
ogy will be discussed in Sect. 3. In Sect. 4 we will show the result of a preliminary
experiment.
1.1 POId
The aim of POIds is to identify, certify and facilitate the identification of digital
cultural objects, which are unstructured or semi-structured texts where authors
describe or just mention specific cultural facts or objects.
As we said in Sect. 1, the idea of this work is to define a linguistic-based
methodology to automatically attribute the POId codes to texts, relying the
classification of this cultural objects on a domain ontology schema previously
defined.
The POId can be splitted in two parts which includes: a) a prefix which
allows the immediate recognition of the code as a POId and which contains the
description of the semantics of the text; b) a suffix which contains information
about the date of entry and a disambiguation code. The POId schema is showed
in the Table 1.
An example of POId is:
A02.4042115006/301117.00001
in which the A represents the superclass of the ontology (in this case A repre-
sents “Archeological Heritage”), the numerical code 02 represents the subcate-
gory “archaeological areas”, 40421.15006 represents the latitude and longitude
of the place described calculated by a geo-localization algorithm, rounded by 3
decimals, 301117 is the entry date and 00001 is a progressive disambiguation
number. In this paper we will deepen the problem around the automatic attribu-
tion of the first two elements of the POId: the superclasses and the subcategories
from the domain ontology (see Sect. 3.1).
Text Classification Through POIds 213
Table 1. Structure of the POId
Prefix
Superclass letter Alphabetic code
Class Number Numeric code
Geographical coordinates Three decimal approximation
Suffix
Inserting document date ggmmyy
univocal disambiguation code 5+ numbers
2 State of the Art
Text classification is “the activity of labeling natural language texts with the-
matic categories from a predefined set” [13]. In classification systems, group of
words or terms are collected together and organized. Each of these terms will be
associated with a particular concept. [18].
According to [8], text classification determines some advantages: differently
from the dictionary-based approaches which are mainly used to provide infor-
mation on the magnitude, classification ones make available “information about
type and likelihood of being of a type and researchers can go a step further by
understanding what words or patterns lead to being classified as a type” [8]. In
addition, the classification models, as do not describe a word list a priori, can
reveal insights and surprising combinations of words or patterns.
While until the late’ 80s, text classification was especially based on a knowl-
edge engineering systems, characterized by a manual identification of text classi-
fication rules in specific classes, the’ 90s were marked by the adoption of machine
learning (ML) developments that, using a general inductive process, automati-
cally builds an automatic text classifier by learning, from a set of preclassified
documents, the features of the classes of interest [13]. As indicated by [17], ML
algorithms can be classified into supervised, unsupervised and semisupervised
according to specific learning rules. The former refers to a classic way to build-
ing a text classifier as they require a human intervention for assigning labels to
a set of training documents and then apply a learning algorithm to the classifier
building. The most used supervised parametric algorithms are logistic regression
adopted to predict binary classes and the Naı̈ve Bayes [10] which assume that
all attributes of the examples are independent of given category [11]. Among
the non-parametric classifiers, the mostly applied are: support vector machines,
k-nearest neighbor, decision trees and neural networks [18].
The unsupervised algorithms aim at learning underlying characteristics of
text without previously determining specific categories of interest. For this rea-
son, they allow researchers to identify the structure of unlabeled data and to
explore text patterns perhaps understudied or unknown [6]. Classification train-
ing of the unsupervised approaches is usually based on the inferences that are
carried out by clustering data into different clusters without labeled responses:
k-means and hierarchical clustering are typical statistical techniques performed

in order to face with high dimensional data [17]. Finally, a combined use of
both supervised and unsupervised techniques is provided by semi-supervised
approaches which use both labeled data and unlabeled data to improve the clas-
sification accuracy [1].
Starting from an in depth analysis of the main automatic classification algo-
rithms, a hybrid approach that uses dictionaries and a supervised automatic
classifier, the Maximum Entropy Classifier (MaxEnt) has been chosen, accord-
ing to the project research question and its specific aims. MaxEnt Classifier has
been tested by [12] on three different corpora obtaining comparable or better
performances than Naive Bayes classifier, usually exploited for text classifica-
tion tasks.
3 Methodology and Resources

The methodology includes three main phases: a first one concerns the creation of
the Domain Ontology Schema, whose classes have been selected as base for every
future classification. The second step includes the formalization of the linguistic
resources, in the shape of dictionaries and grammars. The last step concerns
the automatic classification of texts. The linguistic module locates in the texts
the terminological (open) compounds that work as the document keywords and
extracts also the 20 words with the higher frequency values for each test. Finally,
the accuracy of the system is tested by using a collection of 1000 processed texts
with a MaxEnt Classifier.
3.1 Domain Ontology
The first step of the present methodology consists in the definition of a Cultural
Heritage Ontology Schema which includes two semantic levels: the first level,
characterized by three superclasses, is related to the first character of the POId
code. Then, for each superclass, we define a number of subclasses with a total
of 9 final classes. In this project the words of the dictionary have been system-
atically labeled with the codes designed for ontology. The Ontology Schema is
the following:
• Historical-artistic (A);
– Archeological Areas (01): Parks, sites, excavations and any type of
area of historical and artistic interest;
– Museum (02): Museums, archives, art galleries;
– Architecture Elements (03): churches, villas, monuments, and every-
thing related to buildings of artistic interest;
– Artwork (04): cultural assets related to music, painting, sculpture and
humanistic and scientific knowledge in general;
• Natural (B);
– Naturalistic Itinerary (01): footpaths, mountains paths, biking trails;
– Parks (02): urban and extra-urban parks, gardens, fields;
– Natural Areas (03): natural protected areas, marine protected areas;
• Cultural (C);
– Objects (01): historical texts, traditional receipts, stage clothes and cos-
tumes, traditional tools and instruments;
– Events (02): manifestation, demonstrations, shows, exhibitions, country
or religious festivals;
3.2 Lexical and Grammatical Resources
The formalization of electronic dictionaries2 opens the problem of the definition

of the words boundaries and the necessity to refer to Atomic Linguistic Units
(ALU)3 rather than to word forms. By means of ALUs, it becomes possible to
lexicalize, through the same formalism, word forms, e.g. casa “home”; parts of
words, e.g. del “of the”; and multiword expression 4 , e.g. consiglio di amminis-
trazione “governing body” [14,15].
The presence of these MWE is truly significant in the documents semantic analy-
sis. In technical languages the presence of compound words is extremely relevant.
Sometimes these expressions exceed the 90% of the amount of words character-
istic of a specialized jargon [5]. The merit of these compounds lies in the oppor-
tunity of being tagged in an unambiguous way. Moreover, they are capable of
summarizing the meaning of a text, if they occur in it. They represent, indeed,
depending on the frequency, a text’s summary-sheet made up with keywords [2].
In this project, we also focused on a particularly productive kind of semi-fixed
compounds. We make reference to those nominal groups characterized by a fixed
or semi-fixed head able to select a more or less wide range of variable elements
from specific grammatical categories. According to [4], in this research we call
2
Electronic dictionaries, usable with all the computerized documents for the text
recognition, for the information retrieval and for the machine translation, should be
conceived as lexical databases intended to be used only by computer applications
rather than by a wide audience, which probably wouldn’t be able to interpret data
codes formalized in a complex way.
3
According to [15], ALUs “refers to the smallest elements of a given language that
are associated with linguistic information. By definition, these ALUs constitute the
vocabulary of the language. They can and must be systematically described in exten-
sion, because some of, or all their properties cannot be computed from their compo-
nents”.
4
A phrase composed of several words is considered to be a multiword unit, or expres-
sion, “if some or all of its elements are frozen together, that is, if their combination
does not obey productive rules of syntactic and semantic compositionality” [9].
In order to assume if a sequence of simple words separated by blanks can be lex-
icalized as a multiword expression it must be verified if it presents the following
properties: semantic atomicity, distributional constraints, and an institutionalized
and shared use in the linguistic community.
these kind of structures “open series compounds”, which are “lists of compound
ALUs having the first two or three items in common”. Here, the fixed or semi-
fixed head defines the grammatical and semantic features of all variable elements
that constitute the compounds. A typical example is (1).
⎡ ⎤
bronzo
⎢ sale ⎥
statua di ⎢ ⎥
⎣M ichelangelo⎦ (1)
David
“statue of (bronze + salt + Michelangelo + David)”

In (1), statua represents the fixed constituent (C ) of the open series com-
pound: the constituent that allows the automatic recognition of the whole mul-
tiword expression. The presence of the preposition di is not always mandatory,
but contributes to define the part of speech of the variable constituent (statua
selects a set of adjectives if the preposition di doesn’t occur, e.g. statua (bronzea
+ marmorea + ecc..) “(bronzy + marble) statue”, but doesn’t select nouns, e.g.
*statua sale “*statue salt”). The nouns in (1) in square brackets represent the
variable constituents of the compound, which are into a paradigmatic relation
in so far they can be substituted for each other.
Anyway, variability, as we anticipated, does not mean arbitrariness: elements
do not combine one another without rules. Instead, the elements that can occur
together must respect different kinds of constraints, always established by the
fixed constituent(s).
• Syntactic constraints: e.g. the parts of speech selected by C+PREP do not

include adjectives (*statua di dorico “*statue of doric”), which are, instead,
selected by C without PREP ;
• Distributional constraints: e.g. the lexico-syntactic traits5 [3] or, more specif-
ically, the classes of words6 [7] accepted as variable element by C+PREP
include, among others, concrete words (Conc) indicating materials (Nmat),
proper nouns (Npr ) indicating the author of the sculpture or its subject, but
not abstract nouns: *statua di stanchezza “*statue of tiredness”.
Due to syntactic and lexical variation, open series compounds need to be for-
malized into an electronic context able to take into account their peculiarities.
5
We make reference to the lexico-syntactic traits identified by [3], which coincide with
nouns connected to wide semantic classes, such as human nouns, concrete nouns,
abstract nouns, among others. See the property “trait” in Fig. 1.
6
We make reference to less extensive semantically homogeneous classes of nouns, pro-
posed by [7], which block the interpretation of a given predicate (that selects those
nouns as arguments) by specifying its use. Examples are the nouns from the class
“materials” which are selected as direct object by specific predicates such as lavo-
rare “to work”, martellare “to hammer””, polverizzare “to pulverize”, scolpire “to
chisel”, levigare “to smooth”. When occurring together, these subsets of arguments
and predicates reciprocally delineate and circumscribe their interpretation. See the
property “type” in Fig. 1.
One of the best solution is to link dictionaries, which contain the lists of Atomic
Linguist Units and their syntactic and semantic properties, with syntactic gram-
mars, which can make such words and properties interact into a FSA context
[16]7 . That is why semi-frozen compounds need both dictionaries and grammars
to be recognized and annotated in a correct way. The dictionaries allow the
recognition of the multiword units thanks to the characteristic components C,
which are the word forms or the sequences that occur every time the expressions
occur and originate the recognition of the frozen expressions in texts [16]. The
grammars, in the form of regular expressions or finite state automata8 , let the
machine compute the compounds, despite of the many different forms that in
real texts they can assume. An example of the formalization of the electronic
dictionary is reported in Fig. 1. An extract from the grammar net is reported in
Fig. 2.
Fig. 1. Extract of the Json dictionary of open series compounds characteristic compo-
nents
In this work we generalized the idea that each word (both simple and
compound ones) included in the Cultural Heritage lexical resources could
become characteristic or variable components of bigger multiword expressions.
7
Local grammars are algorithms that, thanks to grammatical, morphological and
lexical instructions, help us in the formalization of linguistic phenomena and in the
parsing of texts.
8
FSAs (finite-state automata) are abstract devices characterized by a finite set of
nodes and transitions that allow us to locate patterns related to a particular path.
Usually we use them when we want to extract sequences from texts and build concor-
dances. A finite state transducer (FST) produces linguistic information as outputs.
FSA and FST can be morphological or syntactical. An RTN (Recursive Transi-
tion Network) is a grammar that contains embedded graphs. An ERTN (Enhanced
Recursive Transition Network) is a more complex grammar that includes variables
and constraints.
Fig. 2. Extract of a FSA for the recognition of open series compounds
As described in the example of Fig. 1, each C is described through syntactic

and distributional details which specify the nature of the each element that the
grammars can identify as constituent of the compound that has C as head. This
way, as exemplified below, both simple words and multiwords can be combined
together in order to match longer domain expressions that properly work as
keyword of the texts in which they occur. In the example (2) we show that the
open series compounds that share at least one constituent, such as statua PREP
Nmat (CPN); Nmat PREP marmo (NPC); marmo A (CA); marmo PREP Npr
(CPA), can generate complex expressions that deserve to be correctly located
and processed by a classifier like the one described in this research.
⎡ ⎤
⎡ ⎤ Sicilia
rosso
di polvere ⎢M arquina⎥
statua di marmo ⎣ nero ⎦ di ⎢ ⎣ Carrara ⎦
⎥ (2)
in granulato
bianco
T hassos
“statue of (red + black + white) marble from (Sicily + Marquina + Carrara

+ Thassos)”
4 Experiment
We tested the effectiveness of the proposed methodology by setting an experi-
ment that includes the following steps: Training set building and pre-processing
(collection of 1000 texts from the Cultural Heritage related to each one of the
ontology classes); Text structuring (extraction of MWEs, Open Series Com-
pounds and words with higher term frequency, conversion of the processed texts
in a JSON structure composed by the information extracted); and, in the end,
Classifier Training and Testing (training of the MaxEnt classifier included in the
Mallet package9 over the 90% of the mentioned collection of texts and testing on
the remaining percentage of texts). In our experiment, we compared the results
of the MaxEnt classifier, trained on the same corpus, first and after the text
structuring phase. Table 2 shows the values of Precision, Recall and F1 for the
two trained classifiers.
9
Mallet includes a series of text classifier algorithms such as the already mentioned
MaxEnt and Naive Bayes. Mallet also allows the automatic evaluation of the trained
classifier.
The total accuracy value for the classifier trained with structured texts
reaches a value of 0.78. While the accuracy of classifier trained on unstructered
texts reaches the 0.69. We can affirm that the linguistic processing increased the
results of about 0.09 points.
Table 2. Comparison between MaxEnt classifier trained on structured texts and on

unstructered texts
Class Structured Texts Unstructured Texts

Precision Recall F1 Precision Recall F1
A01 0.82 0.64 0.72 0.87 0.7 0.78
A02 0.9 0.75 0.82 0.6 1.0 0.75
A03 0.44 0.8 0.57 0.62 0.45 0.52
A04 0.75 0.67 0.7 0.71 0.71 0.71
B01 0.7 0.7 0.7 0.73 0.69 0.71
B02 0.89 1.0 0.94 1.0 0.89 0.94
B03 0.85 0.78 0.81 0.6 0.54 0.57
C01 0.91 1.0 0.95 0.69 0.75 0.72
C02 0.67 0.75 0.70 0.5 0.67 0.57
5 Conclusion
In the paper we described a linguistically based strategy to face the problem
of automatic classification and representation of unstructured texts. Thanks to
NLP, information about people, time, places, events or even emotions, contained
in raw texts can be extracted and processed by the machine just like structured
data recorded in traditional databases. In this work we anchored the identifica-
tion of keywords into the Cultural Heritage domain to a subset of terminological
fixed and semi-fixed multiword expressions, demonstrating that higher levels of
accuracy can be reached thanks to a proper linguistic processing of texts.
References
1. Altınel, B., Ganiz, M.C.: A new hybrid semi-supervised algorithm for text classi-
fication with class-based semantics. Knowl.-Based Syst. 108, 50–64 (2016)
2. Bolasco, S., et al.: Statistica testuale e text mining: alcuni paradigmi applicativi.
Quaderni di statistica 7, 17–53 (2005)
3. Chomsky, N.: Aspects of the Theory of Syntax, 11th edn. MIT press, Cambridge
(1964)
4. di Buono, M.P., Monteleone, M., Elia, A.: Terminology and knowledge represen-
tation. Italian linguistic resources for the archaeological domain. In: Proceedings
of Workshop on Lexical and Grammatical Resources for Language Processing, pp.
24–29 (2014)
5. Elia, A., Cardona, G.R.: Discorso scientifico e linguaggio settoriale. un esempio di

analisi lessico-grammaticale di un testo neuro-biologico. Quaderni del Dipartimento
di Scienze della Comunicazione–Università di Salerno”, Cicalese A., Landi A., a
cura di,“Simboli, linguaggi e contesti, (2) (2002)
6. Grimmer, J., Stewart, B.M.: Text as data: the promise and pitfalls of automatic
content analysis methods for political texts. Polit. Anal. 21(3), 267–297 (2013)
7. Gross, G.: Les classes d’objets. 28, 111–165 (2008)
8. Humphreys, A., Wang, R.J.H.: Automated text analysis for consumer research. J.
Consum. Res. 44(6), 1274–1306 (2017)
9. Laporte, E., Voyatzi, S.: An electronic dictionary of French multiword adverbs.
In: Language Resources and Evaluation Conference. Workshop Towards a Shared
Task for Multiword Expressions, pp. 31–34 (2008)
10. Lewis, D.D.: Naive (bayes) at forty: the independence assumption in information
retrieval. In: European Conference on Machine Learning, pp. 4–15. Springer, Hei-
delberg (1998)
11. McCallum, A., Nigam, K., et al.: A comparison of event models for naive bayes
text classification. In: AAAI-1998 Workshop on Learning for Text Categorization,
vol. 752, pp. 41–48. Citeseer (1998)
12. Nigam, K., Lafferty, J., McCallum, A.: Using maximum entropy for text classifi-
cation. In: IJCAI-1999 Workshop on Machine Learning for Information Filtering,
vol. 1, pp. 61–67 (1999)
13. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput.
Surv. (CSUR) 34(1), 1–47 (2002)
14. Silberztein, M.: NooJ computational devices. Formalising Natural Languages with
NooJ, 1–13 (2013)
15. Silberztein, M.: An alternative approach to tagging. In: International Conference
on Application of Natural Language to Information Systems, pp. 1–11. Springer,
Heidelberg (2007)
16. Silberztein, m.: Complex annotations with nooJ. In: Proceedings of the 2007 Inter-
national NooJ Conference, pp. p–214. Cambridge Scholars Publishing, Cmbridge
(2007)
17. Thangaraj, M., Sivakami, M.: Text classification techniques: a literature review.
Interdisc. J. Inf. Knowl. Manag. 13 (2018)
18. Vasa, K.: Text classification through statistical and machine learning methods: a
survey. Int. J. Eng. Dev. Res. 4, 655–658 (2016)
An Industrial Multi-Agent System
(MAS) Platform
Ariona Shashaj1(B) , Federico Mastrorilli1 , Michele Stingo2 ,

and Massimiliano Polito1
1
Network Contacts, Molfetta, Italy
{ariona.shashaj,federico.mastrorilli,
massimiliano.polito}@network-contacts.it
2
DSPC - Università degli studi di Salerno, Salerno, Italy
mstingo@unisa.it
Abstract. When it comes to address challenges in the area of dis-

tributed and parallel computing, Multi-Agent Systems (MAS) are emerg-
ing as a key architecture. Characterized as a collection of autonomous
software (agents) which are able to cooperate in a distributed envi-
ronment, MAS-based applications have proven capabilities when using
cognitive processes, reasoning and knowledge representation in order to
develop functionality related to complex and dynamic scenarios where
the contribution of a single agent is computationally limited. In this
paper, we propose an industrial platform which fully supports the devel-
opment, deployment and maintenance cycle of MAS-based applications.
1 Introduction
A Multi-Agent System (MAS) is a collection of autonomous software compo-

nents (agents) which cooperate in a distributed and dynamic environment in
order to achieve a common goal. A modern vision of Artificial Intelligence (AI)
is the one of an intelligent agent capable of reasoning, planning, interacting
and learning [17]. In this context, it is natural to focus the interest on MAS
distributed architectures which enable the collaboration of intelligent agents in
order to develop modular and computationally efficient applications related to
Distributed AI, where the capabilities of a single agent are limited due to a local
perception of the environment and scarce computational resources. In order
to completely support the development and maintenance cycle of multi-agent
applications involved in the automation of Telco industrial processes, we pro-
vide Multi-Agent Specialized system (MASs), a distributed MAS environment.
In this paper, we present an overview of MASs environment and the followed
design principles. The remainder of the paper is organized as follows. Section 2
reviews and discusses design principles and related works. An overview of MASs
environment is presented in Sect. 3, while Sect. 4 shows an industrial use case.
Final remarks are given in Sect. 5.

https://doi.org/10.1007/978-3-030-33509-0_20
222 A. Shashaj et al.
2 Background and Related Works

A vast amount of research has been conducted on MAS techniques, and many
modeling approaches, communication standards, tools and environments have
been proposed in the past two decades. In this section, we are going to review
main results in literature focusing on: design and modeling MAS methodologies,
agent interaction and communication and popular MAS platforms.
MAS Modeling and Standards
Agent-based methodologies, like Agent-Oriented Model (AOM) [19], Agent-
Oriented Programming (AOP) and Agent-Oriented Programming Language
(AOPL) [25] have been suggested and investigated as new modeling and design
approaches for MAS environments. In 2011 it was reported that more than 100
of MAS platforms were developed or were prone to be developed, and most of
them were built on top of an Object Oriented (OO) language (Java). Agents
acting in these platforms can be represented as a software coded through the
same programming language, whereas its design can be tackled through well-
know practices used in OO programming like UML-based design and modeling.
Agent UML (AUML) [3,4] has been proposed as an UML extension of Sequence
Diagram and Class Diagram, in order to represent agents and their commu-
nication protocol stack. AUML can model static behaviours of MAS environ-
ments where relations between agents do not change over time, but it lacks on
modeling dynamic aspects. Multi-Agent System Modeling Language (MAS-ML)
is another modeling approach proposed in [7,8],which is able to model both
static and dynamic aspects of a MAS environment. MAS-ML extends UML,
through the definition of: Organization Diagram, which represents social struc-
tures of MAS environment and the elements acting on it, like agents, objects
and agents/objects roles; Role Diagram which models agent - object relation
type. In order to model dynamic behaviours of MAS environments, MAS-ML
extends UML Sequence Diagram which is used to represents elements creation
and destruction processes and their interactions. As our final goal is the devel-
opment of a whole new industrial MAS environment, in order to support the
design process, we have developed a dedicated MAS model based on UML-
extension which strongly represents the components of MASs. The Foundation
for Intelligent Physical Agent (FIPA) IEEE committee [1], with the final goal
to achieve interoperability between different MAS platforms, has proposed a
set of standards specification for heterogeneous and interacting agent-based sys-
tems. In order to guarantee the FIPA-compliance, MASs implements FIPA-AM,
FIPA-ACL and FIPA-IP standards respectively related to platform management
system, agent communication language and interaction protocols.
Agent Communication
Agent interactions have an important role in MAS platforms, as a clear and
efficient interaction framework impacts the overall performance of the system.
Agent Communication Language (ACL) provides a semantic communication
interface between agents. In literature, there have been proposed different seman-
tic schemes for agent communication. Knowledge Interchange Format (KIF) [11]
An Industrial Multi-Agent System (MAS) Platform 223
has been proposed to commands the execution of a set of logic rules expressed
in the form of first order predicates. Knowledge Query Manipulation Language
(KQML) [9] is another ACL schema, where messages are composed of: Contents
includes message contents;Communication defines a set of parameters used for
message transmission, as source/destination addresses and the unique identifier
of the communication threads;Message is the fundamental part of the KQML
message as it represents the communicative act correlated to the message type.
The semantic set of communicative acts in KQML is vast which makes it too
generic and prone to scenarios where agents which support different KQML
implementation cannot communicate. In order to define a unique communication
standard between heterogeneous agents, FIPA committee has proposed FIPA-
ACL [1]. FIPA-ACL messages are composed of three parts: Communicative act
defines the message type and it is composed by the only attribute performative;
Message parameters defines attributes used for message transmission and inter-
pretation; Contents contains the message body. Considering that FIPA-ACL has
become the de facto standard for agent communication [14], and that it is sup-
ported by almost all the most popular and available MAS platforms, we have
implemented FIPA-ACL as agent communication language in MASs.
In order to enrich the communication framework with interaction mecha-
nisms, FIPA has proposed a set of protocols which rules the messages exchange
flow during agent interactions. FIPA interaction protocols can be classified
in three main categories: protocols which involve a single interaction (FIPA-
Request, FIPA-Request-When, FIPA-Propose), multiple participant protocols
like those designed to support cooperation and negotiation processes(FIPA-
Contract-Net,FIPA-English-Auction,FIPA-Dutch-Auction, FIPA-Brokerage and
FIPA-Recruting) and multiple interactions protocols where there is a contin-
uous stream of messages exchanged (FIPA-Subscribe, FIPA-Iterated-Contract-
Net). Currently MASs environment supports FIPA-Request, FIPA-Subscribe and
FIPA-Contract-Net, which represent all the categories of FIPA interaction pro-
tocols. Further, considering the natural relationship that has emerged between
distributed agent applications and the Internet of Things (IoT) [13,24] and Edge
Computing [15], in MASs, FIPA interaction protocols are built on top of message
transport standards commonly used in IoT applications, such as REST [18] and
MQTT [20].
MAS Platforms and Tools
Design principles which lead the development of MAS environments can be
classified as agent-centric and middleware-centric. Agent-centric approaches are
focused on reasoning architectures, like the belief-desire-intention (BDI) prac-
tical reasoning model [23], where agent’s actions are selected considering a set
of goals (desires) which are subsequently selected according to agent’s beliefs.
Jason [6], is a Java-based interpreter for AgentSpeak [21], one of the first AOP
languages based on the BDI model. In AgentSpeak, agents are instances of the
agent family element. Agent’s behaviors are expressed in terms of: services and
plans. A service corresponds to a set of actions executed by agents in order to
fulfill desires. The Jason implementation of AgentSpeak strengthens the original
framework, allowing a quite immediate integration of Jason agent into an Java-

based MAS environment. JADEX [16] implements the BDI model, and is an add-
on of a popular Java-based MAS platform (JADE[cite]). JADEX’s goals, which
represents beliefs and desires, are classified as: perform, achieving, query and
maintain goal, while JADEX’s plan represents agents intentions as a procedu-
ral process, through the Java language. Middleware-centric approaches concerns
MAS environments considered as a collection of software functionalities which
guarantee the agent life cycle management (development, creation, destruc-
tion, mobility), agent communication (interaction flow and message transport),
multi-agent application development processes, etc. JACKTM Intelligent Agents
(JACK) [22] is a commercial Java-based MAS environment, which supports the
development processes of distributed multi-agent applications. It provides a BDI
agent language, JAL (JACK Agent Language), which is based on Java API.
Agent coded in JAL are executed upon the JACK platform. Java Agent DEvel-
opment framework JADE [5] is another popular FIPA-compliant MAS environ-
ment developed in Java. The core elements of JADE are platform, Container
and agent. A JADE platform consists of multiple distributed Containers. Each
Container hosts the execution of JADE agents. A Container runs in a single
machine, while a platform can be distributed through multiple machines. FIPA-
compliance assures the inteoperability between agents hosted in different JADE
Containers/platforms. JADE agents are executed through a dedicated thread.
JADE doesn’t provide neither an agent development tool nor monitoring and
maintenance tools. Multi-agent Development Kit MadKit [12] is an open source
multi-agent environment built on top of the Agent/Group/Role organization
architecture and based on Java language. Agents in MadKit play roles and are
grouped in artificial societies. The platform supports the development of agents
application by providing a Java API library. There are no dedicated development
and maintenance tools. In our work we considered a middle-ware development
approach, designing a distributed MAS environment which supports the com-
plete development and deployment cycle of multi-agent applications.
3 MASs Platform
When we started thinking about creating our MAS, we had to question ourselves
about what we were aiming for. The goal was a robust, powerful, innovative prod-
uct, flexible enough to easily adjust to the business context it was going to be
used in. We really had to create something resembling a real, natural ecosystem,
where every agent contributed to the general balance and enrichment of the
community, the overall result being a framework with an enhanced user expe-
rience. First and foremost we stressed the importance of an hybrid component:
our software agents needed to be able to interact in the utmost possible natural
way with human “agents”.
To this purpose we designed the MASs Ecosystem: a Multi Agent Specialized
system based on the HMAS-UP - Hybrid Multi Agent System Unified Process-
methodology; namely the whole process that describes the UML models design of
systems of agents to whom delegate specific operation. In order to allow human

and virtual agents coexistence and cooperation within the ecosystem, the imple-
mentation of appropriate Agent to Human interfaces represents a mandatory
step. A second main feature of our ecosystem is the configuration of the same
as a distributed system, so as to provide it with robustness, scalability and
computational power. A preliminary survey amongst the already existing MAS
environments (2) found on the market has proved to be a critical study, not only
in order to check whether any of these systems were hybrid-oriented products,
but also to determine what are the most appropriate scientific standards to con-
sider while developing original systems from the scratch. As foretold we adopted
the FIPA [1] specifications, as de facto standards shared by the international
scientific community. We enriched the FIPA models including the Behaviour
concept, (inherited by JADE) as the actual container of computational logic of
an agent; fundamentally a Behaviour is a method which describes how an agent
reacts to an event [2]. Figures 1and 2 show the architecture of FIPA Agent Man-
agement (FIPA-AM) model and the distributed FIPA-compliant Agent Manage-
ment implemented in MASs.
Fig. 1. FIPA AM: every Platform contains many agents, including system agents
AMS (Agent Management System, the primary management entity) and DF (Direc-
tory Facilitator, an optional entity that maintains a list of services offered by active
agents); each Agent communicates with the others via a Message transport System
and may interact with external software and tools.
We defined five Agent and two Behaviour basic archetype:

• System Agent: the logic entity which provides services to the Container and
grants coordination between all nodes of the system; its life cycle corresponds
to that of the Container
• Dynamic Agent: its life cycle lies within its operativity: the Container will
instantiate this agent when an execution request comes; the agent will die as
soon as it fulfills the request
Fig. 2. MASs hierarchical model: every Container (one of the system nodes) contains
Agents, each encapsulating one or more Behaviour (the real computational logic opera-
tors). The Main Container is the coordinator, and contains the AMS and (eventually)
the DF.
• Static Agent: it is instantiated by the Container when an execution request

comes or even without a request; differently from a Dynamic Agent, a Static
one it will die only when explicitly requested;
• Session Agent: shares the same logic of a Static Agent with the only difference
that it will die after an idle time (defined in its development);
• Instanceable Agent: an archetype defined in order to allow the instantiation
of a single agent directly from his correlated agents and not exclusively from
the system;
• Synchronous Behaviour: allows the execution of a task in a synchronous way;
suitable for quick tasks which promptly reply to the calling agent;
• Asynchronous Behaviour: performs tasks asynchronously entrusting them
to a third process; dedicated to long standing processes in order to avoid
behaviour’s deadlocks for the entire duration of the task.
Every entity of the ecosystem is defined as an extension of one of these abstract

archetypes, basically mirroring the ideas behind OOP: currently in terms of
implementation all agents and behaviours are described as Java classes, exten-
sions of abstract superclasses or interfaces which define the basic archetypes.
In order to design the exchange of messages between agents according to
the FIPA-ACL standards, three different communication protocols have been
implemented: FIPA-Request, FIPA-Subscribe and FIPA-ContractNet [1].
The MASs core has been realized considering it as a real Application Server,
namely containing all the logic necessary to create, admin and develop cooper-
ating agents, thus providing the core with the capability to satisfy every require-
ment and specific needs of the operating context. One of the challenges faced
by our team during the development of a distributed MAS environment was the
achievement of complete fault tolerance in case of Main Container node fail-
ure. The automatic re-election of a coordinator follows the policies of an algo-
rithm developed by our software engineering department, called Algoritmo del

Despota.
The aforementioned algorithm was born as an adjustment of the well known
Bully Algorithm [10], but with a reduced computational cost. Whereas the
Bully algorithm has in fact a computational cost of O(n2 ), the Algoritmo del
Despota has a computational cost of O(n) The lesser cost derives from the
fact that, unlike the Bully Algorithm, the Algoritmo del Despota does not
require the assignment of an unique identifier to every single Container, but just
an identification flag for the Main Container. In case of system downfall, any
Container may call an election for a new coordinator by contacting every other
Container in order to understand if, amongst them, another Main Container is
already present and if not, sending a proclaim of self-election to all other nodes
(see Fig. 3). In order to avoid deadlocks (that is to say a situation where more
Containers are competing and thus blocked while in the process of electing a
leader) and starvation (which is to say a penalized Container that never has
the chance be elected), we propose the use of a shared database where to keep
track of the identity of the Main Container. We assume that each Container in
the platform has both AMS and DF in an Initiated state, ready to answer to
incoming requests in case it turns into Main Container.
Fig. 3. A simple scheme to show how the Algoritmo del Despota works in the event
of failure of the Main Container. The self election message is sent ONLY in case no
other Container answers affirmatively to the question “Are you the coordinator?”
The system is equipped with two web applications, composed of multiple

user interfaces each granting to the developer the access to various functionality.
These web applications are dedicated to the development and deployment of
agents and their behaviours as well as to the maintenance of the ecosystem itself,
in order to support the MAS developer users at every stage (design, development,
test and deployment).
3.1 Development and Deployment

The DAD (Distributed Agent Developer), as shown in Fig. 4, is the web
application dedicated to design processes related to the implementation of agents
and behaviours, as extensions of the pertained Java classes (archetype). It pro-
vides a series of modules which are:
Fig. 4. The various modules for the DAD interface (from left to right, top to bottom):
1. agents navigator, dependencies & history 2. Java classes implementation (Coder ) 3.
Error, save & deploy (Console) 4. Test (Debugger )
The Designer is a graphical wizard available in DAD, dedicated to the

guided/simplified creation and configuration of new agents. It provides an
easy-to-use graphical interface, which leads developers between the choices of
archetypes and protocols to implement, behaviours and related agent to asso-
ciate. Further, the Designer provides a chat-shaped input box that can be used
to inspect the agents for each interaction set off, including the possible content
of messages that different cooperating agents may exchange between them. In
Fig. 5 we show an overview of the Designer interface dedicated to agents com-
munication protocols configuration and testing.
Fig. 5. Designer overview: a single request to the Classifier agent triggers an exchange
of messages between four cooperating agents. Developers can follow and keep track of
the communication through the dashed lines (blue lines indicate request messages,
green lines correspond to responses.
3.2 Maintenance and Monitoring

What we have simply named Console is a complex web application, composed
of four different modules, which provide a comprehensive and continuous mon-
itoring of the entire ecosystem. Easily navigable through a sidebar, Console’s
views are dedicated to the monitoring of different entities, such as: Container ’s
view shows a general summary of platform containers (IP address, alias, sta-
tus, agents running on it, memory usage);Agent’s view monitors the status(e.g.
Deployed, Suspended), archetype, location and the set of associated behaviours
of each Agent of the environment;Behaviour ’s view monitors the execution state
of Behaviours, the Agents they’re associated to and, if running, the Container
they’re on; Log’s view lists log files related to the system and allows a compre-
hensive history of agent development processes (Fig. 6).
Fig. 6. Console overview: an example of the Container view
4 Case Study: A MASs Linguistic Pipeline for NLP,

Virtual Agents Interaction and Operator Support
One of the focus target of companies operating within the Telco domain is the
automation of processes involved in customer operations. In order to validate
MASs environment we designed a multi-agents application composed of a set of
cooperating agents which support human operations throughout conversational
interfaces. Natural Language Processing (or in short NLP) is one of the core
processes which has an important role in conversational interfaces. In order to
interact both with customers (autonomously) and with human operators, a Vir-
tual Operator (e.g. a new generation chatbot) has to understand what it is being
told/asked to and being capable to answer accordingly. In order to instruct this
NLP interface we had to define a digital Corpus for both written and spoken
Italian restricted to the TelCo domain, acquiring the linguistic data from the
most different sources (social media, call transcriptions, etc.), further defining a
pipeline of agents, each assigned with a part of the linguistic analysis: Lemma-
tization (groups together the inflected forms of a word to analyze it as a single
item, the lemma or dictionary form), Grammar and Vocabulary check (Clean-
Phrase agent), comparison with the Corpus, identification of intents and entities
of every sentence, categorization (see Fig. 7 for an example of our linguistic
agents’ pipeline). All these agents need a specific training.
We defined three steps for producing an efficient training model:
1 Creation of a dataset containing a broad range of sentences to . compare with

the requests so that most of them will pass the comparison test and go on to
the categorization phase
2 Widening of the vertical domain dictionary, so as to achieve adequate results
in terms of classification accuracy
3 Once obtained a good number of correctly categorized sentences, refresh the
entire dataset with a new training
Fig. 7. The pipeline we defined as a prototype for a MASs-based chatbot
This pool of agents, correctly instructed and trained, will basically operate
some Information Retrieval, or IR, on the specific Knowledge Base. Different
software components, external to our MASs, are necessary: Apache Spark and
Spark Streaming to build and interrogate the Knowledge Base, Apache Kafka
for the data stream interaction between MASs and the Big Data Framework
(a collection of open-source software utilities that facilitate using a distributed
system like MASs’ to solve problems involving massive amount of heterogeneous
data and calculations, including distributed storage and low latency MapReduce-
like implementation). So we had to create agents able to interact with these
external softwares: a Scheduler to launch a Spark job for the creation of the
collections which constitute the Knowledge Base; a Spark agent whose job is
to interact with Spark Streaming to load the models and train the semantic
agents and successively interact with the user request operating on the Kafka
environment; an OperationInfo agent who receives the feedback from the IR

operation.
We consider the results so far obtained still highly dependant from the com-
pletion of the Knowledge Base, as shown in the following Table 1 (and as pre-
dicted).
Table 1. Examples of how small changes in analyzed phrases may alter the results
of Similarity (which evaluates how much a phrase fits into the Knowledge Base) and
Classification (associates the phrase in question with a topic form a list of available
categories) agents. Phrases are reported in Italian as the processing pipeline has been
developed and trained considering an Italian Knowledge Base. Each phrase is translated
for this paper’s purpose only.
Phrase (Italian) Translation Similarity Classification

Come ricarico mio cell? How to top up my cell? 63% RIC: 49%
Come si ricarica il mio cell? How to top up my cell? 87% RIC: 99%
Come attivo un’offerta? How do you activate a promo? 87% OFF: 62%
Come si attiva un’offerta? How do you activate a promo? 82% OFF: 95%
In the first two phrases the different verb inflection of “ricaricare” (which
means to top up credit in a prepaid phone) generate very different results both in
terms of similarity and classification; same thing happened with the verb “atti-
vare”, although in this case the similarity analysis were closer. The lack in the
domain vocabulary of a specific term or a specific synonym or abbreviation will
generate highly negative conditioned results, and in some case outcomes with
high erratic component. Having MASs interact with the Big Data Framework
proves very efficient for a number of reasons: agents provide versatility in ana-
lyzing (even in real time) data of heterogeneous sources and compositions, and
guarantee simple integration of new software technologies (instead of rewriting
entire portions of a classic software, easily integrate within MASs a new agent
deputy to manage the new features); accessing the Big Data framework guaran-
tees the chance of a deep and continuous training of the agents over a knowledge
base of wide dimension.
5 Conclusion
We have presented a distributed FIPA-compliant multi-agent system, MASs
environment, which supports the whole development cycle of a multi-agent appli-
cation characterized by a set of cooperative autonomous software (agent). This
kind of application is considered a worthwhile architecture in domains where the
capacities of a single agent are limited (e.g. Distributed AI, Edge Computing,
etc). We also described a case study of a distributed application implementing
NLP tasks. In the future, improvements of the MASs environment’s components
usability as well as the interoperability with other FIPA-compliant MAS envi-
ronments will be investigated.
Acknowledgement. Funding/Support: This work was supported by the Horizon

2020-PON 2014/2020 project B.4.M.A.S.S “Big Data for Multi-Agent Specialized Sys-
tem”. Contribution: The MASs environment has been developed by engineers of
Ingegneria dei Sistemi Department, Network Contacts, Molfetta, Italy.
References
1. Fipa specifications: The foundation for intelligent physical agent. http://www.fipa.
org/repository/standardspecs.html
2. Jade primer: parallelism and behaviours. https://www.iro.umontreal.ca/vaucher/
Agents/Jade/primer3.html
3. Bauer, B., Müller, J.P., Odell, J.: Agent UML: a formalism for specifying multia-
gent software systems. Int. J. Softw. Eng. Knowl. Eng. 11(03), 207 (2001)
4. Bauer, B., Odell, J.: UML 2.0 and agents: how to build agent-based systems with
the new UML standard. Eng. Appl. Artif. Intell. 18(2), 141 (2005)
5. Bellifemine, F., Caire, G., Poggi, A., Rimassa, G.: JADE: a software framework for
developing multi-agent applications. Lessons Learned Inf. Softw. Technol. 50(1–2),
10 (2008)
6. Bordini, R.H., Hübner, J.F., Wooldridge, M.: Programming Multi-agentsystems in
AgentSpeak using Jason. Wiley, Hoboken (2007)
7. da Silva, V.T., Choren, R., de Lucena, C.J.: Using the MAS-ML to model a multi-
agent system. In: International Workshop on Software Engineering for Large-Scale
Multi-agent Systems, pp. 129–148. Springer, Heidelberg (2003)
8. Da Silva, V.T., De Lucena, C.J.: From a conceptual framework for agents and
objects to a multi-agent system modeling language. Auton. Agent. Multi-Agent
Syst. 9(1–2), 145 (2004)
9. Finin, T., Fritzson, R., McKay, D., McEntire, R.: KQML as an agent communica-
tion language. In: Proceedings of the Third International Conference on Informa-
tion and knowledge management, pp. 456–463. ACM (1994)
10. Garcia-Molina, H.: Elections in a distributed computing system. IEEE Trans. Com-
put. (1), 48 (1982)
11. Genesereth, M.R., Fikes, R.E., et al.: Knowledge interchange format-version 3.0:
reference manual (1992)
12. Gutknecht, O., Ferber, J.: In: Workshop on Infrastructure for Scalable Multi-
Agent Systems at the International Conference on Autonomous Agents, pp. 48–55.
Springer, Heidelberg (2000)
13. Hahm, O., Baccelli, E., Petersen, H., Tsiftes, N.: Operating systems for low-end
devices in the internet of things: a survey. IEEE Internet Things J. 3(5), 720 (2015)
14. Juneja, D., Jagga, A., Singh, A.: A review of FIPA standardized agent commu-
nication language and interaction protocols. J. Netw. Commun. Emerg. Technol.
5(2), 179 (2015)
15. Ogino, T., Kitagami, S., Suganuma, T., Shiratori, N.: A multi-agent based flexible
iot edge computing architecture harmonizing its control with cloud computing. Int.
J. Netw. Comput. 8(2), 218 (2018)
16. Pokahr, A., Braubach, L., Lamersdorf, W.: In: Multi-agent programming, pp. 149–
174. Springer, Heidelberg (2005)
17. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Edu-
cation Limited, Malaysia (2016)
18. Severance, C.: Roy T. fielding: understanding the REST style. Computer (6), 7
(2015)
19. Shoham, Y.: Agent-oriented programming. Artif. Intell. 60(1), 51 (1993)

20. Standard, O.: MQTT version 3.1. 1 3 (2014). http://docs.oasis-open.org/mqtt/
mqtt/v3
21. Weerasooriya, D., Rao, A., Ramamohanarao, K.: In: International Workshop on
Agent Theories, Architectures, and Languages, pp. 386–401. Springer, Heidelberg
(1994)
22. Winikoff, M.: In: Multi-Agent Programming, pp. 175–193. Springer, Heidelberg
(2005)
23. Wooldridge, M.: Reasoning About Rational Agents. MIT press, Cambridge (2003)
24. Xu, Y., Mahendran, V., Guo, W., Radhakrishnan, S.: Fairness in fog networks:
achieving fair throughput performance in MQTT-based IoTs. In: 14th IEEE
Annual Consumer Communications and Networking Conference (CCNC), pp. 191–
196. IEEE (2017)
25. Yu, E.: Agent-oriented modelling: software versus the world. In: International
Workshop on Agent-Oriented Software Engineering, pp. 206–225. Springer, Hei-
dleberg (2001)
Museums’ Tales: Visualizing Instagram Users’
Experience
Pierluigi Vitale(&), Azzurra Mancuso, and Mariacristina Falco
Department of Political and Communication Science, University of Salerno, via

Giovanni Paolo II, 132, 84084 Fisciano, Salerno, Italy
{pvitale,amancuso,mfalco}@unisa.it
Abstract. Social networks have renewed the ways audiences experience art
and its spaces. The phenomenon concerns visitors, communicating their art
experience through social media, and the artistic institutions, communicating
their spaces and events. Sharing contents is a practice of fruition that allows the
experience to be textualized. Our research focuses on how Igers represent
themselves and their experience at museums, through a qualitative and quanti-
tative description of the data collected. Our approach can support art institu-
tions’ communication strategies.
1 Introduction
Considering museum as a semiotic space with its own specific organization of meaning
[17], the article highlights the way users live their life experience in an exhibition place
and tell it on Instagram. Considered as a novel medium, the social network allows users
and institutions to talk about their cultural experience using images, hashtags and
comments, also describing feelings and emotions These forms of textualization let
analysts to face the phenomenon on multiple levels, organizing visual and linguistic
datas. The research underlines how users, sharing their tourist-cultural experience on
the social network, classify as “museum” places that are not specifically museums. The
use of this linguistic label suggests the permeability of this semiotic space, clearly
highlighted by the Visual Data Analysis and supported by a final psycholinguistic
analysis.
2 Literary Review and Research Design
Instagram posts offer different types of information: visual content, comments, hash-
tags [2] and geolocation tags (hereinafter geotags)1. Both visual and textual, these
information have communicative [8] and meta-communicative functions [7], including
1
In order to hyperlink the image’s geographical location.

https://doi.org/10.1007/978-3-030-33509-0_21
Museums’ Tales: Visualizing Instagram Users’ Experience 235
denotation and connotation related to the topic of the posts and the attributes of the
picture2.
Starting from these assumptions, we have built a dataset collecting Instagram posts
tagged “#museum”, considering “museum” as a topic and as an attribute. Then we
worked on displaying all the images containing the geolocation coordinates of the
metadata. We built a list of words referring to an exhibit space, enriching it by pro-
cessing the label through Natural Language Processing techniques [12]. As a result, we
developed a list of single words, bigrams or trigrams [3] for identifying the posts
clearly geotagged in an exhibit space. In the end, with a pattern matching process, we
have derived an automatic binary coding of all geolocated pictures, with the oppor-
tunity to clearly visualize and define the position and relevance of the pictures in an
exhibit or not exhibit space (Fig. 1).
Fig. 1. Exhibition spaces or other spaces - interactive map - geographic visualization of the
population
The interactive map in Fig. 1 is an overview [19] of our work, allowing to search
and explore the data in each country (or city) of the world. Moreover, the interactive
stacked bar on the bottom has a highlight function. By clicking on the bar, it is possible
to focus attention on the two categories: post geolocated in an exhibit space or not3.
2
Hashtags and geotags present an important difference. In the first case the lexical choices to describe
the image belong to the user; in the second case the choice is instead dictated by a list made available
by the platform itself.
3
The size of each circle is proportioned to the amount of average presence of places in the population.
236 P. Vitale et al.
The score of the geolocated pictures is 67% of the total. Only 44% of the entire dataset
(56,000 posts) were geotagged in an exhibit space, while 23% geolocated are not in an
exhibit space. 33% of the collected posts are not geolocated at all. This means that
more than a half of the images do not refer clearly to any art institutions.
3 Instagram Post as Vector of Data
In order to pursue the analysis of Igers’ representation we have randomly collected a

large amount of Instagram posts containing the tag “#museum” in the caption or in the
first comment. For every post, we gathered visual content and text-based communi-
cations. The second step was to randomly build a subset (1,100) of our dataset. In order
to manage visual content without any leverage by hashtags or geotags, we separated
them in two different subsets (images and text-based information), which were related
one each other through their own hyperlink to the post URL. Thus, we defined several
variables to identify in the pictures the represented Igers’ relationship with exhibition
spaces and objects, for converting the qualitative visual information in data by a
coding. The process has assigned to each variables a binary value (0–1). Three different
judges were selected by a test, based on the roles described below, with the aim of
reaching an efficient level for coding visual information by using the variables. As a
result of the coding process, every image was defined by a sequence of binary codes,
which corresponded to different categories of relationship between Igers, exhibition
spaces and exhibited objects. Then, we validated the code with a sustained and efficient
method that works as agreement index between different coders by measuring the
reliability of the variables. In the end, we reconnected every piece of visual information
coded with binary sequence to the correspondent text-based information within a
unique dataset in order to proceed to the Visual Data Analysis. Below, we explain in
detail every phase of our work.
4 Data Collection and Refinement
By using a custom python script simulating the human navigation of the Instagram web
interface, we tracked on Instagram the hashtag “museum” and harvested 56,000 posts
published during the last year (January 4, 2017 – January 4, 2018). The collected data
included image/video files, captions and comments published to each post, and
geolocation coordinates including the name of the place tagged.
Different kind of data have been collected in several different files, described
subsequently:
– Single file for each posted picture in jpg format
– Single file for comments published to each post in txt format
– Single file for geolocation coordinates and name of places of each post in json
format
– Single file for metadata of each post in json format.
The last one and the most important included: url of the picture; number of likes;
number of comments; data and time of publishing; owner_id; media_id; caption of the
post; hashtag mentioned in the caption; user_id tagged in the picture; user_id tagged in
the caption; and type of media: picture or video.
This kind of collection allows the analysis of the phenomenon in different ways and
segments. For the content analysis of the visual data as first stage of the Data refine-
ment process, we merged all the json metadata files in a tabular dataset. From this
dataset we removed all the information not relevant to the purpose of our investigation,
such as: users mentioned in hashtag and tagged in pictures, media and author ID, and
all the video files. In the second stage, we built two different subsets starting from the
tabular dataset. First, we randomly prepared a subset containing 100 images as test to
detect the more efficient judges. Then we set up a random larger subset of 1,000 images
for the designated ones. At the same time, we prepared a relational subset in which
each image from the larger subset is related to all the hashtags contained in its caption.
It contains the information retained from metadata files, geolocation coordinates, names
of places.
The aim of relating each hashtag to a category determined and substantiated by the
final psycholinguistic analysis4.
5 Data Validation
Because of the complexity of the coding process, we drew a first phase of selection of
the judges. We submitted to the candidates the first subset made by 100 pictures in
order to select the better performing ones according to the data coding criteria defined
by the research team. In the second phase, we submitted the above-mentioned subset
containing 1000 pictures to the three designated judges. After this phase we calculated
the intercoder reliability, adopting Krippendorff’s [11] Alpha formula:
1 Observeddisagreement
KALPHA ¼
Expecteddisagreement
Krippendorff’s Alpha is an efficient method to measure the reliability of the vari-

ables, but it is also a good index to understand if there is a good agreement between the
coders.
4
During the process, trying to establish the quantitative relevance of the hashtags in the dataset, we
found a gap between the number of the images and the number of the main hashtag “museum.” This
gap pertains to the user experience of publishing, that provides the usage of the hashtag in the first
comment and not in the caption. According to Instagram’s system, only the hashtag published in the
caption and in the first comments are involved in the indexing of the images. So, we needed to enrich
the edgelist with all the hashtags present in all the textual comments files, relating them to the images
under they were published.
In this phase we found that two coders (Coder 1 and Coder 3) had a kalpha of 0.9;
meanwhile, the third (Coder 2) had a low level (around 4.1). It is considered acceptable
if the minimum level of 0.6 but it depends on the complexity of the evaluation. If the
variable is simple to define, also a 0.9, kalpha could be low. In our case, the multitude
and the fine line between many different variables makes the process very difficult. In
this way we selected Coder 1 and Coder 3 as the most reliable and relevant for the
analysis.
6 Visual Information Coding: Definitions and Rules
Before defining the rules of the coding scheme [18], we must define the two main
concepts: exhibition space and exhibited object. We decided to interpret the query
keyword “museum” in the broadest sense of Exhibition space, a place—indoor or
outdoor—where are displayed and shown to a public “objects of lasting interest or
value” (Museum, n.d.). As an instance of Exhibition Space, we intend a museum, a
cloister, the garden of a castle, a park, etc. We defined as Exhibited objects works of
art, scientific specimens, or other objects of cultural value that are gathered in a space
for the purpose of being displayed. Therefore, we consider as Exhibited objects also
architectural elements, performances, and some natural objects. On the basis of these
definitions, we describe the rules for the coding process followed by the judges. Each
variable is dichotomous even if sometimes, when they seem very related, they are not
ever mutually exclusive. This approach helps us identify some particular and
ambiguous cases such as off topic content, screenshots, collages uploaded, etc.
The designated and most efficient two judges interpreted the pictures through
an inductive-deductive coding process [5]: developing codes from the data itself using
a bottom-up method.
Every picture was examined with the following questions:
– Is the Iger represented in the picture? Is she/he indoors? Is she/he outdoors?
– Is the object represented in the picture? Is it indoors? Is it outdoors?
– Is the object represented in the picture as a figure? Or is it on the background?
After they had looked at the picture on a PC monitor, the judges put the relative
binary value for YES (1) or NO (0) on a spreadsheet.
Even if the values of variables were considered mutually exclusive, sometimes,
looking at the picture, it was not possible to identify any spatial element. In these cases,
inside and outside values were 0 for both Iger and object.
7 Results
The Visual Data Analysis has offered the identification of 21 categories of images
representing different relationships between Igers, the exhibition space and the
exhibited objects5. The hierarchy shows that Igers prefer to post pictures with single
objects displayed with elements of their exhibit’s places (Objects in the Space). Sim-
ilarly, the third category (Figure Objects in Public Space) identifies posts where the
subject of the picture is an object in a public space, such as public artworks, archi-
tectural elements, etc. When the subject of the picture is not an object, we identified a
counterpart category we termed Ground Objects in Public Space.
The second most relevant category is composed of images depicting exhibition
spaces, that is, the overall place where several objects are displayed. The subject of
these images is not a specific object, but rather the relationship between the exhibition
space and objects contained therein. Also, for this category we could find people in the
pictures, usually shot during their visit to an exhibition.
We found the score for the Close-up pictures and the outstanding number of Off-
Topic images to be considerable. On the one hand, Close-up pictures show just the
objects without any visual element that grounds them to an exhibition space. In these
cases, the user’s point of view in the picture appears to be close to the object or focused
on some details. On the other hand, all the images with no presence of objects in a
specific exhibition space were considered Off-Topic. At the end of the process we
coded 15 similar and residual categories, which show other representations of the
relationship between Igers, exhibition spaces, and objects. Because of their low scores
and their similarity, they could be aggregated into three meaningful groups: Uploaded,
Selfie, and No Shared Space (between Iger and object) that accurately describe the last
few residual categories. Even if some categories Uploaded could be identified as a
subcategory of the main categories we described above, they are actually different from
them with respect to the different binary sequence of codes assigned to them.
The No Shared Space category is for pictures shot by Igers situated in a place that is
different and separate from the one where the object is exhibited.
Selfie pictures are unique cases where the Iger is clearly identifiable in the picture.
The images in the Uploaded categories could be accounted as subcategories of the
main categories we described above, but they have been grouped in a different cate-
gory, because of the presence in the pictures of several graphic elements, such as labels,
frames, signatures, stickers etc., indicating the image has been edited before it was
5
Categories are: Objects in the Space (36, 29%); Spaces of Exhibition (13,71%); Figure Objects in
Public Space (12,49%); Close-up (12,44%); Off Topic (11,81%); Upload Objects in Space (3,62%);
Ground Objects in Public Space (3,30%); Selfie with Ground Objects (1,67%); Uploaded
Figure Objects in Public Space (1,13%); Selfie with Objects in the Space (0.72%); Uploaded
Spaces of Exhibition (0.59%); Outside Figure Objects Shot from Inside (0,50%); Selfie with Ground
Objects in Public Space (0,36%); Uploaded Selfie with Objects in the Space (0,27%); Uploaded
Ground Objects in Public Space (0,27%); Inside Objects Shot from Outside (0,23%); Selfie with
Objects with no Exhibit Context (0,18%); Outside Ground Objects Shot from Inside (0,18%); Inside
Selfie with Figure Objects in Public Space (0.09%); Uploaded Selfie Interaction with Objects
(0,09%); Objects with No Exhibit Context (0,05%).
posted on Instagram. These pictures are the result of post-production processes and
probably they don’t belong to the Iger who posted it. So, we consider them outside of
the space of exhibition, coding “Igers presence” variables as neither inside nor outside.
8 Discussion
In order to understand how many and which kind of mages Igers use for specific
exhibition, we interrelated each coded picture with the relative geotag. Among all the
geotagged pictures (44% on the dataset), Selfie categories are more geotagged in an
exhibition space (60%) than all the other pictures representing just objects or spaces of
exhibitions (45%). In particular, we noticed that Close-up and Selfie with Ground
Object posts are the less geotagged (28% and 38%), and only about half of the pictures
representing clearly the exhibition spaces (Object in the Space and Spaces of Exhi-
bition) are not geolocated. Consequently, we moved our attention to observing the
hashtags used by Igers in their posts, in order to verify if places or locations were
indicated in the caption. Considering the 20 most shared hashtags used (about 19,000
hashtags in total, average 17 hashtags per post), we noticed that only a few of them
were words referring to a place or a location (#paris, #gallery, #exhibition, #museo6).
On the contrary, besides the large amount of words related to artistic experience, we
found many hashtags for feelings (#love, #beautiful), acts of shooting (#photography,
#photo), and the merging of these two typologies (#instagood, #photooftheday, #pi-
coftheday)7. In order to understand the relationships hashtags create among categories,
we built a two-mode undirected graph (Fig. 2). Starting from the edgelist8, we sub-
stituted each picture with the associated category, as coded through the visual analysis.
Therefore, the network represents the relationships between two kinds of attributes:
their visual attributes (categories) and their metadata (hashtags). For our purpose, we
decided to assign fixed sizes. Hashtags are represented with nodes with the minimum
size in the graph, and the top 6 categories are represented with the bigger nodes. So, the
medium-size nodes represent the residual categories. Hashtags are colored in gray,
while visual categories are in light blue. The layout we have adopted to visualize this
network is Fruchterman-Reingold [6], in which the nodes with a significant proportion
of non-shared hashtags tend to move their link outwards, whereas the nodes related to
hashtags mainly shared tend to move with their link to the middle of the graph. Each
category is a network related to the others through the inner hashtags and has clusters
of distinctive hashtags positioned around the graph.
6
“Museo” is the Italian and Spanish word for “museum”.
7
Tags and percentages are: museum (100.00%); art (40, 55%); travel (15.45%); photography
(15.45%); artist (11.73%); architecture (9.91%); gallery (9.09%); painting (8.91%); love (8.64%);
contemporaryart (8.64%); history (8.55%); instagood (8.73%); beautiful (6.91%); photooftheday
(6.73%); paris (6.73%); exhibition (6.64%); arte (6.18%); design (6.18%); artwork (6.18%); museo
(6.00%); picoftheday (5.91%); photo (5.36%).
8
In which we had related the pictures to their hashtags published in comments and captions.
Fig. 2. Network visualization - overview and zoom of a two modes graph about visual
categories and their hashtags
On the right side (Fig. 2) the main categories share very fewer hashtags as against
the not shared ones, whereas most of the residual categories (Selfie, Uploaded, No
Shared Space) have many shared hashtags, both with each other and with the main
categories.
Through the Visual Data Analysis, we saw both Uploaded and Selfie pictures
showing experience museums, art galleries, etc., as inside an exhibition place. But there
is a difference between the two categories: generally, they are the figure of Uploaded
pictures, whereas they are just secondary elements, ground, for the self-representation
(Selfie).
The Visual Data Analysis makes possible a further linguistic reflection. Although
Instagram is a photosharing application, captions and hashtags, images are a rich source
of information about users behaviour, intentions and thoughts. For this reason, we
decided to apply a psycholinguistic analysis to written texts extracted in our images’
corpus. In the next section we will describe the analysis and report the results.
9 Psycholinguistic Analysis
Written texts (both captions and hashtags) were entered into a psycholinguistic anal-
ysis. Our corpus comprised 1.585.504 tokens. First of all, textual data were split into
two datasets:
Museum, which comprises all texts referring to images post-geolocated in an
exhibit space;
Not-museum, which includes all texts related to images that were not post-
geolocated in an exbit space.
The aim of the analysis was to capture people’s social and psychological states by
means of their language; specifically, we were interested in comparing two datasets, in
order to evaluate whether they differ with each other in terms of behaviors, needs,
thinking styles, or other psychological states that language might reflect. We generated
an interactive interface made by a word cloud with the 70 most frequent words in each
dataset with 10 as minimum term frequency, and their relative categories, according to
the following analysis process.
The fowling phase of the work was to automatically annotate the textual data with
LIWC program (http://liwc.wpengine.com/). This tool reads a given text and counts the
percentage of words that reflect different emotions, thinking styles, social concerns, and
even parts of speech. The tool provides a set of summary variables and a categorization
of words adopted in the text, as it is possible to explore in the dashboard shown in
Fig. 3.
Fig. 3. Interactive dasboard: words and categorization in psycholinguistic analysis
The interface is designed to explore the most frequent words in the textual part of the
posts (hastags and captions) adopting a filter that let the user to observe the results when
the pictures are shot in a museum or not. On the right side, each color related to the words
shows the categories in which the words have been recognized by the algorithm. The
order of the words is related to the amount of categories in which they are found.
Summary variables. Analytical thinking, clout, authenticity, and emotional tone are
research-based composites that have been converted to 100-point scales where
0 = very low along the dimension and 100 = very high.
Analytical thinking (Analytic) refers to analytical or formal thinking. It’s a factor-
analytically derived dimension which captures the degree to which people use words
that suggest formal, logical, and hierarchical thinking patterns. People lower in ana-
lytical thinking tend to write and think using language that is more narrative ways,
focusing on the here-and-now, and personal experiences [16].
Clout refers to the relative social status, confidence, or leadership that people
display through their writing or talking. The algorithm was developed based on the
results from a series of studies where people were interacting with one another [9].
Authenticity (Authentic). When people reveal themselves in an authentic or honest
way, they are more personal, humble, and vulnerable. The algorithm for Authenticity
was derived from a series of studies where people were induced to be honest or
deceptive [14] as well as a summary of deception studies published in the years
afterwards [15].
Emotional tone (Tone). Although LIWC2015 includes both positive emotion and
negative emotion dimensions, the Tone variable puts the two dimensions into a single
summary variable [4]. The algorithm is built so that the higher the number, the more
positive the tone. Numbers below 50 suggest a more negative emotional tone. Data
reveal that in the dataset Museum Igers are more likely to adopt a language which
reflect analytic or formal thinking than in the dataset Non-Museum. This finding could
be interpreted as an evidence that, when posting images referring to exhibition places
people tend to be more objective, to describe what they see and to report details.
Conversely, textual data related to images not geo-tagged in an exhibition space
(dataset Not-Museum) seem to include a language which is more authentic and honest:
people are more incline to speak about their personal experiences and exhibit more self-
confidence or leadership (Clout values). As regards to the emotional tone, both datasets
show a very positive tone (higher than 80%), without relevant differences between
Museum and Not-Museum.
Word categories. Among all psychological dimensions available in LIWC2015,
such as linguistic categories (verbs, prepositions, future tense, past tense, swears, etc.),
psychological processes (anxiety, anger, feeling, cognitive mechanisms, etc.), and
personal concerns (money, religion, leisure, TV, achievement, home, sleep. etc.), we
selected only those categories which reached a threshold of 4% of total words in both
corpora.
Specifically, we focused on the following categories:
Affect: this variable is calculated on the basis of the sum of positive (posemo) and
negative (negemo) emotion words. The degree to which people express emotion, how
they express emotion, and the valence of that emotion can tell us how people are
experiencing the world. Research suggests that LIWC accurately identifies emotion in
language use. For example, positive emotion words (e.g., love, nice, sweet) are used in
writing about a positive event, and more negative emotion words (e.g., hurt, ugly,
nasty) are used in writing about a negative event [10]. LIWC ratings of positive and
negative emotion words correspond with human ratings of the writing excerpts [1].
Social: it includes words referring to social relationships, such as family, friends,

etc. Language at its most basic function is to communicate. Words provide information
about social processes—who has more status, whether a group is working well toge-
ther, the quality of a close relationship [20].
Cognitive processes (cogproc): this category aggregated words referring to a series
of cognitive processes (insight, cause, discrepancies, tentative, certainty, differentia-
tion). Depth of thinking and cognitive complexity can vary between people and situ-
ations; certain words can reveal these differences [21]. The use of causal words (e.g.,
because, effect, hence) and insight words (e.g., think, know, consider), two subcate-
gories of cognitive mechanisms are indicative of a complex language.
Perceptual processes (percept): it includes words referring to experiences which
involve sense organs (to see, to hear, to feel).
Present Focus (focuspresent): it is a subcategory of Time Orientation and it includes
words referring to experiences that users are living while they post images (e.g., now,
present tense verbs, currentmood, etc.)
Leisure: it is a subcategory of Personal Concerns and it pertains to words that
express
Analysis on the linguistic cues indicates the dominance of personal and social
aspects on Instagram, as found by Manikonda, Meduri, and Kambhampati [13].
10 Conclusions
Together with Visual Data Analysis, Psycholinguistic analysis has shown how the
different geolocalization and tagging change the way to describe the museum experi-
ence on a lexical level, still maintaining a positive approach to it. Igers tend to give a
positive and euphoric valorization to their “museum” visit and tale, confirming the idea
that many kinds of exposition places are lived and conceived as it. The communicative
functions related to hashtags and geotags explain the relationship that is established
between the place and the user, or the relationship narrated through uploaded images. It
is not just a referential function, through which the contexts are told, but the set of
expressive, poetic and metalinguistic functions that run through the museum’s story,
made up of images, hashtags and geotags. All these functions and the elements to
which they belong make the text able to seduce and attract the attention of other
followers and other visitors. Finding and analyzing elements and functions make our
approach also able to support art institutions to organize events and communication
strategies following Igers’ narrations.
References
1. Alpers, G.W., Winzelberg, A.J., Classen, C., Roberts, H., Dev, P., Koopman, C., et al.:
Evaluation of computerized text analysis in an Internet breast cancer support group. Comput.
Hum. Behav. 21, 361–376 (2005)
2. Bruns, A., Burgess, J.E.: The use of Twitter hashtags in the formation of ad hoc publics. In:
Proceedings of the 6th European Consortium for Political Research (ECPR) General
Conference (2011)
3. Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. In: Proceedings of SDAIR-
94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 161–175
(1994)
4. Cohn, M.A., Mehl, M.R., Pennebaker, J.W.: Linguistic indicators of psychological change
after September 11, 2001. Psychol. Sci. 15(10), 687–693 (2004)
5. Corbin, J., Strauss, A.: Basics of Qualitative Research: Techniques and Procedures for
Developing Grounded Theory. Sage, Thousand Oaks (2008)
6. Fruchterman, T.M., Reingold, E.M.: Graph drawing by force-directed placement. Softw.:
Pract. Exp. 21(11), 1129–1164 (1991)
7. Giannoulakis, S., Tsapatsoulis, N.: Evaluating the descriptive power of Instagram hashtags.
J. Innov. Digit. Ecosyst. 3(2), 114–129 (2016)
8. Jakobson, R.: Essais de linguistique générale. Éditions de Minuit, Paris (1963)
9. Kacewicz, E., Pennebaker, J.W., Davis, M., Jeon, M., Graesser, A.C.: Pronoun use reflects
standings in social hierarchies. J. Lang. Soc. Psychol. 33(2), 125–143 (2013)
10. Kahn, J.H., Tobin, R.M., Massey, A.E., Anderson, J.A.: Measuring emotional expression
with the linguistic inquiry and word count. Am. J. Psychol. 120(2), 263–286 (2007)
11. Krippendorff, K.: Content analysis: An Introduction to its Methodology. Sage, Thousand
Oaks (2004)
12. Kumar, E.: Natural Language Processing. I.K International Publishing House, New Delhi
(2011)
13. Manikonda, L., Meduri, V., Kambhampati, S.: Tweeting the mind and instagramming the
heart: exploring differentiated content sharing on social media (2016)
14. Newman, M.L., Pennebaker, J.W., Berry, D.S., Richards, J.M.: Lying words: predicting
deception from linguistic styles. Pers. Soc. Psychol. Bull. 29(5), 665–675 (2003)
15. Pennebaker, J.W.: The Secret Life of Pronouns: What Our Words Say About Us.
Bloomsbury, New York (2011)
16. Pennebaker, J.W., Chung, C.K., Frazee, J., Lavergne, G.M., Beaver, D.I.: When small words
foretell academic success: the case of college admissions essays. PLoS One 9(12), e115844
(2014)
17. Pezzini, I.: Semiotica dei nuovi musei. Laterza, Roma-Bari (2011)
18. Potter, W.J., Levine-Donnerstein, D.: Rethinking validity and reliability in content analysis.
J. Appl. Commun. Res. 27(3), 258–284 (1999)
19. Shneiderman, B., The eyes have it: a task by data type taxonomy for information
visualizations. In: Proceedings of IEEE Symposium on Visual Languages, pp. 336–343
(1996)
20. Semin, G.R., Fiedler, K.: The cognitive functions of linguistic categories in describing
persons: social cognition and language. J. Pers. Soc. Psychol. 54(4), 558 (1988)
21. Tetlock, P.E.: The influence of self-presentation goals on attributional reports. Soc. Psychol.
Q. 44, 300–311 (1981)
Research Topics: A Multidisciplinary Analysis
of Online Communities to Detect
Policy-Making Indicators
Iolanda Sara Iannotta1 and Pierluigi Vitale2(&)

1
Department of Human, Philosophic and Education Science, Fisciano, Italy
iiannotta@unisa.it
2
Department of Political and Communication Science, Fisciano, Italy
pvitale@unisa.it
Abstract. This study follows a multidisciplinary approach which combines text

mining techniques, data visualization, and educational research to focus online
knowledge communities. Recently, social media shifted information from “of-
ficial” source to user-generated content changing knowledge building practices.
Starting from a large textual dataset, created using mining techniques, these
authors build an interactive visual tool to explore conversation of an Italian
Facebook group. The aim of this research is to build a web-based tool, which
allows exploration and navigation through conversations and topics, to under-
stand the significance of interactions in the research community. Results show
that members’ participation in the community conversations has grown in the
course of time. Comments published from 1 January 2012 to 31 December 2016
allow us to identify seven topics for different fields of interest in the ROARS
community. Despite positive results, further investigations are required to give
weight to the empirical evidence.
1 Introduction
Scientific progress has changed intersubjective dynamics and media pervasiveness

weaves through everyone’s everyday life. Contemporary humans are experiencing an
enlargement of the space and time dedicated to social communication and the rapid
growth of social media, such as Facebook and Twitter; these have become a crucial
source of information for a wide range of tasks [1–3]. This anthropological evolution
entails numerous advantages, but also some recognizable problematic aspects; indeed,
the limit between what is real and what is “virtual” is not immediately distinguishable
due to constant and consistent media exposure. The media experience has seen sig-
nificant change in ways of conceiving and realizing interpersonal relationships. Par-
ticularly, the diffusion of social media has reformed the manner in which information,
feelings, and opinions are produced and consumed. Social platforms promote a more
active debate between users [4], turning users from passive consumer readers to pro-
sumer figures, able to create and modify content. As is well known, the word prosumer
derives from the crasis of the terms “producer” and “consumer”. Previously used by
McLuhan [5], the dissemination of the term can be attributed to Alvin Toffler, who

https://doi.org/10.1007/978-3-030-33509-0_22
Research Topics: A Multidisciplinary Analysis of Online Communities 247
used the term for the first time in his text, The Third Wave [6], to indicate the improved
shape of consumers in.
Generation X. The improvement of technology devices causes an unceasing
redefinition of the modalities to share one’s own experiences, which are configured as a
composite and intricate relationship network. Taking account of these specificities,
Lévy [7] identified cyberspace as a place where individuals can generate new
knowledge and, for that reason, introduced the concept of “collective intelligence”.
To determine the value of the conversations that take place in a digital environment,
we can refer to Pea [8] who distinguishes between ritual communication and trans-
formative communication. Ritual communication gives emphasis to participation,
sharing, and continuous interaction between the members of the community, as well as
a sense of belonging. Transformative communication, on the other hand, focuses on the
transmission of learning messages. Online virtual communities have some distinctive
features: members interact with each other through the network; policies exist to
manage users’ interactions; services are used to support interaction among members
and to create a sense of belonging in the community [9].
As previously mentioned, Lévy [7] deemed that virtual space gives rise to differ-
entiated forms of intelligence and he identified cyberspace as a place where individuals
could generate new knowledge, thanks to collective intelligence. To constitute this
particular form of intelligence, the significant element (the fourth element) is the result
of the negotiated construction of individual knowledge, called the knowledge space.
Since intelligence is distributed where there are human “settlements”, then, according
to the author, information communication technologies allow democratic and global
dissemination of knowledge and, consequently, a renewed structure of social systems.
In fact, in the last few decades, there have been more and more terms to define the
phenomenon of virtual communities: discourse community, community of practices,
knowledge-building community, learning community [10]. As suggested in the spe-
cialist literature, each form of community has detailed characteristics, but there are
natural overlaps between one category and another. There are communities centered on
conversation; this is the case in discourse communities in which participants find
themselves because they share interests and needs. The massive diffusion of tech-
nologies and the Internet have increased possibilities for conversation and the differ-
entiation of issues of common interest. In the knowledge-building community, on the
other hand, the focus is on individual members’ knowledge, which contributes
knowledge for collective growth. Learning, therefore, is built in social relations and not
determined a priori.
Chism [11] believes that online conversations can carry out various educational
functions: they allow the building of a cohesive group (learning community), each
actor is responsible for achieving a common goal, they favor the elaboration of ideas,
they offer support or tutoring, and they generate positive feedback for individual
problem-solving strategies. More generally, Jenlink and Carr [12] identified three main
functions of online conversation: transactional function, if the conversation is directed
at negotiating and exchanging and/or sharing of knowledge and points of view;
transformative function, if interlocutors suspend their own opinions and judgment on
what is said by others; transcend function, if the conversation leads to the overcoming
of personal convictions, in order to advance collective knowledge.
248 I. S. Iannotta and P. Vitale
Communities of practice hint at their constructivist heritage [13]; the main principle
is to shift control from instructors to learners and to the learning path. Siemens [14]
believes that the Internet has heralded the inception of a new educational paradigm
called connectivism. Social constructivist views recognize that learners produce
knowledge as they attempt to understand their shared experiences and actively create
meaning. Learners regularly select and follow their own learning (they do this differ-
ently, depending on age).
The interaction between members of the knowledge building community is
therefore the essential element of the social process of knowledge building, and allows
every participant to become a knowledge builder rather than a passive interpreter of the
learning process [15]. A key concept of communities of practice is community
knowledge, in which the sum of this community knowledge is greater than the sum of
individual participant knowledge [16].
2 Research Description
Starting from the results and the methodology of a preliminary study [17], propose a
multidisciplinary approach: beginning with a large textual dataset created using web
mining techniques, we have built an interactive visual tool to explore and navigate
topics extracted from data mined from Facebook conversations. Due to its extreme
popularity, Facebook offers the possibility to collect heterogeneous data. In addition,
the language on Facebook is more emotional and interpersonal compared with other
social media [4]. To analyze topics, we borrowed techniques from natural language
processing (NLP); in particular, we address the problem by proposing an approach
based on topic modeling.
Topic modeling offers an interesting machine-learning-based solution [18] to
manage large textual data and to discover a hidden thematic structure in a collection of
documents; it allows one to find salient and the most debated themes using latent
semantic analysis and, in a second step, perform an interactive time analysis of a
particular topic. We propose an interdisciplinary approach, combining NLP with data
visualization methods to translate data into a visual system to explore the collection.
We designed a set of views about the relevance of topics in Facebook conversations in
order to examine them dynamically, basing their value on time spent on each topic. The
tool is composed of a set of “views” and navigation filters which allow exploration of
the dataset for different ranks and trends, and through various levels of detail, following
the classical pattern overview first, zoom and filter, then details-on-demand [19].
Visualization is primarily designed as an exploratory tool which allows a visual
interactive analysis of Facebook conversations at different levels of depth and
granularity.
3 Data Collection
The analysis was carried out on large text corpora, pertaining to seven years of con-
versation on the largest and longest-running Italian Facebook group about research
topics, with more than 14000 members: ROARS (www.roars.it). The group is public
and all the data that was processed for this analysis are available and accessible via
queries to the Facebook Public API.
The dataset contains 183762 rows, starting on the first day, 30 September 2011,
until the final day of data collection, 5 April, 2017.
The posts that were published by the members inside the group number 18891 and,
for each of these, we collected the typology (i.e., status, video, link, photos, etc.) and
the number of interactions generated, such as shares, comments, likes, and reactions
(i.e., like, love, wow, sad, angry, etc.). The latter interactions could represent a scale of
base emotions, in which like and love have a positive polarity, wow has a neutral
polarity, and sad and angry are clearly identifiable as negative.
The large remaining number of posts—164871 rows—are the comments published
with their relevance (comment, likes) and, if any, the user_id and name of users to
whom the response was sent.
Our purpose was to focus only on the comments.
4 Data Pre-processing
For our purposes, we carried out two processes of data refining, building two different
datasets for micro and macro exploration. First, we decided to focus our attention on an
analysis of the evolving relevance of the topic in these communities and to detect this
in several corpora for each year. Moreover, to detect critical topics, we filtered our
dataset, focusing attention on all the statuses (first level of publications inside the
group) marked by a percentage of negative emotions greater or equal to 50%. Once we
detected the potentially critical statuses, we collected all their comments and we
merged all the textual contents (critical statuses and their comments) into a unique
dataset.
Considering that the reactions were introduced on the 24 February, 2016 (and the
slow rise of usage), the ones that we defined as “critical status” are interestingly 23% of
a total of 4521 posts.
For the efficient topic modeling analysis, it was necessary to convert the dataset into
a plain text format, removing links, stop words, and the entire non-textual format used
by the users, such as gif, images, videos, memes, and emoji. Our goal was to explore
latent topics and relations, so we chose to analyze data by borrowing techniques from
natural language processing (NLP). In particular—as a first step—we chose to pre-
process data, following the entire NLP pipeline widely talked about in the literature
[20]. We addressed the problem by proposing an approach based on topic modeling. As
shown by several recent studies, topic modeling offers an interesting machine-learning-
based solution [18] to manage large textual data and to discover a hidden thematic
structure in a collection of documents: they allow one to find salient and the most
debated themes using latent semantic analysis and, in a second step, to perform an
interactive time analysis about a particular topic. We propose an interdisciplinary

approach, combining NLP with data visualization methods to translate data into an
interactive visual system to explore the collection.
Due to the extreme variety of texts extracted from Facebook, in order to organize
the corpus into clusters of thematically related sections, we used LDA [21]. LDA
represents documents as random mixtures over latent topics, where each topic is
characterized by a distribution over words. For each one of these we have a dirichlet
parameter [22] value, which defines the weight of the topics and their hierarchy. These
random mixtures express a document’s semantic content and document similarity can
be estimated by looking at how similar the corresponding topic mixtures are. For topic
identification, we used MALLET software. As required by MALLET, we separated
and organized the whole dataset into seven different files, in a machine-readable format,
to better analyze the topics pertaining to each year.
As shown in Table 1, one can observe the visible growth over the years of par-
ticipation in the community.
Table 1. Number of authors/comments per year on ROARS

Year n° Authors n° Comments
Sept. 2011 72 949
2012 13000 377
2013 25815 983
2014 32677 1452
2015 33261 1926
2016 48430 2506
2017 10739 1141
5 Data Visualization
In the following visualizations, with regard to the macro exploration of the thematic
trends in the conversation, we focused our attention on the comments published from 1
January 2012 to 31 December 2016 in order to avoid partial results in the first and last
years.
After data processing carried out using the MALLET algorithm, we detected the
top five topics for each year examined. By interpreting the results, seven topics for
different fields were identified. However, not all the same topics were presented for
each year. For example, the topic of the theme of research funding starts to appear in
the top five from 2014; the topic of the school and the Italian Ministry of Education
disappears in 2015.
The topic concerning political arguments, labelled policymaking, remains stable in
second place, after dropping to third in 2012. The topic evaluation/National Academic
Qualification (NAQ) (explained below), is always present in the top five topics. Table 2
shows some examples of the labelled topics:
Table 2. Sample of labelling of automatic topics detected using MALLET

Topic Keyword
Evaluation NAQ Evaluation; VQRa; Researchers; ANVUR; Contest
Schooling/MIUR English; Language; School; Students
Scientific Production Thesis; Article; Quoted; Doctoral; Quotations
a
VQR is an acronym for the Evaluation of Research Quality.
Many studies deal with the visualization of data from large text corpora or topical
hierarchies, particularly focusing attention on the words that comprise the topics and
applying visual formatting such as networks [23]. In order to better understand the
evolving relevance of the topics in a text, there are many visual strategies. These start
with the adoption of the “stacked graph” by ThemeRiver [24] through to the approach
of TIARA [25], which includes in the graph a summarization of the words, and
ultimately to the flow graph of TextFlow [26] which allows observation of the rela-
tionship (“splitting and merging”) between the topics.
In order to achieve the research goal, we decided to adopt a hierarchial alluvial
diagram [27] to clearly understand in an overview diagram (Fig. 1) which one of the
seven topics detected is relevant for each year. In the following visualization, on the
upper side we present all the topics in ascending order, from left to right. Each tick
represents a year, with a fixed size [19].
Fig. 1. Presence or absence of topics from 2012 to 2016

To better understand the evolving ranking of the topics over time, we used a
Sankey diagram [28], with a timeline on the ordinates. Moreover, on the abscissae,
there are no quantitative values; however, there is an inverted top-down scale from 1 to
6. Therefore, the line in the first position on a specific year represents the main topic of
the year. Each topic has a different color that defines its position until the last year.
When a topic could not be detected in the top five topics of the year, its path does
not start from the tick. For instance, the topic regulatory framework only appears in
2016.
In the upper right side, there is a dropdown menu allowing one to switch from the
hierarchy to trend visualization (Fig. 3).
All the topics detected may be defined as “educational topics”, except for chit-chat.
We labelled this topic in this way because it does not represent any particular theme;
however, it includes all the daily conversations among the members.
The topic scientific production pertains to all conversations about journals, reviews,
citations, and other words about academic publishing. Evaluation NAQ (National
Academic Qualification) is the topic in which people talk about evaluation criteria for
promotion, in order to achieve the title of academic professor in Italy or to improve
one’s own academic role. Policy making is the cluster of conversation about the work
of Italian politicians, specifically, presidents and ministers. The topic regulatory
framework is a little different. Here, people talk more specifically about the laws in the
field of school and research. In schooling/MIUR members generally talk about school
and relationships between one’s academic career and the attainability of earning the
title of professor in compulsory education. Finally, the topic funding pertains to all the
comments about provision and management of funds.
Fig. 2. Hierarchical distribution of topics from 2012 to 2016

By selecting the “trends” button in the dropdown menu positioned in the upper
right corner of the former visualization, it is possible to change the view. In the
following visualization, it is possible to change the reading perspective of the data,
focusing attention not on the weight of each topic, but on the relevance evolutions. For
this purpose, we decided to adopt a bump chart, in which each topic, colored as before,
is positioned on a timeline. Each path has a standard maximum size, so it is not possible
to make comparisons among different topics.
For instance, for the topic schooling/MIUR, it is possible to advise that there is a
different amount of relevance between the year 2012 and 2014, even if it is still in third
position. Moreover, the topic funding seems to be decreasing, albeit having a stable
position (fourth) in the last two years analyzed.
Fig. 3. Trend distribution of each topic from 2012 to 2016
6 Critical Topics
Using the same approach, we also analyzed the content for critical topics; however, in
this case, the observed period is between the launch of the reactions interaction and the
day of the data collection process: 23 February 2016 to 5 April 2017.
Topic Keyword
Evaluation NAQ Anni valutazione commissione ricercatori concorso
vqr
Issues about scientific career of a Articolo scritto tesi plagio scrivere ministro
Minister
University inequality North/South Studenti scuola sud nord università lavoro laurea
As carried out for topic detection in posts, for critical topics we developed manual
decoding and labelling. We re-identified, evaluation NAQ: this is the topic in which
people talk about the evaluation criteria for entitling, in order to attain an academic title
or to improve one’s own academic career. Critical topic keywords are “years, evalu-
ation, commission, researcher, selection”. The second critical topic concerns facts and
discussions related to a recent news item which has long been covered by the media:
the authors of this study define this as the “issue about the scientific career of a
Minister”. The news reported the supposed plagiarism of the doctoral thesis of the ex-
Minister of Public Administration; the case closed with a full acquittal. In this case, as
reported below, the keywords are: “article, written, thesis, plagiarism, writer, Minister”.
In our analysis, the label “University inequality North/South” reveals the unfortunately
evident disparity that exists between universities in Southern Italy and those in the
North in terms of structures and possibilities. Keywords are “students, school, north,
south, universities, degree”. The critical topic of financing distinctly emerges and, in
this case, we labelled it “funding and comparison”, because it was noted in posts that
members of the community talked about comparison with others European Countries.
We infer this from keywords: “research, Italy, euro, Germany, university, financing”.
There is also a comments area that it is not possible to clearly classify, because the
main words are “problem, part, system, right, always, research”; we called this chit-
chat. A sixth critical topic identified in this research is “person to person call”; in other
words, this discussion theme concerns a direct call from the school head. Keywords
that identified this theme are: “Government, Natta, teaching chair, Minister, reform,
President”. A final critical topic is the subject of “research approach missing in
decision-making”; the list of keywords is “research, university, xylella, Italian, scien-
tist, bacterium”. This topic refers to a news report which gave rise to a debate about the
value of news that can be recognized as “scientific” and is for this reason useful to
address decision-making. To visualize these results, we decided to adopt a treemap
(Fig. 4) [29] in which it is possible to show the hierarchy of the critical topics and their
most relevant keywords, used to decode the meaning. The sizes and color scale refer to
the dirichlet parameter, so the ranking of each topic in the top seven is clearly readable.
Fig. 4. Hierarchical distribution of critical topics
7 Discussion
In this paper, the authors first focused their attention on the comments published in the
ROARS Facebook group from 1 January 2012 to 31 December 2016. This choice was
made in order to elude potential partial results in the first year, 2011, and in the last
year, 2017. The MALLET algorithm allows the detection of the top five topics for each
year from 2012 to 2016. As already mentioned, topics are not presented every year of
analysis. For example, the topic labelled research funding appears in 2014. It is pos-
sible to infer that this argument, or similar arguments, was the subject of Facebook
conversation even before 2014. Nevertheless, in 2014, the European Researcher wrote
an open letter opposing vicious budget cuts that were wrecking scientific based
research and threatening Europe’s economic future. Clearly, the interest of the com-
munity of researchers in ROARS rose in that period. Visualization of data shows
(Fig. 2) that only the topic labelled Evaluation NAQ is represented in every year of
analysis, but even this had its ups and downs. It is important to point out that the topic
chit-chat is present every year, but this topic does not signify any specific theme. It
follows that the evaluation of the scientific path and relative NAQ are arguments
particularly significant for Italian research.
The second aim of this research study concerns critical topics in the observed
period between the launch of the reactions interaction and the day of the data collection
process (23 February 2016 to 5 April 2017). Data processing led to the detection of
seven critical topics, as reported in the previous section. We can summarize the results
of the analyses carried out on critical topics in two macro-categories: the first macro-
category we call “inequality”, the second “research value”. The first macro-category is
easily explained: inequality refers to all those discriminations, more or less evident,
concerning the field of education or research. Among these inequalities are, for
example, the difference in the conditions of possibility between north and south, the
quality of the services offered by formal educational contexts, and inequality which
also affects job opportunities at the end of the course of study. With regard to the
second macro-category, which we have called “research value”, in this case the label
identifies a double meaning. On the one hand, it identifies all the evaluation methods
that involve the world of education and research in our country, such as, for example
VQR or the National Academic Qualification. On the other hand, it concerns the value,
in the proper meaning of the term, given to the concept of research. What value does
research have? Is it possible to use research data to address the political decision-
maker? Do we believe in the value of research?
The results of the analysis carried out on critical topics leave ample space for
debate; certainly the connotations of the reactions ought to be taken into consideration.
8 Conclusions
This study follows a multidisciplinary approach: starting from a large textual dataset
produced using web mining techniques, the authors built an interactive visual tool. It
was possible to identify and analyze topics extracted from data mined from Facebook
conversations and to understand the connotation of the interactions in the ROARS
community. Individual knowledge and collective knowledge should support each
other. Rather than creating performance goals, learning communities produce artifacts
and histories that aid in the transfer of knowledge and the increase of understanding of
common problems or tasks. In fact, at any moment and from any place, each member
of the online community can access and mediate within a single exchange. Such
collaborative construction directly invests the actions involved in the interaction and,
indirectly, in the whole community. Despite evidence of such constructive collabora-
tion in the ROARS community, it is necessary to further investigate the relationship
between posts and comments in order to understand the actual value of the practice
community from an educational point of view.
References
1. Ahmad, A.N.: Is Twitter a useful tool for journalists? J. Media Pract. 11(2), 145–155 (2010)
2. Sheffer, M.L., Schultz, B.: Paradigm shift or passing fad? Twitter and sports journalism. Int.
J. Sport Commun. 3(4), 472–484 (2010)
3. Hermida, A.: # Journalism: reconfiguring journalism research about Twitter, one tweet at a
time. Digit. J. 1(3), 295–313 (2013)
4. Shah, D.V., Cho, J., Eveland, W.P., Kwak, N.: Information and expression in a digital age:
modeling Internet effects on civic participation. Commun. Res. 32(5), 531–565 (2005)
5. McLuhan, M., Nevitt, B.: Take Today; The Executive as Dropout. Harcourt Brace
Jovanovich, New York (1972)
6. Toffler, A.: The Third Wave. Bantam Books, New York (1980)
7. Lévy, P.: L’intelligenza collettiva. Per un’antropologia del cyberspazio. Feltrinelli, Milano
(1996)
8. Pea, R.D.: Seeing what we build together: distributed multimedia learning environments for
transformative communication. J. Learn. Sci. 13(3), 285–299 (1994)
9. Preece, J.J.: Online Communities: Designing Usability, Supporting Sociability. Wiley,

England (2000)
10. Jonassen, D.H., Peak, K.L., Wilson, B.J.: Learning with Technology: A Constructivist
Perspective. Paperback, New York (1999)
11. Chism, N.: Handbook for instructor on the use of electronic class discussion. Office of
Faculty and TA Development, Ohio State University, Columbus, OH (1998)
12. Jenlink, P., Carr, A.A.: Conversation as a medium for change in education. Educ. Technol.
36(1), 31–38 (1996)
13. Knowles, M., Holton, E., Swanson, R.: The Adult Learner: The Definitive Classic in Adult
Education and Human Resource Development, 5th edn. Gulf Publishing, Houston (1998)
14. Siemens, G.: Connectivism: a learning theory for the digital age. Int. J. Instr. Technol. Dist.
Learn. 2(1) (2005) http://www.itdl.org/journal/jan_05/article01.htm
15. Scardamalia, M., Bereiter, C.: Computer support for knowledge-building communities.
J. Learn. Sci. 3(3), 265–283 (1994)
16. Gherardi, S., Nicolini, D.: The organizational learning of safety in communities of practice.
J. Manage. Inquiry 9(1), 7–18 (2000)
17. Vitale, P., Guarasci, R., Iannotta, I.S.: Visualizing research topics in Facebook conversa-
tions. Multidisc. Digit. Publ. Inst. Proc. 1(9), 895 (2017)
18. Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd
International Conference on Machine Learning, pp. 977–984. ACM, New York (2006)
19. Shneiderman, B.: The eyes have it: a task by data type taxonomy for information
visualizations. In: Proceedings of IEEE Symposium on Visual Languages, pp. 336–343.
IEEE (1996)
20. Jurafsky, D., Martin, J.H.: Speech & Language Processing. Pearson Education, London
(2008)
21. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–
1022 (2003)
22. Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International
Conference on Machine Learning, pp. 113–120. ACM, New York (2006)
23. Gretarsson, B., O’Donovan, J., Bostandjiev, S., Höllerer, T., Asuncion, A., Newman D.,
et al.: TopicNets: visual analysis of large text corpora with topic modeling. ACM Trans.
Intell. Syst. Technol. (TIST) 3(2), 1–26 (2012). http://www.datalab.uci.edu/papers/topicnets.
pdf
24. Havre, S., Hetzler, E., Whitney, P., Nowell, L.: Themeriver: visualizing thematic changes in
large document collections. IEEE Trans. Visual Comput. Graph. 8(1), 9–20 (2002)
25. Wei, F., Liu, S., Song, Y., Pan, S., Zhou, M. X., Qian, W., et al: TIARA: a visual
exploratory text analytic system. In: Proceedings of the 16th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, pp. 153–162. ACM, New York
(2010)
26. Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z.J., et al.: Textflow: Towards better
understanding of evolving topic in text. IEEE Trans. Vis. Comput. Graph. 17(12), 2412–
2421 (2011)
27. Liu, X., Derudder, B., Csomós, G., Taylor, P.: Featured graphic: mapping shifting
hierarchical and regional tendencies in an urban network through alluvial diagrams. Environ.
Plann. A 45(5), 1005–1007 (2013)
28. Riehmann, P., Hanfler, M., Froehlich, B.: Interactive Sankey diagrams. In: Proceedings of
IEEE Symposium on Information Visualization, INFOVIS, 233–240 IEEE (2005)
29. Shneiderman, B., Wattenberg, M.: Ordered treemap layouts. In: INFOVIS. IEEE (2001)
Proposal of Transesophageal Echo
Examination Support System by CT Imaging
H. Takahashi1(&), T. Katoh1(&), A. Doi1(&), M. Hozawa2,

and Y. Morino2
1
Graduate School of Software and Information Science,
Iwate Prefectural University, 152-52 Sugo, Takizawa, Iwate 020-0611, Japan
{t-hiroki,kato_t,doia}@iwate-pu.ac.jp
2
Internal Medicine Course, Iwate Medical University, 19-1 Uchimaru,
Morioka, Iwate 020-8505, Japan
maiyan117@gmail.com, ymorino@smile.email.ne.jp
Abstract. Transesophageal echocardiography and CT imaging have been used

to provide definite diagnoses of cardiac diseases, such as angina and myocardial
infarction. Transesophageal echocardiography has been performed by manually
adjusting probe depth and ultrasound irradiation angle while referring to the
echo image. However, it is difficult to grasp the three-dimensional (3D) position
of the heart with echo images alone. Moreover, this takes a long time and puts a
heavy burden on patients and doctors. Therefore, we propose a new method in
order to smoothly create a preoperative plan. The proposed method replaces
conventional transesophageal echocardiography with CT images. The proposed
system can inspect CT images interactively and with a shorter examination time.
Moreover, unlike transesophageal echocardiography, there is no burden on the
patient when using the proposed method.
1 Introduction
Transesophageal echocardiography is a method of providing definite diagnoses of

cardiac diseases, such as angina and myocardial infarction. Transesophageal echocar-
diography has been performed by manually adjusting the probe depth and the ultra-
sound irradiation angle while referring to an echo image. The doctor operates the probe
while checking the condition of the patient. The probe is capable of being rotated “back
and forth” and “left and right” the curved knobs, clockwise and counterclockwise. The
echo image by transesophageal echocardiography is generated using the ultrasonic
echo from the tip of the probe inserted in the esophagus.
However, echo images have low image quality, and it is difficult to capture the
position and shape of parts accurately. Moreover, conventional transesophageal
echocardiography takes a long time and places a heavy burden on patients and doctors.
Therefore, we herein propose a new method in order to smoothly create a preoperative
plan. A more accurate diagnosis can be realized by creating a pseudo echocardiogram
from a CT image. Moreover, transesophageal echocardiography is invasive to the

https://doi.org/10.1007/978-3-030-33509-0_23
Proposal of Transesophageal Echo Examination Support System by CT Imaging 259
patient. If transesophageal echocardiography can be replaced by CT imaging, then the

invasiveness with respect to the patient and the burden on the doctor can be reduced,
which is a great advantage.
The remainder of the present paper is organized as follows. Section 2 introduces
the proposed virtual transesophageal echo examination support system. The image
filtering algorithm for echo cardiogram display is explained in Sect. 3. We explain how
to operate the pseudo probe in the proposed system in Sect. 4. Section 5 details our
evaluation and its results. Section 6 concludes the paper with a summary and the view
on future research.
2 Transesophageal Echo Examination Support System
This section introduces the proposed transesophageal echo examination support sys-
tem. The proposed system has the following functions: (1) display of CT section
imitating the echocardiogram echo image, (2) presentation of echo display location on
a volume rendering display, and (3) measurement on a pseudo echocardiogram from
the CT image. Figure 1 shows the user interface.
Display of a CT Section Imitating an Echocardiogram Image
The echocardiogram image is a fan-shaped image centered on the probe position shown
in Fig. 1. The proposed system imitates an echocardiogram and generates a CT cross-
section (hereinafter referred to as a CT echo image) of the specified range and angle
from the virtual probe position. Using this function, it is possible to facilitate the
correspondence between echocardiography and CT echo images.
Presentation of Echo Display Location on a Volume Rendering Display
This function displays the fan area displayed by beam simulation has displayed in the
same 3D CT image space as the volume rendering display shown in Fig. 2. Using this
system simultaneously with actual echocardiography, it becomes possible for the
doctor to accurately grasp the current position of the probe being operated in three
dimensions. However, there are cases in which CT echo images become difficult to
read depending on the state of volume rendering and the position of the probe.
Therefore, we extended the function whereby the doctor can optionally set the trans-
parency and transfer function of volume rendering.
Fig. 1. User interface of the proposed system Fig. 2. Echo display location on a
volume rendering display
260 H. Takahashi et al.
Measurement on a Pseudo Echocardiogram from the CT Image

This function can measure the distance and the area on a virtual echo image (i.e., a CT
image cut at an arbitrary plane), as shown in Fig. 3. The start and end points for
distance measurement can be displayed on virtual echo images in volume rendering. In
this system, the user designates an area with a line segment on the virtual echo image
and automatically calculates the enclosed area.
Fig. 3. Virtual echo image and measurement (distance and area) (unit: mm)
3 Image Filtering Algorithm for the Echo Cardiogram

Display
Even if the same region is displayed, echocardiography and CT images look com-
pletely different. By filtering the CT image and making it look similar to the
echocardiographic image, the echocardiographic image and the CT image can be easily
contrasted. The structure of this image filter is as follows. Let k be the position of the
starting pixel of the beam in the pseudo echo image generated on the two-dimensional
image. In addition, let (x, y) be the position of an arbitrary pixel, and let c(x, y) be the
luminance value of this pixel. The distance between k and (x, y) can be calculated as
follows:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
l ¼ ðxk xc Þ2 þ ðyk yc Þ2 ð1Þ
The sampling range n is calculated as follows:

8 2 l 2
>
> 10 wl 10 w %2 ¼ 1
>
< l 2
n¼ 3 10 w \3 ð2Þ
>
>
>
: 2 l 2
10 wl þ1 10 w %2 ¼ 0
where n is proportional to l. Therefore, a wider range of sampling is performed for

pixels far from the beam center. In addition, w is the width of the pseudo echo image,
and the unit is the pixel. The sampling interval s of this filter is then calculated as:
( n
r
n
r 1
s¼ ð3Þ
r \1
n
1
where r is a constant. As s approaches 1, a smoother image is obtained. Then, P is

defined as follows:
n1
P¼ ð4Þ
2s
The pixel c′ is calculated as follows:
PP PP
0 i¼P j¼P cðx þ si; y þ siÞ
c ðx; yÞ ¼ ð5Þ
ð2P þ 1Þ2
This is a smoothing filter with sampling interval s, in which the range in the x and
y directions is [− Ps, Ps]. The difference in the results when the sampling interval is
changed is shown in Figs. 4 and 5. This filter reproduces the state in which the intensity
has extended away from the probe position, as in the original echo image shown in
Fig. 6. This smoothing filter is fast and suitable for interactive operation.
Fig. 4. Case of sampling with a high density Fig. 5. Case of sampling with low density
Fig. 6. Original echo image

4 Pseudo Probe Operation

4.1 Position Specification
Figure 7 shows an image of actual probe movement and how it is handled in the
proposed system. Probe operation includes movement in the depth direction, move-
ment on the CT slice, and rotation of the beam irradiation direction. Since trans-
esophageal echocardiography is basically performed by inserting a probe into the
transesophagus, and movement in the depth direction is the only parameter.
However, it is difficult to recognize the transesophagus from the CT image, and the
pseudo probe cannot be moved along the transesophagus in the CT image. Therefore,
we implemented the function so that parallel movement can be performed on the slice
where the probe exists, with the depth being the movement in the slice direction of the
CT image. Thereby, fine adjustment can be easily performed. However, only the setting
of the initial position of the artificial probe is automatically calculated from the
information by detecting the vena cava in the CT image [1].
Fig. 7. Probe movement path
4.2 Rotation Matrix

In this system, two types of functions have been implemented as a rotation method of
the pseudo probe. The first function is (1) Euler angle rotation, and the second function
is (2) local coordinate rotation of pseudo echocardiography. In the actual examination,
the degree of rotation has been specified according to the “sense” of a doctor.
Therefore, it is desirable to use local coordinate rotation (Function 2), where the
rotation from the current position can be performed intuitively. On the other hand,
when assuming reproducibility of user operation, it is also possible to switch to Euler
angular rotation (Function 1), which is easy to calculate. At Euler angle rotation, the
rotation matrix R calculated as follows is applied to the initial state of the pseudo probe:
R ¼ Rx Ry Rz ð6Þ
0 1
1 0 0
Rx ðhÞ ¼ @ 0 cos h sin h A ð7Þ
0 sin h cos h
0 1
cos h 0 sin h
Ry ðhÞ ¼ @ 0 1 0 A ð8Þ
sin h 0 cos h
0 1
cos h sin h 0
Rz ðhÞ ¼ @ sin h cos h 0A ð9Þ
0 0 1
In the case of relative rotation, the following calculation has been performed. Here,
n is the rotation axis of length 1, and R is the rotation formula of Rodriguez rotating
axis n by h, where
n ¼ ð n1 ; n 2 ; n3 Þ ð10Þ
0 1
cos h þ n21 ð1 cos hÞ n1 n2 ð1 cos hÞ n3 sin h n1 n3 ð1 cos hÞ þ n2 sin h
@
RðnÞ ¼ n2 n1 ð1 cos hÞ þ n3 sin h cos h þ n22 ð1 cos hÞ n2 n3 ð1 cos hÞ n1 sin h A
n3 n1 ð1 cos hÞ n2 sin h n3 n2 ð1 cos hÞ þ n1 sin h cos h þ n23 ð1 cos hÞ
ð11Þ
When the current rotation axes are xa, ya, and za (where xa ? ya ? za and
xa 6¼ ya 6¼ za ), new rotation axes about xa are expressed as follows:
0
xa ¼ xa ð12Þ
0
ya ¼ Rðxa Þya ð13Þ
0
za ¼ Rðxa Þza ð14Þ
5 Evaluation
The existing transesophageal echo apparatus does not have simultaneous display of
beam range and volume rendering and the distance measurement function.
Subjective evaluation was performed using this system for one doctor. The doctor
performed the three functions shown in Sect. 2. The functions are (1) display of the CT
section imitating the echocardiogram echo image, (2) presentation of echo display
location on a volume rendering display, and (3) measurement on a pseudo echocar-
diography from a CT image.
Smooth real time operation was realized for all functions, even if the user interface
was different from the actual probe operation. Transesophageal echocardiography is
very patient. In addition, the doctor who examined it required a lot of time and
attention. The doctors using this system reported its effectiveness.
On the other hand, the echocardiographic filter was not faithfully reproduced. This
requires further improvement of the reproduction method.
6 Conclusion
A transesophageal echo examination support system is proposed in the present study.

The proposed system replaces conventional transesophageal echocardiography with CT
images. The proposed system can inspect CT images interactively. Moreover, the
examination time is shorter, and there is no burden on the patient compared to trans-
esophageal echocardiography.
In order to improve the quality of medical examination by transesophageal
echocardiography, simulated transesophageal echocardiography was created using the
special smooth filter from CT images. The system has the following three functions:
(1) display of CT section imitating echocardiogram echo image, (2) presentation of
echo display location on a volume rendering display, and (3) measurement on a pseudo
echocardiogram from a CT image.
The existing transesophageal echo apparatus does not have simultaneous display of
beam range and volume rendering and the distance measurement function. In order to
evaluate the practicality of this system, few doctors have actually used the function of
this system and made a subjective evaluation. Smooth operation was realized for all
functions, even if the user interface was different from the actual probe operation. The
echocardiographic filter was not faithfully reproduced. Further improvement of the
reproduction method will be performed in the future. In addition, we intend to proceed
with the implementation of a more intuitive interface reflecting the opinions of doctors
who actually use the system.
Acknowledgments. The present study was supported by JSPS KAKENHI Grant Number
26350541 (Grant-in-Aid for Scientific Research C) “Basic research on pre-operative and intra-
operative support system for tailor-made medical treatment” and the Public Interest Foundation
of the Japan Keirin Association.
References
1. Sekimura, S., Doi, A., Kato, K., Hozawa, M., Morino, Y.: An extraction method of coronary
artery and left ventricle from heart CT images based on hough transformation and region
growing. In: The Twelfth 2017 International Conference on Knowledge, Information and
Creativity Support Systems (KICCS 2017), November 2017
2. Ishizuka, N., Ashihara, K.: Transesophageal echocardiography and diagnosis Atlas Series
ultrasonic knitting vector core, March 2012. (in Japanese)
3. Okamoto, H., Sugimura, S.: Transesophageal echocardiography - basic method and training
of how to take, how to diagnose Yodosha, 1 June 2011. (in Japanese)
A Study of 3D Shape Similarity Search
in Point Representation by Using
Machine Learning
Hideo Miyachi(&) and Koshiro Murakami
Department of Information Systems, Tokyo City University,

3-3-1 Ushikubo-nishi, Tsuzuki-ku, Yokohama City, Japan
{miyachi,g1672095}@tcu.ac.jp
Abstract. 3D shape similarity seach have been studied for detecting or finding
a specified 3D model among 3D CAD model database. We propose to use 3D
point data for the search, because it has become easier to obtain 3D point data by
photographs. CAD data is converted into 3D point data in advance. Then, using
machine learning, we attempt to match those data with the 3D point data
acquired in the field. It can be expected that the accuracy of matching is
improved by directly handling 3D data. As a preliminary trial, we have tried to
clasify 10 kinds of chair models in represented as 3D point data with a machine
learning approach. It was suggested that 3D shape matching between 3D point
data is possible by our proposed method as the result.
1 Introduction
In the manufacturing maintenance field, it is often required to identify the exact product
name of a real object in front of you. In order to identify this product, it is currently
inquired about the product by a two-dimensional image. A photograph taken at the
product operation site is sent to a service center, and a person who is familiar with the
product recognizes the feature and identifies the product identification number as
shown in Fig. 1.
So far, many image processing researches and developments have been performed
to support this work, and they have achieved certain results. However, since the
acquisition of 3D shapes has been becoming easy, it has been examined whether the
matching work can be effectively performed by using the 3D point data. Therefore, by
adding point cloud data to conventional three-dimensional CAD data, I have started to
study a fundamental system that handles both design data and measurement data
efficiently.
As our start, I started to investigate the possibility of 3D shape similarity search
using 3D point cloud data by using machine learning. In this paper, we report the
concept and the preliminary test results.

https://doi.org/10.1007/978-3-030-33509-0_24
266 H. Miyachi and K. Murakami
Fig. 1. Current product identification method
2 Goal of This Study
The final goal is to develop a new data management framework for PLM (Product
Lifecycle Management) which can handle both surface and point data. High-speed,
high-precision laser measurement and 3D shape measurement by photographs are
generally used, and a large amount of point data has being acquired [1–3]. However, at
present, point data is transient data for obtaining surface data for reverse engineering
[4], and it is rare to use point as it is. Therefore we are considering how to use the point
data as it is.
We have developed a technology to convert surface data to point data at high speed
[5–7]. Using this technology, we plan to convert CAD data to point data and use the
point data as a subset of CAD data. If the product can be identified by using the point
data measured in the field and the point data that is a subset of CAD data, it is not
necessary to convert point data back to surface data like shown in Fig. 2. In the future,
when 3D measurement becomes easier, 3D shape search based on point data has the
potential to be maintenance work in the manufacturing industry effective.
Fig. 2. Product search by point data

A Study of 3D Shape Similarity Search in Point Representation 267
3 Related Works
3D shape recognition has long been studied, and various methods have been proposed,
such as a histogram-based method [8] in Fig. 3(a), a visual-similarity-based method [9]
in Fig. 3(b), and others [10, 11]. In this research, we propose 3D shape similarity
search by introducing machine learning into histogram-based method. In the histogram-
based method, as the first stage, points are generated randomly on the object surface.
Thereafter, as a key of the 3D shape similarity search, a histogram of the distance from
a certain point, for example, the center of gravity or a histogram of the distance
between all the points are calculated [12].
(a) (b)
Fig. 3. Typical methods of 3D shape similarity search (a) Histogram-based method (b) visual-
similarity-based method: the picture indicates the method by using images viewed from all
directions in 360°
In this study, we apply a fast point generation technology we developed to the stage of
point generation on the object surface. Our point generation method can generate point
data in various resolutions for the object while changing two parameters. Since the
proposed method is fast, it is easy to generate a lot of point data sets quickly for one
object. Taking advantage of this characteristic, a large number of point cloud data sets can
be generated and used to learn the histograms for machine learning. We used the two
kinds of distance histograms. The first is the distance histogram from the center of gravity
of all points to all points, and the second is the distance histogram from the farthest point
from the gravity point to all points. In this paper, the former is called H1 histogram and
the latter is called H2 histogram. The next section describes our point generation method.
4 Proposed Point Generation Method
In this section, we introduce our method [5–7] for point generation from the surface
object. It uses rendering process in computer graphics. The output of rendering with Z
buffer algorithm is stored in memory area called frame buffer and Z buffer separately.
The color information is stored in frame buffer and the depth information is stored in Z
buffer respectively shown in Fig. 4.
Fig. 4. Color and depth information in graphics memory after Z buffer rendering.
Combining these two kinds of data, 2.5 dimensional point data can be obtained.
By rendering the object from various viewpoints, it is possible to acquire many
kinds of 2.5 dimensional point data viewed from various directions as shown in Fig. 5.
Furthermore, combining all of 2.5 dimensional point data, 3D point data to represent
entire 3D object can be obtained. In the processes of Figs. 4 and 5, there is one
parameter each which can be determined arbitrarily. In Fig. 4, the display resolution,
that is, the product of the vertical and horizontal image pixel sizes of the frame buffer or
Z buffer becomes a parameter for determining the density of point data generated. In
Fig. 5, the number of direction from which 2.5-dimensional data is to be acquired, that
is, the number of view point, is another parameter for determining the density of the
point data generated.
Fig. 5. Reconstructing 3D point data by combining of multiple 2.5D point data sets.
5 Experiment
5.1 Experiment Procedure
The outline of experiment procedure is illustrated in Fig. 6. Although our goal is to
search for 3D shapes represented as point data in a CAD database similar to the point
data measured in the field, as shown in Fig. 2, but here we study feasibility of searching
3D point data by using a histogram approach with machine learning.
Fig. 6. Outline of experiment procedure
We prepared 10 kinds of chair data in STL format by downloading free model

providing sites. Those STL data are converted to point data by our point generation
method. At that time, 100 kinds of resolution parameter (display resolution in Fig. 4)
and 3 kinds of view direction (in Fig. 5) sets are applied to each STL data. At the
conversion, 300 kinds of point data files are generated for each chair model. So, total
3000 kinds of point data file were obtained.
Then H1 histogram and H2 histogram were calculated as the key feature of each
object. Finally, using these histogram, 3D shape search ability with a machine learning
was tested.
5.2 Experiment Data

The appearance of ten types of chairs used for the experiment is shown in Fig. 7.
Table 1 shows model name and information of download site for each chair. The
models of chairs are provided in STL format. Those were converted to point data by
our conversion technology described in Sect. 4. The resolution parameters were set to
N N pixels and the N was changed from 200 to 299 in 1 unit. Another parameter, the
number of viewpoints, was set to 6, 18 or 26 directions.
Fig. 7. Appearance of ten types of chairs
Table 1. Name and download site (Database) for each chair. Database 1: 3Delicious [13], 2:
Free3D [14]
Model ID Name Database
1 13494_Folding_Chairs_v1_L3.stl 2
2 ArmchiairHollyHunt130518.stl 1
3 ArmchairclassicN120518.s tl 1
4 Chair.stl 2
5 ChairMulanCattelanItaliaN040519.stl 1
6 ChairbarN230918.stl 1
7 ChairditamoderrnN220419.stl 1
8 OldChair.stl 2
9 SeatINARTN040519.stl 1
10 obensandalye.stl 2
H1 and H2 histogram were calculated from those point data. The histogram was
divided into 1000 from 0 to the maximum value which is the average of the maximum
value of the data set, and the frequency was normalized by percentage. The all of H1
histogram of chairs are shown in Fig. 8. Each graph contains 900 histograms. From this
figure, it can be seen that even if the two parameters at the point generation are
changed, the general tendency is the same, but the details are different.
Fig. 8. H1 histograms for each chair type
5.3 Machine Learning

A deep learning [15] was used for the classification of chairs. Since the distribution of
the histogram handled here changes continuously, a one-dimensional CNN as shown in
Fig. 9 was used [16–18].
Fig. 9. Machine learning model

The CNN consists of two convolution blocks and one full connected block. The
convolution block consists of two convolution layers (filter size = 3, stride = 1) and
one pooling (maximum value). The full connected block reduces the number of
parameter from 1024 to 10. ReLU (Rectified Linear Normalization) is used as an
activation function. Batch Normalization [19] and Dropout [20] (50%) are used for
avoiding overfitting. This program is built on KERAS with Python language. From the
total number of 3000 experiment data, 2700 were used for the learning and 300 were
used for the evaluation.
5.4 Experimental Result

As the result of the experiment, the correct answer rate was 100% and 98.3% for the H1
and H2 histograms, respectively, for the 300 trials. Table 2 shows the response matrix
diagram for the H2 histograms. This figure shows the pairs of answer and response as a
matrix. The answer indicates the correct model Id for the histogram, and the response
indicates the model Id of the machine learning answer. That is, the numbers on the
diagonal of the matrix indicate the number of correct answers. The numbers other than
the diagonal of the matrix indicate the number of incorrect answers. For example, the
column of “Answer” = 4 and “Response” = 10 indicates 1. This means that the program
has determined 10 where the correct answer should be 4.
Table 2. Response matrix diagram for the H2 histogram.

Model Id Response
1 2 3 4 5 6 7 8 9 10
Answer 1 31 0 0 0 0 0 0 0 0 0
2 0 36 0 0 0 0 0 0 0 0
3 0 0 31 0 0 0 0 0 0 0
4 0 0 0 23 0 0 0 0 0 1
5 0 0 0 0 32 0 0 0 0 1
6 0 0 0 0 0 27 0 0 1 0
7 0 0 0 0 0 0 25 0 0 0
8 0 0 0 0 0 0 0 28 0 0
9 0 1 0 0 0 0 0 0 27 1
10 0 0 0 0 0 0 0 0 0 35
6 Conclusion
This paper described a preliminary trial result to identify the chair with a machine
learning approach in the data sets of 3D point data generated from 3D geometry data.
We used two types of histograms as the key of the identification. The one was the
distance histogram from the center of the gravity to the all points and the other was the
distance histogram from the farthest point from the center of the gravity to the all
points. As a result, it was suggested that 3D shape matching between 3D point data is
possible by machine learning.
In this test, only 10 types of chair data were used. However, in order to show the
possibility of general shape recognition, it is necessary to identify various data. As the
next step, we will confirm the identification performance of simple primitives such as
spheres, cones, and cubes, and then investigate the identification performance for
complex and diverse 3D shapes such as animals, ships, and cars.
Acknowledgments. This research was financially supported by KAKENHI 17K00162. We

would like to thank ARK Information Systems, INC. for their work on sample programs for
machine learning.
References
1. Buonamici, F., Carfagni, M., Furferi, R., Governi, L., Lapini, A., Volpe, Y.: Reverse
engineering modeling methods and tools: a survey. Comput.-Aided Design Appl. 15(3),
443–464 (2018)
2. Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S.M., Szeliski, R.:
Building rome in a day. Commun. ACM 54(10), 105–112 (2011)
3. Mizoguchi, T., Kuma, T., Kobayashi, Y., Shirai, K.: Manhattan-world assumption for as-
built modeling industrial plant. Key Eng. Mater. 523, 350 (2012)
4. Kashiwara, K., Kanai, S., Date, S., Kim, T.: Automatic recognition and modeling of piping
system from large-scale terrestrial laser scanned point cloud. In: ACDDE (2012)
5. Miyachi, H.: Data reduction by points extraction in consideration of viewpoint positions of
observer. Trans. VSJ 36(8), 40–45 (2016). (in Japanese)
6. Miyachi, H.: Quality evaluation of the data reduction method by point rendering. In: The 3rd
International Symposium on BioComplexity, pp. 986–990 (2018)
7. Miyachi, H.: Quality evaluation of 3D image represented by points. In: The 21st
International Symposium on Artificial Life and Robotics, pp. 686–689 (2016)
8. Ankerst, M., Kastenmüller, G., Kriegel, H.-P., Seidl, T.: 3D shape histograms for similarity
search and classification. In: Proceedings of 6th International Symposium on Spatial
Databases. Lecture Notes in Computer Science (1999)
9. Chen, D.-Y., Tian, X.-P., Shen, Y.-T., Ouhyoung, M.: On visual similarity based 3D model
retrieval. In: Computer Graphics Forum (EUROGRAPHICS 2003), vol. 22, no. 3, pp. 223–
232 (2003)
10. Li, Y., Li, W.: A survey of sketch-based image retrieval. Mach. Vis. Appl. 29(7), 1083–1100
(2018)
11. Tabia, H., Laga, H.: Covariance-based descriptors for efficient 3D shape matching, retrieval,
and classification. IEEE Trans. Multimed. 17(9), 1591–1603 (2015)
12. Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Shape distributions. ACM TOG 21(4),
807–832 (2002)
13. Download site 3 Delicious (2019). https://3delicious.net/
14. Download site Free3D (2019). https://free3d.com/ja/3d-model/folding-chairs-v1–612720.
html
15. Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment
classification: a deep learning approach. In: Proceedings of the Twenty-eight International
Conference on Machine Learning (ICML 2011), June 2011
16. Fukushima, K., Miyake, S.: Neocognitron: a new algorithm for pattern recognition tolerant
of deformations and shifts in position. Pattern Recognition 15(6), 455–469 (1982)
17. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document
recognition. Proc. IEEE 86(11), 2278–2324 (1998)
18. Krizhevsky, A., Sutskever, I., Hinton, G.E: ImageNet classification with deep convolutional
neural networks. In: Proceedings of NIPS (2012)
19. Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving
neural networks by preventing co-adaptation of feature detectors (2012)
20. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift (2015)
A Fog-Cloud Approach to Enhance
Communication in a Distance Learning Cloud
Based System
Lucas Larcher(&), Victor Ströele, and Mario Dantas
Federal University of Juiz de Fora (UFJF), Juiz de Fora, MG, Brazil

{lucas.larcher,victor.stroele,mario.dantas}@ice.ufjf.br
Abstract. IT infrastructures have changed to cloud computing since this model

allows omnipresent and on-demand access to share computing resources. Such
resources can be quickly provisioned and released with minimal management
effort. However, cloud computing has been challenged to be scalable as regards
the infrastructure of the systems. These challenges are reinforced in distance-
learning environments where cloud infrastructure is adopted, and the supporting
off-campus units, located in precarious regions, suffer for poor infrastructure.
Consequently, fog-cloud models have emerged to improve application perfor-
mance using local prepossessing of data and analysis of communication
requirements. The present study envisages the solution of a problem in a
Brazilian university that offers distance education. A low cost technological
solution is proposed using the Fog-Cloud computing paradigm. In order to
evaluate the proposal, an experiment using real data was carried out and the
results obtained point to viability of the proposal.
1 Introduction
Still regarded as a solution to storage and processing problems, the use of a simple
cloud-computing infrastructure has been questionable. One problem faced by this type
of technological solution is the scalability and performance of the applications. With
this, other paradigms of computing systems, such as Fog Computing, have been
researched and employed in both industry and the academia. Examples of such an
interest are the papers and research presented in [1–3].
The interest in new paradigms has been growing, especially in the academic milieu
and companies with a large amount of geographically distributed data. The use of
Cloud Computing has shown different bottlenecks, such as latency and flow, which are
limited by the network. There are also security issues, large-volume of data processing
and time constraint, as can be seen in research projects [4] and [5].
While Cloud Computing-based architectures focus on centralization, Fog Com-
puting approaches intends to decentralize the network by bringing data prepossessing
and communication requirements to the local network. Therefore, Fog Computing is a
decentralized computing architecture where data, communications, storage, applica-
tions, and management are distributed between the data source and the cloud [6].

https://doi.org/10.1007/978-3-030-33509-0_25
276 L. Larcher et al.
Distance education courses in Brazil aim to reach students who are in regions
located far from the educational institution. In this way, remote centers or units, lit-
erally translated from Portuguese as “educational poles”, are located in small towns so
that students have access to school laboratories and local didactic support, without the
need to attend the main institution’s campus. Institutions that offer distance-learning
courses adopt platforms with a Cloud Computing architecture in which data storage and
processing are centralized in the main educational institution. In this process, teachers
and students interact with each other using an online platform.
The educational institution here under study has the system of “educational poles”
(henceforth referred to as ‘supporting off-campus units’ or ‘remote units’). These
remote units are geographically spread and are used for performing tasks in the
computational cloud, whose source platform is physically located in the main campus.
A problem arises when the quality of the Internet connection does not support the
volume of data in the network, which also negatively impacts the students’ assessment
of the off-campus unit, and therefore the user’s Quality of Experience (QoE).
With respect to this technological model based on cloud, distance education courses
in Brazil face difficulties in view of the precarious technological infrastructure of some
schools. In this context, the problem tackled in the present study is: How to improve the
infrastructure of the supporting off-campus units with the purpose of enabling the full
use of online education platforms, reducing communication failures and deploying a
low-cost computational solution?
In this context, this paper aims provide educational institutions that offer courses in
the distance modality with a financially interesting solution to communication prob-
lems. The proposal of this study was evaluated through an experiment in which the
cooperation of fog-cloud proves to be a distinguishing feature towards scalability and
better performance of the application.
The paper is organized in seven sections, the first one being the introduction and
presentation of the scope of the problem. Second section presents the theoretical
foundation, introducing the concepts of Cloud and Fog Computing. The topic pre-
sented in section three is related work that aims to elucidate the state of the art. Fourth
section addresses the proposed solution, revealing the initial motivation of the study.
Subsequently, the experimental results are presented, and finally conclusions are drawn
and future work is suggested.
2 Theoretical Foundation
This section presents the theoretical basis necessary to understand the concepts
addressed in this study, considering the main research areas: Fog and Cloud computing
and Distance Learning.
2.1 Cloud and Fog Computing

The cloud is the idea of making every kind of service available on the internet without
any effort on the user’s part. One of the definitions found in the literature is as follows:
“Cloud Computing is a model for enabling ubiquitous, convenient, on-demand network
A Fog-Cloud Approach to Enhance Communication 277
access to a shared pool of configurable computing resources (e.g., networks, servers,

storage, applications, and services) that can be quickly provisioned and released with
minimal management effort or service provider interaction.” [13].
The concept of Fog Computing is defined by Cysco in [1] as an extension of the
cloud to maximize the potential of the Internet of Things (IoT). Fog is the physical
approach of the cloud to the applications, decreasing the number of jumps and giving
greater availability. In a practical way, fog nodes are added geographically to devices
that use the cloud with the intention of performing some type of activity before the data
is passed to the cloud.
A practical example in the IoT environment are sensors that perform measurements
that matter only when there is a significant change in the data, such as a person’s
heartbeat in an assisted living environment. There is no need to send information to the
cloud when the heart rate is normal, but if there is a change then a doctor should be
contacted right away.
In the literature there are several definitions, but for a clearer understanding,
“Clouds are distributed technology platforms that leverage sophisticated technology
innovations to provide highly scalable and resilient environments that can be remotely
utilized by organizations in a multitude of powerful ways” [15].
2.2 Distance Learning

Distance Learning is a non-face-to-face teaching-learning modality that uses technol-
ogy to reach people who cannot be in the same place where the teacher or tutor is, due
to lack of affordability or sheer choice. This modality comprises a main educational
institution (e.g. a university) and supporting off-campus units or centers, which are
geographically distributed with the purpose of reaching the greatest possible number of
people.
These off-campus units offer an infrastructure to aid communication between
teachers (physically present at the main institution) and students (physically present at
the off-campus units), provided with basic technological equipment and internet con-
nection. In addition, Virtual Learning Environments (VLE), such as Moodle [16], are
used so that teachers, tutors and students can interact in the teaching and learning
process, profiting from the technological infrastructure available in the main educa-
tional institution and the units.
The present study will analyze the communication between the off-campus units
and the main institution, with the purpose of proposing a solution to the communication
problems thereof.
3 Related Works
This section presents studies that relate to the main themes of the work developed in
this paper, that is: Cloud Computing, Fog Computing and Distance Education.
The first important point to be made is that cloud-based architectures are no longer
sufficient to support the demands of the Internet, especially when it comes to data
volume allied to delivery time [7]. In the classic configuration of Cloud Computing
(Fig. 1), all data processing is the responsibility of the cloud, and clients are only
passive elements.
In this context, in [8] the authors present Fog Computing as a solution to deal with
the demands found on the Internet, being a powerful solution and an extension of the
cloud. It is a solution mainly used on the Internet of Things (IoT), which works the
resources at the edge of the cloud, i.e. it performs prepossessing closer to the data
source. In this way, their study proposes a task-scheduling approach for the fog-cloud
configuration.
According to [5], fog is a powerful tool to help cloud computing. The authors
characterize it as an extension of the cloud. Their study proposes an approach, looking
for a relationship between energy consumption and system delay, exploring the
interaction between cloud and fog.
The fog-cloud cooperation is presented in [9] from a mobile perspective, suggesting
fog as a solution for offloading services for smart technologies. In this way, the authors
present a hybrid proposal of cloud and fog to solve the problem.
[14] show that bandwidth, throughput and latency are directly affected by Fog-
Cloud cooperation. In this way, the work performs a case study verifying the behavior
of these measures. The latency is verified as the main problem, pointing the offloading
approach as an interesting computational solution.
The authors in [10] claim that cloud infrastructures no longer meet the current
demands of IoT applications, being network bandwidth and latency the two main
causes. Therefore, they suggest fog and edge computing as solutions to alleviate these
limitations and give the cloud a chance to survive. The authors proposed four forms of
workload, besides challenges concerning machine learning, security and communica-
tion networks.
In [15], Fog Computing is considered a solution to the classic cloud architecture
problems generated by flow and delivery time. The study applies the fog concepts to a
scanner’s system, which due to the type of files generated, the cloud infrastructure was
not sufficient to handle the processing and storage of the data. The solution was to
perform prepossessing on the fog before the data were sent to the cloud.
In the educational context, the authors in [11] draw attention to the need for
educational content exchanges between universities, and suggest a synchronization
system, since instability in connection, according to the authors, impairs the teaching
and learning process. In this way, the authors expect a savings of technological
resources, mainly because of the scarcity of resources in terms of connection.
According to [12], in an evolved society, education is fundamental, and learning
monitoring systems grow to aid that. The best-known system used by the authors is
Moodle, a platform to support Distance Education. According to the authors, due to
poorer learning conditions, some places cannot always have Internet connection, which
is a bare necessity for the use of the platform. In the article, the authors propose that
Moodle be autonomous and work offline as a solution to the connection problems.
The studies mentioned in this section indicate that the alternative of processing in
fog is a viable solution to solve the problems identified in the classic architectures
based on Cloud Computing, where a stable connection with the Internet is a necessary
factor. In this sense, it is clear the importance of studies related to communication
issues, designing applications for those environments. From an educational perspective,
the Fog Computing paradigm presents itself as a solution to communication and
scalability problems. In this way, the present study can be understood as a contribution
to the educational area, trying to increase the reach of Distance Education in regions
with poor communication infrastructure.
4 Fog-Cloud Cooperation for Virtual Learning

Environments
This section presents the proposal of this study, the form that it is structured and how it
is inserted in the context of cloud and fog computing. The proposal aims to solve the
problem previously exposed, transforming a Classical Cloud architecture into an
architecture based on fog-cloud cooperation.
Distance-learning environments, the subject of research in this study, are classic
cloud computing systems where students in different geographical locations have
access to the system through the Internet. This system is the online Distance Learning
platform available from the main university so that students can interact with teachers
and tutors. In addition, all evaluative activities are developed in this platform and
students submit their assignments, meeting the deadlines established by the teacher.
The computational architecture of a platform is the traditional model of
client/server applications, as shown in Fig. 1, where each user stays in the off-campus
unit. These off-campus units provide machines and Internet connection, and the uni-
versity offers a Web service with database features. Student perform the tasks and
submit them to the server through this Web interface.
Fig. 1. An ordinary cloud architecture. Fig. 2. An cloud-fog architecture
This scenario presents some problems for the users of the platform (students,
teachers and tutors) due to the poor infrastructure in the off-campus units: (i) connec-
tion failure; (ii) delay in submitting the assignments; (iii) difficulties in downloading
didactic material; and, mainly, (iv) difficulties to upload documents.
Fig. 3. Geographic distribution of educational units
We can see the geographic distribution of all this structure in Fig. 3, where the
‘graduation cap’ are off-campus units and the ‘construction’ are the educational
institution (UFJF). In this map it is possible see the distance of some off-campus units
to UFJF, which contribute to enlarge the connection problem.
In this context, the solution proposed in this paper seeks to solve the problems
concerning the connectivity between the supporting off-campus units and the central
server of the main educational institution, allocated in the cloud. The major problem
relates to students uploading their completed activities. A second objective is to solve
this problem in a scalable way and the least costly possible, facilitating the creation of
new off-campus units and more easily implementing the solution in the existing units
Based on the knowledge presented in this paper and the findings of similar studies,
e.g. [11, 12], one solution that emerges is to apply the fog paradigm. With a fog node,
the off-campus units can have the autonomy of the cloud and the possibility of pre-
possessing the data locally, now tackling the file upload problem differently.
The solution is to install a software application that manages the files submitted by
students, keeping them on a local server in the remote unit, and when convenient the
system can upload these files. That way, the proposed solution meets the requirements
of being feasible and scalable, while at the same time it solves the Quality of Expe-
rience issue.
In this way, differently from the current architecture based on a classic Cloud
Computing approach, the use of the Fog paradigm requires a local structure to keep the
system up and running. Figure 2 presents this situation. Fog nodes, which are located
in the remote supporting units, assist the cloud located at the university.
In this paper, the architecture developed considers fog-cloud cooperation. Thus, the
system still works with the cloud server (located in the university), but the great
innovation is the adoption of fog nodes (located in the off-campus units) next to the
data source, so they can be prepossessed. Under this configuration, there is a low
number of jumps in the network and, in principle, hardware with greater availability to
process the data produced by the students.
With this new architecture, it is expected that the documents produced by students
be quickly processed in the fog, without the need to send them to the cloud. When the
student completes the assignment, the Fog node stores the document locally, sending it
to the cloud at a later time. This avoids overloading the network in the remote units,
thus increasing response time.
5 Experimental Results
In order to evaluate the proposal, an experiment was carried out by simulating the
network in operation, verifying its weaknesses in order to find out what to attack and
how the solution will have the expected effect.
Data on the remote units’ infrastructure were then collected for the experiment,
checking the number of machines and the internet package in each unit, so that the
solution simulation environment would reflect the real environment of the units.
As reported by teachers, students and tutors, near the time scheduled for the
beginning of the online activities, there is a large amount of access, as students need to
access the platform to download the activities and related-content files. In this time
range, many requests are directed to centralized server of the university.
During the period of students’ activities, some other requests for documents are
disregarded because there is not a clear concentration of package submissions in the
meantime.
Near the assignment upload deadline, there is another sharp curve of package
submissions, which is when students send their completed tasks to the server of the
distance-learning platform.
In this way, the distribution of packet uploads and downloads of a certain remote
unit is given by a distribution with packet concentration at the beginning and at the end
of the available time. This distribution draws on teachers’ experience and data analysis
regarding the activities saved in the Moodle platform. This scenario was implemented,
simulating the current situation of the system.
To perform the evaluation, data were collected from the off-campus units of the
Federal University of Juiz de Fora (UFJF), in the city of Juiz de For a, Brazil. Data
from five remote units were used: Barroso, Boa Esperança, UAB of Juiz de Fora, Santa
Rita de Caldas and Sete Alagoas. The number of machines per remote unit are 30, 20,
47, 50 and 30, respectively, and the network flow values are 20 Mb/s, 4 Mb/s, 4 Mb/s
and 10 Mb/s, as shown in Table 1. The UFJF server has a 100 Mb/s connection, which
is the value used in the simulation.
Table 1. Remote units configuration

Remote units Number of machines Flow (Mb/s)
1 - Orange 30 20
2 – Yellow 20 4
3 - Green 47 4
4 - Blue 50 4
5 - Red 30 10
In general, students can take up to 2 h to complete activity. After that time, students
are no longer allowed to send the assignment by the Moodle platform. In this situation,
students have to rely on the willingness and availability of the face-to-face tutors and
the teachers in charge of the assignment.
One point that was also considered during the simulation is the fact that some
Internet providers in Brazil offer plans with an upload rate limit lower than the
download rate limit. The first indication of that fact is the information provided by
SPEEDTEST, a website that performs speed tests around the world and compiles the
information. According to that website, the upload rate in Brazil represents on average
46\% of the download speed [15].
Another widely adopted upload rate in Brazil is 10% of the download rate. Thus,
during the simulations, three factors were deployed: 10%, 46% and 100% of the initial
value of download rate.
Even though the idea was to get as close as possible to a real scenario, other types
of network traffic were not considered other than that of the remote units and the
distance-learning server, which excludes external problems, hence making this scenario
close to the best possible extent according to what the system offered.
For the present experiment, the class assignments of the off-campus students were
done simultaneously in order to obtain the greatest amount of access possible in the
distance-learning server of the main university. The intent was to make sure if the
bottleneck origin would be in the distance-learning server. However, this hypothesis
was discarded through an initial analysis of the data after simulation, evincing that the
problem stemmed from the remote units
5.1 Simulation of the Real Environment

The simulation of a real environment drew on the three previously presented propor-
tions of the three factors: 10%, 46% and 100% of the initial download rate value. These
factors will be presented separately, verifying the limits that the cloud can reach in this
environment.
To carry out the experiment and generate the data, it was necessary to undergo a
series of processes, from the generation of the scripts run in the simulator, to a graph
analysis of the simulation results. To streamline this process and to be able to perform
multiple experiments with greater ease, a Framework was developed to support these
process steps.
A python program was developed in the sheel script language using Network
Simulator 2 (ns − 2) to generate data, which are also processed by a python program
that generates output for the use of Gnuplot and python’s matplotlib, yielding graphs
for better visualization and analysis of the obtained results. The source codes of this
framework were made available in GitHub [17] in order to help other researchers.
(1) The 100% Factor: When the upload and download flow are equal, the simulation
presents results that, although within the limit of the remote infrastructure, would not
cause the identified problems for the remote units. The delivery is fast enough for the
situation in the remote units, which is not reflected in the report by teachers, students
and tutors.
Figure 4 refers to the time each file takes to be sent from its start until delivery to the
server that will store the data, showing the off-campus units in different colors and
points shape. The points indicate the moment of files delivery for the cloud server. The
file upload time takes approximately 2 min, which is satisfactory for the problem. In
this case, where the upload rate is the same as that of download, the cloud service
presents no problems, being sufficient for the application.
(2) The 46% Factor: In the same way as shown in the previous situation. In this
simulation where the upload rate is 46% of the download rate, delivery begins to take
longer, which now reflects the situation mentioned by teachers, students and tutors –
the cloud bottleneck begins to be evident in this situation.
Figure 5 refers to the time each file takes to be sent, the time for all off-campus units
to send the files, in the two worst cases (yellow and green units), is 8 min – a con-
siderably high amount of time for the file’s delivery. In this situation, some students
could not send their activities in due time (120 min).
In this case, the students need to start for sending the tasks 8 min earlier, i.e., they
have less time to solve the tasks them students from other off-campus units.
(3) The 10% Factor: The first result obtained for the 10\% upload rate, refers to the
amount of packet traffic in the network, in an overview, during the entire period of an
assignment fulfilled. In this case, the time is quite large, on an emergency basis.
The time that all remote units need to send all the files until the end of the trans-
mission. The deadline established for sending the files is within 120 min (two hours to
carry out the online activities). There is an evident delay of approximately 70 min for
the delivery part of files, which points to the fact that the flow is not enough.
Fig. 4. Files sent in relation to assignment Fig. 5. Files sent in relation to assignment
completion time (100% factor) completion time (46% factor)
Fig. 6. Files sent in relation to assignment Fig. 7. Files sent in relation to assignment
completion time (10% factor) completion time (Fog-cloud cooperation)
Complementing the previous result, Fig. 6 shows the sending time of each class
assignment performed, with each remote unit represented by a different color, showing
only the end of the period, which concentrates the submissions. The characteristics of
the off-campus units follow the same order as in Table 1, where the yellow and green
remote units have more machines and an insufficient internet connection, hence causing
a much longer delay. The 70 min previously mentioned are a responsibility of these
remote units. The Figure points to the problem of off-campus units, revealing what was
already expected – that the flow is not enough and the structure needs to be improved.
This case extrapolates the previous case and shows urgency about the use of another
type of structure.
5.2 Simulation of the Proposal

The result of this investigation aims to allocate all students’ assignments in a local
server in the off-campus unit, which alleviates the network as the task is considered a
simple one for a local network. The files will only be sent to the server at a later time,
compressed and in a controlled manner.
Naturally, if assignments are sent to the local server, which in the case is a fog
node, the time is much shorter. Figure 7 only reinforces this statement, with the same
graph for the file sending time. In this case, the times are so short that they are less than
a minute. In this case, fog-cloud cooperation is a viable alternative for solving the
problem of connecting the off-campus units located in remote regions very far from the
major centers. With this strategy, the sending problem is solved, making the system
scalable and viable.
These results considered the average of the results obtained after several executions
of the Framework. However, it is worth noting that, even though the experiments were
performed with different seeds, there were no outliers, i.e., the results of each execution
follow a very similar distribution.
6 Conclusions and Future Works
This study reveals the cloud’s state-of-the-art, as it points to a cloud limitation. Later,
fog is presented as a solution, which can extend cloud boundaries in certain situations.
The situations in which the fog structure is most used are mainly when there is a need
of flow or time of delivery. Fog also appears to solve other problems such as energy
and Quality of Experience (QoE).
To validate this information and solve a real problem in the educational context, an
experiment was carried out considering poor Internet connection as a debilitating factor
for the adoption of a classical cloud architecture. At first, considering economic,
infrastructural and geographic issues, it was found that some supporting off-campus
units did not have a connection good enough to carry out the tasks they were supposed
to.
The experiment brings the problem to evidence and confirms the previous
hypothesis. It is clear that the system is at its limit, especially when it concerns structure
expansion and number of off-campus units. To solve this problem, fog-cloud coop-
eration presents itself as a viable solution, since its premise is based on flow and
delivery time.
When the proposed architecture was evaluated through the experiments, it became
clear that it is a viable and scalable functional solution to the problem and that it can
even extend the life of the structure. In this way, the experiment reveals the power of
fog, helping the cloud, and how these paradigms should work together to avoid cloud
overhead.
As future work, we intend to implement this new architecture at the Federal
University of Juiz de Fora for an evaluation of the proposal in real-time execution.
Acknowledgment. The authors would like to thank FAPEMIG, CAPES, CNPq and PTI-Lasse
for partially funding the research presented in this article.
References
1. Cisco: Fog Computing and the Internet of Things: Extend the Cloud to Where the Things
Are. White Paper, Cisco (2015)
2. IBM. https://www.ibm.com/blogs/cloud-computing/2014/08/25/fog-computing/. Accessed
Feb 2019
3. Openfog. https://www.openfogconsortium.org/. Accessed Feb 2019
4. Gomes, E.H.A., Plentz, P.D.M., De Rolt, C.R., Dantas, M.A.R.: A survey on data stream,
big data and real-time. Int. J. Netw. Virtual Organ. 20(2), 143–167 (2019)
5. Deng, R., Lu, R., Lai, C., Luan, T.H.: Towards power consumption-delay tradeoff by
workload allocation in cloud-fog computing. In: 2015 IEEE International Conference on
Communications (ICC), pp. 3909–3914, June 2015, https://doi.org/10.1109/icc.2015.
7248934
6. Stojmenovic, I., Wen, S., Huang, X., Luan, H.: An overview of Fog computing and its
security issues. Concurr. Comput.: Pract. Exp. 28(10), 2991–3005 (2016). https://doi.org/10.
1002/cpe.3485
7. Pepper, R., Garrity, J.: The internet of everything: how the network unleashes the benefits of
big data. In: Bilbao-Osorio, B., Dutta, S., Lanvin, B. (eds.) The Global Information
Technology Report 2014, Rewards and Risks of Big Data, pp. 35–43. World Economic
Forum, Geneva (2014)
8. Pham, X.-Q., Huh, E.-N.: Towards task scheduling in a cloud-fog computing system. In:
2016 18th Asia-Pacific Network Operations and Management Symposium (APNOMS),
pp. 1–4, October 2016. https://doi.org/10.1109/apnoms.2016.7737240
9. Du, J., Zhao, L., Feng, J., Chu, X.: Computation offloading and resource allocation in mixed
fog/cloud computing systems with min-max fairness guarantee. IEEE Trans. Commun. 66
(4), 1594–1608 (2017). https://doi.org/10.1109/tcomm.2017.2787700
10. Bierzynski, K., Escobar, A., Eberl, M.: Cloud, fog and edge: Cooperation for the future? In:
2017 Second International Conference on Fog and Mobile Edge Computing (FMEC),
pp. 62–67, May 2017. https://doi.org/10.1109/fmec.2017.7946409
11. Ijtihadie, R.M., Hidayanto, B.C., Affandi, A., Chisaki, Y., Usagawa, T.: Dynamic content
synchronization between learning management systems over limited bandwidth network.
Hum.-Centric Comput. Inf. Sci. 2(1), 17 (2012). https://doi.org/10.1186/2192-1962-2-17
12. Ngom, B., Guillermet, H., Niang, I.: Enhancing Moodle for offline learning in a degraded
connectivity environment. In: 2012 International Conference on Multimedia Computing and
Systems, pp. 858–863, May 2012. https://doi.org/10.1109/icmcs.2012.6320168
13. Mell, P.M., Grance, T.: The NIST Definition of Cloud Computing Recommendations of the
National Institute of Standards and Technology (2011)
14. Dantas, M., Bogoni, A.R., Eduardo, P., Filho, P., Eduardo, J.F.: An application study case
tradeoff between throughput and latency on fog-cloud cooperation. In: IJNVO (In press)
15. SPEEDTEST. https://www.speedtest.net/reports/brazil/. Accessed Feb 2019
16. MOODLE. https://moodle.org/. Accessed Feb 2019
17. GitHub. https://github.com/lucaslarcher/Framework_Analise_Saida_NS2. Accessed Feb
2019
Research Characterization on I/O
Improvements of Storage Environments
Laércio Pioli1,2(B) , Victor Ströele de Andrade Menezes1 ,

and Mario Antonio Ribeiro Dantas1,2
1
Federal University of Juiz de Fora (UFJF), Juiz de Fora, Minas Gerais, Brazil
{laerciopioli,victor.stroele,mario.dantas}@ice.ufjf.br
2
INESC P&D, Juiz de Fora, Minas Gerais, Brazil
Abstract. Nowadays, it has being verified some interesting improve-

ments in I/O architectures. This is an essential point to complex and
data intensive scalable applications. In the scientific and industrial fields,
the storage component is a key element, because usually those applica-
tions employ a huge amount of data. Therefore, the performance of these
applications commonly depends on some factors related to time spent in
execution of the I/O operations. In this paper we present a research
characterization on I/O improvements related to the storage targeting
high-performance computing (HPC) and data-intensive scalable comput-
ing (DISC) applications. We also evaluated some of these improvements
in order to justify their concerns with the I/O layer. Our experiments
were performed in the Grid’5000, an interesting testbed distributed envi-
ronment, suitable for better understanding challenges related to HPC
and DISC applications. Results on synthetic I/O benchmarks, demon-
strate how to improve the performance of the latency parameter for I/O
operations.
1 Introduction
Recent improvements related to I/O architectures have been found in some dif-
ferent dimensions. Many contributions are being proposed focusing on data dis-
tribution. Other works consider hardware combination to enhance data access.
Some of these solutions utilize flash memory, as SSD (Solid-State Disk) together
with HDD (Hard Disk Drive) to support the I/O management problems, improv-
ing then the performance of these applications. Finally, other dimensions adopt
the software improvement in the up layer to enhance the I/O performance, as
illustrated in [9].
Therefore, we can consider that the actual efforts are divided in a macro view
of software, hardware and storage systems approach. Our proposal can be under-
stood, as Fig. 1 shows, as a characterization related to the I/O improvements,
studying each component separately (i.e. software, hardware and storage sys-
tems) and their interconnections. In a previous literature review we found a gap
which indicates a necessary set of experiments to better understand the relation
https://doi.org/10.1007/978-3-030-33509-0_26
288 L. Pioli et al.
between the three components. To evaluate our proposal, we present a set of

experiments performed inside the Grid’5000, a large distributed computational
environment, which targeted to indicate aspects related to I/O performance spe-
cially considering the view of scientific and industrial applications.
This contribution is structured as follows. In Sect. 2, it is presented some
elements related to HPC and DISC. Related work that justify these character-
ization proposals are illustrated in Sect. 3. The proposed research work of the
paper is highlighted in Sect. 4. Used environment and experimental results in
the Grid’5000 are presented in Sect. 5. Finally, Sect. 6 presents conclusions and
future work after this investigation effort.
2 HPC and DISC

In this section we present some explanation about HPC and DISC.
2.1 HPC
HPC systems, such as clusters, grids, and supercomputers, consist of a very
specialized and complex combination of hardware and software elements. Their
design mainly focuses on providing high processing power for large scale dis-
tributed and parallel applications, even though low communication and data
access latency are considered increasingly important requirements [3,15].
A main characteristic observed in these environments is the separation of
compute and storage resources, which results in massive data movements through
layers of the I/O path during the execution of a data-intensive application. In
such complex computing environments, many factors can affect the I/O perfor-
mance perceived by an application: workload characteristics, system architecture
and topology model, configurations in the many layers of the parallel I/O soft-
ware stack, properties of hardware components, interference and system noises,
to name a few.
Most previous research works addressing I/O performance variability pro-
posed inter-application coordination approaches focusing on reducing the impact
of multiple applications simultaneously executing on the HPC system [6,7,14,
26,30]. However, on extreme-scale computational science applications, denot-
ing applications with dedicated access to all resources of the HPC environment
for execution, I/O performance variability can be mostly attributed to intra-
application interference. One major, yet intrinsic, source of intra-application
interference in this context relates to the load balance on PFS’s data servers.
2.2 DISC
DISC focuses on data-centric, different type of applications, such as industrial,
can be the target of this effort, because it is oriented to data management and
analysis and has goals such as scalability, fault-tolerance, availability and cost-
performance. In [22] there is an interesting discussion about DISC considering
hardware, software packages and computation environment.
Research Characterization on I/O Improvements of Storage Environments 289
3 Related Work
In this section, we present some previous works targeting I/O improvements on
storage devices, systems and software done by researchers which are concerning
about the performance on I/O layer.
In general, conducting performance studies on storage devices is challenging
for researchers as they need to deal with various low-level concepts and new
technologies. Applications that produce and process many data in a short time
interval almost always need to store and retrieve data and usually they encounter
latency problems to perform such operations. Much of these problems are related
to the device and technology that are being used. The related works shown below
exemplify some issues that are being addressed for researchers to improve I/O
performance. These works were selected because it is being realized a systematic
review concerning I/O improvements on storage devices and systems that were
proposed at the last 10 years. These solutions were divided into numbered groups
that were used in the characterization proposal presented in Sect. 4, the groups
are:
1. Yang et al. [29] proposed (WARCIP) which means “write amplification reduc-
tion by clustering I/O pages” to minimize the negative impact of garbage
collection (GC) on SSD devices. They used a clustering algorithm to min-
imize the rewrite interval variance of pages in a flash block. Chang et al.
[4] proposed an approach to operate wear leveling on virtual erase counts
instead of real erase counts using SSD devices. Kim et al. [13] we have pro-
posed an I/O architecture that optimizes the I/O path to take full advantage
of NVMe SSDs. The authors approach works eliminating the overhead of
user-level threads, bypassing unnecessary I/O routines and enhancing the
interrupt delivery delay. Ramasamy et al. [21] proposed an algorithm called
random first flash enlargement to improve the performance of write operation
on the flash-memory-based SSDs. Shen et al. [23] proposed an I/O scheduler
where the design of the solution is motivated by unique characteristics on
Flash-Based SSDs.
2. Mackey et al. [16] proposed a novel storage device called Igloo which serves
to solve the problem of accessing cold data storage. Chen et al. [5] is looking
for the issue of random write operations which is very common on NAND-
flash memories. They proposed utilizing the NVM device as an auxiliary
device because the NVM technology such as phase-change memory, supports
better in-place updates presenting better I/O performance. Stratikopoulos
et al. [25] introduces an FPGA-based approach for accelerating NVMe-based
SSDs. Their solution introduces an FPGA-based fast path that accelerates
the access to the NVMe drive, improving then the I/O performance of the
device.
3. Wu et al. [27] proposed a priority-based data placement method for databases
using SSDs. They consider a mechanism and a migration rule for performing
migrations between HDDs and SSDs. Yang et al. [28] proposed a content
look-aside buffer (CLB) for simultaneously providing redundancy-free virtual
290 L. Pioli et al.
disk I/O and caching. They implemented a CLB on KVM hypervisor and
demonstrate that CLB delivers considerably improved I/O performance with
realistic workloads. Huo et al. [10] proposed a caching management algorithm,
sometimes called as a framework named ACSH, which is based on SSD devices
and DRAM and is focused on the improvement of metadata I/O on the file
systems. Ou et al. [19] proposed a file index scheme for flash file system called
NIS. In this scheme they are concerned about the performance of file systems
when using NAND flash as a storage device.
4. Nakashima et al. [18] proposed a method for improving I/O performance of a
big data application using SSD as cache. The results presented by the authors
demonstrate that the method can improve I/O performance. Min et al. [17]
proposed a method using NVMe SSDs to enhance I/O resource management
of Linux Cgroup on NUMA systems. Ouyang et al. [20] proposed a method
which is an aggregation staging I/O to enhance checkpoint writing perfor-
mance using staging I/O and SSD on the data server archiving better write
bandwidth. Kannan et al. [12] proposed a mechanism using Active NVRAM
based approach for I/O staging. In the considered method, each physical node
has an additional active NVRAM component to stage I/O. Bhattacharjee
et al. [2] proposed utilizing SSD to enhance recovery and restart through
random access capability in a database engine.
5. Others contributions focus on the I/O improvements targeting storage sys-
tems. Zhou et al. [31] proposed an attributed consistent hashing algorithm
called attributedCH, which manages heterogeneous nodes on a consistent
hashing ring. The algorithm is particularly suitable for heterogeneous stor-
age systems. Shi et al. [24] proposed SSDUP which is a scheme to improve
the burst buffer by addressing some limitations such as requiring large SSD
capacity and harmonious overlapping between computation phase and data
flushing stage. Du et al. [8] proposed a balanced partial stripe (BPS) scheme
that improves the write performance of RAID-6 systems which is a bottleneck
to deal with some applications.
4 Proposed Approach
In this article we present a research characterization on I/O improvements
related to the storage devices and systems targeting HPC and DISC applica-
tions.
Figure 1 presents the proposed characterization which is composed by three
basics elements: software, hardware and storage systems. Software investigations
could be understood as an improvement where the object that is being proposed
as a solution is an algorithm, method, frameworks or any programmable solution.
Hardware investigations could be characterized as improvements when the object
that is being proposed to improve is a physical component or something palpable.
We found that these two groups can relate and improve each other or a storage
system targeting the improvement of I/O performance.
In Sect. 3, we presented some contributions that were done by researchers
to improve the I/O performance on the storage devices and systems. All these
Fig. 1. Proposed characterization I/O improvements
contributions presented in the group 1 are some kind of “software” solution and
could be characterized into a “solution” that is being proposed to improve I/O
performance on storage device. This group of improvements were characterized
and shown in Fig. 1 as the arrow that leaves the red circle (Software) and arrives
in the green circle (Hardware). For easy viewing, the acronym S2H-IO which
means “Software solution to improve I/O performance on Hardware” was created
and added above the arrow.
Authors from group 2 are looking for improvements targeting I/O perfor-
mance on the hardware device but to do that they are using technology to per-
form it. In all of these cases, the solution involves differently hardware technology
as solution. This group of improvements were characterized and shown in Fig. 1
as the arrow that is circling the hardware circle. For this group of improvements
the acronym H2H-IO which means “Hardware solution to improve I/O perfor-
mance on Hardware” was created and added below the green circle (Hardware).
All contributions presented in group 3 are some kind of “software” solution
which could be characterized into a “solution” that is being proposed to improve
I/O performance on another “software” object differently from the group 1. It is
important to notice that although the improvements are from software to soft-
ware, they take into account the storage technologies that they are using. These
group of improvements receive the acronym S2S-IO which means “Software solu-
tion to improve I/O performance on Software”. It was created and added below
the red circle (Software) on Fig. 1.
Authors from group 4 presented improvements targeting I/O performance
on the software but to do that they are using technology, that is hardware, to
perform it. Although, this approach is not the most natural to receive the looks
of the researchers, it was classified and characterized as presented below. These
group of improvements receive the acronym H2S-IO which means “Hardware
solution to improve I/O performance on Software”. It was created and added
292 L. Pioli et al.
below the arrow that leaves the green circle (Hardware) and reach to the red
circle (Software) on Fig. 1
Group 5 is concerned about the I/O improvements targeting storage systems.
It is good to make it clear that in these characterizations Storage Systems does
not mean some software that is present in a storage device but rather a group
of technologies and software working together and asynchronously in an envi-
ronment. Because the Storage Systems are made up of hardware and software,
the improvements proposed by researchers can be either software or hardware
improvements or both of them. To this end, the acronym S2SS-IO and H2SS-IO
which means “Software solution to improve I/O performance on Storage Sys-
tems” and “Hardware solution to improve I/O performance Storage Systems”
respectively was created and added above the arrow that reaches to the blue
circle above (Storage Systems).
5 Experimental Environment and Evaluation Results

In this section we present the experimental environment that we used to carry
out the experiments. We also present as subsection the factors that we used in
the experiment and why they were chosen.
5.1 Experimental Environment

To evaluate our proposal, we consider the Grid’5000 which is a large-scale
testbed for experiment-driven research. Grid’5000 has a focus on parallel and
distributed computing including cloud computing, high-performance computing
and big data which have 800 compute-nodes grouped. Each cluster provides a
huge amount of technologies including different CPU processors, GPUs, storage
devices such as SSD, HDD, NVMe and Ethernet, Infiniband and Omni-Path
network interconnectors. The Grid’5000 testbed is a secure and powerful envi-
ronment composed of 8 different sites located in France providing a huge amount
of devices and technologies working parallel together to solve huge problems of
science.
The experimental environment used was composed of 24 nodes of the dahu
cluster located in Grenoble. The nodes are composed by a Dell PowerEdge C6420
model interconnected with a Gigabit Ethernet network. Each node has 2 CPUs
Intel Xeon Gold 6130 2.10 GHz with 16 cores/CPU. The storage of each of them
has 240GB SSD SATA, 480 GB SSD SATA, and 4.0 TB HDD SATA, and the
memory RAM 192 GiB. Centos7 was used as an Operational System, kernel
3.10.0-957.21.2.el7.x86 64 and ext4 file system. In this experiment, 16 nodes
were used as Compute Nodes (CN) and 8 as Storage Nodes (SN).
The experiments were conducted using OrangeFS file system (version 2.9.7).
The software used to perform the experimentation was the IOR-EXTENDED
(IORE) benchmark [11]. The IORE benchmark is a flexible and distributed
I/O performance evaluation tool which was designed for Hyperscale Storage
Systems and supporting whole experimental variety of workloads. The requests
were generated running IORE on a CN with MPICH 3.0.4 version.
5.2 Experiment Factors Definition
In order to carry out our experiments, this subsection presents the factors that
we used in the analysis.
• Storage Device
The right usability of the storage can improve the I/O performance of appli-
cations that needs to execute I/O operations frequently. To verify that the
technology device usually can influence on the performance, this research
takes care of three common approaches to store data. The first one is to store
all kind of data, data and metadata, on an HDD device. The second is use
SSD to store metadata while the data are stored on the HDD devices. Finally,
we stored data and metadata on the SSD device.
• Linux I/O Schedulers
To provide a better usage and access of the data, the I/O schedulers take care
about the disk access requests. In this experiment we consider three Linux
I/O schedulers. Complete Fairness Queueing [1], deadline and noop are the
schedulers that were considered as factors to the experiment complementing
the scenario to store data shown above.
• Task Numbers
Another important factor that we consider on the experimentation is the
number of tasks that should participate on the test. In this experiment we
consider 32 and 64 as the number tasks because we believe that different
workloads and sizes numbers can be easily found in DISC applications and it
probably influences on the performance results.
• Access Patern
Finally, more two factors were introduced on the experimental parameters is
related with the access pattern. We consider that the random and sequential
accesspatterns could give us different information and present broader results.
In conclusion, we performed the experimentation with the total of 36 dif-

ferent scenarios where 3 comes from the different approaches to store data and
metadata, 3 comes from the different I/O schedulers used on the requests, 2
comes from the number of tasks used and 2 comes from the data access pattern
used by the benchmark. In order to improve the results, each experiment was
performed 5 times and at the end the average of them was calculated as final
result.
5.3 Experimental Results

This subsection presents the results obtained of the experimentation. In all of
them, we are looking for understanding how latency behaves when the bench-
mark performs read and write operations.
294 L. Pioli et al.
Before starting, let’s discuss how the data and information were presented
in this graphics. In each figure presented, 24 outcomes for the read and write
operations are shown. All bars with hatches we have the read latency time for one
scheduler and all bars without hatches, located after the hatched bar, we have
the write latency time for the same scheduler. The write latency time always
comes after the related read latency time scheduler.
Figure 2 we analyze the latency time for the operations when storing both
data and metadata on the HDD device switching then the I/O schedulers. Table 1
we present the average latency time for all the scenarios presented in Fig. 2. We
present this average because we believe that DISC and HPC applications can
treat and use heterogeneous kind of data with different access patterns and
number of tasks on the same application. Considering this, it’s possible to see in
Table 1 that when we are storing both data and metadata on the HDD device the
scheduler that presents the lowest average latency time to perform read operation
is the deadline and to perform the write operation is the CFQ scheduler.
Table 1. Average latency of

Fig. 2
Scheduler Read Write

CFQ 0.3643 2.0151
Deadline 0.3468 2.0183
Noop 0.3637 2.2576
Fig. 2. Data and metadata on HDD
Figure 3 we analyze the latency time for the operations when storing data on
HDD and metadata on SSD device switching then the I/O schedulers. Just as it
happened in Fig. 2, in Fig. 3 all values to perform the write operation are greater
than the value to perform the read operation. It’s possible to see too that in
Table 2 the latency average to execute the read operation for all schedulers are
smaller when storing data on HDD and metadata on SSD if comparing with the
approach presented in Fig. 2. We can also notice that when we are storing data
on HDD and metadata on SSD the scheduler that presents the lowest average
latency time to perform read and write operation is the Noop scheduler.

Fig. 3

CFQ 0.2560 2.0871
Deadline 0.2707 1.9146
Noop 0.2493 1.8570
Fig. 3. Data on HDD and metadata on SSD
Figure 4 we store even the data and metadata on SSD which is a device that
does not have mechanical components switching then the same I/O schedulers.
In this case the write average latency time decreased significantly compared with
the other two approaches presented earlier. However, it’s possible to notice that
the read average latency time did not suffer significant variations.
These results could lead us to think that if the device that you are storing
the data is an SSD device, it’s very likely that the latency time will be decreased
and thereby improve the performance of write operations (Table 3).

Fig. 4

CFQ 0.2815 0.6827
Deadline 0.2841 0.7274
Noop 0.3087 0.6839
Fig. 4. Data and metadata on SSD

This study described our characterization approach toward providing a better
understanding about the improvements that are being done by the researchers on
296 L. Pioli et al.
the storage devices used in the I/O architecture of huge environments. We con-
sider that the actual efforts are divided in a macro view of software, hardware and
storage systems approach. Our proposal can be understood as a characterization
related to the I/O improvements, where we are studying each component sepa-
rately (i.e. software, hardware and storage systems) and their interconnections.
In this paper, we present a set of experiments performed inside the Grid’5000, a
large distributed computational environment, which targeted to indicate aspects
related to I/O performance, and we could demonstrate that the latency when
performing I/O operations can undergo many variations if we take into account
the presented factors evaluated in the experiments. The most expressive result
is related to the reduction of the latency time of the write operation when the
approach of storing both data and metadata on the SSD. In order to improve
these research, we also intend to finish a survey which consider more than 4.7
thousand works and classify all the improvements longer to 10 years using our
characterization presented on this paper. We would like share and analyze the
throughput rate performed on this experiment on the next work, and we want
perform more experiments in Grid’5000 using an entire cluster, or even more
than one, exploring others technologies such as flash NVMe on the experimen-
tation.
Acknowledgment. Experiments presented in this paper were carried out using the
Grid’5000 experimental testbed, being developed under the INRIA ALADDIN devel-
opment action with support from CNRS, RENATER and several Universities as well
as other funding bodies (see https://www.grid5000.fr). We also would like to thank the
Federal University of Juiz de Fora (UFJF), CNPq, CAPES, FAPEMIG, PTI-LASSE
and INESC P&D Brazil in SIGOM project that support in part this study.
References
1. Axboe, J.: Linux block IO–present and future. In: Ottawa Linux Symposium, pp.
51–61 (2004)
2. Bhattacharjee, B., Ross, K.A., Lang, C., Mihaila, G.A., Banikazemi, M.: Enhanc-
ing recovery using an SSD buffer pool extension. In: Proceedings of the Seventh
International Workshop on Data Management on New Hardware, pp. 10–16. ACM
(2011)
3. Chang, C., Greenwald, M., Riley, K., et al.: Fusion energy sciences exascale require-
ments review. an office of science review sponsored jointly by advanced scientific
computing research and fusion energy sciences. In: USDOE Office of Science (SC)
(2017)
4. Chang, L., Huang, S., Chou, K.: Relieving self-healing SSDs of heal storms. In:
10th ACM International Systems and Storage Conference, p. 5. ACM (2017)
5. Chen, R., Shen, Z., Ma, C., Shao, Z., Guan, Y.: NVMRA: utilizing NVM to improve
the random write operations for NAND-flash-based mobile devices. Softw. Pract.
Exp. 46, 1263–1284 (2016)
6. Dorier, M., Antoniu, G., Cappello, F., Snir, M., Orf, L.: Damaris: how to effi-
ciently leverage multicore parallelism to achieve scalable, jitter-free I/O. In: IEEE
International Conference on Cluster Computing, pp. 155-163. IEEE (2012)
7. Dorier, M., Antoniu, G., Ross, R., Kimpe, D., Ibrahim, S.: CALCioM: mitigating
i/o interference in HPC systems through cross-application coordination. In: IEEE
28th International Parallel and Distributed Processing Symposium, pp. 155–164.
IEEE (2014)
8. Du, C., Wu, C., Li, J., Guo, M., He, X.: BPS: a balanced partial stripe write scheme
to improve the write performance of raid-6 In: IEEE International Conference on
Cluster Computing, pp. 204–213. IEEE (2015)
9. Gorton, I., Klein, J.: Distribution, data, deployment: software architecture conver-
gence in big data systems. IEEE Softw. 32, 78–85 (2015)
10. Huo, Z., Huo, X., et al.: A metadata cooperative caching architecture based on
SSD and DRAM for file systems. In: International Conference on Algorithms and
Architectures for Parallel Processing, pp. 31–51. Springer (2015)
11. Inacio, E.C. and Dantas, M.A.R.: IORE: a flexible and distributed i/o performance
evaluation tool for hyperscale storage systems. In: Symposium on Computers and
Communications (ISCC), pp. 01026–01031. IEEE (2018)
12. Kannan, S., Gavrilovska, A., Schwan, K., Milojicic, D., Talwar, V.: Using active
NVRAM for I/O staging. In: Proceedings of the 2nd International Workshop on
Petascal Data Analytics: Challenges and Opportunities, pp. 15–22. ACM (2011)
13. Kim, J., Ahn, S., La, K., Chang, W.: Improving I/O performance of NVMe SSD on
virtual machines. In: Proceedings of the 31st Annual ACM Symposium on Applied
Computing, pp. 1852–1857. ACM (2016)
14. Kuo, C., Shah, A., Nomura, A., Matsuoka, S., Wolf, F.: How file access patterns
influence interference among cluster applications. In: International Conference on
Cluster Computing (CLUSTER), pp. 185–193. IEEE (2014)
15. Lucas, R., Ang, J., Bergman k., et al.: Top ten exascale research challenges. DOE
ASCAC subcommittee report, 1–86 (2014)
16. Mackey, G., Agun, M., Heinrich, M., Ryan, R., Yu, J.: Igloos make the cold bear-
able: a novel HDD technology for cold storage. In: 20th International Conference
on HPC and Communications; 16th International Conference on Smart City; 4th
International Conference on Data Science and Systems (HPCC/SmartCity/DSS),
pp. 99–108. IEEE (2018)
17. Min, J., Ahn, S., La, K., Chang, W., Kim, J.: Cgroup++: enhancing I/O resource
management of Linux Cgroup on NUMA systems with NVMe SSDs In: Proceedings
of the Posters and Demos Session of the 16th International Middleware Conference,
p. 7. ACM (2015)
18. Nakashima, K., Kon, J., Yamaguchi, S.: I/O performance improvement of secure
big data analyses with application support on SSD cache. In: Proceedings of the
12th International Conference on Ubiquitous Information Management and Com-
munication, p. 90. ACM (2018)
19. Ou, Y., Wu, X., Xiao, N., Liu, F., Chen, W.: NIS: a new index scheme for flash file
system. In: 29th Symposium on Mass Storage Systems and Technologies (MSST),
pp. 44–51. IEEE (2015)
20. Ouyang, X., Marcarelli, S., Panda, D.K.: Enhancing checkpoint performance with
staging IO and SSD. In: International Workshop on Storage Network Architecture
and Parallel I/Os, pp. 13–20. IEEE (2010)
21. Ramasamy, A.S., Karantharaj, P.: RFFE: a buffer cache management algorithm for
flash-memory-based SSD to improve write performance. Can. J. Electr. Comput.
Eng. 38, 219–231 (2015)
22. Randal E. B.: Data-intensive supercomputing: intensive supercomputing: the case
for DISC the case for DISC. Technical report: CMU-CS-07-128 (2019)
298 L. Pioli et al.
23. Shen, K., Park, S.: FlashFQ: a fair queueing I/O scheduler for flash-based SSDs.
In: Presented as part of the 2013 USENIX Annual Technical Conference USENIX
(ATC 2013), pp. 67–78. ACM (2013)
24. Shi, X., Li, M., Liu, W., Jin, H., Yu, C., Chen, Y.: SSDUP: a traffic-aware SSD
burst buffer for HPC systems. In: Proceedings of the International Conference on
Supercomputing, p. 27. ACM (2017)
25. Stratikopoulos, A., Kotselidis, C., Goodacre, J., Luján, M.: FastPath: towards wire-
speed NVMe SSDs. In: 28th International Conference on Field Programmable Logic
and Applications (FPL), pp. 170–1707. IEEE (2018)
26. Wan, L., Wolf, M., Wang, F., Choi, J.Y., Ostrouchov, G., Klasky, S.: Comprehen-
sive measurement and analysis of the user-perceived I/O performance in a produc-
tion leadership-class storage system. In: International Conference on Distributed
Computing Systems (ICDCS), pp. 1022–1031. IEEE (2017)
27. Wu, C.H., et al.: A priority-based data placement method for databases using
solid-state drives. In: Proceedings of the 2018 Conference on Research in Adaptive
and Convergent Systems, pp. 175–182. ACM (2018)
28. Yung, C., Liu, X., Cheng, X.,: Content look-aside buffer for redundancy-free virtual
disk I/O and caching. In: International Conference on Virtual Execution Environ-
ments, pp. 214–227. ACM (2017)
29. Yang, J., Pei S., Yang, Q.: WARCIP: write amplification reduction by clustering
I/O pages. In: 12th ACM International Conference on Systems and Storage, pp.
155–166. ACM (2019)
30. Yildiz, O., Dorier, M., Ibrahim, S., Ross, R., Antoniu, G.: On the root causes of
cross-application I/O interference in HPC storage systems. In: International Par-
allel and Distributed Processing Symposium (IPDPS), pp. 750–759. IEEE (2016)
31. Zhou, J., Chen, Y., Wang, W.: Atributed consistent hashing for heterogeneous
storage systems. In: PACT, pp. 23–1. ACM (2018)
Home Fine Dust Monitoring Systems
Using XBee
Sung Woo Cho(&)
Department of Computer Science, Northwestern Polytechnic University,

Fremont, CA, USA
sung_wooc@yahoo.com
Abstract. Home air environments are affected by several pollutant sources.

Among them, fine dust is the most dangerous pollutant, which can cause serious
health problems. In general, residents in East-Asian countries trust fine dust
information on weather forecast portals. However, the quality at air inside
homes is very different from forecasted, as indoor air conditions cannot be
observed easily. Although fine dust values can be observed in an air purifier,
these readings can hardly be trusted, because one air purifier in a living room
cannot determine the fine dust density values of other rooms or places. There-
fore, this paper presents a home fine-dust monitoring system based on selected
places applicable in any environments. By using the FRIBEE white Arduino
board (built-in XBee shield Arduino board) and fine dust sensors, a star network
topology is developed and each fine dust density value is determined by the fine
dust sensors. Utilizing XBee and the Bluetooth network, the author examined
fine dust density values on an Arduino serial monitor and Android applications
in real time. To obtain a moderate result, two places in a home were selected to
observe the fine dust density values. Moreover, indoor fine dust density values
were compared with fine dust density values of weather forecast portals. The
proposed systems can be used as a low-cost tool for home (or other indoor
environments) fine dust monitoring.
1 Introduction
Currently, fine dust is one of the most serious causes of social health issues in East-
Asian countries. The negative effect of breathing fine dust has been the primary topic in
Korea portal news season after season. People wear masks outdoors when dangerous
fine dust density values are reported in the weather forecast. Even inhabitants having an
air purifier in the living room (or in different indoor places) cannot sense fine dust
density values or air pollutants. Furthermore, they do not consider the effects of fine
dust on their lifestyle and health.
Recently, the Internet of Things (IoT) has become a popular paradigm owing to its
benefits in modern wireless telecommunications. The IoT pertains to a world-wide
network of interconnected objects that are uniquely addressable, based on standard
communication protocols [1]. Among these wireless network protocols, Zigbee man-
ages various data traffic and communication with businesses and consumer devices.

https://doi.org/10.1007/978-3-030-33509-0_27
300 S. W. Cho
Therefore, using the FRIBEE white Arduino board, air fine dust density values are
verified based on the XBee method to obtain air fine dust density values.
According to conventional research, fine dust density measurements have been
implemented variously. Primarily, only one dust density sensor and one IoT platform
were used. Although several sensors have been developed, a costly Wi-Fi shield was
required and only a simple network topolog was created [2]. Therefore, an experiment
was conducted to use XBee S2C, which is a newer version of XBee Pro, and the
Arduino FRIBEE white (with built-in XBee shield) to receive sensor results with star
topology. A star topology can be better used in IoT home automation research com-
pared to other topologies such as peer-to-peer because the primary coordinator serves
as a server and controls other end sensors. With two end nodes (sensor node) and one
coordinate node, two fine dust sensors in two different places send sensor values to the
coordinate node to display the values. Furthermore, an Android phone can obtain
results through Bluetooth with Arduino. By developing a home fine dust monitoring
system, the author has obtained different results between outdoor and indoor fine dust
density values. Indoor fine dust density values depend on different air environments
such as cooking, smoking, and vacuuming.
The remainder of this paper is structured as follows. Section 2 introduces the
FRIBEE Arduino platform. ZigBee and ZigBee network topologies are explained in
Sects. 3 and 4, respectively. Section 5 presents fine dust sensors used primarily in IoT
research. IoT star topology, implementation, and results are described in Sect. 6.
Finally, Sect. 7 presents the conclusions of this study and future work.
2 FRIBEE Arduino Platform
Many do-it-yourself prototyping platforms are available that allow one to create IoT
prototypes quickly and easily. Particularly, Arduino and Raspberry Pi are heavily used
as hardware platforms for teaching basic computer science in schools and for IoT
product prototype development.
Arduino is a flexible open source micro-controller that functions with several
communication and sensing technologies. This single-board development environment
allows one to read data coming from sensors and to control different devices. The
Arduino board (Arduino UNO R3) consists of an open hardware design with an
ATmega328 micro-controller [1]. Primarily, the Arduino software supports C and C++
programming languages. Various inputs and outputs are provided in the Arduino
board; therefore, eight input and output ports can be used simultaneously for various
applications.
FRIBEE white is a Fribot Arduino R3 expansion board containing a built-in XBee
shield; therefore, no additional shield is required. It is compatible with Arduino and can
be fitted easily with XBee such that developers can reap the advantages of reduced
overall volume, reduced cost, and convenient application. Connectable wireless com-
munication can be implemented easily by selecting various antennas such as XBee
(S1)/Zigbee (S2)/Bluetooth/Wi-Fi [3] (Fig. 1).
Home Fine Dust Monitoring Systems Using XBee 301
Fig. 1. FRIBEE white board
3 ZigBee
ZigBee is an IEEE 802.15.4 standard for data communication with businesses and
consumer devices. It was developed to provide low-power, wireless connectivity for a
wide range of network applications related to monitoring and control. ZigBee and
ZigBee Pro are mesh communication protocols that are based on IEEE 802.15.4.
ZigBee PRO is an improved version of the original ZigBee protocol, providing a
number of additional features that are particularly useful for extremely large networks
(may include hundreds or even thousands of nodes).
XBee and XBee Pro, also known as XBee S1, are product names for radio com-
munications modules produced by Digi International. The XBee and XBee Pro mod-
ules provide an easy-to-implement solution and are used for peer-to-peer and star
networks. Contrary to XBee S1, XBee S2 is an antenna used for mesh networking and
requires an extremely creative networking configuration (Fig. 2).
Fig. 2. XBee S1 and XBee S2
ZigBee network performs three primary roles as follows [4]:

• ZigBee Coordinator (ZC): a single ZC is required for each ZigBee network. It has a
unique PAN ID and channel number. It initiates network formation, serves as an
802.15.4 PAN coordinator, and may serve as a router once a network is formed.
302 S. W. Cho
• ZigBee Router (ZR): this is an optional network component. It may be associated

with ZC or with the previously associated ZR. It serves as a PAN coordinator, and
participates in the multi-hop routing of messages.
• ZigBee End-device (ZED): this joins the ZC or ZR and is an optional network
component; it acts as an 802.15.4 end device. It is utilized for extremely low power
operations, and does not allow for associations or participations in routing.
4 ZigBee Network Topologies
Three types of network topologies that ZigBee supports are as follows: star topology,
peer-to-peer topology, and cluster tree, as shown in Fig. 3 [5].
Fig. 3. ZigBee topology
4.1 Star Topology

In the star topology, communication is established between devices and a single central
controller, called the PAN coordinator. The PAN coordinator may be mains powered
while the devices will most likely be battery powered. Applications that benefit from
this topology include home automation, personal computer (PC) peripherals, toys, and
games.
4.2 Peer-to-Peer Topology

In peer-to-peer topology, there is one PAN coordinator. In contrast to star topology,
any device can communicate with any other devices if they are within the range of each
other. A peer-to-peer network can be ad hoc, self-organizing and self-healing. Appli-
cations such as industrial control and monitoring, wireless sensor networks, and asset
and inventory tracking would benefit from such a topology.
4.3 Cluster-Tree Topology

As a method to overcome the communication distance limitation of star topology, the
cluster-tree topology connects and configures several star topologies. Therefore, it is
used for a wider area. It generally consists of one coordinator and one router that form a
subtree and is connected with the coordinator.
5 Fine Dust Sensor
A dust sensor is used to detect house dust, cigarette smoke, etc. and is designed as a
sensor to automatically operate applications such as air purifiers and air conditioners
with air purifier functions.
The sharp dust sensor GP2Y1010AU0F is a more accurate and affordable device that
is used for measuring smoke and dust particles. The sharp dust sensor GP2Y1010AU0F
is an optical dust sensor that is also known as an optical air quality sensor. The sharp
dust sensor consumes less power and provides a highly reliable and stable output.
A 220µF capacitor and a 150Ω resistor are used for pulse driving of a given sensor LED
[6] (Fig. 4).
Fig. 4. GP2Y1010AU0F sensor pin definitions
6 System Architecture and Implementation
6.1 System Architecture

This system architecture can be easily built into an IoT home automation system to be
applied for fine dust measurement at every location. As mentioned previously, this
system fully utilizes the advantages of the star topology. One coordinator node (remote
node) collects fine dust sensor information and two end nodes send fine dust infor-
mation to the coordinator node. The coordinator displays two end node values in an
Arduino serial monitor and pairs them with Android applications simultaneously
(Fig. 5).
304 S. W. Cho
Fig. 5. System architecture
6.2 XBee Configuration

To build a star topology with XBees, the XBee configuration is required with an X-
CTU terminal. From the X-CTU setup, the XBee network topology is changed and
tested. Every XBee has its own 64 bit serial number. If XBee has a serial number and
satisfies the networking distance, then other Xbees can be found. The following figure
shows the verification of the XBee serial number. By plugging in an XBee USB
adapter to a PC, a user can test the XBee serial number by clicking the Test/Query
button in the PC Settings menu (Fig. 6).
Fig. 6. XBee configuration

PAN ID network address is a 16-bit address set by a coordinator dynamically. The

XBee series1 antenna factory mode PAN ID is 3332. If this value is changed manually,
another PAN ID environment can be created. It is impossible to communicate among
XBees in different PAN ID. In addition, Channel consists of PAN ID. The maximum
channel number is 16 and the coordinator can choose another number for the XBee
network (Fig. 7).
Fig. 7. Relationship among Channel, PAN ID, and MY address in XBee network
Similar to the figure above, if the channel or PAN is different, XBee cannot
communicate with other XBees although it is in the same place. Although the channel
number is the same, if the PAN ID is different, communication is impossible. There-
fore, under the same channel and PAN ID, the XBee network topology can be created.
In an XBee 1: N network, writing MY address and DL address is required in the X-
CTU modem configuration. For this communication agreement, the My address is
written in the node configuration to send data and the DL address is written for another
node to receive data. The address range is 0-FFFF. 0 is used for every node and FFFF is
used for broadcasting.
In this project, with End Node and Remote Node X-CTU Modem Configuration,
the author uses the factory mode for the PAN ID, MY address, and DL address because
every node is in the same channel.
The following Figs. 8 and 9 show the DustDensity1 (Arduino project name)
end nodes and DustDensity2 (Arduino project name), respectively. Figure 10 is a
receive_test (Arduino project name) remote node serving as a coordinator. The two
ends nodes are operated by a battery, and a remote node is powered by a laptop.
306 S. W. Cho
Fig. 8. DustDensity1
Fig. 9. DustDensity2
Fig. 10. receive_test
6.3 Arduino Implementation

Three Sketches, DustDensity1, DustDensity2, and receive_test are created in the
Arduino IDE to be uploaded to the FRIBEE Arduino board. DustDensity1 and
DustDensity2 are end nodes and receive_test is a remote node. The following are parts
of the DustDensity1 and DustDensity2 loop() functions that retrieve dust sensor values
and calculate dust density values.
digitalWrite(ledPower, LOW);
delayMicroseconds(samplingTime);
voMeasured = analogRead(measurePin);
delayMicroseconds(deltaTime);
digitalWrite(ledPower,HIGH);
delayMicroseconds(sleepTime);
calcVoltage = voMeasured * (5.0 / 1024.0);

dustDensity = (0.17 * calcVoltage - 0.1) * 1000;
Serial.println(dustDensity);
From Arduino pin A0, voMeasure reads an analog value; calcVoltage is a voltage
changed from voMeasure. Based on the Sharp GP2Y1010AU0F specification sheet [7],
the author calculates “dustDensity” with this voltage and generates the dust density in a
serial monitor.
Additionally, receive_test is a coordinator node code that receives fine dust density
values located in different places in a home. Once these fine dust values are processed from
the end nodes, receive_test sends values to an Android smartphone through Bluetooth.
if(Serial.available()){
byte data = Serial.read();

Serial.print(Serial.read());
buffer[bufferPosition++] = data;
if(data == '\n'){
buffer[bufferPosition] = '\0';
// Send dust density to Smartphone
btSerial.write(buffer, bufferPosition);
bufferPosition = 0;
}
}
308 S. W. Cho
Fig. 11. Experimental results between two rooms
The next experimental results on Android applications are average fine dust values
measured in two different locations. Figure 11 shows a room and another room test.
Figure 12 shows a room and a kitchen. Because of indoor lights and a windy envi-
ronment, the dust density values changed frequently. To obtain satisfactory results, the
author calculates the average of 30 dust density values. Creating a dark place and
closing windows may be required to measure consistent values.
Figure 11 shows the dust density values for two rooms. To configure the dust
density values, the author referred to portal information. The portal fine dust infor-
mation is as follows: 0–30 is good, 30–80 is average, 80–150 is poor, and above 150 is
very poor. Therefore, one experimental result was average and another was poor. While
this experiment was conducted, the outside dust density value was 51 at approximately
3 pm. Contrary to the outside value, the indoor dust density was higher.
Fig. 12. Experimental results between a room and a kitchen
Figure 12 shows the dust density values for one room and a kitchen. The room
values were average, but the kitchen values were poor. The outside dust density value
was 67 at approximately 5 am. Hence, the author assumes that cooking might have
contributed to higher values regardless of the outside dust density values.
7 Conclusion and Future Work
Compared to existing systems research, a method to measure indoor fine dust values at
a lower cost and efficiently was presented herein. By developing a star topology with
Xbee in Arduino, a user in a home can easily measure the fine dust density value of
each home location. Moreover, XBee can be placed depending on user preferences.
Two fine dust density values were collected in an Arduino coordinator and an Android
smartphone by synchronizng. This primary result obtained indicated that despite the
cheap IoT platforms and sensors, these methods provided accurate results in illumi-
nated or air environments. Thus, the author suggests indoor inhabitants to read the dust
density values of this system rather than relying on outdoor fine dust density infor-
mation on portals.
As IoT has developed, context-aware communication and computing are funda-
mental and are expected to be the same in the IoT paradigm. Context aware computing
pertains to sensing the environment and context, and adapting to behaviors accordingly
in IoT systems. Furthermore, the current IoT industrial market products are involved in
context-awareness using machine-learning algorithm of AI algorithms.
In future research, machine-learning technologies and IoT will be evaluated in
home IoT environments more efficiently. In addition, the author plans to use machine-
learning algorithms for fine dust alert systems, because the current proposed system
only displays fine dust density values in a simple Android UI. In addition, the author
plans to explore more complicated network architectures such as the mesh topology
and its corresponding test in other indoor places. The result of this future research is
expected to render indoor fine dust monitoring systems extremely useful for indoor
inhabitants’ health and lifestyles.
References
1. Giovanni, F., Raffaele, S., Imma, T., Roberto, C., Giorgio, V.: Polluino: an efficient cloud-
based management of IoT devices for air quality monitoring. In: IEEE 2nd International
Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow
(RTSI) (2016)
2. Seung-Il, J., Eun-Ki, L.: Implementation of improved functional router using embedded Linux
system. In: The Institute of Electronics and Information Engineers Conference (2017)
3. Wook-Jin, C.: FRIBEE white board (2019). https://fribot.com/goods/view?no=66
4. Tareq, A.: A survey on environmental monitoring systems using wireless sensor networks.
J. Netw. 10, 606–615 (2016)
5. Priyanka, K.: A review on wireless networking standard-Zigbee. Int. Res. J. Eng. Technol.
(2016)
6. Ravi, K., Borade, S.: MQTT based air quality monitoring. In: 2017 IEEE Region 10
Humanitarian Technology Conference (2017)
7. Chris, N.: Air Quality Monitoring (2012). http://www.howmuchsnow.com/arduino/airquality/
Opinion Mining in Consumers Food Choice
and Quality Perception
Alessandra Amato1, Giovanni Cozzolino1(&), and Marco Giacalone2

1
University of Naples “Federico II”, via Claudio 21, 80125 Naples, Italy
{alessandra.amato,giovanni.cozzolino}@unina.it
2
Vrije Universiteit Brussel & LSTS, Vrije Universiteit Brussel,
Pleinlaan 2 4B304, 1050 Brussels, Belgium
Marco.giacalone@vub.ac.be
Abstract. In this work we present a system for the automatic analysis of text
comments related to food products. Systematic analysis means allowing an
analyst to have at a glance all the needed aggregated data and results that
summarize the meaning of hundreds or thousands of comments, written in
natural language. The analysis of the comments, and therefore the choices of the
consumers, can therefore constitute a patrimony of very high value for the
companies of the sector.
At this aim we implemented a system, developed in Python. It uses the state
of the art libraries of processing texts written in natural language, because the
messages in natural language collected on the domain of food are written in
Italian language.
Keywords: Algorithm Food Data analysis
1 Introduction
It is now established practice to publish online comments on the places visited, the
services offered and the products purchased. This information is a sounding board that
feeds data for marketing companies. Attention to user comments is particularly high in
the domain of food-related products.
The business that moves around food is consolidated on billion dollars.
In this work we present a system for the automatic analysis of text comments
related to food products. Systematic analysis means allowing an analyst to have at a
glance all the needed aggregated data and results that summarizes the meaning of
hundreds or thousands of comments, written in natural language. The analysis of the
comments, and therefore the choices of the consumers, can therefore constitute a
patrimony of very high value for the companies of the sector.
At this aim we implemented a system for sentimental analysis, developed in
Python. It uses the state of the art libraries of processing texts written in natural
language, because the messages in natural language collected on the domain of food are
written in Italian language.

https://doi.org/10.1007/978-3-030-33509-0_28
Opinion Mining in Consumers Food Choice and Quality Perception 311
In the case of study built for validating the analyzer, the system retrieves from
Twitter some posts (through proper APIs) on which performs sentiment analysis
operations.
2 Twitter’s Streaming Processing
The tools used in this system are:

• Twitter
• Python
• Spacy
Twitter’s streaming API allows users to keep track of tweets on a specific topic by
monitoring the keywords defined by the user.
User can access all tweets that contain a keyword (provided the volume of those
tweets is less than 1% of the total flow).
However, monitoring a topic using a keyword has two important disadvantages:
1. Tweets often contain a number of elements to be discarded that disturb the analysis
of information (such as the presence of links)
2. The user often does not think about all the key words (or the most useful ones) a
priori.
Our system, therefore, aims to create a streaming interface that allows the user to
obtain an optimized flux that maximizes the number of relevant tweets from the stream.
Given a set of keywords selected by the user, an initial stream is produced [2–4].
The active learning component classifies the tweets as relevant or irrelevant and
simultaneously presents the user with the tweets for manual annotation.
Only tweets for which the system is most uncertain are selected for manual
annotation. A second component proposes new keywords based on co-occurrence in
the tweet text.
The initial operation that we propose to do concerns the selection of all the tweets
related to a particular hashtag selected by the user. This process [5–10] is possible
using the API provided by the Twitter Company itself, which allows the programmer to
interact with individual posts.
With Application Programming Interface (API) terms, we mean a set of procedures
designed to perform a specific task. An API provides a ‘black box’.
The connection strings for Twitter are reported in the code shown Fig. 1.
Fig. 1. Screenshot of the connection strings

312 A. Amato et al.
3 Opinion Analysis
Once you have requested and obtained the API from Twitter you need to decide which
tweets, among all those present, to consider. This operation is done by selecting only
the posts that contain a certain hashtag, this is possible by using a particular script
written in python in which is needed to specify the reference language and the number
of tweets that the user wants to take into consideration [11–14].
In Fig. 2 we report a script for the selection of 50 tweets characterized by hashtag
“food”.
Fig. 2. Script for the selections of tweets belonging to a given topic
Before explaining how the sentiment analysis on the tweets that have been found
through the methods reported in the previous section was done, we report some key
point on this kind of analysis.
Terms of Sentiment Analysis indicates the field of natural language processing that
deals with building systems for the identification and extraction of opinions from the
text [15–19].
It is based on the main methods of computational linguistics and textual analysis.
Existing approaches to sentiment analysis can be grouped into 4 main categories [1]:
1. Spotting of keywords: classifies the text from influential categories based on the
presence of influential but ambiguous words such as happy, sad, afraid, bored;
2. Lexical Affinity: it does not only aim to detect influential words, but has also the
task of assigning to words, in an arbitrary manner, a probable particular emotions;
3. Statistical methods: they rely instead on elements taken from machine learning
methods, such as latent semantic analysis, support vector machines, bag of words
and semantic orientation. In order to extrapolate opinion in a given context and
obtain its characteristics, the grammatical relations of words are used.
4. Conceptual level techniques: these kind of approaches rely on knowledge repre-
sentation tools such as ontologies and semantic networks, and aimed to detect
semantics that are expressed in a subtle way.
The import commands for the extension of the sentiment analysis modules are
shown in Fig. 3.
Fig. 3. Importing commands for sentiment analysis modules

Once the tweets are obtained, they will have to be analysed to extract opinions.
For this operation we used the Python procedures, which allowed us, through
proper scripts, to obtain sentimental information about the same.
In Fig. 4 the script for performing the sentiment analysis is reported.
Fig. 4. Script for sentiment analysis
At this point, the last part of the program is to call the function for the analysis of
each tweet determined and saved appropriately in an array, printing the text and the
result, as reported in Fig. 5.
Fig. 5. Script for storing and display the results.
4 Opinion Mining in Consumers Food Choice
The developed python program aims to analyze the documents of interest, in our case a
series of comments, published on one of the main social network (Twitter), related to a
list of selected food products. The input texts are taken from public accessible cooking
pages available on the web, in order to evaluate the product according to the extent of
the comments, whether positive or negative. In the end, based on the score received
from comments, the product will be assigned a number of stars, for a maximum of 5 in
case of all positive comments.
Using spacy and the PyCharm IDE we have implemented the code reported in
Fig. 6.
314 A. Amato et al.
Fig. 6. Screenshot of the code fragment reporting the rate of the comments
We imported the spacy library and the model propaedeutic for the recognition of
the characteristics of the Italian language, in which the comments are written, in order
to perform the processing of the comments written in natural language [20–25].
The score variable takes into account the score achieved by the food product.
We count the number of comments and increase the score depending on the
comment found through the command token.text.rfind:
For each “Congratulations” (complimenti) and “Very Good” (buonissimo) found in
the file there is an increase of 0.75; “very good” (molto buono) and “very good recipe”
(ricetta molto buona) correspond to the maximum score for a comment, that is 1;
“goodness” (bontà) corresponds to 0.65; “wonder” (meraviglia) to 0.8; “I don’t like” (non
mi piace) to 0.25; “not bad” (non male) to 0.5; “to try” (da provare) to 0.6; “very bad”
(pessimo) to 0. The Italian translation of the sentences is reported between the brackets.
In Fig. 7 is reported the snippet of code used for printing the resulting score.
Fig. 7. Screenshot of the code fragment reporting the resulting score

In Fig. 8 a set of fragment of the analyzed comments is reported.
Fig. 8. Screenshot of the fragments of comments under analysis.
5 Conclusions
In this work we presented a system for the automatic analysis of text comments related
to food products. Systematic analysis means allowing an analyst to have at a glance all
the needed aggregated data and results that summarize the meaning of hundreds or
thousands of comments, written in natural language. The analysis of the comments, and
therefore the choices of the consumers, can therefore constitute a patrimony of very
high value for the companies of the sector.
At this aim we implemented a system, developed in Python. It uses the state of the
art libraries of processing texts written in natural language, because the messages in
natural language collected on the domain of food are written in Italian language.
Through this system it is possible to analyze the reactions of users to certain posts,
selecting those to which correspond positive feedback, also for proposing to the user
content that is interesting for him.
Acknowledgments. This work was co-funded by the European Union’s Justice Programme
(2014–2020), CREA Project, under grant agreement No. 766463.
316 A. Amato et al.
References
1. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Pearson
Education Limited, Malaysia (2009)
2. Chianese, A., Marulli, F., Piccialli, F.: Cultural heritage and social pulse: a semantic
approach for CH sensitivity discovery in social media data 2016. In: IEEE Tenth
International Conference on Semantic Computing (ICSC), pp. 459–464 (2016)
3. Amato, F., Moscato, V., Picariello, A., Piccialli, F., Sperlí, G.: Centrality in heterogeneous
social networks for lurkers detection: An approach based on hypergraphs. Concurr. Comput.:
Pract. Exp. 30(3), e4188 (2018)
4. Hussain, S., Keung, J., Khan, A.A., Ahmad, A., Cuomo, S., Piccialli, F., Jeon, G.,
Akhunzada, A.: Implications of deep learning for the automation of design patterns
organization. J. Parallel Distrib. Comput. 117, 256–266 (2018)
5. Coppolino, L., D’Antonio, S., Mazzeo, G., Romano, L., Sgaglione, L.: Exploiting new CPU
extensions for secure exchange of eHealth data at the EU level. In: 2018 14th European
Dependable Computing Conference (EDCC), Iasi, pp. 17–24 (2018). https://doi.org/10.
1109/edcc.2018.00015
6. Coppolino, L., D’Antonio, S., Mazzeo, G., Romano, L.: A comparative analysis of emerging
approaches for securing java software with Intel SGX. Futur. Gener. Comput. Syst. 97, 620–
633 (2019). ISSN 0167-739X. https://doi.org/10.1016/j.future.2019.03.018
7. Mazzeo, G., Coppolino, L., D’Antonio, S., Mazzariello, C., Romano, L.: SIL2 assessment of
an active/standby COTS-based safety-related system. Reliab. Eng. Syst. Saf. 176, 125–134
(2018). ISSN 0951-8320. https://doi.org/10.1016/j.ress.2018.04.009
8. Cilardo, A., Barbareschi, M., Mazzeo, A.: Secure distribution infrastructure for hardware
digital contents. IET Comput. Digit. Tech. 8(6), 300–310 (2014)
9. Amelino, D., Barbareschi, M., Cilardo, A.: An IP core remote anonymous activation
protocol. IEEE Trans. Emerg. Top. Comput. 6(2), 258–268 (2016)
10. Cilardo, A., et al.: An FPGA-based key-store for improving the dependability of security
services. In: 10th IEEE International Workshop on Object-Oriented Real-Time Dependable
Systems. IEEE (2005)
11. Amato, F., Moscato, F., Moscato, V., Colace, F.: Improving security in cloud by formal
modeling of IaaS resources. Futur. Gener. Comput. Syst. 87, 754–764 (2018). https://doi.
org/10.1016/j.future.2017.08.016
12. Di Lorenzo, G., Mazzocca, N., Moscato, F., Vittorini, V.: Towards semantics driven
generation of executable web services compositions. J. Softw. 2(5), 1–15 (2007). https://doi.
org/10.4304/jsw.5.1.1-15
13. Moscato, F., Aversa, R., Di Martino, B., Rak, M., Venticinque, S., Petcu, D.: An ontology
for the cloud in mOSAIC. In: Cloud Computing: Methodology, Systems, and Applications,
pp. 467–485 (2017). https://doi.org/10.1201/b11149
14. Aversa, R., Di Martino, B., Moscato, F. Critical systems verification in MetaMORP(h)OSY.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics). LNCS (LNAI and LNB), vol. 8696,
pp. 119–129. Springer (2014). https://doi.org/10.1007/978-3-319-10557-4_15
15. Albanese, M., Erbacher, R.F., Jajodia, S., Molinaro, C., Persia, F., Picariello, A., Sperlì, G.,
Subrahmanian, V.S.: Recognizing unexplained behavior in network traffic. In: Network
Science and Cybersecurity, pp. 39–62. Springer, New York (2014)
16. Casillo, M., Clarizia, F., Colace, F., Lombardi, M., Pascale, F., Santaniello, D.: An approach
for recommending contextualized services in e-tourism. Information 10(5), 180 (2019)
17. Amato, F., Cozzolino, G., Sperlì, G.: A hypergraph data model for expert-finding in
multimedia social networks. Inf. (Switzerland), 10(6) (2019). Article no. 183
18. Amato, F., Moscato, V., Picariello, A., Sperli’ì, G.: Extreme events management using
multimedia social networks. Futur. Gener. Comput. Syst. 94, 444–452 (2019)
19. Amato, F., Moscato, V., Picariello, A., Piccialli, F.: SOS: a multimedia recommender system
for online social networks. Futur. Gener. Comput. Syst. 93, 914–923 (2019)
20. Clarizia, F., Colace, F., Lombardi, M., Pascale, F., Santaniello, D.: Chatbot: an education
support system for student. In: International Symposium on Cyberspace Safety and Security,
pp. 291–302. Springer, Cham (2018)
21. Colace, F., De Santo, M., Greco, L., Napoletano, P.: A query expansion method based on a
weighted word pairs approach. In: Proceedings of the 3rd Italian Information Retrieval (IIR),
vol. 964, pp. 17–28 (2013)
22. Colace, F., De Santo, M.: Adaptive hypermedia system in education: A user model and
tracking strategy proposal. In: 2007 37th Annual Frontiers in Education Conference-Global
Engineering: Knowledge Without Borders, Opportunities Without Passports, pp. T2D–18.
IEEE, 2007 October
23. Amato, F., Moscato, F., Xhafa, F.: Generation of game contents by social media analysis and
MAS planning. Comput. Hum. Behav. (2019)
24. Amato, F., Cozzolino, G., Moscato, V., Moscato, F.: Analyse digital forensic evidences
through a semantic-based methodology and NLP techniques. Futur. Gener. Comput. Syst.
98, 297–307 (2019)
25. Amato, F., Cozzolino, G., Mazzeo, A., Moscato, F.: Detect and correlate information system
events through verbose logging messages analysis. Computing 101(7), 819–830 (2019)
A Model for Human Activity Recognition
in Ambient Assisted Living
Wagner D. do Amaral1(B) , Mario A. R. Dantas2 , and Fernanda Campos2

1
School of Science, Engineering and Information Technology,
Federation University Australia, Brisbane, Australia
wagnerdaufenbachdoamaral@students.federation.edu.au
2
Department of Computer Science, Federal University of Juiz de Fora,
Juiz de Fora, Brazil
{mario.dantas,fernanda.campos}@ice.ufjf.br
Abstract. This work presents a model for human activity recognition,

through an IoT paradigm, using location and movement data, generated
from an accelerometer. The activities of five individuals from different
age groups were monitored, utilizing IoT devices, using the activities
of four of these individuals to train the model and the activities of the
remaining individual for test data. For the prediction of the activities,
the Extra Trees algorithm was used, where the results of 81.16% accuracy
were obtained when only movement data were used, 92.59% when using
both movement and location data, and 97.56% when using movement
data and synthetic location data.
1 Introduction
According to 2013 projections from the IBGE (Brazilian Institute for Geogra-
phy and Statistics), the elderly people in Brazil in 2016 would represent approx-
imately 12% of the population of the country. By 2060, the institute projects
that the population of the elderly will represent almost 34% of the population
of Brazil. In absolute numbers, it is estimated that the number of elderly people
practically triple by 2060 [1].
This great growth of the elderly population creates both the opportunity and
the need to establish conditions that may guarantee them a life with quality and
independence. Based on this scenario, technologies that monitor individuals (IoT
devices) in their homes have emerged – the so-called Ambient Assisted Living.
Although human activity recognition systems reinforce the notion that the main
target audience is the elderly population, it is important to note that this kind of
system, as well as the proposed model, can be useful for the entire population in
general, regardless of their age group or other characteristics that define them.
An Ambient Assisted Living consists of a heterogeneous set of wearable and
environment sensors (IoT devices) that generate large volumes of data. These
environments use a variety of techniques to detect abnormalities; however, such
abnormalities are usually detected only when they are actually occurring, which
https://doi.org/10.1007/978-3-030-33509-0_29
A Model for Human Activity Recognition in Ambient Assisted Living 319
is often linked to an advanced stage of a disease. In this type of system, it is

not possible to predict anomalies in a timely manner, which makes it virtually
impossible to take preventive actions in order to avoid a critical situation [2].
By storing the spatial data of a monitored individual inside their residence, as
well as their movements, we can infer their daily activities and thus understand
their individual routine, detecting more accurately, and in advance, possible
anomalies in their health.
The current research proposes a human activity recognition model, based on
location and movement data. We have chosen to use devices of low cost and
small dimensions, focusing on being a non-intrusive model, thus guaranteeing a
life with more freedom and quality for the monitored individual. This research
is a study conducted in a progressive way; partial results of it are discussed in
[3].
This work is organized as follows: in Sect. 2, we perform an analysis of the
related work; in Sect. 3, the proposal is presented and detailed; the environment
and the obtained results are presented in Sect. 4; finally, the conclusion and
future work are presented in Sect. 5.
2 Related Work
In recent years, the subject of assisted environments has been brought up by

many articles. Since it is a relatively new concept, few standards can be found
in the literature. Some works, such as [4] or [5], use only a single accelerometer
to recognize activities. Other ones, such as [6,7] or [8], use several sensors for
the same purpose.
Wearable sensors are widely used in this area, even though other approaches
may also be found in the literature, such as using images for activity recognition.
This is the case of studies like [9].
When using wearable sensors, it is important to emphasize that the body
position where the device will be placed is of great importance. Although there
is also no standard on where the sensors should be positioned, some of the most
common places are: chest, wrist, ankle and waist. In [10], a study is conducted
in order to determine the best positioning for wearable sensors.
In [11], the identification of human activities is done through a single triaxial
accelerometer, using the concept of dynamic window sizing. The classification
system consists of three classifiers, each of which is implemented as a decision
tree. Their work, which uses only movement data, has reached an average pre-
cision of 95.4%.
Chevalier [12] proposes a model that uses a public data set of human activi-
ties, which is made available in [13]. Comparing to the previously mentioned
models, this one proposes the use of a smartphone, making use of both its
accelerometer and gyroscope to generate data, as sensor location the waist and
as Classification method Recurrent Neural Networks. The system’s average accu-
racy stood at 91.55%.
320 W. D. do Amaral et al.
2.1 Comparison of the Approaches
Table 1 presents a comparison between the approaches used in some of the papers
that were selected as related work.
Table 1. Comparison of research studies
Study Sensor Sensor location Classification

method
Mario [4] Accelerometer Waist Convolutional
neural network
Choi et al. [5] Accelerometer Non-dominant wrist Support vector
machine
Dwiyantoro et al. Accelerometer Pants front pocket Dynamic time
[6] and gravity warping and
sensor K-nearest neighbors
(smartphone)
Murao and Multiple Wrists, ankles and Dynamic time
Terada [7] accelerometers hip warping and support
vector machine
Kim et al. [8] Accelerometer Waist Hidden Markov
and gyroscope model
(smartphone)
3 Proposal
As life expectancy keeps increasing, focused solutions for Ambient Assisted Liv-
ing environments arise with the objective of improving the quality of life and
guaranteeing independence for the elderly.
This work proposes a model for human activity recognition, monitoring indi-
viduals within their homes. The data to be analyzed in this prediction refer to
positioning and movement.
Figure 1 shows an overview of the proposed model, which utilizes the IoT
technology. The figure depicts every process within the model, from the input
of data – which are captured from sensors – to the prediction of the activity
performed by the monitored individual.
The main goal of this work is to propose a model that is capable of accurately
inferring what activity the monitored individual is performing, based on their
location and movement data.
Fig. 1. Overview of the model
3.1 Obtaining Data
Knowing that an individual’s activities are strongly linked to their in-house

location [14], the proposed model incorporates the identification of the indoor
location of the monitored subject.
The movement of the individual is identified from a single accelerometer
attached to their body. According to [10], the best results in domestic activities
identification were found when the wearable sensor was attached to their waist
or wrist. Since the model should be as least intrusive as possible, we chose to
use the sensor in the individual’s wrist for this research. For the location data
of the individual within the environment, Bluetooth devices will be distributed
throughout the environment to map the signal intensity sent by the wearable
sensor.
3.1.1 Movement
In order to obtain movement data, the individual will carry a wearable sen-
sor which will be comprised of a TinyDuino, a BLE shield for communication
between the sensor and the computer system, and an accelerometer shield. The
main board and the shields can be seen in the Fig. 2. The dimensions of the
assembled sensor are 20 mm × 20 mm × 16.8 mm.
Fig. 2. TinyDuino, BLE shield and accelerometer shield

3.1.2 Location
For the identification of an individual’s location, Bluetooth devices will be dis-
tributed throughout the environment. These devices will be responsible for ver-
ifying the signal strength of the BLE carried by the person. The location of the
monitored individual will be defined according to the signal strength measured
by the Bluetooth devices.
In this step, it will be necessary to train the model to recognize the surround-
ings of the house. In order to do so, it is necessary to equip the sensor and walk
around the house, manually discriminating each room.
The process of identifying the distinct areas of the environment is described
as follows:
1. Every second, each of the Bluetooth devices will store data regarding the
signal intensity of the wearable sensor in a database;
2. The computational system will be responsible for reading the last 3 sets of
data of each Bluetooth device, and for calculating the average intensity of
each device;
3. The average intensity of each device will be used as an index for the prediction
of the location;
4. The K-NN algorithm will be used to predict the location;
5. The prediction will be stored in the database.
3.2 Human Activity Recognition
For the recognition of activities, the Extra Trees algorithm, implemented by

Scikit-learn [15], was selected. 80% of the available data will be used for training
and 20% for testing. To evaluate the model, a cross-validation (k-fold ) will be
performed within the training data set, with 10 subsets.
For the accelerometer data, we have defined a 5-second window with a 3-
second overlap, except for the last window, which will be fitted to the remaining
available time. For example, if we have data regarding 10 s of accelerometer
monitoring, we will have the following time windows: [0 s–5 s], [2 s–7 s], [4 s–9 s],
[5 s–10 s].
Since the chosen algorithm only works with numeric data, it is necessary to
convert the locations into this data type; in order to perform such conversion,
we opted to use binary coding. Moreover, since there are 7 environments in the
house, 3 new attributes would have to be created.
3.2.1 Attribute Selection

The selection of attributes to be used in the activity recognition algorithm was
based on the proposal of [12]. Each of the attributes is calculated taking into
account the data relating to 5-second accelerometer signals, with the exception
of the location attributes. The selected attributes are:
• Mean [X, Y, Z] - mean values of each axis;

• Standard deviation [X, Y, Z] - standard deviation of the values of each
axis;
• Median [X, Y, Z] - median values of each axis;
• Maximum value [X, Y, Z] - maximum value of each axis;
• Minimum value [X, Y, Z] - minimum value n
of each axis;
2
i=1 (ki )
• Energy [X, Y, Z] - given by the formula n for the values of each
axis, where n is the quantity of values;
• Interquartile range [X, Y, Z] - the difference between the third and first
quartile of the values of each axis; n
• Entropy [X, Y, Z] - given by the formula − i=1 (ki ∗log(ki )) for the values
of each axis, where n is the quantity of values;
• Signal
n
magnitude area (X, Y, Z) - given by the formula
i=1 (|xi |+|yi |+|zi |)
n , where n is the quantity of values in an axis;
• Correlation [(X, Y), (X, Z), (Y, Z)] - coefficient of correlation between
the data of the axes, taken two by two;
• Place [x0, x1, x2] - binary codification regarding indoor location.
4 Environments and Results

In this section, we present the experimental results of the proposed model. For
this stage, 5 individuals were monitored – two of them aged between 20 and 30
years, one aged between 30 and 40, and two aged 60 or more. The data of one
of the individuals aged between 20 and 30 years were used as test data, while
the data of the other individuals were used as training data. The decision using
different age ranges in the experiment lies on the idea briefly discussed previously
that the model can be useful for the entire population in general, regardless of
their age group or other characteristics that define them. Two scenarios have
been created with the collected data: first, only the movement data were used
for the activity recognition; then, in addition to the movement data, the indoor
location data were also used.
The environment in which the experiments were conducted has an area of
approximately 67 m2 , divided into the following rooms: living room, hallway,
office, kitchen, laundry room, restroom and bedroom. The computer system
(Raspberry Pi 3 Model B) was positioned in the kitchen, while the bluetooth
devices (three laptops) were positioned in the office, bedroom and living room.
In order to generate data from a wide variety of activities and locations –
reproducing the actual routine of an individual – a set of activities has been
defined. Activities such as walking and sweeping, for example, can be performed
in any room of the house, while tooth brushing is a restroom-exclusive activity.
The activities were divided into seven categories: high intensity, walking,
domestic activity, low intensity, eating, sleeping and personal hygiene. Each indi-
vidual had 2 min to perform each category of activities, and this time interval
was divided equally among the activities of each category. Naturally, categories
that contain less activities have more time for performing each activity, as is
the case of the high intensity category, which has only one activity and thus has
120 s to perform it. Categories that have more activities, on the other hand, have
less time for each activity, as is the case of the domestic activity category, which
has an average of 11 s for each activity, since it comprises 11 activities.
4.1 Scenarios
Scenario I presents the results for the tests where only movement data were used
for the recognition of human activity. Scenario II presents the results for the tests
where movement and location data were used. Scenario II was performed in two
different ways: in II(a) the location data from the proposed model are used,
whereas in II(b) synthetic location data are used, considering an accuracy of
100%, that is, all locations are correctly labeled. This decision has been made
due to low data precision in indoor locations where there were no Bluetooth
devices (hallway, restroom and laundry room). Rooms with Bluetooth devices
reached an accuracy of nearly 95%, while rooms without Bluetooth devices had
an accuracy close to 30%.
4.2 Cross-Validation
We have run tests using three different algorithms: Extra Trees, Gradient Boost-
ing and Random Forest. Table 2 presents a comparison of the accuracy of the
model – for the training data – using the cited algorithms.
Table 2. Cross-validation comparison
Algorithm Scenario I Scenario II(a) Scenario II(b)

Extra Trees 95,32% 94,68% 99,35%
Gradient Boosting 94,94% 95,26% 98,05%
Random Forest 94,68% 95,58% 98,77%
4.3 Importance of the Attributes
After running some experimental tests, we have observed that complex attributes
(interquartile range, entropy, signal magnitude area and correlation) are the
ones that are less important for the model, whereas simpler attributes – such
as maximum value and minimum value – have a greater weight for the selection
process.
In scenarios II(a) and II(b), where location data were analyzed, it is observed
that attributes that are related to location play a significant role, having a great
weight in the selection process. In scenario II(b), where all locations were correct,
these attributes have, on average, a weight three times higher than the other
attributes.
Table 3. Model accuracy in each scenario
Scenario Accuracy
I 81,16%
II(a) 92,59%
II(b) 97,56%
Fig. 3. Confusion matrices of scenarios I, II(a) e II(b)

4.4 Precision of the Model
Table 3 presents a comparison of the accuracy of the model in each scenario,

using test data.
Despite the high accuracy of the model in the cross-validation for scenario
I (95.32%), a considerable accuracy drop can be observed when the model is
applied to test data, obtaining an accuracy of 81.16%. An interesting point to
note is that the category with the highest error rate – personal hygiene – is
strongly linked to a specific indoor location: the restroom.
With the addition of location data in both scenarios II(a) and II(b), there was
a considerable decrease in classification errors regarding the personal hygiene
class, reaching 0% in scenario II(b). The complete confusion matrices for the
scenarios I, II(a) and II(b) are presented in the Fig. 3.
Throughout this work it was clear that the existing literature related to the
context of human activity recognition in ambient assisted living, based on IoT
technology, lacks standards, in a way that a wide variety of approaches are used.
In this work, we have decided to use only one wearable sensor (IoT device) for
each monitored individual, and data regarding location and movement are sent
to the computer system for the prediction of activities.
The model proved to be effective for the recognition of human activities
for the test data, presenting 81.16% accuracy when using only movement data,
92.59% when using both movement and location data and 97.56% when move-
ment data and synthetic location data are used.
During the tests, some Bluetooth communication problems were identified,
mainly in the indoor localization system, which led to a low accuracy of this
system in rooms where there were no Bluetooth devices – hallway, restroom
and laundry room. To address this problem, using the same approach, it would
be necessary to install Bluetooth devices in all monitored environments, which
would be impractical for the general public. Even with the problem of communi-
cation, the importance of localization for the recognition of human activity has
been proven when using synthetic location data.
The positioning of the Bluetooth devices was also a key to the success of the
research. It was necessary to test various device arrangements in order to achieve
satisfactory results. Obstructions in the path between the Bluetooth devices and
the wearable sensor, however small, can have a major impact on signal strength
measurements.
We chose the accelerometer as the main sensor for the identification of activ-
ities mainly due to its low energy consumption and small dimensions, which
favors non-intrusiveness, guaranteeing a life with quality and independence for
the monitored individuals.
After analyzing the results of the research, we can suggest some improvements
that can be addressed in future works:
• Studying other communication technologies in order to reduce the impact of

communication failures on Bluetooth devices;
• Identifying other indoor location techniques that present more consistent
results. RFID-based techniques or Wi-Fi signal strength are especially rec-
ommended as possible solutions;
• Applying a Context Quality filter to define which data should be stored or
discarded. Regarding this concern, a similar approach to [16] should be imple-
mented;
• Implementing a system that is capable of generating the routine of the moni-
tored individual based on their activities and triggering alerts if such routine
is interrupted. One solution could be the notification recommender systems.
References
1. IBGE: Projeção da população do brasil por sexo e idade 2000-2060. Revisão 2013
(2013)
2. Forkan, A.R.M., Khalil, I., Tari, Z., Foufou, S., Bouras, A.: A context-aware app-
roach for long-term behavioural change detection and abnormality prediction in
ambient assisted living. Pattern Recognit. 48(3), 628–641 (2015)
3. Amaral, W.D., Dantas, M.: Um modelo de reconhecimento de atividades humanas
baseado no uso de acelerômetro com qoc. In: Workshop de Iniciação Cientı́fica
do WSCAD 2017 (XVIII Simpósio em Sistemas Computacionais de Alto Desem-
penho), pp. 45–50 (2017)
4. Mario, M.: Human activity recognition based on single sensor square HV accel-
eration images and convolutional neural networks. IEEE Sens. J. 19, 1487–1498
(2018)
5. Choi, H., Wang, Q., Toledo, M., Turaga, P., Buman, M., Srivastava, A.: Tempo-
ral alignment improves feature quality: an experiment on activity recognition with
accelerometer data. In: 2018 IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, pp. 462–4628
(2018)
6. Dwiyantoro, A.P.J., Nugraha, I.G.D., Choi, D.: A simple hierarchical activity
recognition system using a gravity sensor and accelerometer on a smartphone.
Int. J. Technol. 7(5), 831–839 (2016)
7. Murao, K., Terada, T.: A combined-activity recognition method with accelerome-
ters. J. Inf. Process. 24(3), 512–521 (2016)
8. Kim, Y.J., Kang, B.N., Kim, D.: Hidden Markov model ensemble for activity recog-
nition using tri-axis accelerometer. In: 2015 IEEE International Conference on Sys-
tems, Man, and Cybernetics, pp. 3036–3041 (2015)
9. Maurer, U., Smailagic, A., Siewiorek, D.P., Deisher, M.: Activity recognition and
monitoring using multiple sensors on different body positions. In: Proceedings of
the International Workshop on Wearable and Implantable Body Sensor Networks,
BSN 2006, pp 113–116. IEEE Computer Society, Washington, DC (2006)
10. Atallah, L., Lo, B., King, R., Yang, G.Z.: Sensor positioning for activity recognition
using wearable accelerometers. IEEE Trans. Biomed. Circuits Syst. 5(4), 320–329
(2011)
11. Noor, M.H.M., Salcic, Z., Wang, K.I.-K.: Adaptive sliding window segmentation
for physical activity recognition using a single tri-axial accelerometer. Pervasive
Mob. Comput. 38(1), 41–59 (2017)
12. Chevalier, G.: LSTMS for human activity recognition (2016). https://github.com/
guillaume-chevalier/LSTM-Human-Activity-Recognition. Accessed 13 June 2017
13. Anguita, D., Ghio, A., Oneto, L., Parra, X., Reyes-Ortiz, J.L.: A public domain
dataset for human activity recognition using smartphones. In: 21th European Sym-
posium on Artificial Neural Networks. ESANN, Computational Intelligence and
Machine Learning (2013)
14. Zhu, C., Sheng, W.: Motion- and location-based online human daily activity recog-
nition (2013)
15. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine
learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
16. Nazário, D.C., Campos, P.J., Inacio, E.C., Dantas, M.A.R.: Quality of context eval-
uating approach in AAL environment using IoT technology. In: 2017 IEEE 30th
International Symposium on Computer-Based Medical Systems (CBMS), Thessa-
loniki, pp. 558–563 (2017)
Omniconn: An Architecture for
Heterogeneous Devices Interoperability
on Industrial Internet of Things
Bruno Machado Agostinho1 , Cauê Baasch de Souza1 ,

Fernanda Oliveira Gomes1 , Alex Sandro Roschildt Pinto1 ,
and Mario Antônio Ribeiro Dantas2(B)
1
Federal University of Santa Catarina, Florianópolis, Brazil
{bruno.agostinho,fernanda.gomes}@posgrad.ufsc.br, caue.bs@grad.ufsc.br,
a.r.pinto@ufsc.br
2
Federal University of Juiz de Fora, Juiz de Fora, Brazil
mario.dantas@ice.ufjf.br
Abstract. The increase of the number and different types of devices

within the Internet of Things context has brought several challenges
over time. One of them is how to support heterogeneous device interop-
erability in IoT environments. This work proposes an architecture, called
Omniconn, to tackle some of these communication issues. Utilizing the
microservice approach, tests were performed with devices communicat-
ing through the Zigbee, Bluetooth LE, and Wi-fi protocols. The results
were compared with tests using the same protocols in isolation. It was
possible to perceive a low percentage of timeouts and invalid packages.
In some cases, the use of multiple protocols had presented a lower per-
formance compared with the isolated experiments, using Omniconn in
a testbed environment was considered feasible. On the other hand, the
majority of tests pointed out an enhancement in the average response
time and number of requests, which reached 26% and 12% respectively.
1 Introduction
Nowadays, there has been a significant increase in the number of devices on
the Internet of Things (IoT) context. It is expected that IoT will be one of the
greatest revolutions since the Internet, bringing numerous opportunities world-
wide [12]. The growth in IoT also pushes some related areas such as VANETs,
Smart Cities, and Industrial IoT (IIoT).
According to [20], two of the most significant application challenges within
the context of IoT are the high heterogeneity and cooperation among millions
of distributed devices. The quantity of device types and protocols makes it very
difficult to approach standardization. And this may become even more complex
over the years, due to the release of new devices and protocols. An approach is
necessary that makes possible the integration of these devices.
In [8], the authors proposed to use a Service-Oriented Architecture (SOA)
to deal with interoperability issues. While service utilization is a promising
https://doi.org/10.1007/978-3-030-33509-0_30
330 B. M. Agostinho et al.
approach, a traditional SOA architecture may reduce some flexibility from the
technologies adopted. Using a different method, [6] proposed to use containers
and SDN technologies to deal with heterogeneous networks, but they did not
perform connectivity tests. In [14], the authors proposed the use of a gateway
to deal with heterogeneity in the communication between Zigbee and Bluetooth
LE devices, sending the data to the cloud. However, until the moment of the
publication, they had not done any validation of the environment.
The Omniconn proposal presents an approach using microservices to handle
different types of devices and protocols. Rather than a traditional SOA app-
roach, we decided to use microservices, taking advantage of its flexibility and
independence about other services approaches. Diverging from some proposals
presented, Omniconn was designed to use and process locally collected data,
enabling the use of its infrastructure within the context of Fog Computing. The
work brings as contribution, the feasibility of an environment that supports the
interoperability of heterogeneous devices, presenting the results obtained by the
experimental environment to validate the proposal. Furthermore, this work has
validated an approach using microservices to handle different devices and pro-
tocols and implemented an Industrial IoT control architecture.
The structure of this work is organized as follows. The second section brings
some relevant concepts related to the proposal. In the next section, we discussed
some related works. In Sects. 3 and 4, the Omniconn Architecture and the results
obtained in the experimental environment are presented. The work ends with
conclusions and future work.
2 Background
2.1 Industrial Internet of Things (IIoT)
The growth of the number of devices connected to the Internet in recent years and
the interaction between them, where everything can be connected and considered
a distinct object is called the Internet of Things (IoT). IoT devices can sense
and actuate in the environment, besides they interact with other devices and
users (sharing information and taking decisions) [2].
According to [3], IoT is a new concept that is growing and based on objects
or things that are pervasively distributed in an environment. These things can
be sensors, actuators, smartphones, and smart things (devices that can interact
with users through Internet).
Until the year 2020 is expected 212 billion devices connected through Internet
[2]. However, how these devices will interact, communicate, collect, and store
data is not a consense between IoT researchers. Besides IoT is a natural trend,
it is necessary the definition of standards that properly integrate IoT devices.
IoT applications can be used in several practical contexts: medical wireless
sensor networks, smart homes, smart factories and smart farming. Thus, IoT is
a trend that will change the way users interact with their goods and facilities.
Omniconn: An Architecture for Industrial Internet of Things 331
2.2 IoT Protocols

Even with the lake of consensus about the communication standards of IoT
devices, some wireless standards are becoming popular in IoT applications.
Application layer protocols are an alternative for IoT devices integration.
Constrained Application Protocol (CoAP) is implemented above the HTTP pro-
tocol and based on the REST standard. Message Queue Telemetry Transport
(MQTT) [10] is another application standard for IoT. It is based on the pro-
ducer/consumer paradigm, it can be described as a many-to-many standard and
is energy efficient [15].
XMPP [16] and AMQP [10] are two of many application protocols already
used in IoT applications. Besides, there are a lot of protocols used in IoT devices
communication. Wireless communication protocols as Bluetooth Low Energy
(BLE), Wi-fi (IEEE 802.11) and Zigbee are popular solutions. Moreover, SigFox
and LoRa/LoraWAn [4] are specific IoT communication protocols.
2.2.1 Bluetooth Low Energy (BLE)

Bluetooth Low Energy (BLE) is a standard developed for short distance com-
munication, focusing on energy economy. It was developed by Bluetooth Special
Interest Group, and promises to be a low energy solution that achieves more
than ten years of lifetime [9]. According to [18], BLE is a standard that can
become popular in IoT applications.
BLE use 2.4 GHz frequency, however, it is not compatible with previous Blue-
tooth protocol versions. Moreover, some devices operate in Dual Mode, using
classical Bluetooth and BLE. BLE devices are in sleep mode most of their life-
time, awaking for message exchange. Thus, these devices consume less energy.
2.2.2 Zigbee
Zigbee [5], was developed to be a reliable, low cost and low energy, wireless tech-
nology. Based on IEEE 802.15.4 standard, it implements network and application
layers. Devices remain in sleep mode most of their lifetime to save energy.
3 Related Works
The challenges to be overcome within the context of IoT, Industry 4.0, and pro-
tocols interoperability has been increasing researches in this area. In this section,
we have selected some related works to compare and validate our proposal.
In [8], the variety of protocols are managed by service-oriented drivers, taking
advantage of the concept of “service platform”, while the network is handled
locally on the platform. Despite the solution seens works well, using a traditional
SOA architecture can take away the system flexibility. In the model presented,
the authors suggest a centralized service platform creation, presenting itself as
a possible point of failure in case of problems.
The use of SDN technologies and Docker containers to solve the problems
related to heterogeneity using IoT devices was proposed by [6]. Experiments
were performed using an architecture with a centralized SDN controller. The
experimental environment has focused on the connectivity between the hetero-

geneous devices connected to the networks through the performance of different
traffic flows, with possible prioritization. At the time of publication, the authors
had not yet performed any tests related to device connection. Centralizing the
activities on an SDN device may prove to be a problem in case of failure. The
centralized communication and complexity of replacing an SDN controller device
can result in a long period offline.
On their work, [7] proposed to use Mobile Ad-Hoc Networks (MANETs) to
accelerate the collection of data from WSNs (Wireless Sensors Network). The
proposal explores the interaction opportunities allowed by standard protocols
to enable low-latency cross-network routing. The solution reduced the latency
of delivery of urgent data packets in all cases tested. The work presented an
integration of two different types of networks in an IoT context but is limited
only to networks with wireless communication protocol in the same frequency.
On [14], the authors proposed to use a gateway connected to devices using
different protocols and storing the data collected on the cloud. The proposal
aims to solve problems of communication between IoT devices, using as scope
the BLE and Zigbee protocols. Although it is similar to the work proposed here,
the authors up to the time of publication did not develop the tests to validate
what they were proposing.
There are also several other approaches related to solutions for heterogeneous
devices interoperability, such as [11], which proposes a driver abstraction model,
[21], which suggests the use of multiagent systems within an SOA architecture
to transform data access in services and [17], where a hierarchical approach
was proposed for objects within an IOT context. In addition to the mentioned
works, we can also highlight the work of [19] and [13], where were proposed using
frameworks to access the data.
4 Omniconn Architecture
In this paper, we propose the Omniconn Architecture, developed as a solution for
interoperability problems in the communication between heterogeneous devices
within an Industrial IoT context. Omniconn was designed to act on four groups
of components, based on the work [1]: Main Gateway, Secondary Gateways, IOT
Objects, and Control Devices. Each of these will be explained in detail in this
section.
Figure 1 illustrates the interaction between Omniconn components. Although
it appears in the figure, the control application did not have its prototype devel-
oped. The functionalities were tested through HTTP requests, simulating the
expected performance of the application.
Main Gateway: Its primary function is to provide information to the control
devices about which objects are available and which gateways should be called
for the desired service. Requests are composed by a command responsible for
performing some function or get data from an IoT Object. The communication of
the main gateway with the secondary ones will be done through HTTP requests,
in the client/server model.
Fig. 1. Omniconn prototype architecture.
Secondary Gateways: They are the main actors of this architecture. The
proposal is to have several gateways that handle all objects and sensors in
the environment without overloading. These devices will be controlled (acti-
vated/deactivated or data collection) through microservices. Services send or
receive messages to the objects or gather information from the sensors through
the GPIO of the board. The collected data is sent to the central gateway.
IoT Objects: Can be treated as actuators and sensors in a control system.
They have a moderate computational capacity, require low power consump-
tion and, communicate themselves through IoT protocols. For the proposal, the
used protocols were Zigbee, Bluetooth LE, and Wi-fi. The gateway must have a
transceiver with the modulation and radio frequency specified by the standards
of each communication.
Control Devices: The control devices are responsible for showing the inter-
action options with the IoT Objects. The available objects determine the pos-
sibilities. The devices will communicate with the central gateway to log in and
acquire tokens to access the secondary gateways.
4.1 Development
We decided to use microservices on the development of Omniconn, approaching

a service-oriented architecture. The reason is the wide variety of devices that
can arise and may difficult the insertion of new devices on the future, finding
the flexibility with the microservices. Another reason is the possibility to share
microservices for the same device. As an example, we can have a manufacturer
sending the service to connect on its device.
The Main Gateway is responsible for the requests routing, authentication,

and mapping of services. Its implementation was made using the Javascript
language. The choice was because of its event-oriented architecture, ideal for
dealing with inputs and outputs, expected behavior of the Architecture. The
Node.js interpreter was chosen due to its event-loop application, an event man-
ager responsible for providing asynchronous I/O functionality. The code consists
of a monolithic API server, based on the Koa.js web framework. The server struc-
ture can be split into control, model, and routing files.
Secondary gateways are responsible for allocating the microservices. The
implementation was also performed using the JavaScript language. The code
consists of the main server, based on the web framework “Express.js”. It is
responsible for instantiating the microservers using the framework “Microcule”.
The reason for choosing the Secondary Gateway server framework different from
the Gateway Central one was easy integration with the microservice framework.
The primary function of the Gateway prototype is service management. It means
instantiating the services, verifying the inclusion and removal of them, and test-
ing the connection to the devices.
The prototype allows three distinct ways to interact with an edge device (IoT
Object). In the first, the IoT object makes a call to the microservice, through
an HTTP request, sending its data. In the second, the IoT objects send their
data through the USB adapter relative to their protocol. In this case, the data
sent is forwarded to a responsible microservice. In the third case, the adapter
remains open to detections and forwards the devices it finds to a responsible
microservice.
5 Experimental Environment
For the experiments, a structure was set up to simulate a small environment
from a Smart Factory. We used 3 secondary gateways connected to a central
gateway, as shown in Fig. 2.
The experiments performed in this work were designed to measure the level of
interference that the use of multiple communication protocols can generate. For
this, we chose the Zigbee, BLE, and Wi-fi protocols. The tests were performed
initially for each communication protocol individually, increasing the number of
IoT objects connected to the corresponding secondary gateway. We performed
the isolated tests to compare it with the multiple protocols scenario. All tests
were executed five times, using the mean of the results as the final result. We
used 4 SOCs Raspberry Pi 2 as the gateways in the experiments. The boards
have a four-core 900 MHz processor and 1 Gb of memory each.
To simulate real use, we put all IoT objects in the same environment. The
secondary gateways for BLE and Zigbee devices were placed together with the
IoT objects, while the secondary gateway responsible for Wi-Fi objects and the
central gateway were placed in a separate environment, connected to a wired
network.
Table 1 shows the quantity of each type of device used in the test cases. Due
to the stabilization of the detections for the previously presented BLE devices,
Fig. 2. Multiple protocols topology.
Table 1. Test cases configuration
Bluetoot LE Zigbee Wi-fi

Test Case 1 1 1 1
Test Case 2 5 3 3
Test Case 3 10 6 6
Test Case 4 15 9 9
Test Case 5 15 12 12
it was decided to repeat the number of devices for the last 2 tests, using 15
instead of 20. For the results, we will use the term Ineroperability for the tests
using multiple protocols and Isolated for the tests using a specific one. The blue
columns and lines represent the results for the referred protocol, while the red
ones represent the interoperability results.
Figure 3 shows results of the number of requests performed using the Blue-
tooth protocol in the isolated tests compared to experiments using multiple
protocols. It is possible to see that the quantities of requests in the isolated tests
and of interoperability remain very close in the tests with 1, 5, and 10 devices.
The difference slightly increases for the experiments with 15 devices (BLE and
Zigbee with 9 devices each). Finally, we can see the most significant difference
in the last test, using all the devices.
Regarding the average response time, it is possible to notice in Fig. 4 that the
results remained very close with 1 and 5 devices. The difference increased slightly
for tests with 10 and 15 devices and reached the most significant difference in
the last test, where the average response time of the interoperability tests was
almost 20% higher.
Fig. 3. Bluetooth result - total requests.
Fig. 4. Bluetooth result - average response time.
The test with Zigbee protocol shows, as can be seen in Fig. 5, the number
of requests increasing according to the number of devices and stabilizing since
the tests with 6 devices. The number of requests remained very close in all
experiments. The most significant differences observed were 2% higher for the
interoperability tests with 3 devices and 3% for the isolated tests using 6 objects.
The difference between the response time can be seen on Fig. 6.
Figure 7 shows the result of the comparison of the number of requests between
the isolated and interoperability tests related to the Wi-Fi protocol. Unlike pre-
vious protocols, the total amount is continually increasing as more devices are
inserted.
Regarding the average response time of the requests, the values found in the
isolated tests remained higher during configurations with 1, 3, and 6 devices. In
the experiments with 9 and 12 devices the average of the interoperability tests
were higher, reaching the most significant difference (approximately 26%) with
9 devices.
Fig. 5. Zigbee result - total requests.
Fig. 6. Zigbee result - average response time.
Fig. 7. Wi-fi result - total requests.


Our work presented the Omniconn Architecture, developed aiming to enable the
interoperability of heterogeneous devices in IoT environments, such as a Smart
Factory. We also performed tests to verify the interference in the use of multiple
communication protocols on real environments. Isolated tests were performed to
compare the interference using the protocols defined in the scope of this work,
the BLE, Zigbee and Wi-fi.
After the isolated tests, we performed experiments connecting the three
protocols devices in the same Omniconn instance. The number of devices was
increased according to Table 1. So, the results were compared with the isolated
tests.
Regarding the experiments, the results found were very close on both tests.
We found the most significant differences in the average response time of the
interoperability tests in comparison to the isolated. In the experiments with
Bluetooth, this difference was about 20% while in tests with Wi-fi, the difference
reached 26%.
Although in some tests there has been a considerable difference in the aver-
age response time, the total of timeouts and invalid packets were 0.02% and
0.014% respectively. The found results validate the feasibility to use a similar
architecture in a real environment. The tests were performed to simulate much
more requests than in a real scenario, so we believe that the difference can be
even lower.
As future work, there is a need for testing the complete environment to
validate the best architecture for using Omniconn. The use of singles or multiple
instances of the gateway for the entire smart factory should be tested to measure
the interference. Also, alternative layouts can be examined, such as a secondary
gateway being used to communicate with more than one type of device.
Acknowledgments. This study was financed in part by the Coordenação de Aper-

feiçoamento de Pessoal de Nı́vel Superior - Brasil (CAPES) - Finance Code 001. We
also thanks INESC-Brazil for partially support this research work.
References
1. Agostinho, B.M., Rotta, G., Della Mea Plentz, P., Dantas, M.A.R.: Smart Comm:
a smart home middleware supporting information exchange. In: IECON 2018 -
44th Annual Conference of the IEEE Industrial Electronics Society, pp. 4678–4684,
October 2018
2. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., Ayyash, M.: Internet
of Things: a survey on enabling technologies, protocols, and applications. IEEE
Commun. Surv. Tutor. 17, 2347–2376 (2015)
3. Atzori, L., Iera, A., Morabito, G.: The Internet of Things: a survey. Comput. Netw.
54(15), 2787–2805 (2010)
4. Augustin, A., Yi, J., Clausen, T., Townsley, W.M.: A study of LoRa: long range
& low power networks for the Internet of Things. Sensors 16(9), 1466 (2016)
5. Baronti, P., Pillai, P., Chook, V.W., Chessa, S., Gotta, A., Hu, Y.F.: Wireless
sensor networks: a survey on the state of the art and the 802.15.4 and Zigbee
standards. Comput. Commun. 30(7), 1655–1695 (2007). Wired/Wireless Internet
Communications
6. Bedhief, I., Kassar, M., Aguili, T.: SDN-based architecture challenging the IoT het-
erogeneity. In: 2016 3rd Smart Cloud Networks Systems (SCNS), pp. 1–3, Decem-
ber 2016
7. Bellavista, P., Cardone, G., Corradi, A., Foschini, L.: Convergence of MANET and
WSN in IoT urban scenarios. IEEE Sens. J. 13, 3558–3567 (2013)
8. Bottaro, A., Gérodolle A.: Home SOA: facing protocol heterogeneity in pervasive
applications. In: Proceedings of the 5th International Conference on Pervasive Ser-
vices, ICPS 2008, pp. 73–80. ACM, New York (2008)
9. Gomez, C., Oller, J., Paradells, J.: Overview and evaluation of bluetooth low
energy: an emerging low-power wireless technology. Sensors 12(9), 11734–11753
(2012)
10. Luzuriaga, J.E., Perez, M., Boronat, P., Cano, J.C., Calafate, C., Manzoni, P.: A
comparative evaluation of AMQP and MQTT protocols over unstable and mobile
networks. In: 2015 12th Annual IEEE Consumer Communications and Networking
Conference (CCNC), January 2015
11. Moazzami, M., Xing, G., Mashima, D., Chen, W., Herberg, U.: Spot: a
Smartphone-based platform to tackle heterogeneity in smart-home IoT systems.
In: 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT), pp. 514–519,
December 2016
12. Ngu, A.H., Gutierrez, M., Metsis, V., Nepal, S., Sheng, Q.Z.: IoT middleware: a
survey on issues and enabling technologies. IEEE Internet Things J. 4(1), 1–20
(2017)
13. Perumal, T., Ramli, A.R., Leong, C.Y.: Interoperability framework for smart home
systems. IEEE Trans. Consum. Electron. 57(4), 1607–1611 (2011)
14. Rahman, T., Chakraborty, S.K.: Provisioning technical interoperability within Zig-
bee and BLE in IoT environment. In: 2018 2nd International Conference on Elec-
tronics, Materials Engineering Nano-Technology (IEMENTech), pp. 1–4, May 2018
15. Rotta, G., Dantas, M.A.R.: Um estudo sobre protocolos de comunicação para ambi-
entes de internet das coisas. Escola Regional de Alto Desempenho (2017)
16. Saint-Andre, P., Smith, K., Tronçon, R., Troncon, R.: XMPP: The Definitive
Guide. O’Reilly Series. O’Reilly Media, Sebastopol (2009)
17. Sarkar, C., Nambi, A.U., Prasad, R.V., Rahim A.: A scalable distributed architec-
ture towards unifying IoT applications. In: 2014 IEEE World Forum on Internet
of Things (WF-IoT), March 2014
18. Siekkinen, M., Hiienkari, M., Nurminen, J.K., Nieminen, J.: How low energy
is Bluetooth low energy? Comparative measurements with Zigbee/802.15.4. In:
2012 IEEE Wireless Communications and Networking Conference Workshops
(WCNCW), pp. 232–237, April 2012
19. Sobral, J., Rodrigues, J., Rabelo, R., Lima Filho, J.C., Sousa, N., Araujo, H.S.,
Filho, R.: A framework for enhancing the performance of Internet of Things appli-
cations based on RFID and WSNs. J. Netw. Comput. Appl. 107, 02 (2018)
20. Xiao, G., Guo, J., Xu, L.D., Gong, Z.: User interoperability with heterogeneous
IoT devices through transformation. IEEE Trans. Ind. Inform. 10(2), 1486–1496
(2014)
21. Zhiliang, W., Yi, Y., Lu, W., Wei, W.: A SOA based IoT communication mid-
dleware. In: 2011 International Conference on Mechatronic Science, Electric Engi-
neering and Computer (MEC), August 2011
A Framework for Allocation of IoT
Devices to the Fog Service Providers
in Strategic Setting
Anjan Bandyopadhyay1,2(B) , Fatos Xhafa3 , Saurav Mallik4,5 , Paul Krause6 ,

Sajal Mukhopadhyay1 , Vikash Kumar Singh7 , and Ujjwal Maulik8
1
NIT Durgapur, Durgapur, West Bengal, India
anjanmit@gmail.com,sajal@cse.nitdgp.ac.in
2
Amity University, Kolkata, India
3
Universitat Politècnica de Catalunya, Barcelona, Spain
fatos@cs.upc.edu
4
Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India
sauravmtech2@gmail.com
5
The University of Texas Health Science Center at Houston,
Houston, TX, USA
saurav.mallik@uth.tmc.edu
6
University of Surrey, Guildford, UK
p.krause@surrey.ac.uk
7
Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, India
vikas.1688@gmail.com
8
Jadavpur University, Kolkata, India
umaulik@cse.jdvu.ac.in
Abstract. In the IoT+Fog+Cloud architecture, with the ever increas-

ing growth of IoT devices, allocation of IoT devices to the Fog service
providers will be challenging and needs to be addressed properly in both
the strategic and non-strategic settings. In this paper, we have addressed
this allocation problem in strategic settings. The framework is developed
under the consideration that the IoT devices (e.g. wearable devices)
collecting data (e.g. health statistics) are deploying it to the Fog ser-
vice providers for some meaningful processing free of cost. Truthful and
Pareto optimal mechanisms are developed for this framework and are
validated with some simulations.
1 Introduction
As the unprecedented growth of IoT devices is around the corner [1] (CISCO
estimated that by 2020 there will be 50 billion of connected devices with an
average of 7 per person [2]), big chunks of data will need to be collected and
processed. The most potential option for processing the data is to use the cloud
framework that provide infrastructure less services starting from any individual
F. Xhafa—(On Leave, University of Surrey, UK).
https://doi.org/10.1007/978-3-030-33509-0_31
A Framework for Allocation of IoT Devices to the Fog Service Providers 341
to small and large enterprises. However, the IoT devices are to be connected with
the cloud through Internet and thereby can consume a huge bandwidth and may
not be supported always with the limited bandwidth availability (even with the
next-generation standard for wireless communications such as 5G or more). To
circumvent this issue, Fog computing could be a viable solution, where much of
the data processing may be done close to the data source rather than being sent
to cloud. In this paper, we are concerned about the fact that, there are some
Fog service providers and the data collected by the IoT devices can be processed
with one or more of such Fog service providers.
At present, the number of Fog service providers are limited. However, with
the ever increasing growth of IoT devices and to meet their demand, many more
Fog service providers may join the market. The increasing number of Fog service
providers will be providing users (synonymously be used with IoT devices) a
greater choice and also creating difficulty to choose the most appropriate provider
for their requirements. This requires a proper and efficient resource allocating
mechanism in Fog computing both in strategic and non-strategic settings. In
strategic setting the users are strategic in the usual game theoretic sense [3].
In strategic setting, the resource allocation problem can be represented both
with money (when users pay to the fog service providers) and without money
(when users get certain services free of cost). In this paper, we have studied
the resource allocation problem in a without-money setting. First, in this paper,
we have considered a case where an IoT device (user) needs an exclusive use of
the Fog Service Provider (F SP ) for a long time once allocated (later described
as ti = ∞). The model is then extended in an interesting case where an IoT
device needs the F SP for a specific duration from when they are allocated
(later denoted as ti = ∞). A detailed simulation is provided for both the cases.
The remaining sections of the paper are described as follows. The prior works
are demonstrated in Sect. 2. In Sect. 3, we describe the system model and formu-
late the problem. The proposed mechanisms are illustrated in Sect. 4. In Sect. 5,
the analysis of the F SAM -IComP is carried out. The algorithm for the extended
version of the problem is depicted in Sect. 6. The experimental results are dis-
cussed in Sect. 7. Finally, the conclusions and future works are highlighted in
Sect. 8.
2 Related Works
As the background of IoT+Fog+Cloud framework, in the IoT+cloud architec-

ture, several works have been done in the direction of modelling the problem
of resource allocation through the concept of mechanism design with money
(mainly auctions) and mechanism design without money, as the participat-
ing agents are strategic in nature. Talking about the scenario where money
is involved, in [4] an auction is utilized to allocate the computing capacity
of a computer to the users. Based on the demand of the computation time
by the users, the payment of the users are decided. Similar line of thinking is
still relevant in today’s cloud computing market and will be discussed in many
342 A. Bandyopadhyay et al.
subsequent papers. As the participating agents are rational they can manipulate
their private information to have some extra incentives. In order to tackle such
situation, some truthful mechanisms are discussed [5]. Further considering the
scenario where there is/are single (or multiple) service provider offering multiple
heterogeneous services and on the other hand we have multiple users request-
ing for bundle of services among the available services. In literature such type
of scenarios are modelled by utilizing the framework of combinatorial auction.
Moving on to the case where money is not involved in any sense, in [6] the set-
up is, there are multiple users and multiple service providers say n, and users
provide a preference ordering over the available service providers. Here, the goal
is to allocate the best available service provider to each of the users.
The truthful mechanism is proposed for this discussed set-up. Further in [7],
the set-up discussed in [6] is extended to the case, where both the users and
the service providers are providing the strict preference ordering (full or partial
preference) over the members of the opposite community. A truthful mechanism
is proposed to allocate the services to the users (each user receives single ser-
vice). Coming back to our IoT+Fog+Cloud framework, currently there are few
existing works on the concept of fog computing [8,9]. For the detailed overview
of Fog computing and the research challenges the readers can go through [8–
11]. In [12], in the Fog environment, the bidders reveal the bids along with the
discrete sets of resources. The purpose of these bids is to reserve and allocate
those resources for a fixed period of time. The objective, here, is to allocate
the resources to the bidders in a non-conflicting manner. The above discussed
papers in Fog computing have mainly considered the problems from the mone-
tary perspective. However, it is found that till date no work has been done in
Fog computing environment from non-monetary (Money is not involved in the
market in any sense.) perspective. We believe that this paper is the first time one
of the problems in a Fog computing environment has been modelled by utilizing
the concept of mechanism design without money.
3 System Model and Problem Formulation

In this model, we have a number, n, of Fog service providers. The Fog service
providers are present always to impart their services on demand. The Fog service
providers may be heterogeneous in nature. Say some are providing some CPU
related services simply, some are providing some data analysis facilities and so on.
So they are providing services for several categories C = {c1 , c2 , . . . , ck }. Before
the allocation of IoT devices, they may be clustered based on the category of
services they produce (this is a static clustering to be done before hand. This
raises the issue about the dynamics, when more fog services or IoT devices come
in. This issue will be addressed in our future work.). We can denote all the service
providers in a particular category ci ∈ C as : Pi = {pc1i , pc2i , . . . pcnii } and all such
P3
P1 P2
Fig. 1. Partitioning the service provider based on the category of services provided.

service providers as P = {P1 , P2 , . . . Pk }, where i ni = n. This characterization
is shown in Fig. 1.
Similarly, we have a number, m, of IoT devices and their demand is hetero-
geneous in nature. They will be categorized by services they need. We denote
all IoT devices in a particular category ci ∈ C as Ai = {ac1i , ac2i , . . . acmi i } and

i mi = m and all such IoT devices as A = {A1 , A2 , . . . , AK }. Each IoT device
is characterized by aci i = (δici , tci i }, where δici is a strict preference ordering over
Pi∗ ⊆ Pi and tci i is the time needed to complete the desired job of the IoT device
(Here, job means the desired processing of its collected data.). Every user of the
specific category ci (viz. aci i ∈ Ai ) has a strict preference ordering over Pi∗ ⊆ Pi .
The strict preference ordering of the j th user in category ci i.e. acji ∈ Ai is
denoted by δjci , where δjci = {pcri : pcri ∈ Pi } and each pcri ∈ Pi is separated by
a relation P j . For example, say we have five service provider in Pi and user
i
j = 2 has a given preferences (it could be full preference also) over pc2i , pc1i ,
and pc5i in that order. So, the δ2ci becomes : δ2ci = (pc2i P ci P i ci ci
2 p1 2 p5 ). tj is
i
only a number, say 10 that depicts the completion time from when it is getting
allocated. The unit of the number may be second, minute, hour etc. depending
on the applications. So, we can say tcji ∈ .
First let us take the structure of the full preference and later we will come
back to the partial preference case again. From now on we will discuss the algo-
rithm in any arbitrary category until stated otherwise without loss of generality.
If we don’t superscript ci , it will be assumed that it is for some arbitrary cate-
gory ci . For the full preference case, say for example, there are three users (viz.
a1 , a2 , and a3 ) and three service providers (viz. p1 , p2 , and p3 ). In the one sided
matching, users give their strict preferences over the service providers. Therefore
every user has a power to rank all the service providers and provide the strict
preferences over the service providers. Every service provider delivers its service
for a particular category e.g., every user find a job from different job sites. Here
job sites can be depicted as the service providers, and the users are denoted as
the job seekers. Hence, the users yield their strict preferences over the job sites,
i.e they provide rank of the job sites. Here, every user (a1 , a2 , and a3 ) gives their
preferences to the job sites (s1 , s2 , and s3 ). In this paper, we generally use the
notation pi k pj , where i = j, This notation signifies that the user k prefers
the service provider pi to the service provider pj . Total ordering of the users
could be depicted as follows: (1) a1 : p1 1 p3 1 p2 , (2) a2 : p2 2 p1 2 p3 ,
and (3) a3 : p3 3 p2 3 p1 . Here three possible conditions may be introduced
in the partial preference cases: (1) m = n, (2) m > n, and (3) m < n. In these
situations, every user gives their partial preferences over the intended service
providers, and obtains the best available service providers.
4 Proposed Mechanisms
In this section, firstly RanAlgo is given as a naive solution for our problem,
that will help to understand better, the truthful mechanism called Fog Service
Allocation Mechanism with Incomplete Preferences (F SAM -IComP ) motivated
by [13,14].
4.1 Random Algorithm (RanAlgo)
The central idea of RanAlgo is, for each category ci , randomly pick a user from
the available users list. Next, randomly pick a service provider from the selected
user’s preference list and allocate it. Remove the user along with the allocated
service provider from the market. The process repeats until the users list becomes
empty.
4.2 Fog Service Allocation Mechanism with Incomplete Preferences

(FSAM-IComP)
In this section, we propose an one sided partial preference algorithm motivated

by [13,14] where users and service providers may not be equal in lengths (i.e.
m = n).
Algorithm 1. select best(ij , Allocated List)
1 for k = 1 to length (ij ) do
2 if pxki ∈
/ Allocated List then
3 return k
4 end
5 end
6 return 0
The time complexity of F SAM − IComP is O(n) + O(nlogn) + O(mn) =

O(n2 ) if m = O(n).
Algorithm 2. FSAM-IComP (Pi , Ai , i )
1 Allocated List ← φ
2 a∗ ← φ, p∗ ← φ, D ← φ
3 for j = 1 to |Ai | do
4 D[j] = j
5 end
6 for i = 1 to n do
7 Swap D[i] with D[Random(i, n)]
8 end
9 for j = 1 to |Ai | do
10 Assign (axj i , D[j])
11 end
12 Ai ← Sort(Ai ) // Sort Ai based on random number assigned.
13 for j = 1 to Ai do
/* Process Ai sequentially */
14 a∗ ← axj i
15 k ←select best(ij , Allocated List)
16 if k = 0 then
17 user is unallocated else
18 p∗ ← pxki
19 Allocated List ← Allocated List ∪ p∗
20 R ← R ∪ (a∗ ∪ p∗ )
21 end
22 a∗ ← φ, p∗ ← φ
23 end
24 end
25 return R
5 Analysis of the Proposed Mechanism

Now, we will prove some theoretical results about the proposed F SAM -IComP .
It is to be noted that, before applying the concept of the DRAW [14], we had
to take care of (δi , ti ) pair of each user. So, the problem became two fold. First
dividing the users according to the category. Secondly, in each category, based
on the preferences and ti , allocate the service providers to the users. For ti = ∞,
we can directly apply DRAW for the allocation process. However, for ti being
some constant (ti = ∞ case), the DRAW may not be directly applicable. In
that case a modified DRAW will be used (which is discussed later in the name
M F SAM -IComP ). As the algorithm is two fold, for both the cases ti = ∞ and
ti = ∞, we have to prove that in our setting F SAM -IComP (for ti = ∞ case)
and M F SAM -IComP (for ti = ∞), satisfying some economic properties such
as, truthfulness, Pareto-optimality. As we are applying DRAW in our two fold

mechanism, the proof will be similar in nature of [14,15].
Definition 1 (Truthful). A mechanism is truthful if a user always maximizes

his utility by declaring his true input, regardless of what the other users declare.
Proposition 1. The DRAW is Truthful.
Theorem 1. F SAM -IComP mechanism is truthful.
Proof. The proof is removed due to space constraint.
Definition 2 (Pareto optimality) [3]. Pareto optimality provides an outcome

or allocation, where you can’t move an arbitrary user to a better place without
harming at least one user except that arbitrary one.
Proposition 2. The DRAW is Pareto optimal.
Theorem 2. The allocation resulted by F SAM -IComP is Pareto Optimal.
Proof. The proof is removed due to space constraint.
6 Algorithm for ti = ∞ Case
We will first make an observation here. The observation will provide the insights
that will help developing algorithm in this setting.
6.1 Observation
We have already seen that a user aci i ∈ Ai is characterized by (δi , ti ). In ti = ∞,

once an allocation is made, a service-provider is exclusively given to a user and
that was natural as the user needed it for a longer period of time. Observe that
once an exclusive permission is offered to a user, it can’t be given to the other
even if some of the other users had the same preference for that service provider.
However in ti = ∞ case, this may not be the case. For example, say two users
have the same first preference for a service provider. The first user needs 10 min
to complete the job and the second user needs 7 min to complete the job. The
service provider is available throughout a day for imparting service to the users.
In this case, if we use F SAM -IComP , one user will get the service provider i.e.
his first preference and the other one will not get his first preference as F SAM -
IComP gives an exclusive access to a user once allocated. But we can modify the
F SAM -IComP to accommodate more users to a particular service provider if
they have the same preference. We call this algorithm Modified F SAM -IComP
and in short M F SAM -IComP .
6.2 Designing the Algorithm

When developing F SAM -IComP a detailing of the several components of the
algorithm was presented. For M F SAM -IComP careful outline of the algorithm
will be sketched and this will be sufficient as already the fundamental framework
has been laid down by F SAM -IComP in ti = ∞ case.
6.3 Sketch of the Algorithm

The main idea of the M F SAM -IComP is that instead of locking a service
provider by exclusive access,
Algorithm 3. MFSAM-IComP
1 for each category ci ∈ C do
2 D ← Extract users of this category
3 Randomize D
4 for i = 1 to |D| do
/* process each ai */
5 for j = 1 to |P ∗ | ⊆ |Pi | do
/* process the ranked list of ai */
6 if ai could be accommodated to pj ∈ P ∗ based on the length of the job then
7 assign ai to pj
8 break
9 end
10 end
11 end
12 end
we have to keep the door of the service provider opens as long as possible.
This view point will enable us to accommodate more users to their preferred
available service provider. The idea of the algorithm is given in Algorithm 3.
The time complexity of the proposed algorithm is O(kmn) time, where k is
the total number of categories to be processed. If k = O(m) or k = O(n) and
m = O(n) then the time complexity is boiling down to O(n3 ).
Theorem 3. M F SAM -IComP is truthful
Proof. Fix a category ci and consider the ith user being processed. The ques-
tion is that whether he should provide the preference list truthfully or not. Our
claim is yes. First observe that all the users 1, 2, . . . , i − 1 being processed inde-
pendently of the preference list provided by the ith user and the processing is
done sequentially. This ensures that no further processing will be made to the
users 1, 2, . . . , i − 1. Some of the choices that are common to i s preference, may
be taken away by the earlier users processed before ith user and ith user can’t
do anything. Now, the proof boiling down to the fact that whether ith user is
getting the best available choice when his turn of processing is coming. By the
construction of the algorithm i s preference list will be scanned top to bottom in
that order and he will get the next choice when and only when the previous list
has been exhausted. So, we can infer that he will always get the best available
choice. So, he should be truthful. Any possible lie will allocate him a service
provider which is not better than the current best. Hence the theorem.
Theorem 4. M F SAM -IComP is Pareto optimal.
Proof. In M F SAM -IComP , a user is allocated his first choice as long as possi-
ble. Otherwise, he is allocated the best available. At any ith stage if we consider
allocating a user by any other algorithm, it has to chose the strategy of the
M F SAM -IComP , otherwise it will lead to a sub-optimal allocation in terms of
choice as stated earlier, by construction, M F SAM -IComP allocates a user his
first choice as long as possible (the user will be getting a lower ranked service
provider than the current best, a worsening effect and hence violating the Pareto
optimal property). So, we conclude that M F SAM -IComP is Pareto optimal.
7 Experimental Analysis
In the fog computing framework, users give their true preferences over the ser-
vice providers. In this segment, we analyze the efficiency of the F SAM -IComP
algorithm via the simulation study. Here, RanAlgo has to be considered as a
benchmark algorithm and is compared with F SAM -IComP (in case of partial
preferences). In this area, some simulation has been worked out in light of the
data created randomly. To demonstrate the execution of proposed calculation
F SAM -IComP , we compare F SAM -IComP algorithm with Random Algo-
rithm (RanAlgo). Of note, the experimental code is composed in the Python
language.
7.1 Simulation Set-Up
For the simulation purpose, here, we have considered 500 different users and 500
different service providers. It is to be noted that, in each of the categories, the
users and the service providers are fixed. For the simulation, let the number of
users and service providers be the same.
7.2 Performance Evaluation
The execution of the proposed systems is estimated under the standard of two
vital parameters: (1) Efficiency Loss (EL): Efficiency Loss can be calculated by
the difference between the index of the allocated service provider from the user
preference list and the index of the most preferred service provider by the user
from his preference list. (2) Best Allocation (BA): It evaluates the number of
the user (ui ) that gets their most preferred service provider from their provided
preference list over the accessible number of service providers.
7.3 Discussion
To demonstrate the proficiency of the proposed algorithm F SAM -IComP , we
have compared our proposed algorithm to the Random algorithm RanAlgo along
with several cases such as (F SAM -IComP -L-var), (F SAM -IComP -M -var)
and (F SAM -IComP -S-var). Here, (F SAM -IComP -L-var) is depicting the
fact that 12 of the users can manipulate their true preference list. Likewise,
(F SAM -IComP -M -var) and (F SAM -IComP -S-var) is depicting the fact that
1 1
4 and 8 of the users can manipulate their true preference lists respectively.
7.4 Results Analysis

We consider three conditions (i.e., m/8, m/4 and m/2) that deviate from their
true preference list, where m = total number of users.
Fig. 2. Efficiency loss and best allocation for m = n scenario
With the help of these three conditions, we compare the performance of the
Random algorithm with the F SAM -IComP algorithm. Now, we calculate the
efficiency loss and compare between our algorithm F SAM -IComP and the Ran-
dom Algorithm (RanAlgo). In Fig. 2a, when the subset of all the users varies their
true preference ordering, then the EL of the users with the RanAlgo is more than
the large variation (F SAM -IComP -L-var) is more than the medium variation
(F SAM -IComP -M -var) is more than the small variation (F SAM -IComP -S-
var) is more than the without variation F SAM -IComP . The observation leads
to the conclusion that F SAM -IComP algorithm performs much better than the
Random Algorithm and also better than the deviation cases with m/2, m/4 and
m/8 number of users. Next, we have depicted the best allocation by the users
to the service providers in Fig. 2b. It can be concluded that F SAM -IComP ,
in this case, is performing better i.e. when the users are not manipulating their
true preferences, a substantial number of users are getting their best preference.
In Fig. 3 we have compared M F SAM −IComP against F SAM −IComP in
two different scenarios, for the case where ti = ∞. It can be seen that, the number
Fig. 3. Best allocation for m = n and m > n scenarios
of best allocation made in case of M F SAM − IComP is more as compared to

that of F SAM − IComP in two scenarios. This is due to the fact that, in case of
M F SAM −IComP no service provider is exclusively given access to a particular
user and a single service provider can be allocated to multiple users. Due to this
reason, large numbers of users are getting their most preferred service provider.
8 Conclusion and Future Works

In this paper, we have proposed a framework where users give their partial
preferences on fog service providers. Here, we have considered that every fog
service provider gives their services free of cost and every user may get their
best-preferred service from the available service providers. In our future works,
we will address the issue when an IoT based device needs services from the fog
nodes and the transactions will occur in exchange of money.
Acknowledgements. Fatos Xhafa’s work has been supported by Spanish Ministry

of Science, Innovation and Universities, Programme “Estancias de Profesores e inves-
tigadores senior en centros extranjeros, incluido el programa Salvador de Madariaga
2019”, PRX19/00155. Work partially done at University of Surrey, UK.
References
1. Brogi, A., Forti, S.: QoS-aware deployment of iot applications through the fog.
IEEE Internet of Things J. 4(5), 1185–1192 (2017)
2. CISCO. Fog computing and the Internet of Things: extend the cloud to where
the things are. https://www.cisco.com/c/dam/enus/solutions/trends/iot/docs/
computing-overview.pdf (2015)
3. Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V.: Algorithmic Game Theory.
Cambridge University Press, New York (2007)
4. Sutherland, I.E.: A futures market in computer time. Commun. ACM 11(6), 449–
451 (1968)
5. Zhang, X., Wu, C., Li, Z., Lau, F.C.M.: A truthful (1 − ε)-optimal mechanism for
on-demand cloud resource provisioning. IEEE Trans. Cloud Comput. 1 (2019)
6. Bandyopadhyay, A., Mukhopadhyay, S., Ganguly, U.: Allocating resources in cloud
computing when users have strict preferences. In: 2016 International Conference on
Advances in Computing, Communications and Informatics, ICACCI 2016, Jaipur,
India, 21-24 September 2016, pp. 2324–2328 (2016)
7. Bandyopadhyay, A., Mukhopadhyay, S., Ganguly, U.: On free of cost service dis-
tribution in cloud computing. In: 2017 International Conference on Advances in
Computing, Communications and Informatics (ICACCI), pp. 1974–1980, Septem-
ber 2017
8. Hu, P., Dhelim, S., Ning, H., Qiu, T.: Survey on fog computing. J. Netw. Comput.
Appl. 98(C), 27–42 (2017)
9. Mahmud, R., Kotagiri, R., Buyya, R.: Fog computing: a taxonomy, survey and
future directions. pp. 103–130. Springer, Singapore (2018)
10. Puliafito, C., Mingozzi, E., Longo, F., Puliafito, A., Rana, O.: Fog computing for
the Internet of Things: a survey. ACM Trans. Internet Technol. 19(2), 18:1–18:41
(2019)
11. Mouradian, C., Naboulsi, D., Yangui, S., Glitho, R.H., Morrow, M.J., Polakos,
P.A.: A comprehensive survey on fog computing: state-of-the-art and research chal-
lenges. IEEE Commun. Surv. Tutor. 20(1), 416–464 (2018)
12. Fawcett, L., Broadbent, M., Race, N.: Combinatorial auction-based resource allo-
cation in the fog. In: 2016 5th European Workshop on Software-Defined Networks,
pp. 62–67 (2016)
13. Gale, D., Shapley, L.S.: College admissions and the stability of marriage. Am.
Math. Monthly 69(1), 9–15 (1962)
14. Roughgarden, T.: CS269I: Incentives in computer science, (Stanford University
Course), Lecture #1: The draw and college admissions, September 2016
15. Roughgarden, T.: CS269I: Incentives in computer science (Stanford University
Course), Lecture #2: Stable matching, September 2016
The 12th International Workshop on
Simulation and Modelling of
Engineering and Computational Systems
(SMECS-2019)
Blockchain Based Decentralized
Authentication and Licensing
Process of Medicine
Muhammad Azeem, Zain Abubaker, Muhammad Usman Gurmani,

Tanzeela Sultana, Abdul Ghaffar, Abdul Basit Majeed Khan,
and Nadeem Javaid(B)

Abstract. Counterfeit medicines are increasing day by day and these

medicines are damaging the health of people. Drug Regulatory Author-
ities (DRAs) are trying to overcome this issue. Synchronized electronic
medicine record can mitigate this risk. We proposed a decentralized
Blockchain (BC) based medicine licensing and authentication system to
stop production of counterfeit medicines. Our proposed system provides
a convenient way to register medicines by manufacturers with DRA. Ven-
dors will also be registered with DRA, they are intermediate who buy
from manufacturer and sale to customers. Furthermore, every transaction
between manufacturer and vendor will be saved to BC. We used Proof
of Collaboration (PoC) as a consensus mechanism, the manufacturer
deals with more vendors will have more mining p ower. DRA has a dif-
ferent department, i.e., Licensing Department (LD), regulatory Depart-
ment (RD), and Quality Control Department (QAD). These departments
perform many actions, which will be saved in the BC database. LD reg-
isters manufacturers and vendors, RD imposes rules and QAD makes
random checks to test the quality of medicines. We also propose manu-
facturer profile scheme, QAD rates manufacturer according to its qual-
ity of medicines from the feedback of users. Moreover, we provide an
interface to the users, through which they can check the authenticity of
medicine. We compare results of traditional licensing system with BC
based licensing system.
Keywords: Blockchain · Decentralized · Authentication · Licensing ·

Data security · Medicines network · Verification · Consensus
1 Introduction
In the modern era, counterfeit medicines are still a serious issue. These coun-
terfeit medicines are not only damaging the patient’s health, rather, it is also
a dismal reflection of society. The manufacturers who formulate the counter-
feit medicines, put patient’s life on risk as well as they are responsible for tax
https://doi.org/10.1007/978-3-030-33509-0_32
356 M. Azeem et al.
evasion, pollution, unemployment and child labor. To overcome these issues, lit-
erature is full of ideas, models, and processes by which medicine can be traceable,
authenticate and minimize the production of counterfeit medicines.
With to increasing number of medicines, the Drug Regulatory Authority
(DRA) made lot of rules and trying to fully implemented these rules to over-
come on counterfeit medicines. Departments under drug regulatory authority
are: (1) Licensing Department (LD) (2) Regulatory Department (RD) (3) Qual-
ity Assurance Department (QAD). These department are working in unsynchro-
nized environment, due to which efficiency and traceability are unsatisfied. How-
ever, strict and unimplemented regulation and inefficient bureaucratic job are
also a huge reason in the production of counterfeit medicines. All these depart-
ment have their own different database and asynchronous communications. This
asynchronous communication makes beneficial for counterfeit medicine’s manu-
facturer, in the formation of bogus and low quality medicines. Once, a manufac-
turer reaches to LD for register a medicine, LD does not have enough information
to see the history of a manufacturer, its previously registered medicine and the
quality of medicine it is producing. Also, QAD do not have enough information
of registered and currently producing medicine of manufacturer.
Vendor is an entity of this system, which buying medicines from manufacturer
and selling to user. Mischievously, vendor are not registered with the DRA. So,
some vendors are selling counterfeit medicines with the cooperation of bogus
manufacturers.
There is a need of synchronous distributed database, which provides data
efficiently and correctly with security. In the medicine domain, LD should have
access to view history of the manufacturer, its quality reports and overall
progress before issuing a new license of medicine. QAD should also have access
to know about all licensing products of a manufacturer and it should be able to
check currently manufacturing medicine. Blockchain is now emerging and one of
the disruptive technologies, providing the most beneficial and secure mechanism
to trade between entities without any centralized party. Besides, high security
and reliability make Blockchain (BC) more convenient to use in every field of
life. A decentralized BC based medicine licensing and authentication system per-
forms better than centralized asynchronous database. Moreover, asynchronous
data can be stolen or temper in any way, whereas, BC based decentralized data
is very hard to stolen and tempered. Although, every entity of BC has a copy
of the database if any hacker will try to tempered data, others have a copy of
original data to mitigate this issue.
For controlling counterfeit medicines, our contribution is summarized as fol-
low:
1. We proposed a decentralized BC based database for medicines, in which

license issues to the manufacturer which will be validated by the miner.
Other manufacturers act as miners and validate new license with Proof of
Collaboration (PoC) as a consensus mechanism. The manufacturer deals with
more vendors has more mining power. Also, decentralized database enables
all departments to check medicine license and authenticity at real time.
Blockchain Based Decentralized Authentication 357
2. We proposed profile scheme by which synchronous database allows to all

the entities like DRA departments, manufacturers, vendors, and users to see
manufacturer and vendor’s profile. QAD rates the profiles of manufacturer
and vendor from the feedback of user. Furthermore, the license would be
canceled if manufacturer and vendor do not maintain their profile rating
according to the threshold level set by DRA.
3. We use smart contract to impose rules, by which both parties will be agreed
on rules and regulations. The violation of rules and regulation would affect
the manufacturer’s profile as a negative rating.
4. We provided an interface for users to verify the medicine authenticity and
quality. On the interface, user input medicine license and transaction number
performed between user and vendor, in result user will get all detail about
medicine. User will rate manufacturer and vendor’s profile according to his
satisfaction.
2 Related Work
With a lot of research in the interoperability of health-care and medical data,

there is little research in medicine authentication and licensing [1]. In [2] author
discussed openEHR interoperability standard and explained the OmniPHR
model, due to a lack of interoperability standard medical and health-care data
very difficult to integrate. In [3] explained the medical information sharing with
BC, dealing with data security and controlled sharing. Also, precautions of inap-
propriate handling of medical and health-care data lead to personal identity
leakage, insurance fraud, and personnel data misused. The procedure of patients
from contact with a doctor to buy medicine is explained in [4] with the applica-
tion of BC. In [5] author proposed a model MedRec, a BC based model for saving
the medical record and health-care data. Further, it determine the security and
privacy of the system during the record access. Blockchain based networks are
based on Internet of Things (IoT), every entity in the network have IoT device
for different functionality.
With the increasing number of IoT devices, there are a lot of issues in con-
nectivity, security, privacy, scalability, and robustness of a network. Blockchain
in [6–8], are used to tackle different problems. Different types of BC are used
in different scenarios, like in [7,9] used consortium BC, in [6,8] public BC is
used. Furthermore, in [6,8] consensus mechanism, PoW is used while in [9] proof
of authority and in [7] PBFT are used. In [9] author proposed a model for
secure lightweight clients and validity states of source for prevention of mali-
cious attacks.
Networks are limit in resource sharing, every node in the network needs to
connect and should remain active for operations. With the increase in the data
set and increasing size of the application, networks are not more efficient and scal-
able. To achieve secure, reliable, scalable and robust networks [10] used BC. Con-
sensus mechanism Proof of Work (POW) is used in [11–14] and proof of collabo-
ration (PoC) is used in [12,15] which increase the security of network and reduce
358 M. Azeem et al.
the entry of malicious nodes. Ethereum environment for transactions are used
in [13–15] which provides an authentic way to transfer incentive from one node
to another. Data is indisputable, tamper-proof and decentralization with PDP
consensus mechanism [10]. Decentralized storage, spectral efficiency Q-learning
and exhaustive learning mechanism increase the connectivity of live users with
disconnecting the sleeping node [11]. In paper [10] POW, electronic signature,
point to point network, hashing algorithm and distributed ledger decrease pol-
lution attacks and bloating problems. Multi layer access in mobile devices and
transport networks are proposed for scalable and secure access data from data
centers [12].
Different sensing devices spreading very fast to sense data, manipulate and
broadcast it, i.e. temperature devices are connected in a network with different
manners to sense temperature and send it to the network for further analysis.
Due to their very low power, storage, and computation resources, they cannot
defend themselves from attacks of malicious devices. Wireless sensor nodes are
connected through BC and used different consensus mechanism in [17–20]. Prov-
able data possession and hash rate are used in [17] along with the problem of the
selfish node. In [18] virtual credit, privacy protection, and confusion mechanism
made secure and reliable data sharing.
3 Traditional Approach
Traditionally, licensing and quality control of medicine are done by a centralizing

manner by DRA as shown in Fig. 1. When DRA issues the license to manufac-
turer it will verify the information only one time. Manufacturers, at the start,
contact to the LD for registering a medicine in order to manufacture it [22].
LD will issue the license to that manufacturer who will have a proper place,
instruments, and expertise in manufacturing. LD ensures all these requirements
then allows manufacturer to produce medicine with a license number. The need
of license against a medicine is not only necessary for manufacture a medicine,
while it is also used for regulating medicine through out the process. License
ensures that specific medicine are manufacturing under properly, safe and good
condition [23]. So, RD is playing a role to impose different rules and regula-
tions on different medicine with different characteristics. Some manufacturers,
do not follow the rules, produce low quality medicine in a bad environment.
QAD has the right to check a manufacturer that the medicine it producing have
good quality and made under following the proper rules. Vendors does not know
about manufacturer while purchasing the medicine, so there is chance to buy
counterfeit medicine and then sell it to user. Users do not know about legality,
quality, and cost of medicine [24].
In traditional system, the process of imposing penalty due to low quality and
counterfeit medicine is very slow. There is no back tracking system for medicine
to maintain the quality of medicine. Further, there is no way to check a medicine
by user.
Fig. 1. Centralized system of authentication and manufacturing medicine
4 Architecture
DRA issues the license to the manufacturer through smart contracts against a
medicine as shown in Fig. 2. In BC, medicine license broadcasts to the network,
miners check eligibility of specific manufacturer. Once, a manufacturer will be
able to purchase a license, some amount as Ethereum will be deducted from
his account to DRA’s account. Every action from DRA like quality checks and
rules applying, will be saved into manufacturer’s account. Like, the polluted
environment and poor quality penalty imposed through the smart contract and
will be written in manufacturer’s profile after miner validation.
DRA: To register a medicine, every specific area has its own DRA, which deals
with medicine related challenges, quality, rules and facilitates manufacturers
with new medicine manufacturing technology. The focus of DRA is discussed in
this paper, is to stop the manufacturing of counterfeit medicines and provide a
secure way to check medicine’s authenticity. Functionality of DRA departments:
• LD: Medicine licensing is the first step for a manufacturer in order to get
permission from DRA to manufacture medicine. Manufacturers provide all
evidence about the ability of production of medicine, packing and storing. LD
will analyze manufacturer pre-production steps and the history of previously
manufactured medicines. The manufacturers, with good pre-production steps
and having a good record of previously manufactured medicine, are able to
get the license of a particular medicine.
• RD: Every medicine has different production processes, packing constraints
and ways of storing. Some medicines will be stored in freezing temperature
and some will be stored at room temperature. Medicine packing processes and
packing material also have some constraints. So, every medicine production,
packing and storing related rules and regulations are imposed by the RD.
Furthermore, these rules are continuously changed with the new technology
evolves.
360 M. Azeem et al.
• QAD: This department ensures that the rules and regulations are followed by
the manufacturers. With the use of IoT devices QAD can check quality mea-
sures of manufacturing processes. It is easy way to identify a manufacturer
which will not following the rules. Once, a manufacturer with bad activities
identifies, QAD will impose a penalty on him. Some amount as Ethereum will
be detected from the manufacturer’s account. Also, profile rating of manu-
facturer will be decrease. QAD have right to cancel the license of any culprit
manufacturer.
Vendor: The vendor is the intermediate entity between end-user and manufac-
turers. Manufacturer after manufacturing medicines sales to the different ven-
dors through the smart contract. Vendor ensures first about the authenticity of
medicine by looking manufacturer profile and check license number that belong
to related manufacturer. Then, vendor purchases medicines through smart con-
tract and responsible for further storing these medicines in good quality environ-
ments. Vendors are registered with DRA, now every transaction between vendor
and manufacturer, will be saved in the database of BC. Now, DRA can easily
verify about the manufacturer and vendor, which are making transaction.
Once, a vendor enters the network, it is hard to sell counterfeit medicines,
due to maintaining profile rating. If any vendor will involve in selling counterfeit
medicines, Once a user will purchase that medicine and check it by interface.
Then, user will report about culprit vendor and DRA will cancel vendor’s license.
End User: Patients are the ones who will use the medicines, they are not
well known about manufacturers. They mostly doubtful about medicines, while,
these medicines are formulated by authorized manufacturers, well packed and
came from a good environment. To mitigate this issue, our proposed system
provides an interface to end user. On which users input medicine license and
transaction number, all details about medicine like manufacturer name, formula,
batch number, will display on his interface.
Smart Contract: The smart contract, basically, is a code written in a program-
ing language. It has a set of rules, which will be broadcast to network, when two
parties agreed upon a situation. A smart contract does not change once it will
be broadcasted to the network. The penalty will be imposed on nodes, which
agreed on a smart contract and later break rules written in the smart contract.
5 Methodology
5.1 Blockchain
BC is a technique which used in the decentralized network, for maintaining a

consistent database among distributed members. Firstly, Satoshi Nakamoto used
this technique in well-known currency, i.e., Bitcoin [25]. The BC based decentral-
ized network has no single fixed database, while in the centralized network, data
has to be stored in a single fixed database. In the public BC, all members are
miner and perform mining while adding a block in the BC. In the consortium
BC, the network is classified into different layers and a specific layer has the
right of mining. A private BC has a limited number of nodes, and only selected
nodes will behave as miners. Miners are that nodes in the network, which solve a
mathematical task to validate a transaction. Due to every node has a copy of the
BC, no one can change data stored in the BC unless it has strong computation
power. If anyone have such computation power it can be able to hack 51% nodes
then it can control mining which is not possible. Because of this ability, BC is
widely studied in research nowadays.
Blockchain is consists of blocks that are arranged in a specific order, each
block has a number of transactions. These transactions are generated by traders
and after successful validation, transactions broadcasted to the entire network.
Blocks are chained in a way that every block address has some value of the
previous block, i.e., the hash value. The change in any block address would
change the entire BC address. Furthermore, a nonce is added in the block, which
is the mathematical problem. A miner which will efficiently solve the nonce, will
considered as centered node and broadcast his block. Miner election schemes,
e.g., POW, proof of capacity, proof of stake use computation power, storage
capacity and capital respectively to elect a node as a miner. Consequently, BC
is more successful in providing data security in decentralized manners.
5.2 BC Based Design Overview

In our proposed model, the BC based database used for saving medicine records.
Manufacturers apply for a license through smart contract, they write pre-
production preparation, and agreements that they will never misuse this license.
In pre-production preparations, manufacturer will describe location, produc-
tion material, and expertise of production. DRA checks its eligibility and pre-
production preparation, then send manufacturer record for validation. In the
consensus PoC, all miners will check manufacturer’s profile, if the previous rating
is good then miners validate the profile of manufacturer. After validation, DRA
allows the manufacturer to formulate the medicine and issues a license number
against the medicine. The rules and regulations are also applied through the
smart contract. Once, a manufacturer agrees on rules and regulations, data will
be saved in the BC. Now, QAD checks manufacturer’s production processes fre-
quently and randomly. The manufacturers violating the rules would be charged
with a penalty as a Ethereum. Continuously, violation of the rules would have to
cancel the license by the DRA and manufacturer would no longer to formulate
any medicine.
Vendor registers with DRA through the smart contract, on which medicine
storing preparations are written. DRA analyses first storing preparation and val-
idates vendor by miners. Vendors are also possessed profile on which their rating
is mentioned. Once, QAD identifies a vendor violates rules or sales counterfeit
medicines, the vendor’s registration would be canceled, and it would no longer
be the part of the network. Otherwise, penalty would be imposed on breaking
the rules.
362 M. Azeem et al.
Fig. 2. Decentralized system of authentication and manufacturing medicine
5.3 Advantage of BC Based Authentication
Decentralized: Traditionally, DRA using asynchronous centralized database

by which counterfeit medicine manufacturers can easily be advantageous. If any
department of DRA has lost its data or data compromised, no way to recover
it. Our decentralized BC based database, are synchronous, reliable, secure, and
transparent. Due to every node have a copy of the BC, attacker will be able
to hack only small portion of network. Practically, they would not succeed to
overcome the network, because data is distributed on every node.
Tamper-proofing: The malicious user would try to add its block of transactions
or bogus license numbers via compromising a node. Due to distributed manners
of BC and use of consensus mechanism, its transaction will be rejected by miners
and it also do not be able to add a block in the BC.
Consistency: By maintaining the record in the BC based database, every time
it returns the same results. Whenever, a user would buy a medicine from the
vendor, and check its validity by interface. Medicine will be reflecting its own
manufacturer name and license number.
Timeliness: Based on POC, all manufacturers would be able to add their blocks
into the BC. Like, a manufacturer have more trust points, which based upon valid
transactions with multiple vendors, would more likely to win a nonce and add a
block in the BC.
Availability: Data stored in BC can be accessed by all entities in the network.
Whenever, a user purchases a medicine, he checks validity by the interface. On
every inquiry of medicine, BC will instantaneously respond.
6 Workflow of Proposed System

Our proposed system in Fig. 2, describes the processes of BC based decentralized
licensing and authentication process. Following are the steps of our proposed
model, i.e., (1) Drug licensing and rules applying; (2) Rating of manufacturers
and vendors; (3) Manufacturer and vendor dealing; (4) Customer application
interface.
Step1 : Drug licensing and rules applying: Drug licensing procedure for the man-
ufacturer is very easy in BC. Due to introducing smart contract in our system,
all activities will perform through the execution of the smart contract. Manufac-
turer will write a code of smart contract, in which preproduction preparation,
material and relevant expertise of medicine will be written.
LD analyzes preproduction preparation, availability of material ans relevant
expertise of manufacturer. Then, manufacturer’s smart contract will be broad-
casted for consensus. With the POC, miner will validate the manufacturer’s
smart contract according to his profile rating. Other manufacturers as a miner
will authenticate new license manufacturer. After 51% of miner authentication
for a manufacturer, it is successfully eligible to get a license of medicine. With
consensus mechanism, no bogus manufacturer could be entered to network. Also,
manufacturers with negative ratings would not be part of the network. LD broad-
cast nonce for a miner with a threshold level, the manufacturer with more valid
transactions with vendors, has a large number of chance to meet threshold level.
For vendor registration, vendor will broadcast a smart contract in which it
will code followed rules for storing medicine and agreement for selling authen-
tic medicines. LD analyses the smart contract and send for miner validation.
Vendor’s profile will checked by miner and according to profile rating miner will
validate. Negative rating vendors will not be successful in miner validation.
Step2 : Rating of manufacturers and vendors: For the new manufacturers and
vendors, LD adds default rating to their accounts, e.g., for manufacturers ten
is the highest rating by default and for vendor number three is highest rating.
Manufacturers have high rating than vendors because QAD imposes penalty on
manufacturers by many factors whereas vendor have only environmental or stor-
ing factors. Manufacturer’s factors are environmental, technological, expertise,
and cleanliness.
Step3 : Manufacturer and vendor dealing: Previously, manufacturer and vendor
dealings did not save in DRA’s record. Now, the vendor broadcasts a smart
contract when it needs to buy any medicine, whereas, manufacturer broadcasts a
smart contract about selling manufactured medicines. When both parties agreed
on a contract, they made the transaction, which broadcasted to network for
mining.
Miner will check the profile rating of manufacturer and vendor, transaction
will be successfully validated when both parties will have positive rating. In
our proposed system, we achieve more clarity that manufacturer with negative
profile can not sale its medicines and vendor with negative profile can not buy.
364 M. Azeem et al.
Highly rating vendors would prefer to deal with highly rated manufacturers.
In this way, manufacturers and vendors always maintain their high ratings for
selling and purchasing.
Step4 : Customer application interface: In traditionally, there was no way to
determine the authentication of medicine by the customer. The proposed model
provides an interface to the customer, on which customer input license and
transaction number that he performed with the vendor and get all details of
medicines. Moreover, counterfeit medicine would not have any record, and the
customer will receive an attention message. On the interface,customer would
complain to DRA if the medicine is bogus, by clicking on complaint button.
7 Results and Discussion
Before introducing BC in medicine licensing and authentication processes, these

processes were slow, time-consuming, insecure and unreliable. There was no data
synchronization within DRA, anyone could be temper record for his own purpose.
Bogus license, un-implementation of regulations and lack of quality checks on
manufacture were the issue faced in traditional system. Users did not know about
medicine while is it valid medicine or not, due to asynchronous data.
Our proposed system provides data security, integrity, scalability, and effi-
ciency. Hacker can not access data for long because data have to be saved on the
distributed BC based ledger, when data have to be tempered on one end, other
ledgers have the same data as the original. Due to every transaction validated by
consensus mechanism, miner check profile rating first and then validate a trans-
action. Every medicine validates three time in the whole process, first in issuance
of license, then in purchasing by vendor and finally, when user purchased from
vendor.
Figure 3 shows the results of license issuance with respect to time. Results
show that BC based system is very efficient and less time consuming with respect
to the existing system. The existing system consumes more hours to complete a
task than our proposed system.
Figure 4 shows the results of transactions per day, our proposed BC based
system can perform well and a huge number of transactions can perform. In
the existing system, transaction performed manually with out smart contract.
In proposed system, smart contract used for every transaction where all entities
can work efficiently.
Business established due to earn profit in less time. Figure 5 shows the cost
of transactions. Through decentralized BC based database with smart contracts
reduced transparency cost, security cost processing cost, storage cost, and com-
munication cost. Existing system has different types of hidden cost as well.
Fig. 3. License issuance in blockchain vs traditional system
Fig. 4. Number of transaction in blockchain vs traditional system
Fig. 5. Cost of transactions in smart contract vs third party
8 Conclusion
In this paper, we proposed BC based decentralized medicine licensing and
authentication system. The manufacturer will request by smart contract to
apply for licensing, Smart contract is used for applying licensing, imposing rules,
quality checks and receiving penalty. User also use smart contract to verify the
authenticity of medicine. Finally, the vendor is now registered with DRA and
part of the network for any transaction.
References
1. Drosatos, G., Kaldoudi, E.: Blockchain applications in the biomedical domain: a
scoping review. Comput. Struct. Biotechnol. J. (2019)
2. Roehrs, A., et al.: Analyzing the performance of a blockchain-based personal health
record implementation. J. Biomed. Inf. 92, 103140 (2019)
366 M. Azeem et al.
3. Han, S.-H., et al.: An empirical analysis on medical information sharing model

based on blockchain. Int. J. Adv. Comput. Res. 9(40), 20–27 (2019)
4. Engelhardt, M.A.: Hitching healthcare to the chain: an introduction to blockchain
technology in the healthcare sector. Technol. Innov. Manag. Rev. 7(10) (2017)
5. Azaria, A., et al.: Medrec: using blockchain for medical data access and permission
management. In: 2016 2nd International Conference on Open and Big Data (OBD).
IEEE (2016)
6. Zhang, Y., Wen, J.: The IoT electric business model: Using blockchain technology
evaluation. IEEE Internet of Things J. (2018)
8. Sharma, P.K., Park, J.H.: Blockchain based hybrid network architecture for the
smart city. Future Gener. Comput. Syst. 86, 650–655 (2018)
9. Xu, Y., et al.: Towards secure network computing services for lightweight clients
using blockchain. Wirel. Commun. Mobile Comput. (2018)
10. Lin, J., et al.: Using blockchain to build trusted LoRaWAN sharing server. Int. J.
Crowd Sci. 1(3), 270–280 (2017)
12. Dai, M., et al.: A low storage room requirement framework for distributed ledger
in blockchain. IEEE Access 6, 22970–22975 (2018)
13. Gordon, W.J., Catalini, C.: Blockchain technology for healthcare: facilitating the
transition to patient-driven interoperability. Comput. Struct. Biotechnol. J. 16,
224–230 (2018)
14. Alabi, K.: Digital blockchain networks appear to be following Metcalfe’s Law. Elec-
tron. Commer. Res. Appl. 24, 23–29 (2017)
over the ethereum blockchain network. Sci. Technol. Libr. 37(3), 235–245 (2018)
16. Zhang, G., et al.: Blockchain-based data sharing system for AI-powered network
operations. J. Commun. Inf. Netw. 3(3) 1–8 (2018)
17. Ren, Y., et al.: Incentive mechanism of data storage based on blockchain for wireless
sensor networks. Mobile Inf. Syst. (2018)
(2018)
19. Xu, C., et al. Making big data open in edges: a resource-efficient blockchain-based
approach. IEEE Trans. Parallel Distrib. Syst. 30(4) 870–882 (2018)
20. Kushch, S., Prieto-Castrillo, F.: A rolling blockchain for a dynamic WSNs in a
smart city. arXiv preprint arXiv: 1806.11399 (2018)
21. Mandl, K.D., et al.: Public standards and patients control: how to keep electronic
medical records accessible but private. BMJ 322(7281), 283–287 (2001)
22. Choonara, I., Dunne, J.: Licensing of medicines. Arch. Dis. Child. 78(5), 402–403
(1998)
23. McGinnis, M., et al.: Clinical data as the basic staple of health learning: creating
and protecting a public good. National Academies Press (2010)
24. Farzandipour, M., Sadoughi, F., Meidani, Z.: Hospital information systems user
needs analysis: a vendor survey. J. Health Inf. Dev. Countries 5(1) (2011)
25. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008)
Detection of Malicious Code Variants Based
on a Flexible and Lightweight Net
Wang Bo, Wang Xu An, Su Yang(&), and Nie Jun Ke
College of Cryptographic Engineering,

Engineering College of Armed Police Force, Xi’an, Shaanxi, China
wb516100@163.com
Abstract. With the phenomenon of code reuse is becoming more widespread

in the same malicious family, this paper proposed a method to detect malicious
code using a novel neural net. To implement our proposed detection method,
malicious code was transformed into RGB images according to its binary
sequence. Then, because of code reuse features can be revealed in the image, the
images were identified and classified automatically using a flexible and light-
weight neural net. In addition, we utilized dropout algorithm to address the data
imbalance among different malware families. The experimental results demon-
strated that our model performs well in accuracy and rate of convergence as
compared with other models.
1 Introduction
With the rapid development of information technology, the exponential growth of

malicious code has become the main threat to network security. Symantec points out
that 401 million pieces of malicious code were discovered in 2016, including 357
million new variants [1]. However, traditional automatic analysis based on feature
codes is easily bypassed by obfuscating technologies. Although automatic analysis
methods based on dynamic features (such as sandbox technology) have high recog-
nition rate for malicious codes, they have high system overhead and low detection
speed, and are not suitable for detection of malicious codes with large sample sets.
In recent years, as the makers of malware have begun to use standard software
engineering practices, the phenomenon of malware being compounded by several other
code samples has become more widespread, making code reuse more common [2].
Also, most malware is reused and not written from scratch. In 2014, Symantec Cor-
poration pointed out that as malware programmers are working to improve existing
malware, the number of truly new malware families has slowed down. In fact, some
malicious code is in course of being reused and modified [3]. So, most variants of
malware and the parent have duplicate code snippets.
Therefore, some researchers use this feature to visualize malware. On the classi-
fication of malware, different from the traditional feature extraction method, they use
the image features obtained after visualization to classify the samples. In Yoo [4] uses
self-organizing maps to detect and visualize malicious code in executable files. Han [5]
et al. classify malicious code by generating image matrix. Nataraj et al. [6] converted

https://doi.org/10.1007/978-3-030-33509-0_33
368 W. Bo et al.
malware samples into grayscale images, using the similarity of the images between
malicious code variants, and combining image processing methods to classify malware.
On this basis, Cui et al. [7] used the convolutional neural network method to improve
the scheme proposed by L. Nataraj.
Based on the full study of the above research results, we proposed a visualization
method to classify malicious code. This method converts binary files to RGB images,
then uses a flexible and lightweight neural net which is novel and fit for the image
classification problem whose image features are simple. In consideration of data
visualization such as malware visualization and protocol visualization generally use
uncomplicated method to create feature maps to replace original textual data, these
textures of maps are usually simple compared with textures of animals. So, a light-
weight structure is enough for a net to classify texture-simple images and a flexible
structure makes sure its general applicability. Therefore, the flexible and lightweight
net (FLNet) we designed are suitable for our proposed detection method.
The experimental results demonstrated that by using 9342 samples from 25 families
for evaluation, the average classification accuracy rate on validation data set is 94.39%
after 200 epochs, which can effectively classify malicious code samples.
2 Data Preprocessing
This paper selects the Malimg dataset published by the Vision Research Lab team in
2011 as the experimental data set, which includes 9342 samples from 25 malware
families [6].
The malware binary bit string can be divided into several substrings of length 8 bits,
since 8 bits can be regarded as unsigned integers in the range of 0–255, which cor-
responds to the range of gray-scale values 0–255, so each substring can be thought of
as one pixel [7]. Three consecutive 8-bit character strings are selected, which respec-
tively correspond to the RGB three-color channel in the color image, that is, the first 8-
bit string corresponds to the value of the R channel, and the second 8-bit string
corresponds to the value of the G channel, and the third 8-bit string corresponds to the
value of the B channel, and then the process is repeated until all the data has been
selected (the data of the last segment less than 24 bits is complemented by 1).
For example, if there is a bit string 011011101001100111010011, the processing is
011011101001100111010011 ! 01101110, 10011001, 11010011 ! 110,153,211.
A 3-byte (24-bit) binary number B ¼ ðb23 ; b22 ; b21 ; ; b2 ; b1 ; b0 Þ can be converted
to the values of the R, G, and B color channels by the following method.
X
7
R¼ bi þ 16 2i ;
i¼0
X
7
G¼ bi þ 8 2i ;
i¼0
Detection of Malicious Code Variants Based on a FLNet 369
X
7
B¼ bi 2i :
i¼0
In this way, the malware binary bit string is converted into a length * width * 3
type matrix. Considering that the input of the convolutional neural network used in this
paper is an RGB image of 224 pixels * 224 pixels. To facilitate the input of images into
the neural network, we generate a color image according to the length: width = 1:1
(image squared), and then proportionally scales to 224 pixels * 224 pixels in size to
maximize texture features in the image. Figure 1 shows the flow chart for sample
preprocessing in this paper.
Fig. 1. Analysis process using visualization methods.
The texture characteristics between different malicious families are different. Fig-
ure 2 shows three color images from three malicious families after image normalization.
Fig. 2. (a) (b) (c) were from the Adialer.C family, the Swizzor.gen!I family, and the VB.AT
family in sequence and all of them were squared to 224 px * 224 px.
370 W. Bo et al.
Fig. 3. (a) (b) were from Lolyda.AA2 family and both of them were squared to 224 px * 224 px.
Samples between the same family have similar texture patterns. Figure 3 shows two
sample images in the Lolyda.AA2 family after image normalization. It can be seen
from the figure that the texture features of Sample 1 and Sample 2 are similar from top
to bottom.
According to the strategy designed in the previous section, the Malimg sample set
was processed to obtain a color image of a total of 9342 samples of 25 malicious
families. Compared with the grayscale image, the color image retains the main features
of the grayscale image, and has a more obvious emphasis on the repeated short data
segments in the binary file (the repeated occurrence here is not a constant repetition, but
refers to pseudo-repeat, that is, the segment after the repeated segment can be slightly
different from the previous segment, so that a color gradient pattern is produced in the
color image, as shown in the green pattern in Fig. 2(c). This article will use the deep
learning method to train the classification model which is novel and inspired by the
VGG structure [8] and Xception structure [9]. The details will be described in the next
section.
3 Construction of the Classification Model
3.1 Flexible and Lightweight Network

ImageNet Large Scale Visual Recognition Challenge has spawned many innovative
neural network structures, including VGG and Xception. VGG proposed that the
convolution effect of three 3*3 convolution kernels is approximately equal to a 7*7 size
convolution kernel, however, saves about 81% of the parameters [8].
Entry flow Middle flow Exit flow

224×224×3 images 28×28×512 feature maps 7×7×512 feature maps
Conv 32,3×3,stride=2×2
ReLu ReLu Fully-Connected layers 256
SeparableConv 512,3×3
Conv 64,3×3 Fully-Connected layers 25
ReLu
ReLu SeparableConv 512,3×3 Softmax
Conv 1×1
SeparableConv 128,3×3 stride=2×2 Classification result
ReLu
Conv 1×1 ReLu SeparableConv 512,3×3
stride=2×2
SeparableConv 128,3×3 MaxPooling 2×2, stride=2×2
ReLu
MaxPooling 2×2, stride=2×2

⊕
⊕ ReLu
ReLu
ReLu
Conv 1×1 SeparableConv 512,3×3
Conv 1×1 ReLu stride=2×2
ReLu
stride=2×2
SeparableConv 256,3×3 SeparableConv 512,3×3
MaxPooling 2×2, stride=2×2 MaxPooling 2×2, stride=2×2
⊕ ⊕
28×28×512 feature maps 7×7×512 feature maps
Fig. 4. The FLNet structure: the data first goes through the entry flow, then through the middle
flow which is composed of the same two blocks, and finally through the exit flow. Note that all
Convolution and SeparableConvolution layers are followed by batch normalization [10] (not
included in the diagram). All SeparableConvolution layers use a depth multiplier of 1 (no depth
expansion)
Xception utilizes separable convolution to achieve a complete separation between

the correlation and spatial correlation between channels, without greatly affecting the
convolution effect and greatly reducing the amounts of parameters (which can be
understood as the amounts of parameters required for doing a 2-d convolution plus
doing a 1-d convolution is much smaller than the amounts of doing a 3-dimensional
convolution). It also used the skip connections (which expresses the output as a linear
superposition of the nonlinear transformation of the input and input) and Batch Nor-
malization (which fixes the activation input distribution of each hidden layer node,
solves the Internal Covariate Shift problem) to avoid the gradient disappearance
problem and to speed up the learning convergence speed, which greatly speeding up
the training speed [9].
Combining the advantages of both, I designed a novel Flexible and Lightweight
network structure called FLNet. As shown in Fig. 4, the FLNet has 14 layers, including
12 convolutional layers which is used to form the feature extraction base of the network
and the last 2 FC layers which is used to integrate features to achieve classification.
Compared with VGG, it is a little shallower (VGG16 has 16 layers and VGG19 has 19
layers) but much more lightweight (FLNet saves about 84% parameters than VGG16).
Compared with Xception, it is not as deep as latter (Xception has 36 layers) but the
372 W. Bo et al.
classification effect is good enough to deal with uncomplicated texture patterns such as
malware visualization images.
3.2 Adamax Optimizer

Adam is a stepwise optimization algorithm based on the adaptive estimation of the low-
order moments of the stochastic objective function. The algorithm is simple to
implement, high in computational efficiency, and low in memory usage. It is not only
suitable for solving problems with large amounts of data or variables, but also for
problems where the target is non-stationary and has serious noise or sparse gradients
[11]. Adamax is a variant of the Adam based on infinite norm.
The reason of using Adamax as optimizer is because about 2 million parameters in
FLNet participate in the calculation of convolutional layers, which takes advantage of
Adamax to solve the problem with large amounts of parameters. In addition, since
Adamax is an adaptive learning rate algorithm, it means that it has a faster convergence
speed, so that the model still has higher classification accuracy when the training
epochs is less. It means that it is suitable for porting to network environments for
efficient processing of classification and detection problems.
4 Neural Network Model Fitness Optimization
The data set distribution used in this paper is shown in Fig. 5. Since the sample number
distribution among the malicious families is not uniform (for example, the Allaple.A
family has 2949 samples and the kintrim.N family has only 80 samples), if the number
difference between the samples is ignored and no adjustment is adopted for neural
network structure, it will produce over-fitting phenomenon, resulting in a decrease in
classification accuracy.
Fig. 5. Sample distribution of the data set

Fig. 6. The Dropout schematic diagram. (a) shows a standard neural network, in this example
there are 4 layers containing 16 nodes, which needs to calculate 55 times (if one arrow represents
one calculation). However, if the neural network use Dropout with activation probability
P = 0.5, it can be described as (b). There are only 8 nodes participate in calculating, which only
needs to calculate 15 times, saving about 72.7% of calculation times.
To solve the problem, the Dropout is introduced in the fully connected layer
portion of the neural network, and the schematic diagram is shown in Fig. 6. In 2012,
Hinton presented Dropout in his paper [12]. When a complex feedforward neural
network is used to train on a small data set, it is easy to cause overfitting. To prevent
overfitting, the performance of the neural network can be improved by preventing the
feature detectors from acting together. In short, when we are propagating forward, let a
certain neuron activate with a certain probability, which makes the model more gen-
eralized because it does not rely too much on some local features.
Assume that a neural network has L hidden layers, setting l 2 f1; . . .; Lg as the index
of hidden layers, ZðlÞ as the vector inputs to l layer, yðlÞ as the output vector of l layer
(yð0Þ ¼ x represents the input). WðlÞ and bðlÞ are respectively the weights and the bias of l
layer. Therefore, a standard forward propagation of neural network can be described as:
ðl þ 1Þ ðl þ 1Þ l ðl þ 1Þ
zi ¼ wi y þ bi ;

ðl þ 1Þ ðl þ 1Þ
yi ¼ f zi ;
Where f is any activation function, for example, when Dropout is used, the forward
propagation becomes:
ðlÞ
rj BernoulliðpÞ;
~yðlÞ ¼ rðlÞ yðlÞ ;

374 W. Bo et al.
ðl þ 1Þ ðl þ 1Þ l
zi ¼ wi ~y þ bðli þ 1Þ ;
ðl þ 1Þ ðl þ 1Þ
yi ¼ f ðzi Þ:
Where * represents the product of elements. For any layer l, rðlÞ is the vector form
of independent random Bernoulli variable with probability sum of 1. By multiplying
rðlÞ and yðlÞ to get the reduced output ~yðlÞ as the input to the next layer. This process is
applied to each layer, equivalent to sampling a subnetwork from a large network.
As shown in Fig. 7, an easy-to-understand description is that in training period for
each neural node, it has the probability P to activate. But in test period, each node is
always activated because we must insure the integrity of model while testing.
probability p activated
w pw
(a)In training period (b)In test period
Fig. 7. Operation of Dropout in predicting the model. (a) shows that while training, the node is
activated with probability P and w represent the weights to the next layer. (b) shows that in
ðlÞ
testing the node is always activated and the weights to the next layer become Wtest ¼ pWðlÞ :
5 Performance Analysis
This part evaluates the performance of FLNet and compares the results with common
neural network (e.g. VGG and Xception). In this comparison, we use Loss, Accuracy to
evaluate each neural network.
In this experiment, we use the data set Malimg from Vision Research Lab [6] and
implement preprocessing to transform them into RGB images. Then, we use Keras to
build and train CNN in the environment of Intel Core i7-8750H CPU (2.20 GHz, 12
CPUs) and Nvidia GeForce GTX 1060 GPU (6G) and 16 GB RAM.
As is shown in Fig. 8, FLNet shows a significant advantage compared with VGG16
and Xception because it has a faster convergence speed and lower loss value. From
Fig. 8(a) we can understand that on train data set, FLNet has been converged in about
20 epochs, much quicker than Xception in about 60 and VGG16 in about 200 or more.
The reason is that FLNet is much more lightweight than other two (about 1/7 of
VGG16 and about 1/10 of Xception) so it can reach the convergence faster. Simplified
structure design can extract the simple texture feature efficiently, but complicated
structure will reach the counterproductive effect. Therefore, FLNet rank the higher
Accuracy and lower Loss in training period.
Fig. 8. Accuracy and Loss on train data set
Fig. 9. Accuracy and Loss on validation data set
However, as is shown in Fig. 9, complicated neural network shows better robust-

ness in test period, which results better performance than FLNet. It is worth mentioning
that VGG16 reach better performance on validation data set than on train data set
because VGG utilizes Dropout method to avoid overfitting which divides neural net-
work into many weak classifiers in training period. Only by gathering them can it get a
strong classification effect but in test period, dropout is unused, which means the neural
network is integral. In FLNet, we use not only Dropout but also Batch Normalization
in hidden layers to make sure input distribution always obey the standard normal
distribution. So, in test period, FLNet does not shows exaggerated.
Although FLNet does not perform as well as VGG16 and Xception as the epochs
increases on validation data set, but FLNet still show advantage within 60 epochs. So,
376 W. Bo et al.
in some situation where fast results are needed, such as online testing, FLNet has a high
application value.
6 Conclusions
This paper presents a method of converting binary file into RGB image to emphasize
the feature of repeated short data segments and an approach to detect malicious code
variants using a novel neural network FLNet, which is flexible and lightweight and
performs well on extracting simple feature. Adamax algorithm is chosen as the gradient
optimization algorithm because it applies to the large parameter neural network and the
large data set and performs best compared with other algorithms on FLNet. To speed up
the training and avoid overfitting further, we use Dropout in the FC layer. The benefit
to use CNN to detect and classify malicious code variants is that it only requires little
preprocessing, without code execution or disassembling, has a good accuracy and
quick response in an automatic way. The scheme we proposed has potential value in
online detection or assisting professional analysts in determining the type of malicious
code variants.
Acknowledgements. This work is supported by the National Key Research and Development
Program of China Under Grants No. 2017YFB0802000, National Cryptography Development
Fund of China Under Grants No. MMJJ20170112, the Natural Science Basic Research Plan in
Shaanxi Province of china (Grant Nos. 2018JM6028), National Nature Science Foundation of
China (Grant Nos. 61772550, 61572521, U1636114, 61402531), Engineering University of
PAP’s Funding for Scientific Research Innovation Team (Grant No. KYTD201805).
References
1. Symantec: Internet Security Threat Report (2017)
2. Anderson, B., Lane, T., Hash, C.: Malware phylogenetics based on the multiview graphical
lasso. In: Advances in Intelligent Data Analysis Xiii, vol. 8819, pp. 1–12 (2014)
3. Alazab, M.: Profiling and classifying the behavior of malicious codes. J. Syst. Softw. 100,
91–102 (2015)
4. Yoo, I.: Visualizing windows executable viruses using self-organizing maps. In: Proceedings
of ACM Workshop on Visualization and Data Mining for Computer Security, pp. 154–166
(2004)
5. Han, K.S., Lim, J.H., Kang, B.: Malware analysis using visualized images and entropy
graphs. Int. J. Inf. Secur. 14(1), 1–14 (2015)
6. Natara, L., Karthikeyan, S., et al.: Malware images: visualization and automatic classifi-
cation. In: International Symposium on Visualization for Cyber Security (2011)
7. Cui, Z.H., Xue, F., Cai, X.J., Cao, Y., Wang, G.G., Chen, J.J.: Detection of malicious code
variants based on deep learning. IEEE Trans. Ind. Inf. 14(7), 3187–3196 (2018)
8. Simoyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. Computer Science (2014)
9. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2017)
10. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift. In: Proceedings of the 32nd International Conference on Machine
Learning, pp. 448–456 (2015)
11. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. Computer Science (2014)
12. Hinton, G.E., Srivastava, N., Krizhevsky, A., et al.: Improving neural networks by
preventing co-adaptation of feature detectors. Computer Science (2012)
Preprocessing of Correlation Power Analysis
Based on Improved Wavelet Packet
Peng Ma, Ze-yu Wang, WeiDong Zhong, and Xu An Wang(&)
Key Laboratory of Network and Information Security under Chinese People

Armed Police Force (PAP), Engineering University of PAP,
Xi’an 710086, Shaanxi, China
mapengzyp@163.com, wangxazjd@163.com
Abstract. Preprocessing is a very important step in side channel analysis. The

quality of the collected power traces seriously affects the efficiency of side
channel analysis. Therefore, the preprocessing of Wavelet Transform (WT) and
Wavelet Packet Denoising (WPD) are widely used. However, WT has certain
defects in characterizing detail information of power traces. The threshold of
WPD is not universal and adaptive. In order to solve these problems, it provides
a preprocessing of power traces by combining WPD and Singular Spectrum
Analysis (SSA), which takes advantage of the former to resolve the power
consumption data, and the latter is used to extract the information of the low
frequency and high frequency parts. Then, according to the fluctuation trend of
singular entropy, the key information contained in the two parts is extracted
adaptively, so as to improve the quality of power traces. Finally, through the
selection of plaintext attack on the SM4 algorithm implemented by hardware, it
can improve the efficiency of Correlation Power Analysis (CPA).
Keywords: Correlation Power Analysis Preprocessing WT WPD SSA

Singular entropy
1 Introduction
In the process of encrypting data, cryptographic devices often have some the leakage,
such as power consumption [1], time [2], electromagnetic radiation [3], and so on. Side
Channel Analysis (SCA) is to make full use of the leakage, and to calculate the
relationship between the leakage and the encrypted (or decrypted) data through
mathematical statistical analysis, so as to achieve the purpose of decrypting the secret
key. Power analysis attack is the method of analyzing and attacking power trace leaked
by the cryptographic device during the encryption process. Correlation Power Analysis
(CPA) proposed by E Brier et al. [4] in 2004 has been widely adopted due to its strong
aggression and the efficiency of decrypting keys.
Usually before the implementation of SCA, the first step is to sample the leakage of
the cryptographic device. In order to eliminate the influence of noise and improve the
quality of sampling power consumption, many methods are used to denoise data before
power analysis attack. Thanh-Ha proposes to use the estimated fourth-order cumulant
instead of the original signal to perform traditional CPA and Differential Power
Analysis (DPA) to improve the attack performance [5]. Xavier used wavelet transform

https://doi.org/10.1007/978-3-030-33509-0_34
Preprocessing of CPA Based on Improved Wavelet Packet 379
to improve DPA [6]; Souissi also proposed a special noise reduction threshold based on
information theory, but it is not universal in noise reduction of power trace [7]. In
addition, Wei also proved that the efficiency of wavelet transform noise reduction is
better than high-order cumulant [8]. However, the wavelet transform based noise
reduction method ignores the high frequency part of the data during the denoising
process, and has certain defects in characterizing the data detail information [9]. In
[10], Wavelet Packet Decomposition (WPD) is used in data preprocessing to denoise
data and improve the attack efficiency of power attack. However, in fact, wavelet
packet decomposition may have special threshold requirements for different applica-
tions and neglect of useful components in high-frequency signals in noise reduction
processing.
Aiming at the problems in the above noise reduction methods, this paper proposes
an improved wavelet packet denoising method. Singular Spectrum Analysis (SSA) [11]
is an effective time series analysis tool that decomposes complex signals into different
subsequences by singular value decomposition. Unlike Principal Component Analysis
(PCA) [12], SSA can process single-shot data based on a special matrix structure.
Therefore, the SSA is used to process the low-frequency part and the high-frequency
part of the wavelet packet decomposition, and the detail information of power trace is
extracted from each part according to the singular entropy [13] fluctuation trend so as to
improve the quality of the power trace. Then use three kinds of data (original power
trace, wavelet packet threshold noise reduction power trace, improved wavelet packet
noise reduction power trace) to select the plaintext attack on the hardware imple-
mentation of SM4 algorithm, and verify the attack efficiency of the proposed method
for related power attack. And the accuracy rate is improved.
2 Preliminaries
2.1 SM4 Block Cipher

The SM4 block cipher is a group encryption algorithm used in the Chinese wireless
standard [14]. In 2012, it was identified by the National Commercial Password
Administration of China as the national cryptographic industry standard. The SM4
algorithm has a block length of 128 bits and a key length of 128 bits. The encryption
and decryption algorithms all use 32 rounds of unbalanced Feistel iteration structure,
which first appeared in the key expansion algorithm of block cipher LOKI. SM4 adds a
reverse-order transformation after 32 rounds of nonlinear iterations, so that only the
decryption key is the reverse order of the encryption key, so that the decryption
algorithm is consistent with the encryption algorithm. The structure of the SM4
encryption and decryption algorithm is identical, except that the decryption key is the
reverse order of the encryption key when the round key is used. The specific encryption
process is shown in Fig. 1.
380 P. Ma et al.
Fig. 1. SM4 algorithm encryption flow chart
2.2 Correlation Power Analysis (CPA)

CPA attack is proposed by Brier et al. [4], which uses the linear relationship between
real power consumption data and analog power consumption data to achieve the
purpose of decrypting the key. The process of CPA attack is mainly summarized as 4
steps [15]:
1. Choose a model for the target’s power consumption. Generally the model is
Hamming weight model or Hamming distance model. The model will focus on
specific points in the encryption device.
2. Get some different encrypted plaintexts of the target. Record every power trace of
the target’s power consumption during each of the encryption device.
3. For all possible option for the subkey, use the plaintexts of step 2 and the subkey to
calculate the power consumption according to the model.
4. Calculate the Pearson correlation coefficient (Eq. 1) between the modeled and the
power consumption of step 1. The maximum value of the Pearson correlation
coefficient is the subkey.
5. Repeat all the above steps to get all the best subkeys. Then decipher the secret key
by the subkeys.
E ðT; H Þ E ðT ÞE ðH Þ
qðT; H Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1Þ
VarðTÞVarðHÞ
2.3 Wavelet Packet Decomposition

The traditional signal analysis method is based on the Fourier Transform (FT) [16].
However, the FT has certain limitations in analyzing local information of signals and
processing non-stationary signals. Even though the improved Short-Time Fourier
Transform (STFT) uses the fixed sliding window function to determine the time-
frequency resolution of the fixed signal, it still lacks the corresponding adaptive ability.
The wavelet transform solves the above problems in signal analysis and processing.
Wavelet transform (WT) is a signal analysis and processing method based on a series of
mathematical analysis methods such as Fourier transform, general function analysis and
numerical analysis. It is widely used in signal processing, image analysis and speech
processing [17]. Figure 2 shows the two-layer wavelet decomposition of signal S.
Fig. 2. Two-layer wavelet decomposition of signal S
As shown in Fig. 1, L is the low-frequency data of the signal (the approximate

part), which is generally considered to be the main information of the signal. H is the
high-frequency data of the signal (the detail part), which is generally considered to be
the secondary information of the signal and is often considered to be the noise part. As
can be seen from Fig. 1, the work of the WT is based on the low frequency data of the
signal, and the high frequency data of the ignored signal. This also leads to WT not
fully characterization of the signal containing high frequency data [18]. Therefore,
there are certain defects in the denoising performance of WT.
Wavelet Packet Transform (WPT) [19] is an improvement for WT. Unlike wavelet
transform, which only decomposes low-frequency data, it also decomposes high-
frequency data to achieve better time-frequency localization. The wavelet packet
provides more orthogonal bases available on the basis of wavelets. Figure 3 shows the
decomposition of the two layers of the signal S. It is assumed that the signal can be
represented by the function f(t), which is decomposed using a wavelet packet. repre-
sents the i-th decomposition coefficient on the jth layer after decomposition, and G and
H are wavelet decomposition filters. Then the fast algorithm for binary wavelet packet
decomposition is [20]:
P10 ¼ f ðtÞ ð2Þ

X
P2i1
j ¼ Hðk 2tÞPij1 ðtÞ ð3Þ
k
382 P. Ma et al.
Fig. 3. Two-layer wavelet packet decomposition of signal S
X
j ¼
P2i Gðk 2tÞPij1 ðtÞ ð4Þ
k
The reconstruction algorithm is:

" #
X X
Pij ¼ 2 hðt j þ 1 ðtÞ
2kÞP2i1 þ gðt j þ 1 ðtÞ
2kÞP2i ð5Þ
k k
In Eq. 5, j = J − 1, J − 2, …, 1, 0; i = 2j, 2j−1, …, 2, 1; J = log2N, h, g is a wavelet

reconstruction filter.
2.4 Singular Spectrum Analysis and Singular Entropy
(1) Singular Spectrum Analysis

Singular Spectrum Analysis (SSA) is a method of analyzing the distribution of
different components of current data. The data targeted by SSA has nonlinear
characteristics. Firstly, the data structure is transformed into the trajectory matrix.
Secondly,data packets are reconstructed using Singular Value Decomposition
(SVD) [21]. Finally, the data obtained is represented by data from different
components [22].
(2) Singular Entropy [13]
According to the decomposition principle of the signal singular value [23], for an
m n dimensional real matrix A, it must consist of a m l dimensional matrix
U, a l l dimensional diagonal matrix and a n l dimensional matrix V. These
three matrices satisfy the relationship of Eq. 6.
Amn ¼ Um1 Kll Vnl

T
ð6Þ
The main diagonal element ki ði ¼ 1; 2; . . .; lÞ of the diagonal matrix K is non-

negative and is arranged in descending order, that is k1 k1 kl 0, these
diagonal elements are the singular values of the matrix A.
Theory and Practice has proved that, when noise or no signal having a higher signal
to noise ratio, after subjected to singular value decomposition of the matrix K can be
obtained as described as Eq. 7.
K ¼ diagðk1 ; k2 ; ; kk ; 0; ; 0Þ ðk \ l; ki 6¼ 0; i ¼ 1; 2; ; kÞ ð7Þ
When the signal has a lower signal-to-noise ratio, the matrix obtained by the
singular value decomposition can be described as Eq. 8.
K ¼ diagðk1 ; k2 ; ; ki ; ; kl Þ ðki 6¼ 0; i ¼ 1; 2; ; lÞ ð8Þ
Obviously, the number of non-zero-valued main diagonal elements in the matrix is

closely related to the re-testing of the frequency components contained in the signal.
The more wireless the non-zero value diagonal in matrix K, the more complex the
signal components. Even after the signal is disturbed, the main diagonal elements of
matrix K may all be non-zero. The fewer non-zero value main diagonal elements in
matrix K, the simpler the frequency component of the signal. This shows that the
matrix K can objectively reflect the amount of information on the signal. Therefore, the
singular entropy is defined as:
X
k
Ek ¼ DEi ðk lÞ ð9Þ
i¼1
Where k represents the order of the singular entropy. DEi represents the increment
of the singular entropy at the order k, which can be calculated by Eq. 10.
X
l X
l
DEi ¼ ðki kj Þ logðki kj Þ ð10Þ
j¼l j¼l
3 CPA Based on Improved Wavelet Packet
In the process of using wavelet packet decomposition and noise reduction, the choice of
threshold and the quantization method of threshold are closely related to the effect of
data noise reduction [24]. In practical applications, either the default threshold noise
reduction is selected, or the adjustment parameters are continuously tested to achieve
the purpose of noise reduction. The default threshold method is not targeted and cannot
accurately denoise the characteristics of power consumption data. By continuously
testing and adjusting parameters to reduce noise, its parameters are only suitable for
noise reduction of current power consumption data, and these parameters are not
universal. WPD is only for the high frequency coefficient threshold denoising of each
384 P. Ma et al.
A noisy signal S
Wavelet packet decomposition
The best tree of wavelet packet
SSA reduces noise of low SSA reduces noise of high

frequency coefficients frequency coefficients
Rewrite the processed coefficients to the node
Wavelet packet tree reconstruction
Denoising data X
Fig. 4. Improved wavelet packet denoising flowchart
layer, ignoring that the low frequency fine number still contains a small amount of
noise. In the case of low signal-to-noise ratio, the noise signal contained in the low-
frequency fine number still occupies a large proportion, which seriously affects the
efficiency of power consumption attack and the accuracy of decrypting the key.
Therefore, in view of the above problems, this paper adds SSA to WPD and pre-
processing noise reduction for power consumption data, improving the efficiency of
CPA.
Figure 4 is a flow chart of the wavelet packet denoising of the improved power data
pre-processing. The basic flow is:
(1) Decompose the power consumption data by wavelet packet. Firstly, a wavelet
with better symmetry, tight support and orthogonality is selected as the wavelet
Tree Decomposition data for node: 0 or (0,0).

0.2
(0,0)
0.15
(1,0) (1,1) 0.1
(2,0) (2,1) 0.05
(3,0) (3,1) 0
-0.05
(4,0) (4,1)
-0.1
(5,0) (5,1)
-0.15
(6,0) (6,1) (6,2) (6,3)
-0.2
1000 2000 3000 4000
Fig. 5. The best wavelet packet tree

Fig. 6. Corresponding energy value of the best wavelet packet tree nodes
base of wavelet packet decomposition, and then the number of layers of wavelet
packet decomposition is determined.
(2) Getting the best tree, as shown in Fig. 5. Choose a suitable entropy criterion and
solve the optimal wavelet packet tree of wavelet packet decomposition.
(3) Processing of wavelet packet decomposition coefficients. As can be seen from
Fig. 6, except for the highest energy value of the first node (parent node), the
remaining nodes exhibit a large one and a small distribution. Therefore, in the
wavelet packet noise reduction, the low frequency and high frequency nodes
should be analyzed and processed at the same time. The wavelet packet coeffi-
cients are processed using singular spectrum analysis. The power consumption
information is extracted according to the fluctuation trend of the singular entropy
corresponding to the low frequency and the high frequency singular value.
(4) Wavelet packet reconstruction. The node coefficients processed by SSA are
rewritten into the node, and then reconstructed to obtain the power consumption
data after noise reduction.
4 Experiment and Analysis
In order to evaluate the efficiency and accuracy of CPA in this method, the hardware-
based SM4 algorithm is selected by chosen plaintext attack. Follow the steps in
Sect. 2.1 to select the first rounds of S-boxes as leak points for attack. Figure 7 shows a
piece of power trace by sampled. Then we use three kinds of power consumption data
(raw power trace, WPD power trace, improved WPD power trace) to carry out CPA
attack, analyze the attack efficiency of three kinds of data and the accuracy of
decrypting the key.
Figures 8, 9, 10 and 11 shows the relationship between the correlation coefficient
and the number of power traces obtained by performing CPA on four S-boxes for three
types of power consumption data (From left to right: raw power consumption, WPD
386 P. Ma et al.
Fig. 7. Sampling a power trace
Fig. 8. The relationship between the first S-box power consumption data and the correlation
coefficient
Fig. 9. The relationship between the second S-box power consumption data and the correlation
coefficient
power consumption and improved WPD power consumption). It can be concluded

from the figure that by using the sampled power consumption curve, the raw power
trace successfully cracks the key when the number is 190, 120, 200, 140. The tradi-
tional wavelet packet threshold denoising will cause the number of power trace to
increase and even failing decrypting the key correctly after the data is processed. After
using the improved WPD for data processing, the number of power traces on the
decrypted key will be reduced to different degrees. Cracking the keys of the four S-
boxes requires 120, 100, 130, and 120 power traces, respectively, and the attack
efficiency is increased by about 37%, 17%, 35, and 14%.
Fig. 10. The relationship between the third S-box power consumption data and the correlation
coefficient
Fig. 11. The relationship between the forth S-box power consumption data and the correlation
coefficient
5 Conclusion
In this paper, in view of the current problems of wavelet packet denoising, we com-
bines SSA with WPD, and proposes a CPA method by improved WPD. The method
firstly uses WPD to find the best wavelet packet tree, and extracts the low-frequency
coefficients and high-frequency coefficients of the power consumption data in each
node. Then, SSA is used to remove the noise information according to the singular
value singular entropy fluctuation trend of the node coefficients. Finally, all processed
node coefficients are written into the wavelet packet tree, and then the data is recon-
structed to obtain the pure power consumption data after noise reduction. The method
effectively solves the problem that the threshold noise reduction in the wavelet
decomposition and noise reduction is not targeted, while retaining the power con-
sumption information in the high frequency part and improving the power consumption
quality. By chosen plaintext attack for CPA of the SM4 algorithm based on hardware,
the proposed method is superior to WPD in denoising.
Acknowledgements. This work is supported by the National Key Research and Development
Program of China Under Grants No. 2017YFB0802000, National Cryptography Development
Fund of China Under Grants No. MMJJ20170112, the Natural Science Basic Research Plan in
Shaanxi Province of china (Grant Nos. 2018JM6028), National Nature Science Foundation of
China (Grant Nos. 61772550, 61572521, U1636114, 61402531), Engineering University of
PAP’s Funding for Scientific Research Innovation Team (grant no. KYTD201805).
388 P. Ma et al.
References
1. Kocher, P., Jaffe, J., Jun, B.: Differential power analysis. In: Advances in Cryptology,
CRYPTO, vol. 1666 (1999)
2. Kocher, P.C.: Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other
systems. In: Advances in Cryptology—CRYPTO 1996. Springer, Heidelberg (1996)
3. Agrawal, D., et al.: The EM Side – Channel (s): Attacks and Assessment Methodologies (2003)
4. Brier, E., Clavier, C., Olivier, F.: Correlation power analysis with a leakage model. In:
International Workshop on Cryptographic Hardware and Embedded Systems. Springer,
Heidelberg (2004)
5. Le, T.H., Clediere, J., Serviere, C., et al.: Noise reduction in side channel attack using fourth-
order cumulant. IEEE Trans. Inf. Forensics Secur. 2(4), 710–720 (2007)
6. Charvet, X., Pelletier, H.: Improving the DPA attack using wavelet transform. In: NIST
Physical Security Testing Workshop, p. 46 (2005)
7. Souissi, Y., Elaabid, M.A., Debande, N., et al.: Novel applications of wavelet transforms
based side-channel analysis. In: Non-Invasive Attack Testing Workshop (2011)
8. Liu, W., Wu, L., Zhang, X., et al.: Wavelet-based noise reduction in power analysis attack.
In: 2014 Tenth International Conference on Computational Intelligence and Security,
pp. 405–409. IEEE (2014)
9. Yanni, P.: Application of wavelet transform in signal denoising. J. ChongQing Univ. 27(10),
40–43 (2004)
10. Duan, X., She, G., Gao, X., et al.: Wavelet packet based AES related power analysis attack.
Comput. Eng. 43(6), 84–91 (2017)
11. Myung, N.K.: Singular spectrum analysis. 1283(4), 932–942 (2009). Springer, Berlin
12. Wold, S.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1), 37–52 (1987)
13. Yang, W., Jiang, J.: Study on singular entropy of mechanical signals. J. Mech. Eng. 36(12),
9–13 (2000). (in Chinese)
14. Wang, S., Gu, D., Liu, J., et al.: A power analysis on SMS4 using the chosen plaintext
method. In: 2013 9th International Conference on Computational Intelligence and Security
(CIS), pp. 748–752. IEEE (2013)
15. Teng, Y., Chen, Y., Chen, J. et al.: Differential power consumption and related power
analysis of SM4 algorithm. J. Chengdu Univ. Inf. Technol. 29(1), 13–18 (2014)
16. Pan, M., Lv, X., Zhang, L., et al.: Signal case analysis combining wavelet transform and
Fourier transform. Inf. Secur. Commun. Priv. 6, 62–63 (2007)
17. Ma, L., Han, Y.: Periodicity of time series using wavelet transform. In: National Academic
Conference on Youth Communication (2007)
18. Liu, Z.: Signal denoising method based on wavelet analysis. J. ZheJiang Ocean Univ. (Nat.
Sci. Ed.) 30(2), 150–154 (2011)
19. Qi, X.: Research on quantitative timing strategy based on wavelet packet transformation (2018)
20. Nikolaou, N.G., Antoniadis, I.A.: Rolling element bearing fault diagnosis using wavelet
packets. NDT E Int. 35(3), 197–205 (2002)
21. Golub, G.H., Reinsch, C.: Singular value decomposition and least squares solutions. In:
Linear Algebra, pp. 134–151. Springer, Heidelberg (1971)
22. Ai, J., Wang, Z., Zhou, X., et al.: Improved wavelet transform for noise reduction in power
analysis attacks. In: 2016 IEEE International Conference on Signal and Image Processing
(ICSIP), pp. 602–606. IEEE (2016)
23. Ren, N., Liu, Z.: Research on spectral analysis method based on modern signal processing.
Software 39(455(3)), 157–159 (2018)
24. Lv, N., Su, S., Zhai, C.: Application of improved wavelet packet threshold algorithm in
vibration signal denoising. In: 11th Youth Academic Conference of the Chinese Acoustics
Society, vol. 1, pp. 330–333 (2017)
A Method of Annotating Disease Names
in TCM Patents Based on Co-training
Na Deng1(&), Xu Chen2, and Caiquan Xiong1

1
School of Computer Science, Hubei University of Technology, Wuhan, China
iamdengna@163.com
2
School of Information and Safety Engineering,
Zhongnan University of Economics and Law, Wuhan, China
chenxu@whu.edu.cn
Abstract. In the era of big data, annotated text data is a scarce resource. The
annotated important semantic information can be used as keywords in text
analysis, mining and intelligent retrieval, as well as valuable training and testing
sets for machine learning. In the analysis, mining and intelligent retrieval of
Traditional Chinese Medicine (TCM) patents, similar to Chinese herbal medi-
cine name and medicine efficacy, disease name is also an important annotation
object. Utilizing the characteristics of TCM patent texts and based on co-training
method in machine learning, this paper proposes a method of annotating disease
names from TCM patent texts. Experiments show that this method is feasible
and effective. This method can also be extended to annotate other semantic
information in TCM patents.
1 Introduction
In the era of big data, annotated text data is a scarce resource. The annotated important
semantic information can be used as keywords in text analysis, mining and intelligent
retrieval, as well as valuable training and testing sets for machine learning. At present,
in order to obtain high quality annotated data, it usually takes a lot of manpower to
annotate a large number of text data manually.
In the analysis, mining and intelligent retrieval of TCM patents, similar to the name
of Chinese herbal medicine and the effect of medicines, the name of disease is also an
important annotation object. The annotated disease names can be used as keywords in
patent text mining, such as patent classification, patent clustering and so on. They can
also be used as keywords in patent intelligent retrieval. In intelligent search, search
engines return patents containing words similar to the semantics of disease names.
The abstract part of patent contains the main information of the patent, including
the technology used in the patent, the efficacy achieved by the patent and so on.
In TCM patent text, the abstract mainly contains the diseases that can be treated by the
patent, the main Chinese herbal used and the efficacy that can be achieved. The patent
abstracts are usually short texts containing concise information of patents. The tradi-
tional keyword extraction methods are not suitable for patent abstracts.
Based on the characteristics of TCM patent abstract texts and the co-training
method in machine learning, this paper proposes a method to annotate disease names

https://doi.org/10.1007/978-3-030-33509-0_35
390 N. Deng et al.
from TCM patent texts. Experiments show that this method is feasible and effective.
This method can also be extended to annotate other semantic information of TCM
patent texts.
2 Related Work
Patent information extraction and semantic annotation is an important step before

patent analysis and mining. [1] proposes a method combining rules-bases and statistical
learning to annotate and extract the information of patent features, composition and
usage. [2] utilizes some fixed words to extract information from patents, such as
background, technical proposal, advantage and so on automatically. [3] proposes a
novel ontology-based automatic semantic annotation approach, extract the structure
information from patent documents, and then identifies semantics of entities and
relations between entities.
Co-training is a kind of weakly supervised machine learning method. Its working
principle is that, firstly, two independent classifiers are constructed, a classifier is
trained with a small amount of labeled data, and then the labeled samples with high
confidence are added to the training samples of another classifier, so that the two
classifiers are trained together repeatedly [4]. Co-training is very suitable for situations
with large amount of data but little or no annotated data. In the author’s previous work,
[5] applies co-training method to annotate effect clues in patent texts, and achieves
good results. [6, 7] use co-training to annotate effect clause in Chinese patents.
3 Characteristics of Disease Names in TCM Patent Texts
Disease name in TCM patent texts indicates the name of disease that the patent can
treat or prevent. Through observation, we find that in TCM abstract texts, disease
names or the clauses containing disease names usually have some remarkable
characteristics:
(1) The clauses containing disease names often contain some special words, which
are called clue words in this paper. Such as “treat”, “cure” and “prevent”. Clauses
containing clue words are also highly likely to contain disease names.
(2) Disease names always appear continuously in many cases, such as “(the patent)
can treat various skin diseases, such as tinea pedis, eczema, beri-beri, and so on”.
4 Clue Words
Clue words refer to those words that can indicate that the clause is a clause containing
disease name. According to the different situations of clue words, we divide them into
the following categories:
A Method of Annotating Disease Names in TCM Patents 391
(1) Therapeutic indicators: Therapeutic indicators refer to the words such as “treat”,
“cure”, “prevent and “mainly treat” and so on. These words are usually followed
by disease names, indicating that the patent can treat certain diseases. For
example, “The invention is a kind of drug for treating sinusitis”, in which “si-
nusitis” is a disease name.
(2) Adverbial indicators of disease object: adverbial indicators of disease object refer
to the words such as “for”, “be aimed at” and so on. These adverbials are usually
followed by disease names, which mean that patents have therapeutic effects on
certain diseases.
(3) Therapeutic effect indicators: Therapeutic effect indicators refer to the words such
as “significant”, “unique”, “effective”, “curative effect”, “effect” and so on. These
words indicate that patents are effective in treating certain diseases, or the extent
to which they are effective.
(4) Therapeutic effect adverbial indicators: Therapeutic effect adverbial indicators
refer to the words such as “have” and “achieve”. These adverbials are usually
followed by therapeutic effects, indicating that patents have certain therapeutic
effects.
5 Method
According to the above, disease names often appear continuously in the abstract texts
of TCM patents, and the clauses containing disease names usually contain some special
clue words. Therefore, this paper uses these structural characteristics and text com-
position characteristics, and combines co-training technology in machine learning to
give a method to annotate disease names from TCM patent abstract texts. The flow
chart of the method is shown in Fig. 1.
The process in details is as follows:
1. Start from a small number of TCM patent abstract texts, and the texts are cut into
clauses using punctuation marks (period, comma, etc.).
2. For each clause, Chinese word segmentation is executed and according to certain
strategy, the set of disease name seeds can be obtained.
3. In more TCM patent abstract texts, clause cut and Chinese word segmentation are
carried out.
4. For each clause, the weight of each clause is calculated according to whether it
contains clue words and disease name seeds. The clause whose weight exceeds the
threshold which is manually set is taken as candidate clause.
5. After deleting the stop words and other irrelevant words from the candidate clause,
the new annotated disease name is obtained.
6. The newly annotated disease name can be added to the set of disease name seeds to
start the next iteration.
392 N. Deng et al.
A few TCM
patent abstract More TCM
texts patent abstract
texts
Cut into clauses Cut into clauses
Chinese word Chinese word

segmentation segmentation
For each clause, calculate

its weight according to Set of clue
Get disease
whether contain clue words
name seeds
words or disease name
seeds
Set of disease
After filteration, get
name seeds
candidate clauses whose Stop words
weight exceeds the removal
threshold
Disease
names
Fig. 1. Flow chart of the method.
5.1 Preprocessing
The abstract texts of TCM patents need to be preprocessed, including clause cutting
and Chinese word segmentation. patent_abstract_list is a list storing TCM patent
abstract texts. patentID_clauselist_dict is a two-level nested dictionary. The key of the
outer dictionary is patent ID. The value clauseID_clause_dict is the inner dictionary.
The key of the inner dictionary is clause ID. The value is the word list named
words_list generated from the clause after Chinese word segmentation. The structure of
the two-level nested dictionary is shown in Fig. 2.
Key Value
First level PatentID clauseID_clause_dict

dictionary
Secondary level
clauseID words_list
dictionary
Key Value
Fig. 2. The structure of patentID_clauselist_dict.
Source code of preprocessing in Python is shown in Fig. 3.
Fig. 3. Source code of preprocessing.
5.2 Collection of Disease Name Seeds

There are various expressions of disease names. Some are ended with character “病
(disease)” or “炎(inflammation)”, such as “胃病(stomach disease)”, “中耳炎(tympa-
nitis)” and “阑尾炎(appendicitis)”, and some expressions do not have obvious char-
acteristics, such as “鼻息肉(nasal polyp)” and “toothache”. In order to collect disease
name seeds, this paper adopts a simple and direct method: for those clauses containing
the words “treat” or “mainly treat”, if a non-stop word in the clause ends with “病” or
“炎”, it is considered to be a disease name. Words collected in such a way are used as
the set of disease name seeds. The Python code is shown in Fig. 4.
394 N. Deng et al.
Fig. 4. Source code of collecting disease name seeds.
5.3 Collection of Clue Words

The clause containing disease names often contains some special clue words, which
can help us identify which clauses are more likely to contain disease names, so we need
to collect these clue words. In order to obtain high quality clue words, this paper
collects some clue words artificially based on the category of clue words specified
above, as shown in Table 1. The set of clue words will be enriched in several iterations
of the whole method.
Table 1. Collection of clue words.

主治(mainly treat) 治疗(treat) 预防(prevent) 对(for)
防治(Prevent and cure) 对于(for) 针对(aim at) 治愈(cure)
效果(effect) 独特(unique) 有效(effective) 显著(significant)
疗效(curative effect) 可(can) 有(have) 具有(possess)
作用(curative effect) 达到(achieve) 用于(is used in) 病症(disease)
5.4 Calculation of the Weights of Clauses

Considering the characteristics of disease names and clauses containing disease names,
that is, the clause containing disease names often contains clue words and disease
names often appear continuously, we can calculate the weight of each clause to obtain
candidate clauses which have a high probability of containing disease names. The flow
chart of calculating weights of clauses is shown in Fig. 5.
A TCM For each clause, set

abstract text its initial weight as 0
Cut into clauses For each word in clause
Chinese word Whether this word exists in the set

segmentation of clue words
weight_of_word is added to
the weight of the clause
Whether this word exists in the set

of disease name seeds
weight_of_word is added to The total weight of the

the weight of the clause clause is obtained
Fig. 5. Flow chart of calculating weights of clauses.
The source code of calculating weight of a clause in Python is shown in Fig. 6. In

which, wordlist is the word sequence of a clause after Chinese words segmentation,
clue_word_list is the list of clue words, disease_list is the list of disease name seeds,
and weight_of_word is the parameter manually set, representing the value added each
time.
Fig. 6. Source code of calculating weight of a clause.
5.5 Storage of Weights of Clauses

After calculating the weights, the weights of all clauses in TCM patent abstract texts
needs to be stored. Similar to patentID_clauselist_dict in preprocessing, clause weights
396 N. Deng et al.
are also stored in a two-level nested dictionary. The key of the outer dictionary is the
patent ID, and the value is an inner dictionary; the key of the inner dictionary is the
clause ID, and the value is the final weight of the clause. The Python code is shown in
Fig. 7, where patentID_clause_weight_dict is the outer nested dictionary, and
clause_weight_dict is the inner dictionary, which is also the value of patentID_clause_
weight_dict.
Fig. 7. Source code of storage of weights of clauses.
5.6 Creation of Candidate Clauses

After calculating the weight of each clause in TCM patent abstract texts, we get several
clauses with higher weights (exceeding the threshold manually set) as candidate clauses
that may contain disease names. The steps in details are as follows:
1. Create a new dictionary, patentID_topN_clauses, which stores the clauses with
topN weights in all patents. It is also a two-level nested dictionary.
2. Traversing the two-level nested dictionary patentID_clause_weight_dict which
stores the weights of all patent clauses.
3. For each TCM patent abstract text, create a new list named topN_clauses storing the
all topN clauses, and a new list named topN_clauseid_list storing the id number of
all topN clauses.
4. If the weight of a clause is greater than the threshold weight_threshold, record
clause_id of the clause and add the clause_id into topN_clause_list.
5. In the two-level nested dictionary patentID_clause list_dict, which stores all the
word sequences of patent clauses, the corresponding word sequences of clauses in
topN_clauseid_list are found and stored in the dictionary patentID_topN_clauses.
The Python code is shown in Fig. 8.
Fig. 8. Source code of creation of candidate clauses.
6 Experiments
We run the code in Fig. 4 in 1000 TCM patent abstract texts, and get disease name
seeds as shown in Fig. 9.
Fig. 9. Disease name seeds.
After artificial screening, part of newly annotated disease names is shown in

Fig. 10.
Fig. 10. Part of newly annotated disease name seeds.

398 N. Deng et al.
7 Conclusion
Taking advantage of the characteristics of TCM patents, that is, the clauses containing
disease names also contains clue words, and disease names often appear continuously,
this paper proposes a method based on co-training to annotate disease names from the
abstract texts of TCM patents. The experimental results show that the method is
effective.
Acknowledgments. This work was supported by National Key Research and Development
Program of China under Grant 2017YFC1405403; National Natural Science Foundation of China
under Grant 61075059; Philosophical and Social Sciences Research Project of Hubei Education
Department under Grant 19Q054; Green Industry Technology Leading Project (product devel-
opment category) of Hubei University of Technology under Grant CPYF2017008; Research
Foundation for Advanced Talents of Hubei University of Technology under Grant BSQD12131;
Natural Science Foundation of Anhui Province under Grant 1708085MF161; and Key Project of
Natural Science Research of Universities in Anhui under Grant KJ2015A236.
References
1. Guangpu, F., Xu, C., Zhiyong, P.: A rules and statistical learning based method for Chinese
patent information extraction. In: Eighth Web Information Systems & Applications
Conference. IEEE, Piscataway New Jersey (2011)
2. Hou, C.Y., Li, W.Q., Li, Y.: An automatic information extraction method based on the
characteristics of patent. Adv. Mater. Res. 472–475, 1544–1550 (2012)
3. Wang, F., Lin, L.F., Yang, Z.: An ontology-based automatic semantic annotation approach for
patent document retrieval in product innovation design. Appl. Mech. Mater. 446–447, 1581–
1590 (2013)
4. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In:
Conference on Computational Learning Theory, pp. 92–100 (1998)
5. Deng, N., Wang, C., Zhang, M., et al.: A semi-automatic annotation method of effect clue
words for chinese patents based on co-training. Int. J. Data Warehouse. Min. 14(4), 1–19
(2018)
6. Chen, X., Peng, Z., Zeng, C.: A co-training based method for Chinese patent semantic
annotation. In: The 21st ACM International Conference on Information and Knowledge
Management. ACM (2012)
7. Chen, X., Deng, N.: A semi-supervised machine learning method for Chinese patent effect
annotation. In: 2015 International Conference on Cyber-Enabled Distributed Computing and
Knowledge Discovery (CyberC). IEEE, Piscataway New Jersey (2015)
Semantic Annotation in Maritime Legal Case
Texts Based on Co-training
Jun Luo(&), Ziqi Hu, Qi Liu, Sizhuo Chen, Peiyong Wang,

and Na Deng

2233524221@qq.com
Abstract. In the era of artificial intelligence and big data, a large number of
legal case texts have been accumulated in the process of law enforcement of
marine rights protection. These case texts contain a lot of important information,
such as the time, place, person, event, judgment body, judgment result and so
on. The annotation of these semantic information is an important link in text
analysis, mining and retrieval of sea-related cases. In this paper, a semantic
annotation method based on collaborative training for maritime legal texts is
proposed. The experimental results show that the method is correct and feasible.
1 Introduction
In the era of artificial intelligence and big data, a large number of legal case texts have
been accumulated in the process of law enforcement of marine rights protection. These
case texts contain a lot of important information, such as the time, place, person, event,
judgment body, judgment result and so on. The annotation of these semantic infor-
mation is an important link in text analysis, mining and retrieval of sea-related cases.
How to effectively use the important information in these case texts has become an
important proposition at present.
With the development of computer science and artificial intelligence technology,
people are more and more inclined to use the powerful computing power of computer
and the powerful analytical power of artificial intelligence to assist human decision-
making. However, the strong computing power of computers and the strong analytical
power of artificial intelligence are based on the need for a large number of high-quality
training sets and test sets. Therefore, labeled data is a scarce resource in the era of big
data. At present, there is a popular saying in the data labeling industry, “How much
intelligence, how many manual work”. It reflects an objective fact: in order to provide
high-quality training and test sets for machine learning and artificial intelligence, people
usually use time-consuming manual annotation method to annotate the semantic fea-
tures of data. Under the disadvantageous condition of insufficient labeling data, in order
to extricate people from the arduous predicament of manual labeling, collaborative
training technology emerged as the times require. Under the disadvantage of insufficient
labeling data, this paper proposes a semantic labeling method based on collaborative
training to label the important semantic information of legal texts of maritime cases.

https://doi.org/10.1007/978-3-030-33509-0_36
400 J. Luo et al.
2 Related Work
At present, the research on semantic annotation by scholars at home and abroad covers
many aspects, such as keyword extraction, named entity recognition, entity relationship
recognition and so on. The main purpose of named entity recognition is to identify the
components representing named entities from text. In this field, the current mainstream
methods of Chinese named entity recognition have many directions: rule-based,
statistics-based, machine learning-based and the combination of various methods.
A rule-based entity recognition method is usually based on the analysis of the
entity’s feature codes, and then constructs artificial rules for matching. The rule-based
method for named entity recognition has the advantages of stable performance and fast
speed in small-scale corpus testing. However, the drawbacks are that it takes a lot of
money and time to summarize rules manually or to organize corpus and named entity
database. And portability is poor. Once ported to other languages or other fields, named
entity recognition cannot be completed. Rules and defined entities need to be rewritten.
Rule-based methods tend to have higher accuracy, but recall rates need to be improved;
once a large number of rules are written, rules may conflict with each other.
The research on cooperative training algorithm can be traced back to the Co-
Training algorithm proposed by Blum and Mitchell [1]. The algorithm assumes that
data attributes have two sufficient redundant views [2], which are recorded as view 1
and view 2. Firstly, classifier C is trained on view 1 of labeled data set L, and classifier
C2 is trained on view 2. Then randomly select a number of instances (counted as u)
from unmarked data set U and put them into the set U. All elements in U were marked
with C1 and C2, respectively, and marked as E1 and E2. Then, from E1 and E2, P
positive markers and n positive markers with high confidence were selected and put
into L. Finally, 2p + 2n data from U are selected and added to U’, and the above
process is repeated until the cut-off condition is satisfied.
Collaborative training is a weak guidance machine learning method. Its working
principle is to construct two independent classifiers, train one classifier with a small
amount of labeled data, and then add the labeled samples with high confidence to the
training samples of another classifier. With such iteration, the two classifiers cooperate
with each other and gradually expand the scale of labeled data. Collaborative training is
very suitable for large amount of data but no or less annotated data. In recent years, it
has been applied to various types of data annotation, such as image [4], web page [5],
emotional text [6], entity relationship extraction [7].
There are many research methods based on semantic annotation. Document [8]
combines statistics and rules to recognize named entities of Internet text. Document [9]
Considering that the expression structure of numerals and time words is more stan-
dardized in the process of named entity recognition, rule-based method is adopted to
recognize numerals and time. Considering the irregularity of names, place names and
organization names, the conditional random field model is used to identify them.
Semantic Annotation in Maritime Legal Case Texts Based on Co-training 401
3 Systematic Framework of Semantic Annotation

3.1 Semantic Annotation Method for Sea-Related Cases Based
on Collaborative Training
The author realizes the method of semantic annotation of sea-related case texts based
on cooperative training by using cooperative training algorithm. On the one hand, the
AdaBoost classifier is constructed through the semantic label feature database formed
by semantic annotation and natural semantic processing to train the samples. In order to
speed up the rate of generating labeled samples and complete the training process for a
large number of samples, three training classifiers are used to label simultaneously in
the system. Cross-labeling training is conducted among classifiers to avoid large
experimental errors caused by small probability of semantic labeling errors. Each time
a new tagged text is generated that can be added to the sample set, the three trainer
models are compared with each other in confidence. If the confidence is high, the tag of
the sample set semantically labeled is added to the storage database for the next
reading. As shown in Fig. 1:
Fig. 1. Semantic annotation model of sea-related case text based on collaborative Training
Firstly, three classifiers are trained based on three views of a small number of
labeled data samples. Then, some unlabeled data are randomly selected from the
unmarked sample data set U and labeled by three classifiers. The first n samples with
high confidence are selected from the labeled results of a single classifier and added to
402 J. Luo et al.
the other two training sets. The above process is repeated until the end condition is
reached. When judging the new samples, the result with more votes will be taken as the
criterion, which will greatly improve the accuracy of the classifier’s conclusion.
For each iteration in each training process, it is necessary to record the test accuracy
of the collaborative training process. If the accuracy of model testing recorded in the
current iteration version is higher than that recorded in the previous iteration process, it
is considered that the supplementary set selected in this paper can promote the whole
model, and the supplementary set is added to the training set. If the accuracy of model
testing recorded in the current iteration version is lower than that recorded in the
previous iteration process, it is considered that the selected supplementary set has a
negative effect on the whole model, and the supplementary set is returned to the test set,
waiting for the next iteration process, and marking the test set. When the number of
labeling arrives at the set value (assumed to be N times), the test set is considered to
have N times labeled as a side-effect sample by the model, which will be removed from
the overall data. Through this model, a large number of test sets can be trained, which
can promote the model, so as to ensure the accuracy of the whole set and training
process.
3.2 Semantic Annotation Process Based on Collaborative Training

There are some punctuation symbols, misspelling, spelling errors, temporal errors and
misuse of singular and plural numbers on the free labeling of case texts in test samples.
These conditions greatly affect the labeling of test samples, so it is necessary to
standardize the processing of samples and convert them into a unified presentation
format. This process advocates eliminating the separators, special symbols and incor-
rect spelling test set in the text at first, and then using IK Analyzer tool to process
Chinese word segmentation and stop-using words, finally forming a preliminary
screening of samples.
In the process of semantic tagging, we will encounter the situation that both before
and after semantic tagging can be matched. When encountering this situation, the
labeling system can divide labels with a fuzzy matching option. For example, “collision
damage” can be marked as “collision damage” and “collision damage”. The former
refers to acts of involvement, while the latter refers to professional terms. Different
classifications will lead to different understandings. Therefore, in the training of sample
text, it is necessary to deal with semantic ambiguity in order to avoid misinterpretation
of semantics and reduce the accuracy of classification training.
4 Project Implementation Process
The specific implementation of the implementation is divided into the following five
steps:
1. Acquisition of sea-related case samples
Using Python language to write crawler scripts to capture target information in sea-
related cases, there will be a lot of content to be grasped, which will be matched and
searched with the key words of “law concerning sea”, “law enforcement of ocean” and
“safeguarding rights of ocean”. When crawler technology is used to grab text docu-
ments, it will produce some irrelevant content, so regular expressions are needed to
clean up and filter irrelevant content. Grabbing strategies include depth-first traversal
algorithm, breadth-first traversal algorithm and Beautiful Soup algorithm.
2. Constructing a semantic annotation model
Based on the obtained text of the sea-related legal case,basic theory and method of
using natural language processing, combined collaborative training method,label all
kinds of important semantic information in the case text, such as time, law, plaintiff,
defendant, illegal act, judgment agency, punishment, etc. The semantic information
labeling process of the sea-related legal case text using collaborative training is shown
in Fig. 2.
According to the different types of important semantic information, different
methods are proposed for labeling. For time information and legal information, because
of its obvious characteristics: year, month, day or book name, etc., it is proposed to use
the method of rules to semantically mark these two types of information.
Text of maritime
legal casesText of Chinese word Remove out-of-
maritime legal cases segmentation date words
classification
Rule-based time Rule-based law Other Important

Information Information Semantic Information
annotation annotation Annotation Based on
Collaborative Training
End
Fig. 2. Semantic information annotation flow chart of the sea-related legal case text of
collaborative training
3. Building a collaborative training model

For other semantic information that does not have obvious features, such as illegal
activities, judgment institutions, punishments, etc., will use collaborative training to
construct different collaborative training models for different information, the model
work diagram shown in Fig. 3.
404 J. Luo et al.
Text of maritime
Chinese word Remove out-of-
legal cases
segmentation date words
Semantic
annotation for
different text
information
Manual labeling Labeling

of seed words results
Select a view of
collaborative
training
View
Cooperative
Training
Fig. 3. Schematic diagram of the collaborative training model for annotating semantic
information
According to different semantic information and semantic characteristics, a small

number of case texts are labeled manually to establish a seed lexicon. Then select
relatively independent but collaborative training views, and train these views in col-
laboration to find more semantic information in more case texts.
4. Semantic annotation of sea-related legal case texts based on collaborative training
The purpose of entity relationship labeling in maritime legal case texts is to identify
and label the semantic relationship between entities in case texts, such as for the entity
“Party Chen Huiyong” (TYPE = “PER”), and the entity “Hui’an County Chuanchuan
Handling Station” (TYPE = “FAC”, SUBTYPE = “Government”). The relational
mapping formed by the entity is “TYPE = “PHYS”, SUBTYPE = “Located”.
The word “Complaint” indicates that the type of case is civil dispute case, so the
word “Complaints” have a strong instructive role. After semantic tagging, these
training words can distinguish the types of cases.
The method of entity relation extraction based on collaborative training is similar to
that of semantic information extraction. Firstly, a small number of seed entity relation
sequence patterns are summarized from the context containing relational seeds, and
then more relational seed instances are found by using relational sequence patterns, so a
new relational seed set is formed. In order to avoid semantic drift to some extent, this
project will use two independent feature sets to provide different and complementary
information, thus reducing annotation errors.
5. Performance evaluation
(1) Preparation of test set: In the semantic annotation of maritime legal case texts, due
to the lack of common test sets, it is proposed to build test sets independently and
evaluate the accuracy of semantic annotation of test samples manually.
(2) Spatio-temporal cost of testing under different scale corpora: test the space-time
overhead and performance of the project method in case texts of different scales
such as 100, 1000, 10000, 100000.
(3) Contrast experiments with other existing text labeling methods: compare the
accuracy and time-space overhead of the method and other text labeling methods
in different scales of 100, 1000, 10000, 100,000, etc. Compare and evaluate the
performance of the project method.
5 Experiment
In order to verify the reliability and validity of this method, we give the titles of four
test sample cases, each sample corresponds to one case. The source of the sample
sentences in these tests is the preliminary tagging and screening of the title of the case
text. The purpose of screening is to classify the description of the specific violations
committed by the parties in the sample. Namely:
Sentence 1 = “Accidents involving Foreign Maritime Traffic”
Sentence 2 = “Dispute over Liability for Damage Caused by Ship Collision”
Sentence 3 = “Dispute over Contract of Carriage of Goods by Sea”
Sentence 4 = “Ship Construction Insurance Contract Dispute Case”
According to the semantic annotation model under collaborative training, the four
test sentences are divided into:
Sentence 1 = {“foreign-related”, “maritime”, “traffic accident case”, “accident
case”}. Among them, “foreign-related” is the attribute of the relationship, “sea” is the
place where the case occurs, “traffic accident case” is the type of case, and “accident
case” is the attribute of the case, which lies between criminal and civil cases;
Sentence 2 = {“collision of ships”, “collision damage”, “liability dispute case”,
“dispute case”}, in which “collision of ships” is the action of the case, “collision
injury” is the attribute of the degree of the case, “liability dispute” is the category of the
case, and “dispute case” is the attribute of the case, generally referring to civil cases;
Sentence 3 = {“maritime”, “cargo transport”, “contract dispute case”, “dispute
case”}, in which “maritime” is the place where the case occurred, “cargo transport” is
the action when the case occurred, “contract dispute case” is the category of the case,
and “dispute case” is the attribute of the case, generally referring to civil cases;
Sentence 4 = {“Ship Construction”, “Insurance Contract”, “Contract Dispute
Case”, “Dispute Case”}, in which “Ship Construction” is the action when the case
occurs, “Insurance Contract” is the content attribute of the case concerned, and
“Contract Dispute” is the attribute of the case, generally referring to civil cases.
By training different test samples, dividing and labeling the cases on the premise of
semantic labeling, we can quickly classify the cases.
406 J. Luo et al.
The four test sentences are classified according to the category labels formed by
semantic markers. The similarities between sentence 3 and sentence 4 have the highest
confidence, and can be classified into one type of case handling. Sentence 1, which has
the lowest confidence similar to Sentences 2, 3 and 4, cannot be classified as a case type
with the other three cases. The similar confidence between sentence 2 and sentence 3,
sentence 2 and sentence 4 is not as good as that between sentence 3 and sentence 4, but
higher than that between sentence 1 and the other three sentences. Therefore, it is an
intermediate state between high confidence and low confidence. It can be divided
according to specific needs. The results obtained from the annotation model of algo-
rithm evaluation and training are shown in Table 1.
Table 1. Sample semantic similarity confidence tested by training model

Semantic similarity confidence Sent1 Sent2 Sent3 Sent4
Sent1 — — — —
Sent2 11.6% — — —
Sent3 10.3% 57.1% — —
Sent4 12.7% 61.9% 80.7% —
The accuracy of sample text training can be obtained by comparing the results of
algorithm evaluation and semantic annotation model with the actual results of whether
the case is similar or not.
Experiments introduce the algorithm and semantic annotation model through fewer
test cases, but to distinguish a large number of maritime case texts, it is not feasible to
use only pure semantic annotation model, because it will consume a lot of human
resources to draw conclusions from the model and the actual situation of the case text.
It is neither convenient nor scientific to compare the situation. Therefore, this paper
intends to adopt the semantic annotation model of sea-related case texts based on
collaborative training proposed above. We can train a large number of case texts
directly from a small sample set to a large sample set that can handle a large number of
samples. By establishing a large database, the model saves the label content of case text
semantic annotation, and manages the training large sample set through the database,
which improves the efficiency of data management and increases the security of data.
Table 2. Comparing the accuracy of model training before and after semantic annotation
Model Classification model of direct training Semantic annotated training model
Accuracy rate 71% 80.6%
F1 70.3% 80.4%
Table 3. Accuracy of collaborative training model with different numbers of training texts
Number of Classification model of direct Accuracy of training model after
training training accuracy rate semantic annotation
10 81.262% 97.414%
20 79.641% 88.019%
30 74.579% 84.813%
40 72.131% 83.520%
50 71% 80.641%
60 70.40% 79.701%
70 69.98% 77%
80 69.44% 76.905%
90 68.5 75.95%
100 70.3% 76.31%
The experimental results (as shown in Table 2) show that, compared with the
classification model using original data directly, the collaborative training model based
on semantic annotation can improve the accuracy of detection by nearly 9% points. The
scheme proposed in this paper is suitable for the case of fewer labeling and processing
samples. With the increase of samples, the training accuracy of the model decreases
gradually with the increase of samples. The trend is shown in Fig. 4, and the experi-
mental data are recorded in Table 3.
6 Conclusion
This paper proposes a semantic annotation model for sea-related case texts based on
collaborative training, which can improve the accuracy of classification to nearly 10%
points, which has great practical significance in the process of text processing with
large amount of data. Firstly, the tags of semantic tagging under collaborative training
are standardized, and then the mapping relationship of tags is analyzed by using the co-
occurrence relationship of tags. Thus, it can accurately annotate the mapping rela-
tionship of related ontologies (concepts, attributes, examples, etc.), that is, the key
words with mapping relationship such as time, place, law, plaintiff, defendant, illegal
act, judgment institution, punishment, etc. in this paper. Through statistical analysis of
the co-occurrence relationship between tags, the semantic correlation between tags is
revealed by using the co-occurrence relationship between tags. Through a large number
of repeated training and experiments, it is concluded that the cases with the highest
confidence in similarity among samples can be judged as the same type of cases. It is
basically consistent with the actual situation of artificial judgment, thus achieving the
experimental purpose.
Deficiencies. There are still some deficiencies in the project, such as the need to judge
similar cases manually in the process of accuracy verification. Therefore, we will take
the content of natural language and machine learning as the direction of improvement
408 J. Luo et al.
in the future. It is proposed to add a system to maintain a database related to sea-related

cases. The database contains all kinds of tags after semantic annotation of case texts
and all relevant laws and regulations related to the basis of judgment of maritime cases.
Through natural language processing, the system can automatically match the key-
words retrieved with the database of maritime legal cases, and eventually rank them
from high to low according to similarity confidence. Finally, a series of similar mar-
itime cases can be generated quickly, which can alleviate the dependence on manual
work to a certain extent.
Acknowledgments. This work was supported by Innovation and Entrepreneurship Project for
College Students in Hubei Province under Grant S201910500040; Philosophical and Social
Sciences Research Project of Hubei Education Department under Grant 19Q054.
References
1. Blum, A., Mitchell, T., et al.: Combing labeled and unlabeled data with co – training. In:
Proceedings of the 11th Annual Conference on Computational and Learning Theory, pp. 92–
100. Springer, Berlin (1998)
2. Zhou, Z.H., et al.: Collaborative Training Style in Semi-supervised Learning. Machine
Learning and its Application, p. 275. Tsinghua University Press, Beijing (2007)
3. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In:
Conference on Computational Learning Theory, pp. 92–100 (1998)
4. Xu, M., Sun, F., Jiang, X., et al.: Multi-label learning with co-training based on semi-
supervised regression. In: 2014 IEEE International Conference on Security, Pattern Analysis,
and Cybernetics, pp. 175–180 (2014)
5. Wang, W., Lee, X.D., Hu, A.L.: Co-training based semi-supervised web spam detection. In:
The 10th IEEE International Conference on Fuzzy Systems and Knowledge Discovery,
pp. 789–793 (2013)
6. Iosifidis, V., Ntoutsi, E., et al.: Large scale sentiment learning with limited labels. In: The
23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
pp. 1823–1832 (2017)
7. Zhang, J.H., et al.: Research on Text Entity Relationship Extraction Based on Semi-
supervised Learning. Harbin Engineering University, Harbin (2017)
8. Liu, S.X., et al.: Research on Internet Named Entity Recognition Based on Conditional
Random Fields. Zhongyuan Institute of Technology, Zhengzhou (2016)
9. Cheng, Z.G., et al.: Research on Chinese Named Entity Recognition Based on Rules and
Conditional Random Fields. Central China Normal University, Wuhan (2015)
10. Wang, C.Y., et al.: Cluster analysis of labels. Mod. Libr. Inf. Technol. 5, 67–70 (2008)
Data Analytical Platform Deployment:
A Case Study from Automotive
Industry in Thailand
Chidchamaiporn Kanmai1 , Chartchai Doungsa-ard2 , Worachet Kanjanakuha3 ,

and Juggapong Natwichai1(B)
1
Department of Computer Engineering, Faculty of Engineering,
Chiang Mai University, Chiang Mai, Thailand
{chidchamaiporn ka, juggapong.n}@cmu.ac.th
2
College of Arts, Media and Technology, Chiang Mai University,
Chiang Mai, Thailand
chartchai.d@cmu.ac.th
3
ExxonMobil, Bangkok, Thailand
worachet@gmail.com
Abstract. This paper presents a case study on data analytical platform

deployment. The platform takes datasets continuously from social media
platform, Facebook, and provide some insight information for sell per-
sonal in automotive industry in Thailand. The goal is to improve the
sale performance of the sale team. Using information from the Facebook
Graph API on the Facebook pages of an automotive industry dealer, the
developed platform can then analyze such data and store the insight into
the stored data system. When the users search for the related keywords
or hashtags, the information in the form of graphs, images, and text can
be returned. Both system performance experimental results, as well as
the focus-group interviewing are to be presented to validate our proposed
work.
1 Introduction
Automotive industry is one of the most competitive industries in Thailand, since

its establishment in 1950. For the cars, the production could reach 200,000 units
per months. Half of the production is to be sold in the country, meanwhile the
other half is exported to the other countries [2]. Though the competition in
such industry among the regional competitors such as Indonesia or Vietnam are
very high, or the emerging of electric vehicles could re-shape the industry, the
automotive industry is still very important to Thailand’s economics.
With such high stake, the car dealers have to carefully present their products
or advertise their products in a more attractive way. Obviously, the marketing
communication and strategy highly affects the direction of the Thai automotive

https://doi.org/10.1007/978-3-030-33509-0_37
410 C. Kanmai et al.
industry. E-marketing by social network engagement is one of the most important

marketing tools in almost all sectors including automotive industry. Typically,
the customers are often interested in product information, product insights, or
product reviews. From the report in [1], the customers in Thailand increase their
interest enormously in the car or bike through the social network media more
than previously. From the survey on Facebook in [3], the most active users are
people with 18 to 25 years old. Meanwhile, the most active second group was 26
to 34 years old. Another prominent group is the trendsetter which their ages are
between 13 to 17 years old. From the technological aspects, data from Facebook
can be considered as big data, since its size and their structure [4].
In this paper, we presents a case study on data analytical platform deploy-
ment based on Facebook, and provide some insight information for sell personal
in automotive industry in Thailand. The goal is to improve the sale performance
of the sale team. The developed platform can analyze such data and store the
insight into the stored data system. When the users search for the related key-
words or hashtags, the information in the form of graphs, images, and text can
be returned. The performance of the platform are validated by experimental
results, as well as the focus-group interviewing.
The organization of this paper is as follows. Section 2 provides key compo-
nents of proposed platform including the structure and its data. The evaluation
result is presented in Sect. 3. The paper is concluded in Sect. 4.
2 Proposed Platform
2.1 System Structure
The system structure is shown in Fig. 1. It takes the data from Facebook Graph
API and the stores the data in Node.js platform. The underlying data storage
engine is MongoDB. Inside the platform, there are two sub-components, i.e. data
analyzing and the Restful service to response to the client systems through http
requests. The transmitting data is in JSON type. The result provides to the
client can be graphs in various forms.
The JSON data for the system is shown in Fig. 2. In this work, there are
two collections, i.e. Post and Result collections. The Post collection stores the
data from Facebook Graph API, e.g. message, picture, comments, sharing infor-
mation, or reaction of the posts. The Results store summarizing data from the
posts.
Data Analytical Platform Deployment 411
Fig. 1. The system structure
Fig. 2. JSON data
2.2 System UI
The system frontend allows users to determine the data of searched keywords or
hashtags that are related to the automotive industry. The examples of system
statistics are summarization on number of shares or posts, summarization in
term of the types of content, posts, common words, contents by dates in Figs. 3,
4 and 5, respectively. Such information can be utilized by users.
Fig. 3. An example on number of shares or posts
Fig. 4. An example of the top 4th most common keywords
Fig. 5. An example of the comment frequency by months
3 Evaluation
First, the system performance is evaluated by experiment results. In Fig. 6, the

performance is shown. On the x-axis, the number of posts to be processed in a
batch approach is presented. This process is used to prepare the stored data for
the users. Often, we prepare such processing in the night time or lunch break
before the users start their work. It takes the data from Graph API to be stored in
JSON structure and, in the same time, collect the statistical values, e.g. number
of shares/posts. The performance is presented in execution time of the specified
batch size. From the result, it can be seen that the larger the batch size is, the
better overall execution time can be obtained.
Data Analytical Platform Deployment 413
Fig. 6. Performance result
Secondly, we evaluate the system by a focus group interview, and then trans-
lated into the specified question/answer in Likert scale. The case is based on
the largest automotive dealer in Thailand, where the sales division is focused,
where the size of the team is 22 people with moderate experience (5–10 years
experience in sales). Before the evaluation, we explain the procedure to use the
system, then let them try using the system for their routine tasks, e.g. determine
the sales-leads, summarize the data before making a call to the target, or com-
municate with the technical team. Our finding from the evaluation after their
usage is as follows.
• The average score of the users on seeing benefit on the summarization is 4.5
out of 5, particularly on the less experience sales.
• The average score of the users on capability to assess the impact of the pro-
motion on the sales is 4.32 out of 5.
• The average score of the time-sensitive data display to the users is 4.32, e.g.
the question to find the highest-sales quarter can be answered effectively.
• The average score of the benefit for sales-planning is 4.5 out of 5, particularly
the 5–10 experience sales.
• The average score of the adequate data display on the system is 4.36 out of 5.
4 Conclusion
In this paper, we present a case study on data analytical platform based on
automotive sales data. The system takes the input from the Facebook Graph
API. The developed system can then analyze such data and store the insight into
the stored data system. When the users focus on the related keywords or hashtags
which could be the model/specification of the automotive, the information in the
form of graphs, images, and text can be returned and utilized by the salesteam.
From the system performance experimental results, as well as the focus-group
interviewing, it can show that our system can help salesteam to work on their
operations effectively. The improvement on the time-sensitive and data display
are planned and can be deployed in the next release.
References
1. Jiwcharoen, T.: Influence of social media advertising that affecting to brand aware-
ness consumer in Bangkok: case study, car business sector, Bangkok University
(2015)
2. TAIA: The thai automotive industry association (2017). http://www.taia.or.th/
Statistics/
3. Arpavate, W., Dejasvanong, S.C.C.: Communication behavior in facebook of stu-
dents at rajamangala university of technology Phra Nakhon. RMUTP Res. J. 7(2)
(2010)
4. Wipawin, N.: Network in a networked society. TLA Res. J. 8(2), 119–127 (2015)
Streaming Media Delivery and
Management Systems (SMDMS-2019)
The Structured Way of Dealing
with Heterogeneous Live
Streaming Systems
Andrea Tomassilli1 , Nicolas Huin2 , and Frédéric Giroire1(B)

1
Université Côte d’Azur, CNRS, Inria, Sophia Antipolis, France
frederic.giroire@cnrs.fr
2
Huawei Paris Research Labs, Paris, France
Abstract. In peer-to-peer networks for video live streaming, peers can

share the forwarding load in two types of systems: unstructured and
structured. In unstructured overlays, the graph structure is not well-
defined, and a peer can obtain the stream from many sources. In struc-
tured overlays, the graph is organized as a tree rooted at the server
and parent-child relationships are established between peers. Unstruc-
tured overlays ensure robustness and a higher degree of resilience com-
pared to the structured ones. Indeed, they better manage the dynamics
of peer participation or churn. Nodes can join and leave the system at
any moment. However, they are less bandwidth efficient than structured
overlays. In this work, we propose new simple distributed repair pro-
tocols for video live streaming structured systems. We show, through
simulations and with real traces from Twitch, that structured systems
can be very efficient and robust to failures, even for high churn and when
peers have very heterogeneous upload bandwidth capabilities.
1 Introduction
Live streaming can be done either over a classic client-server architecture or a
distributed one. The high bandwidth that live streaming requires may limit the
number of clients that the source can serve. A small number of clients could be
sufficient to saturate the source resources.
In a distributed scenario (e.g., P2P), the bandwidth required can be spread
among the users and the bottleneck at the source can be reduced. So, peer-to-
peer systems are cheap to operate and scale well with respect to the centralized
ones. Because of this, the P2P technology is an appealing paradigm for providing
live streaming over the Internet.
In P2P context, we can choose between an unstructured or a structured
overlay network. In unstructured overlay networks, peers self-organize themselves
without a defined topology. Unlike the structured case, any peer can receive each
piece of the video from a different peer. In structured overlay networks, peers are
organized in a static structure with the source at the root of the diffusion tree
A node receives data from a parent node that can be a peer or the source of the
https://doi.org/10.1007/978-3-030-33509-0_38
418 A. Tomassilli et al.
streaming. In these type of systems the content distribution is easier to manage

with respect to unstructured ones. In unstructured systems the diffusion tree is
done opportunistically and, as the authors show in [6], this ensures efficiency
and robustness to the dynamicity of peers. In structured systems, a departure
and arrival of users (churn) may break the diffusion tree. Our goal is to check if
these kinds of systems can also be efficient and robust to churn.
Contributions. In this work, we study a structured network for live video
streaming experiencing frequent node departures and arrivals in systems where
nodes have heterogeneous upload bandwidth.
– We propose, in Sect. 4, simple distributed repair protocols to rebuild the dif-

fusion tree when peers are leaving. Different protocols use different levels of
information.
– Using a custom made discrete-event simulator, we compare the protocols
using different metrics, i.e., delay, percentage of clients without video, number
and duration of interruptions. We validate the protocols using different peer
bandwidth distributions from literature.
– We study the efficiency of the protocols versus the level of information they
use. We show that a repair protocol can be very efficient, even when using
only a small amount of local information on its neighbors.
– Finally, we use real Twitch traces to compare our different heterogeneous
protocols in a real-life scenario of a streaming session in Sect. 5. We show
that our simple distributed repair protocols work very well in practice.
Due to lack of space, all the obtained results could not be included in the con-
ference version and can be found in [19].
2 Related Works
There is a large amount of work on video streaming systems (see [11,18] for sur-
veys), such as to improve video coding, e.g., adaptive streaming [16] or multiview
video [12], to reduce energy consumption [5], or to improve the way the overlay
network deals with churn using techniques from optimization [17], protocols, or
algorithmics. Our work lies in the last category.
There are two main categories of distributed systems for video live stream-
ing: unstructured and structured. [14] provides an overview of P2P based live
streaming services. Hybrid solutions are also possible [22]. In unstructured over-
lay networks, peers self organize themselves in an overlay network, that does
not have a defined topology. CoolStreaming/DONet [23] is an implementation
of this approach. In structured overlay networks, peers are organized in a static
structure with the source at the root of the tree. There are many techniques used
in P2P live streaming with single-source and structured topology. These tech-
niques may fall into two categories: single-tree and multiple-tree. In the single-
tree approach, each node is connected to a small number of nodes to which it
is responsible for providing the data stream. ZIGZAG [20] is an implementation
The Structured Way of Dealing with Heterogeneous Live Streaming Systems 419
of this approach. In the multiple-tree approach, the main idea is to stripe the
content across a forest of multicast trees where each node is a leaf in every tree
except one. SplitStream [7] is an implementation of this approach.
In terms of reliability, unstructured systems are considered the best choice.
As shown in [6] this kind of system handles churn (peers leaving the system) in
a very efficient way. Only few works [8,9] focus on tree maintenance in struc-
tured systems. In [9] the authors propose a simple distributed repair algorithm
that allows the system to fastly recover and obtain a balanced tree after one
or multiple failures. In [8] the authors develop a simple repair and distributed
protocol based on a structured overlay network. By providing, through analysis
and simulations, estimations of different system metrics like bandwidth usage,
delay and number of interruptions of the streaming, they show that a structured
live-streaming system can be very efficient and resistant to churn. However, their
study is based on the fact that all nodes have the same bandwidth, where the
bandwidth determines the maximum number of other nodes that can be served
simultaneously. We extend their work to the case of peers with heterogeneous
bandwidth.
3 Distributed Systems and Modeling
3.1 Modeling
We model a live streaming system as a tree of size n+1 where the root represents
a source streaming a live video. A summary of the variables used in this work is
given in Table 1. The n other nodes are clients wanting to watch the video. The
source is the only reliable node of the network; all other peers may be subject
to failure. A node v has a limited bandwidth d(v) used to serve its children.
A node is said to be overloaded, when it has more than d(v) children. In this
case, it cannot serve all its children and some of them do not receive the video.
Note that the distance, in the logical tree, between a peer and the root gives the
reception delay of a piece of media. Hence our goal is to minimize the tree depth
while respecting degree constraints.
Each node applies the following algorithm with a limited knowledge of the
whole network.
– When a node is overloaded, it carries out a push operation. It selects two of

its children, and the first one is reattached to the second one, becoming a
grandchild.
– When a node leaves the system, one of its child is selected to replace it. The
other children reattach to it. In this work, we only consider single failure. But
multiple failures could be handled by considering the great grandparent of a
node or by reattaching to the root.
– When a new node arrives, it is attached to the root.
Table 1. Summary of the main variables and terminologies used in this work.
Variable Signification Default value

n Number of nodes of the tree, root not included 1022
d Node bandwidth (or ideal node degree) -
h Height of the tree (root is at level 1) -
μ Repair rate (avg. operation time: 100 ms) 1
1
λ Individual churn rate (avg. time in the system: 6000
10 min)
1022
Λ System churn rate (Λ = λn) 6000
≈ 0.17
Terminology Values
unit of time 100 ms
systems with low churn Λ ∈ [0, 0.4]
systems with high churn Λ ∈ [0.4, 1]
Churn. We model the system churn rate with a Poisson model of rate of Λ.
A node departure (also called churn event) occurs after an exponential time of
parameter Λ, i.e., in average after a time 1/Λ. We note the individual failure rate
λ = Λ/n. Authors in [10,21] carried out a measurement campaign of a large-
scale overlay for multimedia streaming, PPLive [1]. Among other statistics, the
authors report that the median time a user stays in such a system is around
10 min. In this study, we use this value as the default value.
Repair Rate. When a node has a push operation to carry out, it has first to
change its children list, and then to contact its two children implicated in the
operation so that they also change their neighborhoods (parent, grandparent or
child). This communication takes some amount of time, that can vary depending
on the node to contact and congestion inside the network. To take into account
this variation, we model the repair time as a random variable with an expo-
nential distribution of parameter μ. [13] reports that the communication time
in a streaming system is on average 79 ms. Thus, we believe that assuming an
average repair time of 100 ms is appropriate.
Default Values. In the following, for the ease of reading, we normalize the
repair rate to 1. We call unit of time the average repair time, 100 ms. The
normalized default churn rate, for an average stay of 10 min, λ is thus 1/6000
and the system churn rate is Λ = nλ ≈ 0.17. These default values are indicated
as typical examples and are reported in Table 1, but, in our experiments, we
present results for a range of values of Λ between 0 and 1. We talk of low churn
systems for values of Λ below 0.4, and of high churn systems for values above
0.4.
3.2 Metrics
To evaluate the performance of the different protocols, we are interested by the

following metrics.
Number of People Not Receiving the Video. Due to repair and churn
events, some people do not receive the video during small periods. We study
what fraction of the nodes do not receive the video and during which amount of
time.
Height of the Tree or Delay. The height of the diffusion tree gives the max-
imum delay between the source and a node.
Number of Interruptions and Interruption Duration. We monitor the
number of interruptions of the video diffusion to a node during the broadcast
lifetime, as well as the distribution of interruption durations.
4 Protocols for Heterogeneous Systems
We present three new repair protocols using different levels of knowledge.

Description of the Protocols. To obtain a tree with minimum height while
respecting the degree constraints, the following two conditions for optimality
must hold:
1. If there exists a node at level L, all previous levels must be complete;

2. For each pair of levels (L, L + 1) the minimum bandwidth between all the
nodes at level L must be greater than or equal to the maximum bandwidth
between all the nodes at level L + 1.
Thus, the distributed protocols that we propose try to maintain nodes with high
bandwidth on top of the diffusion tree. To this end, an overloaded node has to
carefully select which node is pushed under which node using its information on
the diffusion tree.
Local Bandwidth Protocol (LBP). In this protocol, each node knows
the bandwidth of each of its children. Moreover, a node keeps track of the number
of push operations done on each of its children. Note that this local information
is easy to maintain accurately.
When a node is overloaded, it pushes its child with the smallest bandwidth,
as nodes with higher bandwidth should stay on top on the diffusion tree. This
child has to be pushed on a node receiving the video. We consider all the nodes
receiving the video and push on them proportionally to their bandwidth (e.g., a
node with a bandwidth of d = 4 should receive twice as much pushes than a node
with a bandwidth of d = 2). In details, the expected number of pushes to each
child is proportional to its bandwidth. The parent then pushes into the child
with the largest difference between the push operations done and the expected
number of push operations.
Bandwidth Distribution Protocol (BDP). In this protocol, nodes have
additional information. Each node knows the bandwidth of each of its children
and the bandwidth distribution of the subtree rooted in each of its children. Note
that the bandwidth distribution can be pulled up from the subtree. This may
be considered costly by the system designer. However, it is also possible for a
node to estimate this distribution by keeping the information of the bandwidth

of the nodes pushed into it.
This additional information allows us to estimate the optimal height of the
subtrees of each of the children. Thus, when a node is overloaded, it can push its
child with the smallest bandwidth into its child (receiving the video) with the
smallest estimated height.
Full Level Protocol (FLP). In this protocol, each node knows the band-
width of each of its children and the last full level of the subtree rooted in each
of its children.
This information allows knowing in which subtree nodes are missing. The
main idea is to push toward the first available slot to not increase the height
of the diffusion tree. When a node is overloaded, its child with the smallest
bandwidth is pushed into the child with the smallest last full level.
For all three protocols, a churn event is handled similarly. When a node leaves
the system, the children of the falling node are adopted by their grandparent.
Discussion. In LBP, a node has no information about the underlying subtree.
Push operations are carried out according to the degree of the nodes that receive
the video. Since nodes may leave the system at any moment, it can thus happen
that a node is pushed into the worst subtree.
In BDP, a node knows the bandwidth distribution of each subtree rooted in
each one of its children. A node is pushed according to the estimated height of
the subtree rooted in its children. In the estimation, a node assumes that all the
levels, except the last one, is complete. Hence, a node may be not pushed on the
best subtree.
In FLP, a node knows the last full level of the subtree rooted in each one of
its children. A node is pushed under the node with the smallest value. In this
way, we are sure that the node is pushed toward the best possible position. This
may not be enough. In fact, due to nodes arrival and nodes departure, the two
conditions for optimality may be broken.
Thus, in none of the protocols, the two conditions for optimality are always
true. However, the transmission delay is not the unique factor that determines
the QoS for users. Other factors like time to attach a new node, number and
duration of interruptions are as important as the transmission delay. So, we
decided to keep the protocols as simple as possible taking into consideration all
metrics.
Handling Free Riders. Some nodes in the system can have no upload band-
width and thus cannot distribute the video to anyone. They are called free riders.
They may pose difficulties and call for special treatment. An obvious observa-
tion is that protocols should not push nodes under a free rider; all our protocols
prevent this. A more problematic situation arises when deadlocks appears in
which some nodes do not receive the video and all their brothers are free rid-
ers. Push operations cannot solve the situation. In this case, we decided to ask
the concerned free riders to rejoin the system. This is a solution we wanted to
avoid as the goal of a structured protocol is to maintain as much as possible the
Fig. 1. Parameters obtained from the Fig. 2. Average metrics for Distri-
model for the considered streaming session. bution 1 after 250 simulations.
parent-children relations of the diffusion tree. However, we considered that this

is a small cost that free riders can pay, as they do not contribute to the system.
Evaluation of the Protocols. We extensively studied the protocols and com-
pared their performances for 4 bandwidth distributions of live video streaming
systems that we found in the literature. The study can be found in [19]. We only
present the results using Twitch traces in the next section.
5 Results with Twitch Traces

To simulate the protocols in real scenarios, we decided to use Twitch [3] as
a Use Case. Twitch is a live streaming video platform that mainly focuses on
video gaming and e-sport events. It first appeared in 2011, and its popularity
grew very fast. Today, with 1.5 million broadcasters and 100 million visitors per
month Twitch represents the 4th largest source of peak Internet traffic in the
US [15]. To understand the behavior of the viewers of the stream, we monitored
the 100 most famous streamers (in terms of viewers and followers [4]).
We gathered the number of viewers for each moment of their streaming ses-
sions. We noticed that most of the streaming sessions can be divided into 3
phases: (1) the start of the stream where the number of viewers increases at
an extremely high rate, (2) the central part of the stream where the number of
viewers increases at a slower rate than at the beginning of the stream and (3)
the end of the stream where the number of viewers decrease (see Fig. 3 (Left))
We defined a model to represent streaming sessions that follow the 3–phases
pattern. This model allows us to abstract from the real data and to repeat
the simulations several times to estimate the quantitative behavior of the pro-
tocols more easily. Using the fact that, on average, a user spends 106 min on
twitch.tv [2], for each phase, we calculate the arrival rate and the individual
churn rate, modeled as a Poisson model. Table 1 shows the rates given to the
simulator for the considered streaming session. To calculate the leaving rate of
the 3th phase, we assumed that the arrival rate of Phase 3 is the same as the
one of Phase 2. Figure 3 compares the data obtained from monitoring the user
with the data obtained from the model.
We simulated our 3 protocols using the data generated from the model for
our metrics of interest and according to the four distributions of bandwidth
presented in the previous section. Table 1 summarizes the average results of the
3 protocols after 250 simulations using Distribution 1. Results for the other
distributions are similar. They are omitted due to lack of space, but they can be
found in [19].
Fig. 3. Number of viewers as a function of the time for a streaming session of the user
dizzykitten.
Height of the Diffusion Tree and Delay. We see that all protocols achieve a
very small height of the diffusion tree: around 5 or 6 for the average. Recall that
we have around 1500 users at the maximum of the stream. The protocols are thus
very efficient. The evolution of the diffusion tree height is given as an example
in Fig. 4. We see the increase of height when the users connect to the stream till
a maximum height of 6 for FLP and LBP, and of 7 for BDP. FLP gives the
tree with the smallest height for all the distributions. The results of LBP and
BDP are different from the simulation case of the previous section. Since the
individual churn rate is very small during the experiment (∼1.5 * 10−5 ), pushing
according to the children’s bandwidths (local information) reveals to be a good
strategy, leading to a better height than BDP for 2 of the 4 distributions. In
particular, LBP behaves better than BDP in the case of very distant values of
bandwidth (Distributions 1 and 3) and worst when the values of bandwidth are
close between them (Distribution 4).
Percentage of People Without the Video During Time. In this case, the
3 protocols behave similarly, as in the simulation case. They are very efficient:
on average, only 0.2% of peers are unable to watch the video. Recall that we
count the users arriving and waiting to be connected at the right place of the
tree.
Number of Interruptions During the Diffusion. The protocols have sim-
ilar behavior in most cases. For Distributions 1, 2 and 3 the number of inter-
ruptions ranges from 12 to 21, and for Distributions 4, it ranges from 5 to 6.
This means that, in the worst case, a peer staying for all the duration of the
stream experiences an interruption every 10 min. In all cases, the duration of
these interruptions is very small. Considering all protocols and all the distribu-
tions, we see that a node is never interrupted for more than the 0.02% of the
time. A peer remaining during the whole stream session is thus interrupted for
less than 3 seconds. A buffer of few seconds (e.g., 10 s) for the video makes these
interruptions imperceptible to the end-users. For a video rate of 480 kbps, it
corresponds to a buffer size of only 40MB.
Fig. 4. Height of the diffusion tree during time for an example of Twitch session
dizzykitten.
6 Conclusion
In this study we examined the problem of delivering live video streaming in a
P2P overlay network using a structured overlay. We have proposed 3 distributed
protocols to repair the diffusion tree of the overlay, when there is churn. The pro-
tocols use different amounts of information. Using simulations and experiments
with real traces from Twitch, we have shown that our protocols behave well with
respect to fundamental QoS metrics, even for very heterogeneous bandwidth dis-
tributions. Our main result is that, with very simple distributed repair protocols
using only local information, structured overlay networks can be very efficient
and resistant to churn.
References
1. PPLive. http://www.pplive.com/
2. Twitch Blog. https://blog.twitch.tv/twitch-hits-one-million-monthly-active-
broadcasters-21dd72942b32
3. Twitch. http://www.twitch.com/
4. Twitch Statistics. http://socialblade.com/twitch/
5. Bacco, M., Catena, M., De Cola, T., Gotta, A., Tonellotto, N.: Performance anal-
ysis of WebRTC-based video streaming over power constrained platforms. In: 2018
IEEE Global Communications Conference (GLOBECOM), pp. 1–7. IEEE (2018)
6. Bonald, T., Massoulié, L., Mathieu, F., Perino, D., Twigg, A.: Epidemic live
streaming: optimal performance trade-offs. In: ACM SIGMETRICS Performance
Evaluation Review, vol. 36, pp. 325–336. ACM (2008)
7. Castro, M., Druschel, P., Kermarrec, A.M., Nandi, A., Rowstron, A., Singh, A.:
SplitStream: high-bandwidth multicast in cooperative environments. In: ACM
SIGOPS Operating Systems Review, vol. 37, pp. 298–313. ACM (2003)
8. Giroire, F., Huin, N.: Study of repair protocols for live video streaming distributed
systems. In: IEEE GLOBECOM (2015)
9. Giroire, F., Modrzejewski, R., Nisse, N., Pérennes, S.: Maintaining balanced trees
for structured distributed streaming systems. In: International Colloquium on
Structural Information and Communication Complexity, pp. 177–188. Springer
(2013)
10. Hei, X., Liang, C., Liang, J., et al.: Insights into PPLive: a measurement study of
a large-scale P2P IPTV system. In: International Word Wide Web Conference on
IPTV Workshop (2006)
11. Hoque, M.A., Siekkinen, M., Nurminen, J.K.: Energy efficient multimedia stream-
ing to mobile devices–a survey. IEEE Commun. Surveys. Tutorials 16(1), 579–597
(2014)
12. Kito, T., Fujihashi, T., Hirota, Y., Watanabe, T.: Users’ demand-based segment
scheduling for progressive multi-view video transmission. In: IEEE GLOBECOM
(2018)
13. Li, B., Qu, Y., Keung, Y., Xie, S., Lin, C., Liu, J., Zhang, X.: Inside the new
coolstreaming: principles, measurements and performance implications. In: 27th
IEEE International Conference on Computer Communications (2008)
14. Li, B., Wang, Z., Liu, J., Zhu, W.: Two decades of internet video streaming: a
retrospective view. ACM Trans. Multimedia Comput. Commun, Appl (2013)
15. MacMillan, D., Bensinger, G.: Amazon to buy video site twitch for $970 million
(2014). http://www.wsj.com/articles/amazon-to-buy-video-site-twitch-for-more-
than-1-billion-1408988885
16. Nihei, K., Yoshida, H., Kai, N., Satoda, K., Chono, K.: Adaptive bitrate control
of scalable video for live video streaming on best-effort network. In: 2018 IEEE
Global Communications Conference (GLOBECOM), pp. 1–7. IEEE (2018)
17. Park, J., Hwang, J.N., Wei, H.Y.: Cross-layer optimization for VR video multicast
systems. In: 2018 IEEE Global Communications Conference (GLOBECOM) (2018)
18. Seufert, M., Egger, S., Slanina, M., Zinner, T., Hoßfeld, T., Tran-Gia, P.: A sur-
vey on quality of experience of HTTP adaptive streaming. IEEE Commun. Surv.
Tutorials 17(1), 469–492 (2015)
19. Tomassilli, A., Huin, N., Giroire, F.: The structured way of dealing with heteroge-
neous live streaming systems. Technical report, Inria (2017)
20. Tran, D., Hua, K., Do, T.: Zigzag: an efficient peer-to-peer scheme for media
streaming. In: IEEE INFOCOM (2003)
21. Vu, L., Gupta, I., Liang, J., Nahrstedt, K.: Measurement of a large-scale overlay
for multimedia streaming. In: ACM HPDC (2007)
22. Wang, F., Xiong, Y., Liu, J.: Mtreebone: a hybrid tree/mesh overlay for
application-layer live video multicast. In: IEEE ICDCS (2007)
23. Zhang, X., Liu, J., Li, B., Yum, T.: CoolStreaming/DONet: a data-driven overlay
network for peer-to-peer live media streaming. In: IEEE INFOCOM (2005)
A Rule Design for Trust-Oriented Internet Live
Video Distribution Systems
Satoru Matsumoto1(&), Tomoki Yoshihisa1, Tomoya Kawakami2,

and Yuuichi Teranishi1,3
1
Cybermediacenter, Osaka University, Osaka, Japan
smatsumoto@cmc.osaka-u.ac.jp
2
Graduate School of Information Science,
Nara Institute of Science and Technology, Nara, Japan
3
National Institute of Information and Communications Technology,
Tokyo, Japan
Abstract. In recent years, Internet live video distribution has become popular
and Internet distributors such as YouTubers have attracted attention. In Internet
live distribution, the distributors often record and distribute themselves with a
smartphone while moving, and deliver video via the internet. Regarding the trust
between the distributor and the viewers, there is a social problem that the
viewers are threaten or attacked by the distributor if there is no trust, and
securing the trust in the Internet live distribution provides secure distributions to
the distributor safe. It is possible to transmit information with trust. However,
there has been no Internet live video distribution system considering the trust so
far, and the distributor hides the face and distributes so that the surroundings and
the surroundings are not displayed. We propose a trust-oriented Internet live
distribution system for video processing, especially in this research, we designed
rules for event-driven processing of the relationship between distributor infor-
mation, trust environment, and video content in trust-oriented live video
distributions.
1 Introduction
In recent years, the viewer specifies the name from the face of the distributor captured
in the Internet live distribution and performs an act of intimidation, etc. and the viewer
who captures the remark of the distributor as a trap identifies the location from the
surrounding situation and attacks and so on, it is becoming a social problem that the
distributor is at risk. These problems are due to the fact that an appropriate trust
relationship (trust) has not been secured between distributor and viewer.
Therefore, in this research, we propose a trust-oriented Internet live distribution
system that appropriately manages (abandon, build, and maintain) trusts according to
the policy of the distributor. By solving the problem of “what kinds of Internet live
distribution system can properly manage trust?”, The distributor can transmit infor-
mation by Internet live distribution with trust. It will lead to the promotion of a rich
information society by distribution.

https://doi.org/10.1007/978-3-030-33509-0_39
428 S. Matsumoto et al.
Trust-oriented Internet live distribution changes the trust management method

according to the presence or absence of trust in the delivery environment. Video
processing is executed autonomously in accordance with the video processing policy
described in advance by the distributor, and the trust is appropriately managed. If the
trust can be properly managed, the viewer can easily add effects, can be comfortably
played back, and if it is possible to provide a service that extends the conventional
Internet live broadcast, the trust-maintained next-generation Internet live broadcast
services will be realized.
2 Related Research
In the paper [1], the importance of trust is discussed. Besides, various Internet live
distribution systems are being researched and developed at home and abroad and
presented at prominent international conferences ([2–4]). However, these systems can
not properly manage the trust according to the policy of the distributor. The authors
have built an Internet live distribution system using a P2P communication environment
etc. [5]. We propose the rule design of the Internet live distribution system that can
manage the trust properly by making use of the results obtained in these researches.
3 System Design
The following is a proposal for the design of rules in an Internet live distribution
system to properly manage the trust.
3.1 Problems with Conventional Systems

Trust in Internet live distribution has just started to be emphasized recently, it is
difficult to manage properly, and construction of a trust-oriented Internet live distri-
bution system has been considered extremely difficult ([2–4]). However, we inde-
pendently adopt the three approaches proposed by the authors, such as “Distributer
information hiding”, “Distributer information exposure”, and “Distributer information
localization”, and build an Internet live distribution system (trust-oriented Internet live
distribution system) by solving this problem.
In the distribution system proposed by the authors, in “Distributer Information
Hiding”, an approach is adopted in which the annotation about the distributor, the
situation telop, etc. are blurred to the image, and the video effect is added and con-
cealed. In “Distributer Information Exposure”, an approach is made to add credibility
and annotation to the video information by adding telops and annotations to the
information about the information on the distributor, the state, and the situation. It is
shown in Fig. 1. In “Distributer Information Localization”, an approach is adopted in
which distributor information and the surrounding conditions are restricted to specific
viewers and disclosed.
A Rule Design for Trust-Oriented Internet Live Video Distribution Systems 429
Fig. 1. Information related to trust.
3.2 Design Policy

In this research, the environment without trust is distinguished as the un-trusty envi-
ronment and the environment with trust as trusty environment. One of the trust man-
agement policies in the untrusted environment is to abandon trust construction and hide
the distributor information as much as possible. In this research, we consider a method
to hide the distributor information by automatically applying image processing to the
distributing video.
In addition, as one of the trust management policies in the un-trusty environment,
there is a distributor policy that exposes distributor information to build trust. In this
study, we will consider a method to expose distributor information by displaying telops
periodically on the distributing video. As a trust management policy in a trusty envi-
ronment, there is a policy of restricting and disclosing distributor information to a
specific audience to maintain the trust (localization).
3.3 Trust Management Design

Under all of the environments shown in Table 1, video processing such as blurring
processing and teloping and addition of annotation for concealing in a distributed video
is required. By applying the distributed processing method performed by the authors to
these [5], it is possible to distribute the Internet with a processing time that does not
decrease the video playback rate. In this research, it is different in that it is necessary to
perform object recognition and determine the character information to be displayed,
and a method of designating in advance a combination of a template image and the
character information to be displayed and displaying them is used.
Table 1. Relationship between trust environment and proposed policy.

Environmental Un trusty environment Trusty
Policy environment
Video content The situation where the Need distributor Situations that
(status) disadvantage to the information and delivery should limit
distributor status description viewers
Trusty status Trust abandonment Maintaining a trust Maintaining a
trust
Policy type Distributor information Distributor information Distributor
hiding exposure information
locality limit
Affected video Distributor, background Telop, annotation ON Telop,
object blur, telop, annotation annotation ON
OFF
In this research, a set of event, condition (condition), and action are described, and
some ECA (Event, Condition, Action) rules that perform event-driven processing are
used. By representing the relationships in Table 1 by ECA rules, it is possible to design
and manage the following event processing. The video content (condition) is an event,
the trust status, and environment, and the index to the template image are conditions. In
addition, connecting to the image processing action for each video object based on the
approach makes it possible to maintain and abandon the trust environment.
3.4 Ruleset Design

Based on the relationship between the policy in Table 1 and the environment described
in Sect. 3.3, an example of a video processing rule set is introduced in Table 2.
The events are shown in Table 2 assuming signals originating from the computing
environment. The condition indicates the trusty environment, the preset policy, and
trusty status of the trusty selected by distributor. The action describes the process of
changing the state of the receiver’s video for the condition. Here, rules for processing
for blurring the image of the distributor are described. When the video or still image is
sent from the camera, the “Get Picture Frame” signal is generated, these examples
which received the signal of the geofencing process when a camera enters in the field
which forbids exposure of a delivery person is shown.
As a reason to use the ruleset using ECA, for example, when the distributor wants
to trust only in the person by multi-factor authentication with non-spoofing biometrics,
etc. participating in a certain community, and wants to distribute all in an exposed state,
It is also possible to flexibly rewrite the rule set of Table 2 using a web interface or the
like.
The trust relationship between the distributor and the receiver, which has been
exemplified above, can be set by various methods and concepts, but the discussion of
the methodology and validity thereof will be omitted because the paper is full.
A Rule Design for Trust-Oriented Internet Live Video Distribution Systems 431
Table 2. Example of ECA rule set at autonomous video processing.

Event Condition Action
Input signal Environmental Policy Policy type Trusty status Video Processing
Get picture frame Un-trusty environment H Trust abandonment Blur face
Get picture frame Un-trusty environment H Maintaining a trust Blur face
Get picture frame Trusty environment H Maintaining a trust Blur face
Get picture frame Un-trusty environment E Trust abandonment Non-Blur
Get picture frame Un-trusty environment E Maintaining a trust Blur face
Get picture frame Trusty environment E Maintaining a trust Non-Blur
Get picture frame Un-trusty environment L Trust abandonment Non-Blur
Get picture frame Un-trusty environment L Maintaining a trust Blur face
Get picture frame Trusty environment L Maintaining a trust Non-Blur
Un-trusty Area in Un-trusty environment H Trust abandonment Blur face
Un-trusty Area in Un-trusty environment H Maintaining a trust Blur face
Un-trusty Area in Trusty environment H Maintaining a trust Blur face
Un-trusty Area in Un-trusty environment E Trust abandonment Blur face
Un-trusty Area in Un trusty environment E Maintaining a trust Blur face
Un-trusty Area in Trusty environment E Maintaining a trust Blur face
Un-trusty Area in Un-trusty environment L Trust abandonment Non-Blur
Un-trusty Area in Un-trusty environment L Maintaining a trust Blur face
Un-trusty Area in Trusty environment L Maintaining a trust Non-Blur
Policy type H:Hiding, E:Exposure, L:Locality limit
4 Design of an Experimental Environment
As a design of experiment environment, it is designed to handle processing by Java

script by Node.Js, etc. Therefore, ECA rules are described in advance in JSON format,
held in a trust-oriented Internet live distribution system, and trust is appropriately
managed. The feature as the condition of the image and the area in the image hold the
index to the CSV file, and follow the general data structure as seen in services such as
[6]. The reason is that in the future some AI services on cloud computing using Tensor
flow etc. have started to be released in the future, and it is considered in the future.
Describing in CSV format and saving in cloud storage broadens the possibility of using
these. The goal is to reduce the computing resources of distributors and viewers sig-
nificantly and to lead to the natural construction of a trusted environment.
In addition, as a distributed processing system for video distribution, the cloud
computing environment using Widows Azure, which the authors have developed, is
renewed and used [5]. Furthermore, in this research, we consider the method to reduce
the load increase of the delivery server by encryption by performing multicast
encryption communication.
5 Summary
In this research, we conduct video processing autonomously according to the video

processing policy described in advance by the Internet live distributor and consider
appropriate management of the trust. For that purpose, we considered the design of the
rule, excluding the problem and the distribution load to the distributor. We designed the
rule set in accordance with our proposed policy and made a discussion that streamlined
Internet live distribution with trust-oriented autonomous video processing.
Acknowledgments. This research was supported by I-O DATA Foundation and a Grants-in-
Aid for Scientific Research (C) numbered JP17K00146 and JP18K11316.
References
1. Hoffman, L., et al.: Trust beyond Security: an expanded trust model. Comm. ACM 49(7), 94–
101 (2006)
2. Zhang, X., et al.: Keyword-driven image captioning via context-dependent bilateral LSTM.
In: IEEE ICME, pp. 781–786 (2017)
3. Li, J., et al.: Attention transfer from web images for video recognition. In: ACM Multimedia,
pp. 1–9 (2017)
4. Naor, Z., et al.: Content placement for video streaming over cellular networks. In:
IEEE ICNC, pp. 133–137 (2015)
5. Matsumoto, S. et al.: A design of hierarchical ECA rules for distributed multi-viewpoint
internet live broadcasting systems. In: SMDMS 3PGCIC, pp. 340–347 (2018)
6. https://ai.google/
High-Performance Computing
Environment with Cooperation Between
Supercomputer and Cloud
Toshihiro Kotani and Yusuke Gotoh(B)
Graduate School of Natural Science and Technology, Okayama University,

Okayama, Japan
gotoh@cs.okayama-u.ac.jp
Abstract. Due to the recent popularization of machine learning, such a

deep reinforcement learning as AlphaGO has advanced to analyze large-
scale data and is attracting great attention. In deep reinforcement learn-
ing, users evaluate many functions in large-scale computer environments,
including supercomputer and cloud systems. Cloud services can provide
computer resources based on the scale of the computer environment
desired by users. On the other hand, in conventional large-scale computer
environment that only consists of CPUs or GPUs, the processing time
greatly increases according to the scale of the calculation processing.
In this paper, we propose a high-performance computing environment
for deep reinforcement learning that links supercomputer and cloud sys-
tems. Our proposed system can construct a high-performance computing
environment based on the scale of the computing process by the coop-
eration of the supercomputing and cloud systems with short physical
distance and short network distance. In our evaluation of deep reinforce-
ment learning using our proposed system, we confirmed that computer
resources can be effectively used by allocating suitable processing for the
supercomputer and the cloud according to the usage situations of the
CPU, the GPU, and the memory.
1 Introduction
Due to the recent popularization of machine learning, such a deep reinforcement
learning as AlphaGO [1], which has advanced to analyze large-scale data, is
attracting great attention. In deep reinforcement learning, users evaluate many
functions in large-scale computer environments such as a supercomputer system
and a cloud system. Such a supercomputer as “K computer” [2] operated by the
RIKEN Center for Computational Science (R-CCS) is a computer system that
mainly uses a CPU. Supercomputers can be used in simulation evaluations to
predict the future of large-scale natural environments, such as climatic variations
and tsunamis [3]. On the other hand, the cloud, such as Amazon Web Services
(AWS) [4], can be used in simulation evaluations using many GPUs with CPU-
based virtual servers.
https://doi.org/10.1007/978-3-030-33509-0_40
434 T. Kotani and Y. Gotoh
Fig. 1. Hokkaido University Information NEtwork System (HINES)
Cloud services can provide computer resources according to the scale of the
computer environment desired by users. On the other hand, in a conventional
large-scale computer environment that only consists of CPUs or GPUs, the pro-
cessing time greatly increases according to the scale of the calculation processing.
For example, distributed deep reinforcement learning which is one kind of deep
reinforcement learning, performs simulation evaluations using a large amount of
CPUs and memory. In addition, GPUs are required for learning based on the
result of simulation evaluations. In this case, since the processing time must be
reduced using a computer environment that combines CPUs and GPUs, we need
to construct such a system.
In this paper, we propose a high-performance computing environment for
deep reinforcement learning that links the supercomputer and cloud systems.
Our proposed system can construct a high-performance computing environment
based on the scale of the computing process by cooperating with supercomputer
and cloud systems with short physical and short network distances.
The remainder of the paper is organized as follows. In Sect. 2, we explain the
configuration of the Hokkaido University Information NEtwork System (HINES).
Machine learning and distributed deep reinforcement learning are introduced
in Sects. 3 and 4. In Sect. 5, we explain the implementations of the functions
that link the supercomputer and the cloud. The performance of our proposed
environment for deep reinforcement learning is evaluated in Sect. 6. Finally, we
conclude our paper in Sect. 7.
2 Hokkaido University Information NEtwork System

2.1 Outline
Figure 2 shows the configuration of the Hokkaido University Information NEt-
work System (HINES) in the Hokkaido University Information Initiative Cen-
High-Performance Computing Environment with Cooperation 435
Fig. 2. Supercomputer system
Fig. 3. Intercloud system
ter. HINES consists of a supercomputer system and an intercloud system. In

addition, it achieves an advanced system environment, including a wide-area
distributed cloud system by connecting several servers installed in various areas
of Japan with a Science Information NETwork 5 (SINET5) [6].
2.2 Supercomputer System

Figure 2 shows the configuration of the supercomputer system operated by the
Hokkaido University Information Initiative Center. It consists of a storage system
and two types of computing systems: Grand Chariot and Polaire. These systems
are connected by a network called Omni-Path.
2.3 Intercloud System

Figure 3 shows the configuration of the intercloud system operated by the
Hokkaido University Information Initiative Center. This intercloud system oper-
ates several servers in the University of Tokyo, Osaka University, and Kyushu
University as well as this center as a wide-area distributed cloud system.
3 Machine Learning
3.1 Outline
Machine learning uses several methods that enable the identification of images
and the prediction of correct results through algorithms. For example, when
identifying an image, machine learning inputs an image of handwritten numbers
and outputs correct numbers through the algorithm. In machine learning, we can
improve accuracy using an appropriate method based on the learning object.
3.2 Methods for Machine Learning

We explain three types of methods for machine learning: (1) supervised learning,
(2) unsupervised learning, and (3) reinforcement learning.
Supervised learning uses label data. In supervised learning, we can efficiently
classify data using the information of labels included in the data for learning.
On the other hand, we need to make a large amount of label data by manually
adding the label information to improve the accuracy for classifying the data.
Therefore, the cost of classifying the data is large.
Since unsupervised learning does not use label data, we can classify the data
without making a large amount of label data for learning. On the other hand,
since we do not use label data, the accuracy for classifying the data of unsuper-
vised learning is lower than supervised learning.
Since reinforcement learning does not use label data, we learn actions to
maximize the output called a reward by improving the learning environment.
However, we need to set the environment for learning actions.
We explain an example of CartPole with a learning environment for mea-
suring the ability of reinforcement learning provided by Gym [7]. CartPole is
a program to control an inverted pendulum. In CartPole, we get a reward by
moving a black box and keeping a stick on it from falling down. In reinforcement
learning, we need to improve the learning environment to obtain more rewards.
3.3 Deep Learning

Deep learning can be identified for other inputs by constructing a neural network
based on the neural network of the brain on a computer and learning the output,
which is the identification or prediction result for the input using a data set. For
example, deep learning is used for image recognition and clustering.
Deep reinforcement learning incorporates deep learning into reinforcement
learning. In deep reinforcement learning, since processing to maximize the reward
in reinforcement learning is performed on deep learning, we need to learn the
current state as input and an action that maximizes the reward as output.
4 Distributed Deep Reinforcement Learning

4.1 Outline
Distributed deep reinforcement learning considers distributed processing in deep
reinforcement learning. In it, we can quickly obtain a large amount of data for
learning by improving the learning environment in parallel.
Fig. 4. Ape-X
4.2 Ape-X
Distributed deep reinforcement learning has several methods. We explain Ape-

X [8], which is one learning method that improves accuracy by reducing the
learning time in distributed deep reinforcement learning.
The configuration of Ape-X is shown in Fig. 4. Its learning model is classified
into three types: Actor, Learner, and Replay Memory. The Actor generates a lot
of learning data and stores them in the Replay Memory. In addition, the Actor
operates in parallel with a network and a learning environment to determine each
action. The network in the Actor is updated by periodically copying parameters
from Learner. Next the Replay Memory stores the learning data generated by the
Actor and sends it to the Learner. Finally, the Learner performs deep learning
on the learning data received from the Replay Memory, updates the parameters,
and provides a copy of the Network based on the request from the Actor.
5 Implementation
In this section, we implement the functions that link the supercomputer and the
cloud.
5.1 Distributed Tensorflow
We use Distributed Tensorflow [9] to link the deep learning between the super-
computer and the cloud. Distributed Tensorflow creates clusters for parallel dis-
tributed processing using multiple tensorflow servers. We allocate the processes
by operating parameter server and worker servers and achieve the cooperation
of the supercomputer and the cloud.
5.2 MPI
Message Passing Interface (MPI), which is a standard for parallel computing, is
used in large-scale distributed processing environments such as supercomputers.
Its primary projects using the MPI library are MPICH and OpenMPI. On the
supercomputer of the Hokkaido University Information Initiative Center, Japan,
IntelMPI
R Library (Intel MPI) [10] is installed. When we use MPI at this center,
after setting the number of nodes used in the supercomputer, the number of
MPI processes, and the number of threads for each process, we request the job
management system to execute the job.
5.3 Horovod
Horovod [11] is a framework for the distributed processing of deep learning. In
the distributed processing of Distributed Tensorflow, we implement the system
based on parameter servers, worker servers, and the data for each one. Therefore,
both the cost of adapting conventional code to distributed processing and that
of learning is high. On the other hand, horovod can easily realize high-speed
distributed processing by reducing these costs.
In this paper, we use horovod for distributed deep reinforcement learning by
connecting a supercomputer and the cloud. Since horovod automatically con-
nects from the host server to another server by SSH based on configuration files,
we do not need to check the connection status. In addition, horovod can also use
MPI to support the Keras needed to implement the Ape-X programs. On the
other hand, in Distributed Tensorflow, since the supercomputer and the cloud
are linked, we need to execute the program in the parameter server and the
worker server. Therefore, in the cloud, we can execute programs at any time by
occupying resources.
In the supercomputer, since the job is put in the standby state based on the
occupation situation, the timing of the job execution is uncertain. Therefore, we
need to check the program’s execution state in the supercomputer to link the
supercomputer and the cloud.
5.4 Network
Next we explain networks to link the supercomputer and the cloud as well as
the problems in constructing networks. When we use Distributed Tensorflow, the
server that constructs the cluster is specified by its IP address and a port number.
However, we can not connect directly from the cloud to the supercomputer and
vice versa. When the cloud connects to the supercomputer, it links to a login
node, which is a server that sends jobs to the supercomputer. However, it is
difficult to connect to a supercomputer composed of multiple nodes. On the
other hand, the supercomputer can not connect to the cloud even if we specify
the IP address of the cloud.
In our proposed system, the supercomputer connects to the cloud by a login
node. We achieved the cooperation between the supercomputer and the cloud
on the network by setting a SSH port-forwarding that mutually connects the
supercomputer and the cloud.
Fig. 5. Processing time in distributed tensorflow
6 Evaluation
6.1 Evaluation Environment
In this section, we describe three types of performance evaluations. First, we

evaluate the Distributed Tensorflow program to cooperate with the supercom-
puter and the cloud. Second, we evaluate the performance of each supercomputer
and the cloud using the horovod sample program. Third, we evaluate the per-
formance using the Ape-X program.
6.2 Processing Time in Distributed Tensorflow
In Fig. 5, we show the processing time in the Distributed Tensorflow. The hori-
zontal axis is the number of epochs for learning one kind of data, and the vertical
axis is the processing time. We compare the results of executing the Distributed
Tensorflow with the cloud and the supercomputer and for the cloud only.
In Fig. 5, the processing time with just the cloud is shorter than just using
the supercomputer and the cloud. For example, when the number of epochs is
1000, the processing time is 19.6 s with just the cloud and 481.2 s with just the
supercomputer and the cloud. In the cooperation between the supercomputer
and the cloud, the processing time was delayed by dividing the processing into
a parameter server for updating the parameter and a worker server that learns
in the same model using the data of multiple mini-batches. In the original Dis-
tributed TensorFlow, we need to learn the data using more parameter servers
and worker servers. However, in this evaluation, since we combined the super-
computer and the cloud by one parameter server and one worker server, the
processing performance was low and the processing time was lengthened.
Fig. 6. Performance for distributed Fig. 7. Performance for distributed

process of cloud on horovod process of supercomputer on horovod
(total)
When we perform distributed deep reinforcement learning in the environment

using the supercomputer and the cloud, the Actor operates in parallel and asyn-
chronously. The Learner has learning in the cloud over closer network distances.
Therefore, the network delay during learning is smaller than the processing time
result in Fig. 5.
6.3 Performance for Distributed Process on Horovod
We show the performance of the distributed process of the cloud on horovod in

Fig. 6. The horizontal axis is the number of GPUs on the cloud of two types:
one and two. The vertical axis is the performance of the image processing per
second.
In Fig. 6, the image processing performance is 312.9 when the number of
GPUs is one, and 473.2 when the number of GPUs is two. The image processing
performance improves as the number of GPUs increases. On the other hand,
when the number of GPUs is two, since the performance of the image processing
per GPU is 236.6, the performance is reduced by about 24.4% compared to the
case where the number of GPUs is one. Therefore, the performance of the image
processing per GPU decreases with an increase of the number of GPUs.
Next we show the performance for the distributed process of supercomputer
according to the number of GPUs on the horovod in Fig. 7. In addition, we show
the performance for the distributed process of the supercomputer per CPU on
the horovod in Fig. 8. The horizontal axis is the number of CPUs, which have
four types: 8, 16, 32, and 64. The vertical axis is the number of image processings
per second.
In Fig. 7, the image processing performance increases proportionally with
the increase of the number of CPUs. For example, with eight CPUs, the image
processing performance is 43.3, 86.0 for 16, 170.1 for 32, and 339.3 for 64. Also, in
Fig. 8, the performance per CPU is basically constant, regardless of the number
of CPUs.
Fig. 8. Performance for distributed process of supercomputer on horovod (per unit)
Fig. 9. Performance for process speed of cloud on Ape-X
6.4 Performance for Distributed Process on Ape-X
We show the performance for the distributed process on Ape-X in Fig. 9. The
horizontal axis is the number of episodes. The vertical axis is the number of
image processing on the Actor per second. The number of actors is twelve.
In Fig. 9, when the number of episodes ranges from 1 to 960, the processing
speed per second is 63 steps using both the CPU and the GPU. When the
number of episodes ranges from 1 to 970, the processing speed per second is 46
steps using only the CPU. At this time, the processing speed with only the CPU
is 26.9% lower than when using both the CPU and the GPU. In both cases, as the
number of episodes approaches 1000, the number of processes decreases based

on the number of Actors. Therefore, the processing speed per second increases.
In Ape-X, the performance is degraded because we used a network learned
by the Learner. However, since Ape-X performs the simulation using a reinforce-
ment learning environment at high speed on the CPU, the performance of the
Actor does not decrease as much as with horovod. When a GPU is not used for
the Actor, the performance improves because many CPUs can be used in the
environment where the supercomputer and the cloud cooperate. Therefore, we
can perform image processing efficiently by performing distributed deep rein-
forcement learning in cooperation with the supercomputer and the cloud.
7 Conclusion
In this paper, we proposed an evaluation system of deep reinforcement learning

based on cooperation between a supercomputer and the cloud. Our proposed
system can construct a high-performance computing environment based on the
scale of a computing process by cooperating with the supercomputer and the
cloud with a short physical distance and a short network distance. In addition,
we implemented cooperation between the supercomputer and the cloud using
Distributed Tensorflow, deep reinforcement learning using horovod, and Ape-X.
In our evaluations, we confirmed that we can improve the performance of deep
reinforcement learning by increasing the number of CPUs.
In the future, we will evaluate the performance of distributed deep reinforce-
ment learning by cooperation between a supercomputer and the cloud using
horovod.
Acknowledgement. This work was supported by JSPS KAKENHI Grant Number

18K11265.
References
1. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Driessche, G.V.D.,
Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S.,
Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M.,
Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep
neural networks and tree search. Nature 529, 484–489 (2016)
2. RIKEN Center for Computational Science (R-CCS): K computer. https://www.r-
ccs.riken.jp/en/k-computer/
3. Meteorological Research Institute (MRI): Supercomputer System. http://www.
mri-jma.go.jp/Facility/supercomputer en.html
4. Amazon Web Services (AWS): Amazon. https://aws.amazon.com/jp/
5. Hokkaido university Information NEtwork System (HINES): Hokkaido University
Information Initiative Center. https://www.hines.hokudai.ac.jp/
6. National Institute of Informatics (NII): Science Information NETwork 5. https://
www.sinet.ad.jp/en/top-en
7. Open-AI: Gym. https://gym.openai.com
8. Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., Hasselt, H.V.,
Silver, D.: Distributed prioritized experience replay. In: Proceedings of the Inter-
national Conference on Learning Representations (ICLR 2018) (2018). https://
arxiv.org/pdf/1803.00933.pdf
9. Google: Distributed TensorFlow. https://www.tensorflow.org/deploy/distributed
10. Intel Corporation: INTEL MPI LIBRARY. https://software.intel.com/en-us/mpi-
library
11. Sergeev, A., Del Balso, M.: Horovod: fast and easy distributed deep learning in
TensorFlow, arXiv https://arxiv.org/abs/1802.05799 (2018)
Evaluation of a Distributed Sensor Data
Stream Collection Method Considering
Phase Differences
Tomoya Kawakami1(B) , Tomoki Yoshihisa2 , and Yuuichi Teranishi2,3

1
Nara Institute of Science and Technology, Ikoma, Nara, Japan
kawakami@is.naist.jp
2
Osaka University, Ibaraki, Osaka, Japan
3
National Institute of Information and Communications Technology,
Koganei, Tokyo, Japan
Abstract. We define continuous sensor data with difference cycles as

“sensor data streams” and have proposed methods to collect distributed
sensor data streams. However, our previous paper provides the simula-
tion results only when the distribution of collection cycles is uniform.
Therefore, This paper provides the additional simulation results in dif-
ferent distributions of collection cycles. Our additional simulation results
show that our proposed method can equalize the loads of nodes even if
the distribution of collection cycles are not uniform.
1 Introduction
In the Internet of Things (IoT), various devices (things) including sensors gen-
erate data and publish them via the Internet. We define continuous sensor data
with difference cycles as a sensor data stream and have proposed methods to
collect distributed sensor data streams as a topic-based pub/sub (TBPS) sys-
tem [14]. In addition, we have also proposed a collection system considering phase
differences to avoid concentrating the data collection to the specific time by the
combination of collection cycles [6]. These previous methods are based on skip
graphs [1], one of the construction techniques for overlay networks [3,4,7–13].
In our skip graph-based method considering phase differences, the collection
time is balanced within each collection cycle by the phase differences, and the
probability of load concentration to the specific time or node is decreased. How-
ever, our previous paper provides the simulation results only when the distribu-
tion of collection cycles is uniform. Therefore, This paper provides the additional
simulation results in different distributions of collection cycles. The employed dis-
tributions are the normal (Gaussian) distribution and exponential distribution.
2 Problems Addressed
2.1 Assumed Environment
The purpose of this study is to disperse the communication load in the sensor
stream collections that have different collection cycles. The source nodes have
https://doi.org/10.1007/978-3-030-33509-0_41
Evaluation of a Distributed Sensor Data Stream Collection Method 445
sensors so as to gain sensor data periodically. The source nodes and collection
node (sink node) of those sensor data construct P2P networks. The sink node
searches source nodes and requires a sensor data stream with those collection
cycles in the P2P network. Upon reception of the query from the sink node, the
source node starts to delivery the sensor data stream via other nodes in the P2P
network. The intermediate nodes relay the sensor data stream to the sink node
based on their routing tables.
2.2 Input Setting
The source nodes are denoted as Ni (i = 1, · · · , n), and the sink node of sensor
data is denoted as S. In addition, the collection cycle of Ni is denoted as Ci .
In Fig. 1, each node indicates source nodes or sink node, and the branches
indicate collection paths for the sensor data streams. Concretely, they indicate
communication links in an application layer. The branches are indicated by dot-
ted lines because there is a possibility that the branches may not collect a sensor
data stream depending on the collection method. The sink node S is at the top
and the four source nodes N1 , · · · , N4 (n = 4) are at the bottom. The figure
in the vicinity of each source node indicates the collection cycle, and C1 = 1,
C2 = 2, C3 = 2, and C4 = 3. This corresponds to the case where a live camera
acquires an image once every second, and N1 records the image once every sec-
ond, N2 and N3 record the image once every two seconds, and N4 records the
image once every three seconds, for example. Table 1 shows the collection cycle
of each source node and the sensor data to be received in the example in Fig. 1.
Fig. 1. An example of input setting
2.3 Definition of a Load
The communication load of the source nodes and sink node is given as the total
of the load due to the reception of the sensor data stream and the load due to
the transmission. The communication load due to the reception is referred to
as the reception load, the reception load of Ni is Ii and the reception load of
446 T. Kawakami et al.
Table 1. An example of the sensor data collection
Time N1 (Cycle = 1) N2 (Cycle = 2) N3 (Cycle = 2) N4 (Cycle = 3)

0
1
2
3
4
5
6
7
··· ··· ··· ··· ···
S is I0 . The communication load due to the transmission is referred to as the

transmission load, the transmission load of Ni is Oi and the transmission load
of S is O0 .
In many cases, the reception load and the transmission load are proportional
to the number of sensor data pieces per unit hour of the sensor data stream to
be sent and received. The number of pieces of sensor data per unit hour of the
sensor data stream that is to be delivered by Np to Nq (q = p; p, q = 1, · · · , n)
is R(p, q), and the number delivered by S to Nq is R(0, q).
3 Skip Graph-Based Collection System Considering

Phase Differences
We have proposed a load distribution method for sensor data stream collec-
tion [6]. In this section, we describe the our proposed method which considers
phase differences.
3.1 Skip Graphs

Our proposed method assumes the overlay network for the skip graph-based
TBPS such as Banno, et al [2]. Skip graphs are overlay networks that skip list
are applied in the P2P model [1]. Figure 2 shows the structure of a skip graph. In
Fig. 2, squares show entries of routing tables on peers (nodes), and the number
inside each square shows a key of the peer. The peers are sorted in ascending
order by those keys, and bidirectional links are created among the peers. The
numbers below entries are called “membership vector”. The membership vector
is an integral value and assigned to each peer when the peer joins. Each peer
creates links to other peers on the multiple levels based on the membership
vector. In skip graphs, queries are forwarded by the higher level links to other
peers when a single key and its assigned peer is searched. This is because of the
higher level links can efficiently reach the searched key with less hops than the
1 node
Key
21 33 Membership
Level 2 10 01 vector
13 48 75 99
00 00 11 11
21 75 99
Level 1 10 11 11
13 33 48
00 01 00
Level 0 13 21 33 48 75 99
00 10 01 00 11 11
Fig. 2. A structure of a skip graph
lower level links. In the case of range queries that specifies the beginning and
end of keys to be searched, the queries are forwarded to the peer whose key is
within the range, or less than the end of the range. The number of hops to key
search is represented to O(log n) when n is denoted as the number of peers. In
addition, the average number of links on each peer is represented to log n.
3.2 Phase Differences
Currently we have proposed a large-scale data collection schema for distributed

TPBS [14]. In [14], we employ “Collective Store and Forwarding”, which stores
and merges multiple small size messages into one large message along a multi-hop
tree structure on the structured overlay for TBPS, taking into account the deliv-
ery time constraints. This makes it possible to reduce the overhead of network
process even when a large number of sensor data is published asynchronously. In
addition, we have proposed a collection system considering phase differences [5].
In the proposed method, the phase difference of the source node Ni is denoted
as di (0 ≤ di < Ci ). In this case, the collection time is represented to Ci p + di
(p = 0, 1, 2, ...). Table 2 shows the time to collect data in the case of Fig. 1 where
the collection cycle of each source node is 1, 2, or 3. By considering phase differ-
ences like Table 2, the collection time is balanced within each collection cycle, and
the probability of load concentration to the specific time or node is decreased.
Each node sends sensor data at the time base on his collection cycle and phase
difference, and other nodes relay the sensor data to the sink node. In this paper,
we call considering phase differences “phase shifting (PS)”. Figs. 3 and 4 show
an example of the data forwarding paths on skip graphs without phase shifting
(PS) and with PS, respectively.
Table 2. An example of the collection time considering phase differences
Cycle Phase diff. Collect. time

1 0 0, 1, 2, 3, 4, · · ·
2 0 0, 2, 4, 6, 8, · · ·
1 1, 3, 5, 7, 9, · · ·
3 0 0, 3, 6, 9, 12, · · ·
1 1, 4, 7, 10, 13, · · ·
2 2, 5, 8, 11, 14, · · ·
4 Evaluation
In this section, we describe the additional evaluation to our proposed method
in [6].
4.1 Simulation Environments
In the simulation environments of our previous work, the collection cycle of

each source node denoted as Ci is determined at random between 1 and 10.
The selectable cycles are assumed limited by the practical systems, however,
the distribution of the selected cycles are not uniform in the real world. In this
paper, therefore, we evaluate our proposed method in the different distributions
of collection cycles. The employed distributions are the normal (Gaussian) dis-
tribution and exponential distribution. To determine the integer cycle between 1
and 10, the normal distribution has 5.5 as an average μ and 1.5 as a variance σ 2 .
In addition, the exponential distribution determines the integer cycle for each
node based on its cumulative distribution function (CDF), 10 × (1 − e−x ), while
Fig. 3. An example of the skip graphs-based method without PS

Key 3 3 3 2 2 1 1 Dest.
Phase diff. 2 1 0 1 0 0 0 node
N1 N2 N3 N4 N5 N6 N7 D1
t=0
t=1
t=2
t=3
t=4
t=5
Fig. 4. An example of the skip graphs-based method with PS
x is determined between 0.0 and 5.0 at random. For other parameters, the sim-
ulation time denoted as t is from 0 to 2519, which length is the least common
multiple of the selectable cycles. In addition, this simulation has no communica-
tion delays among nodes although there are various communication delays in the
real world. As comparison methods, we compare the proposed method with skip
graph-based method without PS shown in Fig. 3, the method in which all source
nodes send data to the destination node directly (Source Direct, SD), and the
method in which all source nodes send data to the next node for the destination
node (Daisy Chain, DC).
4.2 Simulation Results
Figures 5 and 6 show the results for the maximum instantaneous load and total
loads of nodes by the number of nodes, respectively. The number of node is the
value on the lateral axis, and the allowable number of stream aggregation is under
11. In all the distributions from Fig. 5, the proposed method, skip graphs (SG)
with PS, has a lower instantaneous load compared to SD-based methods where
the destination node receives data directly from the source nodes. Although the
larger the allowable number of stream aggregation in DC-based methods, the
smaller the number of transmission and reception. In this simulation environ-
ment, however, the proposed method has a lower instantaneous load than the
results of DC-based methods. In addition, the proposed method has a lower
instantaneous load compared to SG without PS because each node has different
transmission and reception timing by its phase difference even if another node
is configured the same collection cycle. In Fig. 6, on the other hand, SD-based
methods have the lowest total loads. However, the proposed method has lower
total loads compared to DC-based methods in this simulation environment. In
addition, the total loads are the lowest in the exponential distribution because
longer cycles have higher probabilities to be selected.
400
400
400
SD w/o PS SD w/o PS SD w/o PS
SD w/ PS SD w/ PS SD w/ PS
DC w/o PS DC w/o PS DC w/o PS
DC w/ PS DC w/ PS DC w/ PS
Max. Instantaneous Load

300
300
300
SG w/o PS SG w/o PS SG w/o PS
SG w/ PS SG w/ PS SG w/ PS
200
200
200
100
100
100
0
0
0
100 200 300 400 100 200 300 400 100 200 300 400
Number of Nodes Number of Nodes Number of Nodes
(a) Uniform (b) Gaussian (c) Exponential
Fig. 5. The maximum instantaneous load by the number of nodes
8000
8000
8000

6000
6000
6000
Total Loads [1000]

Total Loads [1000]
Total Loads [1000]
4000
4000
4000
2000
2000
2000
0
0
100 200 300 400 100 200 300 400 100 200 300 400
Fig. 6. The total loads by the number of nodes

150
150
150
Avg. Number of Hops
Avg. Number of Hops
Avg. Number of Hops
100
100
100
50
50
50
0
0
100 200 300 400 100 200 300 400 100 200 300 400
Fig. 7. The average hops by the number of nodes

400
400
400

300
300
300

Max. Number of Hops
Max. Number of Hops
Max. Number of Hops
200
200
200
100
100
100
0
0
100 200 300 400 100 200 300 400 100 200 300 400
Fig. 8. The maximum hops by the number of nodes

Similar to the results for the maximum instantaneous load and total loads
of nodes, Figs. 7 and 8 show the results for the average number and maximum
number of hops by the number of nodes under 11 streams aggregation, respec-
tively. In Figs. 7 and 8, SD-based methods have only one hop as the average
number and maximum number although those instantaneous loads described in
Fig. 5 are high. The proposed method has log n as the average number of hops
while n is denoted as the number of nodes and DC-based methods are affected
linearly by n.
400
400
400

300
300
300

200
200
200
100
100
100
0
0
40 41 42 43 40 41 42 43 40 41 42 43
Allowable Number of Stream Aggregation Allowable Number of Stream Aggregation Allowable Number of Stream Aggregation
Fig. 9. The maximum instantaneous load by the allowable number of stream aggrega-
tion
20000
20000
20000

15000
15000
15000

Total Loads [1000]
Total Loads [1000]
Total Loads [1000]
10000
10000
10000
5000
5000
5000
0
0
40 41 42 43 40 41 42 43 40 41 42 43
Allowable Number of Stream Aggregation Allowable Number of Stream Aggregation Allowable Number of Stream Aggregation
Fig. 10. The total loads by the allowable number of stream aggregation
Figures 9 and 10 show the results for the maximum instantaneous load and
total loads of nodes by the allowable number of stream aggregation, respectively.
The allowable number of stream aggregation is the value on the lateral axis, and
the number of node is 200. SD-based methods have a constant value as the
maximum instantaneous load not affected by the allowable number of stream
aggregation because the source nodes send data to the destination node directly.
In Figs. 9 and 10, most of the results decrease by the increase of the allowable
number of stream aggregation. The proposed method, SG with PS, has lower
results for both of the maximum instantaneous load and total loads even in the
realistic situation, 41 streams aggregation, compared to DC-based methods that

require many streams aggregation to reduce those loads. In addition, similar
to the results by the number of nodes, the total loads are the lowest in the
exponential distribution. The average number and maximum number of hops
are the same to the results of 200 nodes in Figs. 7 and 8 because they are not
affected by the allowable number of stream aggregation.
5 Conclusion
We have proposed a skip graph-based collection system for sensor data streams
considering phase differences. In this paper, we evaluated our proposed method
in the different distributions of collection cycles. The employed distributions are
the normal (Gaussian) distribution and exponential distribution. Our additional
simulation results show that our proposed system can equalize the loads of nodes
even if the distribution of collection cycles are not uniform.
Acknowledgements. This work was supported by JSPS KAKENHI Grant Number

17K00146 and I-O DATA Foundation.
References
1. Aspnes, J., Shah, G.: Skip graphs. ACM Trans. Algorithms (TALG) 3(4(37)), 1–25
(2007)
2. Banno, R., Takeuchi, S., Takemoto, M., Kawano, T., Kambayashi, T., Matsuo, M.:
Designing overlay networks for handling exhaust data in a distributed topic-based
pub/sub architecture. J. Inf. Process. (JIP) 23(2), 105–116 (2015)
3. Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-
attribute range queries. In: Proceedings of the ACM Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communications (SIG-
COMM 2004), pp. 353–366 (2004)
4. Kaneko, Y., Harumoto, K., Fukumura, S., Shimojo, S., Nishio, S.: A location-based
peer-to-peer network for context-aware services in a ubiquitous environment. In:
Proceedings of the 5th IEEE/IPSJ Symposium on Applications and the Internet
(SAINT 2005) Workshops, pp. 208–211 (2005)
5. Kawakami, T., Ishi, Y., Yoshihisa, T., Teranishi, Y.: A skip graph-based collection
system for sensor data streams considering phase differences. In: Proceedings of
the 8th International Workshop on Streaming Media Delivery and Management
Systems (SMDMS 2017) in Conjunction with the 12th International Conference
on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC 2017), pp. 506–
513 (2017)
6. Kawakami, T., Yoshihisa, T., Teranishi, Y.: A load distribution method for sensor
data stream collection considering phase differences. In: Proceedings of the 9th
International Workshop on Streaming Media Delivery and Management Systems
(SMDMS 2018) in Conjunction with the 13th International Conference on P2P,
Parallel, Grid, Cloud and Internet Computing (3PGCIC 2018), pp. 357–367 (2018)
7. Legtchenko, S., Monnet, S., Sens, P., Muller, G.: RelaxDHT: a churn-resilient
replication strategy for peer-to-peer distributed hash-tables. ACM Trans. Auton.
Adapt. Syst. (TAAS) 7(2), 28 (2012)
8. Mondal, A., Lifu, Y., Kitsuregawa, M.: P2PR-tree: an R-tree-based spatial index
for peer-to-peer environments. In: Proceedings of the International Workshop on
Peer-to-Peer Computing and Databases in Conjunction with the 9th International
Conference on Extending Database Technology (EDBT 2004), pp. 516–525 (2004)
9. Ohnishi, M., Inoue, M., Harai, H.: Incremental distributed construction method
of delaunay overlay network on detour overlay paths. J. Inf. Process. (JIP) 21(2),
216–224 (2013)
10. Shinomiya, J., Teranishi, Y., Harumoto, K., Nishio, S.: A sensor data collection
method under a system constraint using hierarchical delaunay overlay network.
In: Proceedings of the 7th International Conference on Intelligent Sensors, Sensor
Networks and Information Processing (ISSNIP 2011), pp. 300–305 (2011)
11. Shu, Y., Ooi, B.C., Tan, K.L., Zhou, A.: Supporting multi-dimensional range
queries in peer-to-peer systems. In: Proceedings of the 5th IEEE International
Conference on Peer-to-Peer Computing (P2P 2005), pp. 173–180 (2005)
12. Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek,
F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet
applications. IEEE/ACM Trans. Netw. 11(1), 17–32 (2003)
13. Tanin, E., Harwood, A., Samet, H.: Using a distributed quadtree index in peer-to-
peer networks. Int. J. Very Large Data Bases (VLDB) 16(2), 165–178 (2007)
14. Teranishi, Y., Kawakami, T., Ishi, Y., Yoshihisa, T.: A large-scale data collection
scheme for distributed topic-based pub/sub. In: Proceedings of the 2017 Interna-
tional Conference on Computing, Networking and Communications (ICNC 2017)
(2017)
A Mathematical Analysis of 2-Tiered Hybrid
Broadcasting Environments
Satoru Matsumoto, Kenji Ohira, and Tomoki Yoshihisa(&)
Cybermedia Center, Osaka University, Ibaraki, Osaka, Japan

{smatsumoto,ohira,yoshihisa}@cmc.osaka-u.ac.jp
Abstract. Due to the recent development of wireless broadcasting technolo-

gies, video data distribution systems in hybrid broadcasting environments has
attracted great attention. In some video distribution methods for hybrid broad-
casting environments, the server broadcasts some data pieces via the broad-
casting channel and delivers them to all the clients to effectively reduce the
interruption times. They fix the bandwidth allocation to predict when the clients
finish receiving pieces. However, the server can change the bandwidth alloca-
tion flexibly in recent environments such as 5G. In this paper, we propose a
mathematical model for video data distribution in flexible bandwidth hybrid
broadcasting environments.
1 Introduction
Video distribution in hybrid broadcasting environments has attracted great attention.

One of the pioneer points of hybrid broadcasting environments is that the clients of a
system can receive data both from the broadcasting channel and the communication
channel. In hybrid broadcasting environments, each client can receive data from both
channels. The server can send data to all clients at once via the broadcasting channels.
Server can send data at the request of each client via the communication channel.
Conventional hybrid broadcasting environments, divide data of a certain video into
several pieces. Viewers client terminal plays the received pieces in order from the front
of a video stream. The client device has not yet received a piece when occurrence of
playback interruption. For viewers, reducing playback interruption time is an important
issue. Figure 1. shows the data structure of assumed conventional streaming video data.
A general data amount for a GOP is the data amount of the video for 0.5 s. For
example, suppose the case that a video for 30 min is encoded in MPEG and the bitrate
is 5 Mbps, a general video bitrate of television services. The data amount for the video
is 1.125 Gbytes and that for a piece is 312.5 Kbytes. For example, On the 5G Network
of MBMS Mode show in Fig. 2. Although the total bandwidth is fixed, the control
system can control the bandwidth of broadcast channel and communication channel.
However, in most hybrid broadcasting environments, the system can flexibly change
the bandwidth allocation. Authors focused on 5G MBMS and used it as an approach to
find the optimal bandwidth distribution (Broadcast/Communication). Hybrid broad-
casting has a problem in which it is necessary to find the distribution of the optimum

https://doi.org/10.1007/978-3-030-33509-0_42
A Mathematical Analysis of 2-Tiered Hybrid Broadcasting Environments 455
Fig. 1. Conventional streaming video data
Fig. 2. Hybrid broadcasting streaming video data
Fig. 3. Conceptual diagram of RDB method
broadcast band and communication band. Authors also focused on the balance of 2
points shown in the following (a) and (b).
(a) When the number of clients is large, the larger the broadcast bandwidth is, the
easier it is to shorten the playback interruption time (because it can be delivered
collectively).
(b) When the number of clients is small, the larger the communication bandwidth,
the easier it is to shorten the playback interruption time (since it can be sent on request).
In this paper, we propose a video data distribution method and model for flexible
bandwidth allocation in hybrid broadcasting environments. We call our proposed
method the Request-based Division Broadcasting (RDB) method. In the RDB method,
the server broadcasts the requested pieces only when the transmission time via the
broadcasting channel is predicted to be earlier than that via the communication channel.
Show in Fig. 3.
Fig. 4. Viewer request model
2 Related Work
Guo et al. [1]: Investigate, 3GPP has not yet defined any broadcast/multicast solution for
5GNR, although some proposals will be revisited as soon as time units in latest 5G
become available. In this work, we analyze the use of a mixed mode that shares mul-
ticast, broadcast and unicast resources via the same physical channel. Goto et al. Have
proposed a method for creating a broadcast schedule when users use multiple data [2].
However, the broadcast schedule is fixed and only predetermined data can be broadcast.
Research is being conducted to distribute data not only by radio broadcasting but also by
the Internet. Hironaka et al. [3]: Are researching and developing a system called hybrid
cast that distributes access destinations via radio broadcasts and downloads and displays
the access destination homepage delivered via the Internet. However, these research
cannot broadcast data according to the user’s situation. Some researchers has started to
distribute data according to the user’s situation in a broadcast format. There has been
proposed a method for determining the processing order on the terminal side so that the
processing time is shortened when data is broadcast in sequence in response to requests
from multiple computers [4]. It assumes large-scale distributed computation and is not
intended for high-speed notification, response, or continuous transmission. It assumes
large-scale distributed computation and is not intended for high-speed notification,
response, or continuous transmission. It assumes large-scale distributed computation
and is not intended for high-speed notification, response, or continuous transmission. It
assumes large-scale distributed computation and is not intended for high-speed notifi-
cation, response, or continuous transmission. A method has been proposed in which a
user terminal requests and broadcasts necessary data from a list of data given in advance
[5]. The user can only request data in the list, and cannot use a large amount of data.
Authors have created an Internet broadband distribution system [6], which is the pre-
decessor of the experimental environment for distributing the Hybrid broadcast pro-
posed in this paper, and am preparing them for a new experimental environment.
3 Proposed Method and Model
There is a technology called near video on demand which realizes the video on demand
service in a pseudo manner by distributing data repeatedly. However, waiting time
occurs until the video start data is acquired. The waiting time can be shortened by
taking the beginning part via in unicast. In hybrid broadcast, in multicast and unicast
band allocation, the band allocation that can reduce latency appropriately and effec-
tively becomes indefinite depending on the number of viewers. Therefore, there is a
problem that it is desired to clarify appropriate band allocation when the total band-
width is constant. In order to solve these problems, the authors propose to find an
optimal value in a mathematical model. Viewer request model show in Fig. 4.
3.1 Data Division

In this paper, we assume that the video data is divided into 2 segments, S1 and S2 . S1 is
delivered to the clients via the communication system when they request playing the
video data. S2 is cyclically broadcast via the broadcasting system. Here, we denote that
D1 is the duration for S1 , and D2 for S2 . BB is the bandwidth for the broadcasting
system and BC is the bandwidth for the communication system. The clients can play the
video data continuously in the cases that they finish receiving S2 while they are playing
S1 . The data amount that the clients can receive while playing S1 is BB D1 and the
duration is D2 ¼ BB D1 =R. Here, R is the video bitrate. The total duration should be the
same as the video duration D. Therefore:
BB D1
D1 þ ¼D
R
Hence,
D
D1 ¼
1 þ BB =R
If the average arrival interval of the clients k is shorter than the time needed to
receive S1 from the communication system, the waiting time to start playing the video
data for the clients increases because the communication bandwidth is shared with the
next client and the reception time of S1 lengthens. Therefore, the shortest average
interval of the clients ^k that the waiting time does not lengthen is given by the time
needed to receive S1 from the communication system. Thus,
^k ¼ RD1 ¼ RD
BC BC ð1 þ BB =RÞ
In this paper, we assume that the total bandwidth of the broadcasting system and
the communication system is fixed to B and is flexibly divided to the broadcasting
system and the communication system. We define b as the ratio of the bandwidth for
the broadcasting system to the total bandwidth.
BB ¼ bB
BC ¼ ð1 bÞB
In this case,
^k ¼ RD
ð1 bÞð1 þ bB=RÞB
A larger value of the denominator gives a shorter value of ^

k. Here, we define the
function f ðbÞ by the denominator. To find the shortest ^
k, we differentiate f ðbÞ.
f ðbÞ ¼ ð1 bÞð1 þ bB=RÞB
df
¼ ð1 þ bB=RÞB þ B2 ð1 bÞ=R
db
In the case that the differentiation is 0,
ð1 þ bB=RÞB ¼ B2 ð1 bÞ=R
Therefore, the limit of the average arrival interval that the waiting time does not
lengthen is the shortest when:
BR
b¼
2B
3.2 Limit of Average Arrival Interval

In our assumed hybrid broadcasting environments, the video distribution system can
flexibly allocate the bandwidth for the communication system and the broadcasting
system. To investigate the limit of the average arrival interval under that the waiting
time does not lengthen, we calculate the value of f ðbÞ defined in the previous sub-
section. First, Fig. 5 shows the value of f ðbÞ under some different total bandwidth B.
We fix the bit rate R to 5 Mbps. This is a bit rate often used for MPEG2 and is realistic.
From this figure, we can see that the f ðbÞ has a maximum value even B changes.
Therefore, we need to find the value of b that gives the maximum f ðbÞ to give the
shortest limit for the average arrival interval. Also, we can see that a smaller B gives a
smaller maximum f ðbÞ. This is because the limit lengthens as the total bandwidth
decreases.
Figure 6 shows the value of f ðbÞ under some different bit rate R. We fix B to 20
Mbps assuming practical total bandwidth for cellular networks. From this figure, we
can see that the f ðbÞ has a maximum value even R changes. Same as for the investi-
gation under different total bandwidth, we need to find the value of b that gives the
maximum f ðbÞ to give the shortest limit for the average arrival interval. Also, we can
see that a smaller R gives a smaller maximum f ðbÞ. This is because the limit lengthens
as the bit rate decreases.
90
80
70
60
50
f(b)
40
30
20
10
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
B=10 b
B=20
B=30
Fig. 5. f (b) graph B = 10, 20, 30 R = 5. Result of simulation, when R is fixed to 20, f (b) is
maximum when b is 0.32. The values used for B were 10, 20, and 30.
140
120
100
80
f(b)
60
40
20
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
R=1 b
R=5
R=10
Fig. 6. f (b) graph B = 20 R = 1, 5, 10. Result of simulation, when B is fixed to 20, f (b) is
maximum when b is 0.46. The values used for R were 1, 5, and 10.
Considering the apportionment of R and B so that the value, a well-balanced

distribution becomes possible.
It is result of simulation indicate, optimal to deliver waiting time at a ratio of
B = 2.78 R under the conditions of Fig. 5 and at a ratio of 0.08B = R under the
conditions of Fig. 6.
4 Conclusions
Recently, streaming video delivery in a hybrid broadcast environment has received

great attention. One of the research topics in the field of video distribution research is
the reduction of interruption time to the environment in that the client can receive data
from both the broadcast channel and the communication channel. In this paper, we
propose a video data delivery method and model for flexible bandwidth allocation. As a
result of making a simulator with reference to the model and performing simulation
using a method of outputting a value approximate to the proposed RDB method, the
minimum balance point in general movies data distribution was found.
Acknowledgments. This research was supported by a Grants-in-Aid for Scientific Research

(C) numbered JP17K00146 and JP18K11316.
References
1. Guo, W., Fuentes, M., Christodoulou, L., Mouhouche, B.: Roads to multimedia broadcast
multicast services in 5G new radio. In: Proceedings IEEE International Symposium on
Broadband Multimedia Systems and Broadcasting (BMSB), Valencia, Spain (2018)
2. Fukui, D., Gotoh, Y.: A scheduling method for switching playback speed in selective
Contents Broadcasting. Journal of Mobile Multimedia (JMM) 12, 181–196 (2017)
3. Hironaka, Y., Majima, K., Kai, K., Sunasaki, S.: Broadcast metadata providing system for
hybridcast. In: Proceedings IEEE International Conference on Consumer Electronics,
pp. 328–329 (2015)
4. Wu, C.J., Ku, C.F., Ho, J.M., Chen, M.S.: A novel pipeline approach for efficient big data
broadcasting. IEEE Trans Knowl. Data Eng. 28(1), 17–28 (2016)
5. Lu, Z., Wu, W., Li, W.W., Pan, M.: Efficient Scheduling Algorithms for on-Demand Wireless
Data Broadcast. In: IEEE INFOCOM (2016)
6. Matsumoto, S., Yoshihisa, T.: Different Worlds broadcasting: a video data distribution
method for flexible bandwidth allocation in hybrid broadcasting environments. In:
Proceedings of 33rd International Conference on Advanced Information Networking and
Applications (AINA 2019), Shimane, Japan, March 2019
Multimedia, Web and Virtual Reality
Technologies (MWVRTA-2019)
Influence of Japanese Traditional Crafts
on Kansei in Different Interior Styles
Ryo Nakai1(&), Yangzhicheng Lu1, Tomoyuki Ishida2,

Akihiro Miyakwa3, Kaoru Sugita2, and Yoshitaka Shibata4
1
Ibaraki University, Hitachi, Ibaraki 316-8511, Japan
{18nm728a,18nm742y}@vc.ibaraki.ac.jp
2
Fukuoka Institute of Technology, Fukuoka 811-0295, Japan
{t-ishida,sugita}@fit.ac.jp
3
Nanao, Ishikawa 926-8611, Japan
a-miyakawa@city.nanao.lg.jp
4
Iwate Prefectural University, Takizawa, Iwate 020-0693, Japan
shibata@iwate-pu.ac.jp
Abstract. In this research, we investigated the influence on Kansei when

placing Japanese traditional crafts in virtual reality space of Japanese-style
room, Western-style room and Chinese-style room. We placed Japanese tradi-
tional crafts such as Fusuma (sliding door), Shoji (paper sliding door) and
Tsuitate (screen) in each room style. We conducted a questionnaire survey on 14
subjects using 10 Kansei word pairs on the impression of Japanese traditional
crafts placed in each room style.
1 Introduction
There are many types of traditional crafts in the various regions of Japan. On the other
hand, current Japanese traditional crafts have the following problems [1]:
• difficulty procuring raw materials, lack of successors
• lack of know-how for business expansion, difficulty in developing sales channels by
individual
• weakness of information transmission and branding
In such circumstances, the Ministry of Economy, Trade and Industry is promoting
the development of overseas demand and the utilization of information technology in
the “Basic Guidelines on the Promotion of Traditional Craft Industry [2].”
We implemented a Japanese traditional crafts presentation system by Kansei
retrieval method using Virtual Reality Modeling Language (VRML) [3–8] and a
Japanese traditional crafts AR presentation system by Kansei retrieval method using
Augmented Reality (AR) technology [9, 10] in the previous research. On the other
hand, both systems are based on Japanese-style room and do not take into consideration
the influence on Kansei when traditional crafts are placed in Western-style room and

https://doi.org/10.1007/978-3-030-33509-0_43
464 R. Nakai et al.
Chinese-style room. Therefore, in this research, we investigate the influence on Kansei

when placing traditional crafts in Japanese-style room, Western-style room and
Chinese-style room. Each style room used 3D space provided by the interior design site
“Justeasy [11].”
2 Evaluation Score of Each Kansei Word, and Average,

Standard Deviation, Dispersion of Evaluation Score
We calculated the average value, standard deviation and dispersion value of the
evaluation scores for each Kansei Word when Japanese traditional crafts were placed in
Japanese, Western, and Chinese style rooms. The evaluation scores of each Kansei
word are shown in Table 1. In case of the “Modern - Classic”, the evaluation score is 2
when the impression is “moderately modern.” Moreover, Table 2 shows Fusuma, Shoji
and Tsuitate placed in each style room.
Table 1. Evaluation score of each Kansei word.

1 [point] 2 [point] 3 4 [point] 5 [point]
[point]
Extremely massive Moderately massive Neither Moderately light Extremely light
Extremely Moderately Neither Moderately simple Extremely simple
gorgeous gorgeous
Extremely calm Moderately calm Neither Moderately gaudy Extremely gaudy
Extremely Moderately Neither Moderately Extremely
individual individual traditional traditional
Extremely modern Moderately modern Neither Moderately classic Extremely classic
Extremely plain Moderately plain Neither Moderately delicate Extremely delicate
Extremely hard Moderately hard Neither Moderately soft Extremely soft
Extremely rustic Moderately rustic Neither Moderately smart Extremely smart
Extremely warm Moderately warm Neither Moderately cool Extremely cool
Extremely like Moderately like Neither Moderately dislike Extremely dislike
Influence of Japanese Traditional Crafts on Kansei 465
Table 2. Japanese traditional crafts placed in Japanese, Western, and Chinese style rooms.
Fusuma placed
A in Japanese-style
room space
Fusuma placed
B in Western-style
room space
Fusuma placed
C in Chinese-style
room space
Shoji placed
D in Japanese-style
room space
Shoji placed
E in Western-style
room space
Shoji placed
F in Chinese-style
room space
Tsuitate placed
G in Japanese-style
room space
Tsuitate placed
H in Western-style
room space
Tsuitate placed
I in Chinese-style
room space
466 R. Nakai et al.
Table 3 shows the average, standard deviation, and dispersion of the evaluation
scores of Fusuma placed in Japanese-style room space. The average value of Fusuma
placed in Japanese-style room is the Kansei word “Warm - Cool (4.00)” and “Indi-
vidual - Traditional (3.71)” at the top. Conversely, the average value of Fusuma placed
in Japanese-style room is the Kansei words “Like - Dislike (1.57)” and “Calm - Gaudy
(1.86)” at the bottom. Therefore, Fusuma placed in Japanese-style room strongly
influences the Kansei words “Cool,” “Traditional,” “Like,” and “Calm.”
Table 3. Average value, standard deviation and dispersion value of evaluation score of Fusuma
placed in Japanese-style room space.
Kansei words Average Standard Standard deviation Dispersion
deviation (n) (n − 1)
Massive - Light 3.29 1.16 1.20 1.35
Gorgeous - Simple 3.14 1.12 1.17 1.27
Calm - Gaudy 1.86 0.91 0.95 0.84
Individual - 3.71 1.10 1.14 1.20
Traditional
Modern - Classic 3.21 1.32 1.37 1.74
Plain - Delicate 3.14 0.99 1.03 0.98
Hard - Soft 3.21 1.21 1.25 1.45
Rustic - Smart 3.50 0.82 0.85 0.68
Warm - Cool 4.00 1.07 1.11 1.14
Like - Dislike 1.57 0.62 0.65 0.39
scores of Fusuma placed in Western-style room space. The average value of Fusuma
placed in Western -style room is the Kansei word “Hard - Soft (3.79)” and “Massive -
Light (3.43)” at the top. Conversely, the average value of Fusuma placed in Western-
style room is the Kansei words “Individual - Traditional (2.21)” and “Modern - Classic
(2.21)” at the bottom. Therefore, Fusuma placed in Western-style room strongly
influences the Kansei words “Soft,” “Light,” “Individual,” and “Modern.”
scores of Fusuma placed in Chinese-style room space. The average value of Fusuma
placed in Chinese -style room is the Kansei word “Plain - Delicate (3.64),” “Calm -
Gaudy (3.50)” and “Individual - Traditional (3.50)” at the top. Conversely, the average
value of Fusuma placed in Chinese-style room is the Kansei words “Massive - Light
(1.64)” and “Gorgeous - Simple (1.79)” at the bottom. Therefore, Fusuma placed in
Chinese-style room strongly influences the Kansei words “Delicate,” “Gaudy,” “Tra-
ditional,” “Massive,” and “Gorgeous.”
placed in Western-style room space.
Massive - Light 3.43 1.12 1.16 1.24
Gorgeous - Simple 2.93 0.96 1.00 0.92
Calm - Gaudy 3.00 1.00 1.04 1.00
Individual - 2.21 1.08 1.12 1.17
Traditional
Modern - Classic 2.21 0.86 0.89 0.74
Plain - Delicate 2.64 0.72 0.74 0.52
Hard - Soft 3.79 0.56 0.58 0.31
Rustic - Smart 3.14 1.12 1.17 1.27
Warm - Cool 3.07 1.16 1.21 1.35
Like - Dislike 2.57 1.05 1.09 1.10
placed in Chinese-style room space.
Massive - Light 1.64 0.81 0.84 0.66
Gorgeous - Simple 1.79 0.77 0.80 0.60
Calm - Gaudy 3.50 1.45 1.51 2.11
Individual - 3.50 1.30 1.34 1.68
Traditional
Modern - Classic 3.21 1.32 1.37 1.74
Plain - Delicate 3.64 1.17 1.22 1.37
Hard - Soft 2.29 1.10 1.14 1.20
Rustic - Smart 2.86 0.99 1.03 0.98
Warm - Cool 2.50 0.91 0.94 0.82
Like - Dislike 2.43 0.90 0.94 0.82
scores of Shoji placed in Japanese-style room space. The average value of Shoji placed
in Japanese-style room is the Kansei word “Gorgeous - Simple (3.64),” “Massive -
Light (3.57)” and “Individual - Traditional (3.57)” at the top. Conversely, the average
value of Shoji placed in Japanese-style room is the Kansei words “Calm - Gaudy
(1.79)” and “Plain - Delicate (2.21)” at the bottom. Therefore, Shoji placed in Japanese-
style room strongly influences the Kansei words “Simple,” “Light,” “Traditional,”
“Calm,” and “Plain.”
468 R. Nakai et al.
Table 6. Average value, standard deviation and dispersion value of evaluation score of Shoji
Massive - Light 3.57 0.98 1.02 0.96
Gorgeous - Simple 3.64 0.81 0.84 0.66
Calm - Gaudy 1.79 0.56 0.58 0.31
Individual - 3.57 0.90 0.94 0.82
Traditional
Modern - Classic 3.36 0.97 1.01 0.94
Plain - Delicate 2.21 1.26 1.31 1.60
Hard - Soft 3.00 1.13 1.18 1.29
Rustic - Smart 3.36 0.81 0.84 0.66
Warm - Cool 3.43 1.05 1.09 1.10
Like - Dislike 2.43 0.82 0.85 0.67
scores of Shoji placed in Western-style room space. The average value of Shoji placed
in Western -style room is the Kansei word “Gorgeous - Simple (3.64)” and “Rustic -
Smart (3.64)” at the top. Conversely, the average value of Shoji placed in Western-style
room is the Kansei words “Like - Dislike (2.43),” “Calm - Gaudy (2.50)” and “Modern
- Classic (2.50)” at the bottom. Therefore, Shoji placed in Western-style room strongly
influences the Kansei words “Simple,” “Smart,” “Like,” “Calm,” and “Modern.”
scores of Shoji placed in Chinese-style room space. The average value of Shoji placed in
Chinese-style room is the Kansei word “Plain - Delicate (3.50)” and “Modern - Classic
(3.29)” at the top. Conversely, the average value of Shoji placed in Chinese-style room
is the Kansei words “Massive - Light (2.29),” “Calm - Gaudy (2.50),” and “Like -
Dislike (2.50)” at the bottom. Therefore, Shoji placed in Chinese-style room strongly
influences the Kansei words “Delicate,” “Classic,” “Massive,” “Calm,” and “Like.”
placed in Western-style room space.
Massive - Light 3.93 0.59 1.02 0.35
Gorgeous - Simple 3.64 0.89 0.84 0.80
Calm - Gaudy 2.50 0.73 0.58 0.54
Individual - 3.21 0.77 0.94 0.60
Traditional
Modern - Classic 2.50 1.05 1.01 1.11
Plain - Delicate 2.57 0.82 1.31 0.67
Hard - Soft 3.00 0.93 1.18 0.86
Rustic - Smart 3.64 0.72 0.84 0.52
Warm - Cool 3.43 0.62 1.09 0.39
Like - Dislike 2.43 0.73 0.85 0.53
placed in Chinese-style room space.
Massive - Light 2.29 0.88 0.91 0.78
Gorgeous - Simple 2.64 1.04 1.08 1.09
Calm - Gaudy 2.50 1.12 1.16 1.25
Individual - 3.00 1.20 1.24 1.43
Traditional
Modern - Classic 3.29 1.10 1.14 1.20
Plain - Delicate 3.50 0.91 0.94 0.82
Hard - Soft 2.79 0.77 0.80 0.60
Rustic - Smart 3.21 0.77 0.80 0.60
Warm - Cool 3.00 1.00 1.04 1.00
Like - Dislike 2.50 0.73 0.76 0.54
scores of Tsuitate placed in Japanese-style room space. The average value of Tsuitate
placed in Japanese-style room is the Kansei word “Plain - Delicate (3.50),” “Individual
- Traditional (3.29)” and “Rustic - Smart (3.29)” at the top. Conversely, the average
value of Tsuitate placed in Japanese-style room is the Kansei words “Like - Dislike
(2.21)” and “Gorgeous - Simple (2.64)” at the bottom. Therefore, Tsuitate placed in
Japanese-style room strongly influences the Kansei words “Delicate,” “Traditional,”
“Smart,” “Like,” and “Gorgeous.”
Table 9. Average value, standard deviation and dispersion value of evaluation score of Tsuitate
Massive - Light 2.93 1.28 1.33 1.64
Gorgeous - Simple 2.64 1.29 1.34 1.66
Calm - Gaudy 2.71 1.22 1.27 1.49
Individual - 3.29 1.28 1.33 1.63
Traditional
Modern - Classic 3.00 1.20 1.24 1.43
Plain - Delicate 3.50 0.91 0.94 0.82
Hard - Soft 3.14 1.19 1.23 1.41
Rustic - Smart 3.29 1.03 1.07 1.06
Warm - Cool 2.86 1.12 1.17 1.27
Like - Dislike 2.21 1.01 1.05 1.03
scores of Tsuitate placed in Western-style room space. The average value of Tsuitate
placed in Western -style room is the Kansei word “Calm - Gaudy (3.29),” “Massive -
Light (3.14)” and “Plain - Delicate (3.14)” at the top. Conversely, the average value of
470 R. Nakai et al.
Tsuitate placed in Western-style room is the Kansei words “Individual - Traditional

(2.00)” and “Modern - Classic (2.43)” at the bottom. Therefore, Tsuitate placed in
Western-style room strongly influences the Kansei words “Gaudy,” “Light,” “Deli-
cate,” “Individual,” and “Modern.”
scores of Tsuitate placed in Chinese-style room space. The average value of Tsuitate
placed in Chinese -style room is the Kansei word “Individual - Traditional (3.64)” and
“Rustic - Smart (3.29)” at the top. Conversely, the average value of Tsuitate placed in
Chinese-style room is the Kansei words “Gorgeous - Simple (2.21)” and “Warm - Cool
(2.36)” at the bottom. Therefore, Tsuitate placed in Chinese-style room strongly
influences the Kansei words “Traditional,” “Smart,” “Gorgeous,” and “Warm.”
Table 10. Average value, standard deviation and dispersion value of evaluation score of
Tsuitate placed in Western-style room space.
Massive - Light 3.14 1.12 1.17 1.27
Gorgeous - Simple 2.57 1.05 1.09 1.10
Calm - Gaudy 3.29 1.10 1.14 1.20
Individual - 2.00 1.00 1.04 1.00
Traditional
Modern - Classic 2.43 1.05 1.09 1.10
Plain - Delicate 3.14 0.91 0.95 0.84
Hard - Soft 2.71 1.03 1.07 1.06
Rustic - Smart 2.93 1.03 1.07 1.07
Warm - Cool 2.79 1.01 1.05 1.03
Like - Dislike 2.93 1.03 1.07 1.07
Table 11. Average value, standard deviation and dispersion value of evaluation score of
Tsuitate placed in Chinese-style room space.
Massive - Light 2.71 1.28 1.33 1.63
Gorgeous - Simple 2.21 1.15 1.19 1.31
Calm - Gaudy 3.14 1.06 1.10 1.12
Individual - 3.64 1.17 1.22 1.37
Traditional
Modern - Classic 3.21 1.15 1.19 1.31
Plain - Delicate 3.07 1.03 1.07 1.07
Hard - Soft 3.00 0.93 0.96 0.86
Rustic - Smart 3.29 0.96 0.99 0.92
Warm - Cool 2.36 0.81 0.84 0.66
Like - Dislike 2.71 0.96 0.99 0.92
3 Conclusion
In this research, we investigated the influence on Kansei when placing Japanese tra-
ditional crafts in virtual reality space of Japanese-style room, Western-style room and
Chinese-style room. As a result, it turned out that the impression of traditional crafts
changes with each style room. For example, Fusuma placed in Japanese-style room
strongly influences the Kansei words “Cool,” “Traditional,” “Like,” and “Calm.” On
the other hand, Fusuma placed in Western-style room strongly influences the Kansei
words “Soft,” “Light,” “Individual,” and “Modern.”
References
1. Ministry of Economy, Trade and Industry: Efforts for the current status and promotion of the
traditional craft industry. https://www.kanto.meti.go.jp/seisaku/creative/data/20170929setum
eikai_siryou1.pdf. Accessed July 2018
2. Ministry of Economy, Trade and Industry: Basic guidelines on the promotion of the
traditional craft industry. https://www.tohoku.meti.go.jp/s_cyusyo/densan-ver3/html/pdf/1_
5.pdf. Accessed July 2018
3. Miyakawa, A., Sugita, K., Ishida, T., Shibata, Y.: Implementation and evaluation of a
tradition search engine using sensitivity searching method. In: Proceedings of the 18th
International Conference on Advanced Information Networking and Applications, pp. 630–
635 (2004)
4. Sugita, K., Ishida, T., Miyakawa, A., Shibata, Y.: Kansei retrieval method using VPIC of
traditional Japanese crafting objects. In: Proceedings of the 18th International Conference on
Advanced Information Networking and Applications, pp. 10–13 (2004)
5. Sugita, K., Ishida, T., Miyakawa, A., Shibata, Y.: Kansei retrieval method using the
quantitative feature of traditional Japanese crafting object. In: Proceedings of the 24th
International Conference on Distributed Computing Systems Workshops, pp. 112–117 (2004)
6. Miyakawa, A., Ishida, T., Sugita, K., Shibata, Y.: Proposal of tradition handicraft searching
method using DCML. In: Proceedings of the 7th International Workshop on Network-Based
Information System, pp. 348–352 (2004)
7. Sugita, K., Ishida, T., Miyakawa, A., Barolli, L., Shibata, Y.: Experimental result of feature
extraction method for digital traditional Japanese crafting system. In: Proceedings of the 19th
International Conference on Advanced Information Networking and Applications, pp. 235–
240 (2005)
8. Miyakawa, A., Ishida, T., Shibata, Y.: Visual and physical feature analysis for Kansei
retrieval method in VR Japanese crafting presentation system. In: Proceedings of the 22th
International Conference on Advanced Information Networking and Applications Work-
shops, pp. 477–482 (2008)
9. Iyobe, M., Ishida, T., Miyakawa, A., Sugita, K., Uchida, N., Shibata, Y.: Proposal of a
virtual traditional Japanese crafting presentation system mobile edition. In: Proceedings of
the 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous
Computing, pp. 120–125 (2016)
10. Iyobe, M., Ishida, T., Miyakawa, A., Sugita, K., Uchida, N., Shibata, Y.: Development of a
mobile virtual traditional crafting presentation system using augmented reality technology.
Int. J. Space-Based Situated Comput. (IJSSC) 6(4), 239–251 (2017)
11. Justeasy. https://www.justeasy.cn/. Accessed July 2018
Semantic Similarity Calculation
of TCM Patents in Intelligent Retrieval
Based on Deep Learning
Na Deng1(&), Xu Chen2, and Caiquan Xiong1

1
School of Computer Science, Hubei University of Technology, Wuhan, China
iamdengna@163.com
2
School of Information and Safety Engineering,
Zhongnan University of Economics and Law, Wuhan, China
chenxu@whu.edu.cn
Abstract. Semantic similarity calculation between words is an important step

of text analysis, mining and intelligent retrieval. It can help to achieve intelligent
retrieval at the semantic level and improve the accuracy and recall rate of
retrieval. Because of the particularity of TCM (Traditional Chinese Medicine)
patents and the insufficiency of research, most of the current mainstream TCM
patent retrieval systems are keywords-based, and the retrieval results are not
satisfactory. In order to improve the intelligence level of TCM patent retrieval,
to promote TCM innovation and avoid repetitive research, based on real TCM
patent corpus, this paper utilizes the excellent feature learning ability of deep
learning to build a neural network model, and gives a method to calculate the
semantic similarity between words in TCM patents. The experimental results
show that the proposed method is effective. In addition, this method can be
extended to semantic similarity calculation in other domains.
1 Introduction
In text analysis, mining and intelligent retrieval, semantic similarity calculation

between words is an important topic. It can calculate the semantic similarity between
two words, and provide support for text analysis, mining and information retrieval.
In text analysis, mining and intelligent retrieval, semantic similarity calculation
between words is a key step in semantic similarity calculation of texts, which can help
realize text classification, clustering, recommendation and intelligent retrieval. For
example, if we can find the semantic relationship between the two words “pencil” and
“stationery”, then in text search, if the search term is “stationery”, the search engine can
also retrieve the texts containing the word “pencil”.
As the largest intellectual property data source in the era of big data, patent data
contains rich scientific, technological, economic and legal information. According to
the statistics of the World Intellectual Property Organization (WIPO), patents contain
95% of the world’s R&D achievements. If we can make full use of patent information
for technological innovation in the R&D process, we can save 60% of the research time
and 40% of the research expenditure [1]. In the era of big data, facing the rapid growth

https://doi.org/10.1007/978-3-030-33509-0_44
Semantic Similarity Calculation of TCM Patents in Intelligent Retrieval 473
of patent data, how to find the required information accurately and comprehensively
becomes an urgent problem, and patent retrieval is a powerful tool to solve this
problem. For patent applicants, patent retrieval can help stimulate inspiration, avoid
duplicate research and avoid infringement; for patent examiners, they need to retrieve
authorized patents similar to those under examination and decide to grant them or not;
for patentees, patent retrieval can retrieve patents that infringe their own patents,
through filing patent invalidation lawsuits to safeguard their own rights and interests;
for enterprises, patent retrieval can help them understand the R&D direction of com-
petitors or to make decision support for technology trade.
Different from web search, patent retrieval is difficult because of the following
reasons: patents contain a large number of professional terms, and is not easy to be
recognized by Chinese word segmentation tools; some patent applicants, for the pur-
pose of technical protection and expanding the scope of patent protection, intentionally
use hyponyms, vague words, even custom words to replace the original concept, in
order to reduce the probability of patent being retrieved.
As a sub-task of patent retrieval, compared with patent retrieval in other fields,
TCM patent retrieval is often more difficult, mainly reflected in the following aspects.
Firstly, aliases of traditional Chinese medicines are particularly common. Due to
the variety, complex sources and extensive production areas of traditional Chinese
medicine which has been passed down for thousands of years, and influenced by
various factors such as written errors, regional dialects and usage habits, the situation
that one herbal has multiple different names and a name corresponds multiple different
herbal medicines often occurs. For example, ginseng has more than ten aliases, such as
Jilin Ginseng, Yishan Ginseng, Shizhu Ginseng, Shencao, Dijing, Tujing and so on,
while Sanqi, Tudahuang, Hutoujiao, Diburongg and other Chinese herbal medicines all
have a name called Jinbuhuan.
Secondly, due to historical reasons and writing habits, there are a large number of
disease synonyms in TCM patents. For example, disease “catch a cold” can be
expressed as “feng han” or “shou liang”, “suffer from indigestion” can be expressed as
“xiao hua bu liang” or “ji shi” and so on.
All the aspects above may make the traditional keyword-based TCM patent
retrieval produce a very high leakage rate. In traditional keyword-based patent retrieval,
the searcher enters the query request and the retrieval system returns the patents
containing the requested keyword, without any semantic content involved. This kind of
searching misses many patents that do not match keywords but have similar semantics
with keywords. In patent retrieval targeting full recall rate, the omission of a patent may
have serious consequences. The most direct consequence is repeated R&D and patent
infringement litigation.
At present, the mainstream TCM patent retrieval systems and platforms, such as
SIPO, CNIPR, Patent Star and SooPAT [2], are all built on traditional keyword-based
retrieval methods, without the function of semantic retrieval, and the results are not
satisfactory. If we need to get a more comprehensive retrieval result on the premise of
ensuring the accuracy rate, that is, to achieve a higher recall rate, it is necessary to
execute queries repeatedly with different search terms. This requires not only high
retrieval skills, but also familiarity in the field of TCM. The lack of query semantics in
474 N. Deng et al.
TCM patent retrieval system indirectly attacked the enthusiasm of inventors in the field
of TCM and hindered the development speed of TCM innovation in China.
The characteristics of high investment, high risk and long period in the R&D of
TCM make it necessary to search patents in advance. The correctness and compre-
hensiveness of the search results are very important. Aiming at the lack of semantics in
the current TCM patent retrieval systems, and in order to realize intelligent TCM patent
retrieval, this paper will study the semantic similarity calculation between words in
TCM patent texts based on deep learning.
2 Related Work
Patent retrieval is quite different from common web search. Patent texts have some
outstanding characteristics, such as long text, semi-structured, numerous metadata,
vague expression, multi-graph and multi-language. These characteristics bring great
challenges to patent retrieval. In recent years, patent retrieval has been studied as an
important subject in the field of information retrieval at home and abroad [3, 4]. Since
2002, the National Institute of Information Science of Japan has set up a special
symposium on patent retrieval and published patent test data sets for English and
Japanese. CLEF (Cross Language Evaluation Forum) is an open evaluation platform
for information retrieval in European languages. Since 2009, CLEF-IP, a symposium
on patent retrieval, has been set up. In China, most of the research on patent retrieval
focuses on English patent retrieval. Liu Bin [5], Xu Kan [6], Yao Hongxing [7] and
other scholars have done a lot of useful research on patent retrieval, infringement
retrieval and knowledge-based patent retrieval.
There are many methods to calculate the semantic similarity between words, for
example, ontology-based, wordnet-based [8, 9], hownet-based [10, 11]; knowledge
base based, such as wiki [12, 13], Baidu Encyclopedia; synonym based, such as all
kinds of synonym lists, etc. Since ontology, knowledge base and synonym lists can not
contain all concepts and words, these three methods have some limitations.
In the era of big data, all walks of life have accumulated huge amounts of data,
including large amounts of text data. In fact, these texts have implied the context and
semantic relationship between words. In order to reveal the relationship, an effective
method is to use deep learning technology to discover the semantic relationship in
texts.
As a novel word representation method, distributed word vector Word2Vec [14]
has become a core technology applying deep learning into natural language processing
since it was proposed by Tomas Mikolov in 2013. Word2Vec has changed the dis-
advantage of high dimension and high sparseness in traditional bag of words method.
Through the training of neural network, it uses a low dimension and dense real vector
to represent words, which greatly promotes the development of deep learning in natural
language processing. The low-dimensional word vectors trained from the corpus can
not only be used as inputs of neural network conveniently, but also contain the
semantic relationship between words.
By utilizing distributed word vector technology in deep learning, this paper will
build and train a neural network model on a large number of TCM patent corpus to
discover the semantics similarity between words in TCM patents.
3 Our Method
In order to find the semantic similarity between words and concepts in TCM patents,
this paper constructs a neural network model, which is essentially a probabilistic
language model based on Bengio’s distribution hypothesis that the semantics of words
are determined by their context. The workflow of our method is shown in Fig. 1.
Collect Stop Words in

TCM Patents
TCM Patents
List of Chinese Word

Stop Words Segmentation
Remove Stop Words
Word1 Word2
Words Sequence
Create Input Vectors of Word Embedding

Neural Network
Word2Vec Model Semantic Similarity

between
Word1 and Word2
Fig. 1. Workflow of the method.
The details are as follows:

1. Prepare the corpus. The larger the corpus is, the more comprehensive words it
contains, and the more semantic relationships between words can be found.
2. Collect stop words in TCM patents.
3. Preprocess the original corpus in order to get the corpus suitable for training the
neural network model. Preprocessing includes using word segmentation tools to
segment Chinese words and removing stop words. Stop words contain various
punctuation symbols.
4. After Chinese word segmentation and stop words removal, word sequences
obtained are used as input vectors of Word2Vec neural network model. Train the
model, and one of the by-products of the model is the distribution vectors of words.
5. Words are represented by distributed vectors. The similarity between any two words
can be calculated by the distance between vectors.
476 N. Deng et al.
The semantic similarity of any two words is negative correlated with their distance;
that is, the smaller the distance, the higher is the semantic similarity between the two
words; conversely, the larger the distance, the lower is the semantic similarity between
the two words.
3.1 Collection of Stop Words in TCM Patents

Stop words refer to those words that appear frequently in documents and have no
substantial meaning distinction function between documents. In the process of text
analysis and mining, a very important step is to remove stop words, which can help
improve the accuracy of text analysis and mining. At present, in text analysis, people
usually use general stop word thesauruses, which are gradually collected in the process
of people’s research work and have reached a consensus.
Some research organizations have collected their own stop word thesaurus, such as
the stop word thesaurus of Harbin University of Technology, the stop word thesaurus
of Machine Intelligence Laboratory of Sichuan University, and stop word thesaurus of
Baidu Corp., etc.
However, in general stop word thesaurus, these stop words are only for ordinary
texts, not for a specific area. If they are used in text analysis in specific fields, they will
not get satisfying results.
TCM patents cover two fields: the field of traditional Chinese medicine and the
field of patent. Therefore, there are many domain terms in TCM patents, including the
professional terms in the field of Chinese medicine and the professional terms in the
field of patent. Obviously, stop words that have no real effect on analysis and mining
also contain words in these two fields. Therefore, the general stop word thesaurus is not
suitable for TCM patents.
In order to collect stop words suitable for TCM patents, the simplest, direct and
effective method is adopted here, that is, through filtering high-frequency words to get
stop words. The concrete steps are as follows:
1. Collect a large number of TCM patent texts;
2. Segment words for these texts;
3. Sort segmented words according to word frequency in descending order.
4. Filter Top N words by manual work to retain high-quality stop words.
This paper collected 363 stop words from 7000 TCM patent texts, and source code
in Python is shown in Fig. 2.
Fig. 2. Source code of collecting stop words in TCM patents.
Part of stop words is shown in Table 1.
Table 1. Part of stop words.
3.2 Training of Word2Vec Model

Excellent feature learning ability of deep learning can discover the essential charac-
teristics of massive data and the hidden relationship between them. Word2Vec neural
network is considered as a useful tool for deep learning to be applied to natural
language processing. One by-product of Word2Vec neural network training is the
ability to represent words as short vectors, which imply the semantic similarity between
words.
478 N. Deng et al.
In this paper, after Chinese word segmentation and stop words removal, the word
sequence obtained from the original TCM patent texts is used as the input of Word2Vec
neural network. While training completes, we can get the vector representation of words.
Source code in Python for implementing training model is shown in Fig. 3.
Fig. 3. Source code of training model.
In the code in Fig. 3, tain_file_path stores the word sequence after Chinese word
segmentation and stop words removal, and these words are separated by spaces;
save_model_file_path stores the final generated neural network model; gensim. models.
Word2Vec sets the running parameters of the neural network model, in which size is the
length of the vector, min_count means that the word whose frequency is lower than this
value will be discarded; works is the concurrent number of training, and iter is the
number of training iterations.
3.3 Semantic Similarity Calculation

After the training of Word2Vec model, words can be represented as distributed vectors.
These vectors already contain the semantic relationship between them. Therefore, we
can calculate the semantic similarity between words by calculating the distance
between vectors. The code is in shown in Fig. 4.
Fig. 4. Source code of calculating semantic similarity.
In order to calculate the semantic similarity between the two words “

(detoxification)” and “ (relief a cough)”, the model is loaded, the similarity
between the two words is calculated by the method similarity, and the words which are
Top N most similar to a word are obtained by the method most_similar.
4 Experiments
Using 7000 TCM patents as the original corpus, the abstract text was segmented into
words, and the stop words were removed. A Word2Vec neural network was estab-
lished. Table 2 is part of experimental result of calculating the similarity between some
word combinations.
Table 2. Similarity between some word combinations.
Table 3. Similarity between some word combinations.

480 N. Deng et al.
We can find Top N similar words corresponding with a specific word. The
experiment result is shown in Table 3 (N = 5).
It can be seen that the method can not only calculate the semantic similarity
between Chinese herbal medicine names in TCM patents, but also calculate the
semantic similarity between effect words and other words.
5 Conclusion
In view of the lack of semantics in the current retrieval system of TCM patents, this
paper proposes a method for computing the semantic similarity of words based on deep
learning through Word2Vec technology. This method can not only calculate the
semantic similarity between Chinese herbal medicines names in TCM patents, but also
calculate the semantic similarity between effect words, as well as other words.
Semantic similarity in TCM patents can provide query expansion and other support for
intelligent retrieval of TCM patents.
Acknowledgments. This work was supported by National Key Research and Development
Program of China under Grant 2017YFC1405403; National Natural Science Foundation of China
under Grant 61075059; Philosophical and Social Sciences Research Project of Hubei Education
Department under Grant 19Q054; Green Industry Technology Leading Project (product devel-
opment category) of Hubei University of Technology under Grant CPYF2017008; Research
Foundation for Advanced Talents of Hubei University of Technology under Grant BSQD12131;
Natural Science Foundation of Anhui Province under Grant 1708085MF161; and Key Project of
Natural Science Research of Universities in Anhui under Grant KJ2015A236.
References
1. Feng, L., Peng, Z.Y., Liu, B., et al.: A latent-citation-network based patent value evaluation
method. J. Comput. Res. Dev. 52(3), 649–660 (2015)
2. Xu, N.Y.: A brief introduction to China’s major patent retrieval databases. China Invention
and Patent, vol. 9, pp. 35–37. (2014)
3. Zhang, L., Liu, Z., Li, L., et al.: PatSearch: an integrated framework for patentability
retrieval. Knowl. Inf. Syst. 57, 135–158 (2018)
4. Shalaby, W., Zadrozny, W.: Toward an interactive patent retrieval framework based on
distributed representations. In: The 40th ACM International SIGIR Conference on Research
and Development in Information Retrieval. ACM, New York (2018)
5. Liu, B., Feng, L., Wang, F., et al.: Patent search and analysis supporting technology
innovation. J. Commun. 37(3), 79–89 (2016)
6. Xu, K.: Research on query expansion of patent retrieval. Doctoral Dissertation of Dalian
University of Technology (2017)
7. Yao, H.X.: Research on method of product innovative design based on patent knowledge.
Master Dissertation of Zhejiang University (2016)
8. Zhao, Z., Yan, J., Fang, L., et al.: Measuring semantic similarity based on wordnet. In: Web
Information Systems & Applications Conference. IEEE, Piscataway (2009)
9. Zhang, X., Sun, S., Zhang, K.: An information content-based approach for measuring
concept semantic similarity in WordNet. Wirel. Pers. Commun. 3, 1–16 (2018)
10. You, B., Liu, X.R., Ning, L., Yan, Y.S.: Using information content to evaluate semantic
similarity on HowNet. In: The Eighth International Conference on Computational
Intelligence & Security. IEEE, Piscataway (2013)
11. Dai, L., Liu, B., Xia, Y., et al.: Measuring semantic similarity between words using HowNet.
In: International Conference on Computer Science & Information Technology. IEEE,
Piscataway (2008)
12. Jiang, Y., Zhang, X., Tang, Y., et al.: Feature-based approaches to semantic similarity
assessment of concepts using Wikipedia. Inf. Process. Manag. 51(3), 215–234 (2015)
13. Shirakawa, M., Nakayama, K., Hara, T., et al.: Wikipedia-based semantic similarity
measurements for noisy short texts using extended Naive Bayes. IEEE Trans. Emerg.
Top. Comput. 3(2), 205–219 (2015)
14. Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in
vector space. Computer Science (2013)
The Design and Development
of Assistant Application for Maritime Law
Knowledge Built on Android
Jun Luo(&), Ziqi Hu, Qi Liu, Sizhuo Chen, Peiyong Wang,

and Na Deng

2233524221@qq.com
Abstract. With the continuous construction and development of China’s

marine industry, the protection of marine rights has become a crucial issue.
Usually, the knowledge of maritime law is oriented to legal practitioners, and
the general public does not know much about it. Most of the persons responsible
for maritime cases are mainly due to their weak legal consciousness and lack of
understanding of marine laws. Therefore, it is very important to develop a
knowledge assistant software which can popularize marine law to reduce the
probability of illegal cases. With the help of abundant maritime judgment cases
and relevant professional knowledge on the Internet, this paper will design and
develop a maritime legal knowledge assistant software built on Android plat-
form. It can not only provide timely, convenient, accurate and abundant infor-
mation services for the vast number of maritime practitioners and functional
departments such as maritime courts through advanced mobile service mode, it
can also promote the reform of maritime judgment system.
1 Background
This project comes from the key special project of the national key R&D plan “Marine
Environment Safety Guarantee” “Research and Development of Law Enforcement
Decision Support System for Reef and Marine Structures”. This project is an important
measure after the country has made considerable progress in the field of infrastructure,
platform equipment and instruments related to the protection of maritime rights. The
demonstration project to enhance the ability of law enforcement for the protection of
marine rights is of great significance to better carry out law enforcement activities for
the protection of marine rights in the future and win the victory of the struggle for the
protection of marine rights.
At present, the category judgment system being used by Chinese courts provides
more cases of poor quality, lack of authority and limited guiding significance [1].
Therefore, it is imperative to establish an application system covering maritime dis-
putes, maritime rights protection, case retrieval and case judgment. In the era of arti-
ficial intelligence and big data, a large number of legal case texts have been
accumulated in the process of law enforcement for marine rights protection. These texts
contain important information, which can not only provide decision support for law

https://doi.org/10.1007/978-3-030-33509-0_45
The Design and Development of Assistant Application 483
enforcement for marine rights protection, but also further enhance law enforcement and
judicial workers for marine rights protection to a large extent, fairness and impartiality
of work.
Using advanced information technology and Internet technology, developing a
knowledge assistant software focusing on knowledge sharing, knowledge exchange
and knowledge push will surely become an important magic weapon and winning tool
for China’s marine law governance.
2 Related Work
Domestic research on knowledge service is mainly concentrated in universities, some

laboratories and other research institutions, mainly based on pure theoretical research
[2], and less to develop an assistant software that can keep up with the trend of the
times. For example, the first platform ocean publishing house specializing in Ocean
digital publishing is the largest platform for Ocean digital electronic publishing in
China, but the mode of service dissemination is limited to the PC side. At present, the
mobile terminal, which has been widely used, is still at a low level [3].
While foreign countries pay more attention to the application and practice of
knowledge service software [2]. During the development of knowledge service, the
system framework including knowledge service system model, knowledge service class
theory and knowledge service architecture based on DBMS are put forward, which
forms an important foundation for developing knowledge service technology and
system [4].
The purpose of this project is to study the basis of judgment in maritime man-
agement cases. From the point of view of software services, and based on various
development models, a knowledge assistant software for maritime law is designed and
developed, which focuses on knowledge sharing, knowledge exchange and knowledge
push.
The main contents and processes of the research, design and development of this
subject are as follows:
(1) Feasibility analysis
(2) Deep analysis of demand
(3) Developing and implementing various functions of the software, adopt agile
development mode to quickly produce the software needed by the project.
3 System Requirement Analysis
3.1 Overall System Requirements

The Maritime Legal Knowledge Assistant Software Project requires the design and
development of a mobile software APP which supports the application of knowledge
base of marine legal cases. The application environment is Android 9.0 and above, and
the hardware environment is Huawei honor8 64 GB; RAM: 4 GB.
484 J. Luo et al.
The people served by assistants of maritime legal knowledge software include law
enforcement personnel of maritime departments of governments at all levels, personnel
of enterprises and units related to marine resources development, scientific research and
production personnel, etc.
The software assistant of maritime legal knowledge integrates marine legal
knowledge base and maritime legal cases, providing abundant marine knowledge and
relevant legal cases for users to learn and use.
3.2 System Performance Requirement Analysis
1. Performance requirements
The response time of case knowledge base fragment search and query should not
exceed 1 s in general (in the environment of 4G operator network). Online search and
query response time should not exceed 3 s.
2. Data capacity requirements
Generally speaking, it is only related to the local hardware capacity and is not
limited by the software.
3. Requirements for working mode
The working mode of the system in different network operating environment
should be different. In the networked state, local and cloud data and information
synchronization should be supported, and the data obtained from the Internet should be
updated in time. In order to improve the user’s efficiency and experience, the operating
interface should be brief, intuitive and easy to operate, and the page depth should not
exceed three layers.
4. Security requirements
The security protection of user’s account information and content resources is the
key consideration of this application system. Through the application of encryption
algorithm, the user data is encrypted to ensure the user’s account security and privacy
information is not leaked. By establishing key distribution and key agreement proto-
cols, the security of user accounts is greatly improved on the premise that users are as
convenient as possible [5].
3.3 System Functional Requirements

Based on the simulation and analysis of the target users and specific use scenarios, the
important functional requirements of the current version of the legal knowledge
assistant for maritime cases mainly focus on the browsing and searching of the legal
knowledge base and case base, my collection, knowledge notes, content synchro-
nization and other major functional requirements. Each main function includes a series
of corresponding functions.
1. Browse and Search Function

Relying on the Internet platform, the legal knowledge assistant software of mar-
itime cases forms a knowledge base by grabbing abundant legal cases related to
maritime cases. Because the object of software is limited to academic exchanges, it
does not constitute infringement. By searching the semantically labeled keywords and
matching them quickly, the search speed can be improved [6].
2. Download Function
The download mode of FTP improves the efficiency of download. Once down-
loaded locally, users can read relevant cases offline.
3. Cloud Synchronization
Synchronization mode uses breakpoint continuation (FTP) mode, which improves
the efficiency of synchronization content [7]. Under the same account, based on the
data backed up in the cloud, the function of quickly retrieving data can be realized once
there is a mistake deletion or database loss [7].
4. Person Account Management Function
Personal account management function provides a series of functions such as
registered account, system version description, feedback, and about us. Users can
synchronize local and cloud content through accounts.
5. Adding Note Function
Through the local establishment of a simple TXT type of text, quickly add content
in the text, to facilitate users to record reading experience.
6. Collection Case Function
When users find wonderful cases they like, they can collect them in time to avoid
losing relevant cases due to content updates.
4 System Overall Design
4.1 The Overall Structures Design of the System

The system adopts the overall structure design of the local cloud-adding end, as shown
in Fig. 1.
486 J. Luo et al.
Fig. 1. The overall structure design of the software
4.2 Design of System Functional Structure

The overall functional structure of the system includes login interface, collection
management interface, note management interface, user management interface, login
interface group to deal with user registration, login, password retrieval and other
operation processes. Personal user management mainly supports the maintenance of
personal information, password modification and other operations, as shown in Fig. 2.
Fig. 2. Functional structural design of software
4.3 Detailed Design and Implementation of the System
1. Agile Development Model

Software architecture plays a key role in software development. In traditional
software development, software architecture is difficult to change, and the adjustment
of the architecture needs to pay a large cost [8]. Agile Development, by contrast, is a
lightweight, change-embracing, fast-response development approach [8].
Agile development is a small waterfall project with a human-centered, iterative and
step-by-step development method [9]. Agile development focuses more on the role of
people, emphasizing individual and team collaboration and self-organization, rapid
delivery and display of value through short iterations, continuous customer participa-
tion and feedback, and rapid response to change [9].
Agile development is also a lightweight development approach that delivers

incremental value over multiple iterations through one or more small cross-functional
teams. Agile development manages uncertainty and embraces change through iteration
and rapid user feedback, so this project adopts an agile development model.
2. System Synchronization Function Realization
The assistant software for legal knowledge of maritime cases provides abundant
case texts, and provides different cases of legal knowledge for people with different
reading interests. Therefore, in order to achieve online remote access and browsing of
databases on mobile terminals, high technical support is needed.
Assistant software browses and searches relevant legal cases by searching key-
words. The cloud data end has labeled the case law knowledge through semantic
labeling to the important information such as time, place, personnel involved in the
case and so on. Therefore, it greatly reduces the time that software spends on searching
keywords and matching, thus improving the performance of the software.
As shown in Fig. 4, the content of the software is automatically adapted and
matched by the browser when it clicks on the relevant case content on the mobile side.
3. Download of Knowledge Base of Maritime Cases
(1) Case of database capture and download
The content of the case text cannot be downloaded directly to the mobile end
through the Internet platform, so the maintenance personnel in the background
need to update the background database regularly. The background database
expands the database by grabbing extensive case texts on the Internet, so it is
necessary to introduce the relevant process of grabbing sea-related case texts in
this function.
The grabbing flow chart of the text content of a sea-related case is shown in Fig. 3.
(2) Users use software assistants to download the text of sea-related cases
One of the key functions of the system is to download the text of a case
involving sea by the software assistant. After downloading to the mobile ter-
minal, users can browse the case content offline. Software system knowledge
base is organized and stored in XML file format. The output format includes
html, txt and other common file formats. The process of software downloading
case text is divided into the following steps:
(1) User Input Find Keyword: After the user enters the keywords, the system
defines the scope and content of the search according to the content input
by the user.
(2) Initialization Download Model: The system automatically sets the state
variables needed for software download and initializes the download state.
(3) Download and browse abbreviated data: After matching the keywords, the
system displays the contents of the matched keywords to the preview
section, and marks the keywords red. Provide users with a clear directory
and view.
(4) Download Case Text Content: Processing of download case text includes
batch download of files to show current download progress with percentage
and progress bar hints.
488 J. Luo et al.
(5) Download Failure and Abnormal Situation Handling: When downloading

exceptions, stop downloading and automatically generate and upload error
prompt codes. Restart the download after checking the user’s normal usage
environment .
Fig. 3. Grabbing flow chart for text content of sea-related cases
Fig. 4. Software assistant download flow chart of case text concerning sea
4. Cloud Synchronization Function

The cloud synchronization function of the legal assistant for maritime cases is to
exchange the data stored locally with the cloud and synchronize the user data stored
locally to the cloud. According to the user’s needs and the appropriate network
environment, synchronization from the cloud to the terminal device. This software
system adopts the management mode of synchronization with local cloud. User’s
information (collection, notes and personal information) will be stored in two versions
at the same time, local storage version and cloud storage version. Normally, the version
saved in the cloud is used as a benchmark.
The process of cloud synchronization in this system is divided into five steps:
initialization of state before cloud synchronization, synchronization of local data,
removal of local data, download of data and termination of synchronization. The
process is shown in Fig. 5.
Fig. 5. Systematic synchronization flow chart
(1) State initialization

The system automatically sets and matches the state variables needed for software
synchronization, and initializes the synchronization state.
(2) Synchronizing local data
Upload local content (collection, notes and account information) to the cloud, mark
local content that is not yet synchronized, and wait for the next synchronization.
490 J. Luo et al.
(3) Clear up local data

Before downloading cloud data, remove local content to avoid content confusion
and excessive storage redundancy caused by data duplication.
(4) Download data
The system downloads the cloud data to the terminal in the form of files, and shows
the current progress in the form of progress bars and percentages when downloading.
When downloading exceptions, stop downloading and automatically generate and
upload error prompt codes. Restart the download after checking the user’s normal
usage environment.
(5) End of synchronization process
Exit Cloud Synchronization Process Activities.
5. Personal Account Management Function
Personal user management function is a basic function that can maintain personal
information, including the modification of user password, the modification of user
name, the addition and deletion of user information and so on.
6. Adding Note Function
Note function should include: note addition, deletion, editing, viewing and other
functions. Figure 6 shows the location of the note function in the application, so that
users can record their ideas anytime, anywhere.
Fig. 6. Software added notes and collection function display
7. Collection Case Function

The collection function of software should include: adding collection, canceling
collection, viewing collection and other functions. Figure 6 shows the location of the
collection function in the application, facilitating users to add and manage the collected
case text content.
8. Design of Database Function
The entity-relationship model (E-R model) will be used in the design of database
functions. The software PowerDesigner is used to convert the database model into
TABLE and VIEW. Through the operation and management of the database by SQL
Server, it can effectively support the safe and efficient storage of data and the operation
of data, so as to achieve the efficient processing requirements of user information.
The system maintains a database related to sea-related cases. The database contains
all kinds of tags after semantic annotation of case texts and all relevant laws and
regulations related to the basis of judgment of maritime cases. Through natural lan-
guage processing, the system automatically matches the keywords retrieved with the
database of maritime legal cases, and finally pushes them to users according to the
ranking of similarity from high to low.
From the point of view of system requirement, the conceptual model of the system
is established through the entity relationship model (E-R model) as shown in Fig. 7.
Fig. 7. E-R diagram of system database design
5 System Testing
5.1 System Test Environment

The main body of the experimental code is written in the Java language environment,
in which Python language is used to grab the content of the sea-related case text. The
physical host is 8 GB of memory. The processor is Intel (R) Core (TM) i7-8565U
CPU@1.80 GHz 1.99 GHz. The size of hard disk is 512G, and the system develop-
ment environment is 64-bit Windows 10 series. Android SDK, JDK1.8.0
5.2 Software Test

Software testing is divided into unit testing, integration testing, confirmation testing,
white-box testing, black-box testing, software debugging and other testing links.
The principle and purpose of software testing is the process of executing a program
in order to discover errors in the program. A good test plan is likely to find errors that
have not been found before. The process of software testing should start with small-
scale testing and gradually conduct large-scale testing. Usually, testing a single pro-
gram module first, then turning the focus of testing to finding errors in the integrated
cluster, and finally finding errors in the whole system [10].
492 J. Luo et al.
Fig. 8. Main interface of software system Fig. 9. Application interface of layer two soft-
ware system
Because the agile development mode is adopted in this project, the performance test
of each iteration version should not only complete the new feature test of the current
iteration, but also ensure that the historical function is not affected by regression test.
Compatibility testing in all possible environments, the stability and performance of the
system should be tested. The performance testing of the system is mainly divided into
offline and online performance testing. In different network environments, the stability
of the software system performance is tested. System security testing mainly focuses on
penetration testing and injection attacks on user account information and content
resources. By verifying the security of user’s account information and content
resources, the user’s safe use can be ensured.
6 Conclusion
1. The main interface test results of the system are shown in Fig. 8. The main interface
is divided into four parts: maritime case push, collection management, note man-
agement and personal account information management. The main interface of
system testing has a good UI design interface.
2. The results of the second layer interface test of the system are shown in Fig. 9. This
interface is the user pushes into the second layer of the interface by searching for
cases of legal texts related to the sea or clicking on the case of the main interface.
While the interface is concise and beautiful, it should also be able to quickly write
notes and collect cases when reading the content of the case text.
3. Due to time constraints, so far the time spent in software testing is insufficient, and
there is no complete software testing process as planned. At present, the software
system cannot run smoothly, there are still some bugs. But the main functional
modules have been written and passed the unit test. Therefore, the software will be
maintained and upgraded in the later stage, in order to achieve a perfect software
assistant of maritime legal knowledge.
Acknowledgments. This work was supported by Innovation and Entrepreneurship Project for
College Students in Hubei Province under Grant S201910500040; Philosophical and Social
Sciences Research Project of Hubei Education Department under Grant 19Q054.
References
1. Zuo, W.M.: How to realize classified judgment by artificial intelligence, china legal network,
vol. 2 (2018)
2. Jiang, B.N.: Research on the construction of institutional knowledge base in scientific
research management of Chinese Universities, Northeast Normal University, vol. 4 (2011)
3. Cheng, N.: Design and development of knowledge assistant software for ocean management,
Dalian University of Technology (2016)
4. Wang, W.J., Wu, G., et al.: A comparative study on the status quo of institutional knowledge
base construction at home and abroad. Nat. Lib. J. 31–35 (2010)
5. Douglas, R.S., et al.: Cryptography, Theory and Practice, 3rd edn. pp. 330–332 (2009)
6. Qin, X.H., Hou, X., Zhao, X., et al.: An entity relation extraction algorithms. J. Beijing Univ.
Inf. Sci. Technol. Nat. Sci. Edn. 34(1), 64–67 + 98 (2019)
7. Xing, Z., et al.: Continuous transmission of breakpoints and multi-threaded download,
programmer, vol. 3 (2002)
8. Li, S.W., Wang, A.J., Tan, H.X., et al.: Software Architecture Design and Practice in Agile
Development, Computer and Information Technology, vol. 3 (2015)
9. Wang, A.J., et al.: Design and Practice of Software Architecture in Agile Development,
Henan University (2015)
10. John Wiley & Sons, Vliet, H.V.: Software Engineering Principles and Practice, 2nd edn.
New York (2000)
A Matrix Factorization Recommendation
Method Based on Multi-grained
Cascade Forest
Shangli Zhou1, Songnan Lv1, Chunyan Zeng1(&), and Zhifeng Wang2

1
Hubei Key Laboratory for High-efficiency Utilization of Solar Energy
and Operation Control of Energy Storage System,
cyzeng@hbut.edu.cn
2
Department of Digital Media Technology, Central China Normal University,
Wuhan, China
Abstract. The traditional collaborative filtering method based on matrix fac-

torization regards the users preference as the inner product of users and items
implicit features and has limited learning ability. Many studies have focused on
the use of deep neural networks to mine the interaction relationship between
implicit features, but the learning cost of deep neural networks is too large, and
the model lacks interpretability. Therefore, a matrix factorization recommen-
dation method based on multi-grained cascade forest is proposed. Replacing
inner product by multi-grained cascade forest with deep structures, rather than
deep neural networks, and explore the interactive relationship between users and
items. The method experiments on real-world data sets and performs well when
comparing with the state of the art methods.
1 Introduction
Collaborative filtering technology (CF) [1] is one of the most critical technologies in
the recommender system, which is to understand the interactive relationship between
users and items based on their interaction histories. Matrix factorization (MF) [2, 3], as
one of the most important collaborative filtering methods, maps users and items into a
latent space, respectively uses a latent vector to represent users and items, and finally
uses a simple inner product to represent the interactive relationship between users and
items. A lot of research on recommender systems is devoted to improving the per-
formance of matrix factorization methods.
The traditional MF uses the inner product [4] to explore the interactive relationship
between latent vectors, which makes matrix factorization method inevitably suffer from
the limitation of inner product [3]. Inner product is an operation of reducing dimension,
which undoubtedly leads to the loss of interactive information. For the simple and fixed
inner product, it is difficult to explore the complex interaction between users and items
in low-dimensional latent space. Therefore, matrix factorization is difficult to improve
the accuracy of recommendation.

https://doi.org/10.1007/978-3-030-33509-0_46
A Matrix Factorization Recommendation Method 495
Recently, deep learning has been found to have good effects in various fields, and it
has the ability to approximate any continuous function [5], which makes many
researches begin to focus on the combination of deep learning and recommender
system [6]. At present, most recommender systems based on deep learning apply deep
learning in feature engineering, and cannot improve performance from model kernel. In
the study of learning interactive relations from by deep learning, most are based on
deep neural network (DNN). For example, Salakautdinov et al. [7] used two-layer
RBM to model the rating of users and items. Ouyang et al. [8] made use of three-layer
auto-encoder to model the rating. He et al. [9] proposed an implicit feedback Neural
Collaborative Filtering (NCF) framework, which extended the traditional matrix fac-
torization to explore the nonlinear relationship between users and items. Although the
deep neural network has a good performance in these methods, the deep neural network
requires too much data and is difficult to be put into practical application in the face of
high tag cost. Moreover, the deep neural network has many hyper-parameters, and the
model performance depends heavily on the adjustment of the hyper-parameters, which
makes the use of the deep neural network difficult to control.
In this paper, a matrix factorization recommendation method based on multi-
grained cascade forest is proposed. The feature embedding module is used to extract
the latent vectors of users and items, and then multi-grained cascade forest is used to
mine the interaction relationships between latent vectors. Through experiments, it is
found that compared with DNN, the multi-grained cascade forest not only has fewer
parameters and faster speed, but also has better performance.
2 Proposed Methods
We first introduce a matrix factorization recommendation method using multi-grained

cascade forest (GCF-MF), and explains how to mine the interaction relationships
between implicit features by using the binary classification model. The specific process
is shown in Fig. 1.
2.1 Feature Embedding Module

This method focuses on the improvement of matrix factorization method in collabo-
rative filtering, so only user and item ID are used as input features, as shown in Fig. 1.
The users and items is converted to binary the sparse vector by one-hot encoding as
input of the model, through a full connection layer to embed the sparse input vector to
the dense vector, its action is similar to the matrix factorization, then the embedded
feature of users and items will be trained by GMF. After the model training is com-
pleted, the implicit features can be extracted from the model.
2.2 Multi-grained Cascade Forest

Multi-grained cascade forest (GCForest) is a new decision tree integration method
proposed by Zhou et al. [10] that can replace deep neural network. GCForest algorithm
consists of multi-grained scanning and cascade forest. Multi-grained scanning can
496 S. Zhou et al.
User
ID
GMF Training
Layer
Item
ID
Feature Embedding
one-hot encoding Module
Cascade forest
CRF Forest
PRF Forest
CRF
Forest
PRF Ave.
Forest
CRF
PRF Layer N
Multi-Grained Scan
Layer 1 Layer 2 Layer 3
Fig. 1. GCF-MF model structure diagram
sense the context structure and spatial relationship of input data [11], and cascading
forests can carry out characterization learning of features layer by layer. In this paper,
GCForest is used to replace the inner product operation in GMF. On the one hand,
Multi-grained scanning is used to mine the spatial feature of latent vectors, and on the
other hand, deep structure is used to further explore the interaction between latent
vectors.
Multi-Grained Scanning
In order to explore the interaction relationship between the latent vectors, we need to
concatenate the two latent vectors to make it into one feature vector, as shown in Fig. 2.
Fig. 2. Multi-grained scanning structure diagram
It is assumed that the dimensions of latent vectors of users and items are both 64
dimensions. After connecting the two latent vectors, the input vector of multi- grained
scanning is 128 dimensions, and then the sliding window with a length of 32 is used for
scanning to generate 97 feature vectors with 32-dimensional. These feature vectors
extracted by sliding window will generate 97 two-dimensional class vectors (binary
classification) through random forest. In order to increase the diversity of models [12],
a pair of random forests is usually used for classification, one is ordinary random forest,
the other is completely random forest [13]. In general, 194 two-dimensional class
vectors can be obtained from the original 128-dimensional vectors through sliding
window scanning, and then a 388-dimensional vector can be obtained through vector
concatenation.
In fact, we can use windows with different lengths to generate feature vectors of
different granularity, which will further enhance the mining of information by the
model. As shown in Fig. 1, GCForest uses sliding windows with lengths of [32, 64, 96]
to conduct multi-grained scanning, after which the scanned vectors will be input into
the subsequent cascade forest.
Cascade Forest
Similar to DNN learning with layer by layer representation learning, GCForest cascade
multiple layers composed of several random forests to learn the input features layer by
layer, as shown in Fig. 3.
498 S. Zhou et al.
Fig. 3. Cascade forest structure diagram
Feature vector sample are input into each layer for training and generate class
vectors. Since the task is a binary classification problem, each random forest will
generate a two-dimensional class vector. In Fig. 3, each layer of cascade forest is
composed of different types of random forest to ensure diversity [12] and each random
forest layer is composed of 2 ordinary random forests and 2 complete random forests,
so 8 enhanced features will be output. Finally, the original features and these enhanced
features will be concatenated as the input of the next layer.
Each new extended layer uses a validation set to verify the performance of the entire
cascade forest, and if there is no significant improvement in performance, the training
process is terminated. Such an adaptive method to determine the complexity of the
model can make the model better adapt to the training data of different scales. The four
two-dimensional class vectors obtained from the last layer are averaged, and the ones
with the highest probability are used as predicted values for subsequent recommen-
dation and evaluation.
3 Experiments
3.1 Dataset
The MovieLens dataset is one of the movie rating datasets commonly used to evaluate
collaborative filtering algorithms. We use the movielens-100k dataset, which contains a
total of 100,000 ratings of 1,682 movies by 943 users, each of whom has rated at least
20 movies.
In order to study the modeling method based on implicit feedback data [14], the
rating data of the original data set is converted into implicit data. It is defined that if the
user has scored the movie, the interaction has been observed, otherwise it has not.

1; if interaction is observed;
yui ¼ ð1Þ
0; otherwise:
3.2 Evaluation Methods

The Hit Rate (HR) and the Normalized Discounted Cumulative Gain (NDCG) [5] were
used as evaluation indicators for top-K recommendation tasks. For the top-K recom-
mendation task, HR directly evaluates whether item that interact with user in the test set
appear in the top-K recommendation list. NDCG calculates the ranking of the hit items
by assigning a higher score to the items in the rank of recommendation list. The specific
calculation formula is as follows.
Number of Hits@K
HR@K ¼ ð2Þ
jGTj
8
>
> P
K
< DCG@K = 2rðiÞ 1
log ði þ 1Þ
ð3Þ
2
i¼1
>
>
: NDCG@K = DCG@K
iDCG@K
Where jGTj represents the number of items in all test sets in Eq. 2. In Eq. 3,
DCG represents the cumulative discount gain, calculates the ranking performance of
the prediction list, and iDCG represents the cumulative discount gain of the optimal
ranking list.
3.3 Baselines
To verify performance, the GCF-MF have been compared with the following methods
in this field.
Item-KNN [15]: This is a standard item-based collaborative filtering method.
BPR [16]: Bayesian Personalized Ranking, which optimizes personalized ranking
tasks, is often used as a baseline for item-based recommendations.
GMF [9]: This is an extension of matrix factorization proposed in the NCF
framework to learn the nonlinear relationship between embedded vectors.
3.4 Parameter Setting

The advantage of GCForest is that it has fewer hyper-parameters to adjust compared
with other deep learning architectures. This paper mainly studies the influence of
embedded vector dimension and window settings of multi-grained scanning on the
model.
500 S. Zhou et al.
Embedded Vector Dimensions Determine

For the matrix factorization method, the dimension of embedded vector is a very
important parameter. For feature embedding module, we adopted the Settings in lit-
erature [9], respectively setting the latent vector dimensions as [8, 16, 32, 64] for the
experiment. Under different potential vector dimensions, the performance evaluation of
matrix decomposition recommendation method based on GCForest is as follows.
As shown in Table 1, different latent vector dimensions have a great impact on
recommendation performance. It’s important to note that it’s not always better to have
more dimensions than latent vectors. Theoretically, the higher the dimension of the
latent vector, the more complex the interactive features that can be explored. Experi-
ments show that the increase of latent vector will bring overfitting problem. Therefore,
it is necessary to select suitable latent vector dimensions to give full play to the
performance of the model.
Table 1. Performance of different latent vector dimensions

Evaluation indicators Latent vector dimensions (dim)
8 16 32 64
HR@10 0.819 0.813 0.836 0.824
NDCG@10 0.549 0.553 0.544 0.538
Window Settings of Multi-grained Scanning

The performance of multi-grained scanning in dealing with spatial relationships
depends mainly on the length of sliding windows. We select the embedded vector
dimension of model is 32 dimension, because it works best in previous experiments. In
addition to the no sliding window experiment, it was set that three sliding windows
were used for each multi-grained scanning, and the length ratio of the three sliding
windows was 1:2:3.
As shown in Table 2, the dimension of the embedded vector is d = 32, and the
dimensions of the input vector is d = 64. It can be found that when the minimum length
of the sliding window is D/4, that is, the length of the sliding window is [16, 32, 48],
the recommendation performance is relatively good.
Table 2. Performance of different window settings when the latent vector dimension is 32
Evaluation indicator (d = 32) Window settings of multi-grained scanning
[16, 32, 48] [8, 16, 24] [4, 8, 12] No window
HR@10 0.835 0.827 0.826 0.824
NDCG@10 0.544 0.546 0.543 0.54
3.5 Performance Comparison

Figure 4 shows HR@10 and NDCG@10 of several models with different latent vector
dimensions, among which Item-KNN does not involve latent vector, so the best results
are selected from multiple Item-KNN experiments for comparison [17]. The perfor-
mance of GCF-MF is better than the other three models, although embedded vector of
GCF-MF is trained by GMF, but judging from the results that the GCF-MF is more
effective to mine the interaction relationship. We can see the performance of the
HR@10 and NDCG@10 averagely increased by 13.9% and 14.2% respectively, it can
prove that using multi-grained cascade forest instead of inner product can effectively
improve the performance.
(a) HR@10 (b) NDCG @10
Fig. 4. Performance comparison of models in different embedded dimensions
4 Summary and Future Work
In this paper, we designed a matrix factorization recommendation method based on

multi-grained cascade forest, and used deep learning method to model the interaction
relationship. This paper embeds the latent vectors of users and items, and then uses
multi-grained cascade forest modeling interaction relations to promote the traditional
matrix factorization model to the depth model, which becomes a new idea to study the
recommendation method based on deep learning.
Acknowledgments. This research was supported by National Natural Science Foundation of

China (No. 61901165, No. 61501199), Science and Technology Research Project of Hubei
Education Department (No. Q20191406), Excellent Young and Middle-aged Science and
Technology Innovation Team Project in Higher Education Institutions of Hubei Province
(No. T201805), Hubei Natural Science Foundation (No. 2017CFB683), and self-determined
research funds of CCNU from the colleges’ basic research and operation of MOE
(No. CCNU18QN021).
502 S. Zhou et al.
References
1. Zhang, H., et al.: Discrete collaborative filtering. In: The 39th International ACM SIGIR
Conference. ACM (2016)
2. Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering
model. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, Las Vegas, Nevada, USA, 24–27 August 2008. ACM (2008)
3. He, X., et al.: Fast matrix factorization for online recommendation with implicit feedback
(2017)
4. Rendle, S.: Factorization machines. In: IEEE International Conference on Data Mining.
IEEE (2011)
5. Hornic, K.: Multilayer feedforward networks are universal approximators. Neural Netw. 2
(5), 359–366 (1989)
6. He, K., et al.: Deep residual learning for image recognition (2015)
7. Salakhutdinov, R., Mnih, A., Hinton, G.: ACM Press the 24th International Conference -
Corvalis, Oregon, 20–24 June 2007. Proceedings of the 24th International Conference on
Machine learning - ICML 2007 - Restricted Boltzmann Machines for Collaborative Filtering,
pp. 791–798 (2007)
8. Ouyang, Y., et al.: Autoencoder-based collaborative filtering (2014)
9. He, X., et al.: Neural collaborative filtering (2017)
10. Zhou, Z.H., Feng, J.: Deep forest: towards an alternative to deep neural networks (2017)
11. Lecun, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86
(11), 2278–2324 (1998)
12. Zhou, Z.H.: Ensemble Methods - Foundations and Algorithms. Taylor & Francis (2012)
13. Breiman, L.: Random forests (2001)
14. Liang, D., et al.: Modeling user exposure in recommendation (2015)
15. Sarwar, B., et al.: Item-based collaborative filtering recommendation algorithms. In:
International Conference on World Wide Web 2001 (2001)
16. Rendle, S., et al.: BPR: Bayesian personalized ranking from implicit feedback (2012)
17. Hu, Y., Koren, Y., Volinsky, C.: Collaborative filtering for implicit feedback datasets. In:
2008 Eighth IEEE International Conference on Data Mining. IEEE (2009)
Adaptive Learning via Interactive,
Cognitive and Emotional approaches
(ALICE-2019)
Multi-attribute Categorization of MOOC
Forum Posts and Applications
to Conversational Agents
Nicola Capuano1(&) and Santi Caballé2

1
Department of Information Engineering,
Electric Engineering and Applied Mathematics, University of Salerno,
Via Giovanni Paolo II 132, 84084 Fisciano, SA, Italy
ncapuano@unisa.it
2
Faculty of Computer Science, Multimedia and Telecommunications,
Universitat Oberta de Catalunya, Rambla Poblenou 156, 08018 Barcelona, Spain
scaballe@uoc.edu
Abstract. Discussion forums are among the most common interaction tools
offered by MOOCs. Nevertheless, due to the high number of students enrolled
and the relatively small number of tutors, it is virtually impossible for instructors
to effectively monitor and moderate them. For this reason, teacher-guided
instructional scaffolding activities may be very limited, even impossible in such
environments. On the other hand, students who seek to clarify concepts may not
get the attention they need, and lack of responsiveness often favors abandon-
ment. In order to mitigate these issues, we propose in this work a multi-attribute
text categorization tool able to automatically detect useful information from
MOOC forum posts including intents, topics covered, sentiment polarity, level
of confusion and urgency. Extracted information may be used directly by
instructors for moderating and planning their interventions as well as input for
conversational software agents able to engage learners in guided, constructive
discussions through natural language. The results of an experiment aimed at
evaluating the performance of the proposed approach on an existing dataset are
also presented, as well as the description of an application scenario that exploits
the extracted information within a conversation agents’ framework.
1 Introduction and Related Work
In their short history, MOOCs have attracted learners from around the world and
gained notoriety at world-class educational institutions [1]. Nevertheless, to reach their
full extent, several technological and pedagogical issues remain to be solved. In par-
ticular, due to their scale, the involvement of instructors during delivery stages has to
be limited to the most critical tasks [2]. As argued in [3], this strongly affects the
assessment task, so that existing systems often resort to automated approaches and
techniques relying on students’ contribution, like peer assessment [4, 5]. Due to the
same reasons, teacher-guided instructional scaffolding activities may be very limited,
even impossible in MOOCs. As a matter of fact, the connection between teachers and
students tends to be a one-directional transfer of information.

https://doi.org/10.1007/978-3-030-33509-0_47
506 N. Capuano and S. Caballé
The primary interaction tool provided by MOOCs is often a discussion forum,

where students and instructors can read and post messages. Learners resort to such
collaboration tool to build a sense of belonging and to better understand the subject
matter. In many cases MOOC forums are full of reflections on student affect and
academic progress but they are often highly unstructured and may hinder rather than
promote the community [6]. On the other hand, it is practically impossible for
instructors to effectively monitor and moderate such forums. So, students trying to
clarify concepts may not get the attention they need, and the lack of responsiveness
often favors drop-out [7].
For these reasons, the automatic analysis of forum posts would offer valuable
information to the instructors for moderating and planning the interventions both within
the forums themselves and at a broader level of the course. In fact, several studies show
that instructors highly value the understanding of the activity in their forums [8].
However, in [9] it was discovered that students’ experiences in MOOC forums are not
appreciably affected by the presence or absence of (poor) instructor intervention.
To address these increasingly recurring needs, different approaches to the automatic
categorization of MOOC forum posts are emerging. In previous research on eLearning,
automatic text analysis, content analysis and text mining techniques have already been
used to extract opinions from user-generated content, such as reviews, forums or blogs
[10, 11]. In [6], authors have defined an instructional tool based on Natural Language
Processing (NLP) and machine learning that automatically detects confusion in forum
posts and have trained it on dataset composed of about 30,000 anonymized forum posts
from eleven online classes1.
In [12], Naive Bayes, Support Vector Machine (SVM) and Random Forest clas-
sifiers have been trained on labeled posts coming from a course to recognize sentiment
and urgency. Trained models have been then validated on different courses showing
low cross-domain classification accuracy. To consider biases among different courses,
in [13] a transfer learning framework based on a convolutional neural network and a
long short-term memory model has been proposed. Obtained results suggest that
models trained on a course forum to recognize sentiment can be adapted to other
courses by relying on few labeled examples for model models tuning.
Information gathered by automatic MOOC forum posts categorization can be used
in several ways. In [14] collective sentiment from forum posts is used to monitor
students’ trending opinions towards the course and major course tools, such as lecture
and peer-assessment. In the same paper, author observe a high correlation between
sentiment measured on daily forum posts and the number of students who drop out
each day. In [6], detected levels of confusion are used to map confused posts to
pertinent video-clips from course resources through information retrieval.
A research area that can also benefit from the analysis of forum posts is that of
conversation agents (chatbots): software programs engaging learners in guided, con-
structive discussions through natural language. The key objective of such agents is to
have productive conversational peer interactions, such as argumentation, explicit
explanation and mutual regulation, triggered and expanded by the agent intervention
1
https://datastage.stanford.edu/StanfordMoocPosts/.
Multi-attribute Categorization of MOOC Forum Posts and Applications 507
and contribution to collaborative activities. In the most common approaches [15, 16],
the agent intervention is generally triggered by the presence of keywords in forum or
chat posts. The introduction of automatic post categorization may be used to generate
more targeted and timely interventions so improving their overall effectiveness.
The aim of this paper is to present the first results of a research aimed at building a
multi-attribute text categorization tool for MOOC forum posts based on state-of-the art
natural language understanding methods and tools. The tool is able to detect useful
information from forum posts including intent (the aim of the post), topics (main
learning domain concepts the post is about), sentiment (the affective polarity of the
post), confusion (level of confusion expressed by the post) and urgency (how urgently a
reply to the post is requested). An application scenario exploiting extracted information
with conversational agents is also provided.
The paper is organized as follows: after having described the adopted document
representation model as well as the proposed text categorization approach, we present
the results of an experiment aimed at evaluating the performance of the proposed
approach on an existing dataset of forum posts (the same used in [6]). Application to
conversational agents is then discussed, followed by conclusions and future work.
2 Text Categorization Approach
Text Categorization (TC) is the task of assigning predefined categories to free-text

documents based on the analysis of their content. Several methods based on machine
learning have been defined so far to automate such task [17]. Using a set of documents
with assigned class labels, such methods are able to create a model that can predict
classes of arbitrary documents. More formally, let D ¼ fd1 ; ; dn g a set of documents
(posts) to be classified and C ¼ fc1 ; ; cm g a set of classes, TC aims at defining a
classification function / defined as follows [18]:

/ : D C ! fT; F g j di ; cj 7! / di ; cj ð1Þ
which assigns the value T (true) if di 2 D is classified in cj 2 C, F (false) otherwise.

TC algorithms adopt a vector representation of documents i.e. each document is
assigned values of a fixed, common set of attributes and it is then represented by a
vector of its attribute values, with the number of vector elements being the same for all
documents [19]. A simple and widely used vector text representation is the Bag Of
Words (BOW) in which each document d is represented by a vector d ¼ w1 ; . . .wjT j
where T is the set of terms appearing at least once in training documents and each
wi 2 ½0; 1 represents how much the term ti 2 T represents the semantics of d.
In the simplest implementation wi is the term frequency i.e. the number of occur-
rences of the term ti in d. The tf-idf (term frequency inverse document frequency)
function is often used as an alternative way to calculate term weights as follows:
jD j
wi ¼ TF ðti ; d Þ log : ð2Þ
DF ðti Þ
where TF ðti ; d Þ is the term frequency while DF ðti Þ is the domain frequency of ti i.e. the
number of training documents where ti appears at least once.
The main limitation of BOW is that it disregards context, grammar and even word
order. Such limitations can be overcome by more refined context-sensitive approaches
like Word Embeddings (WE). Within WE, words are represented by dense vectors
projecting them into a continuous vector space. The position of a word vector within
such space is learned from training documents and it is based on the words that
surround the word when it is used.
WE are able to capture semantic similarities between words: words with similar
meanings have close vector representations. In [20] it was shown that semantic and
syntactic patterns can be reproduced using vector arithmetic e.g. by subtracting the
vector representation of the word “Man” from the vector representation of “Brother”
and then adding the vector representation of “Woman”, a result which is closest to the
vector representation of “Sister” is obtained. Within such model a document can be
represented as the sum or mean of word vectors of included terms [21].
Once the weight vectors associated with training documents have been constructed
with BOW or WE, a classifier capable of evaluating the degree to which each new
document belongs to one of the available classes can be trained. Many machine
learning approaches exist to perform this task including decision trees, neural net-
works, Bayesian classifiers, and support vector machines. Among the others, in this
paper we adopt a Neural Networks based approach leveraging on the SpaCy2 open
source Python framework for natural language understanding. In particular two dif-
ferent architectures have been experimented for MOOC post categorization:
• bow-ff: the BOW model is adopted for documents encoding and a fully connected
2-layers feed forward neural network is used for categorization;
• cnn-we-ff: a 4-layers convolutional neural network is used to learn word vectors
from training documents, then documents are represented by averaging word vec-
tors of included terms and a fully connected 2-layers feed forward neural network is
used for categorization.
With both architectures, the output layer applies a soft-max function that takes as
input a vector z of k real numbers (corresponding to non-normalized output of the
preceding layer) and normalizes it into a probability distribution as follows:
ez i
rðzÞi ¼ Pk for i 2 f1; . . .; kg and z ¼ ðz1 ; . . .; zk Þ 2 Rk ð3Þ
zj
j¼1 e
After applying soft-max, each output component is in the interval ½0; 1, and the
components will add up to 1. The transformation is non-linear and thought to highlight
the largest values and hide those that are significantly smaller than the maximum value.
2
https://spacy.io/.
3 Experiments with MOOC Posts
Based on document representation models and text categorization approaches defined

in the preceding section, a multi-attribute text categorization tool for MOOC forum
posts has been developed. Given an input string representing a forum post, the tool is
currently able to detect the information summarized in Table 1.
Table 1. Meaning of the extracted attributes and supported categories.

Attribute Meaning Categories
Intents The general aims of the post Three categories are currently supported:
question (help seeking), answer (help giving),
opinion
Topics Main educational domain Strongly domain dependent, three broad
concepts involved in the categories are currently supported (resulting
post from the dataset used for training): humanities,
medicine, education
Sentiment The affective polarity of the Positive, negative, neutral
post
Confusion The level of confusion Low, medium high
expressed by the post
Urgency How urgently a reply to the Low, medium high
post is requested
Defined algorithms have been trained on the Stanford MOOCPosts data set con-
taining 29,604 anonymized learner forum posts from 11 Stanford University public
online classes [6] within the Humanities, Medicine and Education domain areas. Each
post was manually coded along the following dimensions by three independent coders:
question (yes/no), answer (yes/no), opinion (yes/no), sentiment (from 1 to 7), confusion
(from 1 to 7), urgency (from 1 to 7). In our experiment the first three dimension
(question, answer, opinion) have been wrapped in a single attribute (intents) while
feasible values for the last three (sentiment, confusion, urgency) have been discretized
in three categories (positive/negative/neutral for sentiment, low/medium/high for
confusion and urgency). The educational area of the post has been obtained from
additional course information and used to model the topics attribute.
Cross-validation has been used to estimate categorization performances. The
dataset has been divided in 4 disjoint subsets of equal size and at each step, the k-th
subset with k 2 f1; . . .; 4g has been used as validation set, while the remaining subsets
have been used for training. Performance obtained on the validation set for intents and
concepts have been measured in terms of average precision, recall and f-score as
follows:
TP TP prec rec
prec ¼ ; rec ¼ ; F ¼ 2 ð4Þ
TP þ FP TP þ FN prec þ rec
where TP is the total number of true positives (correctly predicted labels), FP is the
total number of false positives (wrongly predicted labels) while FN is the total number
of false negatives (correct but unpredicted labels) [22].
In each step the classifier has been trained on 22,203 items while 7401 have been
used for validation. Experiment with 20 and 50 training epochs have been performed
with a batch size increasing from 4 to 32 samples per iteration. A fixed 20% dropout
rate has been used to prevent overfitting. Obtained results, averaged among the 4
validation steps are shown in Table 2. The obtained f-score ranges from about 75% to
over 87%, with a slight dominance of the bow-ff architecture over cnn-we-ff. With
respect to training extent, is should be noted that 20 epochs are enough to characterize
the dataset while performing additional training does not seem to add useful
information.
Table 2. Average performance of the defined categorizer on the MOOCPosts dataset.

Attribute Architecture Epochs Loss Precision Recall F-score
Intents bow-ff 20 0.122 79.98% 73.33% 76.51%
50 0.111 78.86% 73.84% 76.27%
cnn-we-ff 20 0.087 81.28% 70.20% 75.34%
50 0.050 81.26% 69.33% 74.82%
Topics bow-ff 20 0.076 87.34% 81.20% 84.16%
50 0.063 86.39% 81.86% 84.06%
cnn-we-ff 20 0.046 83.85% 81.92% 82.87%
50 0.022 83.35% 82.30% 82.82%
Sentiment bow-ff 20 0.645 88.62% 86.07% 87.33%
50 0.052 88.53% 85.41% 86.94%
cnn-we-ff 20 0.051 85.43% 85.43% 85.43%
50 0.031 84.23% 84.20% 84.21%
Confusion bow-ff 20 0.081 87.19% 84.25% 85.70%
50 0.056 86.52% 83.40% 84.93%
cnn-we-ff 20 0.045 84.33% 81.92% 83.11%
50 0.027 85.39% 82.22% 83.78%
Urgency bow-ff 20 0.098 84.05% 75.75% 79.69%
50 0.083 83.35% 75.21% 79.07%
cnn-we-ff 20 0.056 80.45% 73.95% 77.06%
50 0.028 80.23% 74.90% 77.47%
4 Application to Conversational Agents
In this section, an application scenario fostering the adoption of the defined text cat-
egorization tools with Conversational Agents (CAs) is described. As introduced in the
first section, CAs are software programs interacting with learners within synchronous
and asynchronous collaboration tools with the final aim of promoting constructive
Fig. 1. A sample forum thread with a conversational agent fostering productive discussion
based on the results of the proposed multi-attribute post categorization tool.
discussions through the application of pre-defined dialogue patterns. In [16], dialogue

patterns applied by CAs have been grounded on the Academically Productive Talk
(APT) framework [23], which originates from a substantial body of work on modeling
useful classroom discussion practices and norms.
According to APT, promoting students’ reasoned participation and orchestrating

academically productive discussions require teachers to provide dynamic support via
facilitative conversational moves i.e. conversational interventions (or actions) aiming to
model and trigger appropriate forms of peer dialogue. Examples include contribute
linking (e.g. “Do you agree with what … said?”, “Would you like to add something to
…”), building on prior knowledge (e.g. “How does this connect with what we know
about …”), press for reasoning (e.g. “Why do you think that?”), expand reasoning
(e.g. “That’s interesting! Can you elaborate on that?”), etc.
However, in order to dynamically generate fruitful moves, key information like
intents, topics, sentiment, urgency and confusion must be recognized in the discussion
thread as well as in individual posts. To this end, the defined categorization tools may
play a critical role. Figure 1 shows a sample student interaction within a forum thread,
moderated by a CA fostering productive discussion based on the results of the proposed
multi-attribute post categorization tool. In the first post Mario asks for help about a
specific course topic. His intent (seeking help) and related topics are recognized by the
categorizer together with additional attributes. Based on this information, the chatbot
makes a contribute linking move by soliciting the feedback of another student (Laura)
which in the past received a positive assessment on the same subject.
In the third post Laura replies to Mario reacting to the chatbot request. The intent
and the topics of the posts match that of the preceding post so it is pertinent, but a high
level of confusion is detected suggesting a weak or somewhat chaotic reply. So, the
chatbot makes again a contribute linking move to solicit additional clarifying inter-
ventions. As a consequence, Anna joins the discussion in the fifth post and provides a
pertinent (based on detected intents and topics), clear (based on confusion) and sup-
portive (based on sentiment) reply.
Unfortunately, the answers obtained so far, do not clarify Mario’s requests thus, in
the sixth post, he seeks further help (based on intents) on the same topic but showing a
negative sentiment as well as high levels of urgency and confusion suggesting dis-
appointment and frustration. To relieve the stress level while providing relevant and
immediate support (based on urgency), the chatbot makes an affective move comple-
mented with a direct hint linked to available educational resources.
The analyzed thread is just an example of how information extracted by the
developed categorizer may be used within a CA framework. It should be noted that
additional training must be provided to adapt the tool to learning domains. In [13] it
was demonstrated that models trained on a course forum to recognize sentiment can be
successfully adapted to other courses thanks to transfer learning methods, thus relying
on very few labeled additional examples coming from the new course. It is reasonable
to suppose that similar considerations hold for intent, confusion and urgency.
The recognition of domain topics deserves a separate discussion. First of all,
domain concepts and their links have to be modeled with some kind of representation
like knowledge graphs or lightweight ontologies [24, 25] to let the CA and the cate-
gorizer know which concepts are relevant and how they are connected. Then, the
categorizer must be trained to recognize such concepts in forum posts. To this end, in
order to avoid the arduous task of manually labeling numerous examples, it is possible
to apply further NLU methodologies like named entity recognition [26].
5 Final Remarks
In this paper we have described the preliminary results of a research aimed at defining a
tool for multi-attribute texts categorization, specialized on the analysis of MOOC
forum posts. The developed tool is able to detect intents, topics, sentiment, confusion
and urgency of forum posts with a level of accuracy ranging from about 75% to over
87% (in terms of f-score), as resulting from an experiment made on an existing dataset
of annotated posts. An application scenario that takes advantage of the extracted
information within a conversation agents’ framework has been also discussed.
Future work is planned both in terms of categorizer improvement and in terms of its
adoption in real MOOC environments. Regarding the first point, it should be noted that
the dataset used for training is slightly imbalanced, with some underrepresented cat-
egories. Thus, performance can be improved by integrating such dataset and/or
applying data augmentation techniques. Moreover, as already pointed out in the pre-
ceding section, additional methods like transfer learning and named entity recognition
may be integrated to facilitate tool adaptation to different learning domains.
With respect to the second point, the integration of the proposed tool within an
existing framework for conversational agents is already planned in the context of the
colMOOC project (see acknowledgement) aimed at developing and experimenting
conversational agents and learning analytics tool in MOOCs.
Acknowledgement. This work has been supported by the project colMOOC “Integrating
Conversational Agents and Learning Analytics in MOOCs”, co-funded by the European Com-
mission within the Erasmus + program (ref. 588438-EPP-1-2017-1-EL-EPPKA2-KA).
References
1. Liyanagunawardena, T., Adams, A., Williams, S.: MOOCs: a systematic study of the
published literature 2008/2012. Int. Rev. Res. Open Distance Learn. 14(3), 202–227 (2013)
2. Glance, D.G., Forsey, M., Riley, M.: The pedagogical foundations of massive open online
courses. First Monday 18(5) (2013)
3. Capuano, N., Caballé, S.: Towards adaptive peer assessment for MOOCs. In: Proceedings of
the 10th International Conference on P2P, Parallel, GRID, Cloud and Internet Computing
(3PGCIC 2015), Krakow, Poland (2015)
4. Capuano, N., Loia, V., Orciuoli, F.: A fuzzy group decision making model for ordinal peer
assessment. IEEE Trans. Learn. Technol. 10(2), 247–259 (2017)
5. Albano, G., Capuano, N., Pierri, A.: Adaptive peer grading and formative assessment. J. E-
Learn. Knowl. Soc. 13(1), 147–161 (2017)
6. Agrawal, A., Venkatraman, J., Leonard, S., Paepcke, A.: YouEDU: addressing confusion in
MOOC discussion forums by recommending instructional video clips. In: Proceedings of the
International Conference on Educational Data Mining, Madrid, Spain (2015)
7. Yang, D., Wen, M., Howley, I., Kraut, R., Rose, C.: Exploring the effect of confusion in
discussion forums of massive open online courses. In: Proceedings of the 2nd ACM
Conference on Learning@Scale, New York, NY, USA (2015)
8. Hollands, F., Tirthali, D.: MOOCs: expectations and reality. Center for Benefit-Cost Studies
of Education, Teachers College, Columbia University, NY (2014)
9. Tomkin, J., Charlevoix, D.: Do professors matter?: using an A/B test to evaluate the impact
of instructor involvement on MOOC student outcomes. In: Proceedings of the ACM
Conference on Learning@Scale, New York, NY, USA (2014)
10. Binali, H., Wu, C., Potdar, V.: A new significant area: emotion detection in e-learning using
opinion mining techniques. In: Proceedings of the 3rd IEEE International Conference on
Digital Ecosystems and Technologies (DEST 2009), Istanbul, Turkey (2009)
11. El-Halees, A.: Mining opinions in user-generated contents to improve course evaluation. In:
Software Engineering and Computer Systems, pp. 107–115 (2011)
12. Caballé, S., Lapedriza, A., Masip, D., Xhafa, F., Abraham, A.: Enabling automatic just-in-
time evaluation of in-class discussions in on-line collaborative learning practices. J. Digit.
Inf. Manag. 7(5), 290–297 (2009)
13. Wei, X., Lin, H., Yang, L., Yu, Y.: A convolution-LSTM-based deep neural network for
cross-domain MOOC forum post classification. Information 8, 92 (2017)
14. Wen, M., Yang, D., Rosè, C.: Sentiment analysis in MOOC discussion forums: what does it
tell us? In: Proceedings of Educational Data Mining (2014)
15. Kumar, R., Rosé, C.: Architecture for building conversational agents that support
collaborative learning. IEEE Trans. Learn. Technol. 4(1), 21–34 (2011)
16. Demetriadis, S., Tegos, S., Psathas, G., Tsiatsos, T., Weinberger, A., Caballé, S.,
Dimitriadis, Y., Sánchez, E.G., Papadopoulos, P.M., Karakostas, A.: Conversational agents
as group-teacher interaction mediators in MOOCs. In: Proceedings of Learning
With MOOCS (LWMOOCS), Madrid, Spain (2018)
17. Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge
University Press, Cambridge (2008)
18. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34
(1), 1–47 (2002)
19. Cichosz, P.: Case study in text mining of discussion forum posts: classification with bag of
words and global vectors. Appl. Math. Comput. Sci. 28(4), 787–801 (2019)
20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of
words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–
3119 (2013)
21. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In:
Proceedings of the 31st International Conference on Machine Learning (ICML 2014),
Beijing, China (2014)
22. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification
tasks. Inf. Process. Manag. 45, 427–437 (2009)
23. Tegos, S., Demetriadis, S.: Conversational agents improve peer learning through building on
prior knowledge. Educ. Technol. Soc. 20(1), 99–111 (2017)
24. Capuano, N., Gaeta, M., Salerno, S., Mangione, G.R.: An ontology-based approach for
context-aware e-learning. In: 3rd IEEE International Conference on Intelligent Networking
and Collaborative Systems, Fukuoka, Japan (2011)
25. Capuano, N., Dell’Angelo, L., Orciuoli, F., Miranda, S., Zurolo, F.: Ontology extraction
from existing educational content to improve personalized e-Learning experiences. In:
Proceedings of the 3rd IEEE International Conference on Semantic Computing (ICSC 2009),
Berkeley, CA, USA (2009)
26. Lee, C., Hwang, Y., Oh, H., Lim, S., Heo, J., Lee, C., Kim, H., Wang, J., Jang, M.: Fine-
grained named entity recognition using conditional random fields for question answering.
LNCS, vol. 4182, pp. 581–587 (2006)
A Tool for Creating Educational Resources
Through Content Aggregation
Antonio Sarasa-Cabezuelo1(&) and Santi Caballe1,2

1
Universidad Complutense de Madrid, Madrid, Spain
asarasa@ucm.es
2
Universitat Oberta de Catalunya, Barcelona, Spain
scaballe@uoc.edu
Abstract. A problem that students find in some subjects is to find online

educational resources on their content that have adequate quality. This paper
presents a tool that allows a teacher to create in a simple way educational
resources using their own material or automatically extracted from repositories
of open data or linked data. In addition, the tool offers a user management
system that allows a student to register with a specific teacher, so that they can
access all the resources published by a specific teacher. Educational resources
can be downloaded from the tool in order to be used non-online. In the article,
the proposal has been illustrated in the context of the subject of Software
Engineering.
1 Introduction
In general, a very frequent practice of students is to search online for educational

resources that complement the information provided in the classrooms. These resources
are usually exercises or exams, or materials that complement the content explained in
class. However, a common situation in this process is that the quality and content of the
resources are not guaranteed by any recognized institution or entity. In this sense, it is a
risk to use them since it could include erroneous or inaccurate information. Another
problem that arises is that the materials located on the internet do not conform exactly
to the contents explained in class or to the difficulty posed. It is for this reason that the
task of finding additional material on the network is usually a complex task and in
many cases with reduced success. That is why the ideal situation would be for the
teacher of the subject to directly offer additional material since it will better fit what is
explained in class.
From the teacher’s perspective, the main problem that arises to create these
resources is the lack of time since normally the additional materials available to them
will be found in different electronic formats such as word documents, pdf, videos. In
this sense, if the teacher has to integrate these materials into a single material. He must
carry out a previous editing process that will delay the availability of the material and
will take an important time.
Another fact to consider in this context is the existence of digital repositories. These
are applications that facilitate a set of services [11] such as the creation, elimination,
modification, search and retrieval of digital resources. All repositories show a set of
https://doi.org/10.1007/978-3-030-33509-0_48
516 A. Sarasa-Cabezuelo and S. Caballe
standard features [12]: a resource and metadata management system, an access man-
agement system and an interconnection system between repositories. The first system
facilitates the management of resources (create, edit, delete) and the organization of
resources into logical structures called digital collections that represent related digital
resources. The second system allows to decide the contents of the resources that are
shown to each user. The latest system allows to connect a set of repositories to share the
resources contained in each of them offering services of import, export and visual-
ization of unified data. Open data repositories and linked data are two types of digital
repositories [1]. These are two initiatives that have emerged in different fields but with
a similar final objective. Open data repositories is an initiative that arises in the field of
public and private institutions that aims to make available to any user (in a generally
free way) the data generated during the performance of their activities. In this way the
exposed data can be exploited in various ways by users to create value-added services
or achieve strategic or economic advantages [6]. In these repositories, the usual way to
retrieve information is usually by using a REST web services API [4]. The invocation
of a service generates a query that returns the data requested in the query stored in a file
of a certain format such as xml, json, csv and others. On the other hand, linked data
repositories is an initiative that arises in the field of semantic web [3] and aims to relate
data sets using the RDF language. Through this language it can be defined RDF
statements in the form of subject-property-verb triplets that allow to express the rela-
tionships between electronic resources that are stored in different digital repositories. In
this way, a graph of information is created that can be consulted using a graph query
language called SPARQL [9]. SPARQL queries retrieve all resources that match the
restrictions represented. A large number of public and private institutions have opted
for this initiative and added their resources to this network [5]. One of the most
important linked data repositories is Wikidata [10]. This is an initiative of the Wiki-
media Foundation in which an extensive collection of resources described by RDF and
related to datasets from other digital repositories is stored [8]. In Wikidata, each
information element is characterized by a uniquely identifying label, a description that
details the characteristics of an element based on property-value pairs and a number of
aliases. Properties can link to external databases [2] or content found in wikis such as
Wikipedia, Wikibooks or Wikiquote. To access the information in Wikidata, it used
specific clients to browse the contents of Wikidata, use a query API [10], or use a
SPARQL endpoint [4] (it is an access point to the contents of Wikidata where it is
possible to write queries in SPARQL [7] and retrieve data in different formats such as
json, csv or rdf [2]). Therefore, both types of repositories constitute a free and easily
accessible source of information that could be used to create educational resources for a
specific topic.
This article proposes a proposal to solve the problem posed by offering students
additional materials of the subjects generated by the teachers themselves and that in
addition the generation of them is not too expensive for the teacher. In this sense, the
proposal is to create educational resources by adding materials available to the teacher
(which may be in different formats) without having to be previously edited to integrate
them. In particular, it is proposed that the information from the exploitation of open
data repositories and linked data be also used to create educational resources. There-
fore, the main task of the teacher in this process will be to select the materials that
A Tool for Creating Educational Resources Through Content Aggregation 517
constitute the educational resource and ensure that its contents are adequate and con-
sistent with the contents explained in class.
There are a wide variety of content aggregation tools in which it is not necessary to
perform any type of editing, it simply consists of adding. A very simple way would be
to use a blog post or a wiki. It can also be used more specific tools that offer content
container templates that allow to manipulate how the presentation will be, do some type
of basic editing on the contents or perform the storage of resources in standard formats.
An example of such tools would be eXeLearning. However, any of the cases presented
presents some disadvantage with respect to implementing the proposed proposal. In the
first case, a blog or a wiki is not a specific tool for creating educational resources and
offers other additional features that are not necessary or adaptable for this area. And in
the second case, a minimum edition and configuration of the educational resources is
required, in addition to not contemplating the possibility of using resources recovered
from an open data or linked data repository. For these reasons, this article presents a
tool that has been implemented with the objective of better adjusting to the described
proposal. The described tool covers a double objective. On the one hand, it offers a
teacher a tool to create educational resources by aggregating content and publishing it
later in a simple and intuitive way. And on the other hand, it offers a user management
system that allows a student to register in the tool and associate with a specific teacher
so that he can access all the resources published by the teacher.
The article is structured as follows. Section 2 sets out the objectives of the tool
developed. Next, Sect. 3 describes the functionality implemented for each type of user.
Section 4 describes the evaluation performed on the tool. And finally, in Sect. 5 a set of
conclusions and lines of future work are proposed.
2 Objectives
In general for students, searching for complementary content for a subject on the
Internet is complicated given that the information is scattered and can come from
unreliable sources. In this sense, having a unified and controlled source of resources
would facilitate access to information. To solve this need, a tool has been created with
the following objectives:
• Create a web application that allows to create, publish and retrieve digital educa-
tional resources from the field of software engineering.
• The application will allow students and teachers from different groups to be
managed.
• Allow teachers to create resources from material uploaded by teachers or from
material recovered from open or linked data repositories. These materials are
recovered through SPARQL queries about Wikidata.
• Allow students to register with the different teachers who publish resources in the
repository.
Note that the application has been designed as a repository to create digital edu-
cational resources and download them. However, it is not possible to perform some
operations such as: sort out selected resources in a specific order, create different
learning paths depending on student’s skills or progress, create activities associated

with the selected resources, or reproduce content integrated in the same tool (not
downloading content).
3 Description
3.1 Application Architecture

The architecture of the application is shown schematically in Fig. 1.
Fig. 1. Application architecture.
Users interact with the application client through a web browser. Requests are made
to an Apache server from the client. These requests can be directed to a MySQL
database using SQL queries. The returned data is processed on the server from which
the results are sent in the form of HTML pages.
3.2 Design and Implementation

In the application 3 different types of users are defined for which particular function-
alities have been defined:
• The student has the possibility of registering as a student of a certain teacher chosen
by himself. Once registered, it is possible to log in by filling in the fields provided
and access to all existing resources in the system regardless of the group to which it
belongs. In addition, profile customization options and a resource search engine are
provided.
• The teacher has the same functions as a student but also it has additional functions
such as the management of the content of the tool and the students’ accounts.
• The administrator is responsible for managing both resources, teachers and students.
Next, the main features will be described. Any user, regardless of the type, to access
the functionality of the application must authenticate to the system using a user and a
password. Once authenticated, it has access to the functionality of each type of user.
The administrator has associated functions related to the management of the
application. In this sense, it can be activated a teacher who has requested to use the
application as shown in Fig. 2 (when a teacher request the register, the activation of a
user is not direct and the administrator must confirm its registration), delete teachers
and students who are already users, or modify the user profiles of anyone.
Fig. 2. Teacher activation
Another option is to visualize the resources that teachers have created and down-
load them as shown in Fig. 3.
Fig. 3. (a) Resources created, (b) resource display.
The main functions of the teacher are the management and creation of educational
resources, and the management of students. The registration of a student is not direct.
For this, the student must request the registration with a form, in which he must
indicate his personal data and the group to which he wishes to associate. The teacher in
charge of the selected group receives the request, and is responsible for rejecting or
accepting the student’s request for registration (Fig. 4). Also, the teacher can eliminate
any student registered with him.
Fig. 4. Student activation
However, the main function of the teacher is the creation of digital educational
resources. The creation of a resource requires several phases. First, the teacher must
upload the files (documents, images, videos, audios or links) that it will be used to
create a resource to the application. Next, the system automatically generates a
SPARQL query that retrieves information and data about Software Engineering from
Wikidata that can be used to create resources. The files that a teacher uploads can be
used by any teacher active in the application to create their own resources.
Once the files are in the application, the teacher can create resources. To do this, it
must be selected the files that will form the resources, and he will be assigned a title and
description. Once the resource is created, it can be previewed. Then, the teacher can
publish the resources so that it can be searched and retrieved by any user registered in
the application, or it can be saved to continue editing it later. Figure 5 shows the
process of creating a resource.
Fig. 5. (a) Title and description of a resource, (b) selection of files, (c) creation of resources.
The teacher can also perform other tasks:

• Delete uploaded files in the application. To delete a file uploaded in the application,
all resources in which they are being used must be previously deleted. Otherwise,
the application will display a message indicating that a file cannot be deleted when
used in a resource.
• Display a resource. To do this, the resource must be selected, so that the files that
are part of the resource will be displayed, and can be deleted or modified. Figure 6
shows the resource display.
Fig. 6. (a) Resources of a teacher, (b) content of a resource.
• Download a resource. To do this it must be selected the resource, and selected the
download option.
• Modify the profile. For this, the teacher will use a link called “Profile” from which it
is possible to modify the name, description, photo and office.
• Remove students. For this, the teacher will use a link called “My Students” from
which it is possible to select a specific student and delete the account.
• Remove resources. For this, the teacher will use a link called “My Resources” from
which it is possible to select a specific resource and remove it from the application.
The main function of the student is the search and recovery of the resources created
by the teachers. In order to do this, it is possible to navigate over a set of resources
created or use a search engine that performs searches on the titles and descriptions that
the resources have associated. Once the searched resources have been recovered, they
can then be displayed by selecting it. Also, a student can perform the following actions:
• Modify the profile. For this, the student will use a link called “Profile” from which
he can modify the name, description, photo and studies.
• Display a resource. To do this, the resource must be selected, so that the files that
are part of the resource will be displayed, and can be downloaded
• Download a resource. In order to this, it must select the resource, and select the
download option.
Finally comment on the aspect of data privacy. All data, both from teachers and
students, are encrypted within the database used by the application. In this way, free
access to them is not possible.
4 Evaluation
An evaluation of the usability of the tool and user satisfaction between different people,
both students and teachers, has been carried out. For this, a survey with fourteen
activities has been created where the user must perform certain tasks and answer
questions about the tasks performed by choosing values between 1 and 5 (1 being the
most unfavorable option and 5 the most favorable), or indicating how much effort the
action has involved. The evaluation involved 15 teachers and students and was done
using Google Forms form technology. In this sense, a form1 was created that included
the tasks to be performed and the questions that they had to answer after performing the
tasks. The form was sent to the participants, who performed the tasks without time limit
and without help.
The results obtained in the evaluation were the following:
• Regarding the creation of a new account with the role of student, 75% have thought
that it has been very intuitive and 25% that it has been quite intuitive.
• 75% of the responses show that it is very intuitive to activate the account of a
previously created user, and 25% think it is quite intuitive.
• 87.5% of users have responded that it is intuitive to log out of the tool and the
remaining 12.5% respond that it is not very intuitive.
• Regarding the visualization of resources, 25% think that the design is very good,
62.5% think that it is quite good and 12.5% think that it is not bad.
• 37.5% think that the modification of the user profile is very intuitive, 37.5% think
that it is quite intuitive and 25% think that it is not bad.
• Regarding the search for resources. 50% think that it has not cost them to find the
resources, 25% think that almost nothing and 25% think it has cost them little.
• 75% think that the download of resources is intuitive and 25% think that it is not
intuitive.
• 50% think that the design of the application is very good, and the other 50% think
that the design is good.
• Regarding the task of uploading a file, creating a resource with the file previously
uploaded and publishing it. 62.5% think it has been very easy and the other 37.5%
think it has been easy.
• Regarding the task of uploading a file, creating a resource with the file previously
uploaded and saving it. 37.5% think that it has been very easy, 25% think that it has
been easy, 25% think that it has not been easy and 12.5% that it has been difficult.
1
https://docs.google.com/forms/d/e/1FAIpQLSee1Z9H1ilbRXWUwbw6HAXHbeADdjO94-3kwEJ0
N7u3ASbXgw/viewform?vc=0&c=0&w=1.
• Regarding the task of modifying a resource. 25% think that it has been very easy,
50% think that it has been easy, 12.5% think that it has not been easy and 12.5%
that it has been difficult.
• 62.5% think it is very easy to remove a user from the system. However, 37.5% think
it is easy.
• 50% think that finding resources and eliminating them is very easy, 25% believe
that it is easy and the remaining 25% think that it is not easy.
Regarding the task of uploading a file, creating a resource with the file The
objective of the evaluation was to assess the usability of the application in a reduced
and controlled context. In general, the results show that in this context, the perception
that the tool is usable. However, it cannot be stated categorically that the tool is usable
as a more formal experiment should be carried out.
This article has presented a tool that allows the creation and exploitation of educational
resources on the subject of software engineering. Three types of user profiles have been
defined in the tool: teacher, administrator and student.
The educational resources that can be created with the tool can consist of files of
different types: videos, audios, images, documents or links. Some of these resources are
automatically retrieved from open data repositories linked through SPARQL queries.
The tool allows to manage different teachers who publish their own resources. In
this sense, a peculiarity of the tool is that the resources that each teacher creates and
uses are shared, so that any teacher can use resources from other teachers to create their
own resources.
With respect to other similar tools such as learning management systems or content
creation tools, it offers fewer features. However, the tool has several important con-
tributions: (a) It offers a specific content of Software Engineering, (b) Resources are
created by experts, (c) It offers a simple system for managing teachers and students
The main lines of future work that would improve this work would be:
• Extend the tool to other subjects than Software Engineering.
• Add a resource valuation system so that the user can know for example which file is
the most complete or the most valued.
• Improve the user interface so that it can be selected the language in which the
content is displayed or display the content in alphabetical order or by creation date.
• Create a mobile version of the tool.
• Create a forum in the tool where users and teachers can easily ask and answer
questions.
• Design a comment system for each resource created.
Acknowledgments. This work has been partially supported by the European Commission
through the project “colMOOC: Integrating Conversational Agents and Learning Analytics in
MOOCs” (588438-EPP-1-2017-1-EL-EPPKA2-KA). Also, I would like to acknowledgment to
Manuel Martín Canoray David Limón Miralles
References
1. Hausenblas, M., Karnstedt, M.: Understanding linked open data as a web-scale database. In:
Proceedings of the Second Conference on Advances in Databases Knowledge and Data
Applications, 11–16 April 2010, France, pp. 56–61 (2010). https://aran.library.nuigalway.ie/
handle/10379/1127. Accessed 20 Nov 2018
2. Hausenblas, M.: Exploiting linked data to build web applications. IEEE Internet Comput. 13
(4), 68–73 (2009). https://doi.org/10.1109/MIC.2009.79
3. Heath, T., Bizer, C.: Linked data: evolving the web into a global data space. Synth. Lect.
Semant. Web Theor. Technol. 1(1), 1–136 (2011). https://doi.org/10.2200/
S00334ED1V01Y201102WBE001
4. Kahan, J., Koivunen, M.R., Prud’Hommeaux, E., Swick, R.R.: Annotea: an open RDF
infrastructure for shared Web annotations. Comput. Netw. 39(5), 589–608 (2002). https://
doi.org/10.1016/S1389-1286(02)00220-7
5. Kitchin, R.: The Data Revolution: Big Data, Open Data, Data Infrastructures and Their
Consequences. Sage, California (2014)
6. Larson, R.R.: Introduction to information retrieval. J. Am. Soc. Inf. Sci. Technol. 61(4),
852–853 (2010). https://doi.org/10.1002/asi.21234
7. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Bizer, C.:
DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia. Seman. Web
6(2), 167–195 (2015). https://doi.org/10.3233/SW-140134
8. Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. VLDB
Endowment 5(12), 2032–2033 (2012). https://doi.org/10.14778/2367502.2367572
9. Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In:
Proceedings of European Semantic Web Conference, Spain, 1–5 June 2008, The Semantic
Web: Research and Applications. Lecture Notes in Computer Science, vol. 5021, pp. 524–
538. Springer, Heidelberg (2008)
10. Vrandečić, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of
the 21st International Conference on World Wide Web, Universite de Lyon, Lyon, 16–20
April 2012, pp. 1063–1064. ACM, New York (2012)
11. Park, J.R., Tosaka, Y.: Metadata creation practices in digital repositories and collections:
schemata, selection criteria, and interoperability. Inf. Technol. Libr. 29(3), 104–116 (2010)
12. Bluhm, M., Getting, B., Hayft, M., Walz, S.: U.S. Patent No. 7,085,755. Washington, DC:
U.S. Patent and Trademark Office (2006)
A Methodology Approach to Evaluate
Cloud-Based Infrastructures in Support
for e-Assessment
Josep Prieto(&) and David Gañán
Faculty of Computer Science, Multimedia and Telecommunications,

Universitat Oberta de Catalunya, Rambla Poblenou 156, 08018 Barcelona, Spain
{jprieto,dganan}@uoc.edu
Abstract. In the last decade, cloud development has grown exponentially and
increasingly companies, administrative and educational institutions decide to
make the leap and turn their solutions to a cloud platform. In particular, cloud
technologies are applicable to educational contexts, especially in online learn-
ing, as online universities historically have their educational services installed
on-premises, but the trend is to move them to the cloud. This is mainly because
of the evident non-functional benefits of using cloud-based solutions, such as
high availability, scalability and real-time responsiveness, which are impossible
or too much costly to be achieved on-premises Indeed, these benefits can
effectively support the current broad and demanding educational services, such
as document sharing, communication, assessment, administrative procedures
and reporting tools, which must be available to an increasing number of students
and university staff anytime and anywhere. However, from the architectural
point of view, cloud-based systems pose some challenges not existing in tra-
ditional on-premises systems, such as modularization and separation of services,
which require additional work to guarantee performance and data protection
against potential risks. In this paper, we focus on the assessment services pro-
vided by an innovative cloud-based educational system named TeSLA, which
may reduce considerably the university costs and infrastructure maintenance,
while offering flexible and effective e-assessment solutions. The ultimate goal of
the paper is to propose a methodology to evaluate the TeSLA system from the
underlying cloud infrastructure in terms of non-functional requirements.
1 Introduction
Cloud platform major benefits are related with the easiness of setting up an infra-
structure in the cloud comparing with setting up on-premises. In a cloud environment a
new machine or service can be deployed in question of minutes, while doing the same
on-premises (purchasing the hardware, installing it physically, setup operating systems,
etc.) can take considerably more time and become too costly. Furthermore, cloud
environments easily and rapidly enable vertical and horizontal scaling when required.
In addition to the ease of use, cloud platforms offer many other advantages, such as fast
processing, large data-storage capacity and sharing of resources that overall can help in
the accomplishment of key business non-functional requirements [1, 2].

https://doi.org/10.1007/978-3-030-33509-0_49
526 J. Prieto and D. Gañán
However, deploying an application in the cloud requires some important consid-

erations, such as those related with interoperability and security issues. Indeed, some
companies are reticent to migrate their solutions to the cloud if those solutions cannot
share data and services across different cloud vendors [3] while others have the per-
ception of the loss of control over data in the cloud and need to increase the development
investment to implement secure communication protocols, data encryption, etc. This is
why starting a new project oriented for the cloud is easier than migrating an existing one.
Cloud technologies are also applicable and have much sense too in educational
contexts, especially in online education [4]. Universities, such as the Universitat Oberta
de Catalunya (UOC)1, historically have their VLEs (Virtual Learning Environments)
installed on premises, but the trend is to move them to the cloud as some learning
companies already offer VLE services on the cloud directly (e.g., MoodleCloud). The
purpose is to provide effective support to the broad number of services available for
education, such as document sharing, communication, assessment, reporting tools, etc.
To overcome the above challenges, this research targets an educational system
named TeSLA2, which is architected for cloud environments aiming to provide an
unambiguous proof of students’ academic progression during the learning process
while avoiding the time and physical limitations imposed by face-to-face education.
[6]. To this end, the system provides multiple instruments and pedagogical resources in
order to check if the students are doing their activities on their own or not, be the means
of analyzing biometric patterns, such as face or voice, keyboard patterns or verify
authorship of documents [5].
In this paper, a methodological guide is proposed for the evaluation of the TeSLA
system, which can be used for the evaluation of similar cloud-based systems devoted to
support e-assessment activities that meet the mentioned non-functional requirements.
The rest of the paper is structured as follows. Section 2 reviews the literature about
cloud-based solutions for education and e-assessment. Section 3 summaries the TeSLA
system developed for cloud environments. Section 4 proposes a detailed methodology
to evaluate the presented cloud-based system to support e-assessment activities.
Finally, Sect. 5 concludes the paper and outlines next directions of this research.
2 Background
Cloud platforms offer multiple services of different types that can be categorized in
many ways, but in general there are three distinguished layers or service models (XaaS
services), namely Software as a Service (SaaS), Platform as a Service (PaaS),
1
The Universitat Oberta de Catalunya (UOC) located in Barcelona, Spain, offers full distance higher
education through the Internet since 1995. UOC's Virtual Campus supports currently about 75,000
students, lecturers and tutors who are involved in some of the 350 official degrees and other post-
graduate programs. The UOC web site is found at: https://www.uoc.edu.
2
The TeSLA system (Trust-based authentication & authorship e-assessment analysis) is a project
funded by the European Commission with the aim to provide educational institutions with an
adaptive trust e-assessment system for assuring e-assessment processes in online environments. The
TeSLA web site is found at: https://tesla-project.eu/.
A Methodology Approach to Evaluate Cloud-Based Infrastructures 527
Infrastructure as a Service (IaaS) [7, 8]. Many contributions [7–10] agree to identifying
the main advantages and issues of cloud platforms, the most important are the easy of
management and the possibility of reducing costs. The scalability of components, the
uninterrupted service or the characteristics for disaster management are also mentioned
as the main advantages of the cloud. In addition, other relevant issues are related with
security and privacy.
There are many examples of cloud architectures or framework proposals and
application case studies in different fields [12, 13]. In particular, there are many con-
tributions about cloud applications in the field of education [4]. For instance, the
authors in [14] present the architecture of an e-learning system deployed on the cloud,
which is divided in several layers with the application layer supporting educational
services, such as content production, content delivery, virtual laboratory, collaborative
learning, assessment and management features.
For the sake of our work, we focus our research on the assessment services of the
learning process on a VLE (i.e., e-assessment). We can define e-assessment as the
process where information and communication technologies are used for the man-
agement of the end-to-end assessment process [15]. In other words, e-assessment deals
with methods, processes, and web-based software tools (or systems) that allow sys-
tematic inferences and judgments to be made about the learner’s skills, knowledge, and
capabilities [16]. The literature discusses different models and platforms for virtual
automatic e-assessment, such as the work of [16] and [17]. However, most of this
literature focuses on an educational or a technological description without taking into
account any security measure. In this context, we define security measure as how the
learner is authenticated on the system when a learning activity is performed or sub-
mitted, how the authorship of the activity is verified, and how the system protects the
integrity of submitted activity. These security requirements have been previously
considered in [18] and [19] in the attempt to cover all the possibilities of cheating.
However, the counter-measures to mitigate these issues are not always easy to
implement [5].
3 Engineering a Cloud-Based e-Assessment System
In this section, we describe a cloud-oriented architecture in support for the TeSLA

educational system [5, 6, 20] aimed to meet the previously mentioned security needs
found during e-assessment processes, namely authentication and authorship detection
and verification. The following sections describe the system specifications and
requirements, the architecture schema and its components.
3.1 System Specifications

The main goal of the system is to analyze student’s activities in online education in
order to detect cheating, like plagiarism or that the activity is done by someone other
than the student. The system must support a large number of concurrent students and
teachers, provided that online institutions usually have a big community with tens of
thousands of members.
There are different tools called instruments that allow the identification of people.
Some identification mechanisms of the instrument can be transferable such as a card
and an access code; or can be not transferable such as biometric data. Biometric
instruments are valid to check if it is the student who really makes the activity,
including facial recognition, voice recognition, keyboard pattern recognition or anal-
ysis of the style of writing. In terms of copy detection there are also different tools
known as anti-plagiarism. Next, we briefly describe some of these instruments that will
be involved in our pilot scenario (see Sect. 4.1 and [6, 20]):
Facial Recognition (FR): compares the face and facial expressions using images
and videos with the learner model.
Keystroke Dynamics (KD): compares the rhythm and speed of typing when using
the keyboard with the learner model.
Plagiarism Detection (PD): detects similarities (word-for-word copies) between a
given set of text documents created by students using text matching.
Forensic Analysis (FA): compares the personal writing style to the learner model,
which is updated over time with submission of new documents.
Apart from PD, these identification instruments require an initial training process or
enrolment that each student must carry out and that allows capturing their characteristic
features for later identification (learner model). During the process of analyzing the data
captured during the activity (called verification), the instruments compare the gathered
data with the initial learner model captured during enrollment and return a probability
value about if the student is who is doing the activity.
3.2 System Technical Requirements

There are some well-known non-functional requirements, already discussed in the
background section, that should be taken into account when designing the architecture
of the system, but the decisions taken to ensure them may vary when the system will be
deployed into the cloud. This section describes the decisions made to meet these non-
functional requirements found in our context of online education, namely modular-
ization, robustness, extensibility, performance, security, operability and usability [2].
Modularization. One of the most impacting decisions in the architecture of the system
was to enforce modularization with the use of containerization, which is the encap-
sulation of each module or component inside a container that is one minimal unit of
deployment and operation and can be scaled separately. The ecosystem of multiple
containers is commonly known as microservices architecture and the most well-known
example of containerization platform is Docker.
Robustness/Reliability. in order to increase the reliability of the project some decisions
were taken, among others: (i) use a code source repository for each component (con-
cretely GitHub) enforcing continuous integration and delivery, (ii) increase the testa-
bility of the system, by the means of unitary tests for each component plus integration
tests, and (iii) create some well-defined communication interfaces between compo-
nents, implemented through REST endpoints accepting JSON format.
Extensibility. TeSLA system was designed to be extensible in some different ways, and
the modularization of components, the correct definition of interfaces and the inter-
operability between components were the key to success. On one hand, TeSLA defines
an instrument interface that can be implemented by any third party instruments to be
integrated in the system like a plugin. On the other hand, the system is able to work
with different VLEs (at the end of the project it is compatible with Moodle out-of-box)
thanks to a defined interface which can be implemented particularly by other VLEs in
the future.
Performance. The number of potential students and instructors to be using the TeSLA
system is quite big, more than 10.000, so it should be prepared to support a big number
of concurrent requests. This is not only achieved by developing high performant
components, but also by scaling some parts of the system. Scalability can be addressed
in two ways: either by scaling vertically (increase machine resources) or horizontally
(increase the number of machines supporting a service).
Security. The most obvious security issues are related with communications security
using encryption to avoid man-in-the-middle attacks, or the authentication and
authorization mechanisms for each of the interfaces of the system (either visual
interfaces or endpoints). This is especially important to ensure in cloud systems
because data is moved from the on-premises installation to the exterior. In order to
increase even more the security in the communications between components, each
component uses its own client certificate emitted by a trusted certifying authority (CA).
Additionally, the system does not store personal information about students; the way to
identify them is using a unique TeSLA ID which is generated for each student. The
translation between the TeSLA ID and the corresponding student is stored in the Plugin
component which is installed always in the VLE side (this means in the institution
premises).
Operability. Operability of the system includes deployment and maintainability. The
containerization of the components helps in the deployment of the system, which is
deployable through an installation script adaptable to each institution. The maintain-
ability is enhanced by the use of the Portal component tools. The Portal includes tools
for administrators, they can deploy and manage components (including instruments and
instances) and centralizes the monitoring of all the components of the system (status,
health, logs, etc.).
Usability. This requirement is not directly related with architecture, but it is very
important each time more. A system can be fully functional, performant, robust, secure,
but if it is not usable it will not succeed. The development of TeSLA system took into
account some premises (finally accomplished or not) about the usability of the system:
(i) it should not be intrusive for students or impede them to work normally if there is
some failure, (ii) it should be easy for instructors to setup activities and the results
shown should be easily interpretable, (iii) the system would be easy to be managed by
administrators, and (iv) the system has to be aware of SEND students and offer
alternatives (for example a dumb person cannot use a voice recognition system, another
alternative should be offered like face recognition or keystroke).
3.3 Global Architecture

The proposed educational system TeSLA to respond to the needs described in the
previous subsections is composed by the following components (see Fig. 1):
• TeSLA Portal (or Portal for short): handles the licensing/enrolment, deployment and
statistics.
• TeSLA E-Assessment Portal (TEP for short): acts as a service broker, receiving
requests from the Virtual Learning Environment (VLE) plug-ins and forwarding
that request to the appropriate instruments and/or the Portal.
• TeSLA Instruments: An instrument evaluates a given sample in order to access the
Learner identity/authorship (see Sect. 3.1).
• TeSLA Plug-in: generates TeSLA ID used for anonymization, gathers information
about the VLE and communicates with TEP.
• External tool: collects data samples to provide to the instruments (e.g. video, sound
or keystrokes).
• Learner tool: interacts with the learner using TeSLA.
• Instructor tool: interacts with the instructor using TeSLA.
In [2], each component of the TeSLA system is described in detail.
Fig. 1 System global architecture [2].
4 Evaluation Methodology
In order to evaluate the previous approach of the cloud-based TeSLA system, in this
section we propose, first, the pilot scenario that will conduct the evaluation activities of
the system. Second, the experimental study [21] to validate the impact the system in
several indicators of interest related mainly to non-functional but also functional and
educational requirements. Finally, once the experiment is over, the results will be
elaborated and will be reported in next publications (see Sect. 5). Note that this paper
intends to propose a suitable design for a future experiment to be conducted in order to
evaluate the potential benefits of the cloud-based TeSLA system.
4.1 Pilot Scenario

In this subsection, we describe the scenario of use by the deployment of the TeSLA
system to support assessment activities in the real online education context of the UOC
virtual campus. First, the technical aspects of the scenario are described and then the
educational approach is presented in terms of the type of e-assessment activity where
students and teachers will be involved in the real online learning context of the UOC.
Fig. 2 System’s architecture to support the pilot scenario [2].
Technical Scenario. The pilot will be performed using Moodle courses and activities,
which are already integrated into the UOC virtual campus using an LTI provider, thus
allowing for sharing information about students, courses and activities between the
virtual campus and Moodle though LTI REST APIs (see Fig. 2 and [2] for details). All
the communications between the Moodle and TeSLA system will use a unique iden-
tifier called TeSLA ID which corresponds to a unique student. The TIP component will
take care of generating those TeSLA ID and managing the conversion from and to the
internal virtual campus id. Finally, TeSLA provides a REST API that provides all the
public functionalities of the system, like send enrolment data, query student results, etc.
The Moodle activities use those APIs in order to query and update configuration of
activities in the TeSLA system, and to get evaluation results. The learner browser also
uses these APIs directly to send evaluation data either for enrolment or during activities
Educational Scenario. The assessment activity proposed for this pilot will be intended
to assess an online teaching planning. Students will be required to design and plan a
teaching action in a virtual environment (e.g. plan a course, design an e-assessment
activity or design a learning platform) with the following main learning objectives:
• To master the fundamental elements for the planning of processes and scenarios of
online training that respond to specific training needs.
• To plan the most appropriate strategies and resources for the evaluation of specific
online learning situations.
• To design learning activities for virtual environments (e-activities) that foster the
interaction, participation, and collaboration of students.
The assessment activity will be used both for formative and summative purposes
within the course. The teacher supports and monitors the development of the project by
giving guidelines, answering questions and reviewing the ongoing project. After the
delivery of the learning output, the teacher will provide personalized feedback about
the work and will offer a mark to each student once the activity is delivered.
4.2 Experiment Design

This section describes a comprehensive study describing all activities that are proposed
to be undertaken during the experimentation of the presented cloud-based system
TeSLA in the described pilot scenario (see Sect. 4.1). The design of a complete
empirical study includes details on the goals and hypotheses, the method (including
number and type of participants, apparatus and stimuli, and procedure), as well as the
techniques and tools of data analysis for evaluation and validation purposes of the
empirical data. The design presented is based on the standard guidelines to report
empirical studies for online education research [21].
The goals and hypotheses formulated for this scenario are related mainly to test out
the previously mentioned non-functional requirements of the cloud-based TeSLA
system (see Sect. 3.2). In particular, the modularization, robustness, extensibility,
performance, security, operability and usability of the TeSLA system when supporting
e-assessment activities. To this end, the proposed pilot scenario will be run at UOC as
described in Sect. 4.1. Next, the goals and hypotheses are formulated.
Goals
G1: To test the system performance at scale
G2: To refine of the system modules
G3: To test the reliability and robustness of authentication and authorship
mechanisms
G4: To extend the system by adding new instruments
G5: To report on the system security
G6: To test the system operability in terms of deployment and maintenance
G7: To measure the system usability in terms of functionality
G8: To achieve the learning goals.
Hypotheses
H1: The system outperforms on-premises solutions at scale
H2: The system is modular as it provides extensibility and operability
H3: The authentication and authorization mechanisms are reliable and robust
H4: There are no security issues affecting the system functioning
H5: The system is functional and is considered as a valuable educational resource
H6: The use of the system increases the knowledge acquired during the course.
Following the standard method to report empirical results [21], information about
the participants, the apparatus used for experimentation and the procedure of the
experiment are proposed next.
Participants. In order to appropriately evaluate the cloud-based TeSLA system in the
terms and features considered in the formulated goals and hypotheses, the sample of the
experiment in terms of targeted participants, demographics and courses are suggested
to be the following:
• Number of students: 1,500 online students located at home
• Number of lecturers and technicians: 130
• Number of courses in which TeSLA is integrated into the assessment system: 20
(unsupervised individual assessment activities)
• Demographics of the students using TeSLA: gender balanced, age normally dis-
tributed around a mean age of 33
Apparatus and Stimuli. Several TeSLA instruments will be applied (see Sect. 3.1 for a
description). In particular, Keystroke Dynamics (KD), Forensics Analysis (FA) and
Plagiarism Detection (PD) will work online for authentication and authorship purposes
while students will perform the assessment activity in Moodle Quiz. In addition, Face
Recognition (FR) will be used to strengthen students’ authentication without impacting
on students’ activity performance.
According to the targeted type of participant (students, lecturers and technicians),
three different types of questionnaires are to be launched after the experiment:
1. After the assessment activity, the participating students will be required to fill in a
questionnaire, which will include the following 6 sections: (i) identification data;
(ii) open questions related to the pedagogical topics and contents of the assessment
activity in order to validate the knowledge acquired during the course; (iii) test-
based evaluation of the systems supporting the assessment activity (Moodle Quiz
and TeSLA instruments) as valuable educational resources; (iv) test-based evalu-
ation on the usability of the system; (v) test-based evaluation on the emotional state
when using the system; (vi) a test-based evaluation of the questionnaire. All sec-
tions have a final field to express suggestions and further comments about aspects
not represented in the questions. The questionnaire’s Sects. 2–5 are considered for
the purpose of our study.
2. The previous questionnaire will be sent to the participating lecturers involving all
sections but Sect. 2 as for the knowledge acquired during the course.
3. Technicians involved in the experiment will receive a questionnaire after the

experiment where they will report the following: (i) identification data; (ii) open
questions with related questions to the non-functional aspects of the system under
study (integration, scalability, performance, security, etc.); (iii) test-based evalua-
tion of the system functional aspects supporting the assessment activity from the
technical perspective; (iv) a test-based evaluation of the questionnaire. All sections
have a final field to express suggestions and further comments about aspects not
represented in the questions.
The answer categories will vary between rating scales, multiple choice or open
answers. Regarding the rating scales, for the majority of the quantitative questions the
5-point Likert scale will be used, so that students can state their level of agreement or
disagreement. The rating scale range from “I strongly disagree” (1), “I disagree” (2),
“neither/nor” (3) to “I agree” (4), “I strongly agree” (5).
For qualitative statistical analysis, the open answers will be summarized in the
questionnaire. For the quantitative statistical analysis, we will employ descriptive
statistics, such as Mean (M), Standard Deviation (SD) and Median (Md). We will
complement this quantitative analysis by employing accepted statistical procedures, such
as Chi-square (X2), so as to compare the observed scores to the expected scores [21].
For the Sect. 4 (usability) we will use the System Usability Scale (SUS) developed
by [22], which contains 10 items and a 5-point Likert scale to state the level of
agreement or disagreement. SUS is generally used after the respondent had an
opportunity to use the system being evaluated.
The data from this experience will be collected by means of the Google Forms
attached to the Moodle Quiz and the UOC virtual campus as the real context where the
assessment activity and the overall experiment will be developed. Quantitative and
qualitative data will be collected from questionnaires, log files and databases from
UOC virtual campus and the Moodle platform.
Procedure. Students participating in the experiment will be requested to provide a
written response to a quiz in Moodle where they will describe their ideas for their
individual project (i.e. title, type of design, approach). Once the proposal is accepted by
the teacher, the student will develop the project. The output for this activity will be the
document containing the planning or design of an online teaching action. This
assessment activity will take place during the Fall academic semester in 2020 at the
UOC and undergraduate students from courses of both the Computer Science and
Learning Science degrees will be invited to participate in great numbers (see Partici-
pants). All the participants will be duly informed about the experiment and the systems
and technologies to be used by means of user manuals and if necessary, by online
training sessions before the experiment starts.
Engineering applications that will be deployed in a cloud infrastructure does not differ
dramatically from engineering distributed applications deployed on-premises. How-
ever, some specific considerations should be taken into account when using a cloud
infrastructure, overall those regarding to security and data privacy because data is not
anymore stored and managed on-premises but in a remote server. Performance and
availability are also an important point to design so moving to the cloud does not affect
the response time of the application. Furthermore, cloud technologies offer very
interesting features to distributed applications, mainly at easing the setup and main-
tenance of infrastructure. The first part of the paper discussed about engineering
applications for the cloud and described the most important points that distinguish
typical on-premises application architectures from that cloud based.
Second part of the paper describes the architecture of a cloud-based application for
trust-based online assessment called TeSLA system is presented. The TeSLA system
was designed to accomplish the functional and non-functional requirements found in
the context of e-assessment of educational organizations. The paper shows an overview
of the TeSLA system in terms of system specifications and requirements, the archi-
tecture schema and its components. Finally, a comprehensive experimental design is
proposed as a key contribution of the paper describing all the evaluation activities to be
undertaken during the experimentation of the presented cloud-based system TeSLA in
a future pilot scenario, including details on the goals and hypotheses, the method as
well as the techniques and tools of data analysis for evaluation and validation purposes
of the empirical data. The design presented is based on the standard guidelines to report
empirical studies for online education research.
The ultimate goal of the paper is to propose a methodology to evaluate the TeSLA
system from the underlying cloud infrastructure chiefly in terms of non-functional
requirements. The natural next step of this research is to run the proposed experimental
design and analyze the collected empirical data in order to evaluate and validate the
system from the cloud-based perspective. This experiment may follow others that
correct and extend some design parameters that help collect more appropriate quanti-
tative and qualitative data in order to evaluate the system more objectively.
Acknowledgements. This work has been supported by Spanish Government through the Grant
TIN2014-57364-C2-2-R “SMARTGLACIS and the H2020-ICT-2015/H2020-ICT-2015 T pro-
ject ‘An Adaptive Trust-based e-assessment System for Learning’, Number 688520.
References
1. Talukder, A.K., Zimmerman, L.A.P.H.: Cloud economics: principles, costs, and benefits. In:
Antonopoulos, N., Gillam, L. (eds.) Cloud Computing Computer Communications and
Networks. Springer, London (2010)
2. Prieto, J., Gañán, D.: Engineering cloud-based technological infrastructure. In: Baneres, D.,
Guerrero-Roldán, A.E., Rodríguez-González, M.E. (eds.) Engineering Data-Driven Adaptive
Trust-based e-Assessment Systems. Lecture Notes on Data Engineering and Communica-
tions Technologies, vol. 34, ch. 4. Springer. ISBN 978-3-030-29325-3 (2020, in press)
3. Pecori, R.: A virtual learning architecture enhanced by fog computing and big data streams.
Future Internet 10(1), 4 (2018)
4. González-Martínez, J.A., Bote-Lorenzo, M.L., Gómez-Sánchez, E., Cano-Parra, R.: Cloud
computing and education: a state-of-the-art survey. Comput. Educ. 80, 132–151 (2015)
5. Baneres, D., Rodríguez, M.E., Guerrero-Roldán, A.E., Baró, X.: Towards an adaptive e-
assessment system based on trustworthiness. In: Caballé, S., Clarisó, R. (eds.) Formative
Assessment, Learning Data Analytics and Gamification in ICT Education, pp. 25–47.
Elsevier (2016)
6. Bañeres, D., Noguera, I., Rodríguez, M.E., Guerrero-Roldán, A.E.: Using an intelligent
tutoring system with plagiarism detection to enhance e-assessment. In: INCoS 2018,
pp. 363–372 (2018)
7. Rimal, B.P., Choi, E., Lumb, I.: A taxonomy and survey of cloud computing systems. In:
Fifth International Joint Conference on NCM 2009, pp. 44–51. IEEE, August 2009
8. Jadeja, Y., Modi, K.: Cloud computing-concepts, architecture and challenges. In: 2012
International Conference on Computing, Electronics and Electrical Technologies (ICCEET),
pp. 877–880. IEEE (2012)
9. Rimal, B.P., Jukan, A., Katsaros, D., Goeleven, Y.: Architectural requirements for cloud
computing systems: an enterprise cloud approach. J. Grid Comput. 9(1), 3–26 (2011)
10. Tsai, W.T., Sun, X., Balasooriya, J.: Service-oriented cloud computing architecture. In: 2010
Seventh International Conference on Information Technology: New Generations (ITNG),
pp. 684–689. IEEE, April 2010
11. Zimmermann, O.: Architectural refactoring for the cloud: a decision-centric view on cloud
migration. Computing 99(2), 129–145 (2017)
12. Karim, B., Tan, Q., El Emary, I., Alyoubi, B.A., Costa, R.S.: A proposed novel enterprise
cloud development application model. Memetic Comput. 8(4), 287–306 (2016)
13. Felemban, M., Basalamah, S., Ghafoor, A.: A distributed cloud architecture for mobile
multimedia services. IEEE Netw. 27(5), 20–27 (2013)
14. Masud, M.A.H., Huang, X.: An e-learning system architecture based on cloud computing.
Int. J. Comput. Electr. Autom. Control Inf. Eng. 6(2), 255–259 (2012)
15. Cook, J., Jenkins, V.: Getting Started with e-Assessment. University of Bath, Bath (2010)
16. Ala-Mutka, K.M.: A survey of automated assessment approaches for programming
assignments. Comput. Sci. Educ. 15(2), 83–102 (2005)
17. Ihantola, P., Ahoniemi, T., Karavirta, V., Seppälä, O.: Review of recent systems for
automatic assessment of programming assignments. In: Proceedings of the 10th Koli Calling
International Conference on Computing Education Research, pp. 86–93 (2010)
18. Weippl, E.: Security in e-Learning. Advances in Information Security, vol. 16. Springer
(2005)
19. Neila, R., Rabai, L.B.A.: Deploying suitable countermeasures to solve the security problems
within an e-learning environment. In: Proceedings of the 7th International Conference on
Security of Information and Networks, pp. 33–38 (2014)
20. Okada, A., Noguera, I., Alexieva, L., Rozeva, A., Kocdar, S., Brouns, F., Ladonlahti, T.,
Whitelock, D., Guerrero-Roldán, A.: Pedagogical approaches for e-assessment with
authentication and authorship verification in Higher Education. Br. J. Educ. Technol.
(2019). https://doi.org/10.1111/bjet.12733
21. Caballé, S.: A computer science methodology for online education research. Int. J. Eng.
Educ. 35(2), 548–562 (2019)
22. Kay, R.H., Loverock, S.: Assessing emotions related to learning new software: the computer
emotion scale. Comput. Hum. Behav. 24, 1605–1623 (2008)
23. Brooke, J.: SUS: a ‘quick and dirty’ usability scale. In: Jordan, P.W., Thomas, B.,
Weerdmeester, B.A., McClelland, A.L. (eds.) Usability Evaluation in Industry, pp. 189–194,
Taylor and Francis, London (1996)
Towards an Educational Model for Lifelong
Learning
Jordi Conesa1(&), Josep-Maria Batalla-Busquets1, David Bañeres1,

Carme Carrion1, Israel Conejero-Arto1, María del Carmen Cruz Gil2,
Montserrat Garcia-Alsina1, Beni Gómez-Zúñiga1,
María J. Martinez-Argüelles1, Xavier Mas1,
Tona Monjo1, and Enric Mor1
1
Universitat Oberta de Catalunya, Barcelona, Spain
{jconesac,jbatalla,dbaneres,mcarrionr,iconejero,
mgarciaals,bgomezz,mmartinezarg,xmas,
amonjop,emor}@uoc.edu
2
Universidad Carlos III de Madrid, Madrid, Spain
macruzg@bib.uc3m.es
Abstract. Today, lifelong learning is fully integrated into our society. From the
student point of view, lifelong learning has several characteristics that differ-
entiate it from regular learning: domains of interest may be very broad; learning
occurs in different depths; topics to study may be related both to work, family
and leisure; students’ continuity cannot be guaranteed since their availability can
be intermittent and little constant; a great dynamism is required in order to allow
studying any topic, in any order, in the moment that best suit each student and at
the best pace for everyone. Over 25 years ago some authors already claimed for
moving towards innovative learning models, more personalized and where the
students would take a more active role and would decide what to learn, when to
learn and how to learn. Technology was not ready then to support this change of
pedagogical paradigm, but it seems to be ready now. Thus, the technological
context is set for facilitating a change of paradigm to promote lifelong learning.
However, lifelong learners continue suffering from a model not adapted to their
needs and preferences. This position paper discusses on the actual situation of
lifelong learning from a critical point of view, analyzing some of the relevant
literature and justifying the need to create new models that promote self-
determination of students in the context of lifelong learning.
Keywords: Lifelong learning Lifewide learning Andragogy Heutagogy

eLearning
1 Introduction
Lifelong learning is fully integrated into our society. We constantly need to learn in our
everyday activities: for travelling, for using new software programs, for keeping
updated, for curiosity, etc. In the professional context, lifelong learning is a need, all
professionals should be lifelong learners [1] and should use different kind of envi-
ronments to do so, such as formal, informal and non-formal learning environments [2].
https://doi.org/10.1007/978-3-030-33509-0_50
538 J. Conesa et al.
Lifelong learning has some specific characteristics to take into account. Some of
them may be due to the time and availability constraints of people, such as the
impossibility of having a full dedication, schedule constraints, time periods of
unavailability or the lack of constant dedication. Others come from the complexity of
the current world or the myriad of preferences of people, that impose a more multi-
disciplinary learning, mixing leisure and professional aspects as well as aspects related
to daily activities. Others are due to the uniqueness of each person; both referring to the
current knowledge, skills and competences everyone has and to the different necessities
of skills and knowledge of everyone. Therefore, the more suitable environment for
lifelong learning is one where adults are able to choose what to learn, how to learn,
when to learn, in what order and at what pace [3], what is known as heutagogy (or self-
determined learning) [4].
The necessity of providing a self-determined learning for lifelong learning is not
new, some works claimed, almost 30 year ago, the necessity to use different models,
more personalized, in which students take a more active role, deciding what to learn,
when to learn and how to learn [3]. Obviously, in the 90’s, technology was immature to
support such models, but it is ready nowadays: there are millions of digital learning
resources available, thousands of organizations teach online, current information sys-
tems are able to provide personalized learning and to automate some of the interactions
with students, there is a huge amount of social/collaboration tools that could be used
and students are used to participate and benefit of communities of interest. Therefore,
providing a technological environment that supports self-determined learning seems
feasible. Even though, lifelong learners continue suffering from a similar model, more
ubiquitous and efficient thanks to the use of technology, but still not adapted to their
needs and/or preferences.
Changes should be done not just in the way students learn, but also in what they
learn and when they learn. Choosing what content to learn requires new ways for
enrolling and choosing courses, different to the enrollment to a given subject or a group
of subjects related to a given topic, which are the typical structures of masters and
subjects offered by educational organizations. Learning whenever learners prefer
requires having flexible schedules, allowing each student to decide when to begin the
course, when to finish the course and at what pace the student will work. Implementing
these changes requires educational organizations to evolve, mostly in their business and
organizational models. Therefore, the change of paradigm does not just affect peda-
gogy, but the whole learning experience, that is, all the facets related with learning
activities and their actors/resources: the necessity of new materials, of new techno-
logical tools, of persons with new roles, of new business models, of new motivational
policies, etc.
The goal of this work is to provide a discussion on the adult needs in the context of
lifelong learning, how these needs can be addressed and from what perspectives. The
paper also provides a glance to a model under development at the Open University of
Catalonia, that try to design, implement and evaluate new tools, both methodological
and technological, to move forward to a more suitable lifelong learning environment.
The paper is organized as follows. Section 2 presents a brief review about the terms
lifelong learning, lifewide learning, andragogy and heutagogy; relevant in the current
context. The section also presents a use case that shows how a future lifelong learning
Towards an Educational Model for Lifelong Learning 539
environment can be. Section 3 sketch the main needs of lifelong learners and point out
the current gap between their needs and the current educational offer. Section 4 briefly
presents the proposed model and the different aspects it takes into account. Finally,
Sect. 5 outlines the main conclusions and provides on-going and future directions of
research.
2 Background and a Case Study of Lifelong Learning
Lifelong learning is commonly associated to the terms lifewide learning, self-

determined learning, self-directed learning, andragogy and heutagogy. Subsect. 2.1
describes such concepts and how they are related to lifelong learning. Due to the bias
we may have for the continued use of the traditional educational model, it may be
difficult to imagine how new lifelong learning environments should be and how they
may differ from current ones. In order to facilitate such an imaginative exercise,
Subsect. 2.2 presents a case study that shows some of the constraints an adult face
when learning and presents a possible lifelong learning environment that helps the
learner in her learning process, by adapting seamlessly to her constrains and needs.
2.1 Background
Lifelong learning has become very relevant due to the continuous necessity to keep up
updated into work environments [5], but also in daily life [6]. Some research also point
out its potential to improve us as a society [7, 8]. In this sense, eLearning may be a
game changer to break the barriers between education and work [1]. Mainly because of
its ability to deal with ubiquity, personalization, communication and automatization.
Many authors claim that lifelong learning should be addressed from an heuta-
gogical perspective. Heutagogy can be viewed as an evolution from pedagogy that
passes through andragogy. Heutagogy occurs due to the maturity, awareness and
autonomy of lifelong learners [7]. Blaschke has proposed a framework, in the form of a
pyramid, to reflect such perspective [4], depicted in Fig. 1.
Pedagogy may be seen as the theory of teaching. At this level the teacher is the
responsible of the learning process, choosing what to learn, when to learn, in what
order and how. In some sense, we can say that students are educated and have few
decisions to take about their learning. Second level is andragogy, characterized by more
self-responsibility and self-control of learners. In this level, students are more aware of
how they learn and what are their main necessities. They are the responsible to identify
their learning necessities and to plan how these necessities will be addressed. Even
though their voice is taken into account, the role of the teachers is still very relevant,
and they take great responsibility in the learning process. Andragogy is also known as
self-directed learning. Finally, third level is heutagogy. Heutagogy requires learners
that have progressed in maturity and autonomy, who are ready to take a step further and
conduct a self-determined learning, that is, choosing what to learn, when to learn, how
to learn and at what pace. Some authors define heutagogy as the learning with the
absence of educators [7]. Heutagogy, under our humble opinion, does still need edu-
cators, but with a different role, a role more focused in guiding students during the
Fig. 1. Blaschke framework reflecting the lifelong learning process (from [4]).
learning process and in promoting their curiosity and knowledge by the provision of
examples, activities, success cases or any other resource related to the student interest.
Another difference among pedagogy, andragogy and heutagogy is the type of their
learning outputs. Pedagogy and andragogy are relevant to get knowledge and com-
petences, but heutagogy is more focused to learn capabilities, understanding a capa-
bility as the ability to use a competence (skill or knowledge) efficiently to deal with
different problems, even when these problems are very different to the ones seen during
learning. The acquisition of competence requires changes in the learning methodology:
to add a double loop [9]. The double loop is a process in which learners revisit their
acquired competences and try to find out how to adapt them in their daily activities to
improve them.
2.2 Case Study: An Example of How Lifelong Learning Should Be

Neus is a 40-year-old woman, mother of a 4-year-old son. She works in a construction
company and enjoys learning new things: mainly related to healthy eating, education of
young children and travelling; she loves travel. She had considered studying something
related to her interests, but she had not the time (neither the willingness) to enroll in a
long-term program. There are short-term programs, such as monthly courses or full
subjects, but they require a continuous dedication that, with her son and her current
work, cannot be guaranteed. Neus discovers a new lifelong education service and
decides to sign up.
Just after her subscription, she receives an email from Joan, his personal mentor.
Joan introduces himself, explains how everything works, the different courses available
and how to search and navigate thru them. He also asks about her availability, hobbies,
interests, goals and learning expectations. The courses offered are very focused and of
short duration, ranging from some hours to one week. Every course has a schedule, but
it is tentative. Courses can be begun whenever the student needs and take as much time
as necessary to be finished. Each course proposes a challenge, provides the knowledge
required to address it, presents some related examples and facilitates an assessment
activity that requires the use of the learned knowledge and skills to address the pro-
posed challenge. There is a whole system of complex interrelations between courses to
state their relations and group different courses to respond to larger and complex
challenges. These aggrupations may represent different abstraction levels and reflect the
magnitude of the proposed challenge and the skills needed to address it. A graphical
representation of courses, their relationships and aggregations is provided by an
interactive and navigable visualization.
Thirty minutes later of her first contact with the system, Neus is already aware of
the structure of the courses, of the main courses related to her interests (healthy food)
and understand what the most convenient order of courses is to address her learning.
After some interactions, by using communication tools that Neus usually uses (e-
mail, phone and messaging applications), she does know the courses she should face
and in what order. Her first choice has been a course about the impact of sugar-
sweetened beverages in health. The course is compound, composed of five smaller
courses, titled: “Introduction to the digestive system”, “What are the carbohydrates?”,
“Do I need glucose? How many?”, “Sugar-sweetened drinks - learning to identify them
-” and “Evidence based-studies on Sugar-sweetened drinks”. Since she already knows
about the digestive system and carbohydrates, she decides to enroll in the third one
directly “Do I Need glucose? How much?”. Immediately, she receives a personalized
message from Clara, her teacher on this course, briefly introducing the benefits of the
glucose, the risks of its excessive consumption, a guideline for the course and a link to
the course materials. The course has a planned dedication of one week but there is no
time limitation to finish it. She enjoyed the course for two weeks before finishing it.
Just at the end of her first course she receives news at work that worries her: there
are plans to implement an ERP in the company where she works. Neus has heard about
these systems from friends that work in the sector, and not very well by the way. But
she does not have much knowledge of what an ERP is and what problems may have its
implementation. She contacts her mentor to ask whether there are some available
courses on ERP. In few hours, she gets information about the courses on the subject.
There are content for more than one year of study, but she decides to just take a short
introductory course titled “What is an information system for organizations?”. She
finishes the course after a couple of days worried, very worried indeed. She is now
aware about the potential advantages of ERP, but also about the potential problems
their implementation may have. She wonders whether there is something that can be
done to increase the chances of success of the ERP implementation. After navigating
through the visualization of courses, she finds a course that seems interesting, it
belongs to a compound course in project management, titled “What should be done to
guarantee success in the implementation of an ERP?”. There are some preliminary
courses, but she decides to ignore them to take the relevant course with urgency and
high interest. In few hours she learnt about the critical success factors to take into
account when implementing an ERP. Enric, her teacher in the course, aware of her
situation, has provided some success and failure cases of ERP implementation to her.
After studying thoroughly these cases, she talks with her boss about the potential risks
of the future ERP implementation and ways to mitigate them. She will become a
coordinator member of the implementation project team due to her recent acquired
knowledge and will be able to deal with such responsibility due to her ability to learn
what she needs, when she needs.
Some weeks later, Neus returns to her study about sugar-sweetened drinks. She
does not have to start from the beginning, since the virtual learning environment
provides her a visual reminder of what she had done, what she had read, the interac-
tions she had with her teacher and the activities she performed. Such information helps
her to resume the learning in few hours. Since that day, Neus is a promoter of the
lifelong learning service, useful for both her work and her life.
3 The Gap Between Students’ Needs and Academical Offer
There are some experiences in which learning has been adapted to lifelong learners’
necessities, but they are mostly punctual and isolated. In [7], for example, authors
analyze an experience focused to deal with the fourth sustainable development goal
(4SDG) from the UN [10] (ensure inclusive and equitable quality education and pro-
mote lifelong learning opportunities for all1) in Kenia. The experience used an heut-
agogical approach to promote agricultural education. The provided education was
focused to address three different dimensions: human, financial and society. In this
experience the students were not just knowledge receptors, but also knowledge gen-
erators, promotors and communicators. Some communities of interest have been
blossomed from the experience, providing a rich and natural environment to learn, but
also to share the learnt lessons about agriculture that farmers, who were the lifelong
learners, have learn during their life. Lessons that may be difficult to be learnt from
academics. In [11] the relationship between travelling and learning is analyzed. The
research presents travelling as a very suitable platform for lifelong learning, since
through travelling we do not only acquire knowledge, but also competences and soft
skills (stereotype removals, cultural changes, motivation, etc.). In [12] lifelong learning
approaches are used in order to deal with poverty, social inclusion and long-term
unemployment. Finally, [13] analyzes different lifelong learning approaches in the
contexts of Europe and Asia. It results that European approaches are more focused to
individuals, promoting their employability, meanwhile, in the Asiatic countries, there is
a lot of focus in the education that promotes community and collective ethos.
Aside of some isolated approaches, the main response of higher education orga-
nizations to lifelong learning needs are academic offers very similar to conventional
formal education, but with more practical or work-related contents. These offers tend to
have form of long courses, scheduled like underdegree courses (by semesters with
similar calendars), with none (or few) flexibility in the assessment activities and with
constraints on when the courses can be started, how they can be taken and at what pace
they should be studied. Some of the offers are composed by several courses and allow
few (or none) electives, such as a master. A master has a curriculum designed for a
given standard student, a student that, in the real world, it is very difficult to find, and
even more in the case of adult learners.
1
Source: https://sustainabledevelopment.un.org/sdg4.
It seems clear that students’ needs do not fit with the characteristics of the offer that
higher education institutions are providing. Figure 2 shows some of the mismatches
between the academic offer and the students’ needs, which will be discussed in more
detail below.
Fig. 2. Main mismatches between academic offer and students needs
Adult students do not have fully dedication because they should conciliate their
family, work, leisure and learning activities. In addition, they have responsibilities at
home and at work that may get them unavailable for a given period of time: an urgent
project at work or a baby issue in the family, for example. Regular academic calendar is
very unsuitable for them, since courses length are long (several months) and constant
dedication is expected. In addition, assessment activities from courses are scheduled
and allow few (or none) flexibility; it is not rare the case of students who fail a course
because they had to travel for a couple of weeks and have been unable to deliver an
assessment activity on time. Therefore, short courses with a lot of flexibility to deal
with the potential unavailability of students is advisory for lifelong learning.
Students have their schedule preferences shaped by their responsibilities. Some
may work in shifts of one week and have one week free, for example, others may have
availability just at summer; Then, why academic institutions do not allow them to take
the course whenever they want? Current schedules (mostly aligned with fixed seme-
sters) are, obviously, not the best solution for most people, but very convenient to
academic organizations.
Each learner is different, since past experiences shape our knowledge and abilities
to the current state. The differences among learners are more noticeable in adults. In
addition, lifelong learners do not focus in the learning of just one topic, but many of
them, related to the different facets in their life: work, leisure, travelling, family and
others. These characteristics make difficult the creation of academical offers that are
suitable for large communities of students. It seems more suitable to create very small
courses, focused to cover a given piece of knowledge or a skill. In that approach, it is
easier to find out interested learners and the courses can be grouped to create compound
courses that deal with a given topic in more detail. Under such structure students would
be able to choose the curriculum they want, avoiding unnecessary courses, taking into
account their interests and facilitating to take the courses in the order that better suits
their needs.
Since lifelong learning is not one-shot activity but a long-distance race, the current
business model (payment for enrolment) may not be the most adequate. New business
models should be considered, models that charges students for the use they do,
affordable for most people, providing higher scalability in order to incorporate larger
number of students easily and, with the final goals of making academic organizations
sustainable and making lifelong learning a right for everyone.
4 Towards an Educational Model for Lifelong Learning
We are working in a long-term project to create, implement and test a model that
facilitates lifelong learning in a distance learning environment. The model should be
created taking into account the scientific evidence and lessons learnt during the last
decades.
The problem cannot be solved just providing a new pedagogical model, there are
models for andragogy and heutagogy already, but rarely applied in real world. We
humbly believe that the solution should be more multidisciplinary, a solution that
provide the tools (both methodological, theoretical and technological) to deploy an
environment where lifelong learning is conducted easily and conveniently. Such pro-
posal should take into account pedagogy (to promote learning), but also organizational
studies (to propose suitable ways to structure lifelong learning educational organiza-
tions), business models (to make the proposal sustainable and scalable), user experi-
ence (to adapt the model to the students’ needs and limitations), psychological (to study
how we can motivate students in the new paradigm), informational (to study ways of
organizing academical offers in small pieces that can be aggregated in other pieces of
higher level recursively) and technological (to study how technology, eLearning tools
and analytics can be used to personalize learning and automate the system as much as
possible).
The basic characteristics of the proposed model are:
– Educational resources must be digital, very modular and with small granularity.
– Learning units (the subjects or courses in the current model) should be modular,
with a very small granularity (of one week or a few hours) and very interrelated.
The interrelations between the different units will allow to define units of greater
granularity and complexity, but also to identify prerequisites, related subjects and
possible paths that the students can take. These interrelations will have to be shown
graphically and interactively so that the student can navigate and understand what
there is, and how it is related.
– Student’s experience must be integral (taking into account aspects of user experi-
ence, pedagogy and psychology), flexible (allowing to begin the courses whenever
the student wants), dynamical, (allowing to suspend any activity whenever neces-
sary and resume it later), personalized (with support of analytical tools and tech-
nology that allows personalization), and accompanied, with mentoring figures that
accompanies students throughout their educational experience and that promote

empowerment, involvement, good habits and attitude.
– Sustainability must be guaranteed through a business model.
– Scalability should be provided by a suitable organizational model.
– Virtual learning environment should provide knowledge management functionali-
ties to make learning units explicit, accessible and usable.
– Accreditation systems that state the acquisition of competences, knowledge and
capabilities should be provided, using badges [14] or similar systems.
Lifelong learning in higher education is usually constrained by artificial barriers, such

as deadlines, mandatory subjects, inflexibility, long courses or time-restricted pro-
grams. However, lifelong learning should be more flexible and personalized. Students
should be able to choose what they want to learn, how, when, in what order and at what
pace. To do so, new educational models should be created in order to provide more
suitable learning. These models should take into account not only educational aspects,
but also other perspectives: organizational, economical, pedagogical, technological,
user-centered and psychological.
This paper presents some background about lifelong learning, some thoughts about
needs of lifelong learners and some misalignments that current academic programs
have with these needs. It also presents the main characteristics of a holistic lifelong
learning model that is being developed at the Open University of Catalonia.
The goal of the paper is to arise discussion about current lifelong learning pro-
grams, whether they solve real needs of students and to promote constructive thoughts.
To do so, some mismatches between lifelong learners needs and academic offers are
commented. Readers may think that these mismatches are naïve and common sense.
They really are but, even so, they should be stated and discussed since they existed for
many years and there is not foresee of any improvement.
Further work is focused to develop, implement and evaluate the model commented
in the paper.
Acknowledgments. This work has been partially supported by the eLearn Center through the
project Xtrem 2018 and by European Commission through the project “colMOOC: Integrating
Conversational Agents and Learning Analytics in MOOCs” (588438-EPP-1-2017-1-EL-
EPPKA2-KA). This research has also been supported by the SmartLearn Research Group at
the Universitat Oberta de Catalunya.
References
1. Ashton, J., Newman, L.: An unfinished symphony: 21st century teacher education using
knowledge creating heutagogies. Br. J. Educ. Technol. 37(6), 825–840 (2006)
2. Manuti, A., Pastore, S., Scardigno, A.F., Giancaspro, M.L., Morciano, D.: Formal and
informal learning in the workplace: a research review. Int. J. Train. Dev. 19(1), 1–17 (2015)
3. Candy, P.C.: Self-Direction for Lifelong Learning. A Comprehensive Guide to Theory and
Practice. ERIC (1991)
4. Blaschke, L.M.: Heutagogy and lifelong learning: a review of heutagogical practice and self-
determined learning. Int. Rev. Res. Open Distance Learn. 13(1), 56–71 (2012)
5. Kettle, J.: Flexible pedagogies: employer engagement and work-based learning flexible
pedagogies: preparing for the future (2013)
6. Tuckett, A.: The rise and fall of life-wide learning for adults in England. Int. J. Lifelong
Educ. 36(1–2), 230–249 (2017)
7. Carr, J., Balasubramanian, A., Atieno, K., Onyango, R.: Lifelong learning to empowerment:
beyond formal education. Distance Educ. 39(1), 69–86 (2018)
8. Louw, W.: Designing learning experiences to prepare lifelong learners for the complexities
of the workplace. In: Psycho-Social Career Meta-Capacities: Dynamics of Contemporary
Career Development, pp. 307–319. Springer International Publishing (2014)
9. Hase, S., Kenyon, C.: From andragogy to heutagogy. Ulti-BASE In-Site (2000)
10. Robert, K.W., Parris, T.M., Leiserowitz, A.A.: What is sustainable development? goals,
indicators, values, and practice. Environ. Sci. Policy Sustain. Dev. 47(3), 8–21 (2005)
11. Liang, D.J.: Caton, K., Hill, K.: Lessons from the road: travel, lifewide learning, and higher
education, 15(3), 225–241 (2015)
12. Harrison, R., Vanbaelen, J.: Lifelong learning as a steppingstone to entrepreneurship and
innovation (Conference Paper) (2016)
13. Osborne, K., Borkowska, M.: A European lens upon adult and lifelong learning in Asia. Asia
Pac. Educ. Rev. 18(2), 269–280 (2017)
14. Gibson, D., Ostashewski, N., Flintoff, K., Grant, S., Knight, E.: Digital badges in education.
Educ. Inf. Technol. 20(2), 403–410 (2015)
Cloud and Distributed System
Applications (CADSA-2019)
Optimization Algorithms and Tools Applied
in Agreements Negotiation
Alessandra Amato1, Flora Amato1, Giovanni Cozzolino1(&),

Marco Giacalone2, and Francesco Romeo3
1
{alessandra.amato,flora.amato,
giovanni.cozzolino}@unina.it
2
Vrije Universiteit Brussel & LSTS, Vrije Universiteit Brussel,
Pleinlaan 2, 4B304, 21050 Brussels, Belgium
3
University of Naples “Federico II”, via Porta di Massa, 1, 80133 Naples, Italy
francesco.romeo@unina.it
Abstract. In this paper we introduce new mechanisms of dispute resolution as

a helping tool in legal procedures for lawyers, mediators and judges with the
objective to reach an agreement between the parties; in some situations. The
primary objectives are the following: to apply algorithmic mechanisms to the
solution of certain national and cross-border civil matters, including matrimonial
regimes, successions and trusts, commercial law and consumer law, facilitating
the agreement process among the parties; to demonstrate the efficacy of an
algorithmic approach and apply it into the allocation of goods, or the resolution
of issues, in disputes, leading the parties to a friendly solution before or during
the trial comparison. We will focus on the algorithms used for resolving disputes
that involve a division of goods between agents, e.g. inheritance, divorces and
company law.
Keywords: Algorithm Cloud Resolution Artificial Intelligence
1 Introduction
One of the aims of informatic systems (and in particular of the ones based on Artificial
Intelligence) is to help people (e.g. domain experts) make decisions by proposing
solutions to given problems. In order to be helpful and effective, those systems must be
outright easy to use and able to satisfy every possible need of the user without over-
coming him.
As informatics and AI evolve, hopefully those systems will help us with more and
more aspects of our lives in order to improve them.
The costs and extremely long trial times is also a common problem for the majority
of the European Union member states. The cross-border proceedings take an inordinate
amount of time, both to set up and for a decision to be rendered. Specific problems
include finding a judge with the requisite competence and translating the summons and
other relevant material into a language intelligible to the addressee. Moreover, cross-

https://doi.org/10.1007/978-3-030-33509-0_51
550 A. Amato et al.
border civil proceedings are far too expensive for most citizens, due to the high costs of
document translation and finding and consulting qualified legal experts. Another
obstacle that citizens face in cross-border disputes is the divergent interpretations of
different national courts, even when EU rules apply. The length of national proceedings
is also a common problem for the majority of the member states.
Also, the current legal systems do not adequately value the possibility of reaching
an agreement of the parties, instead they always try to find the solution in legal rules
often divergent compared to the wishes of the parties and in a long and debilitating
conflict of comparison process1.
For all these reasons the “Conflict Resolution with Equitative Algorithms” (CREA)
project was established by the European Commission. It involves several European
universities and each one has his own duty. The project aims to introduce new
mechanisms of dispute resolution as a helping tool in legal procedures for lawyers,
mediators and judges with the objective to reach an agreement between the parties; in
some situations, it could be used directly by citizens. Its primary objectives are the
following:
1. to apply algorithmic mechanisms to the solution of certain national and cross-border
civil matters, including matrimonial regimes, successions and trusts, commercial
law and consumer law, facilitating the agreement process among the parties;
2. to demonstrate the efficacy of an algorithmic approach [12–14] and apply it into the
allocation of goods, or the resolution of issues, in disputes, leading the parties to a
friendly solution before or during the trial comparison.
In this paper we will see an example of such a system used for resolving legal
disputes, in particular for disputes regarding a division of goods; this system makes use
of mathematical algorithms that belong to the Game Theory and is being developed
within a project that aims to diffuse this algorithmic approach for making legal pro-
cedures faster and fairer. We will focus on the algorithms used for resolving disputes
that involve a division of goods between agents, e.g. inheritance, divorces and com-
pany law [1–11].
2 The Algorithms Setup
The algorithms we will look at apply to cases were some goods have to be assigned to a
set of entities (e.g. people or companies), which we will call agents. Each agent is given
a share of entitlement, i.e. a share weight. Typically, if there are n agents, the share will
be 1/n for each one of them, but it could be different. For instance, shares could reflect
the closeness of relatives to a deceased person, or the effort. The algorithms will give us
a solution in form of the share of each good assigned to each agent.
1
Sections 2, 3, 4 and 5 are to be attributed to all authors; Sect. 1 in particular is to be attributed to
Marco Giacalone.
Optimization Algorithms and Tools Applied in Agreements Negotiation 551
Let’s have a look at the notation we are going to use; we have:

• A set of agents N = {1, 2,…, n}
• A set of goods M = {1, 2,…, m}
• A market value for each good mj, j 2 M
• A share weight for every agent wi, i 2 N
• A solution of the allocation problem where:
– zij is the share of good j 2 M given to agent i 2 N
– Z = {zij} i 2 N, j 2 M is the solution matrix
We have to make the agents express their preference about which goods they would
like to get and which goods they don’t and in what measure. We can achieve that in
two ways: the bids method and the ratings method; whether we choose one method or
the other, we end up with the utility for each good of each agent, i.e. how much the
agents value each good.
2.1 The Bids Method
• For each good, a market price is given.

• The sum of the market prices is computed. This is the budget available to each agent
in the following steps.
– An equal budget for each agent reflects the principle that all agents should be
treated equally. Only the share of entitlement could discriminate among agents.
• The market price is decreased by 20%. This price is low enough to guarantee the
selling of the good (or at least provide an extremely high probability for its
selling). Below this price, called the lower bidding bound, an offer cannot be
considered acceptable.
• Each agent is asked to distribute the budget as bids over the unassigned goods.
Each bid cannot be less than the lower bidding value, and the total value of bids
cannot exceed the budget.
– The idea is that the higher the bid, the more likely is for the agent to receive the
good.
Using this method, the utility of each good for the agents is just the bid:
uij ¼ bij for each agent i 2 N and good j 2 M
with bij being the bid of the agent i for the good j and uij its utility.
2.2 The Ratings Method
• Each disputed good is valued at the market price.

• Each agent evaluates how much he/she would like to receive each good. The
evaluation can be attained through a “1 to 5 stars” marking system (as the rating of
an Amazon product or that of a restaurant through Trip Advisor). This evaluation
552 A. Amato et al.
does not regard the monetary value of the good. For instance, an agent is involved
in the allocation of a house worth 100000 euros and a second-hand Harley-
Davidson motorbike worth 6000 euros. The house is worth more, but he already
owns a beautiful house and he knows that managing a house is time and money
consuming. On the other hand, he has always dreamt about riding that motorbike.
He will give 2 stars to the house and 5 to the bike.
– Notice that assigning 5 stars to all items will not make you any better off than
assigning 1 star to all of them. What counts is the profile: you raise the chance of
getting the items you really want by assigning them a high mark, and by giving a
low mark to those you are not too interested.
If we call rij 2 {1, 2, 3, 4, 5} the rating agent i gives to the good j and K > 1 the
revaluation rate for each star (typically K = 1.1) the utility is given by:
uij ¼ K ðri j3Þ mj for each agent i 2 N and good j 2 M:
3 The Solutions
In this paragraph we introduce the tools adopted for the implementation of the two
most adopted criteria: The Equitable Allocation Algorithm and the Nash solution of
Competitive Equilibrium. Our algorithms can be identified as optimization problems,
since they have an objective function and one or more constraints; to implement an
optimization problem we must detect what kind of problem it is in order to choose the
right tool to solve [25–27]. We will see that the kind of problem we are going to solve
is not always the same.
3.1 The Egalitarian Equivalent Allocation

The Egalitarian Equivalent Allocation, also called Equitable Allocation, was first
introduced in Pazner and Schmeidler (Egalitarian equivalent allocations: A new con-
cept of economic equity, Quarterly Journal of Economics, 1978). It makes sure that all
agents receive goods (or parts of them) such that the sum of the goods’ value according
to his own bids is the same, and this value is as high as possible. If agents have different
weights, equality is attained once values are weighted with the shares in order to attain
equality.
In order to obtain an egalitarian allocation a solution must be found that maximizes
a comfort variable t such that
!
X u X
P ij t; z ¼ 1 8j 2 M; zij 0 8i 2 N; j 2 M
wi uij i2N ij
j2M j2M
By construction, this solution is egalitarian. It turns out that this solution is also
efficient: no other allocation, even a non-egalitarian one, can make all agents better off
simultaneously. The allocation, however, has some problems: it’s not resources
monotonic, it doesn’t have responsive shares, it suffers from domination. But the most
important problem is that the allocation may cause one or more agents to be envious of
the goods assigned to other agents; let’s consider an example: suppose there are 3
agents (I, II and III) and 3 divisible items (a, b and c). In the following table, we
describe the utility of each item by each player.
Table 1. Envy example utilities

a b c
I 40 30 30
II 30 40 30
III 10 50 40
The egalitarian solution assigns item a to agent I, item b to agent II and item c to
agent III – without splits. Each one gets a utility of 40. However, agent III is envious of
agent II, because he received item c (valued 40) but he would have preferred item b
(valued 50 to him) that was assigned to agent II.
3.2 The Competitive Equilibrium from Equal Income

The Competitive Equilibrium from Equal Income (CEEI), also known as the Nash
solution, is based on the studies of A. Bogomolnaia, H. Moulin, F. Sandomirskiy and
E. Yanovskaya, (Competitive division of a mixed manna, Econometrica).
Supposing each agent is given the same budget (weighted on his entitlement), the
Competitive Equilibrium from Equal Income (CEEI) solution is reached if goods are
bought such that: (a) each agent, independently of the others, makes the best choice:
given the budget, he buys goods that maximize his own satisfaction and (b) all goods
are sold with no overlaps (for instance two agents buying the same good in its entirety)
and no leftovers (no good remains unsold).
The following is the objective function of this solution, which was introduced by
J. F. Nash (The Bargaining Problem. Econometrica, 1950) and thus takes the name of
Nash solution:
YX
max ð uij zij Þwi
i2N j2M
This solution is Resource Monotonic (ore goods to divide should not be bad news
for anyone), it has Responsive Shares (if an agent raises the bid on a certain good, he
cannot end up with a smaller share of that good) and doesn’t suffer from domination.
Most importantly it’s envies free, although it’s not usually egalitarian: let’s consider
again three agents sharing three divisible items (see Table 1). The CEEI/Nash solution
prescribes that: agent I gets item a in its entirety, agent II gets 9/10 of item b, agent III
gets item c and 1/10 of item b. Agent III will not be envious anymore, because in his
evaluation item c in its entirety plus 1/10 of item b is no less (actually it equals) 9/10 of
item b.
554 A. Amato et al.
They both have a utility of 45. In terms of utility, we note that the utility of agent I
is 40, the utility of agent II is 36 and the utility of agent III is 45, therefore the
allocation is not equitable. But here agents modify the utilities depending on the likes
and dislikes of goods and the utility does not directly represent money anymore.
3.3 Constraints
The maximization problems we are considering are rather flexible and can incorporate
other constraints to better suit the division problem: we will see, as an example,
indivisible goods and restricted assignments, CREA Algorithms so far.
A good is indivisible if it cannot be split and must be assigned to only one agent in
its entirety. If a good j is indivisible, we impose that
zij ¼ 0 or zij ¼ 1 for every i 2 N
A restricted assignment occurs if legislation requires a good to be assigned to a

specific agent or a subset of agents. Let us call N’ N this subset and j the good with
restricted assignment, we impose that:
X
zij ¼ 1
i2N 0
4 Implementation Tools
4.1 Google OR-Tools

Google OR-Tools is open source software for combinatorial optimization, which seeks
to find the best solution to a problem out of a very large set of possible solutions. So
basically, it’s a set of libraries with classes and methods useful for solving optimization
problems [26, 27]. The main class provided is the Solver class, which lets us define the
problem and then solves it; there are many solvers for different types of problems:
Linear Programming, Integer Programming, Mixed, Constraint Optimization etc. In our
case OR-Tools was used to implement the Equitable Allocation algorithm and although
our base problem is a Linear Programming one, because of the possibility of having
indivisible goods it could become an Integer Programming problem, so the Mixed-
Integer Programming (MIP) solver was used for dealing with both cases.
To define an optimization problem with OR-Tools we must first declare its vari-
ables and what type of variable they are (e.g. continuous or integer); then we proceed to
define the constraints, i.e. the equations (expressed as sums of variables and products
for constants) that constitute our problem and the bonds between the variables; finally
we indicate our objective function and whether our problem is a maximization or a
minimization one; now we can make the solver solve the problem and then display the
solution.
Let’s see how those steps were translated in our problem: the variables are the
elements of the Z matrix, i.e. the share of each good for each agent (which can be
between 0 and 1), plus the convenient variable t (which is positive); all the variables are
continuous except for the ones associated with indivisible goods (because they can be
either 0 or 1). There are three groups of constraints in our problem: the first constraints
are that the sum of the shares for each good must be exactly 1; then we have the
constraints that incorporate the problem’s main inequality, i.e. for each agent the sum
of every goods share multiplied by a coefficient (previously calculated) must be higher
than t; the last group of constraints concerns the restricted assignments, as the solutions
relative to them are predetermined. Our goal is to maximize t in our system, so the
objective function is just t and the problem is a maximization one [15–20].
4.2 CSNumerics
CSNumerics is a Portable Class Library (PCL) of various numerical algorithms written
in C#. It comprehends the implementation of three optimization algorithms by
Michael J. D. Powell: BOBYQA for minimizing a nonlinear objective function subject
to variable bounds, LINCOA for minimizing a nonlinear objective function subject to
linear constraints and COBYLA for minimizing a nonlinear objective function subject
to nonlinear constraints.
In our project CSNumerics was used for implementing the Nash allocation algo-
rithm because its objective function is non-convex being a product between variables
and thus cannot be implemented using OR-Tools (which only supports sums of vari-
ables and multiplications for constants). In particular the COBYLA (Constrained
Optimization BY Linear Approximation) algorithm was used; it lets us define only
constraints in form of inequality and it solves only minimization problems, but we will
see how to bypass those issues later. Setting up the problem is pretty much the same as
with OR-Tools, but this time the objective function and the constraints are defined
within a procedure that will be called by the constructor of the solver [21–24].
5 Conclusion
In this paper, we focused on the algorithms used for resolving disputes that involve a
division of goods between agents, e.g. inheritance, divorces and company law. We
evaluated the advantage of the Equitable Allocation, its equity in monetary terms
though in most cases it’s more important not to generate envy among the parties; also,
the Nash solution tends to avoid splitting goods. For those reasons (and the ones we
talked about before) the Nash solution is more similar to the way those cases are dealt
with in ordinary proceedings.
Keep in mind that the solution displayed by the program can just be taken as an
advice and it’s always a decision of the mediator whether to use it or not. Therefore, the
solution could be adjusted: e.g. for the goods that, according to the solution, must be
split, negotiations should take place. A negotiation for each good should begin and
should involve only the players entitled to some fraction of the good.
556 A. Amato et al.
Acknowledgments. This paper has been produced with the financial support of the Justice
Programme of the European Union, 766463 CREA. The contents of this report are the sole
responsibility of the authors and can in no way be taken to reflect the views of the European
Commission.
References
1. Russell, S., Norvig, P.: Artificial Intelligence: A modern approach, 3rd edn. (2009)
2. Audiconsum. CREA Project. https://www.adiconsum.it/al-tuo-fianco/iniziative/crea-conflict-
resolution-with-equitative-algorithms/. Accessed 02 Feb 2019
3. CREA: CREA Project. http://www.crea-project.eu. Accessed 02 Feb 2019
4. CREA: Commission, European. Grant Agreement number - 766463 (2017)
5. Kumar, V.: Algorithms for constraint-satisfaction problems: a survey. AI Mag. 13(1), 32
(1992)
6. Pazner, E.A., Schmeidler, D.: Egalitarian equivalent allocations: a new concept of economic
equity. Q. J. Econ. 92, 671–687 (1978)
7. Nash, J.F.: The Bargaining Problem. Econometrica 18, 155–162 (1950)
8. Bogomolnaia, A., Moulin, H., Sandomirskiy, F., Yanovskaya, E.: Competitive division of a
mixed manna. Econometrica 85, 1847–1871 (2017)
9. Buchanan, B.G., Headrick, T.E.: Some speculation about artificial intelligence and legal
reasoning. Stan. Law Rev. 23, 40–62 (1970)
10. AI vs Lawyer: LawGeex, February 2018. https://www.lawgeex.com/resources/aivslawyer/
11. European Commission: Effective justice the 2018 EU Justice Scoreboard. https://ec.europa.eu
approach for CH sensitivity discovery in social media data. In: IEEE Tenth International
Conference on Semantic Computing (ICSC), pp. 459–464 (2016)
social networks for lurkers detection: an approach based on hypergraphs. Concurrency
Comput.: Pract. Exp. 30(3), e4188 (2018)
extensions for secure exchange of eHealth data at the EU level. In: 14th European
Dependable Computing Conference (EDCC), pp. 17–24, Iasi (2018). https://doi.org/10.
1109/edcc.2018.00015 (2018)
16. Luigi, C., D’Antonio, S., Mazzeo, G., Luigi, R.: A comparative analysis of emerging
approaches for securing Java software with Intel SGX. Future Gener. Comput. Syst. 97,
620–633 (2019). ISSN 0167-739X. https://doi.org/10.1016/j.future.2019.03.018
(2018). ISSN 0951- 8320. https://doi.org/10.1016/j.ress.2018.04.009
digital contents. IET Comput. Digit. Tech. 8(6), 300–310 (2014)
modeling of IaaS resources. Future Gener. Comput. Syst. 87, 754–764 (2018). https://doi.
org/10.1016/j.future.2017.08.016. Elsevier B.V
org/10.4304/jsw.5.1.1-15
pp. 467–485 (2017). https://doi.org/10.1201/b11149
24. Aversa, R., Di Martino, B., Moscato, F.: Critical systems verification in MetaMORP
(h) OSY. In: Artificial Intelligence and Lecture Notes in Bioinformatics, LNCS, vol. 8696,
pp. 119–129. Springer. https://doi.org/10.1007/978-3-319-10557-4_15
25. Albanese, M., Erbacher, R.F., Jajodia, S., Molinaro, C., Persia, F., Picariello, A.,
26. Amato, F., Cozzolino, G., Moscato, V., Moscato, F.: Analyse digital forensic evidences
through a semantic-based methodology and NLP techniques. Future Gener. Comput. Syst.
98, 297–307 (2019)
27. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In:
Proceedings - 2018 19th International Conference on Information Reuse and Integration for
Data Science, IRI 2018, vol. 8424755, pp. 530–536. IEEE (2018)
A Configurable Implementation
of the SHA-256 Hash Function
Raffaele Martino and Alessandro Cilardo(B)
Department of Electrical Engineering and Information Technology,

University of Naples Federico II, Naples, Italy
{raffaele.martino2,acilardo}@unina.it
Abstract. This paper proposes a hardware solution for the SHA-256

hash function offering a number of configurable architecture-level fea-
tures. This flexibility allows for exploring various trade-offs between per-
formance, area occupation, and power consumption. As confirmed by
the experimental results, the approach succeeds in exposing the effects
of different architectural configurations on the resulting implementation.
1 Introduction
SHA-256 is one of the most widely adopted cryptographic hash functions nowa-
days. Its applications include, for example, Hash-Based Message Authentication
Code [11] and the Digital Signature Algorithm [12], of relevance in security-
critical areas such as the Internet of Things, finance, and cloud computing [2–
4,10]. Because of the stringent requirements posed by these areas of applica-
tion, hardware acceleration of SHA-256 is often desirable. The quest for efficient
SHA-256 implementations has been further pushed by the explosion of Bitcoin,
employing SHA-256 at the heart of its mining protocol [10].
The SHA-256 hash function is defined by the National Institute of Standards
and Technology (NIST). It was firstly introduced in 2001 and is currently in use
as secure hash algorithm [13]. SHA-256 is a block-based hash function, meaning
that the variable-length message to be hashed is split into fixed-length Padded
Data Blocks (PDBs), and each PDB is processed throughout the hash algorithm.
The hash algorithm itself applies a compression function to a state variable for
a predefined number of iterations or rounds, while the input PDB is expanded to
produce a key for each round of the compressor function. A detailed description
of the SHA-256 function can be found in the standard [14].
A direct hardware implementation of SHA-256 is presented in [15], while a
large number of optimizations have been proposed, including variable precom-
putation [1], pipelining [5–9], loop unrolling [8,9], and resource reordering [7–9].
This paper presents a hardware accelerator of SHA-256 which is config-
urable with respect to several architectural parameters. Relying on its flexibility,
the proposed accelerator can be adapted to different application requirements.
Like [1], we present a configurable implementation aimed at comparing differ-
ent alternatives for the implementation of the round function. Nevertheless, the
https://doi.org/10.1007/978-3-030-33509-0_52
A Configurable Implementation of the SHA-256 Hash Function 559
implementation proposed in this work offers improved degrees of freedom than

the one in [1], as it also provides support for pipelined and unrolled configura-
tions as well as configurations with resource reordering.
The rest of the paper is organized as follows. Section 2 presents the proposed
architecture. Experimental results are discussed in Sect. 3. Section 4 concludes
the paper with a few final remarks.
2 Proposed Implementation
The SHA-256 implementation proposed in this paper takes as input a full PDB
and produces as output the corresponding hash value. Figure 1 shows the data
path of the proposed architecture.
Fig. 1. Top level entity of the configurable SHA-256 accelerator
When pipelining is enabled, several PDBs belonging to different messages

can be processed simultaneously. It is not possible to process simultaneously
PDBs of the same multi-block message since it is necessary to wait until the
end of processing of PDB j in order to process PDB j + 1, according to the
standard [14]. It is responsibility of the external system to perform the chaining
sum.
2.1 Data Path
The architecture is composed of two parallel pipelines, one for the Compressor
and one for the Expander. For each stage of the Compressor pipeline, there is
also an associated ROM containing the values of the constants K relevant for
that stage, as required by the standard [14].
560 R. Martino and A. Cilardo
2.1.1 Compressor
The round registers, clocked by the external base clock, are also employed as
pipeline registers. The two functions are controlled by a multiplexer, placed
before the register within the round, and driven by a major cycle signal. The
major cycle signal is produced by the round counter, which also outputs the
address input for the ROMs.
The compressor pipeline registers contain the 8 working variables and a valid-
ity flag, which is set during the first stage and carried up to the output, to signal
that the value of the output hash register is valid.
2.1.2 Expander
Within the Expander, the round registers work as 16-position word-wide shift
registers during the stage, commuting to parallel registers when the major cycle
signal is asserted, again by means of a multiplexer array. Since the last shift of
the stage works with the major cycle asserted, it is not recorded by the shift
register. Instead, it must be captured by properly rearranging the connection
with the register of the following stage, as shown in Fig. 2.
Fig. 2. Expander architecture with stage chaining
To perform unrolling, the shift register chain of each Expander stage is split
into a number of chains equals to the unrolling factor, as shown in Fig. 3, since
that number of expanded words W must be generated at each clock cycle. Words
are distributed among the split chains cyclically with respect to their positions
within the original chain.
According to the SHA-256 standard [14], the 16 initial words forming the
input message are in big-endian order, causing a reverse sorting of the input
message, which must be taken in little-endian order. This makes it necessary to
reverse the input of the Expander, and this reversal must in turn be taken into
account when splitting the Expander into stages.
Fig. 3. Expander architecture, unrolled by a factor of 4
2.2 Control Unit
Basically, the Control Unit should only keep the round counter enabled during
the computation and reset it at the end. Hence, in principle, it should be made
only of two states, idle and compute.
An additional last stages state is added to flush the pipeline when no new
PDBs are provided to the core for hashing. Since the number of pipeline stages is
configurable, it is not viable to employ an FSM state per pipeline stage. Instead,
a stage counter is employed, fed by the major clock cycle and enabled during
the last stages state. As for the round counter, the counting value of the stage
counter can be computed from the values of the generic parameters of the
design. When the round counter signals that the pipeline is fully flushed, the
FSM can go into the idle state.
The Control Unit is also responsible for properly driving the timing of the cir-
cuitry. Due to the presence of the multiplexers, it is possible to load the pipeline
only when the major cycle signal is asserted, but since upon reset the major
cycle signal is cleared, the very first major cycle would be lost. To avoid this, the
Control Unit introduces two control signals, which enable the Compressor and
the Expander respectively to receive incoming data also during the very first
major cycle. This is done during an additional stage, first load, as shown in
Fig. 4a.
In this version of the FSM, the control signals for initialising the Compressor
and the Expander pipeline are actually the same signal. However, depending on
the chosen architecture of the transformation round, the data path may incur
Fig. 4. FSM of the Control Unit. a with the Compressor and the Expander aligned. b
with the Expander moved ahead
a further timing issue, related to the constant ROM. For handling such cases,
the Control Unit provides an additional stage, second load, which delays only
the Compressor pipeline by one clock cycle, keeping it from accepting inputs.
To work correctly, the Expander must instead start working, hence the need of
differentiating the two control signals. The modified FSM is shown in Fig. 4b,
whilst Sect. 2.3 discusses the conditions on the round architecture under which
the modified FSM must be instantiated.
2.3 Reconfigurable Aspects Controlled by Source-Level Parameters

The following characteristics of the architecture can be reconfigured by set-
ting the corresponding generic parameter in the hardware description language
(HDL) code, namely VHDL code:
Number of Pipeline Stages. The number of pipeline stages can be set directly
as the value of the PIPELINE STAGES parameter (set to 1 to disable pipelin-
ing). This value is also the number of PDBs which can be processed at the
same time by the architecture. The number of pipeline stages must divide 64
(this constraint is checked by an assert in the code).
Unrolling Factor. The unrolling factor of the design must be set as the value
of the UNROLLING FACTOR parameter (set to 1 to disable unrolling). Note
that this value must be set consistently with the unrolling factor of the
selected internal transformation round, otherwise the architecture will not
work. Moreover, the unrolling factor must divide 64 (the latter constraint is
checked by an assert).
Timing. The boolean parameter FIX TIME allows for reconfiguring the timing of
the part of the design which provides K and W constants, as required by the
standard [14]. Due to the fact that the K ROM takes as input the counter
value, there is a one-cycle delay between the counter increment and the cor-
responding value of the constant. If the pipeline register is placed before any
use of the K constants, as it happens in the Naive transformation round
architecture, the pipeline register itself compensates for the delay and hence
no further action is required, so FIX TIME must be set to false. Otherwise,
if the value of K is used before the pipeline registers, as in transformation
round architectures which make use of the precomputation technique, we

need to introduce a one-cycle delay to align the value of K with the other
operands. This is done by the Control Unit as described in Sect. 2.2, and
this additional clock cycle impacts only the first computation, hence not
affecting steady-state throughput. To instantiate the appropriate FSM for
architectures which employ precomputation, FIX TIME must be set to true.
When this is the case, the major cycle of the Compressor is delayed by one
clock cycle, by means of a flip-flop, in order to keep the other stages aligned
with the first stage, which is the only one directly fixed by the Control Unit.
Final Sum. If the FINAL SUM boolean parameter is set to true, an additional
register is placed just before the final sum, actually resulting in an additional
stage, which is not included in the PIPELINE STAGES figure. To minimise
latency, since this additional stage requires only one clock cycle, the output
register of the circuit, which in this case is also the output register of this
additional stage, is no longer clocked with the major cycle signal, but with a
one-cycle-delayed version of it. Due to the pipelined architecture, throughput
is not affected by the presence of the final stage, even if PIPELINE STAGES is
set to 1.
If the architecture employs spatial reordering and at least one adder is placed
before the pipeline register, however, it is not profitable to add the final stage,
hence FINAL SUM can be set to false. When this is the case, the output of
the last stage is directly fed into the adders.
It is worth noting that, when an adder is placed before the registers, it is
most likely the one which performs the H + W sum. As a consequence,
an architecture which sets FINAL SUM to false usually also needs to set
FIX TIME to true. However, we choose to keep these two parameters distinct
in order to provide more flexibility for new optimised implementations of the
transformation round.
2.4 Reconfigurable Aspects Controlled by Component Declarations

The architecture of the Compressor pipeline stage component can be spec-
ified by a VHDL configuration declaration in order to configure the trans-
formation round. Alternatives are implemented as different architectures of the
Transf round entity.
The following transformation round architectures are provided, where UF
stands for Unrolling Factor:
Naive. A straightforward implementation of the transformation round, with the
pipeline register placed before all the combinatorial parts.
Precomputed UF1. An implementation of the round function with precomputa-
tion, presented in [1].
Reordering UF1. An architecture with precomputation and spatial reordering,
presented in [7].
When using the Naive transformation round core, the combinatorial part can
be further customised by specifying an architecture for the Transf round comb
component, which must implement the round function. The following architec-
tures for this component are provided:
Naive. A straightforward implementation of the round function.

Unrolled. An implementation of the round function unrolled by a factor 4.
An optimisation of the transformation round which is entirely combinatorial

should be implemented as an architecture of the Transf round core entity for
use within the Naive architecture of the Transf round entity. This way the tim-
ing issue described in Sect. 2.3 is avoided. On the other hand, if an optimisation
involves the pipeline register, this must be implemented as an architecture for
the Transf round VHDL entity.
2.5 Discussion
The main aim of our work is to provide a fully configurable design solution to
be used to explore and evaluate existing and possibly new architectures. Most of
the previous literature proposals for SHA-256 can in fact be seen as a particular
instance of the configurable solution presented here. Only architectures that
completely redefine the data path of the SHA-256 core, such as [6], are not
suitable to be described by the framework proposed in this paper. Furthermore,
a few contributions that take advantage of the particular application of the SHA-
256 function to enhance the performance of the overall system, such as [9], also
fall out of the scope of this work. Such architectures can be integrated into our
framework for the part regarding the SHA-256 design, but they cannot exploit
the benefits of the particular application. This is an intended goal of this work,
which aims to make it easier and fairer to compare SHA-256 implementations
on their own.
The proposed architecture has been captured in VHDL. Several configuration
alternatives have been synthesized, placed and routed with the Xilinx Vivado
IDE, in order to validate the configurable implementation and to assess the
different tradeoffs brought by each configuration. Table 1 lists the different con-
figurations that have been evaluated.
The target device for this experiment is the Kintex UltraScale+ XCKU5P
FPGA [16], and the results are shown in Table 2.
Despite the fact that they are unable to handle multiple PDBs of a single
message, pipelined configurations offered the highest values of the hash rate.
It is worth recalling, however, that this improvement can be exploited only in
contexts where different messages need to be hashed. If there are not enough
messages to feed a pipelined architecture, a Naive configuration with unrolling
provides the best performance.
The Naive configuration with pipelining and unrolling showed not only the
highest hash rate, but also the best power efficiency. Nevertheless, the best area
Table 1. Detail of the explored architectures
No Core type PIPELINE STAGES UNROLLING FACTOR FIX TIME FINAL SUM AS STAGE
1 Naive 1 1 false true

5 Precomputed UF1 1 1 true true
6 Precomputed UF1 4 1 true true
7 Reordered UF1 1 1 true false
8 Reordered UF1 4 1 true false
Table 2. SHA-256 implementation results on the Kintex FPGA. Area efficiency is

expressed in Mbps/LUT, while power efficiency is expressed in Mbps/mW.
Number Critical Hash rate Area Power consumption (W) Efficiency

path delay (Mhash/s)
LUT FF Static Dynamic Total Area Power
1 2.398 6.516 1578 1875 0.451 0.122 0.573 1.057 2.911
2 7.750 8.065 2793 1960 0.451 0.099 0.550 0.739 3.754
3 2.745 22.769 5314 4302 0.453 0.412 0.865 1.097 6.738
4 9.012 27.741 9316 4188 0.453 0.347 0.800 1.097 8.877
5 3.004 5.201 1619 1866 0.451 0.115 0.566 0.822 2.353
6 3.050 20.492 5385 4314 0.454 0.430 0.883 0.974 5.941
7 2.297 6.802 1485 1640 0.451 0.131 0.582 1.173 2.992
8 2.646 23.621 4986 4312 0.454 0.434 0.887 1.213 6.817
efficiency was obtained with the Reordered UF1 transformation round, in the
pipelined configuration. The Reordered UF1 transformation round should there-
fore be preferred in area-constrained contexts, given also that its nonpipelined
configuration reached the least overall area occupation.
Interestingly, the unrolled configurations reached the best power efficiency,
both amongst the pipelined configurations and the nonpipelined configurations.
4 Conclusions
Architecture exploration is crucial for the effective evaluation and comparison of
hardware acceleration solutions. This is especially true of SHA-256 implementa-
tions, which provide a variety of different design options. In this work, a config-
urable implementation of the SHA-256 hash function was presented, providing
a superset of most literature solutions and allowing the effective exploration of
various implementation trade-offs. As future work, we will further improve the
framework flexibility, in order to support more configurations. Namely, we will
consider supporting implementations which require to pass additional working

variables between pipeline stages, such as [8]. Moreover, work will be done to
extend the proposed implementation to support other algorithms of the SHA-2
family through a configuration option.
Acknowledgements. The activities described in this article received funding from

the European Union’s Horizon 2020 research and innovation programme under the
FETHPC grant agreement RECIPE no. 801137.
References
1. Algredo-Badillo, I., Feregrino-Uribe, C., Cumplido, R., Morales-Sandoval, M.:
FPGA-based implementation alternatives for the inner loop of the secure hash
algorithm SHA-256. Microprocess. Microsyst. 37, 750–757 (2013). https://doi.org/
10.1016/j.micpro.2012.06.007
2. Amato, F., Moscato, F.: A model driven approach to data privacy verification in
e-health systems. Trans. Data Priv. 8(3), 273–296 (2015)
3. Amato, F., Mazzocca, N., Moscato, F.: Model driven design and evaluation of
security level in orchestrated cloud services. J. Netw. Comput. Appl. 106, 78–89
(2018). https://doi.org/10.1016/j.jnca.2017.12.006
4. Amato, F., Moscato, F., Moscato, V., Colace, F.: Improving security in cloud by
formal modeling of IaaS resources. Future Gener. Comput. Syst. 87, 754–764 (2018)
5. Dadda, L., Macchetti, M., Owen, J.: An ASIC design for a high speed implemen-
tation of the hash function SHA-256 (384, 512). In: GLSVLSI 2004 - 14th ACM
Great Lakes Symposium on VLSI, pp. 421–425 (2004). https://doi.org/10.1145/
988952.989053
6. Macchetti, M., Dadda, L.: Quasi-pipelined hash circuits. In: ARITH 2005 - 17th
IEEE Symposium on Computer Arithmetic, pp. 222–229 (2005). https://doi.org/
10.1109/ARITH.2005.36
7. Michail, H., Milidonis, A., Kakarountas, A., Goutis, C.: Novel high throughput
implementation of SHA-256 hash function through pre-computation technique. In:
ICECS 2005 - 12th IEEE International Conference on Electronics, Circuits, and
Systems (2005). https://doi.org/10.1109/ICECS.2005.4633433
8. Michail, H.E., Kakarountas, A., Milidonis, A., Goutis, C.: A top-down design
methodology for ultrahigh-performance hashing cores. IEEE Trans. Dependable
Secure Comput. 6, 255–268 (2009). https://doi.org/10.1109/TDSC.2008.15
9. Michail, H.E., Athanasiou, G.S., Kelefouras, V., Theodoridis, G., Goutis, C.E.:
On the exploitation of a high-throughput SHA-256 FPGA design for HMAC.
ACM Trans. Reconfigurable Technol. Syst. 5(1), 2 (2012). https://doi.org/10.1145/
2133352.2133354
10. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). https://
bitcoin.org/bitcoin.pdf
11. National Institute of Standards and Technology: The keyed-hash message authenti-
cation code (HMAC). FIPS 198-1, U.S. Department of Commerce (2008). https://
doi.org/10.6028/NIST.FIPS.198-1
12. National Institute of Standards and Technology: Digital signature standard (DSS).
FIPS 186-4, U.S. Department of Commerce (2013). https://doi.org/10.6028/NIST.
FIPS.186-4
13. National Institute of Standards and Technology: NIST policy on hash func-
tions (2015). https://csrc.nist.gov/projects/hash-functions/nist-policy-on-hash-
functions
14. National Institute of Standards and Technology: Secure hash standard (SHS). FIPS
180-4, U.S. Department of Commerce (2015). https://doi.org/10.6028/NIST.FIPS.
180-4
15. Sklavos, N., Koufopavlou, O.: On the hardware implementations of the SHA-2 (256,
384, 512) hash functions. In: ISCAS 2003 - 36th IEEE International Symposium
on Circuits and Systems (2003). https://doi.org/10.1109/ISCAS.2003.1206214
16. Xilinx Inc.: Kintex Ultrascale+ FPGA product brief (2016). https://www.xilinx.
com/support/documentation/product-briefs/kintex-ultrascale-plus-product-brief.
pdf
A Blockchain Based Incentive Mechanism
for Crowd Sensing Network
Zainib Noshad, Atia Javaid, Maheen Zahid, Ishtiaq Ali,

Raja Jalees ul Hussen Khan, and Nadeem Javaid(B)
COMSATS University Islamabad, Islamabad, Pakistan

zainabnoshad@yahoo.com, atiajavaid477@gmail.com,
maheen.zahid2017@gmail.com, ishiali503@gmail.com,
jalees106@gmail.com, nadeemjavaidqau@gmail.com
Abstract. Crowd Sensing Network (CSN) uses sensor embedded mobile

phones for the collection of data for some specific task which can effec-
tively save cost and time. The quality of collected data depends on the
participation level from all entities of CSN, i.e., service provider, ser-
vice consumers and data collectors. In comparison with the centralized
traditional incentive mechanisms devised for CSN, we have proposed a
decentralized system model where incentives are used to stimulate the
involvement among data collectors and motivate the participants to join
the network. Moreover, the issue of privacy leakage is tackled by using
AES128 technique. Furthermore, the system is evaluated through analyz-
ing the gas consumption of all the smart contracts, whereas, the encryp-
tion technique is validated through comparing the execution time with
base paper methods.
Keywords: Crowd Sensing Network · Blockchain · Encryption ·

Service consumer · Gas consumption
1 Introduction
With the proliferation of emerging technology naming smartphones, smart-
watches, wearable devices, and smart-glasses with sensors embedded has given
the opportunity to sense raw data from the environment at a very high rate.
It is apparently the new trend in the market [1]. This advent in technology has
enabled many applications to collect data from a crowd through Mobile Crowd
Sensing Network (MCSN). The dawn of this sensing paradigm works through
outsourcing the acquiring of sensory data to a crowd of volunteering users, called
data collectors or crowd workers [2]. Their aim is to complete a task which is
broadcast-ed by any service provider or a requester and in return, they are com-
pensated for their efforts and spent resources with some fair share. This approach
is adapted from the traditional mechanism which is referred as a win win situ-
ation. It provides the opportunity to both the parties i.e., client and server, a
chance to collaborate. All the involved participants work together for a conflict
and comes up with a mutually beneficial solution.
https://doi.org/10.1007/978-3-030-33509-0_53
A Blockchain Based Incentive Mechanism for Crowd Sensing Network 569
From the commercial point of view, many researchers have exploited this
trend to gain the maximum benefit from Crowd Sensing Network (CSN). There-
fore, service oriented approach is another perspective of a CSN that has been
addressed by multiple authors [3]. Moving on, another entity [4] i.e., service con-
sumers is brought into the scenario for sole purpose of utilizing the data that
has been acquired by the data collectors. Otherwise, without the third entity,
there is no point of collecting this massive amount of data and letting the crowd
workers to spend their efforts, time, and resources for no good reason at all. For
businesses to improve, drive towards success, and achieve organizational goals,
there is a dire need [5] to analyze and process data which is not generated by
themselves. Rather, the raw data purchased from service providers of a CSN
where you can fill up gaps and generate leads.
Although CSN is serving the purpose of sensing data at a very cheaper rate
and providing a handful of advantages however, there are multiple issues which
arise with such kind of mechanisms and the major issue that has been addressed
by multiple researchers, is attaining the data quality [4] and engaging skilled
users. Since smart phone users are participating and spending their resources like
battery, storage and computation power, with that they are also exposing them-
selves with the potential privacy leakage. There should be some kind of reward
or incentive mechanism to compensate for their privacy leakage and resource
consumption. Therefore, a truthful and secure incentive mechanism has an utter
importance for CSNs. Furthermore, the sensing domain is divided into two cate-
gories i.e., opportunistic (in-volunteer) and participatory (volunteer) [6–8]. This
categorization help organizations to decide their task allocation, resources, and
the measures to take regarding the challenges mentioned above.
In order to encounter the issues faced by a CSN, many incentive mechanisms
have been proposed such as socially aware incentive mechanism for MCSN [9],
Quality and Usability Of INformation (QUOIN) [10] and many more monetary
approaches. However, these mechanisms posses a central authority [11] which
can lead to a single point failure. With the advent of technology and emerging
trends in application centric approaches, blockchain has proved to be the opti-
mal solution for the problems faced by CSNs. The Fig. 1 is taken from [12] to
elaborate the structure of blockchain.
In this paper, we have proposed a blockchain based data sharing mechanism
for CSN with two communication paradigms. In addition to that, AES128, an
encryption technique is also applied for preserving privacy of data collectors.
Smart contracts are used for communication purposes and enforcing the defined
criteria autonomously.
The paper is further divided as follows; Sect. 2 describes the motivation
and problem statement. Whereas, Sect. 3 provides the related work. Section 4
presents the proposed system model and explanation. Moving on, Sect. 5 has the
details of experimental results while the paper is concluded in Sect. 6.
570 Z. Noshad et al.
Fig. 1. Blockchain structure
2 Motivation and Problem Statement
CSN use sensors carried by mobile phones to collect sensed data and can effec-
tively save cost. In [13], authors have proposed a blockchain based privacy pro-
tection and virtual credit incentive mechanism for CSN. They have targeted
two issues of CSN, i.e., privacy leakage and low user involvement respectively.
Former issue is resolved by using encryption technique i.e., Affine Cipher. The
latter issue is resolved through tackling the issue of privacy leakage. However,
the Affine Cipher used for encryption is rather a less secure substitution cipher
and it is vulnerable to all the attacks of cipher class. Also, they have not used
smart contracts as a secure communication platform rather, encryption tech-
niques are applied separately when data is submitted by the sensors and Merkle
Tree (MT) root is used for data validation. Furthermore, the communication
paradigm between service provider and consumers is not considered.
Taking the motivation from the paper mentioned above and to tackle the
following limitations, we have proposed a system which is further divided into
two communication paradigms;
• communication between service provider and data collectors,

• communication between service provider and service consumers.
In the proposed scenario, where AES128 is used for encryption and a smart
contract is initiated by the service provider who acts as a requester in the net-
work for a specific task. Encryption is used to make sure that worker’s identity
is kept safe while submitting their committed task and incentive is distributed
among all the data collectors/workers immediately to stimulate user participa-
tion. Moreover, another smart contract is deployed for service consumers that
can request service and get a response from the service provider. Following that,
payment will be made to the service provider for sending the requested data.
3 Related Work
CSNs are where a large group of individuals have mobile devices consisting of
sensors. These devices are capable of sensing and computing shared data which
can be used to extract useful information for measuring, mapping, analyzing, or
estimating any process of common interest. Most mobile devices (smart phones,
tablet computers, wearables) can sense ambient light, noise (through the micro-
phone), location (through the GPS), movement (through the accelerometer),
and much more. In [14], authors have proposed a blockchain based incentive
mechanism for CSNs to increase the participation rate of the active users with
preserving their privacy. This mechanism mainly considers the truthfulness of
the mechanism by introducing a cryptocurrency as a secure reward to com-
pensate the participant. In this way, high quality users will be rewarded with
cryptocurrency that is built on blockchain and is recorded in transaction blocks.
The server will publish a sensing task with deposit, users will upload sensing
data, miners will verify the quality of data and transactions and in the end,
server will distribute the reward to all the participants. Whereas in [13] the
authors have proposed a blockchain based location privacy preserving incentive
mechanism in CSNs. The emphasis of this mechanism is to protect the pro-
vided information and sanctions reward for participation which will increase the
involvement of the users. The experiments are conducted in a campus environ-
ment with a total of 10 nodes (participants) and the results obtained are effective
in terms of encouraging user participation. In [12] the authors have introduced
blockchain based crowd sensing system where the miners and sensing-task work-
ers are rewarded through some pre-defined incentives which provides authentic
anonymity and system robustness. In [10], the authors have proposed QUOIN
which concurrently provides quality and usability of information for crowd sens-
ing applications. Stackelberg game model is applied in QUOIN to ensure that
every participant attains a sufficient level of financial gain. The authors have
evaluated this mechanism through a case study and the results show their effi-
ciency and effectiveness for the purpose of stimulating participation rate.
In [15], the authors have suggested an incentive mechanism which is based on
contract theory for mobile CSNs. A trust scheme has been introduced between
crowd sensing platform and mobile users which is based on direct and indirect
trust. Following that, an optimal contract is laid out that is based on incen-
tive scheme to encourage mobile users for participation in CSN. This contract,
together with maximizing the platform’s profitability, satisfies individual incen-
tive compatibility. The authors in [9] have proposed a novel technique, called
the social incentive mechanism where the social friends of the participants are
incentivized which strengthens the social ties among participators to promote
global cooperation. The incentive allotment depends on the participants social
circle therefore they get motivated to impact their friend’s behavior in order to
secure an increased payback. This kind of approach is useful where the quality of
the sensed data depends on the interdependent relationship among participants
or users.
In [16], a case study has been conducted by the authors which involves Partic-
ipAct platform and ParticipAct living lab. This experiment has been conducted
in University of Bolonga that involves 170 students, for a whole year, invested
in multiple crowd sensing campaigns who can access the smartphone sensors
passively and also provoke for user active collaboration. This paper presents the
outline of ParticipAct’s design, architecture, its feature, and report with the
quantitative results.
Cryptographic techniques play an important role when the information is
exchanged between users. In [17] and [18], the authors have compared encryp-
tion and decryption techniques for multimedia and guessing attacks prevention
such as RSA, Blowfish, AES, 3DES, and many more. The analysis is based on
comparing the encryption, decryption time, memory used, avalanche effect, and
entropy. Likewise in 2016, the authors of [19] designed a platform to promote
CSN not just as a service but as a contribution for the society too. However,
where there are people involved, there is always a risk of privacy leakage. This
issue causes a huge set back to CSNs as they are purely based on user’s vol-
unteering or in-volunteering involvement towards the network. To tackle this
issue, the authors used AES256 encryption technique for preserving the privacy
of users that attracted skillful participants for the platform. Moreover, in [13],
the authors have used Affine Cipher for the same issues mentioned above.
Furthermore, in conventional CSNs, there are many other techniques which
are used for preserving privacy. In [20], the authors have used Dynamic Trust
Relationships Aware Data Privacy Protection (DTRPP) mechanism for achiev-
ing privacy. The system is devised to combine public key with trust management
mechanism. The extensive simulation analysis showed that the system performed
better in-terms of average delay, delivery and loading rate when measured against
traditional systems. Likewise, [21], authors have proposed a mechanism to pro-
tect the location of the mobile users through combining k -anonymity and differ-
ential privacy-preserving.
4 System Model
The proposed system is an incentive mechanism which is based on blockchain
for CSN. In the suggested scenario, there are three entities which are partici-
pating in the CSN, i.e., the service provider, the service consumers, and data
collectors. The terms requester and service provider will be used alternatively
throughout the paper likewise, the terms data collector and worker will be used
interchangeably. The roles of the entities are defined in Table 1.
Table 1. Roles of entities of a CSN
Participants Roles
Service provider Broadcasting a task in the network and provide services to consumers
Service consumers Requesting data and utilizing that information which is acquired by a
data collector
Data collectors Measuring the required data about a subject of interest using mobiles
which is stated in the task broadcast-ed by service provider
Fig. 2. System model of CSN
After entering the system, service provider will initiate a smart contract by
setting the demands of task for sensing data and stores a definite amount of
deposit for establishing an incentive for the worker. Then, the task is published
in the network. Following that, worker’s identity is kept safe through encryption
AES128 which tackles with the issue of privacy leakage.
Once the workers submit their sensed data and it has been validated by the
miners, they will get the incentive immediately which is reserved in the smart
contract protocol. This helps in building the reputation of the system and boosts
up the enthusiasm of both miners and data collectors because to the instant
receival of incentives. In addition to that, since the rules are already established
in the smart contract initiated by service provider, both data collector and miners
can put their trust in service provider as a reliable administrator. Also, the
posting of gathered data by data collectors will cost them a definite aggregate of
gas in smart contract. It is the equivalent to security deposits by data collectors
before participating in the CSN, this process will help in preventing various
attacks. The motivation for proposed system model as shown in Fig. 2 is taken
from [13] and [22].
Service consumers separately interacts with service provider as shown in

Fig. 2. They will send a request to the provider for that specific service. Service
provider will initiate a contract with specified payment for the data requested
and the response is sent to the server. In addition to that, the payment is sent
to the server in exchange of the requested data. Since the interaction between
service provider and consumer involves smart contract, it gets the job done effec-
tively. It removes the probability of any error that may transpire in traditional
contracts and agreements.
To assess the performance of a blockchain based data sharing mechanism for

CSN with two communication paradigms, we have developed our application
in VS code with the help of ganache, truffle framework and metamask. The
language used is solidity. Ganache is used for deploying our smart contracts
whereas metamask is used as a bridge that permits us to run Ethereum decen-
tralized applications in the browser without executing a full Ethereum node. The
specifications of the system are as follows; it is an Intel core i3, 7th generation,
with 4 GB RAM, and 500 GB of storage. The performance parameters for the
proposed scenario are as follow;
• gas consumption of smart contracts, and

• execution time of different cryptographic techniques.
5.1 Gas Consumption Analysis

The detail of gas consumption of each contract with functions are described
below. In Fig. 3, the gas consumption of data collectors are plotted against each
function executed by the smart contract. There are total three functions that a
server performs, i.e., initTask(), Abort() and Checkdata(). Service provider acts
Transact ion cost

800000
Execut ion cost
Gas consum pt ion (gas)
600000
400000
200000
0
Creat e Task Abort Task Check Dat a
Fig. 3. Gas consumption of requester

as a requester in CSN as an initiator of task. Therefore, it has the authority

to write the digital contract and commence the task. It is evident from the
graphs that transaction gas is more than the execution gas of all the functions
executed in both smart contracts. Transaction gas is basically the gas used when
the transactions are validated and stored in the blockchain which requires more
computational power then execution gas, which is the cost for execution of each
line of code. The function initTask() is used by the service provider to create
a task and decide the incentive for each processed sensing data. The service
provider saves a definite range of deposits and define the reward with number of
data required in the criteria before creating the task. This is why, it has consumed
the greatest amount of gas as compared to the rest of functions whereas, Abort()
is called when service provider believes that they have collected sufficient amount
of data through viewing the number of data in Checkdata(). The deposits will
be returned to the requester by this function.
175000 Transact ion cost

Execut ion cost
150000
125000
100000
75000
50000
25000
0
View Task Com m it Task
Fig. 4. Gas consumption of worker
Figure 4 is demonstrating the consumption of gas of two functions which are

executed by the data collectors. In CSN, the data collectors have the choice
to select a task to perform they are interested in. However in our scenario, we
have assumed that the task, worker is interested in, is the task broadcast-ed by
our requester. The functions performed by the worker are following; getTask()
and commitTask(). The former function is used to view the information required
by the service provider with defined criteria which is the reward and number
of data. It is necessary for workers to first view the task otherwise if the data
entered is incorrect, not a single penny will be given as an incentive to the worker.
The transaction and execution gas of view task is less as compared to the latter
function. Because, the commitTask() is called to submit the collected data which
requires more computational energy as compared to viewing the task.
Figure 5 represents the gas consumed by the service consumer. This smart
contract, when initiated performs the following functions, i.e., ServiceRequest(),
ServiceResponse(), and Payment(). The transaction cost of request and response
is almost the same while Payment() requires more transaction and execution
60000
Transact ion cost
50000 Execut ion cost

40000
30000
20000
10000
0
Service Request Service Response Paym ent
Fig. 5. Gas consumption of service consumer
cost. The reason behind increase in gas consumption is because some amount
is edited from the smart contract’s account while added in service provider’s
account. This transaction is then added in the blockchain.
5.2 Execution Time Analysis
Figure 6 shows the comparison of cryptographic techniques on the basis of exe-

cution time of encryption and decryption. Encryption time is the time required
to convert the normal text into cipher text whereas, decryption time is the
time required for converting the cipher text into normal text. Both of them are
desired to be less for the system to be quick and more responsive. Also, both
have an affect on the performance of the system due to which four techniques
are compared i.e., Affine Cipher, AES128, AES256, and 3DES respectively for
the proposed scenario. Affine Cipher is used in [2] for preserving the privacy of
the user however, it belongs to the class of classical mono-alphabetic substitu-
tion techniques which can be easily comprehended by solving a simultaneous
equation. In addition to that, it is vulnerable to all the cipher attacks and it is
Fig. 6. Execution time comparison for Affine Cipher, AES128, AES256, and 3DES
not considered to be a strong secure method for encryption as compared to the

modern symmetric key block cipher techniques. From the literature review [18–
20], we implemented three more encryption techniques and found that AES256
and AES128 performs with less execution time as compared to 3DES. Although,
AES256 is more secure as compared to AES128 because of 256 bit key size
and increased number of rounds however, there is the trade-off of execution
time. Therefore, in order to have a fast and secure algorithm for encryption and
decryption for the system to be more responsive, we deployed AES128 instead
of AES256 which is suitable for the proposed scenario.
6 Conclusion
With the advent of technology, blockchain has emerged to be the optimal solution
for providing a decentralized environment to applications and helping out in
terms of security, privacy and single point of failure. In this paper, a blockchain
based incentive mechanism is devised for CSN which aims to motivate data
collectors and attract highly skilled users. Encryption is used to preserve the
privacy of participants and a communication platform i.e., smart contract is
provided for secure reporting. The objective of the proposed model is to cater the
needs of all the entities of CSN in a decentralized manner which in return achieve
integral and reliable data, high participation rate, and secure communication.
The system is evaluated by analyzing gas consumption of all smart contracts
deployed whereas, the encryption technique is validated through examining the
execution time and comparing it with other techniques.
In future, the objective is to calculate trustworthiness of the data contributed
by the users. Through comparing the user’s trust attitudes and applying non-
parametric statistic methods, we can examine the subjectivity in contributed
data for the proposed scenario.
References
1. He, D., Chan, S., Guizani, M.: User privacy and data trustworthiness in mobile
crowd sensing. IEEE Wirel. Commun. 22(1), 28–34 (2015)
2. Jin, H., Su, L., Xiao, H., Nahrstedt, K.: Incentive mechanism for privacy-aware data
aggregation in mobile crowd sensing systems. IEEE/ACM Trans. Netw. (TON)
26(5), 2019–2032 (2018)
3. Merlino, G., Arkoulis, S., Distefano, S., Papagianni, C., Puliafito, A., Papavassiliou,
S.: Mobile crowdsensing as a service: a platform for applications on top of sensing
clouds. Future Gener. Comput. Syst. 56, 623–639 (2016)
4. Nie, J., Luo, J., Xiong, Z., Niyato, D., Wang, P.: A stackelberg game approach
toward socially-aware incentive mechanisms for mobile crowdsensing. IEEE Trans.
Wirel. Commun. 18(1), 724–738 (2019)
5. Gisdakis, S., Giannetsos, T., Papadimitratos, P.: Security, privacy, and incentive
provision for mobile crowd sensing systems. IEEE Internet Things J. 3(5), 839–853
(2016)
6. Luo, C., Liu, X., Xue, W., Shen, Y., Li, J., Hu, W., Liu, A.X.: Predictable privacy-
preserving mobile crowd sensing: a tale of two roles. IEEE/ACM Trans. Netw.
(TON) 27(1), 361–374 (2019)
7. Ahmad, W., Wang, S., Ullah, A., Yasir Shabir, M.: Reputation-aware recruitment
and credible reporting for platform utility in mobile crowd sensing with smart
devices in IoT. Sensors 18(10), 3305 (2018)
8. Lane, N.D., Eisenman, S.B., Musolesi, M., Miluzzo, E., Campbell, A.T.: Urban
sensing systems: opportunistic or participatory? In: Proceedings of the 9th Work-
shop on Mobile Computing Systems and Applications, Napa Valley, CA, USA, pp.
11–16 (February 2008)
9. Yang, G., He, S., Shi, Z., Chen, J.: Promoting cooperation by the social incentive
mechanism in mobile crowdsensing. IEEE Commun. Mag. 55(3), 86–92 (2017)
10. Ota, K., Dong, M., Gui, J., Liu, A.: QUOIN: incentive mechanisms for crowd
sensing networks. IEEE Netw. 32(2), 114–119 (2018)
11. Jaimes, L.G., Vergara-Laurens, I.J., Raij, A.: A survey of incentive techniques for
mobile crowd sensing. IEEE Internet Things J. 2, 370–380 (2015)
12. Huang, J., Kong, L., Kong, L., Liu, Z., Liu, Z., Chen, G.: Blockchain-based crowd-
sensing System. In: 2018 1st IEEE International Conference on Hot Information-
Centric Networking (HotICN), pp. 234–235. IEEE (August 2018)
(2018)
14. Park, J.S., Youn, T.Y., Kim, H.B., Rhee, K.H., Shin, S.U.: Smart contract-based
review system for an IoT data marketplace. Sensors 18(10), 3577 (2018)
15. Wang, J., Li, M., He, Y., Li, H., Xiao, K., Wang, C.: A blockchain based privacy-
preserving incentive mechanism in crowdsensing applications. IEEE Access 6,
17545–17556 (2018)
16. Dai, M., Su, Z., Wang, Y., Xu, Q.: Contract Theory Based Incentive Scheme for
Mobile Crowd Sensing Networks. In: 2018 International Conference on Selected
Topics in Mobile and Wireless Networking (MoWNeT), pp. 1–5. IEEE (June 2018)
17. Cardone, G., Corradi, A., Foschini, L., Ianniello, R.: Participact: a large-scale
crowdsensing platform. IEEE Trans. Emerg. Topics Comput. 4(1), 21–32 (2016)
18. Ahamad, M.M., Abdullah, M.I.: Comparison of encryption algorithms for multi-
media. Rajshahi Univ. J. Sci. Eng. 44, 131–139 (2016)
19. Wahid, M.N.A., Ali, A., Esparham, B., Marwan, M.: A comparison of crypto-
graphic algorithms: DES, 3DES, AES, RSA and blowfish for guessing attacks pre-
vention (2018)
20. Mottur, P.A., Whittaker, N.R.: Vizsafe: the decentralized crowdsourcing safety
network. In: 2018 IEEE International Smart Cities Conference (ISC2), pp. 1–6
(September 2018)
21. Wu, D., Si, S., Wu, S., Wang, R.: Dynamic trust relationships aware data pri-
vacy protection in mobile crowd-sensing. IEEE Internet Things J. 5(4), 2958–2970
(2017)
22. Chi, Z., Wang, Y., Huang, Y., Tong, X.: The novel location privacy-preserving
CKD for mobile crowdsourcing systems. IEEE Access 6, 5678–5687 (2017)
Design of a Cloud-Oriented Web Application
for Legal Conflict Resolution Through
Equitative Algorithms

Marco Giacalone2, and Francesco Romeo3
1
2
Vrije Universiteit Brussel & LSTS, Vrije Universiteit Brussel, 4B304,
Pleinlaan 2, 1050 Brussels, Belgium
3
University of Naples “Federico II”, via Porta di Massa, 1, 80133 Naples, Italy
francesco.romeo@unina.it
Abstract. With the terms “Cloud Computing” we mean a model in which

computation, software and storage resources do not require knowledge of the
physical location and configuration of the system that provides the service in
question by the end user. Thanks to the widespread use of the Internet, appli-
cations can now be used as services on the network. Many interesting areas of
application are enjoying of these benefits, such as legal domain: the use of
systems for conflict resolution can help lawyers, mediators and judges with the
objective to reach an agreement between the parties. In this paper we introduce
the architecture of a cloud-oriented web application, developed within the
European project CREA (Conflict Resolution with Equitative Algorithms,
Figure). Its objective is the application of algorithms to solve civil matters, into
the allocation of goods, leading the parties to a friendly solution before or during
the trial comparison.
1 Introduction
For thousands of years humankind tried to understand some of the most important
mental faculties, i.e. thinking and reasoning, whose research was assigned to philosophy
and science. Nowadays a new discipline, A.I. (Artificial Intelligence), is developing and
its aim is even more ambitious: not only understanding thinking and reasoning activities
but also creating entities which are capable of performing these activities [1]. It is hard to
define what exactly “Artificial Intelligence” is; historically the main definitions were

https://doi.org/10.1007/978-3-030-33509-0_54
580 A. Amato et al.
focused on different aspects such as inner reasoning process or external actions of the
intelligent system1. In particular, there are four main approaches [1]:
1. Thinking humanly: the process [15] leading the intelligent agents to solve a
problem is similar to the process [16] a human being would perform (this approach
is related to cognitive science);
2. Thinking rationally: the process [17] leading the intelligent agents to solve a
problem is formal and it is modelled by using logic;
3. Acting humanly: the outcome of an intelligent agent action is indistinguishable
from a human being action;
4. Acting rationally: the outcome of an intelligent agent action is the best possible
result according the provided data [18].
In Fig. 1 it is possible to read several definitions according to different approaches:
Fig. 1. Artificial Intelligence definitions
Over time A.I. reached important and remarkable targets; indeed, there are many
interesting areas of application: natural language understanding, decisional support
system, robotics and automation in transportation, manufacturing, business etc.
Another area of application, maybe less intuitive than the previous ones but certainly
not less important, is in law: especially the use of artificial intelligence systems for legal
conflict resolution, also called “A.I. & Law research” [2–11].
1
Sects. 2, 3 and 4 are to be attributed to all authors; Sects. 1 and 5 in particular are to be attributed to
Marco Giacalone.
Design of a Cloud-Oriented Web Application 581
For this reason, the European project CREA (Conflict Resolution with Equitative
Algorithms) was born. Its objective is to introduce, thanks to the contribution of rules
and principles coming from disciplines even outside the judicial system, new mecha-
nisms for disputes resolution such as support tools in legal procedures which could be
useful for lawyers, judges and mediators, by making things easier also for Europe
citizens (2). In Europe there are currently too many costs and extremely long trial times
in cross-border and national civil proceedings in the member states. This happens
because of several problems: finding a judge with the appropriate competence, trans-
lating the summons and other important material into languages understandable for the
Fig. 2. Resolution time for disputes in different EU countries
Fig. 3. Resolution time for disputes in different EU countries

582 A. Amato et al.
addressee, different interpretations of different national courts etc. Because of these

problems, most common citizens cannot stand a cross-border civil proceeding (4). But
above all, the length of national proceedings is a common problem for most of the
member states, as it is possible to see in Figs. 2 and 3.
The main purpose of this paper is introducing the readers to these new mechanisms
of dispute resolution by proposing the use of A.I techniques [12–14], notions of
economics, a concrete algorithm and also an example of implementation in a pro-
gramming language.
2 System Design
To achieve the goal, create an application in the Cloud, was considered a Client Server
model. Specifically, the MVC (Model-View-Controller) pattern [5] was used. In this
MVC architectural scheme the application is divided into three components: models,
views and controllers.
This pattern allows the separation of tasks:
• Model represents the state of the application or the operations that it must follow,
the views use the View-Model component that contains the data to be displayed.
The controller creates and populates the ViewModel instances from the model.
public class Agent {
public int agentID {get; set;}
public char Name {get; set;}
public char Surname {get; set;}
public float Budget {get; set;}
public char Via {get; set;}
public char City {get; set;}
public int Telephone {get; set;}
public bool Conciliator {get; set;}
public float Share_of_entitlement {get; set;}
11
public ICollection <Association> Associations {get;
set;}
public ICollection <Offer> Offers {get; set;}
}
• Views are responsible for showing the content through the user interface [25]. They
use the Razor <display engine. The presence of logic must be minimal, and that
which there is must refer to the presentation of the content;
@model CREAProject.Models.Agente
@{
ViewData ["Title"] = "Create";
}
<h2> Create </ h2>

<h4> Agent </ h4>
<hr />
<div class = "row">
<div class = "col-md-4">
<form asp-action = "Create">
<div asp-validation-summary = "ModelOnly" class =
"text-danger"</div>
<div class = "form-group">
<label asp-for = "Name" class = "control-label">
</label>
<input asp-for = "Name" class = "form-control" />
<span asp-validation-for = "Name" class = "text-
danger"> </span>
</ div>
<div class = "form-group">
<label asp-for = "Surname" class = "control-la-
bel"> </label>
<input asp-for = "Surname" class = "form-control"
/>
<span asp-validation-for = "Surname" class =
"text-danger"> </ span>
</ div>
• Controllers are the components that manage user interaction, interact with the model
and distinguish the visualization that will be rendered.
584 A. Amato et al.
public class AgentesController: Controller

{
private readonly ApplicationDbContext _context;
public AgentesController (ApplicationDbContext con-

text)
{
_context = context;
}
// GET: Agentes
public async Task <IActionResult> Index ()
{
return View (await _context.Agente.ToListAsync ());
}
...
// GET: Agentes / Create
public IActionResult Create ()
{
return View ();
}
}
3 Data Model Design
The data model is shown in Fig. 4. The main entities and associations (in yellow in the
Figure) of this model are:
• AGENT presenting as a primary key agentID is the entity that represents the user
among the various attributes Shared of entitlement is the one that represents the
right of that agent on a given asset;
• BID, is an association that has the primary keys of Good and Agent as its foreign
keys;
• GOOD, entity representing assets in goodID primary key disputes, we note that the
Divisible attribute is of Boolean type so if set to TRUE it is a divisible asset
otherwise it is not;
• DISPUTE, entity with primary controversial ID that identifies the dispute, with the
ResolutionMethod attribute sets the first or second method;
• BAD, entity representing the debts identified by a badID;
In the next section it will be explained how from the ER model through the use of
the Entity Framework and ASP.NET Core, we implemented the cloud application.
Fig. 4. Data model schema
4 Implementation
To develop the project on the Web, Visual studio was used with ASP.NET Core. ASP.
NET is a multi-platform, open source framework used for compiling modern cloud-
based applications, allowing you to:
• Compile web apps and services;
• Use your favorite development tools in Windows, macOS and Linux;
• Deploy in the cloud or locally.
Entity Framework Core is a framework intended to hide the details of a relational
database, allowing us to think of our data model as a set of objects. With Entity
Framework, you can operate at a higher level of abstraction by managing data and
creating data-oriented applications with less code than traditional applications [18–20].
In our case study, Entity Framework [21, 22] allowed us to transform the entities of
the data model monster into classes and the attributes into dynamic properties managed
by list of objects.
public ICollection <Association> Associations {get; set;}
public ICollection <Bid> Bids {get; set;}
586 A. Amato et al.
The ICollection <Association> Associations property, allows the

connection between the Agent [23, 24] and Dispute entity (Fig. 5).
Fig. 5. Screenshot of the bidding view
5 Conclusions
In this work an approach that uses artificial intelligence techniques to solve problems
related to legal disputes is presented. In particular, concepts and algorithms used in
Game Theory (equitable resource allocation algorithms) are adopted to arrive at
solutions that maximize the satisfaction of the parties involved. In particular it is proved
the effectiveness of an algorithmic approach in disputes resolution, explaining with an
example how to lead contrasting parties to a solution as friendly (and equal) as
possible.
The subject matter could be interesting, first of all, for legal professionals such as
lawyers, judges and mediators but also experts in consumer dispute and judicial
cooperation experts. This new procedure could potentially modify in a deep way the
national and cross-border legal proceeding in Europe, by reducing the waiting time for
citizens-consumers and eliminating differences due to specific national laws.
Acknowledgments. This work was co-funded by the European Union’s Justice Programme
(2014–2020), CREA Project, under grant agreement No. 766463.
References
1. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. (2009)
2. AUDICONSUM: CREA Project, 02 Feb 2019. https://www.adiconsum.it/al-tuo-fianco/
iniziative/crea-conflict-resolution-with-equitative-algorithms/
3. CREA: CREA Project, 02 Feb 2019. http://www.crea-project.eu
4. European Commission: Grant Agreement number - 766463 - CREA (2017)
(1992)
6. Pazner, E.A., Schmeidler, D.: Egalitarian equivalent allocations. A new concept of economic
equity. Q. J. Econ. 92, 671–687 (1978)
7. Nash, J.P.: The bargaining problem. J. Econ. Soc. 18, 155–162 (1950)
reasoning. Stan. L. Rev. 23, 40 (1970)
10. AI vs Lawyer. LawGeex, Feb 2018. https://www.lawgeex.com/resources/aivslawyer/
11. European Commission: The 2018 EU Justice Scoreboard. Effective justice. https://ec.europa.eu
approach for CH sensitivity discovery in social media data. In: 2016 IEEE Tenth
social networks for lurkers detection: an approach based on hypergraphs. Concurr. Comput.:
Pract. Exp. 30(3), e4188 (2018)
Dependable Computing Conference (EDCC), pp. 17–24. IASI (2018). https://doi.org/10.
1109/edcc.2018.00015
approaches for securing java software with Intel SGX. Future Gener. Comput. Syst. 97, 620–
633 (2019). https://doi.org/10.1016/j.future.2019.03.018
588 A. Amato et al.
(2018). https://doi.org/10.1016/j.ress.2018.04.009. ISSN 0951-8320
digital contents. IET Comput. Dig. Tech. 8(6), 300–310 (2014)
org/10.1016/j.future.2017.08.016
org/10.4304/jsw.5.1.1-15
pp. 467–485 (2017). https://doi.org/10.1201/b11149
24. Aversa, R., Di Martino, B., Moscato, F.: Critical systems verification in MetaMORP(h)OSY.
In: Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics), LNCS, vol. 8696, pp. 119–129. Springer, Berlin (2014). https://
doi.org/10.1007/978-3-319-10557-4_15
25. Albanese, M., Erbacher, R.F., Jajodia, S., Molinaro, C., Persia, F., Picariello, A., Sperlì, G.,
Equitative Algorithms for Legal Conflict
Resolution

and Marco Giacalone2
1
University of Naples “Federico II”, Via Claudio 21, 80125 Naples, Italy
2
Vrije Universiteit Brussel and LSTS,
4B304, Vrije Universiteit Brussel Pleinlaan 2, 1050 Brussels, Belgium
Abstract. In this paper we present potential advantages of an algorithmic

approach in legal dispute resolution, introducing the CREA project with its
objectives and methods. We describe a model that represents legal disputes by
constraints satisfaction problems and two algorithmic strategies to solve them
(Nash allocation and Egalitarian Equitative Allocation), focusing on inheritance
disputes category. This approach is radically innovative because it helps lawyers
and judges to set the legal procedure not as a parties’ dispute but as a process
aiming to consensual agreement. We also report the experimental results to
prove the effectiveness of our approach, considering a concrete legal case
dispute.
1 Introduction
At present, there are too many costs and extremely long trial times to a proper func-
tioning of cross-border and national civil proceedings in the Member States. Regarding
the cross-border proceedings, these proceedings take an inordinate amount of time,
both to set up and for a decision to be rendered. Specific problems include finding a
judge with the requisite competence and translating the summons and other relevant
material into a language intelligible to the addressee. Moreover, cross-border civil
proceedings are far too expensive for most citizens, due to the high costs of document
translation and finding and consulting qualified legal experts. Another obstacle that
citizens face in cross-border disputes is the divergent interpretations of different
national courts, even when EU rules apply. The length of national proceedings is also a
common problem for the majority of the member states.
Also, the current legal systems do not adequately value the possibility of reaching
an agreement of the parties, instead they always try to find the solution in legal rules

https://doi.org/10.1007/978-3-030-33509-0_55
590 A. Amato et al.
often divergent compared to the wishes of the parties and in a long and debilitating
conflict of comparison process1.
For all these reasons the “Conflict Resolution with Equitative Algorithms” (CREA)
project was established by the European Commission. It involves several European
universities and each one has his own duty. The project aims to introduce new
mechanisms of dispute resolution as a helping tool in legal procedures for lawyers,
mediators and judges with the objective to reach an agreement between the parties; in
some situations, it could be used directly by citizens. Its primary objectives are the
following:
1. to apply algorithmic mechanisms to the solution of certain national and cross-border
civil matters, including matrimonial regimes, successions and trusts, commercial
law and consumer law, facilitating the agreement process among the parties;
2. to demonstrate the efficacy of an algorithmic approach [12–14] and apply it into the
allocation of goods, or the resolution of issues, in disputes, leading the parties to a
friendly solution before or during the trial comparison.
We will focus on the algorithms used for resolving disputes that involve a division
of goods between agents, e.g. inheritance, divorces and company law [1–11].
2 The Algorithms Setup
The algorithms we will look at apply to cases were some goods have to be assigned to a
set of entities (e.g. people or companies), which we will call agents. Each agent is given
a share of entitlement, i.e. a share weight. Typically, if there are n agents, the share will
be 1/n for each one of them, but it could be different. For instance, shares could reflect
the closeness of relatives to a deceased person, or the effort. The algorithms will give us
a solution in form of the share of each good assigned to each agent.
Let’s have a look at the notation we are going to use; we have:
• A set of agents N = {1, 2,…, n}
• A set of goods M = {1, 2,…, m}
• A market value for each good mj, j 2 M
• A share weight for every agent wi, i 2 N
• A solution of the allocation problem where:
– zij is the share of good j 2 M given to agent i 2 N
– Z = {zij} i2N, j2M is the solution matrix
We have to make the agents express their preference about which goods they would
like to get and which goods they don’t and in what measure. We can achieve that in
two ways: the bids method and the ratings method; whether we choose one method or
the other, we end up with the utility for each good of each agent, i.e. how much the
agents value each good.
1
§§ 2 and 3 are to be attributed to all authors; § 1, 4 and 5 in particular are to be attributed to Marco
Giacalone.
Equitative Algorithms for Legal Conflict Resolution 591
2.1 The Bids Method
• For each good, a market price is given.

• The sum of the market prices is computed. This is the budget available to each agent
in the following steps.
– An equal budget for each agent reflects the principle that all agents should be
treated equally. Only the share of entitlement could discriminate among agents.
• The market price is decreased by 20%. This price is low enough to guarantee the
selling of the good (or at least provide an extremely high probability for its selling).
Below this price, called the lower bidding bound, an offer cannot be considered
acceptable.
• Each agent is asked to distribute the budget as bids over the unassigned goods. Each
bid cannot be less than the lower bidding value, and the total value of bids cannot
exceed the budget.
– The idea is that the higher the bid, the more likely is for the agent to receive the
good.
Using this method, the utility of each good for the agents is just the bid:
uij ¼ bij for each agent i 2 N and good j 2 M
with bij being the bid of the agent i for the good j and uij its utility.
2.2 The Ratings Method
• Each disputed good is valued at the market price.

• Each agent evaluates how much he/she would like to receive each good. The
evaluation can be attained through a “1 to 5 stars” marking system (as the rating of
an Amazon product or that of a restaurant through Trip Advisor). This evaluation
does not regard the monetary value of the good. For instance, an agent is involved
in the allocation of a house worth 100000 euros and a second-hand Harley-
Davidson motorbike worth 6000 euros. The house is worth more, but he already
owns a beautiful house and he knows that managing a house is time and money
consuming. On the other hand, he has always dreamt about riding that motorbike.
He will give 2 stars to the house and 5 to the bike.
– Notice that assigning 5 stars to all items will not make you any better off than
assigning 1 star to all of them. What counts is the profile: you raise the chance of
getting the items you really want by assigning them a high mark, and by giving a
low mark to those you are not too interested.
If we call rij 2 {1, 2, 3, 4, 5} the rating agent i gives to the good j and K > 1 the
revaluation rate for each star (typically K = 1.1) the utility is given by:
uij ¼ K rij3 mj for each agent i 2 N and good j 2 M:

592 A. Amato et al.
3 The Solutions
The next step is finding a way to correctly distribute utilities among the agents. In the
end, the overall utility of an agent will be given by the sum of the utilities of the goods
received. If an agent gets a fraction of the good, the utility will be weighted by this
fraction. For instance, if, according to the proposed solution, I should receive 1/3 of a
house which I valued 90000 euros, my utility will be 90000 * (1/3) = 30000 euros.
How to find an optimal allocation of the goods? Research suggests that no single
criterion is universally better than the others. The recent literature shows that two
criteria prevail: the egalitarian equivalent allocation and the Competitive Equilibrium
from Equal Income.
3.1 The Egalitarian Equivalent Allocation

The Egalitarian Equivalent Allocation, also called Equitable Allocation, was first
introduced in Pazner and Schmeidler (Egalitarian equivalent allocations: A new con-
cept of economic equity, Quarterly Journal of Economics, 1978). It makes sure that all
agents receive goods (or parts of them) such that the sum of the goods’ value according
to his own bids is the same, and this value is as high as possible. If agents have different
weights, equality is attained once values are weighted with the shares in order to attain
equality. (General Procedures).
In order to obtain an egalitarian allocation a solution must be found that maximizes
a comfort variable t such that (CREA Algorithms Stelle)
!
X u X
P ij t; z ¼ 1 8 j 2 M; zij 0 8i 2 N; j 2 M
wi uij i2N ij
j2M j2M
By construction, this solution is egalitarian. It turns out that this solution is also
efficient: no other allocation, even a non-egalitarian one, can make all agents better off
simultaneously. The allocation, however, has some problems: it’s not resource
monotonic, it doesn’t have responsive shares, it suffers from domination. But the most
important problem is that the allocation may cause one or more agents to be envious of
the goods assigned to other agents; let’s consider an example: suppose there are 3
agents (I, II and III) and 3 divisible items (a, b and c). In the following Table 1, we
describe the utility of each item by each player.
Table 1. Envy example utilities

a b c
I 40 30 30
II 30 40 30
III 10 50 40
The egalitarian solution assigns item a to agent I, item b to agent II and item c to
agent III – without splits. Each one gets a utility of 40. However, agent III is envious of
agent II, because he received item c (valued 40) but he would have preferred item b
(valued 50 to him) that was assigned to agent II.
3.2 The Competitive Equilibrium from Equal Income

The Competitive Equilibrium from Equal Income (CEEI), also known as the Nash
solution, is based on the studies of Bogomolnaia, Moulin, Sandomirskiy and
Yanovskaya, (Competitive division of a mixed manna, Econometrica).
Supposing each agent is given the same budget (weighted on his entitlement), the
Competitive Equilibrium from Equal Income (CEEI) solution is reached if goods are
bought such that: (a) each agent, independently of the others, makes the best choice:
given the budget, he buys goods that maximize his own satisfaction and (b) all goods
are sold with no overlaps (for instance two agents buying the same good in its entirety)
and no leftovers (no good remains unsold).
The following is the objective function of this solution, which was introduced by
Nash (The Bargaining Problem. Econometrica, 1950) and thus takes the name of Nash
solution:
! wi
Y X
max uij zij
i2N j2M
This solution is Resource Monotonic (ore goods to divide should not be bad news
for anyone), it has Responsive Shares (if an agent raises the bid on a certain good, he
cannot end up with a smaller share of that good) and doesn’t suffer from domination.
Most importantly it’s envy free, although it’s not usually egalitarian: let’s consider
again three agents sharing three divisible items, with the same utility table considered
before.
a b c
I 40 30 30
II 30 40 30
III 10 50 40
The CEEI/Nash solution prescribes that: agent I gets item a in its entirety, agent II
gets 9/10 of item b, agent III gets item c and 1/10 of item b. Agent III will not be
envious anymore, because in his evaluation item c in its entirety plus 1/10 of item b is
no less (actually it equals) 9/10 of item b.
They both have a utility of 45. In terms of utility, we note that the utility of agent I
is 40, the utility of agent II is 36 and the utility of agent III is 45, therefore the
allocation is not equitable. But here agents modify the utilities depending on the likes
and dislikes of goods and the utility does not directly represent money anymore.
594 A. Amato et al.
3.3 Constraints
The maximization problems we are considering are rather flexible and can incorporate
other constraints to better suit the division problem: we will see, as an example,
indivisible goods and restricted assignments, CREA Algorithms so far.
A good is indivisible if it cannot be split and must be assigned to only one agent in
its entirety. If a good j is indivisible, we impose that
zij ¼ 0 or zij ¼ 1 for every i 2 N
A restricted assignment occurs if legislation requires a good to be assigned to a

specific agent or a subset of agents. Let us call N’ N this subset and j the good with
restricted assignment, we impose that:
X
zij ¼ 1
i2N 0
4 Constraint-Satisfaction Problems
A large number of problems in A.I. ad other areas of computer science can be viewed
as special case of the constraint-satisfaction problems (also known as CSP), e.g.
machine vision, temporal graph problems, planning of genetic experiments etc.
An application we are interested on is the possibility of modelling legal cases as a
sort of CSP.
A constraint-satisfaction problem is a type of problem which needs a solution
satisfying a set of conditions and constraints provided. Generally, the more problems
are complicated, the greater number of constraints necessary for the solution becomes;
therefore, it gets more difficult to apply algorithms.
First of all, let us see how a CSP is represented.
CSPs are represented in terms of state space where the important thing is not the
solution path from the initial state to the final state but finding a state capable of
satisfying the constraints.
Theoretically a CSP can be defined by using three components X, D and C:
• X is a set of variables, X = {X1, X2, …, Xn}.
• D is a set of domains, D = {D1, D2, …, Dn}, where each domain has a set of
available values, e.g. Di = {vi1, vi2, …}.
• C is a set of constraints specifying allowable combinations of values.
Each state in CSPs is defined by an assignment of values to all (or even some)
variables, e.g. {Xi = vi, Xj = vj, …}.
An assignment that does not violate any constraints is called “consistent assign-
ment” while an assignment in which a value is assigned to every variable is called
“complete assignment”.
The solution to a CSP will be a consistent and complete assignment. The com-
plexity implied in these procedures is high and this is the reason why using A.I.
techniques is useful in this field. The technique we propose in this paragraph is the
Backtracking Search.
4.1 Backtracking Search

The backtracking search is an efficient paradigm for CSP solution. In this method,
variables are instantiated sequentially [18–20]. As soon as all the variables relevant to a
constraint are instantiated, the validity of the constraint is checked [21–24]. If a partial
instantiation violates any of the constraints, backtracking is performed to the most
recently instantiated variable that still has alternatives available [25]. Clearly, whenever
a partial instantiation violates a constraint, backtracking is able to eliminate a subspace
from the Cartesian product of all variable domains [5, 15–17].
Here is presented the pseudocode to explain how this procedure works (Fig. 1.1).
Fig. 1. Backtracking search pseudocode.
A legal case can be modelled as a constraint-satisfaction problem if constraints and

solutions are well defined.
Let us consider an inheritance case (examined in depth in paragraph 2.5):
• Variables: number of heirs and goods, goods value etc.
• Constraints: laws, legal precedents, heirs’ preferences for particular goods, principle
of equity etc.
• Solution: a state in which heirs equally share legacy and each heir gains goods
according his/her preferences.
596 A. Amato et al.
5 Conclusion
According to the backtracking algorithm, Dispute Resolution System tries to assign

different values to all variables; if a particular assignment is not consistent with con-
straints, then DRS can backtrack, cancel the last variables assignment and attempt new
assignment to reach the best solution.
In this paper, we focused on the algorithms used for resolving disputes that involve
a division of goods between agents, e.g. inheritance, divorces and company law.
Acknowledgments. This paper has been produced with the financial support of the Justice
Programme of the European Union, 766463 CREA. The contents of this report are the sole
responsibility of the authors and can in no way be taken to reflect the views of the European
Commission.
References
1. Russell, S., Norvig, P.: Artificial Intelligence: A modern approach (3rd edition) (2009)
2. Audiconsum: CREA Project. https://www.adiconsum.it/al-tuo-fianco/iniziative/crea-conflict-
resolution-with-equitative-algorithms/. Accessed Feb 02 2019
3. CREA: CREA Project. http://www.crea-project.eu. Febbraio 02 2019
4. European Commission: Grant Agreement number - 766463 - CREA (2017)
(1992)
6. Pazner, E.A., Schmeidler, D.: Egalitarian equivalent allocations: a new concept of economic
equity. Q. J. Econ. 92, 671–687 (1978)
7. Nash, J.F.: The bargaining problem. 18, 155–162 (1950)
reasoning. Stanf. Law Rev. 23, 40–62 (1970). Stanford, s.n
10. AI vs Lawyer: LawGeex, February 2018. https://www.lawgeex.com/resources/aivslawyer/
11. European Commission: The 2018 EU Justice Scoreboard. Effective justice. https://ec.europa.eu
approach for CH sensitivity discovery in social media data. In: 2016 IEEE Tenth
social networks for lurkers detection: an approach based on hypergraphs. Concurrency
Comput. Pract. Experience 30(3), e4188 (2018)
Dependable Computing Conference (EDCC), pp. 17–24, Iasi (2018). https://doi.org/10.
1109/edcc.2018.00015
approaches for securing java software with Intel SGX. Future Gener. Comput. Syst. 97, 620–
633 (2019). https://doi.org/10.1016/j.future.2019.03.018. ISSN 0167-739X
(2018). https://doi.org/10.1016/j.ress.2018.04.009. ISSN 0951-8320
digital contents. IET Comput. Digital Tech. 8(6), 300–310 (2014)
protocol. IEEE Trans. Emerg. Topics Comput. 6(2), 258–268 (2016)
org/10.1016/j.future.2017.08.016. Elsevier B.V
org/10.4304/jsw.5.1.1-15
pp. 467–485 (2017). https://doi.org/10.1201/b11149
24. Aversa, R., Di Martino, B., Moscato, F.: Critical systems verification in MetaMORP
(h) OSY. In: Lecture Notes in Computer Science (including subseries Lecture Notes in
Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS, vol. 8696, pp. 119–129.
Springer (2014). https://doi.org/10.1007/978-3-319-10557-4_15
25. Albanese, M., Erbacher, R.F., Jajodia, S., Molinaro, C., Persia, F., Picariello, A.,
Cloud Computing Projects and
Initiatives (CCPI-2019)
An Approach to Help in Cloud Model
Choice for Academia Services’ Supplying
Pasquale Cantiello(B) , Beniamino Di Martino, and Michele Mastroianni
Dipartimento di Ingegneria Università degli Studi della Campania,

Aversa, Italy
{pasquale.cantiello,michele.mastroianni}@unicampania.it,
beniamino.dimartino@unina.it
Abstract. In academic institutions, there is frequently the need to pro-

vide new services, in a cloud model, to be used either in teaching or
research activities. One of the main decisions to be addressed is related
to the cloud model to adopt (private, public or hybrid), and what the
mixing of functionalities to use for the hybrid one. In this paper a new
approach is proposed to define a methodology to serve as a tool to be
used as decision support for the ICT manager in order to help him in this
decision. A simple and intuitive graph representation has been used. The
methodology has been tested with a real case study on the provisioning
of a new e-learning service for the university of the authors. The work
has to be considered a tentative approach as a starting point to further
researches in the same direction.
1 Introduction and Objectives

The main purpose of this paper is to explore a new approach as an evaluation
methodology in order to help in the choice of the cloud model (private, public,
hybrid) to adopt when there is the need of provisioning ICT services for teaching
and research activities in the academia.
The approach is designed to be used as a decision support for an ICT manager
of a University where there is the need to provision a new service for research or
teaching activities and the question is related to what cloud model to adopt.
It starts with a survey that is divided into two parts: the first to be answered
by the person who needs the service, and the second part to be answered by
the ICT manager. By representing the results of the survey in a graph, some
analysis can be done on order to give design directions.
This work is to be considered as a tentative starting point for a more exhaus-
tive research to be explored in future. In this sense it lacks examination of lots
of aspects that must be addressed (e.g.: costs and time constraints).
After this introduction, the paper continues in Sect. 2 with a related works
analysis. In Sect. 3 the services are classified with some parameters and a rep-
resentation is shown in Sect. 4. The analysis done on the model is shown in 5
and a case study on a real academia need is presented in Sect. 6. The paper ends
with Sect. 7 with conclusions and future works directions.
https://doi.org/10.1007/978-3-030-33509-0_56
602 P. Cantiello et al.
2 Related Works
From the analysis of recent work on the topic made by the authors, no papers
have been found that are related to decision support methodologies on private
vs public cloud system choice.
In [1] has been exploited the serviced offered in a federated way to support
service provisioning in the Italian Academia. These can be used as component
to realize more complex services provisioned in a hybrid cloud model.
As reported in [6] the cost factor is not always in favor of the public solution.
The authors of that work have shown that, in a long time terms, the private
solution is cheaper than the public one.
The adoption of cloud computing in e-learning systems as a strategy to
improve agility and reduce costs has been studied in [4]. The authors have dealt
with the problem of the choice between a public or private cloud with a sugges-
tion to use the experience of the manager to do the final choice. In this sense
our work can also be thought as a way to expand the research started with this
work.
In [5] has been studied the benefits of using services from federated clouds
with various coupling levels and different optimisation methods.
A comprehensive study on cloud for research computing has been done in [2].
This report summarizes the results from a workshop on topics and challenges
for academic research cloud computing in applications, support, data movement,
administrative, legal, and financial issues.
In [3] has been reviewed what the cloud computing infrastructure will provide
in the universities and what can be done to increase the benefits of common
applications for students and teachers.
3 Services Classification
The starting point of the methodology is represented by a brief survey to be
answered by the person who needs the service be implemented. This survey is
focused not on ICT aspects, but mainly on domain requirements and this permits
to obtain a first parameterization of the requirements.
Now these information are enriched with other ICT related parameters. They
are also given by a brief survey answered by the ICT manager.
Both set of parameters become variables measured in both domains and
represented with a scale of values in 0...N range. The choice of this scale is
dictated by the need to express qualitative parameters with numbers. In our
case, the meaning of the numerical values is this: they grows according to the
necessity of having specific features not publicly available. A value of 0 means
that the feature could be obtained by a full-public solution, while a value of N
denotes the need of that feature only in a private model.
The first group of parameters under analysis for this starting point of the
research are those related to the application domain (P 1 to P 5) and shown in
the Table 1.
An Approach to Help in Cloud Model Choice 603
Table 1. Domain parameters
Parameter Name Description

P1 H/W customisation Does the application require custom devices?
P2 Privacy Is there the need of specific privacy constraints
(e.g. for health or government data)?
P3 Mission critical Is the service to be considered mission critical for
the university (or research center)?
P4 Interoperation Does the service need to interoperate with other
on-premise services?
P5 Rights The service and its related data involves specific
copyrights or patents requirements?
Table 2. ICT parameters
Parameter Name Description

P6 Customised H/W Is there already customised on-premise hardware?
P7 Security Are there specific policies to ensure data security
(e.g.: redundancy, availability)?
P8 Throughput Are there specific needs in term of bandwidth or
throughput?
P9 Authentication Are there specific authentication and accounting
that must be ensured?
P10 Resources Has the university enough hardware and human
resources?
The second group of parameters deals with pure ICT aspects of the problem.
They are labeled from P 6 to P 10 and shown in the Table 2. They come from the
answers to the survey as given by the ICT specialist.
4 Parameters Representation
In order to represent and later analyze parameter values, the Kiviat graph has
been used. This choice has been derived by the graph’s easiness and the immedi-
acy of understanding and also by the types of analysis that are in the objective
of the work. This graph also allows to easily identify total or partial overlapping
zones, and also regions with small deviations.
In our case, the graph will have as much rays as the number of parameters to
represent. Each parameter’s value, as extracted by the answers to the survey, is
represented with a point on the ray of the corresponding parameter with a 0...N
scale. A value of 0 means that there is no need for the corresponding parameter
feature to be provided as private.
Values grow with the importance of the parameter till the N value which
represent the not substitutability of the requirement and the mandatory private
model for it.
5 Model Analysis
After the representation of the parameters on the diagram, they are connected
as usual with segments in order to form the polygon representing the model.
Now the process continues by adding regular polygons with growing rays. Each
growing ray represent the greater necessity to have the service provisioned in a
private model.
Among these regular polygons, some of them must be highlighted:
• The smallest one containing all the problem polygon, called external polygon
with a ray Re . This gives the guarantee that all requirement are met.
• The one with an immediately lower radius called peak polygon (Rp = Re −
1). This indicates the possibility to operate the service with some relaxed
requirements.
• The internal polygon that is the biggest one (ray Ri ) entirely contained in the
problem polygon. This represent the maximum of the minimum requirements.
The analysis of the requirements that fall in the polygonal crown (rays Re
and Rp ) permits to extract information on the constraints that could be relaxed
if one want to contains all the problem diagram in the Rp polygon. The number
of values in the crown is indicated with Np (number of peaks). The distance
between internal and external rays D = Re − Ri gives an indication of the
regularity of shape of the problem polygon.
So, at the end of the representation, we have obtained the values
Re , Rp , Ri , Np and D, and we can obtain several suggestions from them.
• First of all, if Re = N , there is the impossibility to provide the service with

an entirely public model.
• If Ri = 0, there is no need to provide the service in a completely private
model.
• Low values of D are useful design directions that drive to proceed toward non
hybrid solutions.
• High values of D denote strong fluctuations in the values of the parameters
and drive to adopt hybrid models. In this case the Np value has more impor-
tance, since it represent the functionalities that must be realized in the private
part of the hybrid model.
At this point it would be appropriate to operate a new round of refinement

of the values obtained by the survey. This time it will be directed only to deepen
with the stakeholder the actual need of the requirements that corresponds to the
peeks. Maybe some of them could be relaxed in order to regularize the shape of
the problem polygon with a design simplification.
The peak parameters investigation requires now more effort, not a simple
survey as in the first round. There is the need of a deep discussion on the peak
parameters. In this round the actors involved are all together and this can help
to verify the effective need of a private approach for the related features.
If some of these peeks are lowered, a new graph with a new representation
of the parameters can be done and used as the final one. This is shown in the
case study.
6 A Case Study
As a case study the model has been tested with the real need to provide a new
e-learning service to the students of the University of the authors.
6.1 Survey Submission

The stakeholder and also the domain expert is the teacher delegated by the
rector as responsible of innovations in teaching. The first part of the survey has
been submitted to him.
The second part of the survey has been submitted to the ICT manager of
the University.
For this study, the answers of the survey as weighted in a 0...5 scale (N = 5).
The corresponding values are shown in the Table 3.
These values have been subsequently graphically represented in the Kiviat
graph of Fig. 1. The parameter values are represented in red and the same is for
the problem polygon and for the corresponding shape.
On the same graph can be seen (in gray) the regular polygons with increasing
values of parameters.
The external polygon is represented in green and has Re = 4. All the problem
polygon is inside this one and since Re < N , the full-private solution is not
mandatory.
The internal polygon is represented in black and has Ri = 1. Also in this
case since Ri > 0 there is no need to adopt a full-public solution.
The peak polygon is represented in blue and has Rp = 3.
The number of peaks is Np = 3. They corresponds to the requirement of P3,
P5 and P8 and are the constraints that must be investigated if we want to lower
the Re by one unity.
6.2 Second Round

In order to verify if the peak parameters can be lowered, a second trip with the
stakeholder and the ICT manager has done. Now there is not a simple survey as
in the first round, but a discussion with all the actors together.
The first peek parameter (P3) is related to the mission critical aspects of the
service for the academia. A deepen discussion with the stakeholder has revealed
that this critical issue has been understood more as value added of the service
Table 3. Survey result
Parameter Name Value

P1 H/W customization 1
P2 Privacy 3
P3 Mission critical 4
P4 Interoperation 3
P5 Rights 4
P6 Customized H/W 1
P7 Security 2
P8 Throughput 4
P9 Authentication 3
P10 Resources 1
P4 P3
P5 P2
P6 P1
P7 P10
P8 P9
Fig. 1. Kiviat graph for e-learning service
for the students rather than its effectiveness with the whole academia system.
So, it has been agreed that P3 can be surely lowered from 4 to a value of 2.
The second peek parameter (P5) is related to the copyrights or patents affect-
ing the data managed by the service. It is certainly true that the lessons with
related audio, video and teaching materials and their authors must be protected,
but there are no specific needs to do this on on-premise systems, since cloud
providers can easily assure the protection needed. So, a middle value answer
seems more appropriate. So P5 has been lowered from 4 to 3.
Last parameter (P8) concerns throughput and bandwidth parameters of the
service. It is obviously that audio/video streaming requires proper connections
P4 P3
P5 P2
P6 P1
P7 P10
P8 P9
Fig. 2. Kiviat graph after 2nd round
with high bandwidth, and it is undoubtedly that a private cloud can provide
better performance than a public cloud, but this is true only on the local con-
texts.
The answer given at first by the ICT manager with a value of 4, was right,
but the stakeholder shows that the most users want to use the e-learning service
not in the campus, but away from it (e.g. at home, or during trips). In this case
the best results can be given by a public system that, if required, can also be
spread over other regions. The new value of P8 is 2.
After the second trip the new Kiviat graph is shown in Fig. 2.
6.3 Final Results

As we can see on the new graph, the shape of the polygon problem is more
regular.
The external polygon in green now has Re = 3.
The peak polygon in blue has now Rp = 2.
It is worth noting that the number of peaks Np has grown from 3 to 4 and
relate to parameters P2, P4, P5 and P9. This must not be considered a worsening
of the model.
There is no need to do further rounds of investigation on peaks parameter
since one of them, P5, was also a peek parameter at the beginning and has
already been discussed.
Since Re = 3 and Ri = 1, the suggestion we can derive from the analysis is to
adopt an hybrid solution, where the components related to the peak parameters
are more candidate to be given by a public provider.

In this work has been proposed a first approach to help in the choice of the most
appropriate cloud model to be used when there is the need to provide a new
service in the academic world.
The main objective of the work is to give a simple tool to be used as a decision
support in the choice of a specific model. In this way the authors have started
with the identification of a series of requirements (some domain related, and
some technical related), that could be extracted by a simple two-parts survey.
The model uses a simple and intuitive representation as given by Kiviat
graph. On this diagram some polygons related to the problems can be con-
structed and some values has been extracted that can be properly evaluated in
order to obtain the required directions.
The work is still exploratory, has to be considered only as starting point, and
the problem surely must be further investigated. Some other case studies must
be analyzed in order to verify the generalisation of the methodology.
At the present other parameters are under investigation for introduction in
the model. Some of them will be related to the cost of each requirements. Some
other are related to times, both for the service duration and for the service
construction.
References
1. Attardi, G., Di Martino, B., Esposito, A., Mastroianni, M.: Using federated cloud
platform to implement academia services for research and administration. In: 2018
32nd International Conference on Advanced Information Networking and Applica-
tions Workshops (WAINA), pp. 413–418 (May 2018)
2. Bottum, J., Atkins, D., Blatecky, A., McMullen, R., Tannenbaum, T., Cheetham,
J., Wilgenbusch, J., Bhatia, K., Deumens, E., von Oehsen, B., et al.: The future of
cloud for academic research computing (2017)
3. Ercan, T.: Effective use of cloud computing in educational institutions. Procedia-
Soc. Behav. Sci. 2(2), 938–942 (2010)
4. Pocatilu, P., Alecu, F., Vetrici, M.: Using cloud computing for e-learning systems.
In: Proceedings of the 8th WSEAS International Conference on Data Networks,
Communications, Computers, pp. 54–59. Citeseer (2009)
5. Subramanian, T., Savarimuthu, N.: A study on optimized resource provisioning in
federated cloud. arXiv preprint. arXiv:1503.03579 (2015)
6. Vikas, S., Gurudatt, K., Vishnu, M., Prashant, K.: Private vs public cloud. Int. J.
Comput. Sci. Commun. Netw. 3(2), 79 (2013)
Italian Cloud Tourism as Tool to Develop
Local Tourist Districts Economic Vitality
and Reformulate Public Policies
Alfonso Marino(&) and Paolo Pariso
Engineering Department, University of Campania “Luigi Vanvitelli”,

Via Roma 29, 81031 Aversa, CE, Italy
{alfonso.marino,paolo.pariso}@unicampania.it
Abstract. The purpose of the study is to define and develop a methodology in

order to evaluate the different levels of cloud Tourism implementation within
Local Tourist Districts in Italy. The methodology evaluates whether and how the
region is meeting the challenge posed by new technologies and to possibly
identify performance value enhancing factors. The phenomenon observation is
structured using a questionnaire referred to organizational typologies which
should be submitted to all top level managers of all the analyzed districts. The
acquired data should be processed using the factor analysis approach in order to
identify the district organizational typologies.
1 Introduction
Tourism is one of Italy’s most significant economic sectors and its long-term devel-
opment potential arises strategic. Such development is also boosted by an ever stronger
need for quality services regardless of the market segment. This matter presents several
bottlenecks, particularly in Italian tourism supply, with the presence of an entrepre-
neurial system based on small firms without standardized services and industrial
methods. This situation implies also a different rating system from region to region and
within the same region. In order to create an accumulation of knowledge on innovation
technology, public policies and tourism sector, this research is focused on Campania
region analyzing cloud Tourism as changing process.
The term “cloud Tourism” was coined as a result of the rapid developments in
information and communication technologies (ICTs) and their applications to the
sector. Cloud Tourism is one of the most representative paradigms of the knowledge
economy and represents an emerging investigative field for researchers and practi-
tioners [16, 23]. Due to its capacity to provide tourist organizations with a huge and
varied amount of data, from which it is possible to gain invaluable insights [25] about
customers views, preferences, needs and attitudes [10, 13], cloud Tourism is being
acknowledged as a key source of value for Local Tourist Districts (LTD). In this
perspective the aim of the research is to analyze the Local Tourist Districts (LTD) in the
Campania region (Italy) so as to identify value enhancing elements in LTD perfor-
mance. The eTourism and cloud solutions, offer ample opportunities to improve the
performance value. The strengths of the sector include the wealth of natural, historical

https://doi.org/10.1007/978-3-030-33509-0_57
610 A. Marino and P. Pariso
and cultural heritage [31] assets of the Campania region. On the other hand, the digital
divide and the organizational bottlenecks represent the main weaknesses. A competi-
tiveness-enhancing decision making policy related to cloud tourism has been devised
and implemented in the region but the process is still in its early stages. Nevertheless,
two important issues emerge from the investigation and the international research
findings on the subject (WTTC, 2017):
• citizens’ increasing expectations in terms of digital services and of promotion of
local economic development;
• the tendency of many LTDs to turn to technology for answers.
Italy’s and Campania’s LTDs, have for some years, acknowledged the importance
of managing tourist information as a result of the amount of data generated by the
increase in tourist presences and made available by ICTs. E – tourism and in
particular the cloud economy, could aid LTDs in driving economic vitality by
ensuring:
• more efficient service cost management;
• improvement of service delivery to citizens;
• transformation of the business model.
All of the local districts would benefit from a management approach based on the
pursuit of these three objectives, as on the one hand, they all compete to attract new
presences, confirm returns and extend visitors’ stay and, on the other hand, they all
need to address two major issues:
– more domestic than foreign tourism (senior citizens, young people, business visitors
and families);
– data access and management.
As to the first issue, the question is how to attract more foreign visitors i.e. how to
enhance the Destination Management Organization (DMO) [4] so as to ensure com-
prehensive and sustainable solutions. As to the second issue, the question is how to
improve ICT management. The technology is, in fact, readily available and accessible
but in the case of many districts several factors such as the lack of competencies, skills,
infrastructure and management ability have a negative impact on customer fruition and
service delivery. For example, limited access to any kind of digital information can
result in a delayed, partial or negative assessment or feedback concerning the stay at the
chosen destination [4, 34]. Visitors require information delivery, possibly in real time,
before during and after the stay, so collection, storage [29], information flow pro-
cessing and big data management, play a strategic role and therefore the cloud can be a
strong accelerator for the growth in such an information intensive sector. The paper
aims to define and develop a methodology in order to evaluate the different levels of
cloud Tourism implementation in the analyzed districts. Furthermore, the study focuses
on the correlation between created value, ICT implementation and eventual increased
number of visitors to verify with the application of the methodology on the context of
Campania’s LTDs. In the next sections the literature review with focus on the theory
and its operational applications are presented. The conclusions highlight the future
implications of this research.
Italian Cloud Tourism as Tool to Develop LTD Economic Vitality 611
2 Literature Review
2.1 Categories of Innovation in the Tourism Sector
The Schumpeterian approach to innovation categories was applied to the tourism sector
by Hall [22], Weiermair [35]. Starting from different research topics, they identified the
main body of four innovation categories: product or service, process,
organizational/managerial, and market Innovation. Over the last decades these four
categories, were modified by Information and Communication Technology (ICT) [7].
Information and Communication Technology (ICT) represents a main agent for process
innovations. Yuan, Gretzel, and Fesenmaier [37] underline how ICT modifies the
organizational features and specific managerial objectives. Furthermore Blake, Sinclair,
and Soria [3] demonstrate that ICTs improve the outcomes and produce positive effects
when they are combined, for example, with other strategic and managerial actions such
as competence building and HRM (Human Resources Management). In addition, ICTs
are generating rapid innovation impacts also in [19] eTourism, and creating new
marketing tools, such as the Cloud, useful in customer profiling. However, several
aspects of the development still require further analysis.
Use 10-point type for the name(s) of the author(s) and 9-point type for the address
(es) and the abstract. For the main text, please use 10-point type and single-line
spacing. We recommend the use of Computer Modern Roman or Times. Italic type
may be used to emphasize words in running text. Bold type and underlining should be
avoided.
Papers not complying with the LN style will be reformatted. This can lead to an
increase in the overall number of pages. We would therefore urge you not to squash
your paper.
2.2 The Cloud System and ETourism

The technological revolution [11] brought about by the development of the cloud [12,
20] has changed the market conditions for tourism organizations. ICTs have evolved
rapidly and provided new tools for tourism marketing and management. They support
the interactivity between tourism enterprises and consumers and, as a result, they
reengineer the entire process of developing, [14] managing and marketing tourism
products and destinations. The Cloud is defined as an ICT-integrated tourism service,
which integrates tourism sources and ICTs [2], such as artificial intelligence and the
Internet of Things (IoT), to provide explicit information and satisfactory services to
tourists, based on the development of innovative mobile communication technology
[36]. As far as the cloud development is concerned, it has been authoritatively sug-
gested [36] that three different forms of ICT are vital for tourism systems: IoT, mobile
communication, and artificial intelligence technology. Cloud computing [17] is a web-
based virtualization resource platform and a dynamic data center. It stimulates infor-
mation sharing that is fundamental to undertaking tourism projects [9]. The joint use of
the IoT and the cloud, support destinations [32] in providing information and analysis
as well as automation and control [14]. The Cloud as a mobile communication system
allows the communication of voice and data over mobile and portable devices [8]. It
can be used for mobile booking, online [28] payment, information access, and self-
entertainment during travel [1]. The combination of Artificial intelligence and the cloud
entails that a computer-based system has problem-solving, data storage and human
language understanding capabilities [34]. This technology can be employed to forecast
tourist flow, evaluate tourism services, handle tourist crowding, issue emergency
tourism alerts, and so on [36]. In this sense the cloud has changed the traditional tourist
services and therefore the tourist sector needs to become electronic to respond to the
demand for new tourism services.
2.3 eTourism and Local Tourist Districts

Cloud tourism research [7] is gaining increasing attention and a number of issues are
starting to be addressed in literature but the theme of LTDs has been, to date, rather
neglected. The impacts of ICTs can be seen in how stakeholders interact, for example,
networking and dynamic interfaces enable consumers and partners to redevelop the
tourism product proactively and reactively which is critical for the competitiveness of
LTDs [26]. The combinations of multiple communication technologies [5], that have
enabled the transition to new forms of service delivery, have shaped cloud tourism [6]
and made the new global market more competitive. The value of the market has
changed, e.g., the customer is a living testimonial of the journey and can communicate
in real time the qualitative aspects of the service. Moreover, tourist B&Bs across the
continent, now maintain sophisticated websites to advertise their unique features,
handle booking orders, and promote specials. The opportunities created by cloud
tourism and its technologies [18] as a result of the digital platform include improved
supply chain management, transaction automation, increased operational efficiency and
the penetration of markets previously inaccessible for firms in developing LTDs [15].
Cloud tourism [24] development can be analyzed from two different perspectives:
– service management: employment, professional growth, supply of new services;
– business organization: more productivity, competitiveness and economic growth.
In order to promote the development of Campania’s LTDs it is necessary to analyze
the problem from both perspectives. Cloud tourism [24] plays a central role in the
structural change of the sector in terms of redefining the value chain, offering new
services and changing [27] the business process.
3 Methodology
The data [21] should be acquired through questionnaires submitted to all the top level
managers of the analyzed units. The questionnaire consists in 30 pre – developed
macro-area questions, with 15 sub-questions each. Likert (scale) statements should be
used to assess the degree of criticism with respect to each particular item. The dif-
ferentiation of organization typology, proactive, participated and insensitive, was based
on Schein’s approach [30]. Specifically, respondents should be asked to indicate the
level of criticism on a seven point scale, ranging from “highly critical” (7) to “slightly
critical” (1) for each item. The collected data should be statistically analyzed using the
Statistical Package for the Social Sciences version 22.0. The 30 macro area questions
should be explored by principal components factor analysis and varimax rotation,
which will result in a four factor solution. Factor analysis is the tool will be used to
investigate variable relationships for complex concepts. It will allow to examine
concepts that are not easily measured directly by reducing a large number of variables
into a few interpretable factors related to the analyzed context. Each factor took into
account a certain amount of the overall variance in the observed variables, and the
factors should be itemized according to the amount of variation they will show. The
purpose of the factor analysis [21] is to transform the statements into a set of factor
combinations capable of synthesizing the characteristics of an organization type. The
internal consistency of each factor should be examined by Cronbach’s alpha tests and
which will show how strictly related as a group a set of items is. They should be used as
a measure of scale reliability. Considering that the alpha test does not allow for the
measurement of the scale unidimensionality, an exploratory factor analysis should be
performed to check such a dimensionality. Cronbach’s alpha, as a coefficient of reli-
ability, can be written as a function of the number of test items and the average inter-
correlation among the items. Below, the formula for the standardized Cronbach’s
alpha:
N c
a¼ ð1Þ
v þ ðN 1Þ c
• N is equal to the number of items;

• c-bar is the average inter-item covariance among the items;
• v-bar equals the average variance.
In Campania there are 12 LTDs and the methodology involves to interview almost
twelve managers for each district for a total of almost 144 interviews. The interviews
should be conducted inside the organizations. The purpose of the factor analysis [21] is
to combine the statements into a set factors that will be deemed to represent organi-
zational types linked to Campania’s LTDs. Test results will have to show the identi-
fication of the most pressing issues by LTD managers, underlining the distinctive
characters of organizational types.
Conclusions and Future Implication
Starting from literature review and methodology, the analysis of the 12 Campania
LTDs will show how the innovation, achieved through the implementation of new
technologies and the use of the cloud, has impacted their performance value. The
research, analyzing the districts, will verify what factors determine better competitivity
and the increasing of visitors’ number and by contrast, the factors which will produce
the most negative effects in the districts.
References
1. Adukaite, A., Reimann, A.M., Marchiori, E., Cantoni, L.: Hotel mobile apps. The case of 4
and 5 star hotels in European German-speaking countries. In: Xiang, Z., Tussyadiah, L.
(eds.) Information and Communication Technologies in Tourism 2014, pp. 45–47. Springer,
Cham (2013)
2. Ali, A., Frew, A.J.: ICT and sustainable tourism development: an innovative perspective.
J. Hospitality Tourism Technol. 5(1), 2–16 (2014)
3. Blake, A., Sinclair, M.T., Soria, J.A.C.: Tourism productivity: evidence from the United
Kingdom. Ann. Tourism Res. 33(4), 1099–1120 (2006)
4. Bokunewicz, J.F., Shulman, J.: Influencer identification in Twitter networks of destination
marketing organizations. J. Hospitality Tourism Technol. 8(2), 205–219 (2017)
5. Buhalis, D., Licata, M.C.: The future eTourism intermediaries. Tourism Manag. 23(3), 207–
220 (2002)
6. Buhalis, D.: eTourism: information technology for strategic tourism management. Pearson
(Financial Times/Prentice Hall) (2003)
7. Buhalis, D., Law, R.: Progress in information technology and tourism management. 20 years
on and 10 years after the Internet – the state of eTourism research. Tourism Manag. 29, 609–
623 (2008)
8. Buhalis, D., O’Connor, P.: Information communication technology revolutionizing tourism.
Tourism Recreat. Res. 30(3), 7–16 (2005)
9. Buhalis, D., Amaranggana, A.: Smart tourism destinations. In: Xiang, Z., Tussyadiah, L.
(eds.) Information and Communication Technologies in Tourism 2014, pp. 553–564.
Springer, Cham (2013)
10. Canhoto, A.I., Clark, M.: Customer service 140 characters at a time: the users’ perspective.
J. Mark. Manag. 29(5–6), 522–544 (2013)
11. Caragliu, A., Del Bo, C.D., Nijkamp, P.: Smart cities in Europe. J. Urban Technol. 18(2),
65–82 (2011)
12. Connell, J., Reynolds, P.: The implications of technological developments on tourist
information centers. Tourism Manag. 20(4), 501–509 (1999)
13. Choudhury, M.M., Harrigan, P.: CRM to social CRM: the integration of new technologies
into customer relationship management. J. Strateg. Mark. 22(2), 149–176 (2014)
14. Chui, M., Loffler, M., Roberts, R.: The Internet of things. McKinsey Q. 2, 1–9 (2010)
15. Dredge, D.: Destination place planning and design. Ann. Tourism Res. 26(4), 772–791
(1999)
16. Erickson, S., Rothberg, H.: Big Data and knowledge management: establishing a conceptual
foundation. Electr. J. Knowl. Manag. 12(2), 108–116 (2014)
17. European Commission: Measuring the economic impact of cloud computing in Europe
(2016)
18. Fodor, O., Werthner, H.: Harmonise: a step toward an interoperable eTourism marketplace.
Int. J. Electr. Commer. 9(2), 11–39 (2005)
19. Go, F.M., Lee, R.M., Rosso, A.P.: E-heritage in the globalizing society: enabling cross-
cultural engagement through ICT. Inf. Technol. Tourism 6(1), 55–68 (2003)
20. Gretzel, U.: Intelligent systems in tourism: a social science perspective. Ann. Tourism Res.
38(3), 757–779 (2011)
21. Hair, J.F., Anderson, R.E., Tatham, R.L., Black, W.C.: Multivariate Data Analysis with
Readings. Prentice-Hall, Hemel Hempstead (1995)
22. Hall, C.M.: Innovation and tourism policy in Australia and New Zealand: never the twain
shall meet? J. Policy Res. Tourism Leisure Events 1(1), 2–18 (2009)
23. Law, R., Buhalis, D., Cobanoglu, C.: Progress on information and communication
technologies in hospitality and tourism. Int. J. Contemp. Hospitality Manag. 26(5), 727–750
(2014)
24. Michopoulou, E., Buhalis, D., Michailidis, S., Ambrose, I.: Destination management
systems: technical challenges in developing an eTourism platform for accessible tourism in
Europe. In: Sigala, M., Mich, L., Murphy, J. (eds.) Information and Communication
Technologies in Tourism 2007, pp. 301–310. Springer, Wien (2007)
25. Morabito, V.: Big Data and Analytics: Strategic and Organizational Impacts. Springer,
Berlin (2015)
26. Pearce, D.G.: Tourist district in Paris: structure and functions. Tourism Manag. 19(1), 49–65
(1998)
27. Pühretmair, F.: It’s time to make eTourism accessible. In: Miesenberger, K., Klaus, J.,
Zagler, W., Burger, D. (eds.) Computers Helping People with Special Needs. Proceedings
9th International Conference, ICCHP 2004, Paris, France, July 2004, pp. 272–279. Springer,
Berlin (2004)
28. Reino, S., Alzua-Sorzabal, A., Baggio, R.: Adopting interoperability solutions for online
tourism distribution: an evaluation framework. J. Hospitality Tourism Technol. 7(1), 2–15
(2010)
29. Sahin, I., Gulmez, M., Kitapci, O.: E-complaint tracking and online problem-solving
strategies in hospitality management: plumbing the depths of reviews and responses on
TripAdvisor. J. Hospitality. Tourism Technol. 8(3), 372–394 (2017)
30. Schein, E.H.: Organizational Culture and Leadership, 3rd edn, pp. 193–222. Wiley, Jossey-
Bass (2010)
31. Scuderi, R.: Special focus: local resources for tourism – from impact to growth. Tourism
Econ. 24(3), 294–296 (2018)
32. Soteriades, M.: Tourism destination marketing: approaches improving effectiveness and
efficiency. J. Hospitality Tourism Technol. 3(2), 107–120 (2012)
33. Tavares, J.M., Neves, O.F., Sawant, M.: The importance of information in the destination on
the levels of tourist satisfaction. Int. J. Tourism Policy 8(2), 129–146 (2018)
34. Wang, C.: Predicting tourism demand using fuzzy time series and hybrid grey theory.
Tourism Manag. 25(3), 367–374 (2004)
35. Weiermair, K.: Product improvement or innovation: what is the key to success in tourism?
In: OECD. (ed.) Innovation and Growth in Tourism, pp. 53–69. OECD, Paris (2006)
36. Zhang, M., Yang, W.: Fuzzy comprehensive evaluation method applied in the real estate
investment risks research. Phys. Procedia 24, 1815–1821 (2012)
37. Yuan, Y.L., Gretzel, U., Fesenmaier, D.R.: The role of information technology use in
American convention and visitor bureaus. Tourism Manag. 27(3), 326–341 (2006)
Auto-scaling in the Cloud: Current Status
and Perspectives
Marta Catillo1 , Massimiliano Rak2 , and Umberto Villano1(B)

1
Department of Engineering, University of Sannio, Benevento, Italy
martacatillo@gmail.com, villano@unisannio.it
2
Department of Computer Engineering, University of Campania Luigi Vanvitelli,
Aversa, Italy
massimiliano.rak@unicampania.it
Abstract. One of the main advantages of cloud computing is elasticity,

which allows to rapidly expand or reduce the amount of leased resources
in order to adapt to load variations, guaranteeing the desired quality
of service. Auto-scaling is an extensively studied topic. Making optimal
scaling choices is of paramount importance and can help reduce leasing
costs, as well as power consumption. This paper analyzes the current sta-
tus of auto-scaling in the cloud ecosystem, considering recent literature
contributions as well as existing auto-scaling solutions. Then it discusses
possible research directions in this field, fostering the development of a
methodology that, on the basis of suitably-defined performance parame-
ters, can produce an optimal auto-scaling policy to be implemented using
existing auto-scaling services and tools.
1 Introduction
Nowadays cloud computing systems are attractive and extensively used. This
popularity is mainly linked to the possibility to lease low-cost computing
resources, thus avoiding huge investments for the purchase of proprietary and
rapidly obsolescing computing infrastructures. Moreover, elasticity is a signif-
icant feature of cloud environments: it is the ability of a system to adapt to
workload changes and represents a key factor to ensure Quality of Service (QoS)
requirements. As a matter of fact, in the NIST definitions “rapid elasticity” and
“on-demand self-service” (the enabler of elasticity) are considered core features
of cloud computing [19].
Tools that allow to change the amount of leased computing resources accord-
ing to some kind of user-generated or self-managed, autonomic input are called
auto-scalers. These can be extraordinarily powerful instruments, but they also
pose challenging issues that need to be addressed. In particular:
• workload forecasting is surely an important matter. To avoid QoS losses, it is

desirable to acquire additional resources before an actual workload increase.
This can be achieved only by workload prediction.

https://doi.org/10.1007/978-3-030-33509-0_58
Auto-scaling in the Cloud: Current Status and Perspectives 617
• accurate estimation of the resources actually needed is currently an open issue.

Resource under-provisioning will inevitably worsen performance; resource
over-provisioning reduces computing efficiency. In both cases, the undesired
effect is to incur in unnecessary costs.
Therefore, there are a number of complex objectives that an auto-scaler

should achieve by balancing QoS and costs. Currently the issues linked to auto-
scaling are not widely understood, and a reference standard or implementation
is still lacking. Each commercial cloud provider has its own interpretation of
the problem and its own way of addressing it. Most of the times, the available
solution are far less than optimal, and suspiciously neglect leasing costs.
In this paper, after introducing the auto-scaling problem (Sect. 2) we ana-
lyze the current status of auto-scaling in the cloud ecosystem. In Sect. 3 we
summarize recent literature contributions. In Sect. 4 we deal with existing auto-
scaling implementations, including those used by several commercial providers.
The main open research issues and directions are discussed in Sect. 5. The paper
closes with our conclusions and plans for future research.
2 The Auto-scaling Problem

Auto-scaling is concerned with the dynamic adaptation of computing resources
so as to obtain the desired QoS levels (and to guarantee the SLA fulfillment)
under variable input workload at minimum cost. Auto-scaling can either be
supported by a service available in the cloud environment or by an external
logic/tool exploited by the scalable application. In both cases, it aims to find a
fair trade-off between application QoS and minimization of total resource costs.
Stated another way, any auto-scaler is an optimizer of the resource
provisioning/de-provisioning process. Anyway, it is important not to confuse an
auto-scaling solution with the initial provisioning. The initial provisioning con-
cerns the setup of the operational environment; any auto-scaler instead works
on an existing and previously configured environment by adding or reducing
resources under variable workloads. If we consider the cloud stack, the dynamic
provisioning process can take on different perspectives. These range from an
IaaS-perspective, where the provisioning mainly involves the acquisition of VMs,
to an SaaS-perspective, where it is up to the SaaS vendor to scale the infras-
tructure to give the illusion of unbounded capacity.
Ideally, at provisioning time, a perfect match between resources and work-
load should be created. Unfortunately, this is difficult to achieve. In addition,
provisioning may take (long) time, and once a resource is obtained it may not be
alive immediately. In order to achieve a good auto-scaling process, provisioning
can take different forms:
• Reactive provisioning: provisioning takes place in response to traffic.

• Proactive provisioning: provisioning takes place before the actual increase
in traffic.
618 M. Catillo et al.
• Predictive provisioning: provisioning is carried out on the basis of a fore-

cast predicting unplanned traffic peaks.
The three forms of provisioning are shown in Table 1 by highlighting their
advantages and disadvantages.
Table 1. Provisioning comparison
Provisioning Advantages Disadvantages

Reactive -easy to use -risk of outages if provisioning is not quick
enough
-cost saving (provision exactly
what you need)
Proactive -considers overhead in advance -suitable for environments with predictable
loads
-high costs (add more capacity than is
actually needed)
Predictive -can predict unplanned load -accuracy is a challenge
peaks
In the literature, the auto-scaling problem is typically addressed as an

autonomous control problem by using the MAPE (Monitor-Analyze-Plan-
Execute) loop as a reference model [18]. During the monitoring phase, an infor-
mation gathering activity is planned to assess the status of resource utilization.
This task requires the measurement of purposely-selected metrics about system
and application current state. During the analysis phase, the information col-
lected is analyzed to be used later on. Scaling decisions are taken by avoiding
oscillations that occur when scaling actions are carried out too quickly. The
analysis process could be even more complex, if machine learning techniques for
knowledge discovery from the information gathered by monitoring are adopted
[7]. The results obtained during the analysis phase are subsequently matched to
rules defined by the application provider.
The planner makes scaling decisions. It is possible to choose between hori-
zontal and vertical scaling. Horizontal scaling is the most widely used approach;
it adjusts the amount of Virtual Machine (VM) instances by acquiring further
VMs (scaling out) or by releasing idle VMs (scaling in). Vertical scaling, instead,
involves adding (scaling up) or subtracting (scaling down) to/from existing VMs
resources such as compute cores or RAM. Horizontal and vertical scaling are
suitable for different contexts; used jointly, they may guide to the choice of the
optimal scaling strategy.
The last phase of the MAPE loop is the Execution one, which involves the
execution of the scaling decision previously taken. Conceptually, Execution is
an easy phase, but there are hidden complexities due to choices to be made in
the presence of multiple providers with data centers in different geographical
regions.
As a matter of fact, many auto-scaling techniques are MAPE-loop-based
[23]. Considering all the aspects mentioned above, it is clear that the choice of
an optimal auto-scaling strategy is challenging. This complexity is also reinforced

by the presence of multiple cloud providers with an extraordinary variety of cost
plans.
3 Auto-scaling Approaches in the Literature

This section is an overview of the main approaches adopted in the litera-
ture in order to implement auto-scaling tools. They have been divided by the
auto-scaling method adopted (load prediction, resource-aware, SLA-aware, cost-
aware), and described in the following subsections.
3.1 Load Prediction Approaches

Estimating the use of resources in a context where there is a strong variability in
the clients’ workload models is quite complex. For many websites, for example,
it is not easy to plan load peaks, as it is necessary to consider the interleaving
of a number of factors (time of day, day of week and other seasonal factors)
that together could make the load predictable. Average load or maximum peaks
could be taken into account, but each of the two solutions has disadvantages. In
the first case, provisioning problems could arise in the presence of unexpected
peaks; in the second, there could be a waste of resources if the workload stays
under the peaks. In this context, it would be desirable to predict the incoming
load for careful management of resources. In the literature there are many works
that deal with this problem.
In [16] the authors propose prediction-based resource measurement and pro-
visioning strategies by using Neural Networks and Linear Regression. The goal
is to satisfy the incoming demand by making the best use of resources. The
solution, in addition to providing accurate forecasts, makes it also possible to
predict the resource demand ahead of the VM instance setup time.
In [14] Herbst et al. describe a novel self-adaptive approach that selects fore-
casting methods depending on the context. The approach is based on decision
trees and direct feedback cycles. The results obtained from the experimenta-
tion show that the context selection of a forecast method reduces the error of
the load previsions, compared to the results obtained from a statically-selected
forecasting method.
Finally, in [8] the authors describe a scheduling algorithm and auto scaling
triggering strategies based on user patience, a metric that estimates the per-
ception end-users have of the QoS delivered by a service provider. This metric
takes into account the ratio between expected and actual response times for each
request. This approach reduces the resource costs while maintaining perceived
QoS at adequate levels.
3.2 Resource-Aware Approaches

The progressive diffusion of cloud computing, with data centers spread over
different geographical areas, makes it necessary to optimize resource provisioning
policies. One of the main challenges in the field of resource provisioning concerns
the distribution of resources to applications, so as to reduce power consumption
and costs. Many recent studies have addressed this problem.
In [15] the authors propose a resource prediction model based on double
exponential smoothing. Besides considering the current state of resources, the
system also takes into account historical data. The experiments performed on
the CloudSim simulator show a good model accuracy. In [9] Bankole and Ajila
describe cloud client prediction models for the TPC-W benchmark web applica-
tion using three machine learning techniques: Support Vector Machine (SVM),
Neural Networks (NN) and Linear Regression (LR). They also use the SLA met-
rics for Response Time and Throughput as input parameters to the chosen pre-
diction models. As an improvement of this work, in [6] the authors implemented
their model on a public cloud infrastructure (Amazon EC2) by extending the
experimentation time by over 200% and using random workload pattern. In
both cases the Support Vector Machine turned out to be the best model. Paper
[11] proposes mechanisms for autonomic service provisioning, implemented in a
prototype and tested using Wikibench, a benchmark that replays real workload
obtained from Wikipedia access traces. The experiments show that the system is
self-adaptable respect to the workload fluctuations. Paper [10] proposes an auto-
matic provisioning solution for multi-tier applications called AutoMAP. It allows
both horizontal and vertical auto-scaling. The system is tested using the RUBiS
benchmark on a real cloud architecture. Another hybrid solution is provided by
Moltó et al. [21]. They propose a system providing automatic vertical elasticity,
adapting the memory size of the VMs to their current memory consumption.
The system uses live migration to prevent overload scenarios, without downtime
for the VMs.
3.3 SLA-Aware Approaches
Autonomic provisioning should allow to meet the requirements specified by the

Service Level Agreement (SLA) signed by application providers and cloud ser-
vice providers. In an SLA, the quality of service is specified as non-functional
requirements of services. Inadequate values of QoS lead to SLA penalties. In this
context, many QoS-based resource provisioning techniques have been proposed.
In [12] Garcı́a et al. propose Cloudcompass, a SLA-aware cloud platform that
manages the complete resource life cycle. In particular, it allows cloud service
providers with a generic SLA model to manage higher-level metrics, closer to
the end user perception. Cloudcompass also allows to correct QoS violations by
exploiting the elasticity features of cloud systems.
Gill and Chana in [13] describe a QoS metric based on a resource provisioning
technique that allows a good reduction of execution time and execution cost of
cloud workloads. A similar model is proposed by Kaur and Chana in [17]. In
particular, they propose a QoS-Aware Resource Elasticity (QRE) framework
aiming at dynamic scalability of cloud resources.
3.4 COST-Aware Approaches

The use of auto-scaling mechanisms that try to meet cost requirements is an
active research topic. Evaluating the cost of a scalable application is not easy.
The use of a service, for example, can be calculated and therefore charged every
hour, or based on each GB of generated I/O traffic. For these reasons, it is crucial
to develop scaling strategies taking into account the billing cycles.
Moldovan et al. [20] introduce a model for capturing the pricing schemes of
cloud services. The solution is useful for the developers of scalable applications
for public clouds, as it allows to monitor costs and develop cost-aware scalabil-
ity controllers. Ocone et al. [22] propose an analysis method based on off-line
benchmarking that allows to define scaling policies to be used by auto-scalers.
The approach consists in benchmarking the web application to discover the load
processing capacities of each component, making it easier to apply scaling up/out
policies in the presence of load variations. The analysis enables to identify the
trade-offs between costs and quality of service of the application.
4 Auto-scaling Implementations
In this section, we analyze some existing auto-scaling solutions, including those
used by several commercial cloud providers.
Amazon Web Service (AWS) offers an auto-scaling service in the IaaS EC2
(Elastic Compute Cloud) public cloud [1]. An Auto Scaling Group (ASG) is char-
acterized by the configuration of the virtual machines that will be part of the
group. The Auto Scaling Group maintains EC2 instances in the group by per-
forming a periodic health check. If any instance becomes unhealthy, it is stopped
and replaced with another instance within the group. Another fundamental com-
ponent is the Launch Configuration. It is a template used by Auto Scaling Group
to launch EC2 instances. It allows to specify the Amazon Machine Image (AMI)
(instances type, key pair, and security groups, etc.) during the launch configu-
ration step. Finally, the Scaling Plans specify when and how to scale. There are
several ways to scale within the Auto Scaling Group:
• Maintaining current instance level at all times: maintenance of a
fixed number of running instances in the Auto Scaling Group. If an instance
becomes unhealthy, it is immediately replaced.
• Manual scaling: once the group capacity is specified, auto-scaling maintains
the instances with updated capacity.
• Scale based on schedule: this approach can be used when traffic peaks are
expected. In this case scaling actions are performed based at specific times.
• Scale based on demand: with this approach resources scale by using a
scaling policy. It is possible to scale in or out considering specific parameters
(CPU utilization, Memory, Network In and Out, etc.).
Amazon AWS supports only horizontal auto-scaling out-of-the-box. Vertical
auto-scaling is possible by harnessing the AWS services (e.g., by the CloudFor-
mation(CF) templates, CloudWatch alarms, SNS and Lambda).
Azure’s auto-scaling allows to set up rules to automatically scale applications

(both horizontally and vertically) without manual intervention [2]. Rules can be
based on time (scaling is carried out at specified times), or on two types of
metrics:
Resource metrics: related to usage within Azure (memory, CPU and disk
usage, thread count, queue length). It is possible to set Azure auto-scaling to
scale up or down based on these usage parameters.
Custom metrics: these are metrics produced by the application itself. If they
are sent to Application Insights (a performance monitoring service by Microsoft),
they can be used to make decisions on whether to scale or not.
Rackspace Auto Scale is written in Python and relies on the Rackspace Cloud
Servers, Rackspace Cloud Load Balancers, and Rackspace RackConnect v3 APIs
[3]. Rackspace only supports horizontal scaling. The scaling events can be man-
aged both with scaling rules, which can be defined through the monitoring sys-
tem, and through a schedule that can be suitably configured. A scaling group is
a set of identical servers (and optionally a load balancer), characterized by the
following components:
• Scaling group configuration: the group name, cooldown time limit, mini-
mum and maximum number of needed servers.
• Launch configuration: if a scaling event is intercepted, the specific server
configurations are managed.
• Scaling policy: specifies the actions of the policy.
• Webhook (capability-based URL): triggers a scaling policy.
Auto-scaling in the Google Cloud platform is a feature of the instance group.
In particular, a group is composed of homogeneous instances, created from a com-
mon instance template. The auto-scaling service is offered by Compute Engine,
which supports only horizontal scaling [4]. Possible scaling policies include:
• CPU utilization: the auto-scaling event is linked to the average CPU uti-
lization of a group of virtual machines.
• Load balancing serving capacity: auto-scaling is based on load balancing
serving capacity by monitoring the serving capacity of an instance group.
There is a scaling event if the VMs are over or under capacity.
• Stackdriver Monitoring metrics: auto-scaling is based on a standard met-
ric provided by Google’s Stackdriver Monitoring, or on any custom metrics.
VMware software allows to get in a private cloud environment the same
auto-scaling feature provided in public cloud solutions, like AWS and Azure.
In particular, the service can be offered through the VMware Pool auto-scaling
plugin [5].
5 Research Directions
Until the end of the first decade of the new millennium, the auto-scaling prob-
lem was addressed mainly from the datacenter management point of view and it
was strictly related to resource scheduling problem. The research focused on the
optimal allocation of many-task applications over existing resources, in order to
minimize resource usage, reduce energy consumption and so on. According to
the state of art, currently auto-scaling in infrastructures is no more a technolog-
ical issue: existing tools are able to scale both horizontally and vertically in an
almost transparent way for the upper layers, thanks to virtualization, containers
and balancers. In fact, auto-scaling is even offered as a commercial service (as
discussed in Sect. 4).
Nowadays the problem is decoupled: auto-scaling is offered as a service to
the application, and the requested resources are independently scheduled on the
lower physical infrastructural layer, aiming at optimizing resource consumption,
especially for energy-aware considerations. In practice, the auto-scaling problem
moves from the infrastructure level up to the application level: scaling should
be done taking into account the single application behavior, adapting to the
workload it is subject to, not considering the load of many-task applications.
At the state of art, infrastructures offer the capability to scale, while it is
up to the application to make a decision on when, how and how much to scale.
As a consequence, the research topics change and can be summarized as follows:
(i) auto-scaling policy definition and design, (ii) comparison and benchmarking
auto-scaling tools, (iii) definition of auto-scaling metrics and performance figures,
(iv) trade-off among scaling and costs and finally (v) protection against Economic
Denial of Sustainability or Fraudulent cloud Resource Consumption attacks.
The first and clear open research point is the need for techniques and lan-
guages to design and express scalability policies in a way that is as much as
possible vendor- and technology-independent, being at the same time able to
catch the application behavior and to define the criteria and logic needed to
scale.
Regarding the second point, comparison and benchmarking auto-scaling tools,
even if the technologies are nowadays commonly available, at the best of author’s
knowledge there is not a commonly accepted reference architecture for auto-
scaling tools, and no stable benchmarks that enable a comparison among the
different techniques and solutions on the market. This is clearly an open issue:
being the auto-scaling delegated to the application, together with the associated
costs, it is relevant to know how well the auto-scaling tool will react and imple-
ment the requested policy. Performance indicators should be correctly identified
in order to evaluate well and to implement the scaling policies.
The need for benchmarks opens up a research area even for new metrics
and indicators (see point (iii) above), able to express in a clear way the over-
head introduced, how it affects the scaling policy and, overall, the scaling qual-
ity. In this field, one of the most innovative aspects is to devise metrics tak-
ing into account the relationship with costs, as turning out from a pay-per-use
resource usage. While at infrastructure level the resources are almost fixed and
the costs are mainly considered to optimize their use and/or to reduce energy
dissipation (obtained reducing the amount of resources adopted), moving the
scaling problem at application level implies a direct effect of scaling on costs:
adding/removing a VM or changing the type of machine directly affects the

costs, and so it is possible to define policies taking into account the cost as
a performance indicator (as illustrated in Sect. 3). Auto-scaling policies should
accordingly take into account the trade-off between performance-related metrics
and cost-related metrics, as outlined in point (iv).
Moreover, the cloud pay-per-use paradigm has an additional side-effect: a
new type of cyber-attacks named Economic Denial of Sustainability (EDoS)
or Fraudulent cloud Resource Consumption (FRC), which aim at forcing the
cloud applications to consume more resources than usual. The objective of these
attacks is not to make service unavailable as in a DoS attack, but to increase the
application costs to be payed to the hosting cloud provider. Mitigation techniques
for EDoS and FRC exist (interesting surveys on the topic are [24–27]), but they
require suitable application design and need to be taken into account in the
auto-scaling policy design and/or tools implementation.
6 Conclusions
Auto-scaling is no more a technological issue, as its adoption opens up to a lot
of new problems and opportunities to be addressed from research point of view.
Our research team is now working on a set of new performance figures able to
benchmark, evaluate and compare auto-scaling tools. The objective is to devise
a methodology that, taking into account such parameters, can produce in an
automated way an auto-scaling policy. This policy should be provided using a
formalism that allows its implementation over existing services and tools. The
proposed methodology should be able not only to optimize resource usage and
manage the trade-off among costs and performance, but even to take into account
the risks due to the EDoS attacks.
References
1. https://aws.amazon.com/it/autoscaling/
2. https://azure.microsoft.com/en-in/features/autoscale/
3. https://www.rackspace.com/cloud/auto-scale
4. https://cloud.google.com/appengine/docs/
5. https://go.cloudbees.com/docs/plugins/vmware/
6. Ajila, S.A., Bankole, A.A.: Cloud client prediction models using machine learn-
ing techniques. In: 2013 IEEE 37th Annual Computer Software and Applications
Conference, pp. 134–142 (2013)
7. Amiri, M., Mohammad-Khanli, L.: Survey on prediction models of applications for
resources provisioning in cloud. J. Netw. Comput. Appl. 82(C), 93–113 (2017)
8. Assuncao, M., Cardonha, C., Netto, M., Cunha, R.: Impact of user patience onauto-
scaling resource capacity for cloud services. Future Gener. Comput. Syst. 55, 41–50
(2015)
9. Bankole, A.A., Ajila, S.A.: Cloud client prediction models for cloud resource provi-
sioning in a multitier web application environment. In: 2013 IEEE 7th International
Symposium on Service-Oriented System Engineering, pp. 156–161 (2013)
10. Beltran, M.: Automatic provisioning of multi-tier applications in cloud computing

environments. J. Supercomput. 71, 2221–2250 (2015). https://doi.org/10.1007/
s11227-015-1380-5
11. Casalicchio, E., Silvestri, L.: Mechanisms for SLA provisioning in cloud-based ser-
vice providers. Comput. Netw. 57, 795–810 (2013)
12. Garcia Garcia, A., Blanquer, I., Hernández Garcia, V.: SLA-driven dynamic cloud
resource management. Future Gener. Comput. Syst. 31, 1–11 (2014)
13. Gill, S.S., Chana, I.: Q-aware: quality of service based cloud resource provisioning.
Comput. Electr. Eng. 47, 138–160 (2015)
14. Herbst, N.R., Huber, N., Kounev, S., Amrehn, E.: Self-adaptive workload clas-
sification and forecasting for proactive resource provisioning. In: Proceedings of
the 4th ACM/SPEC International Conference on Performance Engineering, ICPE
2013, pp. 187–198. ACM, New York (2013)
15. Huang, J., Li, C., Yu, J.: Resource prediction based on double exponential smooth-
ing in cloud computing. In: 2012 2nd International Conference on Consumer Elec-
tronics, Communications and Networks (CECNet), pp. 2056–2060 (2012)
16. Islam, S., Keung, J., Lee, K., Liu, A.: Empirical prediction models for adaptive
resource provisioning in the cloud. Future Gener. Comput. Syst. 28(1), 155–162
(2012)
17. Kaur, P., Chana, I.: A resource elasticity framework for QoS-aware execution
ofcloud applications. Future Gener. Comput. Syst. 37, 14–25 (2014)
18. Maurer, M., Breskovic, I., Emeakaroha, V.C., Brandic, I.: Revealing the MAPE
loop for the autonomic management of Cloud infrastructures. In: 2011 IEEE Sym-
posium on Computers and Communications (ISCC), pp. 147–152. IEEE, Corfu
(2011)
19. Mell, P., Grance, T.: The NIST definition of cloud computing. NIST Spec. Publ.
800, 145 (2011)
20. Moldovan, D., Truong, H.L., Dustdar, S.: Cost-aware scalability of applications in
public clouds (2016)
21. Moltó, G., Caballer, M., de Alfonso, C.: Automatic memory-based vertical elastic-
ity and oversubscription on cloud platforms. Future Gener. Comput. Syst. 56(C),
1–10 (2016)
22. Ocone, L., Rak, M., Villano, U.: Benchmark-based cost analysis of auto scaling web
applications in the cloud. In: 2019 IEEE 28th International WETICE Conference,
pp. 98–103 (2019)
23. Qu, C., Calheiros, R.N., Buyya, R.: Auto-scaling web applications in clouds: a
taxonomy and survey. ACM Comput. Surv. 51(4), 1–33 (2018)
24. Singh, P., Manickam, S., Ul Rehman, S.: A survey of mitigation techniques against
economic denial of sustainability (EDoS) attack on cloud computing architecture.
In: Proceedings of ICRITO 2014, May 2015
25. Somasundaram, A.: Economic denial of sustainability attack on cloud - a survey.
ICTACT J. Commun. Technol. 07(04), 6 (2016)
26. Thaper, R., Verma, A.: A survey on economic denial of sustainability attack miti-
gation techniques. Int. J. Innov. Res. Comput. Commun. Eng. 3(3), 6 (2015)
27. VivinSandar, S., Shenai, S.: Economic denial of sustainability (EDoS) in cloud
services using HTTP and XML based DDoS attacks. Int. J. Comput. Appl. 41(20),
11–16 (2012)
Dynamic Patterns for Cloud Application
Life-Cycle Management
Geir Horn1(B) , Leire Orue-Echevarria Arrieta2 , Beniamino Di Martino3 ,

Pawel Skrzypek4 , and Dimosthenis Kyriazis5
1
University of Oslo, P.O. Box 1080 Blindern, 0316 Oslo, Norway
Geir.Horn@mn.uio.no
2
Fundacion TECNALIA Research and Innovation, Derio, Spain
Leire.Orue-Echevarria@tecnalia.com
3
University of Campania “Luigi Vanvitelli”, Caserta, Italy
4
7Bulls.com, Al. Szucha 8, 00-582 Warsaw, Poland
pskrzypek@7bulls.com
5
University of Piraeus, Piraeus, Greece
dimos@unipi.gr
Abstract. Cloud applications are by nature dynamic and must react to

variations in use, and evolve to adopt new Cloud services, and exploit
new capabilities offered by Edge and Fog devices, or within data centers
offering Graphics Processing Units (GPUs) or dedicated processors for
Artificial Intelligence (AI). Our proposal is to alleviate this complexity
by using patterns at all stages of the Cloud application life-cycle: deploy-
ment, automatic service discovery, monitoring, and adaptive application
evolution. The main idea of this paper is that it is possible to reduce the
complexity of composing, deploying, and evolving Cross-Cloud applica-
tions using dynamic patterns.
1 Introduction
The question is not if Cloud computing should be used, but how: There are
concerns about private data, and consequently the simultaneous use of private
and public Cloud; there are questions about vendor lock-in and portability; there
are questions about the best deployment models, like deploy a Virtual Machine
(VM) and a database in that machine, or use a database as a service offered in
the Cloud; there are questions about scalability and maintenance of the deployed
application as application use and Cloud offerings evolve over time. Furthermore,
applications in the future will need to relate to major IT-trends1 like wearable
devices and sensors, mobility of users, and strong security requirements.
1
https://www.iqvis.com/blog/cloud-computing-predictions-2020/.
This work has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement No. 731664 MELODIC: Multi-cloud
execution-ware for large-scale optimised data-intensive computing; and grant agree-
ment No. 731533 DECIDE: Multicloud applications towards the digital single market.
https://doi.org/10.1007/978-3-030-33509-0_59
Patterns for Cloud Application Life-Cycle Management 627
Today, a decade into the era of Cloud computing, the situation is similar to
the one faced a decade into the era of object-oriented programming when soft-
ware systems grew in complexity making them exponentially harder to develop
and maintain. There was a need for abstractions [30] and the identification of best
practices, which gave raise to what became known as design patterns [10]. With
growing software system complexity this again led to architectural patterns [26],
and recently to microservices patterns [6] to help the design of distributed appli-
cations in the Cloud.
Even though such patterns capture the best practices and may help the design
and initial deployment of Cloud applications, the current state of the art fails to
capture the dynamic nature of Cloud applications. The Cloud deployment must
react to variations in use, adopt new Cloud services, and exploit new capabili-
ties offered by Edge and Fog devices, or within data centers offering GPUs or
dedicated processors for AI. The Cloud services, virtual and physical resources,
and their combinations – in the core Cloud, at the Edge, or in the Fog – will
herein be referred to collectively as Cloud capabilities.
Autonomic computing [20] applied to the application Cloud deployment can
remedy some of the concerns above. This requires a continuous feedback con-
trol loop: Monitor, Analyse, Plan, Execute—with Knowledge (MAPE-K) [17].
There are already successful utility based approaches to autonomic application
configuration management in context aware mobile computing [22] and ubiq-
uitous computing systems [28]. In the novel Multi-cloud Execution ware for
Large scale Optimised Data Intensive Computing2 (MELODIC) framework these
approaches are extended to Cross-Cloud autonomic application deployment and
run-time management [14]. These approaches assume that the application to be
deployed can be modeled as a set of interconnected components or objects [25],
and there are many frameworks based on dialects of cloud modeling languages [2].
MELODIC exploits the application model in The Cloud Application Modelling
and Execution Language (CAMEL) [1], and can thus be seen as a complete mod-
elsrun.time [16] framework that has already been successfully applied to several
demanding Computational Intelligence (CI) [19] applications [15].
An issue with this approach is the Planning part of the MAPE-K loop since
finding the best application configuration for a given execution context is a com-
binatorial optimization problem whose time complexity is typically exponential
in the number of application components, and possibly also the Cloud capa-
bilities offered by the Cloud providers that may host the application’s compo-
nents. The complexity of the problem remains hard even though it is possible in
many cases to find algorithms that work well for smaller configurations, e.g. an
adapted and improved version of the Nondominated Sorting Genetic Algorithm
II (NSGA-II) [21] is used in the DECIDE3 project to optimize the problem of
selecting the most appropriate cloud services for an application that needs to
comply with a set of non-functional requirements [24].
2
https://melodic.cloud/.
3
https://www.decide-h2020.eu/.
628 G. Horn et al.
However, a better approach to deal with the complexity could be to iden-

tify statically the application topology and architecture as a software pattern.
The application topology pattern sub-graphs could then be mapped onto known
classes of deployment patterns. This would reduce the problem at run-time to
select the most suitable deployment patterns for the current application exe-
cution context. This paper will explore this vision. Section 2 will show how
the application can be statically reduced to a set of patterns. Section 3 dis-
cusses Cloud deployment patterns and how they can be used for fast deployment
decisions. The application life-cycle considerations are discussed in Sect. 4, and
Sect. 5 elaborates on the consequences of using patters for application manage-
ment and adaptation.
2 Software Patterns and Semantics

There are often practical issues which limit portability and interoperability of
legacy or even native Cloud applications [4]: different data formats, parame-
ters semantics, unclear descriptions of the exposed Application Programming
Interfaces (APIs), and so on. Furthermore, many vendors try to bind their cus-
tomers to their own platform, making it difficult or expensive for them to port
their applications to another environment when needed An effective approach,
enabling automated reasoning, is the adoption of semantic representations, and
specifically ontologies, which is a formal, machine readable knowledge representa-
tion by means of a set of domain related concepts and the relationships between
those concepts. The Web Ontology Language4 (OWL) is a semantic mark-up lan-
guage for publishing and sharing ontologies on the World Wide Web (WWW). A
number of ontologies related to Cloud computing emerged in the past few years.
Androcec et al. [3] provides an overview of Cloud Computing ontologies, their
types, applications and scope. Deng et al. [9] presents a formal catalog represen-
tation of Cloud services that model, with ontologies, a range of Cloud services
and their processes. Takahashi et al. [29] use ontologies to describe cybersecu-
rity operational information such as data provenance and resource dependency
information.
There are no formal, standard, and universally accepted languages to describe
design and Cloud patterns in a uniform manner. Semantic based languages have
been proposed in literature to formalize and categorize patterns: Dietrich and
Elgar defined an ontology based model, called Object Design Ontology Layer
(ODOL) [18] that defines a series of OWL classes and properties to describe
patterns, focusing on the description of the application context of a pattern,
analyzing its possible uses and identifying real implementation of its partici-
pants [18]. On the other hand, it neglects other aspects of patterns, like their
behavior or the different relationships existing among their participants, thus
losing expressiveness.
The mOSAIC project has developed Cloud services and patterns descrip-
tion based on semantic technologies [5]. The defined Semantic Model – based
4
https://www.w3.org/TR/owl2-overview/.
on standard World Wide Web Consortium5 (W3C) semantic representation lan-

guages OWL and Semantic Markup for Web Services6 (OWL-S), and on their
integration with the ODOL model – is able to represent in a machine readable
representation Cloud Services – with their functional and non-functional features
– and Cloud Patterns, which represent correlations and composition of Cloud
services according to well established design, architectural and process patterns.
The mOSAIC Cloud Ontology has been adopted by the Institute of Electrical
and Electronics Engineers7 (IEEE) Standard for Intercloud Interoperability and
Federation8 (SIIF).
Figure 1 shows the Semantic model as a graph, structured into five conceptual
layers. The graph represents concepts (graph nodes) and relationship (graph
edges) at different layers. In each layer relationships among concepts of the same
layer are represented, in addition to inter-layer relationships.
The Application Software Architectural Patterns layer, labeled (4) in Fig. 1,
represents the description of patterns describing the application to be deployed
on a platform where an application pattern represents a composition of applica-
tion components embodying application domain functionality. The Cloud Pat-
terns layer, layer (3) in Fig. 1, represents the semantic description of agnostic
and vendor dependent patterns. It represents patterns at the Infrastructure as a
service (IaaS) and at the Platform as a Service (PaaS) level. The Services layer,
layer (2) in Fig. 1, represents the semantic annotation of the provider dependent
platform services and the supporting ontologies needed to identify the plat-
form provider supported operation, input and output parameters. This layer
presents details of the provider platform architecture, the functionality exposed
and the underlining details. This layer contains also the semantic description
of the agnostic platform services exposed through an ontology that exposes in
vendor neutral terms platform resources, operations and exchanged parameters.
This layer is grounded onto the two underneath layers - representing the ground-
ing of the semantic representation, and following the Web Service Description
Language (WSDL) standard: the Operations layer, layer (1) in Fig. 1, that rep-
resents the syntactic description of the operation and functionality exposed by
the platform services, and provides a machine-readable description of how the
service can be called, what parameters it expects, and what data structures
it returns; and the Parameters layer (layer 0 in Figure) which represents the
description of the data type exchanged among services as input and output of
the operations.
3 Deployment Patterns
Microservices are a key pattern both in terms of software engineering and in
terms of software management in terms of deployment and configuration of
5
https://www.w3.org/.
6
https://www.w3.org/Submission/OWL-S/.
7
https://www.ieee.org/.
8
https://standards.ieee.org/project/2302.html.
630 G. Horn et al.
Fig. 1. The Semantic Model as a graph with five layers highlighting the generic and
provider agnostic side, and the platform and vendor specific side. The semantic rep-
resentations of both sides are fundamental for defining the equivalence among the
different levels and to enable semantic inference and matchmaking within a coherent
pattern based methodology.
different services that contribute to an overall application. Microservices refer

to atomic and individual services with a single but well-defined aim, that
are deployed autonomously [8] and in composition formulate a service (i.e.
application). Given that each microservice is deployed individually, the over-
all deployment of an application raises the challenge of how different microser-
vices are deployed, i.e. how the overall deployment of the application can be
optimized. The latter becomes even more challenging in cloud infrastructure
deployments given the distributed underlying infrastructures and the different
types of resources that can be provided, e.g. VMs, containers, or bare metal; and
the way these resources are connected, e.g. bandwidth and latency.
Deployment patterns regard the process of identifying the optimal actions
and practices that will automate the deployment and execution of services in
a computing infrastructure, such as a computing Cloud [7], meaning how the
different microservices are deployed in the same VM for example or in differ-
ent VMs with specific connection requirements and constraints between these
VMs. Amato et al. [12] propose a pattern-based orchestration methodology that

describes the composition of a Cloud service and provides information regarding
the way the service should be deployed. The same authors suggest that a way to
orchestrate more efficiently the computing infrastructure and deploy the services
on top of them is related to the proper description of the services [13]. When the
services are properly defined, with respect to their needs, then the deployment
is adjusted in order to achieve higher performance. Yamato proposed a platform
that analyses the computing infrastructure usage during deployment in order to
achieve high performance [31]. Still, the challenge of the proper placement of the
services in heterogeneous environments such as a computing Cloud is valid.
Furthermore, there are several frameworks that aim at addressing the afore-
mentioned deployment patterns and configurations challenge. Octopus Deploy9
is a tool that is used for the deployment of services in cloud computing infras-
tructures and automates the process of the application deployment. Through
Octopus Deploy run-time adaptation can be achieved since it understands pos-
sible infrastructure changes and dynamically adjust. IBM’s WAS10 deployment
manager is a yet another production tool whose purpose regards the adminis-
tration and orchestration of application services in WAS deployment shells. In
addition to the above-mentioned tools, Google Cloud Deployment manager11
regards a deployment service that speeds up the configuration and orchestra-
tion of services hosted in Google Cloud Platform. The Runtime Configurator of
this Deployment manager allows the service provider to dynamically configure
services.
One of the key enablers for optimized deployment patterns are the software
patterns and semantics, which provide the ground for describing and character-
izing the microservices to be deployed. Approaches aim at analyzing the capabil-
ities of the microservices through their semantic descriptions, and the associated
requirements in order to identify the deployment patterns in Cloud environments
by utilizing information about the current state of the Cloud infrastructures
where the services are to be deployed, the foreseen availability of resources, the
failure estimations, etc. At the same time such approaches must take additional
aspects into consideration, such as the interdependent services and the data and
network dimensions. An additional aspect to deployment patterns generation
and configuration refers to their adaptation during run-time given the services
evolution, the utilization of specific microservices of the application, and the
infrastructure evolution. The latter enables the provision of adaptive deploy-
ments through the identification and dynamic adaptation of the deployment
patterns.
9
https://octopus.com/.
10
https://www.ibm.com/support/knowledgecenter/en/linuxonibm/liaag/wecm/
l0wecm00 was deployment manager.htm.
11
https://cloud.google.com/deployment-manager/.
632 G. Horn et al.
4 Cloud Application Life-Cycle

The application life-cycle of multi- and cross-Cloud native applications has var-
ied to what the literature in software engineering reported. The development of
Cloud native applications involve shorter and faster development cycles, and the
operation activities have now a more prominent role than before. Users expect
Cloud applications to be always available and perform well, which implies that
application providers need to respond quicker to malfunctions or bottlenecks.
This has created the need to merge both the development and operation teams
into one, namely the DevOps [27], which aim at being able to continuously ’archi-
tect’, develop, integrate, test, deploy and operate the application. Furthermore,
in order to understand what needs to be improved in the application and to
provide always a responsive and available solution, means have to be provided
to monitor the different components of the application and be able to deploy
automatically, these components into a new configuration that responds to the
user’s needs and expectations.
The development of multi- and cross-Cloud native applications deployed on
a hybrid scenario pose several challenges, both in development and operations
time such as:
• Applications need to be always responsive in terms of availability and perfor-
mance, among other aspects, under this hybrid Cloud configuration, especially
when the micro services of the application are deployed on different Cloud
resources from different Cloud Service Providers (CSP). To achieve that, the
health and conditions of the application micro services should be continuously
monitored in order to be able to respond to malfunctions or inefficiencies at
operation time. To this end, and to ensure a complete availability, the appli-
cation shall be able to self-adapt itself and redeployed in a new appropriate
configuration.
• Communication in terms of information exchange among the micro services
and different components of the multi-Cloud application: As systems evolve
and grow, the management of endpoints becomes cumbersome. To this end,
there are tools, e.g. Kubernetes12 that support the orchestration of containers
and can manage such endpoints, can be of assistance, but the learning curve
is high.
• Vendor lock-in: Each CSP offers its own technology, libraries and frameworks
that should be used when deploying an application on their services. Further-
more, porting data and applications from one Cloud service to another can
be challenging from the technical and regulatory point of view, although the
container approach eases this task to a great extent. Moreover, article 6 of
the new regulation of Free Flow of Data [11] aims to facilitate this, through
the adherence of the CSPs to the code of conducts currently being developed.
• Discovery, selection, management of the most appropriate Cloud service or
combination of cloud services for a specific application: The selection of cloud
services is currently more and more complex taking into consideration the
12
https://kubernetes.io/.
plethora of offerings and configurations there exist, this activity becomes

even more challenging. To alleviate this, cloud service brokers such as Cloud-
more13 , Computenext14 , Nephos hybrid Cloud management15 , Intercloud16 ,
IBM cloud brokerage solutions (previously Gravitant17 ) and Jamcracker18 ,
come into play. However, these solutions do not present an easy way to select
Cloud services or do not consider the combination of multiple Cloud resources
to respond to the needs of multi-Cloud native applications. This situation can
also be extended to the monitoring of the non-functional characteristics of the
different Cloud resources. An initial solution to respond to this challenge has
been implemented in the DECIDE project19 , as part of the Advanced Cloud
Service meta-Intermediator (ACSmI).
5 Application Adaptation
There are many types of Cloud applications. Our vision assumes that the appli-
cation will run over quite some time, and the deployment must therefore consider
the various factors indicated in the previous Sect. 4 life-cycle discussion. Addi-
tionally, the application may have intrinsic variability caused by the data it
processes or caused by the application users. Consider for instance an airline
industry application responsible for scheduling planes, crews certified for some
type of planes, and the passengers. It must be available around the clock, all
days around the year. It may experience variation in demand and use between
day and night, and between week days and weekends. It may see changes in
users’ location as the earth rotates. Finally, if there is a major incident closing
a major airport hub, there may temporary be a large demand for resources to
re-schedule planes, crews and passengers to clear this exceptional situation as
quickly as possible. All of these situations call for the application to be adaptive
to its current execution context.
It will be costly and error prone to have a human DevOps team constantly
monitoring the application and exploiting the elasticity of the Cloud computing
paradigm to provide or to remove resources as the demand fluctuates. Current,
autonomic approaches based on the MAPE-K loop are reactive. In other words,
when some Complex Event Processing (CEP) shows that the monitored sensors
indicate a change in the application’s execution context in the analysis phase,
there must be an efficient planning phase finding a better deployment solution
and executing this solution before the application context has changed again.
The planning phase typically involve solving a combinatorial optimization prob-
lem whose time complexity grows exponentially with the number of factors to
13
https://web.cloudmore.com/.
14
https://www.computenext.com/platform/enterprise-cloud-brokerage.
15
http://www.nephostechnologies.com/technology/hybrid-cloud-management/.
16
https://www.intercloud.com/platform/overview.
17
https://www.ibm.com/us-en/marketplace/cloud-brokerage-solutions.
18
https://www.jamcracker.com/.
19
https://www.decide-h2020.eu.
634 G. Horn et al.
consider for the next deployment. Using anytime algorithms will produce the
new deployment configuration solution within the time window of stability for
the current execution context. However, this may severely impact the solution
quality, and consequently the usefulness of the adapted deployment.
Consequently, it is paramount to reduce the number of factors to consider
in the planning phase. Knowing the software patterns of the application, and
their surjective mapping onto architecture patterns, and the surjective mapping
of these again onto deployment patterns will reduce the decision problem to
select the deployment pattern that is best suited for the application’s current
context. This is essentially a sorting problem that can be solved in polynomial
time. Consider as an example a very simple application with a front-end web-
server and a back-end database. Some deployment patterns can be pre-computed
for these components: They can be co-located on the same VM, they can be
on separate VMs, and the web-server and the database can both be hosted as
services offered by the Cloud platform provider. If cost is the main concern,
the current cost of using any of these three deployment patterns can quickly be
retrieved from the potential Cloud providers, and the less expensive pattern is
selected and enacted.
6 Discussion
The significant rise of Cloud computing models, services and solutions [23], in
conjunction with automatic adaptation platforms create a very complex envi-
ronment for the development and maintenance of modern applications. Also,
modern applications are becoming more complex and integrated with various
sources of data in different ways. Applications may process huge amount of data
in real-time, based on data streams or complex data lakes. On the other hand,
even when using Cloud computing, the cost of computing resources is a signif-
icant element of the whole Total Cost of Ownership (TCO) of an Information
Technology (IT) system. The rising complexity of modern applications requires
dedicated methods to handle this issue. As presented in this paper, by using well
defined, described, implemented, and tested patters, it is possible to reduce the
complexity of modern Cloud applications. It is very important to distinguish the
Cloud deployment patterns and software development patters. The first ones are
dedicated patters representing application architecture, which handle typical sit-
uation for the deployment into the Cloud. Use of these predefined patterns limits
the impact of the complexity of deployment. Also, there is limited need for test-
ing configurations based on these patterns, as they have already been deployed
and tested. On the other hand, it is still the issue of selecting the right pattern
in the certain moment, especially for some applications where different patterns
must be deployed at a various levels of applications use, e.g. high workload or
low workload. Increasing workload would sometimes even require to completely
redeploy the application with a different deployment architecture.
As automatic application management platforms become standard for the
deployment of modern, complex applications, the importance of deployment
patterns is increasing. Automatic adaptation platforms need exploit predefined,

well-described, and tested patterns; and select the best one for the particular sit-
uation. Without patterns, it would be very difficult to use these platforms, since
one must define and hard code, for each situation, a particular type of deploy-
ment. Using patterns, one can measure the effectiveness of each deployment
and learn which pattern is the most suitable for the given or similar situations.
According to our knowledge, the only competitive approach using patterns for
Cloud deployments is using one deployment pattern for all situations, and hard
code it into the application configuration. This is a less flexible approach and
requires more effort for implementation and testing that the approach outlined
here with automatic application adaptation. Also, the static use of patterns lim-
its the potential benefit of automatic adaptation, as adaptation would be limited
to the size of resources, and would not be allowed to change deployment archi-
tecture.
When automatically selecting the best deployment patterns for the given sit-
uation, the automatic adaptation platform is able not only to adjust the amount
of resources given to the application, but also to find more optimal deployment
architecture. We are expecting a rising number of Cloud deployment patterns, as
the complexity of modern applications will be rising. Also, the increasing num-
ber of Cloud services and models will support this trend, in conjunction with
wider usage of automatic adaptation platforms.
7 Conclusion
In this paper we have devised and illustrated a preliminary methodology to sup-
port the complex Cloud application life-cycle consisting of designing, developing,
composing and deploying Cloud applications in a multiservice and multiplatform
scenario, which include possible edge and fog integration. Our proposal is to use
extensively patterns at all stages. We are investigating techniques for automat-
ing the enactment of the different stages of the life-cycle, with use of machine
readable semantic representation of patterns and service, inference, and machine
learning techniques.
References
1. Rossini, A., Kritikos, K., Nikolov, N., Domaschka, J., Griesinger, F., Seybold, D.,
Romero, D., Orzechowski, M., Kapitsaki, G., Achilleos, A.: The cloud applica-
tion modelling and execution language (CAMEL). OPen Access Repositorium der
Universität Ulm, p. 39 (2017). https://doi.org/10.18725/OPARU-4339
2. Bergmayr, A., Rossini, A., Ferry, N., Horn, G., Orue-Echevarria, L., Solberg, A.,
Wimmer, M.: The evolution of CloudML and its applications. In: Paige, R., Cabot,
J., Brambilla, M., Hill, J.H. (eds.) Proceedings of the 3rd International Workshop
on Model-Driven Engineering on and for the Cloud 18th International Conference
on Model Driven Engineering Languages and Systems (MoDELS 2015), vol. 1563,
pp. 13–18. CEUR Workshop Proceedings, Ottawa (2015). http://ceur-ws.org/Vol-
1563/
636 G. Horn et al.
3. Androcec, D., Vrcek, N., Seva, J.: Cloud computing ontologies: a systematic review.
In: The Third International Conference on Models and Ontology-based Design of
Protocols, Architectures and Services, MOPAS 2012, pp. 9–14 (2012)
4. Di Martino, B., Cretella, G., Esposito, A.: Cloud Portability and Interoperability -
Issues and Current Trends. Springer, Berlin (2015). https://doi.org/10.1007/978-
3-319-13701-8
5. Beniamino, D.M., Antonio, E., Giuseppina, C.: Semantic representation of cloud
patterns and services with automated reasoning to support cloud application porta-
bility. IEEE Trans. Cloud Comput. 5(4), 765–779 (2017). https://doi.org/10.1109/
TCC.2015.2433259
6. Richardson, C.: Microservices Patterns: With Examples in Java, 1st edn. Manning
Publications, Shelter Island, New York (2018)
7. Namiot, D., Sneps-Sneppe, M.: On micro-services architecture. 2(9), 24–27 (2014).
http://injoit.org/index.php/j1/article/viewFile/139/104
8. Taibi, D., Lenarduzzi, V., Pahl, C.: Architectural patterns for microservices: a
systematic mapping study. In: CLOSER 2018, pp. 221–232 (2018). https://pdfs.
semanticscholar.org/f6e8/8823482de1729584acbfb450d4502f4d393d.pdf
9. Deng, Y., Head, M., Kochut, A., Munson, J., Sailer, A., Shaikh, H.: Introducing
semantics to cloud services catalogs. In: 2011 IEEE International Conference on
Services Computing (SCC), pp. 24–31 (2011)
10. Gamma, E., Helm, R., Johnson, R., Vlissides, J., Booch, G.: Design Patterns:
Elements of Reusable Object-Oriented Software, 1st edn. Addison-Wesley Profes-
sional, Reading (1994)
11. European Union: Regulation (EU) 2018/1807 of the European Parliament and of
the Council of 14 November 2018 on a framework for the free flow of non-personal
data in the European Union (text with EEA relevance) (2018). http://data.europa.
eu/eli/reg/2018/1807/oj
12. Amato, F., Moscato, F.: Pattern-based orchestration and automatic verification
of composite cloud services. 56 (2016). https://www.sciencedirect.com/science/
article/pii/S0045790616302026
13. Amato, F., Moscato, F.: Exploiting cloud and workflow patterns for the analysis
of composite cloud services. Future Gener. Comput. Syst. 67, 255–265 (2017)
14. Horn, G., Skrzypek, P.: MELODIC: utility based cross cloud deployment optimisa-
tion. In: 32nd International Conference on Advanced Information Networking and
Applications (AINA) Workshops, pp. 360–367. IEEE Computer Society, Krakow
(2018). https://doi.org/10.1109/WAINA.2018.00112
15. Horn, G., Skrzypek, P., Materka, K., Przeździek, T.: Cost benefits of multi-cloud
deployment of dynamic computational intelligence applications. In: Barolli, L.,
Takizawa, M., Xhafa, F., Enokido, T. (eds.) Proceedings of the Workshops of the
33rd International Conference on Advanced Information Networking and Appli-
cations (WAINA-2019). Advances in Intelligent Systems and Computing, vol.
927, pp. 1041–1054. Springer, Matsue (2019). https://doi.org/10.1007/978-3-030-
15035-8 102
16. Blair, G., Bencomo, N., France, R.B.: Models@run.time. Computer 42(10), 22–27
(2009). https://doi.org/10.1109/MC.2009.326
17. IBM: An architectural blueprint for autonomic computing. White Paper Third
Edition, IBM, 17 Skyline Drive, Hawthorne, NY 10532, USA (2005). http://www-
03.ibm.com/autonomic/pdfs/AC%20Blueprint%20White%20Paper%20V7.pdf
18. Dietrich, J., Elgar, C.: A formal description of design patterns using OWL. In:
2005 Australian Software Engineering Conference, pp. 243–250. IEEE Computer
Society, Brisbane (2005). https://doi.org/10.1109/ASWEC.2005.6
19. Kacprzyk, J., Pedrycz, W. (eds.): Springer Handbook of Computational Intelli-

gence. Springer Handbooks. Springer, Heidelberg (2015)
20. Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1),
41–50 (2003). https://doi.org/10.1109/MC.2003.1160055
21. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective
genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002).
https://doi.org/10.1109/4235.996017
22. Geihs, K., Barone, P., Eliassen, F., Floch, J., Fricke, R., Gjørven, E., Hallsteinsen,
S., Horn, G., Khan, M.U., Mamelli, A., Papadopoulos, G.A., Paspallis, N., Reichle,
R., Stav, E.: A comprehensive solution for application-level adaptation. Softw.:
Pract. Exp. 39(4), 385–422 (2009). https://doi.org/10.1002/spe.900
23. Maciaszek, L.A., Skalniak, T.: Confluent factors, complexity and resultant archi-
tectures in modern software engineering: a case of service cloud applications. In:
5th International Symposium on Business Modeling and Software Design, BMSD
2015, pp. 37–45. SciTePress (2015)
24. Arostegi, M., Torre-Bastida, A., Bilbao, M.N., Del Ser, J.: A heuristic approach
to the multicriteria design of IaaS cloud infrastructures for big data applications.
Expert Syst. 35(5), e12259 (2018). https://doi.org/10.1111/exsy.12259
25. Ferry, N., Chauvel, F., Song, H., Rossini, A., Lushpenko, M., Solberg, A.: CloudMF:
model-driven management of multi-cloud applications. ACM Trans. Internet Tech-
nol. (TOIT) 18(2), 16:1–16:24 (2018). https://doi.org/10.1145/3125621
26. Avgeriou, P., Zdun, U.: Architectural patterns revisited - a pattern language. In:
Longshaw, A., Zdun, U. (eds.) Proceedings of the 10th European Conference on
Pattern Languages of Programs (EuroPLoP 2005), vol. D3, pp. 1–39. UVK - Uni-
versitaetsverlag Konstanz, Irsee (2005)
27. Jabbari, R., bin Ali, N., Petersen, K., Tanveer, B.: What is DevOps? A system-
atic mapping study on definitions and practices. In: Proceedings of the Scientific
Workshop Proceedings of XP2016, XP 2016 Workshops, pp. 12:1–12:11. ACM,
Edinburgh (2016). https://doi.org/10.1145/2962695.2962707
28. Hallsteinsen, S., Geihs, K., Paspallis, N., Eliassen, F., Horn, G., Lorenzo, J.,
Mamelli, A., Papadopoulos, G.A.: A development framework and methodology for
self-adapting applications in ubiquitous computing environments. J. Syst. Softw.
85(12), 2840–2859 (2012). https://doi.org/10.1016/j.jss.2012.07.052
29. Takahashi, T., Kadobayashi, Y., Fujiwara, H.: Ontological approach toward cyber-
security in cloud computing. In: Proceedings of the 3rd International Conference
on Security of Information and Networks, pp. 100–109. ACM (2010)
30. Cunningham, W., Beck, K.: Constructing abstractions for object-oriented applica-
tions. Technical report CR-87-25, Tektronix, Inc, Computer Research Laboratory
(1987)
31. Yamato, Y.: Optimum application deployment technology for heterogeneous IaaS
cloud. Inform. Process. Soc. Jpn. 25, 56–58 (2017)
From Monolith to Cloud Architecture
Using Semi-automated Microservices
Modernization
Salvatore Augusto Maisto(B) , Beniamino Di Martino, and Stefania Nacchia
Dipartimento di Ingegneria, University of Campania “Luigi Vanvitelli”, Aversa, Italy

{salvatoreaugusto.maisto,stefania.nacchia}@unicampania.it,
Abstract. The motivation for this transition comes from the fact that
constantly maintaining a monolithic architecture has resulted in diffi-
culties in keeping up in pace with new development approaches such as
DevOps, calling for deployment several times a day. In contrast, microser-
vices offer a more flexible option, where individual services comply with
the single responsibility principle (SRP), and they can therefore be scaled
and deployed independently. We propose a methodology, starting from a
source code application, that provides a new architecture for the appli-
cation oriented to microservices and deployable on Cloud.
Keywords: Cloud computing · Microservices · Monolith ·

Refactoring · Modernization
1 Introduction
Lately more and more organizations, have developed a tendency to move existing
enterprise-scale, and mostly, monolithic applications to the cloud. The reasons
to do so are manifold: high availability and redundancy, automatic scaling, easier
infrastructure management and compliance with latest security standards ensure
a more agile and combined flow of development and operation, also referred to
as DevOps [3]. The motivation for this transition comes also from the fact that
monolithic architectures that have grown over years can become large, complex
and in later stages even fossilize, meaning the accumulated technical debt results
in obscure structures that make the product un-maintainable with a reasonable
effort. A new architectural style, referred to as Microservices, promises to address
these issues. It started as a trend in software engineering industry practice which
was first described in detail by Lewis and Fowler [14].
Companies such as Amazon [18], Netflix [17], LinkedIn [10], SoundCloud [5]
and many more have made the transition to microservice architecture because
their existing monolithic application was too hard to maintain, develop and scale.
On the other hand the transition between these quietly different architectures
is not easy e.g. the communication between multiple microservices can introduce
https://doi.org/10.1007/978-3-030-33509-0_60
From Monolith to Cloud Architecture 639
performance issues if the services are too fine-grained [20]. One challenge is also
to handle the orchestration of the microservices in production. Luckily in the last
few years many new tools for supporting this have been made such as Kubernetes
[1] and Mesos [2] to name two. The main contribution in this paper is to provide
a general methodology that could support a team in re-factoring a monolith
application thus easing the transition to microservices. The basic idea behind
the proposed methodology is that not everything can be done automatically, but
it is possible to identify the application’s modules that can be used or combined
to obtain a microservices architecture and, then, also produce the skeleton stubs
for the microservices architecture. The remainder of the paper is organized as
follows. In Sect. 2 we will highlight the architecture evolution that in time led to
the birth of microservices architecture; in Sect. 3 we will highlight the current
approaches used for re-factoring and modernizing a monolithic architecture.
2 Roadmap to Microservices
2.1 Monolithic Architecture
The first architectural style introduced for applications development is the so

called “monolith” architecture; according to this the developed applications were
basically a single entity. The monolithic applications are easy to implement, since
they consist of a single project, a single code base; moreover the design has to
be done once and it takes into account every single functionality. This type
of architecture lends itself well to applications that are small or otherwise not
subject to changes, but if we find ourselves developing applications that must
necessarily evolve rapidly, this architectural style is not suited.
In these situations, monolithic applications can easily become mammoth i.e.
very great in size and complexity, which makes it difficult to move quickly in
development, testing and implementation. It is not impossible that, while the
application is deployed in production, the customer wants to add a new feature
or change some parts of the existing workflows, in these cases any small changes
should and absolutely must go through a complete test of the entire application
before being distributed in production.
Furthermore, the only way to be able to scale a monolithic application is to
replicate the entire application with a consequent increase in costs and necessary
resources.
2.2 Microservice Architecture

Microservices are an architectural style that structures an application as a col-
lection of small autonomous and atomic services, modeled around a business
domain. In microservices architectures, each service is self-contained and imple-
ments a single business capability. Let’s consider a generic application as a use
case to understand the difference between Monolith architecture and Microser-
vices, as shown Fig. 1.
640 S. A. Maisto et al.
Fig. 1. Differences between monolithic architecture and microservices
The main difference we observe in the Fig. 1 is that all the features were ini-
tially under a single instance sharing a single database, but then, with microser-
vices, each feature was allocated to a different microservice that handles its own
data, and performs different functionalities.
3 State of the Art

Whether the decision is to start developing an application directly using
microservices or start from a monolith architecture that later will be refactored
towards microservices, there are many technical challenges that needs to be
addressed and solved in order to use microservices; e.g microservices add more
distribution which adds more points of failure. This brings up many questions
such as how to handle failures, how services communicate with each other, how
transactions are handled and so on [19].
If a monolithic application, for some reason, stops running while operative
and in production, it is very fast to locate the failure as nothing would work. On
the other hand, with microservices, if one service stops responding other services
still do and the localization of the failure is more difficult and must be handled
properly. As said before, communication among microservices is one of the big
questions to get right. Getting the communication wrong can lead to a situation
where microservices lose their autonomy and thus the main benefits of the whole
approach can diminish [19].
Refactoring of the existing software can be a daunting task. In order to
successfully make the refactoring, a good test coverage is required. Otherwise,
there is a chance that during the introduction of microservices new bugs might
end up in the existing features. It might also be hard to find and define which
parts of the existing software should be split up into microservices and which
ones are good candidates for microservices. One of the methods is to find seams
from the existing software [19]. A seam is a part of the code that can be isolated
and work alone in separation from the rest of the codebase [8].
Finding the seams requires good knowledge about the business use cases.
However, this knowledge should already be in the organization. Either in the
code base, if the monolith has good modular architecture or, even if the modules
are not well defined, then at least the use cases should be clear and defined.
On top of the technical challenges, adopting microservice architecture requires
organizational changes [19], as the teams need to adapt and adjust in an ever
changing environment. The organization needs to adapt accordingly to the new
architecture. This means developing, testing, deploying and taking care of the
service in production. DevOps culture needs to be adopted as teams now need to
deploy, monitor and address issues also in production [12]. Teams have to set up
their own continuous integration (CI) and continuous delivery (CD) pipelines.
4 Challenges with the Adoption of Microservices
In the previous sections we have tried to briefly highlight the challanges that
could be faced when transitioning from monolith to microservices but the chal-
lenges concern the entire microservices paradigm. According to [13] the chal-
lenges with adopting microservices architecture can be divided according to two
different points of view:
– the technical challenges;

– the organizational challenges.
4.1 Technical Challenges
There are various technical challenges that needs to be addressed and solved
before adopting a full microservices architecture. When the starting point is a
monolithic application, the organization is most likely already familiar with the
domain and has an idea where the seams of the application can be found. The
biggest problem in these cases is to find and separate these services.
It can take a lot of time and effort to refactor the services out from the mono-
lithic architecture. This is why the refactoring towards microservices should be
done in small steps. Also, when implementing new functionalities, they should
not be appended to the monolithic stack even though it might be faster. Instead,
organizations should expand their microservices offering and add new microser-
vices to replace the old monolithic code. It is extremely important to be careful
when doing this refactoring, because there is the possibility of introducing new
bugs in existing features. This is why good test coverage is needed before starting
this process.
When splitting up the services, attention should also be paid to the fact
that the services do not become too fine-grained. Microservices can introduce
a relevant performance overhead, especially if the communication is done over
network [19]. For example, if the communication is done using REST over HTTP
protocol, each inter-service call adds overhead from the network latency and from
marshalling and un-marshalling the data. If the services are too fine-grained there
will be a lot of traffic between them and, as each call adds overhead, the outcome
can be a system that does not perform well enough.
One of the biggest challenges is the integration between different microser-
vices [14]. It is not recommended to tie the integration between services to some
specific technology, because the teams might want to use different programming
languages when implementing services. There are also multiple other challenges
when thinking about the integration of microservices. The microservices inter-
faces should be simple to use and they should have good backwards compatibility.
So when new functionalities are introduced, the clients using the service do not
have to be necessarily updated.
4.2 Organizational Challenges
One of the organizational challenges is the structure of the organization itself,

that could prevent from successfully embrace the microservices paradigm. In
order to develop good application, the organization must align their structure
with the structure of the application architecture.
If previously with monolithic application the organization had big teams
which had clear roles like quality assurance, development and database admin-
istration then this kind of organization structure does not work with microser-
vices. Conway’s law states that the organization which designs the system will
produce a system which structure is a copy of the organizations structure [9].
If the structure of the organization is monolithic then microservices approach
does not work. The organization must split these big teams to smaller teams
which can work autonomously. This way the structure of the architecture is in
line with structure of the organization and they do not conflict with each other.
5 Methodology
The proposed methodology was carried out with the aim of providing support
for the re-engineering of old monolithic applications, in favor of new cloud tech-
nologies using a microservices architectural approach.
The goal is to provide the designer with a series of guidelines for the imple-
mentation of the new software and to minimize the task of redesigning the appli-
cation architecture. Another goal is to provide developers with the communica-
tion stubs of the new microservices, so that they can concentrate exclusively on
re-engineering the functionalities.
The proposed methodology consists of three phases: “Decomposition
phase” which deals with identifying a possible microservice architecture starting
from the application source code; “Microservices production and ranking
phase” which deals with producing the communication stubs and ordering the
microservices among them in order to provide a priority index for development;
and “Cloud deployment phase” in which through the annotation of the iden-
tified microservices and thanks to domain ontologies (Cloud Services and Cloud
pattern) it is possible to identify any microservices that can be replaced by cloud
services already ready for use, and finally produce a standard deployment model
of the new microservice application.
5.1 Decomposition Phase
The objective of this first phase is to identify the new microservices to be devel-
oped, in a first version it was decided to start from any monolithic application
written in JAVA. The process of identifying the proposed microservices is the
following:
– Identify the classes that make up the project;

– For each class, identify the methods;
– Assume that for each class the corresponding microservice can exist;
– Calculate the communication rate between classes, understood as the number
of times a first class calls a method of a second class and vice versa;
– Combine the classes that have a high coupling;
– Iteratively repeat the procedure until reaching a fair trade of microservices
produced and communication rate.
Fig. 2. Decomposition phase process
The Fig. 2 schematizes what was described above.

The analysis of the application was done by analyzing the application’s source
code; for this purpose we used two well-known techniques in code analysis: the
construction of the abstract syntax tree (AST) and the construction of the call
graph; the first in order to identify the classes and the respective methods and
the second for the calculation of the communication rate.
For a better manipulation of the inferred information it has been decided to
produce an internal model that could represent the information recovered from
the analysis of the code and that could easily be manipulated automatically for
the aggregation of the classes. A layered graph model has been identified as a
better solutions (Fig. 3). This model makes it possible to represent the individual
classes identified with the nodes and with the edges the method calls between
the classes. The call direction of the methods was not considered important, i.e.
which class calls the other, since for the analysis to be carried out they have
equal value; for this reason the individual levels of the model are indirect graphs
with weighted edges.
The first level, “Entry level”, consists of as many nodes as the classes
identified through the AST. Each class is connected to other classes by as many
edges as there are methods that are called by the other class and vice versa;
each edge has a weight that indicates how many times that particular method
is called.
The second level, “Edge normalization level”, provides an abstraction of
the underlying level by normalizing the number of edges between classes. For
each pair of classes there is only one weighted edge, where the weight is the sum
of all the weights of the edges between the two classes. The normalization step is
fundamental in order to have a single number (communication rate) to evaluate.
Subsequent levels, “no merged layer”, are used to represent intermediate
class mergers. These mergers are decided by a very simple heuristic “The rate
of communication between two classes must not be greater than X” where X
is an arbitrary value chosen by the designer. The choice of this heuristic is, as
of today, very simplified and not self-explanatory. The purpose here is to the-
oretically validate the methodology taken as a whole, once the methodology is
completed in its very step, this heuristic variable will be the result of a con-
strained optimization problem that would take into account all the necessary
variables.
The last level, “End merged layer”, represents the new microservice archi-
tecture. Each node keeps within it the history of the joint nodes, so as to be able
to better reconstruct the communication stubs.
5.2 Microservices Production and Ranking Phase

In this phase, the new microservices architecture has been defined, what is miss-
ing is the modernization of the existing code and the concrete creation of the
new microservices. The Fig. 4 shows the chosen process.
First the microservices communication stubs are produced and then ordered
among them. For the production of the projects it was decided not to bind to
any programming language but to maintain a level of abstraction such that the
implementation of microservices could be done in any programming language.
As described in [19] one of the main technologies used for communication
between microservices and the use of RESTful API [21]; for this reason it was
decided to build, for each microservice, the skeleton of the corresponding API.
[11] provides a specification for the representation of the RESTful API and
provides a set of tools for processing the specification document in order to
produce the communication stubs (client and server). For this reason it has been
adopted in this work as a reference de facto standard and in the production phase
Fig. 3. Layered graph model
Fig. 4. Microservices production phase process
of the stubs. For each identified microservice a specification document has been
produced that includes within it all the functionalities that the microservice
exposes towards the other microservices.
The microservices ranking is proposed in this phase to give the designer the
priority to follow in the development of the new architecture. As described in
[13] one of the main reasons for the modernization of applications is due to the
obsolescence of the code, i.e. the use of old libraries and deprecated methods.
Even in this phase, in a very simplistic way, it has been chosen to introduce
only this factor; for this reason, for each microservice the deprecated parts of
code used are identified using a knowledge base. The microservice that has more
deprecated code is the first candidate for reengineering, the rest of them are
ordered according to the same logic.
Fig. 5. Cloud deployment phase process
5.3 Cloud Deployment Phase
In the last phase of the methodology the new microservices architecture is eval-
uated and the microservices that can be replaced by services offered in the cloud
are identified; finally a model is produced that can allow an easy deployment of
the application. The Fig. 5 shown these phases.
In [7] a framework for the semantic representation of RESTful API of IoT
devices has been proposed, in this work the same semantic model, based on
OWL-S [15], is used to semantically represent the identified microservices. This
semantic model is the basis of an inferential engine that, exploiting different
knowledge bases, such as the semantic representation of cloud services, described
in [16], the semantic representation of cloud pattern, described in [6], and more
generally any semantic representation of applications and cloud services allows
you to identify services that can replace the microservices of the new architecture.
The last step is to produce a standard model that can be used for application
deployment. The OASIS standard “Topology and Orchestration Specification for
Cloud Applications” (TOSCA) [22] has been identified as a reference standard.
The standard allows to represent the complete architecture of a complex system
and to create a model that can then be used for the deployment of the application
or, in the case of this work, in the deployment of the new microservices identified.
The methodology proposed in this phase involves the automatic creation of
a file, following the TOSCA syntax, which contains, for each microservice, all
the information necessary for its deployment. Thanks to the use of the OpenAPI
model it is possible to produce the code to be inserted in the TOSCA Implenta-
tionArtifact and produce CSAR packages that can, in turn, be executed through
the tools that support the TOSCA specification, such as OpenTosca [4].
Acknowledgement. This work has received funding from the Campania Regional
Government, under the project “Linee Guida e Proposte per I 4.0 Campania”.
References
1. Kubernetes (2017). https://kubernetes.io/
2. Mesos (2017). https://mesos.apache.org/
3. Bass, L., Weber, I., Zhu, L.: DevOps: A Software Architect’s Perspective. Addison-
Wesley Professional, Boston (2015)
4. Binz, T., Breitenbücher, U., Haupt, F., Kopp, O., Leymann, F., Nowak, A., Wag-
ner, S.: OpenTOSCA–a runtime for TOSCA-based cloud applications. In: Interna-
tional Conference on Service-Oriented Computing, pp. 692–695. Springer (2013)
5. Calçado, P.: Building products at soundcloud—part i: dealing with the monolith.
Dosegljivo (2014). https://developers.soundcloud.com/blog/building-products-at-
soundcloud-part-1-dealing-withthe-monolith
6. Di Martino, B., Esposito, A., Cretella, G.: Semantic representation of cloud pat-
terns and services with automated reasoning to support cloud application porta-
bility. IEEE Trans. Cloud Comput. 5(4), 765–779 (2015)
7. Di Martino, B., Esposito, A., Maisto, S.A., Nacchia, S.: A semantic IoT framework
to support RESTful devices’ API interoperability. In: 2017 IEEE 14th International
Conference on Networking, Sensing and Control (ICNSC), pp. 78–83. IEEE (2017)
8. Feathers, M.: Working Effectively with Legacy Code. Prentice Hall Professional,
Upper Saddle River (2004)
9. Fowler, M.: Microservice premium. Saatavissa, May 2015. http://martinfowler.
com/articles/microservices.html
10. Ihde, S.: InfoQ—from a monolith to microservices 1 rest: the evolution of LinkedIn’s
service architecture (2015)
11. Initiative, O., et al.: OpenAPI specification. Retrieved from GitHub, 1 (2017).
https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0
12. Jamshidi, P.: Microservices architecture enables DevOps
13. Kalske, M., et al.: Transforming monolithic architecture towards microservice
architecture (2018)
14. Lewis, J., Fowler, M.: Microservices: a definition of this new architectural term.
Mars (2014)
15. Martin, D., Burstein, M., Hobbs, J., Lassila, O., McDermott, D., McIlraith, S.,
Narayanan, S., Paolucci, M., Parsia, B., Payne, T., et al.: OWL-S: Semantic
markup for web services. W3C member submission 22(4) (2004)
16. Martino, B.D., Cretella, G., Esposito, A., Carta, G.: An owl ontology to support
cloud portability and interoperability. Int. J. Web Grid Serv. 11(3), 303–326 (2015)
17. Mauro, T.: Adopting microservices at Netflix: lessons for architectural design
(2015). https://www.nginx.com/blog
18. Munns, C.: I love APIs 2015: microservices at Amazon (2015)
19. Newman, S.: Building Microservices: Designing Fine-Grained Systems. O’Reilly
Media, Inc., Newton (2015)
20. Richards, M.: Microservices AntiPatterns and pitfalls (2016)
21. Richardson, L., Ruby, S.: RESTful Web Services. O’Reilly Media, Inc., Newton
(2008)
22. Standard, O.: Topology and orchestration specification for cloud applications ver-
sion 1.0 (2013)
Reinforcement Learning for Resource
Allocation in Cloud Datacenter
Salvatore Venticinque, Stefania Nacchia(B) , and Salvatore Augusto Maisto
Department of Engineering, University of Campania “Luigi Vanvitelli”,

Via Roma 9, Aversa, Italy
{salvatore.venticinque,stefania.nacchia,
salvatoreaugusto.maisto}@unicampania.it
Abstract. Cloud technologies provide capabilities that can guarantee

to the end user high availability, performance and scalability.
However, the growing use of IoT technologies and devices, have made
the applications not only more computationally intensive, but also data
intensive. Because of this, dynamically scaling applications running on
clouds can lead to varied and unpredictable results due to highly time-
varying workloads distinguishes this new kind of applications. These
applications are also often composed of different independent modules
that could be easily moved across devices. Automatic scheduling and
allocation of these modules is not an easy task, because there could be
many conditions that prevent the design of a smart solutions. Thus deter-
mining appropriate scaling policies in a dynamic non-stationary environ-
ment is non-trivial, as a problem arises concerning resource allocation.
Decision making about which resources should be added and removed,
when the underlying performance of the resource is in a constant state
of flux, becomes an issues. In this work we model both the applications
and the infrastructure in order to formulate e Reinforcement Learning
problem for automatically find the best configuration for the applications
modules, taking into account the environment in which they are placed
and the applications already running.
Keywords: Cloud data center · Resource management · Resource

allocation · Reinforcment learning · Markov Decision Process
1 Introduction
The Cloud, as of today, is a widely used computing model that enables con-
venient, on-demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services) that can
be rapidly provisioned and released with minimal management effort or service
provider interaction [9]. In the cloud computing paradigm, all ICT resources are
“virtualised” as datacenter facilities and are operated by third party providers
[15–17]. Cloud computing data-centers employ virtualisation technologies that

https://doi.org/10.1007/978-3-030-33509-0_61
Reinforcement Learning for Resource Allocation in Cloud Datacenter 649
allow the scheduling of multiple workloads on a smaller number of servers. Obvi-

ously this “multiple” provisioning must be correctly handled as different work-
loads may have different resource utilisation footprints and may further differ in
their temporal variation. Due to the highly relevant time-variance of the work-
loads, it is of major importance to address the problem of cloud resource alloca-
tion and management. More specifically the need for new frameworks that can
automatically adapt and adjust the allocation of the resources has arisen.
Numerous automatic decision-making approaches, have been applied to par-
tially solve the resource allocation problem in the cloud computing environment.
One specific current trend in this field relies heavily on mechanisms provided by
Reinforcement Learning (RL) [13], for enabling the cloud infrastructure providers
to allocate resource optimally under variable and unpredictable workloads.
Due to the dynamic nature of resource demands and the complexity of the
cloud environment, it is hard to set up a mathematical model for the effi-
cient resource allocation strategy. In this paper we present a formulation of
the resource allocation problem that takes into account the features of both the
infrastructure and the applications. The reminder of the paper is as follows: in
the following section we would focus on the challenges of resource allocation and
the current approaches for solving such issues. In Sect. 4 we formulate the prob-
lem we address in this paper and we model a solution using the Reinforcement
Learning approach. In the last section will explain how in the future work we
will exploit the defined model.
2 Resource Management
Most of companies use cloud data-centers either to run:
– Data-processing and intensive applications (from development tools like con-

tinuous integration suites to business tools as video transcoders)
– Transaction-processing software (including social networks and e-commerce
websites)
– Or event-processing systems (as fraud detection tools in the financial market).
The cloud is quite appealing because of its ability to reduce capital expenditures
and increase return on investment, since the traditional model, where physi-
cal hardware and software resources were bought and managed is no longer an
option, due to its time and cost consuming management, which often has resulted
in a poor quality of service. In fact many approaches often face workload and
system changes that affect their performance during over time.
The cloud, first, must be able to handle large fluctuating loads; of course
some situations are quite predictable, e.g. like the promotion of a new feature.
In these cases resources can be provisioned in advance through the use of proven
techniques like scaling and balancing. However, for unplanned usage spikes or
other kinds of events, even more sophisticated solutions cannot ensure the best
results.
650 S. Venticinque et al.
This is one of the main reasons why companies rely on the cloud infrastruc-
ture: as it grants features, like autoscaling mechanisms, as a way to automatically
adjust resources allocated to applications based on their needs at any given time.
On the contrary if the resources were managed directly by the companies, these
scaling mechanisms should be programmed specifically by the companies.
In addition, auto-scaling seems to be a lot more justified when it comes
to handle changes in applications (following software improvements). The core
features of auto-scaling are a pool of available resources that can be pulled or
released on-demand and a control loop to monitor the system and decide in
real time whether it needs to grow or shrink [3]. In this scenario the burden of
ensuring the Service Level Agreement (SLA) and Quality of Service(QoS) falls
completely to the cloud data-centers.
Thus a cloud data-center has to constantly handle numerous requests for the
currently running applications and these demands also vary quite rapidly. Due to
continuous arrival and departure of workloads, the management for allocation,
migration and the consolidation process across the data center are performed
periodically to ensure stability in the data center and to ensure low overheads.
Another fundamental aspect is that the workload of each service or applica-
tion generally changes over time, as mentioned before, and consequently their
energy consumption and related costs may significantly impact both the cloud
data-center as well as the users.
Thus, it is necessary to adopt specific mechanisms that perform resource
allocation and management, that can be scalable, cost-aware and energy-aware
in order to efficiently:
– Respect the current (SLA) in the presence of time-varying demands.

– Ensure and maintain a high (QoS) level, even with the reconfiguration of the
running services and application in order to avoid performance degradation.
– Resource utilization of the data center has to improve between the under-
utilization and over-utilization criteria.
– The reconfiguration of the applications and services does not have to impact
the final cost for the end user.
2.1 Resource Allocation
Resource allocation, is one of the challenges of resource management because

end-users can access resources from anywhere and at any time. The resources in
a cloud can be accessed through Restful web APIs requests for computations or
storage which are mapped to virtualized ICT resources (servers, blob storage,
elastic IP, etc.).
Since cloud data-center offer abundance of resources, the cloud computing
model is able to support on-demand elastic resource allocation. However, such
abundance also leads to non-optimal resource allocation.
In cloud computing paradigm, the key challenge is the allocation of resources
among end-users having changing requests of resources based on their application
usage patterns. The unpredictable and changing requests need to run on data-
center resources across the Internet.
The aim of resource allocation for any particular cloud provider can be either
optimize QoS of applications or improve resource utilization and energy effi-
ciency. The main objective is to optimize QoS parameters, e.g. response time,
which measure the efficiency of resource allocation regardless of the type of ICT
resources allocated to end-users.
The optimized QoS parameters can be any measure such as time, space, bud-
get, and communication delay. Some of the challenges associated with energy effi-
cient resource allocation policies, which have been identified also in [5], include:
– Choosing the best allocation according to characteristics such as resource
usage, performance, and power consumption.
– Evaluating the inter-dependencies between workloads that may reside on the
same physical node
– Provisioning of resource allocation and utilization at run time by evalu-
ating the possibility of centralized, federated, and standardized datacenter
resources.
– Increasing performance by assessing application inter-dependencies to facili-
tate resource consolidation.
3 Related Work
The current state of the art, regarding not only the resource allocation problem
but, more generally, the resource management, comprises a plethora of works.
To the best of our knowledge, most of them set as primary objective the energy
efficiency of the resource management mechanisms while focusing on the optimal
scheduling and allocation of the virtual machines.
Nathuji and Schwan [10] have studied power management techniques in the
context of virtualized data centers. Besides hardware scaling and VMs consol-
idation, the authors have introduced and applied a new power management
technique called “soft resource scaling”. The idea is to emulate hardware scal-
ing by providing less resourcetime for a VM using the Virtual Machine Moni-
tor’s (VMM) scheduling capability. Raghavendra et al. [11] have investigated the
problem of power management for a data center environment by combining and
coordinating five diverse power management policies. The authors explored the
problem in terms of control theory and applied a feedback control loop to coor-
dinate the controllers’ actions. It is claimed that, similarly to [10], the approach
is independent of the workload type. Like most of the previous works, the sys-
tem deals only with the CPU management. Verma et al. [14] have formulated the
problem of power-aware dynamic placement of applications in virtualized hetero-
geneous systems as continuous optimization: at each time frame the placement of
VMs is optimized to minimize power consumption and maximize performance.
All these cited work focus on more classic techniques for improving resource
management, while more recently machine learning techniques have been exten-
sively used for the same purpose. Specifically the advanced field of reinforcement
learning is more and more establishing itself as a major technique for resource
management.
In [19] the authors presents a unified reinforcement learning approach, namely
URL, to automate the configuration processes of virtualized machines and appli-
ances running in the virtual machines. The approach lends itself to the applica-
tion of real-time autoconfiguration of clouds. It also makes it possible to adapt
the VM resource budget and appliance parameter settings to the cloud dynam-
ics and the changing workload to provide service quality assurance. Dutreilh et
al. [3] propose to deal with these problems using appropriate initialization for
the early stages as well as convergence speedups applied throughout the learn-
ing phases and we present their first experimental results for these. They also
introduce a performance model change detection on which we are currently work-
ing to complete the learning process management. [4] propose a Reinforcement
Learning-based Dynamic Consolidation method (RL-DC) to minimize the num-
ber of active hosts according to the current resources requirement. The RL-DC
utilizes an agent to learn the optimal policy for determining the host power mode
by using a popular reinforcement learning method. [20], inspired by the success of
Deep Reinforcement Learning (DRL) on solving complicated control problems,
present a novel DRL-based framework for power-efficient resource allocation in
cloud RANs. Specifically, the authors defines the state space, action space and
reward function for the DRL agent, apply a Deep Neural Network (DNN) to
approximate the action-value function, and formally formulate the resource allo-
cation problem as a convex optimization problem. [8] propose a novel hierarchi-
cal framework for solving the overall resource allocation and power management
problem in cloud computing systems. The proposed hierarchical framework com-
prises a global tier for VM resource allocation to the servers and a local tier for
distributed power management of local servers. The emerging deep reinforcement
learning (DRL) technique, which can deal with complicated control problems
with large state space, is adopted to solve the global tier problem. Furthermore,
an autoencoder and a novel weight sharing structure are adopted to handle the
high-dimensional state space and accelerate the convergence speed. On the other
hand, the local tier of distributed server power managements comprises an LSTM
based workload predictor and a model-free RL based power manager, operating
in a distributed manner.
4 Reinforcement Learning Based Resource Allocation

Reinforcement learning, [13], has been applied successfully across a range of
domains supporting the automated control and allocation of resources [2,8,20].
It operates on the basic premise of punishment and reward, with agents biased
towards actions which yield the greatest utility.
To the best of our knowledge, many research activities have focused
on the management of VMs, thus resolving the problem of allocating the
jobs/applications onto the VMs. In many cases, to cite one [1], solutions are
based on the analysis of the application code in order to obtain the specific jobs
to schedule and have more flexibility when it come to the resource management.
4.1 Problem Statement and Modelling

As opposed to that we suppose that the applications that must be allocated onto
the cloud nodes are composed of atomic modules, following a microservices archi-
tecture design [7]. These applications are considered as black boxes, so we cannot
intervene on the basic jobs, but we can allocate the specific modules to cloud
nodes. Specifically, we address two specific and use case scenarios that would
help placing the defined model. Let’s consider that a the time t0 the application
is deployed stochastically on the infrastructure. Suppose that the application is
composed of 4 modules/jobs, and at bootstrapping the infrastructure comprises
9 nodes.
– Consider the case where the infrastructure changes, and a new node is
inserted/removed. What is the best configuration?
– If a second application must be deployed, is there a way to adjust the config-
uration so that both of them have an optimal/sub-optimal configuration?
An allocation algorithm that takes into account all the possible configuration we
would have
N N!
= ;
M M !(N − M )!
This means that with N = 9 and M = 4, we would have 126 possible config-
urations, which is a definitely large exploration space that grows exponentially
with the increasing numbers of modules and nodes.
The model here proposed is based on a series of parameters that can be the
input to a reinforcement learning model, in order to evaluate the best configu-
ration for the modules. Specifically we define a set of information that can help
describing both the modules of the application and the infrastructure itself. The
application is described by:
– Number of modules
– For each modules Mi
Mi (BM , CCM , τM , taM )

where BM is the input data size which denotes the size of computation input data
needed for computing, including program codes and input parameters, CCM
represents the CPU cycles required to accomplish the computation, τM denotes
the maximum tolerable delay of task, taM is the module arrival time. On the
other hand the infrastructure is described using the following parameters:
– Number of nodes
– For each node ni
U d
Ni (SN , fN , P tN , rN , rN , StN , PN )
Where SN is the storage size of the node, fN are the CPU cycles per sec-
ond; P tN is the transmission power of the node, this is parameter is necessary
U
when we must consider to offload the module to another node, rN is the uplink
d
rate, while rN is the download rate. The node at a certain time t can be in a
specific state StN (t) ∈ St = {Start, Optimal, U nderestimated, Overestimated,
Overloaded, Idle}. We assume the power consumption of a server in the sleep
mode is zero, and the power consumption of a mode at time t in a different state
is a function of the CPU utilisation.
Now if we consider TM as the execution delay that is a function of CCM and
fN and EM as the corresponding energy consumption of the module, we can
formulate a cost function of the single module executing on the node.
N n n
CM = wT TM + wE EM
where wT and wE represent the weights of time and energy cost of the specific
module. The same function can be defined for the node, but in this case the
O
cost function CN (of f loading) will take into consideration also the transmission
delay and the energy consumed in case the module has to be offloaded to another
node. This is to address the necessity of re-configuring the allocation strategy
in case of infrastructure changes or if the arrival of new applications set off the
need for a new configuration.
O
CN = wT TN + wN E N
Where TN is a combination that takes into account the transmission delay and
the execution time, while EN is the energy that takes into account both the exe-
cution and the energy consumed for the transmission. These two cost functions,
N O
CM and CN , combined together represent the objective function of our learning
problem, that needs to be optimised in order to obtain the optimal allocation
for each module.
Previously we have shown an example of all the possible configurations when
it comes to the resource allocation for a very simple application. As executing an
algorithm that can evaluate and find the best solution would be extremely time
consuming, without the certainty to obtain the best solution, we have opted for
formulating the resource allocation problem as a reinforcement learning problem.
As shown in Figure 1, the general agent-environment interaction modelling con-
sists of an agent, an environment, a finite state space S, a set of available actions
A, and a reward function: SXA → R. The decision maker is called the agent,
and should be trained as the interaction system runs. The agent needs to inter-
act with the outside, which is called the environment. The interaction between
the agent and the environment is a continual process. At each decision epoch k,
the agent will make decision ak based on his knowledge about the current state
sk of the environment. Once the decision is being made, the environment would
receive the decision and make corresponding changes, and the updated new state
sk + 1 of the environment would be presented to the agent for making future
decisions. The environment also provides reward rk to the agent depending on
the decision ak , and the agent tries to maximize some notion of the cumulative
rewards over time. This simple reward feedback mechanism is required for the
agent to learn its optimal behaviour and policy.
Fig. 1. Reinforcement learning approach
4.2 Markov Decision Processes
Reinforcement learning problems can generally be modelled using Markov Deci-

sion Processes (MDPs), [6]. In fact reinforcement learning methods facilitate
solutions to MDPs in the absence of a complete environmental model. This is
particularly useful when dealing with real world problems. A MDP can typically
be represented as a tuple of four elements, consisting of states, actions, transi-
tion probabilities and rewards. In our specific case, the Markov Process can be
specified as it follows:
– The possible states S are: Start,Optimal,Underestimated,Overestimated,

Overloaded,Idle. These represent the possible states of a node. Of course,
at each state there is an associated tuple of the above mentioned parameters,
that helps describing and defining that particular state.
– The possible actions A are: Move to-i-node, Revert and Fail.
The agent can choose to move e specific module to an i-node according to the
current state of the whole environment, but it can also revert that action if it
realises that the chosen node is Overloaded or Overestimated and it can also
Fail if there are no available nodes.
– The transition probabilities is a matrix where for each state is defined the
probability to move into another state s performing an action a.
– The reward function is the last part of the process definition. At each step,
the agent will get a reward. R(s, a) in a certain state s after executing each
possible action a. In general, the reward function should be related to the
objective function. Consequently, the objective of our optimisation problem
is to get the minimal sum cost and the goal of RL is get the maximum reward,
so the value of reward should be negatively correlated to the size of the sum
cost, defined in the precious section.
4.3 Model-Free Learning

As we have defined our Markov Process, the next step is to choose the specific
reinforcement learning algorithm to solve the above defined allocation problem.
However, a specific feature of the Markov Process is the Transistion Probability
Matrix, which tells us how likely to enter a specific state given current state and
action.
This transition probability has to be given explicitly in a model based algo-
rithms, which work well for finite states and actions, but which are impractical
when the state space and action space grows.
Model-free algorithms, on the other side, rely on trial-and-error to update
their knowledge. As a result, they do not require space to store all the combina-
tion of states and actions.
Thus in the absence of a complete environmental model, as it is in our case,
model free reinforcement learning algorithms such as Q-learning [18] can be used
to generate optimal policies. Q-learning belongs to a collection of algorithms
called Temporal Difference (TD) methods. Not requiring a complete model of
the environment, TD methods possess a significant advantage. TD methods have
the capability of being able to make predictions incrementally and in an online
fashion. Q-learning can often require significant experience within a given envi-
ronment in order to learn the best actions to perform.
5 Conclusions and Future Works

In this paper we have presented a model for addressing the problem of resource
allocation in a cloud environment. The presented model, compared to other
approaches used in the current state of the art, focuses on the applications as
not composed by a series of job that could be manipulated at code level, but as a
series of black box modules. The models also represents both these modules and
the cloud nodes using sets of features that can capture many usage scenarios.
We have combines all these features to formulate an objective function, and
modelled a Markov Decision Process as the basic model for using the Reinforce-
ment Learning model-free algorithms. In the future works this model will be
evaluated on field using state of the art cloud usage traces, e.g Google cluster
traces data [12], in order to evaluate the proposed model. Moreover we will show
how, with the constraints and adjustments, this model might work also in a
hybrid Cloud-Edge scenario.
References
1. Alam, M.G.R., Hassan, M.M., Uddin, M.Z., Almogren, A., Fortino, G.: Autonomic
computation offloading in mobile edge for iot applications. Future Gener. Comput.
Syst. 90, 149–157 (2019)
2. Chevaleyre, Y., Dunne, P.E., Endriss, U., Lang, J., Lemaitre, M., Maudet, N.,
Padget, J., Phelps, S., Rodriguez-Aguilar, J.A., Sousa, P.: Issues in multiagent
resource allocation. Informatica 30(1) (2006)
3. Dutreilh, X., Kirgizov, S., Melekhova, O., Malenfant, J., Rivierre, N., Truck, I.:
Using reinforcement learning for autonomic resource allocation in clouds: towards a
fully automated workflow. In: The Seventh International Conference on Autonomic
and Autonomous Systems, ICAS 2011, pp. 67–74 (2011)
4. Farahnakian, F., Liljeberg, P., Plosila, J.: Energy-efficient virtual machines consoli-
dation in cloud data centers using reinforcement learning. In: 2014 22nd Euromicro
International Conference on Parallel, Distributed, and Network-Based Processing,
pp. 500–507. IEEE (2014)
5. Hameed, A., Khoshkbarforoushha, A., Ranjan, R., Jayaraman, P.P., Kolodziej, J.,
Balaji, P., Zeadally, S., Malluhi, Q.M., Tziritas, N., Vishnu, A., et al.: A survey and
taxonomy on energy efficient resource allocation techniques for cloud computing
systems. Computing 98(7), 751–774 (2016)
6. Howard, R.A.: Dynamic Programming and Markov Processes (1960)
7. Lewis, J., Fowler, M.: Microservices: a definition of this new architectural term.
Mars (2014)
8. Liu, N., Li, Z., Xu, J., Xu, Z., Lin, S., Qiu, Q., Tang, J., Wang, Y.: A hierarchical
framework of cloud resource allocation and power management using deep rein-
forcement learning. In: 2017 IEEE 37th International Conference on Distributed
Computing Systems (ICDCS), pp. 372–382. IEEE (2017)
9. Mell, P., Grance, T., et al.: The NIST definition of cloud computing (2011)
10. Nathuji, R., Schwan, K.: VirtualPower: coordinated power management in virtu-
alized enterprise systems. In: ACM SIGOPS Operating Systems Review, vol. 41,
pp. 265–278. ACM (2007)
11. Raghavendra, R., Ranganathan, P., Talwar, V., Wang, Z., Zhu, X.: No power strug-
gles: coordinated multi-level power management for the data center. ACM SIGOPS
Oper. Syst. Rev. 42(2), 48–59 (2008)
12. Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format+
schema. Google Inc., White Paper, pp. 1–14 (2011)
13. Sutton, R.S., Barto, A.G., et al.: Introduction to Reinforcement Learning, vol. 2.
MIT Press, Cambridge (1998)
14. Verma, A., Ahuja, P., Neogi, A.: pMapper: power and migration cost aware
application placement in virtualized systems. In: Proceedings of the 9th
ACM/IFIP/USENIX International Conference on Middleware, pp. 243–264.
Springer, New York (2008)
15. Wang, L., Chen, D., Huang, F.: Virtual workflow system for distributed collabo-
rative scientific applications on grids. Comput. Electr. Eng. 37(3), 300–310 (2011)
16. Wang, L., Chen, D., Zhao, J., Tao, J.: Resource management of distributed virtual
machines. Int. J. Ad Hoc Ubiquit. Comput. 10(2), 96–111 (2012)
17. Wang, L., Von Laszewski, G., Chen, D., Tao, J., Kunze, M.: Provide virtual
machine information for grid computing. IEEE Trans. Syst. Man Cybern.-Part
A: Syst. Hum. 40(6), 1362–1374 (2010)
18. Watkins, C.J.C.H.: Learning from delayed rewards (1989)
19. Xu, C.Z., Rao, J., Bu, X.: URL: a unified reinforcement learning approach for
autonomic cloud management. J. Parallel Distrib. Comput. 72(2), 95–105 (2012)
20. Xu, Z., Wang, Y., Tang, J., Wang, J., Gursoy, M.C.: A deep reinforcement learning
based framework for power-efficient resource allocation in cloud RANs. In: 2017
IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2017)
The 6th International Workshop
on Distributed Embedded Systems
(DEM-2019)
DUST Initializr - CAD Drawing Platform
for Designing Modules and Applications
in the DUST Framework
Thomas Huybrechts(B) , Simon Vanneste, Reinout Eyckerman, Jens de Hoog,

Siegfried Mercelis, and Peter Hellinckx
IDLab - Faculty of Applied Engineering, University of Antwerp - imec,

Sint-Pietersvliet 7, 2000 Antwerp, Belgium
{thomas.huybrechts,simon.vanneste,reinout.eyckerman,jens.dehoog,
siegfried.mercelis,peter.hellinckx}@uantwerpen.be
Abstract. The DUST framework is a middleware platform developed to

create software components that are distributable across heterogeneous
networks. Developing and maintaining the component-based system and
its configuration can become a challenge. In this paper, we present a web-
based tool that allows engineers to visually model a DUST application
graph in a CAD canvas. This graph is used by the back-end template
generators to create the necessary configuration files and code for each
module to implement in order to focus on the business logic. The current
version of the tool is able to fully generate message objects and builder
classes that are used for communication over DUST. Therefore, allow-
ing the engineers to manage and update these objects through the GUI
interface instead of error-prone coding.
1 Introduction
With the advances in internet and communication technology, we are able to
collect and process more data than ever before. It is expected that the number
of connected Internet-of-Things (IoT) devices alone will reach 25 billion by 2021
[9]. Additionally, new emergent development in Cyber-Physical Systems (CPS)
and simulation [1] allows us to create complex autonomous machines and vehicles
that can communicate with each other to provide emergent behaviour on a larger
scale. However, hard constraints (e.g. latency, energy usage, etc.) are imposed
on these systems. The DUST framework provides a platform to research and
develop these systems.
The development of a DUST-based application is straightforward. However,
the file-based configuration of such an application can quickly become unclear for
complex architectures. Also, the modular design requires us to implement lots of
standard mandatory interfaces and rules that result in repetitive, tedious coding
work. Therefore, we propose a tool which assists the engineers in creating faster
and efficient modules in the DUST framework; and keeping a better overview of
the entire application in order to lower their workload.
https://doi.org/10.1007/978-3-030-33509-0_62
662 T. Huybrechts et al.
In Sect. 2 of this paper, we discuss the DUST framework and the underly-
ing communication architecture that we target to configure automatically and
generate code for. Section 3 discusses the high-level architecture of a DUST appli-
cation and the work required to successfully deploy such a system. To conclude,
we present our DUST Initializr tool in Sects. 4 and 5.
2 Distributed Uniform STreaming Framework
The Distributed Uniform STreaming (DUST) framework is a middleware plat-

form developed to create software components that are distributable across het-
erogeneous networks [10]. These components or modules are able to run on dif-
ferent network nodes, ranging from the central cloud infrastructure up to net-
work edge nodes and devices. The goal of this framework is to cover the gap
between different data sources or ‘data producers’ and the systems and services
which require this data, that we call ‘data consumers’. With the rise of the
Internet-of-Things (IoT) and large-scale agent-based simulations [1], data plays
an important role in revolutionary (distributed) applications.
The DUST framework is developed by the IDLab research group as a plat-
form to test and build distributed computing applications with live migration
possibilities [10]. The migration functionality allows us to rearrange the distri-
bution of the application modules across the network. Therefore, the system is
able to adapt to its dynamic environment (i.e. changing network infrastructure,
loss of nodes, etc.) to obtain an optimal distribution according to a defined cost
function [5]. For example, an application monitoring air pollution using com-
pact mobile air quality sensors that are installed throughout a city. Each sensor
module has restricted resources available, e.g. limited battery capacity, compu-
tational processing, etc. It may seem feasible to send all data to the cloud for
processing. However, when the number of sensors grows, the amount of data
and thus the transmission cost over the network increases. By performing small
pre-computations on the devices itself, the amount of traffic will be significantly
reduced when the network becomes saturated. Nonetheless, this approach is
detrimental to the batteries of remote nodes. A better methodology is to dis-
tribute the load over the available resources with respect to its environmental
context (e.g. batch computation of close located nodes on the edges of the net-
work - fog computing) and restrictions (e.g. remaining battery power).
In order to create and maintain efficient distributed applications, the DUST
framework is designed to manage and monitor the entire system, i.e. from the
network to the nodes and their links. A first requirement of the application is
the ability to divide the application logic into smaller components. We refer to
these application components as modules or blocks.
The streaming of data between modules is handled by the DUST-Core
library. Therefore, the core supports a variety of message protocols through
add-ons to stream messages. The communication interface accepts byte streams
and makes the underlying stack transparent for the application layer. As a
result, integrating multiple communication stacks makes it possible to switch
DUST Initializr 663
them depending on the current context. For example, it can be feasible to swap
between sockets and shared memory when the linked modules are placed on the
same node to optimise the network and CPU usage. The data streaming layer
is built upon the publish-subscribe pattern [4]. Each link contains a topic with
a predefined message type and is extendable to have multiple publishers and/or
subscribers connected (i.e. many-to-many).
3 DUST Applications
A DUST application is a collection of modules that work together in a distributed
environment by communicating with each other over data links with predefined
messages. We differentiate two major development parts when creating a DUST
application. On the one hand, we have the basic components that contain the
business logic or algorithms of the application. These components are the DUST
modules. Each DUST module is built upon the DUST block interface of the
DUST-Core library. This library provides a basic interface to execute program
logic and send messages to other modules. The DUST-Core will handle all nec-
essary operations and configurations to transmit and receive messages, monitor
the state of the system and perform the migration and recovery of the nodes with
minimal effort of the module developer. By separating every ‘task’ or concern
into its own module, a collection of small modular services will be obtained that
are reusable in other applications when the right level of abstraction is applied.
On the other hand, we have the collection of DUST modules that are con-
nected and working together to achieve new emergent behaviour to solve domain-
specific problems. This higher architecture of interconnected DUST modules is
a DUST application. At this level, we need to describe the interactions and con-
nections between each module to have the desired behaviour. These interactions
are fixed in a DUST file. This configuration file contains the preferred commu-
nication stack, the data links and its link role (i.e. publisher or subscriber). The
DUST file contains static parameters which the DUST-Core uses to configure
itself. However, the DUST-Core is able to reload a new configuration. This fea-
ture allows us to dynamically adapt the system to optimise its configuration
to the state of the entire system (e.g. network load, battery capacity of mobile
nodes, etc.). The DUST framework uses an orchestrator node that monitors dif-
ferent state parameters of the system in order to determine the most optimal
module configuration across the network. Therefore, the orchestrator will update
the DUST file when modules are migrated to other network nodes. An additional
orchestration file is required that hints the orchestrator about the behaviour and
characteristics of each DUST module (e.g. Worst-Case Execution Time, energy
consumption, etc.). These characteristics are determined using behaviour and
resource analysis tools, such as the COBRA framework [3,8].
Creating a complex application in DUST can be an intricate task to per-
form. As previously discussed, the development can be divided among two major
teams, i.e. the module and the application developers. Nevertheless, each team
still has its challenges in designing its part of the system. The module devel-
opers have to implement the DUST-Core interfaces which can be a tedious and
repetitive task to perform. Missing a small detail in the implementation of the

interfaces could lead to endless debugging of the module, resulting in signif-
icant lost time. The application developers on the other hand must link and
distribute all modules. Creating and configuring a small DUST application with
a small number of nodes is still manageable in simple network layouts. However,
the complexity increases tremendously when the number of nodes and modules
increases. In the end, it becomes problematic to keep an overview of the intricate
network. Therefore, we need an easy to use solution to assist engineers in creat-
ing, deploying and maintaining DUST applications. The DUST Initializr tool is
developed to provide the support needed from bootstrapping a new application
to reconfiguring an existing one.
4 DUST Initializr
The development and configuration of a DUST application can quickly become a

tedious task when the project is in full development. The collection of connected
DUST modules turns into a complex network in which it is hard to distinguish
the individual links. The need for a clear overview and documentation of DUST
applications has resulted in the development of the DUST Initializr tool. In
order to assist with the start, deployment, and maintenance of a project, the
DUST Initializr tool has four main features:
Creating Modules. The tool provides a graphical user interface (GUI) to model
the interfaces of existing or new modules. Afterwards, the tool generates code
projects for each newly defined module. A more in-depth discussion is provided
in Sect. 4.1;
Configuration. With the computer-aided design (CAD) canvas in the tool, the
developer is able to create a graph of the interconnected DUST modules. The
modelled graph of the DUST application is used to generate the corresponding
DUST files to configure each module. More details on creating the graphs and
configuration files are available in Sect. 4.2;
Documentation. A DUST application is a collection of DUST modules each
with their own code project. This micro-service approach makes the modules
flexible and easy to reuse. Nevertheless, the codebase has no references to its
dependent/connected modules. These links are defined on application level in
the DUST file. However, the JSON based configuration file requires a large effort
to interpret and see the global picture of all modules and their links. The CAD
graph provides a visual overview of the DUST application that is easier and
faster to understand for humans. Therefore, this tool is a great way to document
the system architecture of a project;
Modules Database. By defining the interface of a DUST module in the tool,
a new code project will be generated based on a template. The developer is
responsible to implement the desired business logic in the block. When modular
modules are created, we can improve the re-usability of these blocks in future
DUST Initializr 665
projects by grouping them in a database of modules. This collection can be used

to search for existing implemented modules. By importing a module into the
project, the interface of that module does not need to be re-entered into the
tool. When exporting the project, the functional implementation of the module
will be supplied instead of an empty template, resulting in shorter development
time.
4.1 Creating Modules

The DUST Initializr assists with the development and design of modules by
defining the interfaces and connections between the blocks. A connection between
two modules consists of a data link and a message object type used to format
the content on the link.
In the first step, we need to define these links and message objects. A data
link is a communication bus that transports messages of a single predefined type
from one or more senders to one or more receivers. A module has the opportunity
to publish (sending) or subscribe (receive) to a data link. Therefore, a data link
is labelled with a unique topic to distinguish one from another. In addition, a
message type needs to be assigned to the link, so that the modules are able to
communicate unambiguously with each other.
A DUST message object is a data object that contains a predefined hierarchi-
cal tree structure of other message objects down to core primitive data types at
its leaves, such as numeric (integer, long, etc.) and character values. An exam-
ple structure of a message object is shown in Fig. 1. By defining and fixating
the message interface during the planning phase, we provide a contract between
module designers on how to format the data to communicate with each other.
The message contract is then used by the project compiler to generate functional
class files for the engineer in the chosen programming language. Additionally, a
message factory is created that generates immutable message objects. The fac-
tory is responsible to initialise the message object with valid values for each field.
The generated message objects are then ready to be sent over the data links.
The next step is to define the module interface with the DUST messages in
place. By creating a new module in the Initializr tool, we need to define three
key parameters: module name, data link inputs and outputs. The module name
is a descriptive name for the developers to quickly differentiate each module in
the system graph and in the module database. The data link inputs and out-
puts define respectively all incoming and outgoing data links for the module.
Each module inputs and outputs (I/O) has an assigned message type that it can
receive/send. These types are used by the project compiler to generate corre-
sponding typed publisher and subscriber classes to send/receive messages. The
resulting transceiver classes implement ready-to-use functionality to (de)serialise
a message object for easy transparent transmission. The publisher class exposes
a Publish interface that sends the given message object as a parameter to all sub-
scribers on the link. The underlying logic resolves the necessary steps to serialise
the object into a data stream which is then sent over the network. The subscriber
class implements an observable pattern that notifies all registered observers in
Fig. 1. Class structure of a car message object.
the module of the received message. With the Update interface, any code can
be executed to perform computations on the received message. Additionally, the
generated subscriber is responsible for parsing the received data stream to a
usable message object.
The project compiler is responsible for generating the necessary code files
for each newly defined DUST module. The generated code projects serve as a
starting point for the developers to implement the business logic of each module.
In order to create these project files, the Initializr tool uses default code tem-
plates for each supported programming language. Each code template engine is
an extension for the project compiler. Each template consists of static code with
annotated parts. These parts are used to dynamically inject types and names
into the code based on the provided project configuration in the tool, e.g. mod-
ule name, I/O, etc. A template generated module will consist of a main block
with a configuration callback and the main processing thread. In addition, the
publisher and subscriber classes with their message type are added to the code
base according to the defined module I/O.
After the implementation of the business logic into the generated code base,
the developer is able to archive the module into the DUST module collection
for future reuse. The database contains a list of blocks that are functional and
integrable directly into an application. This feature is extremely useful for generic
modules with a modular design which are easily employable into systems with
similar requirements. In order to add a block to the DUST module database,
we need to package the source code into a module archive. Such a package
contains the executable code and/or the source code of the module to deploy the
block. Furthermore, a manifest file is included to describe the I/O and features
of the module. The file will enable the Initializr tool to generate a graphical
representation of the module with immutable I/O settings inside the graph.
This allows the developer to configure the system without having to manually
remodel the module in the tool.
DUST Initializr 667
Fig. 2. Example graph of a DUST application with subscriber/publisher modules and

broker node.
4.2 Creating Applications
A DUST application is created in the Initializr tool by interconnecting the mod-

ule’s I/O with data links. The development flow of the tool is focused on con-
structing a visual graph to model the application. Therefore, the main window
consists of a CAD drawing area in which the engineer is able to convert the
concept into a graph. The CAD tool assists the user to add, modify and verify
the design of the graph.
Each module that is created in the tool or imported from an archive as
described in Sect. 4.1 is resembled as a rectangular node with in- and outputs in
the graph. An example of an application graph is shown in Fig. 2. By drawing an
oriented link in the graph between two modules, we define a data link. A link end
with an arrow indicates an input of the module while an end without an arrow
is an output. These inputs and outputs will result in respectively subscriber and
publisher classes in the DUST modules. Each link can have multiple publishers
and subscribers connected. However, there should be at least one publisher and
one subscriber connected to a link. Furthermore, an additional broker node will
be created when defining a many-to-many link depending on the communication
stack used. This broker will serve as a proxy to forward messages across the data
link when the underlying chosen communication stack does not support many-
to-many messaging, such as ZeroMQ [7]. This is achieved by transforming a
many-to-many link into two separate connections with a broker module in the
middle, namely a many-to-one and one-to-many link.
After modelling the application graph, the tool will verify the configuration
during export. For example, the fixed typing of the module’s I/O offers an extra
validation check. When two I/O ports share a data link with different message
types, an exception will be raised as the generated publisher/subscriber pair
are incompatible. If the model passes the validation tests, a project package is
presented to the developer that contains the generated module templates, any
available module archives and the necessary configuration files required to deploy
the DUST application.
The DUST Initializr tool can generate two type of configuration files depend-
ing on the chosen operation mode that is used for deployment and initialisation
of the system. The functional behaviour of a DUST application is determined
on two different levels. The configuration of a DUST module on the lowest level
is a static description defining the target host and its data link parameters. We
call this configuration a ‘DUST file’. This JSON formatted file is used by the
DUST library during start-up to configure its subscribers and publishers on the
data links, e.g. target IP address, link name, port number, etc. The configuration
of multiple modules may reside in one DUST file. Based on a unique identifier,
a module will find its configuration parameters. The configuration of a module
will remain static as long as the DUST file is not updated.
The dynamic behaviour of the DUST framework comes into play when we
push updates to the configuration of the DUST file. This can be achieved by
manually editing the content of the file. Nonetheless, this approach is a tedious
task that is not scalable or responsive at all. The automation and intelligence
of the DUST framework is provided on a second layer by the orchestrator, as
discussed in Sect. 3. In addition to managing the migration of DUST blocks over
the network, the orchestrator uses the DUST file to push updated configura-
tions to all application nodes. By deploying an orchestrator, we no longer have
to worry about the low-level placement of the modules across the network hosts,
as it will determine and persist the most optimal distribution. Notwithstanding,
the orchestrator needs information of the network topology, but also the actual
DUST application and its code behaviour in order to optimise the system per-
formance. Therefore, a list of Key Performance Indicators (KPI’s) is composed
that describe the context of the system [5]. These KPI’s are then translated in a
cost function which the orchestrator will try to optimise when distributing the
application modules. The topology, context and behaviour of the application is
summarised in an ‘orchestrator file’. Most of this information is derived from the
graph model. Additional details such as code behaviour models of code analysis
can be entered into the tool as well [3,8].
5 Results
The Initializr tool is developed as a responsive web application that enables

us to deploy it as a software-as-a-service (SAAS) utility. By hosting the tool
in the cloud, it becomes easy to push a patch of the DUST library and/or
template engines. This guarantees us that every newly generated project runs
on the latest version of DUST to avoid any incompatibility issues among different
tool versions. Furthermore, a community or company-oriented modules database
allows developers to share new modules instantly with others. The back-end
system is built in Java using the Spring Framework [2]. It provides an API
for the template engines to generate module templates and configuration files.
The interactive front-end of the DUST Initializr is created with the ReactJS
framework [6] and contains a large JS library to create user interface components.
Figure 3 shows the front-end design of the tool.
DUST Initializr 669
Fig. 3. Front-end view of the Initializr tool with the CAD graph.
In order to validate the development time gain, we modelled a basic DUST

application with three Java DUST modules. The system consists of an interface
node that captures data from a REST interface, creates a DUST message object
and pushes it to the processing module. The processed data is then sent to a
storage module that stores the records in an external database.
Table 1. Number of manual-written/generated code lines for an example Java DUST

application.
Total code lines Generated code lines Ratio gen./total lines(%)

Interface node 402 37 9.2
Processing node 92 34 36.96
Storage node 226 41 18.14
Message object with 4 fields 201 201 100
Table 1 shows an overview of the ratio between the total number of code lines
and the template generated lines for the test application. The code base of tem-
plate generated modules only contains the pre-filled interfaces that need to be
implemented by the developer. The domain specific logic (e.g. algorithms, data
processing, etc.) will therefore take the majority of the code lines depending on
the complexity. However, the manual labour of creating customised boilerplate
interfaces for a new module is minimised. A significant conclusion in Table 1 is
the full code coverage of the message object and its builder class by the tem-
plate engine. As a result, the Initializr tool is able to fully manage the message
objects. Any change to a message interface is feasible through the tool without
any manual intervention inside the code and thus, it significantly reduces the
development time and chance of errors.
6 Conclusion
Creating and adjusting an application with the DUST framework can become
a tedious task to update the configuration and keep an overview of the system.
The proposed DUST Initializr tool reduces the development load of the engi-
neers. The current version includes four main features to speed up the process:
boilerplate code for new modules, configuring and documenting DUST networks;
and storing/importing existing modules for reuse. A SAAS application has been
developed to easily distribute the latest DUST updates to all users. The devel-
opment methodology is focused on building a visual graph of the DUST network
in the CAD window. The code generators create the necessary configuration files
to deploy the application and code projects of each module to immediately start
implementing the essential business logic. In the results, we notice that message
objects and their builders for communication over DUST are fully generated by
the tool and thus, any update to these objects are manageable through the GUI
instead of coding.
References
1. Bosmans, S., Mercelis, S., Hellinckx, P., Denil, J.: Towards evaluating emergent
behavior of the internet of things using large scale simulation techniques. In: Pro-
ceedings of the Theory of Modeling and Simulation Symposium, p. 4. Society for
Computer Simulation International (2018)
2. Cosmina, I.: Pivotal Certified Professional Spring Developer Exam. Springer, Hei-
delberg (2017)
3. De Bock, Y., Altmeyer, S., Huybrechts, T., Broeckhove, J., Hellinckx, P.: Task-
set generator for schedulability analysis using the TACLeBench benchmark suite.
ACM SIGBED Rev. 15, 22–28 (2018). https://doi.org/10.1145/3199610.3199613
4. Eugster, P., Felber, P., Guerraoui, R., Kermarrec, A.: The many faces of publish/-
subscribe. ACM Comput. Surv. (CSUR) 35(2), 114–131 (2003)
5. Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Context-aware distribution
in constrained IoT environments. In: Proceedings of the 13th International Confer-
ence on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2018), pp.
437–446 (2019). https://doi.org/10.1007/978-3-030-02607-3 40
6. Fedosejev, A.: React.js Essentials. Packt Publishing Ltd., Birmingham (2015)
7. Hintjens, P.: ZeroMQ: Messaging for Many Applications. O’Reilly Media, Inc.,
Newton (2013)
8. Huybrechts, T., De Bock, Y., Haoxuan, L., Hellinckx, P.: COBRA-HPA: a block
generating tool to perform hybrid program analysis. Int. J. Grid Util. Comput. 10,
105–118 (2019). https://doi.org/10.1504/IJGUC.2019.098211
9. Omale, G.: Gartner identifies Top 10 strategic IoT technologies and trends (2018)
10. Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M.,
Mercelis, S., Hellinckx, P.: Distributed uniform streaming framework: an elastic fog
computing platform for event stream processing and platform transparency. Future
Internet 11, 158 (2019). https://doi.org/10.3390/fi11070158
Distributed Task Placement in the Fog:
A Positioning Paper
Reinout Eyckerman(B) , Siegfried Mercelis, Johann Marquez-Barja,

and Peter Hellinckx

{reinout.eyckerman,siegfried.mercelis,johann.marquez-barja,
peter.hellinckx}@uantwerpen.be
Abstract. As the Internet of Things (IoT) paradigm becomes

omnipresent, so does fog computing, a paradigm aimed at bringing appli-
cations closer to the end devices, aiding in lowering stress over the net-
work and improving latency. However, to efficiently place application
tasks in the fog, task placement coordination is needed. In this paper,
task placement in the fog and corresponding problems are addressed. We
look at the fundamental issue of solving Multi-Objective Optimization
problems and treat different techniques for distributed coordination. We
review how this research can be used in a smart vehicle environment,
and finish with some preliminary tests results.
1 Introduction
The Internet of Things (IoT) paradigm is gaining widespread acceptance. A key

concept pushing IoT forward is Industry 4.0. One use case within the industrial
domain is leveraging IoT to monitor equipment and processes, enabling pre-
dictive maintenance and preventing downtime costs. In the automotive sector,
another case of IoT usage is smart vehicles, using photodetectors, rain sensors,
cameras, and various other connected sensors and actuators. Vinitsky et al. [1]
found that only a 10% smart vehicle penetration rate is enough to reduce traffic
congestion by 25%, showing that few smart vehicles can already make a differ-
ence. To utilize their sensors better, smart vehicles can be connected together to
create Vehicle Ad-Hoc Networks (VANETs), allowing vehicles to communicate
their information to each other.
However, many sensors mean many devices, which means a massive amount
of data putting a serious strain the network. Cisco estimates that by 2021, 850
ZB of data will be generated yearly, more than triple the amount since 2016 [2].
However, most of this data is short-lived: an estimated 90% of it will be used
for processing and will not be stored. This is especially true for IoT devices: the
sensor data mainly needs to be processed and acted upon. Fog computing can
play a large role in resolving load problems like this. It aims to enable pervasive
access to a shared set of computing resources for distributed and latency-aware
https://doi.org/10.1007/978-3-030-33509-0_63
672 R. Eyckerman et al.
applications, as defined by Iorga et al. [3]. The placement of the application

tasks, will happen in the fog shown in Fig. 1, an intermediate network layer,
allowing for a reduced network load and latency. However, fog networks are often
highly dynamic. An example here are the previously mentioned VANETs, where
vehicles continuously (dis)connect due to differences in speed, or by passing by
roadside access points.
Fig. 1. Example of a fog network & an application graph.
Highly dynamic networks demand efficient fog placements, so as not to lose

application efficiency when nodes move/disappear. This requires a task place-
ment coordination technique. Applications must first be divided into separate
tasks, which can be distributed over the network. The technique keeps in mind
the available hardware and network resources along with the dynamic network
aspect, and distributes these tasks over the network. The potential overhead and
communication cost make it infeasible to distribute tasks in a centralized man-
ner. Thus, a distributed task placement approach is proposed, which adapts the
task placement to changes in context. This context consists of Key Performance
Indicators (KPI), which represent device-specific weights inside the network. The
KPI change depending on the significance of a hardware resource (e.g. energy
scarcity when running on battery power).
This paper entails possible approaches for efficient task placement across the
network. This approach will enable application latency reductions, will increased
application efficiency, and will also enable automated placement, so that is no
longer the concern of the application developer. The research presented in this
paper focuses on fog computing, and the distribution of tasks to enable this
computing paradigm.
2 State of the Art
The task placement problem spans several areas. Below we present a summary
of most relevant research.
Distributed Task Placement in the Fog: A Positioning Paper 673
Hardware and Software Models: Xia et al. [4] modeled basic device metrics
such as CPU, RAM and disk capacity for a task placement technique on the fog.
Huybrechts et al. [5] looked into the application model, focusing on Worst-Case
Execution Time (WCET) analysis using a hybrid approach. To achieve their goal,
the COBRA and TACLeBench tool suites were used [6]. Using techniques such
as these, the application metrics were more efficiently calculated and modeled.
Constraints: Plenty of constraints are present when working with fog devices.
Xia et al. [4] considered the resource constraint where the device has a set of
available resources, whose usage cannot be exceeded. Similarly, Sharma et al.
[7] considered the software heterogeneity when placing tasks on Google compute
clusters, with for example tasks requiring a minimum Kernel version to properly
run. They then measured the performance impact of such constraints on the
task scheduler.
Distributed Software Frameworks: Distributed software development pro-
vides more challenges than monolithic software, such as message passing and
state maintenance. To aid with this challenge, Vanneste et al. [8] developed the
Distributed Uniform STreaming (DUST) framework. The framework provides
base building blocks for event streaming in a distributed fashion. This allows
users to build parts of applications on these blocks, creating a connected dis-
tributed application, which can, in turn, be distributed over the network.
Context: The devices on which the tasks run, employ a context. This concept
of context-awareness has been around for some time. One popular definition was
coined by Dey & Abowd, whose definition included user, application, location,
and device awareness [9]. Although all these contexts influence task placement,
only device awareness influences it directly. Location and user awareness can, for
example, have an influence on the application, which can, in turn, be simplified
into a change in the software requirements. If, for example, it gets dark outside,
regular cameras will not be of much use on smart vehicles and can thus be
disabled, lowering the amount of data the task has to move, thus changing the
task requirements.
Fog Challenges: Many challenges appear when attempting task placement on
fog networks, as presented by Wen et al. [10]. They described several challenges,
such as the IoT device heterogeneity, security, network latency, dynamicity, and
fault tolerance, which correspond with problems in the proposed research, such as
the device heterogeneity. Others have already attempted similar approaches, such
as Brogi and Forti, researching into placing tasks across fog devices regarding the
Quality of Service (QoS), and proving NP-hardness [11]. Guaranteeing the QoS
was interpreted as placing the tasks without overloading the physical data links
between devices and without overloading the machines themselves. Wang et al.
[12] proposed a placement algorithm based on Multi-Access Edge Computing
(MEC), a technique where they place micro-clouds closer to the edge. They
used Linear Programming methods for mapping tree application graphs onto
tree network graphs, and make simplifications to make the problem tractable.
Optimization Techniques: There are multiple techniques for solving the task
placement problem, one of which is by use of heuristics. Xia et al. [4] looked into
application placement with fog devices, using dedicated zones for application
deployment areas to ensure placed tasks’ locality. Their goal was to minimize
the weighted average latency when placing his tasks over his devices using sev-
eral solutions, such as exhaustive search, naive search, and improvements of
naive search. Wang et al. [12] proposed an MEC application placement algo-
rithm, a technique where they placed micro-clouds closer to the edge, which is
then compared to a greedy and the vineyard algorithm. The research used Linear
Programming methods for mapping tree application graphs onto tree network
graphs. In my previous research, I compared a brute force placement technique
to a Genetic Algorithm (GA) and a Hill-Climb (HC) technique [13]. Plenty of
research already exists, but it is noticeable a distributed implementation is lack-
ing. Distributed approaches bring massive added complexity. One method of
solving the problem is by using the Contract Net Protocol, as defined by Smith
[14]. There, the agents distributed tasks one by one, using a simple bidding
mechanism. Another solution would be the A-Team, as defined by Talukdar et
al. [15], who defined a set of autonomous agents, called the A-Team or Asyn-
chronous Team, which work together to solve the same goal. This approach was
implemented by Barbucha and Je [16], to solve the Vehicle Routing Problem.
Barbucha also showed the efficiency of distributed agent optimization [17]. Here
they focus on placing teams of agents on a distributed network and showing
that an increase of agents improves efficiency up to a certain point, after which
efficiency starts decreasing again due to overhead.
Simulation: The research is to be tested using simulation. One versatile sim-
ulator is Simgrid, which has been in active development since 2001 [18]. This
simulator allows the simulation of a concurrent software stack on top of simu-
lated hardware resources, optionally using ns-3 for network simulation. Simgrid
allows modeling the application and tests its behavior, speeding up research.
Other simulators exist, most focusing on the specific behavior of IoT networks
such as IoTSim [19], which attempts to mimic the behavior of an existing appli-
cation to generate results about certain metrics and its behavior in the network.
This State of the Art covers all the fundamental parts required for developing
a task placement coordinator. We will define problems not stated in the state
of the art. Different kind of placement techniques will need to be compared and
reviewed, combining and editing where necessary, to finally end up with a stable
technique.
3 Problem Definition
The main objective of this paper is to show the current problems involved when
trying to distribute tasks across IoT networks using decentralized optimization
algorithms. There are several reasons for not using a centralized approach. A
centralized approach might be too slow for time-critical applications (optimiza-
tion or communication might take too long), in a highly dynamic environment
a single point of failure is quite dangerous, and furthermore, centralized mon-

itoring of global resource availability in real-time introduces plenty overhead.
We define the metrics used for the application and network and describe which
shapes of application and network graphs are supported. After this, we define
our context and the Multi-Objective Optimization (MOO) problem. Finally, we
describe the features an efficient technique should have, and a use case.
Metrics: Metrics for both the software application and the network hardware
must be defined. The selected network metrics can be seen in Table 1, where
the metrics of the nodes and links in the network are defined. The application
metrics are defined in Table 2. It can be seen that the defined software metrics
can almost directly be mapped to hardware metrics, with the exception of the
worst case network load, which can be mapped to both the Network Interface
Card (NIC) transfer speed of the device as to the bandwidth of the link. Most
other metrics are usually network specific, such as wireless network interference.
One exception is the data storage cost. However, streaming applications are
assumed, where no data needs to be stored on the device. The network model
can be created by using the hardware characteristics. The application model can
be created using the COBRA Framework [6].
Table 1. Network metrics Table 2. Software metrics
Name Description Name Description

N Set of physical nodes in the A Set of tasks in the application
network C Set of data links in the
L Set of physical links in the application
network wceta Worst-Case Execution Time of
cpun Processing Speed of node n task a
memn Memory Size of node n wcmca Worst-Case Memory
nicn NIC Transfer Speed of node Consumption of task a
n wceca Worst-Case Energy
ecn Energy Consumption of node Consumption of task a
n wcnlc Worst-Case Network Load of
bwl Bandwidth (kbps) of link l data link c
latl Latency of link l mlatc Maximum Allowed Latency of
data link c
Network and Application Shape: Both the network shape and the applica-
tion shape will be modeled as graphs. Previous research often used tree graphs for
the network, and directed linear [13] or tree graph structures for the application
[12]. Constraining the graphs allows for constraint modeling into the optimiza-
tion techniques. Mesh-like networks are underrepresented due to them greatly
enlarging the search space but are necessary to exploit the full benefits of the
network, such as redundancy. Similarly, directed mesh application graphs are
required, in order to support any kind of application. An example of a network

and application graph is shown in Fig. 1. As observed, the bottom application
node represents a data source or sensor, the top node a data sink or actuator.
Context: There is a device-specific context assigned to each device. This context
is a set of KPI, each connected to their respective device parameter. These con-
texts allow the device to tune itself in regards to task placement. If for example,
the device has a low battery, the energy KPI goes up, stopping energy-hungry
tasks to be assigned to the device.
Multi-objective Optimization: Determining the optimal task placement on
the network results in a Multi-Objective Optimization (MOO) problem. It is
an optimization problem where multiple, potentially conflicting, criteria are to
be optimized simultaneously using multiple objective functions, e.g. optimizing
toward energy efficiency and maximizing machine utilization. Solving this tends
to result in a set of optimal solutions, where no criteria can be improved without
degrading other criteria. This set of optimal solutions is called the Pareto front.
In this scenario, each optimization criterium (e.g. energy efficiency, latency) can
be modeled into an objective function, which results in a MOO problem. Marler
and Arora [20] provide a survey paper about available techniques. There are
three main techniques for working with MOO problems.
• A priori: preferences are modeled in advance in order to achieve a subset of
Pareto-optimal solutions.
• A posteriori: the entire Pareto-optimal set is determined, after which one
selects a solution based on preferences.
• Interactive method: multiple method iterations are done, aiding the user with
a selection of preferences.
The coordination technique should be able to work autonomously, placing appli-
cations without human interaction or feedback. In order to be able to work
autonomously, the coordination technique is required to know the preferences of
placements, such as valuing energy efficient placements over low latency place-
ments. This is application and network dependent, and thus is the system admin-
istrator’s task to determine this. However, this requires that the coordinator
is able to find a single optimal solution, since finding multiple equally good
placements is something it cannot work with because it would not know which
solution to follow. In this regard, only the a priori method can be used for it
is the only technique that can provide a single solution, because it models the
MOO problem as a single-objective problem, either by defining weights, pref-
erences or other techniques. The approach does often have problems in finding
Pareto-optima, depending on how well the function is constructed. Two a priori
methods are defined below:
Weighted Sum: A simple and widely used approach is the weighted sum app-
roach, where the set of objectives is scalarized into a single objective function
by multiplying each objective function with a user-defined weight. Marler and
Arora [21] list up several problems with this approach, one of them being objec-
tive function normalization. This simplifies the weight defining process, since
due to objective function normalization the weights no longer need to take in

account the scale of the objective function.
Lexicographic Method: When using the lexicographic method, the objective
functions are sorted and solved in order of importance. Multiple techniques exist
in aiding prioritization of objective functions, such as presented by Kaufman and
Michalski [22]. They present a technique where multiple parameters are defined
per objective function which corresponds to the results they present in the set
of Pareto solutions.
3.1 Optimization Technique Features

We will now define the most prominent features which should be kept in mind
during the research process is in order. These features contain distributivity,
speed, resource consumption, memory, scalability, and optimality. The features
have an impact on the technique’s efficiency for the application placement prob-
lem and can be used as measurements.
Distribution: Important is that the technique can run in a distributed manner.
Little communication overhead is required as to not overload the network links.
It should run competitively on smaller devices so that they can add to finding
the solution. Stronger devices should not need to wait on solutions or skip the
smaller devices. It should be fault tolerant so that there still can be found a
solution when a device misbehaves/fails.
Speed: Another feature to keep in mind is the speed of the used technique. Work-
ing with a highly dynamic network, both on context as on device level, solutions
should be found quickly. Otherwise, the distributed tasks might perform badly
or devices get overloaded with tasks.
Resource Consumption: The resource consumption is of grave importance.
If the technique consumes too much memory or processing power while running,
the device might not be able to process its other tasks productively. In a highly
dynamic network, this would result in devices being unreliable.
Scalability: This feature asks that the technique is scalable on a heterogeneous
network. A major issue here is that the Internet of Things (IoT) network is het-
erogeneous by nature, where devices and links have different resources available.
Due to energy depletion, link failure or any other kind of problem, the devices
can disconnect, resulting in a continuously changing network.
Solution Memory: Solution memory is another important aspect, for it can
speed up sequential task placements, and might keep new optimal placements
in the neighborhood of the existing placement, requiring little change in current
task placement. As this could result in a potentially less optimal placement, this
would also result in fewer application problems since most tasks can be kept
where they are running.
Global/Local Optima: The final feature should test how well techniques which
go for local optima (Contract Net Protocol, Hill Climb) compare to those looking
for global optima (Genetic Algorithm (GA), Particle Swarm Optimization). It

should be kept in mind that if global optimization techniques are not feasible,
they should be discarded as soon as possible.
3.2 Techniques
There are multiple techniques that can be used, as stated in the state of the art.
Due to the design of the application and network graphs, there is a very large
search space with worst-case O(N A ) run time, meaning exact search techniques
are infeasible. Heuristics can bypass this problem. However, finding and com-
posing the best heuristic is a major part of this research, and greatly depends
on the choices made when selecting a MOO technique. The selected technique
should be inspected for features which can be implemented in the heuristics for
pre-optimization, improving search efficiency. Another selection to be made is
the multi-agent technique, such as A-Team, combining it with heuristics.
3.3 Simulation
The Simgrid simulator platform shows to be interesting for testing the

approaches. However, simulation is limited in its results. To test the distribution
techniques, a set of VMs could be modeled into the network to be simulated.
These VMs can model the resources available by the devices, and for exam-
ple, the Linux netem command can simulate network links between the VMs
so to represent the real connection links. Using this VM technique, the effect
of both the distribution technique as that of the application distributed can be
observed. Improved realism and scalability can be achieved using a test-bed,
using, for example, the Fed4Fire test-bed [23]. Use of a test-bed, however, is
time-consuming, and should thus be well-prepared by simulation before actual
deployment.
3.4 Use Case
The use case is found in the automotive sector. The safety of vulnerable road
users can be improved, by implementing cooperative detection of such road users
near a crossing, combining vehicle data with infrastructure sensors. The network
is shown in Fig. 2. It describes infrastructure cameras monitoring for vulnerable
road users, and a roadside unit connected with a vehicle, with vehicular proximity
sensors for additional detection and a graphical interface for the driver. In case
of any problems, alerts are sent to the driver’s display in the visualized network.
The application preprocesses the data before reaching the cloud server, where it
accounts for data checking. Afterward, data compression and conversion happens
before finally reaching the target device. The application graph is shown in Fig. 3.
4 Test
We have provided a small test scenario as some preliminary results. First, we
describe our test scenario, after which we explain our achieved results. In this
scenario, we test centralized coordination techniques and compare them. This
differs from distributed techniques since a single node is responsible for the
placement. A simplified scenario is provided, which differentiates from future
research since currently, placement calculation happens centralized.
The use case is the previously defined use case, shown in Fig. 2. The network is
a static network, purely for testing placement techniques. Four different contexts
are used, where the cloud, edge devices, actuators, and monitoring devices have a
different context. The green node represents the video monitor as a data source,
and the red nodes are the actuators as data sinks. The blue node is a task forced
to run at the cloud level, for data integrity checking. All the colored nodes are
locked nodes, and thus cannot be moved. The The Key Performance Indicators
(KPI) are defined with a preference to keep the tasks as close to the node level
as possible. The objective function we use is the following, based on [13].

#Components I
#KP
C= wij Cij (1)
i=0 j=0
Here we try to minimize the placement cost of component i onto the device with
KPI cost j per device. Sum these costs over the entire network for the global
cost. Several techniques are used to solve this problem, briefly listed below.
Branch and Bound: A branch & bound with fail-early constraint checking is
used to provide a baseline. A constraint is, for example, a device is allocated more
tasks than its resources can handle. This algorithm is guaranteed to return the
best placement, by iteratively checking all possible placements. However, since
it has to check all possible placements, it has a worst-case run time of O(N A ),
making it is infeasible to use this technique in practice.
Hill Climb: The Hill-Climb (HC) heuristic is used as a local optimization algo-
rithm. It is based on a multiple restart steepest descent HC,, implemented with
Fig. 2. Use case network graph Fig. 3. Use case application graph
10 restarts. It attempts to improve total cost by greedily moving tasks to their

neighboring device where possible.
Genetic Algorithm: A GA is tested as well. We define a chromosome as a
complete task placement.
√ Ranking is based on the weight and is scaled according
to the formula 1/ n, where n is the normalized position in the ranking. The best
children according to this ranking are selected and automatically pass to the next
generation. The single-point crossover function splits the placed task set in two
and shares the first part. Parents for this function are selected using roulette-
wheel selection. The mutator function changes a randomly selected task to a
randomly selected device.
Fig. 4. Algorithm running time with Fig. 5. Solution improvement of GA

twice as much free tasks as nodes compared to HC
Fig. 6. Comparison of calculated weight on use case
4.1 Test Setup and Results

The used hardware is a server with 2x Intel Xeon E5-2420 v2 CPU and 32 GB
RDIMM memory. Next to the use case, random networks were developed with
an increasing amount of nodes. For each of these networks, we generated two

applications, which always included all sensors and actuators as either start
or end nodes, and forced at least one task at the cloud. The first of these two
applications has the same amount of free tasks as there are nodes in the network,
and the second one has the double this amount. Each algorithm was run ten times
and then averaged.
The first application showed that run-time for HC and the GA where com-
parable, and as thus are not shown for brevity. In Fig. 4 it is shown that the HC
runtime increases considerably faster with the network size, compared to the GA
who seems to be following a logarithmic trend. This is to be expected since the
operations of the GA are more costly but are more consistent than those of the
HC, which only moves a single task each iteration.
The GA is able to search for an optimum utilizing the complete set of tasks
and the complete network at the same time, allowing it to handle much larger
search spaces with much less time. However, the solution quality is another
important metric, where Fig. 5 shows the solution improvement in % of the GA
compared to the HC. This improvement on small application graphs is due to
the randomization of the GA, giving it more room to play than the HC. This
improvement decreases the more tasks there are, allowing HC to continuously
improve. A potential improvement is the Simulated Annealing algorithm, allow-
ing the HC to exit local optima, requiring fewer restarts, improving both runtime
and found optima.
In the search space with the least amount of tasks, the GA performs consid-
erably better, but the unstable lines show both algorithms have trouble finding
their solutions in the larger search space, being greatly influenced by the ran-
domness. Finally, looking at the use case scenario in Fig. 6, we see that on a
small scale HC manages to find the optimum, while the GA hovers above it.
This is due to the HC taking individual steps, allowing the last task to shift into
position, whereas the GA uses its population-based technique to get as close as
possible. Due to the small network, the HC performs considerably better.
Fog computing will help in relieving network stress, but this needs a coordinator
function to organize the fog processes. It provides an approach and ends its case
with a test scenario. From this scenario can be concluded that the HC tends
to do better in regard to solution quality, but the GA provides faster and more
consistent results. Additionally, this paper lists some of the most important
literature in this field and defines some problems that are yet to be solved.
A briefly touched subject in this paper is the problem of network monitoring,
including network discovery and link monitoring. This is, however, a significant
challenge when it comes to dynamic networks since it is infeasible to keep the
network status up to date at all times. Another major problem is the current
lack of security: if an attacker manages to get the coordinator to distribute his
software, he gets an arsenal of resources to his disposal, or he can make the
placement extremely inefficient by changing the weights. Load balancing should

be implemented, with a detection mechanism that is able to discover when a
task is no longer able to process the stream due to too much information and
can inform the coordinator about this so that an extra task can be deployed.
Another interesting addition to add to this work is to add software constraints,
as defined in [7].
References
1. Vinitsky, E., Parvate, K., Kreidieh, A., Wu, C., Bayen, A.: Lagrangian Control
through Deep-RL: applications to bottleneck decongestion. In: IEEE Intelligent
Transportation Systems Conference, pp. 759–765 (2018)
2. Cisco: Cisco Global Cloud Index: Forecast and Methodology 2014-2019 (white
paper). Cisco, pp. 2016–2021 (2016)
3. Iorga, M., et al.: Fog computing conceptual model. NIST, Gaithersburg, MD, Tech-
nical report, March 2018
4. Xia, Y., et al.: Combining heuristics to optimize and scale the placement of IoT
applications in the fog. In: 2018 IEEE/ACM 11th International Conference on
Utility and Cloud Computing (UCC), pp. 153–163 (2018)
5. Huybrechts, T., Mercelis, S., Hellinckx, P.: A new hybrid approach on WCET
analysis for real-time systems using machine learning, no. 5, pp. 1–5 (2018)
6. IDLab, Imec, and University of Antwerp: COBRA Framework. http://cobra.idlab.
uantwerpen.be/
7. Sharma, B., Chudnovsky, V., Hellerstein, J.L., Rifaat, R., Das, C.R.: Modeling and
synthesizing task placement constraints in Google compute clusters. In: Proceed-
ings of the 2nd ACM Symposium on Cloud Computing - SOCC 2011, pp. 1–14.
ACM Press (2011)
8. Vanneste, S., et al.: Distributed uniform streaming framework: towards an elastic
fog computing platform for event stream processing. In: Proceedings of the 13th
International Conference on P2P, Parallel, Grid, Cloud and Internet Computing,
pp. 426–436 (2019)
9. Abowd, G.D., et al.: Towards a better understanding of context and context-
awareness. In: Lecture Notes in Computer Science, vol. 1707, pp. 304–307 (1999)
10. Wen, Z., et al.: Fog orchestration for internet of things services. IEEE Internet
Comput. 21(2), 16–24 (2017)
11. Brogi, A., Forti, S.: QoS-aware deployment of IoT applications through the fog.
IEEE Internet Things J. 4(5), 1–8 (2017)
12. Wang, S., Zafer, M., Leung, K.K.: Online placement of multi-component applica-
tions in edge computing environments. IEEE Access 5, 2514–2533 (2017)
13. Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Context-aware distribution
in constrained IoT environments, pp. 437–446 (2019)
14. Smith, R.G.: Communication and control in a distributed problem solver. IEEE
Trans. Comput. C(12), 1104–1113 (1980)
15. Talukdar, S., Baerentzen, L., Gove, A., De Souza, P.: Asynchronous teams: coop-
eration schemes for autonomous agents. J. Heurist. 4(4), 295–321 (1998)
16. Barbucha, D., Je, P.: An agent-based approach to vehicle routing problem. Int. J.
Appl. Math. Comput. Sci. 4(1), 36–41 (2007)
17. Barbucha, D.: Agent-based optimization. In: Agent-Based Optimization, vol. 456,
pp. 55–75 (2013)
18. Casanova, H., Giersch, A., Legrand, A., Quinson, M., Suter, F.: Versatile, scal-
able, and accurate simulation of distributed applications and platforms. J. Parallel
Distrib. Comput. 74(10), 2899–2917 (2014)
19. Zeng, X., et al.: IOTSim: a simulator for analysing IoT applications. J. Syst. Archi-
tect. 72, 93–107 (2017)
20. Marler, R., Arora, J.: Survey of multi-objective optimization methods for engineer-
ing. Struct. Multidisc. Optim. 26(6), 369–395 (2004)
21. Marler, R.T., Arora, J.S.: The weighted sum method for multi-objective optimiza-
tion: new insights. Struct. Multidisc. Optim. 41, 853–862 (2010)
22. Kaufman, K.A., Michalski, R.S.: Learning from inconsistent and noisy data: the
AQ18 approach. In: Symposium A Quarterly Journal In Modern Foreign Litera-
tures, pp. 411–419 (1999)
23. Gavras, A., Karila, A., Fdida, S., May, M., Potts, M.: Future internet research and
experimentation: the FIRE initiative. Sigcomm 37(3), 89–92 (2007)
Using Neural Architecture Search to
Optimize Neural Networks for Embedded
Devices
Thomas Cassimon(B) , Simon Vanneste, Stig Bosmans, Siegfried Mercelis,

and Peter Hellinckx

{thomas.cassimon,simon.vanneste,stig.bosmans,
Abstract. Recent advances in the field of Neural Architecture Search

(NAS) have made it possible to develop state-of-the-art deep learning
systems without requiring extensive human expertise and hyperparam-
eter tuning. In most previous research, little concern was given to the
resources required to run the generated systems. In this paper, we present
an improvement on a recent NAS method, Efficient Neural Architecture
Search (ENAS). We adapt ENAS to not only take into account the net-
work’s performance, but also various constraints that would allow these
networks to be ported to embedded devices. Our results show ENAS’
ability to comply with these added constraints. In order to show the effi-
cacy of our system, we demonstrate it by designing a Recurrent Neural
Network (RNN) that predicts words as they are spoken, and meets the
constraints set out for operation on an embedded device.
1 Introduction
In recent years, developing state-of-the-art neural networks has become a chal-
lenge, due to the vast complexity of these systems. Developing neural networks
usually requires a substantial amount of experimentation and hyperparameter
tuning, as well as domain knowledge and expertise in designing neural networks.
This is a lengthy and tedious process due to the sheer size of the hyperparam-
eter and architectural space. In order to solve this problem, the idea of Neural
Architecture Search (NAS) [1] was introduced. In Neural Architecture Search, a
controller is trained using a reinforcement learning algorithm, REINFORCE [2]
in our case. The controller first generates a neural network. This network is then
trained and evaluated based on its performance. The controller then uses the net-
works’ performance to learn and generate better networks. This search process
still takes a long time to converge, however. In order to remedy this, Pham et al.
tested the idea of weight sharing [3]. Instead of training a network from scratch
every time, weights are shared between all architectures, allowing architectures

https://doi.org/10.1007/978-3-030-33509-0_64
Using Neural Architecture Search to Optimize Neural Networks 685
to converge faster. While this was a major improvement on NAS, there are still
some unaddressed issues. Something that has been overlooked in most research is
the resource requirements of a NAS system. Most papers just focus on optimizing
the generated networks’ performance [1,3]. The resource requirements of neural
networks are set to become equally important however, given the prevalence of
mobile and edge devices in modern IoT networks. Some research has gone into
improving networks resource consumption, showing promising results [4].
In this paper we will attempt to address some of these resource constraints
by introducing hard and soft constraints on our architectures. Our research com-
bines the short search times achieved by Pham et al. [3] with a multi-objective
approach, allowing us to quickly find cells that also meet a given set of con-
straints. We first discuss the current state-of-the-art in NAS technology (Sect. 2),
then we give an overview of the techniques we employed to improve on the current
state-of-the-art (Sect. 3), we explain our experiments (Sect. 4) and also present
our results (Sect. 5). Finally we offer some insights into the behaviour of our
current system and determine a possible direction for future research (Sect. 6).
2 State-of-the-art
The state-of-the-art in NAS can be split into three categories: Reinforcement
Learning (RL) based approaches [1,3,5], Evolutionary approaches [4,6] and
Bayesian approaches [7]. We chose to use a RL-based approach because of the
quick search times that recent RL-based NAS methods can achieve.
In their initial paper, Zoph and Le [1] described a way to use a reinforcement
learning controller to generate Recurrent Neural Network (RNN) cells and Con-
volutional Neural Networks (CNNs). Both of these systems performed similar
to state-of-the-art, human-designed, systems, while requiring little to no human
interaction. Later, further improvements were made on this work by introducing
weight sharing [3]. Weight sharing involves forcing all models to share a single
set of trainable parameters. Using the idea of weight sharing in ENAS, Pham et
al. were able to significantly reduce the amount of computational power required
to traverse the search space. Results show that ENAS is capable of finding com-
petitive cells in less than a day on a single GPU, compared to 32 400–43 200
GPU hours for earlier NAS algorithms [5].
Recently, new research has shown that NAS isn’t limited to optimizing for
a single objective. In their paper Elsken et al. [4] used Lamarckian evolution
to find architectures that are not only comparable to state-of-the-art systems
in terms of performance, but also offer a significant reduction in the amount of
parameters the models require.
3 Methods
For our research we intend to improve on ENAS by introducing extra constraints
to the reward function of the controller. We will design these constraints to max-
imize the generated networks’ performance, while keeping resource requirements
686 T. Cassimon et al.
low. In order to make our system sufficiently flexible, we decided to split our
constraints into two categories. The first is a set of hard constraints, and the
second is a set of soft constraints. If the system fails to meet one of the hard
constraints, its reward will be zero. If the hard constraints are met, the system
will be rewarded with the sum of its soft constraints.
In order to implement these constraints, we use Eq. 1 to determine the reward,
R given to our controller during its training.
n−1

R (Ch , μ, Cs , p) = (Ch,0 · Ch,1 · . . . · Ch,k−1 ) · μi · Cs,i (1)
i=0
In this equation, we have k hard constraints (Ch,0 through Ch,k−1 ) and n soft
constraints (Cs,0 through Cs,n−1 ). Each soft-constraint is weighted using a pre-
set weight (μi for the i-th constraint).
3.1 Hard Constraints

The first hard constraint we used is the amount of memory the model uses.
Determining the memory usage requires us to traverse the cell’s graph, and
determine the size of the block’s inputs based on the size of the outputs leading
into this block. We consider this constraint violated if the networks’ memory
usage exceeds a predetermined maximum. When setting the maximum, a certain
amount of extra memory should be assumed for things like encoders, decoders
and other miscellaneous purposes, since this amount of memory does not depend
on our architecture, we decided not to take it into account for our research. More
formally:

0 if Smodel + ε ≥ Sdevice
Ch,memory = (2)
1 if Smodel + ε < Sdevice
In Eq. 2 Smodel is the size of our model, Sdevice is the amount of available
memory on the target device, and ε is a small amount of memory to be used for
other purposes beside our cell.
The second hard constraint we introduced is inference latency. Inference
latency is measured using a cost model that contains the approximate latency for
every available activation function and three basic operations (addition, multipli-
cation, division). By accumulating the computational costs of every node in our
cell, we can obtain an estimate of the amount of latency induced when making a
single pass through our cell. If this latency exceeds a pre-determined maximum,
we consider the constraint to be violated. This constraint can be formalized as:

0 if Latencymodel + ε ≥ Latencymax
Ch,complexity = (3)
1 if Latencymodel + ε < Latencymax
Just as in the previous case, a small ε is added to our latency to make sure
the system has some spare time for computations that aren’t part of our model.
3.2 Soft Constraints
We first consider a cache constraint. This constraint was introduced because

modern processors spend a lot of time waiting for data to be fetched from mem-
ory, thus a program that is able to place more of its data in the processor’s cache
will usually be faster. Another reason why caches are important in embedded
systems is energy use. According to Horowitz [8], the cost of an off-chip DRAM
access (1–2 nJ) is a couple of orders-of-magnitude larger than the cost of an
internal cache memory access (10 pJ). In order to encourage cache use, we con-
structed a constraint that penalizes our agent for networks that do not fit into
cache, and rewards the agent for networks that fit in the cache, shown in Eq. 4.
The constraint also accounts for the various cache levels, since caches closer to
the processor tend to take fewer cycles to access.
xcache = μL1 · (SL1 Cache − Smodel ) (4)

+μL2 · (SL1 Cache + SL2 Cache − Smodel )
+μL3 · (SL1 Cache + SL2 Cache + SL3 Cache − Smodel )
In our experiments, we used 13 for μL1 , 12 for μL2 and 1 for μL3 . This choice
is arbitrary and not representative of the cost of cache misses. Our choice of
parameters gives a smaller penalty for models that don’t fit into L1 cache, a
slightly larger penalty for models that don’t fit into the combination of L1 and
L2 cache and an even larger penalty for models that don’t fit into the combination
of L1, L2 and L3 caches. Models are penalized less, but also rewarded less, for
exceeding the size of small caches. The main reason for this is, that it is unlikely
that our system will be able to fit a model into a L1 cache, due to their limited
size (typically measured in kilobytes).
The final constraint for our system is the performance of our cells on a
validation dataset. We calculate our performance reward as described in Eq. 5.
When designing our reward function, we noticed that Pham et al. don’t mention
the value of c used for their experiments. We decided to set c to 80, which is the
same value Zoph and Le [1] used.
c
p= 2 (5)
(pplvalid )
3.3 Dynamic Cell Size
Initial experiments show that ENAS is able to meet hard constraints, given a
static cell size. Our controller is still limited in how small it could make the
networks it produces, since the number of blocks in a cell is fixed. In order to
give the controller greater freedom in choosing the design of its cells, we allowed
the controller to sample the number of blocks in a cell. This allows the controller
to change the size of its cells in a significantly larger range than before. In order
to generate a dynamically-sized cell, the controller first samples the number of
blocks in a cell, after which the generation process proceeds normally.
4 Experiments
In this section, we will discuss our results. First, we explain the idea behind our
experiments in Sect. 4.1. We provide an overview of the parameters used in our
experiments (Sect. 4.2) and discuss the results of different cells in Sect. 4.3.
4.1 Use Case

In order to demonstrate our algorithm, we set our constraints based on a real-
world use case. Our model needs to be able to predict words at the same pace
as they would be spoken by a native speaker, in English. When giving a pre-
sentation, English native speakers tend to speak at a pace of about 100–125
words per minute, we will round this to 120 words per minute [9]. We also want
to allow some extra time for other systems such as encoders and decoders and
miscellaneous tasks, which results in a maximum inference time of 330 ms. Since
we also want our solution to be portable, we need to run it on an embedded
platform. For this, we chose the Raspberry Pi 3B. The Raspberry Pi has 1 GB
of RAM memory, of which we will use half, leaving 500 MB for the operating
system and miscellaneous memory consumption.
4.2 Configuration
We organize our results based on whether or not our constraints were enabled.
When constraints are disabled, we only take a cell’s accuracy into account. We
also include the cell reported by Pham et al., trained from scratch. In our exper-
iments, when using dynamic cell sizes, our model is allowed to choose cell size
in the range [2–24], where previous research used static cell sizes, the cell size
is fixed at 12. Our search space consists of four activation functions: identity,
sigmoid, ReLU and tanh. In order to be able calculate an estimated cell size
in bytes, we assumed a parameter size of 4 bytes. When performing architec-
ture search, the weight of our cache-constraint is 10−13 and the weight of the
network’s performance is left at 1. Our agent tends to have little trouble under-
standing and optimizing for its cache-constraint, and thus, we opted to put a
larger emphasis on accuracy.
Our hidden states and embeddings have dimension 1000, both during search
and during inference. Weight-tying is used for our encoders and decoders, as
described by Inan et al. [10], alongside L2 regularization, weighted by a factor of
10−7 and gradient clipping, with a maximum of 0.25. We also add our controller’s
sample entropy to its reward, weighted by a factor of 0.0001. Our generated cells
are also enhanced using highway connections [11]. We also note that, contrary
to Zoph and Le [1], we do not perform a grid-search over hyperparameters after
training our cell.
4.3 Cells
We were unable to reproduce the performance reported by Pham et al. [3] (55.8
Perplexity points on Penn Treebank (PTB) [12]) or the performance reported
by Li et al. [13] (60.3 Perplexity points using DARTS’ [14] training code) using
the cell shown in Fig. 1a. In order to be able to provide a fair comparison, we
trained the cell reported by Pham et al. [3] ourselves, reaching a test perplexity
of 186.35, using early stopping after 14 epochs of training on the PTB dataset.
While attempting to reproduce the results from ENAS, we also noticed that
dropout on our hidden-to-hidden weights actually deteriorated our network’s
performance by a significant amount, with our best validation perplexity rising
to 221.09 after 12 epochs.
(a) ENAS (b) No Constraints (c) Constraints
Fig. 1. Comparison of the different cells generated with dynamic cell size
Running our algorithm without constraints and with dynamic cell size, results
in cells which are drastically smaller than those generated when the cell size is
fixed. Cells generated with this combination of parameters typically contain 3–
4 blocks, due to their simplicity, these cells tend to exhibit a worse validation
accuracy than their fixed-size counterparts. The cell that was used to gather
results is shown in Fig. 1b, and was able to achieve a validation perplexity of
252.57.
The smallest cells are generated when ENAS is put under resource constraints
and given the ability to choose the size of the cells it generates. The cell we used
to gather our results consists of a single line of blocks, shown in Fig. 1c. This cell is
similar to the cell generated without constraints, which also consisted of sigmoid
and ReLU blocks. Under constraints the controller presumably creates long,
snake-like cells with a single line of blocks, because this simple structure results
in the least amount of connections, which in turn reduces the necessary amount of
memory for the cell. It should also be noted that the generated cell contains two
sigmoid activation functions, which are 19.97 times slower than ReLU activations
and 31.7 times slower than identity activations, showing that ENAS still tries
to keep accuracy in mind, even when designing cells in a constrained matter.

Figure 2 also demonstrates that ENAS is effectively capable of meeting a given
hard constraint, while it initially has trouble meeting the constraint, it quickly
realizes how to design cells that meet the given requirements.
Fig. 2. Estimated inference time as a function of training steps, showing that ENAS
is capable of meeting and exceeding a given hard constraint in a reasonable amount of
time. Red areas indicate constraint violations, green areas indicate that the constraint
was met.
5 Results
Table 1 provides an overview of the results we obtained during our experiments.
Table 1. Overview of our results, showing validation perplexity, memory use and
inference latency.
System Validation Memory use Inference latency Cache

perplexity (MB) (ms) reward
ENAS 186.35 136 382.97 −112.54
No Constraints 252.57 56 159.16 −45.87
Constraints 212.62 40 159.27 −32.54
We also provide a graphical comparison between our 3 runs in Fig. 3. This

figure does not show the value of the cache constraint, since it is strongly cor-
related with memory use. The figure shows that, in terms of resources, our
dynamically designed cells outperform ENAS by quite a large margin, while
only sacrificing a relatively small amount of performance.
Fig. 3. Graphical comparison of the different cells, all numbers are normalized to
ENAS’ scores. A smaller surface area means a system is better.

In this section, we discuss the conclusions we were able to draw from our exper-
iments, noting the difficulty in reproducing earlier results, the effects of allowing
a dynamically chosen cell size and the efficacy of applying constraints to a NAS
system. We also outline several possible directions for future research that might
contribute to improving the state-of-the-art in NAS.
While conducting our experiments, we found that the results reported by
Pham et al. [3] are difficult to reproduce. We also briefly attempted to reproduce
the reported results on the CIFAR-10 dataset using the original author’s code,
but were unable to do so due to excessive memory use. Li et al. [13] also reported
similar issues in their report on the efficacy of random search as NAS method
and the reproducibility of the results of NAS algorithms.
When ENAS is able to choose the size of the cells it generates, it tends to
generate smaller cells. We suggest that this occurs because we add the controller’s
sample entropy to its reward function. Since the sample entropy is summed
across all decisions, making less decisions produces less entropy, thus encouraging
smaller networks. When disabling this operation we noticed that ENAS has a
tendency to generate larger cells, but it also tends to ignore soft constraints,
generating networks that barely comply with the given hard constraints.
We can confidently conclude that our version of ENAS is capable of meeting
a given set of hard-constraints, while still also effectively optimizing for soft-
constraints. We suggest that this is a promising area of research, and further
enhancements could be made to make ENAS able to meet these constraints in
a faster manner, while still allowing for maximum accuracy.
Currently, our reward is a linear combination of a set of soft constraints,
multiplied by the AND-operation of all hard constraints. This has been shown
to work, however, it might be worth exploring other scalarization methods such
as Hypervolume-based Scalarization [15] and Chebyshev Scalarization [16]. Van
Moffaert et al. have shown that these methods can outperform simple linear
combinations in multi-objectivized versions of single-objective problems by a
large margin, making them an interesting option to consider in future research.
Recently, promising research has been done on deep learning systems operat-
ing on graphs [17,18]. Our system is currently generating computational graphs
node-by-node in a recurrent fashion. It could, however, be valuable to progres-

sively optimize a single graph by performing operations on its nodes. In this
respect, graph embeddings could be a very useful tool.
We gratefully acknowledge the support of the NVIDIA Corporation with the
donation of the Titan Xp GPU used for this research.
References
1. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning (2017).
https://arxiv.org/abs/1611.01578
2. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist
reinforcement learning. Mach. Learn. 8, 229–256 (1992)
3. Pham, H., Guan, M., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture
search via parameters sharing (2018). http://proceedings.mlr.press/v80/pham18a/
pham18a.pdf
4. Elsken, T., Metzen, J.H., Hutter, F.: Efficient multi-objective neural architecture
search via lamarckian evolution. In: International Conference on Learning Repre-
sentations (2019). https://openreview.net/forum?id=ByME42AqK7
5. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures
for scalable image recognition. CoRR, vol. abs/1707.07012 (2017). http://arxiv.
org/abs/1707.07012
6. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classi-
fier architecture search. CoRR, vol. abs/1802.01548 (2018). http://arxiv.org/abs/
1802.01548
7. Kandasamy, K., Schneider, J., Pczos, B., Xing, E.P.: Neural architecture search
with Bayesian optimisation and optimal transport. In: Conference in Neu-
ral Information Processing Systems (2018). https://papers.nips.cc/paper/7472-
neural-architecture-search-with-bayesian-optimisation-and-optimal-transport.pdf
8. Horowitz, M.: 1.1 computing’s energy problem (and what we can do about it).
In: 2014 IEEE International Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), pp. 10–14, February 2014
9. Wong, B.L.: Essential Study Skills, 8th edn. Cengage, Boston (2015). ISBN
9781285430096
10. Inan, H., Khosravi, K., Socher, R.: Tying word vectors and word classifiers: a loss
framework for language modeling. CoRR, vol. abs/1611.01462 (2016). http://arxiv.
org/abs/1611.01462
11. Zilly, J.G., Srivastava, R.K., Koutnı́k, J., Schmidhuber, J.: Recurrent highway net-
works. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Con-
ference on Machine Learning. Proceedings of Machine Learning Research, PMLR,
06–11 Augus 2017, vol. 70, pp. 4189–4198. International Convention Centre, Syd-
ney (2017). http://proceedings.mlr.press/v70/zilly17a.html
12. Marcus, M., Kim, G., Marcinkiewicz, M.A., MacIntyre, R., Bies, A., Ferguson,
M., Katz, K., Schasberger, B.: The penn treebank: annotating predicate argument
structure. In: Proceedings of the Workshop on Human Language Technology, HLT
1994, pp. 114–119. Association for Computational Linguistics, Stroudsburg (1994).
https://doi.org/10.3115/1075812.1075835
13. Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture
search. CoRR, vol. abs/1902.07638 (2019). http://arxiv.org/abs/1902.07638
14. Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. arXiv
preprint arXiv:1806.09055 (2018)
15. Van Moffaert, K., Drugan, M.M., Nowé, A.: Hypervolume-based multi-objective
reinforcement learning. In: Purshouse, R.C., Fleming, P.J., Fonseca, C.M., Greco,
S., Shaw, J. (eds.) Evolutionary Multi-Criterion Optimization, pp. 352–366.
Springer, Heidelberg (2013)
16. Van Moffaert, K., Drugan, M.M., Now, A.: Scalarized multi-objective reinforcement
learning: novel design techniques. In: 2013 IEEE Symposium on Adaptive Dynamic
Programming and Reinforcement Learning (ADPRL), pp. 191–199, April 2013
17. Cai, H., Zheng, V.W., Chang, K.C.-C.: A comprehensive survey of graph embed-
ding: problems, techniques and applications. IEEE Trans. Knowl. Data Eng. 30,
1616–1637 (2017)
18. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social represen-
tations. In: Proceedings of the 20th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, KDD 2014, pp. 701–710. ACM, New York
(2014). https://doi.org/10.1145/2623330.2623732
Spiking Neural Network Implementation
on FPGA for Robotic Behaviour
Maximiliaan Walravens1(B) , Erik Verreyken1,2 , and Jan Steckel1,2

1
Cosys Lab, Faculty of Applied Engineering, University of Antwerp,
Antwerp, Belgium
maximiliaan.walravens@student.uantwerpen.be,
jan.steckel@uantwerpen.be
2
Flanders Make Strategic Research Centre, Lommel, Belgium
Abstract. Over the last few years there has been a considerable amount
of progress in the field of machine learning. Neural networks are com-
mon in academic literature and they are often used in engineering appli-
cations. A certain class of Artificial Neural Networks (ANN) is called
Spiking Neural Networks (SNN), these are neural networks that utilise
the specific time when a neuron fires, to calculate its output. Feedback
loops with these kind of SNNs allow the execution of complex tasks in
a compact manner. Due to their parallel character they are unquestion-
ably suitable to be implemented on Field Programmable Gate Arrays
(FPGA). The aim of this paper is to take a step towards creating a colli-
sion avoidance robot, which uses a SNN on an FPGA and Reinforcement
Learning (RL) on an external device.
1 Introduction
Conventional Artificial Neural Networks (ANN) are not time-varying systems,
unlike Spiking Neural Networks (SNN) which rely on spike timings as informa-
tion conveyor. This makes SNNs computationally more powerful than the con-
ventional ANNs [1]. SNNs are inspired on the biological neural system, which
remains the most efficient neural system we know. The focus of this paper is the
implementation of SNNs onto an FPGA and the creation of a framework which
supports Hardware-In-the-Loop (HIL) simulation in a reinforcement learning
task. FPGAs are especially suited for this application because the different neu-
rons and layers can be parallelised while being at the fraction of the cost of
ASICs [2], certainly if we take into account the trial-and-error of trying to find
a suitable architecture for the neural network. With ASICs, a new ASIC should
be manufactured each time the desired architecture changes. The purpose of this
paper is to train the SNN with reinforcement learning. Reinforcement learning is
especially suitable for a situation where an agent, controlled by a spiking neural
network, learns by interacting with its environment. The agent should learn its
own representation of the environment to minimise the bias of the programmer
[3]. As always with neural networks it is best to give the neural network as much
freedom over its decision process as possible.
https://doi.org/10.1007/978-3-030-33509-0_65
Spiking Neural Network Implementation on FPGA for Robotic Behaviour 695
Implementing this RL robot in real life would be very expensive. This is

because it is time consuming in operation, since the test environment can hardly
be parallelised. Another roadblock is the high cost of failure: if for instance the
robot crashes into an obstacle, parts might break and need to be replaced. In
addition, there would also have to be a human supervisor with the test, at all
times. This is why many neural networks are first trained in simulation and
make use of this knowledge in the real-life set-up afterwards. However, the gap
between reality and simulation often poses a significant problem. To solve this,
transfer learning could be improved upon or the fidelity of the simulation needs
to increase. The latter could be done by trying to make the parameters used
mimic the real world more closely. A last more experimental technique would be
to create a relatively high number of different simulation environments, where
our reality would just be a sample of this set.
In this paper, a hybrid of both simulation and real-life implementation will
be described. The SNN will be implemented on an FPGA. This way the uncer-
tainty of having everything in simulation is removed after testing. This SNN will
communicate with the MATLAB environment, where the simulation of the robot
will run. This communication will be achieved through Hardware-In-the-Loop
(HIL). This way the neural network, on the FPGA, is working properly in real
life already.
As shown in Fig. 1, a non-holonomic differential drive robot will be placed
in a arena with obstacles. The robot has 10 sensors placed in all horizontal
directions. The FPGA with the SNN on it will communicate with MATLAB via
JTAG. MATLAB will receive the wheel speed values that the network thinks
are appropriate. In simulation MATLAB will convert these into the movement of
the robot (in the simulation), using adjustable parameters. Then it will calculate
the values that the sensors on the robot would output and send these back to
the neural network. If the robot collides with an obstacle, the robot will reset
itself to the starting position and the RL algorithm will use this to recalculate
the weights and delays of the neural network. These will be sent to the FPGA
and the weights and delays will be adjusted on the go.
2 Implementation
2.1 Spiking Neural Network
The type of SNN used in this paper is the leaky-integrate-and-fire model [4].
Whenever a neuron receives a spike, the potential of this neuron will rise. It will
keep rising until it reaches a certain threshold. Once that happens, the neuron
will output a spike to all neurons connected to its output. The advantage of SNNs
is that the neuron only recalculates its state, if it receives a spike. Neurons that
do not get a lot of input and are not very active will not take up processing power.
This reduces the strain on the system and making the computation dependent
on the amount of spikes instead of the network architecture and depth [5]. Once
the neuron fires, it will enter a refractory period for a certain amount of time.
Whilst in this period the neuron will not react to any incoming spikes. This limits
696 M. Walravens et al.
MATLAB FPGA
Wheel Speed (left/right)

Spiking
Neural Network
Reinforcement
Leaning
Weights and Delays
Sensor Inputs (10)
Fig. 1. MATLAB simulation of a collision avoidance non-holonomic differential drive

robot (represented by the X in the circle) in an arena with obstacles (black dots). The
FPGA, with the SNN on it, will send the wheel speeds to the MATLAB simulation,
which in turn will calculate the movement of the robot, the sensor inputs and the
collision detection.
the maximal spike rate of the neuron. The last characteristic of this model is
the leak. At a fixed time, there will be a deduction from the potential. Without
this leak, a neuron that does not get many inputs could fire after a long and
arbitrary time. This could cause the whole dynamic of the system to change and
make the learning algorithm change the weights. The leak should be seen as a
limited window of memory. The lower the leak rate, the longer an input will
linger in the SNN [4].
An advantage of SNNs is that there are two parameters, of the intercon-
nections or synapses, that the learning algorithm can change: the weights and
the delays. Due to the potential leaking away from neurons, a neuron will spike
quicker if the connection with the previous neuron has a higher weight. This
because the output of the previous neuron will be multiplied, and more spikes
will arrive. This will cause the neuron to spike more quickly. The other param-
eter is the delay. If the output of the previous neuron is delayed, the potential
has more time to decrease due to leak. A high delay will cause a neuron to spike
less often. This is illustrated in Fig. 2B. In this figure 100 is the threshold value.
With a delay increase from 5 to 10, the amount of spikes required went up from
5 to 7 and the time to spike increased from 20 ms to 60 ms.
The robot has 10 distance sensors (inputs) and the left and right wheel speeds
are the output signals. Therefore, the architecture chosen here is a three layer
10-6-2 network, as shown in Fig. 2A. This architecture is an arbitrary choice and
is often chosen experimentally or a proven model is used. Since the architecture is
bound to the 10 input neurons (10 sensors) and two output neurons (2 wheels),
finding a proven model is difficult. Thus, trying the 10-6-2 model could be a
viable solution.
100
A) B)
90
80
70
Neuron Potential(mV)
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80
time(ms)
Fig. 2. (A) A representation of a 10-6-2 neural network architecture and its intercon-
nections. The black dots signify the part of the architecture shown in Fig. 3 (B) Com-
parison of neuron with a short (first spike) and long (second spike) pre-synaptic delay.
A high delay will cause the neuron to take more time to reach the threshold and thus
spike less often.
2.2 VHDL Components

The SNN will be implemented using Hardware Description Language (HDL)
code. The architecture of the HDL is shown in a compacted way in Fig. 3. Each
of the 10 input neurons will have a synapse connection to each neuron of the next
layer. The weights and delays of each synapse, the sensor inputs and wheel speeds
will be communicated with MATLAB. The architecture of each component will
be described below.
Input Neuron: This is the neuron connected to the sensor inputs. In simulation,
these inputs will be generated by MATLAB and will be sent over JTAG, as
can be seen in Fig. 3. The component will need to convert the input signal from
the sensor into a spike signal. Because we are using a simulation, the sensor
(MATLAB) will send a number from 0 to 255. It will send out more spikes if an
obstacle is close and fewer to none if it is far away. This is an arbitrary choice
since the learning algorithm will adjust its parameters accordingly [6,7]. Unlike
the normal neuron, the input neuron will not calculate a potential of the neuron.
Instead the neuron will use a form of temporal coding. According to the sensor
value, it will send an amount of spikes every 50 ms [8].
Neuron: A single neuron is connected to all previous neurons, through synapses,
and will have multiple inputs. The inputs are all connected to an OR-gate, so
if multiple spikes coming at the same time, only one will be counted up to the
potential. This is because two spikes are non distinguishable if they are separated
by less than one clock cycle. “Non distuinguisable” does not mean “equal”, it
means we can not state if equal or different [4]. Once it reaches the threshold,
the neuron will go into a refractory state and ignore all input signals.
Synapse: The synapse component will need to multiply the spikes with the weight
and delay the spikes according to the delay. Both the multiplying and delaying
will run on a 1 ms clock. To be able to adjust the delay, while running, a dynamic
FIFO is used [8].
Motorblock : The wheels of the robot are controlled with a servo signal. This is
a PWM signal, with a period of 20 ms and a high time that varies between 1 ms
and 2 ms. The robot is driving backwards if the high time is 1 ms, it is standing
still at 1.5 ms high time and is driving forward at 2 ms high time. To convert
the incoming spikes to this signal, the block will count for 18 ms, how many
spikes come in from the motor neurons. These are the last two neurons of the
architecture. These are normal neurons with another name to differentiate from
the hidden layer, as can be seen in Fig. 3. After 18 ms have passed, the amount
will be converted into a high time between 1 and 2 ms.
Synapse1_1 Synapse1_L FPGA

InputNeuron1 Neuron1 MotorNeuronL
Synapse2_1 Synapse2_L
MotorBlock
InputNeuron2
Synapse1_2 Synapse1_R
Neuron2
. MotorNeuronR
Synapse2_2
. Synapse2_R
. .
.
.
Sensor Input Weights and Delays Weights and Delays Wheel Speeds
Simulation
Position (x,y,θ)
sensors
Reinforcement Collision
Learning Detection
MATLAB
Fig. 3. A compressed schematic the a 10-6-2 SNN, in connection with a MATLAB

simulation environment The input neurons will convert the sensor signals into spikes
and send these, via the synapses to the next neurons. The synapses have the possibility
to strengthen or weaken and delay these spikes. The spikes reaching the motor block
will be converted into a RC Servo signal.
2.3 HDL Verifier

The communication between MATLAB and the FPGA will use FPGA-in-the-
loop (FIL) which is a form of HIL and part of the MATLAB HDLVerifier toolbox.
With this toolbox, MATLAB will be able to read and write into the on-board
slave memory locations of the FPGA, via the MATLAB command line. For this
JTAG, PCI Express or an Ethernet cable could be used. With the Zybo Z7 only
JTAG is available, at the time of writing. A disadvantage of this technique is
that only one application can use the JTAG cable at a certain time. Using this
toolbox, the FPGA can read out the sensor and weight/delay values and write
in the wheel speed registers.
2.4 MATLAB
MATLAB will use the wheel speed values to calculate the new position of the
robot in the simulation. For this it will use a differential drive toolbox in MAT-
LAB. Each time the robot moves, MATLAB will recalculate all the sensor dis-
tances and write these in the FPGA registers. It will also perform a collision
detection and if the robot has touched an obstacle or wall, a boolean will be
set to true. This can later be used for the RL algorithm’s fitness function. Since
the RL is not implemented yet, the weights and delays will be set at random at
the start of the simulation. This way we can test if the simulation is working
properly, using a random SNN.
2.5 Results
FPGA: Since there are few open source libraries for spiking neural networks
in VHDL, the basic code had to be written from scratch. This framework has
been achieved as described in the above section. To verify the functionality a
testbench architecture was created. After functionality was verified, the code was
loaded to a Zybo Z7 (XC7Z020-1CLG400C). This is an FPGA with an on-board
Zyng-7020 processor, which is used to communicate with MATLAB, this could
also be done with a soft core MicroBlaze processor.
To connect the VHDL code to the PC, the AXI protocol was used for com-
munication. This AXI block writes the output signals from the SNN HDL
block to registers on the FPGA. This way the MATLAB block can read
from these registers and send them to the PC. An addition to the FPGA
side could be a block that writes the weights to non-volatile memory. This
way the weights do not need to be initialised each time the FPGA is shut
down. However, it is still recommended to additionally save the weights con-
figuration on the PC so FPGAs could be interchanged. Since the used FPGA
(Zybo Z7) also has a hardware processor on-board, it would not be difficult
to write code to make the processor read from the sensor registers on the
robot, and send this data to the appropriate registers, for the SNN to read out.
In Table 1 below, the 10-6-2 architecture uses about 14% of the total FPGA
slices. This means a network of at least 5 times the size (about 90 neurons) could
fit on a single FPGA, being able to solve relatively complex tasks.
MATLAB : The MATLAB GUI of the simulation is a simple figure with an
imported map. The collision avoidance and sensor distance estimation were
implemented in Simulink via MATLAB functions. For the differential drive con-
version of wheel speeds to x, y and θ (the orientation of the robot), the Mobile
Robotics Simulation toolbox has been used.
Table 1. Hardware consumption of a 10-6-2 (inputLayer-HiddenLayer-OutputLayer)

SNN on Zybo Z7 FPGA. Since the SNN uses about 14% of the total FPGA slices, a
network 5 times the size could fit on this particular FPGA.
Slices LUTs Flip-Flops

10-6-2 SNN 1853 3200 6618
Usage (%) 14 6.02 6.22
Reinforcement Learning: In the MATLAB function a framework for RL has

been written. The code consists of the FIL code to write in the weight and delay
registers. The RL algorithm itself has not been implemented. However, for the
simulation to work, a random generator will assign random values to all weight
and delay registers. So that to this we can test a randomly parametrised SNN.
Since the main focus of this paper is to test the relevancy of SNNs on FPGAs
and not about RL, RL will be implemented in MATLAB, because the MATLAB
RL algorithms are more optimized than RL in Hardware Description Language
(HDL). In theory, RL could also be implemented on the processor of the Zybo. If
other FPGAs would be used, a soft core microblaze processor could be used. A
disadvantage is that most FPGAs have limited space in the non-volatile memory,
hence the implementation of RL on an external device (PC), alongside with the
storage of weights and delays.
3 Reinforcement Learning Controllers

3.1 General
Reinforcement learning is a framework for learning sequential decisions with

delayed rewards [9]. As previously stated, RL is especially useful when we are
dealing with agents interacting with their environment. The agent here would
be the collision avoidance robot and it will be interacting with its environment
by the means of 10 sensor inputs.
3.2 Metrics
A suitable metric would be the time or distance the robot drives until it collides.
The problem with both of these is that there is no sense of urgency. If time is
chosen, the robot could drive very slowly and this way the robot could maximise
its time before collision. Another problem is that if the robot drives at a reduced
speed, the result could be a mediocre collision avoidance algorithm. If driving
speed is increased, it could instantly drive into something. If distance is chosen
on the other hand, the algorithm could be good, but it could be very slow to react
on obstacles or impulses. Ways to solve this would be to let the robot restart not
only when it collides but also when it dives under an arbitrary minimum speed.
3.3 Reinforcement Learning Model
As shown in [10], there are several RL algorithms. The problem is, that they
are not all satisfactory for this application. Many are focussed on discrete action
spaces. Since discretising the continuous values leads to an explosion of dimen-
sions, Deep Q-networks and its likes are inefficacious. Using more suitable algo-
rithms would be appropriate. Algorithms like Deep Deterministic Policy Gradi-
ent, Trust Region Policy Optimization or Asynchronous Advantage Actor Critic
would be best [11]. A common approach to speed up the RL is to use architecture
and parameters from previous successful set-ups. Usually we adjust the neural
network to the type of inputs. If the inputs consist of pixel values of the environ-
ment, convolutional architectures would be suitable. If the states are numerical
vectors, multi-layer perceptrons may be most feasible. If the states are sequen-
tial data, Long Short-Term Memory would be a good starting point [11]. But
in this case the aim of this paper is to check whether Spiking Neural Networks
are a good choice for applications like collision avoidance. Parameters used for
this set-up will not be taken from previous papers, but will be generated in a
simulation from scratch. An exploration mechanism should also be chosen. As
shown in [12], classical -greedy mechanisms fail to provide enough performance
in continuous and multi-action scenarios. A solution for this would be Noisy Net
[12] or an Ornstein-Uhlenbeck process [13].
3.4 Genetic Algorithm
Instead of starting with random starting parameters for the neural network,
a good idea would be to apply a genetic algorithm (GA) and later build on
this foundation to get the RL algorithm started. Genetic algorithms have been
inspired by the natural selection mechanisms described by Darwin. They apply
a set of operators on a population of solutions, in such a way that the new
population improves.
In simulation the neural network could reach far with this GA method. But
improvements could still be made with RL if it is not satisfactory enough. A
next step could involve implementation in the real world.

In this paper, steps have been taken towards an infrastructure for a collision
avoidance robot. The integration of MATLAB with FPGA hardware and a viable
connection between the SNN on the FPGA and the simulation/RL algorithm on
a PC are explored. A follow-up to this paper could be to compare different RL
algorithms for collision avoidance. After implementation of the RL algorithm,
training in simulation could commence.
One the robot has achieved relatively good collision avoidance behaviour
in simulation a next step would be to use a real robot. As mentioned in the
introduction, it would be quite risky to let this robot start from scratch. Since
if it fails and runs into an obstacle, the costs of repair could be quite high.
This is why transfer learning would be a good idea. The concept of transfer
learning is to use knowledge of the past. Since training times can be long, it is
recommended to reuse data from the past. For example from a simulation, that
had similar parameters as the real life environment. But this is expected to be
still not faithful enough to be used, without alteration. Using simulation data
for a real robot, using transfer learning, has already been proven in the past [14].
Since the choice of architecture was quite arbitrary, it could also be possible
to test different kinds of neural network configurations and observe if there are
performance boosts or downgrades. These configurations could be the amount
of hidden layers and the amount of nodes, these layers consist of. At the time
of writing there is no real calculation to know the perfect amount of layers and
neurons. The only way is systematic experimentation, by means of a hyper-
parameter sweep. In [15] is stated that deep neural networks seem to perform
better. However, for this problem, creating a deep neural network with multiple
layers does not seem necessary to cover this simple problem of converting sensor
input to wheel speed, for obstacle avoidance.
References
1. Maass, W.: Networks of spiking neurons: the third generation of neural network
models. Neural Netw. 10, 1659–1671 (1997)
2. Pearson, M.J., Melhuish, C., Pipe, A.G., Nibouche, M., Gilhesphy, I., Gurney, I.,
Mitchinson, B.: Design and FPGA implementation of an embedded real-time bio-
logically plausible spiking neural network processor. In: Proceedings - 2005 Inter-
national Conference on Field Programmable Logic and Applications, FPL (2005)
3. Florian, R.V.: A reinforcement learning algorithm for spiking neural networks.
In: Proceedings - Seventh International Symposium on Symbolic and Numeric
Algorithms for Scientific Computing, SYNASC (2005)
4. Cessac, B., Paugam-Moisy, H., Viéville, T.: Overview of facts and issues about
neural coding by spikes (2010)
5. O’Connor, P., Welling, M.: Deep Spiking Networks. CoRR, vol. abs/1602.08323
(2016)
6. Gerstner, W., Kistler, W.: Spiking neuron models: single neurons, populations,
plasticity. Encycl. Neurosci. 4(7–8), 277–280 (2002)
7. Goldberg, D.H., Andreou, A.G.: Distortion of neural signals by spike coding. Neural
Comput. 19, 2797–2839 (2007)
8. Rosado-Muñoz, A., Bataller-Mompeán, A., Guerrero-Martı́nez, J.: FPGA imple-
mentation of spiking neural networks. In: IFAC Proceedings Volumes (IFAC-
PapersOnline) (2012)
9. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT
Press, Cambridge (2018)
10. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: A brief survey
of deep reinforcement learning. Special Issue on Deep Learning for Image Under-
standing (2017)
11. Behzadan, V., Arslan, M.: Adversarial reinforcement learning framework for bench-
marking collision avoidance mechanisms in autonomous vehicles (2018)
12. Fortunato, M., Azar, M.G., Piot, B., Jacob, M., Hessel, M., Osband, I., Graves,
A., Mnih, V., Munos, R., Hassabis, D., Pietquin, O., Blundell, C., Legg, S.: Noisy
networks for exploration (2017)
13. Bowling, M., Veloso, M.: Rational and convergent learning in stochastic games. In:
IJCAI International Joint Conference on Artificial Intelligence (2001)
14. Mu, J.W., et al.: Transfer learning for reinforcement learning on a physical robot.
Zhonghua yi xue za zhi (2010)
15. Goodfellow, I., Bengio, Y., Courville, A.: Adaptive Computation and Machine
Learning. MIT Press (2016)
A New Approach to Selectively
Implement Control Flow Error Detection
Techniques
Jens Vankeirsbilck, Jonas Van Waes, Hans Hallez, and Jeroen Boydens(B)
Department of Computer Science, KU Leuven,

Spoorwegstraat 12, 8200 Brugge, Belgium
{jens.vankeirsbilck,jonas.vanwaes,hans.hallez,jeroen.boydens}@kuleuven.be
Abstract. Many software-implemented control flow error detection

techniques have been proposed over the years. In an effort to reduce their
overhead, recent research has focused on selective approaches. However,
correctly applying these approaches can be difficult. This paper aims to
address this concern and proposes a new approach. Our new approach is
easier to implement and is applicable on any existing control flow error
detection technique. To prove its validity, we apply our new approach
to the Random Additive Control Flow Error Detection technique and
perform fault injection experiments. The results show that the selective
implementation has approximately the same error detection ratio with a
decrease in execution time overhead.
1 Introduction
The reliability of embedded systems in ever harsher working environments is

becoming more important. These systems are, however, more vulnerable to exter-
nal disturbances ranging from high energy particles striking the hardware to elec-
tromagnetic interference and attackers [4,13,14]. These external disturbances
introduce bit-flips in the systems hardware, which can cause invalid behavior
such as erroneously controlling an actuator, corrupting data or corrupting the
execution order of instructions.
The corruption of the execution order of instructions is also known as a
control flow error (CFE). A CFE is a violation against the control flow graph
(CFG) of the program. The CFG is a representation of the program, in which
the program is divided into basic blocks and edges. A basic block is a sequence
of consecutive instructions with exactly one entry and one exit point. An edge
is an intentional path between basic blocks. CFEs are typically partitioned into
two categories: inter-block CFEs and intra-block CFEs. An inter-block CFE is
an invalid jump through the program between two different basic blocks, while
an intra-block CFE is an invalid jump within the same basic block. Both types
of CFE can cause the affected program or system to halt, to crash or to provide
erroneous output, potentially leading to hazardous situations.
https://doi.org/10.1007/978-3-030-33509-0_66
A New Approach to Selectively Implement CFE Detection Techniques 705
To increase the reliability of embedded systems, several software-

implemented CFE detection techniques, have been proposed [10,11]. In a previ-
ous comparative study, we developed a CFE detection technique called RACFED
which has a higher detection ratio at a lower execution time overhead when com-
pared to similar techniques [15]. The imposed execution time overhead, however,
is still relatively high. Therefore, research has focused on selective CFE detection
techniques [5,9]. Selective CFE detection techniques only protect a selected part
of the target program to reduce the imposed execution time overhead. Whilst
reducing the execution time overhead, selective techniques are often difficult to
implement correctly or are difficult to apply to other CFE detection techniques.
This paper aims to address this concern by proposing a new principle to selec-
tively implement CFE detection techniques. Our new principle is relatively easy
to implement and can be used on any CFE detection technique. To validate this
principle, we apply it to our RACFED technique to create Selective RACFED
or S-RACFED. The rest of this paper is organized as follows. We discuss more
background and related work in Sect. 2. Next, we discuss S-RACFED and how
to implement it in Sect. 3. Then, we validate S-RACFED and present the exper-
iment setup and the results in Sect. 4 and Sect. 5, respectively. Conclusions are
drawn in Sect. 6.
2 Background and Related Work

Past years, a multitude of CFE detection techniques has been proposed [1–3,6,
10–12]. In order to detect CFEs, such techniques insert extra control variables,
called signatures, and instructions into the target program at compile time. At
run time, the added instructions recalculate these control variables and compare
them to their expected compile-time value. A mismatch indicates that a CFE
has been detected. In previous work, the authors performed a comparative study
of these techniques and used the data to develop a new CFE detection technique
called Random Additive Control Flow Error Detection (RACFED) [15]. When
compared to similar techniques, RACFED shows to be the better technique since
it detects more CFEs at a lower overhead.
The above mentioned techniques are full implementation techniques, i.e.,
they insert the control variable update and verification instructions in each basic
block in the CFG. In recent years, selective implementations have been proposed,
where the extra instructions are only inserted in certain selected basic blocks.
One way to selectively apply CFE detection, is to simply ignore some basic
blocks. This approach is taken by the S-SETA technique [5]. Within S-SETA,
smaller basic blocks have no extra instructions inserted in order to reduce the
imposed execution time overhead. While this drastically reduces the overhead,
it also reduces the detection ratio, as CFEs that occur within the ignored basic
blocks are possibly undetectable.
A second way to selectively apply CFE detection, is to insert signature update
instructions in each basic block, but to only insert signature verification instruc-
tions in selected basic blocks. Chielle et al. propose to only insert the verifica-
tion instructions in larger basic blocks with their SETA-C technique [5]. Khudia
706 J. Vankeirsbilck et al.
et al. took another approach with their Abstract Control Signatures (ACS) tech-
nique [9]. They divide the basic blocks into regions and assign a specific signature
to each region. Each basic block in a region has a signature update instruction
inserted, while signature verification instructions are only inserted in basic blocks
where the control flow changes from one region to another. Khudia et al. define
their regions as a collection of basic blocks which has a single entry point but
multiple exit points. Ideally, the regions should possess certain properties that
help in minimizing the number of signature updates and verifications.
Although SETA-C and ACS show promise, implementing them can be diffi-
cult. SETA-C only inserts the verification instructions in larger basic blocks. For
algorithms with little difference between the length in basic blocks, the decrease
in execution time overhead can be limited. To gain the maximum decrease in exe-
cution time overhead when applying ACS, the division of the CFG into regions
is critical. However, Khudia et al. provide little guidance on how to find this
optimal division.
Therefore, we propose an easier and more consistent approach to selectively
implement CFE detection techniques. We propose to only insert the signature
verification instructions in return basic blocks. These are basic blocks which
contain the return statement that exits out of the current function or program
and returns control to the calling function or program. These basic blocks are
the last basic block in each possible path through the target CFG and thus allow
to detect all CFEs that occur within the CFG.
3 Selective RACFED
This section discusses how we applied the proposed selective implementation

principle to our RACFED technique. First, we discuss how the principle trans-
forms RACFED into S-RACFED with a high-level example. Next, we present
the small change to be made to the compile-time process to implement RACFED
to implement it selectively.
3.1 High-Level Example
RACFED is our CFE detection technique that uses run-time signature updates
to detect both inter- and intra-block CFEs. A high-level implementation of
RACFED is shown on the left-hand side of Fig. 1. The instructions inserted
by RACFED are indicated in bold and S is the run-time signature. As can be
seen, a signature update is inserted at the beginning of each basic block and
after each non-jump original instruction. In RACFED, each basic block has a
signature verification instruction that compares the run-time signature with the
compile-time value. These are indicated as S != #value: error(). When a mis-
match between the run-time and compile-time value is detected, a call is made
to the error-handler.
Applying our proposed selective implementation approach transforms
RACFED in S-RACFED. As shown in the right-hand side of Fig. 1, only the
Fig. 1. Transformation from RACFED to S-RACFED.
bottom basic block has a signature verification instruction inserted because that
is the return basic block. Since S-RACFED does not remove any signature update
instructions, it should have the same CFE detection ratio as RACFED but with
a lower execution time overhead.
3.2 Change in Compile-Time Process
There are four steps in the compile-time process to implement RACFED:
1. First, all needed compile-time variables are assigned to all basic blocks.
2. Then, the instruction monitoring is implemented. These update instructions
help to detect intra-block CFEs.
3. Next, the first signature update and the signature verification instructions
are inserted in each basic block.
4. Finally, the last signature update is inserted in each basic block.
To implement S-RACFED, the third step is slightly adjusted. The first sig-
nature update instruction is still inserted in each basic block. The signature
verification, however, is only inserted in return basic blocks. Therefore, an extra
check is inserted in the compile-time process that analyses whether or not the
last instruction of the basic block is a return instruction. If so, only then the
signature verification is inserted in the basic block.
4 Experimental Setup
To thoroughly validate the proposed approach, we performed fault injection

experiments, measured the error detection cost and measured the error detection
latency of S-RACFED and RACFED. This section discusses the setup of the
performed experiments in detail. First, we present the four criteria we measured
for S-RACFED. Next, we present the selected case studies and the selected
target. Finally, we discuss the used fault injection process.
4.1 Criteria
To compare S-RACFED with RACFED, we measured four criteria: the effect of
each error, the execution time overhead, the code size overhead and the error
detection latency.
4.1.1 Error Effects

For every injected fault, we determined its effect. We grouped the effects in four
categories:
• Detected (Det.): This percentage of faults was detected by the implemented

countermeasure. In other words, this category is the desired result.
• Hardware Detection (HD): Many processors already have several inter-
nal fault handlers that are able to detect specific hardware faults, such as
improper bus usage or memory access violations. This category represents
the faults detected by such fault handler.
• Silent Data Corruption (SDC): These are the faults that were not
detected by the implemented technique and caused the algorithm to produce
a wrong result.
• No Effect (NE): Finally, this is the percentage of the faults that were not
detected and did not affect the outcome of the target algorithm.
4.1.2 Error Detection Cost

Software-implemented CFE detection techniques insert extra instructions into
the code they need to protect. This means that the code size and the execution
time of the protected code will be higher than that of the unprotected code.
We measured both its code size and execution time and compare the results to
the unprotected code. The code size is measured using the text output of the
arm-none-eabi-size tool, when used on the .elf file of the compiled case study.
The execution time is measured using a hardware timer of the target.
4.1.3 Error Detection Latency

Because the main difference between RACFED and S-RACFED is the place-
ment of the verification instruction, we measured the error detection latency of
both techniques. For every detected CFE, we counted the number of executed
instructions before the CFE was detected.
4.2 Case Studies

The selected case studies are an implementation of the following algorithms: bit
count (BC), bubble sort (BS), cycle redundancy check (CRC), cubic function
solver (CU), Dijkstra’s algorithm (DIJ), fast fourier transform (FFT), matrix
multiplication (MM) and quick sort (QS). While the BS, MM and QS case
studies use our own implementation, the other implementations were selected
from MiBench version 1.0 [7].
With the selected case studies, a wide array of embedded application domains
is covered. The BS, QS, MM and FFT case studies are also the typical appli-
cations used to validate error detection techniques in literature [1–3,6,10–12].
Furthermore, they also have varying basic block and edge distributions which
assures a thorough validation of our approach. Some setup and validation code
is required to launch the case studies and verify their results. However, only the
function implementing the target algorithm will have (S-)RACFED applied and
only that function will be subjected to CFEs.
4.3 Target
We executed the case studies on an ARM Cortex-M3, because it is an industry-
leading 32-bit processor used in many different embedded application domains.
For the CFE injection experiments, we used a simulated Cortex-M3 provided
by the Imperas simulator [8]. The Imperas simulator is an instruction set simu-
lator which allows to execute the target instructions at host (computer) speed,
speeding up the fault injection experiments. The Imperas simulator supports
ARM, MIPS, PowerPC and several other targets.
Because the Imperas simulator is an instruction accurate simulator and not
a cycle accurate simulator, it is less suited to perform timing measurements.
Therefore, we performed the execution time measurements on a physical tar-
get, the NXP LPC 1768. The NXP LPC 1768 is an ARM Cortex-M3 driven
microcontroller running at 96 MHz, including 512 kB FLASH and 32 kB RAM.
4.4 Fault Injection Process

The used CFE injection process starts with an initialization phase that initializes
variables and creates all possible CFEs. Those possible CFEs are constructed
from the user-defined options disasmFile and range. The disasmFile holds the
path to the disassembly file of the target program. This file is needed to know
which program counter (PC) values are valid for the program. The range option
indicates for which PC range the target program will be tested. The option
allows to target a specific function (or functions) to be tested, instead of the
entire program. All created possible CFEs are within this range.
Once the possible CFEs are created, the execution part of the fault injection
process starts. This phase begins with performing a check to determine whether
or not the previous loop injected a CFE. In case a CFE was injected, a new
instance of the virtual platform is created and a handler to the processor is
obtained. Next, the processor executes the number of steps indicated by exec-
Steps variable. The execSteps variable indicates how far into the case study the
fault injection testing has already progressed. When no CFE was injected, the
processor of the current instance of the virtual platform is progressed one step
further and the corresponding variables are updated.
Next, the current PC of the processor is read out and the loop variable
CFE Injected is updated. The read PC value is compared to the user specified
endAddress value. The endAddress is the PC value that indicates the end of the
current fault injection experiment. When this value is reached, all desired CFEs
have been injected and the experiment can thus stop. In our case, the endAddress
is the return address of the function implementing the target algorithm. In other
words, in our case the endAddress is the first PC value of the verification part
of the case study.
The next step in the algorithm is to determine whether or not the read out
PC falls within the user specified range. When the PC is not in that range, a
new CFE attempt is started and the execution phase restarts. In case the PC
does fall within the range, its according CFE possibilities are extracted from all
possibilities that were created at the start of the algorithm.
Finally, the process analyses whether for the current PC a new CFE must
be injected. This is determined by comparing the current wanted possibility
index (possIndex ) with the number of possibilities for this PC. If a CFE must be
injected, it is injected, analysed for its effect on the target program, the according
variables are updated and the current instance of the virtual platform is deleted.
Then, the execution part of the fault injection restarts.
When no CFE must be injected, a final check is executed. This final check
determines whether or not the current PC must be removed from the range of
allowed PC values. It does this by checking if the total amount of injected CFEs
exceeds the limit provided by the user via the minCFEs option. When the total
amount of injected CFEs exceeds the limit, the PC is removed. The minCFEs
option is mainly used to break the process out of case studies that have loops
with a lot of iterations. In this case, the loop iterations will inject the same
CFEs until the minCFEs limit is reached. Once reached, the PC values part of
the loop will be removed from the allowed PC range, causing the fault injection
experiment to skip them and test the remainder of the target program.
5 Results
This section represents the results of the performed experiments per defined
criterion.
5.1 Fault Injection Results
For each of the selected case studies, we defined five datasets and we subjected
each combination of case study, dataset and implemented technique to fault
injection experiments. The results are shown in Fig. 2 and are the averages for
Fig. 2. Bar chart presenting the results of the fault injection experiments.
the five datasets. The green bar indicates the percentage of CFEs that was
detected by the implemented technique (Det), the red bar indicates the percent-
age of CFEs that was not detected and caused a wrong result (SDC), and gray
bar shows the percentage of CFEs that were either detected by the microcon-
troller itself or that were not detected and did not corrupt the result of the case
study (HD + NE).
As shown, the impact of implementing RACFED selectively changes from
case study to case study. For the CU, DIJ, and QS case studies, the SDC ratio
of S-RACFED is approximately the same as when RACFED is implemented.
For the DIJ case study, S-RACFED even has a higher error detection ratio. A
minor negative impact on the SDC ratio is measured for the BS and FFT case
studies. For these two case studies, the SDC ratio increases with one or two
percent. The FFT case study also reports a much lower error detection ratio
with S-RACFED implemented, but the undetected errors are mainly covered by
either the hardware or the application resilience. However, for the BC, CRC and
MM case studies, the chart illustrates that the selective implementation has a
major negative impact on both the error detection ratio and the SDC ratio. For
these case studies, the SDC ratio increases with five to ten percent, resulting in
an intolerable SDC ratio between ten and eighteen percent. Regarding the CRC
and MM case studies, this increase is coupled with a significant decrease in error
detection ratio.
Although little to no difference between the fault injection results of
RACFED and S-RACFED were expected, the results indicate that, depending
on the case study, large differences are possible. Especially for the smaller case
studies, i.e. BC and CRC, bigger differences were noted. The data shows that
when S-RACFED is implemented, more single-bit CFE resulted in premature
undetected exits out of the algorithm, increasing the SDC ratio.
Fig. 3. Execution time overhead of the two CFE detection techniques.
5.2 Error Detection Cost

The error detection cost of the techniques is shown in Fig. 3. The dark-blue bars
show the execution time of each technique compared to the unprotected code,
while the light-blue bars present the code size of each technique.
Here, the chart displays the expected results, with S-RACFED imposing
less execution time overhead and code size overhead. Again, depending on the
case study, the reduction in overhead can be larger. On average, the execution
time with S-RACFED implemented is ×1.66 that of the unprotected code. In
contrast, the execution time with RACFED implemented is ×1.87 compared
to the unprotected code, on average. The reduction in code size overhead is
smaller, because the only a small part of the entire code base has been protected.
On average, code size when S-RACFED is implemented is ×1.03 that of the
unprotected code, while the code size overhead of RACFED is ×1.04.
5.3 Error Detection Latency

While a lower error detection cost is the advantage of selectively implementing
a CFE detection technique, its disadvantage is a higher error detection latency.
For the performed experiments, we measured the error detection latency as the
number of executed instructions between the point of injection and the execution
of the error handler. The results are shown in the box plots of Fig. 4, with the
error detection latency data for RACFED shown in dark-blue and the data for
S-RACFED depicted in light-blue. The median value for each box plot is shown
with the orange line. To show the broad range of data as clear as possible, we
chose a logarithmic scale on the x-axis.
As illustrated, the error detection latency for RACFED falls within the range
from 1 executed instruction to 30 executed instructions for six out of the eight
case studies. For the CU and FFT case studies the upper limit of the range
Fig. 4. Boxplots showing the extra error detection latency of S-RACFED compared to
RACFED.
extends to 4000 instructions and 500 instructions, respectively. The median value
for most case studies is located around 10 instructions.
The error detection latency of S-RACFED, however, is much higher for all
case studies. Here, we can divide the case studies in two groups. The first group
consists of BC, CRC, CU and DIJ, and has an error detection latency range
from 1 executed instruction to various upper limits, going to 1 million executed
instructions before the CFE was detected for the DIJ case study. The second
group consists of the remaining case studies and shows little spreading in its
error detection latency. For these case studies, the CFE was only detected once
the entire program has executed.
These results indicate that S-RACFED is not suited for all types of applica-
tions. For example, an application that controls an actuator preferably has a low
error detection latency as erroneously controlling an actuator can cause damage
or injuries. Applications that are suitable for S-RACFED are applications that
process data and return the result. Since only the result is of importance, a
higher detection latency can often be tolerated.
6 Conclusions
This paper proposed a new approach to selectively implement CFE detection
techniques. We propose to only insert signature comparisons and the branches
to the error handler in the return basic blocks of the target algorithm. The
advantages of this approach are that it is easy to apply and that it can be
applied to most, if not all, existing CFE detection techniques.
We validated this approach by applying it to our own RACFED technique
to create Selective RACFED (S-RACFED). Next, we performed fault injection
experiments for eight case studies and measured the detection ratio, the error
detection cost and the error detection latency of S-RACFED. The results show
that S-RACFED has approximately the same error detection ratio with a lower
overhead when compared to RACFED. This reduction in overhead does come
at a cost of a high error detection latency. This high latency makes S-RACFED
less suited for actuator controlling applications. Its high detection ratio and low
execution time overhead do make S-RACFED very suitable for data processing
applications.
Acknowledgement. This work is supported by a research grant from the Baeke-

land program of the Flemish Agency for Innovation and Entrepreneurship (VLAIO) in
cooperation with Televic Healthcare NV, under grant agreement IWT 150696.
References
1. Alkhalifa, Z., Nair, V.S.S., Krishnamurthy, N., Abraham, J.A.: Design and evalu-
ation of system-level checks for on-line control flow error detection. IEEE Trans.
Parallel Distrib. Syst. 10(6), 627–641 (1999). https://doi.org/10.1109/71.774911
2. Asghari, S.A., Abdi, A., Taheri, H., Pedram, H., Pourmozaffari, S.: SEDSR: soft
error detection using software redundancy. J. Softw. Eng. Appl. 5(9), 664–670
(2012). https://doi.org/10.4236/jsea.2012.59078
3. Asghari, S.A., Taheri, H., Pedram, H., Kaynak, O.: Software-based control flow
checking against transient faults in industrial environments. IEEE Trans. Industr.
Inf. 10(1), 481–490 (2014). https://doi.org/10.1109/TII.2013.2248373
4. Baffreau, S., Bendhia, S., Ramdani, M., Sicard, E.: Characterisation of microcon-
troller susceptibility to radio frequency interference. In: Proceedings of the Fourth
IEEE International Caracas Conference on Devices, Circuits and Systems (Cat.
No.02TH8611), pp. I031–1–I031–5 (2002). https://doi.org/10.1109/ICCDCS.2002.
1004088
5. Chielle, E., Rodrigues, G.S., Kastensmidt, F.L., Cuenca-Asensi, S., Tambara, L.A.,
Rech, P., Quinn, H.: S-SETA: selective software-only error-detection technique
using assertions. IEEE Trans. Nucl. Sci. 62(6), 3088–3095 (2015). https://doi.org/
10.1109/TNS.2015.2484842
6. Goloubeva, O., Rebaudengo, M., Sonza Reorda, M., Violante, M.: Soft-error detec-
tion using control flow assertions. In: Proceedings of the 18th IEEE International
Symposium on Defect and Fault Tolerance in VLSI Systems, DFT 2003, Washing-
ton, DC, USA, pp. 581–588. IEEE Computer Society (2003)
7. Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown,
R.B.: MiBench: a free, commercially representative embedded benchmark suite.
In: Proceedings of the Fourth Annual IEEE International Workshop on Workload
Characterization. WWC-4 (Cat. No. 01EX538), pp. 3–14 (2001). https://doi.org/
10.1109/WWC.2001.990739
8. Imperas: Revolutionizing embedded software development (2018). http://www.
imperas.com/
9. Khudia, D.S., Mahlke, S.: Low cost control flow protection using abstract control
signatures. SIGPLAN Not. 48(5), 3–12 (2013). https://doi.org/10.1145/2499369.
2465568
10. Li, A., Hong, B.: Software implemented transient fault detection in space computer.
Aerosp. Sci. Technol. 11(2), 245–252 (2007). https://doi.org/10.1016/j.ast.2006.06.
006
11. Nicolescu, B., Savaria, Y., Velazco, R.: SIED: software implemented error detection.
In: Proceedings 18th IEEE Symposium on Defect and Fault Tolerance in VLSI
Systems, pp. 589–596 (2003)
12. Oh, N., Shirvani, P.P., McCluskey, E.J.: Control-flow checking by software signa-
tures. IEEE Trans. Reliab. 51(1), 111–122 (2002)
13. Sierawski, B.D., Reed, R.A., Mendenhall, M.H., Weller, R.A., Schrimpf, R.D., Wen,
S.J., Wong, R., Tam, N., Baumann, R.C.: Effects of scaling on muon-induced soft
errors. In: 2011 International Reliability Physics Symposium, pp. 3C.3.1–3C.3.6
(2011). https://doi.org/10.1109/IRPS.2011.5784484
14. Tang, A., Sethumadhavan, S., Stolfo, S.: CLKSCREW: exposing the perils of
security-oblivious energy management. In: 26th USENIX Security Symposium
(USENIX Security 17), pp. 1057–1074. USENIX Association, Vancouver, BC
(2017)
15. Vankeirsbilck, J., Penneman, N., Hallez, H., Boydens, J.: Random additive control
flow error detection. In: Gallina, B., Skavhaug, A., Bitsch, F. (eds.) Computer
Safety, Reliability, and Security, pp. 220–234. Springer International Publishing,
Cham (2018)
In-Air Imaging Sonar Sensor Network
with Real-Time Processing Using GPUs
Wouter Jansen(B) , Dennis Laurijssen, Robin Kerstens, Walter Daems,

and Jan Steckel
Faculty of Applied Engineering - CoSys Lab, University of Antwerp,

Groenenborgerlaan 171, Antwerp, Belgium
{wouter.jansen,jan.steckel}@uantwerpen.be
Abstract. For autonomous navigation and robotic applications, sens-

ing the environment correctly is crucial. Many sensing modalities for this
purpose exist. In recent years, one such modality that is being used is
in-air imaging sonar. It is ideal in complex environments with rough con-
ditions such as dust or fog. However, like with most sensing modalities,
to sense the full environment around the mobile platform, multiple such
sensors are needed to capture the full 360-degree range. Currently the
processing algorithms used to create this data are insufficient to do so for
multiple sensors at a reasonably fast update rate. Furthermore, a flexi-
ble and robust framework is needed to easily implement multiple imaging
sonar sensors into any setup and serve multiple application types for the
data. In this paper we present a sensor network framework designed for
this novel sensing modality. Furthermore, an implementation of the pro-
cessing algorithm on a Graphics Processing Unit is proposed to poten-
tially decrease the computing time to allow for real-time processing of
one or more imaging sonar sensors at a sufficiently high update rate.
1 Introduction
Sonar sensing has, contrary to what nature tells us, often been judged to as not
being capable of supporting complex, dynamic environments for intelligent inter-
action when used outside water. For implementation in robotics, optical sensors
are often the first choice [4] and sonar is not considered. However, as we can see
with several species of bats, their biosonar allows for classification of objects and
finding their prey among them [7,14]. In recent years, our research group has
developed sonar systems inspired by the echolocation system used by these bats,
capable of finding detailed 3D information from complex environments [9,15].
This sensing modality can serve a wide range of applications, especially in
harsh environments where optical techniques tend to fail due to medium distor-
tions due to foggy weather or dusty environments. As an example application in
robotics, a Simultaneous Localisation and Mapping (SLAM) solution was devel-
oped using 3D sonar for estimating the ego-motion of a mobile platform [16].
Furthermore, in 2017, a control strategy was developed to navigate a mobile
platform through an environment using the acoustic flow field generated by the
https://doi.org/10.1007/978-3-030-33509-0_67
In-Air Imaging Sonar Sensor Network with Real-Time Processing 717
sonar sensor [17]. As these works show, sonar is an increasingly capable sensing
modality.
However, often multiple sensors are embedded into a (mobile) platform, sens-
ing in different directions. This is no different for sonar. The current sonar system
we use is capable of sensing the entire frontal hemisphere in front of the sensor.
When trying to sense the entire 360-degree around a mobile platform, multiple
sonar sensors are required. In this paper we propose a framework to deploy a
network of these sonar sensors for such a platform, providing time-synchronised
measurements which can be processed in real-time on a Compute Unified Device
Architecture (CUDA) [12] enabled Graphics Processing Unit (GPU) by NVIDIA.
Our main focus lies on being able to scale up of the amount of sensors, general
ease of use of the framework and being able to always operate in real-time, being
only limited by the frequency of sonar measurements.
2 Sonar Synchronisation
First, we will provide a brief overview of the 3D sonar system used, our Embed-
ded Real Time Imaging Sonar (eRTIS) [15]. A much more in-depth description
of the sonar can be found in [9,18]. The sensor has a single emitter and uses a
known, irregularly positioned array of 32 digital microphones, as can be seen on
Fig. 1a.
A broadband FM-sweep is emitted allowing the microphones to capture the
reflected echos. The processing of the recorded microphone data to create the 2D
or 3D images of the environment which will be further discussed in Subsect. 4.1.
The microphone signals are recorded over a 1 bit Sigma-Delta Pulse-Density
Modulation (PDM) ADC. The embedded microprocessor on the eRTIS sensor
sends out the recorded data over a USB 2.0 connection to a Raspberry Pi, which
will distribute it to the network which be discussed in the next section.
As we use acoustic echolocation, allowing each eRTIS sensor to freely choose
when to emit the FM-sweep and capture data with the microphones would cause
heavy interference from one another as the emission signals from other sen-
sors would be captured. This would cause artefacts making most measurements
meaningless. Therefore, we have to make the emitters of the eRTIS sensors send
their FM-sweeps simultaneously in order to make the emitters form a consistent
source. To that end, we use a synchronisation mechanism that we have built into
the hardware, which is discussed in more detail in [8]. The eRTIS sensors use
a SYNC signal to synchronise with each other as seen on Fig. 1b. This SYNC
signal allows each eRTIS sensor to synchronise with the previous eRTIS sensor
and will also propagate the SYNC signal to further daisy-chained eRTIS sensors
in the network. To set the measurement rate, the first eRTIS sensor in the chain
can be controlled by an external clock pulse. Each connected Raspberry Pi will
receive a specific message over its USB serial data connection after the FM-
sweep has been emitted and will cause the Raspberry Pi to initiate the capture
of the measurement data over that same serial connection. The measurement
delay with this synchronised system is constant and is ca. 400 ns. However, this
718 W. Jansen et al.
is inconsequential as this is but a small fraction of measurement cycle of an

eRTIS sensor sampling at 450 kHz using a update rate of 20 Hz.
a)
Emitter SYNC
USB
Microphone
Array STM32 Raspberry Pi
Ethernet
GPIO GPIO
b) TTL SYNC Connection
External
Clock
Pulse
Generator
Fig. 1. (a) Schematic overview of all elements of an eRTIS sensor connected to a

Raspberry Pi. (b) Chain of eRTIS sensors connected with the SYNC connection and
initiated by a external pulse generator.
3 Network
The chain of hardware-synchronised sonar sensors will continuously capture the

recorded data of the microphone array. However, we need additional nodes in
this network to receive these data streams and process or store them elsewhere.
Furthermore, several applications need to be able to make use of the processed
sensor data. This network with various node types needs a particular topology
as well as a communication pipeline that is capable of working in real-time. In
this section we will discuss these further. For the network topology we use a star
format as seen on Fig. 2. The Raspberry Pi of each eRTIS sensor is connected
to the central node.
This central node represents the receiver of all sonar measurements. It is cur-
rently configurable in two states. The first state is to store all raw measurement
data. It will identify, index and store the recorded data received from each eRTIS
sensor. The second state is when real-time processing is required. The applica-
tion nodes, the end-users of the processed data, are also directly connected to
the central node.
This can be various application types such as a navigation safety system, a
SLAM algorithm, an acoustic flow control system or a visualisation application.
All connections between the nodes are provided by a LAN network on the plat-
form, avoiding any wireless communication, ensuring a fast and reliable Ethernet
connection.
CLIENT SERVER CLIENT

Central Node
Common Processing or Storage Processing
Workers
Application
Sensor Node
Data Data Node
Worker
TCP over Ethernet Listener Sender TCP over Ethernet
Worker
Unserialization Worker Serialization

Application
Sensor Node
Worker Node
Identification
Worker
Queue
Storage
Worker
Queue
Data
Storage
Fig. 2. A diagram visualising the network topology and communication between all
the nodes as well as the different states of the central node. The sensor nodes (client)
send their data packages to the central node (server). This node, being in either storage
or processing state, will identify each package and add it to a queue. The centre node
has a series of multi-threaded workers which will pick from that queue and either store
or process the measurement. In the processing state, the resulting processed imaging
data will be put on another queue. Data in this queue is send to the correct application
nodes (client) by separate process.
There are a vast amount of proprietary and open-source communication stacks

for robotic sensor networks as for example the very commonly used Robot Oper-
ating System (ROS) [21], or a recently developed API called Distributed Uniform
Streaming Framework (DUST) [20]. However, our imaging system requires a high
computational load and therefore is less suited to implement in such existing
frameworks directly. Furthermore, the simplicity of the network topology, using
a reliable and stable wired LAN network all indicate that a more lightweight,
custom-designed solution was a better option. We developed the current version
in Python. Nevertheless, we don’t exclude the idea of implementing our sensor
network into existing frameworks at a later date when our system has more
matured.
Communication between the various edge nodes and centre node was devel-
oped as a uncomplicated client-server communication over TCP. The sensor
nodes package and serialise the PDM recorded data combined with an identifi-
cation serial-number and a timestamp. Afterwards, they send this data package
over TCP to the central node. This node, being in either storage or processing
state, has a separate server process constantly listening for incoming packages of
sensor nodes. Each package is then unserialised, identified and added to a queue.
The centre node has a series of multi-threaded processes defined as workers which
will pick from that queue and either store or process the measurement.
When the centre node is in the processing state, the resulting processed
imaging data will be put on another queue. This queue will be picked from by
another process, serialising the data and sending it to the correct application
nodes. The entire communication layer is shown in Fig. 2.
4 Real-Time Processing
Where our research group had previously implemented the processing algorithm
discussed in this section on a Central Processing Unit (CPU) using Python, it
was not capable of performing real-time computation for one or more eRTIS
sensors when the amount of directions of interest was increased substantially. To
achieve real-time applications making use of our 3D sonar images, an accelerated
implementation of the processing pipeline had to be made. First, we will discuss
the current processing implementation as it was created for a CPU, followed by
the benefits of using a GPU over a CPU and describing its architecture. Finally,
we will discuss the changes made to the processing algorithm to make better
usage of the GPU resources.
N channels N channels
Unpack binary data

Matched Filter Delay-and-Sum
...
+ Pulse Density ... Beamformer

Demodulation
M directions M directions
Range
Envelope Detector Visualisation
...
...
Application
Azimuth
Fig. 3. Overview of the processing pipeline. The microphone signals containing the
reflected echoes of the detected objects are demodulated and passed through a matched
filter. Afterwards, a beamformer generates a spacial filter for every direction of interest.
The output is then put through an envelope detector for every direction. The last
visualisation step and resulting 2D image is not part of the processing but is shown
here to illustrate a possible application of the sensor imaging.
4.1 Signal Processing

For a detailed description on the process that is used to achieve the 3D data
with the eRTIS sensor, we advise the reader to read [9,18]. A brief overview
of the processing pipeline is shown in Fig. 3. As initially explained in Sect. 2, a
broadband FM-sweep is emitted and the reflected echoes are recorded by the 32
microphones on the eRTIS sensor. These signals are modulated using PDM and
communicated to the processing node over the communication channels discussed
in the previous Section. There, after being first demodulated, the signals are
processed using a matched filter.
Using broadband chirps, we can detect multiple closely spaced reflectors in
the reflection signal [18]. After this, beamforming takes place. For each direction
of interest we can calculate the delay, relative to the particular channel of the
microphone array. This allows for steering into the directions of interest with
simple delay-and-sum beamforming [19]. After beamforming, we can extract the
envelope of the signal as the representation of the reflector distribution for each
particular direction, with the time axis representing the range for that direction.
We now have a 2D or 3D image that gives us the intensity, range and direction
for each detected reflector.
4.2 GPU Architecture
CUDA [12] has enabled programming on the GPU beyond rendering of graph-
ics. This allows to speed up and parallelise heavy computational tasks. Recently,
multiple works have been published using GPUs for audio signal processing
[2,3,11] and (sonar) beamforming algorithms such as (Linearly Constrained)
Minimum Variance, Minimum Variance Distortionless Response, Generalised
Sidelobe Canceller and Capon [1,5,6,10]. Similarly, our research group looked
at GPUs to provide the potential to perform real-time processing of an eRTIS
sonar with a high number of directions during beamforming to create a high res-
olution 3D image of the sensor’s frontal hemisphere. Our processing algorithm,
which is computationally expensive and needs to be used in parallel to allow
for multiple sensors to be processed simultaneously is therefore an ideal case for
using a GPU.
The architecture of an NVIDIA GPU has been consistent over the last few
generations. It is composed of multiple Streaming Multiprocessors (SM). Each
SM contains a set of pipelined CUDA cores. For high-speed, high bandwidth
memory access, an NVIDIA GPU has access to device memory (global ) and
faster on-chip (shared ) memory. When programming device code that has to
run on the GPU cores, it is also called a kernel function. One such kernel can be
made to be run concurrently by multiple cores. It is then called a thread. Threads
are grouped in thread blocks, with a maximum of 1024 threads for each block. A
block can be designed to organise threads in up to three dimensions. All threads
in a block have access to the same shared memory. The GPU will schedule the
blocks over each SM. An SM can run up to 32 parallel threads, which is called a
warp. Using all 32 threads in a warp is important, as not optimally using a single
SM will increase latency. During the validation of our system we used 3 different
GPUs. The Jetson TX2, GTX 1050 and GTX 1080 Ti representing what could
be used on a mobile platform with low power usage, a laptop GPU and a high-
end desktop GPU respectively. Their specifications can be found in Table 1. All
share the same restrictions on thread block dimensions and maximum amount
of blocks for each SM. The specific architecture of an NVIDIA GPU requires
careful adaptation of the algorithms to exploit it’s full power. The changes are
detailed in the next subsection.
Table 1. Validation NVIDIA GPU Specifications
Jetson TX2 GTX 1050 GTX 1080 Ti

Device memory 8 GB LPDDR4 4 GB GDDR5 11 GB GDDR5X
CUDA cores 256 640 3584
Streaming Multiprocessors 2 5 28
GPU clock speed 1301 MHz 1190 MHz 1582 MHz
Memory clock speed 1600 MHz 3504 MHz 5505 MHz
Power usage under heavy load ±15 W ±50 W ±250 W
4.3 GPU Algorithm Implementation
For this initial implementation of the processing pipeline discussed in Sub-

sect. 4.1, we mostly looked at a direct application of the algorithm developed
for the CPU and rewrite it for the GPU. This CPU algorithm was created in
Python, making substantial usage of the NumPy and SciPy packages known for
their powerful N-dimensional array and signal processing functions respectively.
For this paper we will focus on the aspects that were done differently for optimal
usage of the GPU during the rewrite of the algorithm.
A re-occurring element seen in many parts of the processing algorithm are the
several digital signal filters being used. They are used during the PDM demod-
ulation, the matched filter and envelope detection. In the CPU implementation,
we design these filters as Butterworth filters and make usage of SciPy’s lfilter
function. For performing this faster and paralleled on the GPU, we make use of
FFT convolution filters. For this we can make use of the CuFFT CUDA library
for paralleled FFT and iFFT execution. Furthermore, we can implement parts
of the filter such as the element-wise multiplication of the complex FFTs or
removing the group delay introduced by the filter easily using GPU kernels.
Outside of the filters, other aspects such as the unpacking of the binary
data before the PDM demodulation, subsampling before the matched filter and
after envelope detection to increase computational performance and delay-and-
sum beamforming were implemented as device kernels. By optimally dividing
the dimensions of the matrices used in the algorithm in the correct block sizes
to fill a complete warp on an SM, one can make excellent usage of the GPU’s
resources. As discussed in Subsect. 4.2, the used GPUs for this paper share the
same limitations for the maximum block dimensions, blocks per SM and threads
per blocks. This makes it easier to decide the optimal parameters for the ker-
nels. Moreover, an additional performance increase we achieved was by preal-
locating device memory before we start the processing node. We noticed that
allocating and freeing memory for each worker for each measurement decreased
performance substantially. Given that our matrix sizes and input variables are
consistent, we can perform this step once and free the device memory when we
stop the entire processing node at the end.
5 Experimental Validation
For validating the processing on a GPU we used three different models as shown
in Table 1. For comparison, we used three Intel CPUs. An i7-7567U (2 cores,
up to 4 GHz), an i7-7700HQ (4 cores, up to 3.8 GHz) and an i9-7960X (16 cores,
up to 4.2 GHz). Similarly as with the GPU choices, these represent what could
be expected on a mobile platform with low power usage, a laptop and a high-
end desktop respectively. We ran 3 different processing configurations for each
hardware setup. First, a 2D image of the horizontal plane in front of the sensor,
using 90 directions over the entire azimuth range. A second configuration was
a 3D image with 1850 directions distributed between −45 and 45◦ for both
azimuth and elevation. Finally, a 3D image of the full hemisphere in front of the
sensor, using 3000 directions. For every configuration, 100 measurements were
processed and the computation duration was recorded. The results can be found
in Table 2.
Table 2. Computing Time Results - Mean and Standard Deviation
90 directions 1850 directions 3000 directions

CPU i7-7567U 132.53 (4.89) 976.33 (14.54) 1541.41 (20.30)
CPU i7-7700HQ 187.85 (19.94) 1410.85 (183.50) 2 080.59 (232.22)
CPU i9-7960X 167.13 (16.88) 1087.04 (109.72) 1653.45 (167.03)
GPU Jetson TX2 68.40 (13.40) 585.24 (61.02) 911.47 (91.97)
GPU GTX 1050 23.03 (2.38) 202.37 (20.85) 323.83 (33.18)
GPU GTX 1080 Ti 5.93 (0.66) 51.58 (6.38) 77.02 (7.81)
Standard deviation is listed between brackets. All values are in milliseconds.
One comment we would like to make when looking at the results is the remarkable
performance of the i7-7567U compared to the other CPUs. A partial explana-
tion is that the CPU implementation of the processing algorithm was done in
Python, making heavy usage of the NumPy and SciPy packages. Most of their
functions are not multi-threaded and the Python Global Interpreter (GIL) limits
the Python interpreter to only have one thread in a state of execution at any
point in time [13]. This severely limits our CPU algorithm to make full usage of
the i9-7960X’s 16 cores. However, if we want would process multiple eRTIS sen-
sors at the same time, we can make usage of Python’s Multiprocessing package
to run multiple workers simultaneously, making usage of multiple cores (which
we do as explained in Subsect. 3 when describing the workers). In such cases,
the i9 would outclass the other CPUs in our tests. But for these results we are
only looking at single sensor processing. Nevertheless, we did not find an expla-
nation why the i7-7567U performs better in single thread performance than the
i9-7960X as on paper the second should be better. Lastly, given these results
and after doing additional experiments we found that on the Jetson TX2, suited
well for a mobile platform, up to three eRTIS sensors could be processed at 5 Hz
when using 90 directions of interest. Moreover, on the GTX 1080 Ti, a single
eRTIS sensor could be processed at 10 Hz with 3000 directions covering the entire
frontal hemisphere.

With the sensor network we built, we can now develop new application types
much easier with this framework as a basis. From a mobile platform to a larger
desktop system, the network can be easily installed due to its flexibility. Fur-
thermore, our results validate that the initial implementation of our processing
algorithm on a GPU already heavily outperforms any CPU tested. Therefore,
using a GPU and a processing configuration appropriate for the application and
available GPU resources allows for one or more eRTIS sensors to be processed
in real-time. Additional research has to be done to optimise the GPU algorithm
further, notably for the computationally expensive steps such as beamforming
and filters. Furthermore, a comparison should be made to a CPU implementation
where instead of Python, an optimised C(++) algorithm is developed.
We intend to continue our work on this sensor network framework and
GPU/CPU algorithms to support more application use-cases and lower the
cost in terms of price and in terms of the computing requirements for all
configurations.
References
1. Asen, J.P., Buskenes, J.I., Nilsen, C.I.C., Austeng, A., Holm, S.: Implementing
Capon beamforming on a GPU for real-time cardiac ultrasound imaging. IEEE
Trans. Ultrason. Ferroelectr. Freq. Control 61(1), 76–85 (2014). https://doi.org/
10.1109/TUFFC.2014.6689777
2. Belloch, J.A., Ferrer, M., Gonzalez, A., Martinez-Zaldivar, F., Vidal, A.M.:
Headphone-based virtual spatialization of sound with a GPU accelerator. J. Audio
Eng. Soc. 61(7/8), 546–561 (2013)
3. Belloch, J.A., Bank, B., Savioja, L., Gonzalez, A., Valimaki, V.: Multi-channel IIR
filtering of audio signals using a GPU. In: 2014 IEEE International Conference on
Acoustics, Speech and Signal Processing, pp 6692–6696. IEEE (2014). https://doi.
org/10.1109/ICASSP.2014.6854895
4. Brooks, R.: A robust layered control system for a mobile robot. IEEE J. Robot.
Autom. 2(1), 14–23 (1986). https://doi.org/10.1109/JRA.1986.1087032
5. Buskenes, J.I., Åsen, J.P., Nilsen, C.I.C., Austeng, A.: Adapting the minimum vari-
ance beamformer to a graphics processing unit for active sonar imaging systems.
J. Acoust. Soc. Am. 133(5), 3613–3613 (2013). https://doi.org/10.1121/1.4806739
6. Buskenes, J.I., Asen, J.P., Nilsen, C.I.C., Austeng, A.: An optimized GPU imple-
mentation of the MVDR beamformer for active sonar imaging. IEEE J. Oceanic
Eng. 40(2), 441–451 (2015). https://doi.org/10.1109/JOE.2014.2320631
7. Griffin, D.R.: Listening in the Dark: The Acoustic Orientation of Bats and Men.
Dover Publications (1974)
8. Kerstens R., Laurijssen, D., Schouten, G., Steckel, J.: 3D point cloud data acqui-
sition using a synchronized in-air imaging sonar sensor network. In: IEEE/RSJ
International Conference on Intelligent Robots and Systems (2019, to be published)
9. Kerstens, R., Laurijssen, D., Steckel, J.: eRTIS: a fully embedded real time 3D
imaging sonar sensor for robotic applications. In: IEEE International Conference
on Robotics and Automation (2019, to be published)
10. Lorente, J., Vidal, A.M., Pinero, G., Belloch, J.A.: Parallel implementations of
beamforming design and filtering for microphone array applications. In: European
Signal Processing Conference (2011)
11. Schneider, M., Schuh, F., Kellermann, W.: The generalized frequency-domain
adaptive filtering algorithm implemented on a GPU for large-scale multichannel
acoustic echo cancellation. In: Speech Communication, 10. ITG Symposium, VDE
Verlag GmbH, p. 296 (2012)
12. NVIDIA: CUDA Toolkit (2019). https://docs.nvidia.com/cuda/index.html
13. Python: GlobalInterpreterLock (2019). https://wiki.python.org/moin/Global
InterpreterLock
14. Schnitzler, H.U., Moss, C.F., Denzinger, A.: From spatial orientation to food acqui-
sition in echolocating bats. Trends Ecol. Evol. 18(8), 386–394 (2003). https://doi.
org/10.1016/S0169-5347(03)00185-X
15. Steckel, J.: RTIS (2019). https://www.3dsonar.eu/
16. Steckel, J., Peremans, H.: BatSLAM: simultaneous localization and mapping
using biomimetic sonar. PLoS ONE 8(1), e54,076 (2013). https://doi.org/10.1371/
journal.pone.0054076
17. Steckel, J., Peremans, H.: Acoustic flow-based control of a mobile platform using
a 3D sonar sensor. IEEE Sens. J. 17(10), 3131–3141 (2017). https://doi.org/10.
1109/JSEN.2017.2688476
18. Steckel, J., Boen, A., Peremans, H.: Broadband 3-D sonar system using a sparse
array for indoor navigation. IEEE Trans. Rob. 29(1), 161–171 (2013). https://doi.
org/10.1109/TRO.2012.2221313
19. Van Trees, H.L.: Optimum Array Processing. Wiley, New York (2002). https://
doi.org/10.1002/0471221104
20. Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Sharif, M., Mercelis, S.,
Hellinckx, P.: Distributed uniform streaming framework: towards an elastic fog
computing platform for event stream processing. In: International Conference on
P2P, Parallel, Grid, Cloud and Internet Computing, pp. 426–436 (2018). https://
doi.org/10.1007/978-3-030-02607-339
21. Garage, W.: Stanford Artificial Intelligence Laboratory. ROS.org (2019). http://
www.ros.org/
Comparing Machine Learning Algorithms
for RSS-Based Localization in LPWAN
Thomas Janssen(B) , Rafael Berkvens, and Maarten Weyn
University of Antwerp - imec, The Beacon, Sint-Pietersvliet 7, 2000 Antwerp, Belgium

{thomas.janssen,rafael.berkvens,maarten.weyn}@uantwerpen.be
Abstract. In smart cities, a myriad of devices is connected via Low

Power Wide Area Networks (LPWAN) such as LoRaWAN. There is a
growing need for location information about these devices, especially to
facilitate managing and retrieving them. Since most devices are battery-
powered, we investigate energy-efficient solutions such a Received Signal
Strength (RSS)-based fingerprinting localization. For this research, we
use a publicly available dataset of 130 426 LoRaWAN fingerprint mes-
sages. We evaluate ten different Machine Learning algorithms in terms of
location accuracy, R2 score, and evaluation time. By changing the rep-
resentation of the RSS data in the most optimal way, we achieve a mean
location estimation error of 340 m when using the Random Forest regres-
sion method. Although the k Nearest Neighbor (k NN) method leads to
a similar location accuracy, the computational performance decreases
compared to the Random Forest regressor.
1 Introduction
Locating a device in smart cities is becoming an increasingly challenging prob-

lem. The amount of Internet of Things (IoT) devices connected to Low Power
Wide Area Networks (LPWAN) is forcing network operators to improve the scal-
ability of their networks. Furthermore, these mobile devices are typically powered
by a small battery that needs to last for several years. Sensors reporting air qual-
ity and smart water level meters are just a few examples of the growing need to
locate devices throughout the city.
LPWANs are being used as an alternative to ordinary Global Navigation
Satellite System (GNSS) receivers, which consume a significant amount of power.
Besides, satellite-based solutions are not always desired, given their limitations
in indoor environments, i.e. signals not penetrating well through walls. Sigfox,
LoRaWAN and NB-IoT are the most commonly used LPWAN technologies [10].
While the latter operates in the licensed spectrum with low latency, Sigfox and
LoRaWAN benefit from a longer range and battery life [7].
Several approaches exist to locate a transmitting device in an LPWAN. In
every approach, a trade-off must be made between location accuracy and energy
consumption. However, when comparing different studies of the same approach,
several other parameters need to be considered. For example, the cost and effort
https://doi.org/10.1007/978-3-030-33509-0_68
Comparing Machine Learning Algorithms for RSS-Based Localization 727
to train a model or install equipment needs to be taken into account. More-

over, the indoor or outdoor environment and the amount of receiving gateways
also plays a significant role in the resulting localization accuracy [8]. For Time
Difference of Arrival (TDoA) and Angle of Arrival (AoA) approaches, the gate-
ways and antennas need to be synchronized, respectively. Several state-of-the-
art TDoA algorithms are compared in [6]. TDoA-based positioning and tracking
with LoRaWAN are topics discussed in [9]. In this paper, we focus on Received
Signal Strength-based (RSS) fingerprinting.
Outdoor RSS-based fingerprinting localization can be challenging, given the
time and effort needed to create a training database and the dynamic environ-
ment of a city. However, Aernouts et al. managed to collect a large amount of
RSS samples, together with GPS coordinates as ground truth data, in the city of
Antwerp, Belgium [1]. Both Sigfox and LoRaWAN messages were collected. In
previous research, we performed outdoor fingerprinting using Sigfox with a basic
k Nearest Neighbors (k NN) algorithm [5]. The mean location estimation error
was 340 m. Meanwhile, the LoRaWAN dataset size has grown to 13 426 sam-
ples. In this research, we want to explore and compare more advanced Machine
Learning algorithms using this dataset. In the state-of-the-art, Support Vector
Machines (SVM) are being used to classify an RSS fingerprint into a correct GPS
node class in Wireless Sensor Networks (WSN) [11], in indoor environments [4]
and in simulation environments [14]. Furthermore, several Machine Learning
algorithms are evaluated in terms of location accuracy and computation time in
an indoor environment [3]. In this research, we evaluate ten different Machine
Learning algorithms using real measurement data collected in a city-scale out-
door environment.
The remainder of this chapter is organized in the following way. In Sect. 2,
we explain into more detail how the fingerprinting database is created and what
preprocessing steps were taken to input the data into the algorithms. Next, the
benefits and limitations of each regression algorithm are briefly discussed and we
justify how parameter values are chosen. In Sect. 3, we evaluate each algorithm
in terms of evaluation time, location estimation error and R2 score. Section 4
summarizes our main findings and lists the future work.
2 Methodology
In this section, we will describe the steps taken to evaluate every Machine Learn-
ing algorithm. First, some preprocessing steps need to be taken in order to rep-
resent our data in the most optimal way. Second, we will explain into detail how
each Machine Learning algorithm works and what parameters are optimized.
For each algorithm, the benefits and limitations are listed. Third, we define the
metrics used to evaluate each algorithm.
2.1 Dataset
The data that is being used to evaluate each algorithm is collected and updated
by Aernouts et al. [1]. Version 1.2 of the publicly available dataset consists of
728 T. Janssen et al.
a table with 130 426 rows, each row representing a LoRaWAN message sent in
the city of Antwerp, Belgium. These messages are plotted in Fig. 1. For each
message, the RSS values to all 72 LoRaWAN gateways are represented in the
columns, appended by the ground truth GPS coordinate and some meta data. If
a message is not received by a gateway, the RSS value in that gateway column
is set to −200 dBm. Given the spatial spread of the LoRaWAN gateways, a lot
of RSS values are set to −200 dBm. This emphasizes the challenge to represent
the data in a proper way and extract the relevant data above the noise floor.
Fig. 1. The dataset consists of 130 426 LoRaWAN messages collected in Antwerp,
Belgium.
c 2019 OpenStreetMap contributors
To put this data into a more Machine Learning perspective, each message
is related to a single sample and each receiving gateway relates to a feature.
Therefore, the dataset consists of 130426 samples and 72 features in total.
2.2 Preprocessing Steps
In Machine Learning, data preprocessing is required to prepare raw, sometimes

incomplete data for further processing. Hence, before feeding the Machine Learn-
ing algorithms with the dataset, some preprocessing steps are taken.
A first step is to transform the RSS data into another format. In the raw
data, the RSS values are represented in decibels relative to a milliwatt (dBm).
Torres-Sospedra et al. suggest four different RSS representations in order to
increase the localization accuracy [13]. The first two representations, positive
and normalized, can be seen as lineal transformations of the dataset. The posi-
tive representation maps the minimum RSS value to a value of zero, resulting in
all positive RSS values. In this way, all −200 dBm values are mapped to a value
of zero. In the normalized representation, the positive values are mapped to the
range [0...1]. Since signal strengths are represented in a logarithmic way, it is
better to map the RSS values in a logarithmic way as well. Therefore, the ‘expo-
nential’ and ‘power’ representations are introduced in [13]. In our experiments,
we evaluate the localization accuracy when using the normalized, exponential
and power RSS representation.
Although some scaling is introduced in the RSS representations of the pre-
vious paragraphs, it is recommended to use the StandardScaler of the sklearn
package in Python, which is able to feed the data in the most optimal way to
the Machine Learning algorithms in the sklearn package.
Finally, a Principal Component Analysis (PCA) is performed on the
dataset. With PCA, we can extract the most relevant features out of the dataset.
This is highly desired, since not all gateways receive each message, thus reducing
the amount of features and noise. Moreover, extracting the principal components
of the dataset is often used to decrease the evaluation time of each Machine
Learning algorithm. We performed PCA on the dataset with 95% of the variance
retained, resulting in a reduction from 72 to 40 components.
2.3 Regression Algorithms
Since we are predicting continuous-valued output, regression-based algorithms

are most suitable for our supervised Machine Learning problem. In total, ten
different regression algorithms are evaluated on our dataset. These algorithms
can be classified into four different categories. For every category of algorithms,
we will explain how each algorithm works and how the parameters are tuned.
Finally, the benefits and limitations of each algorithm are discussed. In our
experiments, each algorithm is benchmarked in terms of location estimation
error as the Vincenty distance between the GPS coordinate and the estimated
coordinate; the R2 score of sklearn; and evaluation time as the elapsed time
between the fitting of a model and the estimation of an output coordinate, using
a Virtual Machine with 32 GB RAM memory and 10 CPU cores, of which 6 are
used in parallel in every algorithm.
2.3.1 Linear Regression Algorithms

Linear regression algorithms attempt to fit a linear function from the provided
training data and estimate the numeric output values, given new input values.
By fitting the function, a linear Machine Learning model is created. This model
can be represented in the form of Eq. (1), where x is the feature vector of length
n and w and b are the parameters that need to be learned by training the model.
ypred = w[0] ∗ x[0] + w[1] ∗ x[1] + ... + w[n] ∗ x[n] + b (1)

Several variations of linear regression algorithms exist. In this research we
evaluate the following linear algorithms: Ordinary Least Squares, Ridge, Lasso,
Elastic Net, Stochastic Gradient Descent and Polynomial regression.
As the name suggests, Ordinary Least Squares (OLS) is the most basic linear
regression algorithm. In this algorithm, the Mean Squared Error (MSE) between
the predicted and real output values is minimized. If the features of the data are
correlated, the number of random errors in the target values increases. This phe-
nomenon is called multicollinearity. Therefore, the independence of the features
is very important in the OLS algorithm.
Ridge regression is similar to OLS, with the difference that in Ridge regression
the magnitude of the coefficients w is reduced by a factor α, as can be seen
in Eq. (2). This constraint is known as 2 regularization. We find the optimal
value of α by evaluating Ridge regression with cross-validation. During the cross-
validation, we iterate over values ranging from 5 × 10−2 to 5 × 109 . In this way,
the optimal value of α is found to be 3290.
min Xw − y22 + αw22 (2)

w
In the Lasso regression algorithm, some coefficients in Eq. (1) are set to zero.
Consequently, some features are ignored by the model. This is called 1 regular-
ization. Similar to Ridge, the optimal value of α equal to 2.52 × 10−5 is found by
cross-validation. With only a few non-zero weights, the advantage of the Lasso
regression is the reduced amount of time needed to train the model.
The Elastic-Net linear regression model performs both 1 and 2 -norm regu-
larization of the coefficients. In fact, it is a combination of the Ridge and Lasso
algorithms, in the sense that there are fewer non-zero weights and the regular-
ization properties of Ridge are maintained.
Stochastic Gradient Descent (SGD) is a Machine Learning algorithm that can
be used for classification and regression problems. It is often used for training
artificial neural networks. However, it can also be used for training linear regres-
sion models. Gradient Descent is a method to find the values of a function that
minimizes the cost function. To find the optimal values, the initial parameters
are constantly updated. Thus, Gradient Descent is an iterative method, resulting
in slower training times. In the stochastic variant of this algorithm, one iterates
over a few randomly selected samples, reducing the computational complexity in
large datasets. Furthermore, different loss functions and corresponding parame-
ters have been evaluated on our dataset. The Huber loss function with equal
to 1 × 10−2 and a stopping criterion equal to 1 × 10−3 lead to the most accurate
results (i.e. smallest location estimation error). The Huber loss function uses
a squared loss function and past a distance of it uses a linear loss function.
Therefore, it becomes less sensitive to outliers in the dataset.
The last linear regression algorithm we discuss is the polynomial regression,
which can be seen as a particular case of multiple linear regression. In this
algorithm, a linear function is fitted within a higher-dimensional space. Thus, the
degree of a polynomial function is lowered. In our case, the localization accuracy
was maximized when reducing the degree to 1. This approach allows to fit a
much wider range of data and benefits from the relatively fast performance of
linear regression algorithms.
2.3.2 Support Vector Regression

Support Vector Machines (SVM) can be used in classification and regression
problems, both in linear and multi-dimensional cases. In simple regression algo-
rithms, we attempt to minimize the error rate, while in Support Vector Regres-
sion (SVR), the goal is to maximize the margin between the hyperplane and
the support vectors (i.e. the data points closest to that hyperplane) In other
words, we need to find a function that has at most deviation from the actu-
ally obtained targets in the training data [2]. Hence, a loss function with a
margin of tolerance is defined because of the real numbered target values.
Furthermore, SVR is characterized by a kernel function that maps lower dimen-
sional data to higher dimensional data. We implemented an SVR with a third-
degree polynomial kernel function and free parameters ε = 0.01 and C = 1000,
determined experimentally. The latter controls the penalty imposed on observa-
tions that lie outside the margin and helps to prevent overfitting. The main
advantage of SVR is that the computational complexity does not depend on the
dimensionality of the input space. However, when the number of samples in the
dataset exceeds a few tens of thousands, the algorithm can be computationally
demanding.
2.3.3 k Nearest Neighbors

The k Nearest Neighbors algorithm is an intuitive yet effective Machine Learning
approach which is often used in indoor and outdoor localization applications [5,
12]. During the offline training phase of the algorithm, the feature vectors and
target values are only stored. In the online evaluation phase, we want to estimate
the target values (i.e. the GPS coordinates) of a test feature vector. This is done
by calculating the distance or similarity between each training vector and the
test vector. This can be done by using various distances [5,13]. As a final step, the
centroid of the target values of the k smallest distances is used as the estimate
for the target values, where k is user-defined. In our experiments, we iterate
over k ranging from 1 to 20, thus optimizing the amount of nearest neighbors.
In the weighted variant of the k NN algorithm, neighbors that are closer to the
test sample have a greater influence than neighbors which are farther away. The
benefits of kNN are the fact that there is no explicit training phase, the simplicity
of the algorithm and the variety of distances to choose from. On the contrary,
the user-defined value of k, the computational complexity and the high outlier
sensitivity are the limitations of the algorithm.
2.3.4 Random Forest

Random Forest (RF) is an ensemble technique, i.e. multiple Machine Learning
algorithms are combined to solve classification or regression problems. The gen-
eral idea is to construct multiple decision trees during the training phase and
output the mean of all individual predictions as an estimated target value. The
technique randomly samples the training observations when building the trees.
Random Forest has been proven to outperform other Machine Learning algo-
rithms in terms of indoor fingerprinting localization accuracy [15]. Despite the
computational complexity of each individual decision tree, the overall training

and matching speeds are very fast, even for high-dimensional input data. Finally,
since multiple algorithms are combined, overfitting is reduced significantly and
the stability of the technique increases. In our implementation, we chose 100 esti-
mators that are combined using the bootstrap aggregation (bagging) technique
in sklearn. From this amount of trees on, the benefit in prediction performance
from learning from more trees did not make up anymore for the computational
cost to learn these additional trees.
3 Results
The ten algorithms discussed in the previous section are now evaluated in terms
of location estimation error, R2 score and the time needed for the Virtual
Machine with 6 CPU cores to compute the results. In Table 1, these metrics
are summarized for every algorithm and for the lineal, exponential and powed
representations of the fingerprinting dataset. A box plot of localization errors
for every algorithm using the powed RSS representation is shown in Fig. 2.
As one can observe, the optimal RSS representation depends on the algo-
rithm being evaluated. For example, the exponential RSS representation yields
the smallest location estimation errors when evaluating the linear regression
algorithms. By using this representation, the accuracy of all linear algorithms is
around 785 m. Given their simplicity, the linear regression algorithms yield higher
errors but are computed in the least amount of time. The R2 score, which gives
an indication of how close the actual target values are to the fitted regression
line, increases from 0.70 using positive RSS to 0.73 using exponential RSS.
Table 1. Mean location estimation error (in [m]), R2 , and evaluation time (in [s])
score for every Machine Learning algorithm using the lineal, exponential and powed
RSS representation of the LoRaWAN dataset.
Algorithm Lineal RSS Exponential RSS Powed RSS

2
Error R Time Error R2 Time Error R2 Time
k NN 354 0.90 131 345 0.90 517 349 0.90 131
k NN weighted 348 0.90 146 344 0.90 484 343 0.90 147
SVR 1148 0.57 1168 1206 0.49 1616 1155 0.55 1162
Linear OLS 801 0.70 1.19 785 0.73 1.15 786 0.72 1.08
Linear Ridge 800 0.70 0.82 785 0.73 0.84 785 0.72 0.83
Linear Lasso 801 0.70 0.88 785 0.73 0.90 786 0.72 0.98
Linear Elastic Net 801 0.70 0.94 785 0.73 0.92 786 0.72 0.98
Linear SGD 799 0.70 11 784 0.73 11 784 0.72 12
Linear Polynomial 801 0.70 1.20 785 0.73 1.20 786 0.72 1.29
Random Forest 351 0.91 56 609 0.84 56 340 0.91 53
Fig. 2. Location estimation errors for every regression algorithm using the powed RSS
representation of the LoRaWAN dataset
Out of all evaluated algorithms, the Support Vector Regression seems to

be the least performing, both in terms of R2 score and evaluation time. The
increased evaluation time is caused by the use of kernels, which is very time
consuming. Additionally, the training complexity in SVR is highly dependent
on the number of samples in the dataset. Hence, taking into account the mean
localization error of over 1 km, SVR is not a good choice to evaluate with our
large dataset.
In contrast, the (weighted) k NN and Random Forest algorithms yield the
best results regarding localization accuracy and R2 score. The weighted variant
of the k NN algorithm is slightly more accurate than the basic version, leading to a
mean location estimation error of 343 m using powed RSS and k = 15. Similarly,
the Random Forest ensemble technique results in an accuracy of 340 m, using
powed RSS as well. While it takes 147 s to compute the results in the k NN
algorithm, Random Forest takes advantage of its bagging technique, reducing
the computation time to 53 s.
In this work, we evaluated ten different Machine Learning algorithms on a

dataset with 130 000 samples, consisting of RSS values to 72 LoRaWAN base
stations. Given the GPS coordinates as ground truth data, the objective was
to compare different Machine Learning algorithms that locate the transmitting

device based on this RSS data. To optimize the input of every algorithm, we
changed the representation of the RSS data, performed scaling and PCA analy-
sis on the dataset as preprocessing steps.
We evaluated every algorithm based on location estimation error, total eval-
uation time and R2 score. Since the fit time complexity of SVR is more than
quadratic with the amount of samples, the algorithm is unable to scale to
datasets with more than a couple of 10 000 samples, resulting in significantly
higher evaluation times. Despite having different benefits and limitations, all six
linear regression algorithms yield similar results. Fitting a linear model is fast
but inaccurate, resulting in a location errors around 785 m. Weighted k NN and
Random Forest achieve the highest localization accuracy. Mean location estima-
tion errors of around 340 m are achieved when the RSS values are transformed to
the powed representation. The Random Forest ensemble technique successfully
avoids overfitting by averaging the predictions of multiple decision trees, while
still being able to compute the results faster than the k NN algorithm, due to
the bagging technique.
Our future work consists of further improving the dataset representation by
preprocessing the dataset using a Deep Learning approach. Afterwards, the algo-
rithms used in this work as well as more complicated Neural Network algorithms
can be evaluated. Finally, we will create a coverage map of our test environment
in Antwerp, in order to further improve the localization accuracy.
Acknowledgements. Thomas Janssen is funded by the Fund For Scientific Research

(FWO) Flanders under grant number 1S03819N.
References
1. Aernouts, M., Berkvens, R., Van Vlaenderen, K., Weyn, M.: Sigfox and LoRaWAN
datasets for fingerprint localization in large urban and rural areas. Data 3(2)
(2018). URL http://www.mdpi.com/2306-5729/3/2/13
2. Awad, M., Khanna, R.: Support vector regression. In: Efficient Learning Machines,
Berkeley, CA, pp. 67–80. Apress (2015). https://doi.org/10.1007/978-1-4302-5990-
94
3. Bozkurt, S., Elibol, G., Gunal, S., Yayan, U.: A comparative study on machine
learning algorithms for indoor positioning. In: 2015 International Symposium on
Innovations in Intelligent SysTems and Applications (INISTA), pp. 1–8. IEEE
(2015). https://doi.org/10.1109/INISTA.2015.7276725, http://ieeexplore.ieee.org/
document/7276725/
4. Farjow, W., Chehri, A., Hussein, M., Fernando, X.: Support Vector Machines for
indoor sensor localization. In: 2011 IEEE Wireless Communications and Network-
ing Conference, pp. 779–783. IEEE (2011). https://doi.org/10.1109/WCNC.2011.
5779231, http://ieeexplore.ieee.org/document/5779231/
5. Janssen, T., Aernouts, M., Berkvens, R., Weyn, M.: Outdoor fingerprinting local-
ization using Sigfox. In: 2018 International Conference on Indoor Positioning and
Indoor Navigation (Accepted), Nantes, France (2018)
6. Jin, B., Xu, X., Zhang, T., Jin, B., Xu, X., Zhang, T.: Robust Time-Difference-of-
Arrival (TDOA) localization using weighted least squares with cone tangent plane
constraint. Sensors 18(3), 778 (2018). https://doi.org/10.3390/s18030778, http://
www.mdpi.com/1424-8220/18/3/778
7. Mekki, K., Bajic, E., Chaxel, F., Meyer, F.: A comparative study of LPWAN
technologies for large-scale IoT deployment. ICT Express 5(1), 1–7 (2019). https://
doi.org/10.1016/j.icte.2017.12.005
8. Plets, D., Podevijn, N., Trogh, J., Martens, L., Joseph, W.: Experimental perfor-
mance evaluation of outdoor TDoA and RSS positioning in a public LoRa network.
In: IPIN 2018 - 9th International Conference on Indoor Positioning and Indoor Nav-
igation, September, pp. 24–27 (2018). https://doi.org/10.1109/IPIN.2018.8533761
9. Podevijn, N., Plets, D., Trogh, J., Martens, L., Suanet, P., Hendrikse, K.,Joseph,
W.: TDoA-based outdoor positioning with tracking algorithm in a public LoRa
network. Wirel. Commun. Mob. Comput. 2018, 1–9(2018). https://doi.org/10.
1155/2018/1864209, https://www.hindawi.com/journals/wcmc/2018/1864209/
10. Raza, U., Kulkarni, P., Sooriyabandara, M.: Low power wide area networks: an
overview. IEEE Commun. Surv. Tutor. 19(2), 855–873 (2017). https://doi.org/10.
1109/COMST.2017.2652320
11. Sallouha, H., Chiumento, A., Pollin, S.: Localization in long-range ultra nar-
row band IoT networks using RSSI. In: 2017 IEEE International Conference on
Communications (ICC), pp. 1–6. IEEE (2017). https://doi.org/10.1109/ICC.2017.
7997195
12. Song, Q., Guo, S., Liu, X., Yang, Y.: CSI amplitude fingerprinting-based NB-IoT
indoor localization. IEEE Internet Things J. 5(3), 1494–1504 (2018). https://doi.
org/10.1109/JIOT.2017.2782479, https://ieeexplore.ieee.org/document/8187642/
13. Torres-Sospedra, J., Montoliu, R., Trilles, S., Belmonte, O., Huerta, J.: Compre-
hensive analysis of distance and similarity measures for Wi-Fi fingerprinting indoor
positioning systems. Expert Syst. Appl. 42(23), 9263–9278 (2015). https://doi.org/
10.1016/J.ESWA.2015.08.013, https://www.sciencedirect.com/science/article/pii/
S0957417415005527
14. Tran, D., Nguyen, T.: Localization in wireless sensor networks based on sup-
port vector machines. IEEE Trans. Parallel Distrib. Syst. 19(7),981–994 (2008).
https://doi.org/10.1109/TPDS.2007.70800, http://ieeexplore.ieee.org/document/
4384476/
15. Wang, Y., Xiu, C., Zhang, X., Yang, D.: WiFi indoor localization with CSI
fingerprinting-based random forest. Sensors 18(9), 2869 (2018). https://doi.org/
10.3390/S18092869, http://www.mdpi.com/1424-8220/18/9/2869
Learning to Communicate with
Multi-agent Reinforcement Learning
Using Value-Decomposition Networks
Simon Vanneste(B) , Astrid Vanneste, Stig Bosmans, Siegfried Mercelis,

and Peter Hellinckx
IDLab, Faculty of Applied Engineering, University of Antwerp - imec,

Sint-Pietersvliet 7, 2000 Antwerpen, Belgium
{simon.vanneste,stig.bosmans,siegfried.mercelis,
peter.hellinckx}@uantwerpen.be, astrid.vanneste@student.uantwerpen.be
Abstract. Recent research focuses on how agents can learn to commu-

nicate with each other. This communication between the agents allows
them to share information and coordinate their behaviour. Recent efforts
have proven successful in these cooperative problems. A major problem
we face in multi-agent reinforcement learning is the lazy agent prob-
lem, where some agents take advantage of the successful actions of other
agents. This results in agents not being able to learn a functional pol-
icy. In this paper we will combine state-of-the-art methods to design an
architecture to address cooperative problems using communication while
also eliminating the lazy agent problem. We propose two approaches for
learning to communicate that use value decomposition to address the
lazy agent problem. We find that the additive version of value decom-
position gives us results which exceeds the results of the state of the
art.
1 Introduction
In recent years, a great deal of research has been done in the field of Multi-Agent
Reinforcement Learning (MARL). The latest evolutions in MARL research show
that communication is essential in cooperative environments since no agent can
observe the complete state of the environment due to partial observability of
the environment and communication also allows the agents to coordinate their
behaviour [1,2]. To get the best results it is necessary to use reinforcement
learning to learn behaviour as well as communication. Foerster et al. [1] showed
that by learning to communicate, certain cooperative problems can be solved
with significantly better results.
In standard MARL we reward each agent individually, which is ideal to create
a series of agents that work by themselves. However, when we want the agents to
work together, this individual reward will not suffice. An individual reward does
not encourage agents to work together since they are not rewarded or punished
for good or bad teamwork. With a joint reward this is solved by rewarding all
https://doi.org/10.1007/978-3-030-33509-0_69
Learning to Communicate Using Value-Decomposition Networks 737
agents equally. This means that agents are rewarded for a team success instead
of an individual success which encourages team behaviour. A problem we face
when using a joint reward is the “lazy agent” problem. A lazy agent is an agent
that does not have a good policy. When using individual rewards, this agent
would not be rewarded and would consequently change its policy to a more
useful policy. However, when using a joint reward, this agent can be rewarded
due to good actions of other agents. Therefore, the agent will be encouraged to
learn an unsuccessful policy.
MARL in a partial observable environments makes the problem more com-
plex because an agent cannot observe the complete environment. In these cases
communication is crucial. By using communication, we can reduce the effects of
partial observability by combining the agent’s own observations with those com-
municated by other agents. Communication between agents in a MARL system
can be approached in different ways. The simplest way is to define the communi-
cation protocol beforehand. This, however, limits the possibilities of communica-
tion. For most systems it will be impossible to define an optimal communication
protocol beforehand because we cannot know what information will be valuable
for other agents. Predefined communication policies are not optimal in terms of
what information is shared and how the data is represented. These policies will
bias our agents to solve problems the same way humans would. However if we
allow the agents to define the communication protocol themselves, the agents
may find a more efficient solution. Therefore, it is very useful to allow the agents
to learn to communicate.
In this paper we present some advancements in the state of the art in coopera-
tive MARL that benefit from both learning to communicate and value decompo-
sition [3] which offers a method to prevent the lazy agent problem. Our research is
organized in the following order. In Sect. 2 we discuss the state of the art relevant
to our problem statement. Section 3 provides background information regard-
ing reinforcement learning and deep Q-learning. Our methods are described in
Sect. 4. We present our experiments and their results in Sect. 5. Finally, we dis-
cuss our conclusions in Sect. 6.
2 Related Work
The latest research in cooperative MARL presents several different approaches to

allow agents to learn to communicate. Foerster et al. [1] and Sukhbaatar et al. [2]
both proposed a novel approach. The difference between these two approaches
is that Sukhbaatar et al. used continuous communication while Foerster et al.
chose to use discrete communication. Additionally, the approach of Foerster et
al. is more suitable for decentralized deployment, where agents can be placed on
separate devices, since the agents can be separated. Sukhbaatar et al. propose
an approach where the entire MARL system is comprised in a single network
which makes it unusable for decentralized deployment.
Foerster et al. propose two different methods for learning discrete communi-
cation, Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent
738 S. Vanneste et al.
Learning (DIAL). RIAL is a simpler method to learn to communicate. The poli-

cies for choosing actions and choosing communication are split in order to reduce
the size of the action space. The reward is applied to both the action policy and
the communication policy. The disadvantage of this approach is that there is no
direct feedback from other agents on the communication of a certain agent. This
makes it infeasible for the agents to agree on a communication policy. DIAL tries
to solve these problems by using gradients that flow through all agents to give
feedback on the communication.
Sunehag et al. [3] explain the lazy agent problem. When one agent learns a
policy which results in positive rewards, due to the joint reward system all agents
will receive these positive rewards. The other agents will interpret these rewards
as if they learned a successful policy as well. There is no way to distinguish
which agent performed a good action. Sunehag et al. [3] propose an approach
using value decomposition to determine the contribution of each agent in the
joint reward. This approach is discussed in detail in Sect. 4.
3 Background
In Reinforcement learning (RL) [4] we have an agent that acts in an envi-
ronment. Depending on the quality of its actions the agent will be rewarded or
punished. Using this reward, the agent will try to learn the desired behaviour by
applying a policy. This policy will determine which action the agent will choose
depending on the observation of the agent. Usually, these RL problems are mod-
elled using a Markov Decision Process (MDP) [5]. The MDP is defined by a tuple
(S, A, Ta , Ra ) with the state space S, action space A, transition function Ta and
reward function Ra . The agent’s objective is to maximize the reward by compos-
ing a policy that chooses an action a ∈ A based on the state of the environment
s ∈ S. A method to create such an agent is Q-learning. A Q-learner will try to
create a Q function which the agent will use to determine the next action by
using the following policy π(s) = argmaxa Q(s, a). This method is based on the
Bellman equation [6] (Eq. 1) which will be used to create the Q-function. Here
Q(s, a) represents the expected value of performing a certain action a while in
a certain state s. R(s) is the reward that the agent receives after performing
the action, α ∈ [0, 1] is the learning rate and γ ∈ [0, 1] is the discount factor.
To balance exploration and exploitation, we use the -greedy policy. This means
that we will follow the policy π of our agents with a probability of 1 − and
choose a random action with probability which will be high in the beginning
of the training and low at the end of the training.
Q̂(s, a) ←
− Q̂(s, a) + α(R(s) + γ max

(Q̂(s , a )) − Q̂(s, a)) (1)
a
In Deep Q-Network (DQN) we use a neural network to represent the

Q-function as proposed by Mnih et al. [7]. The advantage of DQN is that the
neural network should be able to generalize over multiple action state pairs in
comparison to a tabular approach. Equation 2 shows the loss function which is
used to train the parameters θi of the DQN-network for every iteration i. In this
equation, a frozen set of network parameters θi− is used to stabilize the updates
to the network.
Li (θi ) = Es,a,r,s [(r + γ max

(Q(s , a ; θi− ) − Q(s, a; θi ))2 ] (2)
a
In our experiments, the agents act in a partially observable environment which

moves the problem from a MDP to a partially observable MDP’s (POMDP).
The difference between a POMDP’s and a regular MDP’s is that our agents
do not receive all the state information but only a limited observation. This
makes the problem a great deal more difficult since our agents do not have all
the information of the environment. This also causes us to loose the guarantee
that our policy will converge. Since deep Q-learning assumes full observability
we will be using a deep recurrent Q-network as proposed by Hausknecht et
al. [8]. Instead of using a feed forward network to calculate the Q-values we
use a recurrent network. Recurrent networks have an internal state which allows
them to remember observations over time. Using a deep recurrent Q-network
helps us reduce the effect of the partial observability problem.
4 Methods
In this paper, we propose to combine the DIAL method of Foerster et al. [1]
in order to simultaneously learn behaviour and communication in combination
with the value decomposition method of Sunehag et al. [3] in order to reduce
the lazy agent problem.
4.1 Differentiable Inter-Agent Learning (DIAL)
Differentiable Inter-Agent Learning (DIAL) [1] is a method to simultaneously

learn behaviour and communication between the agents. In DIAL, each agent
produces an action a and a message m. The agent has a C-Net that produces
a message and the Q-values for every action. The action selector will select
the action based on these Q-values. To learn the message structure, gradients
flow across agents to get some form of feedback on the message that was sent.
However if we want to let the gradients flow across agents, the messages cannot
be discrete. This is addressed by applying a discretize-regularize unit (DRU)
on the message. During training time, the DRU will add gaussian noise to the
message and apply a sigmoid function. The gaussian noise will have more effect
on the results in the middle of the sigmoid than at the extrema of the sigmoid
as described by Foerster et al. [1]. Therefore, the network will be encouraged to
send messages that we will be able to discretize later on. During testing, the DRU
will discretize the messages. This means that the agents get more information
during training time. However when the agents are deployed, this information is
not necessary for the agents to operate. The architecture of the DIAL network
can be seen in Fig. 1.
m2t+1
Action Action
C-Net C-Net
Select Select
DRU DRU
DRU DRU
m1t
Action Action
C-Net C-Net
Select Select
o1t a1t o2t+1 a2t+1
ENVIRONMENT
Fig. 1. Differentiable Inter-Agent Learning Architecture [1] in which two agents are
represented over two separate time periods (t and t + 1). The execution flow is repre-
sented by black arrows and the gradient flow is represented by red arrows.
4.2 Value Decomposition
Sunehag et al. [3] assume that the total Q-value of the multi-agent system can be
obtained by adding the Q-value of the chosen action of each agent together (see
Eq. 3). Q̃n only depends on the local observations on of each agent. We learn
Q̃ implicitly by applying the gradient decent algorithm with the joint reward
through the summation of the individual Q-values, instead of directly learning
it by applying the joint reward on the individual Q-values. The final loss function
can be obtained by filling the assumption of Eq. 3 into Eq. 2 which results into
Eq. 4.
d

Q((o1 , o2 , o3 , ..., od ), (a1 , a2 , a3 , ..., ad )) ≈ Q̃n (on , an )) (3)
n=1
d
d

−
Li (θi ) = Eo,a,r,o [(r + γ max

(Q̃n (on , an ; θi,n )− Q̃n (on , an ; θi,n ))2 ] (4)
a
n=1 n=1
The Q-values produced by our agents can be seen as a measure of how good
a certain action is in a certain state. Therefore we can use this to determine
the contribution of each agent in the received joint reward. By applying the
joint reward to the combined Q-value, the gradients will make sure each agent
is rewarded according to its contribution. A visual representation of the archi-
tecture is shown in Fig. 2.
Action
Q-Net
Select
Q2t
+
Q1t
Action
Q-Net
Select
o2t o1t a1t a2t
ENVIRONMENT
Fig. 2. The architecture of value decomposition agents using a joint reward.

m2t+1
Action Action
C-Net C-Net
Select Select
DRU DRU
DRU DRU
m1t
Action Action
C-Net C-Net
Select Select
+ +
o1t a1t o2t+1 a2t+1
ENVIRONMENT
Fig. 3. The architecture of the Differentiable Inter-Agent Learning with Value Decom-
position (DIAL-VD).
4.3 Differentiable Inter-Agent Learning with Value Decomposition

When we combine the techniques used in DIAL [1] and value decomposition [3]
we get an architecture that we call Differentiable Inter-Agent Learning with
Value Decomposition (DIAL-VD). In this architecture we will use gradients to
decompose the joint reward and to provide feedback regarding the messages of
other agents as can be seen in Fig. 3. Adding value decomposition to DIAL does
not change the fact that the agents can be deployed on separate devices. The
value decomposition is only used at training time. When the agents are deployed,
we do not need to perform value decomposition.
4.4 Differentiable Inter-Agent Learning with Value Decomposition

Networks
The value decomposition method proposed by Sunehag et al. [3] has certain lim-
itations by assuming that Eq. 3 holds. Therefore we propose a novel approach
called Differentiable Inter-Agent Learning with Value Decomposition Networks
(DIAL-VDN) that uses an additional neural network to perform the value decom-
position. Using this approach we are not bound by the assumption that the total
Q-value of our multi-agent system is formed by the summation of the individ-
ual Q-values. When using a neural network to compose our total Q-value, we
suspect that we can take into account a wider range of problems such as envi-
ronments where one agent’s success depends on the success of the other agent
or environments with heterogeneous agents. The architecture for DIAL-VDN
is very similar to the DIAL-VD architecture and can be seen in Fig. 4. The
value decomposition in DIAL-VDN is not used during deployment. Therefore,
the agents trained using this approach can be deployed on separate devices.
5 Results
In this section, we evaluate the results of our proposed DIAL-VD and DIAL-VDN
approaches compared to the DIAL baseline. The agents use Deep Q-learning (see
m2t+1
Action Action
C-Net C-Net
Select Select
DRU DRU
DRU DRU
m1t
Action Action
C-Net C-Net
Select Select
VD-Net VD-Net
a1t o2t+1 a2t+1

o1t
ENVIRONMENT
Fig. 4. The architecture of Differentiable Inter-Agent Learning with Value Decompo-

sition Networks (DIAL-VDN).
Sect. 3) to learn their behaviour. The architecture of our C-Net is based on the
original C-net described by Foerster et al. [1]. The inputs are first processed by
an input layer. For the agent index and the previous action we use an embedding
layer to learn a better representation for these inputs. For the state we cannot use
an embedding layer because the state info is continuous. Therefore, we use a fully
connected layer to process the state info. To process the incoming messages we
first apply batch normalization followed by a ReLU and a fully connected layer.
After these input layers we combine all these inputs by performing an element
wise addition and apply three layers of GRU’s with a hidden state size of 128.
Our output layer consists of batch normalization followed by a fully connected
layer, ReLU and another fully connected layer. Our implementation is based
on the Pytorch implementation of DIAL [9]. For our DIAL-VDN approach we
use a neural network to perform value decomposition. This network is a simple
network consisting of two fully connected layers with a ReLU in between.
Both Sunehag et al. [3] and Foerster et al. [1] concluded that parameter
sharing is necessary. Parameter sharing means that instead of training a sepa-
rate network for each of our agents, we will use the same network for all agents.
Sunehag et al. stated that parameter sharing helps to solve the lazy agent prob-
lem by providing agent invariance. Foerster et al. concluded from their results
that parameter sharing provides faster training and better results. Following
these conclusions we will use parameter sharing as well.
We use the “simple reference” environment provided by OpenAI [10,11] for
this evaluation. This environment is considered one of the state of the art envi-
ronments to test and evaluate multi-agent communication RL algorithms. In this
environment there are three landmarks and two agents. The two agents have to
go to a specific target landmark but this target landmark is only known by the
other agent. This means that the agents have to communicate to successfully
reach the correct landmark. The “simple reference” environment is a collective
observable environment which makes it a suitable environment to test our system
because it is crucial for the agents to communicate with each other to reach the
goal. The environment only contains two agents which makes it suitable to test
with value decomposition since this method has not been tested with a larger
amount of agents. Since there are three possible landmarks, we allow the agents
to communicate two bits of information.
For our experiments we use the following hyperparameters. We use episodes
of 30 steps. The learning rate is set to 0.0005 with a momentum of 0.05. The
discount factor is set at 0.97. The ε-value for ε-greedy is constant at 0.05. Our
target network is updated every 100 episodes. To stabilize training we use a
batch size of 64.
5.1 Average Learning Performance
In Fig. 5 our results are shown, averaged over 15 runs. The results are smoothed
by applying an exponentially weighed moving average filter with an alpha value
of 0.01. We can see that our DIAL-VD approach outperforms the baseline DIAL
system. Not only does DIAL-VD reach a higher peak-performance, it also sig-
nificantly accelerates training speed. We see that DIAL-VD reaches an average
reward of -30 approximately 3000 episodes before DIAL and DIAL-VDN. On
average we see that DIAL-VDN does not offer great improvements. At the begin-
ning of our training it is slightly faster than DIAL but this only lasts about 4000
episodes. DIAL-VDN does not reach a higher peak performance than DIAL.
In Fig. 5 we can clearly see that the value-decomposition methods suffer from
learning instability. Naturally deep Q-learning suffers from stability problems,
caused by the fact that the exact Q-value is very important. Slight changes in
our network and the Q-values can greatly alter the behaviour of our agents. How-
ever, in single agent settings there are many techniques to mitigate this problem.
One of the main techniques that is used is experience replay, where we store the
experiences of the agent in a buffer. To train the network we sample from this
Fig. 5. Results for DIAL, DIAL-VD and DIAL-VDN averaged over 15 runs and
smoothed with an exponentially weighed moving average filter with an alpha value
of 0.01.
buffer. This way we can make sure the agent does not forget what it learned in
the beginning of its training. In MARL we cannot use experience replay since
the behaviour of these other agents still changes a lot, making the environment
non-stationary and causing the experiences in our buffer to be useless.
5.2 Average Peak Performance

However, to get an accurate idea of the potential of each approach we look
at the average peak performance across all runs. These results can be seen in
Table 1. We see that the average peak performance of DIAL-VD is a great deal
higher than the average peak performance of both DIAL and DIAL-VDN. We
can conclude that DIAL-VD will, on average, reach a higher peak performance
than the other approaches. We can also see that DIAL-VDN has a lower average
peak performance than our DIAL baseline.
Table 1. The average peak performance for the DIAL baseline, DIAL-VD and DIAL-
VDN.
Approach Average reward

DIAL −26, 10
DIAL-VD −23.43
DIAL-VDN −27.5
6 Conclusion
This paper proposed two approaches to improve the state of the art in learning
to communicate as presented by Foerster et al. [1]. Our first approach, DIAL-
VD, inspired by the techniques presented by Sunehag et al. [3], was able to
significantly improve both training speed and peak performance. The second
approach, DIAL-VDN, showed less improvements. All experiments show insta-
bility in training. This instability was more extreme in both proposed value
decomposition techniques than in the DIAL baseline. However, even with this
instability DIAL-VD showed capable of significantly improving the state of the
art in learning to communicate.
In this paper we only tested our methods in an environment with two agents.
In the future we would like to test if our approaches generalize for a larger
amount of agents. A large amount of agents significantly increases the difficulty
of learning with a single joint reward since the agents see more rewards caused
by other agents than rewards caused by their own actions. In addition to this, we
would like to improve the performance of the DIAL-VDN approach by providing
the value decomposition network with the observations of the agents so the
network can evaluate the performance of the agents.
Acknowledgement. We gratefully acknowledge the support of NVIDIA Corporation

with the donation of the Titan Xp GPU used for this research.
References
1. Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate
with deep multi-agent reinforcement learning. In: Advances in Neural Information
Processing Systems, pp. 2137–2145 (2016)
2. Sukhbaatar, S., Fergus, R., et al.: Learning multiagent communication with back-
propagation. In: Advances in Neural Information Processing Systems, pp. 2244–
2252 (2016)
3. Sunehag, P., Lever, G., Gruslys, A., Czarnecki, W.M., Zambaldi, V., Jaderberg,
M., Lanctot, M., Sonnerat, N., Leibo, J.Z., Tuyls, K., et al.: Value-decomposition
networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296
(2017)
4. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press
(2018)
5. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Pro-
gramming. Wiley, Hoboken (2014)
6. Bellman, R., et al.: The theory of dynamic programming. Bull. Am. Math. Soc.
60(6), 503–515 (1954)
7. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G.,
Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level
control through deep reinforcement learning. Nature 518(7540), 529 (2015)
8. Hausknecht, M., Stone, P.: Deep recurrent q-learning for partially observable
MDPs. In: 2015 AAAI Fall Symposium Series (2015)
9. https://github.com/minqi/learning-to-communicate-pytorch
10. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent
actor-critic for mixed cooperative-competitive environments. In: Neural Informa-
tion Processing Systems (NIPS) (2017)
11. Mordatch, I., Abbeel, P.: Emergence of grounded compositional language in multi-
agent populations. arXiv preprint arXiv:1703.04908 (2017)
AirLeakSlam: Automated Air Leak
Detection
Anthony Schenck1,2(B) , Walter Daems1 , and Jan Steckel1,2

1
University of Antwerp, Antwerp, Belgium
{anthony.schenck,walter.daems,jan.steckel}@uantwerpen.be
2
Flanders Make Strategic Research Centre, Lommel, Belgium
Abstract. Estimations indicate that up to a third of the power con-

sumption of compressed air leaks is lost due to undetected leaks. The cur-
rent methods of detection and localization of these leaks require intensive
manual labor, making use of handheld devices. In addition, the added
energy cost caused by these air leaks is concealed in the total cost of
energy. These factors can explain why there is a limited commitment
to detect and repair air leaks. To address this issue, we propose a solu-
tion which does not require manual labor in the process of detecting and
locating pressurized air leaks. By equipping existing factory vehicles with
a multi-modal sensing device containing a LIDAR, an ultrasonic micro-
phone array and a camera, we are able to locate leaks in large industrial
environments with high precision in 3D. With this proposed solution we
aim to encourage the industry to proactively search for pressurized air
leaks and therefore reduce energy losses costing a fraction of currently
employed methods.
1 Introduction
A substantial amount of energy is lost every year due to leaks in pressurized air
networks. It is estimated that these leaks account for 10% to 30% of the energy
consumed by industrial compressors. Based on the information in [9] and [11] it
is possible to estimate a required energy production capacity of 2.3 GW to cover
for these energy losses in Europe on a yearly basis. Current methods of detecting
pressurized air leaks require intensive manual labor. At times it can even be nec-
essary to shut down a part or even the entire production facility. The preceding
factors can make the detection of these leaks excessively expensive. Furthermore,
the cost of the energy loss due to inefficiencies in the compressed air network
caused by leaks is masked by the total cost of energy of the production facility,
and is relatively small compared to other cost reduction options [11]. Often the
first step taken when a decrease in efficiency of the air network decreases, is to
add additional compressor power instead of searching for the root causes [10].
Pressurized air leaks generate ultrasonic sound which is caused by turbulence
at the exit of the leak, where the gas expands coming from the high pressure
inside of the container into the low pressure outside of the container [3,16]. The
ultrasonic sound produced by a leak makes it possible to detect by utilizing
https://doi.org/10.1007/978-3-030-33509-0_70
AirLeakSlam: Automated Air Leak Detection 747
ultrasonic microphones, even in an environment with loud background noise,

commonly found in an industrial setting. Background noise of ultrasonic nature
can be filtered out by making use of highly directional microphones. To increase
the directionality of a microphone one can use either a parabolic dish [4] sur-
rounding the microphone or use a microphone array in combination with beam-
forming methods [2,5]. The current state of the art for detecting pressurized air
leaks consists of using a hand-held device with a highly directional ultrasonic
microphone [12,13]. An operator has to walk around the facility while pointing
the device along the entire pressurized air network. When a leak is detected, the
operator can be notified by means of a display overlaying the leak direction and
intensity or by using a headphone which downshifts the ultrasonic sound to the
human spectrum, increasing in volume when pointing directly towards a leak.
Provided that a leak has been located, the operator then has to manually place
a tag at the leak location after which a maintenance crew can repair the leak.
In this paper we propose a standalone solution for the detection and local-
ization of pressurized air leaks that does not involve any manual labor. Figure 1
shows our current prototype. The system combines our embedded Real-Time
Imaging Sonar (eRTIS) [6] to detect and locate the air leaks, a Light Detec-
tion And Ranging Laser (LIDAR) to map the environment and determine the
position of the system in the environment, and a camera to visually mark the
leak location, all in a single enclosure. This enclosure can then be mounted to
vehicles that are already present in the production facility such as fork lifts or
Automated Guided Vehicles (AGV) as shown in Fig. 2(a). This eliminates the
need for manual labor in the detection and localization process. Once the data
has been captured it can be uploaded for post-processing and analysis.
The rest of this paper is organized as follows: in Sect. 2.1 we will explain
how our eRTIS sensor passively locates ultrasonic sources. Section 2.2 describes
how we combine multiple measurements to determine the location of a leak.
Next, in Sect. 3 we describe the system architecture that we used for our solu-
tion. Section 4 will demonstrate our test methods and show our obtained results.
Finally, we will draw our conclusions and propose future work in Sect. 5.
2 Leak Detection and Localization
In this section we will briefly explain the working of the eRTIS sensor and how
it locates ultrasonic sources, as well as how we determine the position of an air
leak.
2.1 Leak Detection
The eRTIS sensor, depicted in Fig. 2(b) is equipped with a sparse microphone
array, consisting of 32 randomly placed microphones in an ellipse. It has an
angular accuracy of 1◦ and an angular resolution of 5◦ . It is capable of taking
3D measurements at a rate of 12 Hz. When performing a measurement, each
748 A. Schenck et al.
Fig. 1. Panel (a) shows the front view of the enclosure. On the right sits the LIDAR,
a Hokuyo UTM-30LX, which is used for the HectorSLAM algorithm. On the left is
the eRTIS sensor which is used in passive mode to detect and locate air leaks. In the
center is the camera. The camera takes intermittent pictures, on which we can overlay
a leak location if one has been found, aiding in the localization of a leak. Panel (b)
shows the same sensors from the backside. There is also a connection for the power
supply and an Ethernet port to connect to the NUC inside. Panel (c) shows the inside
of the enclosure, with the NUC and the power supply. The NUC uses ROS to setup
the sensors and captures their data.
microphone will record a signal sM

i [k] with i representing the microphone number
and k the discrete time variable. These signals are then band pass filtered:
sF M
i [k] = si [k] ∗ h
BP
[k] (1)
after which we apply a sum and delay beamformer:

32
sBF
ψ [k] = sF
i [k + τi (ψ)] (2)
i=1
where τi (ψ) are the time delays as a function of ψ and ψ = [ϑ, ϕ] is the directional
vector consisting of azimuth and elevation angles as demonstrated in Fig. 2(b).
Next, we estimate the envelope of the signal by full-wave rectification and low-
pass filtering:
sEN
ψ
V
[k] = hLP [k] ∗ |sBF
ψ [k]| (3)
Now we can estimate the energy coming from every direction of interest ψ. The
combination of these energy profiles is called the EnergyScape (ES):
ES[k, ψ] = [sEN
ψ1
V
[k] . . . sEN
ψn
V
[k]] (4)
Fig. 2. (a) Mounting the enclosure on a vehicle while driving around allows for the
localization of air leaks using triangulation. (b) shows the eRTIS sensor. The location
of the 32 microphones are highlighted in green. The Cartesian coordinate system is
superposed in white, together with the azimuth and elevation [ϑ, ϕ] of a source with
coordinate p.
By integrating the EnergyScape over the discrete time, we obtain a spatial spec-
trum:
K

SS[ψ] = ES[k, ψ] (5)
k=1
At this point ψ is uniformly sampled on a sphere over 4000 points, using the zonal
sphere partitioning algorithm of [8], partitioning the unit sphere into regions of
equal area with small diameter. From here we can interpolate SS[ψ] onto a reg-
ular azimuth and elevation grid: SS[ϑ, ϕ]. The distribution of sample points at
this step is no longer uniform on the sphere, but rather severely oversampled at
the poles of the sphere. At this point we have a two-dimensional matrix with
intensity values. Next, we convert this matrix into an acoustic image, in which
ultrasonic sources can be identified as high intensity areas. We then apply thresh-
olding on the image to reduce the influence of noise. We acknowledge that this
method has its limitations in environments where noise and signal are close or
equal in strength. In these cases more sophisticated detection techniques can be
implemented, such as relying on L1 minimization and sparsity, as demonstrated
in [14]. Other beamforming techniques such as broadband MUSIC could be used
as well for source detection [15]. Next we perform binary thresholding on the
image to allow for easy blob detection. We use the remaining blobs in turn as a
mask, to determine the highest intensity value of each ultrasonic source found
in the measurement. These locations are finally related back to their azimuth
and elevation [ϑ, ϕ] indicating the ultrasonic source direction. This process is
demonstrated in Fig. 3. We chose this method of determining the direction of
the ultrasonic sources as it allows to easily distinguish multiple sources in the
same measurement as well as to easily determine the center of these sources.
2.2 Leak Localization

Now that we have a method to establish the azimuth and elevation of a pressur-
ized air leak for a given measurement, we need to be able to locate the leak. We
Fig. 3. Determining the azimuth and elevation of ultrasonic sources by using the Ener-
gyScape generated by the acoustic measurements. The EnergyScape is calculated for
4000 directions sampled uniformly on a sphere. These spheres are a Lambert azimuthal
equal area projection, with grid lines every 30◦ in azimuth and elevation. The 4000
directions were generated using an equal area sphere partitioning algorithm. This is
then interpolated on a azimuth and elevation grid. After thresholding and conversion
into a black and white image we perform blob detection. We use each blob as a mask
and take the highest intensity value behind the mask as the source direction, allowing
us to detect multiple sources in each measurement. Finally we determine the azimuth
and elevation [ϑ, ϕ] of these sources.
do this by creating a map of the environment using the ROS implementation

of HectorSLAM described in [7]. By using this SLAM implementation we only
require laserscan data. This allows our solution to be implemented on any mov-
ing object already present in the environment, only requiring a power source.
Using HectorSLAM we know the position where a measurement is taken with
respect to the world coordinate system:
T
pi = xi yi zi (6)
along with the relative rotation αi of the enclosure along the z-axis, represented
by its rotation matrix Rz :
⎡ ⎤
cos(αi ) −sin(αi ) 0
Rz (αi ) = ⎣sin(αi ) cos(αi ) 0⎦ (7)
0 0 1
Note that we assume a fixed value for the z-coordinate of the robot position. If
a leak is detected from this position pi , we are able to determine the azimuth
and elevation [ϑ, ϕ] of the leak from (5). Since we are using eRTIS in passive
mode, we are unable to extract range information from the data we receive. We
therefore limit the range to a fixed value dmax , which we set in our measurements
to 50 m. Combining the position with the source direction will give us
si = pi + Rz (αi ) · vi (ψ i ) (8)
the 50 m vector pointing towards the found leak position in the sen-
with vi (ψ)
sor’s coordinate system. Detection of the same leak from a different position pj

j ) for this measurement. Combin-
will provide us with sj = pj + Rz (αj ) · vj (ψ
ing multiple measurements will yield a location where the vectors s1 , s2 , . . . , sn
intersect. This intersection point will be the leak location. This is demonstrated
in Fig. 4(a).
However, the likelihood that these vectors truly intersect in 3D-space is zero,
due to noise in the measurement position as well as noise in the measurement
itself. In order to consider two measurement vectors as intersecting, we evaluate
the minimum distance between the newly generated vectors with all the previ-
ously generated vectors. If the obtained distance is below 10 cm, we consider the
vectors to intersect. By using a least mean squared pseudo inverse approach [1],
we determine the intersection point. This is the point that lies closest to both
vectors. Fig. 4(b) shows the collection of intersection points from our experiment.
In order to estimate both the actual leak positions and the amount of leaks, we
need to be able to automatically discern the clusters and find their center, which
should result in an accurate localization of the leaks. In order to analyze the
Fig. 4. Panel (a) demonstrates the process of determining a leak location when driving
by. Combining the ultrasonic source direction vectors allows us to locate a leak by
determining their points of intersection. The red cross shows the robot location with the
protruding red line the direction it is facing (Rz (αi )), at the time of the measurement.
The green lines are the direction vectors of the detected ultrasonic source marked by
the blue crosses. The blue line is the trajectory of the robot. Panel (b) shows all the
intersections found after our experiment, shown by the red dots.
intersection points, we create histograms of the x-, y-, and z-coordinates of all the
intersection points. Next, for each dimension, we fit a probability density func-
tion (PDF) of this data using Kernel Density Estimation. In this PDF we look
for the extrema. These are the most densely clustered locations in their respec-
tive dimension. Next, we cross-reference the locations of the extrema, resulting
in multiple (x, y, z)-coordinates. For each of these coordinates, we check if there
are a certain amount of intersection points within a certain radius. We then take
the mean of their (x, y, z)-coordinates and use the result as the estimated leak
location. We choose to use this method because it does not require any prior
knowledge about the amount of leaks in the environment while also being fast
to calculate. The computational load of for instance K-Means Clustering was a
lot higher, taking multiple hours to complete, while also finding clusters where
no leaks were located.
3 System Architecture
At the center of the system sits a NUC, to which all the sensors are connected.
The NUC is configured to run the Robot Operating System (ROS) framework,
acting as the master. The NUC configures the sensors and handles their data.
The transfer of data under ROS is done using publisher/subscriber messaging.
These messages can be either recorded on disk using a rosbag for post-processing,
or they can be directly processed by an external subscriber. The schematic
overview of the hardware system is shown in Fig. 5(a). Figure 5(b) shows the
schematic overview of the data processing structure of our proposed solution.
a) b)
Locally
eRTIS LIDAR processed Backend
Real-time
Processing
Leak
AirleakSLAM Maintenance
localization +
16V - 62V NUC Enclosure
Image
ticket
Backend
Rosbag
Post-processing
Power
Camera
Supply Processed in the cloud
Fig. 5. (a) Shows the schematic hardware layout of our proposed solution. Panel (b)
shows a schematic overview of the data processing structure. The AirleakSLAM enclo-
sure captures the data from the sensors, which can then be send to the cloud for further
processing, either in real-time or in batch by means of a rosbag. After processing a
maintenance ticket can be dispatched for repairs.
The sensors perform their measurements independently and at different rates.

Due to the fact that every ROS message is timestamped using the Unix epoch
time format, we are able to synchronize them in the post-processing phase by
comparing the timestamps and looking for the nearest match. This is especially
important in order to obtain high accuracy in the localization of an air leak, by
matching the position to a sonar measurement. Once a leak has been detected
and located, we determine the field-of-view of the camera with regard to the
leak location. Knowing the position where a picture has been taken allows us to
overlay the leak location on this image, by adding an intensity map of the sound
pressure level of the leak. This picture can then be added to a maintenance
ticket, facilitating in the repair of said leak. An example of this method can be
seen in Fig. 6(a). Here, the yellow area gives an indication of the sound pressure
level at that location. This is overlaid over the actual leak position, which is in
this case located in the hanging piece of tubing, encircled in red.
We performed our experiments using the enclosure as shown in Fig. 1. The
LIDAR is a Hokuyo UTM-30LX, with a range of 30 m and an angle of 270◦ .
Table 1. Error distances of the experiment
Leak 2D error (m) 3D error (m)

A 0.091 0.112
B 0.192 0.197
C 0.080 0.119
Fig. 6. (a) When a leak has been found, the images taken are analyzed. By using
camera projection, we know when a leak was in the camera’s field of view. If this is
the case, we overlay the sound pressure level at that location. Here, the leak location is
encircled in red, with the overlay of the sound pressure level at that position. Panel (b)
shows the location of the found leaks as the result of our experiment. The actual leak
positions are marked by the blue cross. The estimated leak positions are marked by
the green asterisk. The red crosses indicate the locations where our algorithm checks
for clusters of intersection points. The blue line shows the trajectory of the robot.
The camera is an ELP Super Mini 720p USB camera with a wide angle lens.
The power supply accepts a DC input voltage of 16.8 V to 31.2 V and has a
DC output voltage of 12 V with a power rating of 100 W. We used a Pioneer
P3DX robot as our moving platform on which the enclosure was mounted. We
created an air network of nylon tubing spanning around 70 m in the basement
of our building. This room is filled with chairs, tables and many metal cabinets.
The tubing was routed along existing cable gutters and heating pipes, near the
ceiling. For our experiment we set the pressure in our network at 500 kPa. We
made three leaks in the network by drilling holes in pieces of the nylon tubing,
with diameters of 0.6 mm, 0.8 mm and 1 mm.
During the experiment we drove the robot around at walking pace, along
the blue trajectory shown in Fig. 4(b), during which we were taking acoustic
measurements at a rate of 2 Hz. At the end, we are left with a large amount of
intersection points. These intersection points are shown in Fig. 4(b). After using
our clustering algorithm we found three leak locations. Table 1 shows the distance
error of the locations we found versus the actual leak locations. Figure 6(b) shows
a plot of the found leak positions, marked by the green stars. The actual leak
positions are shown by blue crosses, while the red crosses represent the locations
we investigated with our clustering algorithm. We can see that the leaks are
very accurately located, with 2D-errors of 9 cm for leak A, 19 cm for leak B and
8 cm for leak C. In 3D the localization errors increase only slightly to 11 cm,
20 cm and 12 cm respectively. The leak with the highest error is leak B, situated
at the top left on the floor plan. As we can see from Fig. 6(b), this is the leak
where we only passed once with the robot, resulting in less intersection points
and therefore a less accurate estimation of its position.

In this paper we have proposed a novel method to localize pressurized air leaks,
without the need for manual labor. Our prototype is capable of locating multiple
air leaks in a large environment while being platform independent by using
HectorSLAM, with high accuracy. The analysis can be done both offline and
online, by making use of the ROS publisher/subscriber messaging framework. In
the future we are looking to explore different methods of localizing the air leaks,
for instance by using probabilistic methods instead of using the intersection
points. We would like to further increase the accuracy of the localization of
the leaks in the images as well, which is currently in an early development
stage. Automatic classification of the size of leaks is another aspect we would
like to explore, possibly by making use of machine learning on the recorded and
spatialized leak signal spectrum. Improving the leak localization algorithm using
advanced adaptive beamforming techniques is another avenue of interest. Finally,
we recognize that our proposed method is but a small step in an attempt to limit
energy waste due to the fact that there are hundreds of thousands of companies
who all have to contribute a small portion in order to reduce the total energy
loss of compressed air systems. We believe that facilitating the detection process
of said leaks is a first and important step in the reduction of power consumption
in air leak systems.
References
1. Eikenens, A.: Intersection point of lines in 3D space (2012). https://nl.mathworks.
com/matlabcentral/fileexchange/37192-intersection-point-of-lines-in-3d-space
2. Eret, P., Meskell, C.: Microphone arrays as a leakage detection tool in industrial
compressed air systems. Adv. Acoust. Vibr. 2012, 1–10 (2012). https://doi.org/
10.1155/2012/689379
3. Frenz, E., Moon, C., Pickering, D.J., Mellen, S., Brown, W.C.: Ultrasound tech-
niques for leak detection. SAE Technical Paper Series, vol. 1 (2010). https://doi.
org/10.4271/2009-01-2159
4. Guenther, T., Kroll, A.: Automated detection of compressed air leaks using a scan-
ning ultrasonic sensor system. In: Sensors Applications Symposium, Proceedings,
SAS 2016, pp. 116–121 (2016). https://doi.org/10.1109/SAS.2016.7479830
5. Holland, S.D., Roberts, R., Chimenti, D.E., Song, J.H.: An ultrasonic array sensor
for spacecraft leak direction finding. Ultrasonics 45(1–4), 121–126 (2006). https://
doi.org/10.1016/j.ultras.2006.07.020
6. Kerstens, R., Laurijssen, D., Steckel, J.: eRTIS : a fully embedded real time 3D
imaging sonar sensor for robotic applications. In: International Conference on
Robotics and Automation (to be published)
7. Kohlbrecher, S., von Stryk, O., Meyer, J., Klingauf, U.: A flexible and scal-
able SLAM system with full 3D motion estimation. In: 2011 IEEE Interna-
tional Symposium on Safety, Security, and Rescue Robotics, vol. 32, pp. 155–160.
IEEE (2011). https://doi.org/10.1109/SSRR.2011.6106777. http://doi.wiley.com/
10.1002/rcm.8045
8. Leopardi, P.: A partition of the unit sphere into regions of equal area and small
diameter. Electronic Trans. Numer. Anal. 25, 309–327 (2006)
9. McCorkle, M.: Compressed Air System Leaks: The Cost, Detection, and Repair,
Common Culprits (2018)
10. Perry, W., Mehltretter, N.: Applying root cause analysis to compressed air: how to
solve common compressed air system problems with the 5-whys. Energy Eng.: J.
Assoc. Energy Eng. 115(4), 56–62 (2018). https://doi.org/10.1080/01998595.2018.
12016673
11. Radgen, P.: Compressed air system audits and benchmarking results from the
German compressed air campaign “Drucliuft effizient” (2003)
12. SKF: Product catalog. https://www.skf.com/uk/products/condition-monitoring/
basic-condition-monitoring-products/ultrasonic-leak-detector/index.html
13. SONOTEC: product catalog. https://www.sonotec.eu/products/preventive-
maintenance/ultrasonic-testing-devices/
14. Steckel, J., Peremans, H.: Ultrasound-based air leak detection using a ran-
dom microphone array and sparse representations. In: IEEE SENSORS 2014
Proceedings, pp. 1026–1029, November 2014. https://doi.org/10.1109/ICSENS.
2014.6985178. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=
6985178
15. Van Trees, H.L.: Optimum Array Processing. Wiley, New York (2002). https://
doi.org/10.1002/0471221104
16. Wang, T., Wang, X., Hong, M.: Gas leak location detection based on data fusion
with time difference of arrival and energy decay using an ultrasonic sensor array.
Sensors (Switzerland) 18(9) (2018). https://doi.org/10.3390/s18092985
Simulating a Combination of TDoA and
AoA Localization for LoRaWAN
Michiel Aernouts(B) , Noori BniLam, Rafael Berkvens, and Maarten Weyn
University of Antwerp - imec, IDLab - Faculty of Applied Engineering,

michiel.aernouts@uantwerpen.be
Abstract. Location-based services are an essential aspect of many

Internet of Things (IoT) applications. Due to low power requirements
that these applications impose on their end devices, classic GNSS solu-
tions are replaced by wireless localization via Low Power Wide Area
Networks (LPWAN), e.g. Time Difference of Arrival (TDoA) localiza-
tion with LoRaWAN. Usually, at least four gateways are required to
obtain a reliable location estimate with TDoA. In this research, we pro-
pose to combine TDoA with Angle of Arrival (AoA) in a probabilis-
tic way, using only two gateways. Our simulations demonstrate a 548 m
mean error using TDoA between two gateways. Moreover, we reduce this
mean error to 399 m when a single AoA estimate is added to the TDoA
estimate.
1 Introduction
The application potential of the Internet of Things (IoT) is being recognized

for an increasing amount of use cases, e.g. electric metering, smart farming,
manufacturing automation and asset tracking [9]. Together with the growth of
the IoT, location-based services are becoming increasingly important. Although
classic Global Navigation Satellite System (GNSS) solutions such as Global Posi-
tioning System (GPS), GLONASS or Galileo provide highly accurate location
estimations, they are not suitable for all IoT use cases. For example, an asset
tracking device has to work for years on a small battery, which is hard to achieve
with GNSS due to the high power consumption of a GNSS receiver. Also, a reli-
able asset tracking application requires localization in both outdoor and indoor
scenarios, whereas GNSS is only suitable for outdoor usage.
Due to their long range communication and low power consumption, Low
Power Wide Area Networks (LPWAN) standards such as Sigfox, Narrowband-
IoT (NB-IoT) and LoRaWAN are implemented in IoT devices to provide them
with reliable and low-power wireless communication [12]. Moreover, the char-
acteristics of this wireless communication link can also be used for localization
purposes. By analyzing the received signal strength or Time of Arrival (ToA)
at the receiving LPWAN gateways, techniques such as triangulation, multilat-
eration or pattern matching can be applied to calculate a location estimate for
https://doi.org/10.1007/978-3-030-33509-0_71
Simulating a Combination of TDoA and AoA Localization for LoRaWAN 757
the transmitter. In previous research, a comparative study of signal strength

based localization methods was conducted with Sigfox transmissions [1]. The
researchers used a large ground-truth LPWAN dataset to evaluate fingerprint-
ing, ranging and proximity localization [2]. With a mean error of 586 m, their
fingerprinting implementation achieved the lowest error. However, it takes a
great effort to build and maintain a large outdoor fingerprinting database which
makes it hard to implement this method. Podevijn et al. evaluated tracking with
Time Difference of Arrival (TDoA) in a public LoRaWAN network [11]. In this
research, a median accuracy of 200 m was obtained using raw TDoA data. When
taking map information and movement speed of the transmitter into account, the
median accuracy dropped to 75 m. Bakkali et al. propose to apply an Extended
Kalman Filter (EKF) on LoRaWAN TDoA localization. With their approach,
they achieve a median location error of less than 100 m [3]. In all of the aforemen-
tioned references, researchers rely on as many receiving gateways as possible to
obtain an acceptable location estimate. Angle of Arrival (AoA) estimation sys-
tems are capable to localize a transmitter using only two gateways. Despite its
merit in estimating the direction of received narrowband signals (e.g. LPWAN
signals), AoA has a very high deployment cost. Consequently, AoA localiza-
tion systems have not been developed for LPWANs. However, several serious
attempts to provide cost effective AoA units for IoT applications have recently
been introduced [5–7,14].
On an economic level, it is interesting to limit the installation and mainte-
nance costs of an IoT network by constricting the amount and complexity of its
gateways. For example, a logistics company that is able to cover wireless com-
munication on its site with only two LoRaWAN gateways may not be inclined
to add costly gateways just for localization purposes. In this paper, we simulate
a combination AoA localization and TDoA localization using two synchronized
gateways, one of which can estimate the AoA of a received signal. The goal
of this research is to assess the feasibility of locating a transmitter in a large
outdoor area, using a limited amount of receiving gateways.
The remainder of this paper is structured in the following way: Sect. 2
describes the general workflow of TDoA and AoA localization, and Sect. 3
explains how they are applied to our simulated input data. In Sect. 4 the results
of the simulations are discussed. Finally, Sect. 5 concludes the paper and states
our intended future work.
2 Related Work
In this section, a brief overview on the workflow of TDoA and AoA localization
is provided.
2.1 Time Difference of Arrival
Time of Arrival (ToA) localization relies on the basic principle that the distance
between a receiver and a transmitter can be related to the propagation time.
758 M. Aernouts et al.
Similar to signal strength based ranging methods, ToA applies triangulation

after estimating the distance between the transmitter and at least three gate-
ways [4]. However, this method requires very precise synchronization between
the transmitter and its receivers, usually via GPS. Unfortunately, adding a GPS
module to an IoT device is not a desirable option [8]. Therefore, ToA can be
ruled out as a worthwhile localization method for IoT applications.
TDoA requires only the gateway clocks to be synchronized. When a wireless
signal is received, a location is computed based on the time difference of arrival
relative to a reference gateway. With the time difference between two gateways,
a hyperbola that lists all possible transmitter locations can be described. Conse-
quently, the intersection of three hyperbolas can be accepted as the transmitter
location if at least four gateways receive the signal [4]. It is important to note
that the accuracy of the location estimate depends strongly on the timestamp
precision of the receiving gateways. For example, a precision of 1 µs already leads
to an error of 300 m, as radio signals propagate with the speed of light. In turn,
the timestamp resolution builds upon the signal bandwidth, because the rising
edge of the initial pulse of a signal has to be detected accurately to calculate an
accurate timestamp. Ultra Narrow Band (UNB) wireless networks such as Sig-
fox have a limited bandwidth of 100 Hz, which makes time-based localization for
these networks unfeasible [13]. With a bandwidth of 125 kHz to 250 kHz depend-
ing on the region, LoRaWAN is much more qualified for time-based localization.
2.2 Angle of Arrival

As mentioned before, LPWANs use long range communication transmissions
with minimal power consumption. This is possible due to the fact that IoT
transceivers are usually required to transmit only small amounts of information,
and in turn they do not need large operating bandwidths. Contrary to TDoA,
AoA-based localization techniques can provide a location estimation of a trans-
mitting LPWAN device without depending on a high timestamp precision at
the receiving gateway. Nevertheless, the cost and complexity of deploying AoA
location estimation systems at the receiver side is considered a major challenge
for the AoA-based localization deployment. In a recent work, a cost effective
IoT AoA estimation unit has been introduced [5]. The accuracy of the AoA
estimation in the band of 868 MHz in an anechoic chamber was below 1◦ .
3 Simulation
This section describes the workflow of our experiments. Firstly, we explain how
our input data was simulated. Secondly, we illustrate how the TDoA and AoA
estimations are calculated and combined using this input data.
3.1 Input Data

In our previous work [2], we published a large ground truth dataset with
LoRaWAN messages that were collected in the city center of Antwerp. Unfor-
tunately, this dataset does not contain accurate time of arrival estimates which
Fig. 1. This figure shows the spatial spread of the 10 000 transmission locations that
are used as input for our simulations. Red circles indicate the transmission locations.
Blue markers show the two gateway locations ‘RX 1’ and ‘RX 2’, which we use in our
simulations.
can be used for TDoA localization. For this work, we use the ground truth loca-
tions of 10 000 random samples in this dataset. As can be seen in Fig. 1, all
of these samples are located within a spatial bounding box of 9.88 km2 . This
particular area was chosen as it contains gateways RX 1 and RX 2 which we
use in our simulations. In turn, these gateway locations were selected because
actual LoRa gateways have been installed there for our future experiments with
real LoRaWAN data. For each sample, the time or arrival at both gateways
is simulated based on the distance between the transmission locations and the
locations of the gateways. By dividing this distance with the speed of light,
a theoretical timestamp with nanosecond precision is obtained. Of course, our
simulations have to take multipath reflections into account, as they have a sig-
nificant impact on the accuracy of the measured time of arrival. Therefore, we
draw random errors from a normal distribution and add them to the theoreti-
cal timestamps. The standard deviation of this normal distribution was set to
1.8 µs, as our technical experiments with LoRa gateways in an urban environ-
ment demonstrate that the timing error is normally distributed with a median
of 0 µs, and a standard deviation of 1.8 µs [10].
Furthermore, an angle of arrival relative to RX 1 is simulated for each sample.

Similar to the timestamp simulation, a random error from a normal distribution
is added to the actual angle between the ground truth TX locations and RX 1.
Here, we chose to set the standard deviation of the distribution to 5 degrees, as
our experiments with a sub-GHz AoA unit indicate that it is feasible to obtain
an angle estimate within this margin [7].
3.2 TDoA and AoA Localization

After simulating timing and angle information for all 10 000 TX locations as
described in Sect. 3.1, enough input data is collected to apply TDoA and AoA
localization. In this subsection, we explain how we calculate location estimates
using this input data.
First, a TDoA location estimate is calculated based on the simulated time
of arrival at both gateways. Before doing so, the geodetic coordinates of both
RX 1 and RX 2 are converted to Cartesian coordinates, using the mean of both
gateway locations as a reference point. Next, we appoint the gateway with the
first timestamp t0 as the TDoA reference gateway. Knowing the timestamp t1
from the other gateway, the difference in distance dΔ between the transmitter
and both gateways can be calculated as follows:
dΔ = (t1 − t0 ) ∗ c, (1)
with c representing the speed of light. Consequently, the gateway locations and
dΔ can be used to draw a hyperbola that crosses every possible Cartesian coordi-
nate where the transmitter can be located1 . After calculating the hyperbola, it is
mapped to a discretized grid which divides the bounding box that was described
in Sect. 3.1 in cells of 10 m by 10 m. An offset of 300 m is added to all sides of
the grid, to avoid cutting off the hyperbola if its coordinates exceed the bound-
ing box. For every cell in the discretized grid, the minimal Euclidean distance
to the TDoA hyperbola is calculated. Finally, a probability density function is
applied on the discretized grid, with a standard deviation of 100 m around the
cells that contain the hyperbola. An example of such a TDoA probability den-
sity map can be seen in Fig. 2a. In this figure, it is clear to see that the possible
location estimates for the transmitter are distributed on the entire length of the
hyperbola. To translate this to a single TDoA location estimate, the mean loca-
tion of all cells with the highest probability is calculated and converted to the
geodetic coordinate system. Consequently, this approach is strongly influenced
by the size of the bounding box. If we would enlarge the box, more cells that
are further away from the middle of the hyperbola’s bending point will be taken
in to account, which makes it less likely that the mean location of all these cells
converges to a point on the hyperbola. Hence, further steps should be taken to
obtain a more reliable and scalable localization method.
1
For this research, we started from existing source code to simulate multilateration
based on timing information. The repository can be found at https://github.com/
jurasofish/multilateration.
This next step is to build a probability density map for the AoA estimate.
To do so, a new discretized grid as big as the TDoA grid is created, again
with a resolution of 10 m by 10 m per cell. For each cell, the angle relative to
gateway RX 1 is calculated. Similar to how the TDoA probability density map
was obtained, a probability density function is applied to the discretized grid,
with the simulated AoA estimate as the mean of the function and a standard
deviation of 5◦ . Figure 2b is a visual example of the resulting AoA probability
density map.
By adding the TDoA and AoA probability maps to each other, a new map
is created in which the cells where the TDoA hyperbola intersect with the AoA
estimate get the highest probability. For example, adding the AoA probability
map in Fig. 2b to the TDoA probability map in Fig. 2a results in a new probabil-
ity map that is shown in Fig. 2c. In the same way as the TDoA probability map is
translated to a location estimate, the mean location of all cells with the highest
probability is calculated and converted to geodetic coordinates. If we compare
Fig. 2a and c, we see that the cells with highest probability are distributed along
the entire hyperbola in the TDoA probability map, while the probability is much
more concentrated in a specific area in the combined TDoA - AoA map. Con-
trary to only using TDoA, the combined approach is much less affected by the
size of the bounding box because in most cases, the crossing of the hyperbola
and the angle estimate that used as the location estimate lies within the defined
box.
4 Results
In this section, we list and discuss the location estimation errors of TDoA as well
as the combination of TDoA and AoA, which we calculated for each of the 10 000
simulated messages as described in Sect. 3. As can be seen in Table 1 and Fig. 3,
the TDoA location estimations using only 2 gateways result in a mean error of
548 m, a median error 530 m and a 95th percentile error of 1010 m. A significant
improvement was achieved by combining TDoA and AoA, with a mean error of
399 m, a median error of 337 m and a 95th percentile error of 937 m.
These results are calculated for 8537 messages, as it was not possible to
obtain a TDoA location estimate for the remaining 1463 messages. Figure 4
illustrates the spatial spread of the TDoA localization errors, and indicates TX
locations where localization was not possible with blue circles. We see here that
most of these locations are in the vicinity of their reference gateway. The closer
a transmission is located near its reference gateway, the higher the risk that
the distance estimate between the transmitter and the non-reference gateway
is larger than the actual distance between both gateways. In this case, a TDoA
hyperbola can not be drawn because the estimated difference in distance between
the transmitter and both gateways is negative. Consequently, we were not able to
(a) TDoA
(b) AoA
(c) TDoA combined with AoA
Fig. 2. This example visually explains how we combine the TDoA and AoA estimates
to obtain a location estimate. (a) TDoA probability density map. (b) AoA probability
density map. (c) Final location estimation by combining the TDoA and AoA maps
calculate a TDoA + AoA location estimate for these messages. A viable solution
for this issue could be to combine the AoA estimate with RSS ranging instead
of TDoA, which will be evaluated in our future work.
Fig. 3. This figure displays the cumulative distribution of both the TDoA results, as
the TDoA + AoA combination results.
Table 1. Numerical results of the simulations
Mean error [m] Median error [m] 95th percentile [m]

TDoA 548 530 1010
TDoA + AoA 399 337 937
Fig. 4. This figure shows the spatial spread of the TDoA localization errors. The color
of a circle indicates the magnitude of the localization error on that TX location. Blue
circles indicate that it was not possible to obtain a location estimate for that TX
location. Red markers indicate the gateway locations.
5 Conclusion
Generally, TDoA localization calculates multiple hyperbolas using at least four
gateways to obtain a location estimate [4]. In this research, we illustrate that it
is still possible to realize TDoA localization with only two receiving gateways,
although this approach yields a mean localization error of more than 500 m. In
some cases, TDoA localization was not possible due to the relative geometric
positions of the transmitter and the gateways. This can be resolved by applying
RSS ranging instead, which is something we will evaluate in our future work. In
the second part of our method, a single AoA estimate is added to the TDoA esti-
mate, which leads to a significant improvement with a mean error of 399 m. Com-
pared to other research on localization with LoRaWAN, the estimation errors
that we report are relatively high [3,11]. Nonetheless, these methods rely on
Bayesian filtering and sensor readings to improve the raw TDoA estimate. Our
methods are solely based on the physical characteristics of a received LoRaWAN
message, and can be further improved with similar optimizations that were pre-
sented in the aforementioned research. Also, our approach has the advantage
that it can be implemented in a LoRaWAN network with only two receiving
gateways if one of the gateways is able to detect the AoA of a received signal.
Furthermore, we will research more possible optimizations to our methods
in our future work. Instead of simulations, we will evaluate our approach with
real LoRaWAN data, using the latest version of an openly available LPWAN
dataset [2]. A sub-GHz AoA unit will be installed to assess the feasibility of
detection the direction of a received LoRaWAN signal in a large urban area [5].
We expect this to be a challenging facet of our research, as multipath reflections
have a substantial effect in NLoS environments. Also, we will appraise the effect
of the amount of gateways as well as their locations on our methods. Lastly, the
results of our experiments with real LoRaWAN data will be compared to the
performance of other state of the art LPWAN localization methods.
References
1. Aernouts, M., Bellekens, B., Berkvens, R., Weyn, M.: A comparison of signal
strength localization methods with sigfox. In: 2018 15th Workshop on Position-
ing, Navigation and Communications (WPNC), pp. 1–6. IEEE (2018). https://
doi.org/10.1109/WPNC.2018.8555743
2. Aernouts, M., Berkvens, R., Van Vlaenderen, K., Weyn, M.: Sigfox and LoRaWAN
datasets for fingerprint localization in large urban and rural areas. Data 3(2), 13
(2018). https://doi.org/10.3390/data3020013
3. Bakkali, W., Kieffer, M., Lalam, M., Lestable, T.: Kalman filter-based localiza-
tion for Internet of Things LoRaWANTM end points. In: 2017 IEEE 28th Annual
International Symposium on Personal, Indoor, and Mobile Radio Communica-
tions (PIMRC), vol. 2017-October, pp. 1–6. IEEE (2017). https://doi.org/10.1109/
PIMRC.2017.8292242
4. Bensky, A.: Wireless Positioning Technologies and Applications, 2nd edn. Artech
House (2016)
5. Bnilam, N., Joosens, D., Steckel, J., Weyn, M.: Low cost AoA unit for IoT applica-
tions. In: 2019 13th European Conference on Antennas and Propagation (EuCAP),
pp. 3–7. European Association on Antennas and Propagation, Krakow (2019)
6. BniLam, N., Steckel, J., Weyn, M.: Synchronization of multiple independent sub-
array antennas for IoT applications. In: 12th European Conference on Antennas
and Propagation (EuCAP 2018), p. 125. Institution of Engineering and Technology
(2018). https://doi.org/10.1049/cp.2018.0484
7. BniLam, N., Steckel, J., Weyn, M.: Synchronization of multiple independent subar-
ray antennas: an application for angle of arrival estimation. IEEE Trans. Antennas
Propag. 67(2), 1223–1232 (2019). https://doi.org/10.1109/TAP.2018.2880014
8. Fargas, B.C., Petersen, M.N.: GPS-free geolocation using LoRa in low-power
WANs. In: 2017 Global Internet of Things Summit (GIoTS), pp. 1–6. IEEE (2017).
https://doi.org/10.1109/GIOTS.2017.8016251
technologies for large-scale IoT deployment. ICT Express 5(1), 1–7 (2019). https://
doi.org/10.1016/j.icte.2017.12.005
10. Podevijn, N., Plets, D., Aernouts, M., Berkvens, R., Martens, L., Weyn, M., Joseph,
W.: Experimental TDoA Localisation in real public LoRa networks (accepted). In:
10th International Conference on Indoor Positioning and Indoor Navigation, IPIN
2019, Pisa, Italy (2019)
11. Podevijn, N., Plets, D., Trogh, J., Martens, L., Suanet, P., Hendrikse, K., Joseph,
W.: TDoA-based outdoor positioning with tracking algorithm in a public LoRa
network. Wirel. Commun. Mob. Comput. 2018, 1–9 (2018). https://doi.org/10.
1155/2018/1864209
12. Raza, U., Kulkarni, P., Sooriyabandara, M.: Low power wide area networks: an
overview. IEEE Commun. Surv. Tutorials 19(2), 855–873 (2017). https://doi.org/
10.1109/COMST.2017.2652320
13. Sallouha, H., Chiumento, A., Pollin, S.: Localization in long-range ultra narrow
band IoT networks using RSSI. In: IEEE International Conference on Communi-
cations (2017). https://doi.org/10.1109/ICC.2017.7997195
14. Steckel, J., Laurijssen, D., Schenck, A., BniLamf, N., Weynf, M.: Low-cost hard-
ware platform for angle of arrival estimation using compressive sensing. In: 12th
European Conference on Antennas and Propagation (EuCAP 2018), p. 222. Institu-
tion of Engineering and Technology (2018). https://doi.org/10.1049/cp.2018.0581
Localization Accuracy Performance
Comparison Between LTE-V
and IEEE 802.11p
Rreze Halili, Maarten Weyn, and Rafael Berkvens(&)
Faculty of Applied Engineering – IDLab, University of Antwerp – imec,

Antwerp, Belgium
{rreze.halili,maarten.weyn,
rafael.berkvens}@uantwerpen.be
Abstract. Most of the vehicular industry applications require precise, reliable,

and secure positioning with cm level accuracy. IEEE 802.11p and LTE-V
exploit technological solutions to achieve communication vehicle to vehicle
V2V, vehicles to infrastructure V2I, vehicle to pedestrians V2P, or vehicle to
everything V2X. This paper presents the achievable localization accuracy of
IEEE 802.11p and LTE-V. Cramer Rao Lower Bound for time difference of
arrival localization for two different vehicular network scenarios is computed to
determine bounds of accuracies of the technologies. Measurements are simu-
lated assuming additive white Gaussian noise channel with a variance that
depends on the geometry of the vehicular network. The simulation results show
that IEEE 802.11p outperforms LTE-V for the conditions considered in this
work. In addition to this, having network sites at both sides of the highway and
considering the geometry between vehicles and network sites improves vehicle
localization accuracy.
1 Introduction
There is a high demand for more secure, reliable, and efficient transport, which will
result in a reduction of road accidents, enhancement of road safety and traffic man-
agement. This is the reason why there is a need for high connectivity between vehicles
and road infrastructure, so they can observe what is happening around them, foresee
what will happen next, and take protective actions accordingly [1]. Different use cases
such as automated overtake, cooperative collision avoidance, or high-density pla-
tooning demand high-accuracy positioning, high reliability, and real-time response of
the location information [2].
Last decades GNSS receivers have experienced huge performance improvements.
However, still, a standalone GNSS receiver cannot provide accurate location infor-
mation in dense and indoor environments, such as tunnels, urban canyons, and dense
forests. In addition to this, a GNSS receiver needs at least four GNSS satellites, often
needs Line-of-Sight (LOS) condition, satellite clocks synchronization, ability to
overcome all orbit errors, receiver noise, and multipath effects of the signals [3].

https://doi.org/10.1007/978-3-030-33509-0_72
Localization Accuracy Performance Comparison Between LTE-V and IEEE 802.11p 767
The position provided by GNSS receivers can be corrected using ground-based

reference stations, a system known as Differential GNSS (DGNSS) and Real-Time
Kinematic (RTK) correction, however, there always need to have correction stations in
every part to support GNSS receivers.
Different research is done to analyze Long Term Evolution (LTE) as a comple-
mentary system to GNSS. The achievable localization accuracy is done by analyzing
the LTE positioning reference signal (PRS) and use of Time of Arrival (TOA) esti-
mation algorithms in LTE downlink signals carried on measurements from multiple
base stations [4–6].
IEEE has released IEEE 802.11p known as WAVE in US and ETSI ITS-G5 in
Europe, for vehicular ad-hoc networks (VANETs), which is a collection of mobile ad
hoc networks for vehicular communications and includes categories such as vehicle-to-
vehicle (V2V) communications and vehicle-to-infrastructure (V2I) communications [3,
7]. Also, Release 14 of LTE enables communication between vehicle to vehicle (V2V),
vehicle to pedestrian (V2P), vehicle to infrastructure (V2I), vehicle to network (V2N),
with one phrase vehicle to everything (V2X). The standard is referred as LTE-V [7, 8].
In 3GPP Rel-14 and 3GPP Rel-15, GNSS system has been implemented as a
synchronization source for vehicles communication, respectively to accept signaling
protocols which support GNSS RTK in cellular networks. However, the communica-
tion between user equipment UE and LTE base station requires to have transmitted
messages between them. This causes extra positioning delay and energy consumption
at the UE since several messages should be transmitted in order to have a position.
Both technologies cellular LTE-V and ad-hoc IEEE 802.11p are considered to be
complementary to each other [9, 10] and not competitive. However, even cooperation
needs a detailed analysis of different parameters which have an impact on their per-
formance. This is why different studies are done in this matter, usually using network
simulators [10–13]. Different parameters, such as delay, reliability, quality of service,
data rate, mobility, coverage, economic size, etc., are in focus. However, there is a lack
of studies in the capability of technologies to localize vehicles using different posi-
tioning algorithms in different challenging propagation environments. Thus, we want to
investigate the achievable localization accuracy of LTE-V and IEEE 802.11p.
Following the preliminary analysis on the physical layers of used signals [10–12],
proposed vehicles network design [14], the Cramer Rao Lower Bound (CRLB) of Time
Difference of Arrival (TDOA) measurements in additive white Gaussian noise
(AWGN) channel is analyzed, by providing numerical results through simulation of
two above mentioned technologies on real vehicular scenario with same propagation
conditions and same frequency band.
TDOA is a localization technique to determine the location of the source by
evaluating the difference in arrival time of the signal at spatially separated base stations
[4]. CRLB is the lower bound on the variance of any unbiased estimator. In our study,
it is used to have bounds of localization accuracy obtained using TDOA noisy mea-
surement set. It is computed using the Likelihood function in which the covariance of
the measurement vector enters as a factor [16].
There are different studies for CRLB [4, 17–20]. Most of the studies consider a
constant variance which does not depend on the estimated parameter. However, in
practice, the variance as an input in the covariance matrix depends on the source or UE
768 R. Halili et al.
location and also on the geometry of source and base stations. Therefore, in this study,
we have computed the standard deviation of the TDOA measurements as a function of
the source position. This results in a more realistic value for CRLB.
The rest of this paper is organized as follows. Section 2 provides a vehicular
network design proposed on the Smart Highway project [14], a short description of the
technologies and TDOA opportunities and challenges. CRLB and its deviation are
introduced in Sect. 3. And the simulation results and comparisons are presented in
Sect. 4, while conclusions are summarized in Sect. 5.
2 Vehicular Network System
According to the Smart Highway project [14] in Antwerp, Belgium, it is planned to be

deployed a testbed infrastructure along the highway E313. The testbed infrastructure
consists of interconnected hardware including here: onboard units OBUs, roadside
units RSUs, backbone and testbed management software on top of the Smart Highway
platform. Two different technologies LTE-V implemented on software-defined radio
(SDR) modules, and IEEE 802.11p implemented on ITS-G5, will be on focus. LTE-V
will enable communication between vehicles (V2V) using PC5 and between vehicles
and infrastructure and pedestrians using cellular interface named Uu. LTE-V utilizes
single-carrier frequency-division multiple access (SC-FDMA) and supports 10 and
20 MHz channels for uplink and orthogonal frequency division multiplexing (OFDM)
at the physical layer for downlink [12]. IEEE 802.11p/ITS-G5 exploits orthogonal
frequency division multiplexing (OFDM) for both uplink and downlink. The entire
signal is wideband (i.e., occupies a 10 MHz channel) [10]. In this simulation, network
sites or RSUs placed along the road will be used to communicate with OBU placed in
the vehicle and this communication will be used for localization of vehicles. Two
different scenarios are planned for the placement of the RSUs in the highway.
For communication between RSUs and OBU, we consider line of sight (LoS) and
non-line of sight (NLOS) conditions. We analyze the uplink signals from one OBU or
emitter to RSUs, which are placed in 500 m range distance.
The antenna of each network site is formed by a uniform linear array of Ma
elements with known orientation, while the OBU antenna is omnidirectional.
To comply with TDOA, the network sites are synchronized to a reference clock,
while, the vehicle has its own clock. Since there is no need to have synchronization
between network sites and the vehicle, these assumptions are correct. We consider a
single vehicle or OBU placed at xk ¼ ½xk ; yk T at tk time. Meantime, we assume that the
h iT
ðiÞ ðiÞ ðiÞ
position of network sites or RSUs are presented sk ¼ xk ; yk where i = 1, 2, 3…
M, depending on the number of network sites. Every RSU measures the time of arrival
of the uplink signal received from the OBU and multiplies it with the speed of light to
have the distance between OBU and RSU. The Euclidean distance between the OBU
and RSU is computed as
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ri ¼ ðx xðiÞ Þ þ ðy yðiÞ Þ2 : ð1Þ
To estimate the two-dimensional position of the emitter, we have to know the range
differences between the reference network site and other sites in place. The difference is
found by considering the reference network site, which is the most powerful base
station and other sites.
z1j;k ¼ h1j;k ðxk Þ þ u1j;k ð2Þ
The above expression shows that the resulted range differences z1i;k are sum of
exact range differences h1j;k and the additive white Gaussian noise AWGN with a
standard deviation ri , variance r21j ¼ r21 þ r2j and joint conditional Gaussian distribu-
tion covariance matrix defied below and j = 2, 3…M,
0 1
r21 þ r22 r21
B .. .. .. C
R¼@ . . . A: ð3Þ
r21 r21 þ r2M
3 Cramer Row Lower Bound for TDOA
Localizing the emitter or vehicle using TDOA requires a number of signals to be

transmitted by OBU to RSUs. The estimated position will always be subject to some
uncertainty because the received waveforms will be affected by random phenomena
such as noise, fading, shadowing, multipath, and non-line of sight propagations, and
interference [4].
Considering all above-mentioned factors which have an impact on the accuracy of
TDOA, the accuracy of this two-dimensional position estimation can be defined by the
root-mean-square error (RMSE) of the position estimate ^x. Another way is to have
Cramer Rao Lower Bound (CRLB). The position error in m is,
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
h iffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x ¼ E k^x xk trfCRLBð xÞg;
2
ð4Þ
where E is expected value and tr is trace operator.

Cramer Rao Lower Bound is used to determine the lower bound of the achievable
positioning accuracy [16]. This lower bound can be used to determine the network
design guidelines for accurate vehicle localization [19].
CRLB is computed from the inverse of the Fisher Information Matrix (FIM) J
defined in [16]. Considering the derivation of the CRLB in an AWGN channel which is
done in [16] the CRLB is expressed as
1
CRLBð xÞ ¼ DT R1 D ; ð5Þ
where
0 xx1 2 1
xx
d1 d2
2
yy1
d1 yy
d2
B .. .. .. C
D¼@ . . . A; ð6Þ
xx1
d1 xxM
dM d1 dM
yy1 yyM
and R is defined in (3).

The application of the CRLB for TOA and TDOA is found in different scenarios in
[4, 17–19, 21]. In [4, 17, 19] the covariance matrix R does not depend on the vehicle
position, which results in an approximated constant covariance matrix. However, in
this study, we have included the position of the vehicle in the covariance matrix, so no
constant variance for different measurements is considered. In order to model this
covariance matrix for both systems LTE-V and IEEE 802.11p, standard deviation and
variance of the TDOA should be defined.
According to [4] and [19] and explanations above the variance of a TDOA mea-
surement is the sum of the associated TOA measurements. This parameter is defined as
shown in the expression below and depends on frequency f, relative power weight p2k of
subcarrier k, number of subcarriers N, subcarrier spacing Fsc, signal bandwidth B, and
signal to noise ratio SNRj,
c2
r2j;TOA ¼ 2
Fsc P ð7Þ
8p2 Ma Nm SNRj N k p2 k 2
Nm is the total number of measurements and Ma is the number of antenna elements.

In (7) SNRj is defined in dB as,
SNRj ¼ Pmax SFj N Lj þ 147 10 logðNF Þ 10 logðBÞ ð8Þ

Lj ¼ max PL dj Gt Gr ; MCL ð9Þ
where Pmax is the maximum transmit power, SFj is the shadow fading, F is receiver
noise figure, B is signal bandwidth, Gt and Gr are transmitter and receiver antenna
gains, and MCL is the minimum coupling loss between OBU and RSU [4, 19]. PL(dj)
are macroscopic losses defined by WINNER Models in [22].
The used WINNER propagation model noted as D1 represents radio propagation in
large rural areas (up to 10 km) with low building density. The height of the transmitted
antenna is typically between 20 m to 70 m, much higher than the average building
height. While the receiver antenna is supposed to be located inside of a building or a
vehicle with a velocity between 0 to 200 km/h [22]. The considered model allows
transition between different propagation conditions, such as LOS and NLOS. This
transition for our rural environment is done using WINNER radio channel models for
the probability of LOS.
The limits of position accuracy in this work are obtained for a segment of the two way
E313 highway in Antwerp, showed in Fig. 1. Figure 1(a), shows the image of five first
RSUs placed periodically at two sides of the highway, while Fig. 1(b) shows the image
of first four RSUs placed just on one side of the highway. The inter-site distance, which
is the distance between RSUs varies.
Fig. 1. Map of the environment of the simulated area where RSUs are placed on two sides of the
E313 highway (a), and on one side of the E313 highway (b). © 2018 Google
For the first scenario, we have considered 30 RSUs, and for the second scenario, we
have considered 15 RSUs. The maximum coverage distance for each RSU is 0.5 km
for both technologies. This number is approximated considering the real data mea-
surements done in the E313 highway.
To perform TDOA, there is a need to have communication with at least four RSUs.
For scenarios in simulation, OBU communicates with all RSUs which are placed in
500 m range distance. Then depending on the geometry between vehicle and RSUs and
propagation conditions, OBU will have to switch between LOS and NLOS commu-
nication with RSUs. This network design avoids signal interference of time and fre-
quency resources.
While moving, OBU will change its reference RSU considering the time of com-
munication with them.
For IEEE 802.11p it is considered OFDM physical layer signal with a bandwidth of
10 MHz, frequency 5.9 GHz, and maximum power 23 dBm. Same stands for LTE-V,
just here we have considered SC-FDMA. Other parameters required for simulation are
shown in Table 1.
Table 1. Simulation parameters of the vehicular network scenarios for LTE-V and IEEE
802.11p
Parameter LTE-V IEEE 802.11p
Pmax 23 dBm 23 dBm
F 5.9 GHz 5.9 GHz
Gt 3 dBi 3 dBi
Gr 8 dBi 8 dBi
ht 53.35 m 53.35 m
hr 1.5 m 1.5 m
B 10 MHz 10 MHz
Fsc 15 kHz 156.25 kHz
N 12 64
SF 6 dB and 8 dB 6 dB and 8 dB
NF 9 dB 9 dB
Nm 1 1
Ma 1 1
MCL 70 dB 70 dB
p2(n) 1 1
Modulation for uplink SC-FDMA OFDM
Figure 2 shows the CRLB results for LTE-V technology for RSUs placed at both
sides of the highway (a) and RSUs placed at one side of the highway (b). We note that
both measurements are done in the same propagation conditions. The OBU is placed on
the vehicle which drives on the same lane; all the above parameters are the same.
Therefore the only difference is the number of RSUs included in the computation of
TDOA.
We can see that when the number of RSUs is bigger, so they are placed at both sites
of the highway as shown in Fig. 2(a), the maximum error can reach 1283.72 m. On the
other hand, when computing CRLB considering one side RSUs shown in Fig. 2(b), the
maximum error reaches 87.47 m.
Figure 2(c) and (d) show the CRLB results for IEEE 802.11p technology, again
considering RSUs placed at both sides of the highway (c) and RSUs placed at one side
of the highway (d). The only difference while obtaining TDOA is the number of RSUs
included in the study, so, again, all other parameters stay the same as presented on the
table above. We can see that when the number of RSUs is bigger, and they are placed
on both sites of the highway, as shown in Fig. 2(c), the maximum error is 0.0069 m.
On the other hand, when computing CRLB considering one side RSUs shown in Fig. 2
(d), this error is 0.00307 m. As it is shown in Fig. 2, the maximum error is found in
LTE-V when two-sided RSUs are in use.
Fig. 2. Position error maps computed using CRLB for TDOA for LTE-V and IEEE 802.11p
However, while computing TDOA using two-sided RSUs in all included vehicle
positions, for LTE-V we could obtain CRLB for 85.08% of cases/locations, and for
IEEE 802.11p for 84.21% of cases. On the other hand, obtaining CRLB for LTE-V and
IEEE 802.11p for one-sided RSUs it is possible for just 23.68% of vehicle locations,
respectively 27.19% of locations. This is better shown in Fig. 3. Vehicle positions or
locations where we could not obtain values for CRLB are because the vehicle could not
have communication with four RSUs to have TDOA. In addition to this, LTE-V
implemented on two-sided RSUs could achieve cm level accuracy for 81.57% of
locations while using just one-sided RSUs, this value is 19.30%.
For IEEE 802.11p, we have the same scale of values. 84.21% of cases we have cm
level accuracy using two-sided RSUs, while for one-sided RSUs we could obtain for
27.19% of cases.
As a comparison done between LTE-V and IEEE 802.11p, we see that the local-
ization error achieved using LTE-V is higher than IEEE 802.11p for both scenarios.
The missing points in all figures are cases when the number of RSUs included in
coverage is less than four, and the covariance showed in (3) is a singular matrix.
The first impact can be solved if we add more RSUs in the highway. According to
[19], for rural macro-cell to ensure at least three LoS RSUs for the 95% of the cases, the
inter side distance is set to 200 meters and the roadside separation to 15 meters. The
effect of the second factor is reduced while ignoring the measurement noise, which
results from the OBU location. According to [4, 17], the values for variances shown in
the covariance matrix are constant and do not depend on the position of the vehicle.
Fig. 3. Values of Cramer Rao Lower Bound for TDOA scenarios for LTE-V and IEEE 802.11p
5 Conclusion
In this paper, a comparison between LTE-V and IEEE 802.11p is done, to analyze the
achievable positioning accuracy, using CRLB in a vehicular network of multiple base
stations using Time Difference of Arrival TDOA measurements in an additive white
Gaussian noise channel. This work considers the geometry between base stations
referred as roadside units RSU and vehicle referred as onboard unit OBU, by con-
sidering a distance-dependent noise variance.
The analyses are completed in a simulated environment considering a planned
testbed along the highway E313 in Antwerp, Belgium. The studies are done consid-
ering two scenarios. One of them is having two-sided RSUs, or RSUs placed on both
sides of the highway and the other one considering one-sided RSUs on the highway.
LTE-V and IEEE 802.11p both operate with a 10 MHz bandwidth at 5.9 GHz
frequency. According to results, the best performance is achieved by IEEE 802.11p
when two-sided RSUs are taken into consideration.
For IEEE 802.11p, we could obtain cm level accuracy for 84.21% of positions in
the simulated highway using two-sided RSUs, while for one-sided RSUs we could
obtain cm level accuracy for 27.19% of cases. On the other hand, LTE-V implemented
on both sides of the highway performs with cm level accuracy for 81.57% of cases,
while for base stations placed just on one side of the highway we have cm level
accuracy for 19.29% of cases. The percentage of accuracy includes all positions which
are considered in the study. The missing points in all figures are cases when the number
of RSUs included in coverage is less than four, and the covariance is a singular matrix.
Future work aims at analyzing the dilution of precisions and the position of the
vehicle toward RSUs when we have maximum values of CRLB. Inter-cell interference
is another parameter which to be included in future CRLB computations. Then, the
difference between WINNER model and other models in order to see which of the
models predicts best the losses on the selected environment.
References
1. V2X White Paper by NGMN Alliance, July 2018
2. G-PPP, 5G automotive vision, White Paper, October 2015
3. Fascista, A., Ciccarese, G., Coluccia, A., Ricci, G.: A localization algorithm based on V2I
communications and AOA estimation. IEEE Signal Process. Lett. 24, 126–130 (2016)
4. del Peral-Rosado, J., López-Salcedo, J., Seco-Granados, G., Zanier, F., Crisci, G.:
Achievable localization accuracy of the positioning reference signal of 3GPP LTE. In:
Proceedings of International Conference on Localization and GNSS, June 2012
5. Rosado, J., Arboleda, M., Zanier, F., Granados, G., Salcedo, J.: Performance limits of V2I
ranging localization with LTE networks. In: 14th Workshop on Positioning, Navigation and
Communications (WPNC) (2017)
6. Driusso, M., Marshall, Ch., Sabathy, Ch., Knutti, F., Mathis, H., Babich, F.: Vehicular
position tracking using LTE signals. IEEE Trans. Veh. Technol. 66(4), 3376–3391 (2017)
7. Chen, Sh., Hu, J., Shi, Y., Peng, Y., Fang, J., Zhao, R., Zhao, L.: Vehicle-to-everything
(V2X) services supported by LTE-based systems and 5G. IEEE Commun. Stand. Mag. 1,
70–76 (2017)
8. https://www.3gpp.org/
9. Dressler, F., Kargl, F., Ott, J., Tonguz, O., Wischhof, L.: Research challenges in
intervehicular communication: lessons of the 2010 Dagstuhl seminar. IEEE Commun. Mag.
49(5), 158–164 (2011)
10. Cecchini, G., Bazzi, A., Masini, B., Zanella, A.: Performance comparison between IEEE
802.11p and LTE-V2 V in-coverage and out-of-coverage for cooperative awareness CNR-
IEIIT. In: IEEE Vehicular Networking Conference (VNC), Bologna, Italy (2017)
11. Chen, S., Hu, J., Shi, Y., Zhao, L.: LTE-V: A TD-LTE-based V2X solution for future
vehicular network. IEEE Internet Things J. 3(6), 997–1005 (2016)
12. Masegosa, R., Gozalvez, J.: LTE-V for sidelink 5G V2X vehicular communications: a new
5G technology for short-range vehicle-to-everything communications. IEEE Veh. Technol.
Mag. 12(4), 30–39 (2017)
13. Bazzi, A., et al.: On the performance of IEEE 802.11p and LTE-V2 V for the cooperative
awareness of connected vehicles. IEEE Trans. Veh. Technol. 66(11), 10419–10432 (2017)
14. Project: Smart Highway
15. Kim, S., Chong, J.: An efficient TDOA-based localization algorithm without synchronization
between base stations. Hindawi Publishing Corporation Int. J. Distrib. Sens. Netw. 11,
832351 (2015)
16. Kay, S.: Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall
PTR (1993–1998)
17. Kaune, R., Horst, J., Koch, W.: Accuracy analysis for TDOA localization in sensor
networks. In: Proceedings of IEEE Fusion, July 2011
18. Chan, Y., Ho, K.: A simple and efficient estimator for hyperbolic location. IEEE Trans. Sign.
Process. 42(8), 1905–1915 (1994)
19. del Peral-Rosado, J., Seco-Granados, G., Kimy, S., Lopez-Salcedo, J.: Network design for
accurate vehicle localization. Euro Cost (2018)
20. Chang, Ch., Sahai, A.: Estimation Bounds for Localization. IEEE (2004)
21. Han, Y., Shen, Y., Zhang, X., Win, M., Meng, H.: Performance limits and geometric
properties of array localization. IEEE Trans. Inf. Theor. 62(2), 1054–1075 (2016)
22. IST-4-027756 WINNER II D1.1.2 V1.2 WINNER II Channel Models Part I Channel
Models (2008)
Online Reverse Engineering of CAN Data
Jens de Hoog(B) , Nick Castermans, Siegfried Mercelis, and Peter Hellinckx

{jens.dehoog,siegfried.mercelis,peter.hellinckx}@uantwerpen.be,
nick.castermans@student.uantwerpen.be
Abstract. Modern cars contain numerous sensors that provide useful

data in many different situations, but the interpretation of that data is
cumbersome due to the different implementations of the Controller Area
Network (CAN) messaging system. Hence, reverse engineering is needed
in order to give sense to the internal sensor data of the car. Currently,
reverse engineering of CAN data is an ongoing topic in research, but
no method has been proposed yet to perform online reverse engineering.
Therefore, this paper presents two methodologies. The first one elabo-
rates on the online analysis of continuous signals, while the second one
focuses on the reverse engineering of user-based signals, such as direc-
tion indicators and light switches. The results show that more research
is needed in thoroughly benchmarking those methods with the current
State of the Art. However, as the results are promising, this paper paves
a way to a more scalable solution for reverse engineering in future appli-
cations.
1 Introduction
Nowadays, cars are getting more and more advanced with a variety of sensors
and complex algorithms. This makes that cars are capable of measuring many
different sorts of data in various environments and circumstances. Examples are
lane following mechanisms on highways, detection of pedestrians in urban areas,
detection and prevention of slipping on slippery roads, etc. Within the Mobisense
project, the quality of roads and their environments are determined using both
acoustic and air quality measurements. Although the project makes use of mod-
ern personal cars, the measurements are carried out by external sensors, such as
smartphones. However, to calibrate those external sensor units, a trustworthy
reference is needed. Hence, the data originating from the car itself is used as
reference.
The data used in a car is shared among different sensors and actuators via
the Controller Area Network (CAN) protocol. This protocol is standardised: the
messages have the same structure, containing an integer identifier (ID) (11 bits,
but extendible to 29 bits) and a data field of 64 bits. This ID also determines the
priority of the message: the lower the ID, the higher the priority. Although the
use of an ID is standard, manufacturers are free to choose which component or
https://doi.org/10.1007/978-3-030-33509-0_73
Online Reverse Engineering of CAN Data 777
message type corresponds to which ID. Additionally, manufacturers mostly split

the 64 data bits up in sections of bytes, but they are actually also free to choose
the layout of the entire data field. Finally, the implementation of IDs and data
fields is variable between different manufacturers, if not between car models of
a single manufacturer.
The On-Board Diagnostics (OBD) standard is built on top of the CAN pro-
tocol. It is mainly built as an interface for service centers to acquire emission
related data, such as fuel consumption, engine RPM and presence of exhaust
gases. Those parameters are identified using standardised Parameter Identifica-
tions (PIDs). OBD implements a polling mechanism to obtain the values for
those PIDs. A request with the desired PID is sent to the controller, after which
it replies with the data for that PID. This mechanism is easy in use, but it is
also its biggest drawback. To acquire data at a relatively high frequency (e.g.
20 Hz or more), twice as much messages are sent back and forth over the CAN
bus. In combination with the low priority IDs that those messages have, the tim-
ing is also practically unpredictable. Finally, manufacturers are free to choose
which PIDs are implemented and which are not, which makes that there is still
a difference between manufacturers in terms of obtaining internal data of the
car.
Clearly, it is difficult to obtain and interpret the internal data from a car when
many different implementations of the CAN protocol exist and a high data rate is
required. Therefore, there is a need for reverse engineering methods to interpret
that data in a meaningful way. Various studies exist where data from a car is
used, but some of them make use of OBD or a limited set of cars where the
IDs of CAN are known [2,3,6,9]. Other studies do perform reverse engineering,
but they do so by processing pre-recorded log files [4,5]. No techniques exist
yet to perform reverse engineering in an online or real-time way. This enables
developers and researchers to reverse engineer a CAN bus without recording log
files in advance, which in turn leads to a more scalable solution when deploying
algorithms on analysis devices in various cars. Additionally, using those online
methods, one is able to detect and identify discrete signals and user inputs, such
as direction indicators, lights and wipers. This has not been done yet in previous
studies. Therefore, this paper presents a methodology to perform online reverse
engineering of continuous signals, along with a novel approach for identifying
discrete signals.
This paper is organised in the following way. Section 2 elaborates on previous
studies. Section 3 presents a method for online reverse engineering, while Sect. 4
presents a methodology to identify discrete signals. The experimental setup and
results are discussed in Sect. 5. Section 6 concludes the paper and describes future
work.
2 Previous Work
In current research studies where CAN data is used, different methods exist
for acquiring data from a car. In the works of Fugiglando et al. [2], Lestyán
778 J. de Hoog et al.
et al. [5], Lin et al. [6] and Sathyanarayana et al. [9], techniques are proposed to
perform driver identification using CAN data. In [2], the needed data is acquired
and provided by Audi AG and Audi Electronics Venture. Therefore, no reverse
engineering of the data is needed as the required identifiers are already known.
In [6] and [9], the data is obtained from the OBD interface inside the car. As the
required PIDs of the OBD protocol are also known, no reverse engineering needs
to take place either. In the work of Lestyán et al. [5], the authors perform reverse
engineering on a recorded log file of acquired data. In this way, they detect and
identify the required signals by themselves to perform the driver identification.
A drawback of the use of OBD in [6] and [9], is the fact that the data is
acquired using a polling mechanism, explained in Sect. 1. Next, in the works of
[1] and [2], as the data is provided by Audi AG, no reverse engineering of raw
CAN data took place. Therefore, if the proposed work would be deployed on
many cars from various manufacturers, reverse engineering still needs to take
place in order to interpret the data. Finally, the work of [5] and [4] does reverse
engineer raw CAN data, but the authors do so via offline logs of multiple rides
of various cars. They reverse engineered a reference car using the Dynamic Time
Warping (DTW) algorithm and extracted useful features in the required signals,
after which those features are searched for in the data of other unknown cars.
This method is proven to be effective, but still needs to have recorded logs. Thus,
if this work would be deployed on a variety of cars, logs should be made first
before actual analysis can take place.
No methodology already exists for establishing the online reverse engineering
of CAN data from a car, neither for discrete signals. In this way, it would be
possible to install a device in a car which would first reverse engineer the data
before it starts analysing it for further purposes.
3 Techniques for Online Reverse Engineering

3.1 Detecting Signals
As stated before, the construction of CAN messages makes the reverse engineer-
ing complex and time-consuming. On top of an already complex structure, the
data of a signal can even be defined at bit level in some cars, leading to an enor-
mous search space. For example, if we take the endianness and signedness into
account for a signal of an unknown length between 8 and 16 bits, the amount of
positions increases enormously to 190800 candidates.
A first attempt to reduce the search space has been recently suggested by
Marchetti et al. [7]. The authors propose a novel algorithm for the extraction
of signal boundaries within a sequence of CAN messages called Reverse Engi-
neering of Automotive Data frames (READ). The algorithm extracts signals by
inspecting every single bit of all the observed CAN messages and evaluates their
evolution over time. More specifically, the bit flip/toggle frequency Bi and mag-
nitude log10 Bi is acquired for each bit from messages with the same CAN ID. As
most signals embedded inside CAN messages represent a physical phenomenon,
these types of signals are limited by physical constraints, such as momentum.
Since most CAN messages are issued on the network by a predefined cycle time
such as ten milliseconds, the difference between successive signal values will be
noticeably small. Hence, by analysing bit-flip frequencies of neighbouring bits,
signal boundaries can be defined.
The authors defined three major heuristics: (i) counters, (ii) cyclic redun-
dancy checks (CRCs), and (iii) physical signals. Counters are signals that are
incremented by one concerning the previous value of the signal, while CRCs are
used to detect transmission errors. As an extension to the algorithm, we added
two additional heuristics: zeroes/constants and repetitive patterns. Every other
found signal is categorised as a physical signal. These enable us to categorise the
signals in a more specific way, thus reducing the search space drastically.
We discovered some minor errors during the implementation of the heuristics
defined by Marchetti et al.; the most important one deals with the features
to detect a counter. Instead of describing a bit-flip frequency that halves for
each bit, we improved this feature by introducing an error margin for those
frequencies. This problem arose when only a part of counter values is received
instead of a full range of counter values.
To enable the online execution of this algorithm, it runs every 10 s. A buffer
is implemented to store the temporary CAN data. The results are sent to the
correlation blocks, which are presented in the following subsection.
3.2 Identifying Signals

As mentioned in the previous section, DTW is used in identification cases regard-
ing CAN signals. However, this technique has a high computational complexity;
the algorithm requires its internal matrix to be filled with the two given signals,
which results in complexity O(n2 ). Hence, the algorithm is not suitable for the
analysis of long time series in an online context. Fortunately, extensive research
has been conducted regarding the reduction of the computational complexity. In
this work, we chose for the FastDTW algorithm, proposed by Salvador et al. [8].
This algorithm defines a multilevel approach that offers a linear time and space
complexity (O(n)). It does so by executing three key operations: (i) coarsening,
(ii) projection, and (iii) refinement.
Figure 1 explains the process, showing four different resolutions that were
evaluated during a complete run of the FastDTW algorithm.
The correlation block is implemented in such a way that the online require-
ment of the study is preserved. First, we use FastDTW as algorithm within a
correlation block. This results in a significant performance increase compared to
the usage of DTW. Next, several of those blocks are implemented; all of them
have the same correlation algorithm, but have a different reference signal to cor-
relate with. Therefore, multiple signals can be identified at once. Additionally,
the results of the READ algorithm mentioned in Sect. 3.1 are also incorporated
in these correlation blocks. Thus, the search space in which a correlating signal
can be found is dramatically decreased. Finally, the correlation process is exe-
cuted every 10 s; a buffer is implemented to store the temporary data from CAN
and OBD.
Fig. 1. The complete execution of the FastDTW algorithm, showing four different
resolutions. Coarsening is used to create those resolutions, after which the minimal
warp path is found and project to the next higher resolution. Refinement is used by
applying constraints that only cells in the warp path are evaluate.
4 Detection and Identification of Discrete Signals
In the previous section, a methodology is presented for online reverse engineer-

ing of continuous signals sent over the CAN bus. To identify those signals, a
reference signal is needed to correlate with. When a candidate signal resembles
the reference the most, it can be seen as an outcome of the identification pro-
cess. However, the correlation process gets more difficult when the requested
signals are discrete switches, toggled by a user (e.g. direction indicators, wipers,
fog lights, etc.): it is difficult to obtain a trustworthy reference signal. In this
section, we propose a methodology that enables us to detect and identify discrete
signals. Note that, in this case, a signal is seen as a particular byte from a CAN
message. Therefore, if a CAN bus consists of 10 different IDs, each containing 8
consecutive bytes, 80 different signals are present on the bus.
First, the user is prompted that a detection process for a certain signal is
about to start. For example, if we want to detect and identify the internal signals
for the wipers, the user is prompted with both a countdown of three seconds and
the following message: “Get ready to toggle the wipers”. During this countdown
phase, the data on the CAN bus is analysed. Several signals are changing due
to internal ongoing processes (e.g. counters, fluctuating values of sensors, etc.).
As the user is expected to not trigger the requested signal, that one should be
constant during these three seconds. The signals that are not constant are put on
a blacklist. Note that a signal is seen as a certain byte inside a CAN message with
a particular ID. Therefore, when the countdown has finished, the list of signals
that were constant form a snapshot of all possible candidates. Since only these
candidates need to be analysed further on, the search space is greatly reduced.
Next, the correlation phase takes place. The user is prompted to toggle or
press the requested switch or button as much as possible. This is again accom-
plished via a countdown and a message, but the countdown is now five seconds;
it gives the user enough time to press or toggle a few times. During these five
seconds, only the non-blacklisted incoming signals are constantly correlated with
their corresponding snapshot. This correlation is carried out by the same algo-
rithm used in Sect. 3.2.
Finally, when the countdown has finished, the correlation coefficient is cal-
culated for each signal with its reference. Normally, a signal that resembles its
reference the most has the highest correlation coefficient. However, in this case,
the requested signal should not resemble its reference; instead, it should differ
as much as possible. Hence, the outcome of this correlation process is the sig-
nal with the lowest correlation coefficient. Additionally, as the algorithm knows
which signal was requested – it asked the user to trigger it – the identification
process has been successfully finished too.
5 Experiments
In this section, we elaborate on the experimental setup that is used to perform
the experiments. Next, the obtained results are shown, followed by a discussion
on those results.
5.1 Experimental Setup

The CAN data is obtained using a Raspberry Pi 3 Model B+, extended with
a PiCAN 2 shield that provides CAN capabilities. This allows for an effortless
collection of CAN data by connecting the shield via the on-board DB9 connector
to the car.
The software consists of multiple logical blocks: (i) the streamer of CAN data,
originating from the PiCAN 2 shield, (ii) the block containing the READ algo-
rithm mentioned in Sect. 3.1, and (iii) several correlation blocks containing the
algorithm specified Sect. 3.2, each with a different reference signal. Worth men-
tioning is that block (i) is located on the Raspberry Pi itself; the other blocks (ii)
and (iii) are located on a MacBook Pro 15” (2018). The blocks are developed via
the Distributed Uniform Streaming (DUST) framework, proposed by Vanneste
et al. [10]. Using this framework, the software blocks are interconnected with
each other via the publish/subscribe pattern.
The cars used in the experiments are: (i) a BMW X5 (2014), which has 156
CAN IDs and a publishing rate of 1480 messages per second, (ii) a Citroen C2
(2009), having 27 CAN IDs at a rate of 629 messages per second, (iii) a Mercedes-
Benz A180 (2016) with 68 IDs at a rate of 578 messages per second, (iv) an Opel
Astra J (2012) with 72 IDs at a rate of 1928 messages per second, and (v) a Fiat
Ducato (2018) with 29 IDs at a rate of 938 messages per second. We chose those
cars for the experiments because of the varieties in number of IDs and message
rates. Therefore, the algorithms are tested in different conditions.
To identify a given CAN signal, it has to correlate with a reference signal.
Different sources exist to provide such a reference (e.g. GPS sensors or Inertial
Measurement Units (IMUs)). In this research, we focused on OBD. In this way,
we can obtain trustworthy signals such as engine RPM, vehicle speed and throttle
position.
5.2 Results and Discussion

In the first experiments, we elaborate on the results of the online READ algo-
rithm. We conducted the experiment in three different situations, each with a
different set of received messages, going from 22230 CAN messages to 44460
and 88920 messages. To verify the algorithm on different cars, we conducted an
experiment on the following cars: Citroen C2, Mercedes A180 and Opel Astra J.
Table 1. Results of READ analysis.
BMW X5 Situation 1 Situation 2 Situation 3 Citroen C2 Mercedes A180 Opel Astra J

Counter 42 40 45 3 11 7
CRC 104 102 101 4 5 15
Zero 90 90 97 77 141 220
Constant 281 284 291 24 61 45
Patterns 116 124 148 5 14 6
Physical signal 77 80 77 34 75 103
Total signal count 710 720 759 147 307 396
(a) Analysis on BMW X5 in three situations: 22230 mes- (b) Extended analysis on three other cars
sages, 44460 messages and 88920 messages.
When looking at the results of the BMW X5, shown in Table 1a, we see in
the initial impression that the amount of signals increases with a higher number
of received messages. Due to the absence of a correct fully reversed data set,
manual validation was performed on the found signals. We confirm that READ
performs the labelling with very high accuracy. However, further research is
needed with a fully reserved engineered data set to state the contributions of
our additional heuristics with respect to the State of the Art. Despite the high
labelling accuracy, we found one major limitation of the algorithm. Since the
unused parts of a signal are labelled as a constant value, it is unable to detect
absolute boundaries of those signals. A solution for this is to include the neigh-
bouring constant signal or extend the boundaries to its closest byte. However,
previous research has shown that signals do not necessarily use entire byte fields
and that the remaining bits are often used as flags. When looking at Table 1b we
see that the algorithm is also able to detect and label multiple signals of other
cars. Therefore, we can see that the algorithm is able to reduce the search space
drastically in comparison with a brute force mechanism.
In the second experiment, we look at the performance of the correlation
blocks, shown in Fig. 2. In Fig. 2a the used reference signal (i.e. OBD signal of
the engine RPM) is shown. We clearly see two peaks centred around sample 15.
Figure 2b shows the signal that has the highest correlation coefficient regarding
the reference. The two peaks are also clearly visible. The DTW distance calcu-
lated between the reference and the candidate is 0.796295. Figure 2c shows the
second best candidate with a DTW distance of 0.892764. The two peaks are
also clearly visible and the overall trajectory is very similar; hence, the distances
are also very similar. The exact reason for the very slight increase in distance is
still unknown, but it is either due to a timing difference or a sensor inaccuracy.
The third best candidate (shown in Fig. 2d) has a distance of 2.32065, which is
relatively high in comparison to the other two candidates; the two peaks are less
(a) OBD reference (b) Best candidate signal with highest correlation
coefficient
(c) Second best candidate signal (d) Third best candidate signal
Fig. 2. Overview of found CAN signals. Figure 2a shows the OBD reference of the
engine RPM. Figure 2b shows the best candidate with the lowest DTW distance; Fig. 2c
and d show the second and third best candidate.
visible and the overall trajectory is quite different. From this experiment, we can
see that our correlation block is able to find the best candidate signals given a
certain reference.
Table 2. Shows the found CAN IDs and occupied bytes using the algorithm for discrete
signals.
BMW X5 Fiat Ducato Mercedes A180

Lights 538 [0] 169385984 [1] 75 [0]
Direction indicators 502 [0–1] 135307264 [3] 69 [2]
Wipers 678 [0] 169385984 [6] 69 [2]
Brake pedal 239 [2–3] 1089542 [1–2] 751 [3]
Acceleration pedal 143 [2–3] 102277121 [7] 105 [4–5]
Finally, we elaborate on the results of the algorithm for discrete signals,

shown in Table 2. First of all, the signals for lights, direction indicators and
wipers have been found on all of the cars. As can be seen in the table, the CAN
IDs of the Fiat Ducato are very high in comparison with the other two cars.
That is, the car is using extended IDs, which consist of 29 bits instead of 11
bits. These three signals have been found using user inputs, but they are also
immediately validated by the user. That is, in some cases, a certain counter on
the CAN bus was stable during the reference phase, but incremented during
the analysis phase. Therefore, this counter was also detected as a user signal,
which was clearly not the case. Hence, user validation was needed. Though, the
algorithm is able to detect the discrete signals in an effective way. Additionally,
we see that both the brake and acceleration pedal have been found. In fact, they
are also stable during the reference phase and can be pressed during analysis.
Thus, we can see that this algorithm is also able to detect every signal that
requires a user input, independent from the discreteness of the signal.
Modern cars contain numerous sensors that provide useful data in many different
situations, but the interpretation of that data is cumbersome due to the different
implementations of the CAN messaging system. Hence, reverse engineering is
needed in order to give sense to the internal sensor data of the car.
This paper presents two major methods to perform reverse engineering in an
online way instead of offline analysis on recorded log files. First, a mechanism
using the READ algorithm is suggested which reduces the search space dras-
tically. Given this decreased search space, online correlation blocks containing
the FastDTW algorithm were able to detect signals that correlate with a certain
reference signal.
Next, this paper also proposes a new methodology for detecting and iden-
tifying signals that require a user input, such as direction indicators and light
switches. Additionally, the algorithm is able continuous signals that also require
a user input, such as accelerator and brake pedals. A downside of this method-
ology is that user validation is needed because of unpredictable counters on the
CAN bus. However, the algorithm is still able to detect user signals in an effective
way.
Regarding future work, these proposed methods need to be fully verified
and validated with complete dataset of already reverse engineered data. Thus,
we can thoroughly benchmark our proposed algorithms with the current State
of the Art. Additionally, research is required in making the algorithms fully
autonomous. That is, no user input should be required in order to perform reverse
engineering, and the possibility of locating the algorithms on an embedded device
instead of a laptop should be taken into account. Though, these methodologies
have proven that online reverse engineering has a lot of potential, which paves
the way to a scalable solution for data extraction in many future applications.
Acknowledgement. This work was performed within the framework of the ICON
project MobiSense (grant No. HBC.2017.0155), supported by imec and Flanders Inno-
vation & Entrepreneurship (Vlaio).
References
1. Fugiglando, U., Santi, P., Milardo, S., Abida, K., Ratti, C.: Characterizing the
“driver DNA” through CAN bus data analysis. In: Proceedings of the 2nd ACM
International Workshop on Smart, Autonomous, and Connected Vehicular Systems
and Services, pp. 37–41. ACM, Snowbird (2017)
2. Fugiglando, U., Massaro, E., Santi, P., Milardo, S., Abida, K., Stahlmann, R., Net-
ter, F., Ratti, C.: Driving behavior analysis through CAN bus data in an uncon-
trolled environment. IEEE Trans. Intell. Transp. Syst. 20(2), 737–748 (2019)
3. Hallac, D., Sharang, A., Stahlmann, R., Lamprecht, A., Huber, M., Roehder, M.,
Sosič, R., Leskovec, J.: Driver identification using automobile sensor data from a
single turn. In: IEEE Conference on Intelligent Transportation Systems, Proceed-
ings, ITSC, pp. 953–958 (2016)
4. Huybrechts, T., Vanommeslaeghe, Y., Blontrock, D., Van Barel, G., Hellinckx, P.:
Automatic reverse engineering of can bus data using machine learning techniques.
In: Xhafa, F., Caballé, S., Barolli, L. (eds.) Advances on P2P, Parallel, Grid, Cloud
and Internet Computing, pp. 751–761. Springer, Cham (2018)
5. Lestyán, S., Acs, G., Biczók, G., Szalay, Z.: Extracting vehicle sensor signals from
CAN logs for driver re-identification. In: Mori, P., Furnell, S., Camp, O. (eds.)
Proceedings of the 5th International Conference on Information Systems Security
and Privacy, ICISSP, vol. 1, pp. 136–145. SciTePress, Prague (2019)
6. Lin, N., Zong, C., Tomizuka, M., Song, P., Zhang, Z., Li, G.: An overview on study
of identification of driver behavior characteristics for automotive control. Math.
Prob. Eng. 2014(2), 15 p. (2014)
7. Marchetti, M., Stabili, D.: Read: reverse engineering of automotive data frames.
IEEE Trans. Inf. Forensics Secur. 14(4), 1083–1097 (2019)
8. Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and
space. Intell. Data Anal. 11(5), 561–580 (2007)
9. Sathyanarayana, A., Boyraz, P., Hansen, J.H.: Driver behavior analysis and route
recognition by hidden Markov models. In: Proceedings of the 2008 IEEE Interna-
tional Conference on Vehicular Electronics and Safety, ICVES 2008, October, pp
276–281. IEEE, Columbus (2008)
10. Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M.,
Mercelis, S., Hellinckx, P.: Distributed uniform streaming framework: an elastic fog
computing platform for event stream processing and platform transparency. Future
Internet 11(7), 158 (2019)
Time Synchronization with Channel
Hopping Scheme for LoRa Networks
Ritesh Kumar Singh(B) , Rafael Berkvens(B) , and Maarten Weyn(B)
University of Antwerp - imec, The Beacon, Sint-Pietersvliet 7, 2000 Antwerp, Belgium

{riteshkumar.singh,rafael.berkvens,maarten.weyn}@uantwerpen.be
Abstract. Low-Power Wide Area Networks (LPWAN) for resilient

Internet of Things (IoT) ecosystems come with unprecedented cost for
the minimal load of communication. Long Range (LoRa) Wide Area Net-
work (LoRaWAN) is a LPWAN which has a long range, low bit rate and
acts as a connectivity enabler. However, making an efficient collaborative
service of clock synchronization is challenging. In this paper we tackle two
problems of effective robustness in LoRa network. First, current research
typically focuses on the benefits of LoRa but ignores the requirement
of reliability, which may invalidate the expected benefits. To tackle this
problem, we introduce a novel time synchronization scheme for radically
reducing usage of existing Aloha type protocol that handles energy con-
sumption and service quality. Second, we look into the security space
of LoRa network, i.e. channel selection scheme for the given spectrum.
Attacks like selective jamming are possible in LoRa network because
the entire spectrum space is not used, and utilization of few channels
are comparatively higher. To tackle this problem, we present a channel
hopping scheme that integrates cryptographic channel selection with the
time notion for the current communication. We evaluate time synchro-
nization and the channel hopping scheme for a real-world deployed peer
to peer (P2P) model using commodity hardware. This paper concludes
by suggesting the strategic research possibilities on top of this platform.
Introduction
Low power and low cost communication technologies have made a great impact
on the development of applications and services for IoT. Over the last decades,
due to the miniaturization of IoT communication devices in parallel with
enhanced computing, the number of applications is increasing exponentially,
such as precision agriculture and traffic updates [3].
With the wide spectrum of services and projected inevitable scaling of appli-
cations in IoT, quality measures like scalability and reliability remain a challenge.
This caused an increased demand for wide communication coverage, which in
turn let to the advent of LPWAN. These have high energy efficiency, scalabil-
ity and coverage as key ingredients. LoRa, the radio modulation technology of
LoRaWAN, is widely deployed because of its key enablers such as low power,
long range, low bit rate and low cost chip set [14].
LoRa has brought in a lot of interesting applications [9,14]. It has attracted
many researchers for evolving LoRaWAN infrastructure [18], the network stack
https://doi.org/10.1007/978-3-030-33509-0_74
Time Synchronization with Channel Hopping Scheme for LoRa Networks 787
on top of LoRa physical layer. LoRaWAN features low data rate and multi-
kilometer communication range and enables a star network topology. This even-
tually simplifies deployment and maintenance cost of the network. Resulting,
into greater momentum behind LoRaWAN with deployment all over the world
by network operators and solution providers [17].
However, along with those advantages, LoRaWAN has some limitations and
problems [2]. In particular, a key problem arises due to communication latency
and using Aloha type protocol [5,7]. This in turn results in high battery power
usage with low reliability [13] and limits the command and control scenario
applications in LoRaWAN network [10]. Moreover, for application such as data
fusion and event monitoring, the timing of events becomes critical. In Wireless
Sensor Networks (WSN), this is well implemented by dedicated strategies of clock
synchronization [4,6]. However, it is hard to deploy this type of applications on
top of LoRaWAN because there is no notion of time in the network.
In recent literature, the paper [2] highlights the important research challenges
in LoRaWAN network as - exploring new channel hopping methods, scheduling
deterministic traffic along time and limiting interference/collisions by enabling
coordination mechanism between gateway and end devices. In addition, it specif-
ically highlights the requirement of time synchronization and channel hopping
as mitigation technique to revamp the security and robustness of LoRa network.
In this paper, we address the aforementioned challenges by proposing a novel
time synchronization platform for LoRa network and establish a proof of concept.
Inclusion of a time notion can tackle the most critical drawback of availability
of down link communication. It can help to save energy by shortening the neces-
sary guard time and turning off the radio for better duty cycling schemes. To the
best of our knowledge this stands as the first contribution towards time synchro-
nization in LoRa network. Also, we propose lightweight Cryptographic Channel
Hopping scheme (CCH) to enhance the security and reliability (collision rate) of
network. This unique distribution and utilization of available channels can act
as a mitigation technique for security challenges such as selective jamming [16]
and increase the robustness of network.
The paper is structured as follows. Section 1, presents the background on
time synchronization and channel hopping in context of LoRa and LoRaWAN.
Section 2, proposes design and implementation of the system using real LoRa
devices. In Sect. 3, We evaluate preliminary the synchronization resolution and
optimal guard time. For CCH, we evaluate the channel utilization and distri-
bution along with total memory consumption and statistical test to check the
uniformity of channel distribution. Finally, we conclude this paper along with
application usecase in Sect. 4 and discuss the plans and set of possibilities that
can be build upon time synchronized LoRa platform as future work.
1 Background
This section gives an overview of time synchronization and CCH along with its
key elements in the context of LoRa and LoRaWAN.
788 R. K. Singh et al.
Fig. 1. The LoRaWAN stack.
1.1 LoRa and LoRaWAN
LoRa is a radio modulation technology, proprietary and licensed by Semtech

Corporation. It operates on 863–870 MHz ISM band in Europe and different
frequencies between 470 MHz–925 MHz in variant countries.
It explicates the data-link layer and network layer in the stack [7] as in Fig. 1
and utilizes a star networking topology, in which all end-devices connect directly
with a LoRa gateway.
1.1.1 Time Sync in LoRaWAN

LoRaWAN is implemented as a networking protocol from LoRa physical layer to
the application layer. However, it does not incorporate a time synchronization
protocol. It does not have time related information in message headers, but
only slack timing requirements because of long transmission duration. Class A
devices, utilize Aloha-type protocol [5] to send messages to gateways which
comes with major bottleneck as: 1. Messages from a Class A end device are
application specific. Besides, when there is a message from a gateway to an
end-device (downlink message), the end-device should wait until the next uplink
transmission to receive it as in Fig. 2; 2. After transmitting each message, an end
device opens up two receive window for one second each irrespective whether or
not it will receive a message from the gateway, which results into power loss (Rx:
14.2 mA [11]).
LoRa Alliance has introduced class B type of devices, as time beacon-like
devices. It has periodic receiving window to synchronize the time between end-
devices and gateways. However, they still have an additional receive window same
like Class A device type. This results in consuming even more power though there
is no message for the end-device. Also, Class C type of devices always stay in
receive mode, except when they are transmitting. Thereby, spending maximum
power. Currently, there are no products of class B or C devices in the market
for reference.
Fig. 2. Class A device Latency Fig. 3. Traffic Analysis
1.1.2 Channel Hopping in LoRaWAN

LoRaWAN uses a pseudo-random channel hopping method to amplify overall
network magnitude and avoid accidental collisions by disseminate transmission
over the pool of available channels. The possibility of a channel selection is
decided by duty-cycle limitations (i.e., the duty-cycle is 1% in EU 868 MHz for
LoRaWAN end-devices) [2,7]. As mentioned in Sect. 1.1, LoRaWAN operates
on 8 channels between 863–868 MHz in Europe. Each channel is partitioned
by 200 kHz from each other to support 125 kHz bandwidth as defined in the
LoRaWAN specification. In order to examine channel utilization in LoRa fre-
quency spectrum, an analysis of LoRaWAN traffic was performed using a Multi-
Tech Gateway1 and its logs to scope the channel usage of LoRa transmissions in
regular use. LoRaWAN traffic was recorded for 3 days, totalling 692 messages.
The result of traffic analysis is shown in Fig. 3.
We observe: 1. Most of the spectrum is not utilized. Moreover, only three
default ones are mostly used by end-devices; 2. Also, it affects security of legiti-
mate communication and gives a scope for better channel hopping scheme.
2 System Model - Implementation

In this section, we unfold the propose solution for time synchronization and CCH
in LoRa network.
2.1 Proof of Concept and System Architecture

For the proof of concept setup and to demonstrate the possibility of bringing time
notion in LoRa network, the P2P model (two devices) is considered for exper-
iment which can be further scaled up to a full fledged network. Also, pseudo-
random frequency is calculated on both the master and slave before performing
the subsequent communication between them to adhere CCH. The motivation
behind this setup is to sync both the devices and reduce the guard time while
guaranteed message delivery. This can eventually result in saving power and
increasing the reliability of network.
1
http://www.multitech.com/brands/multiconnect-conduit.
Fig. 4. System architecture
For the implementation we have taken, two Microchip Class A LoRa mote
as end devices that supports 868 MHz High-Frequency2 , demonstrated in Fig. 4.
Detailed hardware setup using LoRa mote as master and slave is as follows:
Master Node. Device is connected to the Raspberry Pi, which picks the system
time as reference time and manages the ongoing messaging between end devices.
A script (high priority process) does the sending and receiving of data from the
LoRa end device using LoRa radio and mac commands [12].
Slave Node. Device has PIC inside RN2483 (PIC-18LF46K22) and is connected
with a RTC (ABRACON-AB0805 [1]) and programmed using Pickit3 tool and
MPLAB IDE. Programming PIC helps to avoid any further inclusion of hard-
ware. The RN2483 module have two reserved pins as mentioned in its data
sheet [11]. These pins are used to set up an I2C communication with RTC. This
RTC is significantly ultra low powered (as low as 14 nA [1]) with an I2 C and
SPI interface having precision up to centiseconds. The clock and alarm feature
of the RTC is mainly used for time sync platform.
2.2 Clock Synchronization

Clock synchronization is done in two steps using P2P model. Firstly, the initial
investigation in most simplified way and then considering the insights of for-
mer approach for developing refined algorithm. Network joining of the device
is out of the scope for this paper and considered to be prerequisite. Simplified
messaging between master and slave is shown in Fig. 7. Both nodes starts with
default frequency, power and SF as defined in the joining process. Then the mas-
ter node sends the reference time (Rt ) as sync message to the slave. This Rt is
composed of 7 bytes of data, which contains time stamp in the form of centisec-
onds, seconds, minutes, hours, date, month and year. The slave, on receiving the
sync packet writes the time value to the appropriate register in RTC. For better
understanding and evaluation, we have predefined the alarm configurations to
get the interrupt after each minute.
2
www.microchip.com/RN2483LoRaMote4233989.
Fig. 5. On air time for different pay- Fig. 6. OAt +Ct for different SF
load
While sending a Rt to slave device, some time gets lapsed in computation

time (Ct ) and on air time. We calculated the on air time (OAt ) for different SF
and payloads as in Fig. 5. These two values, Ct and OAt , are constant as in Fig. 6
for a given type of message, which is necessary to be added on the Rt . Thereby,
instead of writing the received sync message we wrote “α”; α= Rt +OAt +Ct .
Next, we calculated the clock drift over a given time by mapping it with Rt . It is
important to contemplate and calculate the clock drift to maintain the optimal
guard time and re-sync period. Figure 8, shows drift for all SF such that clock
drifts about 50 ms in one hour. Thereby, it can drift to 1 s in 20 h. In case the
application requires second precision then in worst scenario device has to sync
up with in every 20 h in the current setup.
Fig. 7. Message sequence for Time Sync
Further, we observed prospective effects of different parameters like power

configuration or changing device internal frequency on Ct or OAt . Power config-
uration has no footprint. However, changing internal frequency from 16 MHz to
Fig. 8. Clock drift for different SF Fig. 9. Time sync protocol
Fig. 10. Format for sync message Fig. 11. Message format
4 MHz resulted variation in the Ct but not on the clock drift. We got the time
synchronization with a resolution of 55 ms between master and slave. This vari-
ation in resolution was due to hardware dependency. Key take away points for
developing the algorithm further were that it should be hardware independent,
have automated re-sync, no impact of drift, and provision of message classifica-
tion from master.
Second step, time sync algorithm is proposed considering the aforemen-
tioned learning. The message sequence for this algorithm is shown in Fig. 9 with
the detailed steps mentioned it below. Acronym used are: sync message (Sm ),
acknowledgment (Ak ), RTC value (Rv ), RTC read (Rr ), RTC write (Rw ), RTC
value (Rv ), schedule message (Ms ), network time (Nt ), alarm interrupt (Ai ),
guard time (Gt ), network time (Nt ), master device (Md ), slave device (Sd ).
This algorithm is hardware agnostic. The Md sends Sm to Sd in the format
displayed in Fig. 10 comprising of one byte for each field and stores time (T1 ).
The ‘id’ and ‘type’ field represents the network/master id and type of message
respectively as mentioned in Fig. 11. Each message carries additional information
apart from payload for message identification and required processing in the
format given in Fig. 11. It carries the information of sender (ID), type of message
to accordingly read the payload and type of message. The id details are stored
in joining process and helps in avoiding anonymous messages. Sd , on receiving
Algorithm 1. Time Sync algorithm

1: procedure Join network(Sd , network) pre-requisite
2: procedure Time sync(Sd , Md ) time sync with master
3: Sd ← conf ig default network freq,Power,SF..
4: Sd ← Md (Sm , T1 ) T1 is the master reference time
5: Rw (Sm , Reg) ← Sd ; Rr (value, Reg) ← Sd RTC
6: Md (T2 ) ← Sd (Ak , Rv ) ack and RTC value
7: ((T2 − T1 )/2) ← Md i.e delta is calculated
8: Sd ← Md (Sm (T3 + )) T3 is new reference time
9: repeat step(5 & 6) Set rtc with new reference time
10: Sd ← Md (Ms ) schedule information
11: Sd (ISR) : Md (ISR) Alrm interrupt on both device
12: Md (Sd : Ch) ← Sd (data, Rv ) data and RTC value
13: Sd ← Md (Ak , Sm or Ms ) application specific
the message parses id and message type to perform Rw and immediately Rr

followed by sending Ak . Md receives Ak and stores this time (T2 ). It calculates
the difference of both the stored time slot and divides by two to get the delta
(OAt +Ct ) which can be added to Rt for sending the Sm again. Md sends the
Ms as in Fig. 11, for receiving the sensor data. Sd , configures alarm registers to
receive interrupt. We have configured one of the pin for device wakeup on alarm
interrupt and send the message. On receiving the interrupt, Sd sends sensor
data and Rv . Md uses the Rv to map with application requirement for required
resync and accordingly replies back. For each communication, Sd and Md open
their guard window to receive message and set their common frequency using
CCH.
2.3 CCH
During time synchronization and further for each communication CCH is used
to select next channel. After each message communication, master and slave
close their radio modules (state:off) and select their next channel Common func-
tion (select f req) is used for CCH, that runs on both master and slave device.
Both the end devices are loaded with common keys (Kn ), plain text (Pt ) and
nonce (part of joining process). The select f req function uses AES CTR mode
(AES is already present in LoRA stack, so using the same will result into light
weight solution) to generate a pseudorandom channel from the available spec-
trum because of its important random access property. The cost perspective of
using this cryptographic function (AES CTR) is very low [15], since AES and
record of utilized channel is already part of LoRA stack. The CCH, algorithm
for channel selection works as follows:
1. Frequency spectrum for EU, F = {fi } for 863 000 000 ≤ i ≤ 870 000 000,
f ∈ I where fi has to be selected for the next communication.
2. Use AES-CTR mode with 256 bit Kn to produce stream cipher using sequen-
tial counter block of size 16 octets, comprising of counter, nonce and Initial-
ization vector.
3. Give input, Pt of 16 octets for XORing
with key stream and to get cipher
4. Calculate the sum(cipher), i.e., cipher which is always random because
i=116
of inherent basic property of using 14 rounds.
5. With above results, we calculate the fi using logic - Sum()%(upperbound −
lowerbound) + lowerbound
6. The lower and upper bound are the fi range.
Fig. 12. Channel hopping Fig. 13. Memory consumption for time
sync and CCH
3 Evaluation
In this section, we unfold the achieved sync resolution, guard time, channel sta-
tistical uniformity and accomplished channel utilization and distribution using
CCH.
3.1 Guard Time and Sync Resolution

In Sect. 2, air time is calculated for different SF and payload size along with
RTC clock drift. This helps to determine further the optimal guard time and
sync resolution with reference clock between master and slave. Achieved time
sync resolution using proposed algorithm is 10 ms (clock difference with refer-
ence time) which is the minimum measurement resolution with the given RTC
(centiseconds). Changing the RTC may even bring down the sync resolution to
a lower value. Optimal guard time is 30 ms in addition to delta on master side
which is dynamically picked for each instance of communication with the end
device. Also, the guard time is kept to be twice delta in addition to 30 ms on
slave side. No packet loss is observed while running this experiment for longer
period. Sync message itself requires some amount of energy but when compared
to efficient guard time and added benefits, the cost of sync message becomes
negligible.
3.2 Channel Utilization and Distribution Using CCH

Evaluation of CCH frequency selection for each round of communication is shown
in Fig. 14. This graph shows the randomness in channel selection. Also, the
statistical information entropy calculation [8] for testing uniformity came as
5.9478 bits out of maximum 6.1497 bits. It shows nice uniformity in channel
selection. The Fig. 15 shows equal distribution and utilization of the available
channels for each round of communication. From research perspective, EU duty
cycle is not considered for this experiment. The cost of implementing time sync
and CCH scheme in regards to memory consumption on the device is given in
Fig. 13. It is 242 bytes for the implementation of time sync and 717 bytes for
CCH. The computational cost added by time sync and cryptographic function
is 6% and 18% respectively. This stands out to be very low and a perfect fit for
LoRa end devices.
Fig. 14. CCH for different rounds of Fig. 15. Channel utilization and distri-
message bution
4 Application, Conclusion and Future Work

Inclusion of time synchronization platform can allow LoRa devices to send and
receive messages in allocated slots, reducing interference and collision. In this
paper, a novel time synchronization platform is proposed to achieve reliability
and robustness for LoRa networks. The objective of this paper, is to fill two
gaps in LoRA of time sync and CCH. It will be interesting to deploy several
master/slave nodes in an environment to check the fault tolerance and per-
form the empirical calculation on reliability and energy cost. The key problem
of LoRaWAN is addressed for the availability and listening time of down link
message. On top of time cooperative approach, the cryptographic channel hop-
ping scheme reduces the packet collision and enhances the security of network.
In addition, energy saving is achieved by lowering the guard time for message
listening.
Encouraged by the results of proposed time sync platform, we plan to extend
this work towards further evaluation of power measurement and more strategic
research i.e in EU Interreg project on greenhouse monitoring (Grow!). This scope
becomes even more substantial after release of the recent LoRaWAN specifica-
tion, with new commands for seeking time details. However, it will take time
before this is commercially available. We will probably focus on scaling the time
sync network and measure the performance, especially coverage, capacity and
collision. We will first focus on dynamic task scheduling or time sensitive middle-
ware on top of this platform. Next, to explore joining procedures for end devices
in time sync networks. On the other hand, on top of CCH we will try to come
up with a channel hopping scheme to reserve set of channels for critical packets.
We would also try to evolve with an efficient new time sensitive device type;
enriching the possible use-case and applications of LoRa network.
References
1. Abracon: Ab08x5 real-time clock family. Technical report, 16 October 2014.
https://abracon.com/Precisiontiming/AB08X5-RTC.PDF
2. Adelantado, F., Vilajosana, X., Tuset-Peiro, P., Martinez, B., Melia-Segui, J., Wat-
teyne, T.: Understanding the limits of LoRaWAN. IEEE Commun. Mag. 55(9),
34–40 (2017)
3. Centenaro, M., Vangelista, L., Zanella, A., Zorzi, M.: Long-range communications
in unlicensed bands: the rising stars in the IoT and smart city scenarios. IEEE
Wirel. Commun. 23(5), 60–67 (2016)
4. Lasassmeh, S.M., Conrad, J.M.: Time synchronization in wireless sensor networks:
a survey. In: Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon), pp.
242–245 (2010)
5. Laya, A., Kalalas, C., Vazquez-Gallego, F., Alonso, L., Alonso-Zarate, J.: Goodbye,
aloha!. IEEE Access 4, 2029–2044 (2016). https://doi.org/10.1109/ACCESS.2016.
2557758
6. Lin, L., Ma, S., Ma, M.: A group neighborhood average clock synchronization
protocol for wireless sensor networks. Sensors 14, 14744–14764 (2014)
7. LoRa Alliance Inc.: LoRaWANTM specification. LoRa Alliance, Inc., San Ramon
(2015)
8. MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cam-
bridge University Press, Cambridge (2003). https://doi.org/10.2277/0521642981
9. Mdhaffar, A., Chaari, T., Larbi, K., Jmaiel, M., Freisleben, B.: Iot-based health
monitoring via LoRaWAN. In: IEEE 17th International Conference on Smart Tech-
nologies, EUROCON 2017, pp. 519–524 (2017)
technologies for large-scale IoT deployment. ICT Express (2018). https://doi.
org/10.1016/j.icte.2017.12.005. http://www.sciencedirect.com/science/article/pii/
S2405959517302953
11. Microchip: Rn2483 LoRa technology transceiver module data sheet. Technical
report (2015). www.microchip.com/downloads/en/DeviceDoc/50002346C.pdf
12. Microchip: Rn2483 LoRaTM technology module command reference user’s guide.
Technical report (2015). http://ww1.microchip.com/downloads/en/DeviceDoc/
40001784B.pdf
13. Neumann, P., Montavont, J., Noël, T.: Indoor deployment of low-power wide area
networks (LPWAN): a LoRaWAN case study. In: 2016 IEEE 12th International
Conference on Wireless and Mobile Computing, Networking and Communications
(WiMob), pp. 1–8 (2016)
14. Noreen, U., Bounceur, A., Clavier, L.: A study of LoRa low power and wide area
network technology. In: 2017 International Conference on Advanced Technologies
for Signal and Image Processing (ATSIP), pp. 1–6 (2017). https://doi.org/10.1109/
ATSIP.2017.8075570
15. Prathiba, A., Bhaaskaran, V.S.K.: FPGA implementation and analysis of the
block cipher mode architectures for the present light weight encryption algo-
rithm. Indian J. Sci. Technol. 9(38) (2016). http://www.indjst.org/index.php/
indjst/article/view/90314
16. Proano, A., Lazos, L.: Selective jamming attacks in wireless networks. In: 2010
IEEE International Conference on Communications, pp. 1–6 (2010). https://doi.
org/10.1109/ICC.2010.5502322
17. i SCOOP: LoRaWAN across the globe (2017). www.i-scoop.eu/internet-of-things-
guide/iot-network-lora-lorawan
18. L.A.T.M. Workgroup: A technical overview of LoRa and LoRaWAN. Technical
report, LoRa Alliance (2015)
LiDAR and Camera Sensor Fusion
for 2D and 3D Object Detection
Dieter Balemans(B) , Simon Vanneste, Jens de Hoog, Siegfried Mercelis,

and Peter Hellinckx
University of Antwerp, Antwerp, Belgium

dieter.balemans@student.uantwerpen.be, {simon.vanneste,jens.dehoog,
Abstract. Perception of the world around is key for autonomous driv-

ing applications. To allow better perception in many different scenarios
vehicles can rely on camera and LiDAR sensors. Both LiDAR and cam-
era provide different information about the world. However, they provide
information about the same features. In this research two feature based
fusion methods are proposed to combine camera and LiDAR informa-
tion to improve what we know about the world around, and increase
our confidence in what we detect. The two methods work by proposing
a region of interest (ROI) and inferring the properties of the object in
that ROI. The output of the system contains fused sensor data alongside
extra inferred properties of the objects based on the fused sensor data.
1 Introduction
One of the firsts steps in autonomous driving is the perception of the world
around. To accomplish this, autonomous vehicles rely on sensors. Every sensor
has its own advantage and disadvantage. In order to correctly and reliably detect
objects and obstacles, multisensor data fusion can be used. This concept con-
sists of combining the advantages and suppressing the disadvantages of multiple
sensors.
Our proof of concept setup consist of a LiDAR and camera sensor. The data
from these sensors will be fused on feature level basis. This means that firstly
features will be extracted from the separate raw data streams. Afterwards these
features are combined by the fusion algorithm itself. This is in contrast with
a lower level fusion where the raw sensor data gets directly fused and used for
object classification. As the Joint Directors of Laboratories (JDL) [1] proposed,
sensor fusion can be established on different levels:
• Level zero - data alignment: Level zero processing involves synchronizing
individual data streams and mapping their individual coordinates onto each
other.
• Level one - object refinement: Level one processing involves object refine-
ment. In this level locational, parametric, and identity information is com-
bined to achieve refined representations of individual objects.
https://doi.org/10.1007/978-3-030-33509-0_75
LiDAR and Camera Sensor Fusion for 2D and 3D Object Detection 799
• Level two - situation refinement: Level two processing is situation refine-

ment. It develops a description of current relationships among objects and
events.
• Level three - threat refinement: Level three processing deals with threat
refinement. This is often seen as a prediction function. It projects the current
situation into the future and draws inferences about threats and opportuni-
ties.
• Level four - process refinement: Level four processing involves process
refinement. This means its objective is to refine the fusion process and monitor
its performance.
In this research, two different fusion system are proposed that will detect objects
and infer properties of these objects. The main purpose is to evaluate which
system performs best based on the KITTI benchmark. In the next section the
related work in the field is briefly discussed. Afterwards, a general view of sensor
fusion and the main system is given in Sect. 3. Subsequently, in Sects. 4 and 5
the two fusion systems are discussed in detail. In Sect. 6 the evaluation method
is discussed. And to conclude the results, conclusion, and future work can be
found in Sects. 7 and 8.
2 Related Work
A lot of research has already been done in the field of LiDAR and Camera
fusion, with mixed results. Zhang, Clarke, and Knoll [2] use these sensors for a
vehicle detection system. They developed a system that proposes a hypothesis
and verifies this hypothesis using a classification stage. Their classification stage
is embodied by a support vector machine (SVM). Their main conclusion was the
proposed method has a lower false positive rate then other detection systems.
Liang et al. [3] from Uber Advanced Technologies Group and the Univer-
sity of Toronto developed a 3D object detector using LiDAR and camera data to
perform very accurate localization. Their approach consists of continuous convo-
lutions and uses LiDARs and bird eye view images. The results of their research
are very promising.
In other research of Xu et al. [4] a machine learning approach is used. They
use a fusion network to fuse the preprocessed image and point cloud data. This
is in contrast with this research where a more algorithmic method is proposed
where almost no training is needed. On the market side of things, interesting
patents can also be found. In the United States a patent [5] exists for a method for
fusing radar/camera object data and LiDAR scan points. The patented system
is based on the hypothesis generation/verification principle earlier discussed. In
the system the LiDAR data is used in response to an object being detected by
the radar or camera. This is a clear example of level 1 processing of the JDL
model since the method uses object files to represent the position, orientation
and velocity of the detected objects.
This research is focused on the fusion of LiDAR and camera data. Both the
LiDAR and camera will be used to infer properties about the detected objects.
800 D. Balemans et al.
System 1: Fusion on objects
Inputs Preprocessing Fusion Outputs
Point Cloud
Lidar PCL Clustering Level 1
Transforms sensor
Feature processing Fused-Output to
data into a consistent
Association decision making
set of units and
Object (Fusion) systems
2D-image coordinates
Camera
Classification
Fig. 1. Global system overview
A comparison will be made between a fusion system that uses the LiDAR data
as hypothesis and a system that uses the camera data. The proposed systems are
mainly situated on level 0 and level 1 of the JDL model. The proposed methods
are different from the state of the art methods as they are more focused on
analytic data association algorithms. This work tries to improve the state of the
art methods in execution time and detection accuracy.
3 General System Overview

As said earlier, we can divide sensor fusion in different levels. This research is
mainly focused on level 0 and level 1. The proposed system (shown in Fig. 1)
consists of two major stages: a preprocessing stage and a fusion stage. The pre-
processing stage is where the individual sensor streams are processed to detect
individual objects. Thus, a feature based fusion is implemented. The prepro-
cessing stage is developed using off-the-shelve components. More details can be
found in the following Sects. 4 and 5. After the preprocessing step, the actual
fusion processing takes place. In this stage the individual sensor features are
combined and associated with each other. Using these combined features we can
infer the objects properties. In the current state of the system a 2D and 3D
bounding box is inferred. These bounding boxes alongside the fused features are
the output of the system. This output is an object refined output, thus a level
1 output. For the fusion step two different approaches are proposed: a camera
based and a LiDAR based approach.
4 Fusion 1: Camera Hypothesis

4.1 Preprocessing
The first approach is focused around the camera features. In this method the
camera preprocessing will first detect objects in the image space. These detected
objects will serve as a hypothesis for the fusion step. As seen in Fig. 2 the inputs
of the fusion method are detected camera objects and a raw point cloud. In order
to be able to process the point cloud data fast enough a down-sampling filter is
used.
Yolo (camera) based fusion
Camera Objects Coordinate

Time Sync Transformation =
Perspective mapping
FusedObject to object tracker

Associate Lidar
Points to object
Lidar PointCloud Coordinate

Filter Camera field of
view
Perspective mapping
Clustering based YOLO
Raw camera image

Time Sync
Propose image FusedObject to object tracker

Perform object
portion based on
detection
cluster location
Lidar Clusters Coordinate

Perspective mapping
Fig. 2. Top: Fusion 1 camera hypothesis; Bottom: Fusion 2 LiDAR based hypothesis
The preprocessing of the first fusion system consists of an image object detec-
tion algorithm. We opted for a machine learning approach to detect objects.
Therefore the well-known You Only Look Once (YOLO) model is be used. YOLO
is a real-time object detection model. The main advantage of the YOLO model
is that the model only needs to be applied once to the image. This makes it
significantly faster than other prior object detection methods [6]. The output of
the camera object detector are the classes of the objects alongside a bounding
box representing the location of the object on the image.
4.2 Fusion
After preprocessing the actual fusion step takes place. In this step the detected
camera objects are used as hypothesis to find associated LiDAR data. The
detected objects bounding boxes are used to predict a region of interest (ROI)
in the image space. Subsequently the relevant LiDAR points are transformed
to the image space. This transformation consist of projecting the LiDAR points
onto the image plane. For this projection to be correct it is crucial the position
of the camera and LiDAR are precisely determined. The distance between the
center of the LiDAR and the center of the camera determines the image plane
coordinates. The Center of Projection (COP) is the location of the LiDAR center
relative to the center of the camera. In Fig. 3 an example projection is drawn.
Here the COP is in the origin and the image plane is placed at x = a. This means
the camera is placed at a distance of a in front of the LiDAR. Furthermore the
camera and LiDAR are placed in a perfect line behind each other on the same
height. Hence the COP is at the origin.
To map a 3D point (x1 , y1 , z1 ) to the image plane x = a:
x1 = x1 + k ∗ (x1 − copx) = a

y1 = y1 + k ∗ (y1 − copy)
a − x1
k=
x1 − copx z1 = z1 + k ∗ (z1 − copz)
Projection plane (camera plane x=a)
(x1,y1,z1)
(a,y1',z1')
x
Center Of Projection
Fig. 3. LiDAR point projection: point (x1 , y1 , z1 ) projected onto the camera plane
x=a
Because the axis of the LiDAR and camera are different, the projected point’
coordinates on the image space become:
ImageX = y1 and ImageY = z1

With the points and the detected object in the same coordinate space a simple
data association can be applied. It is important to note that one component of
the LiDAR data is lost after the projection. This can introduce an error in the
detection. For example a car is being detected behind a street pole. The LiDAR
points that fall onto the pole are seen as points of the car. Thus these points need
to be filtered. Therefore the final step of the fusion step is applying a clustering
algorithm on the associated point cloud to filter the error. This clustering will
filter out all points in front and behind the car as they will be left out of the
object’ cluster. This final clustering stage can be seen as the verification stage
of this method as objects with no cluster points are filtered out.
Finally we have fused objects of both LiDAR and camera data. These fused
objects contain a class, a bounding box location on the 2D camera image, and a
point cloud from the LiDAR data. Using the data of these fused object further
properties of the objects can be inferred. For example in this method also a 3D
bounding box of the object is being determined. This 3D bounding box is cal-
culated based on principle component analysis (PCA) using the object’ LiDAR
point cloud [7]. First the eigenvectors are determined using PCA. Afterward, the
point cloud of the object is transformed to the origin such that the eigenvectors
correspond to the axes of the space. Next the minimum and maximum point are
calculated, these points are used to determine the box width, height, and depth.
Finally, the quaternion is calculated using the eigenvectors (which determines
how the final box gets rotated), and the transform to put the box in correct
location is calculated.
The output of the system are the fused objects with their properties. These
objects can be made accessible for higher levels of autonomous driving software.
5 Fusion 2: LiDAR Hypothesis

5.1 Preprocessing
In contrast to the first fusion system, this system uses the LiDAR input to
propose the hypothesis. This method allows for more exact fusion and better
3D object localization. In the preprocessing phase the LiDAR data will be used
to detect objects. These objects will be extracted from the point cloud using an
euclidean clustering algorithm. This is possible because only points of the same
object are close enough to each other to form a cluster. The clustering algorithm
will construct a Kd-search tree [8] of the point cloud and use a threshold to
find all points close enough to each other. It must be noted that the raw point
cloud from the LiDAR sensor is not directly ready to be used for the clustering
algorithm. First the point cloud is down sampled using a leaf size of 0.5 and a
voxelgrid [9]. Afterwards a ground plane removal algorithm will find and filter
out the ground plane points. This is done using the RANSAC algorithm [10].
Lastly all LiDAR points that cannot be projected on the image are filtered out.
This leaves a down sampled point cloud with only relevant possible object points
which the clustering algorithm will use to propose objects. It is important to note
that many of these used algorithms have critical parameters. For example the
clustering algorithm uses a threshold to determine if points are close enough to
each other. One can agree that this value determines the quality of the output.
The optimal threshold value can vary from situation to situation. As described
in the future work section these critical parameters should be monitored and
controlled during the fusion process. The output of the LiDAR clusterer are the
clusters themselves.
5.2 Fusion
In the fusion step of system two the LiDAR clusters are first projected onto
the image space. This is done using the same method used in fusion system
one. Subsequently the projected clusters are used to propose a ROI in the image
space, in which an image object detection algorithm will validate the proposition
by classifying the objects. This ROI is proposed by finding the minimum and
maximum X and Y value of the projected clusters plus an predetermined offset
parameter to increase the detection changes. Again, it must be noted this offset
parameter has a notable impact on the correct detection rate. If this parameter
is too big, the ROI is too large and multiple objects may be detected as one.
On the other hand if this parameters is too small, smaller objects may not be
detected correctly (Fig. 4).
After the proposition is verified the bounding box get updated based on the
classification. Now we have fused objects with both LiDAR point cloud data, and
image data. As in system one the same properties (class, 2D and, 3D bounding
box) are inferred using this data. The output of the system is the same as in
fusion system one.
Fig. 4. Example detection’s: (upper image) example detection of fusion system 1 (lower
image) example detection of fusion system 2. These detection where made on the test
setup described in Sect. 6. The data set used is the KITTI 0059 raw data set.
6 Evaluation and Benchmark
The test setup consists of the Velodyne VLP16 LiDAR and a StereoLabs Zed
3D camera. The LiDAR has a 360 degree field of view and 16 channels over
30 degrees of vertical field of view. This means the LiDAR outputs a 3D point
cloud of the world around. The Robotic Operating System (ROS) will be used
to orchestrate the communication between the nodes of the system. The two
sensors are both placed on top of the vehicle facing forward. The position of
these sensors is critical in the fusion process to estimate the position of detected
objects.
6.1 KITTI Benchmark
To evaluate the system the KITTI benchmark data is used. This data set is
composed by the KITTI team and consists of camera, LiDAR, GPS, and IMU
data. The full setup is described by Geiger, Lenz, and Urtasun [11] in their
paper. This raw data set [12] consist of recording fragments alongside a label
file. This label file contains labels and properties of objects that are visible in
each frame of the fragment and is used as ground truth to verify our detections.
The labels can be considered very accurate as they where collected by multiple
stages of human annotations. The properties used in this research to evaluate
are the class name, 2D bounding box, and 3D bounding box information.
Our systems will output a similar label file as the ground truth data for
each frame of a fragment. An evaluation script will compare the detection and
calculate the precision and recall of the detections. To calculate the values the
evaluation script collects the correctly detected objects (true positives TP), the
falsely detected objects (false positives FP), and the not detected objects (false
negatives FN). A detection is considered correct when the class name is correct
and the bounding boxes are at least fifty percent overlapping. The precision and
recall are then given by:
TP TP
precision = and recall =
TP + FP TP + FN
7 Results
The evaluation of the systems is done using the method described in Sect. 6.1.
To compare the systems execution time, precision, and recall are used. The
evaluation of the 2D and 3D detection are run separately. The main objective
of the benchmark is to detect cars. Other object classes will be ignored. In the
next tables you can find the results of the benchmarks.1
KITTI data set 0022 KITTI data set 0059

Method Execution time Precision Recall Method Execution time Precision Recall
F1 2D 0,0805958 s 72,46% 90,98% F1 2D 0,079720 s 94,08% 76,88%
F1 3D 0,0805958 s 3,87% 4,85% F1 3D 0,079720 s 19,88% 15,25%
F2 2D 0,120369 s 84,37% 37,43% F2 2D 0,156222 s 89,28% 25,84%
F2 3D 0,120369 s 13,77% 5,64% F2 3D 0,156222 s 40,89% 9,73%
This research presents two fusion systems which predict a 2D image bounding
box and 3D LiDAR point cloud bounding box. The main purpose is to compare
the performance of a LiDAR based fusion and a camera based fusion. From these
results one can clearly conclude the 2D object detection achieves much better
results than the 3D object detection. Generally speaking the first fusion sys-
tem, based on image ROIs, performs better. It has consistent higher recall and
precision for the 2D benchmark. The execution time of the first system is also
considerably lower. The reason for this is because the first system only needs
to process the image information once. This is in contrast with the second sys-
tem where for each cluster a small part of the image information is processed.
Processing these small image parts takes less time, nevertheless the total pro-
cessing time of all clusters together is longer. The current clustering algorithm
also proposes irrelevant clusters. An example of this are parts of buildings. These
parts will not count as correct detected objects, but do require processing time.
The main reason the second fusion system performs worse in terms of precision
1
F1 is Fusion system one and F2 is Fusion system two.
and recall is because of the euclidean clustering algorithm. This algorithm uses
parameters that need to be specifically tuned to each situation. Therefore the
resulting clusters are not always optimal. This means when for example when
two cars are parked close to each other, they can be detected as one cluster
because the threshold parameters was too high in that situation. Comparing
the 3D detection results one can clearly notice the performance is remarkably
lower than the 2D detection’s. The reason for this is because the 3D bounding
box calculation method is not optimal. However the 3D precision of the second
fusion system is notably higher. This is because the method is based on LiDAR
cluster hypothesis, which lead to more accurate 3D detection’s.
In this research a LiDAR and camera sensor fusion method is proposed which
accurately determines a class and bounding boxes of detected objects. The
hypothesis proposition and verification stages allow for very reliable detections.
The system outputs fused data with inferred properties, which can be used in
decision making algorithms of further stages of autonomous applications. This
method performs slightly lower in detection performance compared to other
state-of-the-art methods, however, the execution times are very promising. One
of the main advantages of this method is that the feature based fusion allows
for a highly modular design. The preprocessing for each sensor is independently
done from the fusion processing. The system can therefore be easily extended
with extra sensors to achieve a very robust detection system.
The proposed systems in this research are situated on level zero and level one
of the JDL model. This includes currently only the data alignment and object
refinement stages. It is future work to further extend the level one outputs to also
track the detected objects in time, and in doing so infer more properties about
the objects. Using optimal state estimation algorithms, for example Kalman
filters, it will be possible to predict an objects heading and velocity. Based on
that information a threat level (level three output) can be determined for each
object.
To further improve the results of the systems, especially the second proposed
fusion system, other LiDAR feature extraction algorithms can be explored. Cur-
rently a euclidean clustering method is used which is difficult to tune to each
situation and therefore the result are not outstanding.
Furthermore, as described earlier, the fusion process can also be optimized
during execution. This can be done by monitoring and updating system param-
eters like thresholds. This processing is the fourth level of the JDL model.
To further improve the results of the system it should be possible to switch
between system one and system two based on the quality of the raw sensor
stream. It is future work to provide a solution for this.
This paper is partially supported by the Belgian/Flemish Smart Highway
project (VLAIO HBC.2017.0612)
References
1. White, F.E.: JDL, data fusion lexicon. Technical Panel for C3, vol. 15, no. 0704,
p. 15 (1991)
2. Zhang, F., Clarke, D., Knoll, A.: Vehicle detection based on LiDAR and cam-
era fusion, In: 17th International IEEE Conference on Intelligent Transportation
Systems (ITSC), pp. 1620–1625. IEEE, October 2014
3. Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-
sensor 3D object detection. In: Lecture Notes in Computer Science (including sub-
series Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
LNCS, vol. 11220, pp. 663–678 (2018)
4. Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding
box estimation, Technical report (2018)
5. Shuqing, Z.: System and method for fusing radar/camera object data and LiDAR
scan points. US Patent 9,476,983, February 2016
6. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv (2018)
7. Point cloud library find minimum oriented bounding box of point cloud (C++ and
PCL). http://codextechnicanum.blogspot.com/2015/04/find-minimum-oriented-
bounding-box-of.html
8. Point cloud library module KdTree. http://docs.pointclouds.org/trunk/group
kdtree.html
9. Point cloud library pcl::voxelgrid <pointt> class template reference. http://docs.
pointclouds.org/1.8.1/classpcl 1 1 voxel grid.html
10. Fischler, M.A., Bolles, R.C.: Random sample paradigm for model consensus: a
apphcatlons to image fitting with analysis and automated cartography. Commun.
ACM 24(6), 381–395 (1981)
11. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The
KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 3354–3361 (2012)
12. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI
dataset. Int. J. Robot. Res. (IJRR) 32(11), 1231–1237 (2013)
The 5th International Workshop
on Signal Processing and Machine
Learning (SiPML-2019)
Apple Brand Classification Using CNN Aiming
at Automatic Apple Texture Estimation
Shigeru Kato1(&), Ryuji Ito1, Takaya Shiozaki1, Fuga Kitano1,

Naoki Wada1, Tomomichi Kagawa1, Hajime Nobuhara2,
Takanori Hino1, and Yukinori Sato3
1
Niihama College, National Institute of Technology, Niihama, Japan
{skatou,wada,kagawa}@ele.niihama-nct.ac.jp,
ri.ei.nnct17@gmail.com, sozktky.4096@gmail.com,
Fu.ei.nnct19@gmail.com, hino@mat.niihama-nct.ac.jp
2
University of Tsukuba, Tsukuba, Japan
nobuhara@iit.tsukuba.ac.jp
3
Hirosaki University, Hirosaki, Japan
yukisato@hirosaki-u.ac.jp
Abstract. This paper describes a system to infer the brand of apple considering
physical features of its flesh. The system comprises a hardware to examine the
apple’s physical features and a software with convolutional neural network
(CNN) to classify an apple into any brand. When a sharp metal blade cuts the
piece of the apple flesh, the load and the sound are measured. From these data,
the computer generates an image consisting of the sound spectrogram and the
color bar expressing the load change. The sound spectrogram has rich features
of the apple flesh. The image is inputted to CNN to infer the brand of apple. In
the experiment part, the authors validated the proposed system. The goal of our
study is to construct a system to estimate the texture such as crunchiness or
crispness. The system is applicable to the quality management of the brand of
apples. For example, one apple randomly chosen from many apples could be
examined by the present system in order to check the texture quality of the flesh.
1 Introduction
There is a saying “An apple a day keeps the doctor away” [1]. The apples have rich
nutrition such as polyphenols, minerals, vitamins and fiber [2, 3]. Taking apple is
effective to prevent cancer, cardiovascular and liver disease [4–6]. In addition, the
hemodialysis patients or aged people are encouraged to take apple [7, 8]. Furthermore,
the apples have anti-bacterial and anti-virus function [9, 10].
From above backgrounds, the relationship between the sensory characteristics and
the nutrition of apple is investigated [11]. When we eat the apple, we feel a fragrance, a
taste and a texture. The consumers ordinarily purchase the apple considering a nutrition
facts label which indicates the acidity or the sugar content level. Such objective
attributes can be measured by sensors, and thus quantified to numerical value easily.
On the other hand, the texture is also significant to indicate the quality of the apple.
However, texture levels of crunchiness or crispness are difficult to measure.

https://doi.org/10.1007/978-3-030-33509-0_76
812 S. Kato et al.
There are some studies in the texture of apple [12, 13]. These studies are analytical
works. Whereas study on the automatic texture estimation using AI (Artificial Intel-
ligent) is not common. If the AI can quantify the texture level of the apple, the
consumers can choose good texture apple thereby such system is useful. AI such as
convolutional neural network (CNN) is very useful in the pattern recognition [14]. In
our previous work [15], we have addressed to construct a system which estimates snack
texture levels such as crunchiness and crispness represented by numerical value within
(0,1). However, the apple flesh is soft and subtle unlike snacks e.g. cookies or biscuits.
Therefore, as a first step we challenged to classify the apples into any kinds of brands
we chose. We selected three kinds of apple brands, named brands A, B and C. We
employed our equipment to examine the apples and CNN to classify the apple brand.
The input of the CNN is an image representing sound spectrogram when the blade cuts
the apple flesh. The sound spectrogram has rich features of the apple flesh. Figure 1
shows the overview of the present system.
Fig. 1. Overview of present system: (a) The flesh sample is hollowed out by metal pipe, and
then trimmed in the same length; (b) The load and the sound are observed by the sensors when
the sample is cut by the sharp metal blade; (c) The load and the sound are inputted to the
computer; (d) The sound and the load features are converted to 227 227 pixels RGB image;
(e) Convolutional neural network (CNN) infers the probability of brands.
As shown in Fig. 1(a), the apple’s flesh sample is hollowed out using the metal pipe,
and then trimmed in the same length. Figure 1(b) is the equipment to examine the
sample. The sharp metal blade cuts the flesh sample. As Fig. 1(c) illustrates, the load and
the sound are observed simultaneously when the blade cuts the apple flesh. The load and
sound features are converted to 227-by-227 pixels RGB image shown in Fig. 1(d). The
image is inputted to CNN (Convolutional Neural Network) shown in Fig. 1(e), and then
the CNN outputs the classified probability of three brands. One of the highest value is
Apple Brand Classification Using CNN Aiming at Automatic Apple Texture 813
identified as inputted sample’s brand. In this example of Fig. 1(e), the sample could be
identified as “Brand A”. Such intelligent system is applicable to the quality management
system for apple flesh texture. For example, one apple randomly chosen from many
apples could be examined by the present system to check the texture quality of the flesh.
In the next section, the equipment for examining the apple flesh is explained.
2 Equipment
As shown in Fig. 2(a), there is a sample flesh piece below the blade. The blade is
moved up and down by an air cylinder. There are the sound sensor and the load sensor
on the blade. The load and the sound are observed simultaneously and given to the
computer via A-D converter with sampling rate 25 [kSamples/s].
Fig. 2. Equipment and sample size: (a) The sharp metal blade cuts the sample flesh; (b) The
apple fleshes are hollowed out and cut in the same size.
3 Experiment
We purchased three brands’ apples in a local supermarket in Japan. Figure 2(b) (right
side) shows the apples’ width and height. The sizes of three brands’ apples were within
certain range. The sample flesh was hollowed out and cut in the same size as Fig. 2(b)
(left side) shows. The flesh samples are numbered by the brands as shown in Table 1.
Table 1. Number of samples.

Brand A Brand B Brand C
Number of samples 30 30 30
Sample number No. 1–30 No. 31–60 No. 61–90
814 S. Kato et al.
The experiment for observing the load and the sound of flesh samples was carried
out under the condition shown in Table 2.
Table 2. Experiment condition.

Item Value/condition
Air cylinder pressure 0.4 [MPa]
Temperature 25–26 [°C]
Humidity 41–51 [%]
Blade down speed 15 [mm/s]
Figure 3(a) shows the result of “Brand A.” The top graph illustrates the load curb
with red line and the sound with blue line. It is found that when the blade cuts the
sample, the load begins to increase. The middle graph shows automatically extracted
signals for 1.0-[s]. The bottom graph shows the FFT result of the extracted 1.0-[s]
sound data.
Fig. 3. Graphs of examination.

Figure 3(b) and (c) show the results of “Brand B” and “Brand C”, respectively.
Comparing (a), (b) and (c), the loud sound and the strong load occurred in “Brand C”.
In other words, the “Brand C” was harder compared with other two brands. FFT results
are also different depending on the brands.
As shown in Fig. 4, the signals for 1.0-[s] are extracted from entire 10-[s] data as
follows.
(i) The maximum load point (1) is found.
(ii) The point (2) is found. The point (2) is at 10% of maximum load.
(iii) The signals for 1.0-[s] from the point (2) is extracted as (3).
Fig. 4. Signal extraction.
Figure 5(a) shows the averages and STDs of the load means for extracted 1.0-[s] in
each brand. Figure 5(b) shows averages and STDs of the integration values of the FFT
result between 1.0-[Hz] to 10-[kHz] in each brand.
Fig. 5. Load and sound features.

816 S. Kato et al.
Fig. 6. Image input to CNN.
It is found that the “Brand C” is hardest considering the average value of load
shown in Fig. 5(a). Therefore, “Brand C” is considered as crunchier than “Brand B”
overall. “Brand A” would have middle attribute between B and C.
Figure 6 illustrates how to generate the image inputted to CNN. The image consists
of the sound spectrogram and the visualized load intensity for extracted 1.0-[s] signals.
As shown in Fig. 6(a), the FFT (Fast Fourier Transform) is performed in each section
from (1)–(9), where the FFT period of each section is 0.2-[s], and there are 0.1-[s]
overlaps between the adjacent sections. The sound spectrogram is visualized FFT
results of sects. (1)–(9) as Fig. 6(a) and (b) show.
Furthermore, this image also includes information in the load intensity displayed in
the bottom part of Fig. 6(b).
4 CNN
We determined to address classifying task using CNN as a first step in the presented
paper. Since we have only 90 data, we adopted transfer learning [16] using AlexNet
[17], which is a pre-trained CNN. We employed the Neural Network Toolbox in
MATLAB developed by MathWorks [18].
Figure 7 shows the CNN to classify the apple brands. The node size of input layer
is 227 227 3 for 227 227 pixel RGB image. The output layer consists of three
nodes. The convolution layers (conv1–conv5) and fully connected layers (fc6 and fc7)
are the original AlexNet part [17] which are pre-trained with a massive number of
image data. We only attached fc8 and the soft-max layer with the original AlexNet,
where all fc8 nodes are connected with all fc7 nodes.
Fig. 7. CNN for classifying apple brands.
The soft-max layer outputs classified values of brands A, B and C within (0,1). The
item with the highest probability (classified value) is identified as an inputted apple. In
the case of Fig. 7, the “Brand A” is judged as the inputted image.
Several images of samples are shown in Fig. 8. It is found that the load of “Brand
B” is smaller than the other brands, as the bottom parts of images in Fig. 8 displays.
Fig. 8. Input images of CNN.
The 6-fold cross-validation of Fig. 9 was performed [19, 20]. Table 3 enumerates
the transfer learning settings to validate the CNN. The result is shown in Table 4. The
mean of the accuracies in Data Sets (1)–(6) was 81.1%.
818 S. Kato et al.
Fig. 9. 6-fold cross validation of CNN.
Table 3. Transfer learning settings.

Parameter Value/condition
Solver sgdm
Learning rate 0.0001
Max epochs 12
Mini batch size 5
Total iterations 180
Number of train data/test data 75/15
Table 4. The result of the 6-fold cross-validation.

Data set (1) (2) (3) (4) (5) (6)
Accuracy [%] 86.67 66.67 53.33 93.33 93.33 93.33
Mini-batch loss at epoch 1 1.4835 2.8320 1.2534 1.6046 1.1831 1.1622
Fig. 10. Schematic of CNN for texture estimation.
The accuracies in the Data Set (2) and (3) were lower. In the Data Set (2), the
“Brand A” was confused with brands B and C, because the “Brand A” has the middle
attribute between brands B and C. In the Data Set (3), the “Brand B” samples were
identified as “Brand A” in all misjudged cases. In the future, we should improve the
sensing method. For instance, it is necessary to alleviate the air pressure and blade
speed in order to observe the subtle sound and load features.
5 Conclusion
This paper addressed the apple brand classification aiming at realizing the automatic
apple texture estimation. We employed CNN which deals with the image comprises the
sound spectrogram including rich information in the apple flesh, and load change color
bar. The image was inputted to CNN which classifies the apple to any brands. We
validate the CNN works well in general. In order to improve the accuracy of the CNN,
in the future, we will carry out experiment by alleviating air pressure which controls the
blade movement. In addition, we will realize a texture estimation system as drawn in
Fig. 10, where the CNN is capable of outputting the texture levels such as crunchiness
or crispness.
Acknowledgments. The authors would like to thank reviewers for giving us fruitful comments,
and Maeda in MathWorks for giving us technical advice.
References
1. Kemper, K.J.: An apple a day: how trees improve human health. Complement. Ther. Med.
46, A1–A4 (2019)
2. Wruss, J., et al.: Differences in pharmacokinetics of apple polyphenols after standardized
oral consumption of unprocessed apple juice. Nutr. J. 14, 32 (2015)
3. Manzoor, M., et al.: Variations of antioxidant characteristics and mineral contents in pulp
and peel of different apple (Malus domestica Borkh.) cultivars from Pakistan. Molecules 17,
390–407 (2012)
4. Gerhauser, C.: Cancer chemopreventive potential of apples, apple juice, and apple
components. Planta Medica 74(13), 1608–1624 (2008)
5. Bondonno, N.P., et al.: The cardiovascular health benefits of apples: whole fruit vs. isolated
compounds. Trends Food Sci. Technol. Part B 69, 243–256 (2017)
6. ChrisSkinner, R., et al.: Apple pomace improves liver and adipose inflammatory and
antioxidant status in young female rats consuming a Western diet. J. Funct. Foods 61,
103471 (2019)
7. Giaretta, A.G., et al.: Apple intake improves antioxidant parameters in hemodialysis patients
without affecting serum potassium levels. Nutr. Res. 64, 56–63 (2019)
8. Avci, A., et al.: Effects of apple consumption on plasma and erythrocyte antioxidant
parameters in elderly subjects. Exp. Aging Res. 33(4), 429–437 (2007)
9. Pires, T.C.S.P., et al.: Antioxidant and antimicrobial properties of dried Portuguese apple
variety (Malus domestica Borkh. cv Bravo de Esmolfe). Food Chem. 240, 701–706 (2018)
10. Martin, J.H.J., Crotty, S., Warren, P., Nelson, P.N.: Does an apple a day keep the doctor
away because a phytoestrogen a day keeps the virus at bay? A review of the anti-viral
properties of phytoestrogens. Phytochemistry 68(3), 266–274 (2007)
11. Endrizzi, I., et al.: A conjoint study on apple acceptability: sensory characteristics and
nutritional information. Food Qual. Prefer. 40(1), 39–48 (2015)
820 S. Kato et al.
12. Charles, M., et al.: Application of a sensory-instrumental tool to study apple texture
characteristics shaped by altitude and time of harvest. J. Sci. Food Agric. 98(3), 1095–1104
(2018)
13. Costa, F., et al.: Assessment of apple (Malus domestica Borkh.) fruit texture by a
combined acoustic-mechanical profiling strategy. Postharvest Biol. Technol. 61(1), 21–28
(2011)
14. LeCun, T., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86
(11), 2278–2324 (1998)
15. Kato, S., et al.: Snack texture estimation system using a simple equipment and neural
network model. Future Internet 11(3), 68 (2019)
16. Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN
architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5),
1285–1298 (2016)
neural networks. In: Proceedings of the 25th International Conference on Neural Information
Processing Systems (NIPS 2012), pp. 1097–1105 (2012)
18. MathWorks, Transfer Learning Using AlexNet. https://www.mathworks.com/help/
deeplearning/examples/transfer-learning-using-alexnet.html?lang=en. Accessed 14 June 2019
19. Priddy, K.L., Keller, P.E.: Dealing with limited amounts of data. In: Artificial Neural
Networks - An Introduction, chap. 11, pp. 101–102. SPIE Press, Bellingham (2005)
20. Wong, T.-T.: Performance evaluation of classification algorithms by k-fold and leave-one-
out cross validation. Pattern Recogn. 48(9), 2839–2846 (2015)
Fundamental Study on Evaluation System
of Beginner’s Welding Using CNN
Shigeru Kato(&), Takanori Hino, and Naoki Yoshikawa
Niihama College, National Institute of Technology, Niihama, Japan

skatou@ele.niihama-nct.ac.jp,
hino@mat.niihama-nct.ac.jp,
kuma3kuma3yoshikawa3@gmail.com
Abstract. This paper describes a fundamental system to evaluate the welding

performed by beginners. The authors took several pictures of metal plates
welded by beginners, and then made image data. The image data is a part of
welding joint in the picture. The authors extracted the welding partial image
from the picture by hand. The extracted image data are divided into two cate-
gories. The one is “good” welding image and the other is “bad” one. The image
was inputted to CNN to classify the images to “good” or “bad”. In the exper-
iment, the validation of CNN was carried out. In the conclusion part, the result
of the experiment and future works are discussed.
1 Introduction
GTA (Gas Tungsten Arc) welding is compatible to various metals and applications [1].
GTA welding is indispensable to the industry such as the cars, ships and buildings.
Therefore, it is necessary to nurture welding skill of young people [2, 3]. However, as
shown in Fig. 1(a), the referee needs to check many welding plates, and thereby it
occurs their physical burden. In addition, the evaluations sometimes differ depending
on the individual referee’s subjectivity as shown in Fig. 1(b).
Fig. 1. Problems in GTA welding evaluation of beginners’ works: (a) the referee is tired of
evaluating many GTA works; (b) the evaluations among panels sometimes differ.

https://doi.org/10.1007/978-3-030-33509-0_77
822 S. Kato et al.
Based on the above background, this study addresses automatic evaluation of the
GTA welding which beginners performed. Since CNN (Convolutional Neural Net-
work) has high performance for image recognition [4], we employed the CNN.
Although there are some studies on welding joint evaluation using CNN [5], these
researches focus on finding the defects of joints welded by professionals or machines.
On the other hand, our study’s emphasis is on welding works by beginners.
2 Method
At first, the authors collected two kinds of pictures of welded plates. The one is the
picture of plate labeled “good” which has passed a test for deserving certification
displayed in Fig. 2(a). The other is the picture labeled “bad” which did not pass the test
by human expert’s judgement shown in Fig. 2(b). The authors have taken several
pictures of the plates by using a digital camera in a room under a constant brightness,
where the camera was fixed on the arm as illustrated in Fig. 2(c).
Fig. 2. Pictures of GTA welded plates: (a) examples of “good” plates; and (b) “bad” plates;
(c) the diagram of the situation when the pictures were taken.
Fundamental Study on Evaluation System of Beginner’s Welding 823
As the beginning task, we focused on the evaluation of the partial area of welded
joint as shown in Fig. 3. A human manually extracts the square welding image using
mouse, and then the image is inputted to CNN with AlexNet [6] in order to evaluate the
image. The image is resized to 227 227 automatically when it inputted to the CNN.
CNN outputs the probability of the evaluation. In this example, the outputted proba-
bility of “good” is higher, thereby the inputted image is judged as “good”. The node
size of input layer is 227 227 3 for 227 227 pixel RGB image. The output
layer consists of two nodes.
The convolution layers (conv1–conv5) and fully connected layers (fc6 and fc7) are
the original AlexNet part [6] which are pre-trained with a massive number of image
data. We only attached fc8 and the soft-max layer with the original AlexNet, where fc8
nodes are connected with all fc7 nodes.
Fig. 3. Schematic of present system: the welding joint part is extracted in square by human
hand. CNN (AlexNet) is utilized to judge the welding quality.
In next section, the experiment to validate the present system is explained.
3 Experiment
As shown in Fig. 4, the image data were extracted by hand using mouse and labeled
“good” and “bad”, respectively.
824 S. Kato et al.
Fig. 4. Image data extraction: (a) “good” plate passed a test for deserving certification. The
square welding parts are extracted by human hand; (b) likewise “bad” images are extracted.
“bad” plate did not pass the test.
Figure 5 shows all image data of “good” and “bad”. These are used to evaluate
performance of CNN in Fig. 3.
Fig. 5. Image data to validate CNN: (a) all good image data; (b) all bad image data.
The image data shown in Fig. 5 are numbered by “good” or “bad” as shown in
Table 1.
Table 1. Number of image data.

Good Bad
Number of image data 30 30
Image data number No. 1–30 No. 31–60
In order to validate the present CNN in Fig. 3, we carried out transfer learning [7]
using AlexNet. Since we have only 60 (good:30, bad:30) data, we performed cross-
validation [8, 9]. The 6-fold cross-validation in Fig. 6 was carried out.
Fig. 6. 6-fold cross validation of CNN.
Table 2 enumerates the conditions for transfer learning of CNN. We employed the
Neural Network Toolbox in MATLAB developed by MathWorks [10].
Table 2. Transfer learning settings.

Parameter Value/condition
Solver sgdm
Learning rate 0.0001
Max epochs 20
Mini batch size 25
Total iterations 40
Number of train data/test data 50/10
The validation result is shown in Table 3. The mean of the accuracies in Data Sets
(1)–(6) was 80%. This value is higher than the probability of random choice of “good”
or “bad”.
826 S. Kato et al.
Table 3. The result of the 6-fold cross-validation.

Data set (1) (2) (3) (4) (5) (6)
Accuracy [%] 80 100 90 50 70 90
Fig. 7. Example of welding joint detection by R-CNN.
Fig. 8. Schematic of the proposed system connected to the internet.

4 Conclusion
The present paper addressed the welding evaluation using CNN. We prepared the
pictures of metal plates welded by beginners. As the beginning task, we made small
square welding image data which is extracted using mouse from the welding plate
pictures. The image data is inputted to the CNN. The image data were divided into two
categories. The one was the “good” welding image (Fig. 5(a)), the other was “bad”
welding image (Fig. 5(b)). The image data was inputted to CNN which classifies the
images to “good” or “bad”. We confirmed the CNN worked well in some cases. In
order to improve the accuracy of the CNN, we will carry out experiment with much
more data. In the future, we will realize an automatic evaluation system using R-CNN
(Region-CNN) which can find the welding part automatically as illustrated in Fig. 7.
Considering the problem that the evaluations sometimes differ depending on the
individual referee’s subjectivity as drawn in Fig. 1(b) and also depending on regions or
countries, we should construct one sever connected with internet to evaluate the
welding works by only one criterion as shown in Fig. 8.
References
1. Niles, R.W., Jackson, C.E.: Weld thermal efficiency of the GTAW process. Weld. J. 54, 25–
32 (1975)
2. Asai, S., Ogawa, T., Takebayashi, H.: Visualization and digitation of welder skill for
education and training. Weld. World 56, 26–34 (2012)
3. Byrd, A.P., Stone, R.T., Anderson, R.G., Woltjer, K.: The use of virtual welding simulators
to evaluate experimental welders. Weld. J. 94(12), 389–395 (2015)
4. LeCun, T., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86
(11), 2278–2324 (1998)
5. Park, J.-K., An, W.-H., Kang, D.-J.: Convolutional neural network based surface inspection
system for non-patterned welding defects. Int. J. Precis. Eng. Manuf. 20(3), 363–374 (2019)
neural networks. In: Proceedings of the 25th International Conference on Neural Information
Processing Systems (NIPS 2012), pp. 1097–1105 (2012)
7. Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN
architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5),
1285–1298 (2016)
8. Priddy, K.L., Keller, P.E.: Dealing with limited amounts of data. In: Artificial Neural
Networks - An Introduction, chap. 11, pp. 101–102. SPIE Press, Bellingham (2005)
9. Wong, T.-T.: Performance evaluation of classification algorithms by k-fold and leave-one-
out cross validation. Pattern Recogn. 48(9), 2839–2846 (2015)
10. MathWorks, Transfer Learning Using AlexNet. https://www.mathworks.com/help/
deeplearning/examples/transfer-learning-using-alexnet.html?lang=en. Accessed 14 June
2019
Building an Early Warning Model
for Detecting Environmental Pollution
of Wastewater in Industrial Zones
Nghien Nguyen Ba1(&) and Ricardo Rodriguez Jorge2

1
Department of Information Technology, Hanoi University of Industry,
Number 298, Cau Dien Street, Bac Tu Liem District, Hanoi, Vietnam
nguyenbanghien_cntt@haui.edu.vn
2
Chihuahua, Mexico
ricardo.rodriguezjorge.mx@ieee.org
Abstract. In this paper, we present two soft computing techniques, which are
support vector regression (SVR) and fuzzy logic, to build an early warning
model for detecting environmental pollution of waste-water in industrial zones.
To determine the proper number of inputs for the model, we use an algorithm to
find the embedding dimension space for a time series. Our proposed model,
which has a high accuracy and short training time, to helps waste-water pro-
cessing station operators take early action and avoid environmental pollution.
1 Introduction
According to Circular 24 of the Vietnamese Ministry of Natural Resources and

Environment, all industrial zones that is discharged waste into the environment with a
flow rate is greater than 1000 m3 per day must be installed an automatic monitoring
system for waste water quality [1]. At the waste-water processing station of an
industrial zone, the operator based on certain measurement and monitoring values,
must take proper action when any criterion of waste-water is over the permitted
threshold value. Therefore, waste-water that does not satisfy this standard that has
already discharged to environment lead to environment pollution. Hence, to avoid the
environmental pollution, we can build a model based on historical data collected by
data-loggers to predict future quality of the waste water; the model could help operators
of the waste water processing stations take early action to make sure that the quality of
the waste water always satisfies the standard requirements. Currently, Vietnam is only
focused on installing an automatic monitoring system and has not yet exploited the
collected data to build a prediction model for the quality of waste-water in the future.
Predicting future time series values from the past is applied in many fields, such as
the following: economics where forecasting stock price helps investors choose the best
time to invest in the stock market to get the highest profit; predicting exchange rates
which helps import-export businesses choose appropriate times to import or export
products [2–4]; energy, where predicting the wind speed of a wind farm and the
electrical load helps energy policymakers give a best plan to meet the requirement of
clients [5, 6], and so on.

https://doi.org/10.1007/978-3-030-33509-0_78
Building an Early Warning Model for Detecting Environmental Pollution 829
Due to the importance of predictive work, scientists have successfully used a

number of techniques in recent years to build a data series predictive model. In [7],
authors combined Kernel Regression (KR) and Function Link Artificial Neural Net-
work (FLANN) to predict exchange rates from USD to GBP, INR and JPY. KR played
role in filtering and FLANN was a model for prediction. The authors in [8] used chaos
theory and reconstructed state space for predicting the exchange rates between USD
and EUR. Liu [9] used a hybrid Discrete Wavelet Transform (DWT) and Support
Vector Regression (SVR) to predict the exchange rates between CNY and USD. First,
he used DWT to decompose the time series data to a different time scale and then he
chose an appropriate kernel function for SVR and prediction corresponding with each
time scale. Finally, he synthesized the prediction result from different predicted time
scale results. The authors in [10] used a local fuzzy reconstruction method to predict
exchange rates between JPY, USD and CAN. The authors in [11] used a successful
wavelet transform to filter noise in the exchange rate time series before using it for
training and prediction based on Multi-layer Feed Forward Neural Network. Liu [12]
used hybrid of neuron and fuzzy logic to predict exchange rates between JPY and USA.
In this paper, we use Fuzzy logic and SVR to build a model for the prediction of
waste water quality, and we compare the results between two models.
2 Related Work
2.1 Building the Model Based on Support Vector Regression (SVR)

The goal of a Support Vector Machine (SVM) is to find the optimal hyper plane (Hyper
plane may be plane or curve) to classify data into two separate regions so that the
distance between the closest point and the hyper plane is at maximum. This is also
called the margin. Figure 1 illustrates a hyper plane and a margin.
Assume, the equation of a hyper plane is w x þ b ¼ 0. The goal for the SVM
algorithm is to find w and b to maximize the margin. The SVM algorithm not only
applies to solving classification problems but also to finding solutions to regression
subjects. The SVR algorithm is based on a loss function, which is tolerant of error for
points distant from the true value with in a small epsilon. This means that this function
gives zero error for all the points in training set that lie in the epsilon range. Figures 2
and 3 illustrate linear and nonlinear regression within the epsilon range [13].
For SVR, the input x is mapped into m dimension feature space by a nonlinear
mapping function first, and then the linear model is built, which is based on this
dimension feature space by Eq. (1):
X
m
f ðx; wÞ ¼ wi gi ðxÞ þ b ð1Þ
i¼1
where: gi ðxÞ i ¼ 1; 2; . . .; m is a set of nonlinear mapping functions.

830 N. N. Ba and R. R. Jorge
Fig. 1. Illustration of the hyper plane and the margin.
Fig. 2. Linear regression with the epsilon range.
Fig. 3. Nonlinear regression with the epsilon range.
The accuracy of the estimate is evaluated by loss function Lðy; f ðx; wÞÞ. SVR uses a
loss function called epsilon – an insensitive loss function which proposed by Vapnik:

0 if jy f ðx; wÞj e
L¼ : ð2Þ
jy f ðx; wÞj otherwise
Thus, SVR is performs linear regression in multi dimension feature space using
function L and minimizing k w k2 for decreasing complexity of the model. This
problem can be solved by introducing slug variables fi and fi with i ¼ 1; . . .; n, to
measure the deviation of the training samples which lie outside of the epsilon range.
Therefore, SVR is minimized by the function below:
1 Xn
min k w k2 þ C ðfi þ fi Þ: ð3Þ
2 i¼1
with constraints:
8
< yi f ðxi ; wÞ e þ fi
f ðxi ; wÞ yi e þ fi : ð4Þ
:
fi ; fi [ 0 8 i ¼ 1; . . .; n
Applying the duality theorem for minimizing problems, we finally obtain the
function f ðxÞ:
P
nSV
f ðxÞ ¼ ðai ai Þ Kðxi ; xÞ þ b
i¼1 : ð5Þ
0 ai ; ai C
where: nSV is the number of support vectors, and Kðxi ; xÞ is the kernel function, which
can be defined as
X
m
Kðxi ; xÞ ¼ gj ðxi Þgj ðxÞ: ð6Þ
j¼i
2.2 Building a Model Bases on Fuzzy Logic

Assume that we have a data time series that was collected from a system at equal time
intervals denoted by sð1Þ; sð2Þ; sð3Þ; . . .; sðnÞ. The task of the prediction time series to
find a mapping from ½sðk d þ 1Þ; sðk d þ 2Þ; . . .; sðkÞ to sðk þ lÞ, where d and l are
constant positive integer numbers, and d is the number of inputs to the predictor. For
the simple case, we assume d ¼ 2 and l ¼ 1. Figure 4 below shows the block diagram
of the system for prediction. According to the algorithm presented by Wang [14], we
first form n 2 input – output pairs: ðsð1Þ; sð2Þ ! sð3ÞÞ; ðsð2Þ; sð3Þ ! sð4ÞÞ; . . .;
ðsðn 2Þ; sðn 1Þ ! sðnÞÞ. Next, we find the maximum and minimum of the time
series and divide this domain interval into 2 R þ 1 regions (R is a positive integer
number), denoted by T1; T2; . . .; T2R; T2R þ 1, and then we assign each region with a
fuzzy membership function. In our case, we choose the shape of membership function
as a triangle wave. Figures 4 and 5 illustrate the fuzzy system and membership function
of the input and output with R ¼ 3.
Fig. 4. Fuzzy system with 2 input and 1 output.
Fig. 5. Membership function for input and output.
In the next step we calculate the degree of given input-output pairs in different
regions, assign it to the region with maximum degree, and then form an IF – THEN
rule. For example, IN1 has a max degree 0.8 in region T2, IN2 has a max degree 0.5 in
region T5 and O has a max degree 0.9 in region T6. Hence, we form the rule: IF IN1 is
T2 AND IN2 is T5 THEN O is T6. Repeating this procedure for each input – output
pair gives a set of rules. To avoid conflict rule (two rules have same IF part but
different THEN part), we only accept the rule from the conflict group that has a
maximum degree. To do so we use table – lookup to present a fuzzy rule base. The
cells of the rule base are filled by the rules. If there is more than one rule on one cell of
the fuzzy rule base, then the rule that has the highest degree is used. The degree of the
rule is calculated by formula (7) below:
DðruleÞ ¼ lA ðIN1Þ lB ðIN2Þ lC ðOÞ: ð7Þ
So far, we obtain the fuzzy rule base corresponding to all input-output pairs. The
next task is calculating output O when we have a new input sample IN1 and IN2. To do
so we first calculate the degree of output control of the k-th rule combined with the
fuzzy rule base corresponding to the new input IN1 and IN2 according to formula (8)
below:
lkOk ¼ lI1k ðIN1Þ lI2k ðIN2Þ: ð8Þ
where Ok denotes the output region of rule k, and Ijk denotes the input region of rule k
for the j-th component. Finally, we use the center average defuzzification formula to
determine the output.
P
N
lkOk Ok
k¼1
o¼ : ð9Þ
P
N
lkOk
k¼1
where Ok denotes the center value of region Ok and N is number of rule in the
combined fuzzy rule base.
2.3 Finding Embedding Dimension Space

The number of inputs for predictor is embedding dimension (windows size) [15]. The
authors in [16] present the implementation of false nearest neighbors to find the
minimum embed dimension. Assume we have a data time series sð1Þ; sð2Þ; sð3Þ; . . .;
sðnÞ the idea of the algorithm to combine sequence values of series together to form a
set of vectors V of dimension dðd ¼ 1; 2; 3; . . .Þ. For example, vector vðkÞ ¼ ½sðkÞ;
sðk þ 1Þ; . . .; sðk þ d 1Þ. We use Euclidean distance to calculate the distance
between two vectors. Formula (10) calculates the distance between vectors vðkÞ and
vðmÞ.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u d 1
uX
DðdÞkm ¼t ½sðk þ iÞ sðm þ iÞ2 ð10Þ
i¼0
For any vðkÞ we can find its nearest neighbor vðmÞ with distance DðdÞkm , increase
the dimension to d þ 1 to calculate DðdÞkm , and check expression (11) to determine if
the distance meets, which is then a false nearest neighbor.

DðdÞ Dðd þ 1Þkm
km
[ Dth ð11Þ
DðdÞkm
where Dth is the threshold. According to [16] the Dth lies in the range 10 to 50. In our
case we choose Dth to be 10. We repeat this procedure from d ¼ 1. After each iteration,
we increase d by 1 until the percent of false nearest neighbors is approaching zero or a
small value in case of increasing d but the percent of false nearest neighbor decreases
slowly or is unchanged.
3 Principles of Our Proposal
Principal of our proposed method is illustrated by Fig. 6. First, we find the embedding
dimension space of time series data sðtÞ. Next, we form a training set from sðtÞ and
dimension space d. Then, we train the model built by fuzzy logic or SVR. Next, we test
the trained model using testing data and check the accuracy criterion. If it meets the
criterion then, we use the model for predicting future values. Otherwise, we change
model’s parameters and go back to the training model step.
Fig. 6. Principles of our proposal method.
To assess the performance of our proposal we use four time series data consisting of
PH, temperature, TSS and COD. These data sets are collected from 0 AM of 25th
February 2019 to 2 PM of 26th February 2019 with the sampling period set to five
minutes at the waste water processing station of Nittoku paper factory in the Kim Bang
district, Ha Nam province, Vietnam. The total data length is 451 points for each
parameter. We use the first 401 points as training set and use the remaining 50 points as
the test set.
With the PH parameter we find that the embedding dimension space is nine. This
means that we use nine points’ data from the past to predict one point in the future, or
the model has nine inputs and one output. Figure 7 shows prediction results by the
SVM model vs the fuzzy logic model. The mean square error of SVR is 0.000077.
While fuzzy logic is 0.0003.
For the temperature parameter, we found the embedding dimension space of the
time series is eleven. Figure 8 shows prediction results by the SVM model vs the fuzzy
logic model. The mean square error of SVR is 0.04, while fuzzy logic is 0.09.
For the TSS, we found the embedding dimension space of the time series is nine.
Figure 9 shows prediction results by the SVM model vs the fuzzy logic model. The
mean square error of SVR is 0.07, while fuzzy logic is 0.06.
For the COD, we found the embedding dimension space of the time series is five.
Figure 10 shows prediction results by the SVM model vs the fuzzy logic model. The
mean square error of SVR is 0.02, while fuzzy logic is 0.77.
Fig. 7. Prediction result for the PH parameter
Fig. 8. Prediction results for the temperature parameter
Fig. 9. Prediction results for the TSS parameter

Fig. 10. Prediction results for the COD parameter
5 Conclusions
In this paper, we present the fusing of a false nearest neighbor algorithm to find the
embedding dimension space for a time series, and we use fuzzy logic and SVR to build
a model for predicting the quality of waste water in industrial zones. We use the
dimension space as the number of inputs for the model. We use data collected from a
wastewater processing station to build and test the model. The experiments results
show that the SVR model obtains the results more accurately than the fuzzy model for
most of the case study data. With the exception of one TSS case, which was very noisy
then the fuzzy logic model gives better results than the SVR model because the fuzzy
logic model tolerates noise better than the SVR model. In addition, training the fuzzy
logic model requires less time than the SVR model because training the fuzzy logic
model only required using the training data set once, while training the SVR model
required the training data set several times. Therefore, we can apply the fuzzy logic
model for noisy, real time and least accurate parameters. Otherwise, we apply the SVR
model. However, the parameters for measuring the waste are affected by many dis-
turbances so the collected values contain a substantial amount of noise which caused
the embedding dimension space of the time series to be large. This value not only
affects the accuracy of the model but also affect the training time. In the future, we plan
to apply a Hilbert Huang Transform as an adaptive filter to denoise the collected time
series data before finding the embedding dimension space and building the model. This
work will decrease the model’s complexity and increase the accuracy prediction.
Acknowledgments. We would like to acknowledge the Hanoi University of Industry for sup-
porting our work.
References
1. According to Circular 24 of the Vietnamese Ministry of Natural Resources and
Environment: 24/2017/TT-BTNMT, 1st September 2017, Hanoi, Vietnam (2017)
2. Wang, Y.-F.: Predicting stock price using fuzzy grey prediction system. Expert Syst. Appl.
22, 33–39 (2002)
3. Liu, W.: Forecasting exchange rate change between USD and JPY by using dynamic
adaptive neuron fuzzy logic system. Asia Pac. Financ. Bank. Res. 2(2), 1–12 (2008)
4. Castillo, O., Melin, P.: Hybrid intelligent system for time series prediction using neural
networks, fuzzy logic, and fractal theory. IEEE Trans. Neural Netw. 13(6), 1395–1408
(2002)
5. Lanzhen, L., Tan, S.: The study on short-time wind speed prediction base on time-series
neural network algorithm. In: 2010 Asia-Pacific Power and Energy Engineering Conference
(APPEEC) (2010)
6. Thiang, Y.K.: Electrical load time series data forecasting using interval type 2 fuzzy logic
system. In: 2010 3rd International Conference on Computer Science and Information
Technology, vol. 5, pp. 527–531. IEEE (2010)
7. Hua, X., Zhang, D., Leung, S.C.: Exchange rate prediction through ANN based on kernel
regression. In: 2010 Third International Conference on Business Intelligence and Financial
Engineering, 13 August 2010, pp. 39–43. IEEE (2010)
8. Hanias, M.P., Curtis, P.G.: Time series prediction of dollar\euro exchange rate index. Int.
Res. J. Financ. Econ. 15, 232–239 (2008)
9. Liu, F.-Y.: Exchange rate based on wavelet and support vector regression. In: 2010 2nd
International Conference Advanced Computer Control (ICACC) (2010)
10. Iokibe, T., Murata, S., Koyama, M.: Prediction of foreign exchange rate by local fuzzy
reconstruction method. In: 1995 IEEE International Conference on Systems, Man and
Cybernetics. Intelligent Systems for the 21st Century, 22 October 1995, vol. 5, pp. 4051–
4054. IEEE (1995)
11. Božić, J., Vukotić, S., Babić, D.: Prediction of the RSD exchange rate by using wavelets and
neural networks. In: 2011 19th Telecommunications Forum (TELFOR) Proceedings of
Papers, 22 November 2011, pp. 703–706. IEEE (2011)
12. Liu, W.: Forecasting exchange rate change between USD and JPY using dynamic adaptive
neuron-fuzzy logic system. Asia Pac. J. Financ. Bank. Res. 2(2), 1–12 (2008)
13. Support Vector Regression – Data Mining Map. https://www.saedsayad.com/support_
vector_machine_reg.htm
14. Li, W.: Adaptive Fuzzy Systems and Control Design and Stability Analysis, pp. 65–69.
Prentice Hall International (1994, 2000)
15. Frank, R.J., Davey, N., Hunt, S.P.: Time series prediction and neural networks. J. Intell.
Robot. Syst. 31(1–3), 91–103 (2001)
16. Abarbanel, H.D., Brown, R., Sidorowich, J.J., Tsimring, L.S.: The analysis of observed
chaotic data in physical systems. Rev. Mod. Phys. 65(4), 1331 (1993)
A Robust Fully Correntropy–Based
Sparse Modeling Alternative
to Dictionary Learning
Carlos A. Loza(B)
Department of Mathematics, Universidad San Francisco de Quito, Quito, Ecuador

cloza@usfq.edu.ec
Abstract. Correntropy is a dependence measure that goes beyond

Gaussian environments and optimizations based on Minimum Squared
Error (MSE). Its ability to induce a metric that is fully modulated by a
single parameter makes it an attractive tool for adaptive signal process-
ing. We propose a sparse modeling framework based on the dictionary
learning technique known as K–SVD where Correntropy replaces MSE in
the sparse coding and dictionary update subroutines. The former yields a
robust variant of Orthogonal Matching Pursuit while the latter exploits
robust Singular Value Decompositions. The result is Correntropy–based
dictionary learning. The data–driven nature of the approach com-
bines two appealing features in unsupervised learning—robustness and
sparseness—without adding hyperparameters to the framework. Robust
recovery of bases in synthetic data and image denoising under impulsive
noise confirm the advantages of the proposed techniques.
1 Introduction
Sparse modeling refers to the mechanisms involved in the learning of a linear
generative model where the inputs are the result of sparse activations of selected
vectors from an overcomplete basis. Its rationale comes from principles of par-
simony where it is advantageous to represent a given phenomenon with as few
variables as possible. Sparse modeling has been particularly appealing to two
fields with (usually) different objectives and methodologies: statistics [2,4,18]
and signal processing [5,6,13]. In neuroscience, Olshausen and Field [15] paved
the way to what is currently known as dictionary learning—instead of using
a fixed off–the–shelf basis, the authors proposed a fully data–driven learning
scheme to estimate said basis, also known as dictionary.
In Image Processing and Computer Vision, sparse modeling is exploited for
denoising [7,16], inpainting [12], and demosaicking [11]. Most of these applica-
tions rely on K–SVD [1]—the well known dictionary learning technique that
exploits a block coordinate descent approach to reach a local stationary point of
a constrained optimization problem. Results are optimal under additive homo-
geneous Gaussian noise. Yet, if the underlying error deviates from normality, e.g.

https://doi.org/10.1007/978-3-030-33509-0_79
A Robust Fully Correntropy–Based Sparse Modeling 839
in the presence of outliers in the form of missing pixels or impulsive noise, the
optimizers might introduce a bias.
Robust estimators are a principled scheme to deal with outliers in linear
regimes [3]. One variant of said techniques is based on Correntropy [9]—the
dependence measure that goes beyond Gaussian environments and their asso-
ciated criterion for maximum likelihood estimation: Minimum Squared Error
(MSE). Correntropy mimics induced metrics that are fully regulated via one
main hyperparameter; if said scale parameter is chosen properly, the induced
metric is robust against outliers. We harness such property to propose a novel
dictionary learning approach where MSE criteria of K–SVD are replaced by
robust metrics based on Correntropy. The result is Correntropy–based Dictio-
nary Learning, or CDL.
Likewise K-SVD, CDL exploits fast sparse code estimators, such as Orthog-
onal Matching Pursuit (OMP), and iterative Singular Value Decompositions
(SVD). Wang et al. proposed a Correntropy variant of OMP known as CMP
[20] while Loza and Principe devised CK–SVD, a robust alternative to MSE–
based SVD [10]. In the current work, we combine both approaches in a fully
robust Correntropy–based sparse modeling framework for linear generative mod-
els. Synthetic data and image denoising under non–homogeneous impulsive noise
confirm the robustness and sparseness of the solutions in addition to their supe-
riority over K–SVD. The rest of the paper is organized as follows: Sect. 2 details
the problem of robust dictionary learning and the proposed solutions. Section 3
summarizes the results, and, lastly, Sect. 4 concludes the paper and mentions
potential further work.
2 Correntropy–Based Sparse Modeling

n
Let Y = {yi }Ni=1 , (yi ∈ IR ) be a set of observations or measurements where
each vector can be encoded as a sparse linear combination of predictors, also
known as atoms, from an overcomplete basis, or dictionary D ∈ IRn×K :
y = Dx0 + n s.t. ||x0 ||0 = T0 (1)
where T0 is the support of the ideal sparse decomposition x0 , || · ||0 represents

the 0 –pseudonorm, and n is the additive noise. The sparse coding problem aims
to estimate x0 given y and a sparsity constraint. The sparse modeling problem
generalizes to a full generative model where both sparse code and dictionary are
unknown. Then for Y, the constrained optimization becomes:
min{||Y − DX||2F } s. t. ∀i, ||xi ||0 ≤ T0 (2)

D,X
where xi is the sparse code corresponding to the yi entry and ||·||F stands for the
Frobenius norm. The performance surface in Eq. (2) is non–convex; hence, typ-
ical greedy techniques are adopted instead. In this case, K–SVD [1] generalizes
k–means by alternating between finding sparse codes, i.e. distributed represen-
tations of the inputs, and dictionary update in the form of SVD routines in a
840 C. A. Loza
atom–by–atom scheme. Even though K–SVD admits any off–the–shelf sparse

coding technique, Orthogonal Matching Pursuit (OMP) [19] is usually preferred
due to its convergence properties, efficiency, and simple, intuitive implementa-
tion.
OMP and SVD–based routines are anchored on the underlying assumption of
Gaussian errors. The former exploits Ordinary Least Squares (OLS) to sequen-
tially update the active set of atoms, while the latter utilizes MSE as the cost
function to update the dictionary elements. Both approaches are destined to
introduce biases in the presence of outliers. We circumvent the Gaussianity
assumption while incorporating robustness into the sparse modeling framework
by exploiting Correntropy as the cost function in both K–SVD stages.
2.1 Correntropy–Based OMP

OMP [19] aims to find a local solution to the sparse coding problem by iteratively
selecting the most correlated atom in D to the current residual, i.e. for the j-th
iteration:
λj = argmax |rj−1 , di | (3)
i∈Ω
where r0 = y, Ω = {1, 2, · · · , K}, di is the i-th column of D, and ·, · denotes

the inner product operator. The resulting atom is then added to the active set
via Λj = Λj−1 ∪ λj .
Lastly, the sparse code is updated as:
xj = argmin ||y − Dx||2 (4)

x∈IRK ,supp(x)⊂Λj
which is solved via OLS. rj is then updated as rj = y − Dxj . Usually, OMP

runs for a fixed number of iterations, L, or until the norm of the residue reaches
a predefined threshold. The sparse code of Eq. (4) would be severely biased in
the presence of outliers, i.e. each dimension in the input space would be equally
weighted as a result of a non–robust estimator.
Correntropy [9] gauges the non–linear interactions between two random vari-
able, X and Y , via a mapping to a reproducing kernel Hilbert space (RKHS):
Vσ (X, Y ) = E[gσ (X − Y )] (5)
where gσ is the Gaussian kernel gσ (t) = exp(−t2 /2σ 2 ) and σ, the kernel band-
width, modulates the norm Correntropy will mimic (also known as CIM or Cor-
rentropy Induced Metric). The metric ranges from the 0 –pseudonorm for small
σ, 1 –norm for increasing σ and 2 –norm (defaults to MSE criterion) for large
bandwidths. Hence, a proper choice of σ is able to incorporate robustness into
the learning framework.
Correntropy Matching Pursuit (CMP) [20] replaces the MSE criterion of
Eq. (4) by the robust CIM criterion, i.e.:
xj = argmin Lσ (y − Dx) (6)

x∈IRK ,supp(x)⊂Λj
n
where Lσ (e) = n1 i=1 σ 2 (1 − gσ (e[i])) is the simplified version of the CIM sam-
ple estimator. The non–convex nature of the CIM demands for alternative opti-
mization techniques. As in [20], Half–Quadratic (HQ) optimization [14] yields
a local minimum of the cost function via iterative minimizations of a convex
enlarged parameter cost. The resulting adaptive σ hyperparameter, the weight
vector that assesses the nature of the inputs (e.g. outliers vs. inliers), the sparse
code for OMP iteration j, and the updated residue are estimated as:
2 12
(t+1) 1 (t+1)
σj = y − Dxj (7)
2n 2

(t+1) (t)
wj [i] = gσ y[i] − Dxj [i] , i = 1, 2, . . . , n (8)
2

(t+1)
xj = argmin (t+1)
(y − Dxj )
Wj (9)
x∈IRK ,supp(x)⊂Λj 2

rj = Wj (y − Dxj ) (10)
where t is the HQ iteration and Wj is the diagonal matrix version of wj . The

theory behind HQ guarantees convergence of the sequences in question [14], i.e.
(t) (t)
limt→∞ xj = xj and limt→∞ wj = wj . Equation (9) is solved via classic
OLS; hence the whole approach can also be framed as a weighted least squares
problem. Hence, CMP weighs the inputs according to a Gaussian kernel—it
emphasizes components from the underlying model family (linear in this case)
and diminishes the influence of outliers. The result is a robust sparse code.
2.2 Correntropy–Based Dictionary Update

K–SVD [1] is data–driven dictionary learning technique that exploits block coor-
dinate descent to obtain a stationary point of Eq. (2). In practice, K–SVD alter-
nates between sparse coding and dictionary element updates. The latter sub-
routine assumes both X and K − 1 columns of D are fixed; then, the atom in
question, dk , alongside its support, i.e. xkT (k-th row of X), are updated as:
2
K

||Y − DX||F = Y −
2 j
dj xT
j=1
F
2

= (Y − dj xjT ) − dk xkT
j=k
F
= ||Ek − dk xkT ||2F (11)
where Ek is the error when the k-th atom is removed. The updated vector is
n×m
estimated via SVD of ER k ∈ IR , which is a restricted version of the error
matrix that only preserves the columns of Ek currently active for dk .
842 C. A. Loza
SVD is optimal only under the MSE criterion. Correntropy K–SVD or CK–
SVD [10] replaces the SVD routines by robust alternatives that exploit the prin-
ciple of Maximum Correntropy Criterion (MCC) [9]. Specifically, let ei be the
i–th column of ERk and vi its low dimensional representation linearly mapped
via the orthonormal projection matrix U. The goal is to maximize a novel cost
function J(U) that mitigates the effect of outliers during said projection:
m

J(U) = gσ (ei − Uvi ) (12)

i=1
As proposed by He et al. [8], HQ optimization is exploited to enlarge the param-

eter space and admit an iterative scheme that guarantees convergence to a local
maximum. The adaptive σ hyperparameter, the weight vector that determines
the influence of {ei }mi=1 , and the projection matrix are equal to:
2
R
× (m)−1/5
(t)
σk = 1.06 × min σE , (13)
1.34

(t+1)
pk [i] = −g eTi ei − eTi (U(t) )(U(t) )T ei (14)

= argmax Tr UT ER (ER T
(t+1) (t+1)
Uk k Pk k) U (15)
U
(t) (t)
where t is the HQ iteration and Pk is the diagonal matrix version of pk . Partic-
ularly, Eq. (13) uses Silverman’s rule [17] to estimate the kernel bandwidth adap-
tively where σE is the standard deviation of the sequence ||ei − U(t) (U(t) )T ei ||2
and R is its interquartile range. Equation (15) is solved via classic SVD solvers
where the updated atom is the eigenvector corresponding to the largest eigen-
value. In short, CK–SVD is a weighted PCA implementation that downplays the
influence of outliers during the dictionary update stage of K–SVD.
2.3 Correntropy–Based Dictionary Learning
As an algorithm based on block coordinate descent, K–SVD relies on effective

sparse coders and SVD solvers to work iteratively to find a local solution. Yet,
if any of the two subroutines yields biased estimates (due to outliers or non–
Gaussian environments), it will directly affect the subsequent stage and lead to
overall biased sparse codes and dictionary. Hence, it would be advantageous to
incorporate robustness into both stages by leveraging the properties of Corren-
tropy.
We propose a combined fully Correntropy–based sparse modeling frame-
work. CDL or Correntropy–based Dictionary Learning alternates between robust
sparse coding (CMP) and Correntropy–based Dictionary Update until conver-
gence. On the one hand CMP downplays the influence of outliers (under a linear
regime) in the observation vectors Y, while on the other hand Correntropy–
based Dictionary Update routines mitigate the effect of outliers (under MSE)
in the estimated dictionary D. Thus, CDL is able to deal with both types of
outliers in a principled robust manner without any extra hyperparameters.
For completeness, we also propose a variant of K–SVD that uses CMP and
MSE–based dictionary update: CMPDL, and reintroduce C–KSVD [10], a com-
bination between OMP and Correntropy–based Dictionary Update. In this way,
it is possible to assess which K–SVD stage is more sensitive to outliers and
non–Gaussian scenarios.
3 Results
The first set of results focuses on robust sparse modeling with access to ground
truth. The dictionary, D ∈ IR20×50 , is generated by sampling a zero–mean uni-
form distribution with support [−1, 1]. Each column is normalized to a unit 2 –
norm. Sparse codes are generated from a uniform random variable with support
[0, 1] (T0 = 3).
Then, 1500 20–dimensional samples are produced via linear combinations
between sparse codes and dictionary. These samples are then affected by noise.
We compare the performance of K–SVD, CMPDL, CK–SVD, and CDL by com-
puting the inner product between atoms from estimated and generating dictio-
naries. Each dictionary learning technique alternates 40 times between sparse
coding and dictionary update. The expected sparsity support is set to L = 3.
The first type of noise is additive Gaussian. Its SNR is varied from 0 to
20 dB. Table 1 details the performance of each algorithm as the average of 50
independent runs for each SNR case (upper rows of each cell). The table also
summarizes the same metrics under additive Laplacian noise (lower rows of each
cell). It is evident that the robust variants outperform K–SVD, with CDL being
superior for most cases. The experiments with low SNR emphasize the fact that
Correntropy–based variants are able to properly handle errors with long tail
distributions.
The third type of noise is non–homogeneous in the form of missing entries in
the observation vectors. In particular, a percentage of the components from each
observation vector is set to zero. This rate is varied from 0 to 50%. Figure 1 sum-
marizes the average of 50 independent runs for each sparse modeling technique.
The Correntropy–based variants consistently outperform K–SVD while CDL is
superior in aggressive noise environments. The degradation of CDL under no
missing pixels is worth investigating as further work.
The next set of results deals with image denoising. The approach proposed in
[7] is exploited here. Essentially, the denoising mechanism invokes sparse model-
ing over local patches of the noisy image. Each patch is sparsely encoded with a
constraint on the residue norm equal to 1.15σ, where σ is the standard deviation
of homogeneous additive Gaussian noise. Then, local averaging over overlapping
patches and global weighted averaging with the noisy image renders the esti-
mated denoised example (Lagrange multiplier, λ, is set equal to 30 according to
[7]).
844 C. A. Loza
Table 1. Average inner product between estimated and ground truth atoms under
additive Gaussian (upper rows of cells) and Laplacian (lower rows of cells) noise. Best
results are marked bold.
SNR Algorithm
(dB) K–SVD CMPDL CK–SVD CDL
0 0.67 0.71 0.68 0.72
0.67 0.71 0.69 0.75
5 0.87 0.91 0.91 0.94
0.86 0.91 0.92 0.95
10 0.98 0.98 0.99 0.99
0.98 0.97 0.99 0.99
15 0.98 0.99 0.99 0.99
0.98 0.98 0.99 0.99
1
Average inner product
0.9
0.8
0.7 K-SVD
CMPDL
0.6 CK-SVD
CDL
0.5
0 10 20 30 40 50
Percentage of missing entries
Fig. 1. Average inner product between estimated and ground truth atoms under miss-
ing entries type of noise.
For the current work, we choose σ = 20. Our framework is tested under a
non–linear transformation in the form of impulsive noise, i.e. a percentage of
affected pixels will be saturated to either 0 or 255 according to additive Gaus-
sian noise with high power (σimp = 100 for this case). The rate of affected
pixels is varied from 0 to 40%. In short, the original image goes first through
an additive linear transformation with σ = 20 and then through a non–linear,
inhomogeneous transform with σimp = 100. All possible overlapping vectorized
8 × 8 pixel patches constitute the observations of the sparse model. The initial
dictionary D ∈ IR64×256 is chosen as the overcomplete Discrete Cosine Trans-
form (DCT) basis. Lastly as suggested in [7], 10 alternating optimizations of the
block coordinate descent routine are ran for each case.
Table 2 details the average PSNR over 5 independent runs for each noise rate
and five different well known gray–scale images of size 512 × 512. In general,
CMPDL and CDL deliver the best performances while CK–SVD remains close
to the K–SVD baseline. In particular, CMPDL and CDL are fairly consistent
for a wide range of affected pixels. CMP seems to be the deciding factor here—
it filters the inlier samples to the subsequent atom update stage, and, hence,
reduces the estimation bias. On the other hand, OMP overrepresents the inputs
and passes noisy examples to the K–SVD or CK–SVD dictionary update rou-
tines. This is confirmed in Table 3 where the average number of coefficients (per
8 × 8 block) in the sparse decompositions are compared. CMP–based variants
clearly render a truly sparse representation, while other flavors overrepresent the
input by encoding residual noise until the constraint on the residue norm is met.
Therefore, Correntropy is more advantageous in the sparse coding subroutine
than in the SVD solver. The details regarding the slight difference in PSNR
between CMPDL and CDL are left as further work. Lastly, Fig. 2 illustrates the
denoising results for a case of 40% rate of outlier pixels.
Table 2. Summary of denoising performance, PSNR (dB), under different impulsive

noise (outliers) rate. Each cell reports four denoising techniques. Top left: K–SVD [7].
Top right: CMPDL. Bottom left: CK–SVD [10]. Bottom right: CDL. Best results are
marked bold.
Outlier % Barbara Boats House Lena Peppers Average

0 30.80 29.33 30.33 29.11 33.17 32.23 32.38 31.53 30.77 29.45 31.49 30.33
29.84 28.73 29.85 28.39 32.43 31.50 31.85 30.85 30.25 28.81 30.84 29.66
10 20.95 24.51 20.76 24.42 20.98 25.21 20.99 24.92 20.86 24.58 20.91 24.73
21.19 24.40 20.94 24.29 21.31 25.21 21.25 24.84 21.18 24.51 21.17 24.65
20 18.66 23.76 18.56 24.06 18.67 24.73 18.69 24.44 18.62 23.98 18.64 24.19
18.55 23.57 18.37 23.78 18.49 24.62 18.48 24.25 18.71 23.81 18.52 24.01
30 16.66 23.74 16.54 24.48 16.61 25.03 16.65 24.76 16.66 24.10 16.62 24.42
16.71 23.09 16.58 23.66 16.63 24.47 16.69 24.18 16.67 23.60 16.66 23.80
40 15.22 24.28 15.09 24.88 15.16 25.99 15.16 25.76 15.18 24.58 15.16 25.09
15.24 22.16 15.12 23.30 15.21 24.30 15.19 24.20 15.22 22.88 15.20 23.37
Table 3. Grand average number of coefficients in sparse decompositions after block–

based image denoising. Sparsest solutions are marked bold.
Outlier Algorithm
% K–SVD CMPDL CK–SVD CDL
0 0.85 0.45 1.17 0.46
10 3.72 0.96 3.96 0.96
20 7.98 1.00 8.11 1.00
30 11.12 1.04 13.47 1.05
40 13.07 1.19 17.99 1.32
846 C. A. Loza
Original Noisy, PSNR=13.64 K–SVD, PSNR=15.16
CMPDL, PSNR=25.74 CK–SVD, PSNR=15.2 CDL, PSNR=24.19
Fig. 2. Example of the denoising results for the image “Lena”. 40% of pixels are affected
by impulsive noise.
4 Conclusion
We proposed a robust sparse modeling framework where Correntropy is exploited
to reformulate the cost functions of both sparse coding and dictionary update
stages of K–SVD. Experiments with synthetic data and image denoising con-
firm the robustness of the estimators and their potential in applications prone
to outliers where sparsity is advantageous. In particular, Correntropy seemed to
be more decisive when used in the sparse coding subroutine of K–SVD. Further
work will involve in–depth analysis of the heuristics utilized to select the ker-
nel widths and their connection to robust linear modeling [3]. In addition, the
denoising mechanism proposed in [7] states empirical optimal hyperparameters
for MSE–based cases. We believe different sets of hyperparameters might yield
distinct stationary points for Correntropy–based optimizations that are worth
investigating.
References
1. Aharon, M., Elad, M., Bruckstein, A., et al.: K-SVD: an algorithm for design-
ing overcomplete dictionaries for sparse representation. IEEE Trans. Sig. Process.
54(11), 4311 (2006)
2. Akaike, H.: Information theory and an extension of the maximum likelihood prin-
ciple. In: Selected Papers of Hirotugu Akaike, pp. 199–213. Springer (1998)
3. Andersen, R.: Modern Methods for Robust Regression, vol. 152. Sage (2008)
4. Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in
coding and modeling. IEEE Trans. Inf. Theor. 44(6), 2743–2760 (1998)
5. Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal
reconstruction from highly incomplete frequency information. IEEE Trans. Inf.
Theor. 52(2), 489–509 (2006)
6. Donoho, D.L., Johnstone, J.M.: Ideal spatial adaptation by wavelet shrinkage.
Biometrika 81(3), 425–455 (1994)
7. Elad, M., Aharon, M.: Image denoising via sparse and redundant representations
over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)
8. He, R., Hu, B.G., Zheng, W.S., Kong, X.W.: Robust principal component analysis
based on maximum correntropy criterion. IEEE Trans. Image Process. 20(6), 1485–
1494 (2011)
9. Liu, W., Pokharel, P.P., Prı́ncipe, J.C.: Correntropy: properties and applications
in non-gaussian signal processing. IEEE Trans. Sig. Process. 55(11), 5286–5298
(2007)
10. Loza, C.A., Principe, J.C.: A robust maximum correntropy criterion for dictionary
learning. In: 2016 IEEE 26th International Workshop on Machine Learning for
Signal Processing (MLSP), pp. 1–6. IEEE (2016)
11. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models
for image restoration. In: 2009 IEEE 12th International Conference on Computer
Vision, pp. 2272–2279. IEEE (2009)
12. Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration.
IEEE Trans. Image Process. 17(1), 53–69 (2008)
13. Mallat, S., Zhang, Z.: Matching pursuit with time-frequency dictionaries. Techni-
cal report, Courant Institute of Mathematical Sciences, New York, United States
(1993)
14. Nikolova, M., Ng, M.K.: Analysis of half-quadratic minimization methods for signal
and image recovery. SIAM J. Sci. Comput. 27(3), 937–966 (2005)
15. Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by
learning a sparse code for natural images. Nature 381(6583), 607 (1996)
16. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal
algorithms. Physica D: Nonlinear Phenom. 60(1–4), 259–268 (1992)
17. Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Routledge
(2018)
18. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.
Series B (Methodol.) 58(1), 267–288 (1996)
19. Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthog-
onal matching pursuit. IEEE Trans. Inf. Theor. 53(12), 4655–4666 (2007)
20. Wang, Y., Tang, Y.Y., Li, L.: Correntropy matching pursuit with application to
robust digit and face recognition. IEEE Trans. Cybern. 47(6), 1354–1366 (2017)
Labeling Activities Acquired
by a Low-Accuracy EEG Device
Ákos Rudas(B) and Sándor Laki
ELTE Eötvös Loránd University, Budapest, Hungary

{ruasaai,lakis}@inf.elte.hu
Abstract. Analyzing EEG signals can help us make implications about

the user’s activities or even thoughts which can result in a myriad of
applications. However, clinical EEG monitoring tools are expensive, often
immobile and in need of professional supervision. Lately a couple of
companies started the production of relatively cheap, easy-to-use, and
mobile devices with significantly lower accuracy. In this paper, we intend
to investigate the usability of these devices in recognizing selected basic
activities e.g., winking, raising a hand etc., showing preliminary results
on how clustering can prove to be an efficient method in labeling a low-
quality EEG data set so that it could be used in supervised learning
scenarios.
Keywords: EEG · Clustering · Emotiv Epoc+
1 Introduction
Devices which measure and record biometric data have been used on various
fields for some time; fingerprint scanners in security systems, polygraphs for lie
detectors, or thermometers on simple medical examinations. Instruments of high
precision and accuracy are very expensive and not quite accessible for the pub-
lic, however in recent years several new devices have appeared on the market
which made it possible for virtually everyone to perform these measurements
(e.g. leap count, EEG, heart rate monitoring, EKG) with a measurement pre-
cision far lower than their clinical counterparts. Dealing with this low-accuracy
data means a great challenge for developers who aim at creating everyday appli-
cations. In these cases one of the most significant factors is the applied signal-
and data processing model. Preprocessing the data is crucial, the signal needs to
be quantified according to how it is intended to be used - windowing is appro-
priate in our case, also as these signals tend to be very noisy the task of finding
and filtering unwanted artifacts out is essential in further processing. Then the
preprocessed signal can be inspected to decide which features can describe the
data in a meaningful way. If proper features have been found, it opens the way
for supervised and unsupervised machine learning methods to be utilized. Such
a processing pipeline can be efficiently used for a range of purposes, from ana-
lyzing historical EEG data to real-time classification of user-actions. It is worth
https://doi.org/10.1007/978-3-030-33509-0_80
Labeling Activities Acquired by a Low-Accuracy EEG Device 849
mentioning that deep learning techniques are widely used when working with
EEG data however it is usually very hard if not impossible to interpret those
results due to the black box nature of those neural networks.
Since in the future our intention is to test supervised machine learning algo-
rithms on our EEG data set, in this paper our goal is to find a way to label
our data set as accurately as possible. To that end we take advantage of two
approaches; a statistical analysis based on signal amplitudes and one based on
clustering. The latter one might shed some light on how EEG signals from dif-
ferent activities or people can be distinguished from one another. We also end
up with at least a partial answer to whether a low-quality commercial EEG
device, specifically the Emotiv Epoc+ is capable of capturing biological signals
which researchers can consider meaningful. Two data sets are subjected to our
research, one containing less activities from one user, the second containing more
activities from multiple users.
The remaining part of the paper is organized as follows: In Sect. 2 we review
literature which pursued somewhat similar goals however they placed accent on
different aspects. A detailed description of data collection follows in Sect. 3. In
Sect. 4 the steps of the method are explained, namely channel selection from
the raw data, feature extraction of those time frames and semi-labeling feature
vectors. Description of hyper parameters of the clustering and performance eval-
uation is described in Sect. 5. Section 6 concludes the paper.
2 Related Work
Electroencephalography is a monitoring method regarding electrical potential

changes on the human (as a matter of fact, originally animal) scalp. For those
changes firing of pyramid-neurons are accountable in the outer layers of the brain.
From resulting signals certain inferences could be made concerning root cause
of those firings. This paper does not intend to go into further details regarding
EEG, readers who would like to dive deeper may read works like [1] and [2].
There is a rich literature on applying Deep Learning techniques on EEG
time series. A great survey of 156 papers is available by Roy et al. [3]. Since
a lot of those experiments show relatively good results when it comes to EEG
classification it would seem natural to continue that track of work. However
those techniques and their results suffer from a great downside - at least for our
purposes - which is the lack of explainability. Deep learning acts like a black
box; it learns important features of the data set and uses those for classification,
but it is very hard to interpret those results or to find out what features had it
actually derived. Therefore in this work we will not be discussing the possibilities
of DL, as instead we will be looking for intuitively derivable descriptors of our
data.
Benı́tez et al. [4] have shown what we have also verified; namely that the
low cost Emotiv Epoc+ device can produce utilizable measurements despite its
low accuracy and high sensitivity to noise. They developed a BCI and tested it
in real life scenarios. They managed to transmit commands to an Arduino so it
850 Á. Rudas and S. Laki
could turn on and off a light bulb with only winking, triggering a servomotor,
etc.
However we must make a remark that strictly talking, these are not EEG
signals, but in fact EMG signals; the waves are indicating muscular activities
of the face and scalp instead of electric potential changes caused by firing of
neurons in the brain. It is a very good start though: first identifying EMG
signals thus giving more information to distinguish between EMG and EEG
signals. Specifically winking detection has been dealt with in [11] with a much
more sophisticated device. Classification was achieved with a back-propagation
and a cascade-propagation feed-forward neural network.
Vijayendra et al. [5] used the same device as well and also an other model
of the same manufacturer with only 5 sensors. They managed to extract real
EEG signals - according to their paper, the subjects did not really move their
arms, only imagined doing so - from the recordings with the utilization of the
Discrete Wavelet Transform technique. For classification they used an artificial
neural network. They managed to get a 98% real-time classification rate. They
even managed to control a UAV by generating the appropriate EEG patterns.
In order to be able to remove EMG artifacts from the intended-EEG signals,
in this paper we intend to identify those EMG signals in time-series recorded by
the Emotiv Epoc+ device.
3 Data Collection
Before a clinical examination in a hospital the electrodes are secured to the
inside of a rubber cap. Its flexibility then ensures the electrodes to be in optimal
touch with the scalp. The placement of the electrodes is standardized according
to the 10–20 [6] international system (the 10 and 20 numbers correspond to the
distance between electrodes in means of scalp diameter ratio). The touching side
of the electrodes are also being covered with a gel which improves sensitivity by
enhancing conductivity.
In contrast to clinical grade devices, data collection with the Emotiv Epoc+
device is a tricky and more difficult task. Electrodes on the Emotiv Epoc+ are
connected to the base of the instrument by plastic arms. This reduces flexibility
and portability so as a result there are cases when one or more electrodes do
not stand a chance of even touching with the skin due to varying skull shapes of
humans. There is also the question of whether this configuration will sufficiently
preserve 10–20 placement however it is not further discussed in this work. The
sensing side of the Epoc’s electrodes are covered with a foam-like material, their
purpose is to be covered with a few drops of some saline agent (e.g. contact
lenses solution) for the same reason as the gel in the case of clinical devices.
After placing the device on the user’s head with the victorious feeling of
having managed to get every curl of hair out of the way of the electrodes, the
software with which the device had come indicates how well we managed to place
the electrodes. The recording can be started and stopped manually, requiring
human interaction. The problem with this approach is that signals of moving
the hand, the finger, or thinking about hitting a button, not to mention aiming,
necessarily add noise (at least in that sense) to the recorded signals. Therefore
recording signals needs two users. To aid future visual analysis of the signals
we recorded activities with distinctive borders; about 2 s of staying idle between
each activity. These activities consisted of winking with the right eye, winking
with the left eye, blinking, and making a pop sound with the mouth (to be
precise, as Donkey made in Shrek). The reason for the choice of the latter was
to record something more complex in means of the number of muscles used to
perform it.
In the first stage of experimentation the above mentioned activities were
recorded from one user, each activity 10 times. The structure of a recording
followed the 5 s long scheme | − −|a| − −| with a being the activity. That leaves
us with 50 measurements.
In the second stage we involved four additional users and recorded a couple
of more activities, that is: raising right hand, raising left hand, raising right foot,
raising left foot. All of these were recorded both sitting and standing, 10 times
each. The resulting data consists of 12 activities recorded from 5 people, 10 times
each. The resulting recordings had the structure | − −|a| − −|a| − −|...| − −|, so
2 s of being idle, 1 s for the activity, 2 s being idle again, etc. All files contain 10
of the same activity records from 1 user, which is about 32 s.
We also note that data is stored in EDF files, which is a format usually used
to store multi-channel biological and physical signals. To add an other important
note; after taking a closer look to the recorded files it turned out that some of
them were practically unusable due to the amount of noise. Therefore we filtered
out these measurements and worked without them in this study.1
4 Methodology
To determine how efficient our device is in recording accurately classifiable data,

we need to apply a supervised machine learning algorithm, teach a classifier
engine and then test it. In order to do that, we need a sufficiently large data
set consisting of time frames of EEG signals with an appropriate transformation
applied to it, and we need that data labeled.
In general such signal processing pipelines follow the pattern of acquiring
the raw signal, preprocessing it, then with respect to the specific goals either
windowing the time series based on some strategy or not, which is followed by a
transformation or feature extraction step to acquire the data set to work with.
Again, according to the nature of the project the data set optionally receives
labels for the individual entities of interest.
After the data collection we have 14-channel EEG time series. First we deter-
mine a time interval which is wide enough to contain any of the observed activ-
ities, like winking or raising a hand. We get the time frames by sliding this
1
Our data set will be made publicly available after publication so anyone can replicate
the same algorithm or try new ones.
window through the signals. These time frames are basically numerical matri-
ces, so a variety of feature extraction methods can be applied. It must be chosen
very carefully as the intended results are vectors such that corresponding time
frames that belong to the same activity should be closer to each other in some
metric than to time frames from other activities.
4.1 Channel Selection

When it comes to multi-channel biological data it is important to distinguish
between relevant and irrelevant channels. From a biological point of view we
could reason that based on location on the skull certain electrodes have a higher
chance of capturing the desired activities due to the underlying region of brain
[7]. On the other hand our device is probably incapable to capture signals that
sophisticated so a more mathematically inspired approach was taken; Princi-
pal Component Analysis has been performed to find channels with the highest
information content. The number of components and corresponding eigenvalues
considered was chosen according to keep at least 99% of information.
4.2 Feature Extraction from EEG Signals

We have at most 14 channels of EEG time series in each file. Considering the
128 Hz sampling rate of the Epoc+ and 5 s long recordings from the first record-
ing session we have 640 values per channel. A possible first step is to smooth
the data as raw EEG tends to be very spiky. We can use Golay filter for this
purpose with third degree polynomials [9]. An other option is the above men-
tioned frequency filtering. Since different frequency bands contain different, very
specific information regarding brain activities [10] it seems like an intuitively
good approach. beta waves tend to be more dominant in the state of light to
moderate physical activeness and gamma waves are accented while being more
intensively physically active. One parameter of our feature extraction method is
therefore the option of filtering for these two frequency bands.
After closer inspection of the data it turns out that a 50 value wide window
will contain an activity. Nevertheless a window-size of 64 was chosen so that a
later FFT would run more efficiently due to the window-size being a power of 2. If
we slide the sliding window one value at a time, it gives us a very high number of
time frames. In order to reduce the size of the problem space we have the option
of keeping only the first and then every e.g. tenth window starting from that
(with this small size it is not crucial but will significantly affect efficiency with
much larger data sets). Considering all 14 channels what we get is are 14*64
matrices per time frame. Since different muscles and different EEG signals in
different frequency bands correspond to different activities, it might be more
helpful to inspect the frequency domain, so we take the two-dimensional Fourier
transform for every time frame. Then for every time frame we consider the 14*64
matrix, and take the first 64 values from the first channel, then the 64 values
from the second channel, and so on, until we get an 896 values long feature
vector. We call this the long feature. That gives us 896 values per row in the
files for each 5 s long measurements. With a sampling rate of 128 Hz the files
would have 577 rows (or 57 after downsampling). Feature vector fi ∈ R(nch ·ws)
fi = fwi 1 fwi 2 ...fwi ws (1)
from the matrix of Fourier coefficients

⎡ ⎤
fwi 1
⎢ fwi 2 ⎥
|F wi | = ⎢ ⎥
⎣ ... ⎦ ∈ R
nch ×ws
(2)
fwi nch
where wi ∈ Rnch ×ws and our data set of windows D = [w0 , w1 , . . . , wnw ] with
nw being the number of windows (time frames), ws being the window size (64
in our case), and the number of channels nch (14 in our case).
In order to extract features which store information about how neighboring
time frames change compared to each other, we consider the feature tripleLong.
For each time frame we put together the long features from the preceding, the
actual, and the succeeding frame. Considering files from the smaller data set with
usable signals we get about 200 feature vectors with 2688 components in each
one. Since the bigger data set contains files of 32 s and 10 activities per activity
type per user, without a couple of files which contained too noisy signals we end
up with about 2400 feature vectors with the same number of components.
The problem we face at this point is the size of the problem space, namely
working with 896 and 2688 dimensional vectors which greatly taints scalability.
To reduce that without losing too much information we perform a Principal
Component Analysis. That way we can rearrange the vectors such that the
more information a component carries the lower it’s dimension number would
be. After this rearrangement we can make a cut in a way that we still keep 99%
of variance in the data.
Most clustering algorithms use some kind of distance definition, and since
we need to calculate those distances a lot of times, and the calculation time
greatly depends on the size of those vectors, reducing it is essential in acquiring
computation times that make it possible to work with our data.
Since we were dealing with usually 6–10 dimensional vectors after the PCA,
using Euclidean distance seemed an obvious choice.
4.3 Semi-labeling
In order to acquire a data set ready to be fed to a supervised learning algorithm

we need to assign labels to the time frames. We can do it manually, but even
in the case of only one user and four activities it means labeling hundreds of
windows one-by-one. Obviously it is unfeasible, so the next step is to find an
informative feature, and a clustering method compatible with this feature. In
order to ensure its success that we need to have a class number (activity number)
assigned to all of the time frames.
Fig. 1. Time-windows visualized using 3 principal components as coordinates after

PCA of data set with long feature.
Since for each frame we know its file of origin and for each file we know
which activity it belongs to, we can separate the time frames only by assigning
the labels idle and some activity, and then linking the real labels to the some
activity windows according to the files they come from.
When we are left only with windows containing activities we can find a
feature, apply a clustering method, create a confusion matrix, and then link the
clusters to activities according to the matrix.
As for the first step, distinguishing between idle and some activity windows
is not an easy task due to the amount of noise our device suffers. We acquired
fair - in a sense that it correlates well with our expectations based on visual
observation - results using a very simple statistical model (Fig. 2). Based on
the single multidimensional points in the time series we determined a threshold
with respect to standard deviation of the amplitudes. Above this threshold these
single points could be labeled, so that after the time frames had been created
they could be labeled as well based on point-labels and their occurrences in that
time frame.
A possible augmentation step before thresholding is to take the Fourier-
transform of the windows, filter for desired frequency bands (beta and gamma
in our case, 16–31 Hz and greater than 32 Hz, respectively [8]) and transforming
it back to time-domain, then continue with this filtered signal.
5 Clustering Time Frames

The point of clustering our EEG time frames is to derive an automated labeling
method. If our data set turns out to be self-organizing in some feature-space,
labeling a new time frame could be as easy as finding which cluster centroid
it is closest to. This approach assumes similarity between windows of the same
activities and dissimilarities between windows of different activities (Fig. 1). This
step depends greatly on the extracted features.
K-means clustering receives a set of vectors and a parameter k which stands
for the number of desired clusters. The expected outcome is then a label vector
with the assigned clusters to the data points. Assuming that points from the same
activity should be closer to each other k-means seemed a convenient choice.
Activities in channel-filtered, aggregated EEG

eeg
2500 activity
min
max
Amplitudes of transformed EEG

2000
1500
1000
500
−500
−1000
−1500
0 10000 20000 30000 40000 50000 60000 70000
Data points
Fig. 2. Detecting activities by thresholding amplitudes with respect to observed stan-

dard deviation. Data points inside yellow spikes are considered activities.
5.1 Clustering the Smaller Data Set

As a first step we applied the method on the smaller data set with k = 5 (the four
activities plus idle). The best outcome would have been 5 clearly separate groups;
4 according to the recorded files, and the 5th everywhere between. However that
was not the case.
What was clear at first sight is that most of the points ended up in one group.
So far that is exactly what could be expected - as activities happen for quite a
short duration, and every recording took several seconds, most of the windows
contain idle activities. As for the other four, the results are quite diverse. left-
and right wink could be identified however blink was hard to be distinguished
from left wink and right wink depending on the features used (Fig. 3). The pop
sound turned out to be mostly indistinguishable. When using k-means cluster-
ing it is a common technique to use a larger k than the desired cluster number.
We evaluated clustering accuracy with varying k across multiples of our desired
cluster number, 5. After reaching k = 30 increasing of accuracy declines and no
longer justifies extra computational time. Accuracy of clustering here is evalu-
ated as the number of well clustered time windows over the number of all time
windows. A time window is considered well classified if it is non-idle and ends up
in a cluster which is associated with the label of the file that window belongs to.
To determine which cluster is associated to which activity a confusion matrix
is built from which that information can be deduced. Note that the number
of clusters may exceed the number of activities. In such case multiple clusters
can be linked to a single activity. Determining labeling accuracy regarding the
thresholding method is not so straightforward, it has been performed merely by
visual observation. Figure 2 shows however that it does fairly well in most cases
when the signal is clear enough. According to visual observation we conclude
that it yields better results to distinguish time windows as idle or some activity
and than assign labels according to their files of origin. However that might not
be the case with a more sophisticated device which could provide clearer data
and possibly more accurate clustering.
Fig. 3. K-means clustering of data points by long-feature. Separated columns show

where files recorded of different activities end and begin. Their order is: left wink, right
wink, blink, pop sound, idle (no activity).
5.2 Measuring Clustering Quality
Usually clustering quality is measured by between-cluster separation (silhouette)

and in-cluster density (inertia). However in our case it is more complicated as
we actually do have a ground truth of time frame affiliations. Therefore besides
the usual measures we will measure a pseudo F-score as well; namely, for every
activity - since we know which files the time frames had come from - after having
the clusters assigned to them, we can measure precision and recall. If a frame
is labeled with a cluster assigned to the activity the frame is from, it is a true
positive. If the label indicates a different activity then it is a false negative, etc.
The harder task is to verify whether a window labeled non-idle does really
contain an activity or not, and same with idle. Our solution to this problem is to
use the previously introduced amplitude-based approach and treat it as ground
truth.
5.3 Clustering Hyper-parameters
In order to determine which EEG-processing methods, which features would pro-

vide the best possible clustering results, different parameter configurations have
been tested. The first parameter is the feature being used which can be either
long or tripleLong. The second parameter noIdles determines if the idle labeled
windows should be disregarded after labeling in order to reduce the number of
classes thus reducing potential classification - and clustering - error in the fol-
lowing steps. The third parameter useFreqFilt sets whether frequency filtering
should be applied to the raw EEG data in order to filter for beta and gamma
frequency bands. A fourth parameter useFFT affects the feature extraction step
with respect to EEG time series transformation. If set to True, vectors repre-
senting time frames would consist of Fourier-coefficients while if set to False
the original values would be kept when creating the vectors (original meaning
optionally smoothed, filtered, centered values, not raw EEG values).
5.4 Clustering Results

A grid search has been performed with respect to the clustering hyper-
parameters. It turned out that the most important factor was whether to include
to idle windows in the clustering (the pseudo-ground truth regarding that had
come from the amplitude-thresholding). The second most important has proved
to be useFreqFilt, feature has come in third, and useFFT has turned out to be the
most insignificant parameter. So much that the latter two gave different results
when testing with data coming from different single individuals. Therefore we
conclude that noIdle and useFreqFilt are globally significant parameters while
feature and useFFT are fine-tunable on the individual-scale. For quality-gain
based on these parameters please see Table 1.
Table 1. Different F-scores based on different hyper-parameter values per user
noIdle (T/F) useFreqFilt (T/F) feature (long/tripleLong) useFFT (T/F)

All 0.43/0.15 0.26/0.3 0.28/0.29 0.29/0.28
User 1 0.4/0.1 0.24/0.28 0.26/0.26 0.27/0.24
User 2 0.38/0.1 0.22/0.26 0.24/0.24 0.25/0.23
User 3 0.45/0.16 0.28/0.33 0.31/0.31 0.31/0.3
User 4 0.46/0.16 0.29/0.34 0.31/0.32 0.32/0.31
User 5 0.47/0.16 0.29/0.34 0.31/0.32 0.32/0.3
×1012 All users All users, silhouette and F-score

0.6 silhouette
6 Fscore
0.5
Inertia
0.4
4
0.3
2 0.2
10 20 30 40 50 60 10 20 30 40 50 60
Number of clusters Number of clusters
(a) Inertia for all users (b) Silhouette and F-score for
all users
Fig. 4. Inertia, silhouette, and F-scores for all users with k ranging from 12 to 60.
Using the derived parameter values K-means has been performed with k
values ranging from 12 to 60, both on the whole data set and individual-by-
individual. Results show that clustering quality on the whole data set is far lower
with respect to F-score; the maximum is around 0.4 while individually it gets
between 0.7 and 0.9 (Figs. 4 and 5). The decreasing value of silhouette is easily
explainable as more and more clusters would be linked to the same activities, it
is expected that they get less and less separated in the feature space on average.
Also it is pretty intuitive that average inertia is getting lower as clusters become
smaller and more compact but the actual quantities are not easily explainable
as it is not a normalized measure. Although it might prove a good measure in
the future if other clustering methods will be tested on the data set with the
same hyper-parameters.
×1011 User 5 User 5, silhouette and F-score

0.8
4
Inertia
3 0.6 silhouette
Fscore
2
0.4
1
10 20 30 40 50 60 10 20 30 40 50 60
Number of clusters Number of clusters
(a) Inertia for user 5 (b) Silhouette and F-score for

user 5
Fig. 5. Inertia, silhouette, and F-scores for user 5 with k ranging from 12 to 60.
6 Conclusion
This paper introduced a hybrid method for labeling feature vectors derived from
time frames which were acquired from multi-channel EEG time series. This step
is essential for producing a data set which is ready to be processed by a super-
vised machine learning model. First step of the method is distinguishing between
idle time frames and frames containing some activity using a signal amplitude
based statistical method while the second step is clustering activity time frames
based on the tripleLong feature, using Fourier coefficients, but not filtering for
specific frequency bands. According to our findings the optimal value for k in the
K-means clustering is around 30–36. We also conclude that the method seems
efficient on an individual scale as results on the data set containing recordings
of every user are way below the individual results. F-score of clustering ended
up around 0.7–0.9 for individuals. Working with the Emotiv Epoc+ device has
been a very educational experience. While it is probably not capable of captur-
ing really sophisticated EEG signals it still manages to produce signals which
correlate with the user’s facial and other muscle movements. With an appropri-
ate model in the background this device can perform well in a wide range of
applications which do not aim at exploiting high-quality EEG signals but relies
more on reactions of the user - this means facial movements while concentrat-
ing, smiling, laughing, and - as proven in this work - winking, blinking. Future
work will concentrate on exploring how supervised machine learning algorithms
perform with the labeled data set we acquired during our experimentation. It
would also be very interesting to see how the resulting EEG analysing pipeline
would perform on signals recorded by different commercial devices and also on
publicly available data sets containing clinical-quality EEG records.
References
1. Pizzagalli, D.A., et al.: Electroencephalography and high-density electrophysiolog-
ical source localization. Handb. Psychophysiol. 3, 56–84 (2007)
2. Swartz, B.E.: The advantages of digital over analog recording techniques. Elec-
troencephalogr. Clin. Neurophysiol. 106(2), 113–117 (1998)
3. Roy, Y., Banville, H., Albuquerque, I., Gramfort, A., Falk, T.H., Faubert, J.: Deep
learning-based electroencephalography analysis: a systematic review. J. Neural
Eng. (2019)
4. Benitez, D.S., Toscano, S., Silva, A.: On the use of the Emotiv EPOC neuroheadset
as a low cost alternative for EEG signal acquisition. In: 2016 IEEE Colombian
Conference on Communications and Computing (COLCOM), pp. 1–6. IEEE (2016)
5. Vijayendra, A., Saksena, S.K., Vishwanath, R.M., Omkar, S.N.: A performance
study of 14-channel and 5-channel EEG systems for real-time control of unmanned
aerial vehicles (UAVs). In: 2018 Second IEEE International Conference on Robotic
Computing (IRC), pp. 183–188. IEEE (2018)
6. Jasper, H.A.: The ten-twenty system of the international federation. Electroen-
cephalogr. Clin. Neurophysiol. 10, 371–375 (1958)
7. Marieb, E.N., Hoehn, K.: Human Anatomy & Physiology. Pearson Education, Lon-
don (2007)
8. Deuschl, G., Eisen, A., et al.: Recommendations for the practice of clinical neuro-
physiology: guidelines of the International Federation of Clinical Neurophysiology.
Electroencephalogr. Clin. Neurophysiol. 52, 1–304 (1999)
9. Schafer, R.W., et al.: What is a Savitzky-Golay filter. IEEE Sig. Process. Mag.
28(4), 111–117 (2011)
10. Dawson, G.E., Fischer, K.W.: Human Behavior and the Developing Brain. Guilford
Press, New York (1994)
11. Chambayil, B., Singla, R., Jha, R.: EEG eye blink classification using neural net-
work. In: Proceedings of the World Congress on Engineering, vol. 1, pp. 2–5 (2010)
The 2nd International Workshop on
Business Intelligence and Distributed
Systems (BIDS-2019)
Data Sharing System Integrating Access
Control Based on Smart Contracts for
IoT
Tanzeela Sultana, Abdul Ghaffar, Muhammad Azeem, Zain Abubaker,

Muhammad Usman Gurmani, and Nadeem Javaid(B)

http://www.njavaid.com
Abstract. Development of Internet of Things (IoT) network brings new

concept of Internet. The dramatic growth of IoT increased its usage. IoT
network facilitates in several manners, more specifically, in access control
and data sharing among IoT devices. However, it has many challenges,
such as: security risks, data protection and privacy, single point of failure
through centralization, trust and data integrity issues, etc. This work
presented a blockchain based access control and sharing system. The
main aim of this work is to overcome the issues in access control and
sharing system in IoT network and to achieve authentication and trust-
worthiness. Blockchain technology is integrated with IoT, which sim-
plifies the access control and sharing. Multiple smart contracts: Access
Control Contract (ACC), Register Contract (RC), Judge Contract (JC),
are used that provide efficient access management. Furthermore, misbe-
haviour judging method utilizes with penalty mechanism. Additionally,
permission levels are set for sharing resources between users. Simulation
results show the cost consumption. Bar graphs illustrate the transaction
and execution cost of smart contracts and functions of main contract.
1 Introduction
Development of Internet leads to the connection of devices. With the growth
of communication and networking technologies, devices are more likely to con-
nect to each other. Devices connected to Internet fasten the growth of Internet of
Things (IoT) network. The idea of IoT can be taken as “network of devices, which
are connected to each other, through Internet”. The main purpose of connection
of devices is to share data, information or resources with other devices. IoT net-
work is integrated with the physical world, over Internet. Growing connection of
IoT devices extend the application of IoT network in all fields. Applications of
IoT network includes vehicular network, where cars are integrated with enter-
tainment, traffic and navigation system; home automation (i.e., smart homes),
health-care system (i.e., transfer health data), supply chain system (asset track-
ing, forecasting, vendor relations, connected fleets), security system (i.e., sensors,
buzzer connected) and many other [1]. Because of its wide range of applications,
https://doi.org/10.1007/978-3-030-33509-0_81
864 T. Sultana et al.
IoT devices are connected globally. According to the Gartner report, the rate
of connected devices over Internet will grow up to 2.4 billion by 2020. Connec-
tion of devices require efficient management of IoT network. As vast usage of IoT
results in growing challenges in network. Some of the major issues are: IoT device
management, data confidentiality, authentication and access control, malicious
attacks, centralization, etc. [2]. As IoT network consists of sensitive data, there
might be some solutions for network safety and security. IoT network is neces-
sary to be protected from attacks, unauthorized access to data and inappropriate
data sharing [3]. For security and efficiency of IoT network, access management
and data sharing are considered as major aspects of network performance [4].
Several strategies are proposed to eliminate the issues that IoT network
encounters. IoT network is also integrated with cloud and fog, for efficient utiliza-
tion of network for resource constrained devices. Moreover, to achieve efficiency,
accuracy and speed in IoT data processing. Besides the storage and processing
advantages of network, cloud and fog also brings the latency, security and pri-
vacy issues [5]. The challenging tasks of IoT network are considered to be as:
data sharing and access control. There must be some strategies to manage access
control and data sharing of IoT network [6,7]. To eliminate the challenges in IoT
network, blockchain technology is intended to be an effective solution. Blockchain
based solutions are more effective, which provide data integrity, security, audit-
ing, fairness, authenticity, distribution [8].
1.1 Blockchain
Blockchain is an ingenious technology, conceptualized by Satoshi Nakamoto. The
idea of blockchain is given in 2008, via a white paper. Blockchain technology was
introduced for secure transaction of cryptocurrency, i.e., bitcoin. Blockchain is
also considered as an underlying technology for bitcoin. Bitcoin is the first cryp-
tocurrency, which is introduced to eliminate the idea of central administration.
It is also considered as an application of blockchain. Blockchain is a decentral-
ized network technology. It is also called distributed public ledger technology. All
transactions done in blockchain network are recorded in a ledger. Ledger is main-
tained in form of blocks. Blocks in blockchain network are ordered chronologi-
cally. The basic structure of blockchain is shown in Fig. 1. Blockchain is a peer-to-
peer (P2P) network technology, in which all nodes in network are interconnected.
To eliminate centralization, ledger is distributed and maintained by all nodes.
Fig. 1. Basic blockchain structure

Access Control and Data Sharing 865
Blockchain technology has great significances than traditional transaction sys-

tems. Blockchain is also considered as more efficient and reliable technology. Fea-
tures of blockchain over traditional system are: decentralization, immutability,
security, scalability, fault tolerance and trust-less nature. Demand of blockchain
is increasing day-by-day, because of its properties and features. Due to its increas-
ing demand, applications of blockchain are increasing in almost every field. For
example: blockchain is implementing rapidly in vehicular network [9]. Other
usage of blockchain technology are Artificial Intelligence (AI), economy, trans-
portation, health, identity management, supply chain management and smart
contract services [10]. Major features of blockchain that make it distinct from
existing systems are: smart contracts, consensus mechanisms, cryptography tech-
niques, etc.
1.2 Motivation
A lot of work is done in literature for efficient utilization of IoT devices. Many
strategies are proposed for access management and data sharing in IoT network
using blockchain technology. Some of the works considered access control and
other focused only on sharing. The work in [1] is based on access control man-
agement. Smart contracts are used to ensure the trustworthiness of the system.
Furthermore, authors in [2] proposed an access control system in order to prevent
from single point failure and unauthorized access to the network. For efficient
data sharing, multiple strategies are proposed. Trust based sharing system is
proposed in [7]. In this system, data sharing is integrated with access control,
for authorized access. Permission levels are used for authorization of access.

Dramatic growth of IoT network results in numerous challenges like: sharing,
access control, security, trusworthiness, authentication, malicious attacks, cen-
tralization, etc. To manage access control in IoT network, authors in [5] proposed
a blockchain based cross chain framework. The main aim of this system is to pro-
vide a decentralized access model, which provides security and privacy protection
to IoT data. However, user information is not protected in an efficient manners.
In [6], an access control management is provided. A blockchain based key man-
agement scheme is proposed in terms of privacy, efficiency, decentralization and
scalability. The scheme improves the system performance in terms of scalabil-
ity. However, the system fails to provide full utilization of blockchain network.
Sharing of data and services is main aspect of the IoT network. To make data
sharing more efficient, many schemes are proposed. The authors in [9] proposed a
blockchain based service sharing system. The main goal of this scheme is to pro-
tect IoT terminals from unauthorized services. Also to prevent lightweight clients
from unauthorized services providers. In spite of its effectiveness, this scheme is
inefficient for non-cooperative scenarios. For sharing management, work in [10]
is based on data sharing in AI-powered networks. This scheme works on trust
based sharing strategy. Smart contracts are used to provide secure and trustless
sharing environment. However, the proposed strategy does not work well in all
sharing scenarios.
1.4 Contribution
By taking aforementioned limitations in literature into consideration, a system

is proposed for access control management and service sharing. The main con-
tributions of this work are as follows:
– A blockchain based access control and data sharing model is proposed,

– multiple smart contracts are used for efficient access management,
– different permission levels are set for one user to access the data of other user
or IoT device,
– the misbehaviour strategy is used in this model,
– further penalty is determined for the user who misbehaves,
– in addition gas cost is examined for each smart contract and some of the
function in main smart contract.
Further sections are organized as follows. Section 2 describes the literature review
in detail. Section 3 gives whole understanding of proposed system model and its
work flow. Section 4 presents simulation results and reasoning of graphs. Section
5 provides the conclusion of the work.
2 Related Work
Several studies are presented in literature for access control management and
data sharing management in blockchain network.
The authors in [1] investigate the conflicts in access control systems in IoT.
To overcome the access control issue, authors proposed a smart contract based
access control system. The access control framework consists of multiple smart
contracts. The main goal is to achieve trustworthiness and validation of access
control. The validation is checked by the behavior of IoT device user in terms
of service requests to other users. The system is evaluated by providing the
case study using hardwares and softwares. The evaluation results show that the
system achieves better performance, by having less access time. However, in this
system IoT devices cannot directly interact with the system. Furthermore, the
time cost and overhead results does not match with real world IoT scenarios.
To further demonstrate the challenges in access management in IoT sys-
tem. [2] proposed a distributed IoT access management architecture. This work
aims to provide mobile, lightweight, scalable, concurrent, accessible, resilient
access control system. The system is compared with the state-of-art Lightweight
Machine to Machine (LwM2M) servers using WSN management hubs. The sys-
tem outperformed in terms of scalability, throughput rate, latency over tradi-
tional systems. However, the system does not perform well for the single man-
agement hub.
Traditional access control schemes are comprised of many issues such as:
security risks, centralization, access management complexity. To solve these chal-
lenges, [3] proposed an attribute based access control system for IoT. Blockchain
based decentralized system is proposed to issues like: single point failure problem,
data tampering issue. The performance of system is evaluated by using Proof of
Concept (PoC) mechanism. Through PoC, storage and computation overhead of
the system is examined. The IoT devices has less computational and communica-
tional overhead. The system also achieves the flexibility and future maintenance
and update. However, only some parts of consensus algorithms enhance the flex-
ibility of system and maintains the future management and updation.
In [4], blockchain consensus based user access strategy is proposed. The
authors investigate the data transmission and authenticity issues in data trans-
mission in wireless networks. A consensus based scheme is used to verify the
authenticity of the user and Channel State Information (CSI). The scheme also
intended to improve the efficiency of users. The CSI is authenticated for fraud
users, which intentionally use their CSI to get resources. CSI is encoded and
decoded by using the conventional Neural Network (NN). Simulations are done
by making a comparison of the proposed scheme with other algorithms. Results
show that proposed scheme enhanced the spectral efficiency. However, in this
scheme nodes are not such intelligent to perform several tasks simultaneously.
Multiple links and access in IoT network increased the issue of security and
privacy. Also the centralization problem in traditional IoT network brings chal-
lenges. For efficient and secure data management in IoT network, [5] proposed a
blockchain based cross chain framework. The framework is proposed for access
control. Multiple blockchains are also integrated with the framework. In this
work, the comparison between multiple blockchains is done. The results show
that integration of Fabric and IOTA is much efficient for IoT. The efficiency
of the system is tested virtually for throughput and latency. Security is also
achieved. However, the system does not guarantee the protection of user privacy
and user information.
Furthermore, to tackle the issues related to access control, privacy oriented
blockchain based key management system is proposed in [6]. Issues of third party
involvement and central authority are investigated. The main aim of the sys-
tem is to reduce latency, increase cross domain access. Blockchain technology is
used to bring decentralization. System performance is evaluated by simulations.
The interrelationship of parameters is also studied for testing performance. The
simulation results show that multi-blockchain structure improves system perfor-
mance and enhanced scalability. However, proposed scheme does not provides
full persistency of blockchain network.
The IoT data is considered as big data and the access management of this
data is a great challenge. To achieve trust, security and maximum access control,
multiple schemes are proposed in the literature. Also, there are storage issues,
which create overhead. Storage issues are also considered in different works.
To eliminate the storage issue, as well as security and access control issues, [7]
proposed an off-chain based sovereign blockchain system. In this work, the mon-
itoring, control and regulation of nodes is maintained by sovereign blockchain.

The performance of the system is evaluated against several existing techniques.
The evaluation is done by using PoC. Evaluation results show that the pro-
posed scheme solves many problems like: keeping excessive data on blockchain,
security and privacy concerns. The system increased the security and effective-
ness of access control. However, this system does not work well for market level
strategies. It lacks when companies intend to integrate with system.
The authors in [8] investigates an insecure data sharing system among smart
Mobile Terminals (MTs). Blockchain based data sharing system is proposed to
overcome the security and sharing issues. Deep Reinforcement Learning (DRL)
is used to achieve high quality data sharing system, among MTs and IoT appli-
cation. The system also aims to design an efficient scheme that provides secure
data sharing system. DRL is used for safe and reliable environment for MTs. The
security analyses are performed under multiple attacks: eclipse attack, majority
attack, terminal device failure. The results show that the proposed system can
withstand under attacks and achieved reliability and security. However, the sys-
tem neither provides the efficient trade-off in some parameters nor it supports
auditing and charging services.
For secure and trustworthy service sharing among IoT devices, [9] proposed a
service sharing system for resource constrained IoT devices. The sharing system
is based on blockchain technology. Blockchain is used to validate services of IoT
devices. The system aims to protect lightweight (Lw) IoT clients from insecure
service codes. To demonstrate the efficiency and effectiveness, proposed model is
tested using virtual cloud and edge nodes. Further comparative experiments for
throughput and latency are done by using Proof of Authority (PoA) and Proof
of Work (PoW). Evaluation results show that proposed system protects the Lw
clients from unauthorized services. In spite of this, the efficiency of proposed
scheme is lacked in non-cooperative scenarios.
Data sharing in mobile communication and network is becoming complex.
To manage data sharing, [10] proposed an Artificial Intelligence (AI) based net-
work operation framework. The authors also investigate problems in full power
exploitation of AI. To make data sharing system secure and trust-less, framework
combined smart contract based access control. Two blockchains are proposed in
this work, to improve efficiency and throughput rate. Further system is eval-
uated by making comparison with existing schemes in literature. The system
outperformed in terms of: security, privacy and scalability. However, the pro-
posed system is not efficient for all sharing scenarios. It also does not work for
market level strategies.
A blockchain based access control and data sharing system for IoT network is
proposed, which is being motivated from works in [1] and [10]. The proposed
system model is shown in Fig. 2.
Fig. 2. System model
3.1 Smart Contracts

In this system, three smart contracts are used: main smart contract, i.e., Access
Control Contract (ACC), Register Contract (RC) and Judge Contract (JC).
ACC controls the access control. RC is used to register the subject. It generates
a table that registers the required information of a subject. RC also maintains
authorization of user in the system. Moreover, role of JC is to manage misbe-
havior. Misbehavior happens from the side of subject. When subject sends too
many requests for any data service, it is considered as misbehavior. JC checks
for misbehavior, if the misbehavior occurs then penalty is imposed on subject. If
there is no misbehavior occur, then permission levels for the subject are checked
by smart contract. Subject can access required services, according to permission
level.
ACC. ACC is the main smart contract. It is deployed to manage overall access
control of system. When subject wants to access data services of object. It sends
request for that service using blockchain network. The ACC contract executes
and manages all access management of the system. In the proposed system, only
one ACC is used. ACC manages the access control for each request.
Table 1. Subject registration table
Subject Object Resource Time

User A User X File1 2019/5/17
11:12
User B User Y Program2 2019/6/14
1:15
User C User Z File3 2019/8/8
3:00
RC. RC is used to manage the access control of IoT device. Its main role is to
register the peer or subject that sends the service request. RC maintains a table
called register table [1], for registration. The register table is shown in Table 1.
In the register table, the required information of the subject is stored, such as:
subject, object, resource, time etc. RC also verifies and authenticate the subject
through a register table.
JC. JC implements a judging method that judges the misbehaviour of a sub-

ject. For further execution of service request, JC checks for misbehaviour. If the
misbehaviour is occurred then JC determines the penalty for subject. If there
is no misbehaviour occurred, JC sends the request next. JC generates following
alerts for access control (! is used for alert message):
– Access Authorized!
– Requests are Blocked!
– Static Check Failed!
– Misbehavior Detected!
– Static Check failed & Misbehavior Detected!
If no misbehaviour occurs, the access is granted to subject and JC generates

message “Access Authorized”. If there is any misbehaviour happened by the
subject then other messages are generated by JC. In terms of misbehaviours,
the penalty is determined. The subject’s requests are blocked for sometimes as
a penalty for misbehaviour.
3.2 Misbehavior
Misbehavior is determined by the judge contract. Misbehaviour is tend to happen

when subject sends too many requests for data services, in a short period of time.
A misbehaviour field is maintained to record all the misbehaviours. Whenever
a misbehaviour happens, for that a penalty is decided by the JC. In a result,
requests of a subject are halted for a particular time. There are several types of
misbehavior, that are done by subject. Misbehavior types are:
– subject sends too frequent requests,

– subject sends multiple requests in particular time, i.e., 5 requests in 10 min,
and
– when subject cancels the request.
In a result of a misbehavior, penalty is determined by the JC. In penalty, requests

of a subject are halted. Due to its halted state, subject is no more able to send
requests in network for a certain time period.
3.3 Data Permission Control
Data permission levels are used, to ensure the trustworthiness of access control.
Permission levels are set according to the data sensitivity and the subject who
wants to access that data. Data permission is divided into four levels, which are
as follows:
– L0: Data is not accessible

– L1: Data can be used in aggregated computation without revealing raw data
– L2: Data is partly allowed
– L3: Data or service is accessible.
4 Simulation Results and Reasoning
In this section, simulations of proposed system are discussed in detail. The pro-
posed system is evaluated for cost consumption in terms of gas usage. The gas
price of smart contracts and their functions is calculated.
4.1 Cost Consumption
In ethereum blockchain, cost consumption is evaluated in terms of gas. Gas is a

measurement unit, which is used to measure computational power of transaction
execution. Gas is defined by miners, at start of the transaction. Gas determines
that how much fee is to be paid for any transaction. Gas price is measured in
Gwei. The amount of gas units is calculated for: execution cost and transaction
cost, of functions of smart contract.
Functions Cost. transaction and execution cost of functions of smart contracts:

ACC, RC, JC, is calculated.
Functions of ACC: Figure 3 shows transaction and execution cost of functions
in ACC. As ACC is the main function which controls the overall access control
of the system. The performance tasks of ACC functions are more than other
smart contracts functions. However, only main functions are considered for gas
calculation, such as: user registration, generating permission levels for the subject
and the data access function.
Fig. 3. ACC function cost
Transaction cost: the transaction cost of the functions is shown in the graph.
The transaction of ACC functions: user register, permission level and data access
is about 89000, 25000 and 30000 gas units, respectively.
Execution cost: the execution cost of smart contract functions is also illus-
trated in graph. The transaction cost of functions: user register, permission level
and data access is 65000, 5000 and 9000 gas units, respectively.
Functions of RC: the cost consumption of RC functions is shown in Fig. 4.
RC manages the registration tasks of the subject in the network and maintains
a registration table for user information.
Transaction cost: transaction cost for RC functions: user registration and
registration table creation is about 133000 and 45000 gas units, respectively.
Execution cost: the execution cost of user registration function and registra-
tion table generation function is 130000 and 23000 gas units, respectively.
Functions of JC: the cost consumption of JC functions is illustrated in Fig. 5.
JC functions are: misbehavior calculation function and misbehavior judge func-
tion.
Transaction cost: the transaction cost of JC functions: misbehavior calcula-
tion and misbehavior judge is 80000 and 195000 gas units, respectively.
Execution cost: the execution cost for JC functions is also shown in bar graph.
The cost for misbehavior calculation and misbehavior judge function is about
60000 and 165000 gas units, respectively.
Fig. 4. RC function cost
Fig. 5. JC function cost

5 Conclusion
In this work, blockchain based system is utilized to overcome challenges, in IoT
access management and data sharing system. With the aid of blockchain and its
features, many benefits could be bring into IoT network.
This work is intended to provide trustworthiness, authorization, authentica-
tion in access management and data sharing. This work is consist of multiple
smart contracts. Which are used to maintain authentication, authentication, reg-
istration. Furthermore, misbehavior is also implemented, whenever the subject
sends too many access requests at a short period of time. For misbehavior, the
corresponding penalty is defined for subject. If there is no misbehavior occurs,
then permission levels are checked for subject to access services of object. In
addition to that, simulations are done to calculate cost consumption of smart
contracts used in ethereum platform. The cost of smart contracts and their func-
tions is calculated. Both transaction and execution cost is checked. Simulation
results show that the proposed system is cost effective.
References
1. Zhang, Y., Kasahara, S., Shen, Y., Jiang, X., Wan, J.: Smart contract-based access
control for the internet of things. IEEE Internet Things J. 6, 1594–1605 (2018)
evaluation. IEEE Internet Things J. (2018)
3. Ding, S., Cao, J., Li, C., Fan, K., Li, H.: A novel attribute-based access control
scheme using blockchain for IoT. IEEE Access 7, 38431–38441 (2019)
5. Jiang, Y., Wang, C., Wang, Y., Gao, L.: A cross-chain solution to integrating
multiple blockchains for IoT data management. Sensors 19, 2042 (2019)
6. Ma, M., Shi, G., Li, F.: Privacy-oriented blockchain-based distributed key manage-
ment architecture for hierarchical access control in the IoT scenario. IEEE Access
7, 34045–34059 (2019)
7. Sifah, E.B., Xia, Q., Agyekum, K.O.-B.O., Amofa, S., Gao, J., Chen, R., Xia, H.,
Gee, J.C., Du, X., Guizani, M.: Chain-based big data access control infrastructure.
J. Supercomput. 74, 4945–4964 (2018)
8. Liu, C.H., Lin, Q., Wen, S.: Blockchain-enabled data collection and sharing for
industrial IoT with deep reinforcement learning. IEEE Trans. Ind. Inform. (2018)
9. Xu, Y., Wang, G.., Yang, J., Ren, J., Zhang, Y., Cheng, Z.: Towards secure network
Comput. (2018)
for AI-powered network operations. J. Commun. Inform. Netw. 3, 1–8 (2018)
Energy Trading Between Prosumer and
Consumer in P2P Network Using
Blockchain
Muhammad Usman Gurmani, Tanzeela Sultana, Abdul Ghaffar,

Muhammad Azeem, Zain Abubaker, Hassan Farooq, and Nadeem Javaid(B)
Department of Computer Science, Comsats University Islamabad,

Islamabad 44000, Pakistan
Abstract. Nowaday’s energy demand and energy production are

increasing. Renewable energy resources will play an important role in
managing future production of electricity due to an increase in the devel-
opment of societies. The centralized energy trading system faces a chal-
lenge in terms of fair energy distribution. Centralized existing energy
trading system totally relies on a central system or third party, because
the third party has many drawbacks in the form of record tampering
or record altering. The fair transaction is the main issue in the energy
trading sector. When the bitcoin is introduced in the market, the trust
of Blockchain technology is increased. We proposed a Blockchain based
energy trading system in peer to peer networks. Blockchain technol-
ogy provides trust, security, and transparency for energy trading. In
Blockchain technology, there is no necessary need of the third party
in the energy supply sector. In our proposed paper, we facilitate the
prosumer who produces renewable energy and sells surplus energy to
the consumer. We achieved transparency, accuracy, efficiency in our pro-
posed paper. Using a double auction process, we obtain low energy price
and acheived consumer trust in energy trading.
Keywords: Blockchain · Prosumer · Consumer · Energy trading ·

DSO · Double auction · Smart grid · Peer-to-Peer network
1 Introduction
The existing power system is rapidly changing around the world. From many
years, these existing systems rely on fossil fuel, plant, oil, gas, and petroleum
to generate electricity and to deliver it to users. In today’s world, real-time
monitoring is important in the management of a smart grid. Renewable energy
is beneficial because there is no impact on surrounding it. There are many types
of renewable energy resources such as solar energy, wind energy, geothermal
energy, and biomass. People spend a lot of money on security and trust. In peer
to peer network analysis of security in energy trading is an important factor for
https://doi.org/10.1007/978-3-030-33509-0_82
876 M. U. Gurmani et al.
a financial investment in the field of renewable energy production [1]. Various

applications which are connected with the Internet of things (IoT) have been
installed into the smart grid to save the time and to balance the energy supply
demand. Blockchain technology provides us a secure Platform for storing many
transaction records between distributed peer to peer networks [2].
A centralized energy trading system has some drawbacks and has become the
non-guaranteed source for an energy provider. In a centralized system, all the
participant nodes rely on a central servers however, system can be failed at any
time and nodes should be ready to leave the network. In centralized energy sector,
chance of energy supply loss is increased due to trust or to rely on the thirdparty.
The consumer has always securely used electricity and also demand to receive
electricity in every time. Prosumer energy production ratio is increasing rapidly
due to the interest of consumer on secure and less energy cost. In decentralizing
Blockchain, every node must participate in validation of every transaction and
furthermore, update every transaction record and distribute the ledger copy
of the transaction to every node in the Blockchain network. In existing system
energy trading between prosumer and consumer in limited Platform and analysis
of security problem of that existing system [3], the consumer has no option of
trading with other entity which provides the better opportunity on bases of low
price.
In our proposed work, we apply the double auction feature. This feature
is handled by the distribution system operator (DSO). The role of DSO is to
manage the information of the double auction feature between prosumer and
consumer. In the current environment, consumers are wanting to like a secure
and low price platform for energy trading. In double auction process, prosumer
want to sell or buy energy in maximum rate, and consumer wants to buy energy
at a minimum cost to an auctioneer. If the consumer rate is matched with the
prosumer rate then the auction process is done. In this work, every node can
perform transaction with two types of energy trading. One is with the smart
grid and the second is prosumer. By applying the Blockchain in double auction
process prosumer participation interest and market competition are increased.
2 Motivation
People use the current trading system, which is not secure to maintain energy
trading. Nowaday’s technology is growing very fast. Furthermore, we need to
change the traditional energy trading system into secure decentralized technol-
ogy. In to-day’s world, on one hand energy demand is fastly increasing on the
other hand these energy demand are fulfilling by renewable energy. The benefit of
renewable energy production is to sell the surplus energy and to fulfill consumer
energy requirement. Blockchain technology provides a secure and tamper-proof
energy trading platform in which we can perform energy trading in secure envi-
ronment. Blockchain technology is a chain in which blocks are made and all the
transaction are stored in the blocks and all the transaction are not handling by a
single person. Blockchain is a distributed ledger anyone who participates in the
Energy Trading Between Prosumer and Consumer in P2P Network 877
Blockchain network easily access the ledger and further update the ledger and
distributed it to another node [4]. Each owner of the bitcoin the digital signature
on bitcoin and transfer it to the next person by public key [5].
3 Related Wok
3.1 Blockchain in Smart Grid
Existing system has some issue in privacy. Existing system relied on third-party.
In this work, security is major problem for nodes. In this work, the authors
proposed a decentralized and automated secure platform for renewable energy
trading in a smart home. Using Blockchain technology and ethereum smart con-
tract for making the system secure without any need of third-party involvement
in a microgrid. The simulation of this paper is implement in solidity language. In
this work, energy trading is limited between two nodes. In future work, we will
expand our system in which more and more nodes can participate. We check
the maximum scalability of our system [6]. In Existing system energy market
relies on third party and produce electricity with the help of oil and gas. On
the other hand prosumer has produced a surplus energy as they want to trade
this energy to consumer in secure platform without the intervention of third
party. In this work, authors proposed a secure and transparent energy trad-
ing platform. In proposed decentralized system, prosumer and consumer can
trade energy in autonomous system [7]. Nowadays’ energy demand is increasing
rapidly. In this work, the current regional energy production is not enough to ful-
fill the requirement of energy demand. Traditional energy system has face many
challenges. In this work, authors proposed a Blockchain based crypto-trading
project for energy trading. Using Blockchain technology software for exchanging
cryptocurrency to the renewable energy trading market. There are some issues
in the proposed system in terms of security concern and lack of immaturity
[8]. In current energy system Local energy market (LEM) faces challenging in
current energy trading system due to involvement of third-party. Local energy
generation cannot meet the local energy demand. LEM has trust on local agents
who have set energy prices, which is a major problem of local energy market.
In this work, the authors proposed a private Blockchain based technology for
local energy trading. Using private Blockchain, intelligent POW consensus we
provide a secure energy trading platform for the Local energy market with-
out any extra transaction. We provide bidding platform in which prosumer and
consumer can trade each other with their own energy bidding schemes. The
simulation of this work are performed in solidity language and ethereum plat-
form is used for making smart contract for security purpose. In future, there
will be a need for changing regulatory system for Blockchain based technology
in the Local energy market [9]. In the existing system, energy trading cost is
increased due to the involvement of third-party. Prosumers nodes have commu-
nication in limited area. Due to this restriction, they do not communicate with
each other. In this work, the authors proposed a Blockchain based decentralized
low-cost energy trading platform. Using Blockchain cost-effective technology, we
guarantee a secure transactive energy trading against tampering. In this work,

we tackle the issue of reliability, visible service availability, and high cost on the
transaction of energy trading. The simulation of this paper is implement in hyper
ledger fabric and smart contract is also written in hyper ledger. In future, we will
expand our system to achieve the specific Blockchain based solution for adapter
module [10]. The centralized energy trading system faces many challenges in
terms of good quality services, the accuracy of the record. The existing energy
trading system totally relies on a central system or third party. The third party
has many drawbacks in the form of record tampering or record altering. Trust
and security are a major problem in this work. In this work, authors proposed
a Blockchain technology for the purpose of the energy trading system. Using
Blockchain technology no longer necessarily need of third parties to maintain
the record. In this paper, authors analysis of security issues of a decentralized
energy trading system. Keeping in mind the hacker attack in system authors
prevent the proposed system from hacker attack. Using Blockchain cryptocur-
rency security mechanism is used the chord algorithm to find out the disrupted
node in the P2P energy trading system. In this proposed work, authors used
the bitcoin along with SHA-256 cryptographic hash function, the authors used
the advanced encryption standard (AES) for encrypted message. In this work,
the authors highlight the different attacks. In any time the performance of a
hacker attack is presented in this work. Using the overlay network, we prevent
the system from various attacks [3]. In this work, authors tackle the problem of
privacy in term of data sharing [11].
3.2 Blockchain in Vehicular Network
The concept of intelligent vehicles remove the presence of driver working. Intelli-
gent vehicles consider a self-driving car, but nowadays security and trustfulness
is the main issue in communication between intelligent vehicles. In this work, the
authors proposed a Blockchain technology to tackle the security and trustfulness
issues. In this work, we divided Blockchain into two branching 1 local dynamic
Blockchain 2 Main Blockchain these are used to minimize the latency of mes-
sage are also used to manage the trustworthiness. In the case of incorporative
Blockchain, transparency is safety is less due to increasing the data [12]. There
has been a vast development in the field of (IoV), but facing a big challenge for
intelligent storing the data. Information security is key issues for the (IoV) The
existing centralized traditional system for the (IoV) face much hardness for quick
response on real time. With the existing (IoV) generates big data as a record
however, the problem is that storing these big data in a secure environment. In
this paper, the authors proposed a Blockchain technology for the (IoV). Using
the Blockchain technology storing the vehicle information data in the intelligent
system is maintained in a secure environment. In this work, Blockchain technol-
ogy is applied to achieve the better privacy and communication system between
vehicle-to-vehicle and vehicle to roadside units (RSU). In this paper, authors dis-
cuss the future work how a multiple Blockchain vehicle nodes can communicate
with other vehicles and (RSU). In cellular network how we achieved reliability
of channels in which traffic is increasing among the vehicles [13].
3.3 Blockchain in Network Communication

Numerous existing security mechanism are inappropriate for IoT devices plat-
form due to the same problem of maintenance and extra resource utilization
are used to provide the services like hardware. The usage of (IoT) devices is
increased on a network then it arises some security risk due to the trustfulness
on a network. Providing security to lightweight clients for the request of ser-
vices is a challenging task. In this paper, authors proposed a Blockchain based
security mechanism to maintain the tamper-proof and authenticate the state of
edge service and off chain services that are handled by arbitration cloud trader,
and also helped the lightweight clients this paper provides the services to a
lightweight client through the process of authentication and validation. When
the lightweight client wants to perform a transaction on the network through the
validation process, using this process it reduces the cost of IoT devices on the
network. We have achieved the low delay and throughput by applying the proof
of authority. The simulation of this paper is implement in Geth 1.8.11 software
using CPU 3.4GHz. Ubuntu mate and fedora 12 software. In this paper, authors
discuss future work, he attains much feedback from IoT devices he wants to
attain a superior tradeoff among scalability, availability, flexibility secure ser-
vices by using the Blockchain base platform [14].
3.4 Blockchain in Wireless Sensor Network

The multi-domain network face many challenge in term of operation based. Using
distributed Blockchain, Dapps technology handles the operational phases in a
multi-administration network. In future, we will handle the important research
Problems toward Blockchain based multi-administration domain network [15].
The crowd sensing is not provided as a large task to the mobile user who par-
ticipates in it. The crowd sensing has not facilities to the mobile user for huge
level awareness. It give many incentives mechanisms for cloud platforms how-
ever, not anyone can handle privacy issues. There are many devices connecting
with the IoT devices, when the number of devices is connecting to IoT create
then these IoT devices are face many challenges in terms of security risk, and
privacy. The existing traditional sensor is not portable and their deployed cost is
high. Privacy hack is an important problem for a mobile user for collecting the
sensor data. In this paper, authors keep in mind the privacy issue which is seen
clearly in the existing traditional system. In this paper, the authors introduced
a Blockchain technology to handle privacy and security issues. Using Blockchain
based protective incentive mechanism to protect the privacy information of peo-
ples. By giving an incentive mechanism to increase the interest of the user to
participate in the sensing responsibility. In this paper, the range of experiment
is applied in the limited environment and the data is taken in a short area. By
applying this type of experiment the result is attained in single-sided [16]. Using
blockchain technology different authors to resolve the different problems such as:
data rights management, healthcare problem, data securing and fair sharing of
data, data trading, node recovery problems, efficient energy trading, data rights
management, edge servers participation. In [17–25] authors, using blockchain
technology also have provided a solution for the above mentioned problem.
4 Problem Statement
All the current energy trading systems are centralized and all the procedures
rely on a single system which has many drawbacks. The record may be altered,
change and not be available on every time [2]. Security is a serious issue in the
centralized system anyone can enter the system and can alter the record. Peo-
ple spent a lot of money on a centralized system but does not achieve security.
In existing, the energy trading system is limited between two nodes [6]. The
attacker node continuously try to enter in the energy trading system and keep
on damaging the system [3]. To prevent from different attack, we apply the Proof
of Authority (POA) in our proposed scheme. However, nowadays, people con-
vert their trust in developed industries which provide a secure environment for
trading renewable energy. All these facilities are done by Blockchain technology.
In Blockchain technology, record cannot update until when 51% of people can
validate or authenticate the transaction record.
Fig. 1. Blockchain based energy trading between prosumer and consumer


Our Blockchain based decentralized proposed system model consists of three
entities that provide secure energy trading between prosumer and consumer in a
P2P network. Our proposed work is motivated from [3]. When the prosumer has
to need energy for maintenance of their production, he purchases energy from
the smart grid on fixed price. Distribution system operator (DSO) is consider
as a node to handle the double auction task between prosumer and consumer.
Prosumers send a request to (DSO) for energy supply, DSO receives the different
requests from prosumer and consumer after verification as shown in Fig. 1. DSO
announces the double auction process among different prosumer and consumer
for energy trading. When the process of double auction is completed, DSO saves
the double auction record in decentralized BlockChain. Consumer purchases the
energy from smart grid on fix price, whereas, he cannot afford the fixed price
of smart grid energy. He wants to purchase the less cost of energy for their
benifit. For this purpose, consumer sends request to DSO for purchase low price
of energy. (DSO) deliver the information about energy price to all participating
prosumers and consumers. When consumer feels satisfaction about energy price
Furthermore, deal with prosumer for low-cost energy in secure P2P network.
Fig. 2. Blockchain based peer to peer energy trading
The Fig. 2 of our proposed submodel is motivated from [6]. As shown in

Fig. 2 Prosumer and consumers can energy trading in a decentralized Blockchain
network. Each node can energy trading to another node in a secure network. All
the transactions of nodes is stored in block and further blocks are increased
and making a Blockchain network. Every node validates the transaction and
distributes the copy of the transaction to all other nodes.
5.1 Proof of Authority

Proof of authority is a type of consensus mechanism of Blockchain technology in
which consensus refers to a list of validators (mention to as authorities when they
are linked to the physical entity) validators are a type of accounts or nodes. Val-
idator is participating in such a way that validate the transaction and blocks. In
our proposed model, the (DSO) node is an authority node. Its responsibility is to
maintains the distributed ledger, smart contract and transaction to validate the
prosumer and consumer entity. First of all (DSO) node is responsible for authen-
tication of prosumer and consumer nodes and then execute the smart contract.
When the DSO node is working in POA mechanism mode. Where (DSO) broad-
casts the double auction for energy trading to all prosumers and consumers.
After the double auction, (DSO) is responsible for storing all the transactions
in Blockchain. In POA consensus mechanism attacker required 51%attack to
control over the Blockchain network. POA consensus mechanism is harder to
obtaining control of nodes as compared to POW.
5.2 Smart Contract

A smart contract is a self-executing code. It is a type of agreement between
purchaser and seller without any requirement of a third person or external
mechanism. A smart contract makes the Blockchain secure as compared to a
centralized system. We use the smart contract in our proposed model to make a
secure transaction between trading parties. In this paper, if prosumer and con-
sumer participate for energy trading. The both parties must have to create a
smart contract for energy trading.
6 Simulation and Results

The simulations of our proposed model are done in window 10 using laptop
with 8 GB RAM, 500 GBROM along with Intel(R) Core (TM) i3-2350M CPU
@2.30 GHz 2.30 GHz Processor. The simulations of our proposed model are
implemented in solidity language. A smart contract is created in solidity lan-
guage. Ganache tool is used for calculating the gas consumption that is required
for performing a transaction. In Fig. 3 shows the comparison of gas cost values
among different events. When any event occurs in solidity, it take some gas value
as a cost. We seen that the values of events are increased according to their tasks.
When the event of a double auction is occures that take high gas cost as a value
as compared to Prosumer and consumer events.
It is shown in Fig. 4 we have performed different transactions in solidity
language. Ganache tool is used for gas consumption. Noted the time of every
Fig. 3. Gas cost and event creation
Fig. 4. No of transaction and time M/S
Fig. 5. Prosumer energy rate and No. of transaction of consumer

Fig. 6. Double auction time and No. of prosumer and consumer request
transaction in a ganache tool. When the transaction is done, the time is increased
or decreased according to the deploy smart contract and relies on values of
functions that we call during the transaction.
Figure 5 represents a graph and comparison of some entities values, where
x-axis takes a value of the prosumer energy rate bitcoin per/unit and y-axis
takes a value of no of the transaction of consumers. When the price of prosumer
energy is increased than the no of the transaction of consumer decreases. Because
consumers want to use a minimum cost of energy that they cannot afford the
high price of energy. The prosumers energy rate is associated with consumer
trust and consumers frequentely transactions. We can see in the Fig. 5 when the
prosumers energy rate is low, whereas the trust of consumer energy transaction
will increased.
Figure 6 represents a transaction execution time between prosumer and
consumer using a double auction process. When any transaction is done in
Blockchain, it takes some time to execute the transaction. When multiple con-
sumers send a request for the transaction to prosumer then response time is
increased to complete the execution. When more prosumers participate in the
double auction for energy supply then market competition will be increasing and
energy prices will also be decreased so, we have acheived the low cost of energy
trading.
7 Conclusion
In our proposed work, Blockchain technology is used for energy trading in renew-
able energy resources between prosumer and consumer in distributed peer to peer
network. Using Blockchain technology, we provide a secure platform for energy
trading. Where we check the efficiency of our system and analysis of market
cost competition. In this paper, we handle the privacy and transparency in dis-
tributed peer to peer network for limited energy trading environment. It is clear
from our proposed idea that every node can participate in the double auction
process to perform a transaction. The important benefit of a double auction is
to give the facility to the consumer for a secure and optimal platform for energy
trading. The main contribution of this paper is to provide a secure energy envi-
ronment at low-cost energy as compared to the market. In future work, we will
expand our system for large scale market and will improve the further scalability
and efficiency of our system in a real experimental environment.
References
1. Aitzhan, N.Z., Svetinovic, D.: Security and privacy in decentralized energy trading
through multi-signatures, blockchain and anonymous messaging streams. IEEE
Trans. Dependable Secure Comput. 15(5), 840–852 (2016)
2. Karame, G.O., Androulaki, E.: Bitcoin and blockchain security: introduction. In:
Bitcoin and Blockchain Security, pp. 1–9. Artech House (2016)
3. Rahmadika, S., Ramdania, D.R., Harika, M.: Security analysis on the decentralized
energy trading system using blockchain technology. J. Online Inform. 3(1), 44–47
(2018)
4. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). http://bitcoin.
org/bitcoin.pdf
5. Narayan, P.: Building Blockchain Projects: Develop Real-time Dapps Using
Ethereum and Javascript. Packt Publishing Ltd., Birmingham-Mumbai (2017)
6. Kang, E.S., Pee, S.J., Song, J.G., Jang, J.W.: A blockchain-based energy trading
platform for smart homes in a microgrid. In: 2018 3rd International Conference on
Computer and Communication Systems (ICCCS), pp. 472–476. IEEE, April 2018
7. Pee, S.J., Kang, E.S., Song, J.G., Jang, J.W.: Blockchain based smart energy trad-
ing platform using smart contract. In: 2019 International Conference on Artifi-
cial Intelligence in Information and Communication (ICAIIC), pp. 322–325. IEEE,
February 2019
8. Mannaro, K., Pinna, A. Marchesi, M.: Crypto-trading: blockchain-oriented energy
market. In: 2017 AEIT International Annual Conference, pp. 1–5. IEEE, September
2017
9. Mengelkamp, E., Notheisen, B., Beer, C., Dauer, D., Weinhardt, C.: A blockchain-
based smart grid: towards sustainable local energy markets. Comput. Sci.-Res.
Dev. 33(1–2), 207–214 (2018)
10. Lombardi, F., Aniello, L., De Angelis, S., Margheri, A., Sassone, V.: A blockchain-
based infrastructure for reliable and cost-effective IoT-aided smart grids (2018)
blockchain model for fair data sharing in deregulated smart grids
put. Netw. 145, 219–231 (2018)
13. Jiang, T., Fang, H., Wang, H.: Blockchain-based internet of vehicles: distributed
network architecture and performance analysis. IEEE Internet Things J. (2018)
14. Lin, J., Shen, Z., Miao, C., Liu, S.: Using blockchain to build trusted lorawan
15. Rosa, R., Rothenberg, C.E.: Blockchain-based decentralized applications for mul-
tiple administrative domain networking. IEEE Commun. Stand. Mag. 2(3), 29–37
(2018)
(2018)
service providing for IoTs using blockchain
based underwater WSNs via recovering the void holes. MS Thesis, COMSATS
University Islamabad (CUI), Islamabad 44000, Pakistan, July 2019
blockchain over ethereum network. MS Thesis, COMSATS University Islamabad
(CUI), Islamabad 44000, Pakistan (2019)
20. Javaid, A., Javaid, N., Imran, M., Ensuring analyzing and monetization of data
using data science and blockchain in loT devices. MS Thesis, COMSATS University
Islamabad (CUI), Islamabad 44000, Pakistan (2019)
21. Kazmi, N., Javaid, N., Imran, M.: Towards energy efficiency and trustfulness in
complex networks using data science techniques and blockchain. MS Thesis, COM-
SATS University Islamabad (CUI), Islamabad 44000, Pakistan (2019)
22. Zahid, M., Javaid, N., Babar, M.: Balancing electricity demand and supply in
smart grids using blockchain. MS Thesis, COMSATS University Islamabad (CUI),
Islamabad 44000, Pakistan (2019)
and blockchain in smart networks. MS Thesis, COMSATS University Islamabad
(CUI), Islamabad 44000, Pakistan, 2019
ing for lightweight clients based on blockchain. MS Thesis, COMSATS University
Islamabad (CUI), Islamabad 44000, Pakistan (2019)
25. Jalees, R., Javaid, N., Iqbal, S.: Blockchain based node recovery scheme for wireless
sensor networks. MS Thesis, COMSATS University Islamabad (CUI), Islamabad
44000, Pakistan (2019)
Auto-Generating Examination Paper Based
on Genetic Algorithms
Xu Chen1, Deliang Zhong1, Yutian Liu1, Yipeng Li1,

Shudong Liu1(&), and Na Deng2
1
Zhongnan University of Economics and Law, 182 Nanhu Avenue, Hongshan
District, Wuhan, Hubei, China
chenxu@whu.edu.cn, 1360534847@qq.com,
990579052@qq.com, lyp2357@163.com,
liumu1321@zuel.edu.cn
2
School of Computer, Hubei University of Technology, Wuhan, China
iamdengna@163.com
Abstract. With the acceleration of education informatization, the social

demand for online examination papers is increasing. However, there are some
problems in the generation of online examination papers. Firstly, it is impossible
to randomly generate examination papers quickly. Besides, it is impossible to
dynamically adjust examination papers according to test results. Thirdly, it is
impossible to generate examination papers based on individual characteristics of
students. In order to solve these problems, this paper proposes a new auto-
generation examination paper model based on genetic algorithm. The model
dynamically adjusts the difficulty factor of individual test questions by analyzing
the online learning data and historical user test result data, and then guarantees
the difficulty of generating examination papers in line with the changes in the
current educational environment. The simulation results show that the algorithm
improves the efficiency and accuracy of the generation examination paper, and
effectively controls the difficulty coefficient of the examination paper.
1 Introduction
With the wide application of computer in education, the traditional paper-based

examination has evolved into online examination. Unlike the paper-based examination,
online examination eliminates the links of paper proposition, examination, printing,
rewinding and grades entering, which can shortens the examination period. Teachers or
relevant staff only need to enter the relevant information of the exam, such as the total
score of the exam, the type of the exam questions, the difficulty of the exam, the
chapters of the exam, into the online exam system before the exam. The system will
generate the exam papers according to the constraints. The generation of test papers
only depends on the restrictions of early entry, so the confidentiality of test papers is
higher. Online examination system not only saves a lot of manpower, material and
financial resources, but also makes the examination more fair and effective.
Test results can objectively and effectively reflect the mastery of knowledge, but the
uncertainty of the test paper quality will affect the results of the test, which makes the

https://doi.org/10.1007/978-3-030-33509-0_83
888 X. Chen et al.
results with luck factor. This paper presents a method of auto-generating examination
paper based on Genetic Algorithms, which can strictly control the various attributes of
the test paper, so as to achieve accurate quality control of the test paper, and make the
test results more comparable.
2 Genetic Algorithm
Genetic Algorithms (GA), which simulates Darwin’s theory of evolution and Mendel’s
theory of genetics to establish a stochastic total optimization method. The genetic
algorithm can automatically acquire and accumulate the knowledge of the searched
space during global search, so as to achieve efficient and parallel search effect, and it
can adaptively control the optimal solution of the search process.
The object of genetic algorithm is all the individuals in the digitized population, and
the stochastic technology is used to guide the spatial evolution to obtain the optimal
solution. Like the evolution of natural populations, digital populations also have
choices, intersections, and variations to maintain the diversity of the searched space.
The core content of genetic algorithm is composed of five elements: parameter coding,
initial population setting, fitness function design, genetic operation design and control
parameter setting.
3 Genetic Algorithm Implementation Process
The implementation process of genetic algorithm is similar to that of biological evo-

lution in nature (Fig. 1). The first step is to find a “digital” coding scheme for the
solution of the problem.
The second step is to use a random approach to initialize the population, and the
individuals in the population are to adopt “digital” coding. The third step is to decode
the “digitized” individuals by appropriate decoding methods, then calculate the fitness
of individuals in the initial population by fitness function, and cross the individuals
with higher fitness to produce new offspring. In the process of repeated crossover and
mutation, the optimal solution can be obtained or the number of iterations can be
reached.
4 Genetic Algorithm Design
According to the implementation process of genetic algorithm, test paper library, test
paper, test question and test question characteristics are mapped into population,
individual, chromosome and gene respectively. Then, according to the constraint
conditions, such as the total score, difficulty coefficient and so on, the test questions
matching the feature parameters are searched from the formed question library, thereby
extracting the optimal test question combination.
Auto-Generating Examination Paper Based on Genetic Algorithms 889
Coding
Set initial population
Calculating individual fitness
Select
Evolution
Cross
Variation
Fig. 1. Genetic algorithm implementation process
4.1 Chromosome Coding and Population Initialization

The purpose of chromosome coding is to establish a mapping relationship between
phenotypes (test questions) and genotypes (characteristic parameters).
The traditional genetic algorithm uses binary coding, but because of the many
characteristic parameters of the test questions, the traditional binary coding method will
lead to the crossover and variation of the latter operator, so the real number coding
method is used to the title of the test questions. The use of real coding cancels the
individual decoding process and speeds up the solution of the optimal solution. Putting
the same type of questions together so that the chromosomes are segmented according
to the question type.
Chromosome : 9; 32; 123; 4j89; 129; 84; 42j. . .j90; 121; 43; 98
Initial population is the initialization of test paper library. It generates a certain

number of test papers randomly under the conditions of satisfying the total score, the
total number of questions and the proportion of various types of questions. This
reduces the convergence time and the number of iterations.
890 X. Chen et al.
4.2 Fitness Function

The fitness function is used to judge the quality of the test paper. The higher the
individual’s fitness is, the more adaptable it is to the environment, so the more likely it
is to be selected for crossover. Since the total score, type and number of questions have
been determined at the time of initialization of test paper population, it is only nec-
essary to consider the calculation of the difficulty coefficient and the coverage of the
knowledge points.
(1) Difficulty factor P calculation
Pk

i pi t i
P ¼ NP ð1Þ
T
Where P represents the absolute value of the difference between the expected
difficulty coefficient and the actual. NP represents the expected difficulty coefficient of
the test paper, and its range is [0–1]. When NP > 0.8 test volume is expected to be
difficult, and NP < 0.2 test volume is expected to be very simple, k represents the total
number of questions in the test paper, i represents the i-th question, Pi takes the value
range [0–1], Pi represents the difficulty of the i-th question, ti represents the score of the
i-th question, and T represents the total score of the test paper, thereby calculate the
range of values of P [0–1]. The smaller the value of P, the closer the actual difficulty of
the test paper is to the expected difficulty, and the higher the quality of the test paper is.
(2) Exposure V Calculation
Pk Pk
vi ti svi
V¼ i
Pmi ð2Þ
T i svi
V represents the exposure degree of the test questions. It is calculated by the score
of the exposure test questions and the number of exposure times of the test questions.
i represents the order of questions, k represents the total number of questions in the test
paper, Vi represents whether the i-th question is used in other test papers, 0 means not
used, 1 means used, ti is the score of the i question, T represents the total score of the
test paper, svi represents the total number of times the i-th question has been used, and
m represents the total number of questions in the test question library. When all the
questions in the test question library are not used, V is 0. According to the expression,
the value of V can be deduced to [0–1]. The smaller the value of V, the lower the
exposure of the test paper and the higher the quality of the test paper.
(3) Fitness function expression
The fitness function is composed of difficulty coefficient and exposure, and its
value should be non-negative, the bigger the better. The fitness function is
designed as follows:
f ðxÞ ¼ 2 P N
The smaller the value of P and N, the larger the value of f(x) is, the better the
quality of the test paper will be.
4.3 Selection Operator

In nature, the higher the adaptability to the environment, the greater the probability of
individual survival is, so roulette can be used.
f ð xÞ
P ¼ Pm ð3Þ
i f ð xÞ
P denotes the probability that an individual will be selected, and M denotes the total
number of papers (population size) in the test library.
Turning the survivability Wheels (Fig. 2), when the wheel stops, the pointer will
randomly point to the area represented by a test paper. The higher the fitness of the test
paper, the higher the probability that the test paper will be selected.
P1 P2 P3 P4 P5
1, 14%
1.8, 25%
1.7, 23%
1.5, 21%
1.2, 17%
Fig. 2. Survivability wheels of examination papers
4.4 Crossover Operator

Crossover operation means that two paired chromosomes exchange some genes in a
certain way to form two new individuals, which is the main way to produce new
individuals. Chromosome coding uses real number segment coding, putting the same
questions together. Because each item type is independent of each other, chromosomes
can use single-point crossover to get multiple-point crossover. Here the crossover
probability Pc is 0.7. The cross-over process is as follows:
892 X. Chen et al.
Chromosome A : 93; 2; 12; 11j38; 9; 1; 29j84; 98; 439
Chromosome B : 90; 1; 21; 342j43; 8; 48; 34j7; 19; 943
Cross to get new individuals:
Chromosome AðnewÞ : 93; 2; 12; 342j38; 9; 48; 29j84; 19; 439
Chromosome BðnewÞ : 90; 1; 21; 11j43; 8; 1; 34j7; 98; 943
The same test questions may be contained in the same type of two crossed chro-
mosomes. If the new chromosomes produced after crossing contain two identical
numbers of test questions, they will be regarded as illegal individuals. At this time, we
delete one of the questions, and then randomly select a question with the same
knowledge points from the question library to add to the new chromosome to make it a
legitimate individual. Population is constantly evolving, and the fitness of the test paper
is constantly improving, so the crossover probability Pc should be constantly adjusted.
At the beginning of population formation, the difference between individuals is large,
so the value of Pc needs to be larger. With the evolution of population, the difference of
fitness between individuals becomes smaller. In order to avoid premature population
falling into local optimum, the value of Pc should be smaller to reduce the evolution
speed of population.
4.5 Mutation Operator

Mutation operators refer to the substitution of certain genes in the chromosome to
produce new individuals, which will increase the diversity of the population.
Firstly, an individual is randomly selected from the population, and then the
chromosome mutation operation is carried out according to the mutation probability
Pm = 0.08. Under the condition that the title and knowledge points do not repeat, the
gene mutation of each segment of chromosome is taken to produce new individuals.
Gene mutations have both good and bad results, which may reduce the individual’s
fitness.
Similar to the crossover probability Pc, the mutation probability Pm should also
change continuously with the evolution of population, But the change of Pm is
incremental. At the beginning of population formation, Pm should take a smaller value.
When the fitness gap between individuals is small, Pm should increase to maintain the
diversity of population.
4.6 Evolutionary Termination

The genetic algorithm needs to be iterated continuously to get the approximately
optimal solution. At the same time, we need to consider the termination condition of
iteration. Since more iterations will not produce better results definitely, we need to
control the operation time. Therefore, the following two termination conditions are
used.
1 If there is a solution that meets the expected fitness, the iteration terminates and
the test paper is successfully grouped. 2 If there is no solution satisfying the expected
fitness in the iteration operation, but the iteration number has reached the threshold, the
solution with the highest fitness will be selected and the test paper is completed.
5 Summary
Based on the understanding of genetic arithmetic, this paper designs and implements
the intelligent test paper generation of online examination system by using real-number
segment coding strategy and genetic algorithm with piecewise single-point crossover
operator. The experimental data show that genetic algorithm can quickly form high
quality test papers and improve the level of network education.
Acknowledgments. This study was financially supported by the National Natural Science
Foundation of China (61602518, 71872180). The Fundamental Research Funds for the Central
Universities, Zhongnan University of Economics and Law (2722019JCT035, 2722019JCG074).
References
1. Huang, Y., Chen, T.: Design and implementation of intelligent test paper system based on
improved genetic algorithm. Coal Technol. (2009)
2. Explain the genetic algorithm with an example: an automatic test paper generation system
based on genetic algorithm (Theories) (2011). https://www.cnblogs.com/artwl/archive/2011/
05/19/2051556.html
3. Ye, C.: Application of adaptive genetic algorithm in intelligent test papers. Inf. Secur.
Technol. (2011)
4. Genetic algorithm detailed (GA) (2016). https://blog.csdn.net/u010451580/article/details/
51178225
5. Chen, P., Xu, J.: Design and implementation of automated test paper system based on SSM.
Comput. Knowl. Technol. (2018)
6. Deng, M.: Implementation of open education online examination system based on genetic
algorithms. J. Guangxi Radio TV Univ. (2019)
The Data Scientist Job in Italy: What
Companies Require
Maddalena della Volpe1(&) and Francesca Esposito2

1
Department of Business, Management and Innovation System,
University of Salerno, Fisciano, Italy
mdellavolpe@unisa.it
2
Department of Political and Communication Sciences,
University of Salerno, Fisciano, Italy
fraesposito@unisa.it
Abstract. In recent years, experts have considered the job of data scientists as
the sexiest of 21st century. However, people skilled with data scientist’s
expertise seem to be rare. This probably happens for the complex set of com-
petences that this profession requires. In this paper, we deal with companies that
are searching for data scientists to expand their workforce. Scraping data from
the business-networking website LinkedIn, as for companies, we collected
dimensions, sectors, kinds of employment, contract forms, working functions,
and required skills. Our findings suggest that data scientist profession extends to
several sectors but it is not yet consolidated. This condition intensifies the
misconception about the skills required. Based on all this, we think that the role
of higher institutions becomes fundamental, on the one hand to define data
science as a discipline, and on the other to train young people for acquiring the
set of skills needed.
1 Introduction
The advent of big data poses several challenges [1, 2], including the difficulties in
recruiting data scientists [3] capable of managing big data technologies [4, 5].
Gartner [6] identifies three components by taking in consideration when companies
are facing the big data challenge. Firstly, data, analytics and governance are separating
from the Information Technology function being completely outside from it. Secondly,
big data represent at the same time an opportunity and a risk: while analytics are
becoming more sophisticated, data scientists are still in their infancy. Thirdly, dealing
with big data at an organizational level means to converge three disciplines, i.e.
information management, analytics and algorithms, plus change leadership and
management.
Companies and industries are more aware that data analysis is increasingly
becoming a vital factor to be competitive, to discover new insights, and to personalize
services [7]. The complexity of a big data system is given by different sequential
modules employed in the data analysis process, such as data generation, data acqui-
sition, data storage and data analytics [8], to which we should like to add data visu-
alization. The implementation of big data deals with the cruciality of “available and

https://doi.org/10.1007/978-3-030-33509-0_84
The Data Scientist Job in Italy: What Companies Require 895
qualified data scientists who can make sense of big data with a proper understanding of
the domain and who are comfortable using analytical tools are not easy to find” [9].
Starting from these evidences, we survey companies that are searching for data
scientists on the business-networking website LinkedIn, in order to better understand
the demand of these rare professionals in Italy. Company profiles allow us to know
what companies require these new professionals, which are the competences desired
and the tasks undertaken by people with expertise in data science. Our paper is
structured as follows: after this introduction, in the second paragraph, we gather def-
initions given in literature by businesses and academics about data scientists. In the
third paragraph, it is briefly described the methodology used to collect and analyse data
and in the fourth paragraph, findings highlight as data scientist features are not yet
consolidated. Finally, the main conclusions are presented in fifth paragraph.
2 The Data Scientist in Literature
Many efforts have been made to study the aforementioned big data challenges. One of
these, it is defining and recruiting data scientists. The mission of a data scientist is to
transform data into insight, providing guidance for leaders to take action [10].
Davenport and Patil [3] seem to provide one of the most original definition of a data
scientist: a quality specialist with the curiosity in new discoveries. Actually, this def-
inition highlights the difference between the old scientist of data and the data scientist:
a part from quantitative competences, this professional must develop communication
skills, creativity and proactive behaviour. Besse and Laurent [11] affirm that the new
role of the data scientist should associate two types of approaches or logics: the
statistician logic, which infers or checks for errors or risks in specific procedures, and
the computer scientist logic, which designs the best strategy to minimize errors and
optimize complex models in order to reach research objectives. Generally, an integrated
skill set for data scientists is identified which includes mathematics, statistics, machine
learning and databases [12].
The literature about data scientists is presented as a set of attempts to establish a
common definition mainly based on the experiences of individuals. Actually, in our
survey, we find mainly descriptions made by company managers and by academic
researchers who analyse company contexts. For instance, IBM provides a compre-
hensive definition: “a data scientist represents an evolution from the business or data
analyst role. The formal training is similar, with a solid foundation typically in com-
puter science and applications, modeling, statistics, analytics and math. What sets the
data scientist apart is strong business acumen, coupled with the ability to communicate
findings to both business and IT leaders in a way that can influence how an organi-
zation approaches a business challenge. Good data scientists will not just address
business problems: they will pick the right problems that have the most value to the
organization” [13]. Hilary Mason, the founder and CEO of Fast Forward Labs, prefers
to define a data scientist as someone “who blends, math, algorithms, and an under-
standing of human behaviour with the ability to hack systems together to get answers to
interesting human questions from data” [14]. Differently, there are academic
researchers that analyse data scientist workflow. Some authors [15] asked more than
896 M. della Volpe and F. Esposito
250 data scientists how they viewed their skills and careers. Then, they clustered the
survey respondents into four roles: Data Businesspeople, Data Creatives, Data
Developers, and Data Researchers. Others [16] interviewed 16 data analysts at
Microsoft and identified their typical activities as consisting of acquiring data, choosing
an architecture, shaping the data to the architecture, writing and editing code, reflecting
and iterating on the results. Even 16 Microsoft data scientists are interviewed [17].
They identify five distinct working styles of data scientists: Insight Providers, who
work with engineers to collect the data needed to inform decisions that managers make;
Modeling Specialists, who use their machine learning expertise to build predictive
models; Platform Builders, who create data platforms, balancing both engineering and
data analysis concerns; Polymaths, who do all data science activities themselves);
Team Leaders, who run teams of data scientists and spread best practices. Two years
later, they presented a large-scale survey with 793 data scientists, again at Microsoft, in
order to understand their educational background, the main work topics, the tools used
and activities accomplished [18]. Finally, [19] also interviewed 35 analysts in com-
mercial companies, ranging from healthcare to retail, finance and social networking.
They recognized that analysts must have an aptitude for discovery, wrangling, profil-
ing, modelling and reporting. Lastly, the role of visualization skills is emphasized as an
outcome of the whole data scientist workflow.
If we compare the profiles identified, we note how they are apparently similar, but
differ from each other. Companies are set to take diverse routes in the adoption of big
data technologies and the distinctive nature of each work performed within each sector
could result in specific competences and often in a reskilling of the workforce.
Since the data scientist role is still multifarious in literature, higher education
institutions can play a key role both in providing a scientific definition of the data
science and offering study programs able to train professionals in this field. In fact, one
of the factors that contribute to companies’ failure in utilizing data is the lack of well-
trained professionals. They should be able to manage and overcome the peculiar
challenges associated with big data. It is critical to have technical knowledge on how to
extract, prepare and format unstructured large volumes of data from multiple sources
[20]. The increased interest in filling this gap has involved some authors, who provide a
description of current data science programs in order to assist universities in designing
and developing undergraduate courses. For instance, it is described a four-year
undergraduate program in predictive analytics, machine learning, and data mining
implemented at the College of Charleston, Charleston, South Carolina, USA [21].
Moreover, findings are presented from a review of the descriptions of courses
offered in a small sample of undergraduate programs in data science and data analytics.
Then, an undergraduate course is offered in a liberal arts environment that provides
students the tools necessary to apply data science [22]. Most of publications are
referred to undergraduate programs, while master and specializations are ignored.
Therefore, the role of universities is fundamental not only in producing talent who can
perform data scientist job, but also in providing acceptable definitions taken from
business experience and yet not having scientific basis. This suggests also the need of
collaboration between companies and universities, in order to obtain the best corre-
spondence between study path and integration in the labor market.
3 Methodology
By means a traditional copy-and-paste scraping procedure, we collected about 400 job

postings from the business-networking website LinkedIn, searching for the job “data
scientist”. We grasped data related to companies and, where possible, information
about the desired skills. As for geographical collocation, we choose to limit the
research within the national and international companies located in Italy. The inves-
tigation has been conducted over a period going from December 27, 2018 to January
15, 2019.
Data are reported as by the following set fields: job denomination, companies’ size,
number of employees, business sector, geographical localization, contract form, level
of seniority and skills required. In order to display the geographical localization of
companies, we use an online resource that allows to geo-localize automatically a great
amount of addresses (www.batchgeo.com). Successively, we provide a descriptive
statistical analysis of data collected.
4 Findings and Discussion
We start detecting the companies’ size counting the number of employees (Table 1).
The first two more crowded clusters are those referred to companies with 11–50
employees and to companies with more than 10,001 employees. Even if we do not
know the turnover, we can image that they represent innovative companies or start-ups
still under development, or large multinationals. Machine Learning, Artificial Intelli-
gence and Data Analytics are advancing and, by now, they play an essential role in
companies’ future. The largest and best companies in the world know this, and they are
investing in these features. We have found trade online giants as Amazon, but also
companies operating in Software (e.g. Microsoft), Telecommunications (e.g. Voda-
fone), TV e media (e.g. Sky), Personal products (e.g. L’Orèal), Banking and Insurance
(e.g. ING).
Table 1. The size of companies finding data scientists

Number of employees Companies (%)
0–1 0.51
2–10 4.33
11–50 17.30
51–200 16.03
201–500 5.60
501–1,000 5.60
1,001–5,000 12.21
5,001–10,000 9.41
10,001 16.79
Not specified 12.21
With regards to business sectors, we observe that they are multiple and very
different from each other. However, we note a prevalence of sectors relating to ICT.
Actually, if we consider the value of the presence percentage, we note that the sector of
Computer science and services (31.3%) is the most widespread. It is followed by
Selection and search of personnel (15.8%); Management consulting (7.1%); Human
resources (5.3%); Software (5.09%); Insurance (2.5%); Telecommunications (2.5%);
Internet (1.53%); Retail trade (1.3%); Public services (1.3%); Banking sector (1.02%);
Accessories and fashion (0.76%); Professional training (0.76%); Aviation and aero-
space (0.51%); Accounting (0.51%); Cosmetics and Personal Products (0.51%); Pub-
lishing (0.51%); Renewable energy (0.51%); Management in training (0.51%);
Mechanical or industrial engineering (0.51%); Marketing and advertising (0.51%);
Broadcast media (0.51%); Oil and energy (0.51%); Newspapers (0.51%); Financial
services (0.51%); Automotive sector (0.51%); Hospital structures and health (0.51%);
Hotelier (0.25%); Food and drink (0.25%); Government Administration (0.25%);
Consumer goods (0.25%); Luxury goods and Jewellery (0.25%); Biotechnology
(0.25%); Consumer electronics (0.25%); Large-scale retail trade (0.25%); Real estate
(0.25%): Pharmaceutical industry (0.25%); Logistics and supply chain (0.25%); Nan-
otechnologies (0.25%); Electrical and electronic production (0.25%); Research
(0.25%); Researches market (0.25%); Information services (0.25%); Legal services
(0.25%); Tobacco (0.25%); Leisure, travel and tourism (0.25%). Only in 11.7%, the
business sector is not specified. It seems that the distribution of data scientist demand
generate a long tail [23], in which a population with a high frequency (Computer
science and services) is followed by a population with a low frequency (or width),
which gradually decreases (tail off) (Fig. 1).
Fig. 1. The long tail of data scientist demand by sectors

By means of Batchgeo online resources, we make up the geographical localization

of companies. The platform returns a map with companies’ geo-localisation as showed
in Fig. 2. As we can observe, there is a prevalence of companies in the Northern side of
Italy (75%), mainly located in the industrial zone formed by Lombardy (76%), Veneto
(6%), Emilia-Romagna (5%), and in Piedmont (9%). A small group of companies are
located in the Central side of Italy (22%) and some companies in the Southern side of
Fig. 2. The geographical localization of companies searching for a data scientist
Italy (4%). However, Milan Area (56%) is the place in Italy with the highest number of
companies searching for data scientists.
As far as contract form concerns, we detect that most part of companies offer a full-
time contract (97.5%), while a short percentage offer a stage (1.8%), a temporary
contract (0.5%) and a part-time contract (0.2%).
A level of seniority or experience is required depending on where it will be posi-
tioned in the company, which will be its direct reporting and, in general, where the
company is positioned as for the path towards the data-driven company. In more
mature companies, the data science structure is positioned transversally to the business
functions, working almost as a function of staff in the development of different projects.
In these cases, the analytics team reports directly to an executive. However, companies
are searching for an Executive in 0.2%, a Director in 0.5%, a medium-high level
experience in 18.6%, a medium level experience in 28%, minimum experience in 40%,
and finally with no experience in 7.1%, and not specified in 5.6%. Consequently, the
data scientist will be hired to perform mainly the working function of a Computer
Engineer (153 job advertisements) and a Computer Specialist (123 job advertisements).
Other working functions detected are specified in Table 2.
Table 2. The working function of a data scientist in companies

Working function %
Analyst 1.8
Business developer 4.6
Computer engineer 38.9
Computer specialist 31.3
Educator 1.3
Finance and accounting expert 1.5
General business 2
Management consultant 6.9
Manager 0.5
Manufacturing expert 0.3
Marketing and Advertising specialist 2
Project Manager 1
Researcher 1.8
Sales specialist 1
Strategy and planning expert 0.8
Other 2.8
Not specified 1.5
It seems that the figures of data science are not yet consolidated in their denomi-
nations. This also powers the confusion about roles and skills. Our research finds
extensive evidence of accelerating demand for a variety of new specialist roles related
to understanding and leveraging the latest emerging technologies. We have identified
many different denominations around data science applications. Firstly, the Data sci-
entist, in its connotation of Senior, Junior, Lead and Intern; then the roles focused on
data analysis, such as Big data Analyst, Engineer, Developer, Architect, Expert and
Manager, Information Technology Data Analyst, Data Management Expert, Reporting
and Data Analyst, Data Mining Specialist; also, roles of consultancy and strategy, i.e.
Business Consultant and Business Intelligent specialist; others focused on the tech-
nology infrastructure such as Technology Consultant, Cloud Solutions Architect,
Database Developer and Engineer, Machine Learning Specialist; other on Artificial
Intelligence (AI), such as AI Engineer, Cognitive Consultant, Clinical solution
Architect, Robotic Automation Developer and Network Analyst; and finally, some
roles associated to specific business functions, such as Customer Intelligent Analyst,
Manufacturing Scientist, Marketing Data Analyst, Merchandising Data Analyst, Sales
Data Analyst.
In order to define the professional profiles wanted, we have collected skills required
and declared by companies and necessary for a candidate to be hired. We identified 239
different skills: above all, the most required skill is Python (5.8%). Then, we find
companies searching for people with expertise and knowledge in big data (4.6%),
Machine learning (4%), problem solving (3.4%), statistics (2.7%) and data analysis
(2.6%), as showed by the different bubble sizes in Fig. 3. According to WEF (2018),
the emerging skills required for future jobs will be creativity, originality and initiative;
analytical thinking and innovation; active learning and learning strategies; technology
design and programming; complex problem-solving; critical thinking and analysis;
leadership and social influence; emotional intelligence; systems analysis and evalua-
tion; reasoning, problem-solving and ideation. The data analyst and scientist is one of
the emerging jobs recognized by WEF. If we compare these skills with those collected
from LinkedIn, we observe that only few skills may match: such as, problem solving
(3.4%), leadership (1.2%) and team leadership (0.3%), strategic and critical thinking
(0.2%) active learning (0.3%) and programming (0.2%). It is observed that creativity,
originality, self-evaluation, emotional sphere and social aspects are still ignored by
companies.
Fig. 3. Skills to be hired as data scientist by companies in Italy
5 Conclusions
Starting from the business-networking website LinkedIn, we have explored features

and structures of companies that are searching for a data scientist in Italy. The main
issues discussed in literature highlight a gap between the ideal and real data scientist
employment. The business reality poses important challenges and that of data scientist
is a complex profession, a one not only presuming a multifarious set of knowledge and
expertise, but changing in respect to the strategic objectives of companies.
Analysing company profiles allowed us to recognise which are companies’ needs,
in terms of competences desired and tasks undertaken by people with expertise in data
science. Although a constantly growing interest, a lack of clarity remains about roles
and skills to be hired by companies. Synthetizing the main results, we can affirm that
data scientists attract both multinationals investing great funds to exploit data from the
Web and small enterprises or innovative start-ups, which are highly innovative and
look for skilled workforce.
The phenomenon is appearing in several sectors, generating therefore a long tail in
which the head is represented by Computer science and Software services. However,
the tail is constituted by numerous sectors that contribute significantly to the diffusion
of this profession. The level of seniority required is medium-high and the contract takes
mostly full time form. In Italy, data scientists are demanded in the Northern side,
especially in Milan area, where the industrial vitality of our country is located together
with the most prestigious Italian universities. This point suggests a reflection: it is
confirmed the principle whereby the proximity to prestigious universities influences the
ability of companies to accommodate the best talent [24]. Moreover, a closer collab-
oration with universities could contribute to provide guidelines and clarifications about
the profession of data scientists, sometimes lost in the multitude of denominations
associated to who works with data.
Finally, the most required skills demonstrate a vision of data scientist profession
still linked to quantitative analysis and analytical thinking. The emergence of a
dynamic and uncertain business context, most when one works with data from the
Web, requires a necessary holding also to those soft skills (creativity, communication,
leadership) that lack or are not detected as important in job advertisements.
As for limitations, our research takes into consideration only job advertisements
within a specific moment, which could be changed when this paper is issued. At the
same time, we limited the research to the keyword “data scientist”: given the uncer-
tainty about denominations, probably there are other wanted profiles that can be
classified in data science category.
Future researches could deal with an exploration of other countries, in order to
compare different contexts, or the role of national universities in professional devel-
opment. Last but not the least, another possible advancement could be reconsidering
this research opening the research field to other denominations, to provide a better
understanding of the specific skills required for each role.
References
1. Schewe, K.D., Thalheim, B.: Semantics in data and knowledge bases. In: International
Workshop on Semantics in Data and Knowledge Bases, pp. 1–25. Springer, Berlin (2008)
2. Chen, J., Chen, Y., Du, X., Li, C., Lu, J., Zhao, S., Zhou, X.: Big data challenge: a data
management perspective. Front. Comput. Sci. 7(2), 157–164 (2013)
3. Davenport, T.H., Patil, D.J.: Data scientist. Harvard Bus. Rev. 90(5), 70–76, 72 (2012)
4. Yin, S., Kaynak, O.: Big data for modern industry: challenges and trends (point of view).
Proc. IEEE 103(2), 143–146 (2015)
5. Davenport, T.H., Dyché, J.: Big data in big companies. International Institute for Analytics,
p. 3 (2013). https://docs.media.bitpipe.com/io_10x/io_102267/item_725049/Big-Data-in-
Big-Companies.pdf. Accessed 20 Mar 2019
6. Gartner: Data and Analytics Leadership Vision for 2017. https://www.gartner.com/binaries/

content/assets/events/keywords/business-intelligence/bie18i/gartner_data-analytics_research-
note_da-leadership-vision_2016.pdf. Accessed 20 Oct 2018
7. Oussous, A., Benjelloun, F.Z., Lahcen, A.A., Belfkih, S.: Big data technologies: a survey.
J. King Saud Univ.-Comput. Inf. Sci. 30(4), 431–448 (2018)
8. Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and
biggest growth in the far east. IDC iView: IDC Anal. Future 2012, 1–16 (2007)
9. Storey, V.C., Song, I.Y.: Big data technologies and management: what conceptual modeling
can do. Data Knowl. Eng. 108, 50–67, 52 (2017)
10. Kim, M., Zimmermann, T., DeLine, R., Begel, A.: The emerging role of data scientists on
software development teams. In: Proceedings of the 38th International Conference on
Software Engineering, pp. 96–107. ACM (2016)
11. Besse, P., Laurent, B.: De statisticien à Data Scientist-Développements pédagogiques à
l’INSA de Toulouse. Statistique et Enseignement 7(1), 75–93 (2016)
12. Dhar, V.: Data science and prediction. Commun. ACM 56(12), 64–73 (2013)
13. IBM: Data Scientists. https://www.ibm.com/analytics/us/en/technology/clouddataservices/
data-Scientist. Accessed 16 Oct 2018
14. Thompson, N.: When Tech Knows You Better Than You Know Yourself. https://www.
wired.com/story/artificial-intelligence-yuval-noah-harari-tristan-harris/. Accessed 16 Feb
2019
15. Harris, H.D., Murphy, S.P., Vaisman, M.: Analyzing the Analyzers: An Introspective Survey
of Data Scientists and Their Work. O’Reilly, Sebastopol (2013)
16. Fisher, D., DeLine, R., Czerwinski, M., Drucker, S.: Interactions with big data analytics.
Interactions 19(3), 50–59 (2012)
17. Kim, M., Zimmermann, T., DeLine, R., Begel, A.: The emerging role of data scientists on
software development teams. In: Proceedings of the 38th International Conference on
Software Engineering, ICSE 2016, pp. 96–107 (2016)
18. Kim, M., Zimmermann, T., DeLine, R., Begel, A.: Data scientists in software teams: state of
the art and challenges. IEEE Trans. Softw. Eng. 44(11), 1024–1038 (2018)
19. Kandel, S., Paepcke, A., Hellerstein, J.M., Heer, J.: Enterprise data analysis and
visualization: an interview study. IEEE Trans. Visual Comput. Graphics 12, 2917–2926
(2012)
20. Asamoah, D.A., Sharda, R., Hassan Zadeh, A., Kalgotra, P.: Preparing a data scientist: a
pedagogic experience in designing a big data analytics course. Decis. Sci. J. Innov. Educ. 15
(2), 161–190 (2017)
21. Anderson, P., Bowring, J., McCauley, R., Pothering, G., Starr. C.: An undergraduate degree
in data science: curriculum and a decade of implementation experience. In: Proceedings of
the 45th ACM Technical Symposium on Computer Science Education, pp. 145–150 (2014)
22. Aasheim, C.L., Williams, S., Rutner, P., Gardiner, A.: Data analytics vs. data science: a
study of similarities and differences in undergraduate programs based on course descriptions.
J. Inf. Syst. Educ. 26(2), 103–115 (2015)
23. Anderson, C.: The Long Tail: How Endless Choice is Creating Unlimited Demand. Random
House, New York City (2007)
24. Galloway, S.: The four: the hidden DNA of Amazon, Apple, Facebook and Google. Random
House, New York City (2017)
An Architecture for System Recovery Based
on Solution Records on Different Servers
Takayuki Kasai and Kosuke Takano(&)
Kanagawa Institute of Technology, Shimo-ogino Atusig, Kanagawa 1030, Japan

s1785009@cco.kanagawa-it.ac.jp,
takano@ic.kanagawa-it.ac.jp
Abstract. It is very important to quickly solve system failures in a system

operation. Some studies have proposed fault tolerance systems such as a flexible
system architecture for dealing with system failures and automatic failure
detection system. However, human identifies a system failure in many cases,
and a support system to reduce the cost of trial and error for solving system
failures is required. In this study, we propose an architecture for system recovery
based on solution records on different servers. In the experiment using proto-
type, we confirm the feasibility of the proposed system.
1 Introduction
We can find many cloud services such as Amazon Web Services [1], Google Cloud
Platform [2], and Microsoft Azure [3] for building the platform of software systems or
information services. These cloud services allow a system administrator to quickly and
flexibly configure a large-scale and complicated system architecture. Meanwhile, the
large-scale and complicated system have increased the cost of system administrators to
operate the system without any troubles, and some automatic functions to reduce the
operating cost have been proposed. For example, monitoring services such as Datalog
[4] and New Relic [5] have been introduced for automatically detecting server failures
by checking server metrics such as the rate of CPU usage and the amount of memory
usage. However, human still identifies and solves a system failure in many cases, even
experienced administrator needs many trials and errors until the cause of server failure
is detected and the system is recovered.
In this study, we propose an architecture for system recovery based on solution
records on different servers. When a server failure is occurred, first, our system inquires
a knowledge base of server failures and obtains a set of commands to recover the
system. Then, the proposed method evaluates the performance of the commands by
executing them in a virtual evaluation environment. The evaluation result is fed back to
the knowledge base as the solution record and update a priority order on the command.
By this cycle, the knowledge base can continuously update the information of solution
results of commands that are effective to solve the server failure, and return them with
priority for the query. In the experiment using prototype, we confirm the feasibility of
the proposed system.

https://doi.org/10.1007/978-3-030-33509-0_85
An Architecture for System Recovery Based on Solution Records 905
2 Related Work
There have been many studies and developments for detecting system error and
abnormal behavior of server.
Holub and Wang propose an analysis engine to collect distributed system logs and
analyze correlation between the system logs and errors called RTCE (Run-Time
Correlation Engine) and evaluate it in a cloud environment [4, 5]. Xu et al. propose a
method for analyzing sentences in console log and extracting log information that will
be output when a server failure occur [6, 7]. In [8], Mirgorodskiy et al. propose a
method to identify a problem process by comparing logs in multiple identical processes
performing similar activities [8]. In addition, Diao et al. propose a method to identify
the cause of a server failure based on the rule that associates sentences in trouble ticket
that a system administrator described with the cause of a server failure [9].
3 Proposed System
Figure 1 shows an overview of our proposed system. The proposed system includes
two types of knowledge bases of server failures, a primary knowledge base and sec-
ondary knowledge bases.
Fig. 1. Overview of proposed system
(1) Primary server-failure knowledge base (primary KB): This knowledge base
collects system logs of server failure and solution commands from secondary
knowledge bases located in n-servers.
906 T. Kasai and K. Takano
(2) Secondary server-failure knowledge base (secondary KB): This knowledge

base records results of solution commands for a server failure that occurred in the
server where the knowledge base is located.
When a server failure occurred in a server x, the secondary KB inquires a primary
KB using error messages as a query and executes solution commands returned from the
primary KB. The solution results are fed back to the primary KB and used for placing
the priority order on commands. When the solution commands are executed in the
server x, a virtual evaluation environment is dynamically constructed. Since the
solution commands are executed in the virtual environment, the server x evaluates the
performance of the solution commands without effecting to actual environment of the
server x.
3.1 Construction of Knowledge Base

For constructing a primary KB, an error log and its solution command are collected
both manually and automatically. When a server failure occurred, the solution com-
mand is searched using a log message as a query in the primary KB. Tables 1 and 2
show a schema of primary KB and secondary KB, respectively.
Table 1. Schema of primary server-failure knowledge base

Field Description
eid Data ID
type Type of server failure
success_score Success score of solution command
failure_score Failure score of solution command
owner Server name that a server failure occurred
log_messages Log messages in a server failure
solution_cmd Solution command
Table 2. Schema of secondary server-failure knowledge base

Field Description
eid Data ID
flag_auto If true, command can be automatically executed
flag_human If true, command can be manually executed
When a solution command succeeds to solve the server failure, success score is
increased. If a solution command fails, failure score is increased. In addition, there are
two types of solution command. One solution command is automatically executable,
and another is proper to be executed by human. The secondary KB records these
properties as Boolean values.
3.2 Virtual Evaluation Environment

Figure 2 shows an overview of the virtual evaluation environment. For constructing a
virtual evaluation environment, union file system is applied. Union file system is a file
system that a readable/writable upper directory tree and readable lower directory are
merged into one directory tree. Using the union file system, a state of disk in a server
failure is created as a failure environment in a read only layer and a solution command
is executed in a readable/writable layer that is created as a virtual evaluation envi-
ronment over the failure environment.
Fig. 2. Overview of virtual evaluation environment
This virtual evaluation environment copies only directory tree in the actual envi-
ronment, and it cannot copy running processes and server failures occurred in a
communication network. Therefore, the virtual evaluation environment is limited to a
server failure that is caused by the change of directory tree. For example, the virtual
evaluation environment can reproduce server failures such as dependency error in
software package and unauthorized access to a file, but it cannot deal with server
failures such as memory leak and communication network error.
3.3 Ranking of Commands in Knowledge Base

For retrieving effective solution commands, a server x in a server failure inquires a
primary KB using an error message errMsg as a search query. Similarity of error
messages are calculated by applying the vector space model.

scoresim emi ; emj ¼ emi emj =jemi jemj ð1Þ
In addition, performance score of solution commands are calculated using success

score and failure score as follows:
Scoreperf ðcmdx Þ ¼ Scoresuccess ðcmdx Þ Scorefail ðcmdx Þ ð2Þ
Performance score is used as a weight for raking solution commands. The solution
commands that are associated with error messages with high similarity score and have
high performance score are extracted as a candidate set of commands.
3.4 Feedback of Solution Result

Figure 3 shows a feedback cycle of performance evaluation of solution command.
Solution result of each command on virtual evaluation environment are recorded on the
second KB. In addition, the result is fed back to the primary KB.
Fig. 3. Feedback cycle of performance evaluation of solution command
If the solution commands success to solve a server failure, a point 1 is added for
success score. If the solution commands fail, a point 1 is added for failure score. By this
cycle, the knowledge base can continuously update the information of solution results
of commands that are effective to solve the server failure, and return them with priority
for the query.
4 Experiment
In the experiment, we confirm the feasibility of our proposed system.

4.1 Experimental Environment

We used Linux (ubuntu16.04) as a server OS and built one server ps for the primary
KB and three distributed servers ds1, ds2, ds3 for the secondary KB on virtual machine
using VirtualBox. For constructing virtual evaluation environment, we used OverlayFS
that is a Union File System for Linux.
Table 3. Server failure data
Server failure / cause

eid Solution command
Package cannot be downloaded from Package manager is not compatible to
repository using https https
1
sudo apt install -y apt-transport-https
Elasticsearch is not launched JDK is not installed

2
sudo apt install -y openjdk-8-jdk
Package cannot be downloaded via Enviromental variable ‘http_proxy’ is not

proxy server set
(1) echo '#!/bin/bash' > /etc/profile.d/proxy.bash
(2) echo export ¥
3 http_proxy=http://ccproxy.kanagawa-it.ac.jp:10080' ¥
>> /etc/profile.d/proxy.bash
(3) echo export 'https_proxy=$http_proxy' ¥
>> /etc/profile.d/proxy.bash
(4) echo ¥
Defaults env_keep+=¥"http_proxy https_proxy¥" ¥
>> /etc/sudoers
MySQL is not launched Operation is not authorized in database
directory
4
sudo chmod 755 /var/lib/mysql

directory
5
sudo chown mysql.mysql /var/lib/mysql
Package cannot be downloaded via Proper proxy configuration for apt is not
proxy server set
6 (1) echo 'Acquire::http::proxy ¥"http://ccproxya.kanagawa-it.ac.jp:10080/¥";' >
/etc/apt/apt.conf.d/10proxy
(2) echo 'Acquire::https::proxy ¥"http://ccproxya.kanagawa-it.ac.jp:10080/¥";' >>
/etc/apt/apt.conf.d/10proxy

directory
7
(1) sudo chown mysql.mysql /var/lib/mysql
(2) sudo chmod 755 /var/lib/mysql
In the experiment, we prepared 20 failure data in the primary KB (Table 3).

Tables 4, 5, and 6 show command for causing pseudo server failure, solution command
to solve the pseudo server failure, and error message in the pseudo server failure.
Table 4. Command for causing pseudo server failure

eid Command
1 sudo apt remove -y apt-transport-https
2 sudo apt remove -y openjdk-8-jdk
3 (1) unset http_proxy
(2) unset https_proxy
4 sudo chmod 000 -R/var/lib/mysql
5 sudo chown root.root/var/lib/mysql
6 find /etc./apt/apt.conf.d/-type f\
| xargs -I{} sed -i -r ‘/Acquire::https?::proxy.+/d’ {}
7 (1) sudo chown root.root/var/lib/mysql
(2) sudo chmod 000/var/lib/mysql
Table 5. Solution command for pseudo server failure

eid Solution command
1 sudo apt update
2 sudo systemctl start elasticsearch.service
3 sudo apt update
4 sudo systemctl restart mysql.service
6 sudo apt update
Table 6. Error log message in pseudo server failure

eid Error log message
1 E: The method driver/usr/lib/apt/methods/https could not be found
N: Is the package apt-transport-https installed?
E: Failed to fetch https://artifacts.elastic.co/packages/6.x/apt/dists/stable/InRelease
…
2 Nov 28 05:56:17 ubuntu-xenial elasticsearch[11810]: could not find java; set
JAVA_HOME or ensure java is in PATH
3 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease Could not
resolve ‘archive.ubuntu.com’
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease
Could not resolve ‘archive.ubuntu.com’
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease
…
(continued)
Table 6. (continued)
eid Error log message
4 2019-01-08T16:23:12.907580Z 0 [ERROR] InnoDB: The innodb_system data file
‘ibdata1’ must be writable
2019-01-08T16:23:12.907658Z 0 [ERROR] InnoDB: The innodb_system data file
2019-01-08T16:23:12.907675Z 0 [ERROR] InnoDB: Plugin initialization aborted with
error Generic error
…
error Generic error
…
6 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease Could not
resolve ‘archive.ubuntu.com’
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease
…
error Generic error
…
4.2 Experimental Method

First, by executing commands in Table 4, we cause pseudo server failures. Then, a
virtual evaluation environment is dynamically constructed for reproducing the state of
the pseudo server failure.
On each virtual evaluation environment, each corresponding server inquire the
primary KB using an error message as a query, and execute a solution command
returned from the primary KB. After executing the solution command, the solution
result is fed back to the primary KB.
4.3 Result
Ranking results of information of server failures from the primary KB before and after
the feedback are shown in Tables 7 and 8, respectively.
In Tables 7 and 8, we can see that success and failure records are updated. This is
because the solution results on each distributed server are fed back to the primary KB.
In addition, after the feedback, the ranks of error information of eid = 14 and eid = 6
are changed.
The reason why the rank of error information of eid = 6 gets higher is that the
associated solution command with eid = 6 succeeded to solve the pseudo server error
so that success score in eid = 6 is increased. On the contrary, the rank of error
information of eid = 15 is lowered, since the associated solution command with
eid = 14 failed to solve the pseudo server error.
Table 7. Ranking result before feedback

Rank eid Success Failure Search score
1 14 0 0 11.22
2 4 0 0 7.93
3 6 0 0 7.93
4 20 0 0 7.93
5 3 0 0 5.86
6 11 0 0 5.86
7 17 0 0 0.36
8 0 0 0 0.34
9 9 0 0 0.34
10 19 0 0 0.34
Table 8. Ranking result after feedback

Rank eid Success Failure Search score
1 6 1 0 7.93
2 14 0 1 11.22
3 4 0 1 7.93
4 20 0 1 7.93
5 3 0 1 5.86
6 11 0 1 5.86
7 17 0 1 0.36
8 0 0 1 0.34
9 9 0 1 0.34
10 19 0 1 0.34
5 Conclusion
In this study, we have proposed an architecture for system recovery based on solution
records on different servers. Our system dynamically builds the virtual evaluation
environment that reproduces the state of server failure, and executes the solution
commands on that virtual evaluation environment to solve the problem or detect the
cause of server failure. Moreover, our system continuously updates the information of
solution results of commands in a server-failure knowledge base, which are effective to
solve the server failure based on the performance of the solution commands on the
virtual evaluation environment, and return them with priority for the query from a
distributed server. In the experiment using our prototype, we confirmed the feasibility
of the proposed system.
In our future work, we will enhance our system to deal with server errors caused by
network and process errors and evaluate the performance of our system in a real server
environment.
References
1. Amazon Web Services. https://aws.amazon.com/. Accessed Jan 2019
2. Google Cloud Platform. https://cloud.google.com/. Accessed Jan 2019
3. Datadog. https://www.datadoghq.com/. Accessed Jan 2019
4. Holub, V., et al.: Run-time correlation engine for system monitoring and testing. In:
Proceedings of the 6th International Conference Industry Session on Autonomic Computing
and Communications Industry Session, pp. 9–18 (2009)
5. Wang, M., et al.: Scalable run-time correlation engine for monitoring in a cloud computing
environment. In: IEEE International Conference on the Engineering of Computer-Based
Systems (ECBS), pp. 29–38 (2010)
6. Xu, W., et al.: Online system problem detection by mining patterns of console logs. In:
Proceedings of Ninth IEEE International Conference on Data Mining, ICDM 2009, pp. 588–
597 (2009)
7. Xu, W., et al.: Detecting large-scale system problems by mining console logs. In: Proceedings
of the 27th International Conference on Machine Learning (ICML-10), pp. 37–46 (2010)
8. Mirgorodskiy, A.V., et al.: Problem diagnosis in large-scale computing environments. In:
Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 88 (2006)
9. Diao, Y., et al.: Rule-based problem classification in it service management. In: 2009 IEEE
International Conference on Cloud Computing, pp. 221–228 (2009)
A Novel Approach for Selecting Hybrid
Features from Online News Textual Metadata
for Fake News Detection
Mohamed K. Elhadad, Kin Fun Li(&), and Fayez Gebali
Department of Electrical and Computer Engineering, University of Victoria,

Victoria, BC, Canada
{melhaddad,kinli,fayez}@uvic.ca
Abstract. Nowadays, online news platforms have become the main sources of
news for many users. Hence, an urgent need arises to find a way to classify this
news automatically and measure its validity to avoid spreading fake news. In
this paper, we tried to simulate how humans, in real life, are dealing with news
documents. We introduced a new way in which we can deal with the whole
textual content of the news documents by extracting a number of characteristics
of those texts and extracting a complex set of other metadata related features
without segmenting the news documents into parts (title, content, date, source,
etc.). Performances of nine machine learning algorithms in terms of Accuracies,
Precision, Recall and F1-score are compared when using three different datasets
obtaining much better result than the results in [1] and [2].
1 Introduction
Undoubtedly, the rapid development of information systems and the widespread use of
electronic means and social networks have played active roles in accelerating the pace
of events in most parts of the world. As most countries face economic problems,
security problems, wars, terrorism and violent conflicts, the majority of internet users
are relying entirely on information from various electronic platforms. These electronic
platforms, whether on the international news sites or social network sites, are char-
acterized as widespread and free. They also enable the user to access and exchange
information easily, quickly and conveniently. Therefore, the circulation of news on
these platforms has become a major source for users to base their opinions on various
issues.
With no standard or measure to determine the accuracy, validity and credibility of
the consumed news, it may lead to the proliferation of many fake news. Such fake news
may be systematic and follow up on agendas. These agendas are prepared in advance
for the purpose of serving specific goals, objectives and interests for a particular faction
or the interests of certain countries or institutions. As a result, it may influence the
public opinion of users of these platforms.
Thus, there is an urgent need to find techniques that can assist in the processes of
discovery and classification of news documents to determine whether they are real or

https://doi.org/10.1007/978-3-030-33509-0_86
A Novel Approach for Selecting Hybrid Features 915
fake. These techniques depend on the type of news document content we are dealing
with. News documents may contain textual data, multimedia or a mixture of them [3].
The use of machine learning techniques is one of the most common ways to do
these classifications of textual news documents. But first, a series of operations should
be carried out on the available news documents so that they are prepared, processed and
placed in the appropriate format for different classification algorithms. These processes
are not straightforward or easy; they require special design and implementation as a
result of the nature of written texts and their valuable properties that can be used in the
classification processes.
In this paper, we have introduced a new way in which we can deal with the whole
textual content of the news, by extracting a number of characteristics of those texts and
a complex set of other metadata related characteristics as introduced in [3] without
segmenting text into parts (title, content, date, source, etc.). The purpose is to simulate
how humans in real life are dealing with the news. We tested our proposed model using
nine different classifiers on three different datasets. Section 2 shows the related work.
Section 3 presents the proposed model for extracting hybrid features from online news’
textual metadata, and shows the details of each stage. Experimental setup and dis-
cussion are given in Sect. 4. Section 5 summarizes our work, concludes our findings,
and provides directions for future work.
2 Related Work
Most of the fake news detection systems deploy machine learning techniques to assist
users in filtering the news they are viewing. Moreover, it helps in classifying whether a
particular news article is deceptive or not. The classification and analysis presented
here is done by comparing a given news article with some pre-known news corpora that
contain both misleading and truthful news articles [4].
Ruchansky et al. [5] introduced a model which incorporate the features related to
news text, responses that the news document receives, and the user who source it.
Potthast et al. [6] proposed a technique in which they extracted a set of features to
capture writing style. These features are being used to assess the style similarity
between different textual news document categories.
Khurana [7] explored the linguistic features that could be extracted using some of
NLP-techniques from news statements and headlines, and found that the use of n-grams
as a feature, especially unigrams, plays a vital role in the discrimination of fake and real
textual news documents. Ahmed et al. in [2] used n-gram analysis and machine
learning techniques for detecting fake news in textual documents. They used both TF
and TFIDF as feature extraction techniques and selected only the features ranging from
the top 1000 to 50000. The best obtained accuracy result was 92% on ISOT fake news
dataset using LSVM, with unigram and top 50000 features. While in [8], they extended
their experiments on another dataset related to detection of fake reviews used in [9].
The best obtained accuracy result was 90% using KNN, with Bigram and top 10000
features.
916 M. K. Elhadad et al.
Al Asaad et al. [10] introduced a technique for verifying the credibility of news
articles depending on their characteristics. they combined several classification meth-
ods with text models. They used three text representation models (BoW, n-gram, and
TFIDF). They used in their experiments two different datasets and tested their model
using Multinomial Naive Bayes and LSVM algorithms. They obtained the best accu-
racy of 95.7% when using Multinomial Naive Bayes classifier with BoW as text
representation technique on the fake or real news dataset.
Khan et al. [1] presented a comparative analysis of the performance of existing
methods by implementing each one on three different datasets. They observed that the
performance of models is totally dependent on the dataset used and it is hard to obtain a
unique superior model for all datasets. Moreover, from their experimental results, they
found that, the proper selection of features enhances the obtained accuracy results.
Also, they claimed that, for small dataset with less than 100 k documents, Naïve Bayes
(with n-gram) could achieve results similar to those obtained when using neural
network-based models.
Bali et al. [11] proposed an approach for automatic detection of fake news. They
extracted set of features from both news headlines and news contents such as n-grams
count feature, sentiment polarity score, and some other linguistic features. They used
three datasets to evaluate their model using seven different classification algorithms.
They obtained the best accuracy when using Gradient Boosting (XGB) classification
algorithm with three news dataset.
In summary, there are many work in the field of fake news detection. To the best of
our knowledge from existing literature, this work is the first to deal with news docu-
ments without segmenting it. Additionally, we use a complex set of extracted features
from textual news metadata besides the extracted linguistic features to enrich the
generated feature vector with valuable features for building the information detection
model.
3 A Model for Extracting Hybrid Features from Online News

Textual Metadata
In order for news textual documents to be classified, they have to be prepared first to be
suitable for processing, by passing through the document preparation stage as described
in Subsect. 3.1. Then, each news document must be transformed and represented by a
set of words that expresses its content. The classification process goes through two
different phases: training and testing. Each phase consists of two stages that are dif-
ferent than each other, as depicted in Fig. 1.
Fig. 1. Block diagram of fake news detection model
In the training phase, the two stages are feature engineering and learning. While in
the testing phase, the stages are feature engineering and detection. Feature engineering
is described in Subsect. 3.2, while learning and detection stages are presented in
Subsect. 3.3. Though our proposed model could be extended to be used for any lan-
guage, in our current work, we are dealing with news documents that are written in
English language only.
3.1 Document Preparation Stage

Usually, news data is segmented into news title, content, publisher, etc. We trans-
formed each news document into a non-segmented format by grouping all of its seg-
ments into a single segment. This segment contains a union of the original segments.
For example, given the news document shown below:
Title: Explosion rocks down town Damascus

Content: Sunday 01 Feb 2015, by nna, Explosion rocks down town Damascus. An
explosion inside a bus killed 6 people and injured another 10 in down town
Damascus according to preliminary reports on Sunday
Source: nna
Date: 2/1/2015
Location Damascus
Label: Fake
We perform the following actions on the news document while transforming:

(1) Remove redundant data from content that is exactly the same in title or in any
other field, for example, the article title is repeated as it is in the news content.
(2) Remove redundant date information from both news title and content, for
example, Sunday is repeated twice.
(3) Put date in a standard format, for example, the date will be (01 Feb 2015) instead
of (2/1/2015).
(4) Convert all numbers in the news data from numeric values into textual written
numbers, for example, the numeric value (6) will be written as (Six).
Then the news document after preparation becomes:
Text: Explosion rocks down town Damascus, an explosion inside a bus killed six people
and injured another ten in down town Damascus according to preliminary reports
on Sunday, nna, 01 Feb 2015, Damascus
Polarity: 0
3.2 The Feature Engineering Stage

As we are dealing with textual news documents, the first issue that needs to be
addressed is how to represent news text and extract their corresponding features. Not
only that a proper representation will ease data manipulation and save time and
memory needed for processing such data, it will also maintain the necessary infor-
mation without any loss. The feature engineering stage is clarified in Algorithm 1.
Table 1 contains the definition of the basic elements of the algorithm.
Pre-processing and Feature Selection Phase; Aims in transforming the obtained
textual news documents using Vector Space Model (VSM). This transformation results
in extracting the BoW that represents the textual content of the document. This is done
by extracting set of words it contains and their frequency regardless of their order [12–
16]. Then, selecting hybrid set of features from the both news content and news textual
metadata. This is being done by applying the following actions:
• Text Parsing (lines 2:5); the parser detects sentences and tokenizes textual news
documents for further textual analysis.
• Data Cleaning (lines 6:11); applies regular expressions to eliminate all symbolic
characters and non-English words. i.e., attain only English alphabets, numbers, and
words, or any combination of them.
• Part of Speech (PoS) Tagging (lines 12:17); performs tagging process to markup
each word in the text to a proper tag such as verb, noun, adjective, etc. in the
English language.
• Stopping Words Removal (lines 23:32); stop words that do not contribute towards
the meaning or the idea of what is being communicated account for 20–30% of the
total word counts [17]. Removing them will reduce the indexing file size and
improve the efficiency and effectiveness of the detection system.
• Stemming (lines 33:38); the fundamental task that is being done by applying any of
the stemming algorithms to avoid redundant patterns. Thus, different equivalent
morphological forms are replaced by their corresponding root word. We used Porter
English stemmer to perform the word stemming process. For example, the words
“killing” and “killed”, all share the same root-word “kill”. It was noted that by
combining words with the same root may reduce indexing size as much as 40–50%
[7].
• Feature Selection; (1) Applies capital letter heuristic (line 9) to keep all words that
begins with capital letters. As wherever there exists in the news textual data a word
that begins with a capital letter, this indicated its importance and it should not be
neglected. (2) Applies no-short heuristic (line 15) to remove all words which size is
less than or equal to two, as it does not make sense to have a word with such a short
length. (3) Considers only the words that were tagged as verbs, proper nouns and
adjectives (line 15) to reduce the dimension of the extracted feature vector size; as
these words are the most representative and descriptive parts in any textual data.
(4) Selects some information from each news article (line 18:21) such as location
based, user based, and time based features. As these metadata gives much more
informative representation of the textual documents. (5) Select the source infor-
mation related to the news publisher. For example: “published on Twitter”, “pub-
lished on CNN website”, etc., as the detection model could come up with some
relation between the news source as a feature, other selected features from both the
text, the metadata and the label of the textual news document.
At the end of the preprocessing and feature selection phase, we obtain a set of
stemmed bag of words which represents the original feature vector that will be used for
the feature extraction phase.
Feature Extraction Phase; It aims in obtaining a set of distinctive weighted features
that indicates the importance of the selected information to the document content. Term
Frequency Inverse Document Frequency (TFIDF) is one of the most well-known
statistical weighting scheme, which is used to indicate the importance of words in a
document in a collection or corpus. This importance increases proportionally to the
number of times a word appears in the document but is offset by the frequency of the
word in the corpus. TFIDF weight is the multiplication of the value that represents how
frequently a term occurs in a news text document (TF), and the value the shows how
important the term is (IDF) [10, 18–21].
3.3 Learning and Detection Stages

In these two stages, the resulting feature vector from the Feature Engineering stage is
used for building the detection model. The data is split into training and testing sets of 5
folds. Then, the training data are processed by the built-in algorithms using the Scikit-
learn Machine Learning library in Python [22]. These algorithms are used in building
the Detection model for different algorithms such as logistic regression (LR), decision
trees (DT), Support vector machines (SVM), etc. As for the test data, the detection
algorithm corresponding to the built model is applied, and thus detection results are
obtained.
Algorithm 1 Feature Engineering Algorithm

1 Process(Document Ndoc,StoppingWords SWF){
2 /* Text Parsing */
3 DB :=Parse (Ndoc)
4 /* Text Tokenization */
5 TDB :=Tokenize (DB)
6 /*Data Cleaning*/
7 RE= "(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)|(RT)"
8 ∀ TDBi∈TDB {
9 if (PatternMatcher(TDBi, RE)=True || IsCapital (TDBi)=True)
10 CTBD : = CTBD ∪ TDBi
11 }
12 /* Part of Speech Tagging*/
13 TW= PoS_Tag(CTBD)
14 ∀ TWi∈TW {
15 if(WordTag(TWi, (“NNP” || ”JJ” || ”VB”)) = True && WordLength(TWi)>2)
16 TTBD : = TTBD ∪ TWi
17 }
18 /* Text Chunking */
19 CW:=Chunk(TW)
20 /* Named Entity Extraction */
21 NE:=ExractNE(CW)
22 /* Initializing BOW*/
23 IBOW=null
24 /* loading stopping words file */
25 LSW=ReadWords(SWF)
26 /*Stopping words removal*/
27 ∀ TTBDi ∈TTBD {
28 ∀ LSWj ∈ LSW {
29 if (WordCompare(TTBDi, LSWj) = False)
30 IBoW : = IBoW ∪ TTBDi
31 }
32 }
33 /* Stemming process*/
34 ∀ IBoW I ∈ IBoW {
35 SBoW : = SBoW StemTerm(IBoWi)
36 }
37 Return (SBoW NE)
38 }
Table 1. Basic elements of Algorithm 1

# Element Definition
1 Ndoc Textual news document in unstructured format
2 SWF Stopping Words File that contains list of stopping words, auxiliary verbs,
adverbs, etc.
3 DB News document which is parsed in a readable format suitable for the next
steps
4 TDB Tokens of the parsed news document
5 RE The regular expression used to allow words which contain only English
letters and numeric values
6 CTBD Clean tokens after removing non-English and symbolic words
7 TW List of tagged words after applying Part of Speech tagging algorithm
8 TTBD List of noun, verb, adjective tagged words
9 CW List of chunks related to the tagged words after applying chunking
algorithm
10 NE List obtained named entities (organizations, locations, date, time, persons
etc.)
11 IBOW Initial Bag of Words
12 LSW List of Stopping words
13 SBoW Stemmed Bag of Words
4 Experimental Results and Discussion
In this section, we discuss our experimental setup and the results for evaluating the
performance of our proposed approach on three different publicly available fake news
datasets: (1) ISOT Fake News Dataset [2], (2) LIAR dataset [23], and (3) FA-KES
dataset [24].
4.1 Dataset Characteristics
ISOT Fake News Dataset [2]. The dataset contains almost 25.2 K textual news
documents related to both fake and real news. These news documents were collected
from real-world sources covers the period between 2016–2017. The fake news was
collected from different unreliable sources that were flagged by Politifact (a fact-
checking organization in the USA) and Wikipedia sources. As for the real news, they
were collected from (reuters.com). The dataset covers different topics and each news
document is described by title, text, type and the date the news document was pub-
lished on.
LIAR Dataset [23]. This dataset contains almost 12.8K, manually labeled, textual
news documents. These news documents were collected from (politifact.com) and
contain six labels of truthfulness ratings (pants-fire, false, barely-true, half-true, mostly-
true, and true). The dataset provides some additional meta-data like subject, speaker,
job, state, party, context, history. It could be remarked that all these metadata may not
be always available in real life scenarios.
FA-KES Dataset [24]. This dataset that contains 804 textual news documents around
the Syrian war. These news documents were collected from several news organizations,
and have been labeled using a semi-supervised fact-checking approach. It has both fake
and real labels and is the first dataset that presents fake news surrounding the conflict in
Syria.
4.2 Results and Discussion

To test the proposed model, experiments were conducted on the datasets shown in
Subsect. 4.1. News documents, from each dataset, were randomly split into a set of
80% documents for training and a remaining set of 20% for testing. All news docu-
ments for training and testing pass through the stages as discussed in Sect. 3. Exper-
imental results reported in this section are based on classification accuracy, precision,
recall, and f1-measures [1, 2, 25]. 5-folds cross-validation was used for the evaluation
and TFIDF was used as a feature extraction method. Tables 2 and 3 shows the clas-
sification results in details compared with the results from [1] and [2] respectively.
Table 4 shows the obtained results when applying our proposed model on the FA-KES
dataset that was recently published in [24].
It is clear from the obtained results; our proposed model compares favorably with
other reviewed models. Moreover, it could be noticed that, performing good prepa-
ration, cleaning and feature selection, will have a good impact on the performance of
the detection process. As can be seen in Table 2, the best enhancement was almost
+6% with accuracy results of 62% obtained when applying both SVM and logistic
regression classifiers on the LIAR dataset. While, from Table 3, the obtained
enhancement in accuracy results were between +4% to +40%. The best enhancement
was obtained when applying our proposed technique, along with SVM and Four-Gram
method on the ISOT fake news dataset. Finally, from Table 4, when using the FA-KES
dataset, the best accuracy was almost 58% when using the Multinomial Naive Bayes
Classifier.
Table 2. Performane comparison between the proposed approach and [1]

Performance Algorithm
metric Decision KNN Logistic SVM Naive Bayes LSVM Perceptron Neural
Tree Regression with n-gram Networks
N=1 N = 2
[1] Acc. 51 53 56 56 60 60 – – –
P 51 53 56 57 60 59 – – –
R 51 53 56 56 60 60 – – –
F1 51 53 51 48 57 59 – – –
Our Acc. 55 58 62 62 62 62 60 59 58
results P 56 58 62 62 62 61 60 59 58
R 56 58 62 62 62 62 60 59 59
F1 56 58 61 62 62 61 60 59 59
Table 3. Accuracy comparison between the proposed approach and [2]

N-gram Algorithm
Decision KNN Logistic SVM Bernoulli Multinomial LSVM Perceptron Neural
Tree Regression Naive Naive Bayes Networks
Bayes
[2] N = 1 89 83 89 86 – – 92 – –
N = 2 85 68 88 78 – – 89 – –
N = 3 87 73 88 71 – – 87 – –
N = 4 74 69 81 55 – – 81 – –
Our N = 1 100 88 98 99 97 95 100 99 99
results N = 2 96 89 98 99 98 97 99 98 99
N = 3 91 88 98 98 98 98 99 97 99
N = 4 87 52 95 95 94 96 95 93 96
Table 4. Our results using the FA-KES dataset [24]

Performance Algorithm
mertic Decision KNN Logistic SVM Bernoulli Multinomial LSVM Perceptron Neural
Tree Regression Naive Naive Bayes Networks
Bayes
Accuracy 51.07 51.14 52.13 50.14 45.15 58.09 57.24 52.08 54.08
Precision 50 50 50 48 45 63 57 51 54
Recall 50 51 52 50 45 58 57 52 55
F1-measure 50 50 47 48 45 50 56 51 54
In this paper a novel approach for selecting hybrid features from online news textual
metadata for fake news detection was introduced. This technique is based on dealing
with the textual news document as a block without segmentation; extracting BoW for
getting textual features. Moreover, selecting set of user based, post based, social based,
and propagation based features [3] from the news metadata to enrich the extracted
feature vector. Different classifiers are used, performance is reported using different
metrics: Accuracy, Precision, Recall and F1-measure. Experimental results through the
entire paper indicates the achieved improvement when using our technique in handling,
selecting, and building the feature vector that is used in building detection models.
From the reported results in this work we could conclude that:
(1) For the ISOT Fake News Dataset, the best accuracy of 100% was obtained when
using Decision Trees, and LSVM models with enhancement of +8% compared
with the results from [2].
(2) For the LIAR dataset, the best accuracy of 62% was obtained when using Logistic
Regression, SVM, and Naive Bayes models with enhancement of +(6%, 6%, 2%),
respectively, compared with the results from [2].
(3) For the FA-KES dataset, the best result of 58% was obtained when applying
Multinomial Naive Bayes model.
It could be remarked that, when using the ISOT Fake News Dataset, the obtained
results are way much better than the other from two datasets. This could be as a result
that ISOT real news are collected from the Reuters website. So, after building the
classification model, the decision is binary to classify whether the news document is a
Reuters or Not-Reuters document.
For future work, there are many avenues to pursue, including:
(1) Performing further investigation and analysis on FA-KES dataset.
(2) Deploying one or more syntactic similarity measures, and semantic similarity
measures to enrich the extracted feature vector with some syntactic and semantic
features.
(3) Extend and evaluate the proposed system against other datasets in other languages
other than English.
(4) Implement a fake news detection system using parallel platforms, such as Apache
Spark, to handle the large amount of data in less time without affecting the overall
system performance.
(5) Combine the decisions of two or more classification algorithms to improve the
accuracy of the results.
References
1. Khan, J.Y., Khondaker, M., Islam, T., Iqbal, A., Afroz, S.: A benchmark study on machine
learning methods for fake news detection. arXiv preprint arXiv:1905.04749 (2019)
2. Ahmed, H., Traore, I., Saad, S.: Detection of online fake news using N-gram analysis and
machine learning techniques. In: International Conference on Intelligent, Secure, and
Dependable Systems in Distributed and Cloud Environments, vol. 10618, pp. 127–138.
Springer, Cham (2017)
3. Elhadad, M.K., Li, K.F., Gebali, F.: Fake news detection on social media: a systematic
survey. In: 2019 IEEE Pacific Rim Conference on Communications, Computers, and Signal
Processing, Victoria, B.C., Canada. IEEE (2019)
4. Bondielli, A., Marcelloni, F.: A survey on fake news and rumour detection techniques. Inf.
Sci. 497, 38–55 (2019)
5. Ruchansky, N., Seo, S., Liu, Y.: CSI: a hybrid deep model for fake news detection. In:
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management,
pp. 797–806. ACM (2017)
6. Potthast, M., Kiesel, J., Reinartz, K., Bevendorff, J., Stein, B.: A stylometric inquiry into
hyperpartisan and fake news. arXiv preprint arXiv:1702.05638 (2017)
7. Khurana, U.: The linguistic features of fake news headlines and statements. Dissertation
Master’s thesis, University of Amsterdam (2017)
8. Ahmed, H., Traore, I., Saad, S.: Detecting opinion spams and fake news using text
classification. Secur. Priv. 1(1), 1–15 (2018)
9. Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch
of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics: Human Language Technologies, vol. 1. Association for
Computational Linguistics, pp. 309–319 (2011)
10. Al Asaad, B., Erascu, M.: A tool for fake news detection. In: 2018 20th International
Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC).
IEEE (2018)
11. Bali, A.P.S., Fernandes, M., Choubey, S., Goel, M.: Comparative performance of machine
learning algorithms for fake news detection. In: International Conference on Advances in
Computing and Data Sciences, pp. 420–430. Springer (2019)
12. Elhadad, M.K., Badran, K.M., Salama, G.I.: Towards ontology-based web text document
classification. In: International Conference on Aerospace Sciences & Aviation Technology
(2017)
13. Zhu, Z., Liang, J., Li, D., Yu, H., Liu, G.: Hot topic detection based on a refined TF-IDF
algorithm. IEEE Access 7, 26996–27007 (2019)
14. Fengling, W.: Research on hot topic discovery based on intelligent campus information
service platform. In: The 3rd Information Technology, Networking, Electronic and
Automation Control Conference. IEEE (2019)
15. Xu, G., Meng, Y., Chen, Z., Qiu, X., Wang, C., Yao, H.: Research on topic detection and
tracking for online news texts. IEEE Access 7, 58407–58418 (2019)
16. Wan, J., Zheng, P., Si, H., Xiong, N.N., Zhang, W., Vasilakos, A.V.: An artificial
intelligence driven multi-feature extraction scheme for big data detection. IEEE Access 7,
80122–80132 (2019)
17. Ju, Q.: Large-scale structural reranking for hierarchical text categorization. Dissertation Ph.
D. University of Trento, Italy (2013)
18. Della Vedova, M.L., Tacchini, E., Moret, S., Ballarin, G., DiPierro, M., De Alfaro, L.:
Automatic online fake news detection combining content and social signals. In: the 22nd
Conference of Open Innovations Association (FRUCT), pp. 272–279. IEEE (2018)
19. Elhadad, M.K., Li, K.F., Gebali, F.: Sentiment analysis of Arabic and English tweets. In:
Workshops of the International Conference on Advanced Information Networking and
Applications. Springer, Cham (2019)
20. Xu, K., Wang, F., Wang, H., Yang, B.: Detecting fake news over online social media via
domain reputations and content understanding. Tsinghua Sci. Technol 25(1), 20–27 (2019)
21. Elhadad, M.K., Badran, K.M., Salama, G.I.: A novel approach for ontology-based
dimensionality reduction for web text document classification. In: 2017 IEEE/ACIS 16th
International Conference on Computer and Information Science (ICIS). IEEE (2017)
22. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.: Scikit-learn: machine learning in
Python. J. ML Res. 12, 2825–2830 (2011)
23. Wang, W.Y.: “liar, liar pants on fire”: a new benchmark dataset for fake news detection.
arXiv preprint, arXiv:1705.00648 (2017)
24. Salem, F.K.A., Al Feel, R., Elbassuoni, S., Jaber, M., Farah, M.: FA-KES: a fake news
dataset around the Syrian war. In: Proceedings of the International AAAI Conference on
Web and Social Media (ICWSM 2019). AAAI (2019)
25. Elhadad, M.K., Badran, K.M., Salama, G.I.: A novel approach for ontology-based feature
vector generation for web text document classification. Int. J. Softw. Innov. 6(1), 1–10
(2018)
Author Index
A Campos, Fernanda, 318

Abubaker, Zain, 70, 355, 863, 875 Cantiello, Pasquale, 601
Aernouts, Michiel, 756 Capuano, Nicola, 505
Agostinho, Bruno Machado, 329 Carrion, Carme, 537
Aleksy, Markus, 106 Cassimon, Thomas, 684
Ali, Ishtiaq, 82, 94, 568 Castermans, Nick, 776
Amato, Alessandra, 310, 549, 579, 589 Catillo, Marta, 616
Amato, Flora, 549, 579, 589 Catone, Maria Carmela, 211
An, Wang Xu, 367 Chakraborty, Ashmita, 186
Appierto, Davide, 117 Chen, Sizhuo, 399, 482
Arif, Arooj, 47 Chen, Xu, 389, 472, 887
Arrieta, Leire Orue-Echevarria, 626 Cho, Sung Woo, 299
Ashrafullah, 57 Cilardo, Alessandro, 558
Azeem, Muhammad, 70, 355, 863, 875 Conejero-Arto, Israel, 537
Conesa, Jordi, 537
B Cozzolino, Giovanni, 310, 549, 579, 589
Ba, Nghien Nguyen, 828 Cuka, Miralda, 35
Balemans, Dieter, 798
Bandyopadhyay, Anjan, 340 D
Bañeres, David, 537 Daems, Walter, 716, 746
Barolli, Admir, 26 Dantas, Mario, 275
Barolli, Leonard, 3, 14, 26, 35 Dantas, Mario A. R., 318
Batalla-Busquets, Josep-Maria, 537 Dantas, Mario Antônio Ribeiro, 287, 329
Berkvens, Rafael, 726, 756, 766, 786 Daood, Nazia, 47
BniLam, Noori, 756 de Andrade Menezes, Victor Ströele, 287
Bo, Wang, 367 de Hoog, Jens, 661, 776, 798
Borrison, Reuben, 106 de Souza, Cauê Baasch, 329
Bortolini, Diogo, 174 del Carmen Cruz Gil, María, 537
Bosmans, Stig, 684, 736 della Volpe, Maddalena, 894
Boydens, Jeroen, 704 Deng, Na, 389, 399, 472, 482, 887
Bylykbashi, Kevin, 14 Di Martino, Beniamino, 626, 638
do Amaral, Wagner D., 318
C Doi, A., 258
Caballé, Santi, 505, 515 Doungsa-ard, Chartchai, 409
Cai, Song, 152 Durresi, Heidi, 26

https://doi.org/10.1007/978-3-030-33509-0
928 Author Index
E K
Elhadad, Mohamed K., 914 Kagawa, Tomomichi, 811
Elmazi, Donald, 35 Kanjanakuha, Worachet, 409
Esposito, Francesca, 894 Kanmai, Chidchamaiporn, 409
Eyckerman, Reinout, 661, 671 Kappler, Chris, 197
Kasai, Takayuki, 904
F Kato, Shigeru, 811, 821
Falco, Mariacristina, 211, 234 Katoh, T., 258
Farooq, Hassan, 875 Kawakami, Tomoya, 427, 444
Fontaine, Jaron, 197 Ke, Nie Jun, 367
Kerstens, Robin, 716
G Khan, Abdul Basit Majeed, 355
Gañán, David, 525 Khan, Nasir Ali, 57
Garcia-Alsina, Montserrat, 537 Khan, Raja Jalees ul Hussen, 568
Gebali, Fayez, 914 Kitano, Fuga, 811
Ghaffar, Abdul, 70, 355, 863, 875 Kolici, Vladi, 3
Giacalone, Marco, 310, 549, 579, 589 Kotani, Toshihiro, 433
Giroire, Frédéric, 417 Krause, Paul, 340
Giuliano, Vincenzo, 117 Kyriazis, Dimosthenis, 626
Gomes, Fernanda Oliveira, 329
Gómez-Zúñiga, Beni, 537 L
Gotoh, Yusuke, 433 Laki, Sándor, 848
Groß, Christian, 106 Larcher, Lucas, 275
Gul, Hira, 47 Laurijssen, Dennis, 716
Gurmani, Muhammad Usman, 70, 355, 863, Li, Kin Fun, 141, 914
875 Li, Yipeng, 887
Lim, Kiho, 127
H Liu, Qi, 399, 482
Halili, Rreze, 766 Liu, Shudong, 887
Hallez, Hans, 704 Liu, Yi, 3
Hellinckx, Peter, 661, 671, 684, 736, 776, 798 Liu, Yutian, 887
Hino, Takanori, 811, 821 Loza, Carlos A., 838
Horn, Geir, 626 Lu, Yangzhicheng, 463
Hozawa, M., 258 Luo, Jun, 399, 482
Hu, Ziqi, 399, 482 Lv, Songnan, 494
Huin, Nicolas, 417
Huybrechts, Thomas, 661 M
Ma, Peng, 378
I Maisto, Alessandro, 211
Iannotta, Iolanda Sara, 246 Maisto, Salvatore Augusto, 638, 648
Ibrishimova, Marina Danchovsky, 141 Mallik, Saurav, 340
Ikeda, Makoto, 14, 35 Malviya, Abhash, 186
Ishida, Tomoyuki, 463 Mancuso, Azzurra, 234
Islam, Tariqul, 127 Manivannan, D., 127
Ito, Ryuji, 811 Marino, Alfonso, 609
Marquez-Barja, Johann, 671
J Martinez-Argüelles, María J., 537
Jansen, Wouter, 716 Martino, Beniamino Di, 601
Janssen, Thomas, 726 Martino, Raffaele, 558
Javaid, Atia, 82, 94, 568 Mas, Xavier, 537
Javaid, Nadeem, 47, 57, 70, 82, 94, 355, 568, Mastroianni, Michele, 601
863, 875 Mastrorilli, Federico, 221
Jorge, Ricardo Rodriguez, 828 Mateen, Abdul, 57
Author Index 929
Matsumoto, Satoru, 427, 454 Sato, Yukinori, 811

Matsuo, Keita, 14, 35 Schenck, Anthony, 746
Maulik, Ujjwal, 340 Schmitt, Johannes, 106
Mercelis, Siegfried, 661, 671, 684, 736, 776, Shabbir, Shaista, 47
798 Shahid, Adnan, 197
Mir, Nader F., 186 Shashaj, Ariona, 221
Mistareehi, Hassan, 127 Shehzad, Faisal, 70
Miyachi, Hideo, 265 Shen, Li, 163
Miyakwa, Akihiro, 463 Shibata, Yoshitaka, 463
Monjo, Tona, 537 Shiozaki, Takaya, 811
Mor, Enric, 537 Siano, Alfonso, 211
Morino, Y., 258 Singh, Ritesh Kumar, 786
Mujeeb, Sana, 47 Singh, Vikash Kumar, 340
Mukhopadhyay, Sajal, 340 Singh, Vincy, 186
Murakami, Koshiro, 265 Skrzypek, Paweł, 626
Steckel, Jan, 694, 716, 746
N Stingo, Michele, 221
Nacchia, Stefania, 638, 648 Ströele, Victor, 275
Nakai, Ryo, 463 Sugita, Kaoru, 463
Naredla, Abhilash, 186 Sultana, Tanzeela, 70, 355, 863, 875
Natwichai, Juggapong, 409
Nobuhara, Hajime, 811
Noshad, Zainib, 82, 94, 568 T
Takahashi, H., 258
O Takano, Kosuke, 904
Obelheiro, Rafael R., 174 Takizawa, Makoto, 26
Ohara, Seiji, 26 Tanveer, Jawad, 57
Ohira, Kenji, 454 Tejomurtula, Jahnavi, 186
Teranishi, Yuuichi, 427, 444
P Tomassilli, Andrea, 417
Paranjpe, Akshay, 186
Pariso, Paolo, 609
Pelosi, Serena, 211 U
Pinto, Alex Sandro Roschildt, 329 ul Hussen Khan, Raja Jalees, 82, 94
Pioli, Laércio, 287
Polito, Massimiliano, 221
Poorter, Eli De, 197 V
Prieto, Josep, 525 Van Waes, Jonas, 704
Vankeirsbilck, Jens, 704
Q Vanneste, Astrid, 736
Qafzezi, Ermioni, 14 Vanneste, Simon, 661, 684, 736, 798
Venticinque, Salvatore, 648
R Verreyken, Erik, 694
Rak, Massimiliano, 616 Villano, Umberto, 616
Rehman, Mubariz, 57 Vitale, Pierluigi, 234, 246
Romeo, Francesco, 549, 579
Rudas, Ákos, 848
W
S Wada, Naoki, 811
Sakamoto, Shinji, 26 Walravens, Maximiliaan, 694
Sarasa-Cabezuelo, Antonio, 515 Wang, Peiyong, 399, 482
930 Author Index
Wang, Xu An, 378 Y

Wang, Ze-yu, 378 Yang, Su, 367
Wang, Zhifeng, 152, 494 Yoshihisa, Tomoki, 427, 444, 454
Wang, Zhuang, 163 Yoshikawa, Naoki, 821
Weyn, Maarten, 726, 756, 766, 786 Z
Zahid, Maheen, 82, 94, 568
Zeng, Chunyan, 152, 494
X Zhong, Deliang, 887
Xhafa, Fatos, 340 Zhong, WeiDong, 378
Xiong, Caiquan, 163, 389, 472 Zhou, Shangli, 152, 494

Cloud and Internet Computing

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Cloud and Internet Computing

Transféré par

Droits d'auteur :

Formats disponibles

Lecture Notes in Networks and Systems 96

** Indexing: The books of this series are submitted to ISI Proceedings,

More information about this series at http://www.springer.com/series/15179

Advances on P2P, Parallel,

ISSN 2367-3370 ISSN 2367-3389 (electronic)

Welcome to the workshops of the 14th International Conference on P2P, Parallel,

Program Committee Co-chairs

Web Administrator Co-chairs

Local Organizing Co-chairs

Steering Committee Chair

1. Data Mining, Semantic Web and Information Retrieval

De-Nian Yang Academia Sinica, Taiwan

2. Cloud and Service-Oriented Computing

Douglas D. J. de Macedo Federal University of Santa Catarina, Brazil

3. Security and Privacy for Distributed Systems

Luigi Sgaglione University of Naples Parthenope, Italy

4. P2P, Grid and Scalable Computing

Joan Arnedo-Moreno Open University of Catalonia, Spain

5. Bio-inspired Computing and Pattern Recognition

Andrea Saracino Institute of Informatics and Telematics (IIT),

Jelena Milosevic TU Wien, Austria

6. Intelligent and Cognitive Systems

Lorenza Melillo University of Salerno, Italy

7. Web Application, Multimedia and Internet Computing

Flora Amato University of Naples Federico II, Italy

8. Distributed Systems and Social Networks

Jun Iio Chuo University, Japan

9. IoT Computing Systems

Chonho Lee Cybermedia Center, Osaka University, Japan

Feng Wang Hubei University of Arts and Science, China

10. Wireless Networks and Mobile Computing

Teruaki Kitasuka Hiroshima University, Japan

Amato Flora Enokido Tomoya

Jordi Conesa Ogiela Lidia

On behalf of the Organizing Committee of the 12th International Workshop on

SMECS-2019 Program Committee

Markus Aleksy ABB, Germany

It is my great pleasure to welcome you to the 8th International Workshop on

SMDMS-2019 Organizing Committee

International Liaison Chair

Lei Shu Nanjing Agricultural University, China

Program Committee Members

Akihiro Fujimoto Wakayama University, Japan

Welcome to the 9th International Workshop on Multimedia, Web and Virtual

MWVRTA-2019 Program Committee

Program Committee Members

Tetsuro Ogi Keio University, Japan

It is our great pleasure to welcome you to the 9th International Workshop on

ALICE-2019 Program Committee

Antonio Cerrato University of Naples Federico II, Italy

Welcome to the 7th International Workshop on Cloud and Distributed System

CADSA-2019 Program Committee

Program Committee Members

Antonino Mazzeo University of Naples Federico II, Italy

Mario Sicuranza Institute for High Performance Computing

Welcome to the 7th International Workshop on Cloud Computing Projects and

CCPI-2019 Organizing Committee

Program Committee Members

Rocco Aversa University of Campania Luigi Vanvitelli, Italy

Luca Tasquier University of Campania Luigi Vanvitelli, Italy

Welcome to the 6th International Workshop on Distributed Embedded Systems

DEM-2019 Program Committee

Program Committee Members