Vous êtes sur la page 1sur 10

Expert Systems with Applications 42 (2015) 125134

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

Indoor localization in a hospital environment using Random Forest


classiers
Luca Calderoni , Matteo Ferrara, Annalisa Franco, Dario Maio
Department of Computer Science and Engineering, University of Bologna, Cesena, FC 47521, Italy

a r t i c l e i n f o a b s t r a c t

Article history: This paper proposes a new indoor localization system, based on RFID technology and a hierarchical
Available online 6 August 2014 structure of classiers. This system has been specically designed to work in unfriendly scenarios, where
transmissions could be disturbed by other electronic devices or shielded walls. The infrastructure has
Keywords: been deployed and evaluated in the emergency unit of a large Italian hospital (48 rooms covering about
Indoor localization 4000 m2 ) to detect the room where lost or forgotten patients lie. Extensive experiments show the
Smart health potential of such technology for indoor localization applications in terms of accuracy, precision, complex-
Hospital
ity, robustness and scalability. In 98% of cases the system localizes the correct room (83%) or one of its
Emergency room
Random Forest
adjacency (15%).
2014 Elsevier Ltd. All rights reserved.

1. Introduction (Gifnger et al., 2007). Healthcare plays a key-role when the


quality of life concerning a urban area is to be estimated
In the last decade the need to track persons and objects in (Schaffers, Komninos, & Pallot, 2012). The availability of modern
indoor environments has become increasingly important for sev- and citizen-friendly hospitals is one of the factors that belong to
eral reasons. The emergence of smarter environments in urban the Smart Living macro area: Hernndez-Muoz et al. (2011) state
areas, along with the improvements achieved in the context of glo- that telemedicine, electronic records, and health information
bal satellite positioning systems, led to an intensive use of outdoor exchanges in remote assistance are meaningful case studies on
location-aware applications (Calderoni, Maio, & Palmieri, 2012). the subject. The possibility to track patients while they are inside
Meanwhile, the scientic and industrial community realized the a hospital and especially in rst aid area is becoming increasingly
potential and benets of location-aware applications in the indoor important. In these areas emergencies frequently occur and this
environment (Mautz, 2012). causes a constant ow of doctors and nurses from one room to
Nowadays, indoor positioning is applied in a wealth of elds: another. This to-and-fro together with specic mental disease
customer navigation in a mall, citizen navigation inside a public (e.g., Alzheimer disease), often leads to lost or forgotten patients
building, product localization in a supermarket and indoor loca- around the hospital. Hence, knowing the right position of a person
tion-aware advertisement are just a few examples. Unfortunately, inside a medical facility at any time is a meaningful problem
while the performance of outdoor positioning systems has become within indoor localization research eld. Unfortunately, the medi-
excellent, indoor positioning seems to be more complicated cal scenario is a particularly hostile context for indoor localization
(Pivato, 2012). Apparently a single multipurpose solution that ts because the transmitters and receivers usually adopted to locate
any need in indoor positioning applications is still missing. In fact, objects have to deal with medical devices; indeed these machiner-
the systems proposed in literature rely on several technologies and ies often come with some limitation on the allowed radio-
tailored models designed in order to t the requirements of each frequency range to be used near them and so on.
specic context (Liu, Darabi, Banerjee, & Liu, 2007). In this paper a focus on the hospital and medical scenario is
Since 2007, the European Union dened several factors that posed; an expert system capable of locate people within a hospital
should be considered when evaluating the smartness of a city emergency unit is presented. This system adopts RFID transmitters
and receivers and handles their signals through a dedicated infra-
structure. Patients are located through a classication of these sig-
nals via a hierarchical Random Forest-based classier proposed
Corresponding author.
herein.
E-mail address: luca.calderoni@unibo.it (L. Calderoni).

http://dx.doi.org/10.1016/j.eswa.2014.07.042
0957-4174/ 2014 Elsevier Ltd. All rights reserved.
126 L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134

1.1. Related works time constraint. To the best of our knowledge, this paper repre-
sents the rst comprehensive contribution completely dedicated
In recent years several system models concerning indoor posi- to indoor localization in hospital environment.
tioning were presented. For the purposes of this work it is mean-
ingful to focus on the system models for indoor localization 1.3. Outline
based on classiers, as the herein proposed method lies in this
eld. The paper is organized as follows: in Section 2 a description of
Honary, Mihaylova, and Xydeas (2012) provide a detailed the hospital context and its related indoor positioning issues are
performance analysis of a k-Nearest Neighbour based positioning provided, as well as the description of the sensors involved in this
system and also propose a positioning scheme based on Gaussian work and a detailed map representation of the hospital where the
Mixture Models in order to locate objects in a generic building. technical surveys were made. After a detailed description of the
An indoor positioning technique relying on bayesian classiers dataset (Section 3), the indoor classication system designed to
focusing on smart ofce and smart building scenarios is presented t the hospital scenario is introduced in Section 4 while in Section
by Kim, Ha, Lee, and Lee (2009). 5 a discussion on the results based on the previously proposed
A valuable study on the subject was proposed by Villarubia, system model is provided.
Rubio, Paz, Bajo, and Zato (2013); it consists of a practical experi-
ment which took place inside a University. Here a Wi-Fi based
system is used to gather signals and a wide range of classiers 2. Operating environment
are compared during the localization process. Shin and Han
(2010) suggest that a viable method to improve the classication In general, it is quite complicated to obtain satisfying results
process in the Wi-Fi indoor positioning context is to combine concerning indoor positioning inside a hospital or a medical facility
classiers, and provide subsequently some practical case studies using wireless techniques. In this specic scenario, several prob-
on the subject (Shin, Jung, Yoon, & Han, 2011). lems may occur due to location sensors installation. Frequently,
While Koyuncu and Yang (2010) provide a general survey on medical devices conict with installed sensors; moreover, these
indoor positioning methods, Gu, Lo, and Niemegeers (2009) focus buildings are often equipped with shielded walls. For instance,
on positioning systems for wireless personal networks. we could imagine an X-ray examination room equipped with thick
In another key-survey Liu et al. (2007) describe a wide range of leaded walls (Hentila, Taparungssanagorn, Viittala, & Hamalainen,
indoor localization systems and compare them on the basis of 2005). This leads to a tangible signal reduction or distortion during
some performance metrics (Accuracy, Precision, Complexity, the communication between transmitters and receivers. In the fol-
Robustness, Scalability, Cost). Within this paper a reference to some lowing a description of the adopted sensors and the operating
of these key-features is posed, in order to provide a helpful environment is proposed, in order to better understand both acqui-
comparison with other systems. The survey also describes several sition and localization steps of the designed system.
algorithm and mathematical techniques used for indoor positioning
purposes. 2.1. Sensors
Mathematical approaches for indoor localization are however
better described by Seco, Jimenez, Prieto, Roa, and Koutsou The deployed infrastructure relies on Radio Frequency Identica-
(2009) and divided in four main categories: geometry-based tion (RFID) technology and consists of three different sensors. Each
methods, cost function minimization methods, ngerprint methods sensor communicate in Ultra High Frequency (UHF) mode on fre-
and bayesian methods. According to this taxonomy, it is reasonable quencies LPD 433 MHz, 446 MHz and 860 MHz. Keeping the fre-
to include our system among the ngerprint based ones. Finger- quency range as conned as possible allows the infrastructure
print methods rely on two main steps. In the rst one, the signal not to conict with other WiFi and medical devices.
levels from the different base stations are recorded in order to form Patients are equipped with an active RFID transmitter (RFID
a training set and calibrate the system. In the localization step, the Tag). Each Tag lies inside a small hypoallergenic bracelet and sends
system tries to locate the source of a new signal classifying it a signal on an user-dened time interval basis. The duration of
against the previously recorded set. These methods are particularly each signal is limited to a few milliseconds, resulting in a total
robust in the non-line-of-sight (NLOS) context, where radio trans- effective transmission time of few hours per year. The irradiance
mission across a path is partially obstructed, usually by a physical at a point of the surface is near to 0. One of the main concerns in
object. dealing with active RFID tags is that of batteries and power con-
sumption. Active tags send signals to the receiver on their own
1.2. Contribution and thus need a local power source. As signals related to the
adopted tag are really short and energy-preserving, bracelets can
Most of the ICT (Information and Communication Technology) be equipped with a small battery which can although keep the
works in the Smart Health eld relate to the design of proper infor- tag alive for more than three months.
mation systems for medical purposes or to the innovative tech- Signals sent from tags are received from several antennas (RFID
niques adopted to cope with several known diseases (Zeng et al., Receiver) which store the Tag Identier (ID) and the strength of
2013). On the contrary, this work describes a novel technique to incoming signal. Specically, signal strength is expressed in deci-
track patients in an emergency room environment in order to nd bels and varies from 100 db (weak signal) to 30 db (strong sig-
them quickly as needed. Specically, in this context, it is important nal). Receivers stay idle until a signal from a tag occurs and, after
to know the room where the patient is located rather than his pre- its processing, which lasts for a few milliseconds, go back to sleep
cise position. The system proposed herein is aimed to locate mode. Thus, these antennas are energy-preserving as well and can
patients through RFID signals classication. In particular, signals be both plugged to the hospital power supply or equipped with
are processed combining several instances of decision-tree classi- batteries.
ers called Random Forest (Breiman, 2001). As assessed by Yim Each antenna sends the data to a central receiver (on the same
(2008) the off-line phase (training) of a ngerprint method is not frequency range) which is connected via an RS232-to-Ethernet
time critical whereas on-line phase (localization) is really time crit- adapter to the local server. This central receiver collects the signals,
ical. Decision-tree oriented classiers behave well according to this stores them and performs real-time localization.
L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134 127

These features make the infrastructure compliant with Indus-


48
trial, Scientic and Medical (ISM) radio frequency specications 47
and suitable for the hospital environment (International 46
45
Telecommunication Union, 2009). 44
43
42
41
40
2.2. Map of the hospital 39
38
37
The emergency unit where the proposed system has been 36
35
deployed is divided in 48 rooms covering an area of about 34
33
4000 m2 . After the acceptance procedure has been carried out, each 32
patient is routed to a specic area, depending on his personal dis- 31
30
ease. Following the opinion of the hospital authorized personnel, 9 29
28
suitable locations have been determined to install 9 antennas 27
across the unit. Specically, each antenna was positioned following 26
25
two criteria: to have the emergency unit fully covered and to allow 24
23
signals generated from each room to be received by at least three 22
21
antennas. Some of these antennas are not able to receive signals 20
while transmitters are located in certain rooms. This behavior is 19
18
sometimes due to signal reduction or distortion described in Sec- 17
16
tion 2 or to the distance between transmitter and receiver, as the 15
emergency unit is quite wide (see Fig. 1). An overall representation 14
13
of the emergency unit is provided in Fig. 1. 12
11
10
9
8
3. Data set 7
6
5
A dataset has been acquired through receivers and transmitters 4
3
described in Section 2.1 by a team of experts, trying to simulate 2
1
real operative conditions. To this purpose, the acquisition was per-
formed while the emergency unit was fully operational. Thus, 0 50 100 150 200 250 300 350
these data are suited to face the small changes which often take
Fig. 2. Number of observations registered for each room.
place in such a context. In particular, four different individuals
wearing active RFID bracelets, moved through each room of the
building. Bracelets sent a signal to the antennas on a ve seconds
basis. Data acquisition lasted for about ve consecutive hours, Denition 1. Given a transmitted signal, it is considered to be a
resulting in 14622 observations from the 48 rooms. Fig. 2 shows valid value with respect to an antenna A if and only if A is able to
the distribution of these observations among each room. On the receive it.
average, there are 304.6 observations per room. Rooms 8 and 9
hold the minimum and the maximum number of observations, Denition 2. Given a set of d antennas and an integer w;
respectively (precisely 226 and 338). 1 6 w 6 d, a valid observation is an observation holding at least
Fig. 3 reports, for each receiver, the percentage of observations w valid values.
detected. According to the chart, it is clear that C and D are the
antennas with the highest reception rates (52.3% and 61.5%, Fig. 4 shows the percentage of observations with at least w valid
respectively), while B; G, and I are those with the lowest ones values (1 6 w 6 9): each observation presents 3.9 valid values on
(36:5%; 36:1% and 24.6%, respectively). the average. Evidently, when w P 4, the percentage of discarded

Fig. 1. The detailed map of the hospitals emergency unit where the tests were made. Rooms are labeled with numbers (from 1 to 48) while the deployed antennas are labeled
with letters (from A to I). Openings between rooms are highlighted as well.
128 L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134

60%

40%

20%

0% Fig. 6. Training process.

A B C D E F G H I
Fig. 3. Percentage of valid observations received by each antenna. within the range 100; 70 in 75% of cases, resulting in a reduced
signals strength range.
Finally, A describes several charts useful for a further detailed
100% analysis of the collected data.
Overall, some conclusions can be drawn:

80%
 The dataset is very difcult to handle due to the high percentage
of missing values (see Figs. 3 and 4);
60%
 The localization task is rather difcult since, in several cases,
signals received from different rooms are very similar (e.g.,
rooms 40, 41 and 42 holds signatures close to each other, as
40% shown in Figures A120, A123, A126).
 A future study on the rearrangement of certain antennas could
improve the system performance with respect to its robustness.
20% For instance, B; G and I could be moved in an area featuring
higher reception rates (see Fig. 3);
0%
1 2 3 4 5 6 7 8 9 4. Localization system

Fig. 4. Percentage of observations presenting at least w valid values; the result is The idea behind the proposed system is to organize rooms in a
reported for increasing values of w.
hierarchical structure through a partitioning into non-overlapping
macro-regions. Each macro-region contains rooms with similar
observations and is assigned separately to a specic sub-system,
customized to manage the data related to that macro-region.
30
Figs. 6 and 7 show the learning and localization activity
40 diagrams of the proposed system, respectively. In the following,
specic steps involved in training and localization procedures are
50 described in detail.
60
4.1. Training
70

80 Let TS fx1 ; x2 ; . . . ; xn g be a set of d-dimensional valid observa-


tions, where each element xij ; j 1; . . . ; d represents the value
90 received by the antenna j for the observation xi . A class label is
associated to each observation and it represents the room identi-
100
er related to that acquisition. The training set TS is used to train
the localization system. This procedure could be time consuming
A B C D E F G H I but usually it is performed only once during the system setup.
Fig. 5. Boxplot on the range of values observed by different antennas. Specically, The training stage can be divided into three steps:
for each receiver the median, minimum, maximum, 25th and 75th percentile values
are given.  Data normalization: pre-processing stage aimed at extracting a
feature vector from each observation;
 Macro-regions denition: room partitioning into non-overlap-
observations (i.e., observations with a number of valid values less ping regions characterized by similar observations;
than w) becomes too high (i.e., from 38.9% to 99.9%), undermining  Classier training: training of region-specic classiers.
the system usability.
Fig. 5 reports, for each receiver, the range of signals strength 4.1.1. Data normalization
through boxplots, which provide detailed statistics for each f fx
The normalized training set TS ~1 ; x
~2 ; . . . ; x
~n g is obtained by
antenna. This chart shows clearly that all antennas receive signals mapping each training pattern in the range 0; 1. This normalization
L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134 129

2. Assign each room to the cluster holding the centroid closest to


most of the observations belonging to the room. Since most of
those observations contain missing values, the Partial Distance
(PD) (Dixon, 1979) instead of the common Euclidean Distance
is used to calculate the distance between two points. PD is cal-
culated as follows:
d X  2
PDx; y Pd  xj  yj ; 4
d j1 bj j1;...;d^bj 0

where xj and yj represent the j-th values of the d-dimensional


points x and y, and

0; if xj  1 ^ yj  1
bj 5
1; o:w:
PD computes the squared Euclidean Distance between all valid
values and normalizes it by the reciprocal of the proportion of
valid values used during the calculation.
3. When all rooms have been assigned, recalculate the position of
the k centroids. If there are no rooms associated to one or more
clusters, delete them and update the number of valid clusters k.
Let idc be the identier of a cluster, let TSf idc be the set of nor-
malized observations belonging to rooms assigned to this clus-
ter and let lidc be the clusters centroid. The coordinates of each
Fig. 7. Localization process. centroid are calculated as the arithmetic mean of all valid
observation values of the rooms assigned to the corresponding
cluster:

is aimed at representing the data in a easier-to-manage format. The 1 X


lidc P  xj ; 6
acquired values are in fact represented by negative integers in the
j
e idc
x2 TS
vj e idc ;
x2 TS v j 1
range 100; 30; moreover, as shown in Section 3, observations
present a considerable number of missing values (xij 0) which are, where 1 6 idc 6 k and 1 6 j 6 d, having
in this phase, replaced by the xed constant 1. 
The rst step of data normalization consists in determining, for 1; if xj  1
vj 7
each receiver, the minimum and maximum value in the training 0; o:w:
set, so that all the values can successively be normalized in the
4. Repeat steps 2 and 3 until no modications to the centroids are
range 0; 1 accordingly. Let us dene MAX fM 1 ; M 2 ; . . . ; M d g
performed or a maximum number of iterations is reached.
and MIN fm1 ; m2 ; . . . ; md g as the sets of maximum and minimum
values observed for each receiver over the whole training set TS
Fig. 8 depicts an example of room grouping into ve macro-
respectively. Specically
  regions. Note that, since the clustering is performed according to
M j max xij ; 1 radio signal features, which carry no spatial information, macro-
xi 2TS;xij 0 regions may also contain non-adjacent rooms (e.g., rooms 7 and
  42).
mj min xij : 2
xi 2TS;xij 0
4.1.3. Classier training
For each xi 2 TS, the normalized value of each single feature Once the rooms are grouped into k macro-regions, different
~i
xj ; j 1; . . . ; d is calculated as follows: classiers are trained using different training sets derived from
8 f
TS. Note that, this operation can be performed in parallel since
< xij mj
~xij ; if xij 0 there are no dependencies between different classiers and train-
Mj mj 3
: f MR , a data-
ing sets. A macro-region classier ClMR is trained using TS
1; o:w:
f
set containing all the data in TS where the identier of each room
Clearly, normalization helps to spread different observations
and thus it simplies the classication task.

4.1.2. Room grouping


Different rooms, each represented by a set of feature vectors
extracted according to the normalization procedure, are then
grouped into non-overlapping regions. In the following, depending
on the context, we will refer to clusters or macro-regions inter-
changeably. The partitioning algorithm operates on f
TS and is based
on a modied version of the k-Means algorithm (Dixon, 1979;
MacQueen, 1967), which supports missing values. The modied
algorithm is composed of the following steps:

1. Set k kmax random points into the d-dimensional space 0; 1d .


These points represent initial cluster centroids. Fig. 8. An example of room grouping. Here ve clusters are generated.
130 L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134

has been replaced by the corresponding macro-region identier. 4.2.2. Macro-region classication
f idc , a
Moreover, for each cluster, a classier Clidc is trained using TS In this step, the classier ClMR is used to pre-select those macro-
f subset containing only the observations from rooms belonging
TS regions where, according to a given observation, the probability of
to cluster idc. Due to the high number of rooms, the chosen hierar- presence of the subject is higher than a prexed threshold. The
chical organization allows each sub-classier Clidc to focus on a normalized observation x ~ is used as input for the classier ClMR that
specic and smaller room subset, thus making easier the classica- returns a set of condences ConfMR fcMR MR
idc ; 1 6 idc 6 kg where c idc
tion problem. represents the likelihood that the subject is located in a room
Among the classiers suitable for the algorithm implementa- belonging to the macro-region idc.
tion, the Random Forest (Breiman, 2001) has been selected. This
choice was due to the classier convenient behavior both in terms
of accuracy and efciency, which are critical features of such an 4.2.3. Room classication
application. In fact, the training process of the whole proposed sys- In this step, the classiers, corresponding to pre-selected
tem requires about 2 s on the average. The localization task is macro-regions, are used to predict the rooms from which the signal
instead performed in real-time and its related execution time is could have been generated. The normalized observation x ~ is given
negligible. These values are referred to an Intel Core 2 Quad as input to each classier Clidc for which the corresponding macro-
Q9400 CPU at 2.66 Ghz, using a multithreaded C# implementation. region condence cMR idc is greater than or equal to a predened

Random Forest is an ensemble classier that consists of a set of threshold t. The result is a set of condences Confidc fcidr ;
decision trees. Let N be the number of samples in the training set idr 2 C idc g where C idc is the set of all identiers of rooms assigned
and P be the number of variables in the classier; let p be the input to the macro-region idc and cidr represents the likelihood that the
parameter denoting the number of input variables to be used to subject is located in room idr. Note that, this operation can be per-
determine the decision at a node of the tree (usually p  P). Each formed in parallel since there are no dependencies between the
tree is grown as follows: different classiers.
The nal score of each room is calculated as:
 Take a bootstrap sample by choosing n times with replacement
from all N available training samples; sidr cMR
M idr c idr ; 10
 Use the remaining samples to estimate the error of the tree, by Midr idc; idr 2 C idc ; 11
predicting their classes;
 For each node of the tree, randomly choose p variables on which where Midr returns the identier of the macro-region containing
to base the decision at that node. Calculate the best split based the room idr. In particular, the condence of the macro-region clas-
on these p variables in the training set; sier is combined to that of the selected classiers on the basis of
 Each tree is fully grown and not pruned. the sum rule. Finally, the identiers of the r rooms with the highest
scores are returned. Note that, in the proposed Random Forest-
To classify a new object, the sample is given as input at each of based implementation, both condences cMR Midr and c idr are com-
the trees in the forest; each tree gives a classication and the nal puted as described in Section 4.1.3. The rst one is provided by
class is chosen on the basis of the majority vote rule. the classier ClMR , trained to distinguish among the different
On the basis of the distribution of votes among the different macro-regions while the second one is returned by the correspond-
classes, a condence is also computed for each class. Specically, ing room classier Clidc .
given a class i, the related condence ci corresponds to the percent- Fig. 9 shows the whole localization procedure over a sample
age of votes obtained through the set of trees composing the observation, using room grouping as reported in Fig. 8.
classier.

4.2. Localization 5. Experimental evaluation


During the localization, a new observation x is given as input to
the system in order to predict the room where the subject cur- This section describes the experiments carried out to evaluate
rently resides. The localization procedure can be divided into three the performance of the proposed system in terms of accuracy,
steps: precision, complexity, robustness and scalability (Liu et al., 2007).

 Data normalization;
5.1. Test protocol
 Macro-region classication;
 Room classication.
The localization system must be trained before it can be used to
localize the subjects (see Section 4.1). To this purpose, the dataset
4.2.1. Data normalization
described in Section 3 has been divided into two disjoint subsets:
The new observation x is rst normalized as described in Sec-
~. Since some feature values could be out of half of the observations belonging to each room is randomly
tion 4.1.1, obtaining x
selected as training data while the other half is used as test data.
the maximum and minimum bounds determined in the training
To assess the impact of invalid values on the system perfor-
phase, a clipping operation is performed to preserve values in the
mance, experiments have been carried out discarding (both in
range 0; 1:
8 training and test sets) those observations that present a number
< 0; v < 0
> of valid values less than w, with w 1; 2; 3. Moreover, the localiza-
~xij 1; v > 1 8 tion system has been evaluated for different values of the parame-
>
: ter r 1; 2; 3, representing the number of rooms returned by the
v ; o:w:
system (see Section 4.2.3). Finally, to reduce performance indica-
where tors variability the experiments have been repeated 20 times, each
time randomly choosing complementary training and test sets, and
xij  mj
v : 9 the nal performance indicators have been calculated as results
Mj  mj average.
L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134 131

Fig. 9. Example of localization process, given a valid observation. The room grouping proposed in Fig. 8 is used. Each ci represents the condence that the observation come
from room i.

5.2. Accuracy Table 1


Accuracy of the proposed localization system as a function of the number of rooms
retrieved (r) and the minimum number of valid values used to select the observations
Generally, accuracy is measured as the average distance (w).
between the estimated location and the real location. In this
specic context, given two room identiers i and j, the distance r/ w 1 2 3

di; j between the rooms is recursively calculated as follows: 1 0.76 0.02 0.55 0.01 0.47 0.01
2 0.40 0.01 0.25 0.01 0.20 0.01
( 3 0.27 0.01 0.16 0.01 0.12 0.01
0; ij
di; j mindk; j 1; o:w: 12
k2Ai

where Ai contains the identiers of all rooms adjacent to room i. that, when r > 1 the distance used to calculate the accuracy is the
Two rooms are adjacent if they share a wall or at least a part of it lowest one among the r distances calculated.
(e.g., in Fig. 1, rooms 1; 3; 4 and 5 are adjacent to room 2). The This chart shows that, as expected, better results in terms of
distance d represents the minimum number of walls that separate accuracy can be reached by applying more restrictive constraints
the true room and the estimated one (e.g., d4; 7 3). In other on the number of valid features required (w 2 or w 3); it is
words, a distance of zero means the system has located the right worth noting that the limits imposed in these experiments are
room, a distance of one means the real room and the located one rather low with respect to the number of receivers available. Sim-
are adjacent and, more in general, a distance of u means that ilarly, accuracy improves when a higher number of rooms (r 2 or
between the right room and the located one there are u walls. r 3) is retrieved by the localization system. According to the con-
Please note that the denition of accuracy used by Liu et al. straints applied to the problem discussed herein, a value of r 2 is
(2007) is here applied. This denition differs from the concept of acceptable; overall, the results provided using r 2; w 2 and
accuracy usually adopted in other contexts, since the lower is the r 2; w 3 are both satisfactory in terms of accuracy and suitable
value (lower distance), the more accurate is intended to be for a practical implementation.
the classication system. Table 1 reports the average accuracy of Fig. 10 depicts the confusion matrix where each row represents
the proposed system and the corresponding standard deviation a room identier (expected for classication) and each column
for different values of r and w, representing respectively the states the resulting localization. In other words, element i; j rep-
number of rooms retrieved by the localization system and the resents the percentage of signals transmitted from room i localized
minimum number of valid values used to select valid observations. as deriving from room j. This matrix helps to understand whether
As proved by standard deviation values, accuracys variability or not the system is confusing two rooms during localization. As
observed after the 20 tests have been carried out is very low. Note predictable, high values are mainly distributed along the diagonal;
132 L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134

1
48
47
46
45
44 0.9
43
42
41
40
39 0.8
38
37
36
35
34 0.7
33
32
31
30
29 0.6
28
27
26
25 0.5
24
23
22
21
20 0.4
19
18
17
16
15 0.3
14
13
12
11
10 0.2
9
8
7
6
5 0.1
4
3
2
1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Fig. 10. Confusion matrix. Blue values indicates a low percentage of localization while lighter ones indicates a higher localization rate.

it is thus conrmed that the system holds a high accuracy level. 5.5. Robustness
This matrix also conrms the frequent localization errors related
to rooms 40, 41 and 42, previously explained in Section 3, as the In general, robustness measures the capability of a localization
area around those room identiers presents a notable blur. system to work normally even if some signals are not available due
to receiver errors. The results in terms of precision (see Fig. 11)
5.3. Precision show that the proposed localization system holds good robustness
properties. The proposed algorithm was designed in order to oper-
Usually, the precision of a localization systems is dened as the ate with many missing values, allowing to reach satisfactory
distribution of distance error between the estimated location and results even with a limited number of valid features available
the real location. The precision of the proposed system is measured (for instance, an acceptable parameter could be w 2, meaning
using the Cumulative Distribution Function (CDF) of the distance that only two valid values, among 9, are required in order to accept
error. In particular, charts reported in Fig. 11 show CDFs of the pro- an observation as valid).
posed system for different values of r and w. Each chart represents
the percentage of correctly classied patterns as a function of the 5.6. Scalability
distance dened in (12)). According to this test, the behavior
observed in the previous experiment is conrmed. In particular, Further experiments have been carried out to evaluate the pos-
by selecting only observations with a minimum of two valid values sibility of scaling up the proposed localization system to larger
(w 2), localization is performed with a maximum distance of 1 in buildings with a higher number of rooms. Usually, the localization
more than the 90% of the tests that have been carried out (with performance degrades when the distance between the transmitter
respect to the real room). So, the localization system tends to iden- and the receiver increases. This evaluation relies on the analysis of
tify either the correct room or one of its adjacency. accuracy as a function of the ratio between the number of rooms in
the building and the number of receivers available, referred to as
5.4. Complexity rPr in the following. To this purpose, the localization system has
been tested on different datasets featuring an increasing number
Complexity of a localization system can be attributed to hard- of rooms per receiver (rPr):
ware, software and operation factors. For simplicity, it is usually
estimated as the computing time required to perform a localiza-  DB1: 2840 observations acquired in a building with 19 rooms
tion. As reported in Section 4.1.3, the system performs a localiza- and 7 receivers installed (rPr 2:7);
tion in a negligible time, and it thus can be adopted in real-time  DB2: 2130 observations acquired in the same building of DB1
applications. with 5 receivers installed (rPr 3:8);
L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134 133

w=1 1.2
1.1
100 1
0.9
0.8

Accuracy
90
0.7
0.6
0.5
80
0.4
0.3
70 0.2 r=1
0.1 r=2
r=1
0 r=3
r=2
60
r=3 0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 rPr
w=2 Fig. 12. The accuracy of the localization system on three different datasets with
w 1; r 1; r 2 and r 3 (blue, red and green elements respectively). The
accuracy trends as a function of rPr (lines) is also shown. Finally, diamonds
100
highlight the accuracy measured on a dataset with rPr 9:6. (For interpretation of
the references to colour in this gure caption, the reader is referred to the web
version of this article.)
90
0.45, respectively. The measured results (diamonds) are extremely
close to those predicted by the trend function (1:14; 0:66 and 0.45),
80 thus making us condent that, at least for a reasonable range of rPr
values, the trend lines are rather reliable.

70
r=1 6. Conclusion
r=2
r=3 Indoor localization plays an important role in ambient intelli-
60
gence applications. In this work, an expert system for indoor local-
0 1 2 3 4 5 6 7 8 9 10 ization in a hospital environment is presented; the aim is to
w=3 estimate the room where the patients are located within the build-
ing. With respect to other scenarios where most of the state-of-
100 the-art work can be framed, the specic application imposes
important constraints mainly related to the presence of electronic
devices and shielded areas that signicantly affect the acquisition
90 process, with important implications on data quality.
The RFID technology was selected for data acquisition because
of its interesting properties in terms of cost, ease of use and com-
pliance to the Industrial, Scientic and Medical radio frequency
80
specications. The localization process, based on the data acquired
by different antennas positioned in the building, is performed by a
multi-classier system. One of the peculiarities characterizing the
70 r=1
indoor localization task addressed in this work is that the data
r=2
acquired present many missing values, in most cases less than
r=3
one third of the acquired signals are valid and can be used for local-
0 1 2 3 4 5 6 7 8 9 10 ization. The system was thus designed to deal with a high degree of
uncertainty and ad hoc feature extraction techniques have been
Fig. 11. Cumulative distribution functions of the distance error, reported for dened to this purpose. A very efcient implementation based on
different values of r and w.
Random Forest classiers has been proposed and tested on real
data, acquired in an emergency unit while it was fully operative.
 DB3: the database described in Section 3 with 48 rooms and 9 Experimental results are very encouraging: the system is able to
receivers installed (rPr 5:3). work in real time and shows good behavior in terms of robustness,
accuracy and precision. As an additional contribution, an experi-
The results obtained are reported in Fig. 12, where accuracy mental study concerning scalability of the proposed approach to
obtained with w 1 is reported for each of the three databases larger buildings has been carried out to further support the feasi-
and different values of r (the markers in the chart). The chart also bility of the proposal.
reports trend functions (represented as lines) that could help to Despite of the systems performance and its suitability with
predict the system behavior at different values of rPr. To validate respect to the initial aims of the project, some important aspects
the observed model, an additional simulation has been carried deserve further investigations. A signicant aspect of the system
out by ignoring the signals from four of the nine receivers used design is antenna positioning; the receivers position should maxi-
in the experiments on DB3 (i.e., A, B, G and I in Fig. 1), thus increas- mize the coverage and consequently limit as much as possible
ing the value of rPr from 3.8 to 9.6 (48 rooms with 5 receivers). The missing data issues. It would be interesting to study the benets
accuracy measured with r equal to 1; 2, and 3 is: 1:15; 0:65 and introduced using different approaches for feature extraction and
134 L. Calderoni et al. / Expert Systems with Applications 42 (2015) 125134

signals management. For instance, the proposed system bases the Dixon, J. K. (1979). Pattern recognition with partly missing data. IEEE Transactions
on Systems, Man, and Cybernetics, 9, 617621.
localization process on each single signal received by antennas;
Gifnger, R., Fertner, C., Kramar, H., Kalasek, R., Pichler-Milanovic, N., & Meijers, E.
relying on signal sequences instead of single signals could improve (2007). Smart Cities: Ranking of European Medium-sized Cities. Centre of Regional
the system performances by reducing the effects of possible outli- Science, Vienna University of Technology.
ers. Moreover, signal sequences could also provide useful informa- Gu, Y., Lo, A., & Niemegeers, I. G. (2009). A survey of indoor positioning systems for
wireless personal networks. IEEE Communications Surveys and Tutorials, 11,
tion for continuous template updating, which would increase 1332.
systems robustness with respect to the frequent changes which Hentila, L., Taparungssanagorn, A., Viittala, H., & Hamalainen, M., (2005).
often take place in a dynamic environment as the hospital one. Measurement and modelling of an uwb channel at hospital. In IEEE
International Conference on Ultra-Wideband, 2005. ICU 2005.. http://dx.doi.org/
Another aspect which could be investigated by the research com- 10.1109/ICU.2005.1569968.
munity is how the system would behave in a multiple oor build- Hernndez-Muoz, J. M., Vercher, J. B., Muoz, L., Galache, J. A., Presser, M., Gmez,
ing. This aspect should involve a study on data acquisition and L. A. H., et al. (2011). Smart cities at the forefront of the future internet. In J.
Domingue, A. Galis, A. Gavras, T. Zahariadis, D. Lambert, & F. Cleary, et al. (Eds.),
localization procedure designed for a multi-oor environment. A Future Internet Assembly (pp. 447462). Springer.
preliminar study on signal propagation through different oors Honary, M., Mihaylova, L., & Xydeas, C. (2012). Practical classication methods for
could lead to different solutions: a rack of independent systems indoor positioning. The Open Transportation Journal, 6, 3138.
International Telecommunication Union, (2009). Industrial, scientic and medical
as proposed herein could be installed on each oor separately or (ISM) applications (of radio frequency energy). Technical Report.
a unique system able to deal with all of the considered oors could Kim, H. H., Ha, K. N., Lee, S., & Lee, K. C. (2009). Resident location-recognition
be installed instead. algorithm using a bayesian classier in the pir sensor-based indoor location-
aware system. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 39,
240245.
Appendix A Koyuncu, H., & Yang, S. H. (2010). A survey of indoor positioning and object locating
systems. International Journal of Computer Science and Network Security, 10,
A comprehensive list of charts for each room and receiver is 121128.
provided as supplemental material. Each room is described by Liu, H., Darabi, H., Banerjee, P. P., & Liu, J. (2007). Survey of wireless indoor
positioning techniques and systems. IEEE Transactions on Systems, Man, and
three charts (A, B and C) while a single chart is provided for each Cybernetics, Part C, 37, 10671080.
receiver. Charts of type A show, for each antenna, the percentage MacQueen, J. B. (1967). Some methods for classication and analysis of multivariate
of received signals among those gathered from the involved room. observations. In L. M. L. Cam & J. Neyman (Eds.), Proc. of the fth berkeley
symposium on mathematical statistics and probability (pp. 281297). University
Charts of type B indicate how many antennas are able to receive
of California Press.
the signals sent from the involved room. For instance, a value of Mautz, R., (2012). Indoor positioning technologies. ETH Zrich Habilitation Thesis.
95% on the column 2 states that the 95% of signals gathered from http://dx.doi.org/10.3929/ethz-a-007313554.
Pivato, P., (2012). Analysis and characterization of Wireless Positioning Techniques in
that room are received by at least two antennas. Charts of type C
Indoor Environment. (Ph.D. thesis). University of Trento.
show, for each receiver, the strength of the received signals gath- Schaffers, H., Komninos, N., & Pallot, M., 2012. Smart cities as innovation
ered from the involved room. The range of signals strength is illus- ecosystems sustained by the future internet. FIREBALL White Paper.
trated through average and standard deviation. Charts proposed Seco, F., Jimenez, A., Prieto, C., Roa, J., & Koutsou, K., 2009. A survey of mathematical
methods for indoor localization, In IEEE International Symposium on Intelligent
for each receiver show the same mean and standard deviation Signal Processing, 2009. WISP 2009., pp. 914. http://dx.doi.org/10.1109/
from the receiver perspective. For each one of the 48 rooms, the WISP.2009.5286582.
charts illustrate signals strength perceived by the involved Shin, J., & Han, D. (2010). Multi-classier for wlan ngerprint-based positioning
system. In S. I. Ao, L. Gelman, D. W. Hukins, A. Hunter, & A. M. Korsunsky (Eds.),
receiver. A le showing the aforementioned charts is available at World Congress on Engineering, International Association of Engineers
http://smartcity.csr.unibo.it/res/ilhe-appendixA.pdf. (pp. 768773). Newswood Limited.
Shin, J., Jung, S. H., Yoon, G., & Han, D. (2011). A multi-classier approach for wi-
based positioning system. In S. I. Ao & L. Gelman (Eds.), Electrical Engineering and
Appendix B. Supplementary data Applied Computing (pp. 135147). Netherlands: Springer. doi: 10.1007/978-94-
007-1192-1_12.
Supplementary data associated with this article can be found, in Villarubia, G., Rubio, F., Paz, J. F. D., Bajo, J., & Zato, C. (2013). Applying classiers in
the online version, at http://dx.doi.org/10.1016/j.eswa.2014.07. indoor location system. In J. B. Prez, J. M. C. Rodrguez, J. Fhndrich, P. Mathieu, A.
042. Campbell, & M. C. Suarez-Figueroa, et al. (Eds.), Trends in Practical Applications of
Agents and Multiagent Systems (pp. 5358). Springer International Publishing.
doi: 10.1007/978-3-319-00563-8_7.
References Yim, J. (2008). Introducing a decision tree-based indoor positioning technique.
Expert Systems Applicatons, 34, 12961302.
Breiman, L. (2001). Random forests. Machine Learning, 45, 532. Zeng, D., Yang, C. C., Tseng, V. S., Xing, C., Chen, H., & Wang, F. Y., et al. (Eds.). (2013).
Calderoni, L., Maio, D., & Palmieri, P. (2012). Location-aware mobile services for a Smart Health International Conference, ICSH 2013, Beijing, China, August 3-4,
smart city: Design, implementation and deployment. JTAER, 7, 7487. 2013. Proceedings. Lecture Notes in Computer Science (Vol. 8040). Springer.

Vous aimerez peut-être aussi