Vous êtes sur la page 1sur 12

Expert Systems with Applications 37 (2010) 7108–7119

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

An intelligent decision-support model using FSOM and rule extraction for


crime prevention
Sheng-Tun Li a,b,*, Shu-Ching Kuo b,c, Fu-Ching Tsai a,d
a
Institute of Information Management, National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan 701, Taiwan
b
Department of Industrial and Information Management, National Cheng Kung University, Taiwan, ROC
c
Department of Information Management, Diwan College of Management, Taiwan, ROC
d
Pingtung County Police Bureau, No.119, Chungchen Rd., Pingtung City, Taiwan, ROC

a r t i c l e i n f o a b s t r a c t

Keywords: In the recent era of increasing volume crimes, crime prevention is now one of the most important global
Intelligent decision support issues, along with the great concern of strengthening public security. Government and community offi-
Crime prevention cials are making an all-out effort to improve the effectiveness of crime prevention. Numerous investiga-
Fuzzy self-organizing map network tions addressing this problem have generally employed disciplines of behavior science and statistics.
Temporal rule extraction
Recently, the data mining approach has been shown to be a proactive decision-support tool in predicting
and preventing crime. However its effectiveness is often limited due to different natures of crime data,
such as linguistic crime data evolving over time. In this paper, we propose a framework of intelligent
decision-support model based on a fuzzy self-organizing map (FSOM) network to detect and analyze
crime trend patterns from temporal crime activity data. In addition, a rule extraction algorithm is
employed to uncover hidden causal-effect knowledge and reveal the shift around effect. In contrast to
most present crime related studies, we target a non-Western real-world case, i.e. the National Police
Agency (NPA) in Taiwan. The resultant model can support police managers in assessing more appropriate
law enforcement strategies, as well as improving the use of police duty deployment for crime prevention.
Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction degree of people’s impression of public security. These lights are


linguistic in nature in order to more effectively attract people’s
In recent years, volumes of crime had brought serious problems attentions to the public security. Indeed, the formal enforcement
to many countries in the world. For example, in Taiwan, the crime of preventive laws is seen as an important means of preventing
volumes have increased more than 71% in a decade, which may not crime and ensuring public safety in modern societies. But increas-
only bring physical harms but also serious mental injuries for the ingly, scholars and public officials were realizing that the tradi-
victims. In order to avoid the emergence of the calamity, the Na- tional law enforcement was not the only approach to crime
tional Police Agency, Taiwan must invest more human resources prevention strategy (Visher & Weisburd, 1998). In addition, do-
in criminal investigation and increase the law enforcement duties, main experts believed that criminal history model can be used to
such as patrolling, raids and guarding for maintaining public order, identify other analogous pattern (Kaza, Wang, & Chen, 2007).
preventing all kind of hazards and promoting the welfare of citi- Therefore, more researches focused on using various intelligent ap-
zens. Traditional law enforcement strategies for crime prevention proaches to analyze different types of crime characteristics and
focus on preventive police patrol which is the most general duty proceeded specific programs.
for the police. Since 2002, the Ministry of Internal Affairs in Taiwan The emerging data mining (also known as knowledge discovery
had been working on a project to examine the public security index in databases) has been shown to be a powerful and effective meth-
(PSI) with red, yellow, purple, green, blue five lights that represent odology to help business users to facilitate intelligent decision
five kinds of status of public security, which are ‘‘very bad”, ‘‘bad”, support (Petropoulos, Patelis, Metaxiotis, Nikolopoulos, & Assimak-
‘‘intermediate”, ‘‘good”, and ‘‘very good,” in order to strengthen the opoulos, 2003; Rupnik, Kukar, & Krisper, 2007). In particular, it en-
ables criminal investigators to explore criminal acts quickly and
efficiently (Alexander, Alexander, Nikolay, & Tatyana, 2008; Chen
* Corresponding author at: Institute of Information Management, National Cheng et al., 2004; Richard, Michael, & John, 2007). In the recent literature,
Kung University, No. 1, Ta-Hsueh Road, Tainan 701, Taiwan. Tel.: +886 6
a variety of data mining techniques had been employed in crime
2757575x53126; fax: +886 6 2362162.
E-mail addresses: stli@mail.ncku.edu.tw (S.-T. Li), r3892104@mail.ncku.edu.tw policy (Adderley, Townsley, & Bond, 2007; Brown & Oxford, 2001;
(S.-C. Kuo), r7895104@mail.ncku.edu.tw (F.-C. Tsai). Chen et al., 2004) and received encouraging results. Yet, most

0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.03.004
S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119 7109

researches emphasized criminal analysis of quantitative numeric, pointed out that the performance of these statistical models has
and few existing studies consider linguistic crime data. Conse- been considered weak due to their high error rates and limited
quently, it is imperative to develop novel approaches for handing explanatory power. Researchers have thus continued to look for
such types of data. In this paper, we propose a framework of intel- new improved methods in this arena. In certain applications,
ligent decision-support model in order to identify crime trend pat- ANN in artificial intelligence has performed at least as well as,
terns for different criminal activities, conduct temporal rule and usually is favorite because it can approximate any nonlinear
extraction to uncover their shift around effect, and provide a refer- function without requiring the specification of a nonlinear model
ence for experts when analyzing the different types of crime. For prior to analyzing the data and can extract hidden knowledge from
crime pattern discovery, we propose a hybrid FSOM model which vast data and domain experts.
combines the features of SOM networks and fuzzy logic in dealing The ANN approach has the capability to simulate thinking of
with clustering, visualization, and linguistic information process- human brains and has been successfully applied in various areas
ing. For temporal crime rule extraction, we use a modified general due to its salient features in classification, clustering, and rule dis-
rule extraction algorithm to find the hidden causal effects between covery. Recently, it has been applied to deal with crime knowledge
different temporal linguistic crime data that can help police manag- discovery and achieved impressive results. For instance, Kaza et al.
ers understand more clearly the criminal acts. Furthermore, we use (2007) integrated association analysis and modified mutual infor-
J-measure to quantify and rank the usability of the rules to identify mation to identify potential criminal vehicles by using law enforce-
their relative importance. The experimental results of the frame- ment data from border-area jurisdictions in Tucson, Arizona in the
work provide actionable information for the police management US. Studying results indicate that the temporal patterns and port
to make better use of its duty deployment and help criminal experts distributions can aid in the better understanding of crossing activ-
to develop and implement more effective law enforcement policies ity. Kaikhah and Doddameti (2006) proposed a tool to discover the
and crime control programs. existing trends for each type of crime (murder, rape, robbery, auto
The remaining sections of this paper are organized as follows. theft) in US cities, which used a new training process of neural net-
Section 2 describes the confronted problem of crime prevention work. The neural network is modified via pruning hidden layer
and reviews the literature related to the problem. Section 3 intro- activation clustering and is trained to learn the correlations and
duces the essential concepts of FSOM, temporal rule discovery, and relationships that exist in a dataset. They concluded that different
J-measure. Section 4 describes the experimental design for data types of crime trend discovery can help monitor the security envi-
collection, network design, and performance evaluation. Section ronment. Grubesic (2006) utilized fuzzy clustering to detect the
5 discusses the experimental results and analysis that focus on explanation of criminal activities for crime hot-spot areas and their
the trend pattern analysis of various crimes, shift around effects, spatial trends. Compared with two hard-clustering approaches
and police duty deployment analysis. Section 6 concludes this pa- (median and k-means clustering problem), the empirical results
per and gives the future work. suggest that a fuzzy clustering approach is better equipped to han-
dle crime spatial outliers. Using the 24 cases selected from Rich-
mond, Virginia police department, Brown and Hagen (2002)
2. Literature review adopted automated association rule mining methods to help law
enforcement. The results confirm that the association method
Because crime has drawn much attention over time, the public can benefit to both the efficiency and accuracy of mining. Other re-
has become increasingly concerned about the government’s re- lated works include (Corcoran, Wilson, & Ware, 2003; Oatley &
sponse to it. The International Journal of Forecasting published a Ewart, 2003; Oatley, MacIntyre, Ewart, & Mugambi, 2002; Wang
special issue of crime forecasting in 2003 (Gorr & Harries, 2003). et al., 2002; Xue & Brown, 2003). These all indicate that the discov-
Numerous modern researches of crime certainly demonstrate that ery of each crime trend with association knowledge provides a
the sharply rising crime volume has blossomed into a high-priority promising solution to the crime problem. Yet, there are still several
problem that needs to be promptly solved. In order to prevent important issues worthy of attention. The cases reported in the lit-
criminal acts and discover crime trend, researchers have developed erature are mainly concerned with Western advanced capitalist
a number of methods to support law enforcement activities over societies and developing countries (Liu, 2006). Moreover, without
the last two decades. These works originate from disciplines of taking into account the uncertainty factor, most aforementioned
behavior and psychology, statistics, and artificial intelligence. models developed can only deal with crisp crime data; thus the
The approach of behavior science and psychology has been usability of the discovered crime patterns could be limited.
aimed at preventing individuals from committing crimes since In this paper, we develop an intelligent decision-support model
the 1960s (Visher & Weisburd, 1998). This traditional approach re- to uncover the crime patterns and their temporal relationships to
lies on the expertise and tactic knowledge of specialists, which eas- support the police management in law enforcement and duty
ily leads to criminal experts’ fatigue, misjudgment, and slow deployment. The target data set to be mined was released by
response. Moreover, it usually lacks of the real data for verification, NPA in Taiwan from 2003 to 2004, which is linguistic in nature,
thus making it difficult to apply the findings to the actual policy thus a model capable of dealing with fuzzy data is needed. For this,
(Visher & Weisburd, 1998). The second approach deals with the we propose a new FSOM model and a modified temporal rule
problem of predicting the crime volume at a specific time and place extraction algorithm to be discussed in the next section.
using various statistic models (Brown & Oxford, 2001; Gorr,
Olligschlaeger, & Thompson, 2003; Greenberg, 2001; Harries,
3. Research methodology
2003; Osgood, 2000; Palocsay, Wang, & Brookshire, 2000; Ratcliffe,
2005). Gorr et al. (2003) utilized the Naïve exponential model to
This section discusses the underlying research methodologies in
forecast crime one month ahead in Pittsburgh, US. The result dis-
this study.
plays that practically any model-based forecasting approach is
vastly more accurate than current police practices. Brown and Ox-
ford (2001) presented baseline models, normal regression, and log- 3.1. The FSOM network
normal regression to predict the number of breaking and enterings
(B&Es) in the city of Richmond, Virginia and found that log-normal The SOM neural network has been one of the most popular
regression is the best model. Unfortunately, Palocsay et al. (2000) unsupervised neural network models to solve a wide variety of
7110 S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119

problems in clustering, classification, visualization, and modeling operations can be defined in various ways; we restrict ourselves
(Flexer, 2001; Vesanto & Alhoniemi, 2000). It quantizes the input to the well-known operations minimun and maximun.
data and simultaneously performs a topology-preserving projec- With determining the fuzzy similarity measure, the similarity of
tion from the data space onto a regular two-dimensional grid in two fuzzy time series is defined as follows:
the output space (Kohonen, 1997). However, the traditional SOM
SF : Rn ! R
fails to deal with uncertainties. For overcoming such shortcoming,
Huntsberger and Ajjimarangsee (1990) first brought the fuzzy c- 1 X
n
SFð Aei ; Aej Þ ¼ SðAik ; Ajk Þ ð3Þ
means model to enhance the learning rate and weight updating n k¼1
strategy of the fuzzy Kohonen clustering network (FKCN). It was
The node with the largest SF value is assigned as BMU.
then extended by Bezdek, Tsao, and Pal (1992), Tsao, Bezdek, and
Pal (1994), and Lin and Lee (2003) and was applied in various areas
3.1.3. The weight updating process
(Lei & Feihu, 1999). FKCN obtained superior learning performance
During training, neurons that are topographically close to BMU
but could not process linguistic data.
in the map are moved toward a given input vector. It is standard to
In this study, we follow the idea of FKCN and the structure of
use certain geometric distance to update weights in training crisp
traditional SOM to enhance the model to allow for the manipula-
data. Since the crime temporal data is linguistic and represented by
tion of fuzzy numbers as inputs, fuzzy similarity measure, and fuz-
triangular fuzzy sets, the weight updating rule is required in order
zy weight updating. To realize such improvements, three issues
to handle fuzzy numbers.
must be addressed: (1) representation of fuzzy time series, (2)
Let
selection of best-matching unit (BMU), and (3) the weight updating
process. Aei ¼ ½Ai1 ; Ai2 ; . . . ; Aik ; . . . ; Ain 
   l m r  l m r  l m r 
3.1.1. Representation of fuzzy time series ¼ ali1 ; am r
i1 ; ai1 ; ai2 ; ai2 ; ai2 ; . . . ; aik ; aik ; aik ; . . . ; ain ; ain ; ain

The input layer contains n neurons for storing an input fuzzy be a fuzzy input vector and
time series, represented as a fuzzy feature vector Aei ¼
½Ai1 ; Ai2 ; . . . ; Aik ; . . . ; Ain  2 F n in n-dimensional fuzzy number space fj ¼ ½W j1 ; W j2 ; . . . ; W jn 
W
h   
F, where Aik is a non-negative triangular fuzzy number. Formally
¼ wlj1 ; wm r l m r
j1 ; wj1 ; wj2 ; wj2 ; wj2 ; . . . ;
speaking,    i
Aei ¼ ½Ai1 ; Ai2 ; . . . ; Aik ; . . . ; Ain  wljk ; wm r l m r
jk ; wjk ; . . . ; wjn ; wjn ; wjn
   l m r  l m r  l m r 
¼ ali1 ; am r
i1 ; ai1 ; ai2 ; ai2 ; ai2 ; . . . ; aik ; aik ; aik ; . . . ; ain ; ain ; ain be the fuzzy weight vector of BMU. Then the BMU node will move
¼ ½Ai1 ½a; Ai2 ½a; . . . ; Aik ½a; . . . ; Ain ½a toward the input Aei using the so-called ‘‘self-organization” learning
h    rule and its weight vector after updating will be
¼ ALi1 ½a; AUi1 ½a ; ALi2 ½a; AUi2 ½a ; . . . ;
   i f 0 ¼ ½W 0 ; W 0 ; . . . ; W 0 
W
ALik ½a; AUik ½a ; . . . ; ALin ½a; AUin ½a ð1Þ j
h 0
j1 j2 jn
  0 
¼ wlj1 ; wm0 j1 ; w r0
j1 ; wlj2 ; wm0 r0
j2 ; wj2 ; . . . ;
where al, am, and ar are the left, middle, and right bounds of the tri-  0   0 i


angular fuzzy number, Aik ½a ¼ fxlAik ðxÞ P ag is the a-cut set for wljk ; wm0 r0 l m0
jk ; wjk ; . . . ; wjn ; wjn ; wjn
r0



fuzzy number Aik, and ALik ½a ¼ inffxlAik ðxÞ P ag and AUik ½a ¼ where the kth component in the vector is modified according to the
 following updating rule:

supfxlAik ðxÞ P ag are the lower and upper bounds of the a-cut
set for the fuzzy number Aik, respectively.
wl0jk ðt þ 1Þ ¼ wljk ðtÞ þ Dwljk ðtÞ
wm0 m m
jk ðt þ 1Þ ¼ wjk ðtÞ þ Dwjk ðtÞ ð4Þ
3.1.2. Selection of BMU
wr0jk ðt þ 1Þ ¼ wrjk ðtÞ þ Dwrjk ðtÞ
The neurons in the map are fully connected to adjacent ones by
a neighborhood relation of the neurons. When an input vector Aei is The amount of adjustment for each parameter in the triangle fuzzy
presented to the network, the neurons in the map compete with set is then defined as follows:
each other to be the winner (or the best-matching unit, BMU)
Aei  , which is the closest to the input vector in terms of some kind Dwljk ðtÞ ¼ gðtÞðalik ðtÞ  wljk ðtÞÞ
of fuzzy similarity measure due to the nature of fuzziness in the in- Dwm m m
jk ðtÞ ¼ gðtÞðaik ðtÞ  wjk ðtÞÞ ð5Þ
put. A variety of fuzzy similarity measures have been proposed, Dwrjk ðtÞ ¼ gðtÞðarik ðtÞ  wrjk ðtÞÞ
and each has different application domains (Chen & Lu, 2001;
Kuo & Xue, 1998; Paiva & Dourado, 2004). In this paper, we adopt where t = 0, 1, 2, 3, . . . is the time lag, and notation g(t) is the learn-
the Paiva and Dourado’s measure, which is widely used in the lit- ing rate. In general, g(t) is a small positive number decreasing with
 
erature and can be generalized to handle any fuzzy similarity time and can be defined as gðtÞ ¼ g0  exp  t t , where g0 is the
total
measure.
initial rate, t is the current epoch number of training, and ttotal is
Assume that Aik and Ajk are two fuzzy numbers (FN) of the kth
the total training epochs required. The training algorithm of FSOM
component in the feature vector. The similarity measure between
is formalized in Table 1.
two fuzzy numbers (Aik, jk) is defined as follows:
S : FN  FN ! R 3.2. Temporal rule extraction
R
kAik \ Ajk k ðjAik ½a \ Ajk ½ajÞda
SðAik ; Ajk Þ ¼ ¼R ð2Þ The information extraction from time series databases can be
kAik [ Ajk k ðjAik ½a [ Ajk ½ajÞda
divided into four directions: similarity/pattern querying, cluster-
The similarity is basically the result of the division of area of their ing/classification, pattern finding/prediction, and rule extraction
intersection by the area of their union. The Aik \ Ajk and Aik [ Ajk (Cotofrei & Stoffel, 2002). Among them, rule extraction becomes
are the fuzzy t-norm and s-norm operations, respectively. These more important since it can identify the correlation between time
S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119 7111

Table 1 Table 2
The training algorithm of FSOM network. The algorithm of temporal rule extraction.

[Step 1] Initialize random fuzzy weights of the FSOM, cluster number and Rule_set=/;
learning rate g0 Si = (Li(1), Li(2), . . ., Li(k), . . ., Li(n)), Li(k) e {red, yellow, purple, green, blue} is the
[Step 2] Select BMU b using Eq. (3) PSI value at time k in time sequence
[Step 3] Update the weights of the neighborhood of BMU using Eqs. (4) and h is a time interval under consideration
(5) //Construct all rules of two time series
[Step 4] Decrease the neighborhood of the winner appropriately using the for t = 0 to h;
following equation for t0 = 1 to (n  t);
t n  o
gðtÞ ¼ g0  expð ttotal Þ  T¼t
E ¼ ee ¼ ðS1 ðLi ðt 0 ÞÞ ! S2 ðLj ðt0 þ tÞÞÞ ; // list all rules
[Step 5] [Step 5] Repeat Steps 2–4 until stopped criterion is not.
Rule_set= Rule_set [E; // remove repetition rules
end
end
//Check thresholds of confidence & support
series and provide understandable results. Das, Lin, Mannila,
for each rule Rl e Rule_set
Renganathan, and Smyth (1998) outlined an effective method by remove Rl from Rule_set if (conf(Rl) < confmin) or (supp(Rl) < suppmin)
resembling vector quantization for uncovering useful rules from return Rule_set;
discretized sequences and represent the rules in the format
if A occurs; then B occurs within time T;
important information could be lost due to failing to take into ac-
count the frequency of right-hand side of a rule (Harms, Li, Deogun,
& Tadesse, 2002). In order to overcome this shortcoming, Smyth
The rules discovered identify the related behavior of patterns and Goodman (1992) developed an information-theoretic measure
within a sequence over time. Compared to the traditional associa- for evaluating the usefulness of a rule that considered both fre-
tion relation format (Karimi & Hamilton, 2002), we prefer using quencies of the left and right sides of the rule. This measure is
simple rules to induce intuitive and significant phenomenon for known as the J-measure and is defined as follows:
experts since rules with multiple candidates and consequents def-

initely decrease the interpretability as well. In this paper, we ex- pðS2 ðLj ÞT jS1 ðLi ÞÞ
JðS2 ðLj ÞT ; S1 ðLi ÞÞ ¼ pðS1 ðLi ÞÞ  ðpðS2 ðLj ÞT jS1 ðLi ÞÞ log
tend Das’s algorithm to mine the relationship of two different pðS2 ðLj ÞT Þ
!#
crime activities by allowing processing linguistic time series data. pðS2 ðLj ÞT jS1 ðLi ÞÞ
The rule format and extraction algorithm are illustrated as follows. þ pðS2 ðLj ÞT jS1 ðLi ÞÞ log ð8Þ
pðS2 ðLj ÞT Þ
Given two linguistic temporal crime activities S1 and S2, where
Si = (Li(1), Li(2), . . ., Li(k), . . ., Li(n)), Li(k) e {red, yellow, purple, green, where p(S1(Li)) is the probability of event S1(Li) occurring in the
blue} is the PSI value at time k for Si. Then the relationship between temporal crime activities sequence, p(S2(Lj)T) is the probability of
S1 and S2 is represented as at least one S2(Lj) taking place in a randomly chosen window of
T¼t duration time interval T, pðS2 ðLj ÞT jS1 ðLi ÞÞ is the probability of at least
S1 ðLi ðt0 ÞÞ ! S2 ðLj ðt 0 þ tÞÞ: one S2(Lj) occurring in a randomly chosen window of duration time
For example, S1 = (red, purple, yellow, red, green), S2 = (purple, red, interval T given that the window is immediately preceded by an
red, purple, red) and the time interval under consideration is T = 1. S1(Li), and pðS2 ðLj ÞT jS1 ðLi ÞÞ is the complement of pðS2 ðLj ÞT jS1 ðLi ÞÞ.
The relational rules can be For each rule obtained, the higher the J-measure value is, the more
T¼0 T¼0 T¼0 important the rule is.
S1 ðredÞ ! S2 ðpurpleÞ; S1 ðpurpleÞ ! S2 ðredÞ; S1 ðyellowÞ ! S2 ðredÞ;
The J-measure not only provides a useful and sound method for
T¼0 T¼0 T¼1
S1 ðredÞ ! S2 ðpurpleÞ; S1 ðgreenÞ ! S2 ðredÞ; S1 ðredÞ ! S2 ðredÞ; ranking rules, but also provides a more complex metric in rule
T¼1 T¼1
S1 ðpurpleÞ ! S2 ðredÞ; S1 ðyellowÞ ! S2 ðpurpleÞ; S1 ðredÞ ! S2 ðredÞ:
T¼1 mining and is used in many literatures for handing different rule
discovery problems, such as efficient rule discovery in a geo-spatial
decision support system (Harms et al., 2002), learning fuzzy rule-
Repetitive rules need to be pruned and rules below the thresholds based networks for function approximation (Higgins & Goodman,
of confidence and support, set by the domain expert, are considered 1992), and incremental learning with rule-based neural networks
uninteresting and removed. The support value refers to the percent- (Higgins & Goodman, 1991).
age of task-relevant crime activities for which the pattern is true:
T nðS1 ðLi Þ; S2 ðLj Þ; TÞ 4. Experimental design
suppðS1 ðLi Þ ) S2 ðLj ÞÞ ¼ pðS1 ðLi Þ \ S2 ðLj ÞT Þ ¼ ð6Þ
nðS1 ðLi ÞÞ
Fig. 1 depicts the framework of intelligent decision-support
where pðS1 ðLi Þ \ S2 ðLj ÞT Þ is the probability of events of S1(Li) and model for crime prevention, of which crime pattern analysis and
S2(Lj)T, n(S1(Li), S2(Lj), T) is the number of occurrences of S1(Li) that police duty deployment will be discussed in the next section.
are followed by a S2(Lj) within time interval T.
The confidence is the conditional probability defined by
4.1. Data acquisition
T
T  suppðS1 ðLi Þ ) S2 ðLj ÞÞ
conf ðS1 ðLi Þ ) S2 ðLj ÞÞ ¼ pðS2 ðLj ÞT S1 ðLi ÞÞ ¼ ð7Þ The data under investigation, provided by the NPA, Taiwan, is
pðS1 ðLi ÞÞ
about the monthly crime volume of 20 county police bureaus in
The temporal rule extraction algorithm is detailed in Table 2. Taiwan from 2003 to 2004. Three counties were intentionally ex-
cluded because they are located in the sparsely populated eastern
3.3. The J-measure area of Taiwan and the crime volume is comparatively too insuffi-
cient to be analyzed. Fourteen criminal categories are collected:
In general, numerous rules are generated from the temporal intimidation, drugs, automobile theft, sexual offences, anti-social
rule extraction algorithm. Although support and confidence are behavior, theft, motorcycle theft, firearms crime, damage, force
two typical objective measures in finding interesting rules, some taking, fraud, robbery, gambling, and injury. Therefore, a set of
7112 S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119

Data acquisition Data preprocessing Mining Crime pattern analysis Police duty deployment

Parameter setting Crime trends


Data standardization
Clustering number
Learning rate
x i - x m in Support and confidence of
S i = temporal rule extraction
x m a x - x m in
Crime offenses

Data fuzzification Training FSOM


Crime locations
NPA, Taiwan

Fuzzy clustering
Temporal rule
extraction
Shift around effect

Fig. 1. The framework of intelligent decision-support model for crime prevention.

xij alk au x
280  24 temporal data is constructed, formed by 14 offenses in xij will be lAk ðxij Þ ¼ max min am al
; aukamij ; 0 , where w is a con-
k k k k
each county for over 24 months. stant in [0,1]. After defining the fuzzy sets, the standardized crime
volume xij is converted to one of five lights according to the best
4.2. Data preprocessing matching membership degree (BMD) b which has the largest mem-
bership degree lAk ðxij Þ: lAb ðxij Þ ¼ max ðlAk ðxij ÞÞ. Fig. 2 depicts the
k¼1;2;...;c
Due to the variability existing in crime categories, the monthly
crime volumes are firstly standardized to be a number between 0 derived fuzzy triangle membership functions and fuzzification re-
and 1 using the following formula: sult, where xij is a normalized value.

x  xmin
xij ¼ ð9Þ 4.3. Parameter setting
xmax  xmin
where xmin and xmax are the minimal and maximal of the crime vol- At the mining stage, the FSOM network is trained to cluster the
ume x in a criminal category. fuzzied crime data. During training, parameters of cluster number,
In addition to the crime volume, a public security index of each learning rate, and stopping criterion need to be decided in advance.
crime type is published by NPA which is classified into one of five The stopping criterion is used for justifying whether the training
lights to illustrate the degree of security situation. Such a classifi- model converges or not. There are some reasonable criteria with
cation can be simulated by fuzzification. In this study, the stan- their own practical merits, of which the error-distance, e.g. Euclid-
dardized crime volumes are fuzzied by fuzzy c-means (FCM), ean, measure is simple and widely accepted to measure the differ-
which has been shown as an effective way to help humans deal ence between the cluster center and the input data. In this study,
with uncertainty (Xu & Khoshgoftaar, 2004). The fuzzification since the data set and the cluster centers are fuzzy sets, a defuzz-
process is conducted as follows. Given a data matrix X ¼ ½xij mn , ification procedure is needed to convert the fuzzy set to a crisp real
one firstly converts matrix X into a set S ¼ fx11 ; x12 ; . . . ; x1n ; number. Thus, the error-distance measure is defined as
x21 ; x22 ; . . . ; x2n ; . . . ; xm1 ; xm2 ; . . . ; xmn g. Then, FCM divides S into c vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u n
homogeneous nonempty subsets and obtains c clustering centers 1 uX
Eð Aei ; W
fj Þ ¼ t distðdfuzðAik Þ; dfuzðwjk ÞÞ ð10Þ
(Ck, k = 1, 2, . . ., c) and the corresponding standard deviations n k¼1
(Sk, k = 1, 2, . . ., c). Without loss of generality, the associated trian-
gle fuzzy number Ak can be established Ak ¼ ðC k  w  Sk ; where function ‘dfuz’ is the well-known gravity-based defuzzifica-
 
C k ; C k þ w  Sk Þ ¼ alk ; am u
k ; ak , and the membership degree for datum tion method, in this case, the centroid method.

1
veay bad
0.9 very goog
the membership degree

bad
0.8 intermediate
good
0.7
0.6

0.5
0.4
0.3
0.2
0.1

0
-0.5 0 0.5 1 1.5

Fig. 2. Fuzzification process of standardized crime volume.


S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119 7113

The learning rate, a critical parameter in the course of training, the second candidate, four clusters, is considered in this
has to be chosen appropriately. Fig. 3 displays the learning curves experiment.
with learning rates of 0.0005, 0.001, 0.005, 0.01 and 0.05, where Finally, to proceed to temporal rule extraction, a senior detec-
axes X and Y denote the iteration times and the error-distance, tive with five years experience is consulted for determining the
respectively. It shows that all models achieve good properties of three parameters of interest: support, confidence, and time interval
convergence and stability, nevertheless the model with learning T. From the viewpoint of empirical practices, he claims that it is
rate of 0.01 achieves the best performance comparatively. There- hard to explain the temporal relations of crimes if time interval
fore, the following experimental analysis is based on the rate in is over two months. Therefore, the parameter T value ranging from
the FSOM neural network. 0 to 2 is considered in the experiment and the threshold values of
The selection of an appropriate number of clusters is usually support and confidence are set to 0.2 and 0.7, respectively.
problem dependent, and is, in general, based on the principle of
minimizing the inter-distance and maximizing the intra-distance 5. Experimental results, analysis, and discussion
among clusters. In this study, we consider one of the most popular
cluster validity used in the literature: silhouette width (Li & Kuo, In this section, we analyze and discuss in detail the experimen-
2008; Rousseeuw, 1987) for determining the cluster numbers. tal results for the aspects of crime patterns, trends and temporal
The silhouette value for each object i is computed in Eq. (11) knowledge rule uncovered.
bðiÞ  aðiÞ
SðiÞ ¼ ; ð11Þ 5.1. Characteristics analysis of crime trend patterns
maxfaðiÞ; bðiÞg

where a(i) is average dissimilarity of i to all other objects in the By feeding the 280 crime linguistic time series data into the
same cluster; b(i) is minimum of average dissimilarity of i to all ob- FSOM network, four crime trend patterns (clusters) identified. To
jects in other cluster (in the closest cluster). The silhouette width justify the clustering result, the one-way analysis of variance (AN-
for a clustering distribution can then be measured by the average OVA) is first used to test the significant differences among the four
silhouette values for the entire data, which is defined in the crime trends at the 95% confidence level. The null hypothesis of
following: ANOVA was set H0: l1 = . . . = l4 against H1:Not all li(i = 1, . . ., 4)
are equal. The ANOVA summary table for crime trends is listed in
1X n Table 4, which indicates that the critical point for a right-tailed test
SW ¼ SðiÞ; ð12Þ at a = 0.05 for an F distribution with 3 and 276 degrees of freedom
n i¼1
is 2.635 and F = 23.04 > F0.05(3,276) = 2.635. Thus, we reject the
null hypothesis and conclude that the means of the four crime
where n is the total number of data items. A higher value of SW indi-
trend patterns are different.
cates better discrimination among clusters and the largest silhou-
Next, we verify and test the significance of relationships be-
ette width indicates the best clustering (number of cluster). To
tween two clusters when data are presented as ordered pairs. An
find the optimal cluster number, numbers from the range of 2–10
excellent way to measure such relationships is to calculate the
are tried as illustrated in Table 3. The best value of silhouette width
Pearson correlation coefficient and construct a scatter plot. The
is reached around 0.1710 when two clusters are identified. How-
null and alternative hypothesis are H0:q = 0 (no correlation) and
ever, it is meaningless to uncover the trend of two crime categories
H1:q – 0 (significant correlation) in each test and the result show
that clusters are too few to be subsequently analyzed. As a result,
in Table 5. Because p-value >0.01, we should decide not to reject
the null hypothesis. At the 1% level, there is not enough evidence
55 to conclude that significant correlations exist between the clusters.
0.0005 This is also verified by the scatter plots in Table 5 in that no linear
0.001
54.5 correlation has been observed between the clusters.
0.005
0.01
Table 6 illustrates the characteristics of the mining results of the
54 0.05 four clusters, which include basic statistical information about the
number of data elements, mean, and variance of the crime offenses
53.5 occurring during two contiguous years for each cluster. The fifth
row in the table shows the cluster center, expressed in black dots
error

53 surrounded by two color curves representing the standard devia-


tion of each time dimension in each cluster. In other words, Table
52.5 6 provides information for the police management to determine

52
Table 4
ANOVA table for crime clusters.
51.5
Source of Sum of Degrees of Mean F
variation squares freedom square ration
51
0 50 100 150 200 250 300 350 400 450 500 Treatment 1.01657 3 0.3388 23.04
iteration time Error 4.0756 276 0.0147
Total 5.09217 279
Fig. 3. Learning curves of the FSOM network with different learning rates.

Table 3
SW values for different cluster numbers.

Cluster number 2 3 4 5 6 7 8 9 10
Silhouette widths 0.171 0.1476 0.164 0.1233 0.0802 0.1235 0.0781 0.0813 0.1235
7114 S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119

Table 5
Correlation analysis for crime clusters.

Cluster vs. cluster 01 vs. 02 01 vs. 03 01 vs. 04 02 vs. 03 02 vs. 04 03 vs. 04


Pearson correlation 0.503 0.187 0.252 0.185 0.305 0.503
Sig. (2-tailed) 0.012 0.381 0.234 0.388 0.147 0.012
Scatter plot

˶˿̈̆̇˸̅˃ˆ
˶˿̈̆̇˸̅˃ˆ

˶˿̈̆̇˸̅˃ˆ
˶˿̈̆̇˸̅˃˅

˶˿̈̆̇˸̅˃ˇ

˶˿̈̆̇˸̅˃ˇ
˶˿̈̆̇˸̅˃ˇ
˶˿̈̆̇˸̅˃˅
˶˿̈̆̇˸̅˃˄
˶˿̈̆̇˸̅˃˄
˶˿̈̆̇˸̅˃˄
˶˿̈̆̇˸̅˃˅

Table 6
Patterns of crime trends identified by FSOM.

Cluster 1 Cluster 2 Cluster 3 Cluster 4


Occurrence 55 44 94 87
Mean 0.47 0.50 0.39 0.34
Variance 0.0166 0.0215 0.0211 0.0034
Trend

Label Typical Gradual increase Sharp increase Wintertime

what kind of duty deployment should be applied for each crime of- in the right time and making the best use of the police duty deploy-
fense. Following, we discuss each crime trend in detail. ment. In addition, for Gradual Increase and Sharp Increase crime
It is clear that cluster 1, with a 0.47 mean and a 0.0166 variance, patterns, the police should dig deep into specific cases to under-
demonstrates that the crime frequency for the first year tends to stand the reasons that keep crime frequency constantly rising. This
gradually increase until it reaches its peak in summer and then de- is the only way that the phenomena stated by the broken window
crease towards the end of the year, which is also the same pattern theory can be avoided.
happening during the next year but with a lower number of occur-
rences. This coheres with the fact that more interactions among 5.2. Crime offense analysis
people exist during summer period, increasing the chance of crime
commitment (Gorr et al., 2003). The consulting expert in crime of- Using the clustering results obtained in Section 5.1, we analyze
fense thus classifies and labels this crime trend as ‘Typical’ pattern. each of the 14 crime offenses according to their percentage distri-
Cluster 2, with a mean of 0.5 and variance of 0.0215, represents the bution in every cluster. In the analysis process, we first have to find
cluster with the highest mean and largest variation. The beginning out which cluster contains the highest percentage of each crime of-
half period of this trend pattern shows the same tendency of the fense and following, we need to determine if this percentage is
Typical pattern but with a slightly later decrease, however the higher than the threshold of 40%, set by the senior expert. If it
occurrences in the second half are still high. Thus, the expert sug- is larger than the threshold, then we can infer that the trend is
gests that the police management should tackle this problem more remarkable, otherwise the trend is subordinate remarkable. We
seriously as the crime frequency is still rising. This cluster is there- summarize the 14 crime offenses in Fig. 4.
fore classified as ‘Gradual Increase’. By carefully observing Fig. 4, we can find that motorcycle theft
Cluster 3, with a mean of 0.39 and variance of 0.0211, clearly and injury belong to Typical pattern, occurring mostly in summer.
demonstrates an increasing trend during both years and is accord- As we mention in the last section, the occurrences belonging to
ingly labeled as ‘Sharp Increase’. It contains the largest number of Typical pattern are caused by the increased interaction between
crime occurrences, thus special attention should be paid to this people as most people like to partake in outdoor activities during
crime pattern. Finally, cluster 4, with a mean of 0.34 and variance this season of the year. Injury refers to physical aggression, includ-
of 0.0034, achieves the lowest mean and variance, however the ing family violence, street violence, etc., and it is the second most
frequency during winter is much higher than in other seasons. common crime offense in juvenile delinquency. According to our
Therefore, the crime occurrences belonging to this cluster are cat- analysis, the reason why injury is clustered in Typical pattern is
egorized as wintertime crimes (Gorr et al., 2003) and the pattern that teenagers are easily disturbed due to the hot weather condi-
labeled as ‘Wintertime’. tions during summer, thus increasing the probability of quarrels
It is worth noting that the occurrences decreased in February and fights. The outcomes obtained from FSOM on Typical pattern
2003 for all cluster patterns, caused by a special annual crime pre- that we proposed in this study happen to agree completely with
vention program deployed in Taiwan around the Chinese New Year the traditional crime theory. Damage is the only remarkable crime
period, in which duty deployment was significantly enforced to offense in the gradual increase pattern, indicating that the police
improve crime prevention efforts. Typical and Wintertime crime management should take this issue more seriously by focusing
patterns show stable trends because of the regular high frequency more on this type of offense and figuring out an effective way of
occurrences during the year. Therefore, the police management preventing it.
should apply particular crime prevention policies regarding both The offences in the sharp increase pattern are the most impor-
types of crime patterns. The resulting strategy helps police man- tant issues that the police should pay attention to because they
agement in carrying on appropriate crime prevention procedures continue to rise indefinitely over time. These include theft, drugs,
S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119 7115

Theft Automobile theft Motorcycle theft Robbery Force taking


Intimidation Sexual offence Drugs Firearms crime Gambling
Injury Fraud Damage Anti-social behavior

80%

40%

0%
Typical pattern Gradual Increase pattern Sharp Increase pattern Wintertime pattern

Fig. 4. Bar graph of percentage distribution in every cluster.

firearms crime, force taking, and anti-social behavior, which are all predicted in a systematic way (Johnson & Bower, 2004). It is neces-
extremely violent offences, indicating seriousness of the situation. sary for the police management to be responsible for each precinct
It is evident that the actual prevention policies and police duty and to integrate their manpower to prevent the recidivists from
deployments for these crime offenses are not working effectively continuing to shift around from a precinct to another. Unfortu-
as the pattern shows an increasing trend. We strongly recommend nately, there still is not an effective inter-precinct duty deploy-
that the police should take notice of this phenomenon and analyze ment. Thus, implementation of an inter-precinct duty
this situation in more depth. The crime offenses clustered in the deployment plan will be helpful in preventing recidivists from
Wintertime pattern all belong to property crimes, such as robbery, shifting around their crime locations and improve police duty
intimidation and gambling, except for sexual offence (Gorr et al., deployment efficiency among every precinct.
2003). Property crimes should be viewed as a representation of of- After using FSOM model to cluster 14 crime offenses into four
fenses that describe material-based criminality in society and usu- clusters, the experimental results show that the gradual increase
ally happen when the victim is not present. The demand for money and sharp increase patterns present unusual and dangerous crime
during the period close to the Chinese New Year is the main reason tendencies with a shift around effect. Therefore in order to reduce
why people commit property crimes in winter. and prevent organized crime, we must analyze in more detail these
The four crime trends obtained from clustering the 14 types of two patterns occurring in each precinct. We expect to discover
crime offenses using FSOM model accurately match the actual similar crime patterns caused by the shift around effect, thus we
crime patterns in Taiwan, so the police should elaborate different use a visualization method to improve understandability and facil-
specific strategies according to the characteristics of each of the itate analysis. Fig. 5 shows Taiwan’s map with the sharp increase
four crime trends, paying special attention to both gradual increase and gradual increase patterns colored in green and purple,
and sharp increase patterns due to their constant increasing trend. respectively.
The information we provide is useful for the police to make a better As shown in Fig. 5, there is an obvious shift around effect due to
use of its duty deployment, establish more satisfactory law the presence of the same pattern in more than one precinct. This
enforcement policies, and focus adequately on the crime offenses happens among nearby precincts and even across cities and coun-
that affect the Taiwanese society. ties, proving the existence of the shift around effect. Taking crime
offense damage as an example, we notice that Sharp Increase pat-
5.3. Analysis of crime location distribution tern occurs in two different areas of the island: the northern area
(Taipei county, Taoyuan county, Hsinchu county, and Hsinchu city)
According to the routine activity theory (RAT) (Ratcliffe, 2002), and the central area (Changhua county, Yunlin county, and Chiayi
crime location is one of the most important factors when analyzing city). The fact that the same pattern occurs within a single area
crime occurrences because it provides a straightforward explana- indicates the presence of the shift around effect, while the distance
tion of the motivation of a crime. RAT indicates that for a crime between the northern and central areas reveal the consequences of
to occur, there should be three primary factors: the presence of a the improvements in transportation. The same phenomena hap-
suitable target (place, time and victim), the presence of a possible pens with gradual increase pattern, but between the central (Hsin-
offender, and the absence of a suitable guardian to prevent the chu county and Hsinchu city) and southern (Kaohsiung city,
crime from happening (Felson & Lawrence, 1980). That is to say, Kaohsiung county, and Pingtung county) areas of the island. An-
a crime will not happen if any one of the three primary factors is other two special cases are drugs and firearms crime, which both
eliminated. Therefore, analyzing crime locations and distributions happen solely in the west coast of the island and only present a
can help the police discover valuable information regarding the gradual increase pattern. This may be caused by the high popula-
time and place of the crimes. The police can thus be deployed tion density, high economic development among the west coast
appropriately to the right place in the right time to avoid the ab- cities, and low relative sentence and punishment. Finally, gambling
sence of a suitable guardian and prevent crimes from happening. seems to be almost non-existent along the island, only happening
In recent years, due to the improvements in transportation and slightly in Keelung city because it belongs to Wintertime pattern,
communication, lots of recidivists can easily shift around nearby which is not considered by the model.
precincts to repeatedly commit. It is reasonable for criminals to According to the visual analysis, we can specifically confirm the
shift around crime locations to other precincts after the local police precincts affected by the shift around effect and help decision mak-
enhance law enforcement duties in their own precincts. Conse- ers in planning the inter-precinct operations to prevent recidivists
quently, the recidivists’ shifting pattern of crime locations can be from shifting around crime locations. The police can integrate their
7116 S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119

Automobile Anti-social
Intimidation Drugs Sexual offense
theft behavior

Motorcycle
Theft Firearms crime Damage Force taking
theft

Fraud Robbery Gambling Injury

Fig. 5. Sharp increase and gradual increase pattern distribution in each precinct.

force and at the same time focus their operations on the specific torcycle theft, drugs, firearms crime, injury, anti-social behavior,
precincts indicated by our model, thus greatly improving duty and fraud. The two statements within each relational rule present
deployment and law establishment. different locations implying that the crime offense has shifted
around after T months. This proves the existence of the shift
5.4. Analysis of shift around effect around effect among nearby precincts, or even among distant pre-
cincts due to the improvements in transportation and communica-
According to the results of the experiment, gradual increase and tion means. This is further emphasized by the colored arrows in
sharp increase patterns showed signs of shift around effect. Das Fig. 6 that show the shift around distance of each crime offense.
et al. (1998) indicated the importance of rule discovery and extrac- Taking anti-social behavior as an example, the rule
tion from time series data, so if rule discovery could be applied to
T¼1
the data obtained, explicit relational rules among precincts and The Taichung city is yellow ) The Changhua county is yellow
crime categories considering time effects could be generated. This
could help enrich understanding of the shift around effect in both indicates that Changhua county should deploy duties to prevent
patterns and improve crime prevention operations (Ratcliffe, 2002; anti-social behavior one month after Taichung city. Another exam-
Smyth and Goodman, 1992). ple of a crime offense with large shift around effect is fraud, which
Our research applies temporal linguistic time series rule discov- is spread all over the island across long distances. A possible expla-
ery and J-measure to discover the precincts with serious crime nation is that this crime offense is usually committed by telephone,
activities. Each crime offense in each city or county with the shift originated in big cities where fraud rings are concentrated. In addi-
around effect is represented by a colored light, which include red, tion, an interesting phenomenon can be derived from motorcycle
yellow, purple, green and blue. The first two belong to the most crit- theft crime offense, which shift around effect occurs simultaneously
ical security indexes while the last three are of minor importance, from Taipei to Hsinchu and vice versa. This indicates the serious-
which is why we focus our analysis on the first two lights, red and ness of motorcycle theft in the area and alerts the police to concen-
yellow. Finally, we extract 23 crime causal rules with the shift trate on its prevention.
around effect, shown in Table 7 and Fig. 6. The large number of rules extracted indicates the seriousness of
From the rules obtained shown in Table 7 and Fig. 6, we learn the shift around effect of the crime offenses belonging to Gradual
that the most remarkable crime offenses are automobile theft, mo- Increase and Sharp Increase patterns. It is essential that the police
S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119 7117

Table 7
Rules of crime offenses with shift around effect.

Crime categories Rule Sup. (%) Con. (%) J-value


Automobile theft T¼2 23 71 0.1698
The Chiayi city is yellow ) The Chiayi county is red
T¼1 22 83 0.1568
The Pingtung county is yellow ) The Changhua county is yellow
Motorcycle theft T¼0 21 83 0.1584
The Taipei county is red ) The Hsinchu county is yellow
T¼0 21 71 0.1383
The Hsinchu county is yellow ) The Taipei county is red
Drugs T¼2 23 71 0.1698
The Pingtung county is yellow ) The Chiayi county is yellow
Firearms crime T¼2 23 71 0.1339
The Changhua county is yellow ) The Yunlin county is yellow
Injury T¼1 22 71 0.0847
The Pingtung county is yellow ) The Yunlin county is yellow
T¼2 23 71 0.0806
The Pingtung county is yellow ) The Hsinchu county is yellow
Anti-social behavior T¼1 26 86 0.1173
The Taichung city is yellow ) The Changhua county is yellow
T¼0 21 83 0.1113
The Taipei city is red ) The Taichung county is red
T¼1 22 71 0.0651
The Taichung city is yellow ) The Taichung county is red
T¼2 23 71 0.0606
The Taichung city is yellow ) The Taichung county is red
T¼2 23 71 0.0439
The Taichung city is yellow ) The Changhua county is yellow
Fraud T¼0 21 83 0.2239
The Changhua county is yellow ) The Kaohsiung county is red
T¼0 21 83 0.1113
The Changhua county is yellow ) The Taipei City is yellow
T¼1 22 71 0.0651
The Kaohsiung City is yellow ) The Taipei City is yellow
T¼1 30 78 0.0549
The Taipei City is yellow ) The Taipei County is yellow
T¼2 23 83 0.0507
The Taipei City is yellow ) The Taipei County is yellow
T¼2 23 83 0.0507
The YiLan County is yellow ) The Taipei County is yellow
T¼0 25 75 0.0436
The Kaohsiung County is yellow ) The Taipei County is yellow
T¼0 21 71 0.0277
The Taoyuan County is yellow ) The Taipei County is yellow
T¼0 21 71 0.0277
The Tainan County is yellow ) The Taipei County is yellow
T¼1 22 71 0.0235
The Kaohsiung City is yellow ) The Taipei County is yellow

Fig. 6. Visualization of crime offenses with red and yellow lights. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version
of this article.)

deploy preventive operations to put a stop to these crime offenses, they are confronted becomes an important issue to consider. Tak-
but under the limited amount of police force, the order in which ing anti-social behavior as an example, the rule
7118 S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119

T¼1 implicit knowledge of great importance that can help improve po-
The Taichung city is yellow ) The Taichung county is red
lice duty deployment’s efficiency and prevent shift around effect,
has a J-measure value of 0.0651, while the rule which has been ignored up to now.
T¼2
The Taichung city is yellow ) The Taichung county is red 6. Conclusions and future works
has a J-measure value of 0.0606, indicating that the first rule should
be considered before the second due to its higher J-measure value. The society we live in is a complicated and culturally revolu-
This is reasonable because the first rule’s time interval is smaller, tionized one, where crime problems are rising in an endless stream
being of better accordance with the fact that the shift around effect and their prevention has become a first priority for the police and
occurs between short intervals of time. Considering distance as a the government. In this paper, we apply technologies in knowledge
factor, the rule discovery in public security index requirement of linguistic data to
support decision making for planning police duty deployment.
T¼1
The Taichung city is yellow ) The Changhua county is yellow During the process of knowledge discovery we use the FSOM neu-
ral network to uncover crime patterns from data of the crime vol-
has a J-measure value of 0.1173, while the rule ume in Taiwan provided by the NPA. We then apply temporal
T¼0
linguistic time series rule discovery and J-measure to identify the
The Taipei city is red ) The Taichung county is red precincts with prominent crime activity and work out a critical va-
lue to determine the relative seriousness of the crimes in each
has a J-measure value of 0.1113, indicating that the first rule should
precinct.
be considered first. This is logical because the distance between Tai-
According to the analysis of the experimental results, we were
chung city and Changhua county is smaller than the distance be-
able to discover the characteristics of four crime patterns, namely
tween Taipei city and Taichung county, which coincides with the
typical, gradual increase, sharp increase, and Wintertime, which
fact that the shift around effect is more prompt to happen between
provide information for the police management to determine the
nearby precincts.
kind of duty deployment that should be applied for each type of
Therefore, by using crime rules and J-measure ranking, the po-
crime offense. This allows for a better usage of the police force and
lice can more accurately predict the location and time of the shift
prevents time and other resources from being wasted when the re-
around effect of the crime occurrences, substantially reducing the
sources employed are not suitable for the occasion. Statistical tests
scope of investigation and effectively allocating its force.
(ANOVA and Pearson correlation coefficient) support our findings by
showing that the four crime trends are significantly different, mean-
5.5. Strategy-making of police duty deployment ing that the four clusters accordingly represent the four different
crime patterns. In addition, by observation and careful analysis we
In general, there are two kinds of strategies of police duty noticed that Gradual Increase and Sharp Increase patterns show a
deployment. The first consists of deploying the routine duties constant increase, so we paid special attention to them and did fur-
every precinct needs within itself, while the second consists of spe- ther study to find out which crime offenses were more remarkable
cial duties conducted by the NPA, Headquarter of Police Affairs in in each pattern and their respective locations. We also found out
Taiwan, across all precincts simultaneously for the same goals that many of them did not happen in a single precinct, but were ex-
(e.g. special duties for robbery prevention). The main purpose of tended to neighboring precincts, indicating the presence of a shift
routine duties is to prevent the occurrence of crimes belonging around effect. In this way, we could improve understandability
to Typical and Wintertime patterns, which tendencies are regular and facilitate analysis to help decision makers in planning the in-
and predictable within all precincts. On the other hand, special du- ter-precinct operations to prevent recidivists from shifting around
ties are applied by deploying the same operations at the same time in crime locations. The rules inferred from the data lead to recogni-
in every precinct to prevent crimes belonging to gradual increase tion of hidden relationships between crime offenses and locations,
and sharp increase patterns because they present shift around ef- and combined with a J-measure value, they facilitated the identifica-
fect. However, special duties usually cannot endure for long peri- tion of the shift around effect. This allowed a better and more accu-
ods of time because of the great consumption of police duty rate prediction of the location and time of the shift around effect of
force. Therefore, crime frequency will drop during periods with the crime occurrences, helping the police to counteract criminal acts
special duties but will rise again after they end. In other words, more effectively and efficiently. Moreover, it is worth mentioning
special duties can only combat crimes temporarily and cannot that the FSOM neural clustering model can be applied to various
solve the root of the problem. application domains that deal with vast amounts of linguistic data.
Moreover, Johnson and Bower (2004) indicated that offenders One limitation of the current study is the difficulty of evaluating
usually do not target the same neighborhoods but prefer to change the accurate performance of the proposed model, so future works
target areas in order to reduce the risks of being caught. It is a great could be aimed at finding a way to use quantified methods to eval-
necessity to discover the precincts with serious public security sit- uate the degree of crime reduction achieved, the improvement in
uations and deploy inter-precinct special duties only to those duty deployment, and the impact on society. Moreover, due to
neighborhood precincts with shift around effects for longer periods the availability of the crime data, the experiment was conducted
of time. We did particular mining for special duties by crime loca- using data that covered a span of two years. If the period under
tion analysis and rule analysis, and furthermore illustrated the de- consideration could be increased, it would provide more longevity
tailed managerial guidance of the four critical clusters of crime to the results obtained.
patterns. According to our managerial duty guidance, patterns with
shift around effect including gradual increase and sharp increase
References
should employ special operations focused only on the specific pre-
cincts appearing in the causal rules. For example, to combat injury, Adderley, R., Townsley, M., & Bond, J. (2007). Use of data mining techniques to
coordination of forces between Pingtung county, Yunlin county, model crime scene investigator performance. Knowledge-Based Systems, 20,
and Hsinchu county would placate the need for the NPA because 170–176.
Alexander, P., Alexander, K., Nikolay, K., & Tatyana, M. (2008). Criminalistic
the latter would perform special duties covering the whole island, identification of PGM-containing products of mining and metallurgical
leading to a waste of resources. Consequently, our research elicits companies. Forensic Science International, 174(1), 12–15.
S.-T. Li et al. / Expert Systems with Applications 37 (2010) 7108–7119 7119

Bezdek, J. C., Tsao, E. C. K., & Pal, N. R. (1992). Fuzzy Kohonen clustering networks. In Kohonen, T. (1997). Self-organizing maps. Berlin, Heidelberg: Springer-Verlag.
Proceedings of IEEE international conferences on terms, fuzzy systems (pp. 1035– Kuo, R. J., & Xue, K. C. (1998). A decision support system for sales forecasting
1046). through fuzzy neural networks with asymmetric fuzzy weights. Decision
Brown, D. E., & Hagen, S. (2002). Data association methods with applications to law Support Systems, 24, 105–126.
enforcement. Decision Support Systems, 34, 369–378. Lei, W., & Feihu, Q. (1999). Adaptive fuzzy Kohonen clustering network for image
Brown, E. D., & Oxford, B. R. (2001). Data mining time series with applications to segmentation. In International joint conference on neural networks (IJCNN) (pp.
crime analysis. IEEE International Conference on Systems, Man, and Cybernetics, 3, 2664–2667).
1453–1458. Li, S. T., & Kuo, S. C. (2008). Knowledge discovery in financial investment for
Chen, L. H., & Lu, H. W. (2001). An approximate approach for ranking fuzzy numbers forecasting and trading strategy through wavelet-based SOM networks. Expert
based on left and right dominance. Computers and Mathematics with Systems with Applications, 34(4), 935–951.
Applications, 41, 1589–1602. Lin, C. T., & Lee, C. S. G. (2003). Neural fuzzy system (Pearson Education Taiwan).
Chen, H., Chung, W., Xu, J. J., Wang, G., Qin, Y., & Chau, M. (2004). Crime data mining: Liu, J. (2006). Modernization and crime patterns in China. Journal of Criminal Justice,
A general framework and some examples. IEEE Computer Society, 50–56. 34, 119–130.
Corcoran, J. J., Wilson, D. I., & Ware, A. (2003). Predicting the geo-temporal Oatley, C. G., & Ewart, W. B. (2003). Crimes analysis software: ‘pins in maps’,
variations of crime and disorder. International Journal of Forecasting, 19, clustering and Bayes net prediction. Expert Systems with Applications, 25,
623–634. 569–588.
Cotofrei, P., & Stoffel, K. (2002). Rule extraction from time series databases using Oatley, G., MacIntyre, J., Ewart, B., & Mugambi, E. (2002). SMART software for
classification trees. In Proceedings of IASTED international conference on applied decision makers KDD experience. Knowledge-Based Systems, 15, 323–333.
informatics (pp. 327–332). Innsbruck. Osgood, D. W. (2000). Poisson-based regression analysis of aggregate crime rates.
Das, G., Lin, K., Mannila, H., Renganathan, G., & Smyth, P. (1998). Rule discovery Journal of Criminal Justice, 16(1), 21–43.
from time series. In The fourth international conference on knowledge discovery Paiva, R. P., & Dourado, A. (2004). Interpretability and learning in neuro-fuzzy
and data mining (pp. 16–22). systems. Fuzzy Sets and Systems, 147, 17–38.
Felson, M., & Lawrence, E. (1980). Human ecology and crime: A routine activity Palocsay, S., Wang, W. P., & Brookshire, R. G. (2000). Predicting criminal recidivism
approach. Human Ecology, 8(4), 389–406. using neural networks. Socio-Economic Planning Sciences, 34, 271–284.
Flexer, A. (2001). On the ues of self-organizing maps for clustering and Petropoulos, C., Patelis, A., Metaxiotis, K., Nikolopoulos, K., & Assimakopoulos, V.
visualization. Intelligent Data Analysis, 5, 373–384. (2003). SFTIS: A decision support system for tourism demand analysis and
Greenberg, D. F. (2001). Time series analysis of crime rates. Journal of Quantitative forecasting. Journal of Computer Information Systems, 44(1), 21–32.
Criminology, 17(4), 291–327. Ratcliffe, J. H. (2005). Detecting spatial movement of intra-region crime patterns
Gorr, W., & Harries, R. (2003). Introduce to crime forecasting. International Journal of over time. Journal of Quantitative Criminology, 21(1), 103–123.
Forecasting, 19, 551–555. Ratcliffe, J. H. (2002). Aoristic signatures and the spatio-temporal analysis of high
Gorr, W., Olligschlaeger, A., & Thompson, Y. (2003). Short-term forecasting of crime. volume crime patterns. Journal of Quantitative Criminology, 18(1), 23–43.
Informational Journal of Forecasting, 19, 579–594. Richard, A., Michael, T., & John, B. (2007). Use of data mining techniques to model
Grubesic, T. H. (2006). On the application of fuzzy clustering for crime hot spot crime scene investigator performance. Knowledge-Based Systems, 20(2),
detection. Journal of Quantitative Criminology, 22(1), 77–105. 170–176.
Harms, S., Li, D., Deogun, J. S., & Tadesse, T. (2002). Efficient rule discovery in a geo- Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and
spatial decision support system. In Proceedings of the second national conference validation of cluster analysis. Journal of Computational and Applied Mathematics,
on digital government (pp. 235–241). 20, 53–65.
Harries, R. (2003). Modeling and predicting recorded property crime trends in Rupnik, R., Kukar, M., & Krisper, M. (2007). Integrating data mining and decision
England and Wales – a retrospective. Informational Journal of Forecasting, 19, support through data mining based decision support system. Journal of
557–566. Computer Information Systems, 47(3), 89–104.
Higgins, C. M., & Goodman, R. M. (1992). Learning fuzzy rule-based networks for Smyth, P., & Goodman, R. M. (1992). An information theoretic approach to rule
function approximation. In Proceedings of the international joint conference on induction from databases. IEEE Transactions on Knowledge and Data Engineering,
neural networks (pp. 251–256). Baltimore, MD. 4(4), 301–316.
Higgins, C. M., & Goodman, R. M. (1991). Incremental learning with rule-based Tsao, C. K., Bezdek, J. C., & Pal, N. R. (1994). Fuzzy Kohonen clustering networks.
neural networks. In Proceedings of the international joint conference on neural Pattern Recognition, 27(5), 757–764.
networks (pp. 875–880). Vesanto, J., & Alhoniemi, E. (2000). Clustering of the self-organizing map. IEEE
Huntsberger, T. L., & Ajjimarangsee, P. (1990). Parallel self-organizing feature maps Transactions on Neural Networks, 11(3), 586–600.
for unsupervised pattern recognition. International General Systems, 16(4), Visher, C. A., & Weisburd, D. (1998). Identifying what works: Recent trends in crime
357–372. prevention strategies. Crime Law and Social Change, 28, 223–242.
Johnson, S. D., & Bower, K. J. (2004). The stability of space–time clusters of burglary. Wang, Z., Meng, W., Gu, X., Ye, J., Nagai, M., & Cu, G. I. (2002). The research of the
British Journal of Criminology, 44(1), 55–65. police GIS spatial data classification technology based on rough set. In
Kaikhah, K., & Doddameti, S. (2006). Discovering trends in large datasets using Proceeding of the fourth world congress on intelligent control and automation.
neural networks. Applied Intelligence, 24, 51–60. Shanghai, PR China.
Karimi, K., & Hamilton, H. J. (2002). TimeSleuth: A tool for discovering causal and Xue, Y., & Brown, E. D. (2003). Decision model for spatial site selection by criminals:
temporal rules. In Proceedings of the 14th IEEE international conference on tools A foundation for law enforcement decision support. IEEE Transactions on
with artificial intelligence (pp. 375–380). Systems, Man, and Cybernetics – Part C: Applications and Reviews, 33, 78–85.
Kaza, S., Wang, Y., & Chen, H. (2007). Enhancing border security: Mutual Xu, Z., & Khoshgoftaar, T. M. (2004). Identification of fuzzy models of software cost
information analysis to identify suspect vehicles. Decision Support Systems, 43, estimation. Fuzzy Sets and Systems, 145, 141–163.
199–210.

Vous aimerez peut-être aussi