## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

:

We use the burglary data (FBI code 05) for year 2014. There are 14306 events, each with time t i

and location ( x i , y i ) .

Model :

λ ( x , y , t)=μ (x , y )+ ∑ λ r ( x−x i , y − y i) λ t (t−t i )

i

1

ai e−√(x− x ) +( y− y ) / L where T is

∑

2

2πL T i

the total duration of the dataset (here 365 days). The two kernels λ t and λ r , as well as the

background weights ai , are to be inverted. The smoothing length L is also to be optimized. We

here follow the approach of Marsan and Lengliné (2008) and use a simple histogram distribution for

the two kernels: λ t (t )=b k for T k ≤t < T k +1 , and λ r (r )=c k for Rk ≤r< Rk+ 1 .We use the

following discretization in time and distance :

T ={ 0 ; 0.1; 0.2; 0.5 ; 1 ; 2 ; 3 ; 4 ; 5 ; 7 ; 10 ; 15 ; 20 ; 30 ; 50 ; 100 } days, and

R= { 0 ; 0.1 ; 0.2 ; 0.3 ; 0.4 ; 0.5 ; 0.7 ; 1 ; 1.5 ; 2 ; 3 ; 5 ;10 ; 20 } km.

2

with the stationnary background rate density μ (x , y )=

2

i

i

**Expectation-Maximization algorithm :
**

Knowing L , the parameters { a i , b k , c k } are inverted by Expectation-Maximization. The

influence of event i on event j is λ ij =λ r ( x j −xi , y j− yi )λt (t j−t i ) , and the sum of all the

influences of past events on j is λ j=∑ λ ij . The background rate density for event j is

i< j

2

2

−√ (x−x i ) +( y− y i ) / L

μ j=μ (x j , y j)=∑ μ ij with μij =ai e

2

/2 π L T (note that the summation is now on

i

**all events i , thus including events i> j , and even j itself). We define the probabilities
**

λij

μ

that j is causally triggered by i , and ω0,ij = ij

that j is a background

ωij = μ +λ

μ

j

j

j+ λ j

event linked to the background node i . These probabilities are normalized by

∑ ω ij+∑ ω0,ij=1 .

i< j

i

**The algorithm iterates the following steps :
**

• Expectation : the probabilities ωij and ω0,ij are computed from the estimated kernels.

Initially, the probabilities are all taken equal to 1 and normalized according to the

normalization ∑ ω ij + ∑ ω0,ij =1 .

i< j

•

i

**Maximization : knowing these probabilities, the log-likelihood is then
**

T −t i

f (a , b , c)=−∑ ai−∑

i

i

that ai =∑ j ω 0,ij , b k =

**∫ dt λt (t)+ ∑ ω0,ij lnμ ij+ ∑ ωij ln λij
**

0

i, j

. Maximizing f gives

i , j>i

Ωk

where Δi , k =T k+1 −T k if T −t i ≥T k+1 ,

∑ Δi , k

i

**Δi , k =T −t i−T k if T k ≤T −t i<T k +1 , and Δi , k =0 if T −t i <T k ,
**

Ω'k

Ωk =

ωij , and c k =

ωij and

with Ω ' k = ∑

∑

S k ∑ Ω 'i

i , j /T ≤t −t <T ,T −t <T

i, j / R ≤r < R

k

j

i

k+ 1

i

k+1

k

ij

k+ 1

i

S k =π( R2k+1−R2k ) .

Convergence is tested by requiring that all non-zero values bk and c k are changed by less than

ln b k

5 % in logarithm, e.g.,

−1 <0.05 , where bk is the value updated during the

ln b k '

∣

∣

**Maximization step, and bk ' is the value prior to this step.
**

The smoothing length can also be optimized during the maximization step, with

∑ ω0, ij √( x i−x j )2+( y i− y j)2

L= ij

. However, doing so lead to the trivial solution L=0 and

2 ∑ ωij

ij

**bk =0 , implying that ωij =0 , ω0,ij =0 if i≠ j , and ω0,ii =1 , which has no predictive
**

value. This solution is a global maximum ( f →∞ when L→0 ). We therefore test two distinct

approaches : (model type 1) we modify the model by imposing ω0,ij =0 if i and j are

co-located, and invert L , similarly to Mohler (2014); (model type 2) we keep L fixed to an a

priori value, and keep the best L after comparing the models with a cross-validation method. The

1st approach gives a best L=0.024 km. For the 2nd, we use the burglary data from the 81 first

days of 2015 (1/1/2015 to 22/3/2015) and compute the log-likelihood on this time period for the

intensity λ (x , y , t ) predicted by the 2014 data alone. We also cross-validate the models of type 1

to compare the two approaches. We find the best L value to be 0.1 km for the models of type 2,

the cross-validation giving a better fit to the 2015 data than the models of type 1, cf Figure 1.

Cross-validation with L=0.024 km in the case of the 1st approach gives a poor fit.

We show in Figure 2 the two interaction kernels λ t and λ r , for the type 2 model with L=0.1

km. Interaction is practically negligible, apart for near-repeats within less than a day from each

other, and accounting for only 1.8 % of all events.

We computed a second set of cross-validations, this time by also including the 2015 data in the

triggering part : the model parameters in λ ( x , y , t)=μ (x , y )+ ∑ λ r ( x−x i , y − y i)λ t (t−t i ) are

i

**unchanged (in particular the background rate-density μ ( x , y ) is thus estimated from the 2014
**

data only), but the triggering term ∑ λr ( x−xi , y− y i ) λt (t−t i )

i

is now computed by summing over both 2014 and 2015 data. Remarkably, the log-likelihood is

systematically found to be lower with this approach, see Table 1. This is counter-intuitive, as using

more recent data to update the triggering term is expected to improve the prediction. A closer look

at the time series (Figure 3) shows that there were significantly less events in the first 81 days of

2015 as predicted from the 2014 data. Since including the new 2015 events in the calculation of the

triggering term result in a larger predicted number, doing so only strengthen the over-estimation.

The over-estimation of the number of events in 2015 highlights the fact that, practically speaking,

one would like to predict just where, rather than both when and where, the next event will occur, so

λ ( x , y , t)

that only the predicted marginal density

is of actual interest, instead of the

∫∫ dx dy λ (x , y , t)

complete space-time rate-density λ (x , y , t ) . We therefore introduce a second measure of the

capacity of the model to predict the future locations of the subsequent events, as

g(a , b , c )=∑ ln λ(x i , y i , t i ) , where the summation is done on the 2015 events only, and the

i

**triggering term of λ (x , y , t) is computed by summing over all preceding events (including those
**

of 2015). We show in Figure 4 that type 2 models perform better than type 1, but more importantly

that a simple (exponential) smoothing of all the previous events does actually better in predicting

the location of the next event, although the improvement is only marginal. This is particularly

surprising, since accounting for memory in the system should a priori improve the prediction

compared to a memory-less prediction as done with a simple smoothing. This is here due to a

change in the spatial properties of the burglary events in 2015 (compared to 2014), which are found

**to be more distant of each other : the mean distance between any two burglaries was 13.58 km in
**

2014, and 14.05 km in 2015. For both years, consecutive events tend be less distant than average,

but there still exist a significant difference between the two time periods, cf Figure 5. Exploiting the

temporal clustering as done with our models will lead to predicted events to close to the

immediately preceding (past) event, while the simple smoothing will predict a distance slighlty

larger, hence a better prediction.

These results cast strong doubts on the capacity of the models proposed here to outperform simple

hotspot maps obtained by smoothing, for the dataset analyzed. The triggering contribution to the

occurrence of future events is small (it accounts only for 1.7 % for the best model). Accounting for

memory in the system therefore can only provide a very modest contribution to the effectiveness of

the prediction scheme.

More importantly, it is assumed that the dynamics of the process stays the same over time. Possible

non-stationarity of the process is thus clearly an issue, as it will prevent the use of past information

to predict the future. This is for example experienced in this analysis, as 2015 burglary events are

clearly not distributed (in time and in space) as they were in 2014. This non stationarity is likely due

to uncontroled evolutions in the way these acts are performed, but, in situations were new

prediction algorithms are set up and exploited by police patrols, could also be a response by

burglars to such a change. Unlike natural processes like earthquakes, analyses like the one presented

here could therefore have the ability to modify the observed process, making it more difficult to

correctly predict future events.

L (km)

ω̄0

Difference in

log-likelihood

0.01

0.02

0.05

0.1

0.2

0.4

1

100 %

99.9 %

98.7 %

98.1 %

93.5 %

73.5 %

45.8 %

<10-15

<10-7

-0.012

-0.017

-0.056

-0.17

-0.33

**Table 1 : percentage of background events ω̄0 and difference in cost function −f / N between
**

cross-validation with and without use of the 2015 data, function of the smoothing length L , for

type 2 models. In all cases the likelihood is lower when including the 2015 data to compute the

triggering intensity.

**Figure 1 : mean of cost function ( −f / N ) for various values of the
**

smoothing length L used to compute the background rate-density

μ ( x , y ) , for the two approached described in the text, obtained by

cross-validation. The model of type 1 with optimized L has L=0.024

km.

**Figure 2 :Interaction kernels λ t (top graphs) and λ r (bottom graphs) for model type 2
**

with L=0.1 km. The two dashed lines show power-laws with exponents -1.5 (for λ t )

and -7 (for λ r ).

**Figure 3 : number of events (in blue) and predicted number, using (magenta) or not using
**

(green) the 2015 events in the triggering term summation.

**Figure 4 : difference in the cost function -g normalized by the number of events to be
**

predicted, compared to type 2 model predictions, for the two simple smoothing of the

2014 (blue) and the 2014 and 2015 data up to the prediction time (red). The simple

smoothing is done using an exponential kernel. For smoothing lengths up to 0.1 km,

the simple smoothing performs better than the more sophisticated model proposed

here that accounts for memory effects.

**Figure 5 : mean distance between pairs of events separated by (n-1) events, for
**

the two time periods analyzed separately.

- 2010 - Optimal choice of baselines.pdf
- Rapport des Verts européens sur les intermédiaires de l'évasion fiscale
- WekaManual-3-6-13.pdf
- Creation of Unseen Triphones From Diphones
- Doc Mediapart - Anciens PR Rapport Migaud Sauvé
- Analysis of TiSA Annex on New Provisions
- L'avis du Conseil d'Etat sur le projet de loi antiterroriste du gouvernement Philippe
- ishbal
- CrossValidation.pdf
- Note de l'Anssi sur le chiffrement
- Tisa Transparency
- PNR
- Projet procedure penale
- amini2016
- Application of Neural Networks in Financial
- belajar bigdata 1
- Leukemia Prediction Using Sparse Logistic Regression.pdf
- Optimization of NIRs
- Fault prediction metrices
- Strategy Optimisation
- 771A Lec2 Slides
- Over Fitting and Tbl
- Unveiling the Identity of PIN from the Flash Crash
- PASW
- 10.1.1.17
- Data Mining in Market Research
- Fault Seal
- Impact of Alley Cropping Agroforestry on Stocks, Forms and Spatial Distribution of Soil Organic Carbon — a Case Study in a Mediterranean Context
- ex5.pdf

- Arrêt Qosmos
- Lettre CGT Hadopi
- Projet de règlement pédagogique musique doc de travail
- Cnil Sanction Google droit à l'oubli
- Création Du Conseil d'Urgence Citoyenne
- Appel projet Horizon/Anticrime
- CNCDH Avis PJL Criminalité Organisée
- Schrems Arret Safe Harbor CJUE
- Directive terrorisme, position Etats 01/16
- Rapport Ambition numérique
- FR MOI MeetingReport 150402
- Le deuxième rapport de la commission de contrôle de l'état d'urgence
- Analysis TiSA Annex on Road Freight and Logistics Services
- Tisa Transparency
- Rapport annuel Hadopi
- FCC « Protecting and promoting the open internet
- Projet de loi renseignement
- DÉCRET_n°2015-125_du_5_février_2015_version_initiale
- 2014-07-15_CNNum_Avis3-2014-TerroNum
- (Commission Numérique Recommandation PJL Renseignement)
- 141015-CHATEL Luc_Mise en congÃ© UMP.PDF
- Rapport Sivens
- d é Mission Laure Durand
- Affaire Mennesson c. France
- Le Conseil d'État
- L'avis de la Commission européenne sur le budget de la France
- 140909 Appel de Paris
- Liberty Order6Feb15
- Décret blocage

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading