Experimental Evaluation of Hotspot Identification Methods: Wen Cheng, Simon P. Washington

Accident Analysis and Prevention 37 (2005) 870–881
Experimental evaluation of hotspot identification methods

Wen Cheng ∗ , Simon P. Washington 1
Civil Engineering and Engineering Mechanics, University of Arizona, 1209 E. Second St., Tucson, AZ 85721-0072, USA
Received 7 April 2005; received in revised form 10 April 2005; accepted 10 April 2005
Abstract
Identifying crash “hotspots”, “blackspots”, “sites with promise”, or “high risk” locations is standard practice in departments of transportation
throughout the US. The literature is replete with the development and discussion of statistical methods for hotspot identification (HSID).
Theoretical derivations and empirical studies have been used to weigh the benefits of various HSID methods; however, a small number of
studies have used controlled experiments to systematically assess various methods.
Using experimentally derived simulated data—which are argued to be superior to empirical data, three hot spot identification methods
observed in practice are evaluated: simple ranking, confidence interval, and Empirical Bayes. Using simulated data, sites with promise are
known a priori, in contrast to empirical data where high risk sites are not known for certain.
To conduct the evaluation, properties of observed crash data are used to generate simulated crash frequency distributions at hypothetical
sites. A variety of factors is manipulated to simulate a host of ‘real world’ conditions. Various levels of confidence are explored, and false
positives (identifying a safe site as high risk) and false negatives (identifying a high risk site as safe) are compared across methods. Finally,
the effects of crash history duration in the three HSID approaches are assessed.
The results illustrate that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques (with certain
caveats). As found by others, false positives and negatives are inversely related. Three years of crash history appears, in general, to provide
an appropriate crash history duration.
© 2005 Elsevier Ltd. All rights reserved.
Keywords: Safety; Sites with promise; Empirical Bayesian analysis; Crash history; Hot spot identification
1. Background risk and sites that happen to have experienced random “up”
fluctuation in crashes during a period of observation.
The transportation system (consisting of road segments, There is a fairly extensive literature focused on methods
intersections, ramps, interchanges, etc.) in general does for the identification of “blackspots”, “sites with promise”,
not perform homogenously with respect to safety. Not sur- “high risk” or “hotspots” (hereafter referred to as HSID for
prisingly, heterogeneity in the driving population, roadside hot spot identification) covering an exhaustive set of issues.
features, weather, traffic conditions, and design considera- Some papers address regression to the mean issues, while oth-
tions leads to heterogeneity in crash frequencies and rates. ers address crash outcome versus total crash modeling. Some
Because of a desire and mandates to provide a safe driving discuss the application of Bayesian methods, while others try
environment, professionals are charged with identifying and to make sense of cross-sectional data. There are papers that
improving “high risk” locations. A difficulty arises from discuss the notion of ‘potential accident reduction’ and its
the inability to differentiate between sites that are truly high role in hot spot identification, and papers that address issues
surrounding crash severity. Specifically, previous research
(Persaud and Hauer, 1984; Persaud, 1986, 1988; Hauer, 1997)
∗
has shown that methods relying on a simple ranking of crash
Corresponding author. Tel.: +1 520 621 4686; fax: +1 520 621 2550.
E-mail addresses: wencheng@u.arizona.edu (W. Cheng),
counts or crash rates produce large numbers of false posi-
simonw@engr.arizona.edu (S.P. Washington). tives – due to the random fluctuation of crashes from year to
1 Tel.: +1 520 621 4686; fax: +1 520 621 2550. year – leading to the attempted remediation of safety prob-
0001-4575/$ – see front matter © 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.aap.2005.04.015
W. Cheng, S.P. Washington / Accident Analysis and Prevention 37 (2005) 870–881 871
lems at relatively safe locations. In addition, an excessive Third, several practical empirical crash distributions from the
number of false negatives allow truly hazardous locations to state of Arizona are selected to represent a realistic range of
escape identification. These errors result in inefficient use ‘base’ crash data. Finally, several degrees of crash hetero-
of federal and/or state aid and local government resources geneity are examined in the experimental evaluation.
applied for safety improvements. Additionally, the selection In addition to evaluating the performance of various HSID
of truly hazardous locations using the method of simple rank- methods, this paper also evaluates the effect of crash his-
ing of crash counts is relatively arbitrary, that is, the degree (or tory duration employed in the three HSID methods. Little
percentage) of difference between “correctable” and “aver- research has been dedicated to exploring the effect of crash
age” sites is often extremely small and quantitative, not history duration. May (1964) first discussed the issue of how
qualitative. many years of crash data should be analyzed when deter-
This observation highlights an important issue, which mining the crash-prone locations. He explored the difference
warrants further exploration. The separation of “safe” and between sorts of average crash counts with t increasing until
“unsafe” sites (intersections, road segments, etc.) is wholly 13 years. He concluded that the difference diminishes as t
arbitrary. A binary condition does not describe the differ- increases while the marginal benefit of increasing t declines.
ence between safe and unsafe sites, and in fact the difference The “knee” of the curve is said to occur at t = 3 years.
in safety across all sites being compared is continuous in Several researchers (Bauer and Harwood, 2000; Persaud,
nature. That is, a safety performance function underlying a 1999) also used the same period of crash history records
set of sites varies continuously from the lowest crash fre- to address different road safety issues. In this paper, the
quency (or rate) to the highest crash frequency (or rate), the effect of crash history duration on HSID results is evaluated
collection of which results in a probability distribution (or experimentally.
density) of crashes. In this context it is extremely difficult
to identify clear differences between safe and unsafe sites,
since these two designations will lie alongside one another 2. Hot spot identification methods
in an ordered distribution of all sites, and their difference
in safety (crash counts or rates) may be very small. Some A site (road segments, intersections, interchanges, etc.)
of the implications of the arbitrary selection of safety and may experience relatively high numbers of crashes due to:
unsafety of sites are discussed later, but we impart at this (1) an underlying safety problem, for example, the high level
point that identifying high-risk sites is akin to establish- of traffic exposure or the nature of site; or (2) a random “up”
ing a level of confidence in a statistical hypothesis test. In fluctuation in crash counts during the observation period.
practice, the threshold between high risk and non-high-risk Simply observing unusually high crash counts does not indi-
sites is determined by available resources, with the ‘high- cate which of the two conditions prevails at the site. Some
est’ ranked sites being improved until safety resources are researchers have suggested using other measures. For exam-
exhausted. ple, McGuigan (1981, 1982) proposed the use of potential for
Hauer and Persaud (1984) drew an analogy between the accident reduction, as the difference between the observed
first stage of identification of black-spots and a sieve, and and expected number of crashes at a site given exposure.
discussed how to measure the performances of various meth- Hakkert and Mahalel (1978) proposed that blackspots should
ods of identifying hot-spot sites. On the basis of this study, be defined as those sites whose accident frequency is sig-
Higle and Hecht (1989) conducted a simulation experiment nificantly higher than expected at some prescribed level of
to evaluate and compare techniques for the identification significance. Mahalel et al. (1982) proposed the road sites
of hazardous locations in terms of crash rates. Maher and selected for treatment should maximize the expected total
Mountain (1988) also use a simulation-based approach to accident reduction by treatment. In light of these various def-
compare methods, including ranking of sites on the basis of initions, it is necessary to articulate the objective of hot spot
annual accident totals (AAT) and potential accident reduc- identification as applied in this study.
tion (PAR). Subsequent work by Bauer and Harwood (2000),
Hadayeghi et al. (2003), Miaou and Lord (2003) have shown
that safety performance functions may be curvilinear with The objective of hot spot identification (HSID)
respect to vehicle miles travelled (VMT) and are useful for is to identify transportation system locations
assessing the risk of various sites. (road segments, intersections, interchanges,
The first part of this paper compares alternative HSID ramps, etc.) that possess underlying cor-
methods and represents a natural continuation of the Higle rectable safety problems, and whose effect will
and Hecht study, with a number of important differences and be revealed through elevated crash frequencies
unique contributions. First, crash frequency instead of crash relative to similar locations.
rate data are used to assess hotspot identification techniques.
Second, the paper compares the performance of simple rank-
ing, classical confidence intervals, and empirical Bayesian Two aspects of the previous statement are noteworthy.
techniques in terms of percent false negatives and positives. First, it is possible to observe an unsafe site that does not
872 W. Cheng, S.P. Washington / Accident Analysis and Prevention 37 (2005) 870–881
reveal elevated crash frequencies – these are termed false neg- empirical data where actual safety is not known for certain,
atives. It is also possible to observe elevated crash frequencies and a range of input assumptions was not tested, as is done
at a relatively safe site – false positives. False positives, if in this study.
acted upon, lead to investment of public funds with little to The empirical Bayesian method rests on the following
no safety benefits. False negatives lead to missed opportu- logic. Two assumptions about crash occurrence are first
nities for effective safety investments. As one might expect, needed, which can be traced back to those of Morin (1967)
correct determinations include identifying a safe site as “safe” and Norden et al. (1956):
and an unsafe site as “unsafe”.
For evaluative purposes, a HSID is sought that produces Assumption 1. At a given location, crash occurrence obeys
the smallest proportion of false negatives and false posi- the Poisson probability law. That is, Px|λ denotes the prob-
tives. Hence, percentage of false negatives, false positives, ability of recording x crashes on a site where their expected
and overall misidentifications (false positives plus false neg- number is λ, where Px|λ = λx e−λ /x!.
atives) are used to compare the performances of three com-
monly implemented techniques: (1) simple ranking of sites, Assumption 2. The probability distribution of the λ’s of
(2) classically based confidence intervals and (3) empirical the population of sites is gamma distributed, where g(λ) is
Bayesian methods. denoted as the gamma probability density function, and is a
The simple ranking (SR) method is the most straightfor- typically modeled as a function of site covariates.
ward. Applying this method, a set of locations is ranked in
descending order of crash frequencies (or counts, K), and the On the basis of the above assumptions, the probability that
unsafe sites are then identified for engineering evaluation. a site selected randomly records x crashes is approximated
This method, for example, is implemented in the current by the negative binomial (NB) probability distribution.
version of the Arizona Local Government Safety Project In the empirical Bayesian technique, estimation of the
(ALGSP) Model, which is available for local jurisdictions long-term safety of an entity is obtained using two clues, that
to conduct HSID (Carey, 2001) in the state of Arizona. Note is, the historical crash record of the entity and the expected
that this method does not rely on crash potential (observed number of crashes obtained from a safety performance func-
minus expected) but overall crash frequency. It is generally tion for similar sites. If the count of crashes (x) obeys the
assumed, however, that similar sites are being compared with Poisson probability law and the distribution of the λ’s in the
one another. reference population is approximated by a Gamma probabil-
A second method for HSID is based on classical statisti- ity density function, an estimator of λ, the long term mean
cal confidence intervals (denoted CI) (Laughlin et al., 1975). crashes for a site, is given by (Hauer, 1997; Harwood et al.,
Location i is identified as unsafe if the observed crash count 2000; Shen and Gan, 2003):
Ki exceeds the observed average of counts of comparison
λi = αE{λ} + (1 − α)xi (1)
(similar) locations, µ, with level of confidence equal to δ,
that is, Ki > µ + Kδ S, where S is the standard deviation of the where α is the weight factor, which can be expressed as fol-
group of comparison sites. In practice a value of δ is typically lows:
0.90, 0.95, or 0.99, and depends upon the actual situation and
E{λ}
considerations such as the number of sites, amount of safety α= (2)
E{λ} + VAR{λ}
investment resources, etc. These values serve as approxima-
tions, since they are borrowed from the normal distribution An important issue associated with these two equations is
function and thus have no special meaning in terms of the that the second of the two clues, crash history, significantly
distribution of true crash counts, which typically follows a affects the estimate of λ, since longer crash histories tend
Poisson or negative binomial distribution. to be more stable (in crashes per year) than shorter crash
The application of empirical Bayesian methods (EB) for histories. Thus, different historical crash records yield differ-
HSID has been developed and applied more recently than ent estimates E{λ} and VAR{λ}, and subsequently different
the SR and CI methods. By accounting for both crash his- identification error rates (false positives and false negatives).
tory and expected crashes for similar sites, EB methods have Because of its importance in HSID the effect of crash history
been shown to offer improved ability to identify “high-risk” duration is examined in this study.
sites. Hauer et al. applied Empirical Bayes’ (EB) methods to
estimate the safety at signalized intersections (Hauer et al.,
1988), Persaud evaluated crash potential of Ontario road sec- 3. Description of simulation experiment
tions (Persaud, 1991) and ranked sites for potential safety
improvements (Persaud, 1999), and Higle and Witkowski The main objective of this experiment is to quantify and
(1988) presented a supplemental EB technique that makes use assess the predictive performance of the SR, CI, and EB HSID
of crash rates. Most of these research studies yielded favor- methods and to assess the effect of crash history duration
able results in terms of identifying hotspots, but the range under a wide range of realistic conditions. Many aspects of
of conditions were quite small within studies, most relied on the simulation experiment require careful attention, such as
the reason for simulating data, the determination of sample to calculate E(x) and VAR(x). This implies that the
sizes, how to characterize crash data, and reliability of tests. analyst has accommodated for covariates and is able
These aspects of the experiment are now discussed in turn. to estimate an expected value for a site that accounts
for things such as exposure, geometrics, etc.
3.1. Why simulate data when empirical data are (c) For the various hot spot thresholds, false positives,
plentiful? false negatives, and total misidentifications in per-
cent are computed.
One might ask: why not use empirical data instead of sim- (4) Evaluate effect of crash history duration. In the CI, SR,
ulated data to test the performance of HSID methods? The and EB methods the analyst must decide how long a
answer is straightforward. When analyzing real data, i.e., history to use for calculations. In this experiment the
crash counts; the analyst never knows a priori which sites effects of various crash histories (1 year through 10 years
are truly hazardous. This leads to the very difficult situation of data) on performance are evaluated.
of trying to count false positives and negatives without know-
ing which sites are truly safe and ‘unsafe’. In contrast, in a 3.3. Generating mean crash frequencies from empirical
simulation it is possible to establish a priori sites that are data
hazardous and assess whether HSID methods can correctly
identify them. The simulation approach, however, requires As stated previously, observed crash data are used to
considerable care in ‘constructing’ the crash data so that they inform the simulation of the TPM’s. Herein, the 6-years (Jan-
are convincingly similar to empirical crash data. uary 1995 through December 2000) of crash counts from
intersections in Apache, Gila, Graham, Lapaz, Pima, and
3.2. Ground rules for simulation experiment Santacruz counties in the state of Arizona are used. The cumu-
lative distributions (i.e., cumulative percent) of crash counts
The simulation experiment consists of the following spe- for the different sites are shown in Fig. 1 (1 plot for each of
cific steps, which are described in greater detail following the six counties). Three types of characteristically different
this summary: underlying cumulative distributions of TPM’s were observed:
an approximately exponential shape (denoted E), an approx-
(1) Generate mean crash frequencies from real data. Crash imately linear shape (denoted L), and an approximately sig-
datasets from Arizona, which represent a range of in moidal shape (denoted S). These three distributional shapes
situ crash data, are first obtained. These data are used are meant to reflect different types of facilities that have
to determine various shapes of distributions of crash site varying safety performance functions, where the safety per-
means (λ’s). Gamma distributions are fit to the observed formance function characterizes the collection of sites.
data to capture the heterogeneity in site crash means. In addition, two levels of heterogeneity in crash counts
The gamma-distributed means are denoted TPM’s for were observed: low heterogeneity (denoted 1) where the
True Poisson Means, and represent the means of crashes range in observed crash counts is less than 20 crashes (differ-
across a collection of sites. To take advantage of ‘large ence between highest and lowest crash count in a single obser-
sample statistical properties’, 1000 TPM’s are generated vation period), and high heterogeneity (denoted 2) where the
for each of the crash distributions analyzed. range is in excess of 50 crashes. These two groups are meant
(2) From TPM’s generate random Poisson samples. Thirty to represent low exposure and high exposure sites.
independent random numbers are generated from each The cumulative distributions used to generate the TPM’s
TPM. Specifically, for each of the 1000 sites the TPM is are labeled as E1, E2, L1, L2, S1, and S2, and are shown
used to randomly generate 30 crash counts that represent in Fig. 1. For example, E2 represents a collection of sites
OBSERVED data for 30 different observation periods with an exponentially shaped crash distribution with high
(e.g. years). heterogeneity in TPM’s. These six data sets, which by good
(3) Evaluate HSID performance. By knowing the true state luck provided these distinctive distributions, were selected
of safety for sites (the TPM’s), and having observed data from various jurisdictions within Arizona to try to represent
(the randomly generated Poisson counts) for 30 obser- the range of underlying characteristics related to true acci-
vation periods, the performance of HSID methods can dent count distributions, with the intent of making the results
be tested. The following steps are used to conduct the gained from this experiment applicable across a verity of typ-
evaluation: ical situations.
(a) SR, CI, and EB methods are applied in separate sim-
ulation runs to rank sites for improvement. These 3.4. Generation of random Poisson samples from TPM’s
are applied and analyzed column wise (or a single
observation period as shown in Table 1). The empirical cumulative TPM’s shown in Fig. 1 represent
(b) For the Bayesian runs, it is assumed that rows (data the data required to support Assumption 1. Using these data,
across observation periods for the same site) can also observed crash counts were randomly generated to repre-
be used to represent the comparison group in order sent observed data for a given observation period. To provide
Table 1
Simulated crash counts for 30 sites and 16 observation periods
Site TPM Observation period
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 4 5 1 4 1 2 7 4 3 4 4 2 1 1 5 5 6
2 8 5 9 8 6 8 4 9 9 5 4 8 8 9 9 13 8
3 8 12 7 10 5 5 7 11 8 8 8 11 6 6 7 8 7
4 9 12 9 10 16 8 12 7 9 11 8 10 8 16 11 6 8
5 9 10 13 12 8 9 6 12 10 9 9 4 5 12 11 11 4
6 10 15 4 6 10 4 17 6 11 12 7 10 10 15 6 17 10
7 10 8 5 10 8 13 10 11 7 12 10 8 9 9 6 9 10
8 10 7 8 11 14 10 12 7 11 12 11 12 13 7 7 7 11
9 12 13 17 8 14 12 10 16 10 7 15 17 9 11 15 14 15
10 12 10 9 13 13 6 12 18 11 15 12 12 12 13 12 13 9
11 12 9 10 10 14 15 12 7 14 6 12 11 19 9 17 10 18
12 12 11 14 14 9 16 7 15 3 10 13 9 11 7 2 12 14
13 12 15 15 16 13 8 12 13 16 16 12 15 11 15 12 14 9
14 12 14 10 10 11 15 15 12 13 14 15 13 14 11 13 17 19
15 12 11 12 12 8 12 13 12 7 9 11 9 9 9 12 4 9
16 13 8 17 13 8 12 11 17 15 16 13 12 15 16 12 14 19
17 13 9 13 16 16 11 8 6 18 12 8 7 11 12 12 17 15
18 13 10 18 15 16 10 15 10 16 17 10 6 8 8 10 13 6
19 13 14 13 17 11 6 11 18 15 11 17 16 19 13 11 15 14
20 13 7 4 13 11 12 10 17 19 6 7 12 15 7 15 14 12
21 14 16 17 12 18 13 17 12 11 7 13 15 10 18 14 17 19
22 15 15 18 21 15 15 14 13 21 14 13 20 13 12 19 16 16
23 15 11 13 16 12 12 16 10 16 19 20 21 16 13 19 11 16
24 15 9 16 16 11 14 12 15 18 11 16 14 29 11 12 19 14
25 16 18 12 15 9 19 18 14 11 19 15 18 14 18 18 14 20
26 17 22 10 19 12 15 19 18 10 11 17 20 16 15 11 10 15
27 18 14 21 9 19 16 17 19 18 18 14 16 28 19 18 19 10
28 18 8 20 19 5 16 18 20 28 16 17 19 14 15 14 18 15
29 19 26 19 18 21 17 29 12 22 25 15 23 11 19 20 15 24
30 20 22 18 23 21 23 19 26 22 16 20 19 15 14 19 13 15
Site = number of site, e.g. intersection, road segment, etc.; TPM = true underlying safety of site or Poisson mean; simulated data = observed crash count in
observation period; shaded cells represent ‘truly hazardous’ locations (sites 19 and 20).
sufficient sample sizes for statistical comparisons, gamma After TPM’s are simulated (the crash means across sites
distributions of TPM’s are fitted to the six datasets, then which reflects the true and unknown safety performance func-
1000 TPM’s are simulated. Fitting gamma distributions to tion) the next step is to generate observed crash counts for
a given sequence of data was implemented through the soft- the sites. These counts represent the observed crash counts
ware package Arena 7.0 (Kelton et al., 2003). The summary across observation periods for a particular site (where its’ true
of the fittings is shown in Table 2. Differences in theoret- safety, the TPM, is known). It is well established that crash
ical curves and empirical data are expected, since the data counts that fluctuate across observation periods is a result of
represent the sum of an underlying (and unknown) safety the randomness inherent in the underlying crash process and
performance function and random crash counts. is well approximated by a Poisson process (see for example
Lord et al., 2004). To represent this natural fluctuation, a ran-
dom sample of 30 observation periods (e.g. months, years,
Table 2 etc.) associated with each location is simulated using a ran-
Summary of Gamma fittings of six datasets dom number generator using the TPM’s defined by the fitted
Data Fitting expression Square error Test p-value distributions in Fig. 1. A small snapshot of the data generated
set statistic by this simulation is shown in Table 1.
E1 0.5 + Gamm(3.79,1.75) 0.022344 26 <0.005 Table 1 shows 16 simulated observations periods for 30
E2 1.5 + Gamm(15.9,1.7) 0.011836 13.4 0.0385 sites, and is perhaps the best way to illustrate how the sim-
L1 0.5 + Gamm(4.31,1.71) 0.038173 11.1 0.0119
ulation was conducted. The far left column shows 30 sites
L2 3.5 + Gamm(13.4,2.27) 0.020052 8.2 0.16
S1 0.5 + Gamm(2,4.3) 0.014903 33.5 <0.005 by number. The next column, labeled TPM, is the true state
S2 0.5 + Gamm(9.06,2.57) 0.013211 23 <0.005 of nature for a particular site, and represents the usually
E—exponential shape; L—linear shape; S—sigmoidal shape; 1—low het- unknown underlying safety. The remaining 16 columns rep-
erogeneity of crash counts; 2—high heterogeneity of crash counts. resent 16 different observation periods – years of crash data
Fig. 1. Cumulative distributions of TPM’s underlying simulated crash data.
– that could possibly arise from the TPM’s in column 2. this regard is the use of δ—the cutoff level used to establish
The data, in aggregate (across sites and observation periods), hazardous locations. Three values of δ are employed in the
are negative binomial distributed, but are Poisson distributed evaluations, 0.90, 0.95, and 0.99 corresponding to the top 10,
within site. 5, and 1% of all sites identified as ‘hazardous’ respectively.
To illustrate how high risk sites might be identified, the two In practice δ corresponds with the availability of resources
sites with 19 or more crashes per observation period may be for remediation.
identified a priori as hazardous, since the TPM’s reflect the The parameters of the simulation experiment in their
true underlying state of nature. The two sites in the shaded entirety now include: shapes of the TPM’s (E, S, and L),
cells are truly high-risk sites whereas the 28 sites above the levels of heterogeneity in the TPM’s (1 and 2), and lev-
shaded area are ‘safe’. Using a simple ranking scheme, in els of δ (0.90, 0.95, and 0.99). Three HSID methods are
any given observation period, say observation period 5, the assessed, simple ranking (SR), classical confidence intervals
observed number of truly hazardous sites that recorded 19 or (CI), and empirical Bayesian methods (EB). The evaluation
more crashes was 2 sites out of 30, where one was a truly criteria include percent of false positives (FP), percent of false
hazardous site (site 30) and one was not (site 25, a false pos- negatives (FN), and sum total percent of false positives and
itive). In observation period 5 there was also a false negative, false negatives, called false identifications (FI). The true state
since truly hazardous site 29 revealed only 17 crashes. Using of nature (TPM) is simulated for 1000 sites for each of the
a large number of observation periods in this fashion will six crash distributions and each site is “observed” during 30
yield statistical estimates of false positives and negatives. observation periods.
To conduct this simulation experiment the following steps
were taken:
4. Performance evaluation results for HSID methods
1. All the TPM cumulative distributions are divided into truly
Establishing fair and realistic comparisons among the dif- hazardous locations and non-hazardous locations, using
ferent HSID methods is paramount. One consideration in thresholds of 0.90, 0.95, and 0.99 to represent different
data separation thresholds. This step results in three “crit- Tables 3 and 4 summarize the results of the errors (FNs,
ical” crash count threshold values, CC0.90 , CC0.95 , and FPs, and FIs) produced under the simulated conditions.
CC0.99 for each combination of cumulative TPM shape Table 3 presents the results when heterogeneity of crash
and heterogeneity level. counts is low (1), while Table 4 presents the results when
2. The three different HSID methods are used to identify hot heterogeneity is high (2). Critical crash count threshold val-
spots using the simulated data. Specifically, the SR method ues increase from left to right in both of the tables.
simply ranks observed frequencies, the CI method uses For low heterogeneity and high heterogeneity simulations,
the entire sample mean and standard deviation to deter- the trends of percent errors with increasing δ are consistent,
mine confidence intervals for ranking, and the EB method however, the percentage of errors for low heterogeneity are
uses a weighted average of crash history and observed much higher than those for high heterogeneity. The major
frequency using gamma distribution parameters to rank reason is that low heterogeneity in crash counts results in rel-
sites. atively small standard deviations of crashes when compared
3. Simulated crash data are then compared to the values of with the other datasets, which in turn makes it more difficult
CC0.90 , CC0.95 , and CC0.99 . For truly hazardous sites; if to identify hazardous locations. On the contrary, it is easy to
the randomly generated crash counts are lower than the identify hot spots when the corresponding crash counts are
values CC0.90 , CC0.95 , and CC0.99 , then false negatives greatly dispersed, particularly when dispersion is large in the
are produced. Similarly, for non-hazardous sites, when the upper most crash count deciles.
simulated crash counts are larger than the values CC0.90 , Another prominent characteristic illustrated in the tables is
CC0.95 , and CC0.99 , false positives are generated. The that the percentage of false negatives decreases with increas-
number of false identifications is the sum total of the num- ing δ for the three HSID methods. In most cases the per-
ber of false negatives and positives. centage of false negatives is substantially smaller for the EB
4. To make the three performance metrics comparable across method. The fairly complicated explanation for this is as fol-
simulations, the percentage of false negatives, false pos- lows. The threshold value divides the top ‘outlying’ crash
itives, and false identifications are calculated. The per- counts from the remainder of the data, either the top 10, 5, or
centage of false negatives is the number of simulated 1% of observed counts. By definition these counts are more
false negatives divided by the simulated truly safe sites, likely to suffer from regression to the mean in a subsequent
the percentage of the false positives is the number of observation period than counts around the TPM. Thus the
false positives divided by the truly hazardous locations, crash history of the top x% of crash counts act to reduce the
and the percentage of false identifications is obtained by effect of the current crash count x when ranking these sites.
dividing the sum of false negatives and false positives by As a result sites that suffer less from regression to the mean
the total number of randomly generated data locations get ranked higher in the list—sites that ordinarily would have
(1000). been ranked as false negatives.
5. Finally, the percentage of false positives, false negatives, Conversely, the percentage of false positives increases
and false identifications across simulation conditions are with increasing δ for the three HSID methods (except for
tallied and reported. the δ of 0.95 for L1 and L2). This result suggests that stricter
Table 3
Percent errors for low heterogeneity in crash counts
Percent errors: low heterogeneity
δ 0.9 0.95 0.99
Method CI SR EB CI SR EB CI SR EB
E FN 2.49 3.55 2.40 1.54 2.09 1.41 0.63 0.55 0.38

FP 62.76 31.97 21.63 82.47 39.73 26.87 114.32 54.00 37.67
FI 7.17 6.39 4.33 5.31 3.97 2.69 2.46 1.08 0.75
L FN 2.21 4.44 2.91 1.39 2.40 1.73 0.15 0.62 0.45

FP 106.14 39.97 26.20 65.24 45.67 32.80 431.62 61.00 45.00
FI 8.75 7.99 5.24 3.62 4.57 3.28 2.10 1.22 0.90
S FN 0.54 6.53 5.28 0.21 3.48 2.90 0.00 0.81 0.73

FP 753.44 58.73 47.50 1251.33 66.20 55.13 NA 80.33 72.33
FI 10.03 11.75 9.50 6.46 6.62 5.51 1.91 1.61 1.45
FN—false negatives; FP—false positives; FI—false identifications; CI—confidence interval; SR —simple ranking; EB—empirical Bayesian; E—exponential
shape; L—linear shape; S—sigmoidal shape. In the table, the reason that some FPs can exceed 100% is due to non-normality of the distribution and setting of
threshold, and in these cases, the CI method identifies more hazardous locations than truly exist. For the same reason, the existing of “NA” in the table is due
to zero truly hazardous locations identified by confidence analysis. The shaded cells show the lowest identification error rate.
Table 4
Percent errors for high heterogeneity in crash counts
Percent errors: high heterogeneity
δ 0.9 0.95 0.99
E FN 1.78 2.09 1.13 1.33 1.33 0.86 0.39 0.26 0.17

FP 24.37 18.77 10.13 32.56 25.33 16.40 57.07 26.00 16.67
FI 4.13 3.75 2.03 3.34 2.53 1.64 1.54 0.52 0.33
L FN 1.89 2.55 1.57 1.50 1.43 0.91 0.44 0.37 0.23
FP 36.33 22.93 14.13 32.20 27.20 17.33 45.22 36.67 22.67
FI 5.14 4.59 2.83 3.40 2.72 1.73 1.29 0.73 0.45
S FN 2.16 2.73 1.74 1.17 1.31 0.71 0.47 0.26 0.12
FP 34.80 24.53 15.67 41.08 24.87 13.47 38.37 25.33 12.33
FI 5.16 4.91 3.13 3.31 2.49 1.35 1.32 0.51 0.25
FN—false negatives; FP—false positives; FI—false identifications; CI—confidence interval; SR—simple ranking; EB—empirical Bayesian; E—exponential
shape; L—linear shape; S—sigmoidal shape. The shaded cells show the lowest identification error rate.
identification criteria (higher δ) will result in fewer failures to crash count for site 28 is 13 crashes per year (average of the
select truly hazardous sites for remedy, but will lead to a larger first four observation periods), while E{λ} = 15.75 crashes
number of non-hazardous locations identified as hazardous. (row or comparison site average), VAR{λ} = 4.14 crashes2
In summary, the percent of false positives increases with the (row or comparison site variance), and the formula weight
rising of thresholds, whereas the percent false negatives and α = 0.792. The resulting expected crash count associated with
false identifications decrease with the rising of thresholds. site 28 when using the first 4-year data is 15.2 crashes. For
The results in almost all simulation scenarios reveal similar the 16 different observation periods (shown in the table) we
trends. can generate 13 expected crash counts associated with site 28
The three identification methods also perform differently. using the 4-year crash history record. This same procedure
Compared to the other two more traditional methods, the EB is applied for varying crash histories. Furthermore, the same
method yields fewer false negatives and false positives in analysis approach is applied to assess the effect of crash his-
most simulation scenarios (see shaded ‘best’ results for each tory using the SR and CI methods. Due to a large amount
of the simulations in the table). In many cases the EB method of iterative computations in this experiment, computer code
reduced false negatives by about 50% compared to CI and SR was written to calculate the various identification error rates
methods, while the reduction in false positives ranges from 30 associated with different periods of crash data.
to 50%. That is, the EB technique is more efficient in iden- In theory, as t increases, the average expected crash counts
tifying sites that require further analysis and/or inspection. will converge to the site’s TPM (recall that the simulated
As for the CI and SR methods, there is not a significant per- data follow the Poisson distribution) and the correspond-
formance difference between them; however, the CI method ing identification error rate converges to zero. However, in
slightly outperforms the SR method for low-heterogeneity in practice, as t increases each site will suffer from changes
crash count situations. by influential factors such as traffic volumes, driver popu-
lation, maintenance activities, surrounding changes to land
use, weather fluctuations, etc., and thus longer crash history
5. Results for assessment of crash history duration periods generally are associated with increasingly less stable
safety performance functions with time. In contrast, if a short
To assess crash history duration effects, error rates asso- period of data is used, the count suffers from random fluc-
ciated with various “t” years of crash history are compared. tuations and/or regression to the mean bias. Consequently, a
All simulation conditions are repeated for the crash history trade-off is sought between a study period that is short enough
analysis. An experimental enhancement lies in how to use to represent the current conditions and long enough to reflect
the different periods of data. A snapshot of data in Table 1 the true expected crash count of a site. In this experiment,
are used again to demonstrate the analysis procedure used various identification rates are plotted versus t years of crash
here. The ith column of data is assumed to represent the ith history duration, and a “knee” in this curve is expected to
current year of crash data for all sites. Consider conducting serve as guidance for period selection.
an EB analysis: it is known that for a given t-year period, To consider up to 10 years of crash history, the 30 simu-
Eqs. (1) and (2) are used for each site to compute the corre- lated observation periods are divided into three groups: the
sponding expected crash counts. For a t-year crash history, first 10 columns of data belonging to group 1, the 11th to
average crash counts per year should be used in these equa- 20th column of data belonging to group 2, and the last 10
tions. For example, in the 4th year or observation period, the columns of data belonging to group 3. This grouping effec-
tively increases the size of the sample used for the analysis
and eliminates consideration of crash histories longer than
10 years, which practically represents an upper limit.
Generally, a plot of the identification error rate and t plot-
ted against each other will reveal fluctuations along the curve
due to random fluctuations characteristic of stochastic data.
To quickly identify the initial “warm-up” period – the period
prior to the knee of the curve – the moving average method
is utilized.
The moving average Ȳi (w) (where w, the window size) of
random observations is defined as follows:
y
 i−w + · · · + yi + · · · + yi+w

 , Fig. 2. Moving averages vs. original statistic.

 2w + 1

 i = w + 1, . . . , m − w
Ȳi (w) = y + · · · + y + · · · + y (3) tional crash history data yields diminishing improvement in

 1 i 2i−1

 , the percentage of false negatives.

 2i − 1
 To assess the effect of crash duration history, 486 simula-
i = 1, . . . , w
tion scenarios (three identification methods, three distribution
In this experiment, a window size of 1 is selected. Details of shapes, low and high heterogeneity for crash counts, three
the moving average method can be found in Simulation with threshold values for truly hazardous locations, three kinds
ARENA (Kelton et al., 2003). of false identifications, or, FN, FP, FI and three groups) were
The moving average effectively smooths the statistical conducted and analyzed to yield the corresponding best study
fluctuations in observations (yi ) and illustrates more clearly duration of crash history—where “best” refers to the study
the “warm-up” period. As illustrated in Fig. 2, it is difficult period associated with the knee of the curve. Table 5 shows
to detect the “knee” of the curve due for the unsmoothed data the identification error rates for the EB method for group 1
(original plot), due primarily to the existence of two outliers data.
(the point when t = 4 and t = 6). However, it is relatively easy The frequencies of best study periods for different confi-
to detect the knee of the curve in the plot of moving aver- dence levels and the total number of sites is shown in Fig. 3,
ages; it appears that a t of 5 years is close to the ‘knee’ of where the results from three groups of data mentioned previ-
the curve. The knee of the curve is the point at which addi- ously are aggregated. Moreover, the cumulative distribution
Table 5
The identification error rates of different crash histories for group 1
1 2 3 4 5 6 7 8 9 10
E1 FN 2.47 1.73 1.53 1.51 1.44 1.31 1.31 1.19 1.22 1.22
FP 22.20 15.56 13.75 13.57 13.00 11.80 11.75 10.67 11.00 11.00
FI 4.44 3.11 2.75 2.71 2.60 2.36 2.35 2.13 2.20 2.20
E2 FN 1.19 0.83 0.67 0.59 0.54 0.60 0.47 0.48 0.50 0.56
FP 10.70 7.44 6.00 5.29 4.83 5.40 4.25 4.33 4.50 5.00
FI 2.14 1.49 1.20 1.06 0.97 1.08 0.85 0.87 0.90 1.00
L1 FN 3.03 2.40 2.13 2.00 1.85 1.87 1.78 1.78 1.72 1.67
FP 27.30 21.56 19.13 18.00 16.67 16.80 16.00 16.00 15.50 15.00
FI 5.46 4.31 3.83 3.60 3.33 3.36 3.20 3.20 3.10 3.00
L2 FN 1.87 1.41 1.36 1.29 1.28 1.20 1.08 1.04 1.06 1.11
FP 16.80 12.67 12.25 11.57 11.50 10.80 9.75 9.33 9.50 10.00
FI 3.36 2.53 2.45 2.31 2.30 2.16 1.95 1.87 1.90 2.00
S1 FN 5.39 4.75 4.33 4.11 4.02 3.91 3.92 3.81 3.61 3.33
FP 48.50 42.78 39.00 37.00 36.17 35.20 35.25 34.33 32.50 30.00
FI 9.70 8.56 7.80 7.40 7.23 7.04 7.05 6.87 6.50 6.00
S2 FN 2.08 1.81 1.65 1.59 1.56 1.51 1.08 1.41 1.33 1.44
FP 18.70 16.33 14.88 14.29 14.00 13.60 9.75 12.67 12.00 13.00
FI 3.74 3.27 2.98 2.86 2.80 2.72 1.95 2.53 2.40 2.60
EB method and δ = 90%. FN—false negatives; FP—false positives; FI—false identifications; CI—confidence interval; SR—simple ranking; EB—empirical
Bayesian; E—exponential shape; L—linear shape; S—sigmoidal shape. The shaded cells shows that where the “knees” are found.
Fig. 3. The frequencies of t-year which is the “knee” of the curve for three confidence levels and the total.
of the sum of the three confidence levels illustrated in Fig. 3 ing the trade-off between the long and short history record,
is shown in Fig. 4. if there is no significant physical change in the sites being
Figs. 3 and 4 show that among 486 simulation scenarios, a examined and a longer history record can be obtained, it is
3-year crash history represented the largest portion of “best” suggested that 3 or more years be used (up to 6 years). In
study periods of crash history, and 3 through 6 years make contrast, 3-years of crash history data represents the ‘short-
up almost 90% of all the optimum t-years. Hence, consider- est’ period of time that should be used and which achieves a
Table 6
Percent errors for low heterogeneity in crash counts (3 years data)
Percent errors: low heterogeneity
δ 0.9 0.95 0.99
E FN 2.02 2.32 1.53 1.36 1.34 0.82 0.89 0.40 0.25

FP 28.06 20.88 13.75 38.60 25.50 15.50 48.56 40.00 25.00
FI 4.68 4.18 2.75 3.69 2.55 1.55 2.13 0.80 0.50
L FN 2.56 2.75 2.13 1.69 1.72 1.25 0.47 0.51 0.40
FP 33.16 24.75 19.13 50.00 32.75 23.75 91.07 50.00 40.00
FI 5.56 4.95 3.83 4.33 3.28 2.54 0.14 0.67 0.53
S FN 1.10 4.88 4.33 0.68 2.88 2.54 0.14 0.67 0.53

FP 228.21 43.88 39.00 239.38 54.75 48.25 362.16 66.25 52.50
FI 9.05 8.78 7.80 5.45 5.48 4.83 1.81 1.33 1.05
shape; L—linear shape; S—sigmoidal shape. In the table, the reason that some FPs can exceed 100% is due to non-normality of the distribution and setting of
threshold, and in these cases, the CI method identifies more hazardous locations than truly exist. For the same reason, the existing of “NA” in the table is due
to zero truly hazardous locations identified by confidence analysis. The shaded cells show the lowest identification error rate.
Table 7
Percent errors for low heterogeneity in crash counts (3 years data)
Percent errors: high heterogeneity
␦ 0.9 0.95 0.99
E FN 1.08 1.28 0.67 0.96 0.95 0.71 0.24 0.14 0.10

FP 13.96 11.50 6.00 15.32 18.00 13.50 34.66 13.75 10.00
FI 2.51 2.30 1.20 1.98 1.80 1.35 1.00 0.28 0.20
L FN 1.72 1.63 1.36 1.19 0.96 0.87 0.41 0.21 0.20
FP 14.37 14.63 12.25 15.07 18.25 16.50 20.11 21.25 18.25
FI 3.08 2.93 2.45 2.14 1.83 1.65 0.86 0.43 0.38
S FN 2.10 2.04 1.65 0.70 0.66 0.55 0.40 0.15 0.10
FP 18.01 18.38 14.88 20.83 12.50 10.50 21.03 15.00 10.00
FI 3.73 3.68 2.98 1.85 1.25 1.05 0.90 0.30 0.20
shape; L—linear shape; S—sigmoidal shape. The shaded cells show the lowest identification error rate.
uate the performance of simple ranking, confidence interval,

and empirical Bayesian high risk site identification methods
under a variety of realistic conditions. The evaluation criteria
are false positives, false negatives, false identifications, and
diminishing returns of crash history duration, t. The study dif-
fers from the vast majority of past research which has relied
upon empirical data (where underlying safety of sites is not
known a priori), where crash rates were used, or where the
effect of crash history has received little attention.
Upon examination of the simulation results described in
detail in this paper, the following conclusions are and recom-
mendations are made:
Fig. 4. The cumulative distribution of various optimal study periods.
(1) Empirical Bayesian methods in general outperform the
significant benefit of crash history (under most general con- other two relatively conventional HSID methods. Under
ditions). Crash histories of 1 and 2 years provided relatively a range of practical conditions, the EB method offers
little benefit under the range of conditions tested. 50% reductions in the percentages of false positives
Tables 6 and 7 illustrate the improvement in identifica- and false negatives compared to CI and SR methods.
tion performance results after using 3-year crash history data The EB analysis benefits, however, are contingent upon
(in contrast to Tables 3 and 4 which use 1 year of crash reliable and accurate safety performance functions for
data). Comparison of these tables reveals that using 3 years predicting ‘expected’ safety of comparison sites, which
of crash history data results in significant improvements in requires good geometric, traffic, and crash data, and is a
error rates for all three methods, CI, SR, and EB. Moreover, caveat for the results observed in this investigation. It is
improvements are seen across most scenarios (except for two strongly recommended that EB methods be incorporated
false negatives for method CI)—with the EB method show- into mainstream practice by managers of road safety who
ing 10–20% reductions in errors on average. While the EB currently may use CI or SR methods.
method still reveals itself as the superior method, the SR (2) In low crash count heterogeneity situations the benefits
and CI methods benefit disproportionately, more, on aver- of the EB methods are relatively less pronounced (com-
age, from using longer crash histories—with reductions in pared to high crash count variability). That is, when the
errors ranging from 15 to 50%. observed differences in crashes between ‘high-risk’ and
‘safe’ sites is relatively small, the EB method offers only
minor improvement compared to the SR and CI meth-
6. Conclusions and recommendations ods. This might suggest that municipalities that manage
safety on systems with relatively few crashes and low
Real crash data from various intersections from six coun- exposure may not experience significant improvements
ties in the State of Arizona were used to simulate crash data in performance by changing analysis platforms from SR
with known parameters. The simulated data enable a priori and CI methods to the EB method.
knowledge regarding the true (and usually unknown) safety (3) The analysis of crash history suggests that a 3-year crash
of simulated sites. The simulated data are then used to eval- history is the ‘optimum’ crash history, and up to 6 years of
crash history is generally better than shorter histories. It Hakkert, A.S., Mahalel, D., 1978. Estimating the number of accidents at
is recommended that not less than 3 years of crash history intersections from a knowledge of the traffic flow on the approaches.
duration be used in a HSID analysis, and the most recent Accident Anal. Prevent. 10, 69–79.
Harwood, D.W., Council, F.M., Hauer, E., Hughes, W.E., Vogt, A.,
6-year crash history record may be used if few substantive 2000. Prediction of the expected safety performance of rural two-lane
changes at the site during this period occurred. highways, Publication FHWA-RD-99-207. FHWA, US Department of
(4) Finally, a drastic improvement of truly identifying Transportation.
hotspots between the SR method (which is still used in Hauer, E., Persaud, B.N., 1984. Problem of identifying hazardous loca-
practice) with 1 year of crash data and the EB method tions using accident data. Transport. Res. Rec. 975, 36–43.
Hauer, E., NG, J.C.N., Lovell, J., 1988. Estimation of safety at signalized
with 3 years of crash data is possible. For example, the intersections. Transport. Res. Rec. 1185, 48–61.
percent of false negatives and false positives associated Hauer, E., 1997. Observational Before-After Studies in Road Safety. Perg-
with the latter ranges between 25 and 50% less than those amon, Tarrytown, NY.
associated with the former. A significant improvement of Higle, J.L., Witkowski, J.M., 1988. Bayesian identification of hazardous
decreasing false identification rates in the results from SR locations. Transport. Res. Rec. 1185, 24–36.
Higle, J.L., Hecht, M.B., 1989. A comparison of techniques for the iden-
and CI methods is also possible by including 3–6 years tification of hazardous locations. Transport. Res. Rec. 1238, 10–19.
of crash data, but these methods are still outperformed Kelton, W.D., Sadowski, R.P., Sturrock, D.T., 2003. Simulation with
by the EB method. Arena. The McGraw-Hill Companies, Inc., NY.
Laughlin, J.C., Hauer, L.E., Hall, J.W., Clough, D.R., 1975. NCHRP
Although the research here reflects an improved under- Report 162: methods for evaluating highway safety improvements.
standing of how various HSID perform, further work is still National Research Council, Washington, D.C.
needed. The results here depend on reliable and accurate Lord, Dominique, S. Washington, and J. Ivan., 2004. Poisson,
safety performance functions which are not always available. Poisson–Gamma, and Zero-Inflated Regression Models of Motor Vehi-
cle Crashes: Balancing Statistical Fit and Theory. Accident Analysis
Thus, it is not known how beneficial EB methods perform rel- and Prevention, Pergamon Press/Elsevier Science.
ative to other methods when safety performance functions are Mahalel, D., Hakkert, A.S., Prashker, J.W., 1982. A system for the allo-
inaccurate. Simulated data also suffer from lack of realism cation of safety resources on a road network. Accident Anal. Prevent.
encountered in ‘uncontrolled’ observational settings. Factors 14 (1), 45–56.
such as changes in weather, road users, enforcement, and Maher, M.J., Mountain, L.J., 1988. The identification of accident
blackspots: a comparison of current methods. Accident Anal. Prevent.
special events (among others) affect crash counts and in par- 20 (2), 143–151.
ticular the crash history analysis to favor shorter periods. May, J.F., 1964. A determination of accident prone location. Traffic Eng.
These factors also may lead to long-term trends in crashes, 34 (5), 21–27.
which have not been considered in this analysis. In addition, McGuigan, D.R.D., 1981. The use of relationships between road accidents
crash data that do not follow a Poisson–Gamma distribu- and traffic flow in “blackspot” identification. Traffic Eng. Contr. 22
(8/9), 448–451, 453.
tion (negative binomial) may perform differently. Finally, it McGuigan, D.R.D., 1982. Non-junction accident rates and their use in
may be possible to improve upon the EB method and further “black-spot” identification. Traffic Eng. Contr. 23 (2), 60–65.
reduce false positives and negatives. Miaou, S., Lord, D., 2003. Modeling traffic crash-flow relationships for
intersections: dispersion parameter, function form, and Bayes versus
empirical Bayes methods. Transport. Res. Rec. 1840, 31–40.
Acknowledgements Morin, A., 1967. Application of statistical concepts to accident data. High-
way Res. Rec. 187, 72–79.
Norden, M., Orlansky, J., Jacobs, H., 1956. Application of statistical
This paper is based on the research performed through quality-control techniques to analysis of highway-accident data. High-
contact with the Arizona Department of Transportation, way Res. Rec. 117, 17–31.
whose supply of the Arizona LGSP Model and crash data is Persaud, B.N., Hauer, E., 1984. Comparison of two methods for de-
gratefully acknowledged, along with support of the research. biasing before-and after accident studies. Transport. Res. Rec. 975,
43–49.
The contents are the sole responsibility of the authors.
Persaud, B.N., 1986. Safety migration, the influence of traffic volumes,
and other issues in evaluating safety effectiveness. Transport. Res.
Rec. 1086, 33–41.
References Persaud, B.N., 1988. Do traffic signals affect safety? Some methodolog-
ical issues. Transport. Res. Rec. 1185, 37–46.
Bauer, K.M., Harwood, D., 2000. Statistical models of at-grade intersec- Persaud, B.N., 1991. Estimating accident potential of Ontario road sec-
tion accidents—addendum, Publication FHWA-RD-99-094. tions. Transport. Res. Rec. 1327, 47–53.
Carey, J., 2001. Arizona local government safety project (LGSP) analysis Persaud, B.N., 1999. Empirical Bayes procedure for ranking sites for
model (Final Report 504). Phoenix, AZ. safety investigation by potential for safety improvement. Transport.
Hadayeghi, A., Shalaby, A.S., Persaud, B.N., 2003. Macrolevel accident Res. Rec. 1665, 7–12.
prediction models for evaluating safety of urban transportation sys- Shen, J., Gan, A., 2003. Development of crash reduction factors: methods,
tems. Transport. Res. Rec. 1840, 87–95. problems, and research needs. Transport. Res. Rec. 1840, 50–56.

Experimental Evaluation of Hotspot Identification Methods: Wen Cheng, Simon P. Washington

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Experimental Evaluation of Hotspot Identification Methods: Wen Cheng, Simon P. Washington

Transféré par

Droits d'auteur :

Formats disponibles

Accident Analysis and Prevention 37 (2005) 870–881

Experimental evaluation of hotspot identification methods

Fig. 1. Cumulative distributions of TPM’s underlying simulated crash data.

δ 0.9 0.95 0.99

E FN 2.49 3.55 2.40 1.54 2.09 1.41 0.63 0.55 0.38

L FN 2.21 4.44 2.91 1.39 2.40 1.73 0.15 0.62 0.45

S FN 0.54 6.53 5.28 0.21 3.48 2.90 0.00 0.81 0.73

E FN 1.78 2.09 1.13 1.33 1.33 0.86 0.39 0.26 0.17

δ 0.9 0.95 0.99

E FN 2.02 2.32 1.53 1.36 1.34 0.82 0.89 0.40 0.25

S FN 1.10 4.88 4.33 0.68 2.88 2.54 0.14 0.67 0.53

E FN 1.08 1.28 0.67 0.96 0.95 0.71 0.24 0.14 0.10

uate the performance of simple ranking, confidence interval,

Vous aimerez peut-être aussi