Journal B. Inggris

GeoJournal
DOI 10.1007/s10708-016-9699-x
Utilizing fuzzy set theory to assure the quality of volunteered

geographic information
Yingwei Yan . Chen-Chieh Feng . Yi-Chen Wang
Ó Springer Science+Business Media Dordrecht 2016
Abstract This paper presents a fuzzy system to Keywords Volunteered geographic information
assure the quality of volunteered geographic informa- Data quality Fuzzy system Species surveillance
tion (VGI) collected for the purposes of species
surveillances. The system uses trust as a proxy of
quality. It defines the trust using both the provenance
of user expertise and the fitness of geographic context Introduction
and quantifies it using fuzzy set theory. The system
was applied to a specific scenario—VGI-based crop The increasing use of volunteered geographic infor-
pest surveillance—to demonstrate its usefulness in mation (VGI) across various application domains has
handling VGI quality. A case study was conducted in revolutionized how spatial knowledge can be derived.
Jiangxi province of China, where location-based rice This is attributed to many of the advantages of VGI,
pest surveillance reports generated by the local notably the ability to provide pervasive location-based
farmers were collected. A field pest survey was data and to allow more flexible data collection
conducted by the local pest management experts to mechanisms. These features not only enable better
verify the farmer-generated reports, and the survey reflections of the observations of ubiquitous users on
results were used as ground truth data. The quality of the ground, but also facilitate the capture of data that
the farmer-generated reports were also assessed may be otherwise left out by traditional data collection
through the fuzzy system and compared to the pest means.
survey results. It was observed that the degree to which Among the applications characterized by their
these two sets of results agreed to each other was increasing reliance on VGI efforts, species surveil-
satisfactory. lances have attracted considerable attention from
researchers (Goodchild 2007; Zhu et al. 2015). The
use of high quality VGI for these purposes has
implications for scientific inquiries pertaining to
Y. Yan (&) C.-C. Feng Y.-C. Wang
environmental, economic, health security, and by
Department of Geography, 1 Arts Link, National
University of Singapore, Singapore 117570, Singapore extension, sustainability issues.
e-mail: yanyingwei@u.nus.edu VGI systems for such purposes can be readily built
C.-C. Feng using existing off-the-shelf computing technologies.
e-mail: geofcc@nus.edu.sg However, it remains challenging to assure the quality
Y.-C. Wang of VGI so as to derive valuable knowledge. Indeed,
e-mail: geowyc@nus.edu.sg any applications involving VGI inevitably face the
123
GeoJournal
challenges of high diversity and a greater level of data BD TOPOÒ IGN dataset based on a larger set of spatial
uncertainty (Goodchild and Li 2012; Kuhn 2007) data quality elements.
because VGI can be developed by both authoritative Such a direct approach can be seen as an adoption
agencies and amateur communities (Foody et al. 2013) of the traditional data quality assessment method that
as well as contributors of varying levels of knowledge focuses on internal data quality (Devillers et al. 2005).
and expertise (Goodchild 2009; Tulloch 2008). Fur- It, however, has limited applicability for assuring the
thermore, VGI can be created for various personal quality of VGI as there is generally an absence of
purposes (Coleman et al. 2009) and collected without authoritative gold standard reference datasets for VGI
explicit quality control measures (Girres and Touya applications (Bishr 2007; Kuhn 2007). For example, in
2010) or metadata (Brando and Bucher 2010). Ensur- the case of utilizing VGI for species surveillances,
ing meaningful gathering of intelligence from VGI voluntary observations are often conducted in sparsely
thus demands careful treatments with reference to populated, rural, or less explored areas of the world. In
such data. In other words, it is crucial to measure and such a case the gold standard reference datasets are
ensure that VGI is of high degree of data quality. often lacking. In addition, VGI dataset is often more
Therefore, this paper proposes a system to assure the up-to-date than authoritative dataset and thus may be
quality of VGI collected for the purposes of species more accurate than the so called gold standard
surveillances. The system takes advantage of fuzzy set reference dataset (Goodchild and Li 2012). To cope
theory to handle data uncertainty and ambiguity with this issue, indirect approaches relying on surro-
inherent in VGI contributions, incorporating explicitly gate criteria were proposed. Four mainstream indirect
the unique property of VGI–trust. To demonstrate the approaches are described as follows:
usefulness of the fuzzy system in handling VGI
1. The user review approach (Goodchild and Li
quality, it was applied to a specific case scenario, i.e., a
2012; Maué and Schade 2008). This approach is
VGI-based crop pest surveillance.
user-driven and relies on Linus’ Law which
The following section reviews how the quality of
assumes that ‘‘given enough eyes all bugs are
VGI is assured by the approaches from existing work.
shallow’’. Based on Linus’ Law, user contribu-
It also illustrates the related shortcomings. To improve
tions converge on a truth through an iterative error
the existing approaches of VGI quality assurance, a
correction process, either in terms of attributive
fuzzy system is then presented. It is followed by a
error or positional error, or both. If one user
discussion on the features of the fuzzy system and the
commits an error, the error can be detected or
future directions of this line of research. The last
corrected by the other users. Haklay et al. (2010)
section concludes the paper.
have applied this approach to OpenStreetMap and
suggested its applicability to VGI in general.
2. The provenance approach (Celino 2013; Trame
Assuring the quality of VGI and Keßler 2011). This approach relies on the
history of volunteered information. Requesting or
Direct and indirect approaches tracing the history of a VGI dataset (e.g., who are
the data providers?) is helpful in better under-
Several existing studies on VGI quality assurance have standing and assessing its quality.
adopted a direct approach which compares a VGI 3. The geographic approach (Goodchild and Li
dataset to an authoritative gold standard reference 2012). This approach is based on Tobler’s first
dataset. For example, Zielstra and Zipf (2010) examined law of geography, which assumes things that are
the completeness of a German OpenStreetMap dataset closer are more related than things that are farther
in comparison to a TeleAtlas MultiNet dataset. Haklay apart (Tobler 1970). A VGI contribution should fit
(2010) compared a London OpenStreetMap dataset with its geographic context, e.g., a report of a species
an Ordnance Survey dataset based on positional accu- occurrence is more likely to be true if many
racy and completeness. More comprehensively, Girres similar reports exist nearby. In addition, more
and Touya (2010) extended the work of Haklay (2010) credit can be given to a VGI if it is volunteered by
by comparing a French OpenStreeMap dataset with a a local resident who is physically close to the site
123
GeoJournal
of the VGI event and is familiar with the local The provenance approach, geographic approach,
environment (Seeger 2008). and trust approach appear to be applicable for species
4. The trust approach (Bishr 2007; Bishr and Janow- surveillance applications. However, when used alone,
icz 2010; Bishr and Mantelas 2008). It uses trust as all three approaches fall short in fully describing VGI
a proxy of quality to establish a link between VGI data quality.
quality and VGI contributors’ authority based on The provenance approach considers VGI prove-
subjective evaluations. It rests on the extent to nance, including data contributors’ expertise. What is
which a VGI contributor has provided honest and challenging, though, is how to appropriately incorpo-
accurate information. Trusted VGI contributors rate provenance of user expertise as the expertise level
tend to provide more trustworthy information of a VGI contributor is difficult to collect (Keßler et al.
compared to less trusted ones. The criteria for 2009). There are also resistances in providing such
evaluating the trustworthiness of VGI replace information due to the concerns on personal privacy
traditional quality measures of geospatial infor- and security (Song and Sun 2010). According to
mation (e.g., completeness, logical consistency, Coleman et al. (2009), VGI contributors can be
and positional accuracy). Indeed, the information classified into five types: (1) neophyte, (2) interested
asymmetry and imperfection of a VGI environ- amateur, (3) expert amateur, (4) expert professional,
ment can lead to social uncertainties in VGI and (5) expert authority. Normally, people are inclined
consumptions (Sniezek and Van Swol 2001). to trust contributors who are expert professional and
When high social uncertainties exist, trust appears expert authoritative. However, a contributor considered
to be particularly important as it reduces social to be an expert may understand a project’s specification
uncertainties by confining the range of behavior very well but lack the knowledge of local history or
expected from another (Sniezek and Van Swol attributes. A contributor considered as either a neophyte
2001). or interested amateur may know little about the
professional part of a VGI project but is very familiar
with the characteristics and details of his or her current
Challenges in using the indirect approaches
location. In short, the boundary between non-expert
for species surveillance applications
amateur and expert professional is quickly blurring in
VGI environments where the expertise of a contributor
Among the indirect approaches, the user review
cannot be simply judged based on contributor type.
approach works well for those VGI that are more
As for the geographic approach, considering only
traceable, such as those in Wikimapia and Open-
fitness of geographic context tends to be less effective
StreetMap. However, it is problematic for species
if a user report fits surrounding geographic context
surveillance applications because the objects being
well but actually is a false observation.
recorded are often highly mobile or persist for only a
Regarding the trust approach, how trust as a proxy
short period of time. It is hardly possible to go back to
of quality can be effectively realized in VGI contexts
the reported locations to verify every user surveillance
is problematic. It demands appropriate methods to
report and therefore it is not peer-reviewable. Good-
evaluate and quantify the trustworthiness of VGI. In
child and Li (2012) also pointed out that this approach
Bishr and Mantelas (2008) an approach combining the
works less well for obscure phenomena, including
trust approach and the geographic approach was
those short-lived ones. Conducting the review process
proposed to assure VGI quality. Their work does
for time-critical issues (e.g., pest outbreaks) is also
provide valuable insights into the usage of the proxy.
impossible because the process is generally time-
First, indeed, the four indirect approaches reviewed
consuming. Additionally, Linus’ Law sometimes fails.
here are not mutually exclusive. For instance, some of
In a crowdsourcing-based cropland capture game,
the elements of trust fall under the geographic
Salk et al. (2015) demonstrated that the majority
approach, i.e., the trustworthiness of VGI can be
agreement among volunteers cannot fully substitute
assessed based on geographic contexts. Second, their
the quality assessment by experts on crowdsourced
approach leverages crowd’s dual roles in VGI
tasks.
123
GeoJournal
creation—contributing locational data and ascertain- quality and the nature of trust being inherently fuzzy
ing the reliability of data (i.e., user trust rating). The (Chang et al. 2005), adopting fuzzy set theory to assess
second role can be helpful in evaluating the trustwor- the quality of VGI is likely to capture more accurately
thiness of VGI. Despite these insights, in the combined the whole assessment process.
approach, fuzziness that is inherent in trust (Chang
et al. 2005) is not well accounted for. Assessing the Fuzzy set theory
quality of VGI based on trust requires special atten-
tions to the fuzzy nature of trust. Fuzzy set was first introduced by Zadeh (1965) to
Novel approaches thus are called for to synthesize model continuous phenomena. It generalizes conven-
the advantages and minimize the disadvantages of the tional crisp sets by allowing their elements to have
approaches mentioned above to assure the quality of degrees of membership. The membership is defined by
VGI, with a better way to account for user expertise, mapping every element x from a universe of discourse
geographic context, and fuzziness involved in trust X to an interval [0, 1], representing the degree to which
judgment. x is an element of a fuzzy set, expressed as Eq. 1.
lA ðxÞ : X ! ½0; 1; where
A fuzzy system lA ðxÞ ¼ 1 if x is totally in the fuzzy set;
ð1Þ
lA ðxÞ ¼ 0 if x is not in the fuzzy set;
To address the problems mentioned above, we present
0\lA ðxÞ\1 if x is partly in the fuzzy set:
a rule-based fuzzy system to assure the quality of user-
generated species surveillance reports. The system Fuzzy set is often used for modelling subjective
uses trust as a proxy of quality, considering both the human reasoning using natural languages in which
track record of the VGI contributors (i.e., provenance many expressions have vague or imprecise meanings
of user expertise) and the fitness of geographic context (Caha et al. 2012). It is therefore a prominent
as defining factors of the trust. alternative to more traditional modelling paradigms
for addressing complex, ill-defined, and less
Fuzziness in geospatial data quality tractable systems (Manca and Curtin 2012). In geog-
raphy, fuzzy set has been applied to modelling the
Traditionally, geospatial data quality is categorized uncertainty inherent in spatial datasets (Al-kheder
into internal quality or external quality (Devillers et al. et al. 2008; Zhang et al. 2014).
2005). The former refers to the assessment of the
difference between a dataset and the reality it repre- System development
sents. The latter refers to the fitness for use, the extent
to which a dataset can be a good fit for its different To introduce the system development based on fuzzy
uses. Evaluating geospatial data quality of both kinds set theory, the following sections first describe its core
involves using realities or fitness as the baseline for fuzzy inference method. Then the two input variables
comparison. The result of the comparison is clear-cut, (i.e., provenance of user expertise and fitness of
or ‘‘crisp’’, as they can either be meeting or failing to geographic context), one output variable (i.e., the
meet the standard. trustworthiness of user reports), and fuzzy rules of the
From a user perspective, VGI quality may be system are defined. Last, the system usage is
considered by users to be meeting the standard or introduced.
slightly below standard, implying a transition between
all levels of quality. It is extremely limiting to treat a Fuzzy inference
VGI that is slightly below the standard in the same way
as another VGI that virtually fails to meet the standard. Mamdani-style fuzzy inference is adopted in the
Yongting (1996) proposed the concept of fuzzy quality system as it is better suited to handling fuzziness and
to account for such a transition by expressing the data uncertainty and it works better with human inputs
quality with a fuzzy set instead of Boolean logic. In (Power et al. 2001). The inference requires the
addition, given the role of trust in evaluating VGI data developer to create both input and output membership
123
GeoJournal
functions from linguistic interpretations of a subject. It information is the best available (Peterson and Pitz
generates output values through compositional infer- 1988), as a surrogate to represent provenance of
ence rules and a defuzzification algorithm. Details user expertise because it has been shown that
about Mamdani-style fuzzy inference can be found in confidence can be a valid cue to information
Mamdani (1974) and Negnevitsky (2005). A brief accuracy (Sniezek and Van Swol 2001). This piece
workflow showing how our fuzzy system derives the of information, specifically the level of confidence
quality (trustworthiness) of a user report based on about the correctness of a user report, is provided
Mamdani-style is given in Fig. 1, which has four steps by the user who has generated the report. It
as follows: contributes to the willingness to accept a piece of
information, especially when other materials about
Step 1. Fuzzification: Fuzzifying the crisp inputs of
the information providers are unavailable (Cofta
the system against appropriate linguistic fuzzy
2007; Sniezek and Buckley 1995). Indeed, confi-
sets and generating membership degrees
dence has been utilized to automatically evaluate
based on given membership functions.
the expertise of the volunteers in performing tasks
Step 2. Rule evaluation: Applying a fuzzy rule set to
such as land cover map validation (Foody et al.
infer fuzzy trustworthiness outputs.
2013) and galaxy classification (Bordogna et al.
Step 3. Aggregation of the rule outputs: Aggregat-
2014a, b).
ing the output of each rule into a single fuzzy
Our fuzzy system requires users to choose a
set for the overall fuzzy output.
value from a ten-point Likert scale to report their
Step 4. Defuzzification: Defuzzifying the aggregate
confidence levels. The value provides a measure
output fuzzy set into a final crisp trustwor-
of self-evaluation to VGI quality. Following the
thiness score using the center of gravity
four-level fuzzy confidence adopted in Yu and
(COG) algorithm. The algorithm finds the
Tsai (2006), four linguistic fuzzy sets—Not Con-
point (COG) where a vertical line would
fident (NC), Somewhat Confident (SC), Confident
slice the aggregate set, on the interval [a, b],
(C), and Very Confident (VC)—are defined for the
into two equal masses using Eq. 2.
input user confidence levels, using standard trian-
P
b
gular and left/right trapezoidal shapes. The corre-
laggregate ðyÞy
sponding membership functions are defined by
COG ¼ x¼a : ð2Þ
Pb Eq. 3 and illustrated in Fig. 2a. Note that the four
laggregate ðyÞ
x¼a fuzzy sets are not symmetric around the median
value of the universe of discourse (i.e., five)
(Fig. 2a) for the following reason. As the confi-
Input variable one: provenance of user expertise dence declared by non-expert VGI contributors
tend to be less reliable as the confidence declared
The proposed system adopts user confidence, the by experts because some contributors may be
strength to which a person believes that a piece of somewhat overconfident about their expertise
Fig. 1 Workflow of the

Mamdani-style
trustworthiness score
inference
123
GeoJournal
Fig. 2 Membership
functions of a contributor
confidence level, b fitness of
geographic context, and
c trustworthiness
(Pulford 1996), the membership functions repre- compensate for over-confidence. The left starting
senting moderate to relatively high levels of user point of VC is kept at 7.5, allowing the values
confidence (i.e., SC, C, and VC) are shifted closer between 7.5 and 8 to have certain low degrees of
to the right end of the universe of discourse to membership to VC.
123
GeoJournal
8 8
>
> > 1 0c1 Fitness of geographic context
> >
< 2 8
> 2 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 39
>
>
>
> Not Confident cþ
5
1 c 2:5 < 2 ffi =
>
> >
> 3 3 Distrtc Distrtc
>
>
> : ¼ 1 ln4 þ þ 15
>
>
>
0 c 2:5 : Distmax Distmax ;
>
> 8
>
> >
> 0 c 1:5
>
> >
> 10; ð4Þ
>
> >
> 2 3
>
> < c 1:5 c 4
>
>
> Somewhat Confident 5
>
5
where Distrtc is the distance from a user report to its
>
> >
> 2 13
>
> >
> 5 c þ 5 4 c 6:5
>
> > corresponding cluster center, Distmax is the distance
>
< >
:
0 c 6:5 between the cluster’s outermost user report and the
lC ðcÞ ¼ 8
>
> > 0 c 4:5 cluster’s center.
>
> >
>2
>
> >
> 9 Following the three-level fuzzy proximity adopted
>
> >
< c 4:5 c 7
>
> 5 5 in Al-kheder et al. (2008), three linguistic fuzzy sets—
>
> Confident
>
> > 2 19
>
> >
> cþ 7 c 9:5 Relatively Low (RL), Medium (M), and Relatively
>
> >
>
>
> >
: 5 5 High (RH)—are defined for the input fitness of
>
> 0 c 9:5
>
>
>
> 8 geographic context, using standard triangular and
>
> > 0 c 7:5
>
> >
< left/right trapezoidal shapes. The corresponding mem-
>
> 1 15
>
> Very Confident c 7:5 c 9:5 bership functions are defined by Eq. 5 and illustrated
>
> >2 4
>
: >
:
1 c 9:5 in Fig. 2b. The fuzzy sets are symmetric around the
ð3Þ median value of the universe of discourse (i.e., five)
(Fig. 2b).
8 8
>
>
> > 1
< 1 0f 0:5
Input variable two: fitness of geographic context >
> 9
>
> Relatively Low fþ 0:5f 4:5
>
> > 4 8
>
> :
Species occurrences usually form clusters. Therefore, >
> 0 f 4:5
>
> 8
fitness of geographic context is evaluated using spatial >
> > 0 f 2:5
>
> >
>
clustering analysis. According to Tobler’s Law, it is >
< > 2
< f 1 2:5f 5
highly possible that a species can be observed at its lF ðf Þ ¼ Medium 5
>
> >
> 2
habitat center (i.e., cluster center) and the possibility >
> >
> f þ 3 5f 7:5
>
> : 5
decreases with increasing distance away from the >
>
>
>
>
0 8 f 7:5
habitat center. Therefore, if a cluster of species >
> >0
< f 5:5
>
> 1 11
surveillance reports is contributed by users, its fitness >
> Relatively High
>
> f 5:5f 9:5
of geographic context is evaluated using its spatial >
: >4
: 8
1 f 9:5
proximity to the center of the cluster.
The fuzzy system uses DBSCAN clustering algo- ð5Þ
rithm (Ester et al. 1996) to locate VGI clusters.
DBSCAN can effectively distinguish noise points
Output variable: trustworthiness
(i.e., outliers) and discover clusters with arbitrary
shapes. Fitness of geographic context is quantified
Following the five-level fuzzy trustworthiness in Song
based on an inverse hyperbolic sine function (Eq. 4).
et al. (2004), five linguistic fuzzy sets—Very Low
The equation captures precisely the characteristics of
(VL), Low (L), Medium (M), High (H), and Very High
the fitness of geographic context—it decays with the
(VH)—are defined for the output trustworthiness
distance departing from the center of a VGI cluster
using standard triangular and left/right trapezoidal
(i.e., inverse relation with distance) by generating a
shapes. The corresponding membership functions with
value between 0 (zero fitness of geographic context) to
a universe of discourse from 0 to 10 are defined by
10 (perfect fitness of geographic context). Outliers
Eq. 6. The five fuzzy sets are asymmetric around the
identified by DBSCAN are assigned zero fitness of
median value (i.e., five) (Fig. 2c) for the following
geographic context.
reasons. Goodchild and Li (2012) suggested that
123
GeoJournal
greater weights can be assigned to similar reports that

are spatially clustered than to a single report. This
system assesses clustered reports which already have
relatively greater weights. Therefore, the fuzzy sets
representing relatively poor data quality (i.e., VL and
L) are placed closer to the left end of the universe of
discourse, meaning that a trustworthiness can be
linguistically interpreted as low or very low only Fig. 3 Fuzzy rule set defined for the system, using a
when it is associated with a sufficiently low value. The conjunction, AND, for all the rules
peak of VL is not placed at zero to ensure that the peak
value stays the same over a certain range (Zhang et al.
(Negnevitsky 2005). Assuming that A and B are two
2014). Additionally, the wider range of M can
fuzzy sets with membership functions lA and lB ,
maintain sufficient overlap in adjacent fuzzy sets
respectively, the fuzzy operation intersection for
(especially L and M) for the system to respond
creating the intersection of the two fuzzy sets is
smoothly (Negnevitsky 2005).
expressed as Eq. 7.
8 8
>
> <
> 1 0 t 0:5 lA\B ðxÞ ¼ min ½lA ðxÞ; lB ðxÞ: ð7Þ
>
> 3
>
> Very Low t þ 0:5 t 1:5
>
> > 2
>
> :
>
> 0 t 1:5
>
> 8 System output surface
>
> 0 t 0:5
>
> >
>
>
> >
> 2 1
>
> < t 0:5 t 2 To evaluate the performance of a Mamdani-style
>
> 3 3
>
> Low 2 7 fuzzy system, we used its three-dimensional output
>
> >
> tþ 2 t 3:5
>
> >
> surface following the suggestion by Negnevitsky
>
> : 3 3
>
> 0 t 3:5
>
> 8 (2005). A satisfactory system building is achieved
>
> > 0 t 2:5
>
> >
> 2 through empirical tunings until the system generates a
>
< >
< t1 2:5 t 5 gradual changing surface which appropriately emu-
lT ðtÞ ¼ Medium 5
>
> > 2 lates subjective human reasoning regarding how the
>
> >
> t þ 3 5 t 7:5
>
> >
: 5 interactions of the system’s inputs influence its output
>
> 0 t 7:5
>
> 8 in the context the problem is viewed.
>
> 0 t 5:5
>
> >
> The output surface of our system is shown in Fig. 4.
>
> >
> 2 11
>
> < t 5:5 t 7 The membership functions and the fuzzy rules men-
>
>
>
> High 3 2 3 17 tioned above are decided based by assessing this
>
> >
> t þ 7 t 8:5
>
> >
> surface. The general trend should be that higher user
>
> : 3 3
>
> 0 8 t 8:5
>
>
>
> > 0 t 7:5
>
> <
>
> 2
>
> Very High t 5 7:5 t 9
>
: >
:3
1 t9
ð6Þ
Fuzzy rules
The full IF-THEN fuzzy rule set defined for this

system is shown in Fig. 3, using a conjunction, AND,
for all the rules (e.g., IF confidence level is SC AND
fitness of geographic context is RL THEN trustwor-
thiness is VL). The conjunctions in the fuzzy rules are
evaluated using the fuzzy operation intersection Fig. 4 Output surface of the fuzzy system
123
GeoJournal
confidence levels and higher fitness of geographic The selection of threshold is context-dependent and
context lead to higher trustworthiness, while certain subject to the accuracy requirements of specific
special considerations should be appropriately reflected projects. Setting a higher threshold can reduce the
on the surface. For example, if a report has an extremely number of false positives (FP), but it will inevitably
low user confidence level (meaning a very low user increase the number of false negatives (FN). Setting a
expertise), its trustworthiness should be very low even lower threshold can reduce the number of FN, while it
if its fitness of geographic context is high. Conversely, will increase the number of FP. In the context of VGI,
even if a report has an extremely low fitness of FN is actually better than FP. Because rejecting good
geographic context, its trustworthiness should be mod- quality VGI incorrectly is actually better than accept-
erate if its user confidence level is very high. ing poor quality VGI incorrectly. Certainly one can
choose a very high threshold to only collect VGI with
System usage with a running example extremely high trustworthiness scores, and ignoring
FN.
Figure 5 shows an example of generating trustworthi- This system has been implemented using the
ness of a reporting (7.68) with two crisp inputs of following tools. DBSCAN algorithm was integrated
confidence level (8) and fitness of geographic context to ArcGIS as an extension through Python. The fuzzy
(6.5). The red vertical line through the aggregate logic toolbox of MATLAB was used for performing
output fuzzy set depicts location of the COG. the fuzzy inference. Figure 6 illustrates the architec-
Once the system has generated the trustworthiness ture of the implemented system.
scores for a VGI dataset, a user-preferred threshold is
used to reject or accept the reports. Non-outlier reports
with trustworthiness scores lower than the threshold Case study: a VGI-based crop pest surveillance
will be rejected and will be accepted if otherwise.
Outlier reports should be specially treated. Outlier Motivation
reports with trustworthiness scores lower than an
assigned threshold can be simply discarded. However, VGI has been previously explored for location-based
outlier reports above the threshold should be treated crop pest managements given its potential in fostering
with caution. It should be reserved or held for further interactive digital communities in which farmers and
observations, i.e., to see whether or not similar reports experts collaboratively manage crop pest risks (Deng
will be reported nearby to confirm it. and Chang 2012; Suen et al. 2014). In a VGI-based
pest management system, the task of acquiring
Fig. 5 Fuzzy inference process for an assumed user report
123
GeoJournal
Fig. 6 Architecture of the system implementation
geospatial data of crop pest surveillances is delegated coaching to the farmers was conducted to ensure a
to farmers to share their location-based observations. minimum intervention to the user contributions. After
Information relevant to managing crop pests is then the pest surveillance, we collected the geographic
discovered from the shared surveillance data through coordinates of the inserted bamboo chips using
various spatiotemporal analytics and subsequently TrimbleÒ GeoXT handheld GPS devices which
disseminated to the farmers for them to better manage delivered a 50 cm positioning accuracy.
crop pest risks. Various rice pest incidents were reported, the
Inspired by these previous studies, in order to species included mainly rice stem borers, rice leaf
demonstrate the usefulness of the fuzzy system in rollers, rice plant hoppers, rice water weevils, and
handling VGI quality, the system was adopted to mole cricket. Of the species, the rice stem borers’
measure the quality of a set of crop pest surveillance scope of activity was relatively fixed over time. It thus
reports collected in Xiajiang prefecture of Jiangxi would be easier to conduct post-surveys to verify the
province, China. actual presences of the reported rice stem borer
incidents. Rice stem borer incident reports were
Study design and data analysis therefore used to evaluate the usefulness of the
system. During the pest surveillance period, 209 rice
VGI collection and quality assessment using the fuzzy stem borer incident reports were collected.
system The quality of the 209 incident reports were
assessed using the fuzzy system. A threshold should
The major crop type cultivated in Xiajiang prefecture be assigned to the generated trustworthiness scores of
was rice which accounted for around 90 % of the total the reports to determine whether or not a report should
cropland of the prefecture (216 km2). Two hundred be accepted. As mentioned above, one can set a high
local rice farmers distributed across the prefecture threshold to only collect VGI with extremely high
were recruited to conduct a rice pest surveillance. trustworthiness scores and ignore FN. In this case
The pest surveillance was conducted by the farmers study, however, we intended to preserve as many
from 15 to 25 August, 2014. They reported rice pest reports as possible. Thus, a moderate threshold is more
incidents (pest occurrences, damages, or both) appropriate. We used a range of thresholds, from 4 to 6
observed during their daily farming activities. To with an increment of 0.2, to evaluate the performance
report an observed pest incident, the observer inserted differences. Outliers, if any, were specially treated
a flat bamboo chip firmly into the soil of the rice paddy according to the method stated in the ‘‘System usage
where the pest incident was observed. The observer with a running example’’ section. The system gener-
also recorded the species name, observation time, and ated categorical results, i.e., accepted, rejected, and
confidence level on the bamboo chip. No other withheld.
123
GeoJournal
Ground truth data collection conducted. That is, ten groups of 90 %-subset, ten
groups of 60 %-subset, and ten groups of 30 %-subset
From 26 August to 1 September, 2014 (immediately were randomly extracted from the whole dataset to
after the pest surveillance conducted by the farmers), a conduct the conformity tests. Cohen’s kappa statistic
field pest survey was conducted by the pest manage- was calculated for each run of the tests. Mean and
ment experts from the local agricultural department to standard deviation were calculated for the Cohen’s
verify the actual presences of the 209 reported rice kappa statistics from each of the same percentile subsets.
stem borer incidents. The experts scrutinized the
evidences including feeding wounds or holes, larval Results
frass, egg masses, damage symptoms, and pupas of the
stem borers within a two-meter buffer zone (consid- The fuzzy system identified from the 209 rice stem
ering the mobility of the borers) surrounding each borer incident reports eight clusters and 16 outliers
bamboo chip. If none of these evidences could be (Fig. 7a) and trustworthiness scores from 0.51 to 9.10
detected within the buffer zone of a report, the report (Fig. 7b).
was rejected by the experts; and was accepted if The results of the conformity tests using different
otherwise. The pest survey thus generated categorical thresholds are shown in Table 1. The sensitivity and
results, i.e., reports being accepted and reports being specificity values confirm that elevating and lowering
rejected. Since the survey was conducted by experi- the thresholds can increase the numbers of FN and FP,
enced experts, the results of which were considered as respectively, leading to lower kappa values. The highest
accurate ground truth data. kappa value (0.67) corresponds to the thresholds 4.8 and
five. Therefore, for the purpose of this case study, the
Conformity tests integer five was adopted as the threshold for further
analyses, although the threshold 4.8 obtained a same
Subsequently, conformity tests were conducted. For kappa value as the threshold five did.
each threshold within the interval [4, 6], a Cohen’s With a threshold five, see Fig. 7c, the fuzzy system
kappa statistic (Viera and Garrett 2005) showing the rejected 29 reports, including six outliers with trustwor-
degree of agreement between the fuzzy system- thiness scores lower than five (red circles), and accepted
generated results and the pest survey results was 170 reports (green circles). Ten outlier reports were held
calculated. The sensitivity (Eq. 8) and specificity (blue circles) due to their relatively high user confidence
(Eq. 9) were also calculated, respectively. Note that levels associated. The conformity test showed that
the reports in withheld status were not included in the around 91 % of the fuzzy system-generated results
calculations. The system-generated results corre- agreed to the survey results with a corresponding kappa
sponding to the highest kappa value were mapped value 0.67. Details are visualized by the confusion
for visualization, for which a confusion matrix was matrix shown in Fig. 8. Regarding the ten pest incident
provided to show the details about the degree of reports that were held, eight of which had in fact
agreement. suffered infestations according to the survey results.
Finally, using the threshold five, it was observed that
Sensitivity ¼ TP=ðTP þ FNÞ; ð8Þ the system performed better with larger sample sizes, as
the mean kappa values increased with the increase of
Specificity ¼ TN=ðTN þ FPÞ; ð9Þ
sample size (Fig. 9). The standard deviations also
where TP, TN, FP, and FN represent true positive, true decreased with the increase of sample size (Fig. 9).
negative, false positive, and false negative,
respectively.
To evaluate the impacts of sample sizes on the system Discussion
performance, randomization tests were performed.
Using the threshold that corresponded to the highest Features of the system
kappa value and following the method mentioned
above, 30 rounds of conformity test using 30 groups Using the pest surveillance data, we demonstrated that
of different subsets from the whole sample set were a proper use of fuzzy set theory can lead to desired
123
GeoJournal
Fig. 7 Maps showing the VGI quality assessment results polyline represents boundary of Xiajiang prefecture. a 209
generated by the fuzzy system. Background map is the road reported rice stem borer incidents. b Inferred trustworthiness
map of Bing Maps. Thumbnail on the lower right corner shows scores of the reports. c Final statuses of the reports based on a
the relative location of Xiajiang prefecture in China. a–c Grey threshold five
Table 1 Results of the conformity tests using different thresholds

Threshold 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0
Kappa value 0.59 0.59 0.64 0.62 0.67 0.67 0.37 0.35 0.34 0.34 0.29
Sensitivity 0.98 0.98 0.98 0.96 0.96 0.96 0.76 0.73 0.70 0.69 0.65
Specificity 0.53 0.53 0.59 0.60 0.66 0.66 0.73 0.76 0.81 0.81 0.81
judgement requires special attentions, and quality

itself is also inherently fuzzy. Therefore, fuzzy set
theory was adopted as the key to the system design,
which easily incorporates semantic knowledge into
the quality assessment. Bordogna et al. (2014a)
promote a linguistic decision making approach to
assess the quality of VGI. Our system extends their
Fig. 8 Confusion matrix visualizing the degree of agreement
work by demonstrating the utility of fuzzy set theory in
assessing the quality of user-contributed species
VGI quality assurance results. The fuzzy system was surveillance reports in particular.
developed based on the idea that the quality of VGI The system design echoes the view of van Exel
can be assured based on its geographic context and et al. (2010) that assessing VGI quality must consider
provenance of user expertise, and trust can be used as a not only feature quality and user quality but also the
proxy of quality. Fuzziness involved in trust interdependency between them. To account for fitness
123
GeoJournal
Through the compositional inference rules (Fig. 3),

the system can appropriately emulate human reason-
ing about how the interactions of the two system input
variables influence the system output (Fig. 4). Using a
simple linear method, for example, combining the
values of the two variables by summation does not
have the same capability, which will be demonstrated
by three exemplar input–output combinations in
Table 2. For Combinations 1 and 2, the simple linear
method obtains two identical values (i.e., 12), while
the fuzzy system generates two different values (i.e.,
3.6 and 5). This is an advantage of using rule-based
fuzzy system. In Combination 1, although the fitness
of geographic context is perfect, the fuzzy system
Fig. 9 Means and standard deviations (shown as whiskers)
calculated for the Cohen’s kappa statistics from the three groups
treats the report as being less trustworthy than that of
of percentile subsets extracted for testing the impacts of sample Combination 2 due to its overly low user confidence.
sizes In Combination 2, although the fitness of geographic
context is relatively low, the system ranks the report
with higher credibility as the user confidence is very
of geographic context (feature quality), DBSCAN high. In Combinations 1 and 3, using the simple linear
clustering was used for identifying VGI clusters. As method obtains two different values (i.e., 12 and 7.9),
pointed out by Goodchild and Li (2012), quality while the fuzzy system generates two identical values
measures of VGI can arise from the data themselves. (i.e., 3.6). The former method gives more credit to
More credit can be given to a clustering of similar Combination 1. However, for human judgement, it
reports than to a single report, in which case one can appears to be less appropriate due to its overly low user
develop metrics of quality based on the clustered confidence level. Therefore, a fuzzy system with
reports. In our system, the metric is based on the appropriately defined system parameters (e.g., mem-
proximities of user reports to their corresponding bership functions) can deal with such complicated
cluster centers, so as to measure the reports’ fitness of non-linear cases through mimicking human thinking.
geographic context. It resembles Gao et al. (2014) in Furthermore, VGI datasets are often large in
which a distance-decay function is used to measure the volume as VGI contributors on the ground are
memberships of a cluster of VGI points assigned to ubiquitous. In the case study, although the entire
Harvard University campus. The closer a point was to dataset was not large, a trend was observed that the
the campus core area, the higher membership the point system’s performance improved with increasing sam-
obtained. Similarly, Liu et al. (2010) used an interpo- ple size, rather than the opposite (Fig. 9). The system
lation procedure to measure the weights of candidate is also robust at handling non-clustered VGI (i.e.,
point locations assigned to South China region. The outlier VGI detected by DBSCAN). Such VGI can be
closer a location was to the core area of South China processed with two options, i.e., discarding (for
region, the higher weight the point obtained. In reports with low user confidence levels) and holding
addition, confidence was used as a surrogate to
represent provenance of user expertise (user quality).
The case study confirmed that requesting the volun- Table 2 Three different input–output combinations
teers to self-evaluate the correctness of their observa- Combination C F T Sum
tions was useful for assessing the quality of the
generated information. By considering the interde- 1 2 10 3.6 12
pendency between the feature quality and user quality, 2 10 2 5 12
the fuzzy system could detect those VGI which 3 5 2.9 3.6 7.9
seemed to fit geographic context well but virtually C, F, and T denote user confidence level, fitness of geographic
were of high uncertainty. context, and system-derived trustworthiness, respectively
123
GeoJournal
(for reports with high user confidence levels), than fuzzy systems for understanding environmental qual-
simply being rejected as poor quality VGI. In the case ity issues (e.g., Carnevale et al. 2009; Yan et al. 2010).
study, the survey results showed that eight of the ten It will be interesting to investigate how such systems
outlier reports with holding statuses had in fact can be utilized to better assure VGI quality.
suffered pest infestations. This supports our thought Moreover, in calculating fitness of geographic
that outlier VGI should not be simply discarded. context, the spatial extent of a cluster is subject to
Finally, since user histories (provenance) can be the VGI points within the cluster. A small number of
harvested from VGI and fitness of geographic context false contributions in a VGI cluster would not
can be determined from VGI, the approach also has a significantly affect the cluster’s spatial extent (the
potential to be adapted and applied to similar VGI spatial extent affects the memberships of the points
applications in different contexts, e.g., VGI-based within the cluster), especially not when point density
earthquake casualty surveillances. of the clusters is high. However, if the majority of the
contributions in a VGI cluster are false contributions,
Potential future improvement our approach will be less effective because the
uncertainty about the spatial extent of the cluster is
The fuzzy system should be extended in ways that high. This problem points to the need to incorporate a
generalize its applicability. Fuzzy logic enables tools user reputation database to our fuzzy system to
to model the inherent fuzziness that would otherwise exclude contributions from contributors with lower
be neglected by traditional crisp logic, while it also reputation before our system performs a refined data
introduces subjectivity into the modelling process. quality assurance work. The idea is similar to that in a
Imprecision related to subjectivity has often been cited facilitated-VGI system suggested in Cinnamon and
as a limitation in conventional fuzzy systems (Ad- Schuurman (2013).
hikari and Li 2013; Al-kheder et al. 2008). In our
study, although the selection and turnings of the
system parameters are justified, they are still the Conclusion
results of a subjective process. Therefore, optimizing
system parameters is perhaps the most important. In this paper a fuzzy system to assure the quality of VGI
Taking the case study for example, the highest collected for the purposes of species surveillances is
kappa value obtained was 0.67 on a 91 % agreement. presented. With a growing number of volunteered
Although 0.67 is considered a substantial agreement geospatial data of species surveillances garnered from
(Viera and Garrett 2005), the specificity (0.66) was not the general public, means for the VGI quality assurance
as good as the sensitivity (0.96) (Table 1). In order to are still limited. Developing robust computational
preserve as many reports as possible while reducing approaches to assure the quality of VGI is crucial to
false positive value, one solution is to further improve the development of such public participatory surveil-
the system through parameter calibrations (e.g., lance programmes. The fuzzy system has the potential
membership function calibrations) based on sensitiv- to benefit relevant experts, scientists, policy makers,
ity analyses using the pest survey data (ground truth and ordinary VGI users alike. Quantitatively, the
data) collected in the case study. After the calibrations, usefulness of the system is demonstrated through a
the system can be generalized to larger spatiotemporal crop pest surveillance case study, although further
extents with greater reliability. Another solution is to calibrations of the system parameters will be needed.
use the consensus approach in which system param- Qualitatively, the system has various features, including
eters are optimized based on the subjective opinions of mainly its advantages in terms of linguistic fuzziness
multiple decision-makers (Zhang et al. 2014). How- handling, geographic context measuring, provenance
ever, both solutions are often laborious and time- acquiring, and outlier treating. Nevertheless, future
consuming. To remediate this problem, a machine work is needed to establish a neuro-fuzzy system and a
learning approach, which involves the use of artificial user reputation system to generalize its applicability.
neural network to determine the appropriate system
parameters automatically, seems more promising as Acknowledgments This research has been supported by
demonstrated in studies using the combined neuro- National University of Singapore (NUS); and Singapore
123
GeoJournal
National Research Foundation under its Inter-national Research Celino, I. (2013). Human computation VGI provenance:
Centre @ Singapore Funding Initiative and administered by the Semantic web-based representation and publishing. IEEE
IDM Programme Office through the Centre of Social Media Transactions on Geoscience and Remote Sensing, 51(11),
Innovations for Communities (COSMIC). 5137–5144.
Chang, E., Thomson, P., Dillon, T., & Hussain, F. (2005). The
Compliance with ethical standards fuzzy and dynamic nature of trust. In S. Katsikas, J. López,
& G. Pernul (Eds.), Trust, privacy, and security in digital
Conflict of interest We declare that there are no real or per- business (pp. 161–174). Berlin: Springer.
ceived conflicts of interest involved in the submission and/or Cinnamon, J., & Schuurman, N. (2013). Confronting the data-
publication of this manuscript. divide in a time of spatial turns and volunteered geographic
information. GeoJournal, 78(4), 657–674.
Ethical standards This research involved human participants Cofta, P. (2007). Confidence, trust and identity. BT Technology
(farmers) as volunteers contributing location-based crop pest Journal, 25(2), 173–178.
surveillance reports for evaluating the performance of the pro- Coleman, D. J., Georgiadou, Y., & Labonte, J. (2009). Volun-
posed approach of VGI quality assurance. Verbal consents of teered geographic information: the nature and motivation
participation were sought from the participants. of produsers. International Journal of Spatial Data
Infrastructures Research, 4(1), 332–358.
Deng, Y., & Chang, K. T. (2012). A design framework for event
recommendation in novice low-literacy communities. In-
ternational Journal of Social, Behavioral, Educational,
References Economic, Business and Industrial Engineering, 6(5),
999–1004.
Adhikari, B., & Li, J. (2013). Modelling ambiguity in urban Devillers, R., Bédard, Y., & Jeansoulin, R. (2005). Multidi-
planning. Annals of GIS, 19(3), 143–152. mensional management of geospatial data quality infor-
Al-kheder, S., Wang, J., & Shan, J. (2008). Fuzzy inference mation for its dynamic use within GIS. Photogrammetric
guided cellular automata urban-growth modelling using Engineering and Remote Sensing, 71(2), 205–215.
multi-temporal satellite images. International Journal of Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-
Geographical Information Science, 22(11–12), 1271– based algorithm for discovering clusters in large spatial
1293. database with noise. In Proceedings of the 2nd interna-
Bishr, M. (2007). Weaving space into the web of trust: An tional conference on knowledge discovery and data min-
asymmetric spatial trust model for social networks. In ing, Portland, Oregon, USA, pp. 226–231.
Proceedings of the 1st conference on social semantic web, Foody, G. M., See, L., Fritz, S., Van der Velde, M., Perger, C.,
Leipzig, Germany, pp. 35–46. Schill, C., & Boyd, D. S. (2013). Assessing the accuracy of
Bishr, M., & Janowicz, K. (2010). Can we trust information? volunteered geographic information arising from multiple
The case of volunteered geographic information. In Pro- contributors to an internet based collaborative project.
ceedings of the workshop ‘‘towards digital earth: Search, Transactions in GIS, 17(6), 847–860.
discover and share geospatial data’’ at future internet Gao, S., Li, L., Li, W., Janowicz, K., & Zhang, Y. (2014).
symposium, Berlin, Germany, pp. 11–16. Constructing gazetteers from volunteered big geo-data
Bishr, M., & Mantelas, L. (2008). A trust and reputation model based on Hadoop. Computers, Environment and Urban
for filtering and classifying knowledge about urban growth. Systems. doi:10.1016/j.compenvurbsys.2014.02.004.
GeoJournal, 72(3), 229–237. Girres, J. F., & Touya, G. (2010). Quality assessment of the
Bordogna, G., Carrara, P., Criscuolo, L., Pepe, M., & Rampini, French OpenStreetMap dataset. Transactions in GIS,
A. (2014a). A linguistic decision making approach to 14(4), 435–459.
assess the quality of volunteer geographic information for Goodchild, M. F. (2007). Citizens as sensors: The world of
citizen science. Information Sciences, 258, 312–327. volunteered geography. GeoJournal, 69(4), 211–221.
doi:10.1016/j.ins.2013.07.013. Goodchild, M. F. (2009). NeoGeography and the nature of
Bordogna, G., Carrara, P., Criscuolo, L., Pepe, M., & Rampini, geographic expertise. Journal of Location Based Services,
A. (2014b). On predicting and improving the quality of 3(2), 82–96.
volunteer geographic information projects. International Goodchild, M. F., & Li, L. (2012). Assuring the quality of
Journal of Digital Earth, 1–22. doi:10.1080/17538947. volunteered geographic information. Spatial Statistics, 1,
2014.976774. 110–120. doi:10.1016/j.spasta.2012.03.002.
Brando, C., & Bucher, B. (2010). Quality in user generated Haklay, M. (2010). How good is volunteered geographical
spatial content: A matter of specifications. In Proceedings information? A comparative study of OpenStreetMap and
of the 13th AGILE international conference on geographic Ordnance Survey datasets. Environment and Planning B:
information science, Guimarães, Portugal, pp. 1–8. Planning and Design, 37(4), 682–703.
Caha, J., Tuček, P., Vondráková, A., & Paclı́ková, L. (2012). Haklay, M., Basiouka, S., Antoniou, V., & Ather, A. (2010).
Slope analysis of fuzzy surfaces. Transactions in GIS, How many volunteers does it take to map an area well? The
16(5), 649–661. validity of Linus’ law to volunteered geographic informa-
Carnevale, C., Finzi, G., Pisoni, E., & Volta, M. (2009). Neuro- tion. The Cartographic Journal, 47(4), 315–322.
fuzzy and neural network systems for air quality control. Keßler, C., Janowicz, K., & Bishr, M. (2009). An agenda for the
Atmospheric Environment, 43(31), 4811–4821. next generation gazetteer: Geographic information
123
GeoJournal
contribution and retrieval. In Proceedings of the 17th ACM Song, W., & Sun, G. (2010). The role of mobile volunteered
SIGSPATIAL international conference on advances in geographic information in urban management. In Pro-
geographic information systems, Seattle, Washington, ceedings of the18th international conference on geoinfor-
USA, pp. 91–100. matics, Beijing, China, pp. 1–5.
Kuhn, W. (2007). Volunteered geographic information and Suen, R. C. L., Chang, K. T. T., Wan, M. P.-H., Ng, Y. C., &
GIScience. In NCGIA and Vespucci workshop on volun- Tan, B. C. Y. (2014). Interactive experiences designed for
teered geographic information, Santa Barbara, CA, USA, agricultural communities. In CHI ‘14 extended abstracts of
pp. 86–97. the conference on human factors in computing systems,
Liu, Y., Yuan, Y., Xiao, D., Zhang, Y., & Hu, J. (2010). A point- Toronto, Canada, pp. 551–554.
set-based approximation for areal objects: A case study of Tobler, W. R. (1970). A computer movie simulating urban
representing localities. Computers, Environment and growth in the Detroit region. Economic Geography, 46,
Urban Systems, 34(1), 28–39. 234–240. doi:10.2307/143141.
Mamdani, E. H. (1974). Application of fuzzy algorithms for Trame, J., & Keßler, C. (2011). Exploring the lineage of vol-
control of simple dynamic plant. Proceedings of the unteered geographic information with heat maps. In Pro-
Institution of Electrical Engineers, 121(12), 1585–1588. ceedings of GeoViz 2011: Linking geovisualization with
Manca, G., & Curtin, K. (2012). Fuzzy analysis for modeling spatial analysis and modeling, Hamburg, Gemany.
regional delineation and development: The case of the Tulloch, D. L. (2008). Is VGI participation? From vernal pools
sardinian mining geopark. Transactions in GIS, 16(1), to video games. GeoJournal, 72(3), 161–171.
55–79. van Exel, M., Dias, E., & Fruijtier, S. (2010). The impact of
Maué, P., & Schade, S. (2008). Quality of geographic infor- crowdsourcing on spatial data quality indicators. In Pro-
mation patchworks. In Proceedings of the 11th AGILE ceedings of the 6th GIScience international conference on
international conference on geographic information sci- geographic information science, Zurich, Switzerland,
ence, Girona, Spain, pp. 1–8. pp. 1–4.
Negnevitsky, M. (2005). Artificial intelligence: a guide to Viera, A. J., & Garrett, J. M. (2005). Understanding interob-
intelligent systems (2nd ed.). London: Pearson Education. server agreement: The kappa statistic. Family Medicine,
Peterson, D. K., & Pitz, G. F. (1988). Confidence, uncertainty, 37(5), 360–363.
and the use of information. Journal of Experimental Psy- Yan, H., Zou, Z., & Wang, H. (2010). Adaptive neuro fuzzy
chology: Learning, Memory, and Cognition, 14(1), 85–92. inference system for classification of water quality status.
Power, C., Simms, A., & White, R. (2001). Hierarchical fuzzy Journal of Environmental Sciences, 22(12), 1891–1896.
pattern matching for the regional comparison of land use Yongting, C. (1996). Fuzzy quality and analysis on fuzzy
maps. International Journal of Geographical Information probability. Fuzzy Sets and Systems, 83(2), 283–290.
Science, 15(1), 77–100. Yu, Z., & Tsai, J. J. P. (2006). Fuzzy model tuning for intrusion
Pulford, B. D. (1996). Overconfidence in human judgement. detection systems. In L. T. Yang, H. Jin, J. Ma, & T.
PhD dissertation. University of Leicester. Ungerer (Eds.), Autonomic and trusted computing (pp.
Salk, C. F., Sturn, T., See, L., Fritz, S., & Perger, C. (2015). 193–204). Berlin: Springer.
Assessing quality of volunteer crowdsourcing contribu- Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3),
tions: Lessons from the Cropland Capture game. Interna- 338–353.
tional Journal of Digital Earth, 1–17. doi:10.1080/ Zhang, Z., Demšar, U., Rantala, J., & Virrantaus, K. (2014). A
17538947.2015.1039609. fuzzy multiple-attribute decision-making modelling for
Seeger, C. J. (2008). The role of facilitated volunteered geo- vulnerability analysis on the basis of population informa-
graphic information in the landscape planning and site tion for disaster management. International Journal of
design process. GeoJournal, 72(3), 199–213. Geographical Information Science, 28(9), 1922–1939.
Sniezek, J. A., & Buckley, T. (1995). Cueing and cognitive Zhu, A., Zhang, G., Wang, W., Xiao, W., Huang, Z., Dunzhu,
conflict in judge-advisor decision making. Organizational G., et al. (2015). A citizen data-based approach to predic-
Behavior and Human Decision Processes, 62(2), 159–174. tive mapping of spatial variation of natural phenomena.
Sniezek, J. A., & van Swol, L. M. (2001). Trust, confidence, and International Journal of Geographical Information Sci-
expertise in a judge-advisor system. Organizational ence, 29(10), 1864–1886.
Behavior and Human Decision Processes, 84(2), 288–307. Zielstra, D., & Zipf, A. (2010). A comparative study of pro-
Song, S., Hwang, K., & Macwan, M. (2004). Fuzzy trust inte- prietary geodata and volunteered geographic information
gration for security enforcement in grid computing. In H. for Germany. In Proceedings of the 13th AGILE interna-
Jin, G. R. Gao, Z. Xu, & H. Chen (Eds.), Network and tional conference on geographic information science,
parallel computing (pp. 9–21). Berlin: Springer. Guimarães, Portugal.
123

Journal B. Inggris

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Journal B. Inggris

Transféré par

Droits d'auteur :

Formats disponibles

GeoJournal

Utilizing fuzzy set theory to assure the quality of volunteered

Ó Springer Science+Business Media Dordrecht 2016

Fig. 1 Workflow of the

greater weights can be assigned to similar reports that

The full IF-THEN fuzzy rule set defined for this

Fig. 5 Fuzzy inference process for an assumed user report

Fig. 6 Architecture of the system implementation

Table 1 Results of the conformity tests using different thresholds

judgement requires special attentions, and quality

Through the compositional inference rules (Fig. 3),

Vous aimerez peut-être aussi