Vous êtes sur la page 1sur 47

GIS

Data Acquisition and Editing

Introduction
Data

Acquisition is the process of getting


data into the computer
Spatial data can be obtained from different
sources, in different format and can be input
in GIS in different methods
The whole process of data editing and
encoding is called the data stream
Analogue & Digital distinction.

Data Collection Workflow


Planning includes establishing user requirements, garnering
resources, and developing a project plan.
Preparation involves obtaining data, redrafting poor-quality
map sources, editing scanned map images, removing noise,
and setting up appropriate GIS hardware and software
systems to accept data.
Digitizing and transfer are the stages where the majority of
the effort will be expended.
Editing and improvement covers many techniques designed
to validate data, as well as correct errors and improve
quality.
Evaluation is the process of identifying project successes
and failures.

Stages in Data Collection Projects


Planning

Evaluation

Editing / Improvement

Preparation

Digitizing / Transfer

Important Guidelines for Data Capture


Scale or resolution: How much detail do you need for your study?
Measurement level: Do you need ordinal data, or are categories
enough?
Accuracy: How well can your measurement tool capture your data?
Sampling method: Do you collect all the data in all the places you
need?
Timeliness: Do you work with time-sensitive data that change quickly
and need to be collected right away?
Data type: Are the data the appropriate data for you application, both in
subject matter and in format? (Do you need field data or satellite data,
for example, or do you need soil data rather than temperature data?)
Data classification system: Do you use the same data classes as
other layers in your database (for example, land-use classes from 1955
versus 2005)?
Completeness: Have you collected all the data that you need to
answer your question?

Types of Data Acquisition


Primary

Sensory Data: those data most commonly associated with


distant sensing devices, such as the Global Positioning System
(GPS), Total Station and various forms of imagery, including both
aerial photography and digital satellite data

Statistical Data: include field data and census data, both of


which usually rely on some form of direct contact by a person to
collect

Secondary
Data collected for other specific purposes can be converted for use
in GIS e.g. Keyboards, Scanning and Digitizing

Primary Data Capture


is data captured specifically for GIS use
Raster - remote sensing (Primary Capture)
usually involves actual sensor collection
e.g. SPOT or IKONOS satellites and aerial photography
Passive and active sensors

Resolution is key consideration


Spatial
Spectral
Temporal

Primary Data Capture in Raster


Data
Disadvantages are:
Resolution

is often too coarse (especially with


Satellite Mounted Sensors);
Most Optical Sensors are restricted by cloud
cover(Except Thermal and Radar sensors).

Vector Primary Data Capture


Surveying
Locations of objects determines by angle and distance

measurements from known locations


Uses expensive field equipment and crews
Most accurate method for large scale, small areas

GPS
Collection of satellites used to fix locations on Earths

surface
Differential GPS used to improve accuracy

Total Station

Primary Data Capture: Field Data


Assembling

field data can involve conducting

House-to-house

surveys
collecting traffic data along roads
recording the air temperature and other
atmospheric data
gathering soils, vegetation, insects, or any
number of other environmental samples.

Primary Data Capture: Field Data Sampling

Its
physically
impossible
to
collect
temperature/elevation data everywhere
In each case, one is forced to collect data from
a sample of the total
For GIS, sampling of geographic space is
required

Primary Data Capture: Field Data Sampling


Methods

Clustered: Sampling focuses on distinct areas


that have a lot of features from which one can
sample.
Systematic: Use a specific, often regular, pattern
to sample. For example, one sample at every
meter along a line.
Random: The sampling has no pattern at all

Primary Data Capture: Field Data Sampling


Methods (Cont..)

Stratification
divide data into groups, or strata

Primary Data Capture: Field Data Sampling


Methods (Cont..)

Stratification (Example)
To stratify your sample of who watches certain
television programs in your city, you could divide the
city into sub portions, or neighbourhoods. Then, you
pick a certain number (for example, 25 people) in
each
neighbourhood
to
sample
randomly,
systematically (for example, every fifth house), or
clustered (such as where housing density is highest).

Secondary Geographic Data Capture


(SGDC)
Data

collected for other specific purposes


can be converted for use in GIS
Raster conversion
Scanning

of maps, aerial photographs,


documents, etc
Important scanning parameters are spatial and
spectral (bit depth) resolution

SGDC: Vector Secondary Data Capture


Collection of vector objects from maps,
photographs, plans, etc.
Digitizing

Manual (table)
Heads-up and vectorization

Photogrammetry the science and technology of


making measurements from photographs, etc.
COGO Coordinate Geometry

SGDC: 1- Keyboard Entry


Keycoding,

is the entry of data into a file at a


computer terminal.
This technique is used for attribute data that
are only available on paper.
Its may be appropriate for tabular data, or for
small numbers of coordinates pairs read from
a paper map source or pocket GPS.
Text scanners and OCR software can be
used to read data automatically.

SGDC: 2- Manual Digitizing


The

most common method of encoding


spatial features from paper maps.
its also used for map encoding where
topology is required and for digitizing features
of interest from hard-copy aerial photographs.
Manual digitizing requires a digitizing table
that is linked to a computer workstation.
Two modes of digitizing: Point & Stream
modes

Digitizing

SGDC: Digitizing Cont.


Manual

digitizing of paper maps is one of the


main sources of positional error in GIS.
The accuracy of encoding depends on factors
like : scale & resolution of the source map,
the quality of the equipment and software
being used.
Errors can be introduced by incorrect
registration of the map document or handwobble

SGDC: Digitizing Cont.


Manual

digitizing can also be used to digitize


low volume of data on demand from scanned
and geocorrected digital map images.

Many

GIS packages provides facilities for


onscreen digitizing using raster backdrop
images as a guide

SGDC: 2 (a)- Heads-Up Digitizing


Head-Up

Screen Digitization

to create vectors from raster layers is to digitize


vector objects manually straight off a computer
screen using a mouse or digitizing cursor and a
GIS Software.

3- Automatic Digitizing
Manual

digitizing is a time consuming tedious


process.
When a large number of complex maps need
to be digitized then a more expensive
alternative is used: automatic digitizing
Two methods:

Scanning
Automatic line following

Scanning

Scanning
Most

commonly used method and


appropriate when raster data are required.
It is a piece of hardware for converting an
analogue source document into a digital
raster format.
3 types:

Flat bed scanners


Rotating drum scanners
Large-format feed scanners

Scanning (Flat Bed)

Scanning (Rotating Drum)

Scanning (Large-format feed scanners)

KartoScan FB VLS

Practical problems: scanning

The possibility of optical distortion when using the flat


bed scanners.
The automatic scanning of unwanted information.
The selections of appropriate scanning tolerance to
ensure important data are encoded and background
data are ignored.
The format of files produced and the input of data to
GIS software.
The amount of editing required producing data suitable
for analysis.

Automatic line follower

Appropriate where digital


versions of clear, distinctive
lines on a map are required.
It mimics manual digitizing and
uses a laser and light sensitive
device to follow the lines on the
map.
It is a vector device and
produces output as (x,y) coordinates.
Some difficulties faced when
digitizing dashed or contour
lines.

4- Electronic Data Transfer


Its

appropriate when the data is already


available in digital form
In most time there is a need to transform of
convert the data to an appropriate format
compatible with the GIS software.
Most GIS software will allow data conversion
Obtaining data from other sources requires
users to address a range of important
questions.

Cont.
Spatial

data may be collected in digital form


and transferred from devices such as GPS,
total stations, and data loggers.
Data may be purchased from a supplier or
obtained from an agency.
Remotely sensed data are normally provided
in electronic form.

Data Editing
After

data encoding, data may include some


errors derived from the original source data,
or errors introduced during the encoding
process
Its better to intercept errors before they
contaminate the GIS DB.
Data editing or cleaning can be done
through four processes

1- Detecting & Correcting errors


Errors

in input data may derive from three


main sources:

Errors in the source data


Errors introduced during encoding (inputting)
Errors propagated during data transfer & conversion

Errors

in attribute data are easy to spot &


may be identified using manual comparison
with the original data
Errors in spatial data are often more difficult
to identify and correct.

Figure 5.11

Examples of spatial error in vector data

Examples of original data problems and the corrected data after


processing
Figure 5.12

Source: Laser-Scan. Copyright 2005 LS 2003 Ltd. All rights reserved

Cont.
Most

GIS packages will provide a suite for


editing tools.
Corrections can be done on-screen or
automatically.
Errors are also present in raster data.

2- Re-projection, transformation
& Generalization
Once

spatial and attribute data have been


encoded and edited, it may be necessary to
process the data geometrically in to provide a
common framework of reference.

The projection system


Different sources (co-ordinate system)

Different origins
Different unit of measurements
Different orientation

Scale & Resolution

Figure 5.15

Topological mismatch between data in different projections

Source: Courtesy of Peter H. Dana

3- Edge matching & rubber


sheeting
When

a map extends across two or three more


map sheets differences or mismatches
between adjacent map sheets may need to be
resolved.
The process involves three basic steps

First mismatches at sheet boundaries must be resolved


Second topology must be rebuilt as new lines &
polygons have been created from the segments that lie
across map sheets
Third, redundant map sheet boundary lines are deleted
or dissolved

Figure 5.17

Edge matching

Cont.
Rubber

sheeting involves stretching the map


in various directions as if it were drawn on a
rubber sheet.
Objects on the map that are accuratly placed
are tacked down and kept still
Others that are in the wrong location pr have
the wrong shape are stretched to fit with the
control points

Figure 5.18

Rubber sheeting

Geocoding address data


Its

the process of converting an address into a


point location
The address itself, a postcode or another nongeographic descriptor is used to determine the
geographical co-ordinates of a location
Geocoding can be affected by the quality of
data

4- Updating &maintaining
spatial DBs
The

world is a very dynamic place and things


change.
Using old and out-of-date map information
would cost time and money
Keeping Dbs up-to-date avoids problems and
is a key aspect of ongoing data editing and
maintenance.

Towards an integrated DB
Each

thematic layer in the DB must be


encoded, corrected and transformed to
create a GIS ready for analysis.

Vous aimerez peut-être aussi