Vous êtes sur la page 1sur 74

Intro to advanced GIS and

a review of basic GIS

Topic 1
Outlines

 About the class setting


 Materials to be covered and scheduled
 Quick review of GIS basics
 First lab (Lab 1)
Materials to be covered and
scheduled
 A review of basic GIS (1) We do not use one single
book, because there is
 Spatial data analysis no single book covering
all the materials I will
 Vector data analysis (2,3,4) cover in the class.
 Raster data analysis (5,6) 1. I will assign many ESRI-
ebook for you to read
 Spatial interpolation (7,8) 2. Many papers for you to
 3-D analysis (12) read.
3. I will give quiz
 Geoprocessing (9,10,11) occasionally to see if you
read them or not.
 Other topics (13) 4. Other policies refer to the
syllabus
What is GIS ?

• A computer system for


- collecting,
- storing,
- manipulating,
- analyzing,
- displaying, and
- querying
geographically related
information.
In general GIS cover 3
components

 Computer system
 Hardware
 Computer, plotter, printer, digitizer
 Software and appropriate procedures
 Spatially referenced or geographic
data
 People to carry out various
management and analysis tasks
Geographic Data

 Geospatial data tells


you where it is and
attribute data tells
you what it is.
Metadata describes
both geospatial and
attribute data.

In GIS, we call geographic data as GIS data or spatial data


1. Geospatial data
Traditional method

 To represent the geographic data is


paper-based maps

 Geology map
 Topographic map
 City street map (we still use it a lot)
 ...
Characteristics of spatial data

 “mappable” characteristics:
 Location (coordinate system, will be lectured
later)
 Size is calculated by the amount (length,
area, perimeter) of the data
 Shape is defined as shape (point, line, area)
of the feature
 Discrete or continuous
 Spatial relationships
Discrete and continuous

 Discrete data are distinct features that


have definite boundaries and identities
 A district, houses, towns, agricultural fields,
rivers, highways, …
 Continuous data has no define borders
or distinctive values, instead, a
transition from one value to another
 Temperature, precipitation, elevation, ...
GIS: a simplified view of the real
world
 Points
Discrete features  Lines
 Areas
 Networks
 A series of interconnecting
lines
 Road network
 River network
 Sewage network
Continuous features  Surfaces
 Elevation surface
 Temperature surface
Problems caused by the simplified
features may still exist, but let’s live on it

 Dynamic nature (not static)


 Forest grow
 River channel change
 City expand or decline
 Identification of discrete and continuous features
 Road to be a line or a area?
 Scale
 Some may not fit to any type of features: fuzzy
boundaries
 Transition area between woodland and grassland
Lets do not worry about these problems now!!! Just keep in mind
Points
 A point is a 0 dimensional object and
has only the property of location (x,y)

 Points can be used to Model features


such as a well, building, power, pole,
sample location ect.

 Other name for a point are vertex,


node
Point
Lines
 A line is a one-dimensional object
that has the property of length
 Lines can be used to represent road,
streams, faults, dikes, maker beds,
boundary, contacts etc.
 Lines are also called an edge, link,
chain, arc
 In an ArcInfo coverage an arc starts
with a node, has zero or more Line
vertices, and ends with a node
Areas (Polygons)
 A polygon is a two-dimensional object
with properties of area and perimeter

 A polygon can represent a city, geologic


formation, dike, lake, river, ect.

 Other name for polygons face, zone


Area
Topology needed
 A collection of numeric data which clearly describes
adjacency, containment (coincidence), and
connectivity between map features and which can
be stored and manipulated by a computer.

 A set of rules on how objects relate to each other

 Major difference in file formats

 Higher level objects have special topology rules


Topology

© Paul Bolstad, GIS Fundamentals


Two basic data models to
represent these features

 Raster spatial data model


 Define space as an array of equally sized cells arranged in rows and
columns. Each cell contains an attribute value and location
coordinates
 Individual cells as building blocks for creating images of point, line,
area, network and surface
 Continuous raster
 Numeric values range smoothly from one location to another, for
example, DEM, temperature, remote sensing images, etc.
 Discrete raster
 Relative few possible values to repeat themselves in adjacent cells, for
example, land use, soil types, etc.
 Vector spatial data model
 Use x-, y- coordinates to represent point, line, area, network,
surface
 Point as a single coordinate pair, line and polygon as ordered lists of
vertices, while attributes are associated with each features
 Usually are discrete features
DIGITAL SPATIAL DATA
• RASTER

• VECTOR

• Real World

Source: Defense Mapping School


National Imagery and Mapping Agency
Raster and Vector Data Models

Real World
600
1 2 3 4 5 6 7 8 9 10
1 B G Trees
500
2 B G G
3 B
400
4 BG G Trees
Y-AXIS
5 B G G 300
6 B G BK House
7 B 200
8 B B G River
9 B 100
10 B 100 200 300 400 500 600
X-AXIS
Raster Representation Vector Representation
Source: Defense Mapping School
National Imagery and Mapping Agency
Example: Discrete raster
Example: continuous raster

Xie et al. 2005


Raster Real world Vector Heywood et al. 2006
Effects of changing resolution
Heywood et al. 2006
Vector – Advantages and
Disadvantages

 Advantages
 Good representation of reality
 Compact data structure
 Topology can be described in a network
 Accurate graphics
 Disadvantages
 Complex data structures
 Simulation may be difficult
 Some spatial analysis is difficult or impossible to
perform
Raster – Advantages and
Disadvantages

 Advantages
 Simple data structure
 Easy overlay
 Various kinds of spatial analysis
 Uniform size and shape
 Cheaper technology
 Disadvantages
 Large amount of data
 Less “pretty”
 Projection transformation is difficult
 Different scales between layers can be a nightmare
 May lose information due to generalization
GIS data formats (file formats)
 Shapefiles
Vector data  Coverages
 TIN (e.g. elevation can be stored as
TIN)
 Triangulated Irregular Network

 Grid (e.g. elevation can be stored as


Raster data Grid)
 Image (e.g. elevation can be stored as
image, all remote sensing images)
Shape Files
 Nontopological
 Advantages no overhead to process
topology
 Disadvantages polygons are double
digitized, no topologic data checking
 At least 3 files .shp .shx .dbf
Coverages
 Original ArcInfo Format
 Directory With Several Files
 Database Files are stored in the Info
Directory
 Uses Arc Node Topology
 Containment (coincident)
 Connectivity
 Adjacency
©Arthur J. Lembo
Cornell University
TIN
 A triangulated irregular network (TIN) is a data model that
is used to represent three dimensional objects. In this
case, x,y, and z values represent points. Using methods of
computational geometry, the points are connected into
what is called a triangulation, forming a network of
triangles. The lines of the triangles are called edges, and
the interior area is called a face, or facet.
 While the TIN model is somewhat more complex than the
simple point, line, and polygon vector model, or the raster
model, it is actually quite useful for representing elevations.
For example a raster grid would require grid cells to cover
the entire surface of a geographic area. Also, if we wanted
to show great detail we would have to have small grid cells.
Now, if the land area is relatively flat, we would still need
the small grid cells. However, with a TIN we would not
have to include so many points on the flat areas, but could
add more points on the steep areas where we want to show
greater detail.
 The illustration shows how we can create a TIN of the
terrain around Ithaca, NY.
 First, a series of elevation points are created
 Second, a TIN face is created with the elevation data
 Third, the faces are shaded in to give the impression of a 3D
surface
Components of a TIN
 Nodes
 Edges
 Triangles
 Hull
 Topology

©Arthur J. Lembo
Cornell University
Grid Properties
 Each Grid Cell holds one
value even if it is empty.
 A cell can hold an index
standing for an attribute.
 Cell resolution is given as its
size on the ground.
 Point and Lines move to the
center of the cell.
 Minimum line width is one
cell.
 Rasters are easy to read
and write, and easy to draw
on the screen.
A new data model in ArcGIS
 Geodatabase data model
 Use a relational database that stores geographic
data
 A type of database in which the data is organized across
several tables. Tables are associated with each other
through common fields. Data items can be recombined
from different files.
 A container for storing spatial and attribute data
and the relationships that exist among them
 And their associated attributes can be structured
to work together as an integrated system using
rules, relationships, and topological associations
Geodatabase components-
vector data and table

 Primary (basic) components


- feature classes,
- feature datasets,
- nonspatial tables.
 complex components
building on the basic
components:
- topology,
- relationship classes,
- geometric networks
Geodatabase components-
Raster data
 Raster data referenced only in personal geodatabase
 Raster data physically stored in multiuser geodatabse
 Raster datasets and raster catalogs
 A raster dataset is created from one or more individual rasters. When
creating a raster dataset from multiple rasters, the data is mosaicked,
or aggregated, into a single, seamless dataset in which areas of overlap
have been removed. The input rasters must be contiguous (adjacent)
and have the same properties, including the same coordinate system,
cell size, and data format. For each raster dataset (.img, grid, JPEG,
MrSID, TIFF), ArcGIS creates an ERDAS IMAGINE file (.img).
 A raster catalog is defined as a table in the geodatabase which you can
view like any other table in ArcCatalog. Each raster in the catalog is
represented by a row in the table. It contains a collection of rasters
that can be noncontiguous, stored in different formats, and have other
different properties. In order to view all the rasters in the catalog, they
must have the same coordinate system and a common geographic
extent
2. Attribute data
 Attribute data is about “what” of a
spatial data and is a list or table of data
arranged as rows and columns
 Rows are records (map features)
 Each row represents a map feature, which has
a unique label ID or object ID
 Columns are fields (characteristics)
 Intersection of a column and a row shows
the values of attributes, such as color,
ownership, magnitude, classification,…
examples
A database needed

 If many fields related to one record (feature-ID), for example,


the a soil unit can have over 80 estimated physical and
chemical properties, more tables are needed to store all the
attributes.
 A database management system (DBMS) is needed to manage
multiple tables.
 A database is a collection of interrelated tables in digital format.
There are four types:
 Flat file, hierarchical database, network database, relational

database
 In GIS, we usually use relational database
Flat file Hierarchical

Relational
Network

PIN: Parcel ID number


Zoning (zonecode): 1-residential, 2-commercial Chang, 2004
Relational database

 A relational database is a collection of tables, also called


relations, which can be connected to each other by keys.
 A primary key represents one or more attributes whose values
can uniquely identify a record in a table. Its counterpart in
another table for the purpose of linkage is called a foreign key
 Advantages
 Each table in the database can be prepared, maintained, and
edited separately from other tables
 Efficient data management and processing, since linking tables
query and/or analysis is often temporary
Four tables linked by keys

Chang, 2004
Relationship of those separate
tables
One record in one table
related to one record in
another table

One record in one table


related to many records in
another table

Many records in one table


related to one record in
another table

Many records in one table


related to many records in
another table
Join and relate tables

 Once tables are separated as Join


relational tables, then two operations
can be used to link those tables
during query and analysis relate
 Join, brings together two tables

based on a common key.


 Relate, connects two tables
Join
(based on keys) but keeps the
tables separate.
 Keys do not have to have the same
relate
name but must be of the same data
type
One-to-One Join
Employee-id Job Employee-id name
1 Digislave 1 Tom
2 Useless Supervisor 2 John

Join Employee-id to Employee-id

Employee-id Job Name

1 Digislave Tom

2 Useless Supervisor John

After join
Many-to-One Join
Polygon Id Symbol Symbol Description
1 Qa Qa Quaternary Alluvium
2 Qa Qe Quaternary Eolian

3 Pa Pa Permian Abo

4 Qe

Polygon ID Symbol Description

1 Qa Quaternary Alluvium

2 Qa Quaternary Alluvium

3 Pa Permian Abo

4 Qe Quaternary Eolian

After Join on Symbol


One-to-Many Relates
Symbol Mineral

Formation Symbol Qa Quartz


Quaternary Alluvium Qa
Pa Quartz
Permian Abo Pa
Qa Gypsum

Pa Feldspar

If the tables are related on Symbol, selecting


Polygon-id 1 will select the highlighted areas.
Many-to-Many Relates
Symbol Mineral

Qa Quartz
Formation Symbol
1 Qa Pa Quartz
2 Qa Qa Gypsum

Pa Feldspar

If the tables are related on Symbol, selecting


Polygon-id 1 will select the highlighted areas.
Tables In ArcGIS GIS
 Those separate tables will have one and only one table called
spatial table (or layer attribute table), which has spatial
location and relationship with the spatial data. Other tables
called nonspatial tables, which can be either join or relate to
the spatial table.
 Join tables when each record in the spatial table has no more
than one matching record in the nonspatial table
 One to one relation

 Many to one relation

 Relate tables when each record in the spatial table has more
than one record in the nonspatial table
 One to many relation

 Many to many relation


The joined table

The joined table will only preserved within the map


document-the tables remain separate on disk-and can be
removed at any time
Related tables

The related table will only preserved within the map document-the
tables remain separate on disk-and can be removed at any time
3. metadata

 Meta is defined as a change or transformation. Data is


described as the factual information used as a basis for
reasoning. Put these two definitions together and
metadata would literally mean "factual information used as
a basis for reasoning which describes a change or
transformation."
 In GIS, Metadata is data about the data. It consists of
information that describes spatial data and is used to
provide documentation for data products. Metadata is the
who, what, when, where, why, and how about every
facet of the spatial data.
 According to the Federal Geographic Data Committee
(FGDC), metadata is data about the content, quality,
condition, and other characteristics of data.
Why use and create metadata
 To help organize and maintain an organization's spatial
data
- Employees may come and go but metadata can
catalogue the changes and updates made to each spatial
data set and how each employee implemented them
 To provide information to other organizations and
clearinghouses to facilitate data sharing and transfer
- It makes sense to share existing data sets rather
than producing new ones if they are already available
 To document the history of a spatial data set
- Metadata documents what changes have been made
to each data set, such as changes in geographic projection,
adding or deleting attributes, editing line intersections, or
changing file formats. All of these could have an effect on
data quality.
Metadata Should Include Data
about
 Date of data collected.
 Date of coverage generated.
 Bounding coordinates.
 Processing steps.
 Software used
 RMSE, etc.
 From where original data came.
 Who did processing.
 Projection
 coordinate System
 Datum
 Units
 Spatial scale
 Attribute definitions
 Who to contact for more information

See an example of non-standard metadata (see)


Federal Geographic Data Committee’s
(FGDC) Content Standard for Digital
Geospatial Metadata (CSDGM)

 The FGDC is developing the National Spatial Data


Infrastructure (NSDI) in cooperation with organizations
from State, local and tribal governments, the academic
community, and the private sector. The NSDI
encompasses policies, standards, and procedures for
organizations to cooperatively produce and
share geographic data.

 The objectives of the CSDGM are to provide a common


set of terminology and definitions for the documentation
of digital geospatial data.
CSDGM (FGDC-STD-001-1998)
 Metadata =
 Identification_Information
 Data_Quality_Information
 Spatial_Data_Organization_Information
 Spatial_Reference_Information
 Entity_and_Attribute_Information
 Distribution_Information
 Metadata_Reference_Information
Connect to http://www.fgdc.gov/metadata/csdgm/
Metadata tools
 Metadata editors:
- tkme / USGS
- ArcCatalog / ESRI
- SMMS / Intergraph
- FGDCMETA / Illinois State Geological Survey
- xtme / USGS
 Metadata utilities (check compliance and export to text,
HTML,XML, or SGML):
- mp / USGS mp: Metadata Parser
- MP batch / Intergraph
- ArcCatalog powered by mp/ ESRI
 Metadata Server
- Isite / FGDC
- GeoConnect Geodata Management Server / Intergraph
- ArcIMS Metadata Server / ESRI
FGDC Clearinghouse
 the FGDC developed a clearinghouse
that allows geospatial data creators to
share their data
 however, the FGDC Clearinghouse is
not a data repository. The data
contained within the clearinghouse is
actually stored on computer servers
maintained by individual contributors.
This allows contributors to manage their
own data.
Two Components
 The FGDC Clearinghouse consists of 6
gateways and 250 nodes
 A gateway is a point of entry into the
FGDC Clearinghouse
 A clearinghouse node is a database
that contains metadata records.
Individual contributors maintain nodes
 Besides the FGDC Clearinghouse,
there are a variety of other
communities that use FGDC-compliant
metadata as the basis of their data
sharing services. These so-called
clearinghouse communities are often
developed because the participating http://clearinghouse1.fgdc.gov/
organizations have data of similar or
complementary types.
4. Geodatabase

 Before geodatabase, in one GIS project, many


GIS files (spatial data and nonspatial data) are
stored separated. So for a large GIS project, the
GIS files could be hundreds.
 Within a geodatabase, all GIS files (spatial data
and nonspatial data) in a project can be stored
in one geodatabase, using the relational
database management system (RDMS)
Types of geodatabases

 personal
 enterprise
Personal Geodatabase

 The personal geodatabase


is given a name of
filename.mdb that is
browsable and editable by
the ArcGIS, and it can also
be opened with Microsoft
Access. It can be read by
multiple people at the same
time, but edited by only
one person at a time.
maximum size is 2 GB.
Multiuser Geodatabase

 Multiuser (ArcSDE or enterprise) geodatabase


are stored in IBM DB2, Informix, Oracle, or
Microsoft SQL Server.

 It can be edited through ArcSDE by many users


at the same time, is suitable for large
workgroups and enterprise GIS
implementations. no limit of size. support raster
data.
3-tier ArcSDE client/server architecture with both
the ArcSDE and Oracle RDBMS running on the
same server, which minimizes network traffic
and client load while increasing the server load
compared to 2-tier system, in which the clients
directly connect to the RDBMS
Personal and Multiuser
Geodatabase Comparison

source: www.esri.com
5. Lab 1

 Getting Started With the Geodatabase


About 2 hours
About 1 hour
COPY the result map of your last step to your home work
Copy your
exam questions
and result to
your homework

Vous aimerez peut-être aussi