Vous êtes sur la page 1sur 4

Understanding Topology

and Shapefiles
When asked if topology is a key concept of GIS, most GIS users will nod their heads in agreement. But ask
these same folks about how topology is handled in shapefiles and the nodding heads give way to shrugging
shoulders. Why should GIS users care about topology? What are the advantages and disadvantages of
storing polygon data in shapefiles rather than coverages?
What Is Topology?
In !"#, the mathematician $eonhard %uler published a paper that arguably started the branch of
mathematics known as topology. &he problem that led to %uler's work in this area, known as (&he Seven
Bridges of )*nigsberg,( is described in the accompanying article (+onundrum Inspires &opology.( ,ore
recently, the -nited States +ensus Bureau, while preparing for the .!/ census, pioneered the application
of mathematical topology to maps to reduce the errors in tabulating massive amounts of census data.
&oday, topology in GIS is generally defined as the spatial relationships between ad0acent or neighboring
features.
,athematical topology assumes that geographic features occur on a two1dimensional plane. &hrough planar
enforcement, spatial features can be represented through nodes 2/1dimensional cells34 edges, sometimes
called arcs 2one1dimensional cells34 or polygons 2two1dimensional cells3. Because features can e5ist only on
a plane, lines that cross are broken into separate lines that terminate at nodes representing intersections
rather than simple vertices.
In GIS, topology is implemented through data structure. 6n 6rcInfo coverage is a familiar topological data
structure. 6 coverage e5plicitly stores topological relationships among neighboring polygons in the 6rc
6ttribute &able 266&3 by storing the ad0acent polygon I7s in the $8oly and 98oly fields. 6d0acent lines are
connected through nodes, and this information is stored in the arc1node table. &he 6rcInfo commands,
+$%6: and B-I$7, enforce planar topology on data and update topology tables.
;ver the past two or three decades, the general consensus in the GIS community had been that topological
data structures are advantageous because they provide an automated way to handle digiti<ing and editing
errors and artifacts4 reduce data storage for polygons because boundaries between ad0acent polygons are
stored only once4 and enable advanced spatial analyses such as ad0acency, connectivity, and containment.
6nother important conse=uence of planar enforcement is that a map that has topology contains space1filling,
nonoverlapping polygons. +onse=uently, so1called cartographic 2i.e., nontopological3 data structures are no
longer used by mainstream GIS software.
Enter Shapefiles
Shapefiles were introduced with the release of 6rc>iew ? in the early ../s. 6 shapefile is a nontopological
data structure that does not e5plicitly store topological relationships. @owever, unlike other simple graphic
data structures, shapefile polygons are represented by one or more rings. 6 ring is a closed, non1self1
intersecting loop. &his structure can represent comple5 structures, such as polygons, that contain (islands.(
&he vertices of a ring maintain a consistent, clockwise order so that the area to the right, as one (walks(
along the ring boundary, is inside the polygon, and the area to the left is outside the polygon.
,oreover, polygon features in shapefile format can contain one or more parts, so that dis0unct and
overlapping features can be represented. Aor e5ample, an individual parcel that is split by a road can be
represented alternatively as two separate polygons with two rings and two records in the
attribute table or as one polygon with two parts and one record in the attribute table. 6
source of confusion for some users is that some 6rc>iew GIS commands can result in
spatially dis0unct, multipart features.
6 primary advantage of shapefiles is that this simple file structure draws faster than a
coverage does. &his may be why the shapefile data structure was developed for 6rc>iew
GIS, a software program that was originally designed for data viewing rather than analysis.
In addition, shapefiles can easily be copied and do not re=uire importing or e5porting as
do .e// format files. &he shapefile specification is readily available, and a number of other
software packages support it. &hese reasons have contributed to the emergence of the
shapefile as a leading GIS data transfer standard. @owever, these advantages do not fully
e5plain the resurgence of a nontopological data structure.
Topological Digitizing and Editing
;ne of the primary reasons topology was developed was to provide a rigorous, automated method to clean
up data entry errors and verify data. &he typical digiti<ing procedure is to digiti<e all lines, build topology, and
label polygons and then clean up slivers, dangles, and under1 and overshoots and build topology again,
repeating the clean and build phases as many times as necessary.
What if the process did not start with the tangled mess of (cartographic spaghetti(? By approaching digiti<ing
from a feature1centric perspective and enforcing planar topology when each feature boundary is digiti<ed
and labeled, sliver polygons, dangling nodes, missing labels, and multilabeled features would be eliminated.
&o be fair, computer hardware was not always powerful enough to support a feature1centric digiti<ing
approach that re=uires on1the1fly calculation of geometric intersections 2though a WBSIWBG digiti<ing
approach was developed as far back as .C!D3.
&oday's computers are powerful enough to support feature1centric digiti<ing for most GIS users. 6rc>iew
GIS supports such feature1centric digiti<ing through the 6ppend 8olygon, Split 8olygon, and Split $ine tools.
With these tools users can add a polygon 2or line3 ad0acent to an e5isting polygon and have boundaries
match perfectly. 6rc>iew GIS also supports topological editing of shared boundaries or nodes through the
manipulation of vertices.
File Sizes No Longer an Issue
6 second oft1cited advantage of topological data structures is smaller file si<es because shared vertices of
ad0acent polygons are not stored twice. &heoretically these files should be up to half the si<e of
nontopological files. In practice however, shapefiles are rarely twice as large as the same data stored in
coverages, in part because coverages re=uire additional files to store the topological information. 6ttribute
tables are often a large proportion of the overall file si<e but are the same si<e regardless of how feature
geometry is stored. ,oreover, although storage was often an important consideration in the past, the current
low cost of storage means that for most GIS users storage space is not a constraint.
Finding Adjacent Features
8erhaps the most pervasive misunderstanding about shapefiles is that because topology is not e5plicitly
stored, ad0acent features cannot be found. @owever, ad0acent features can easily be found by intersecting
target polygons with other polygons in the same map and identifying the points of intersection of polygons
that touch boundaries or overlap. &he geometric intersections of ad0acent features are calculated on the fly
by comparing the vertices of ad0acent features rather than looking up ad0acent features in a table.
Aor e5ample, to find all the neighboring parcels of a parcel, select the parcel, choose &heme E Select by
&heme from the >iew menu and choose (intersect( from the drop1down bo5 and click on :ew Set to select
all the parcels immediately ad0acent to the originally selected parcel. ,ore comple5 ad0acency analyses can
be accomplished by combining the selection by theme with a =uery for specific attributes such as identifying
only the residential parcels ad0acent to an industrial parcel. While some of the more comple5 ad0acencies
that involve direction 2e.g., find the ad0acent parcels to the east of a given road3 are much more difficult to
accomplish without stored topology, these analyses are not fre=uent nor are they a make1it1or1break1it
re=uirement for typical users.
o!puting Adjacency Lists
6lthough analytical operations that re=uire ad0acency information can be performed in 6rc>iew GIS through
the interface, performance re=uirements many necessitate building a table to store ad0acency information.
&wo algorithms for building lists of ad0acent features, described here, could be incorporated in an 6venue
script. 6lthough the representation of topological spatial relationships traditionally has been restricted to
e5actly ad0acent neighbors, this restriction can be rela5ed to find ad0acent features. &he notion of ad0acency
can be e5tended to include features that are within some distance 2D3 rather than e5actly ad0acent 2D F /3.
;ne advantage of computing ad0acency lists is that ad0acency can be defined in relation to the spatial
precision of the coordinates, making analysis less sensitive to sliver polygons. &hese algorithms can find
ad0acency for polylines and polygons. If D E /, ad0acent points can also be identified.
;ne algorithm creates an ad0acency list using the so1called (brute1force( approach. In this simple algorithm,
for every pair of features, it determines if these features intersect and stores the ad0acent inde5 values. &he
time re=uired for this algorithm is proportional to the s=uare of the number of features 2N3 or order ;2N?3.
@owever, because ad0acency is a refle5ive spatial relation, the brute1force algorithm can be modified to
store the inde5 for reciprocal features as well. &he time re=uired for the modified algorithm is proportional to
;2N2N13G?3. 6 second algorithm uses a (divide and con=uer( approach to recursively subdivide features into
smaller and smaller groups. &he refle5ive brute1force approach is applied to these smaller groups. Because
the number of features in each group is smaller than :, the overall number of intersection tests between two
features is reduced considerably.
hec"ing Topology in Shapefiles
6 planar1enforced shapefile can be created as described above or derived from a coverage. @owever, if
nontopological editing methods are used, a shapefile can lose its planar topology during editing. 8lanar
topology can be enforced on shapefiles with the assistance of some 6venue scripts. &he logic of the
algorithm is described here, and the scripts can be downloaded from ArcUser Online or the author's Web
site.
&he first step in enforcing planar topology in a shapefile is to remove twisted or self1intersecting polygon
rings and to ensure that the (inside( of the polygon is on the correct side of the polygon boundary. :e5t,
gaps are identified by creating a rectangle that encompasses all the polygons of interest and serves as a
backdrop. &he polygons are subtracted from the rectangle containing all polygons. &he remaining areas are
gaps. 6 gap polygon is removed by merging it into an ad0acent polygon or by making it a legitimate polygon.
;verlaps are found by intersecting each polygon with all other polygons. If an intersection is found, then the
polygon representing the overlap is created. ;verlaps can be removed by deleting the overlapping area
from one of the involved polygons. ;nce boundary changes have been made, the area and perimeter of
each polygon should be recalculated.
onclusion
&he standard notion of topology in GIS centers around e5plicit representation of ad0acent spatial relations
and involves planar enforcement of geographic features. 6lthough shapefiles do not e5plicitly store spatial
relations, they can conform to planar enforcement. If, during map production or editing, planar enforcement
is violated, then statistical summations that assume space1filling polygons could be inaccurate.
6lthough this may be heresy to many users, there are advantages to using shapefiles that violate planar
assumptions 2i.e., shapefiles that have overlaps andGor gaps3. ,any useful analyses do not re=uire data with
precise planar topology, but these analyses are never conducted because it is assumed that base data must
have topology. Aor instance, city and county governments find it e5tremely time1consuming and difficult to
build parcel coverages because parcel boundary descriptions rarely match cleanly with ad0acent parcels.
9esolving boundary disputes is a very time1consuming process, often fraught with complicated legal issues.
@owever, a standard =uery of parcel data performed with reasonably coincident boundaries 2i.e., submeter
accuracy3 can be used to find landowners within a certain distance of a given location for notification
purposes.
&hough the advantages previously attributed to topological data structures have become less clear, in large
part because of improvements in computer performance, the bottom line is that GIS users need to
ade=uately understand the data structures and use them appropriately.
Aor more information, visit the author's Web site at www.ndis.nrel.colostate.eduGdavet.
A#out the author$
7avid &heobald is a scientist at the :atural 9esource %cology $ab at +olorado State -niversity and is the
author of the book, GIS Concepts and ArcView Methods, which is available from the GIS Store.
Further %eading
Cooke, Donald F., and William H. Maxfield. "The Development of a Geographic Bae
File and !t "e for Mapping," Proceedings of the Fifth Annual Conference of the
Urban and Regional Information Systems Association, pp. #$%&#'(, ')*%.
Cor+ett, ,ame -. Topological Principles in Cartography, Technical -aper .(, "nited
/tate Department of Commerce, B0rea0 of the Cen01 Wahington, D.C., ')%).
ESRI Shapefile Technical Description hite Paper, 2nvironmental /3tem 4eearch
!ntit0te, !nc.1 4edland, C5, '))(.
Moreho0e, /cott. "The 54C6!7F8 Geographic !nformation /3tem," Computers and
!eosciences, 9ol. '(, 7o. ., pp. .:;&..:, '))#.
-e0cker, T.<., and 7. Chriman. "Cartographic Data /tr0ct0re," The American
Cartographer, 9ol. #, 7o. ', pp. ;;&*), ')%;.
4eed, Carl. "G!/ "er /ho0ldn=t Forget 5+o0t Topolog3," !eoorld, 9ol. '#, 7o. .,
p. '#, 5pril '))).
/trand, 2ric ,. "/hapefile /hape G!/ Data Tranfer /tandard," !IS orld, 9ol. '',
7o. ;, p. #(, Ma3 '))(.

Vous aimerez peut-être aussi