Académique Documents
Professionnel Documents
Culture Documents
IN COMPUTER SCIENCE
Editors
S. Abramsky, Dov M. Gabbay, T. S. E. Maibaum
and
ARTIFICIAL INTELLIGENCE AND LOGIC PROGRAMMING
Executive Editor
Dov M. Gabbay
Administrator
Jane Spurr
Logical foundations
Deduction methodologies
Nonmonotonic reasoning and uncertain reasoning
Epistemic and temporal reasoning
Logic programming
Handbook of Logic
in Computer Science
Volume 5
Logic and Algebraic Methods
Edited by
S.ABRAMSKY
Christopher Stanley Professor of Computing
University of Oxford
DOV M. GABBAY
Augustus De Morgan Professor of Logic
King's College, London
and
T. S. E. MAIBAUM
Professor of the Foundation of Software Engineering
King's College, London
Volume Co-ordinator
DOV M. GABBAY
OXFORD
UNIVERSITY PRESS
Preface
We are happy to present Volume 5 of the Handbook of Logic in Computer Science, on Logic and Algebraic Methods, The first two volumes
of the Handbook presented the background on fundamental mathematical
structuresconsequence relations, model theory, recursion theory, category
theory, universal algebra, topology; and on computational structures
term-rewriting systems, A-calculi, modal and temporal logics and algorithmic proof systems. The computational structures considered thus far have
been predominantly syntactic in character; while the discussion of mathematical structures has been quite general and free-standing.
In Volumes 3 and 4 these threads were drawn together. We looked at
how mathematical structures are used to model computational processes.
In Volume 3, the focus is on general approaches: Domain theory, denotational and algebraic semantics, and the semantics of types. In Volume
4, some more specific topics are considered, and the emphasis shifts from
structures to the actual modelling of some key computational features.
The present Volume 5 continues with logical and algebraic methodologies basic to computer science. Chapter 1 covers Martin-L0f's type theory,
originally developed to clarify the foundations of constructive mathematics it now plays a major role in theoretical computer science. The second
chapter covers categorial logic, the interaction area between category theory and mathematical logic. It builds on the basic concepts introduced
in the chapter 'Basic Category Theory' in Volume 1 of this Handbook series. The third chapter presents methods for obtaining lower bounds on
the computational complexity of logical theories. Many such theories show
up in the landscape of logic and computation. The fourth chapter covers
algebraic specification and types. It treats the subject using set theoretical
notions only and is thus accessible to a wide range of readers. The last
(fifth) chapter deals with computability on abstract data types. It develops a theory of computable functions on abstract many-sorted algebras, a
general enough notion for the needs of computer science.
The Handbooks
We see the creation of this Handbook and its companion, the Handbook of
Logic in Artificial Intelligence and Logic Programming as a combination of
authoritative exposition, comprehensive survey, and fundamental research
exploring the underlying unifying themes hi the various areas. The intended
vi
Preface
Preface
vii
for the Handbook. We would particularly like to thank the staff of OUP for
their continued and enthusiastic support, and Mrs Jane Spurr, our OUP
Administrator, for her dedication and efficiency.
A View
Finally, fifteen years after the start of the Handbook project, I would like
to take this opportunity to put forward my current views about logic in
computer science. In the early 1980s the perception of the role of logic in
computer science was that of a specification and reasoning tool and that of
a basis for possibly neat computer languages. The computer scientist was
manipulating data structures and the use of logic was one of his options.
My own view at the time was that there was an opportunity for logic
to play a key role in computer science and to exchange benefits with this
rich and important application area and thus enhance its own evolution.
The relationship between logic and computer science was perceived as very
much like the relationship of applied mathematics to physics and engineering. Applied mathematics evolves through its use as an essential tool, and
so we hoped for logic. Today my view has changed. As computer science
and artificial intelligence deal more and more with distributed and interactive systems, processes, concurrency, agents, causes, transitions, communication and control (to name a few), the researcher in this area is having
more and more in common with the traditional philosopher who has been
analysing such questions for centuries (unrestricted by the capabilities of
any hardware).
The principles governing the interaction of several processes, for example, are abstract and similar to principles governing the cooperation of two
large organisations. A detailed rule based effective but rigid bureaucracy
is very much similar to a complex computer program handling and manipulating data. My guess is that the principles underlying one are very much
the same as those underlying the other.
I believe the day is not far away in the future when the computer
scientist will wake up one morning with the realisation that he is actually
a kind of formal philosopher!
London
April 1999
D. M. Gabbay
Contents
List of contributors
2
3
4
5
Introduction
1.1 Different formulations of type theory
1.2 Implementations
Propositions as sets
Semantics and formal rules
3.1 Types
3.2 Hypothetical judgements
3.3 Function types
3.4 The type Set
3.5 Definitions
Prepositional logic
Set theory
5.1 The set of Boolean values
5.2 The empty set
5.3 The set of natural numbers
5.4 The set of functions
(Cartesian product of a family of sets)
5.5 Prepositional equality
5.6 The set of lists
5.7 Disjoint union of two sets29
5.8 Disjoint union of a family of sets
5.9 The set of small sets
The ALF series of interactive editors for type theory
1
3
4
4
7
7
9
12
14
15
16
19
20
21
21
23
26
28
29
30
32
Contents
Categorial logic
39
Andrew M. Pitts
1 Introduction
2 Equational logic
2.1 Syntactic considerations
2.2 Categorical semantics
2.3 Internal languages
3 Categorical datatypes
3.1 Disjoint union types
52
57
60
62
65
67
68
68
73
75
Predicate logic
77
77
5.2 Hyperdoctrines
78
5.3 Satisfaction
5.4 Prepositional connectives
5.5 Quantification
82
84
89
5.6 Equality
5.7 Completeness
6 Dependent types
6.1 Syntactic considerations
40
43
44
45
48
50
93
97
100
101
107
6.3 Type-categories
6.4 Categorical semantics
6.5 Dependent products
109
114
119
Further reading
123
Contents
logical theories
xi
129
Introduction
Preliminaries
Reductions between formulas
Inseparability results for first-order theories
Inseparability results for monadic second-order theories
Tools for NTIME lower bounds
Tools for linear ATIME lower bounds
Applications
Upper bounds
Open problems
129
135
140
151
158
164
173
180
196
204
217
1
2
Introduction
Algebras
2.1 The basic notions
2.2 Homomorphisms and isomorphisms
2.3 Abstract data types
2.4 Subalgebras
2.5 Quotient algebras
Terms
3.1 Syntax
3.2 Semantics
3.3 Substitutions
3.4 Properties
Generated algebras, term algebras
4.1 Generated algebras
4.2 Freely generated algebras
4.3 Term algebras
4.4 Quotient term algebras
Algebras for different signatures
5.1 Signature morphisms
5.2 Reducts
5.3 Extensions
219
220
220
223
224
225
225
227
227
228
229
229
230
230
233
234
235
235
235
237
238
xii
6
9
10
11
12
13
Contents
Logic
6.1 Definition
6.2 Equational logic
6.3 Conditional equational logic
6.4 Predicate logic
Models and logical consequences
7.1 Models
7.2 Logical consequence
7.3 Theories
7.4 Closures
7.5 Reducts
7.6 Extensions
Calculi
8.1 Definitions
8.2 An example
8.3 Comments
Specification
Loose specifications
10.1 Genuinely loose specifications
10.2 Loose specifications with constructors
10.3 Loose specifications with free constructors
Initial specifications
11.1 Initial specifications in equational logic
11.2 Examples
11.3 Properties
11.4 Expressive power of initial specifications
11.5 Proofs
11.6 Term rewriting systems and proofs
11.7 Rapid prototyping
11.8 Initial specifications in conditional equational
logic
11.9 Comments
Constructive specifications
Specification languages
13.1A simple specification language
13.2 Two further language constructs
13.3 Adding an environment
13.4 Flattening
13.5 Properties and proofs
13.6 Rapid prototyping
13.7 Further language constructs
13.8 Alternative semantics description
239
239
240
241
241
243
243
244
245
246
248
248
249
249
250
251
252
253
253
255
256
257
257
258
260
260
261
263
265
266
266
267
270
271
274
278
281
282
282
282
283
Contents
14
15
16
xiii
284
284
285
287
290
294
295
297
297
299
301
302
304
306
308
309
309
309
Introduction
1.1 Computing in algebras
1.2 Examples of computable and non-computable functions
1.3 Relations with effective algebra
1.4 Historical notes on computable functions on algebras
1.5 Objectives and structure of the chapter
1.6 Prerequisites
Signatures and algebras
2.1 Signatures
2.2 Terms and subalgebras
2.3 Homomorphisms, isomorphisms and abstract data types
2.4 Adding Booleans: Standard signatures and algebras
2.5 Adding counters: V-standard signatures and algebras
2.6 Adding the unspecified value u; algebras Au of signature Su
2.7 Adding arrays: Algebras A* of signature _*
2.8 Adding streams: Algebras A of signature
319
322
325
329
335
340
343
344
344
349
350
351
353
355
356
359
xiv
3
Contents
360
363
363
364
366
368
371
372
374
375
376
377
378
380
383
387
388
389
389
390
392
393
394
397
401
402
404
405
406
407
408
409
413
414
416
416
420
421
422
423
425
Contents
5.12 Engeler's lemma for While* semicomputabitity
5.13 1*definability: Input/output and halting formulae
5.14 The projective equivalence theorem
5.15 Halting sets of While procedures with random assignments
6 Examples of semicomputable sets of real and complex numbers
6.1 Computability on R and C
6.2 The algebra of reals; a set which is projectively While
semicomputable but not While* semicomputable
6.3 The ordered algebra of reals; sets of reals which are While
semicomputable but not While* computable
6.4 A set which is projectively While* semicomputable but not
projectively WhileN semicomputable
6.5 Dynamical systems and chaotic systems on R; sets which
are WhileN semicomputable but not While* computable
6.6 Dynamical systems and Julia sets on C; sets which are
WhileN semicomputable but not While* computable
7 Computation on topological partial algebras
7.1 The problem
7.2 Partial algebras and While computation
7.3 Topological partial algebras
7.4 Discussion: Two models of computation on the reals
7.5 Continuity of computable functions
7.6 Topological characterisation of computable sets in compact
algebras
7.7 Metric partial algebra
7.8 Connected domains: computability and explicit definability
7.9 Approximable computability
7.10 Abstract versus concrete models for computing on the real
numbers
8 A survey of models of computability
8.1 Computability by function schemes
8.2 Machine models
8.3 High-level programming constructs; program schemes
8.4 Axiomatic methods
8.5 Equational definability
8.6 Inductive definitions and fixed-point methods
8.7 Set recursion
8.8 A generalised Church-Turing thesis for computability
8.9 A Church-Turing thesis for specification
8.10 Some other applications
Index
xv
429
431
434
435
438
439
441
443
445
447
449
451
452
453
455
458
460
464
465
465
470
475
479
479
484
488
490
490
492
493
493
496
500
525
Contributors
B. Nordstrom, Kent Petersson and Jan Smith
Department of Computer Science
Chalmers Tekniska Hogskola
Institutionen for Datavetenskap
S-412 96 Goteborg
Sweden
A. Pitts
University of Cambridge Computer Laboratory
New Museums Site
Pembroke Street
Cambridge
CB2 3QG
J. Loeckx
University des Saarlandes
D-66041 Saarbrucken
Germany
H.-D. Ehrich
Technische Universitat Braunschweig
D-38023 Braunschweig
Germany
M.Wolf
Universitatdes Saarlandes
D-66041 Saarbrucken
Germany
K. Compton
Department of Computer Science
University of Aarhus
Ny Munkegade, Bldg 540
DK-8000 Aarhus C
Denmark
C. Ward Benson
Department of Mathematics
University of Illinois
1409 Green St.
Urbana, IL 61801
USA
Contributors
J.V.Tucker
Department of Computer Science
University of Swansea
Singleton Park
Swansea
Wales
J. I. Zucker
Department of Computing and Software
Faculty of Engineering
McMaster Univeristy
1280 Main Street West, JHE-327
Hamilton, Ontario, L8S 4L7
Canada
Contents
1
2
3
4
5
Introduction
1.1 Different formulations of type theory
1.2 Implementations
Propositions as sets
Semantics and formal rules
3.1 Types
3.2 Hypothetical judgements
3.3 Function types
3.4 The type Set
3.5 Definitions
Propositional logic
Set theory
5.1 The set of Boolean values
5.2 The empty set
5.3 The set of natural numbers
5.4 The set of functions (cartesian product of a family of sets)
5.5 Propositional equality
5.6 The set of lists
5.7 Disjoint union of two sets
5.8 Disjoint union of a family of sets
5.9 The set of small sets
The ALF series of interactive editors for type theory
1
3
4
4
7
7
9
12
14
15
16
19
20
21
21
23
26
28
29
29
30
32
1 Introduction
The type theory described in this chapter has been developed by Martin-L6f
with the original aim of being a clarification of constructive mathematics.
Unlike most other formalizations of mathematics, type theory is not based
on predicate logic. Instead, the logical constants are interpreted within type
theory through the Curry-Howard correspondence between propositions
and sets [Curry and Feys, 1958; Howard, 1980]: a proposition is interpreted
as a set whose elements represent the proofs of the proposition.
It is also possible to view a set as a problem description in a way similar to Kolmogorov's explanation of the intuitionistic propositional calculus
[Kolmogorov, 1932]. In particular, a set can be seen as a specification of a
programming problem; the elements of the set are then the programs that
satisfy the specification.
An advantage of using type theory for program construction is that it
is possible to express both specifications and programs within the same
formalism. Furthermore, the proof rules can be used to derive a correct
program from a specification as well as to verify that a given program has
a certain property. As a programming language, type theory is similar to
typed functional languages such as ML [Gordon et al., 1979; Milner et al.,
1990] and Haskell [Hudak et al, 1992], but a major difference is that the
evaluation of a well-typed program always terminates.
The notion of constructive proof is closely related to the notion of computer program. To prove a proposition (Vx A)(By E B)P(x,y) constructively means to give a function / which when applied to an element a
in A gives an element b in B such that P(a, b) holds. So if the proposition (VxE A)(ByB)P(x,y) expresses a specification, then the function /
obtained from the proof is a program satisfying the specification. A constructive proof could therefore itself be seen as a computer program and
the process of computing the value of a program corresponds to the process
of normalizing a proof. It is by this computational content of a constructive
proof that type theory can be used as a programming language; and since
the program is obtained from a proof of its specification, type theory can be
used as a programming logic. The relevance of constructive mathematics
to computer science was pointed out by Bishop [1970].
Several implementations of type theory have been made which can serve
as logical frameworks, that is, different theories can be directly expressed
in the implementations. The formulation of type theory we will describe
in this chapter form the basis for such a framework, which we will briefly
present in the last section.
The chapter is structured as follows. First we will give a short overview
of different formulations and implementations of type theory. Section 2 will
explain the fundamental idea of propositions as sets by means of Heyting's
explanation of the intuitionistic meaning of the logical constants. The
following section will give a rather detailed description of the basic rules
and their semantics; on a first reading some of this material may just be
glanced at, in particular the subsection on hypothetical judgements. In
section 4 we illustrate type theory as a logical framework by expressing
propositional logic in it. Section 5 introduces a number of different sets
and the final section give a short description of ALF, an implementation
of the type theory of this chapter.
Although self-contained, this chapter can be seen as complement to our
book, Programming in Martin-Lof's Type Theory. An Introduction [Nord-
1.1
One of the basic ideas behind Martin-Lof's type theory is the CurryHoward interpretation of propositions as types, that is, in our terminology,
propositions as sets. This view of propositions is closely related to Heyting's
explanation of intuitionistic logic [Heyting, 1956] and will be explained in
detail below.
Another source for type theory is proof theory. Using the identification
of propositions and sets, normalizing a derivation corresponds to computing
the value of the proof term expressing the derivation. One of Martin-Lof's
original aims with type theory was that it could serve as a framework
in which other theories could be interpreted. And a normalization proof
for type theory would then immediately give normalization for a theory
expressed in type theory.
In Martin-Lof's first formulation of type theory in 1971 [Martin-L6f,
1971a], theories like first-order arithmetic, Godel's T [Godel, 1958], secondorder logic and simple type theory [Church, 1940] could easily be interpreted. However, this formulation contained a reflection principle expressed
by a universe V and including the axiom V E V, which was shown by Girard
to be inconsistent. Coquand and Huet's 'Calculus of Constructions' [Coquand and Huet, 1986] is closely related to the type theory in [Martin-L6f,
1971a]: instead of having a universe V, they have the two types Prop and
Type and the axiom Prop Type, thereby avoiding Girard's paradox.
Martin-Lof's later formulations of type theory have all been predicat
ive; in particular, second-order logic and simple type theory cannot be
interpreted in them. The strength of the theory considered in this chapter
instead comes from the possibility of defining sets by induction.
The formulation of type theory from 1979 in 'Constructive Mathematics
and Computer Programming' [Martin-L6f, 1982] is polymorphic and extensional. One important difference with the earlier treatments of type theory
is that normalization is not obtained by metamathematical reasoning; instead, a direct semantics is given, based on Tait's computability method.
A consequence of the semantics is that a term, which is an element in a
set, can be computed to normal form. For the semantics of this theory,
lazy evaluation is essential. Because of a strong elimination rule for the set
expressing the propositional equality, judgemental equality is not decidable. This theory is also the one in Intuitionistic Type Theory [Martin-Lof,
1984]. It is also the theory used in the NuPRL system [Constable et al.,
1986] and by the group in Groningen [Backhouse et al., 1989].
The type theory presented in this chapter was put forward by Martin-Lof
in 1986 with the specific intention that it should serve as a logical framework.
1.2
Implementations
Propositions as sets
The basic idea of type theory to identify propositions with sets goes back
to Curry [Curry and Feys, 1958], who noticed that the axioms for positive
implicational calculus, formulated in the Hilbert style,
in the sense that the constructors of a set will depend on the set. For
instance, an element of A - B will be of the form \(A,B,b) and an
element of A x B will be of the form (A, B, a, b).
We will in this section first introduce the notion of type and the judgement
forms this explanation gives rise to. We then explain what a family of
types is and introduce the notions of variable, assumption and substitution
together with the rules that follow from the semantic explanations. Next,
the function types are introduced with their semantic explanation and the
formal rules which the explanation justifies. The rules are formulated in
the style of natural deduction [Prawitz, 1965].
3.1
Types
The basic notion in Martin-Lof's type theory is the notion of type. A type
is explained by saying what an object of the type is and what it means for
two objects of the type to be identical. This means that we can make the
judgement
A is a type,
which we in the formal system write as
A type,
when we know the conditions for asserting that something is an object of
type A and when we know the conditions for asserting that two objects
of type A are identical. We require that the conditions for identifying two
objects must define an equivalence relation.
When we have a type, we know from the semantic explanation of what
it means to be a type what the conditions are to be an object of that type.
So, if A is a type and we have an object a that satisfies these conditions
then
a is an object of type A,
which we formally write
a A.
Furthermore, from the semantics of what it means to be a type and the
knowledge that A is a type we also know the conditions for two objects
of type A to be identical. Hence, if A is a type and a and b are objects
of type A and these objects satisfy the equality conditions in the semantic
explanation of A then
a and b are identical objects of type A,
which we write
a = b A.
Two types are equal when an arbitrary object of one type is also an
object of the other and when two identical objects of one type are identical
objects of the other. If A and B are types we know the conditions for being
an object and the conditions for being identical objects of these types. Then
we can investigate if all objects of type A are also objects of type B and if
all identical objects of type A are also objects of type B and vice versa. If
these conditions are satisfied then
A and B are identical types,
which we formally write
A = B.
The requirement that the equality between objects of a type must be
an equivalence relation is formalized by the rules:
Reflexivity of objects
ae A
a a A
Symmetry of objects
a = b <E A
b=ae A
Transitivity of objects
a = b A
b = ce A
a =ce A
The corresponding rules for types are easily justified from the meaning
of what it means to be a type:
Reflexivity of types
A type
A=A
Symmetry of types
A=B
B=A
B =C
3.2
Hypothetical judgements
The judgements we have introduced so far do not depend on any assumptions. In general, a hypothetical judgement is made in a context of the
form
X1 AI,
X2 A2, . . . , Xn An
means that A [x 4 c] and A [x < c] are equal types for an arbitrary object
c of type C.
10
a 6 A [x C],
means that we know that a [x c] is an object of type A [x < c] for an
arbitrary object c of type C. We must also know that a [x -f- c] and a [x f- d]
are identical objects of type A [x 4 c] whenever c and d are identical objects
of type C.
That a and 6 are identical objects of type A depending on x e C,
a = b 6 A [x e C],
means that a a[x < c] and b [x c] are the same objects of type A [x - c] for
an arbitrary object c of type (7.
We will illustrate the general case by giving the meaning of the judgement that A is a type in a context of length n; the other hypothetical
judgements are explained in a similar way. We assume that we already
know the explanations of the judgement forms in a context of length n 1.
Let
x1 AI, x2 e A2, . . . , xn An
be a context of length n. We then know that
A1 type
A2 type [x1 E AI]
An type [x1 e A1, x2 6 A2, ..., xn-1 A n - 1 ]
To know the hypothetical judgement
A type [x1 e A1, x-2 E A2, ..., xn E An]
means that we know that the judgement
type
. . . , xn e .4ra [xi <-a] ]
11
stituting objects for one or several of the variables in the context. Formulating the rules so they follow the semantical explanation as closely as
possible gives us:
Substitution in types
x2 E A2, . . . , xn e An.
Substitution in equal types
The explanations of the hypothetical judgement forms justify the following rule for introducing assumptions.
Assumption
12
In this rule all premises are explicit. In order to make the rules shorter and
more comprehensible we will often leave out that part of the context which
is the same in the conclusion and each premise.
The rules given in the previous section without assumptions can also
be justified for hypothetical judgements.
3.3
Function types
One of the basic ways to form a new type from old ones is to form a function
type. So, if we have a type A and a family B of types over A, we want
to form the dependent function type (x A)B of functions from A to B.
In order to do this, we must explain what it means to be an object of
type (x A)B and what it means for two objects of type (x A)B to be
identical. The function type is explained in terms of application.
To know that an object c is of type (x A)B means that we know that
when we apply it to an arbitrary object a of type A we get an object c(a)
in B [x < a] and that we get identical objects in B [x a] when we apply
it to identical objects a and b of A.
That two objects c and d of (x 6 A)B are identical means that when
we apply them to an arbitrary object a of type A we get identical objects
of type B [x < a].
Since we have now explained what it means to be an object of a function
type and for the conditions for two objects of a function type to be equal,
we can justify the rule for forming the function type:
Function type
We will use the abbreviation (A)B for (x A)B when B does not depend
on x. We will also write (x A;y 6 B)C instead of (x e A)(y B)C and
(x,y A)B instead of (x A;y A)C.
We can also justify the following two rules for application:
Application
And we have the following rules for showing that two functions are equal:
13
Application
Extensionality
14
type level. Later we will see that for the set of functions, the situation is
different.
By the rules we have introduced, we can derive the following:
n-conversion
3.4
The objects in the type Set consist of inductively defined sets. In order
to explain a type we have to explain what it means to be an object in it
and what it means for two such objects to be the same. So, to know that
Set is a type we must explain what a set is and what it means for two
sets to be the same: to know that A is an object in Set (or equivalently
that A is a set) is to know how to form canonical elements in A and when
two canonical elements are equal. A canonical element is an element on
constructor form; examples are zero and the successor function for natural
numbers.
Two sets are the same if an element of one of the sets is also an element
of the other and if two equal elements of one of the sets are also equal
elements of the other.
This explanation justifies the following rule:
Set-formation
Set type
If we have a set A we may form a type El(A) whose objects are the
elements of the set A:
15
Set). The concept of type is open; it is possible to add more types to the
language, for instance adding a new object A to Set gives the new type
El(A)).
In what follows, we will often write A instead of El(A) since it will
always be clear from the context if A stands for the set A or the type of
elements of A.
3.5 Definitions
Most of the generality and usefulness of the language comes from the possibilities of introducing new constants. It is in this way that we can introduce
the usual mathematical objects like natural numbers, integers, functions,
tuples etc. It is also possible to introduce more complicated inductive sets
like sets for proof objects: it is in this way that rules and axioms of a theory
are represented in the framework.
A distinction is made between primitive and defined constants. The
value of a primitive constant is the constant itself. So, the constant has only
a type and not a definition; instead it gets its meaning by the semantics
of the theory. Such a constant is also called a constructor. Examples
of primitive constants are N, succ and 0; they can be introduced by the
following declarations:
The last example is the monomorphic identity function which, when applied
to an arbitrary set A, yields the identity function on A. It is easy to see
16
Prepositional logic
Type theory can be used as a logical framework, that is, it can be used to
represent different theories. In general, a theory is presented by a list of
typings
where c1,..., cn are new primitive constants, and a list of definitions
17
if identical proofs of one of the propositions are also identical proofs of the
other.
The primitive constant & for conjunction is introduced by the following
declaration:
& (Set; Set) Set
Prom this declaration we obtain, by repeated function application, the
clause for conjunction in the usual inductive definition of formulas in the
propositional calculus:
& -formation
where we have used infix notation, that is, we have written A&B instead
We must now define what counts as a proof of a conjunction, and that
is done by the following declaration of the primitive constant &I:
This declaration is the inductive definition of the set &(A, B), that is, any
element in the set is equal to an element of the form &/(A, B, a, b), where
A and B are sets and a A and b B. A proof of the syntactical form
&I&1,(A, B,a, b) is called a canonical proof of A&B.
By function application, we obtain the introduction rule for conjunction
from the declaration of &/:
&-introduction
A e Set
(A,
and
and
respectively. Notice that it is the definition of the constants which justifies
their typings. To see that the typing of & E1 is correct, assume that A and
18
B are sets, and that p A&B. We must then show that &i(A, B,p)
is an element in A. But since p A&B, we know that p is equal to an
element of the form &I(A, B,a, b), where a 6 A and b 6 B. But then we
have that &EI (A, B, p) &E1i(A,B,&i(A,B,a,b)) which is equal to a by
the defining equation of &EI .
Prom the typings of &E1 and &E2 we obtain, by function application,
the elimination rules for conjunction:
&-elimination 1
and
&-elimination 2
The defining equations for &EI and &E2 correspond to Prawitz's reduction
rules in natural deduction:
A&B
and
A&B
B
respectively. Notice the role which these rules play here. They are used
to justify the correctness, that is, the well-typing of the elimination rules.
The elimination rules are looked upon as methods which can be executed,
and it is the reduction rules which define the execution of the elimination
rules.
The primitive constant D for implication is introduced by the declaration
D e (Set; Set)Set
As for conjunction, we obtain from this declaration the clause for implication in the inductive definition of formulas in the prepositional calculus:
19
Ds-formation
In the same way as for conjunction, we can use this definition to show that
D is well-typed.
The defining equation corresponds to the reduction rule
Set theory
20
5.1
In these two definitional equalities we have omitted the types since they
can be obtained immediately from the typing of if. Henceforth, we will
often write just a = b instead of o = b A when the type A is clear from
the context.
21
Since there are no constructors we immediately define the selector case and
its type by the declaration
The empty set corresponds to the absurd proposition and the selector corresponds to the natural deduction rule for absurdity:
5.3
In order to introduce the set of natural numbers, N, we must give the rules
for forming all the natural numbers as well as all the rules for forming
two equal natural numbers. These are the introduction rules for natural
numbers.
There are two ways of forming natural numbers 0 is a natural number
and if n is a natural number then succ(n) is a natural number. There are
also two corresponding ways of forming equal natural numbers the natural
number 0 is equal to 0, and if the natural number n is equal to m, then
succ(n) is equal to succ(m). So we have explained the meaning of the
natural numbers as a set, and can therefore make the type declaration
and form the introduction rules for the natural numbers, by declaring the
types of the constructor constants 0 and succ:
The general rules in the framework make it possible to give the introduction
rules in this simple form.
We will introduce a very general form of selector for natural numbers,
natrec, as a defined constant. It can be used both for expressing elements
by primitive recursion and proving properties by induction. The functional
constant natrec takes four arguments: the first is a family of sets that
determines the set which the result belongs to; the second and third are
the results for the zero and successor case, respectively; and the fourth is the
22
The selector for natural numbers could, as already mentioned, be used for
introducing ordinary primitive recursive functions. Addition and multiplication could, for example, be introduced as two defined constants
Using the rules for application together with the type and the definitional
equalities for the constant natrec it is easy to derive the type of the righthand side of the equalities above as well as the following equalities for
addition and multiplication:
23
where d' and e' are functions in type theory which correspond to d and e
in the definition of /.
The type of the constant natrec represents the usual elimination rule
for natural numbers
which can be obtained by assuming the arguments and then applying the
constant natrec to them. Note that, in the conclusion of the rule, the
expression natrec(C, d, e, x) contains the family C. This is a consequence
of the explicit declaration of natrec in the framework.
5.4
Notice that the elements of a cartesian product of a family of sets, II(4, B),
are more general than ordinary functions from A to B in that the result of
applying an element of II(A, B) to an argument can be in a set which may
depend on the value of the argument.
24
The most important defined constant in the H-set is the constant for
application. In type theory this selector takes as arguments not only an
element of II(A, B) and an object of type A but also the sets A and B
themselves. The constant is introduced by the type declaration
and the definitional equality
The cartesian product of a family of sets is, when viewed as a proposition, the same as universal quantification. The type of the constructor
corresponds to the introduction rule
The cartesian product of a family of sets is a generalization of the ordinary function set. If the family of sets B over A is the same for all elements
of A, then the cartesian product is just the set of ordinary functions. The
constant -> is introduced by the following explicit definition:
and the type of the selector is the same as the elimination rule
Given the empty set and the set of functions, we can define a constant
for negation in the following way:
25
and therefore
Example 5.2. Using the rules for natural numbers, Booleans and functions we will show how to define a function, eqN e (N, N)Bool, that decides
if two natural numbers are equal. We want the following equalities to hold:
It is impossible to define eqN directly just using natural numbers and recursion on the arguments. We have to do recursion on the arguments
separately and first use recursion on the first argument to compute a function which, when applied to the second argument, gives us the result we
want. So we first define a function / (N) (N - Bool) which satisfies the
equalities
where
26
5.5
Prepositional equality
27
The derived rule for symmetry can therefore be expressed by the constant
symm defined by
By applying idpeel to A, [ x , y , z ] l d ( A , z , c ) - l d ( A , z , c ) , a, b, d and the identity function [x]A(ld(A, x, c ) , l d ( A , x , c ) , [ w ] w ) we get an element in the set
\d(A,b, c ) - l d ( A , a,c). This element is applied to e in order to get the
desired element in \d(A,a,c):
Example 5.4. Let us see how we can derive a rule for substitution in set
expressions. We want to have a rule
To derive such a rule, first assume that we have a set A and elements a
and b of A. Furthermore, assume that c 6 \d(A,a,b), P(x) 6 Set [x A]
and p P(a). Type checking gives us that
28
5.6
The set of lists List(A) is introduced in a similar way to the natural numbers, except that there is a parameter A that determines which set the
elements of a list belongs to. There are two constructors to build a list, nil
for the empty list and cons to add an element to a list. The constants we
have introduced so far have the following types:
The selector listrec for types is a constant that expresses primitive recursion
for lists. The selector is introduced by the type declaration
29
5.7
If we have two sets A and B we can form the disjoint union A+B. The elements of this set are either of the form inl(A, B, a) or of the form inr(A, B, b)
where a A and b E B. In order to express this in the framework we introduce the constants
5.8
In order to be able to deal with the existential quantifier and to have a set
of ordinary pairs, we will introduce the disjoint union of a family of sets.
The set is introduced by the type declaration
30
There is one constructor in this set, pair, which is introduced by the type
declaration
The selector of a set E(A, B) splits a pair into its parts. It is defined by
the type declaration
Given the selector split, it is easy to define the two projection functions
that give the first and second component of a pair:
5.9
A set of small sets U, or a universe, is a set that reflects some part of the set
structure on the object level. It is of course necessary to introduce this set
if one wants to do some computation using sets, for example to specify and
prove a type checking algorithm correct, but it is also necessary in order to
31
Example 5.5. Let us see how we can derive an element in the set
32
and therefore
33
assumptions left. This process corresponds exactly to the way we can build
a mathematical object from the outside in. If we have a problem
where the new placeholder ?1 must have the type B (since all arguments to c must have that type) and, furthermore, the type of c(?1)
must be equal to A, that is, the following equality must hold:
34
35
References
[Augustsson et al., 1990] L. Augustsson, T. Coquand, and B. Nordstrom.
A short description of Another Logical Framework. In Proceedings of the
First Workshop on Logical Frameworks, Antibes, pages 39-42, 1990.
[Backhouse, 1987] Roland Backhouse. On the meaning and construction
of the rules in Martin-Lof's theory of types. In Proceedings of the Workshop on General Logic, Edinburgh. Laboratory for the Foundations of
Computer Science, University of Edinburgh, February 1987.
[Backhouse et al, 1989] Roland Backhouse, Paul Chisholm, Grant Malcolm, and Erik Saaman. Do-it-yourself type theory. Formal Aspects of
Computing, 1:19-84, 1989.
[Bishop, 1970] Errett Bishop. Mathematics as a numerical language. In
J. Myhill, A. Kino, and R. E. Vesley, editors, Intuitionism and Proof
Theory, pages 53-71, North Holland, 1970.
[Church, 1940] A. Church. A formulation of the simple theory of types.
Journal of Symbolic Logic, 5:56-68, 1940.
[Constable et al., 1986] R. L. Constable , S. F. Allen, H. M. Bromely, W. R.
Cleaveland, J. F. Cremer, R. W. Harper, D. T. Howe, T. B. Knoblox, N.
P. Mendler, P. Panangaden, J. T. Sasaki and S. F. Smith. Implementing
Mathematics with the NuPRL Proof Development System. Prentice Hall,
1986.
[Coquand, 1994] Catarina Coquand. From semantics to rules: a machine
assisted analysis. In E. Borger, Y. Gurevich, and K. Meinke, editors,
Computer Science Logic, 7th Workshop, Lecture Notes in Computer Science 832, pages 91-105. Springer-Verlag, 1994.
[Coquand and Huet, 1986] Thierry Coquand and Gerard Huet. The calculus of constructions. Technical Report 530, INRIA, Centre de Rocquencourt, 1986.
[Coquand and Paulin, 1989] Thierry Coquand and Christine Paulin. Inductively defined types. In Proceedings of the Workshop on Programming
Logic, Bastad, 1989.
[Curry and Feys, 1958] H. B. Curry and R. Feys. Combinatory Logic, volume I. North-Holland, 1958.
[de Bruijn, 1970] N. G. de Bruijn. The Mathematical Language AUTOMATH, its usage and some of its extensions. In M. Laudet, D. Lacombe, L. Nolin, and M. Schutzenberger, editors, Symposium on Automatic Demonstration, Lecture Notes in Mathematics 125, pages 29-61,
Springer-Verlag, 1970.
[de Bruijn, 1980] N. G. de Bruijn. A survey of the project AUTOMATH.
In J. P. Seldin and J. R. Hindley, editors, To H. B. Curry: Essays on
Combinatory Logic, Lambda Calculus, and Formalism, pages 589-606,
Academic Press, 1980.
36
37
and t. Nipkow, editors, Types for Proofs and Programs, Lecture Notes
in Computer Science 806, Springer-Verlag, 1994.
[Martin-Lof, 1971a] Per Martin-Lof. A theory of types. Technical Report
71-3, University of Stockholm, 1971.
[Martin-Lof, 1971b] Per Martin-Lof. Hauptsatz for the intuitionistic theory
of iterated inductive definitions. In J. E. Fenstad, editor, Proceedings
of the Second Scandinavian Logic Symposium, pages 179-216. NorthHolland, 1971.
[Martin-Lof, 1982] Per Martin-Lof. Constructive mathematics and computer programming. In Logic, Methodology and Philosophy of Science,
VI, 1979, pages 153-175. North-Holland, 1982.
[Martin-Lof, 1984] Per Martin-Lof. Intuitionistic Type Theory. Bibliopolis,
Napoli, 1984.
[Milner et al., 1990] R. Milner, M. Tofte, and R. Harper. The Definition
of Standard ML. MIT Press, 1990.
[Nordstrom et al., 1990] Bengt Nordstrom, Kent Petersson, and Jan M.
Smith. Programming in Martin-Lof's Type Theory. An Introduction.
Oxford University Press, 1990.
[Paulson, 1987] Lawrence C. Paulson. Logic and Computation. Cambridge
University Press, 1987.
[Petersson, 1982, 1984] Kent Petersson. A programming system for type
theory. PMG report 9, Chalmers University of Technology, S-412 96
Goteborg, 1982, 1984.
[Peyton Jones, 1999] S. Peyton Jones (ed.). Haskell 98:A non-strict, purely
functional langauge, February 1999. URL: www.haskell.org/onlinereport
[Prawitz, 1965] D. Prawitz. Natural Deduction. Almquist & Wiksell, 1965.
[Smith, 1988] Jan M. Smith. The independence of Peano's fourth axiom
from Martin-Lof's type theory without universes. Journal of Symbolic
Logic, 53(3), 1988.
[Szasz, 1991] Nora Szasz. A machine checked proof that Ackermann's function is not primitive recursive. Licentiate thesis, Chalmers University of
Technology and University of Goteborg, Sweden, June 1991. Also in G.
Huet and G. Plotkin, editors, Logical Frameworks, Cambridge University
Press.
[Tait, 1965] W. Tait. Infinitely long terms of transfinite type. In Formal
Systems and Recursive Functions, pages 176-185, North-Holland, 1965.
[von Sydow, 1992] Bjorn von Sydow. A machine-assisted proof of the fundamental theorem of arithmetic. PMG Report 68, Chalmers University
of Technology, June 1992.
Categorical logic
Andrew M. Pitts
Contents
1
2
Introduction
Equational logic
2.1 Syntactic considerations
2.2 Categorical semantics
2.3 Internal languages
Categorical datatypes
3.1 Disjoint union types
3.2 Product types
3.3 Function types
3.4 Inductive types
3.5 Computation types
Theories as categories
4.1 Change of category
4.2 Classifying category of a theory
4.3 Theorycategory correspondence
4.4 Theories with datatypes
Predicate logic
5.1 Formulas and sequents
5.2 Hyperdoctrines
5.3 Satisfaction
5.4 Prepositional connectives
5.5 Quantification
5.6 Equality
5.7 Completeness
Dependent types
6.1 Syntactic considerations
6.2 Classifying category of a theory
6.3 Type-categories
6.4 Categorical semantics
6.5 Dependent products
Further reading
40
43
44
45
48
50
52
57
60
62
65
67
68
68
73
75
77
77
78
82
84
89
93
97
100
101
107
109
114
119
123
40
Andrew M. Pitts
1 Introduction
This chapter provides an introduction to the interaction between category
theory and mathematical logic. Category theory describes properties of
mathematical structures via their transformations, or 'morphisms'. On the
other hand, mathematical logic provides languages for formalizing properties of structures directly in terms of their constituent partselements of
sets, functions between sets, relations on sets, and so on. It might seem that
the kind of properties that can be described purely in terms of morphisms
and their composition would be quite limited. However, beginning with the
attempt of Lawvere [1964; 1966; 1969; 1970] to reformulate the foundations
of mathematics using the language of category theory, the development of
categorical logic over the last three decades has shown that this is far from
true. Indeed it turns out that many logical constructs can be characterized in terms of relatively few categorical ones, principal among which is
the concept of adjoint functor. In this chapter we will see such categorical
characterizations for, amongst other things, the notions of variable, substitution, prepositional connectives and quantifiers, equality, and various
type-theoretic constructs. We assume that the reader is familiar with some
of the basic notions of category theory, such as functor, natural transformation, (co)limit, and adjunction: see Poigne's [1992] chapter on Basic
Category Theory in Vol. I of this handbook, or any of the several available
introductions to category theory slanted towards computer science, such as
[Barr and Wells, 1990] and [Pierce, 1991].
Overview
There are three recurrent themes in the material we present.
Categorical semantics. Many systems of logic can only be modelled in a
sufficiently complete way by going beyond the usual set-based structures of
classical model theory. Categorical logic introduces the idea of a structure
valued in a category C, with the classical model-theoretic notion of structure
[Chang and Keisler, 1973] appearing as the special case when C is the
category of sets and functions. For a particular logical concept, one seeks
to identify what properties (or extra structure) are needed in an arbitrary
category to interpret the concept in a way that respects given logical axioms
and rules. A well-known example is the interpretation of simply typed
lambda calculus in cartesian closed categories (see Section 3.3).
Such categorical semantics can provide a completely general and often
quite simple formulation of what is required to model (a theory in) the
logic. This has proved useful in cases where more traditional, set-theoretic
methods of defining a notion of model either lack generality, or are inconveniently complicated to describe, or both. Seely's [1987] modelling of
various impredicative type theories, such as the GirardReynolds polymorphic lambda calculus [Girard, 1986], [Reynolds, 1983] is an example of this:
Categorical logic
41
see [Reynolds and Plotkin, 1993] and [Pitts, 1987; 1989] for instances of
the use of categorical models in this case.
Internal languages. Category theory has evolved a characteristic form of
proof by 'diagram-chasing' to establish properties expressible in category
theoretic terms. In complex cases, such arguments can be difficult to construct and hard to follow, because of the rather limited forms of expression
of purely category-theoretic language. Categorical logic enables the use
of richer (and more familiar) forms of expression for establishing properties of particular kinds of category. One first defines a suitable 'internal
language' naming the relevant constituents of the category and then applies a categorical semantics to turn assertions in a suitable logic over the
internal language into corresponding categorical statements. Such a procedure has become most highly developed in the theory of toposes, where the
internal language of a topos, coupled with the semantics of intuitionistic
higher-order logic in toposes, enables one to reason about the objects and
morphisms of a topos 'as though they were sets and functions' (provided
one reasons constructively): see [Mac Lane and Moerdijk, 1992, VI.5], [Bell,
1988] or [McLarty, 1992, Chp. 14]. This technique has been particularly
helpful for working with toposes that contain 'sets' that have properties
incompatible with classical logic: cases in point are the modelling of the
untyped lambda calculus in terms of objects that retract onto their own
function space by D. Scott [1980], and the Moggi-Hyland modelling of the
Girard-Reynolds polymorphic lambda calculus by an internal full subcategory of the effective topos of Hyland [1988].
Term-model constructions. In many cases, the categorical semantics of a
particular kind of logic provides the basis for a correspondence between
formal theories in the logic and instances of the appropriate kind of category. In one direction, the correspondence associates to such a category the
theory induced by its internal language. In the other direction, one has to
give a 'term-model' construction manufacturing an instance of the required
categorical structure out of the syntax of the theory. An example of such a
correspondence is that between theories in bn-equational logic over simply
typed lambda calculus and cartesian closed categories. Another example
is the correspondence between theories in extensional higher-order intuitionistic logic and toposes. These are both examples in which a variety
of category theory was discovered to correspond to a pre-existing type of
logic, with interesting consequences for both sides of the equation. On the
other hand, some varieties of category, embodying important mathematical
constructs, have required the invention of new types of logic to establish
such a theory-category correspondence. A prime example of this is the
notion of category with finite limits (variously termed a cartesian or a lex
category), for which a number of different logics have been devised: the essentially algebraic theories of Freyd [1972], the lim-theories of Coste [1979,
42
Andrew M. Pitts
section 2], and the generalized algebraic theories of Cartmell (see section 6)
can all be used for this purpose, although each has its drawbacks.
Categories arising from theories via term-model constructions can usually be characterized up to equivalence by a suitable universal property
(cf. Theorem 4.6, for example). This has enabled metatheoretic properties
of logics to be proved via categorical algebra. For example, Freyd's proof
of the existence and disjunction properties in intuitionistic logic is a case
in point (see [Freyd and Scedrov, 1990, Appendix B.32]). It makes use of
a categorical construction variously called sconing or glueing. The same
construction was used by Lafont to prove a strong conservativity result for
the simply typed lambda calculus generated by an algebraic theory: see
[Crole, 1993, Section 4.10]. A categorical construction closely related to
glueing was used in [Crole and Pitts, 1992] to prove an existence property
for a logic for fixed-point recursion. One of the strengths of the categorical
approach to logic is that the same categorical construction can sometimes
be applied to obtain results for a number of different logics.
Categorical logic
43
Equational logic
Fixing disjoint, countably infinite sets of variables for each sort symbol, the
(open) terms over Sg and their sorts are defined in the usual way: if x is a
variable of sort , then x is a term of sort a; and if F : 1\,..., n > T is a
function symbol and M1,..., Mn are terms of sorts 1 , . . . , n respectively,
then F ( M 1 , . . . , Mn) is a term of sort . (In the case n = 0, F is just a
constant and we abbreviate F() to F.) We will write M : to indicate
that M is a term of sort a.
Recall that a set-valued structure for Sg is specified by giving a set []
for each sort symbol a and a function [F] : [1] x x [n] > [T] for
each function symbol F : 1 , . . . , n > T. The set [1] x x [n] is the
cartesian product of the sets |i] and so consists of n-tuples ( a 1 , . . . , an)
with ai i. In the case n = 0, this cartesian product contains just
one element, the 0-tuple (), and so specifying the function |F] amounts to
picking a particular element of [.
Given such a structure for Sg, the usual environment-style semantics
for terms is defined by structural induction:
44
Andrew M. Pitts
between two terms, M = N, is satisfied by the structure if, for all environments defined on the variables occurring in M and N, [M]p and [N]p are
equal elements of the structure
This explanation of the meaning of terms and equations is couched in
terms of elements of sets. However, we can reformulate it in terms of functions between sets and using only category-theoretic properties of Set. This
reformulation will allow us to replace the category Set of sets and functions
by other suitable categories. First note that for any fixed list of distinct
variables [x : 1 , . . . ,xn
], the set of environments defined just on this
set of variables is in bijection with the cartesian product [1] x x [n].
Then the mapping p
[M]p determines a function [1] x x [n] > [T]
(assuming M has sort r) which captures the meaning of the term M in the
structure. In particular, the satisfaction of an equation amounts to the
equality of the corresponding functions. So we can get a semantics expressed via functions rather than elements if we give meaning not to a
naked term, but rather to a 'term-in-context', that is, to a term together
with a list of distinct variables containing at least those which actually occur in the term. This also allows us to drop the syntactic convention that
occurrences of variables in terms be explicitly typed, by recording the sort
of a variable just once, in the context. The next section summarizes this
approach to the syntax.
2.1
Syntactic considerations
(2.1)
Categorical logic
45
M :a [
(2.5)
2.2
Categorical semantics
46
Andrew M. Pitts
will denote the unique morphism whose compositions with each i, is fi.
For definiteness we will assume that the product Xi x - - - x Xn is denned
by induction on the length of the list [ X 1 , . . . , Xn] using a terminal object,
1, and binary products, x . Thus the product of the empty list is 1;
and, inductively, the product of a list [ X 1 , . . . , Xn, .Xn+1] of length n + 1 is
given by the binary product (X1 x - - - x Xn) x Xn+i.
A structure in C for a given signature Sg is specified by giving an object
[] in C for each sort a, and a morphism [F] : [ 1] x x [n] > [T] in
C for each function symbol F : <1, ..., an > T. (In the case n = 0, this
means that a structure assigns a global element [c| : > [T] to a constant
c : T.) Given such a structure, for each context l = x1 : 1, ... ,xn : n,
term M and sort T for which M : T [L] holds, we will define a morphism in
C:
where |L] denotes the product [1] x x [ n ] - Note that the rules for
deriving well-formed terms-in-context are such that for each L, M and r,
there is at most one way to derive M : T[L]. The definition of [M : T[L]]
can therefore be given by induction on the structure of this derivation.
Since it is also the case that T is uniquely determined by L and M, we
will abbreviate [M : T[L] to |M[F]J. The definition has two clauses,
corresponding to rules (2.2) and (2.3):
Categorical logic
47
Suppose M = M' : a [L] is an equation-in-context over a given signature, Sg. Since M : a [L] and M : a [L] are required to hold, a
structure in C for Sg gives rise, via the above definition, to morphisms
[M[L]], [M'[L]] : [L] > []. The structure is said to satisfy the equationin-context if these morphisms are equal. If Th is an algebraic theory over
Sg, then the structure is a Th-algebra in C if it satisfies all the axioms of Th.
There are very many different categories with finite products, and algebras
for an algebraic theory in one may have very different detailed structure
from algebras for the same theory in another category. Nevertheless, the
following proposition shows that whatever the underlying category, we can
still use the familiar kind of equational reasoning embodied by the rules in
Fig. 1 whilst preserving satisfaction of equations.
Theorem 2.4 (Soundness). Let C be a category with finite products
and Th an algebraic theory. Then a Th-algebra in C satisfies any equationin-context which is a theorem of Th.
Proof. The properties of equality of morphisms in C imply that the collection of equations-in-context satisfied by the TTi-algebra is closed under
rules (2.6), (2.7) and (2.8) in Fig. 1. Closure under rule (2.9) is a consequence of Lemma 2.2.
The converse of this theorem, namely the completeness of the categorical semantics for equational logic, will be a consequence of the material in
Section 4.2.
Summary. We conclude this section by summarizing the important features of the categorical semantics of terms and equations in a category with
finite products.
Sorts are interpreted as objects.
A term is only interpreted in a context (an assignment of sorts to
finitely many variables) containing at least the variables mentioned
in the term, and such a term-in-context is interpreted as a morphism
with:
* the codomain of the morphism determined by the sort of the
term;
* the domain of the morphism determined by the context;
* variables interpreted as product projection morphisms (identity
morphisms being a special case of these);
* substitution of terms for variables interpreted via composition
and pairing;
* weakening of contexts interpreted via composition with a product projection morphism.
An equation is only considered in a context (containing at least the
variables mentioned), and such an equation-in-context is satisfied if
48
Andrew M. Pitts
the two morphisms interpreting the equated terms-in-context are actually equal in the category.
2.3
Internal languages
inC.
We will not refer to this structure for Sgc in C by name, but simply say
that 'C satisfies M = N : X [L]'if the structure satisfies this equation-incontext. The following results are easy exercises in the use of the categorical
semantics.
Proposition 2.5.
(i) Two parallel morphisms in C, f,, g : X > X, are equal if and only if
C satisfies f(x) = g(x) : X [x : X],
(ii) A morphism f : X
X in C is the identity on X if and only if C
satisfies f(x) = x : X [x : X].
(in) A morphism f : X >Z is the composition of g : X
Y and
h:Y > Z if and only if C satisfies f ( x ) = h(g(x)) : Z (x : X}.
(iv) An object T is a terminal object in C if and only if there is some
morphism t : 1 > T satisfying x = t :T [x : T].
(v) X < Z ->- Y is a binary product diagram in C if and only if there
Categorical logic
49
50
Andrew M. Pitts
evolving through (discrete) time'. The objects are sets X equipped with
a function E : X > w recording that x X exists at time E(x) w,,
together with a function () + : X > X describing how elements evolve
from one instant to the next, and hence which is required to satisfy E(x+) =
E(x) + l. A morphism between two such objects, / : X Y, is a function
between the sets which preserves existence (E(f(x)) = E(x)) and evolution (f(x+) = f ( x ) + ) . The global sections of X are in bijection with the
elements which exist at the beginning of time, {x X \ E(x) 0}. So this
category is easily seen to not be well-pointed. One can model the algebraic
theory of the previous paragraph in this category by interpreting bool as the
set having two distinct elements at time 0 that evolve to a unique element
at time 1, and interpreting hegel as the set with no elements at time 0 and
a single element thereafter. (This interpretation for hegel is different from
the initial object of [w,Set], which has no elements at any time.)
The internal language of a category with just finite products is of rather
limited usefulness, because of the very restricted forms of expression and
judgement in equational logic. However, internal languages over richer logics can be used as powerful tools for proving complicated 'arrow-theoretic'
results by more familiar-looking 'element-theoretic' arguments. See [Bell,
1988], for extended examples of this technique in the context of topos theory
and higher-order predicate logic. The use of categorical semantics in nonwell-pointed categories gives us an increase in ability to construct useful
models compared with the more traditional 'sets-and-elements' approach.
The work of Reynolds and Oles (see [Oles, 1985]) on the semantics of block
structure using functor categories is an example. (As the example above
shows, functor categories are not well-pointed in general.) See [Mitchell
and Scott, 1989] for a comparison of the categorical and set-theoretic approaches in the case of the simply typed lambda calculus.
Categorical datatypes
Categorical logic
51
52
Andrew M. Pitts
rules retain the property that there is at most one derivation of a typing
judgement M : [L]. Thus the definition of its meaning as a morphism
[M[L] : [L] > [ ] in C can be given unambiguously by induction on the
structure of this derivation. The same is true for the rules considered in
section 5. It is only when we come to consider dependent type theories
in section 6 that this property breaks down and a more careful approach
to the categorical semantics must be adopted. We will also take care to
retain the property of derivable typing judgements, M : a [L], that a is
uniquely determined by the term M and the context L. This is the reason
for the appearance of type subscripts in the syntax given below for the
introduction terms of disjoint union, function and list types.
Remark 3.2 (Implicit contexts). In order to make the statement of
rules less cluttered, from now on we will adopt the convention that a lefthand part of a context will not be shown if it is common to all the judgements in a rule. Thus for example rule (3.2) below is really an abbreviation
for
case
Remark 3.3 (Congruence rules). The introduction and elimination
rules for the various datatype constructors to be given below have associated with them congruence rules expressing the fact that the term-forming
constructions respect equality. For example, the congruence rules corresponding to (3.1) and (3.2) are:
case
We will not bother to give such congruence rules explicitly in what follows.
3.1
We will use infix notation and write a + a' rather than +( ,) for the
disjoint union of two types a and ' . (Thus this type constructor has arity
TYPES
TYPES
TYPES.) The rules for introducing terms of type
+
are
Categorical logic
53
Introduction. Since terms-in-context M : a [L] get interpreted as morphisms from I = [L] to X [ ], to interpret the introduction rules (3.1)
we need functions on horn-sets of the form
that can be applied to [M[L]] and |M'][L] respectively to give the interpretation of inl (M) and inr(M'). Now in the syntax, substitution commutes
with term formation, and since substitution is interpreted by composition
in C, it is necessary that the above functions be natural in /. Therefore
by the Yoneda lemma (see [Mac Lane, 1971, III.2]), there are morphisms
X > X + X' and X' > X + X' which induce the above functions on
horn-sets via composition. So associated with the object X + X' we need
morphisms
inlx,x' :X*X + X'
54
Andrew M. Pitts
Applying this with 2, n o (I x id), and n' o (I x id) for c, n, and n'
respectively, gives
case I (c, n, n') = caseI(x+x')(x-2,n o (I
x id),n'o(7n x id))o{id,c)
(3.5)
case(inr (M'),N,N') = N'(M') : r
Writing m : I > X for [M[L], m' : I > X' for [M'[L]J, n : I x X >Y
for [N(x)[L,x : ]], and n' : I x X1 . Y for [N'(x')[L ,x' : a'], then from
above we have that [case(inl ' ( M ) , N , N ' ) [ L ] ] is {n\n'}I ( i d , inlx,x' m)
and [case(inr(M),N,N')[L]l is {n|n'}/ o (id, inr x,x' m). From Lemma
2.2, we also have that [N(M)[L] is no ( i d , m) and that [N(M')[L]] is n' o
(id, m). Consequently the structure on C for interpreting terms associated
with disjoint union types described above, is sound for the rules (3.4) and
(3.5) provided the equations
hold for all m, n and n' (with appropriate domains and codomains). Clearly
for this it is sufficient to require for all n and n' that
and the naturality of { |}/ ensures that (3.6) and (3.7) are also necessary
for the previous two equations to hold for all m, n and n'.
Categorical logic
55
which sends / to (f o (id x inlx,x'), f (id x inrx, x' ))- If, furthermore,
we require that it provide a two-sided inverse, so that
then the categorical semantics becomes sound for a further rule for a + ' ,
namely:
caSe(C,(x)F(inlf,(x)),(x')F(inr(x')))
= F(C) : T
(3.10)
So we are led to require that C have binary coproducts. But this is not
all: we can now compose the natural bijection (3.8) with
to deduce that for each /, X, X', composition with the canonical morphism
56
Andrew M. Pitts
Categorical logic
57
3.2
Product types
One could be forgiven for jumping straight to the conclusion that product
types a x a' should be interpreted by binary categorical product in a category C. However, as we have seen in section 2, the finite product structure
of C is there primarily to enable terms involving multiple variables to be
interpreted. Also, we will use the elimination rule for products that is derived systematically from its formation and introduction rule, rather than
the more familiar formulation in terms of first and second projection. (See
[Backhouse et al., 1989] for comments on this issue in general.) Nevertheless, with a full set of equality rules including the analogue of the 'surjective
pairing' rule, binary products in C are exactly what is required to model
product types. If one does not wish to model the surjective pairing rule,
then a weaker structure on C (not determined uniquely up to isomorphism)
suffices.
Recall that the introduction rule and corresponding elimination rule for
product types are
58
Andrew M. Pitts
For the above semantics to be sound for (3.14), one needs that for each
should have left inverse given by n - split I ( n ) . This condition does not
suffice to determine X * X' uniquely up to isomorphism: there may be
Categorical logic
59
when interpreted, give the first and second projection morphisms as one
might hope:
Remark 3.6 (One-element type). The 'nullary' version of binary product types is the one-element type unit with the following rules (together
with associated congruence rules). As for product types, we use the possibly less familiar elimination rule derived systematically from the form of
formation and introduction
60
Andrew M. Pitts
The last rule is the analogue of rule (3.15)in other words, it is the
for this type. The 'B-rule' for unit (analogous to (3.14)) is
-rule'
but it is easy to see that in fact this rule is derivable from the above rules,
as is the rule
Using an argument similar to that for product types, one finds that these
rules can be interpreted soundly in C provided it has a terminal object.
3.3
Function types
in order to define
Elimination. We will use the familiar elimination rule involving application rather that the rule systematically derived from formation and introduction, which involves judgements with higher-order contexts (see [Nordstrom et o/., 1990, Section 7.2]):
Categorical logic
61
and we define
Equality.
The semantics is sound for this rule provided that for all m : I >
(X-Y)
In fact the naturality of cur_ implies that this holds if and only if for each
X and y
62
Andrew M. Pitts
Now (3.20) and (3.22) say precisely that the natural transformation
given by
bijection with inverse cur-. Thus,
by definition, X-Y is the exponential of Y by X in the category C, with
evaluation morphism
Recall that by definition, a cartesian closed category has all finite products and exponentials.
Thus we have: function types satisfying B- and n-conversion can be soundly
interpreted in C provided it is cartesian closed.
Categorical logic
63
to
For the semantics to be sound for these two rules, one needs that for each
We can go one step further and insist that given f, listrecI(f) is the unique
morphism satisfying (3.27) and (3.28). In this case the structure on C is
also sound for the rule
With (3.29) and in the presence of product types, the scheme for primitive recursion becomes interdefinable with a simpler scheme of iteration,
given by the following rules:
64
Andrew M. Pitts
and use (3.31), (3.32) and (3.33) to prove that this has the correct properties. To do this, one also has to prove that
Categorical logic
65
deduce the general case. If C also has binary coproducts, then one can
combine nilx and consx into a single morphism
3.5
Computation types
66
Andrew M. Pitts
Categorical logic
67
Satisfaction of rules (3.36), (3.37), and (3.38) requires the following equational properties of nx and lift I to hold for all
(we
have used the naturality of lift I in I to state
these in their simplest form):
Theories as categories
68
4.1
Andrew M. Pitts
Change of category
for all sorts a and function symbols F : a\,..., an > T. Since T preserves
finite products and the semantics of terms-in-context is defined in terms of
these, the meaning of a term in 5 is mapped by T to its meaning in T(S).
More precisely, if M : r [F] then
4.2
Two contexts over the signature Sg are a.-equivalent if they differ only in
their variables; in other words, the lists of sorts occurring in each context
Categorical logic
69
axe equal (and, in particular, the contexts are of equal length). Clearly
this gives an equivalence relation on the collection of contexts, and the set
of a-equivalence classes of contexts is in bijection with the set of lists of
sorts in Sg. Assuming a fixed enumeration Var = {v1,V2, } of the set of
variables (i.e. of the set of metavariables of arity TERMS), we can pick a
canonical representative context
for the a-equivalence
class corresponding to each list a1, . . . , an of sorts. We will not distinguish
notationally between a context and the a-equivalence class it determines.
Given contexts F and
a context morphism
from F to r'
to indicate the judgement that y and y' are context morphisms from F
to F' that are Th-provably equal: by definition, this means that for each
(2.8) imply that Th-provable equality of context morphisms (between two
given contexts) is an equivalence relation. Moreover, rule (2.9) shows that
changing F up to a-equivalence will not change the class of 7 under this
equivalence relation.
The composition of context morphisms y : F > F' and y' : F" > F"
is the context morphism y' o y : F F" formed by making substitutions:
The fact that y' o y does constitute a context morphism from F to F"
follows from rule (2.4); and rule (2.9) implies that composition respects
Th-provable equality:
70
Andrew M. Pitts
when
Moreover, the
operations of composition possess units: the identity context morphism for
is given by the list idr = [x1, , xn] of variables in F; clearly one has
We now have all the ingredients necessary to define a category. Specifically, the classifying category, C l ( T h ) , of an algebraic theory Th is defined
as follows:
Objects of Cl(Th) are a-equivalence classes of contexts (or, if you prefer,
finite lists of sorts) over the signature of Th.
Morphisms of C(Th) from one object F to another F' are equivalence
classes of context morphisms for the equivalence relation which identifies
y : F > F' with y' : F > F' just if 7 =Th l' y' > F'. Composition of
morphisms in Cl( Th) is induced by composition of context morphisms, and
identities are equivalence classes of identity context morphisms. We will not
distinguish notationally between a context morphism and the morphism of
C l ( T h ) which it determines.
Proposition 4.2. Cl(Th) has finite products.
Proof. Clearly, for each context F the empty list of terms is the unique
context morphism F > [], and so its equivalence class is the unique
morphism from F to [] in Cl(Th). Thus the (a-equivalence class of the)
empty context [ ] is a terminal object in Cl( Th).
Given contexts F = [x1 : a1,..., xn : an] and F' = [yi : TI ,..., ym : rm],
we make use of the given enumeration Var = { v i , v 2 , . . . } of the set of
variables to define a context
Categorical logic
71
are not merely isomorphisms (as is always the case), but actually identity
morphisms.
Remark 4.4. There is a close relationship between Cl(Th) and the free
Th-algebras in Set generated by finitely many indeterminates. Writing F
for
denote the free algebra generated
by finitely many indeterminates x1,..., xn of sorts a1,..., an respectively.
Then FTh(F) can be constructed by taking its underlying set at a sort r
to consist of the set of terms M satisfying M : r [F], quotiented by the
equivalence relation which identifies M with M' just if M = M' : T [F]
is a theorem of Th. This quotient set is precisely the set of morphisms
F > [vi : T] in C l ( T h ) . Thus the horn-sets of C l ( T h ) can be used to
construct the free finitely generated Th-algebras in Set. Conversely, it can
be shown that the category C l ( T h ) is equivalent to the opposite of the
category whose objects are the free, finitely generated Th-algebras in Set,
and whose morphisms are all Th-algebra homomorphisms.
Next we describe the 'generic' model of Th in the classifying category.
Each sort a of the underlying signature Sg of the theory Th determines an
object in the classifying category of Th represented by the context [v1 : a].
We will denote this object of Cl(Th) by G[a]. If F : aI , . . . , an r is a
function symbol in Sg, then, since
72
Andrew M. Pitts
(i) We have already observed that [v1 : a1 , . . . ,vn : an] is the product
G[oI] x x G{an} in C l ( T h ) ; but recall from section 2.2 that this
is also the definition of G[[VI : a1 , . . . , vn : (on]].
(ii) This follows by induction on the structure of M using the explicit
description of the product structure of Cl( Th) given in the proof of
Proposition 4.2.
(iii) By part (ii), G[M[F]] = G[M'[F]] holds if and only if [M] and [M']
determine equal morphisms in C l ( T h ) , which by definition means
that M = M' : T [F] is a theorem of Th.
Part (ii) of the above lemma implies in particular that G satisfies the
axioms of Th and hence is a Th-algebra. G will be called the generic
Th-algebra. It enjoys the following universal property.
Theorem 4.6 (Universal property of the generic algebra). For each
category C with finite products, any Th-algebra S in C is equal to S(G) for
some finite product preserving functor S : C l ( T h ) > C. Moreover, the
functor S is uniquely determined up to (unique) natural isomorphism by
the algebra S. (S is called the functor classifying the algebra S.)
Proof. Define 5 on objects F by
and
on morphisms
Then the fact that S is a functor preserving finite products follows easily
from the definition of C l ( T h ) in section 4.2. Applying S to G, one has for
a sort a that
similarly for a
function symbol F that
Hence S(G) = S.
Now suppose that T : Cl(Th) > C is another product-preserving
functor and that there is an isomorphism, h : T(G) = S, of Th-algebras
in C. For each object
, since F =
one
gets isomorphisms
Categorical logic
73
(i) The classifying category of Th is determined uniquely up to equivalence, and the generic Th-algebra uniquely up to isomorphism by the
universal property in Theorem 4.6.
(ii) The operation of evaluating a finite product preserving functor from
Cl( Th) to C at the generic algebra G is the object part of an equivalence of categories:
where F P ( C l ( T h ) , C ) is the category of finite product preserving functors and natural transformations from C l ( T h ) to C, and Th-ALG(C)
is the category of Th-algebras and homomorphisms in C.
Proof. Part (i) is a routine consequence of the universal property, but part
(ii) deserves further comment.
The statement of Theorem 4.6 says that the functor T -> T(G) is
essentially surjective, and full and faithful for isomorphisms; hence it gives
an equivalence between the category of functors and natural isomorphisms
and the category of algebras and algebra isomorphisms. Since this is true
for any C with finite products, we can replace C by its arrow category
(whose
objects are the morphisms of C and whose morphisms are
commutative squares in C), which certainly has finite products when C
does. The equivalence for objects and isomorphisms in this case implies
that the original functor T >- T(G) is full and faithful, and hence that
(4.1) holds.
I
4.3
Theory-category correspondence
Let Sgc be the signature defined from a category C with finite products, as in section 2.3. As we noted in that section, there is a canonical
structure for Sgc in C. Define Thc to be the algebraic theory over Sgc
whose axioms are all equations-in-context which are satisfied by this structure. Then the structure is automatically an algebra for this theory, and
hence by Theorem 4.6 corresponds to a finite product preserving functor
T : C l ( T h c ) > C. The definition of Sgc (which names the various objects
and morphisms in C) and The (which identifies terms which name the same
things in C) entails that T is full, faithful and essentially surjectiveand
74
Andrew M. Pitts
Categorical logic
75
products and hence an algebraic theory. This category is a paradigmatic example of the notion of 'iteration theory' introduced by Elgot:
see the book by Bloom and Esik [1993]. (A reader who takes this advice should be warned that [Bloom and Esik, 1993] adopts a not
uncommon viewpoint that algebraic theories can be identified with
categories with finite coproducts. Since the 2-category of categories
with finite coproducts and functors preserving such is equivalent (under the 2-functor taking a category to its opposite category) to the
2-category of categories with finite products, this viewpoint is formally equivalent to the one presented here. However, it does not sit
well with the intuitively appealing picture we have built up of the
sorts of a theory (objects of a category) as generalized sets and the
terms-in-context (morphisms) as generalized functions.)
(ii) If T : C > D is a morphism in Fp between small categories, then,
for any C Fp with small colimits, it can be shown that the functor
induced
by composition with T has a left
adjoint, given by left Kan extension along T. Since T corresponds
to a translation between algebraic theories and T* is the functor
restricting algebras along T, this left adjoint provides 'relatively free'
constructions on algebras for a wide variety of situations (depending
upon the nature of the translation).
(iii) Free constructions (indeed weighted colimits in general) in Fp can
be constructed via appropriate syntactic constructions on algebraic
theories and translations between them.
4.4
In this section we will examine the effect on the classifying category Cl( Th)
of a theory Th when we enrich equational logic with the various datatype
constructions considered in section 3, together with their associated introduction, elimination, and equality rules. We will look at product, disjoint
union, and function types. In each case the classifying category turns out
to possess the categorical structure that was used in Section 3 to interpret these datatype constructs. (Similar considerations apply to the other
type-forming operations considered in that section, namely list types and
'computation' types.)
Product types. (Cf. Section 3.2.) In this case, for each pair of types a and
a' there is a binary product diagram in the classifying category C l ( T h ) of
the form
Similarly, in the presence of a one-element type unit (cf. Remark 3.6), then
[z : unit] is a terminal object in C l ( T h ) . It follows that in the presence of
76
Andrew M. Pitts
(where x,x',z are chosen to be distinct from x = x1,. . . ,xn). In the case
T = 0, we have that [z : a + a'] is the binary coproduct of [x : a] and
[x : a] in Cl( Th). Given the description of products in the category Cl( Th)
in Proposition 4.2, it also follows that product with an arbitrary object
[F] distributes over this coproduct, i.e. it is a stable coproduct(cf. 3.1).
Similarly, in the presence of an empty type null (cf. Remark 3.5), [z : null]
is a stable initial object in C l ( T h ) .
In the presence of product and one-element types we have seen that
every object is isomorphic to one of the form [x : a]. In the presence of
disjoint union and empty types as well, we can conclude that Cl( Th) has
all stable finite coproducts.
Function types. (Cf. Section 3.3.) In this case, for each pair of types a
and a', the object [f : a-a'] is the exponential
of
C l ( T h ) , with associated evaluation morphism
Unlike the case for disjoint union types, it is not necessary to assume the
presence of product types in order to conclude that the classifying category
possesses exponentials for any pair of objects. For, given any objects T =
the
is
given by the object
Thus in the presence of function types, the classifying category Cl( Th) is a
cartesian closed category, whether or not we assume the theory Th involves
product types.
Remark 4.9 (The generic model of a theory). Suppose we consider
equational theories Th in the equational logic of product, disjoint union,
Categorical logic
77
and function types. We have seen that Cl(Th) is a cartesian closed category with finite coproducts. (Recall that the stability of coproducts is
automatic in the presence of exponentials.) Such a category is sometimes
called bicartesian closed. We can define a structure G in C l ( T h ) for the
underlying signature of Th just as in Section 4.2 and satisfying Lemma 4.5.
Thus the structure G in C l ( T h ) is a model of Th (i.e. satisfies the axioms
of Th) and indeed satisfies exactly the equations that are theorems of Th.
An immediate corollary of this property is the completeness of the categorical semantics: an equation is derivable from the axioms of Th using the
equational logic of product, disjoint union, and function types if (and only
if) it is satisfied by all models of Th in bicartesian closed categories.
Indeed, Theorem 4.6 extends to yield a universal property characterizing G as the generic model of Th in bicartesian closed categories: given any
other model S of Th in a bicartesian closed category C, then the functor
S : C(Th) > C defined in the proof of Theorem 4.6 is a morphism of
bicartesian closed categories, i.e. preserves finite products, finite coproducts, and exponentials. It also maps G to 5 and is unique up to unique
isomorphisms with these properties.
Predicate logic
5.1
78
Andrew M. Pitts
with one such rule for each relation symbol R. Later we will consider compound formulas built up from the atomic ones using various propositionforming operations. The logical properties of these operations will be specified in terms of sequents of the form
where
is a finite list of formulas, ip is a formula and the
judgements 0i,prop [T] and prop [T] are derivable. The intended meaning
of the sequent (5.2) is that the joint validity of all the formulas in logically
entails the validity of the formula if).
Figure 2 gives the basic, structural rules for deriving sequents. Since we
wish to consider predicate logic as an extension of the rules for equational
logic given in section 2.1, Fig. 2 includes a rule (Subst) permitting the
substitution in a sequent of a term for a provably equal one. The rule
(Weaken) allows for weakening of contexts. The other possible form of
weakening is to add formulas to the left-hand side of a sequent:
This is derivable from the rules in Fig. 2 because of the form of rule (Id) .
Note that in stating these rules we continue to use the convention established in Remark 3.2 that the left-hand part of a context is not shown if it
is common to all the judgements in a rule. Thus for example, the full form
of rule (Subst) is really
5.2
Hyperdoctrines
Categorical logic
79
'
80
Andrew M. Pitts
Thus a prop-category is nothing other than a category with finite products, C, together with a contravariant functor Propc() from C to the category of posets and monotone functions. Propc() is a particular instance of
a Lawvere 'hyperdoctrine', which in general is category- rather than posetvalued (and pseudofunctorial rather than functorial). The reader should
be warned that the term 'prop-category' is not standard; indeed there is
no standard terminology for the various kinds of hyperdoctrine which have
been used in categorical logic.
Here are some important examples of prop-categories from the worlds
of set theory, domain theory, and recursion theory.
Example 5.2 (Subsets). The category Set of sets and functions supports
the structure of a prop-category in an obvious way by taking the Setproperties of a set X to be the ordinary subsets of X . Given a function
f : Y > X, the operation /* is that of inverse image, taking a subset
A C X to the subset {y \ f(y) E A] C Y.
Example 5.3 (Inclusive subsets of epos). Let Cpo denote the category
whose objects are posets possessing joins of all w-chains and whose morphisms are the w-continuous functions (i.e. monotone functions preserving
joins of w-chains). This category has products, created by the forgetful
functor to Set. We make it into a prop-category by taking PropCpo(X) to
consist of all inclusive subsets of X, i.e. those that are closed under joins
of w-chains in X. The partial order on PropCpo(X) is inclusion of subsets.
Given / : Y X, since / is w-continuous, the inverse image of an inclusive subset of X along / is an inclusive subset of Y; this defines the
operation /* : PropCpo(X) > PropCpo(Y).
Example 5.4 (Realizability). The prop-category Kl has as underlying
category the category of sets and functions. For each set X, the poset
P r o p k l ( X ) is defined as follows. Let X>P(N) denote the set of functions
from X to the power set of the set of natural numbers. Let < denote
the binary relation on this set defined by: p < q if and only if there is a
partial recursive function : N * N such that for all x 6 X and n N,
if n p(x) then (p is defined at n and (n) q(x). Since partial recursive
functions are closed under composition and contain the identity function,
< is transitive and reflexive, i.e. it is a pre-order. Then Propkl(X) is
the quotient of X-P(N) by the equivalence relation generated by <; the
partial order between equivalence classes [p] is that induced by <. Given a
function / : Y > X, the operation /* : PropKl(X) > PropKl(Y) sends
it is easily seen to be well-defined, monotonic and functorial.
Example 5.5 (Subobjects). A important class of examples of propcategories is provided by using the category-theoretic notion of a subobject
of an object. Recall that a morphism / : Y > X in a category C is a
Categorical logic
81
82
Andrew M. Pitts
5.3
Satisfaction
Suppose that
[F] is a sequent over a given signature Sg.
Given a structure for Sg in a prop-category C, what should it mean for
the structure to satisfy the sequent? Since prop [F] and prop [F] are
required to hold, we get C-properties [ [ F ] ] and [F]] of the object [F].
In the case n = 1 when the sequent contains a single antecedent formula,
it seems natural to define the satisfaction of the sequent to mean that
holds
in the poset Propc([F]). For the general case, let
us suppose that each poset Propc(X) comes equipped with a distinguished
element Tx and a binary operation A, A'
A A A'. Then we can define
to
be satisfied if
where
Categorical logic
83
the fact that < is a partial order, it follows that TX A A = A. Using this,
the soundness of (Exchange) amounts to requiring A A B = B A A. But
then we have both A B = B and A B = B/\A<A; so A A B is a
lower bound for A and B. In particular, for all A, A = TX A < TX, so
that TX is the greatest element of Propc(X). In fact A/\B has to be the
greatest lower bound of A and B: for the soundness of (Contract!) requires
C < C AC, for all C; and the general form of (Cut) (when $' is non-empty)
requires that, for all A,A',B,C Propc(X), if A < B and A' A B < C
then A A A' < C. The latter property implies that A is monotone, and
hence C < A and C < B imply C < C A C < A A B. So all in all, we need
each poset Propc(X) to have finite meets. But that is not all: in view of
Lemma 5.6, for the soundness of rules ( Weaken) and (Subst) we will need
that these finite meets be preserved by the pullback operations /* in C.
Therefore we are led to the following definition.
Definition 5.7. A prop-category C has finite meets if for each object X
in C the poset Propc(X) possesses all finite meets and these are preserved
by the pullback operations /*. Thus for each X there is TX Propc(X)
satisfying
Propc(X)
where the left-hand side indicates the (finite) meet of the elements
Extending Theorem 2.4, we have:
Theorem 5.8 (Soundness). Let C be a prop-category that has finite
meets and let Th be a theory in the sense of section 5.1. Then any structure
in C for the underlying signature of Th that satisfies the axioms of Th also
satisfies its theorems.
Examples 5.9. The prop-categories of Examples 5.2-5.5 all have finite
meets. Meets in Set (or Cpo) are given by set-theoretic intersection of
subsets (or inclusive subsets) these are clearly preserved by the pullback
operations since these are given by taking inverse images of subsets along
functions (or w continuous functions).
Finite meets in Kl can be calculated as follows. Choose some recursive
bijection pr : N x N > N and define a binary operation on subsets A,BC
84
Andrew M. Pitts
Nby
Categorical logic
85
(5.5)
Negation,
, will be treated as an abbreviation of => false; truth, true,
will be treated as an abbreviation of -ifalse; bi-implication,
will be
treated as an abbreviation of
Given a structure for Sg in a prop-category C with finite meets, finite joins and Heyting implications, we can interpret formulas-in-context
0[F] as C - p r o p e r t i e s , by induction on the derivation of
prop [F]. Atomic formulas are interpreted as in section 5.3, and compound
formulas as follows:
v
In this way the notion given in section 5.3 of satisfaction of a sequent by a
structure applies to sequents involving the propositional connectives.
Theorem 5.11 (Soundness). Given a structure in a prop-category C
with finite meets, finite joins, and Heyting implications, the collection of
sequents
that are satisfied by the structure (cf. 5.3) is closed under
the usual introduction and elimination introduction rules for the propositional connectives in Gentzen's natural deduction formulation of the intuitionistic sequent calculus, set out in Fig. 3.
Recall that a poset P can be regarded as a category whose objects are
the elements of *P and whose morphisms are instances of the order relation.
From this point of view, meets, joins, and Heyting implications are all
instances of adjoint functors. The operation of taking the meet (or the join)
of n elements is right (or left) adjoint to the diagonal functor P > Pn; and
given A E P, the Heyting implication operation
adjoint to ( ) A A : P > P. Figure 4 gives an alternative formulation
of the rules for the intuitionistic propositional connectives reflecting this
adjoint formulation. The rules take the form
86
Andrew M. Pitts
Categorical logic
87
88
Andrew M. Pitts
plications. Meets (even infinite ones) are given by set-theoretic intersection. Finite joins are given by set-theoretic union. The Heyting implication
A > B of inclusive subsets A, B 6 PropCpo (X) is given by taking the intersection of all inclusive subsets of X that contain
A or x E B}.
The operation /* of taking the inverse image of an inclusive subset
along an w-continuous function / : Y > X preserves meets and finite
joins, but it does not preserve Heyting implications. (For example, take X
to be the successor ordinal w+, Y to be the discrete w-cpo with the same set
of elements, and / to be the identity function; when A {w} and B = 0,
one has
Thus Cpo supports
the interpretation of conjunction and disjunction, but not implication.
Example 5.15. The prop-category Kl of Example 5.4 has finite meets,
finite joins, and Heyting implications. To see this, one needs to consider
numerical codes for partial recursive functions. Let n x denote the result,
if defined, of applying the nth partial recursive function (in some standard
enumeration) to x. The notation {n}(x) is traditional for n x. We will
often write nx for n x; a multiple application (n x) y will be written nxy,
using the convention that associates to the left.
So for each partial recursive function ip : N -* N there is some n N
such that nx x (x) for all x N. Here 'e e' means e- if and only if e',
in which case e = e', and 'e- means the expression e is defined. The key
requirement of this enumeration of partial recursive functions is that the
partial binary operation n, m H-> n m makes N into a 'partial combinatory
algebra'i.e. there should be K, S N satisfying, for all x,y,z E N, that
and Kx x
Categorical logic
89
5.5 Quantification
The rules for forming quantified formulas are
The usual natural deduction rules for introducing and eliminating quantifiers are given in Fig. 5. Modulo the structural rules of Fig. 2, these natural deduction rules are interderivable with an 'adjoint' formulation given
by the bidirectional rules of Fig. 6. It should be noted that some side conditions are implicit in these rules because of the well-formedness conditions
mentioned after (5.2) that are part of the definition of a sequent. Thus x
does not occur free in in (V -Intro) or (V-Adj); and it does not occur free
in
(3-Elim) or (3-Adj).
What structure in a prop-category C with finite meets is needed to
soundly interpret quantifiers? Clearly we need functions
90
Andrew M. Pitts
provides
Categorical logic
91
for all A, C e Propc(I) and B 6 Propc(I x X). We can split this requirement into two: first that
provides a left adjoint to,
and second a
'stability' condition for the left adjoint with respect to meets:
These adjoints are easily seen to be natural in /, and since Set has Heyting
92
Andrew M. Pitts
defined by
Some calculations with partial recursive functions show that these formulas
do indeed yield the required adjoints, and that they are natural in /. Since
we noted in Example 5.15 that K.I has Heyting implications, the Frobenius
reciprocity condition (5.10) holds automatically.
Example 5.21. The prop-category Cpo of Example 5.3 possesses natural right adjoints to pulling back along projections: they are given just
as for Set by the formula (5.11), since this is an inclusive subset when
A is. Cpo also possesses left adjoint to pulling back along projections:
VI x(A) is given by the smallest inclusive subset of I containing {i E I \
3x & X.(i,x) 6 A} (i.e. by the intersection of all inclusive subsets containing that set). However, these left adjoints do not satisfy the Beck-Chevalley
condition (5.7). For if this condition did hold, for any i 6 / we could apply it with I' a one-element w-cpo and / : /' > I the function mapping
the unique element of /' to i, to conclude that i V/,x(^) ^ anc^ on'y if
(i,x) E A for some x X. In other words, the set in (5.11) would already
be inclusive when A is inclusive. But this is by no means the case in general.
(For example, consider when / is the successor ordinal LJ+ and X is the
discrete w-cpo with the same set of elements. Then A = {(m,n) \ m < n}
is an inclusive subset of I x X, but (5.11) is u, which is not an inclusive
subset of w + .)
Thus the prop-category Cpo has universal quantification, but not existential quantification.
Categorical logic
93
5.6
Equality
We have seen that the categorical semantics of the prepositional connectives and quantifiers provides a characterization of these logical operations
in terms of various categorical adjunctions. In this section we show that,
within the context of first-order logic, the same is true of equality predicates. (As with so much of categorical logic, this observation originated
with Lawvere [1970].) The formation rule for equality formulas is
Natural deduction rules for introducing and eliminating such formulas are
given in Fig. 7 (cf. [Nordstrom et ai, 1990, Section 8.1]). The usual properties of equality, such as reflexivity, symmetry, and transitivity, can be
derived from these rules. It is not hard to see that modulo the structural
rules of Fig. 2, the natural deduction rules are interderivable with an 'adjoint' formulation given by the bidirectional rule of Fig. 8.
Suppose C is a prop-category with finite meets. To interpret equality
formulas whilst respecting the structural rules (Weaken) and (Subst) of
Fig. 2, for each C-object X we need a C-property
94
Andrew M. Pitts
Given the definition of satisfaction of sequents in section 5.3, for the soundness of (=-Adj) we need
for all B Propc(X x X). Note that the C-properties Eqx are uniquely
determined by this requirement.
Definition 5.22. Let C be a prop-category with finite meets. We say that
C has equality if for each C object X, the value Eqx of the left adjoint to
Ax at the top element Tx 6 Propc(X) exists and satisfies (5.12)
Thus we have shown that if C is a prop-category with finite meets and
equality, then the sequents that are satisfied by a structure in C are closed
under the equality rules in Fig. 7.
Remark 5.23. In fact when C also has Heyting implications and universal
quantification, then property (5.12) is automatic if Eqx exists satisfying
(5.13). This observation corresponds to the fact that in the presence of
implication and universal quantification, the rule (=-Adj) is interderivable
with a simpler rule without 'parameters':
Example 5.24. The prop-category Set of Example 5.2 has equality. Since
we have seen in previous sections that Set has implication and universal
quantification, by Remark 5.23 it suffices to establish the existence of the
left adjoint to Ax at TX- But for this, clearly we can take Eqx C X x X
to be the identity relation {(x,x) \ x . X}.
Similarly for the prop-category kl of Example 5.4: since it has implication and universal quantification, to see that it also has equality we just
have to verify (5.13). This can be done by taking taking Eqx to be the
kl-property represented by the function 6X : X x X P(N) defined by
Categorical logic
95
Remark 5.27 (Generalized quantifiers). Suppose that C is a propcategory with finite meets, Heyting implications, and both universal and
existential quantification. Then not only do the adjoints
but in fact for any C morphism / : X > Y, the monotone function
/* : Propc(Y) > Propc(X) has both left and right adjoints, which will
be denoted V/ and A/ respectively. Indeed, for B Propc(Y) we can
define
It is easier to see what these expressions mean and to prove that they
have the required adjointness properties if we work in a suitable internal
language for the prop-category C. The signature of such an internal language is like that discussed in section 2.3, but augmented with relation
symbols R C X1 x x Xn for each C-property R Propc(Xi x x Xn)
(for each tuple of C-objects Xi, . . . ,Xn). Using the obvious structure in
C for this signature, one can describe C-properties using the interpretation
of formulas over the signature; and relations between such C-properties
can be established by proving sequents in the predicate calculus and then
appealing to the soundness results we have established.
From this perspective Vf and Af are *ne interpretation of formulas
that are generalized quantifiers, adjoint to substitution:
96
Andrew M. Pitts
can be deduced from the fact that the following bidirectional rules are
derivable from the natural deduction rules for &, =>, 3, V, =:
So the Beck-Chevalley Condition amounts to requiring the other inequality: <7*V'k V/'1*- Dually, #*Afc ^ A/'1* holds automatically and the
right adjoints are said to satisfy a Beck-Chevalley condition for the above
square if the reverse inequality holds, so that g*/\k = A/'1*- With this terminology, it is the case that in a prop-category with finite meets, Heyting
implications, and quantification, the left and right adjoints to the pullback
operations satisfy the Beck-Chevalley condition for certain commutative
Categorical logic
97
squares which (by virtue of the finite products in C) are pullback squares.
These are the squares of the form
5.7
Completeness
98
Andrew M. Pitts
the categorical semantics implies that every theorem of the theory is also
satisfied by such a model. Conversely, the categorical semantics is complete,
in the sense that a judgement over the signature of a given theory is a
theorem if it is satisfied by all models of the theory in prop-categories with
finite meets, finite joins, Heyting implications, quantification, and equality.
This completeness is an easy corollary of the stronger result that, given
Th, there is a 'classifying' prop-category C l ( T h ) containing a 'generic'
model G, which is, in particular, a structure that satisfies a sequent (or
an equation) if and only if it is a theorem of Th. This classifying propcategory can be constructed via an extension of the 'term-model' construction discussed in Section 4. The underlying category of the prop-category
C l ( T h ) is constructed just as in section 4.2: its objects are a-equivalence
classes of contexts F, and its morphisms are equivalence classes of context
morphisms under provable equality in Th.
Then for each object F we define the poset of Cl(Th)-properties of F as
follows. Its underlying set is the quotient
where the equivalence relation (f> ~ Th <f>' holds if and only if both <j> (- <$>' [F]
and $ h (j) [F] are theorems of Th. The partial order on -Prop^f T/I) (F) is
that induced by Th-provable entailment: A < A' holds if and only if for
some (indeed, any) formulas <> and 4>' representing the equivalence classes
A and A' respectively,
is a theorem of Th.
To complete the definition of the prop-category structure (cf. Definition 5.1), we have to define the action of pulling back a C(Th)-property
along a morphism. This is induced by the operation of substituting terms
for variables in formulas. Given A Propcl(Th) (F) and 7 : F' > F with
Categorical logic
99
Proof. The prepositional operations are induced by the corresponding logical connectives. Thus for any object F and Cl( Th)-properties A = [] and
A1 = [<f>'] of T, we have:
100
Andrew M. Pitts
Dependent types
Categorical logic
101
6.1
Syntactic considerations
102
Andrew M. Pitts
var((?i) C {xi,...,Xi-i}
where var(e) denotes the (finite) set of variables occurring in the expression
e. Thus the variables Xi are distinct and each type <r, only involves variables
which have already been listed in the context. Notational conventions
for contexts will be as in section 2.1; in particular, [] denotes the empty
context.
The variant of dependently typed equational logic that we are going to
describe contains rules for deriving a number of different forms of judgement, set out in Table 1. The 'secondary'judgement forms are so-called because they are in fact expressible in terms of the primary forms, modulo the
rules for dependently typed equational logic given below (cf. Remark 6.4).
Definition 6.2. A dependently typed algebraic theory , Th, over the signature sg is specified by the following data.
For each type-valued function symbol s, a judgement
s(x) type [Ts]
called the introductory axiom of s. Here x is the list of variables of
the context Fs, and must have length n when s has arity TERMS -
TYPES. Ts lists the types of the argument to which s may be applied.
For each term-valued function symbol F, a judgement
called the introductory axiom of F. Once again, x is the list of variables of the context fy, and must have length n when F has arity
TERMS - TERMS. FF lists the types of the arguments to which
F may be applied, and F gives the type of the result (which may
depend upon those arguments).
A collection of judgements of the form M = M' : a [F], called the
term-equality axioms of Th.
A collection of judgements of the form a = a' [F], called the typeequality axioms of Th.
Given such a theory, the theorems of Th are the judgements which are
provable using the rules shown in Figs 9 and 10. In the rules, x is the
Categorical logic
103
Intended meaning
Restriction
Primary forms
a type [T]
var(a) C var(T)
M : a [F]
var(M, a) C var(F)
a = a' [F]
M = M' : 7 [F]
var(M,
M', ) C var(F)
Secondary forms
F ctxt
7 : F->T'
T is a well-formed
context'
'7 is a context morphism var(y) C var(F)
from F to F"
F = F'
7 = 7': F->F'
104
Andrew M. Pitts
Contexts
Context morphisms
Categorical logic
ctxt
s(x) type [T
aF type [IV]
F(x) : aF
105
M = M' : a [T]
106
Andrew M. Pitts
Example 6.5. The theory of categories provides an example of a wellformed, dependently typed algebraic theory. The underlying signature is:
obj
hom
Id
Comp
:
:
:
:
TYPES
TERMS2->TYPES
TERMS^TERMS
TERMS5-TERMS.
where
It is evident from this example that the formal requirement in the introductory axiom of a function symbol (such as that for Comp) that all
variables in the context occur explicitly as arguments of the function is
at variance with informal practice. See [Cartmell, 1986, Section 10] for a
discussion of this issue.
Remark 6.6 (Substitution and Weakening.). A general rule for substituting along a context morphism is derivable from the rules in Figs 9
and 10, viz.
Categorical logic
107
where J is one of the four forms 'e type', 'e = e", 'e : e", or 'e = e' : e"\
and y is the list of variables in T'. A special case of this is a general derived
rule for weakening contexts:
The rules for substitution in Fig. 9 also have as special cases forms which
correspond more closely to the substitution rule (2.9) for simply typed
equational logic, viz:
6.2
108
Andrew M. Pitts
egory from the first object to the second is the quotient of the set of Thcontext morphisms F-F' by the equivalence relation of being Th-provably
equal. This definition makes sense because the following rules are derivable
from those in Figs 9 and 10:
We will tend not to distinguish notationally between a Th-context morphism and the morphism of Cl ( Th) which it determines.
Composition in Cl(Th).
The derived rules for substitution mentioned in Remark 6.6 can be used to
show that the following derived rules for composition are valid.
Categorical logic
109
6.3
Type-categories
For a simply typed algebraic theory, we saw in section 4 that the relevant
categorical structure on the corresponding classifying category was finite
products. Here it will turn out to be the more general3 property of possessing a terminal object and some pullbacks. To explain which pullbacks,
we identify a special class of morphisms in the classifying category of Th.
Definition 6.7. If F is a Th-context, the collection of T-indexed types in
Cl( Th) is defined to be the quotient of the set of types a such that a type [F]
is a theorem of Th, with respect to the equivalence relation identifying a
and a1 just if a = a1 [F] is a theorem of Th. As usual, we will not make
a notational distinction between a and the F-indexed type it determines.
Each such F-indexed type has associated with it a projection morphism,
represented by the context morphism
Here x1,... ,xn are the variables listed in F, and x is any other variable.
Note that the object in C l ( T h ) represented by [F,a; : a] and the morphism
represented by a are independent of which particular variable x is chosen.
Lemma 6.8. Given a morphism 7 : F' F in Cl( Th) and a T-indexed
type represented by a, then
Since finite products can be constructed from pullbacks and a terminal object.
110
morphism
Andrew M. Pitts
satisfying
and
Now since
, we must have that the list of terms
7" is of the form
for some term N for which
is a theorem of Th. Now since
(where x1 is the
list of variables in F'), we get a morphism
satisfying
and
as required. If 6' : F">[!", x' : &[j/x\] were any other such morphism, then
from the requirement
' we conclude that the list 6' is of the
f o r m ; and then from the requirement
we conclude
further that
Categorical logic
along X, together with a morphism
making the following a pullback square in the category C:
111
A
112
Andrew M. Pitts
which indeed results in a pullback square in Set of the required form (6.5)
and satisfying the strictness conditions, with
equal to the function
Note the following property of this example, which is not typical of typecategories in general: up to bijection over X, any function with codomain
X, f : Y > X, can be expressed as the projection function associated with
an indexed type:
Example 6.13 (Constant families). Every category C with finite products can be endowed with the structure of a type-category, in which the
indexed types are 'constant families'. For each object X in C, one defines Typec(X) to be Objc, the class of all objects of C. The associated
operations are
Categorical logic
113
can make each topos into a type-category as follows. First note that is
indeed a category with a terminal object. For each object X e Obj, define
Type (X) to consist of all pairs (A, a) where A 6 Obj and a : X x A fi.
Here fi denotes the codomain of the object classifier of , T : 1 > fl.
Given such an X-indexed type, the total object X x (A, a) is given by
forming a pullback square in E:
brojection
114
Andrew M. Pitts
Example 6.15 (Split fibrations). Let Cat denote the category of small
categories and functors. We can turn it into a type-category by decreeing
that for each small category, C, the C-indexed types are functors A :
Cop > Cat. The associated total category C x A is given by a construction
due to Grothendieck:
An object of C x A is a pair (X, A), with X an object of C and A
an object of A.(X).
A morphism in C x A from (X, A) to (X1, A') is a pair (x, a), where
x : X > X' in C and a : A > A.(x)(A') in A(X). The composition
of two such morphisms (a;, a) : (X, A) > (X',A') and (a;', a') :
(X1, A') > (X", A") is given by the pair (x1 o x, A(x)(a') o a). The
identity morphism for the object (^",^4) is given by (id x, id A)The projection functor TTA : C K A > C is of course given on both objects
and morphisms by projection onto the first coordinate.
The pullback of the indexed type A : Cop > Cat along a functor
F : D > C is just given by composing A with F, regarded as a functor
Dop > Cop. Applying the Grothendieck construction to A and to A o F,
one obtains a pullback square in the category Cat of the required form, with
F x A : D x (A o F) > C K A the functor which acts on objects and
morphisms by applying F in the first coordinate.
Regarding a set as a discrete category, there is an inclusion between the
type-category of Example 6.12 and the present one. Unlike the previous
example, not every morphism in Cat arises as the first projection of an
indexed type. Those that do were characterized by Grothendieck and are
known as 'split Grothendieck fibrations'. As the name suggests, these are
a special instance of the more general notion of 'Grothendieck fibration'
which would give a more general way of making Cat into a type-category,
except that the 'strictness' conditions in Definition 6.9 are not satisfied. See
[Jacobs, 1999] for more information on the use of Grothendieck fibrations
in the semantics of type theory.
6.4
Categorical semantics
We will now give the definition of the semantics of the types, terms, contexts, and context morphisms of Th in a type-category C. In general,
contexts are interpreted as C-objects, context morphisms as C-morphisms,
and types as indexed types in C. Terms as interpreted using the following
notion of 'global section' of an indexed type.
Definition 6.16. Given an object X in a type-category C, the global sections of an X-indexed type A Typec (X) are the morphisms a : X >
X x A in C satisfying IT A a = idx- We will write
Categorical logic
115
and (/ x A) o fa = a o /.
and
116
Andrew M. Pitts
Contexts
Terms
Context morphisms
Categorical logic
117
[F ctxt\ x X
a = a' [F] is satisfied if and only if for some (necessarily unique) object X and ^-indexed type A Typec(X)
{a type [F]I x A [X]
and
and
118
Andrew M. Pitts
The generic model of a well-formed theory. Suppose that Th is a dependently typed algebraic theory that is well-formed, in the sense of Definition 6.3. Then the classifying category C l ( T h ) contains a structure, G, for
the underlying signature of Th. G is defined as follows, where we use a
notation for the components of the structure as in Definition 6.17.
The well-formedness of Th implies that for each type-valued function
symbol s, with introductory axiom s(x)type [Fs] say, Fs is a ITi-context and
hence determines an object Xs of Cl( Th). Then define the X s -indexed type
AS 6 TyPecl(Th)(-^s) to be that represented by s(x) (which is a Fs-indexed
type because s(x) type [Fs] is a theorem of Th).
For each term-valued function symbol F with introductory axiom F(x) :
VF [TF],the well-formedness of Th implies that up type \Tp] is a theorem of
Th. Hence F is a Th-context and hence determines an object
ofCl(Th). Then define the XF-indexed type AF T y p e c l ( T h ) ( X F )to be
that represented by the Ff-indexed type
, and define the global section
to be the morphism
represented
by the Th-context morphism
The structure G has the following properties:
For each judgement F ctxt and Cl(Th)-object X, the relation [F] x X
holds if and only if F ctxt is a theorem of Th and X is the object of
C l ( T h ) represented by F.
For each judgement a type, [F], each object X and each X-indexed
type A e Typece^Th-)(X), the relation [a type [F]J x A [X] holds if
and only if a type [F] is a theorem of Th, X is the object of C l ( T h )
represented by F, and A is the X"-indexed type represented by a.
Categorical logic
119
with
Call C reachable if there is such a sequence for
every C-object. Every classifying category has this property. Conversely
if C is reachable, then it is equivalent to Cl( Th) for a theory in a suitable
'internal language' for C (cf. section 2.3).
6.5
Dependent products
120
Andrew M. Pitts
(This definition makes use of the pairing notation introduced in Notation 6.18.) Symmetrically, there is a functor
Categorical logic
121
Introduction
Note that the conclusion of this rule is well-formed if the hypotheses are.
For if a' &XKA A', then a'
so we can apply the adjointness property from Definition 6.23 to form
cur(o') : idx > ^n(A>i')' wmch is, in particular, a global section of
H(A,A').
Elimination
122
we
Andrew M. Pitts
can use
6.18 to form
Since by definition
and
that are theorems of Th and such that X, A, and A' are the equivalence
classes of F, a, and o-'(x), respectively. Then take 11(^4,^4') to be the Findexed type determined by
. This definition is independent of the
choice of representatives. The morphism (6.10) is that represented by the
Th-context morphism
Since,
this does indeed induce a morphism of the required kind; and once again the definition of apAA, is independent of the choice of representatives.
The adjointness property required for 11(^4, ^4') can be deduced from the
equality rules in Fig. 12, and the stability property follows from standard
properties of substitution.
Categorical logic
123
Further reading
This section lists some important topics in categorical logic which have not
been covered in this chapter and gives pointers to the literature on them.
Higher-order logic The study of toposes (a categorical abstraction of key
properties of categories of set-valued sheaves) and their relationship to set
theory and higher order logic has been one of the greatest stimuli of the
development of a categorical approach to logic. [Mac Lane and Moerdijk,
1992] provides a very good introduction to topos theory from a mathematical perspective; [Lambek and Scott, 1986, Part II] and [Bell, 1988]
give accounts emphasizing the connections with logic. The hyperdoctrine
approach to first-order logic outlined in section 5 can be extended to higherorder logic and such higher-order hyperdoctrines can be used to generate
toposes: see [Hyland et al, 1980], [Pitts, 1981,1999] and [Hyland and Ong,
1993].
Polymorphic lambda calculus Hyperdoctrines have also been used successfully to model type theories involving type variables and quantification
over types. [Crole, 1993, Chapters 5 and 6] provides an introduction to the
categorical semantics of such type theories very much in the spirit of this
chapter.
Categories of relations Much of the development of categorical logic has
been stimulated by a process of abstraction from the properties of categories
of sets and functions. However, categorical properties of sets and binary
relations (under the usual operation of composition of relations) have also
been influential. The book by Preyd and Scedrov [1990] provides a wealth
of material on categorical logic from this perspective.
Categorical proof theory Both Lawvere [1969] and Lambek [1968] put forward the idea that proofs of logical entailment between propositions,
may be modelled by morphisms,
in categories. If one is only
interested in the existence of such proofs, then one might as well only
consider categories with at most one morphism between any pair of objects, i.e. only consider pre-ordered sets. This is the point of view taken in
section 5. However, to study the structure of proofs one must consider categories rather than just pre-orders. Lambek, in particular, has studied the
connection between this categorical view of proofs and the two main styles
of proof introduced by Gentzen (natural deduction and sequent calculus),
introducing the notion of multicategory to model sequents in which more
than one proposition occurs on either side of the turnstile,
: see [Lambek, 1989]. This has resulted in applications of proof theory to category
theory, for example in the use of cut elimination theorems to prove coherence results: see [Mine, 1977], [Mac Lane, 1982]. In the reverse direction of
applications of category theory to proof theory, general categorical machinery, particularly enriched category theory, has provided useful guidelines
124
Andrew M. Pitts
for what constitutes a model of proofs in Girard's linear logic (see [Seely,
1989], [Barr, 1991], [Mackie et al., 1993], [Bierman, 1994]); for example,
in [Benton et al., 1993], such considerations facilitated the discovery of a
well-behaved natural deduction formulation of intuitionistic linear logic.
Categorical combinators The essentially algebraic nature of the category
theory corresponding to various kinds of logic and type theory gives rise to
variable-free, combinatory presentations of such systems. These have been
used as the basis of abstract machines for expression evaluation and type
checking: see [Curien, 1993], [Ritter, 1992].
References
[Backhouse et al, 1989] R. Backhouse, P. Chisholm, G. Malcolm, and
E. Saaman. Do-it-yourself type theory. Formal Aspects of Computing,
1:19-84, 1989.
[Barr, 1991] M. Barr. *-autonomous categories and linear logic. Math.
Structures in Computer Science, 1:159-178, 1991.
[Barr and Wells, 1990] M. Barr and C. Wells. Category Theory for Computing Science. Prentice Hall, 1990.
[Bell, 1988] J. L. Bell. Toposes and Local Set Theories. An Introduction,
volume 14 of Oxford Logic Guides. Oxford University Press, 1988.
[Benton et al., 1993] P. N. Benton, G. M. Bierman, V. C. V. de Paiva,
and J. M. E. Hyland. A term calculus for intuitionistic linear logic. In
M. Bezem and J. F. Groote, editors. Typed Lambda Calculi and Applications, Lecture Notes in Computer Science 664, pages 75-90, SpringerVerlag, 1993.
[Bierman, 1994] G. M. Bierman. On intuitionistic linear logic. PhD thesis,
Cambridge Univ, 1994.
[Bloom and Esik, 1993] S. L. Bloom and Z. Esik. Iteration Theories.
EATCS Monographs on Theoretical Computer Science. Springer-Verlag,
1993.
[Cartmell, 1986] J. Cartmell. Generalised algebraic theories and contextual
categories. Annals of Pure and Applied Logic, 32:209-243, 1986.
[Chang and Keisler, 1973] C. C. Chang and H. J. Keisler. Model Theory.
North-Holland, 1973.
[Coste, 1979] M. Coste. Localisation, spectra and sheaf representation. In
M. P. Fourman, C. J. Mulvey, and D. S. Scott, editors, Applications of
Sheaves, Lecture Notes in Mathematics 753, pages 212-238. SpringerVerlag, 1979.
[Crole, 1993] R. L. Crole. Categories for Types. Cambridge Univ. Press,
1993.
Categorical logic
125
[Crole and Pitts, 1992] R. L. Crole and A. M. Pitts. New foundations for
fixpoint computations: Fix-hyperdoctrines and the fix-logic. Information
and Computation, 98:171-210, 1992.
[Curien, 1989] P.-L. Curien. Alpha-conversion, conditions on variables and
categorical logic. Studia Logica, 48:319-360, 1989.
[Curien, 1990] P.-L. Curien. Substitution up to isomorphism. Technical Report LIENS-90-9, Laboratoire d'Informatique, Ecole Normale
Superieure, Paris, 1990.
[Curien, 1993] P.-L. Curien. Categorical Combinators, Sequential Algorithms, and Functional Programming (2nd edition). Birkhauser, 1993.
[Dummett, 1977] M. Dummett. Elements of Intuitionism. Oxford University Press, 1977.
[Ehrhard, 1988] Th. Ehrhard. A categorical semantics of constructions. In
3rd Annual Symposium on Logic in Computer Science, pages 264-273.
IEEE Computer Society Press, 1988.
[Freyd, 1972] P. J. Freyd. Aspects of topoi. Bulletin of the Australian
Mathematical Society, 7:1-76 and 467-80, 1972.
[Freyd and Scedrov, 1990] P. J. Freyd and A. Scedrov. Categories, Allegories. North-Holland, Amsterdam, 1990.
[Girard, 1986] J.-Y. Girard. The system F of variable types, fifteen years
later. Theoretical Computer Science, 45:159-192, 1986.
[Girard, 1987] J.-Y. Girard. Linear logic. Theoretical Computer Science,
50:1-102, 1987.
[Girard, 1989] J.-Y. Girard. Proofs and Types, Cambridge Tracts in Theoretical Computer Science 7. Cambridge University Press, 1989.
[Goguen and Meseguer, 1985] J. A. Goguen and J. Meseguer. Initiality,
induction and computability. In M. Nivat and J. C. Reynolds, editors,
Algebraic Methods in Semantics, pages 459-541, Cambridge University
Press, 1985.
[Goguen et al, 1977] J. A. Goguen, J. W. Thatcher, E. G. Wagner, and
J. B. Wright. Initial algebra semantics and continuous algebras. Journal
of the Association for Computing Machinery, 24:68-95, 1977.
[Hofmann, 1995] M. Hofmann. On the interpretation of type theory in
locally cartesian closed cateories. In L. Pacholski and J. Tiuryn, editors, Computer Science Logic, Kazimierz, Poland, 1994, Lecture Notes
in Computer Science 933. Springer-Verlag, 1995.
[Hyland, 1988] J. M. E. Hyland. A small complete category. Annals of
Pure and Applied Logic, 40:135-165, 1988.
[Hyland and Ong, 1993] J. M. E. Hyland and C.-H. L. Ong. Modified realizability toposes and strong normalization proofs. In M. Bezem and
J. F. Groote, editors, Typed Lambda Calculi and Applications, Lecture
Notes in Computer Science 664, pages 179-194. Springer-Verlag, 1993.
126
Andrew M. Pitts
Categorical logic
127
[Mac Lane, 1982] S. Mac Lane. Why commutative diagrams coincide with
equivalent proofs. Contemporary Mathematics, 13:387-401, 1982.
[Mac Lane and Moerdijk, 1992] S. Mac Lane and I. Moerdijk. Sheaves in
Geometry and Logic. A First Introduction to Topos Theory. Universitext.
Springer-Verlag, 1992.
[Mackie et al., 1993] I. Mackie, L. Roman, and S. Abramsky. An internal language for autonomous categories. Applied Categorical Structures,
1:311-343, 1993.
[Makkai and Reyes, 1977] M. Makkai and G. E. Reyes. First Order Categorical Logic, Lecture Notes in Mathematics 61. Springer-Verlag, 1977.
[McLarty, 1992] C. McLarty. Elementary Categories, Elementary Toposes,
Oxford Logic Guides 21. Oxford University Press, 1992.
[Mine, 1977] G. E. Mine. Closed categories and the theory of proofs. Zap.
Nauch. Sem. Leningrad. Otdel. Mat. Inst. im. V. A. Steklova (LOMI),
96:83-114,145, 1977. Russian, with English summary.
[Mitchell and Scott, 1989] J. C. Mitchell and P. J. Scott. Typed lambda
models and cartesian closed categories (preliminary version). In J. W.
Gray and A. Scedrov, editors, Typed Lambda Calculi and Applications,
Lecture Notes in Computer Science, pages 301-316, Springer-Verlag,
1989.
[Moggi, 1991] E. Moggi. Notions of computations and monads. Information and Computation, 93:55-92, 1991.
[Nordstrom et al., 1990] B. Nordstrom, K. Petersson, and J. M. Smith.
Programming in Martin-Lof's Type Theory, volume 7 of International
Series of Monographs on Computer Science. Oxford University Press,
1990.
[Obtulowicz, 1989] A. Obtulowicz. Categorical and algebraic aspects of
Martin-L6f type theory. Studia Logica, 48:299-318, 1989.
[Oles, 1985] F. J. Oles. Types algebras, functor categories and block structure. In M. Nivat and J. C. Reynolds, editors. Algebraic Methods in
Semantics pages 543-574. Cambridge University Press, 1985.
[Pierce, 1991] B. C. Pierce. Basic Category Theory for Computer Scientists. MIT Press, 1991.
[Pitts, 1981] A. M. Pitts. The theory of toposes. PhD thesis, Cambridge
Univ., 1981.
[Pitts, 1987] A. M. Pitts. Polymorphism is set theoretic, constructively. In
D. H. Pitt, A. Poigne, and D. E. Rydeheard, editors, Category Theory
and Computer Science, Proc. Edinburgh 1987, Lecture Notes in Computer Science 283, pages 12-39. Springer-Verlag, Berlin, 1987.
[Pitts, 1989] A. M. Pitts. Non-trivial power types can't be subtypes of
polymorphic types. In 4th Annual Symposium on Logic in Computer
Science, pages 6-13. IEEE Computer Society Press, 1989.
128
Andrew M. Pitts
Contents
1
2
3
4
5
6
7
8
9
10
Introduction
Preliminaries
Reductions between formulas
Inseparability results for first-order theories
Inseparability results for monadic second-order theories . . . .
Tools for NTIME lower bounds
Tools for linear ATIME lower bounds
Applications
Upper bounds
Open problems
129
135
140
151
158
164
173
180
196
204
1 Introduction
In this chapter we present a method for obtaining lower bounds on the
computational complexity of logical theories, and give several illustrations
of its use. This method is an extension of widely used procedures for proving the recursive undecidability of logical theories. (See Rabin [1965] and
Ersov et al. [1965].) One important aspect of this method is that it is based
on a family of inseparability results for certain logical problems, closely related to the well-known inseparability result of Trakhtenbrot (as refined
by Vaught), that no recursive set separates the logically valid sentences
from those which are false in some finite model, as long as the underlying
language has at least one non-unary relation symbol. By using these inseparability results as a foundation, we are able to obtain hereditary lower
bounds, i.e., bounds which apply uniformly to all subtheories of the theory.
The second important aspect of this method is that we use interpretations to transfer lower bounds from one theory to another. By doing this
we eliminate the need to code machine computations into the models of the
theory being studied. (The coding of computations is done once and for all
130
)={
)= {
A hereditary lower bound for 27 is a bound that holds for sat( ') and
val( ') whenever 27' C val( ). If L is a first-order logic, define inv(L)
to be the set of sentences in L that are logically invalid, i.e., false in all
models. If L is a monadic second-order logic, define inv(L) to be the set
of sentences false in all weak models. (See section 2 for definitions.)
The complexity classes used here are time-bounded classes for nondeterministic Turing machines and for the more general class of linear alternating Turing machines. In providing reductions between different decision
problems, we are always able to give log-lin reductions. That is, our reduction functions can be computed by a deterministic Turing machine which
operates simultaneously in log space and linear time. In particular, such
functions have the property that the size of a value is bounded uniformly
by a constant multiple of the size of the argument.
Let L0 denote the first-order logic with a single, binary relation symbol.
Let MLo denote the corresponding monadic second-order logic. Let T(n)
be a time resource bound which grows at least exponentially in the sense
that there exists a constant d, 0 < d < 1, such that T(dn)/T(n) tends to
0 as n tends to . (This condition is satisfied by the iterated exponential
functions and other time resource bounds which arise most commonly in
connection with the computational complexity of logical theories.) Let
satT(Lo) denote the set of sentences in L0 such that a is true in some
model on a set of size at most T(| |). (Here \ \ denotes the length of a.)
Similarly define satT(ML 0 ) for sentences of monadic second-order logic.
The inseparability results which form the cornerstone of our method are as
follows:
(a) For some c > 0, satT(Lo) and inv(Lo) cannot be separated by any
set in NTIME(T(cn)).
(b) For some c > 0, satT(MLo) and inv(MLo) cannot be separated by
131
132
133
Our results concerning the various theories of finite trees can be summarized as follows:
(a) For each r
4 there are constants c and d > 0 such that sat(r)
is in NTIME (ex.pr_2(dn)) but that sat(r) and val( r) are hereditarily not in NTIME (expr_2(cn)). For r = 3 the upper bound is
NTIME(2 d n 2 ) and the hereditary lower bound is NTIME(2cn).
(b) There exist constants c and d > 0 such that sat( ) is in
NTIME
but that sat (
) and val(
(exp(dn))
134
135
Preliminaries
136
137
consisting of a child of the root and all its descendents. Thus, we may
regard a tree as being formed by directing an edge from the root of the tree
to the root of each of its primary subtrees.
The depth of a vertex in a tree is its distance from the root. The height
of a vertex is the maximum distance to a leaf below it. Thus, the height of
a tree is the maximum depth of its vertices, which is also the height of the
root.
As we noted in Section 1, we will consider problems of the form sat()
and val( ). Prom a computational point of view, these two problems are
complementary. That is, a sentence a is in sat( ) exactly when - is not
in val(). Hence, sat() is a member of a particular complexity class if
and only if val( ) is a member of the corresponding co-complexity class.
If is a complete theory, then sat() = val( ). When we are in a firstorder logic, val( ) is the deductive closure of by the Godel completeness
theorem. There is no corresponding result for monadic second-order logic.
Often a logical theory is specified not by giving a set of axioms , but
by giving a class of models C. In this situation we take to be the set
of sentences true in all members of C. It is easy to verify in this case that
val( ) = and sat() is the set of sentences true in some member of C.
If L is a first-order logic we define inv(L) to be the set of sentences
false in all models for L. This is just the complement of sat( ). If L is a
monadic second-order logic we define inv(L) to be the set of sentences false
in all weak models for L.
Given a time resource bound T(n), let satr() be the set of sentences
true in some model of of size at most T(| |). Also, write satT(L) for
satT( ). Let satpt() be the set of prenex sentences true in some model
of of size at most r(| |).
Let
be a model for a logic L, m = m1, . . . ,m,k elements of , and
( x i , . . . , x n , y 1 , . . . , y k ) a formula from L. Then
(x,m) denotes the
n-ary relation defined by
138
139
(or in ATIME(T(cn),cn)).
A log-lin reduction is a mapping computable in log space and linear time.
In some sources this terminology is used for a log space computable, linearly
bounded mapping, which is a weaker notion. (Linearly bounded means that
output length is less than some constant multiple of input length.) It is not
crucial for the applications presented here that our reductions be quite so
restricted: polynomial-time, linearly bounded reductions suffice. However,
to obtain some results in the literature, such as the nondeterministic polynomial lower bounds in Grandjean [1983], linear time reductions would be
needed.
We encounter a technical problem with log-lin reductions: we do not
know if they are closed under composition. To overcome this difficulty we
define a stronger notion of reset log-lin reduction. A machine performing
such reduction is a log space, linear time bounded Turing machine with
work tapes, an input tape, and an output tape. It has the capability to
reset the input tape head to the initial input cell on k moves during a computation, where k is fixed for all inputs; on all other moves the input tape
head remains in place or moves one cell to the right. It writes the output
sequentially from left to right. Suppose that M' and M" are two such machines using at most k' and k" resets, respectively. We informally describe
a machine M to compute the composition of the reductions computed by
M' and M". Imagine that the output tape of M' and the input tape of
M" have been removed. Instead, M' sends its output directly to M". As
M" computes its output, it calls M' to supply it with a new symbol on
those moves when the input head of M" would have moved right. M' has
only to resume its computation from the last call to supply this symbol.
On those moves where M" would have reset its input head, M' must begin
its computation anew. Now the input head of M" would have passed over
each input cell at most k" +1 times during the computation, and to supply
each symbol the input head of M' passes over each input cell at most k' +1
times. Thus, M resets its input head at most (k1 + l)(k" + 1) - 1 times.
Clearly, M is log space bounded. Since the part of M corresponding to M'
is forced to begin its computation anew at most k" times, it is easy to see
that M is linear time bounded.
It is not difficult to show that the prenex formulas of a logic are closed
under relativization up to reset log-lin reductions. That is, there is a reset
log-lin reduction which takes formulas of the form D, where is a prenex
formula with no variable quantified more than once, to equivalent prenex
formulas. We use this fact often. Unfortunately, we know of no way to
eliminate duplicate quantifications of variables using reset log-lin reductions, but this can be accomplished easily with polynomial-time, linearly
bounded reductions.
A problem is hard for a complexity class C via reductions from a class
S if every problem ' C can be reduced to by some / S. That is, if
140
A and A' are the alphabets for and 17' respectively, then / maps A1* to
A* so that w
" if and only if f ( w )
. If, in addition, C, we say
that 1 is complete for C via reductions from S.
One of our goals is to develop effective and easily used methods for transferring lower bounds from one problem to another. Our methods are based on
interpretations between theories (or equivalently, between classes of models) and can be seen as an extension of the most widely used methods for
proving the undecidability of logical theories; see Ersov et al. [1965] and
Rabin [1965] for a discussion of undecidable theories from this point of view.
To obtain complexity lower bounds for decidable theories we must use interpretations which have a somewhat more general form than those used
in undecidability proofs, and there are certain technicalities about lengths
of formulas which must be addressed in this more general setting. In this
section we will develop the required machinery. The first-time reader may
wish to skip the proofs in this section as they are somewhat tedious and
only the statements of results will be used later.
A common method for proving that a theory in a logic L is undecidable is to show that the theory So of finite binary relations, formulated in
the logic L0, can be interpreted in
In the simplest case this means that
formulas (x,u) and (x,y,u) of L are given so that every finite binary
relation can be obtained (up to isomorphism) in the form
141
boundedness condition may not hold. However, in certain cases there are
methods to efficiently replace 'n by an equivalent formula for which the
linear boundedness condition does hold. Roughly speaking, we can do this
when the formulas n and n are all in prenex form, or are obtained by
a certain kind of iterative procedure. The machinery developed here to
accomplish this task is implicit in most complexity lower bound arguments
for logical problems.
In order to describe this machinery, it is convenient to introduce an
extension L* of each logic L, in which explicit definitions are allowed. (L*
has no more expressive power than L, but properties can sometimes be
expressed by shorter formulas in L* than in L.) Continuing the example
above, let Dn denote a formula in which all quantifiers of n have been
relativized to a new unary relation symbol D. Then the extended language
L* in this case would include a formula
whose interpretation is exactly the same as that of 'n, although its length
is likely to be more under control. Here the equivalences in brackets are
interpreted to mean that P is explicitly defined by n and that D is explicitly defined by n. The general problem, treated below in this section,
is to find situations in which certain formulas of the extended language L*
can be efficiently reduced to equivalent formulas of L, without a significant
increase in the length of the formulas. (In general, it is possible to find for
each L* formula of length n an equivalent L formula of length O(n log n);
this is not good enough for sharp complexity bounds.)
Let L be either a first-order or monadic second-order logic. Define L*
as follows. Formulas of L* may contain any of the symbols occurring in
formulas of L and, in addition, relation variables Sji for each i, j 0. In
each case the arity of Sij is j and the subscript and superscript of S? are
expressed in binary notation. (If L is a monadic second-order logic we need
two superscripts, the first denoting the arity of element arguments and the
second denoting the arity of set arguments.) Subscripts and superscripts
of relation variables contribute to the length of formulas in which they
occur, just as element variable subscripts do. (However, superscripts may
be ignored in asymptotic estimates of formula length because they are
dominated in length by their corresponding argument lists.) We define the
set of formulas of L* inductively, and at the same time define free( ),
the set of free variables in (p. An atomic formula
of L* is either an
atomic formula of L or a formula P(x1 ,...,_,) where P denotes a relation
variablein the former free(( ) is the same as in L; in the latter, free((p)
{P,xi,.. - , X j } . More complex formulas may be constructed using the
logical connectives and quantifiers appropriate to L; in these cases free(( )
is defined just as in L. The only other way to construct more complex
142
Notice that the truth value is consistent with the definition of free(
Notice also that the second-order expression above is equivalent to
).
so that (P(x)
is equivalent to [P(x)
. If free ( ) = 0, then
is a sentence of L*.
We will let sat*( ) denote the set of sentences from L* true in some
model of S, satT( ) denote the set of sentences from L* true in some
model of of size at most T(| |), satT(L) denote the set of sentences
from L* true in some model of size at most T(| |), and inv*(L) denote the
set of sentences from L* true in no model (or no weak model when L is a
monadic second-order logic).
Introduction of explicitly defined relations is standard practice in mathematical discourse. Explicitly defined relations are also similar to nonrecursive procedures in programming languages.
Explicit definitions can be used to define reductions between satisfiability problems. To provide good lower bounds these reductions must be
efficiently computable and linearly bounded. We will show, in fact, that
there are reset log-lin reductions, defined on certain subsets of sentences
from L*, that take formulas to equivalent formulas in L. (Unfortunately,
such reductions probably cannot be defined on the set of all sentences in
L*; with a little effort we can produce a polynomial-time reduction which
maps sentences in L* of length n to equivalent sentences in L of length
O(nlogn).)
We inductively define positive and negative occurrences of a relation
symbol Q in formulas from L*. Q occurs positively in atomic formulas
143
[P(x, y) = (x = yVz
defines the path relation in each graph: P(x, y) is the least relation satisfying the equivalence, so it holds precisely when there is a path between x
and y. Now consider the related iterative definition
[P(x, y) = (x = yVz
(P(x, z)
E(z, y)))] n
which defines a relation P(x,y) which holds precisely when the distance
between x and y is at most n 1 (when n > 1). Notice that this 'approximation' to the implicitly defined relation does not converge very rapidly.
The iterative definition
[P(x, y)
144
defines a relation P(x, y) which holds precisely when the distance between x
and y is at most 2 n - 1 (for n 1), so this approximation to the path relation
converges exponentially 'faster'. For an implicit definition to make sense,
= (P) should be monotone in P (i.e., for every structure , if P and P'
are relations on 21 with PC.P', then 0 (P) C 0 (P')). Monotonicity can
be guaranteed by requiring that P is positive in 6. No such restriction is
needed for iterative definitions. In most of our applications P does occur
positively and the iterative definitions approximate an implicit definition.
Usually, the faster the convergence, the better the lower bounds obtained by
our methods. We will see that the positivity of P in 0 does have implications
in lower bound results.
To show that we can efficiently transform iterative definitions into equivalent explicit definitions, we require the following theorem, which will also
be used to show that certain sets of formulas in L* can be efficiently transformed into equivalent formulas from L.
Theorem 3.1. Let L be a first-order or monadic second-order logic and let
L' be a logic, of the same type, whose vocabulary consists of the vocabulary
of L together with relation symbols P1 , . . . , Pm . There is a reset log-lin
reduction taking each prenex formula of L' to an equivalent prenex formula
of L' having at most one occurrence of each Pj .
Proof. The proof follows an argument of Ferrante and Rackoff [1979,
pp. 155-157]. We must add some details, however, because they were
not interested in obtaining a reset log-lin reduction. We adopt the same
assumption they did there: we assume that L has a symbol for equality
and that all structures have cardinality at least 2. We could dispense with
this assumption at the cost of added complications.
We deal explicitly only with the case m = 1. It will be clear from the
proof that the procedure can be iterated to treat P1 , P2 , . . . in succession.
We describe the action of our algorithm on , a prenex formula from
L. First add a 0 bit to the end of every variable index occurring in . This
will allow us to introduce variables of odd index without creating a conflict.
Now is of the form
145
as the formula it replaces. Since we have no Boolean variable type, we instead replace each subformula PI xil ,...,xil) with an equation v\ = Vb(i) .
We must ensure for each i that b(i) is odd and greater than 1, that
b(i) is log-lin computable from PI (xil, . . . ,xil) (with no resets), and that
b(i)
b(j) when i j. To produce b satisfying these conditions suppose
that Xil , . . . , xil are formal variables denoting actual variables with subscripts ji, . . . ,ji respectively. In the string j l #j2# #j1 replace every
occurrence of 0 with 01, of 1 with 11, and of # with 10; let the result be
b(i). Let
be the result of replacing each formula PI (xil,, . . . ,xil) in
by the formula v1 = ub(i). Now with a little effort we can see that is
equivalent to
146
147
Pi(x1) = 1
where and 1,... ,0k are formulas from L* whose only free relation variables are P1 ,..., Pk, to an equivalent formula
of the form
where ' and ' are formulas from L* whose only free relation variable is
P. Moreover, if P1, . . . ,Pk occur only positively in each of the formulas
1,.. . ,0k, then we may arrange that P occurs only positively in .
Proof. As before, we assume that L has a symbol for equality and that all
structures have cardinality at least 2. Again, we could dispense with these
assumptions at the cost of added complications.
Without loss of generality, we may assume that the variable sequences
x1 , . . . , xk are mutually disjoint. Let z denote a sequence z , Z 1 , . . . , Z k of
distinct variables disjoint from x1 , . . . , Xk The idea of the proof is that one
relation P ( z , x 1 , . . . , X k ) will code the relations PI (x1 ) , . . . , Pk (xk ). To be
more precise, the relation P(z, x1 , . . . , Xk) is equivalent to
Call this formula 5i(x) (or Si for short). Define ' to be the L* formula
. Let
be the formula
148
is equivalent to [P(x)
]n
' by induction on n.
Remark 3.5. Notice that in the proof of Theorem 3.4 formula is formed
simply by inserting explicit definitions of fixed length before . These
definitions may be eliminated by replacing relation variables in
with
their corresponding definitions. Now if is a prenex formula or a member
of a prescribed set of formulas (defined below), it is easy to arrange that
is a formula of the same type.
We can now say precisely which kinds of definitions are used in the
reductions described at the beginning of this section: they are prenex definitions and iterative definitions. It is useful, therefore, to have terminology
to describe sets of formulas in L* built up from prenex formulas using
prenex and iterative definitions. We must place some restrictions on these
sets to be able to efficiently translate them into equivalent formulas from
L.
Let L be a first-order or monadic second-order logic. Let L' be the logic
formed by adding relation variables PI, ..., Pk to the vocabulary of L, and
I be a fixed positive integer. A prescribed set of formulas over L is a set of
formulas of the form
where is a prenex formula from L', and for each i either n, = 1 and
6i is a prenex formula from L' in which only P1, ..., Pi-1 may occur as
free relation variables (i.e., Pi has a prenex definition), or is a formula of
length at most l from L* in which only PI ,..., Pi may occur as free relation
variables (i.e., Pi has an iterative definition in which the operator formula
has bounded length). We place one further restriction on sets of prescribed
formulas: each variable is quantified at most once in and in each formula
i where Pi has a prenex definition. We impose this condition so that when
we relativize all the formulas within a set to a unary relation symbol D,
there is a reset log-lin reduction taking resulting formulas to equivalent
formulas from another prescribed set of formulas. The condition is easy to
satisfy in practice.
We now present our fundamental theorem for making reductions between formulas.
Theorem 3.6. Let L be a first-order or monadic second-order logic. For
each prescribed set of formulas over L there is a reset log-lin reduction
taking each formula in the set to an equivalent formula in L.
149
Proof. Fix a prescribed set of formulas over L. There are relation variables
P1 , . . . , Pk as in the definition such that all formulas in the set are of the
form
where is a prenex formula in which only P1,..., Pk, may occur as free
relation variables, and for each i either ni = 1 and i is a prenex formula
in which only P1,...,P,_i may occur as free relation variables, or , is
a formula of length at most l in which only P1,..., Pi may occur as free
relation variables.
At first glance it may seem that P1 , . . . , Pk are being defined simultaneously, but this is not the case. First P1 is assigned a value by an iterative
definition of depth n1 which is substituted in the remaining definitions.
Then P2 is assigned a value by the next iterative definition of depth n2
which is substituted in the remaining definitions, and so on. The proof
combines this observation with the construction used in Theorem 3.4. As
in that theorem, we will code the relations P 1 ( x 1 ) , . . . , Pk(xk) into a single
relation P ( y ) equivalent to
As before, let
This definition simply defines Pi(Xi) to be i and leaves the other relations
unchanged. Use the construction in the proof of Lemma 3.4 to produce an
150
where
is the formula
We claim that there is a reset log-lin reduction taking ni to an equivalent prenex formula ni with just one subformula P(u) in which P occurs.
Whether Pi has a prenex definition, in which case , is in prenex form,
or an iterative definition, in which case
is of bounded length, there is
a simple reset log-lin reduction to convert , into prenex form. Apply
the reduction given by Theorem 3.1 to the result to obtain an equivalent
formula in which each of the symbols PI , . . . , Pk occurs just once. In this
formula, for each j, substitute j(yj) for the subformula Pj(yj)- Convert
to prenex form again by a reset log-lin reduction and apply the reduction
of Theorem 3.1 one more time to obtain n' as desired. Notice that if Pi has
an iterative definition, ni, has length less than some constant determined
by / and the arities of P1 , . . . , Pk.
If Pi has a prenex definition, form
by substituting -\(u) for P(u)
in n'i- If Pi has an iterative definition we must make several substitutions.
Beginning with
i-1, replace free variables with the corresponding variables u and substitute the result for P(u) in n'i. Repeat this operation ni,
times. The resulting formula is .
In either case it is easy to see that i is obtained by a reset log-lin
reduction.
Since is in prenex form, we can apply the reset log-lin reduction of
Theorem 3.1 to obtain an equivalent prenex formula ' in which each of
the symbols P1,...,P k occurs at most once. As before, there is a reset
log-lin reduction to convert
into a prenex formula with just one subformula P(u) in which P occurs.
Substitute k(u) for this subformula to obtain finally '. Repeated use
of closure of reset log-lin reductions under composition shows that the
mapping
is reset log-lin computable.
Remark 3.7. Scrutiny of the preceding proof reveals two useful facts.
First, if all the symbols Pi have prenex definitions we can arrange that
is in prenex form. Second, if we wish to restrict to formulas in which the
only connectives are A, V, and , the theorem remains true providing Pi
151
occurs only positively in i when Pi, has an iterative definition. To see this,
observe that by the remark following Theorem 3.1 we can always ensure
that the formulas i each contain at most two occurrences of P. This is not
a problem when Pi has a prenex definition because i figures only once in
the construction of k and there are a bounded number of such definitions.
When Pi has an iterative definition we can ensure, again by the remark
following Lemma 3.1, that Pi occurs at most once in i since it occurs only
positively in i.
Hereditary lower bound results have proofs similar to the classical hereditary undecidability results. Young [1985], for example, modified techniques
used in the proof of the hereditary version of Godel's undecidability theorem, which states that all subtheories of Peano arithmetic are undecidable,
to show that all subtheories of Presburger aaithmetic have an NTIME (2 2cn )
lower bound. Our starting point is another classical undecidability result
the Trakhtenbrot-Vaught inseparability theorem. Many hereditary undecidability results have been derived from this theorem.
Recall that L0 is the first-order logic whose vocabulary contains just
a binary relation symbol P. Let fsat(Lo) be the set of sentences of L0
true in some finite model, and inv(Lo) the set of sentences of L0 true in
no model. The Trakhtenbrot-Vaught inseparability theorem states that
fsat(Lo) and inv(L0) are recursively inseparable: no recursive set contains
one of these sets and is disjoint from the other. Trakhtenbrot [1950] showed
this for a first-order logic with sufficiently many binary relations in its
vocabulary and Vaught [1960; 1962] reduced the number of binary relations
to one. To see how this theorem gives hereditary undecidability results,
suppose that for some theory in a logic L there is a recursive reduction
from the sentences of L0 to the sentences of L that takes fsat(L0) into
sat( ) and inv(Lo) into inv(L). Clearly sat( ) is not recursive since it
separates the image of fsat(Lo) from the image of inv(L0). Moreover, if
' C val( ), then sat( ) sat( ') and sat( ') inv(L) = 0 so sat( ')
is not recursive either.
Let T(n) be a time resource bound. Recall that sat T(Lo) is the set
of sentences in L0 such that is true in a structure of power at most
T(l l). Our analogue of the Trakhtenbrot-Vaught Inseparability Theorem states that for T satisfying certain weak hypotheses, satT(L0) and
inv(Lo) are NTIME(T(cn))-inseparable for some c > 0. That is, no set in
NTIME(T(cn)) contains one of these sets and is disjoint from the other.
We show, in fact, that the result is true if we restrict to prenex sentences
in L0. Thus, using the reductions between formulas described in the previous section, we can obtain hereditary NTIME lower bounds for theories
in much the same way that we obtain hereditary undecidability results. In
152
153
Then the relation P(X1, x2, y1, y2) given by the iterative definition
is true when 0 y1x1 2m1 and 2(y1 -xt) = y2-x2. Now LEFT(x, y)
is equivalent to P(0, 0, x, y) and RIGHT(x, y) is equivalent to P(0, l,x, y)
on the interval 0, . . . , 2m + 1, so we take m = [log(n 1)] to obtain RIGHT
and LEFT on the interval 0, . . . , n. By Theorem 3.3 there are first-order
formulas 'n and "n defining LEFT and RIGHT; moreover, they are computable from the unary representation of m by a reset Turing machine in
154
time log n and space log log n. By increasing time to log n log log n we can
make 'n and
prenex formulas.
The height of T is h = [log n] . Now for every i such that 0 i < h and
every j < 2h-j define a quantifier-free formula i j ( x 0 , . . . , Xi) by induction
on i. Roughly, i,j(xo, ... ,xi) says that if (xj,Xj1,. .. ,x 0 ) is a path in
from vertex j = Xi, then the symbol at position X0 on the input tape is
a X 0 . First, 0 , J ( X 0 ) is S Y M a j ( x 0 , 0 ) when j < n and some tautology (say
X0 = X0) when n j < 2h. Next, 0i +1,j (xo, . . . , x,+i) is the formula
2hi
the sentence
says that SYMa k (k, 0) holds when k < n; that is, it says w is written on
the first n cells of the input tape at time 0. Let w be a conjunction of this
sentence and a sentence that says SYM#(x, 0) holds for all x > n (that
is, that all the tape cells from position n onward are blank at time 0).
We must show that there is a reset log-lin reduction taking w to w , from
which it follows easily that there is a reset log-lin reduction taking w to
First, w is read from the input tape while h cells are marked off on a work
tape. One way to accomplish this is simply to keep a count on a work tape
of the number of input tape cells scanned. Count in binary. Incrementing
the count requires changing the low order 1-bits to 0 until encountering
a 1, which is changed to 0. The work tape head is then returned to the
155
lowest-order bit to prepare for the next advance of the input head. It is
not difficult to show that the time required to read the input tape and do
all the increments is O(n).
Next the input head is reset. Now a simple algorithm utilizing a stack
to keep track of subscripts will generate w. The maximal stack height is
h. Formula w was defined in such a way that the information required
from the input tape can be read off from left to right as the algorithm
proceeds. Variable indices are easily computed from the stack height since
they are in unary.
This computation clearly uses just log space. We need show that it
takes just linear time. The time required is less than a constant multiple of
the length of h, o(x0, , Xh)', the length of this formula is in turn less than
a constant multiple of the combined lengths of variable indices occurring
within it. By induction on i, variable Xk occurs no more than 3 2i-k times
in ij when k < i and Xi occurs just twice. Hence, the combined lengths
of variable indices occurring in h,o amounts to no more than
Thus, the computation requires just linear time. Notice that each variable
in w is quantified just once so it is easily arranged that the same is true
of
w-
156
before this formula. Here x0, x1 , . . . , xm+2 are new free variables whose intended interpretations in are bo,b1,..., bm+2. Now existentially quantify
X0 , X 1 , . . . , xm+2. There is a reset log-lin reduction that takes the resulting
formula to an equivalent prenex formula w . (Since each variable in 'w is
quantified just once the relativizations may be pushed inward. Then since
all of the explicit definitions are of fixed length, conversion to prenex form
is straightforward.)
If M accepts w, then 'w is true in some model ' of power at most
. It follows that w is true in some model of power at most
0.
157
0.
This theorem has interesting implications for us. Let T(n) be a time
resource bound. Take T1(n) = T(dn), where 0 < d < 1, T2(n) = T(n), and
f ( n ) = n. The Seiferas-Fischer-Meyer theorem tells us that if T(dn + d) =
o(T(n)), then
NTIME(T(n)) - NTIME(T(dn))
0.
158
159
As before, inseparability results are closely related to satisfiability problems that are hard for certain complexity classes. The classes are of the
form
ATIME(T(cn),cn)
which is in many ways more natural than
ATIME(T(cn),n).
If there is a reset log-lin reduction from a problem to a class of the first
form, then we may conclude that
is also in the class. We know of no
speed-up theorem for alternations, so we cannot make the same claim for
classes of the second form.
One of the main results of the section is Theorem 5.2, an analogue of
Theorem 4.1. We could prove this result along the same lines as Theorem 4.1, but we obtain a somewhat sharper result if we appeal to a result
of Lynch [1982] relating nondeterministic time classes to the spectra of
monadic second-order sentences. Lynch encodes Turing machine runs in a
way different from the classical method used in the last section. Rather
than explicitly accounting for symbols at each tape position and time in a
machine run, he keeps track of just the symbol changed (not its position),
the symbol which replaces it, and the direction of head movement at each
time. If the underlying models have enough structure, it is possible to
express derivability between instantaneous descriptions of nondeterministic
Turing machines with just this information. Lynch shows, in particular,
that this is the case if the underlying models have an addition relation
PLUS(x, y, z) which holds when x + y = z.
We begin, therefore, by considering the monadic second-order logic
ML+ whose vocabulary contains just a ternary relation symbol PLUS,
and M+, the monadic second-order theory of addition on initial segments
of the natural numbers. M+ can be axiomatized by a set of first-order
sentences. Explicitly define a relation
by
Then M+
says that
160
be formulas indicating that ID(x, X) holds and the state for the instantaneous description represented by X is of the corresponding type.
Lynch [1982] shows that for each nondeterministic Turing machine M'
there is a monadic second-order formula nM'(X, y) that holds in (r, +)
precisely when X and Y represent instantaneous descriptions for M' and Y
can be obtained from X within r or fewer moves of M' by a computation in
which the head does not reach a tape position greater than or equal to r. We
can regard the alternating Turing machine M as a nondeterministic Turing
machine simply by ignoring state types. We also form the nondeterministic
Turing machine M' by eliminating transitions out of all states in M except
existential states and then ignoring state types, and M" by eliminating
161
transitions out of all states in M except universal states and then ignoring
state types. Let (x,X,y) be the formula
.
Let
be the formula
That is, (x, X, y) expresses derivability between instantaneous descriptions on the interval [0, x]; ( x , X , y ) ( n ( x , X, y)) expresses the same
except that all states, excluding possibly the last, are existential (universal).
Notice, in particular, that n ( x , X , X ) holds for all X. Let TERM(x ,X]
be the formula
and TERM ( x , X ) be the formula
Let
162
Consider any weak model 21 in which conditions (a)(c) hold. For the
moment, let us suppose that the element x from the universe of is finite,
i.e., a finite distance from the least element 0. (By condition (a), this
is a well-defined notion.) Condition (b) ensures that PLUS restricted to
the interval [0,..., x] in
is the usual addition relation. Condition (c)
ensures that quantification over subsets of {0,..., x} in the weak model
is quantification over all subsets of {0,...,a;}. Thus, for finite x, the
formulas n ( x , X, y),
(x, X, y), and
(x, X ,y) have interpretations in
corresponding to computations of M as described above.
Now in consider x and X satisfying conditions (d) and (e). (We no
longer stipulate that x is finite.) Since all runs of M on input w halt, say
within k moves, we see that every such x is at most distance k from 0.
(Note that this is the case even if there is no y as described in (d).) Thus,
such an x will be finite and the observations of the previous paragraph
pertain.
Let (x, X) be the formula
Let
be the sentence
163
w
satT
Proof. By the previous theorem we need only show that there is a formula
(x,y, z) from MLo such that for each finite ordinal n = {0, . . . , n 1} there
is a binary relation R on n such that
(X, y, z) is an addition relation on n,
where = (n, R). Kaufmann and Shelah [1983] prove a much stronger result: there is a formula (x, y, z) such that for almost every binary relation
R on n,
(a;, y, z) is an addition relation on n, where = (n, R).
For the sake of completeness, we sketch a proof of the simpler result that
there is a formula that codes an addition relation on 'some binary relation
of each finite power.
First suppose that the vocabulary for the logic has three binary relation
symbols P1, P2, P3, rather than just one, and that they interpret binary
relations R1, R2, R3 on m. To simplify the proof we assume that m =
r3. It is easy to specify a formula (X) saying the relations RI, R2, R3
restricted to m x X are functions, respectively denoted f1, f2, f3, and
that (f1(x),/2(x),/3(x)) ranges over each triple in X3 precisely once as x
ranges over m. Thus |X"|= r and we have defined a bijection between m
and X3. Since we can quantify over subsets of m, we can quantify over
ternary relations on X when (X) holds. Therefore, we can, without much
trouble, define an addition relation on X. Also, we can extend this relation
to define addition modulo r. But then it is easy to define addition on m
using the bijection between m and X3.
Using the construction in the proof of Theorem 4.1, the three binary
relations R1 , R2 , R3 on a set of size m = r3 can be coded as a single binary
relation on a set of size n = 3(m + 1). This set has three disjoint subsets
of size m on which addition and addition modulo m can be coded. It is not
difficult now to define addition on all of n. Thus, there is a binary relation
on n from which an addition relation can be defined when n is of the form
3(r3 + 1). With a little effort this construction can be made to work for
arbitrary n.
I
We now state an analogue of Corollary 4.3.
Corollary 5.3. Let T1{n) and T2(n) be time resource bounds such that
ATIME(T2(n),n) - ATIME(T1(n),n)
0.
Suppose that l i m n - + T 1 ( n ) / n =.
Then there is a constant c > 0 such
that for each set
of satisfiable sentences with satT2(MLo)
,
ATIME(T1(cn),cn).
The proof is the same as for Corollary 4.3. Note, however, that we rely
on a result that says the linear speed-up theorem applies to alternating
164
Turing machines. We must also use Theorem 3.6 to obtain a reset loglin reduction from a prescribed set of sentences over ML0 to equivalent
sentences in ML0.
We also have an analogue for Theorem 4.5.
Theorem 5.4. // T is a time resource bound such that for some d between 0 and 1, T(dn) = o(T(n)), then there is a constant c > 0 such that
and inv(MLo) are ATIME(T (cn), cn) -inseparable.
Proof. The proof is much simpler than that of Theorem 4.5. We can
separate ATIME(T(n), n) and ATIME(T(dn), dn) using a straightforward
diagonalization, so we do not appeal to the more difficult methods used
in separating NTIME classes. Then we use the previous corollary to show
that for some c > 0, if satT(ML 0 )
and inv(ML0)
= 0 then
ATIME(T(cn),cn). Since ATIME(T(cn),cn) is closed under complementation, we have that satT(ML 0 ) and inv(MLo) are ATIME(T(cn),cn)inseparable.
Remark 5.5. By Theorem 5.1, Theorems 5.3 and 5.4 hold with
and inv(ML+) in place of satT(MLo) and inv(MLo).
It is also important to note that even though we reduced the prescribed
sets of sentences in these theorems to equivalent sets of monadic secondorder sentences, it is the prescribed sets which are used to obtain lower
bound results. For example, in the proof of Theorem 5.4 we actually
showed that there is a prescribed set
of sentences over ML0 such that
satT(ML0) n and inv*(ML0) are ATIME(T(cn),cn)-inseprable. In
sections 6 and 7 we will find lower bounds for various theories from logics L by finding a reset log-lin reduction from
to ', a prescribed set
of sentences over L, so that satT(ML 0 )
is mapped into sat*( )
and inv*(ML0)
is mapped into inv*(L) n ' . Thus, for some c > 0
sat*( )
and inv*(L) n
are ATIME(T(cn),cn)-inseparable. Then
by Theorem 3.6, sat( ) and inv(L) are ATIME(T(cn),cn)-inseparable.
We present several useful tools for establishing NTIME lower bounds for
theories by interpreting models from classes of known complexity. We
begin with some definitions regarding interpretations of classes of models
and give a general outline of how interpretations are used to obtain lower
bounds. Theorem 6.2, a specific instance of the method, follows from the
results in section 4. It tells how to obtain lower bounds by interpreting
binary relations. We then show in Theorem 6.3 how to interpret binary
relations in finite trees of bounded height. As a consequence we obtain
hereditary lower bounds for theories of finite trees of bounded height and
a tool for obtaining further lower bounds by interpreting classes of these
165
trees in other theories. We obtain similar results for classes of finite trees
of unbounded height in Theorem 6.7 and its corollaries.
Let be a theory in a logic L' and Co, C1,C2, be classes of models
for a logic L whose vocabulary consists of relation symbols P1 , . . . , Pk . Let
x1 , . . . , Xk, be sequences of distinct variables with the length of xt equal to
the arity of Pi. Suppose that there are formulas n(x, u), 1 / n ( X I , U ) , . . .,
(xk, u) from L' which are reset log-lin computable from n (expressed in
unary notation) so that for each
Cn we have a model ' of
and
elements m in ' with
166
case we must say whether we intend the classes C'n to be models for a firstorder or for a monadic second-order logic. Thus, we will say that there is an
interpretation of the classes Cn in the first-order (or monadic second-order)
classes C'n.
Interpretations, inseparability, and prescribed sets are the cornerstones
of our method. Suppose we have an interpretation of classes Cn in classes
C'kn for some nonnegative integer k. (In most cases ; is 1 but occasionally
we need a larger value.) Suppose also that there is a prescribed set of
formulas over L such that
{
is realized in some
Cn where
= n}
where n =
. By adding dummy quantifiers in the right places we can ensure that
kn. If is realized in some model in Cn, then
is realized
in some model in C'kn. If i is true in no model, then ' is true in no model.
(There is a minor point which should be addressed here. To be completely
rigorous we should require for all models ' that
(x) 0 since certain
formulas in inv*(L) may become true when relativized to an empty relation. For example, consider the sentence x (x x). We can always meet
this requirement by replacing 6n(x) with the formula n(x) V x-n(x)
so we can ignore this point in subsequent discussions.) When we have a
prenex interpretation or an iterative interpretation and the definitions in
' are replaced by the appropriate iterative definitions, the sentences all
belong to some prescribed set
of formulas over L'. (If the interpretation
is iterative, then by definition the parameter sequence cannot grow with
n.) We have now that
is realized in some
C'n where
= n}
167
for
, the formulas n having the same sort of restrictions as dn and
1n, ... ,kn. In this case we would have
168
Within this framework we can also accommodate the more general kind
of interpretation in which the domain of the interpreted model is not a
subset of ', but a set of k-tuples from ', and the equality relation is
interpreted by an equivalence relation definable in
. We have found
this to be necessary for only two theories treated here and so we avoided
stating these definitions in the fullest generality. However, it would not have
been difficult to introduce these features explicitly. (See Examples 8.12
and 8.14.)
Although we have emphasized inseparability results, we should not lose
sight of the fact that the starting point for our reductions, Theorem 4.1, is
a hardness result: every set -T of satisfiable sentences with satT(Lo)
is hard for the complexity class
NTIME(T(cn))
c>0
NTIME(T(nc))
c>0
via polynomial time reductions. Thus, all our inseparability results can be
reformulated as hardness results. We summarize the previous discussion
and make this point precise in the following theorem:
Theorem 6.1. Let Co,C1,C2,... be classes of models such that for some
prescribed set
of formulas over a first-order logic L,
is realized in some
Cn where
= n}
169
NTIME(T(nc))
c>0
is realized in some
C'n where
n}
170
as a
171
172
and ti = tj
Ui = Uj holds for 1 i, j m. Using the same argument as
above we can say that m(x,y) is a prenex formula which is reset log-lin
computable from n and equivalent to 'm(x, y) m(y, x), and similarly for
m(x,y).
Since m and m define equivalence relations on the set of vertices of
height 2 we can speak of the m-type and m-type of a vertex x of this
height. Define vi(x) to be the minimum of m and the number of children
of x with precisely i children. Define v+i (x) to be the minimum of m and
the number of children of x with at least i children. The m-type of x
is precisely determined by the values v1(x), . . . , vm(x) and the nm-type by
the values v m +1(x), . . . ,v2m-1,v+2m(x).We see then that the m -type and
the m -type of a vertex are independent, and that there are mm+1 > 2
m -types and the same number of m -types.
Let n(x) be a prenex formula that says x is a child of the root and
V0(X) 0. Let n (x, y) be a prenex formula that says 6n(x) and 6n(y) hold
and there is a child z of the root such that V0(Z) > 1 and m(x,z) and
(z,y)
hold.
It is easy to arrange that n(x) and n(x, y) are reset log-lin computable
from n. Each binary relation on a set of size at most 2 is isomorphic to
for some tree in 2m3.
Corollary 6.4. Let r 3. r has a hereditary NTIME(expr_2(cn)) lower
bound.
Corollary 6.5. Let r
3 and
be a theory in a logic L. If there is
an interpretation of the classes mr log n in , then has a hereditary
NTIME(exp r _ 2 (cn)) lower bound.
Remark 6.6. For each r
3 there is a constant d > 0 such that every tree in trn/log n has at most exp r _ 2 (dn) vertices. Hence, we can view
Corollary 6.5 as a significant improvement over Theorem 6.2 for obtaining
NTIME(expr_2(cn)) lower bounds: rather than interpreting all binary relations on sets of size exp,,_2(cn) we need only interpret all trees of height
r on sets of this size. In applications it is often much more natural to
interpret trees than binary relations. See also Theorems 7.5 and 7.9.
We next prove results similar to Theorem 6.3 and Corollaries 6.4 and 6.5
for finite trees of unbounded height.
Theorem 6.7. Let Cn be the class of binary relations on a set of size
exp (n 3). Then there is a prenex interpretation of the classes Cn in the
first-order classes T2n.
Proof. Recall from the proof of Theorem 6.3 the formula mr which defines
an equivalence relation on the set of all vertices in trees of depth r except
the root. In that proof r was fixed, so we could assume that mr was in
prenex form, and TO increased with n. In this proof we fix TO = 2 and
173
lower bound.
The theorems in this section are counterparts of those in the last section. In
order to obtain linear alternating time lower bounds for logical theories we
must introduce a stronger form of interpretability which we call monadic
interpretability. Theorems 7.2 and 7.3 tell how to obtain lower bounds
by monadic interpretation of addition relations and binary relations. We
then show, in Theorems 7.4 and 7.7 that binary relations have monadic
interpretations in certain classes of trees of bounded height. From these
results we obtain useful tools for establishing linear ATIME lower bounds,
and lower bounds for monadic second-order theories of trees of bounded
height.
Suppose is a theory in a logic L' and C o , C i , C 2 , . - - are classes of
models for a monadic second-order logic ML whose vocabulary consists
of relation symbols P1, ... .Pk. Suppose that there are formulas 6n(x, u),
reset log-lin computable from
n, so that for each
Cn there is a model ' of and elements m in '
with
isomorphic to
range over all subsets of n (x, m) as p ranges over '. The parameter
sequence u is allowed to grow as a function of n but t must remain fixed.
The sequence {In | n 0} where
In =
174
Cn where
= n}
C'n where
= n}
175
Theorem 7.1. Let Co,C1,C2, be classes of models such that for some
prescribed set of formulas over a monadic second-order logic L,
is realized in some
Cn where
= n}
C'n where
= n}
176
177
178
m(x,y)
Q' will be given by the prenex definition [Q'(x,y) = m]. Q'(x,y) is obviously an equivalence relation on the vertices of height 0.
Now let (x, y) be a formula (with free relation variable Q) that says
x is not a leaf and for every child t of x there is a child u of y such that
Q(t,u) holds. Now the iterative definition
where rm(x, y) is reset log-lin computable from n. We will say that two
vertices x and y in a tree of height r have the same rm-type if r m ( x , y )
holds.
Now it is easy to show by induction on k that there is a tree of height
r and sets X1 , . . . , Xm of leaves in this tree such that there are at least
1 + expk+1([n/logn]) rm-types among vertices of height k when k < r:
every nonempty set of rm-types of vertices of height k - 1 determines a
distinct rm-type for a vertex of height k. More is required to see that there
is such a tree in exP2 r([n/log n]) When k = 0 there is no problem Consider
the case k = 1. For i = 1,2, . . . ,exp 2 ([n/logn]) let Tt be the tree of height
179
180
This concludes our survey of tools for establishing lower bounds. The
next section contains many examples of their application.
Applications
181
Then [Q(x, y)
]n+1 defines a relation Q(x, y) which holds precisely when
the number of consecutive O's preceding position x and the number of
consecutive 0's preceding position y are equal and at most n. Let n(x,y)
be the formula
Thus, n (x,y) holds precisely when there are 1's at positions x and y, x
precedes y, there is exactly one more 0 preceding x than preceding y (but
no more than n 0's preceding y), and there is no position between x and
y which has as many 0's preceding as x has. Now a tree T of height n
represented by a linear order is isomorphic to
when Q
is given by the iterative definition [Q(x ,y)
0]n+l.
Remark 8.2. Note that for some d > 0 each tree in T2n has at most
e x p ( d n ) vertices so for some d' > 0 each representation
of a tree in
this class has at most exp00(d'n) elements. From Theorem 6.1 we see that
we have a tool for obtaining further lower bounds. If there is a prenex or
iterative interpretation of the classes Cn consisting of linear orders of length
at most exp00(n) with added unary predicates in a theory , then has
a hereditary lower bound of N T I M E ( e x p ( c n ) ) .
182
183
X(x,y,w),
includes all models (n, , R) in Example 8.1 as w ranges over {0,1}*. The
formula 6(x, w) is x w A x w. The formula \(x, y, w) is
w.
184
for these theories were first given by Rackoff [1975b] (see Ferrante and
Rackoff [1979]). Their treatment shows that these theories are in fact
hereditarily not elementary recursive. (To obtain this from the result stated
by Ferrante and Rackoff, we must use the fact that the theory of pairing
functions is finitely axiomatizable.)
Example 8.8 (The theory of any pairing function). A pairing function is a model = (B, f) where / is a one-to-one binary function on B.
We show that the theory of any pairing function has a hereditary lower
bound of
NTIME(ex P (cn))
by iteratively interpreting the classes Cn of linear orders of length e x p ( n )
with an added unary predicate. These classes were discussed in Example
8.1.
The idea behind the interpretation is to have elements of represent
sequences of length e x p ( n ) . One way to do this, say for an element a, is
to find a pair (a0, a1) such that /(a0, a1) = a. (There is at most one such
pair, since / is a pairing function.) Then find a quadruple (a00 ,
a01,a10,a11)
such that f(a00 a01) = ao and /(a10,a11) = a1 and repeat until we have
a sequence (aw w {0,1}m), where m = e x p ( n 1). (The order on
{0, l}m is the lexicographic order.) We could then say that a represents
this sequence. We may not be able to carry out this construction for every
element ae because / may not be onto, but certainly every sequence of
length e x p ( n ) is represented by some element, and every element represents at most one sequence of length e x p ( n ) .
We may regard this construction as giving a labeling of vertices in the
full binary ordered tree of height m. The root is labelled a, its left and
right children are labelled a and 01, respectively, and so on. Unfortunately,
the construction has two limitations that make it unacceptable for our
purposes: a branch in the tree may have several vertices with the same
label; and two different branches may be labelled identically. We must
modify the construction to overcome these difficulties. Rather than taking
awo and aw1 so that f ( a w 0 , a w 1 ) = aw, we will take b and c so that f ( b , c) =
aw and then take awo and aw1 so that f ( a w o , a w 1 ) c. That is, awo and
aw1 are chosen so that x f ( x , f ( a w 0 , a w 1 ) ) = aw holds. Clearly, aw0 and
aw1 are uniquely determined by aw. It is still true that every element
represents at most one sequence of length exp00(n), but now a sequence of
length e x p ( n ) may be represented by many elements.
We claim that with this modified construction, every sequence of length
exp (n) is represented by some element such that no branch in the associated tree has duplicate labels and no two branches are labelled identically.
Consider a sequence
aw w{0,1}
185
that hold under the same conditions as n(x, u), n(x, y, u), and n(x, x',u,
u'), only for sequences of length exp00(n 1) rather than e x p ( n ) . (If
n = 0, take Qo, Q1, and Q2 to be empty relations.)
Suppose n > 0. Using Qo and Q1, we can say, in regard to a sequence of
distinct elements of length exp00(n - 1) represented by v, that a particular
element is first; that a particular element is last; that one element occurs
before another; and that one element occurs immediately before another.
Hence, we can say that v represents a branch of length exp (n -1) from an
element u. By this we mean v represents a sequence of distinct elements of
length exp00(n 1) such that when x is the first element of the sequence,
SUCC(u, x) holds, and when x occurs immediately before y in the sequence,
SUCC(x , y) holds.
Now define formulas , A, and a in terms of Q0, Q1, and Q2 The
formula 6(x, u) says that if Qo is empty, then x = u, and if Qo is not
empty, the following hold.
(a) There is an element v representing a branch of length exp00(n 1)
186
187
the lower bounds obtained here hold if only the connectives A, V, and are
allowed in formulas. However, with slightly more effort we can overcome
this difficulty and use only formulas where the defined relation symbols
occur only positively. We have not done so to simplify exposition.
Remark 8.9. It is a long-standing open question whether the first-order
theory of the free group Fk on k > 2 generators is decidable. Semenov
[1980] observed that this theory is at least not elementary recursive. He
showed that it is possible to give a first-order definition of a pairing function on Fk, from which it follows that the theory of Fk has a hereditary
NTIME(exp00(cn)) lower bound.
This use of pairing functions is a quick way to show that many decidable
theories are hereditarily not elementary recursive. For example, consider
the first-order theory of the model (N, +,2 X ), where N is the set of nonnegative integers. Semenov [1984] showed that this theory is decidable.
Observe that the function
188
is
not
difficult
to
see
that
for
some
d,
the
trees
in
T3n/log
are
dn
189
Thus,
(x) holds precisely when there are at least two cycles with the
same length as the cycle containing x. Let an(x,y] be the formula
We see that an(x, y) holds precisely when x lies on a cycle of length one less
than the cycle containing y. These formulas interpret successor relations on
sets of size at most 2n with an added unary predicate. The interpretation
differs from interpretations discussed previously, however, in that we must
take a quotient by the equivalence relation nn, rather than interpret the
domain as a subset.
Remark 8.13. The theories in Examples 8.10 and 8.12 are treated by Ferrante and Rackoff [1979]. They give a matching upper bound for the former
and an upper bound of NTIME(2dn2) for the latter. (Example 8.12 first
appeared in Ferrante [1974].) The inseparability, hereditary, and hardness
results presented here are new. Stern [1988] investigated the complexity of
the theory of commuting permutations.
190
191
192
193
194
.
Define n"(x,y,u) to be the formula
The first disjunct gives an edge from 1, the root of the tree, to the root of
each primary subtree; the second disjunct gives the edges in the primary
subtrees. UsingS"n|and n" we can interpret each tree from
by taking
a disjoint collection of trees in
and 1 as a new rootwe need only
choose u suitably. Define 0 " n ( x , u , v , w ) to be the formula
By varying v and w we obtain all subsets of S"n (x, u), so we have a monadic
interpretation.
Remark 8.25. An upper bound of ATIME(exp3(dn),dn) for the firstorder theory of integer multiplication can be obtained from the treatment
of this theory in Ferrante and Rackoff [1979]. The original reference for
an upper bound on the first-order theory of integer multiplication is Rackoff [l975b; 1976]. Decidability results and complexity bounds for related
theories have been given by Maurin [1997b] and Michel [1981; 1992].
As our last example we consider the first-order theory of finite Abelian
groups. Lo [1988] has given an extensive treatment of upper bounds for
theories of Abelian groups. He states these bounds in terms of the classes
SPACE, but it is clear that his analysis gives ATIME(2 2 d n ,dn) upper
bounds. We derive a matching lower bound, not just for the theory of finite
Abelian groups, but also for the theory of finite cyclic groups.
Example 8.26 (The first-order theory of finite cyclic groups). We
195
Let C(l) be the cyclic group of order 1. We know from the remarks
following Example 8.22 that there is a d > 0 such that when 1 > 2 2dn ,
there is a monadic iterative interpretation of T2n in C(l). More precisely,
there are formulas S n (x,t',u'), n(x,y,t',u'), and an(x,t',u',v) given by
iterative definitions such that if u' is a generator of C(l), then each tree in
is isomorphic to (Sn (x,t',u'), n (x,y,t',u')) for some t' in C(l),
and
includes all subsets of Sn (x, t' , u') as v' ranges over
elements of C(/). (We need to mention the generator u of C(l) explicitly
in these formulas because there is no preferred generator; this necessitates
only a minor modification of the formulas in Example 8.20.)
Let p1,p2, . . . ,pk be the prime numbers less than 22dn+1 and mi the
largest power of pi less than 22dn+1 . Then since pi mi > 22dn+1 and
Pi < mi, we know that m^ > (22dn+1 )1//2 = 22dn so that there is a monadic
iterative interpretation of
in each C(mi) as described above.
By Chebyshev's theorem (Theorem 7 of Hardy and Wright [1964]) we
know that k, the number of primes less than 22dn+1 , is at least 22dn for
sufficiently large n. By taking d large enough, we can ensure that k is
greater than the maximum number of primary subtrees in each tree of
.
Now take m = m1.m2 mk so that C(m) = C(m l )C(m2)0- C(mk).
We see that m > (22dn )k > 22dn . We need to show that we can combine
the monadic iterative interpretations of
in the direct summands C(mi)
to obtain a monadic iterative interpretation of T32n in C(m). To do this,
we will show that we can define the decomposition of C(m) into the factor
subgroups C(m,i).
Let 0(x, y, t, u) be the following formula with free relation variable Q:
196
Thus, we can define the decomposition of C(m) into its factor subgroups.
In particular, since u can be expressed as a sum uI + u2 + + Uk , where ui
is a generator of (C(mi), the formula n(s,u',u,u) picks a unique generator
for each factor subgroup as s ranges over maximal prime powers in I.
The rest of the proof proceeds as in Example 8.24. For example, we
form S'n(x, s,t',u',u) by substituting a(x,y,z,s,u) for each occurrence of
x + y = z in Sn(x,t', u'). ThenS"n(x,t, u) is the formula
Upper bounds
In this section we give upper bounds that show that most of the lower
bounds obtained in sections 4-7 are the best possible.
First we give upper bounds for satT(Lo), satpT(Lo), and sat*T(Lo). Recall from Theorem 4.5 that when T(dn) = o(T(n)) for some d between 0
and 1, these sets are not in NTIME(T(cn)) for some c > 0.
197
Proof. To determine whether a sentence from L0 is in satT(Lo), nondeterministically generate a finite binary relation. We give a nondeterministic
recursive procedure that determines whether is true in this relation.
If satT(Lo), then it holds in some binary relation
a set of size
at most T ( n ) . The representation of this relation requires at most T(n)2
bits. We will show that our recursive procedure halts within time cT(n)n+2
on
The procedure tests the subformulas of and combines results to produce an answer. We may assume that all negations have been pushed
inward so that only atomic formulas are negated. It is clear how the procedure works if is a conjunction or disjunction. If begins with an
existential quantifier, an element of is nondeterministically assigned as
the value of the quantified variable and the enclosed formula is checked.
If begins with a universal quantifier, then each element of is assigned
in turn to the quantified variable and the enclosed formula is checked.
When an atomic formula is reached it can be determined in time O(T(n) 2 )
whether it is true in for the assignment values at that point. Since, for
each of at most n universal quantifiers, T(n) values are generated, the total
time is 0(T(n) 2 T(n) n ), as claimed.
Suppose
is in prenex normal form. For each E > 0, whenever n
is sufficiently large there are at most (1 + E)n/logn universal quantifiers in , so determining whether a prenex formula is in s a t ( L o ) ) is in
then
used, except that when a relation variable is encountered, it is necessary to
jump to its definition (this may take n moves), compute its value by calling
our recursive procedure, and return. Total time, then, is O(nT(n) 2 T(n)n)
because the tree of recursive procedure calls has height at most n and
branches at most n times at each vertex; at the leaves, there is a cost of
T(n) 2 moves to evaluate atomic formulas; at the vertices corresponding
to relation variable references there is a cost of O(n) moves to find the
definition.
For all three bounds we must use the linear speed-up theorem (see
Hopcroft and Ullman [1979]) to eliminate constants in front of the time
bounds.
I
We see that if T(n) n+2 = O(T(dn)) for some d > 0, then satT(L0)
NTIME(T(dn)), so we have essentially the same upper and lower bounds.
Similar remarks pertain in the other cases.
198
Proof. Given a sentence of length n in MZ/o, nondeterministically generate a binary relation. We use alternation to determine whether holds
in the relation. If E satT(MLo), then it holds in some binary relation
on a set of size at most T ( n ) . For each set quantifier encountered it is
necessary to generate T(n) bits to assign a value to the quantified variable. There are O(n) such variables so this part of the computation takes
time O(nT(n}). This time is dominated by the O(T(n)2) time needed to
generate and verify atomic formulas. If is a sentence in ML0,we use
the same procedure except that when a subformula with a relation variable
is encountered, the value of the subformula is guessed and verified using
alternation.
I
Notice that if
199
200
any time, it is easy to see that this strategy can always be carried out for
A and B satisfying the hypotheses of the lemma.
I
Theorem 9.5. Given a finite tree A of height at most r, there is a tree
B E Trm such that A = B for all n>0.
Proof. Modify A in the following manner. For each nonleaf vertex x of
depth r 1, consider all children y of x and subtrees Ay; for each isomorphism type, if more than m subtrees Ay are isomorphic, delete enough
of them so that there are precisely m. Continue this modification procedure
for vertices of depth r 2, r 1, and so on, up to the root. Call the resulting
tree B. It is clear from the two preceding lemmas that every time we delete
subtrees in this process, we obtain a tree in the same =-class as A. Thus,
A = B. It is evident that B T.
Remark 9.6. With slight modifications, this proof shows that for any tree
A of height r or less, there is a tree B T such that A = B for all
m > 0.
We have, as an immediate consequence of the preceding theorem, upper
bounds for theories of finite trees.
Corollary 9.7.
each tree in Trm has at most exp r _ 2 (cn) vertices. Thus, the time required
to nondeterministically generate a tree in 77 and determine whether a
sentence of length n holds in this tree is exp r _ 2 (cn) n . This function is
dominated by expr_2(dn) for some d > 0 when r > 3 and by expr_2(dn2)
when r = 3.
I
Remark 9.8. This theorem gives matching upper bounds for the lower
bounds obtained in Corollary 6.4 except for the case r = 3. There the lower
bound is NTIME(2cn) and the upper bound is NTIME(2dn2). Ferrante and
Rackoff [1979] get precisely the same bounds for the theory of one-to-one
functions (cf. Example 8.12). It is more satisfying to say that sat(E3) is a
complete problem for
201
least integer such that m log m > n. (Recall that by Theorem 9.5, for
every finite tree A of height at most r, there is a tree
such that
A = B.) Every first-order theory with a model of power greater than 1
is hard for PSPACE (via log-space reductions), so we know fairly precisely
the complexity of these theories.
Write A &% B to indicate that A and 03 satisfy the same monadic
second-order sentences of quantifier rank n with at most m variables. To
obtain upper bounds for monadic second-order theories of finite trees we
must introduce Ehrenfeucht games characterizing the relation mn.
In such a game, players I and II play for n moves on structures A and B.
Let PI , P2, . . . , Pm be unary relation symbols not in the language of A and
B. During each move of the game one of the symbols Pi will be assigned a
pair of sets. This pair contains a subset of A and a subset of B. Initially,
each of these symbols is assigned the empty set for A and the empty set
for 03. On each move player I picks a relation symbol Pi. The previous
assignment to Pi, is forgotten. Player I assigns a subset of A (or B). Player
II responds by assigning a subset of B (or A) to Pi . Whenever player I picks
a singleton set, player II must respond with a singleton set. (Singleton set
moves correspond to element quantifiers.) Now suppose that at the end of
the game symbols P1 , P2, . . . , Pm are assigned subsets P1 , R2, . . . , Rm of A
and subsets S1 , S2, . . . , Sm of 03. If the set
202
and
203
204
Proof. It is easy to show by induction that for each r > 1 there is a c > 0
such that
we
that
For each r > 2 there is a c > 0 such that h(m, n, r) < expr(c(m + logn)).
When r > 2 we can determine if a sentence from MLt is in sat(ME r )
by nondeterministically generating a tree in
, where m log m > n, and
using alternation to verify that holds in this tree. This can be done in
ATIME(expr(dn/ logn), n).
When
10
Open problems
The first-order theory of finite fields, and several related theories, were
shown to be decidable by Ax [1968] in a paper which has proved to be
205
206
207
,
208
209
Acknowledgements
The authors wish to thank Elsevier Scientific for permission to publish an
updated, revised version of the paper [Compton and Henson, 1990].
We would like to thank the referee of Compton and Henson [1990] for
his extraordinarily careful reading of this chapter and helpful comments.
References
[Ax, 1968] J. Ax. The elementary theory of finite fields, Ann. Math., 88,
239-371, 1968.
[Ax and Kochen, 1966] J. Ax and S. Kochen. Diophantine problems over
local fields III: decidable fields, Ann. Math., 83, 437-456, 1966.
[Barwise, 1977] J. Barwise. On Moschovakis closure ordinals, J. Symbolic
Logic, 42, 292-296, 1977.
210
211
form method for proving lower bounds on the complexity of logical theories, Ann. Pure Appl. Logic, 48, 1-79, 1990.
[Compton et al, 1987] K. J. Compton, C. W. Henson, and S. Shelah. Nonconvergence, undecidability, and intractability in asymptotic problems.
Ann. Pure Appl. Logic, 36, 207-224, 1987.
[Duret, 1980] J.-L. Duret. Les corps pseudo-finis ont la propriete
d'independence. C. R. Acad. Sci. Paris Ser. A-B , 290, A981-A983,
1980.
[Egidi, 1993] L. Egidi. The complexity of the theory of p-adic numbers. In
Proceedings of the 34th Annual Symposium on Foundations of Computer
Science, Palo Alto, CA, pp. 412-421. IEEE Computer Society Press, Los
Alamitos, CA, 1993.
[Emerson and Halpern, 1985] E. A. Emerson and J. Halpern. Decision procedures and expressiveness in the temporal logic of branching time, J.
Comput. System Sci., 30, 1-24, 1985.
[Ersov, 1964] Y. Ersov. Decidability of the elementary theory of relatively
complemented distributive lattices and the theory of filters, Algebra i
Logika, 3, 17-38, 1964. (In Russian.)
[Ersov, 1965] Y. Ersov. On the elementary theory of maximal normed
fields, Soviet Math. Dokl., 6, 1390-1393, 1965. (English translation.)
[Ersov et al., 1965] Y. Ersov, I. A. Lavrov, A. D. Taimanov, and M. A.
Taitslin. Elementary theories, Russian Math. Surveys, 20, 35-100, 1965.
(English translation.)
[Fagin et al., 1995] R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. MIT Press, Cambridge, MA, 1995.
[Feferman and Vaught, 1959] S. Feferman and R. Vaught. The first-order
properties of products of algebraic systems, Fund. Math., 47, 57-103,
1959.
[Ferrante, 1974] J. Ferrante. Some upper and lower bounds on decision procedures in logic, Doctoral thesis, Massachusetts Institute of Technology,
Cambridge, MA, 1974.
[Ferrante and Rackoff, 1979] J. Ferrante and C. Rackoff. The Computational Complexity of Logical Theories, Lecture Notes in Math. 718.
Springer-Verlag, Berlin, 1979.
[Fischer and Ladner, 1979] M. J. Fischer and R. Ladner. Prepositional dynamic logic of regular programs, J. Comput. System Sci., 18, 194-211,
1979.
[Fischer and Rabin, 1974] M. J. Fischer and M. Rabin. Super-exponential
complexity of Presburger arithmetic. In Complexity of Computation, R.
M. Karp, ed. SIAM-AMS Proc., vol VII, pp. 27-42. American Mathematical Society, Providence, RI, 1974.
[Fleischmann et al., 1977] K. Fleischmann, B. Mahr, and D. Siefkes.
212
213
214
215
216
automata and logic, Doctoral thesis, Massachusetts Institute of Technology, Cambridge, MA, 1974.
[Stockmeyer, 1977] L. Stockmeyer. The polynomial-time hierarchy, Theoret Comp. Sci., 3, 1-22, 1977.
[Stockmeyer, 1987] L. Stockmeyer. Classifying the computational complexity of problems, J. Symbolic Logic , 52, 1-43, 1987.
[Tetruashvili, 1984] M. Tetruashvili. The computational complexity of the
theory of abelian groups with a given number of generators. In Frege Conference, 1984 (Schwerin, 1984), PP- 371-375. Akademie-Verlag, Berlin,
1984.
[Touraille, 1985] A. Touraille. Elimination des quantificateurs dans la
theorie elementaire des algebres de Boole munies d'une famille d'ideaux
distingues, C. R. Acad. Sci. Paris, Ser. I, 300, 125-128, 1985.
[Trakhtenbrot, 1950] B. A. Trakhtenbrot. The impossibility of an algorithm for the decision problem for finite models, Dokl. Akad. Nauk SSSR,
70, 569-572, 1950.
[Turing, 1937] A. M. Turing. On computable numbers with an application
to the Entscheidungsproblem, Proc. London Math. Soc. (2), 42, 230-265,
1937. Correction, ibid., 43, 544-546, 1937.
[Vardi, 1997] M. Vardi. Why is modal logic so robustly decidable? In Descriptive Complexity and Finite Models (Princeton, NJ), pp. 149-183.
Amer. Math. Soc., Providence, RI, 1997.
[Vaught, 1960] R. Vaught. Sentences true in all constructive models, J.
Symbolic Logic, 25, 39-58, 1960.
[Vaught, 1962] R. Vaught. On a theorem of Cobham concerning undecidable theories. In Proc. 1960 Intl. Cong. Logic, Phil, and Methodology of
Sci., pp. 14-25. Stanford University Press, Stanford, CA, 1962.
[Volger, 1985] H. Volger. Turing machines with linear alternation, theories
of bounded concatenation and the decision problem of first-order theories, Theoret. Comp. Sci., 23, 333-337, 1983.
[Vorobyov, 1997] S. Vorobyov. The 'hardest' natural decidable theory. In
Proc. 12thd Ann. IEEE Conf. on Logic in Computer Science, pp. 294305. IEEE Computer Society Press, 1988.
[Vorobyov, preprint] S. Vorobyov. The most nonelementary theory,
preprint.
[Wood, 1976] C. Wood. The model theory of differential fields revisited,
Israel J. Math., 25, 331-352, 1976.
[Young, 1985] P. Young. Godel theorems, exponential difficulty and undecidability of arithmetic theories: an exposition. In Recursion Theory,
A. Nerode and R. Shore, eds., Proc. Symp. Pure Math. 42. American
Mathematic Society, Providence, RI, 1985.
Contents
1
2
Introduction
Algebras
2.1 The basic notions
2.2 Homomorphisms and isomorphisms
2.3 Abstract data types
2.4 Subalgebras
2.5 Quotient algebras
Terms
3.1 Syntax
3.2 Semantics
3.3 Substitutions
3.4 Properties
Generated algebras, term algebras
4.1 Generated algebras
4.2 Freely generated algebras
4.3 Term algebras
4.4 Quotient term algebras
Algebras for different signatures
5.1 Signature morphisms
5.2 Reducts
5.3 Extensions
Logic
6.1 Definition
6.2 Equational logic
6.3 Conditional equational logic
6.4 Predicate logic
Models and logical consequences
7.1 Models
7.2 Logical consequence
7.3 Theories
7.4 Closures
219
220
220
223
224
225
225
228
228
229
230
230
231
231
234
235
236
236
236
238
239
240
240
241
242
242
244
244
245
246
247
218
9
10
11
12
13
14
15
249
249
250
250
251
252
253
254
254
256
257
258
258
259
261
261
262
264
266
267
267
268
271
272
275
279
282
283
283
283
284
285
285
286
288
291
295
296
298
298
300
302
303
16
15.5
15.6
15.7
The
16.1
16.2
305
307
309
310
310
310
1 Introduction
It is widely accepted that the quality of software can be improved if its design is systematically based on the principles of modularization and formalization. Modularization consists in replacing a problem by several "smaller"
ones. Formalization consists in using a formal language; it obliges the software designer to be precise and principally allows a mechanical treatment.
One may distinguish two modularization techniques for the software design. The first technique consists in a modularization on the basis of the
control structures. It is used in classical programming languages where it
leads to the notion of a procedure. Moreover, it is used in "imperative"
specification languages such as VDM [Woodman and Heal, 1993; Andrews
and Ince, 1991], Raise [Raise Development Group, 1995], Z [Spivey, 1989]
and B [Abrial, 1996]. The second technique consists in a modularization
on the basis of the data structures. While modern programming languages
such as Ada [Barstow, 1983] and ML [Paulson, 1991] provide facilities for
this modularization technique, its systematic use leads to the notion of abstract data types. This technique is particularly interesting in the design
of software for non-numerical problems. Compared with the first technique
it is more abstract in the sense that algebras are more abstract than algorithms; in fact, control structures are related to algorithms whereas data
structures are related to algebras.
Formalization leads to the use of logic. The logics used are generally
variants of the equational logic or of the first-order predicate logic.
The present chapter is concerned with the specification of abstract data
types. The theory of abstract data type specification is not trivial, essentially because the objects considered viz. algebras have a more
complex structure than, say, integers. For more clarity the present chapter
treats algebras, logics, specification methods ("specification-in-the-small"),
specification languages ("specification-in-the-large") and parameterization
separately. In order to be accessible to a large number of readers it makes
use of set-theoretical notions only. This contrasts with a large number of
publications on the subject that make use of category theory [Ehrig and
Mahr, 1985; Ehrich et a/., 1989; Sannella and Tarlecki, 2001].
Our attention is restricted to those topics that are now standard. Proofs
of theorems are omitted. For more details the reader is referred to the literature and, in particular, to the textbook [Loeckx et al., 1996]. This
220
textbook treats the subject along the same lines as the present chapter,
uses the same notation and contains the proofs of most of the theorems.
A book treating more advanced issues is [Sannella and Tarlecki, 2001]. A
comprehensive state-of-the-art report on recent advances in algebraic foundations of systems specification may be found in [Astesiano et al., 1999b].
A broad survey of the topic together with an annotated bibliography is in
[Cerioli et al., 1997].
Sections 2 to 8 describe the fundamental specification tools. More precisely, sections 2 to 5 are devoted to many-sorted algebras and sections 6
to 8 to logic. Section 9 introduces the general notion of a specification.
Sections 10 to 12 present three specification methods for specification-inthe-small: loose specifications, initial specifications and constructive specifications. Section 13 presents a simple prototypical specification language
for specification-in-the-large and discusses in some detail the language constructs. Section 14 shows how specification languages may be generalized
for modularization and parameterization. Section 15 shortly discusses some
further issues. Finally, section 16 briefly indicates how the notions of a category and of an institution may be used in the study of abstract data type
specifications.
Algebras
2.1
221
False
Succ
222
and similarly for the other operations. Note that, for instance,
stands for
(ii) The following "fancy" algebra B is also a E-algebra:
etc.
While the last example may suggest that a signature allows any algebra
provided its functions respect the arities, the following example illustrates
the contrary.
Example 2.5. Let E = (S, ) be the signature with
For
any E-algebra A the carrier sets A(s) and
not empty. In fact,
Further examples of algebras may be found in [Tucker and Meinke,
1992].
It is of course possible to generalize the definition of an algebra by
allowing the functions associated with an operation to be partial. Unfortunately, this generalization leads to serious problems with homomorphisms
223
(see section 2.2) as well as with the logic (see section 6) and is therefore
not further considered.
2.2
Informally, homomorphisms are mappings between algebras or, more precisely, mappings between their carrier sets that "respect" their functions.
The following definition is "classical" . Examples of the different notions
and the proofs of the theorems may be found in [Tucker and Meinke, 1992]
or any textbook on universal algebra.
Definition 2.6 (Homomorphism). Let A, B be two E-algebras, E =
homomorphism
h : A > B from A to B is a, family
(h s )sES of functions
such that for any operation
the following holds:
say
224
2.3
Informally, an abstract data type is a class of algebras closed under isomorphism (cf. [Tucker and Meinke, 1992]). This definition fits the study of
specification well because the logics to be used cannot distinguish between
isomorphic algebras.
Definition 2.15 (Abstract data type). An abstract data type for a
signature E is a class C C Alg(E) satisfying the condition:
for any pair of E-algebras A, B.
An abstract data type is called monomorphic if all its algebras are
isomorphic to each other; otherwise it is called polymorphic.
225
2.4 Subalgebras
Informally, a subalgebra is an algebra with some carriers deleted and with
the functions restricted accordingly.
Definition 2.16 (Subalgebra). Let E = (S, ) be a signature and let
A and B be E-algebras. The algebra B is called a subalgebra of A if the
following two conditions hold:
2.5
Quotient algebras
226
227
Fact 2.24. With the notation of Definitions 2. 22 and 2.23 one has: =
Q.
The following theorem is called the first homomorphism theorem [Tucker
and Meinke, 1992] and plays an important role in section 11.
Theorem 2.25. Let E be a signature and let A and B be two 'E-algebras.
Furthermore, let h : A B be a homomorphism and
the congruence relation induced by h. If the homomorphism h is surjective, then
Terms
3.1
Syntax
An element of
A
228
3.2
Semantics
229
When t is a ground term the value of t does not depend on a and one may
write A(t) instead of A(a)(t).
3.3
Substitutions
Then:
3.4
Properties
230
The next theorem states that a term keeps its value in a subalgebra.
Theorem 3.9. Let A and B be E-algebras for some signature = (5, ).
// B is a subalgebra of A, then:
231
232
that the syntax of terms has been defined by simultaneous induction on all
sorts.
Definition 4.7. Let E = (S, ) be a signature, A a E-algebra and Sc C S
a set of sorts. The following notation is introduced:
Informally,
contains additional constants but Ac "remains unchanged" .
Definition 4.1 may now be generalized.
Definition 4.8 (Algebras generated in some sorts). Let E = (S, ),
and Ac be as in Definition 4.7. Finally, let
be a set of operations with their target sort in Sc . The E-algebra A is said
to be:
generated in the sorts of Sc by the set c of constructors if the
-algebra
Ac is generated by the set
generated in the sorts of Sc if it is generated in the sorts of Sc by the
operations of with their target sort in ScExample 4.9. Let E = (S, ) be the signature with
233
Note that A is not generated in the sense of Definition 4.1, because in the
signature E even a list consisting of a single element is not representable
by a ground term.
It should be clear that a generated algebra (Definition 4.1) is simply an
algebra generated in all sorts (Definition 4.8).
When an algebra is generated in some sorts, properties of the carrier
sets of these sorts may be proved by induction. For instance, to prove that
a property P holds for all carriers of the carrier set A(list) of Example 4.9,
it is sufficient to prove:
4.2
234
4.3
Term algebras
At first sight this definition may be confusing. The reason is that ground
terms now play a dual role: as elements of the set TE they are syntactical objects; as carriers of the algebra T(E) they are semantical objects.
Actually, this dual role is what makes term algebras so interesting.
Theorem 4.14. A term algebra is freely generated.
By Theorem 4.12 there exists a unique homomorphism h : T(E) > A
for any E-algebra A. This homomorphism is called the evaluation homomorphism of A. Note that h(t) = A(t) for any ground term t.
Example 4.15.
4.4
235
The study of algebras above was for a given, fixed signature. The relation
between algebras with different signatures is investigated in the present
section.
5.1
Signature morphisms
(5',
) be two signatures.
236
(ii) A renaming is a bijective signature morphism, i.e. a signature morphism u = (us, u) with us and
bijective.
If no ambiguities arise one may write u instead of us and
Example 5.2.
defined by
237
Note that in the left-hand side of the equation the variables x and y are of
sorts el1 and e/2, respectively; in the right-hand side they are both of sort
nat.
5.2 Reducts
Reducts constitute a semantical counterpart of (the syntactical notion of)
signature morphisms. As a difference the "mapping" is from E'-algebras
to E-algebras.
Definition 5.6 (Reduct). Let u : E E' be a signature morphism
with E = (S, ) and let A' be a E'-algebra. The u,-reduct of A' is the
E-algebra A' \ u defined by:
238
The now following reduct theorem bears strong similarities with Theorem 3.11.
Theorem 5.9 (Reduct theorem). Let E = (S, ), E' be signatures and
signature morphism. Let X be a set of variables for E and
t a term from TE(x). Finally, let A' be a E' -algebra and a' : u ( X ) > A'
be an assignment for A' . Then:
5.3 Extensions
Contrasting with a reduct, an extension is a "mapping" in the same "direction" as a signature morphism, i.e. from Alg(E) to Alg(E').
Definition 5.10 (Extension). Let u : E > E' be a signature morphism.
239
Logic
6.1 Definition
Definition 6.1 (Logic). An algebra logic (logic for short) L consists of:
(i) a decidable set L() for each signature ; an element of L(E) is called
a (E)formula;
(ii) a function uL ' L(E) L(E') for each signature morphism
u: E > E'; this function ul is called the formula morphism;
(iii) a relation |=E C Alg(E) x L(E) for each signature E; this relation is
called the satisfaction relation (for E); if A |=
for some E-algebra
A and some formula L(E), one says that is valid in A or that
A satisfies
It is required that the following two conditions are satisfied:
(i) (Isomorphism condition.) For any signature , for any formula
E L(E) and for any E-algebras A, B with A ~ B,
(ii) (Satisfaction condition.) For any signatures E, E', for any signature
morphism u, : E ', for any formula
L ( E ) and for any
E'-algebra.A',
One may write u instead of uL, if no confusion arises. The following
definition defines an important subclass of logics.
Definition 6.2 (Logic with equality). A logic L is called a logic with
equality if, for any signature E, any sort s of this signature and any ground
terms t, u T,S, there exists a formula, say
such that
for all E-algebras A.
240
Sections 6.2 to 6.4 now present three different "instances" of this general
logic. Each of these instances constitutes a logic with equality as the
reader may easily verify.
6.2
Equational logic
241
6.3
The theorem, the remarks and the notational convention of Section 6.2
may be generalized for conditional equational logic.
Example 6.7. Let E be the signature of Example 6.5(i). Examples of
conditional equations are:
In the same vein a logic PL called predicate logic can be defined by generalizing classical first-order predicate logic for many-sorted algebras. For
more transparency the set PL(E), the formula morphisms UPL and the
satisfaction relation |=E are defined separately.
Definition 6.8 (The set PL(E)). For each signature E the set PL(E) of
formulas is defined inductively:
242
243
Definition 6.12 (The satisfaction relation of PL). Let S be a signature. The satisfaction relation of first-order predicate logic is defined
by:
A (=s V O- (A(a)(<f) = true for all assignments a : /ree(y?) A)
for each S-algebra A and each formula if PL(E).
Example 6.13. Let E be the signature of Example 6.5(i) and A the "classical" algebra. Then one has, for instance,
A |= -.(x = 0) D 3?/.x = Succ(y),
A |= 3x.3y.-(x = y).
Note that, for instance, "A" is an operation whereas "D" is a logical symbol.
Again, the logic PL may be shown to fulfil the isomorphism and satisfaction conditions.
7.1
Models
(f e $.
(ii) A domain (or universe) for E is a class of E-algebras that is closed
under isomorphism, i.e. an abstract data type. The class of all algebras of a domain U for S that are models of $ is denoted Modu, ?,($},
or Modu($) for short.
From a pragmatic point of view a domain consists of those algebras one is
interested in. Practical examples of domains are Alg(E), the class Gen(E)
of all generated S-algebras and the class of E-algebras generated in some
sorts. One writes Modv($) instead of Mod A/J(E)($) an<l /nds($) instead
of Moc/ Gen ( S )($). If E is known, one may even write Mod($) and /nd($).
Example 7.2.
244
The following theorem illustrates the relation between logic and abstract
data types.
Theorem 7.3. Let L be a logic, E a signature, $ C (S) a set of formulas
and If a domain for E. Then Modu($) is an abstract data type.
A relation between formula sets and their model classes is given in the
following theorem:
Theorem 7.4. Let L be a logic, E a signature, $,$ C L(S) sets of formulas and U a domain for E. Then
(i) Modu(^ U *) = Modu(<b) n MoeJw(#);
(ii) if$CV, then Modu(V) C Modu($).
The notion of a quotient term algebra of Definition 4.18 may now be
generalized for sets of formulas.
Definition 7.5 (Quotient term algebra). Let L be a logic, E a signature and $ C L(E) a set of formulas. The quotient term algebra of $ is the
quotient term algebra of the class Mods($). One writes T(S, <fr) instead of
T(E, Mods (<!>)) The unique homomorphism from T(E, $) to an arbitrary
algebra A of Mods(^) is called the initial homomorphism of A.
Again, T(E, $) is an abstract data type.
The following theorem is a direct consequence of Corollary 4.20. It
forms the basis of the initial specification method (see section 11.1) and
thus justifies the above appellation "initial homomorphism" .
Theorem 7.6. Let L, E, 4> be as in Definition 7.5. If the quotient term
algebra T(S,$) is a model o/$, then it is initial in Mod-z($).
7.2
Logical consequence
245
7.3
Theories
246
The first of these facts expresses the fact that, for instance, Thi(C) \= (p
implies if ThL(C).
Example 7.11. Let E = ({nat}, {0 : -> nat, Succ : not -4 nat}) and A be
the "classical" E-algebra.
(i) ThEi(A) contains tautologies only.
(ii) Apart from tautologies, ThpL(A) contains formulas such as
-i(0 = Swcc(O)) and Vx: nat.By: nat.y = Succ(x).
The first part of the following theorem states that theories cannot distinguish between isomorphic algebras.
Theorem 7.12. Let L be a logic, E a signature and A, B 'S-algebras.
(i) A~B implies ThL(A) = ThL(B).
(ii) If L is a logic with equality and if A and B are generated, then
ThL(A) = ThL(B) implies A~B.
The following theorem is a dual of Theorem 7.4.
Theorem 7.13. Let L be a logic, E a signature and C\, C% two classes of
'E-algebras. Then:
(i) ThL(d\JC2) = ThL(d) n 17i(C2);
(ii) Ci C C2 implies ThL(C2) C ThL(Ci).
To be manageable a theory should be recursively enumerable. Only in
that case it is possible to describe it axiomatically.
Definition 7.14 (Axiomatizable algebra).
(i) A E-algebra A is said to be axiomatizable in a logic L if ThL,z(A) is
recursively enumerable.
(ii) An axiomatizable class of algebras is defined similarly.
Example 7.15. Let E be the signature of Example 2.2. The classical algebra A for this signature (with A(nat) = NO) is called the Pea.no arithmetic.
It is well known that this algebra is not axiomatizable.
7.4
Closures
247
^ $ C * implies $, C *^.
Theorem 7.19. Let L be a logic, E a signature, U a domain for E and
. Tfcen:
C $, if and onfy if Modu($) C
w C #, / and on/j/ t/ * \=u $.
As a corollary of this theorem two sets $ and $ of formulas have the same
models if and only if $ is a logical consequence of $ and vice versa. Hence,
to prove that Mod^($) Modu(^f) it is sufficient to prove $ \=u * and
* \=u *
A dual notion of the closure of a set of formulas is that of the closure
of a class of algebras.
Definition 7.20 (Closure of a class of algebras). Let L be a logic, S
a signature, U a domain for S and C C U a class of E-algebras. The closure
of C with respect to L and U is the subclass C.*LUTi (or C w for short) of
U:
Clearly, C*L<U^ is an abstract data type.
C'Lu'E rias properties similar to those of $^. In particular, with L, E,
and C'as in Definition 7.20 and with C' C W:
248
It is possible to show that Modu and Thi constitute a Galois connection; the interested reader may consult, for instance, Exercise 5.7-4 in
[Loeckx et al, 1996].
7.5
Reducts
The present section investigates the logical properties of reducts. To simplify the presentation, the domain is assumed to be Alg(H); a generalization
for an arbitrary domain U poses no fundamental problems.
The following theorem is a straightforward consequence of the satisfaction condition.
Theorem 7.22. Let L be a logic and fj. : S E' a signature morphism.
(i) Let $ C L(E). Then
Modv(n($)) = { A' Alg(E') \ (A1
(ii) LefD' C Alg(E'). Then
(V | ft) = { <p e
Similar properties exist for closures. An example is given by the following theorem:
Theorem 7.23. Let L be a logic, n : > ' a signature morphism and
$ C L() o set ofS-formulas. Then p(* E ) C (/*(*))jy.
7.6
Extensions
249
Calculi
8.1
Definitions
as
V 1 , , <Pn
ip
250
is an
One then writes $ l-jc <p (or $ h <p for short). Instead of 0 l-/c tp one writes
HK >
The following definition reflects the fact that the purpose of a calculus
is to grasp the notion of logical consequence.
Definition 8.3 (Soundness, completeness). Let L be a logic, S a signature and AC a calculus for L and E. Furthermore, let U be a domain for
E.
(i) AC is called sound with respect to U if, for any set $ C L() and any
formula <p
$ I-K <f implies $ )=w 93.
(ii) /C is called complete with respect to U if, for any set $ C !/() and
any formula y? 6 L(E),
$ j=w y? implies $ h*; <>.
8.2 An example
The following calculus is a calculus for the equational logic EL of Section
6.2. It constitutes a generalization of the calculus presented in [Tucker and
Meinke, 1992] in that it allows empty carrier sets.
Definition 8.4 (Equational calculus). Let E be a signature and X, Y
sets of variables for E. The following axiom scheme (I) and four inference
rules (II) to (V) constitute a calculus for the equational logic EL and the
signature E:
(i) vx.t = t
(ll)
teT^xy,
^7
t = n,VX.u = v
. fc l/
t,uT^x};
t, u, v 6 T E(JC );
251
=U
VX.va = vr
m > 1, ti,Ui T-E(X) fr all z with 1 < i < m,
v e T S (y), CT,T : Y > ?(X) two substitutions satisfying
the condition
for each y Y: either cr(y) = r(y)
or (ff(y) = tj and r(y) = Uj for some j,
1 < 3 < m).
Note that the calculus takes cares of empty carrier sets. More precisely,
assume that Y C X; the calculus allows VX.t = u to be deduced from
VY.t = u; on the other hand it allows VY.t = u to be deduced from VX.t = u
only if for each sort s with Ys = 0 and Xs ^ 0 the set T^^Y),s is not empty
(cf. Example 6.5(ii) with X = {x} and Y = 0).
Theorem 8.5. The equational calculus of Definition 8.4 is sound and complete with respect to Alg(E).
While the proof of the soundness is straightforward, the proof of the completeness is not trivial (see, for example, [Tucker and Meinke, 1992] or
[Loeckx et al., 1996]).
8.3
Comments
The calculus above is of course also sound with respect to any domain
for the signature S, for instance with respect to Gen(E). It is not complete with respect to Gen(S) because the inductive consequence "j=/n<i"
is "stronger" than the logical consequence "|=E" as was illustrated in
Example 7.8(ii) and 7.8(iii).
In the literature several calculi have been proposed for first-order predicate logic all of which are sound and complete with respect to Alg(Ti)
(e.g. [Ebbinghaus et al, 1984; Ryan and Sadler, 1992]). Again, they fail to
be complete with respect to Gen(E). Actually, it is possible to prove that
there cannot exist a calculus for first-order predicate logic which is sound
and complete with respect to Gen(S). This fact holds for equational logic
and conditional equational logic as well.
A calculus which is complete with respect to Alg() together with the
principle of induction is in some sense complete with respect to Gen(S)
(cf. Example 10.7 for more details). This does not contradict the previous
statement because the principle of induction cannot be "axiomatized" in
first-order predicate logic.
Whenever there exists a calculus which is sound and complete with
respect to some domain U an (informal) proof of $ \=u (p may be replaced
by a (formal) proof of $ h (f>. If the calculus is sound without being
complete, it is not guaranteed that $ I- tp holds.
252
The axioms and inference rules of a calculus are generally very "elementary" and especially in the case of first-order predicate logic"nonintuitive" . Proving properties of the form $ h ip in a calculus is therefore
very tedious. To this end calculi have been implemented in the form of
"proof systems" allowing proofs to be carried out automatically or semiautomatically with a computer. Some of these systems allow induction to
be "simulated" and thus inductive consequences to be proved. From what
has been said above it follows that such systems cannot be complete. Instead, the most sophisticated among them use heuristics; if these fail the
user is asked for some "clever" lemma enabling the proof to be carried out.
Specification
253
specification of an algebra, meaning an adequate or strictly adequate specification of the monomorphic abstract data type containing this algebra.
Note that if M(sp) is a monomorphic abstract data type the notions of
adequacy and strict adequacy coincide.
A specifier or user of a specification may want to prove certain properties of a specification. He may, for instance, want to prove that a given
specification sp constitutes an adequate or strictly adequate specification
of an abstract data type C. Of course, such a proof presupposes that the
abstract data type C has been precisely defined elsewhere. Another interesting question is whether the class M(sp) defined by the specification
sp constitutes a monomorphic abstract data type. A further question is
whether M(sp) is empty (and hence "useless").
Proofs of such properties may take the form of informal "mathematical"
proofs similar to those that may be found in any textbook on theoretical
computer science. A particular case is constituted by those properties
that may be reduced to logical consequences. Such a property may be
written as $ \=u tp, where y is, for instance, a property of the abstract
data type and where $ is the set of formulas of an atomic specification.
According to section 8.3 such properties may be proved automatically or
semi-automatically whenever there exists a sound (and complete) calculus
with respect to U.
As indicated above, proving the adequacy of a specification presupposes that the abstract data type to be specified is exactly known. Except
for "trivial" abstract data types this is generally not the case in practice:
the specification itself often constitutes the only precise definition of the
abstract data type. Instead, the adequacy may be checked by a testing
method called rapid prototyping. Essentially, this testing method consists
in mechanically determining the value of some "critical" terms and comparing this value with one's "expectations". This mechanical evaluation
of terms presupposes that the specification is in some sense "executable"
or "constructive" and hence monomorphicas will be discussed below.
Clearly, being based on testing rapid prototyping may disprove adequacy
but cannot prove it; in other words, rapid prototyping may at best improve
one's confidence in the adequacy.
A specification defining a polymorphic abstract data type is sometimes
called a requirement specification. In contrast, a design specification is a
monomorphic specification that allows rapid prototyping.
10
Loose specifications
254
255
False: -+ bool
0: - nat
Succ: nat t nat
_ + _: nat x not > nat
_ * _ : nat x nat nat
. < _: nat x nat -> feooJ
vars
m, n: nat
axioms -i(7Vue = False)
-.(0 = Succ(n))
Succ(n) = Succ(m) D n = m
(0 < n) = Thie
(Succ(n) < 0) = Fake
(Succ(n) < Swcc(m)) = (n < m)
n+0=n
n + Swcc(rn) = Succ(n + m)
n*0 = 0
n * Swcc(m) = n + (n * m)
endspec
Clearly, the specification is an adequate specification of Peano arithmetic
(see Example 7.15). As Peano arithmetic is not axiomatizable, it cannot
be a strictly adequate one.
The following theorem indicates some limits of the expressive power of
loose specifications.
Theorem 10.4. Let sp = (E, $) be a loose specification in a logic L.
Then:
(i) M(sp) = (M(sp)YL;
(ii) if L is a logic for which there exists a sound and complete calculus,
then M(sp) is axiomatizable in L;
(in) if L is one of the logics of sections 6.2 to 6.4 and if M.(sp) contains
an algebra with an infinite carrier set, then M.(sp) also contains nongenerated algebras,
A variant of Theorem 10.4(iii) is also known as the Lowenheim-Skolem
theorem (see, for instance [Ebbinghaus et al, 1984]).
Properties of loose specifications may be proved along the lines discussed in section 9. In general, rapid prototyping is not possible for loose
specifications.
256
10.3
257
endspec
in which Pred is "intended" to be the predecessor function. This polymorphic specification is a strictly adequate specification of the "classical"
algebra in which the value Prerf(O) is left pending. More precisely, the
abstract data type defined contains non-isomorphic algebras differing from
each other only by the value of Pred(0).
11
Initial specifications
11.1
258
11.2
Examples
Example 11.5.
(i) A trivial initial specification is:
initial spec sorts not
opns 0 : - not
Succ : nat > nat
_ + _ : nat x nat > nat
vars m,n:nat
eqns n + 0 = n
n + Succ(m) = Succ(n + m)
endspec
One may prove that the specification is an adequate specification
of the "classical" algebra. Note, in particular, that T(,$)(nat)
consists of the equivalence classes [0], [5cc(0)],[5cc(5cc(0))], etc.
Note also that, for instance,
[0] = [0 + 0] = [0 + 0 + 0] = . . .
[SMcc(O)] = [SMCc(O) + 0] = [0 + Scc(0)]
= [Succ(O) + 0 + 0] = . . . .
(ii) The following example is the classical example of an initial specification. Its glamour stems from the fact that there is no equivalent
loose specification with free constructors that is equally "abstract" .
initial spec sorts nat,, set
opns 0 : -> nat
Succ : nat > nat
0 : -> set
Insert : set x nat > set
vars m, n: nat,s: set
eqns Insert(Insert(s,n),n) = Insert (s, n)
259
Insert(Insert(s,n),m) = Insert(Insert(s,m),n)
endspec
Informally, the first equation identifies terms in which the same "element" occurs several times; the second equation identifies terms in
which the same "elements" occur in different order.
The following example illustrates the problems resulting from the fact
that initial specifications cannot define polymorphic abstract data types.
Example 11.6. Consider the initial specification deduced from the loose
specification of Example 10.10 by deleting the keywords freely generated
and constr and by replacing loose and axioms by initial and eqns.
This specification is not an adequate specification of the "classical" algebra
because T(S,<l>)(na) now contains additional carriers such as [Pred(O)],
[Pred(Pred(0))} and [Succ(Pred(Q))].
Correcting non-adequate initial specifications is non-trivial and errorprone, as the following example illustrates.
Example 11.7. To correct the non-adequate specification of Example 11.6
the following solutions may be envisaged.
One may add the equation Pred(O) = Swccr'(0) for some fixed n > 0,
and thus use [5wc<rn' (0)] as a default value. Of course, this solution is not
acceptable: the choice of a concrete value of n results in a loss of abstraction
and the dual role played by [5wcc'"'(0)] harms the transparency.
Another solution consists in adding to the specification a constant
Error: -> nat and the equations:
Fred(O)
Pred(Error)
Succ(Error)
= Error,
= Error,
= Error.
(11.1)
(11.2)
(11-3)
260
11.3
Properties
11.4
261
11.5
Proofs
262
263
11.6
264
Definition 11.23 (Term rewriting system). The term rewriting system of an initial specification (E, $) is the reduction system (T^, >) with
" ->" inductively defined by:
(i) ta > ucr for each equation VX.t = u $ and each ground substitution a : X > Ts;
(ii) if i -> u, then v[t/y] - w[u/y] for all terms w e ?E({}) containing at
least one occurrence of the variable y.
Note that an initial specification in which an equation VX.t = u is
replaced by VX.u = t leads to a different relation "-*" but to the same
relation "~".
The following definition and theorem constitute the basis of a criterion
guaranteeing that a term rewriting system is Noetherian:
Definition 11.24 (Rewrite ordering, reduction ordering). Let be
Theorem 11.25. Le (E, $) 6e an initial specification and X a set of variables for S containing the variables occurring in $. // there exists a reduction ordering < on T%(x), such that u < t for any equation
VY.t = u e $, /en #je term rewriting system of (,$) is Noetherian.
The main theorem of term rewriting states that the equivalence relation
"~" coincides with the validity of ground equations in the initial algebra:
Theorem 11.26. Let (,$) be an initial specification and (Ts,->) its
term rewriting system. For all ground terms t, u TS it is the case that
t ~ u if and only if T(E, $) |= t = u.
Note that the theorem does not require that the term rewriting system is
Noetherian and/or confluent.
The previous theorem is the basis of the fourth principle of proof mentioned in section 11.5. Suppose one has to prove T(E, $) |= VX.t = u.
It is sufficient to show that for all ground substitutions a : X > T^
it is the case that T(,<J>) |= ta = ua (according to Theorem 11.14) or,
equivalently, that ta ~ ua (according to Theorem 11.26),
265
11.7
Rapid prototyping
266
11.8
The main definitions and theorems of sections 11.1 to 11.7 may be generalized in a straightforward way for conditional equational logic. In particular,
Theorem 11.2 holds for conditional equational logic as well.
The use of conditional equational logic often simplifies the design of initial specifications because it allows the expression of implications between
equalities instead of equalities only. On the other hand it may be more
difficult to grasp the structure of the initial algebra T(E, $) and, in particular, of its carriers; moreover, term rewriting systems and their tools are
more complex (see, for example, [Klop, 1992; Padawitz, 1988]).
11.9
Comments
267
Example 11.32. Let be a signature with a single sort and four constants, viz. a, 5, c, d. Let $ be a set consisting of the single formula (of
predicate logic):
(-.a = 6 A c = d) V (a = b A -ic = d).
Then Mod-s($) has no initial algebra. In fact, by Theorem 3.8 an initial
algebra A has to satisfy both A(a) / A(b) and A(c) ^ A(d). Hence, it
cannot be a model of $.
12
Constructive specifications
Informally, constraints (i) and (ii) make $ look like a (generalized version of
a) recursive function definition in the theory of computability. Constraint
268
(iii) makes sure that the calculation of the value of a ground term terminates
(see Definition 12.3).
The concrete syntax for constructive specifications is chosen to be similar to that of an initial specification but with constructive instead of initial and with constr preceding
B each constructor in the list- of -operations.
Example 12.2.
constructive spec
sorts nat
opns constr 0 : nat
constr Succ : nat ) nat
- + . : nat x nat nat
vars m, n: nat
eqns m + 0 = 0
m + Succ(n) = Succ(m + n)
endspec
The reader having difficulties checking that constraints (ii) and (iii) of
Definition 12.1 are satisfied may wait for the remarks preceding Example
12.6.
Definition 12.3 (Constructive specification: semantics). Let sp =
(,$,n c ) be a constructive specification with S = (5, fl). Put Sc =
(5, flc). The meaning M.(sp) of sp is the monomorphic abstract data type
M(sp) = { A 6 Alg(E) \ A ~ C } where C is the E-algebra defined by:
(i) C(s) = T EcjS for each sort s 5;
(ii) C(uj) = n for each constant u> = (n: > s) fic;
(iii) C(u)(wi,...,Wk) = n(wi,...,Wk)
for each u = (n:si x . . . x Sk -> s) (7C, k > 1, and all tUj Tsc<Si,
(iv)
C(w)(wi,...,wk)
for each w = (n:si x . . . xs^ - s) fi n c , k > 0, and all Wi 7s c>Si ,
1 < i < k, where t and a are defined as follows: t is the right-hand
side of the uni vocally defined equation n(vi,...,Vk) = t $ and
a : Var(n(vi, . . . ,Vk)) > TSC is the univocally defined substitution
for which vna Wi, 1 < i < k, guaranteed by constraint (ii) of
Definition 12.1.
The semantics M (sp) is constructive in that the specification sp constitutes an explicit definition of the algebra C and, more importantly, because
the value C(t) of an arbitrary ground term t IE may be effectively computed in finite time as expressed by the following theorem:
Theorem 12.4. Let sp = (S,$,fi c ) and C be as in Definition 12.3. For
any ground term t 6 TS the computation of the value C(t) is finite and
yields a unique result.
269
{n + m},
{n + 0, n + Succ(m)},
{0 + m, Succ(n) + m},
{0 + 0, Succ(n) + 0,0 + Swcc(m), Succ(ri) + Swcc(m)},
{n + Q,n + Succ(0),n + Succ(Succ(m))}.
Finally, a sufficient condition for constraint (iii) is the following: the operations of fZ fic may be written as w i , . . . , wj, I > 0, such that for each
equation n(v\,..., Vk) t of $, u>i (n : s\ x ... x Sk > s):
270
13
Specification languages
271
the year 2000. In this section and in section 14 a very simple specification
language is described that encompasses the main features of these different
languages.
13.1
(iii)
(iv)
(v)
(vi)
(sp model $)
is a specification with S(sp model $) = S(sp);
(vii) (restriction) if sp is a specification with S(sp) = (5,0), if 5C C 5
and if Qc C fj is a set of operations with target sorts in Sc, then
(sp generated in Sc by Qc)
272
list-of-sorts
list-of-operations
Note that the concrete syntax has to satisfy several "context conditions"
lest /z fails to be a renaming. For instance, the sorts s{,..., s'k have to be
pair wise different.
The constructs of the specification language are interpreted as operations on classes of algebras. Alternative semantics descriptions will be
briefly described in section 13.8.
Definition 13.2 (Semantics of the specification language SL). The
meaning M. (sp) of a specification sp of the language SL is defined inductively and follows the structure and notation of Definition 13.1:
273
274
endspec
+
loose spec sorts freely generated nat
opns constr 0: > nat
constr Succ: nat > nat
endspec)
by opns _ < - : nat x nat - bool)
model vars
m,n:nat
axioms 0 < n = True
Succ(m) < 0 = False
Succ(m) < Succ(n) = m < n)
13.2
275
E'
ft1 IE
h'
B'
Informally, the quotient term extension ^4(E', $') is similar to the quotient
term algebra T(E', $') but possesses in addition all "equational properties"
of ^4. This results from the addition of the ground equations of the equational theory of A to $'. This, in its turn, required the introduction of the
constants ca allowing "access" to each carrier of A.
The following corollary constitutes a generalization of Theorem 11.2.
Corollary 13.6. Let E, E' be signatures with E C E'. Let A be a algebra and let $' C EL(H'). The quotient term extension A(E',$') is a
276
model of $'.
Example 13.7. Let and A be as in Example 4.9. Let 6,61,62: el and
/: list be variables and
$ = {Add(e,Add(e,l)) = Add(e,l),
Add(e1,Add(e-2,l)) = Add(e2,Add(ei,l))}.
Consider the quotient term extension A(, $) of A for (, $). Then
A(,$) ~ with (e/) = ^(ei), B(/ist) is the set of all finite sets of
elements from B(el), B([]) is the empty set and B(Add)(e, s) = sU{e} for
all e 6 B(el) and s B(list).
The following theorem shows that all free extensions coincide with the
quotient term extension up to isomorphism. It constitutes a generalization
of Corollary 11.3.
Theorem 13.8. Let , S' be signatures with C ', let A be a ^-algebra
and let $' C EL(S'). Let A(Z',&) be the quotient term algebra of A for
(',$') and q : A A(E',$') | its associated homomorphism.
(i) E' ~ A(', $') for any free extension (E', r) of A for (', $').
(J IfE' ~ A(',$') according to the isomorphism i' : A(',$') > E',
then (E1, (V \ ) o q) is a free extension of A for (', $').
By the way, the preceding theorem allows a free extension (E',r) to
be identified with the algebra E': according to part (ii) of the theorem
the homomorphism r is uniquely determined by E1 (cf. the remark at the
end of Definition 13.4). Moreover, part (i) of the theorem allows the free
extension E' of A for (', $') to be identified with the algebra -A(', $')
In building a free extension of an algebra A the carrier sets may be
modified by a factorizationas illustrated in Example 13.7. The following
property expresses the fact that the carrier sets remain unchangedup to
isomorphism.
Definition 13.9 (Persistent free extension). Let , ' be signatures
with C ', let A be a -algebra and let $' C ('). Also let E1 be a
free extension of A for (', $') and i' : A(E', $') > E' the corresponding
isomorphism. Call q : A > A(',$') the homomorphism associated
with the quotient term algebra j4(',$'). The free extension E' is called
persistent when the homomorphism
(i1 | ) o q : A -> E' \
is an isomorphism.
Finally, it is possible to introduce the "initial counterparts" of the language constructs extend and model.
Definition 13.10 (Two further constructs of the specification language SL). The abstract syntax of the specification language of Definition
13.1 is augmented by two cases:
277
278
eqnsAdd(e,Add(e,l)) = Add(e,l)
Add(ei,Add(e2,l)) = Add(e2,Add(el,l)}))
by sorts list
opns [ ]: list
13.3
Adding an environment
While a specification language such as the one described above allows specifications to be "structured", a non-trivial specification is a long piece
of text. The idea is to cut this text into "small" specifications and to
give them names according to a technique that is classical in programming. To this end it is necessary to add an "environment" mapping
names into "environment-dependent specifications"henceforth called "especifications". A specification then consists of an e-specification together
with an environment.
While this idea is simple, the corresponding definition is somewhat
subtle. In fact, as the notions of an "environment" and of an "e-specification"
are interrelated, their inductive definition has to be simultaneous.
The definition implicitly makes use of a set of names not further defined.
Definition 13.12 (Environment, abstract syntax of e-speciflcations).
The notions of an environment, of an e-specification, of the signature of an
e-specification in an environment and of the relation "is an e-specification
for the environment" are defined by simultaneous induction. Note that an
environment is defined to be a partial function:
(i) an atomic specification, say at, is an e-specification for the everywhere undefined environment; the signature S(at)(env) of this especification in the environment env is the signature of the atomic
specification at;
279
(ii) if esp1 and esp2 are e-specifications for the environment env, then
(espl + esp2)
is an e-specification for the environment env; the signature of this
e-specification in the environment env is
S(espl + esp2)(env) = S(espl)(env)
(JS(esp2)(env);
(iii) to (ix): as (iii) to (ix) of Definition 13.1 and (the first part of) Definition 13.10 but with:
"e-specification for the environment env" instead of "specification" ;
"S(.. .)(env)" instead of "(...)" where "..." stands for an
(e-)specification;
(x) if env is an environment and n a name such that env(n) is defined,
then n is an e-specification for the environment env; the signature of
the e-specification n in the environment env is
S(n)(env) = S(env (n)) (env};
(xi) the everywhere undefined function is an environment;
(xii) if env is an environment, n a name such that env(ri) is undefined
and esp an e-specification for the environment env, then env[esp/n]
is an environment (where env[esp/n] is identical with env except for
(env[esp/n])(n) = esp);
(xiii) an e-specification for an environment env is also an e-specification for
an environment env1, whenever env C env'.
Informally, an e-specification according to this definition is a specification
according to Definition 13.1, it being understood that a name is also an
e-specification. If esp is an e-specification for the environment env, then
env(n) is defined for each name n occurring in esp. Finally, an environment
env contains no "recursive calls" in the sense that there exists no sequence
HI ,..., rik of names, fc > 1, such that:
env(ni) contains an occurrence of rij + i, 1 < i < k 1;
env(rik) contains an occurrence of n\.
The following definition fixes the meaning of an e-specification in an
environment.
Definition 13.13 (Semantics of e-specifications). Let esp be an e-
280
(A | S(esp1)(env)) e M(espl)(env),
(A j S(esp2)(env)) M(esp2)(env) };
(iii) to (ix): similar to (ii) and in accordance with Definition 13.2 and (the
second part of) Definition 13.10;
(x) Ai(n)(env) = M (env (n)) (env).
It is now possible to define a specification language with an environment.
This specification language is called e-SL; syntactically it consists of a set
of pairs called "specifications" .
Definition 13.14 (The specification language e-SL).
(i) (Abstract syntax.) A specification of the specification language e-SL
is a pair (env, esp) where esp is an e-specification for the environment
env. Its signature is
S((env, esp)) S(esp)(env).
(ii) (Semantics.) The meaning of a specification (env, esp) of the language e-SL is the abstract data type
M((env, esp)) = M(esp)(env).
Again, it is possible to introduce a concrete syntax. One then writes
decl; esp
instead of (env, esp). In this notation decl stands for a string called a "declaration". The set of all declarations decl and the environments (decl)
they describe are defined inductively:
(i) the empty string e is a declaration; it describes the environment
(e) = the everywhere undefined function;
(ii) if decl is a declaration, if esp is an e-specification for the environment
(decl) and if n is a name for which (decl)(ri) is undefined, then
decl; n is esp
is a declaration; it describes the environment
(decl;n is esp) = ((decl))[esp/n\.
Example 13.15. The specification of Example 13.3 may now be written
in a more readable way:
BOOL is loose spec sorts freely generated bool
opns constr True : -> bool
constr False : > bool
endspec;
NAT is loose spec sorts freely generated nat
opns constr 0 : nat
constr Succ : nat nat
endspec;
extend (BOOL + NAT)
281
13.4
Flattening
Flattening consists in "translating" a specification into an atomic specification with the same meaning. The reason for our interest in flattening
will become clear in sections 13.5 and 13.6.
In the process of flattening, constructs such as rename and forget lead
to some minor problems of a syntactical nature. The other constructs of the
specification language SL lead to semantical problems, some of which are
undecidable. The following comments illustrate some of these problems.
For more simplicity they refer to the specification language SL of sections
13.1 and 13.2.
Fact 13.16.
(i) Let spt = (Si,$!), sp-2 = (2,^2) be loose specifications (in the
sense of Definition 10.1). Then Ai(spl + sp2) = M(sp), where sp is
the loose specification ( S i U 2 , $ i U $ 2 ) (ii) Let sp = (,$) be a loose specification in a logic L. Let S be
a set of sorts and (7 a set of operations such that U (5, fi) is
a signature; let \P C ,( U (S, Q)) be a set of formulas. Then
^(extend sp by (S, fi) model $) = M(sp') where sp' is the loose
specification ( U (S, ft), * U #).
This fact does not hold for initial specifications, loose specifications with
(free) constructors or constructive specifications.
Fact 13.17. Let sp = (,$) be an initial specification. Let S and fi be
such that U (S, H) is a signature and let * C EL(, U (5, fi)). Then
.M(freely extend sp by (5,0) quotient <) = M(sp')
where sp1 is the initial specification (E U (5, fi), $ U *).
Again, the fact does not hold for loose specifications or constructive specifications.
As flattening has been studied in some detail for initial specifications,
additional properties for this particular case can be found in the literature
(see, for example, [Ehrig and Mahr, 1985]). A similar approach to deriving so-called "normal forms" of structured specifications can be found in
[Wirsing, 1993].
282
13.5
13.6
Rapid prototyping
There are two methods that allow the rapid prototyping of a specification.
The first method is based on "compilation" and consists in first flattening the specification into an atomic specification that allows rapid prototyping.
The second method is based on "interpretation" and makes use of a
program that "interprets" the different constructs. This method is particularly simple when the atomic specifications contained by the specification
are constructive (see e.g. [Loeckx and Zeyer, 1995]).
13.7
283
/, f*\ - / s if s S or s
PiW - j s> otherwise
for all sorts s 6 S(sp1),
s otherwise
for all sorts s e S(sp2),
n:m(s1) x ... x
if w 6 (7 or u & S(sp2),
n':/ii(si) x ... x fj,i(sk) t/j,i(s) otherwise
for all operations w = (n: si x . . . x sfc -> s) e 5(5;?!), fc > 0,
^(w) defined similarly,
where s', s", . .., n', ... are "new", pairwise different sorts and operation names.
(ii) (Semantics.) M(sp1 +(s,n) SP2)
and
13.8
The semantics of the specification language was defined above as a function M mapping specifications into abstract data types. This method for
describing the semantics is called model- oriented. There exist other description methods, two of which will be briefly discussed. Each of these
methods naturally leads to a slightly different semantics of some of the
language constructs.
According to the specification- oriented (also called presentation-oriented)
description method the semantics of a specification language is a function
AT mapping specifications into atomic specifications defining the same abstract data type. Assume that Af maps specifications into atomic loose
specifications or, alternatively, into atomic initial specifications. Two examples of the semantics of the language constructs are:
Af(at)
284
14
14.1
285
14.2
286
[ ]: -4 list
Add: el x list > list
- . - : list x list > list
Isprefix: list x list -> list
vars
l,m,n: list,e: el
axioms [ ]./ = /
Add(e, l).m Add(e,l.m)
(Isprefix(l,m) True) = (3n.(m = l.n))
endmspec
Hence, S; = ({bool, el}, {True, False}) and Se = ^t\j({list}, {[ ], Add,...}).
Informally, the module specification specifies lists. The abstract data types
bool and el are to be defined "elsewhere". Clearly, the abstract data type
el is intended to be a parameter.
Loose module specifications with constructors and/or free constructors
may be defined similarly.
Next, the case of initial specifications is treated (cf. section 11.1).
Definition 14.5 (Initial module specification).
287
14.3
An elementary modularized specification language, called MSL, is presented. Like the specification language SL presented in sections 13.1 and
13.2, it is not designed for practical use but merely tries to capture the
main features of the modularized specification languages presented in the
literature.
Again, the following definition associates with each module specification msp a module signature S(msp). The goal of the different "context
conditions" contained in the definition is essentially to avoid name clashes
between the export and import signatures and thus preserve the property
of persistency.
Definition 14.6 (Abstract syntax of the modularized specification
language MSL). The set of module specifications msp of the language
MSL and their module signatures S(msp) are defined inductively:
(i)
any atomic module specification atm is a module specification;
S(atm) is the module signature of this atomic module specification;
(ii)
if mspl and msp2 are module specifications with <S(mspj) (Sit, i e ),
S(msp2) = ( 2 i,2e) and if
each sort and each operation of H\e fl S2i is inherited in
each sort and each operation of E 2e n SH is inherited in
288
\[^
e U S2e
ji
Se
1l
m^ P2
/i
Ei.
msj,
77k!Oo
yl
'1,
i1
E
mt'Pi
S2i
I ^i
^'
EH US? .
(a) Illustration of Wispj + msp^
VI.
Fig. 3. Illustration of the syntax of the constructs "+" and "o" of the
modularized specification language MSL.
S(msp2),
then
(msp1
is a module specification with S(mspl + msp2) = (E^ U 2*, Sie U
S2e) (cf- Figure 3(a));
(iii)
(iv)
= (Ei,S e ), if
Si for each sort
289
(ii)
(iii)
(iv)
(^J
.M(msp2)(.B)
for all A e
(v)-(x) similar to (iv) (cf. Definition 13.2(iv) to (vii) and Definition 13.10(viii)
to (ix)).
The reader may easily check that these different definitions are consistent.
The different language constructs, except for freely extend and quotient, may be shown to preserve persistency (cf. the remark following Definition 13.10). The constructs also preserve consistency except possibly
+, model, generated and freely generated; the construct + preserves
consistency, when applied to persistent specifications.
As announced above, the constructs rename and forget may lead to
module specifications, the import signature of which is not a subsignature
of the export signature.
An environment may be added in a way similar to that of section 13.3.
Example 14.8. Let msp denote the atomic module specification of Example 14.4. An example of a module specification with environment is:
LIST is msp;
BOOL is loose mspec sorts freely generated bool
opns constr True : > bool
constr False : ^ bool
endmspec;
LIST o BOOL
290
14.4
291
"new" names for the sorts and operations of the import signature, it generally modifies the export signature too. In fact, the inherited sorts and
operations in the export signature have to be renamed accordingly. The
same holds for the inherited sorts occurring in the arities of the exported
operations. For this reason the signature morphism is defined as a signature
morphism fj,: Ej U Se E' rather than // : EJ > E'.
Definition 14.9 (Abstract syntax of the parameterized specification language PSL). The set of parameterized specifications psp of the
language PSL and their module signatures S(psp) are defined inductively:
(i)-(x) as Definition 14.6(i) to (x) but with "parameterized specification"
instead of "module specification";
(xi)
if psp is a parameterized specification with S(psp) = (Ej,S e ) and
if n : Ej U Ee t E' is a surjective signature morphism satisfying
the following four conditions:
(a) for each sort s from E e \Ej, n(s) = s,
(b) for each operation u> from E e \j, /x(w) and w have the same
operation name,
(c) for any two different sorts or operations so\e and so-ze from
S e , n(soie) n(so?,e) implies that both so\e and so^e are
inherited,
(d) for any sort or operation so, from Ej and soe from S e ,
l_t(soi) = n(soe) implies that soe is inherited,
then
(import rename psp by /x)
is a parameterized specification with
^(import rename psp by p.) = (/i(Sj),/x(S e ));
(xii) if psp is a parameterized specification with S(psp) = (Sj, e ) and
if $ C L(Ej) is a set of formulas for some logic L, then
(psp import model *)
is a parameterized specification with
S(psp import model $) = S(psp).
Informally, conditions (xi)(a) and (xi)(b) express the fact that /j, constitutes a renaming of the import signature. More precisely, condition (xi) (a)
expresses the fact that n renames sorts from Se only if they are inherited.
Condition (xi)(b) expresses the same property for operations or, at least,
for their names; the condition does not extend to their arities: if a noninherited operation u from Se contains imported sorts in its arity, w and
n(w) may differ from each other in their arities. Conditions (xi)(c) and
(xi)(d) avoid "name clashes". More precisely, condition (xi)(c) expresses
292
the fact that p. is injective on the non-inherited sorts and operations from
Se. Condition (xi)(d) expresses the fact that p, may identify a sort or operation from , and a sort or operation from Se only if the latter is inherited.
Note that the signature morphism n is not necessarily bijective and hence
may fail to constitute a renaming in the sense of Definition 5.1(ii). This is
sensible because it must be possible that two different formal parameters
get the same actual value (see Example 14.11).
In the concrete syntax one uses pspec and endpspec instead of mspec
and endrnspec.
Definition 14.10 (Semantics of PSL). The meaning M(psp) of a parameterized specification psp is a module (in the sense of Definition 14.2)
and is inductively defined according to Definition 14.9:
(i) to (x) as in Definition 14.7 but with "parameterized specification"
instead of "module specification";
(xi) ifS(psp) = (Si.Se), then
M (import rename psp by n}(A)
= { B 6 Alg((Ee)) | (B | (/z|EJ) M(psp)(A \ (/i| Ej )) }
for each A 6 Alg(fj,(Si));
(xii) if S(psp) = (Ef,E e ), then
...
.
.
,
/ M(psp)(A)
W/n
M(psp
import model1 A$)(/!)
= <
v
(0
for each A E. Alg(i).
ifA\=$,
,,
otherwise
293
294
Now let
NAT is (loose pspec sorts
Then
ORDERED-LISTS (sorts not, opns <:natx nat -> 600?) o
is a specification of ordered lists of natural numbers. Of course, this specification makes sense only if the module it defines is consistent (in the sense
of Definition 14.2). Hence, it is necessary to prove that "<" satisfies the
axioms of "C".
By the way, if the reader is still irritated by the clumsy notation he
should remember that the specification languages presented are not intended for "practical" use.
14.5
Comments
14.6
295
(14.1)
where par and sp are specifications of a specification language with environmentsuch as the specification language PSL (with an environment)and
where X is a name. It is understood that X may occur in the specification
sp. Informally, the notation "A": par" indicates that actual parameters have
to be of "type" par, i.e. have to belong to the abstract data type defined
by par. Somewhat more precisely, (14.1) is equivalent to the following
parameterized specification of PSL:
the environment is extended by the declaration
X is ...
where ... stands for the loose atomic module specification ((S(par),
S(par)), Th(M(par))); informally, this module specification leaves
the signature <S(por) unchanged but adds the formulas of the theory
of the abstract data type defined by par;
sp is turned into a parameterized specification of PSL by defining the
sorts and operations of S(par) as imported sorts and operations or,
more precisely, as parameters.
Alternatively, one may define ... to stand for the loose atomic module
specification ((S(par),S(par)),) and add Th(M(par)) to the parameterized specification obtained from sp with the help of the construct import
model.
The following approach is called the pushout approach because the
parameter passing mechanism is defined as a pushout in an appropriate
category. The approach was first described in [Ehrich and Lohberger,
1979]. It makes use of the notion of a specification morphism (introduced
in Definition 13.19(ii)). To simplify the description of the approach it is
assumed that all specifications are atomic ones or, alternatively, have been
turned into atomic ones by the meaning function J\f of section 13.8. A
parameterized specification is now a specification morphism
296
par
'r
7T
''
sp(par)
\
-^ <?r> /
act
297
language PSL when the parameter passing /j, is surjective. The renaming
effect of parameter passing may then be simulated by the import rename
construct together with the (surjective) signature morphism
A*
15
Further topics
The present section discusses topics the treatment of which has not (yet)
reached the level of maturity of the topics treated above. The reader interested in more details is referred to the bibliographic notes of section
15.7.
15.1
Behavioural abstraction
298
Clearly, A and B are not isomorphic and even fail to have the same
theory. For instance, putting
t = Add(0, 0), u = Add(0, Add(0: 0)),
one has A(t) = A(u), but B(t) ^ B(u).
On the other hand the algebras A and B behave "similarly" with respect
to the operation "e". More precisely, it is easy to show that A(t) = B(t) for
all ground terms t Ts^ooi of sort bool. Informally, this property ensues
from the fact that membership depends neither on the order of insertion nor
on the number of times an element has been inserted. Hence the algebras
A and B are "similar" whenever one restricts attention to the membership
problem or, put another way, to the effect of set on the "outside world" of
bool.
299
Add(n,Add(n,s)) = Add(n,s)
Add(n, Add(m, s)) = Add(m,Add(n,s))
distinguishing sets from lists.
15.2 Implementation
Informally, an implementation consists in the design of an efficient program
for a given specification. This design includes, in particular, the transformation of a possibly loose non-constructive specification into a constructive
one. It generally also includes a change of data types.
The implementation of data types by other more "concrete" ones is
briefly discussed in the present section. To abstract from specification
details, the notions are defined on data types rather than on specifications.
Example 15.10. A classical example of an implementation is that of
stacks by arrays and integers. Informally, the contents of a stack of depth
n are stored in the elements A[l],..., A[n] of an array A; the integer has
the value n and thus "points" to the topmost stack element. The operation Push is implemented by writing the new element into A[n + 1] and by
adding one to the integer. The operation Pop is implemented simply by
subtracting one from the integer.
300
Clearly, an implementation of stacks by arrays and integers is "sensible" : the abstract data types array and integer are more "concrete" than
the abstract data type stack because they are available in any imperative
programming language and thus are "nearer" to a program.
The example above illustrates three features of an implementation:
(i) a data type is in general implemented by several data types rather
than by a single one; in the example stack is implemented by array
and integer;
(ii) there does not necessarily exist a one-to-one relation between a data
type and its implementation; in the example arrays differing only by
the elements beyond the element pointed to by the integer represent
the same stack;
(iii) the implementing data types may get values that do not represent
values of the implemented data type; in the example an integer may,
for instance, have a negative value.
At first glance one may think that the notion of behavioural equivalence introduced in section 15.1 may capture the notion of implementation.
Actually, this does not work because of feature (iii) above.
The following definition is intended to capture features (ii) and (iii).
Definition 15.11 (Realization). Let A and B be S-algebras for some
signature E. The algebra B is said to realize A if A is isomorphic to a
quotient of a subalgebra of B.
Informally, the definition accounts for feature (iii) by means of the subalgebra and for feature (ii) by means of the quotient.
The next definition is intended to capture feature (i). The specifications
used in it are from a specification language such as SL.
Definition 15.12 (Construction term). Let Bi,...,Bm be algebras
for signatures E 1 ; . . . , E m , m > 1. Let spl,..., spm be strictly adequate
specifications of Bi,...,Bm (cf. Definition 9.1). A construction term of
BI, ..., Bm is a monomorphic specification sp that:
contains (occurrences of) the specifications spl,...,spm instead of
(occurrences of) atomic specifications;
uses only the constructs "+", rename and forget.
Informally, sp is obtained by "putting together" sp1,..., spm.
Finally, it is possible to introduce the notion of an implementation of
an algebra by other algebras.
Definition 15.13 (Implementation). Let A,Bi,...,Bm, m > 1, be
algebras and let sp be a construction term of BI, ..., Bm. The specification
sp is said to constitute an implementation of A by the basis BI ,..., Bm if
each algebra of the abstract data type M(sp) defined by sp realizes A.
301
The condition "sp realizes A" in this definition constitutes the correctness
criterion of the implementation. The investigation of this criterion is one
of the topics in the study of implementations.
The notion of implementation introduced above is for algebras. A generalization of this notion for abstract data types and modularized abstract
data types is straightforward. This generalization naturally leads to a notion of implementation for specifications and parameterized specifications.
An interesting question is whether the realization "is implemented by"
is transitive. Another question is whether the relation is compatible with
the constructs of a specification language.
15.3
Ordered sorts
=
=
302
{ 0: -> int,
Succ: nat > int,
Pred: nat -4 int,
Root: nat > int,
Eq: nat x int > bool,
Eq: int x nat > bool,
Eq: nat x nat > bool }.
15.4
Exceptions
303
304
The main difference with the "classical" (partial) algebra of stacks is that
A(Pop) is now a total function, its domain being A(nestack) rather than
A(stack).
A precise treatment of the order-sorted solution is outside the scope of
this overview. Let it suffice to note that a sort SQK is added for each
"concerned" sort s. This sort is "safe" in that it does not lead to errors.
In the above example nestack stands for stack OK
15.5
Data types are "static" structures consisting of sets and functions. Pieces
of software, however, nearly always contain "dynamic" parts consisting of
processes operating on data. As long as these processes are algorithms,
they can conveniently be described in a functional style that is compatible with algebraic data type specification, i.e. the algorithms are specified
as functions within data types. However, algebraic specification as described in this chapter reaches its limits when the processes are reactive,
i.e. when they havepossibly elaborateinteraction structures with the
environment, and when they are concurrent, i.e. when several processes act
independently and autonomously.
Dynamic data types constitute an extension of "static" data types that
is intended to overcome these limits. Several approaches have been proposed in the literature. The ones most widely discussed so far are based
on the idea that concurrent processes are simply data. Basically, these
approaches introduce an imperative flavour into the functional style of abstract data type specification by introducing sorts for process datasuch
as states and actionsand process operationssuch as state transitions.
On the semantic level some of these approaches use infinitary extensions of
algebrassuch as continuous algebras or projection spacesand a form of
behavioural semantics.
In order to get the flavour of the dynamic data type approach, the basic
concepts of one of the earliest and most elaborate approaches in this area,
viz. SMoLCS [Astesiano and Reggio, 1987], are briefly outlined.
Definition 15.19 (Dynamic signature). A dynamic signature is a triple
(E,S 0 ,n) where:
S = (5, (7) is a signature in the sense of Definition 2.1,
S0 C 5 is a set of sorts called dynamic sorts,
II is a set of transition predicates of the form
n:si x ... x Sfc
305
15.20. The following dynamic signature ((5, n),5 0 ,II) is inmodel state variables of sort s of an imperative programming
= { s,Var[s}}, 1- Var[s] ,
= { Var[S}} ,
= {[]:-+ Vbr[s],
[ _ ] : s -4 Var[s],
:=_: s -> 1 - Vor[s] },
= { . -^ . : Var[s]xl - Var[s]x Var[s] }.
Informally, the sort s stands for the values to be assigned, the sort Var[s]
stands for the variables containing values of sort s and the sort 1 Vor[s]
stands for labels identifying transitions, i.e. "actions" on variables. For
instance, [_L] stands for a variable with undefined contents and [a] stands
for a variable with contents a. Similarly,
:= a
stands for the action of assigning the value a to a variable. Finally, II
is used to denote the possible transitions representing the assignments of
values to variables.
The signature is used in Examples 15.21 and 15.22.
A precise definition of the notion of a dynamic algebra is beyond the
scope of this overview. Informally, a dynamic algebra for the dynamic
signature (S,5 0 ,II) consists of:
a S-algebra (in the sense of Definition 2.3), say A;
for each element of II, say n : s i X . . .xsk, a subset of A ( S I ) X . . .xA(sk).
In such an algebra the triples of the subset defined by "_ -^ _" constitute
the possible transitions.
Example 15.21. A possible dynamic algebra for the dynamic signature
of Example 15.20 is defined as follows:
the E-algebra A is as indicated in the comments on Example 15.20;
the meaning of II is defined to be:
{[JL] ^ [a] | a e A ( s ) } u { [ a ] ^4 [b] \a,b&A(s)}.
The triples of (the meaning of) II represent the possible transitions; each
transition consists in the action of assigning a value to a variable.
Note that in the example the actions of the transition predicate are deterministic. This need not be so in general.
Abstract dynamic data types and dynamic specifications are defined as
usual.
306
15.6
Objects
Since the advent of SMALLTALK [Kay and Goldberg, 1976; Kay, 1993],
the underlying paradigm of object-orientation has become quite successful,
as illustrated by the numerous object-oriented programming languages,
database systems, and software development methods. In this paradigm,
software systems are considered to be dynamic collections of autonomous
objects that interact with each other. Autonomy means that each object
encapsulates all features needed to act as an independent computing agent:
individual attributes (data), methods (operations), behaviour (process) and
communication facilities. Moreover, each object has a unique identity that
is immutable throughout its lifetime. Coincidentally, object-orientation
comes with an elaborate system of types and classes, facilitating structuring
and reuse of software.
Object specification combines ideas from algebraic data type specification, conceptual data modelling, behaviour modelling, specification of
reactive systems, and concurrency theory. The basic concepts are now illustrated by means of an example, using an ad-hoc notation in algebraic
specification style.
Example 15.22. The following specification introduces an object class
Var[s] of state variables of sort s (cf. Examples 15.20 and 15.21):
object class Var[s];
uses s, nat;
attributes value:s;
actions create; :=s; delete;
307
308
15.7
Bibliographic notes
309
is pursued in [Dauchy and Gaudel, 1994; Zucca, 1999; Ehrig et al, 1995].
A recent overview of the topic may be found in [Astesiano et al., 1999a].
Object specification as outlined here is based on work by the second
author of this chapter and others [Sernadas et al., 1987; Ehrich and Sernadas, 1995; Ehrich, 1999; Ehrich and Hartel, 1996] and on the TROLL
language project [Hartmann et al., 1995; Jungclaus et al., 1996; Hartel et
al., 1997]. Other noteworthy object specification approaches related to algebraic specification are FOOPS [Goguen and Meseguer, 1987; Goguen and
Socorro, 1995] and MAUDE [Meseguer, 1993]: FOOPS is based on the specification language OBJ3 [Goguen and Winkler, 1988]; MAUDE is based on
rewriting logic that is a uniform model of concurrency [Meseguer, 1992].
16
16.1
Many authors use categories to introduce and discuss the topics treated in
this chapter (e.g. [Ehrig and Mahr, 1985; Ehrig and Mahr, 1990; Sannella
and Tarlecki, 2001]). By its very abstractness category theory can lead
to more general results and avoid repeating similar proofs. On the other
hand the approach may be difficult to understand for a reader who is not
familiar with category theory.
The first clue in this approach is the fact that any class of algebras together with their homomorphisms constitutes a category. A monomorphic
module (in the sense of Definition 14.2(iii)) may then be viewed as a functor on such categories. Several notions introduced and properties proved in
the present chapter may be viewed as notions and properties from category
theory. For instance, the notions of an initial algebra and of a reduct correspond to the categorical notions of an initial object andat least in the
case of an inclusionof a forgetful functor. As another example, the notion
of a free extension is related to the categorical notion of a free functor.
Another interesting category is constituted by the atomic specifications
together with their specification morphisms (see Definition 13.19(ii)). It
allows one to define the amalgamated union (Definition 13.18) and the
parameter passing mechanism as pushouts (see section 14.6).
The reader interested in category theory and its application to the specification of abstract data types may consult [Sannella and Tarlecki, 2001].
16.2
Institutions
310
References
[Abrial, 1996] J. R. Abrial. The B-book: Assigning Programs to Meaning.
Cambridge University Press, 1996.
[Andrews and Ince, 1991] D. Andrews and D. Ince. Practical Formal Methods with VDM. McGraw-Hill, 1991.
[Astesiano et al, 1999a] E. Astesiano, M. Broy and G. Reggio. Altebraic
specification of concurrent systesm. In E. Astesiano, H.-J. Kreowski and
B. Krieg-Briickner, editors, Algebraic Foundations of Systems Specification, pages 467-520. Springer- Verlag, 1999.
[Astesiano et al, 1999b] E. Astesiano, H.-J. Kreowski and B. KriegBriickner, editors, Algebraic Foundations of Systems Specification,
Springer- Verlag, 1999.
[Astesiano and Reggio, 1987] E. Astesiano and G. Reggio. An outline of
the SMoLCS methodology. In M. V. Zilli, editor, Proc. Advanced School
on Mathematical Models for the Semantics of Parallelism, Lecture Notes
in Computer Science 280, pages 81-113. Springer- Verlag, 1987.
[Astesiano and Reggio, 1993] E. Astesiano and G. Reggio. Algebraic specification of concurrency. In M. Bidoit and C. Choppy, editors, Recent
311
312
313
314
[Hartmann et al., 1995] T. Hartmann, G. Saake, R. Jungclaus, P. Hartel, and J. Kusch. Revised Version of the Modelling Language TROLL
(Version 2.0). Informatik-Bericht 94-03, Technische Universitat Braunschweig, 1995.
[ISO-LOTOS, 1989] ISO-LOTOS. A formal description technique based
on the temporal ordering of observational behaviour. Technical report
IS 8807. International Standards Organization, 1989.
[Jungclaus et al, 1996] R. Jungclaus, G. Saake, T. Hartmann, and C. Sernadas. TROLLa language for object-oriented specification of information systems. ACM Transactions on Information Systems, 14(2): 175211, 1996.
[Kaplan, 1989] S. Kaplan. Algebraic specification of concurrent systems.
Theoretical Computer Science, 69(1):69-115, 1989.
[Kay, 1993] A. Kay. The early history of Smalltalk. ACM SIGPLAN Notices, pages 69-96, March 1993.
[Kay and Goldberg, 1976] A. Kay and A. Goldberg. Smalltalk-72 Instruction manual. Technical report, Xerox PARC, March 1976.
[Klop, 1992] J.W. Klop. Term rewriting systems. In S. Abramsky, D.M.
Gabbay, and T.S.E. Maibaum, editors, Handbook of Logic in Computer
Science. Volume 2. Background: Computational Structures, pages 2-117.
Clarendon Press, 1992.
[Lehmann and Loeckx, 1993] T. Lehmann and J. Loeckx. OBSCURE,
a specification language for abstract data types. Acta Informatica,
30(4):303-350, 1993.
[Loeckx, 1997] J. Loeckx. Hierarchical constructive specifications and their
termination. Internal note, Computer Science Department, University
of Saarbrucken, 1997.
[Loeckx and Sieber, 1987] J. Loeckx and K. Sieber. The Foundations of
Program Verification. Wiley/Teubner, 1987.
[Loeckx and Zeyer, 1995] J. Loeckx and J. Zeyer. Experiences with a specification environment. In M. Broy and S. Jahnichen, editors, KORSO:
Methods, Languages and Tools for the Construction of Correct Software,
Lecture Notes in Computer Science 655,, pages 255-268. Springer LNCS
1009, 1995.
[Loeckx et al., 1996] J. Loeckx, H.-D. Ehrich, and M. Wolf. Specification
of Abstract Data Types. Wiley/Teubner, 1996.
[Meseguer, 1992] J. Meseguer. Conditional rewriting as a unified model of
concurrency. Theoretical Computer Science, 96(1):73-156, 1992.
[Meseguer, 1993] J. Meseguer. A logical theory of concurrent objects and
its realization in the Maude language. In G. Agha, P. Wegener, and
A. Yonezawa, editors, Research Directions in Object-Oriented Programming, pages 314-390. MIT Press, 1993.
315
316
Contents
1
Introduction
1.1 Computing in algebras
1.2 Examples of computable and non-computable functions .
1.3 Relations with effective algebra
1.4 Historical notes on computable functions on algebras . .
1.5 Objectives and structure of the chapter
1.6 Prerequisites
Signatures and algebras
2.1 Signatures
2.2 Terms and subalgebras
2.3 Homomorphisms, isomorphisms and abstract data types
2.4 Adding Booleans: Standard signatures and algebras . . .
2.5 Adding counters: TV-standard signatures and algebras . .
2.6 Adding the unspecified value ui; algebras A" of signature
Eu
2.7 Adding arrays: Algebras A*_of signature IT*
2.8 Adding streams: Algebras A of signature
While computability on standard algebras
3.1 Syntax of While(E)
3.2 States
3.3 Semantics of terms
3.4 Algebraic operational semantics
3.5 Semantics of statements for While(Yi)
3.6 Semantics of procedures
3.7 Homomorphism invariance theorems
3.8 Locality of computation
3.9 The language WhileProc("S)
3.10 Relative While computability
3.11 .For() computability
3.12 WhileN and ForN computability
319
322
325
329
335
340
343
344
344
349
350
351
353
355
356
359
360
361
363
363
364
366
368
371
372
374
375
376
377
318
378
380
383
387
388
389
389
390
392
393
394
397
401
402
404
405
406
407
408
409
413
414
416
416
420
421
422
423
425
429
431
434
435
438
439
441
443
445
319
447
449
451
452
453
455
458
460
464
465
465
470
475
479
479
484
488
490
490
492
493
493
496
500
Introduction
SCW1
320
of natural numbers. The theory establishes what can and cannot be computed in an explicit way using finitely many simple operations on numbers.
The set of naturals and a selection of these simple operations together form
an algebra. A mathematical objective of the theory is to develop, analyse
and compare a variety of models of computation and formal systems for
defining functions over a range of algebras of natural numbers.
Computability theory on N is of importance in science because it establishes the scope and limits of digital computation. The numbers are
realised as concrete symbolic objects and the operations on the numbers
can be carried out explicitly, in finitely many concrete symbolic steps. More
generally, the numbers can be used to represent or code any form of discrete
data. However, the question arises:
Can we develop theories of functions that can be defined by
means of algorithms on other sets of data?
The obvious examples of numerical data are the integer, rational, real and
complex numbers; and associated with these numbers there are data such
as matrices, polynomials, power series and various types of functions. In
addition, there are geometric objects that are represented using the real and
complex numbers, including algebraic curves and manifolds. Examples of
syntactic data are finite and infinite strings, terms, formulae, trees and
graphs. For each set of data there are many choices for a collection of
operations from which to build algorithms.
How specific to the set of data and chosen operations are these
computability theories? What properties do the computability
theories over different sets of data have in common?
The theory of the computable functions on N is stable, rich and useful;
will the theory of computable functions on the sets of real and complex
numbers, and the other data sets also be so?
The theory of computable functions on arbitrary many-sorted algebras
will answer these questions. It generalises the theory of functions computable on algebras of natural numbers to a theory of functions computable
on any algebra made from any family of sets and operations. The notion
of 'computable' here presupposes an algorithm that computes the function
in finitely many steps, where a step is an application of a basic operation
of the algebra. Since the data are arbitrary, the algorithm's computations
are at the same level of abstraction as the data and basic operations of
the algebra. For example, this means that computations over the field E
of real numbers are exact rather than approximate. Thus, the algorithms
and computations on algebras are intimately connected to their algebraic
properties; in particular, the computability theory is invariant under isomorphisms.
Already we can see that, in the general case, there is likely to be a
ramification of computability notions. For example, in the case of computable functions on the set E of real numbers it is also natural to consider
321
322
1.1
Computing in algebras
of functions on the sets called operations; these functions are of the form
F : ASl x ...x ASn - As
and can be total or partial. Among the operations are some standard
functions on the Booleans. Such an algebra is called a standard manysorted algebra; we say it is standard because it contains the Booleans and
their special operations. An algebra is often written
(Ai,... ,Ak, ci,... ,cp,Fi,... ,Fq).
A set of names for the data set, constants and operations (and their
arities) of the algebra A is called a signature.
323
For most of the time we will use many-sorted algebras with finitely
many constants and total operations, but we will need the case of partial
operations to discuss the relationship between our computable functions
and continuous functions on topological algebras such as algebras of real
numbers, and algebras with infinite data streams.
The problem is to develop and classify models of computation that
describe ways of constructing new functions on the set 5 from the basic
operations of the algebra A. In particular, each model of computation M is
a method or technique which we use to define the notion that the function
/ on the carriers of A is computable from the operations on A by means
of method M ; and we collect all such functions into the set
M-Comp(A)
of functions X-computable over the algebra A.
There are many useful choices for a model of computation M with
which to develop a computability theory we list several in a survey in
section 8. In this chapter we focus on a theory for computing with a simple
imperative model, namely the While programming language.
In this programming language, basic computations on an algebra A are
performed by concurrent assignment statements of the form
where xi, . : . , xra are program variables and t\ , . . . ,tn are terms or expressions built from variables and the operation symbols from the signature of
the algebra A; and x^ and ti correspond in their types (1 < i < n).
The control and sequencing of the basic computations are performed by
three constucts that form new programs from given programs Si, S^ and
S, and Boolean test b:
(i) the sequential composition construct
324
325
While(A*)
and
WhUe(A).
By this means it is trivial to add constructs like counters, finite arrays and
infinite data streams to the theory of computation, though it is not trivial
to chart the consequences.
In summary, what mechanisms are available for computing in an algebra? The methods of computation are merely:
(i) basic operations of the algebra; and
(n) sequencing, branching and iterating the operations.
Is equality computable? Do we have available unlimited data storage?
Can we search the algebra for data?
We will see that for any many-sorted algebra A with the Booleans, by
adding the naturals, we can add
(Hi) any algorithmic construction on a numerical data representation;
and, by adding A*, we can add
(iv) local search through all elements of the subalgebra generated by given
input data;
(v) unlimited storage for data in computations.
To obtain equality we have to postulate it as a basic operation of the
algebra.
We will study these models of computation. The most important turns
out to be the programming language While*, which consists of While
programs with the naturals and finite arrays, and is defined simply by
While" (A) = While(A').
This is the fundamental model of imperative programming that yields a
full generalisation, to an arbitrary many-sorted algebra A, of the theory of
computable functions on the set N of natural numbers, and for which the
generalised Church-Turing thesis for computation on A will be formulated
and justified.
1.2
First, let us look at the raw material of our theory, namely problems concerning computing functions and sets on specific algebraic structures. We
will give a list of questions about computing with While programs on
different algebras and invite the reader to speculate on their answers; it is
not essential that the reader understand or recognise all the concepts in
the examples. The idea is to prepare the reader for the role of algebraic
structures in the theory of computable functions and sets, and arouse his
or her curiosity.
326
3.
4.
5.
6.
7.
= 4
=n
=n+1
f(n,m)
n + m
f(n,m)
= n m
In each case is / W7iz/e(N; 0, n 1)?
Let B be the set of Booleans and / : B* - B. Is / eW7Je(B;tt,f,
and, not)?
Let A be a finite set and / : A -> A. Is / 6 While(A; c\,... , cp, FI,
... ,Fq) for any choice of constants c, and operations Fj on Af
Consider the algebra
(B,N,[N ->B]; tt,ff, and, not,0, n + l,eval)
of Booleans expanded by adding the set N of naturals with zero and
successor, and the set [N -> B] of infinite sequences, or streams, of
Booleans, with the evaluation map eval:[N > B] x N B defined
by eval(6, n) = b(n). Are the following functions While computable
over this algebra:
shift: [N - B] x N -> B defined by shift(a,n) = o(n + 1);
Shift: [N -> B] -> [N -> B] defined by Shift(a)(n) = o(n + 1)?
Which of the following sets of Boolean streams are
(i) While computable, and
(n) While semicomputable, over the stream algebra in question 5?
{ a | for some n, a(n) =tt}
{ a I for all n, a(n) =tt)
{ a | for infinitely many n, a(n) = tt}
{ a | a(0) = tt, ... , a(n) =tt} for some fixed n
Consider each of the following functions:
f(x) = 4
f ( x ) = floor(^)
f ( x ) = x/2 f ( x ) = 1*
f(x) = TT
f(x) = sin(x)
f(x) = x
f(x) = cos(a;)
f(x) = x5
f(x) = tan(x)
/(x) = v/x /(x) = ex
327
10.
11.
12.
13.
/Oifa;<7= { , ., ^
I 1 if x > r.
(a 0 ,... , an E).
328
Jtec(R) =
While(A(R)),
= (fn(x)
| n e N, 0 < x < 1}
329
27. Consider the rings TL\X\,... ,Xn] of all polynomials in n indeterminates over the integers. Is the ideal membership relation
q (pi,... ,p m )
(in q,pi,... ,pm) While decidable over this ring?
28. Consider the rings F[Xi,... ,Xn] of all polynomials in n indeterminates over a field F. Is the ideal membership relation While
decidable over this ring?
29. Consider the algebra T(S, X) of all terms over signature in the
finite set X of indeterminates. Let Ax be the set of assignments to
X in an algebra A. Define the term evaluation function
T: TE(E, X) x Ax - A
330
331
in either of the following two directions. One can apply computability theory on N to algebras using maps from sets of natural numbers to algebras
called numberings. The long-established theories of decision problems in
semigroups, groups, rings and fields, etc. are examples of this approach.
Furthermore, the theory of computable functions Comp(R) on the set
E of real numbers in computable analysis uses computability theory on
Comp(N) to formalise how real number data and functions are approximated effectively. Theories based on these approaches are parts of what
we here call effective algebra.
Alternatively, one can generalise the computability theory on N to accommodate abstract structures; the theory of computable functions on
many-sorted algebras developed in this chapter is an example, of course,
and more will be said about equivalent models of computation in section
8. However, there are examples of generalised computation theories that
are strictly stronger, such as ordinal recursion theory, set recursion theory, higher type recursion theory and domain theory. Typically these
four generalised computability theories allow infinite computations to return outputs. To appreciate the diversity of some of these theories it
is necessary to examine closely their original motivations; seen from our
simple finitistic algebraic point of view, generalised recursion theories have
a surprisingly untidy historical development.
Let us focus on the first direction. Effective algebra is a theory that
provides answers for questions such as:
When is an algebra A computable? What functions on A are
computable? What sets on A are decidable or, at least, semidecidable?
It attempts to establish the scope and limits of computation by means of
algorithms for any set of data, by applying the theory of computation on N
to universal algebras containing the set of data using numberings. Thus, it
classifies what data can be represented algorithmically, and what sets and
functions can be defined by algorithms, in the same terms as those of the
Church-Turing thesis for algorithms on N. Assuming such a thesis, we may
then use the theory of the recursive functions on N to give precise answers
to the above questions about algebras, and to the question:
What sets of data and functions on those data can be implemented on a computer in principle?
The numberings capture the scope and limits of digital data representation and, thus, effective algebra is a general theory of the digital view of
computation. More specifically, in effective algebra we can investigate the
consequences of the fact that
1. an algebra is computable;
2. an algebra is effective in some weaker senses; and
332
a: na - N
called a numbering, that lists or enumerates, possibly with repetitions,
all the elements of A; (ii) the operations of A are computable in the
enumerationfor each operation Fj : An^ > A of A there exists a recursive function
that tracks the f, in the set $la of numbers, in the sense that for all
Xi, . . . , X n j
333
Conditions (i) and (ii) are shared with our While programming model
and, indeed, are necessary for an algebraic theory: recall section 1.1. There
are also a number of features that extend the methods of our While model,
including conditions (Hi) and, more dramatically, (iv). Using the properties
of the numbers that represent the data we can perform global searches
through the data sets (by means of an ordering on the code set), and
store data dynamically without limitations on data storage (by means of a
pairing on the code set). Note that condition (vi) is a denning feature of
computable algebras and can be relaxed (as in the case of semicomputable
or effective algebras, for instance).
Note that an algebra A is computable if there exists some computable
numbering a for A. The computability of functions and sets over A may
depend on the numbering a; thus, to be more precise, we should say that
A, its functions and subsets etc. are a-computable. Let us define the computable subsets and functions for such an algebra.
Definition 1.2 (Sets and maps). Let A be an algebra of signature S,
computable under the numbering a : Oa A.
(1) A set 5 = Ak is a-decidable, a-semidecidable or a-cosemidecidable if
the corresponding set
a~l(S) =
{(x1,...,xk)eflc,k\(a(xi),...,a(xk)S}
a]
1-
\ o_
334
Let C(A) be the set of all computable numberings of the algebra A. The
choice of a numbering a C(A) suggests that the effectiveness of a subset
or function on A may depend on a. To illustrate, let S C A and consider
the following questions:
Is S decidable for all computable numberings of A; or decidable for some, and undecidable in others; or undecidable for all
computable numberings of A?
Another question concerns the invariance of computable maps.
// A is computable under two numberings a and /? then what
is the relation between the sets Compa(A) and Compg(A)?
What is
Compa(A)t
Consider our abstract model based on While programs. We have noted
that
While(N; 0, n + 1) = Comp(N).
The question then arises for our algebras:
What is the relationship between While(A) and Compa(A)
for an arbitrary computable representation a ?
We can prove that if A is computable then
While(A) C
P|
Compa(A).
(1.1)
aC(A)
(In fact this inclusion holds for much weaker hypotheses on A.) The converse inclusion does not hold in general. To see why, consider the algebra
(N; 0, n - 1),
because assignments can only reduce the value of the inputs. It follows
that
While(N; 0, n - 1) C
p|
Compa(N; 0, n - 1)
(1.2)
aec(A)
because in any numbering the successor function S(o:) = x + 1 can be
computed.
335
p|
Compa(A)l
OtC(A)
1.4
The generalisation of the theory of computable functions to abstract algebras has a complicated history. On the one hand the connections between
computation and algebra are intimate and ancient: algebra grew from problems in computation. However, the fact that it is now necessary to explain
how computation theory can be connected or applied to algebra is an aberration, and is the result of interesting intellectual and social mutations in
the past. It is a significant task to understand the history of generalisations of computability theory, with questions for research by historians of
mathematics, logic and computing, as well as sociologists of science.
The story that underlies this work involves the development of algebra; the development of computability theory; interactions between computability theory and algebra; and applications to computing. Some of the
connections between computation theory and algebra have been provided
in other Handbook chapters: for notes on the histories of
effective algebra, see Stoltenberg-Hansen and Tucker [1995];
computable rings and fields, see Stoltenberg-Hansen and Tucker [l999a];
algebraic methods in computer science, see Meinke and Tucker [1992].
In the following notes we discuss the nature of generalisations and point
out the earliest work on abstract computability theory. Section 8 is devoted
to a fairly detailed survey of the literature.
We first list some common-sense reasons for generalising computability
theory. A common view is to say that the purpose of a generalisation of
computability theory is one or more of the following:
(i) to say something new and useful about the original theory;
(it) to provide new methods of use in computer science and mathematics;
(Hi) to illuminate and increase our understanding of the nature of computation.
336
337
els; some commonly remembered papers are: lanov [i960], Peter [1958],
Voorhes [1958], Asser [1961], Corn [*196l] and Kaluzhnin [1961]. By the
time of the celebrated Bohm and Jacopini [1966] paper on the construction
of normal forms for flowcharts, the subject of flowcharts was well established.
In some of these papers the underlying data need not be the natural
numbers, strings or bits. In particular, in Kaluzhnin [1961] flowcharts are
modelled using finite connected directed graphs. These have vertices either
with one edge to which are assigned an operation, for computation, or two
exit edges to which are assigned a discriminator, for tests. The graph
has one vertex with no incoming edge, for input, and one vertex with no
outgoing edge, for output. To interpret a so-called graph scheme, a set of
functions is used for the operations, and a set of properties is used for the
discriminators.
Kaluzhnin's work was used in various studies, such as Bigot's early work,
and in Thiele [1966], a major study of programming, in which flow diagrams
are presented that are not necessarily connected graphs. The semantics of
flow diagrams is denned here formally, in terms of the functions
El&(n) = object or data after the nth step in flow diagram A
starting at state ,
Kl&^(n) = edge in flow diagram A traversed after the n-th step
starting at state .
using simultaneous recursions. Thiele's work influenced the formal development of operational semantics as found in the Vienna Definition Language:
see Lauer [1967; 1968] and Lucas et al. [1968]. The important point is
that predicate calculus with function symbols and equality is extended by
adding expressions that correspond with flow diagrams to make an algorithmic language involving graphs.
Thus, in the period 1946-66, some of the basic topics of a theory of computation over any set of data had been recognised, including: equivalence of
flowcharts; substitution of flowcharts into other flowcharts; transformations
and normal forms for flow charts; and logics for reasoning about flowcharts.
Flowcharts were not the only abstract model of computation to be developed.
Against the background of early work on the principles of programming
by A. A. Lyapunov and theoretical work by lanov and others in the former Soviet Union, Ershov [1958] considered computation with any set of
operations on any set of data. In Ershov [i960; 1962] the concept of operator algorithms is developed. These are imperative commands made from
expressions over a set of operations; the algorithms allow self-modification.
The model was used in early work on compilation in the former Soviet
Union. See Ershov and Shura-Bura [1980] for information on early programming.
Of particular interest is McCarthy [1963], which reviewed the requirements and content of a general mathematical theory of computation. It
338
emphasises the idea that classes of functions can be defined on' arbitrary
sets of data. Starting with a (finite) collection F of base functions on some
collection of sets, we can define a class C{F} of functions computable in
terms of F. The mechanism used is that of recursion equations with an
informal operational meaning based on term substitution. An abstract
computability theory is an aimnot 'merely' a model of programming
structure etc.and McCarthy writes (p. 63):
Our characterisation of C{F} as the set of functions computable
in terms of the base functions in F cannot be independently
verified in general since there is no other concept with which it
can be compared. However it is not hard to show that all partial
recursive functions in the sense of Church and Kleene are in
C{zero, succ}.
This, of course, falls short of a generalised Church-Turing thesis. The
paper also mentions functionals and the construction of new sets of data
from old, including a product, union and function space construction for
two sets, and recursive definition of strings. McCarthy's paper is eloquent,
perceptive and an early milestone in the mathematical development of the
subject.
E. Engeler's innovative work on the subject of abstract computability
begins in Engeler [1967]. This contains a mathematically clear account of
program schemes whose operations and tests are taken from a first-order
language over a single-sorted signature. The programs are lists of labelled
conditional and operational instructions of the form
k :
k:
where k, p and q are natural numbers acting as labels for instructions, <f> is
a formula of the language and ijj is an assignment of one of the forms
x:=c,
x:=y
or
where x, y , . . . are variables, and c and f are any constant and operation
of the signature. Interpretations are given by means of a notion of state,
mapping program variables to data in a model. A basic result proved here
is this:
To each program TT one can associate a formula <f> that is a
countable disjunction of open formulae such that for all models,
TT terminates on all inputs from A
<=> A |= <j>.
339
340
1.5
341
342
that term evaluation is always While computable on it; hence, the model
of While* programs is universal.
In section 5 we turn our attention to sets. We begin with a study of
computable and semicomputable sets. We prove Post's theorem in the
present setting. We also study the ideas of projections of computable and
semicomputable sets. It turns out that the classes of computable and semicomputable sets are not closed under projection. The notion of projection is
very important since it distinguishes clearly between forms of specification
and computation. Furthermore, it focuses our attention to the difference
between local search and global search in computation.
Projections also lead us to consider the relationship between While
programming and certain non-deterministic constructs on data. These include: search procedures; initialisation mechanisms; and random assignments.
Next, with each While program is associated a computation tree. With
this technique, we prove that every semicomputable set is definable by an
effective infinite disjunction of Boolean terms over the signature.
In section 6 we illustrate the core of the theory with a study of its
application to computing sets of real and complex numbers over various
many sorted algebras. We include some pleasing examples from dynamical
systems.
In section 7 we return to the special properties and problems of computation of the reals. More generally, we study computation on topological
algebras. A key consideration is the property that if a function is computable then it is continuous. To guarantee a good selection of applications
we use partial functions, which raises interesting topological issues. This
study of programming over topological algebras contains new material.
We also contrast exact versus approximate computation on the reals.
The following fact was observed in Shepherdson [1976]. Let / be a function
on the reals. Then / is computable in the sense of computable analysis if,
and only if, there is a function g which is While computable over the
algebra (E, B, N; 0, 1, x + y, x.y, x,...) such that
\f(x)-g(n,x)\<2-n
for all n N and x 6 K. We extend and adapt this result to topological
algebras.
In section 8 we survey other models of computation and see their relation with While programs. We consider briefly: /^-recursive functions;
register machines; flowcharts; axiomatic methods; set recursion; and equational definability. A generalised Church-Turing thesis is discussed.
There are many subjects that we have omitted from the discussion, for
example: the delicate classification of the power of constucts, including
types; computations with streams; program verification; connections with
proof theory; connections with model theory; degree theory; and generalised complexity theory. There will be good work by many authors that
343
1.6
Prerequisites
First, we assume the reader is familiar with the theory of the recursive
functions on the natural numbers. It is treated in many books such as
Rogers [1967], Mal'cev [1973], Cutland [1980] and Machtey and Young
[1978]. An introduction to the subject is contained in this Handbook (see
Phillips [1992]) and other handbooks (e.g. Enderton [1977]).
Secondly, we assume the reader is familiar with the basics of universal
algebra. Some mathematical text-books are: Burris and Sankappanavar
[1981] and McKenzie et al. [1987]. An introduction to the subject with the
needs of computer science in mind is contained in this Handbook (see Meinke
and Tucker [1992]) and in Wechler [1992]. The application of universal
algebra to the specification of data types is treated in Ehrig and Mahr
[1985], Meseguer and Goguen [1985] and Wirsing [1991]. The theory of
computable and other effective algebras is covered by Stoltenberg-Hansen
and Tucker [1995].
Thirdly, we will need some topology. This is covered in many books,
such as Dugundji [1966] and Kelley [1955] and in a chapter in this Handbook
(see Smyth [1992]).
Finally, we note that the subject connects with other subjects, including
term rewriting (see, for example, Klop [1992]) and domain theory (see, for
example, Stoltenberg-Hansen et al. [1994]).
344
2.1
Signatures
Definition 2.1 (Many-sorted signatures). A signature E (for a manysorted algebra) is a pair consisting of (1) a finite set Sort(S) of sorts,
and (2) a finite set Fnc(E) of (primitive or basic) function symbols, each
symbol F having a type si x ... x sm -t s, where m > 0 is the arity of F,
and 81,... ,sm Sort(E) are the domain sorts and s Sort(S) is the
range sort; in such a case we write
F : Si x ... x sm - s.
The case ra = 0 corresponds to constant symbols; we then write F : > s
or just F : s.
Our signatures do not explicitly include relation symbols; relations will
be interpreted as Boolean-valued functions.
Definition 2.2 (Product types over E). A product type over E,
or Yi-product type, is a symbol of the form s\ x . . . x sm (m > 0),
where si,... ,sm are sorts of E, called its component sorts. We define
ProdType(E) to be the set of S-product types, with elements u,v,w,
If u = Si x ... x sm, we put lgth(u) = m, the length of u. When
lgth(u) = 1, we identify u with its component sort. When lgth(u) = 0, u
is the empty product type.
For a E-product type u and E-sort s, let Func(Y,)u^.s denote the set
of all S-function symbols of type u > s.
Definition 2.3 (E-algebras). A ^-algebra A has, for each sort s of E,
a non-empty set As, called the carrier of sort s, and for each E-function
symbol F : si x ... x sm - s, a function FA : ASl x x ASm -> As.
For a E-product type u s\ x ... x sm, we write
Au =df ASl x ... x A8m.
Thus x Au if, and only if, x (x\,... ,xm), where Xi 6 ASi for i =
1,... ,m. So each E-function symbol F : u > s has an interpretation
345
s,
(s Sort(E))
F : si x ... x sm -+ s,
(F eFwnc(E))
functions
end
As,
(s Sort(E))
functions
FA : ASl x . . . x ASm - As,
end
Examples 2.5. (a) The algebra of naturals A/o = (N; 0, succ) has a signature containing the sort nat and the function symbols 0: -nat and
succ:nat-nat. We can display this signature thus:
346
nat
0: nat,
S:nat-nat
end
A/o
N
0:
S:
end
from which the signature can be inferred. Below, we will often display the
algebra instead of the signature.
(b) The ring of reals T^o = (K; 0,1, +,-, x) has a carrier ffi of sort real, and
can be displayed as follows:
algebra
carriers
functions
72
R
0,1: _1_ s, .
-r, X .
- :R
end
(c) The algebra Co of complex numbers has two sorts, complex and real, and
hence two carriers, C and HL It includes the algebra 72.Q, and therefore has
all the operations on ]R listed in (b), as well as operations on C, as follows:
algebra
import
carriers
functions
C0
Ti0
C
0, l,i : - C,
+, x : C2 - C,
- : C -> C,
re,im: C -> R,
TT : R2 -> C
end
347
G
1 : - G,
* : G2 -> G,
inv: G - G
end
(2.1)
/> : ^ -> ^
(2-2)
(2.3)
(We will explain the '~' in (c) below.) Conversely, given n functions
fj as in (2.2), all with the same domain type u, and with range types
(or sorts) si, . . . , sn respectively, we can form their vectorisation as
a function / satisfying (2.1) and (2.3).
348
Rc = AU\R = {a Au a R},
also of type u.
(<f) (Projections.) To explain this notion, we begin with an example.
Suppose R : u where u = si x s% x 53 x 54 x 85. Now let v = s\ x s^ x 83
and w = 84 x s5. Then the projection of R on v (or on Av), or
the Au'-projection of R, is the relation 5 : v defined by existentially
quantifying over Aw:
S(xi,x2,x3) <=> 3z 4 ,x 5 Aw : R(x1,... ,x5).
restriction of u to i, that is, the product type s,j x ... x Sirr; and
-^
-+
.-*
-+
pro.7'[M| i ](-R) is the projection of R on i (or on ^4U| * ) , or the Au*3 projection of R, that is, the relation 5 : u i defined by existentially
quantifying over Au\J':
-+
,... ,xm).
2.2
349
Definition 2.10 (Closed terms over E). We define the class T(E) of
closed terms over E, denoted t, t', ti,..., and for each E-sort s, the class
T(S)gof closed terms of sort s. These are generated inductively by the
rule: if F Func(E) u _ >g and ti 6 T(E)Si for i = 1,... ,m, where
u = si x ... x s m , then F(ti,... , *m) T(E) g .
Note that the implicit base case of this inductive definition is that of
m = 0, which yields: for all constants c : > s , c ( ) T(E)S. In this case we
write c instead of c(). Hence if contains no constants, T() is empty.
Definition 2.11 (Valuation of closed terms). For A Alg() and t
T(S)S, we define the valuation tA As o f t in A by structural induction
on t:
F(ti,...,tm)A = FA((tl)A,...
,(tm)A).
350
s. (See Meinke and Tucker [1992, 3-2.6 ff] for definitions.) Also for a
product type u = si x ... x sm,
2.3
Given a signature E, the notions of 'S-homomorphism as well as E-epzmorphism (surjective), T,-monomorphism (injective), 'E-isomorphism (bijective) and E-automorphism are defined as usual (see [Meinke and Tucker,
1992, 3.4]). We need a more sophisticated notion, that of relative homomorphism.
Definition 2.19 (Relative homomorphism and isomorphism). Let
E and E' be signatures with C E'. Let A and B be two standard E'algebras such that
351
(a) A Y,'-homomorphism relative to S from A to B, or a E'/E-/iomom0rp/iism </>: A -> B, is a Sort(E')-indexed family of mappings
< = {< : As - Bs | s Sort(E'))
which is a S'-homomorphism from A to B, such that for all s
Sort(S), <f>, is the identity on As.
(b) A E'/E-isomorphism from ^4 to B is a E'/E-homomorphism which is
also a S'-isomorphism from A to B.
(c) A and B are E'/S-Momorptac, written A E-/E B, if there is a S'/Eisomorphism from A to B.
Definition 2.20 (Abstract data types). An abstract data type of signature E (S-adt) is defined to be a class K of E-algebras closed under
!Z-isomorphism. Examples of S-adt's are:
(a) the class Mod(S,T) of all models of a first-order E-theory T;
(b) the isomorphism class of a particular E-algebra.
2.4
bool
true, false: -bool,
and, or: bool2 ->bool
not: bool->bool
end
352
ifs(b,x,y)
if b = f,
353
A/b.B
if Mt : B x N2 -) N,
on
locc
-T^P
L TO,
end
(d) We will also be interested (in section 5) in the expansion 7?,< of "R,
formed by adjoining the order relation on the reals lessreai: M2 B,
thus:
algebra
import
functions
K<
H
lessreai:IR2
end
2.5
Definition 2.25.
(a) A standard signature E is called N-standard if it includes (as well
as bool) the numerical sort nat, as well as function symbols for the
standard operations of zero, successor and order on the naturals:
0:
S:
lessn
-> nat
nat -* nat
bool
nar
354
AN
A
N
0 : -> N
S:N-N
if nat :B x N2
eqnat,lessnat:f
end
(c) The N-standardisation ~KN of a class K of E-algebras is (the closure
with respect to E^/E-isomorphism of) the class {AN | A K}.
Examples 2.27.
(a) The simplest ./V-standard algebra is the algebra M of Example 2.23(6).
(6) We can ./V-standardise the real and complex rings Ti and C, and the
group Q of Examples 2.23, to form the algebras HN, CN and QN,
respectively.
Remark 2.28.
(a) For any standard A, both A and N are H-reducts of the ^-standardisation AN (cf. Remark 2.22(d)).
(b) If A and B are two TV-standard E-algebras, then any E-homorphism
from A to B is actually a E/S(^V)-homomorphism, i.e., it fixes the
reduct N (cf. Remark 2.22(e)).
(c) A E-homomorphism (or E-isomorphism) between two standard Ealgebras A and B can be extended to a S-homomorphism (or Eisomorphism) between AN and BN. (Exercise.)
(d) If A is already JV-standard, then AN will contain a second copy of N,
with (only) the standard operations on it. Further, AN can be effectively coded within A, using a standard coding of I^2 in N. (Check.)
(e) In particular, (^4^)^ can be effectively coded within AN.
We will occasionally make use of a notion stricter than ./V-standardness.
355
2.6
In this subsection, we need not assume that and A are standard. For each
sort s of let ui be a new object, representing an 'unspecified value', and let
A* = As U {uis}. For each function symbol F of S of type si x . . . x sm ->
s, extend its interpretation FA on A to a function
FA>" : A"Sl x ... x ASm > Aus
by strictness i.e. the value is denned as ui whenever any argument is ui.
Then the algebra A u , with signature Su, contains:
(i) the original carriers A3 of sort s, and functions FA on them;
(ii) the new carriers A" of sort su, and functions F A>U on them;
(Hi) a constant unspecss : su to denote uis as a distinguished element of
A"; and
(TO) an embedding function \s : s - su to denote the embedding of As
into A", and the inverse function js : su - s, mapping ui s to the
default term 5s for each sort s.
Further, if A is a standard algebra, we assume Au also includes:
(v) a Boolean-valued function Unspecss : su - bool, the characteristic
function of ui;
(vi) the discriminator on A* for each sort s; and
(vii) the equality operator on A" for each equality sort s.
Thus, if A is standard, A is constructed from A as follows:
algebra
import
carriers
functions
A"
A
A"
ui, : -> A1
r ' : ^*-ai X ... X AS
i . A
\. A**
>s "-, ^ flg
}s : Al -> As
Unspec s : A" - B
if su : B x (.A")2 -> A*
eqsu : (A")2 - B
end
(s S)
(s e s),
-> -A
(F : si x . . . x sm -> s in E),
(s 6 S),
(s e S),
(s e 5),
(s e S),
(s e Se)
356
2.7
357
a* = (,/)
a(fe)
x
uis
if k<l,k^n
if k<l,k = n
otherwise;
(w) the Newlengths operator of type s* x nat -> s*, where Newlength^1
((a, Z),m) is the array (/3,m) such that for all fc,
Ufc)
I ui s
if k<m
if K > m;
358
A'
AU,N
A:
Nulls : -> A*s
Aps : A; x N -> A".
Updates : A*s x N x A", -> A*s
Lgths : A; -> N
Newlengths : yl* x N -> A*
if,. : B x (A*) 2 - .4*
eqs, : (A;)2 -+ 1
(aS)
(aS)
(a 5 )
(aS)
(aeS)
(a 5)
(aeS)
(* Se)
end
359
and 2.30(/).)
(/) The reason for introducing starred sorts is the lack of effective coding
of finite sequences within abstract algebras in general.
(g) Starred sorts have significance in programming languages, since
starred variables can be used to model arrays, and (hence) finite but
unbounded memory.
2.8
As
AN
[N
evalf : [N -Ag] x N -> As
iff : B x ([N ->A g ]) 2 -4 [N
end
where now S is the set of stream sorts.
360
_
c
Also, Ks is (the closure with respect to S /E-isomorphism of) the class
{A | A K}.
Remark 2.32.
_
'S
In this section, we begin to study the computation of functions and relations on algebras by means of imperative programming models. We start
by defining a simple programming language While = While(E), whose
programs are constructed from concurrent assignments, sequential composition, the conditional and the 'while' construct, and may be interpreted on
any many-sorted S-algebra; this takes up sections 3.1-3.6. We will define
in detail the abstract syntax and semantics of this language, and methods
by which its programs can compute functions and relations. In sections
3.7 and 3.8 we prove some algebraic properties of computation on algebras,
with regard to homomomorphisms and locality.
In sections 3.9-3.13, we will add to the basic language a number of
new constructs, namely 'for', procedure calls and arrays, and extend our
model of computation accordingly. In section 3.14 we study the concept
of a sequence of 'snapshots' of a computation, which will be useful later
in investigating the solvability of the halting problem in certain (locally
finite) algebras.
We conclude (section 3.15) with a useful syntactic conservativity theorem for S*-terms over S-terms.
We illustrate the theory with several examples of computations on the
algebras of real and complex numbers.
Throughout section 3, we assume (following Convention 1.4.3) that
E is a standard signature,and A is a standard 'S-algebra.
3.1
361
Syntax of W7ule(E)
We begin with the syntax of the language Whiie(S). First, for each S-sort
s, there are (program) variables a s ,b s ,... , x s , y * . . . of sort s.
We define four syntactic classes: variables, terms, statements and procedures.
(a) Var = Var(S) is the class of S-variables, and Vars is the class of
variables of sort s.
For u = Si x ... x sm, we write x : u to mean that x is a u-tuple of
distinct variables, i.e., a tuple of distinct variables of sorts i , . . . , sm,
respectively.
Further, we write VarTup= VarTup(S)for the class of all tuples
of distinct S-variables, and VarTupu for the class of all u-tuples of
distinct S-variables.
(b) Term = Term(S) is the class of S-terms t,..., and for each S-sort
s, Terms is the class of terms of sort s. These are generated by the
following rules.
(i) A variable x of sort s is in Term$.
(M) If F Ftmc(S)u_>.s and ti TermSi for i = 1,... ,m where
u = si x ... x sm, then F(ti,... ,tm) Terms.
Note again that S-constants are construed as 0-ary functions, and so
enter the definition of Term(S) via clause (ii), with m = 0.
The class Term(S) can also be written (in more customary notation)
as T(S,Var), i.e., the set of terms over S using the set Var of
variables (clause (i) in the definition). Analogously, the set T(S) of
closed terms over S (2.10) can be written as T(S,0).
We write type(t) = s or t: s to indicate that t 6 Terms.
Further, we write TerTn.Tp=TermTup(S)for the class of all tuples of S-terms, and, for u = sj x ... x sm, TermTupu for the class
of u-tuples of terms, i.e.,
TermTupu =# TermSl x ... x Term Sm.
We write type(t) = u or t: u to indicate that t is a u-tuple of terms,
i.e.,, a tuple of terms of sorts s i , . . . , sm.
For the sort bool, we have the class of Boolean terms or Booleans
?ooJ(S) =df Termed) denoted either tboo[... (as above) or b,
This class is given (according to the above definition of Terms) by:
b
::=
362
These have
363
3.2
States
(3.1)
Let State (A) be the set of states on A, with elements CT, ---- Note that
State(A.) is the product of the state spaces States(A) for all s e Sort(S),
where each StateS(A) is the set of all functions as in (3.1).
We use the following notation. For x 6 Vars, we often writeCT(X)for
<T S (X). Also, for a tuple x = (xi,... ,x m ), we writeCT[X]for (CT(XI),. .. ,<r(x m )).
Now we define the variant of a state. Let a be a state over A, x = (xi , . . . ,
x n ) : u and a = (01, . . . , an) G Au (for n > 1). We define cr{x/a} to be the
state over A formed from a by replacing its value at x, by a* for i = 1, ... , n.
That is, for all variables y:
We can now give the semantics of each of the three syntactic classes:
Term, Stmt and Proc, relative to any A StdAlg(T,). For an expression E in each of these classes, we will define a semantic function |]A.
These three semantic functions are defined in sections 3.3, 3.4-3.5 and 3.6,
respectively.
3.3
Semantics of terms
cr(x)
364
if [6^(7 = tt
\t\Ao =dj ([*i]V...,[*m] y MDefinition 3.3. For any M C Vars, and states <TI and 0^2, fi ~ 0^2 (rel M)
means u\ \ M = a^\ M, i.e., Va; M(cri(x) = (72(2;)).
Lemma 3.4 (Functionality lemma for terms). For any term t and
states a \ and a 2, if &i K &2 (rel var(t)), then [tlAcri = |<J a^.
Proof. By structural induction on t.
3.4
In this subsection we will describe a general method for defining the meaning of a statement 5, in a wide class of imperative programming languages,
as a partial state transformation, i.e., a partial function
[SIA : State(A)^- State(A).
We define this via a computation step function
CompA: StmtxState(A)xN->
State(A)L){*}
- . , CTn,
. . .
365
: Stmt -) AtSt
: Stmt x State(A)-^ Stmt,
by
CompA(S,a)
= QFirst(S) \,Aa.
*
if n > 0 and S is atomic
CompA(RestA(S, a), CompA(S, <r),n)
otherwise.
(3.2)
= CompA(S,a).
366
CompLengthA(S,cr)
= {
oo
CompA(S, a, n + 1) = *
if such an n exists
otherwise.
and noting that 0 < / < oo,
If
i ?o o
otherwise.
3.5
We now apply the above theory to the language W/iiJe(S). Here there are
two atomic statements: skip and concurrent assignment. We define ^ S tyA
for these:
d skipper
{ x:= t $A(r
= CT
= <r{x/[tjM-
= S
= skip.
if Sl is atomic
otherwise.
First (S)
367
= skip
Rest
.ficSl
(S*)
10,17]
= skip
S
i^
S i 1 skip
if
tt
./ ^
r t i 4A = -
if \b\ a = f.
This completes the definition of First and RestA . Note (in cases 3
and 4) that the Boolean test in an 'if or 'while' statement 5 is assumed to
take up one time cycle; this is modelled by taking First(S) = skip.
The following shows that the i/o semantics, derived from our algebraic
operational semantics, satisfies the usual desirable properties.
Theorem 3.6.
(a) For S atomic, [S]A = QSfyA, i.e.,
^ skip \Aa
r 2
= a
[S2]A([Si]Aa).
if [b]A<T = f.
(d)
A
if [blAa = tt
Proof. Exercise. Hint: For part (6), prove the following lemma. Formulate and prove analogous lemmas for parts (a), (c) and (d).
I
Lemma 3.7. CompA(Si; 52, a, n)
CompA(Si,<r,n)
CompA(S2,a',n-n0)
368
Remark 3.8.
(a) The four suitably formulated lemmas needed to prove parts (a)-(d) of
Theorem 3.6 (of which Lemma 3.7 is an example for part (&)) provide
an alternative definition of CompA(S,(r,n), which does not make
use of First or RestA. This definition is by structural induction on
5, with a secondary induction on n.
(b) The meaning function [S J (i.e., our i/o semantics) was derived from
our operational semantics, i.e., the CompA function. We could also
give a denotational i/o semantics for While statements. Theorem
3.6 would then provide (one direction of) a proof of the equivalence
of the two semantics (as in de Bakker [1980]).
(c) The semantics given here is simpler than that given in Tucker and
Zucker [1988] where the states have an 'error value' almost everywhere
(for uninitialised variables), and there is an 'error state' corresponding to an aborted computation. While such an 'error semantics' is
superior (we feel) to the one given here, the semantics given here is
simpler, and adequate for our purposes.
For the semantics of procedures, we need the following. Let M C Vars,
and a, a' State(A).
Lemma 3.9. Suppose var(S) C M. If a\ a-i (rel M), then for all
n > 0,
CampA(S,ai,n) w CompA(S,a2,n) (rel M).
Proof. By induction on n. Use the functionality lemma (3.4) for terms.
Lemma 3.10 (Functionality lemma for statements). Suppose var(S)
C M. // a\ w <72 (rel M), then either
ft) [S}Ao'i 4- ff'\ and [S]Aff2 4- ""2 (say)> where a[ m cr2 (rel M), or
(ii) \S\Aoi | and [S\Aff2 tProof. From Lemma 3.9.
3.6
Semantics of procedures
Now if
defined as follows. For a Au, let a be any state on A such that <r[a] = a.
Then
\p\A(n\ ~ / ff 'M * [s]^*' (s&y)
11 ( )
~ t
if[5]^t-
369
For [P}A to be well defined, we need the fact that the procedure P is
functional, as follows.
Lemma 3.11 (Functionality lemma for procedures). Suppose
P = proc in a out b aux c begin S end.
If a\ ?a <72 (rel a), then either
(i) [5]Acri 4- a\ and [S]A<rz 4- ^2 (say), where a{ (72 (rel b)
(ii) [S]AtTi t and [S]^a2 t-
or
Proof. Suppose <TI w 172 (rel a). We can put S=Sinu;S', where 5jnit
consists of an initialisation of b and c to closed terms (see section 3.1).
Then, putting
[$n]A<7i = <//
and
{Sinit\A^ = cr2',
370
(b) In the ./V-standardised group QN (Example 2.27(6)), the partial function ord: G - N, defined by
,. .
I least n s.t. yqn = 1,
ord(g) ~ <
11
if such an n exists
otherwise,
{temporary product}
n:=l;
while not(prod=l)
do prod:=prod* g;
n:=succ(n)
od
end
We emphasise that this order function is defined uniformly over all Nstandardised groups (of the given signature E(^ Ar )).
The following proposition will be useful.
Proposition 3.15 (Closure of While computability under composition). The class of While computable functions on A is closed under
composition. In other words, given (partial) functions f : Au -t A" and
g : A" t Aw (for any ^-product types u,v,w), if f and g are While
computable on A, then so is the composed function g o / : Au -> Aw .
Proof. Exercise. (Construct the appropriate While procedure for the
composed function.)
I
Remark 3.16. Similarly, we have closure under composition for the related notions of computability still to be considered in this section, namely
WhileN, While*, For, ForN and For* computability, and the relativised versions of these. The results for For computability (etc.) can be
derived from its equivalence with PR computability (etc.) (cf. section 8).
3.7
371
For 5 6 AtSt,
4U S \Ao) = 4 S \iB4>(a).
Proof. The case where 5 = skip is trivial. The case where S is an assignment follows from Theorem 3.19.
I
Corollary 3.21 (Homomorphism invariance for the Compi predicate).
S,<r)) = Compf (S,fa)).
Theorem 3.22 (Homomorphism invariance for the Comp predicate).
4>(CtrmpA(S,a,n)) = CompB(S,4>(o-),n).
372
(a)) ~
3.8
Locality of computation
(a(x})A.
C (<T[X])A.
I
A
373
Proof. Suppose
P = proc in a out b aux c begin Sinu', S1 end
[S']Aa"ol.
Then
(a)A = (<,[*])*
= (a"(x})A,
(3.3)
since Sinu consists (only) of the initialisation of b and c to the closed terms,
the values of which lie in every E-subalgebra of A. Also, by the syntax of
procedures (section 3.1(c)), var(Sinu', S1) C x. Hence by Theorem 3.29,
applied to S' and a",
PA(a) =df <r'[b] C a'[x] C (a"[x])^.
(3.4)
374
3.9
x := P(t),
._
) | a (say)
(3-6)
(3.7)
(a) According to our syntax, in the procedure call (3.5) above, 'P' is not
just a name for a procedure but the procedure itself, i.e., the complete
text (3.6)! In practice, it is of course much more convenient and
customary to 'declare' the procedure before its call, introducing
an identifier for it, and then calling the procedure by means of this
identifier.
375
In any case, our syntax prevents recursive procedure calls. The situation with recursive procedures would be quite different from that
described above they cannot be eliminated so simply (de Bakker
[1980]).
(6) Another way of incorporating procedure calls into statements is by
expanding the definition of terms, as was done in Tucker and Zucker
[1994]. The problem with that approach here is that it would complicate the semantics by leading to partially defined terms. In Tucker
and Zucker [1994] this problem does not occur, since the procedures,
being in the For language rather than While, produce total functions.
3.10
We define the programming language While(g) which extends the language While by including a special function symbol g of type u v. We
can think of g as an 'oracle' for gAThe atomic statements of While(g) include the oracle call
z:=g()
where t : u and x : v. The semantics of this is given by
x :=
if <M(WM t
Similarly, for a tuple of (families of) functions g\,... ,gn, we can define
the programming language While(gi,... ,gn) with oracles gi,... ,g n for
<7i > j 9n, or (by abuse of notation) the programming language While(g\,
-,&)
In this way we can define the notion of While(g\,--., gn) computability, or While computability relative to g\,... ,gn, or While computability
in gi,... , gn, of a function on A.
Similarly, we can define the notion of relative While semicomputability
of a relation on A.
We can also define the notion of uniform relative While computability
(or semicomputability) over a class K.
Lemma 3.32 (Transitivity of relative computability). /// is While
computable in g\,... , gm, hi,... ,hn, and g\,... , gm are While computable in hi,... , hn, then f is While computable in hi,... , hn.
Proof. Suppose that gi is computable by a While(hi,... ,h n ) procedure Pj for t = 1,... ,m. Now, given a While(gi,... ,g m ,hi,... ,h n )
376
3.11
.For(E) computability
(3.8)
= skip
= (S0)k
where k = \t\Aa.
Note that t is evaluated (to k) once, upon initial entry into the loop,
which is then executed exactly k times (even if the value of t changes in
the course of the execution). Thus \S\A is always total, and functions
computable by For procedures are always total.
We define For(A) to be the class of functions For computable on A.
As in section 3.10, we can define the notion of relative For(E) computability, and prove a transitivity lemma for this, analogous to Lemma
3.32.
Example 3.33. The functions For computable on M of type nat* -nat
are precisely the primitive recursive functions over N.
This follows from the equivalence of primitive recursiveness and For
computability on the naturals (proved in Meyer and Ritchie [1967]; see,
Davis and Weyuker [1983], for example) or from section 8. Hence
377
3.12
Consider now the While and For programming languages over T,N.
Definition 3.35.
(a) A WhileN(H) procedure is a While(E) procedure in which the input
and output variables have sorts in E. (However the auxiliary variables
may have sort nat.)
(6) ProcN() is the class of While1* (E) procedures.
Definition 3.36 (WhileN computable functions).
(a) A function / on A is computable on A by a WhileN procedure P if
/ = PA. It is WhileN computable on A if it is computable on A by
some WhileN procedure.
(b) A family / = (/A \ A K) of functions is WhileN computable uniformly over K if there is a WhileN procedure P such that for all
A K, fA = PA(c) WhileN(A) is the class of functions WhileN computable on A.
The class of ForJV(E) procedures, and ForN(S) computability, are
defined analogously.
Remark 3.37.
(a) If A is ^-standard (so that For computability is defined on A), then
AN has two copies of N, which we can call N and N', of sort nat
and nat', respectively (each with 0, S and < operations). To avoid
technical problems, we assume then that in the for command ((3.8) in
section 3.11), the term t can have sort nat or nat'. This assumption
helps us prove certain desirable results, for example:
(i) There are For(AN) computable bijections, in both directions,
between the two copies of N.
378
3.13
379
(6) A family / =(/A \ A K) of functions is While* computable uniformly over K if there is a While* procedure P such that for all
4 6 K , fA=PA(c) While* (A) is the class of functions While* computable on A.
The class of For*(E) procedures, and For*(S) computability, are defined analogously.
Remark 3.43.
(a) While* computability will be the basis for a generalised ChurchTuring thesis, as we will see in section 8.8.
(b) For*(Ti) computability implies While* () computability (cf. Proposition 3.34).
(c) Relativised versions of While* and For* computability can be defined as with While computability (section 3.10) and corresponding
transitivity lemmas (cf. Lemma 3.32) proved. Also, relative For*
computability implies relative While* computability.
(d) In M, WhileN and While* computability are equivalent to While
computability, which in turn is equivalent to partial recursiveness
over N (Example 3.14(o)). Similarly, in JV, For, ForN and For*
computability are all equivalent to primitive recursiveness (Example
3.33).
Theorem 3.44 (Locality of computation for While* procedures).
For a While* procedure P : u > v and a G Au such that PA(a) 4-,
PA() 6 (a)A.
Proof. This follows from the corresponding Theorem 3.30 for While computability, applied to A*, together with E*/E conservativity of subalgebra
generation (to be proved below, in Corollary 3.65).
I
The following observation will be needed later.
Proposition 3.45. On A*, While* (or For*) computability coincides
with While (or For) computability.
This follows from the effective coding of (A*)* in A* (Remark 2.31(d)).
Remark 3.46 (Internal versions of While* and For* computability). If A is ./V-standard, we can consider 'internal versions' of While*
and For* computability, based on the 'internal version' of A*, which uses
the copy of N already in A instead of a 'new' copy (see Remark 2.31(c)).
We can show that these versions provide the same models of computation
as our standard ('external') versions.
Proposition 3.47. Suppose A is N-standard. Let While*' and For*'
computability on A be the 'internal versions' of While* and For* (respectively) computability on A (see previous remark). Then While*' and
380
3.14
(jRemSet(S2).
S,
while 6 do 02503504 od;o5,
02503504; while b do 02503504 od;o5,
03504; while b do 02503504 od;as,
04; while b do 02;03; 04 od; 05,
a5.
The next proposition says that RemSet(S) contains S, and is closed
under the 'Rest' operation (for any state).
381
Proposition 3.50.
(a) S&RemSet(S).
(b) S' 6 RemSet(S) == RestA(S',a) RemSet(S) for any state
a.
Proof. By structural induction on S.
I
Proposition 3.51. RemSet(S) is finite.
Proof. Structural induction on S.
I
Definition 3.52. The statement remainder function
RemA : Stmtx5ta*e(A)xN ->Stmt
is the function such that RemA(S, cr,n) is the statement (the 'remainder
of 5') about to be executed at step n of the computation of 5 on A,
starting in state cr (or skip when the computation is over). This is defined
by recursion on n (tail recursion again):
RemA(S,tr,Q) = S
RemA(S,(T,n+l) = {
skip
if n > 0 and S is atomic
RemA(RestA(S,a),CompA(S,o-),n)
otherwise.
Note the similarity with the tail recursive definition of CompA (section
3.4). Note also that for n = 1, this yields
RemA (S, a,l) = RestA(S,a).
The two functions Comp and Hem also satisfy the following pair of
relationships, which (together with suitable base cases n = 0) could be
taken as a (re-) definition of them by simultaneous primitive recursion:
Proposition 3.53.
(a) CompA(S,a,n + l) = Compf (RemA(S,o-,n),CompA(S,o-,n))
(b) RemA(S,a,n + 1) = ReatA(RemA(S,a,n),CompA(S,a,n))
provided CompA(S,a,n) ^ *.
Proof. Exercise.
I
A
Proposition 3.54. For all n, Rem (S,a,n) RemSet(S)\j{sk\p}.
Proof. Induction on n. Use Proposition 3.50.
I
A
If we put Sn =Rem (S,a,n), then the sequence of statements 5 =
So, S\, 5-2, is called the remainder sequence generated by S at <r, written
RemSeqA(S,a).
382
Stmt.
= (a0,S0),(ffi,Si),
CompA(S,ff,n)
(<7 2 ,S 2 ), ...
(a) The snapshot function will be used later, in considering the solvability
of the halting problem for locally finite algebras (section 5.5).
(6) The snapshot function is adapted from Davis and Weyuker [1983] (or
Davis et al. [1994]). There a 'snapshot' or 'instantaneous description'
383
3.15
We conclude this section with a very useful syntactic conservativity theorem (Theorem 3.63) which says that every *-term with sort in is
effectively semantically equivalent to a S-term. This theorem will be used
in sections 4 (universality for While* computations: Corollary 4.15) and
5 (strengthening Engeler's lemma: Theorem 5.58).
First we review and extend our notation for certain syntactic classes of
terms.
Notation 3.60.
384
t=0 : maxval(t) = 0.
t=St0 : maxval(t) = maxval(t0) + 1.
t=\f(b,ti,t2) : maxval(t) = max(maxval(ti), maxval(t^)).
<=Lgths(r), where r is of starred sort. There are four subcases, according to the form of r:
(i)
(ii)
r = Null : maxval(t) = 0.
r = Update(r 0 ,ii,i2) : maxval(t) =maxval(Lgth(ro)).
(Hi)
(iv)
Remark 3.62.
(a) This definition, which is used in stage 1 of the syntactic transformation described in Theorem 3.63 below, uses the assumption that the
variables of t all have sorts in E. If, for example, t (or a subterm of t)
was a variable of sort nat, or was of the form Lgth(z*) for a variable
z* of starred sort, we could not define maxval(t).
(b) Suppose (i) E is strictly TV-standard (and so includes the sort nat),
and (ii) the sorts of a do not include nat. Then, with Terra* s =
Terma)s(S*) with the 'internal' version of S* (using this sort nat
instead of a 'new' sort, cf. Remark 2.31(c)), we can still give an appropriate definition of maxval(t) for t Terra* nat . (Check.)
Theorem 3.63 (E*/S conservativity for terms). Let a be an (arbitary but fixed) tuple of'S-variables. For all s 6 Sort(E), every term in
Terra*;S is effectively semantically equivalent to a term in Terma,s.
Proof. This construction (or transformation) of terms proceeds in three
stages:
Stage 1: from E*-terms (of sort in E""^) to Eu'Ar-terms;
Stage 2: from Eu"/v-terms (of sort in E^) to E^-terms;
Stage 3: from E^-terms (of sort in E) to S-terms.
In all cases, the program variables of the terms are among a.
Stage 1: From Terraa(E*/Su'JV) to Terraa(Eu'Ar). This amounts to
removing subterms of starred sort from a term of unstarred sort.
385
Ap(r,),
Lgth(r).
Lgth(n) = Lgth(r2) A
where M =maxwai(Lgth(ri)), and k is the numeral for fc (that is, '0'
preceded by 'S' fc times).
Now all (maximal) occurrences of a subterm r of starred sort are in a
context of the form either Ap(r, t) or Lgth(r).
Step b. Transform all contexts of the form Ap(r, t), by structural induction
on r. There are four cases, according to the form of r:
(i)
(ii)
(Hi)
(iv)
r = Null:
Ap(r, t) i>
r = Update(r 0 ,to,i):
Ap(r,*) i>
r = Newlength(r 0 , to):
Ap(r, t) i>
r = if(6,ri,r 2 ):
Ap(r.t) >
unspec.
M(t = tQ < Lgth(ro), ti, Ap(r 0 ,t)).
if(t < Jo, Ap(r 0 ,<), unspec).
if(6, Ap(n,), Ap(r 2 ,*)).
Note the use of the 'if operator in cases (ii) and (in). Hence the inclusion
of 'if in the definition of standard algebra (section 2.4). Note also the use
of '<' in cases (ii) and (Hi). Hence the inclusion of '<' in the definition of
the standard algebra Af (Example 2.23(6)) and TV-standardisations (section
2.5).
Step c. Transform all contexts of the form Lgth(r), by structural induction
on r. Again there are four cases, according to the form of r:
(i)
(ii)
(in)
(iv)
r = Null:
Lgth(r) -^ 0.
r = Update(r 0 ,*o,*i) :
Lgth(r) i> Lgth(ro).
r = Newlength(ro,to):
Lgth(r) ^ t0.
r = \f(b,ri,r2):
Lgth(r) >>
if(ft,Lgth(n), Lgth(r 2 )).
386
r < r',
r' < r,
r = r',
r' = r
387
Stage 3, and hence the proof of the lemma, is completed by noting that
all four cases listed in (1) are then equivalent to m < n or fh = n, and
hence (depending on m and n) to either true or false.
I
Remark 3.64.
(a) The transformation of terms given by the conservativity theorem is
primitive recursive in Godel numbers.
(6) Suppose (i) is strictly AT-standard (and so includes a sort nat),
and (ii) the sorts of a do not include nat. Then, with Term* s =
Term a>s (*) with the 'internal' version of S* (as in Remark 3.62(6)),
the conservativity theorem still holds. (Check.)
Recall Definition 2.15 on generated subalgebras. .
Corollary 3.65 (*/ conservativity of subalgebra generation).
Let X C Usesort(s) -^*- Then for any S-sort s,
(X)f*
= (Xtf.
In this section we examine whether or not the While programming language is a so-called universal model of computation. This means answering
questions of the form:
Let A be a -algebra. Does there exist a universal While program Uprog While(H) that can simulate and perform the
computations of all programs in Whife(S) on all inputs from
A? Is there a universal While procedure Uproc G Proc() that
can compute all the While computable functions on A?
These questions have a number of precise and delicate formulations which
involve representing faithfully the syntax and semantics of While computations using functions on A.
To this end we need the techniques of Godel numbering, symbolic computations on terms, and state localisation. Specifically, for Godel numbering to be possible, we need the sort nat, and so we will investigate the
possibility of representing the syntax of a standard S-algebra A (not in
388
4.1
389
Termn =% { T 1 1 6 Term},
4.2
Representation of states
(tt, a[x])
4.3
390
defined by
r, a) =
Term^s~l x A"
te
Av
similarly, by
, a) -
[t]A<r.
4.4
Let AtSt* be the class of atomic statements with variables among x only.
The atomic statement evaluation function on A relative to x,
AEA: AtSt^x State(A) -> State(A),
defined by
is represented by the function
Au
aeA: r
defined by
aeA( r5^, a) where a is any state on A such that <r[x] = a. (Again, this is well defined,
by Lemma 3.14.) In other words, the following diagram commutes:
AtSt x xState(A)
(gn,RepA)
AEA
391
1 State(A)
Rep*
aeA
Next, let Stmtf be the class of statements with variables among x only,
and define
RestA=df RestA\ (StmtxxState(A)).
Then First and Rest* are represented by the functions
r
Stmtn ->.
Stmt
jPirst
gn\
>
AtSt
\gn
first
Resti
gn
r
StmtI"1
Note that first is a function from N to N, and (unlike restf and most
of the other representing functions here) does not depend on ^4 or x.
Next, the computation step function (relative to x)
Comp2= CompA\(StmtIxState(A.)xN}:
Stmt x xState(A)xN -> State (A) U {*}
is represented by the function
392
Stmt^xAu x N -> 1 x Au
>
State(A)\j{*}
(gn,RepAidw)
RepA
StmtI~lxAu x N
>
B x Au
comp
We put
compA(rS^,a,n)
= (notoverA(rS~*,a,n),
state^^S'*,a,n))
r
r
Stmt^xAu x N -> 1
StmtI'1xylu x N -> A"
4.5
Let Stmt^ be the class of While statements with variables among x only.
The statement evaluation function on A relative to x,
SEA: Strut** State(A) -> State(A),
defined by
SEA(S,a)
= fS]V
where a is any state on A such that cr[x] = a. (This is also well defined, by
the functionality lemma for statements, 3.10.) In other words, the following
diagram commutes.
StmtIxState(A)
SEf
393
> Stote(A)
Rep?
SxA"
>
Au
4.6
We will want a representation of the class Procu_>w of all While procedures of type u > i>, in order to construct a universal procedure for that
type. This turns out to be a rather subtle matter, since it requires a coding for arbitrary tuples of auxiliary variables. We therefore postpone such
a representation to section 4.8, and meanwhile consider a local version, for
the subclass of Procu^v of procedures with auxiliary variables of a given
fixed type, which is good enough for our present purpose (Lemma 4.2 and
Theorem 4.3).
So let a,b,c be pairwise disjoint lists of variables, with types a : u, b : v
and c : w. Let Froca,b,c be the class of While procedures of type u -> v,
with declaration in a out b aux c. The procedure evaluation function on A
relative to a,b,c
defined by
PE^c(P,a) = PA(a)
is represented by the function
P<b,c: rProc*,b,^xAu - A"
defined by
P<b,c(rP^, a) = PA(a).
In other words, the following diagram commutes:
394
X Au
x Au
4.7
395
(2) Course of values recursion on nat with range sort nat is reducible to
primitive recursion on nat. (Used in (a).)
(3) The constructive least number operator, used in part (c) (cf. the
definition of CompLength in section 3.4), is While computable
onAN.
References for facts (1) and (3) are given later (Theorem 8.5). Fact
(2) can be proved by an analogue of a classical technique for computability
on M which can be found in Peter [1967] or Kleene [1952].
We complete the cycle of relative computability by proving (e) as follows: given a term t e Terrors, consider the procedure
P = proc in x out y begin y:=t end.
r
for A
(i) For all x and s, the term evaluation representing function te^s is
While computable on AN.
(ii) For all x, the atomic statement evaluation representing function ae,
and the representing function rest^, are While computable on AN.
(Hi) For all x, the computation step representing function comp, and its
two component functions notover^ and state^, are While computable on AN.
(iv) For allx., the statement evaluation representing function se^ is While
computable on AN.
(v) For all a,b,c, the procedure evaluation representing function pe^b c
is While computable on AN.
Proof. From the transitivity lemma for relative computability (3.32) and
Lemma 4.2.
396
follows from the effective normalisability of the terms of these varieties. In the case of rings, this means an effective transformation of
arbitrary terms to polynomials. Consequently, the unordered and
ordered algebras of real and complex numbers (R.,'R<,C and C < , defined in Example 2.23), which we will study in section 6, have the
TEP. (See Tucker [1980, 5].)
(b) An (artificial) example of an algebra without the TEP is given in
Moldestad et al. [I980b].
Proposition 4.6. The term evaluation representing function on A* is
For (and hence While) computable on A* , uniformly for A .StdAlg(Y,).
Hence the class StdAlg(*) has the uniform TEP.
Proof. (Outline.) The function te ^* is definable by course of values recursion (cf. Remark 8.6) on Godel numbers of E*-terms, uniformly for
A StdAlg(E). It is therefore uniformly For computable on A*, by Theorem 8. 7(a).
I
Corollary 4.7.
(a) The term evaluation representing function on A is For* (and hence
While*) computable on AN , uniformly for A StdAlg(E).
(b) The other semantic representing functions listed in Theorems 4.3 are
While* computable on AN , uniformly for A (EStdAlg().
Remark 4.8. Suppose and A are ,/V-standard. Then the semantic representing functions listed above (such as te^s) can all be defined over A
instead of AN . In that case, Lemma 4.2, Theorem 4.3, Definitions 4.4 and
Corollary 4.7 can all be restated, replacing 'AN\ 'S^' and 'KN' by 'A', ''
and 'K', respectively. Similar remarks apply to the definitions and results
in Sections 4.8-4.12.
Recall the definitions of generated subalgebras, and minimal carriers
and algebras (Definitions 2.15 and 2.17 and Remark 2.16).
Corollary 4.9 (Effective local enumerability) .
(a) Given any ^-product type u and H-sort s, there is a For* computable
uniform enumeration of the carrier set of sort s of the subalgebra (a)A
generated by a & Au , i.e., a total mapping
enums: Au x N -> As
which is For* computable on AN , such that for each a A" , the
mapping
(where x : u) is surjective.
(b) If A has the TEP, then enums is also While computable on AN .
397
Proof. Define enum^ s simply from the appropriate term evaluation representing function:
p (a, n) = te^s(n,a).
Corollary 4.10 (Effective global enumerability).
(a) If A is minimal at s, then there is a For* computable enumeration
of the carrier As, i.e., a surjective total mapping
enumf:
N - As,
N
4.8
398
:
:
=
=
bn
tn~l.
Compare these functions with comp^ and its components notover^ and
(section 4.4). Note that for any x extending a and S
notover^(rS'1, (a, 6 A), n)
state? (rS^, (a, 8A), n)
=
=
notoveru(ryi~',rS~*, a,n) = bn
399
Proof. (Outline.) We essentially redo parts (a) and (b) of Lemma 4.2,
using uniform (in x) versions of aeA and restA, i.e., we define (1) the
function
aeuA:
VarTup~>xrAtSt~l
TermTup~(
VarTup~(xrStmt~>xAu
->
400
Remark 4.16.
(a) For all u, v, the construction of Univ u , v (direction (i) => (ii) in the
proof of Theorem 4.14) is uniform over S in the following sense.
There is a relative While(N) procedure Uu<v : nat x u -> v containing oracle procedure calls (he \ s Sort(S)) (section 3.12) with
hs : natxu > s, such that for any A StdAlgCS,), if hs is interpreted
as tes on A (where a : u), then Uu<v is universal for Procu-+v on
A. (We ignore the question of whether tes is computable on A.)
(b) The use of term evaluation occurs at two points in the construction
of Univ Wjt , (direction ()=>()): (1) in the evaluation of Boolean
tests in the construction of the sequence
401
and (2) in the evaluation of the output variables t' (see proof of Theorem 4.14). We can separate, and postpone, both these applications
of term evaluation by modifying the construction of the universal
procedure as follows.
Step 1: Construct from S, not a computation sequence as in (4.1) but
rather a computation tree (section 5.10), specifically comptree(rx~l,
r
S~l,n) (where x = a, b, c), which is the Godel number of the first
n levels of the computation tree from S Stmtx labelled by wtuples of terms in TermTupz<w. Note that comptree:^ - N is
primitive recursive.
Step 2: Select a path in this tree by evaluating Boolean tests (using te^bool
together with the subex operation) until you come (if at all) to a
leaf. Evaluate the terms representing the output variables at this leaf
(again using teB with the subex operation).
4.9
We can strengthen the universal characterisation theorem for While computations (4.14) using the */ conservativity thorem (3.63).
Theorem 4.17. (Universality characterisation theorem for While*
computations) The following are equivalent, uniformly for A
402
Godel numbers (Remark 3.64(o)), the whole algorithm can be formalised as a While(SN) procedure.
(M) => (j): This follows trivially from Theorem 4.14.
I
Corollary 4.18. The following are equivalent, uniformly for A StdAlg(E).
4.10
Next we consider the statement remainder and snapshot functions (section 3.14) which will be useful in our investigation of the halting problem
(section 5.6). Let x : u.
The statement remainder function (relative to x)
RemA= RemA\(StmtxxState(A)xN)
Stmt%xState(A)x^ ->
StmtzxState(A)xN
>
Stmt*
Stmt^xAu x N
SnapAl(StmtIxState(A)xN):
StmtIxState(A)xN -> (Stote(A)U{*})xStmtx
=
=
(compA(rS~l,a,n), remA(rS~l,a,n))
((notoverA(rS~i,a, n), stateA (rS'1 ,a, n)),
403
> (State(A) U
(gn,RepA ,idN)
(RepAt,gn)
are called, respectively, the computation representing sequence, the remainder representing sequence and the snapshot representing sequence generated by S (or rSn) at a (with respect to x), denoted respectively by
compseq(rS~(,a), reraseg^(rS~1,a) and snapseq^(rS^,a). (Compare
the sequences CompSeqA(S,a), RemSeqA(S,a) and SnapSeqA(S,o~)
introduced in section 3.)
The sequences compseqf (r S~* ,a) and snapseqA(rS~*,a) are said to
be non-terminating, if, for all n, notoverA(rS^,a,n) = tt, i.e., for no n is
These representing sequences satisfy analogues of the results listed in
section 3.14; for example:
Proposition 4.19. // snapseqA f~ ^5n ,a) repeats a value at some point,
then it is periodic from that point on, and hence non-terminating. In other
words, if for some m,n with m^n
404
(i) For all x and s, the term evaluation representing function teg is
While computable on AN .
(ii) For all x, the snapshot representing function snap^ , and its two
component functions camp* andrem^, are While computable on
AN.
Proof. As for Theorem 4.3.
4.11
Also (x)f is finite if, and only if, there exists n such that
<*>> = +i
(4-2)
in which case
Ws,n
(x/s,n+l
\x)s,n+2
= (X)
405
N -> N
4.12
finite.
406
(4.3)
->
A xN
Au x N -> B x Au
(4.4)
remAs
snapAs
:
:
407
Au x N Au x N - (I x .4") x
seAs
such that
t<. = tc^rr.o),
A r
1
compAs(a,n)
and similarly for the other functions listed in (4.4). We then have:
Theorem 4.27.
(a) The functions teAsj and aeAs are While computable on A. The
functions reatAs, notoverAs, 8tateAs, compAs, remAs and
snapAs are While computable on AN . The functions seAs and
peAb c P are WhileN computable on A.
(b) Suppose A is N-standard. Then all the functions listed in (4-4) are
While computable on A.
Proof. For (a): computability of teAs<t is proved by structural induction
on t Termx. To prove computability of rests on AN , put 5 = So; Si,
where So does not have the form S';S" (and ';Si' may be empty), and
rewrite the definition of RestA in section 3.5 as an explicit definition by
cases, according to the different forms of So- For computability of comp^s
on AN , show that the family of functions (comp^ s, \S' RemSet(S))
is definable by simultaneous primitive recursion. (Compare the definition
of CompA in section 3.4.) Use the fact that this family is finite, by Proposition 3.51.
Part (6)followsimmediately from (a).
Notions of semicomputability
408
The second idea of importance is that of a projection of a semicomputable set. In computability theory on the set N of natural numbers, the
class of semicomputable sets is closed under taking projections, but this is
not true in the general case of algebras, even with While* computability.
(A reason is the restricted form of computable local search available in
our models of computation.) Protective semicomputability is strictly more
powerful (and less algorithmic) than semicomputability.
In this section we will study the two notions of semicomputability and
projective semicomputability in some detail. We will consider the invariance of the properties under homomorphisms. We will prove equivalences,
such as
projective While* semicomputability = projective For* computability.
In the course of the section, we also consider extensions of the While
language by non-deterministic constructs, including allowing:
(i) arbitrary initialisations of some auxiliary variables in programs;
(if) random assignments in programs.
We prove that in these non-deterministic languages, semicomputability is
equivalent to the corresponding notion of projective semicomputability. We
also show an equivalence between projective semicomputability and
(Hi) definability in a weak second-order language.
We characterise the semicomputable sets as the sets definable by some
effective countable disjunction
\/bk
k=0
5.1
While semicomputability
409
Definition 5.2.
(a) R is While computable on A if its characteristic function is.
(b) R is While semicomputable on A if it is the halting set on A of some
While procedure.
(c) A family R =(RA \ A K) of relations is W^hi/e semicomputable
uniformly over K if there is a W^/iiJe procedure P such that for all
A K, .RA is the halting set of P on A.
It follows from the definition that R is While semicomputable on A
if, and only if, R is the domain of a While computable (partial) function
on A.
Remark 5.3. As far as defining relations by procedures is concerned, we
can ignore output variables. More precisely, if R =HaltA(P), then we may
assume that P has no output variables, since otherwise we can remove all
output variables from P simply by reclassifying them as auxiliary variables.
We will call any procedure without output variables a relational procedure.
Definition 5.4 (Relative While semicomputability). Given a tuple
<?i, 19n of functions on A, a relation R on A is While semicomputable
ingi,... , gn if it is the halting set on A of a While(gi,... , gn) procedure,
or (equivalently) the domain of a function While computable in g\,... ,gn
(cf. section 3.10).
Example 5.5.
(a) On the naturals J\f (Example 2.23(6)), the While semicomputable
sets are precisely the recursively enumerable sets, and the While
computable sets are precisely the recursive sets.
(b) Consider the standard algebra K of reals (Example 2.23(c)). The set
of naturals (as a subset of M) is While semicomputable on Ti, being
the halting set of the following procedure:
is-nat
proc in x: real
begin while not x= 0
do x := x-1 od
end
5.2
410
or
QA(a) \r 1
PA(a) |,
=
=
or
and if [ra<7(Si,S2)lAcr ^ a' then
a' (which) = 1 = {Si\a\. and [mg(Si, S2)]Aa [Si]Aff (rel worSi),
(/(which) = 2 =* [5 2 JaJ.and [mg(Si,S^)]Aa w [S2]A(r (rel varS2).
The definition of mg(Si,Sz) is by course of values recursion on the sum
of compl(Si) and compl(S2)- Details are left as a (challenging) exercise. (Hint: The tricky case is when both Si and S2 have the form
Sj=while hi do S? od; S< (i = 1, 2), where ';S-' may be empty).
I
Remark 5.7. The construction of mg(Pi , P2) for A is much simpler if we
can assume that (i) A is N-standard, and (ii) A has the TEP. In that case,
by Theorem 4.3, the computation step representing function compA is
While computable on A. Using this, we can construct a While procedure
which interleaves the computation steps of Si and S2, tests at each step
411
=
=
where z\ n z2 = 0.
(a) Piu2 can be defined as Tn.g(Pi,P2), as in Lemma 5.6. (We ignore its
output here.)
(b) Pm2 can be defined, more simply, as in the classical case:
Pm2 = proc in x aux zi, z2 begin Si; 82 end.
If R is a relation on A of type u, we write the complement of A as
Rc
=df
AU\R.
<=
(=0
412
So for all x A ,
3y e AsR(x,y)
=> 3nR(x,enum(n))
$=> 3nR'(x,n)
R(x,enumf(n))
Note that there are relativised versions (cf. Definition 5.4) of all the
results of this subsection so far.
Discussion 5.12 (Minimality and search). Corollary 5.11 is a manysorted version of (part of) Theorem 2.4 of [Friedman, 1971a], cited in
[Shepherdson, 1985]. The minimality condition (a version of Friedman's
Condition III) means that search in As is computable (or, more strictly,
semicomputable) provided A has the TEP. Thus in minimal algebras, many
of the results of classical recursion theory carry over, e.g.,
the domains of semicomputable sets are closed under projection (as
above);
a semicomputable relation has a computable selection function;
a function with semicomputable graph is computable.
(Cf.Theorem 2.4 of Friedman [I971a].) If, in addition, there is computable
equality at the appropriate sorts, other results of classical recursion theory
carry over, e.g.,
413
5.3
While semicomputability.
Now, for (2), we introduce a new feature: definability with the possibility of arbitrary initialisation of search variables. For this, we define a
new type of procedure.
Definition 5.15. A search procedure has the form
Psrch = proc in a out b aux c srch d begin S end,
(5.1)
with search variables d as well as input, output and auxiliary variables, and
with the stipulations (compare section 3.1(d)):
a, b, c and d each consist of distinct variables, and they are pairwise
disjoint;
every variable in S is included among a, b, c or d;
the input and search variables a,d can occur only on the right-hand
side of an assignment in 5;
(initialisation condition): S has the form Smit'iS', where Sina
consists of an initialisation of the output and auxiliary variables, but
not of the search variables d.
414
Again, we may assume in (5.1) that PSrch has no output variables, i.e.,
that b is empty. (See Remark 5.3.)
Definition 5.16. The halting set of a search procedure as in (5.1) on A
(assuming a : u and d : w) is the set
HaltA(Psrch)
In other words, it is the set of tuples a Au such that when a is initialised to a, then for some (non-deterministic) initialisation of d, S halts.
Note that this reduces to Definition 5.1 when Psrch has no search variables.
Now let J? be a relation on A.
Definition 5.17. R is While semicomputable with search on A if R is
the halting set on A of some While search procedure.
Again, the notion of uniform While semicomputability with search over
K of a family of relations, is defined analogously.
Now we compare the two notions introduced above.
Theorem 5.18.
5.4
WhileN semicomputability
Let R be a relation on A.
Definition 5.19.
(a) R is WhileN computable on A if its characteristic function is (section
3.12).
(b) R is WhileN semicomputable on A if it is the halting set of some
WhileN procedure P on AN.
Again, we may assume that P has no output variables. (See Remark
5.3.)
From Proposition 3.38 we have:
415
semicomputability).
StdAlg(E).
Note that if A has the TEP, then the construction of a 'merged' WhileN
procedure mg(P\,P2) from two WhileN procedures PI and P%, used in
the above two theorems, is much simpler than the construction given in
Lemma 5.6 (cf. Remark 5.7).
Also Theorem 5.10 and Corollary 5.11 can respectively be restated for
WhileN semicomputability:
Theorem 5.23 (Closure of WhileN semicomputability under Mprojections) . Suppose R C yl uxnat ; where u ProdType(E), and R
is While semicomputable on AN , Then its ^-projection {x | 3n
N/?(a;,n)} is WhileN semicomputable on A.
Corollary 5.24 (Closure of WhileN semicomputability under projections off minimal carriers). Suppose A has the TEP. Let As be a
minimal carrier of A. If R C Au*s is WhileN semicomputable on A, then
so is its projection {x Au \ 3j/ ^4S R(x,y)}.
Example 5.25.
(a) (WhileN semicomputability of the subalgebra relation.) For a standard signature E, equality sort s, product type u and standard Salgebra A, the subalgebra relation
(where (x)f is the carrier of sort s of the subalgebra of A generated by
x 6 Au) is While semicomputable (on AN) in the term evaluation
representing function tes, where x : u (section 4.3). To show this,
we note that
(cf. Remark 2.16) and apply (a relativised version of) Theorem 5.23.
Hence if A has the TEP, this relation is WhileN semicomputable on
A.
416
is finite}
5.5
Let R be a relation on A.
Definition 5.26.
(a) R is protectively WhileN computable on A if, and only if, R is a
projection of a While(EN) computable relation on AN.
(b) R is protectively WhileN semicomputable on A if R is a projection
of a While(N) semicomputable relation on AN.
Proposition 5.14 can be restated for WhileN semicomputability:
Proposition 5.27. Suppose A is a minimal and has the TEP. Then on A
projective WhileN semicomputability =
WhileN semicomputability.
5.6
417
, a) =
'
[f
otherwise.
remu^:
snapu*:
VarTup^xrStmt~(xAu x N -^rStmt~l
l r
u
VarTup~
x Stmt~nl xA
x N ->
r
r
(Bx TermTpa ) x Stmt~l
418
((bn,V),rSn^
(5.2)
where
bn
notoveru^(rx.~l, I~S~1, a, n)
Sn'] =
Lemma 5.33. The function snapu^, and its components compu^ and
remu, are While computable in (te^s \ s 6 5ort(S)) onAN, uniformly
forAeStdAig().
Proof. Similar to Lemma 4.20.
419
420
5.7
While* semicomputability
Let R be a relation on A.
Definition 5.37.
(a) R is While* computable on A if, and only if, its characteristic function is.
(b) R is While* semicomputable if, and only if, it is the halting set of
some While procedure P on A*.
Again, we may assume that P has no output variables. (See Remark
5.3.)
From Proposition 3.45 we have:
Proposition 5.38. On A*, While* semicomputability coincides with While
semicomputability.
Theorem 5.39 (Closure of While* semicomputability under union
and intersection). The union and intersection of two While* semicomputable relations of the same type are again While* semicomputable, uniformly over StdAlg(S).
Proof. From Theorem 5.8, applied to A*
Note that if A has the TEP, then the construction of a 'merged' WhileN
procedure mg(Pi,P-2) from two WhileN procedures PI and P2, used in
the above two theorems, is much simpler than the construction given in
Lemma 5.6 (cf. Remark 5.7).
Note that since A* has the TEP for all A StdAlg(Z), there is a
uniform construction of a 'merged' While* procedure mg(Pi,P2) from
two While* procedures PX and P2, used in the above two theorems, which
is much simpler than the construction given in Lemma 5.6 (cf. Remark
5.7).
Also Theorem 5.10 (and 5.23) and Corollary 5.11 (and 5.24) can be
restated for While* semicomputability:
Theorem 5.41. Suppose R C Au*nat, where u e ProdType(), and
R is While* semicomputable on AN. Then its ^-projection {x \ 3n
N.R(o:,n)} is While* semicomputable on A.
Corollary 5.42 (Closure of While* semicomputability under projections off minimal carriers). Let As be a minimal carrier of A, and
let uProdType().
421
422
5.9
{<t>(x)\x&R}
Notice that the above result holds for a given procedure P, and any
epimorphism <f>: A -> B. In particular, taking the case B = A, we obtain:
Corollary 5.51 (Automorphism invariance for semicomputability).
(a) If R is While semicomputable on A, then for any E-automorphism
<i> of A, <t>[R] =R.
(b) Similarly for While* semicomputable sets.
Corollary 5.52 (Automorphism invariance for projective semicomputability) .
(a) If R is protectively While semicomputable on A, then for any Eautomorphism 4> of A, (f>[R] = R.
(b) Similarly for protectively While* semicomputable sets.
423
5.10
We will define, for any While statement S over S, and any tuple of distinct
program variables x = x i , . . . , xn of type u = s\ x ... x sn such that
var(S) C x, the computation tree T[S, x], which is like an 'unfolded flow
chart' of 5.
The root of the tree T[S,x] is labelled 's' (for 'start'), and the leaves
are labelled 'e' (for 'end'). The internal nodes are labelled with assignment
statements and Boolean tests.
Furthermore, each edge of T[S, x] is labeled with a syntactic state, i.e., a
tuple of terms t : u, where t = ti,... , , with ti 6 TermKiSi. Intuitively,
t gives the current state, assuming execution of 5 starts in the initial state
(represented by) x.
In the course of the following definition we will make use of the restricted
tree T~[S, x], which is just T[S, x] without the 's' node.
We also use the notation T[S, t} for the tree formed from T[S, x] by
replacing all edge labels t' by t'(x./t}.
The definition is by structural induction on S.
(i) S= skip. Then T[5,x] is as in Fig. 1.
Fig. 1.
(ii) S = y := r, where y = y x , . . . , ym and r = n , . . . , rm, with each y;
in x. Then T[S,x] is as in Fig. 2, where t = ti,... ,tn is defined
by:
( .,
_ I TJ if ~x.i=jj rtor some j
I x,otherwise.
424
Fig. 2.
(m) T = Si;S2- Then T[S,x] is formed from T[Si,x] by replacing each
leaf (Fig. 3) by the tree in Fig. 4.
' t
Fig. 3.
Fig. 4.
(iv) S = if b then Si else S2 fi. Then T[S, x] is as in Fig. 5.
(u) S = while b do Si od. For the sake of this case, we temporarily adjoin
another kind of leaf to our tree formalism, labelled '!' (for 'incomplete
computation'), in addition to the e-leaf (representing an end to the
computation). Then T[S, x] is defined as the 'limit' of the sequence
of trees Tn, where TO is as in Fig. 6, and Tn+i is formed from Tn by
replacing each i-leaf (Fig. 7) by the tree in Fig. 8, where T-,~[Si,t]
is formed from 7~~[Si,i] by replacing all e-leaves in the latter by
i-leaves. Note that the Boolean test b shown in Fig. 8 is evaluated
at the 'current syntactic state't (which amounts to evaluating b(x/t)
at 'the initial state' x). Note also that the 'limiting tree' T[S, x] does
not contain any i-leaves. (Exercise.)
425
Fig. 5.
Fig. 6.
Remark 5.54.
(a) In case (i>) the sequence 7^(5, x] is defined by primitive recursion
on n. An equivalent definition by tail recursion is possible (Exercise;
compare the two definitions of CompA(S,(r,n) in sections 3.4 and
3.14; see also Remark 3.5).
(6) The construction of T[S, x] is effective in S and x. More precisely:
T[S, x] can be coded as a recursive set of numbers, with index primExample 5.55. Let S = while x > 0 d o x := x 1 od, where x is a
natural number variable. Then (in the notation of case (u)) TO, 71 and ?2
are, respectively, as shown in Figs. 9, 10 and 11, and T[S, x] is the infinite
tree shown in Fig. 12.
Notice that each tree in the sequence of approximations is obtained
from the previous tree by replacing each i-leaf by one more iteration of the
'while' loop.
5.11
Engeler's lemma
Using the computation tree for a While statement constructed in the previous subsection, we will prove an important structure theorem for While
semicomputabilty due to Engeler [l968a]. One of the consequences of this
result will be the semicomputability equivalence thorem (5.61).
426
&
Fig. 7.
Fig. 8.
For each leaf A of the computation tree T[S, x], there is a Boolean bs,x,
with variables among x, which expresses the conjunction of results of all the
successive tests, that (the current values of) the variables x must satisfy in
order for the computation to 'follow' the finite path from the root s to A.
Consider, for example, a test node in T[5, x]:
If the path goes to the right here (say), then it contributes to bg,\ the
conjunct
... A --6(x/t) A ...
Next, let (AQ, AI, A a , . . . ) be some effective enumeration of leaves of T[S, x]
(e.g., in increasing depth, and, at a given depth, from left to right). Then,
writing bs,k= bs,\k, we can express the halting formula for S as the countable disjunction
(5.3)
427
Fig. 9.
Fig. 10.
Note that although the Booleans b$,k, and (hence) the formula halts,
are constructed from a computation tree T[S, x] for some tuple x containing
var(S), their construction is independent of the choice of x.
Remark 5.56.
(a) The Booleans b$,k are effective in 5 and k. More precisely, rbs,k~* is
partial recursive in rSn and k.
(b) Further, by a standard technique of classical recursion theory, for a
fixed S, if T[S, x] has at least one leaf, then the enumeration
bs,o,bs,i,bs,z,
can be constructed (with repetitions) so that b$,k is a total function
of k, and, in fact, primitive recursive in k.
Now consider a relational procedure
P = proc in a aux c begin 5 end
428
Fig. 11.
with input variables a : u and auxiliary variables c : w. Then S = Sinit',S',
where Smu is an initialisation of the auxiliary variables c to the default
tuple Sw. The computation tree for P is defined to be
T(P)
=df
halts, = V
(cf. (5.3)). Now, for k = 0, 1, . . . , let bk be the Boolean which results from
substituting Sw for c in bs',k- Note that var(bk) C a. Let ft*[a] 1
be the evaluation of bk when a Au is assigned to a. Then by (5.4), the
halting set of P (5.1) is characterised as an effective countable disjunction
(5.5)
k=Q
429
Fig. 12.
due to Engeler [I968a]:
Theorem 5.57 (Engeler's lemma). Let R be a While semicomputable relation on a standard H-structure A. Then R can be expressed as an
effective countable disjunction of Booleans over .
Actually, we need a stronger version of Engeler's lemma, applied to
While* programs, which we will derive next.
5.12
430
Fig. 13.
(5.6)
for all a e Au, where (now) bk Term* bool, i.e., the Booleans bk, though
not of starred sort, may contain subterms of starred sort for example,
they may be equations or inequalities between terms of starred sort. As
before, bk[a] is the evaluation of 6fc when a Au is assigned to a.
Now for any Boolean 6 Term* bool , let b' be the Boolean in
Terma,booi associated with b by the conservativity theorem (3.63). Then
from (5.6), for all a 6 Au,
(5.7)
Because the disjunction in (5.6) and the transformation b >-> b' are both
effective, the disjunction in (5.7) is also effective.
I
For the converse direction:
Lemma 5.59. Let R be a relation on a standard ^-algebra A. If R can
be expressed as an effective countable disjunction of Booleans over E, then
R is While* semicomputable.
Proof. Suppose R is expressed by an effective disjunction
oo
\/bk[a}
k=0
431
tt)
(5.8)
5.13
432
(5-9)
433
434
it follows that
the halting set for While* procedures is SJ definable, uniformly over
StdAlg(X).
5.14
(5.10)
where v* is a product type of *, and RI : u x v* is While semicomputable on A*. Then RI is the halting set of a While(*) procedure
P on A*. By Corollary 5.69, RI is J definable on A", say
fl!(a:,y*)<=>.3 6 A""B<>(x,y',z"),
(5.11)
where w** is a product type of **, and RQ(. ..) is given by an elementary formula over S**. Combining (5.10) and (5.11):
R(x)^3y* Av'3z** 6 Aw"Ro(x,y*,z*f).
(5.12)
435
Second proof. (Here we make no assumption about equality sorts.) Suppose R : u is projectively While* semicomputable on A. Then (as before)
for some product type v* of *,
R(x) <= 3y* Av" ' Rfay*)
where RI : u x v* is While semicomputable on A* .
By Engeler's lemma (Theorem 5.57) applied to A", there is an effective sequence 6fc*(x,y*) (k = 0, 1,2, . . .) of Booleans over E* such that
Ri(x,y*) is equivalent over A to the disjunction of bk*[x,y*]. Further,
by Remark 5.56(6), this sequence can be denned so that r 6fc* n is primitive
recursive in k. (Assume here that R is non-empty, otherwise the theorem
is trivial.) Then
Ri(x,V*) = for some fc, te^*,ibool(rb*;*"1,x,2/*) = tt.
Further, te^ . bool is For computable on A* (by Proposition 4.6).
Hence the function g defined on A* by
g ( k , x , y * ) =4
te. ibool W,
*,*)
is For computable on A* (by Equation 3.8 and Remark 3.16). Hence the
relation
Ro(k,x,y*) <F=*df g(k,x,y*)=\t
is For computable on A* (composing g with equality on bool), and so the
relation
R(x)
is projectively For* computable on A.
The other direction is trivial.
5.15
436
437
Fig. 14.
' t
Fig. 15.
The solution is to represent all the variables x j, xj, x",. which arise
in this way for each i ( l < j < n ) b y a single starred variable x* (with
x*[0],x*[l],x*[2],... representing X j , x ^ , x " , . . . ) . Then to each leaf A of
T[S, x] there corresponds (as in section 5.11) a Boolean b$,\, but now in
the starred variables x*=xj,... , xj^.
Again, as in section 5.11, we can define the halting formula for 5 as a
countable disjunction of Booleans:
halts =df \J
k=0
where the 6s,fc are effective, in fact primitive recursive, in S and k. Note
however that the program variables in halts are now among x*, not x.
Now suppose that the relation R : u is the halting set of a While-
438
random procedure on A,
P = proc in a aux c begin Sjn;S' end
where a : u = i x ... x sm and c : w.
As in section 5.11, let 6* be the Boolean which results from substituting
the default tuple Sw for c in bs',k- Note that war(6/t)C a, c* : u xw*.
Then for all a Au:
a 6 R <^> 3c* Aw 3k
.ibooir&*-', 0) c*) = tt
(cf. (5.8) in section 5.12) which (by Proposition 4.6) is projectively For"
computable on A, and hence projectively While* semicomputable on A.
This proves the theorem for the case that R is W/iiie-random semicomputable on A.
Assume, finally, that R is W/izZe*-random semicomputable, i.e., the
procedure for R may contain starred auxiliary variables, and there may be
random assignments to these. Now we can represent a sequence of random
assignments to a starred variable by a single doubly starred variable, or
two-dimensional array, which can then be effectively coded in A* (Remark
2.31(d)), and proceed as before.
While computability;
While semicomputability;
projective While semicomputability;
projective While* semicomputability.
We will also find interesting examples of sets of real and complex numbers which are semicomputable but not computable. Some of these sets
belong to dynamical system theory: orbits and periodic sets of chaotic
systems turn out to be semicomputable but not computable.
Finally we will also reconsider an example of a semicomputable, noncomputable set of complex numbers described in Blum et al. [1989]. The
effective content of their work can be obtained from the general theory.
Our main tool will be Engeler's lemma.
We will concentrate on the following algebras introduced in Example
2.23: the standard algebras
439
and
C< = (C;lessreal)
formed by adjoining the order relation on the reals lessreai: M2 -> B (which
we will write as infix '<')- We will show that:
(a) the order relation on E is projectively While semicomputable, but
not While semicomputable, on 7; and
(6) a certain real closed subfield of M is projectively While* semicomputable, but not projectively While semicomputable, on 'R<.
6.1
Computability on R and C
(6.1)
440
for these four algebras. We will give more detailed formulations of these
facts for each of these algebras shortly.
Example 6.3 (Non-computable functions). Recall Theorem 3.66 which
says that the output of a While, WhileN or While* computable function is contained in the subalgebra generated by its inputs. From this we
can derive some negative computability results for these algebras:
(a) The square root function is not While* computable on TL or 7^<.
This follows from the fact that the subset of R generated from the
empty set by the constants and operations of ~R or 'R.< is the set Z
of integers. But \/2 is not in this set. (For computability in ordered
Euclidean fields incorporating the square root operation, see Engeler
[I975a]).
(b) The mod function (z <-> |z|) is not While* computable on C or C<.
This follows from the fact that the subset of R generated from the
empty set by the constants and operations of C or C< is again Z. But
again, |1 + i\ = V2 is not in this set.
(c) The mod function would be computable in C if we adjoined the square
root function to the algebra "R, (as a reduct of C).
In the rest of this subsection, we will apply Engeler's lemma for While*
semicomputability (section 5.12) to the algebras K,K<,C and C<.
From the semicomputability equivalence theorem (5.61) (which follows
from Engeler's lemma) and from the TEP lemma (6.2), we get:
Theorem 6.4 (Semicomputability equivalences for Tl,'R,<,C,C<),
Suppose A is Tl or U<, and R C R"; or A is C or C<, and R C C".
Then the following are equivalent:
(i) R is WhileN semicomputable on A;
(ii) R is While* semicomputable on A;
(Hi) R can be expressed as an effective countable disjunction of Booleans
over A.
For applications of this theorem, we need the following normal form
lemmas for Booleans over K and 7?.<.
Lemma 6.5 (Normal form for Booleans over 7?.). A Boolean over 71,
with variables x = xi,... ,xn of sort real only, is effectively equivalent over
H. to a finite disjunction of finite conjunctions of equations and negations
of equations of the form
p(x) = 0
and
q(x) ^ 0,
441
p(x) = 0
and
q(x) > 0,
6.2
in Z .
In this subsection we obtain results distinguishing various notions of semicomputability, using the algebra 71 of reals. In the next subsection we will
obtain other results in a similar vein, using the ordered algebra ?< of reals.
We begin with a restatement of the semicomputability equivalence thorem
(6.4) for the particular case of 71.
Theorem 6.7 (Semicomputability for 71). Suppose R C E n . Then
the following are equivalent:
(i) R is WhileN semicomputable on 71;
(ii) R is While* semicomputable on 71;
(Hi) R can be expressed as an effective countable disjunction
i(x)
(6.2)
where each bi(x) is a finite conjunction of equations and negations of equations of the form
p(x) = 0
and
q(x) 0,
(6.3)
442
443
6.3
In the previous subsection we saw that the order relation on R is not (even)
While* semicomputable on the algebra 7. Let us add it now to K, to
form the algebra K< , and see how this affects the computability theory.
We begin again with a restatement of the semicomputability equivalence
theorem (6.4), this time for the algebra Tl<.
Theorem 6.16 (Semicomputability for 1l<). Suppose RCW1. Then
the following are equivalent:
(i) R is WhileN semicomputable on K< ,
(ii) R is While* semicomputable on 'R,< ,
(Hi) R can be expressed as an effective countable disjunction
\fbi(x)
where each bi (x) is a finite conjunction of equations and inequalities of the
form
p(x) = 0
and
q(x) > 0,
where p and q are polynomials in x = (x\,... , xn) . E n , with coefficients
in Z.
Proof. From Theorem 6.4 and Lemma 6.5.
444
Lander [1975, Chapter 12]. For the application below (section 6.4), we
relativise our concepts to an arbitrary subset D of E.
Definition 6.17.
p(x) = 0
and
q(x) > 0,
445
6.4
A set which is projectively While* semicomputable but not projectively WhileN semicomputable
(6.4)
n<'E
n<
E
j :E
446
(6.5)
(with existential quantification over all four sorts in "R,<<B'N). Then for all
x E F, x is algebraic over some subset of E of cardinality r (= the number
of arguments of (p of sort E in (6.5),).
The rest of this subsection is a sketch of the proof.
Lemma 6.28. (In the notation of the Theorem 6.27,) F can be represented
as a countable union of the form F = (J^o fi> where
Fi = {x\ (By & Er)(3z e ITW.y,*)}
and bi is a finite conjunction of equations and inequalities of the form
p(x,y,z,)=Q
and
q(x,y,z)>Q
Since F is the union of F<[e] over all i, and all r-tuples e from E, the
theorem follows from Lemma 6.29 and the following:
447
Lemma 6.30. For all n, there exists a real which is algebraic over E but
not over any subset of E of cardinality n.
Proof. Take x = e0 + e\ + ... + en (more strictly, j(e0) + ... + j(en)).
The result follows from the construction (6.4) of E.
I
We have shown that E (although a projection on R of a While semicomputable relation on Rx E*) is not a projection of a WhileN semicomputable relation in 7?.<>B'. In fact, we can see (still using Engeler's lemma)
that E is not even a projection of a While" semicomputable relation on
E" x Em (for any n,m > 0). Thus to define E, we must project off the
starred sort E*, or (in other words) existentially quantify over a finite, but
unbounded sequence of elements of E.
6.5
Orb(F,s) = {F(t,s) | t G T } .
The set of periodic points of F is
Per(F) = {s e S | 3* T(F(M) = s)}.
In modelling a dynamical system (S,F), the computability of the F and
of sets such as the orbits and periodic points is of immediate interest and
importance.
Now suppose, more specifically, that the evolution of the system in time
is determined by a next state function
f :S-> 5
through the equations
F(0,s)
F(t + l,s)
which have the solution
= s
= f(F(t,s))
448
F(t,s)=ft(s)
for t T and s S. We call such systems iterated maps. In this case, we
write
orb(f,s)
Orb(F,s)
{ f ( s ) \t T}
per(f)
Per(F)
Theorem 6.31. Let A be an N-standard algebra (with N = T), and containing the state space S. If the next state function f is While computable
on A then so is the system function F. Furthermore, the orbits orb(f, s)
and the set of periodic points per(f) are While semicomputable on A.
Proof. By computability of primitive recursion (Theorem 8.5) and closure of semicomputability under existential quantification over N (Theorem
5.23).
I
Now we will consider the computability of some simple dynamical systems with one-dimensional state spaces. More specifically, suppose the
state space is an interval
S = I = [a,b] C R
and so the next state function and system function have the form
/:/->/
F: T x 7 - 7.
F is called an iterated map on the interval I. Dynamical systems based on
such maps have a wide range of uses and a beautiful theory. For example,
such systems will under certain circumstances exhibit 'chaos'. The following discussion is taken from Devaney [1989]. Let (I,F) be a dynamical
system based on the iterated map F.
Definition 6.32.
(o) (/, F) is sensitive to initial conditions if there exists 6 > 0 such that
for all x 6 I and any neighbourhood U of x, there exist y U and
tT such that
}F(t,x)-F(t,y)\>6.
(b) (I, F) is topologically transitive if for any open sets U\ and C/a there
exist x Ui and t T such that F(t,x) C/2Note that if J is compact then (/, F) is topologically transitive if, and
only if, Orb(F,x) is dense in / for some x I. (The direction '=' is
clear. The proof of '=>' depends on the Baire category theorem.)
449
6.6
We reconsider an example from Blum et al. [1989], and show how it follows
from our general theory of Semicomputability. We work from now on in
C<. First, we must relate computability in the complex and real algebras.
We consider the algebras C and C<.
Notation 6.35. If 5 C C", we write
S =df
{{m(zi),\m(zi),...,n(zn),\m(zn))\(z1,...,zn)C1}
C M 2n .
450
The set F(g) is the filled Julia set of g; the boundary J(g) of F(g) is the
Julia set of g.
For any r R define
Vr(d) {-2 C | 3n(|<7n(,z)| > r)}.
Clearly, U(g) C Fr(g) for all r.
Theorem 6.38. For g(z) z2 - c, with \c\ > 4, we have U(g) is
While semicomputable but not (even While*) computable. Thus, F(g)
is not While* semicomputable.
Proof. Assume for now that |c| > 1, and choose r = 2|c|. Then for \z\ > r,
\g(z)\ = \z*-c\ > \zf-\c\
> \\z\.
451
proc in a: complex
aux b: complex
begin
b:=a;
(Note that although the function z (-> \z\ is not computable, the function
z )-> \z\2 = re(z)2 + im(z) 2 is.)
To conclude the proof we must show that F(g) is not While* semicomputable. Suppose it was, then (by the countable connectivity condition
and the reduction lemma) it would consist of countably many connected
components. But if we choose |c| > 4 it can be shown that F(g) is compact,
totally disconnected and perfect, i.e., homeomorphic to the Cantor set (see,
for example, Hocking and Young [1961]), and so we have a contradiction
(cf. Example 6.26(a)).
I
Q-
452
7.1
The problem
453
tt if x < y
fifx>y
t if x = y,
and
tlf
= (
\fif
These partial functions are continuous, in the sense that the inverse images
of {tt} and {f} are always open subsets of E2.
We will exploit these observations about '<' and '=' to the full by
studying topological partial algebras. We will also prove a more general
version of Theorem 7.3 for such partial algebras (Theorem 7.12).
7.2
454
[F(ti,...,tm)\Aar
~
t
otherwise.
except for the case that F(. . . ) is the discriminator if (b, t\ , < 2 ), in which case
we have a 'non-strict' computation of either [tiller or [/2]"V, depending
on the value of [fe]"4^:
if [6]^
A 4- tt
]* if[b] <rf
t
if[&]A^t-
or
7.3
455
[t]A<ri t-
Note that in, this section, by 'function' we generally mean partial function.
Definition 7.5. Given two topological spaces X and Y, a function / :
X > Y is continuous if for every open V C Y, f~l[V] df {x X \ x
dom(f) and f(x) Y} is open in X.
Definition 7.6.
(1) A topological partial ^-algebra is a partial E-algebra with topologies
on the carriers such that each of the basic functions is continuous.
(2) A standard topological partial algebra is a topological partial algebra
which is also a standard partial algebra, such that the carrier B has
the discrete topology. (Cf. Definition 7.1.)
Examples 7.7.
(a) (Real algebra.) An important standard topological partial algebra for
our purpose is the algebra
KP = (R, B; 0, 1, +, -, x,if r e a l , eqp, less,
which is formed from
by the replacement of eqreai and lessreai by
the partial operations eqp and lessp (defined in section 7.1). It becomes a topological partial algebra by giving K its usual topology, and
B the discrete topology. An open base for the standard topology on
M is given by the collection of open intervals with rational endpoints.
These intervals are all While semicomputable on Kp. (Exercise.)
(b) (Interval algebras.) Another useful class of topological partial algebras are of the form
algebra
import
carriers
functions
IP
T^P
/
'/ : 7->R,
mi
Fi : 7 -> /,
Fk
jmk _^ j
end
where I is the closed interval [0,1] (with its usual topology), i/ is the
embedding of J into K, and -Fj : Imi > I are continuous partial functions.
These are called (partial) interval algebras on I.
Example 7.8 (While computable functions on Up). We give two
examples of functions computable by While programs, using the above
Boolean-valued functions (eqp and lessp) as tests. (In both cases, the inputs are taken to be positive reals to simplify the programs, although the
456
programs could easily be modified to apply to all reals, positive and nonpositive) .
(a) The characteristic function of Z on E, isJnt : R+ -> B, where
. . , ,
f t if a; is an integer
8
is-mt(z) = < '
I f otherwise.
This is defined by the procedure
proc in x : pos-real
out b: bool
begin
b: =false;
457
458
7.4
459
of reals as they occur to us, e.g., in physical measurements and calculations. So we can use the stream model as a source of insights for our
requirements or assumptions regarding the algebraic model, notably the
continuity requirement for computable functions.
Recall the problem discussed in Section 7.1 concerning the continuity
requirement for computable relations, i.e., Boolean-valued functions R :
K" -> B: the only continuous total functions from En to any discrete space
such as B are the constant funtions.
The solution, we saw, was to work with partial algebras, i.e., to interpret
the function symbols in the signature by partial functions. We use the
stream model for insight. Consider, for example, the equality and order
relations on the reals. Suppose we have two input reals (between 0 and 1)
defined by streams of decimal digits, which we read, one digit at a time:
a
/3
=
= 0.606162---
(compare (i));
(iv)
if x = y
(same as (m)!).
Note that examples (i) and (iii) were incorporated as basic operations in
the topological partial algebras 7p and Xp in section 7.3.
460
,- vi-N
In the integer case, we can effectively test whether the input y is 0, and so
(if y = 0) give an output, namely an error message (or default value, if we
prefer). In the real case, if y 0, this cannot be effectively decided, and
so no output (error or other) is possible. (Suppose the first n digits of the
input y are 00 ... 0. The (n + l)th digit may or may not also be 0.)
We remark that the concept of reals as streams is reminiscent of Brouwer's notion of reals defined by lawless sequences or choice sequences. In
fact, for Brouwer, a function was a constructively defined function, and he
'proved' that every function on M is continuous! (See, for example, the
discussion in Troelstra and van Dalen [1988, Chapter 12]).
We conclude this discussion by pointing out a related, intensional, approach by Feferman to computation on the reals, based on Bishop's constructive approach to higher analysis (Bishop [1967], Bishop and Bridges
[1985]). This is outlined in Feferman [l992a; 1992b].
7.5
461
I
2
otherwise,
Lemma 7.18 (Least number operator). Let g : X x N > Y be continuous, and let yo Y be such that {yo} is clopen in Y. Let f : X > N
be defined by
/(x) ~ nk[g(x,k) 4,y 0 ],
i.e.,
f}{x\g(xii)^y0}r({x\g(x,k)iry0}
i=0
k-l
t=0
462
I
A
463
Lemma 7.27. For any While procedure P, the function P (section 3.6)
is continuous.
Proof. Suppose P = proc in a out b aux cbegin 5 end, where a : u and
b : v, so that PA : Au -> Av. Fix any stateCTOeState(A). The
imbedding and projection functions
ta : Au -> State (A)
and
defined by
t a (z) = a0{a./x}
and
7rb(cr) = a[b]
are continuous. (Exercise.) Hence the composition
7Tbo [S]Aota : A->A.
is continuous. But this is just PA, independent of the choice of (TO, by the
functionality lemma (3.11) for procedures.
464
7.6
For background on compactness, see any of the books listed at the beginning of section 7.
Remark 7.28 (Compactness).
=>
=
open
clopen.
We can reverse the direction of the implication in the second of these assertions, under the assumption of compactness.
Theorem 7.29. Let Abe a topological partial algebra, and let u = si x . . . x sm
ProdType(T^, where, for i = 1, ... ,n,
(a) ASi is compact, and
(b) As. has an open subbase of While semicomputable sets.
Then for any relation R C Au, the following are equivalent:
(i) R is While computable;
(ii) R is While* computable;
(in) R is clopen in Au.
Proof. (Z):=>(M) is trivial.
(n)=>(m) follows from Corollary 7.14.
(m)=^-(i): Note first that from assumptions (a) and (6), the product space
Au (with the product topology) is compact, and has an open subbase
of While semicomputable sets. Suppose now that R is clopen in Au.
Since R is open, we can write
R=\jBt
where the Bt are basic open sets. Each Bi is a finite intersection of
subbasic open sets, and hence semicomputable, by Theorem 5.8.
Since -R is closed, R is compact, and hence R is the union of finitely
many of the BJ'S, and so R is semicomputable, by Theorem 5.8.
Repeating the above argument for Rc, we infer by Post's theorem (5.9)
that R is computable.
I
7.7
465
*<,W)) P ) 1/P
(l<P<oo)
7.8
466
total functions from X to B (or to any discrete space) are the constant
functions. (Exercise.)
(e) A finite product of connected spaces is connected. (See any of the
references listed at the beginning of section 7.) Hence in a topological
S-algebra A, if u = si x ... x sm ProdType(S), and ASi is
connected for i = 1,... , m, then so is A".
(d) The space R of the reals, with its usual topology, is connected. Therefore, so is the product space E9 for any q. Hence, by Corollary 7.14,
for any topological partial algebra over K, such as the algebra Tip
(Example 7.7(a)), the only While or While* computable subsets
of ! are W itself and 0.
(e) Similarly, by the connectedness of the unit interval / (and hence of
/*), the only While or While* computable subsets of Iq in any
interval algebra over / (Example 7.7(6)) are I9 itself and 0, ... ,
regardless of the choice of (continuous) functions F,... ,Fk as basic
operations!
We will only develop the theory in this section for total functions on
total algebras. The essential idea is that if / is a computable total function
on A, then / is continuous, and so, by Remark 7.32(6), its definition cannot
depend non-trivially on any Boolean tests involving variables of sort s if
As is connected. (We will make this precise below, in the proof of Lemma
7.40.)
Note that many of these results can be extended to the case of total
functions / on connected domains in partial algebras. We intend to investigate this more fully in future work. However, for now we assume in this
subsection:
Assumption 7.33. A is a total topological algebra.
Examples 7.34 (Topological total algebras on the reals). Two important total topological algebras based on the reals which will be important for our purposes are:
(a) The algebra ftf (T for 'total topological'), defined by
algebra
import
functions
1$
TZo.JV, B
ifreal : 1 x E2 -> M,
divnat : M x N -> M,
end
467
T^
f$
I
i/ : /
468
Fig. 16.
where (putting x = a, b, c)
A = b
469
Remarks 7.41.
(o) Examples of the application of this lemma are the total topological
algebras 71^ and if*, and procedures of type real9 > real and
intvl 9 - real, respectively. Note that the result also holds with the
'internal' version of While* computability, by Proposition 3.47.
(b) Without the assumption that Au be connected, Lemma 7.40 is false,
i.e., it is possible for PA to be total, but T(P) to be infinite. (Exercise.)
(c) Note that any computation tree T is finitely branching; therefore,
by Konig's lemma, T is finite if, and only if, all its paths are finite.
Hence any counterexample to demonstrate (b) would be an example
of a computation tree for a procedure which defines a total function,
but nevertheless has infinite paths!
(d) The lemma also holds without the assumption that A be total, as
long as PA is total (and Au is connected). (Exercise.)
(e) In general, this transformation of T(P) to a finite unbranching tree
given by the proof of Lemma 7.40 is not effective in P, since it depends on the evaluation of (constant) Boolean tests. If we want it
to be effective in P (as we will in the next subsection, dealing with
approximable computability), we will need a further condition on A,
such as the Boolean computability property (Definition 7.56).
Lemma 7.42. // a computation tree T(P) for a (While or While*)
procedure P is finite and unbranching, then PA is (-) explicitly definable
on A.
Proof. Exercise.
Remark 7.43. More generally, Lemma 7.42 holds if T(-P) is finite but
(possibly) branching. (Use the discriminator in constructing the defining
term.)
Combining Lemmas 7.39, 7.40 and 7.42, we have conditions for an equivalence between explicit definability and While computability:
Theorem 7.44. Let A be a total topological algebra, and suppose Au is
connected. Let f : Au -* Av be a total function. Then the following are
equivalent:
(i) f is While computable on A;
(ii) f is While* computable on A;
(Hi) f is explicitly definable on A.
Example 7.45. This theorem holds for the total topological algebras Ti^
and If, and total functions / : R* -> R and /:!*-/, respectively.
Note that by Remarks 7.38(o) and 7.41 (a), the theorem also holds in
these algebras with 'internal' versions of While* computability and *explicit definability.
470
7.9
Approximable computability
It is often the case that functions are computed approximately, by a sequence of 'polynomial approximations'. In this way we extend the class of
computable functions to that of approximably computable functions. This
theory will build on the work of section 7.8.
First we review some basic notions on convergence of sequences of functions.
Definition 7.46 (Effective uniform convergence). Given a set X, a
metric space Y, a total function / : X -> Y and a sequence of total
functions gn : X -* Y (n = 0,1,2,...), we say that gn converges effectively
uniformly to f on X (or approximates f effectively uniformly on X) if, and
only if, there is a total recursive function e : N > N such that for all n, k
and all x X,
k>e(n) => dy($*(*),/(*))< 2-".
Remark 7.47. Let M : N -) N be any total recursive function which is
increasing and unbounded. Then (in the notation of Definition 7.46) the
sequence gn converges effectively uniformly to f on X if, and only if, there
is a total recursive function e : N -4 N such that for all n, k and all a; 6 X,
k > e(n) = d Y ( 9 k ( x ) , f ( x ) ) < l/M(n).
(Exercise.)
The theory here will be developed for total functions on metric total
algebras (denned in section 7.5). We therefore assume in this subsection:
Assumption 7.48. A is a metric total algebra.
Example 7.49 (Metric total algebras on the reals). The two total
topological algebras based on the reals given in Example 7.34 can be viewed
as metric algebras in an obvious way. The second of these, the interval
algebra if*, will be particularly useful here.
We will present, and compare, two notions of approximable computability on metric total algebras: effective uniform While (or While*) approximability (Definition 7.50) and effective Weierstrass approximability (Definition 7.54).
So suppose A is a metric total E-algebra. Let u, v ProdType(E)
and s Sort(E).
Definition 7.50. A total function / : Au -> A" is effectively uniformly
While (or While*) approximable on A if there is a While (or While )
procedure
P : nat x u - v
on AN such that PA is total on AN and, putting gn(x) =<y PA
the sequence gn converges to / effectively uniformly on Au.
(n,x),
471
(a) A total function / : Au -t As is effectively S- Weierstrass approximable over A if, for some x : u, there is a total computable function
h : N -> rTermx,s(xr
such that, putting gn(x) q te^s(h(n),x), the sequence gn converges to / effectively uniformly on A".
(b) Effective E*-Weierstrass approximability is defined similarly, by replacing 'E' by '*' and lte^ by 'te'.
(The term evaluation representing function te^s was defined in section 4.3.)
Proposition 7.55. A function on A is effectively Y,-Weierstrass approximable if, and only if, it is effectively ^-Weierstrass approximate.
Proof. From a computable function
h* : N -> rTermI,s(E*)'1
we can construct a computable function
h : N ->
TermI,f(S)~l
where, for each n, h(n) and h*(n) are Godel numbers for semantically equivalent terms, using the fact that the transformation of E*-terms to S-terms
in the conservativity theorem 2.15.4 is effective.
We shall therefore usually speak of 'effective Weierstrass approximability' over an algebra to mean effective Weierstrass approximability in either
sense.
472
h : N -+ rTermI,s(S)n
be a total computable function. Then there is a While(EN) procedure
P : nat x u > s such that for all x Au and n N,
PAN(n,x) = teAtl(h(n),x).
Proof. Simple exercise.
473
teAs(h(n),x)
= PAN(n,xy
Proof. Suppose
P = proc in n, a out b aux c begin 5 end
where n : nat. Consider the WhileN(E) procedures Pn : u v (n =
0,1,2,...) denned by
Pn = proc in a out b aux n, c begin n := n; S end
where n is the numeral for n. It is clear that for all n 6 N and x Au,
PA(x) = PAN(n,x).
By Lemmas 7.40 and 7.42, PA is definable by a S-term tn. Moreover,
the sequence (tn) is computable in n, by use of the BCP to effectivise the
transformation of the tree T to T' in the construction given by the proof
of Lemma 7.40. (Note that the evaluation of a constant Boolean test can
be effected by the computation of any closed instance of the Boolean term,
which exists by the instantiation assumption.) Hence the function h denned
by
r
h(n) = ^
is computable.
The requirement in the above theorem that / be total derives from the
application of Lemma 7.60, which in turn used Lemma 7.40, where totality
was required.
Remark 7.62. The equivalence of (i) and (in) was noted for the special
case A = If, Au = / and As = R in Shepherdson [1976], in the course of
proving the equivalence of these with another notion of computability on
the reals (Theorem 7.64).
474
We are especially interested in computability on the reals, and, in particular, a notion of computability of functions from Iq to E, developed
in Grzegorczyk [1955; 1957] and Lacombe [1955]. We repeat the version
given in Pour-El and Richards [1989], giving also, for completeness, the
definitions of computable sequences of rationals and computable reals. Finally (Theorem 7.64),we state the equivalence of this notion with the others
listed in Theorem 7.61.
Definition 7.63.
(a) A sequence (r^) of rationals is computable if there exist recursive
functions a, 6, s : N -* N such that, for all k, b(k) ^ 0 and
rt
Tk
- ( p.(fc)(*)
~ ( lj
&(*)'
475
and Richards [1989]. Shepherdson [1976] gave a proof of (i)-e>(iv) by (essentially) noting the equivalence (i)-o(m') and reproving (m)'O(iv). The
new features in the present treatment are: (a) the equivalence (i)t$(iii)
in a more general context (Theorem 7.61), and (6) the equivalence of (ii)
with the rest (Theorems 7.61 and 7.64).
7.10
Our models of computation can be applied to any algebraic structure. Furthermore, our models of computation are abstract: the computable sets
and functions on an algebra are isomorphism invariant. Thus to compute
on the real numbers we have only to choose an algebra A in which (any
one of the representations of) the set E of reals is a carrier set. There
are infinitely many such algebras of representations or implementations of
the reals, all with decent theories resembling the theory of the computable
functions on the naturals. However, unlike the case of the natural numbers,
it is easy to list different algebras of reals with different classes of While
computable functions (see below).
In sections 6 and 7, we have let the abstract theory dictate our development of computation on the reals. The goal of making an attractive and
useful connection with continuity led us to use partial algebras in section 7.
Because of the fundamental role of continuity, this partial algebra approach
is important since it enables us to relate abstract computation on the reals
with concrete computation on representations of the reals (via the natural
numbers). This we saw in section 7.4 and, especially, in section 7.9. Here
we will reflect further on the distinction between concrete and abstract,
following Tucker and Zucker [1999].
The real numbers can be built from the rational numbers, and hence the
natural numbers, in a variety of equivalent ways, such as Dedekind cuts,
Cauchy sequences, decimal expansions, etc. Thus it is natural to investigate
the computability of functions on the real numbers, starting from the theory
of computable functions on the naturals. Such an approach we term a
concrete computability theory. The key idea is that of a computable real
number. A computable real number is a number that has a computable
approximation by rational numbers; the set of computable real numbers
forms a real closed subfield of the reals. Computable functions on the reals
are functions that can be computably approximated on computable real
numbers. The study of the computability of the reals began in Turing
[1936], but only later was taken up in a systematic way, in Rice [1954],
Lacombe [1955] and Grzegorczyk [1955; 1957], for example.
The different representations of the reals are specified axiomatically,
uniquely up to isomorphism, as a complete Archimedean ordered field.
But computationally they are far from being equivalent. For instance,
representing real numbers by infinite decimals leads to the problem that the
trivial function 3x cannot be computable. If Cauchy sequences are used,
476
which, with the product topology, is called Baire space. The theory of
computation on B is called type 2 computability theory. Klaus Weihrauch
and his collaborators, in a long series of papers, have created a fine generalisation of the theory of numberings of countable sets (recall section 1.3) to
a theory of type 2 numberings of uncountable sets. In type 2 enumeration
theory, numberings have the following form. Let X be a topological space.
A type 2 enumeration of X is surjective partial map
a : B -> X
(cf. Definition 1.1). Computability on X is analysed using type 2 computability on B. See, for example, Kreitz and Weihrauch [1985] and, especially, Weihrauch [1987].
A more abstract method for the systematic study of effective approximations of uncountable topological algebras has been developed by V.
Stoltenberg-Hansen and J. V. Tucker. It is based on representing topological algebras with algebras built from domains and applying the theory of
effective domains. This method of applying domain theory to mathematical approximation problems was first developed for topological algebras
and used on completions of local rings in Stoltenberg-Hansen and Tucker
[1985; 1988]. It was further developed on universal algebras in StoltenbergHansen and Tucker [1991; 1993; 1995]; see also Stoltenberg-Hansen et al.
[1994, Chapter 8]. We will sketch the basic method; an introduction can be
found in Stoltenberg-Hansen and Tucker [1995]. Suppose A is a topological algebra. The idea is to build an algebra R that represents A by means
477
a : R - A
(7.1)
478
^\aConcRep(A)ComP^(A}-
While*(A) C
naeConcRep(A)CoTnP<*(A)
(compare (1.1) of Section 1.3). In the known concrete models, the computable functions are continuous, therefore the continuity of the abstract
computable functions is essential.
There is much to explore in the border between abstract and concrete
computability. In Stewart [1999] it is shown that if A is an effective metric
algebra with enumeration a, then the While* approximable functions on
A are a-effective. The converse is not true. To bridge this gap, nondeterministic choice must be added to the 'While' language, and manyvalued functions considered (see Tucker and Zucker [2000a]).
A theory of relations (or multi-valued functions) denned by generalised
Kleene schemes has been developed in Brattka [1996; 1997]. Among several
important results is an equivalence between the abstract computability
model based on Kleene schemes and Weihrauch's type 2 enumerability.
The distinction between abstract and concrete models made in Tucker
and Zucker [1999] has practical use in classifying the many approaches
to computability in concrete structures. However, this distinction needs
further theoretical refinement. One is reminded of the distinction between
'internal' and 'external', applied to higher type functionals, in Normann
[1982].
479
8.1
480
range type v, both product types over E; we will also write a : u > v.
The semantics of such a scheme, for each A E.NStdAlg(E)(ihe class of
TV-standard S-algebras), will then be a function
IO!A : Au -> Av.
(i) Initial functions and constants. For each E-product type u, E-sort s
and function symbol F Func(S)u^s, there is a scheme F P/Z(E) U _ >S .
On each A .NStdAlg(), it defines the function
FA : Au - A..
(if) Projection. For all m > 0, u = si x . . . x sm and i with 1 < i < m,
there is a scheme U tt> , 6 PJ?(S)u_j.Si . It defines the projection function
u
,i : AU -> A*i on each A NStdAlg(V), where
if 6 = tt
481
aA(0,x)
a (z + l,x)
A
=
-
/3A(x)
7A (z, x, aA(z,x))
PR(A) = (PR(A)U^V
where
u,ve
PJl(A)u^v = {a^4 | a
It turns out that a broader class of functions provides a better generalisation of the notion of primitive recursiveness, namely P.R*(S) computability.
pfl(S').
Then any such scheme a e PR* ()_> defines a function aA : Au -> A41
on each A AT,Sttd.A/0(E).
Also PR* (A) is the set of PR* (E)-computable functions on A.
Next we add the constructive least number operator to the PR schemes.
482
2,
/xP.R(E) - (nPR(E)u->v
u,v ProdType(E)),
\u,v ProdType(S))
M pfl(E*).
PJZ(E) - .ForProc(E)
tf>:
and
483
(primitive recursive in the enumerated syntax) such that for all PR(H)
schemes a, For(E) procedures P and TV-standard S-algebras A,
[<t>(a)}A = la]A
and
W(P)]A
= [P]A-
NStdAlg(Z),
aA(0,x) = f)A(x)
and for z > 0
aA(z,x) =
i(z,x,aA(61(z,x),x),...,aA(Sd(z,x),x)),
for z > 0.
484
operator to CR(S).
8.5):
(1) These can be used easily in the mathematical modelling of many deterministic systems, from computers (e.g. Harman and Tucker [1993]
to spatially extended non-linear dynamical systems (Holden et al.
[1992]).
(2) The uPR schemes have been adapted and extended to characterise
the computable relations on certain metric algebras, including the
algebra of reals (Brattka [1996; 1997]).
8.2
Machine models
485
486
= FAPC(N)
= FAPS(N)
FAPCS(N)
487
FAPC(A)
FAP(A)
FAPCS(A)
FAPS(A)
Fig. 17.
This theorem is taken from Moldestad et al. [l980b]. It and other
results about these models make clear the fact that, when computing in
the abstract setting of an algebra A, adding
computation on N
unbounded algebraic memory over A
both separately, and together, increases the computational power of the
formalism.
The connection with the imperative models is easily described. Assuming the straightforward generalisation of the machine models to accommodate many-sorted algebra, we have:
= FAP(A),
= FAPC(A),
= FAPCS(A).
Three other machine model formalisms of interest are the finite algorithmic procedures with index registers (fapIR) and countable algorithmic
procedures (cap) in Shepherdson [1973] and the generalised Turing algorithms (gTa) in Friedman [l971a], all equivalent to While* computability.
In the obvious notation, we have:
= While* (A).
488
8.3
489
490
High-level imperative programming models were slow to enter mainstream computability theory, despite attention being drawn to the value
of this approach in Scott [1967]. Some early textbooks to feature such
programming models were Brainerd and Landweber [1974], Manna [1974],
Bird [1976] and Clark and Cowell [1976].
8.4
Axiomatic methods
8.5
Equational definability
491
Equational definability may be generalised from N to an arbitrary algebra A with the natural result that, if A is an TV-standard structure,
equational definability is equivalent with While* computability. The first
attempt at such a generalisation is Lambert [1968]. We sketch a simpler
treatment from Moldestad and Tucker [1981], adapted to many-sorted algebras.
First we choose a language Eqn= Eqn() for defining equations over
a signature S and transforming them in simple deductions. Let Eqn
have constants a, 6, c,... and variables x, y, z,... for data; and variables
p,q,r,... for functions. Using the basic operations of the signature, we
inductively define E-terms t,... in the usual way. An equation in Eqn is
an expression e = (ti = t^), where t\ and t% are terms of the same sort.
A deduction of an equation e from a set of equations E is a list e\,... ,ek
of equations such that for each i = 1,... , k one of the following holds:
(*) et E;
(ii) ei is obtained from BJ for some j < i by replacing every occurrence
of a variable x in BJ by a constant c;
(Hi) et is obtained from ej for some j < i by replacing at least one occurrence of a subterm t of Cj by a constant c, where t has no free
variables, and for some j' < i, ej> = (t = c).
An equation e is defined to be formally derivable or deducible from E,
written E h e, if there is a deduction of e from E.
Thus, it remains to formulate equational deductions with respect to a
given algebra A of signature E in order to formulate what it means for a
function / on A to be equationally definable on A. This is essentially giving
our system a semantics. The first semantical problem is to allow the basic
operations of A to play a role in deductions from a set of equations E, and
this is accomplished by permitting
=>
01 = a 2 .
492
Theorem 8.14.
8.6
The familiar definition of the recursive functions on N based on the primitive recursion scheme of Dedekind and Godel, and the least number operator of Kleene, appeared in Kleene [1936]. Kleene provided a thorough
revision of the process of recursion on N sufficiently general to include recursion in objects of higher function type: see Kleene [1959; 1963]. In
Platek [1966] there is an abstract account of higher-type recursion.
Studies of higher type inductive definitions have been taken up by D.
Scott and Y. Ershov, whose work forms part of domain theory (see, for
example, Stoltenberg-Hansen et al. [1994]). The central technical notion
is that of fixed points of higher type operators.
In Moldestad et al. [I980a] Platek's methods were analysed and classified in terms of the machine models of section 8.2. Like equational definability, definability by fixed-point operators applies to an arbitrary algebra
A and is there equivalent to fapS computability. Thus, this notion coincides
with While* definability in an JV-standard structure. We will sketch the
method (adapted to many-sorted algebras).
First we construct the language FPD= FPD(S) for defining fixedpoint operators. Let FPD have the data and function variables of Eqn,
the equation language of section 8.5. Using the basic operations of the
signature and the A-abstraction notation, we create a set of fixed-point
terms of both data and function types:
t
::=
x\p\F\T(ti,...,tn)\fy[>(p-yi,...,yn-t].
493
Theorem 8.15.
(a) For any standard 2,-algebra A,
FPD(A) = FAPS(A).
(b) For any N-standard 'S-algebra A,
FPD(A) = FAPCS(A) = While* (A).
For more details see Moldestad et al. [I980a].
An approach to computation on abstract data types, alternative to that
presented in this chapter, is the development in Feferman [I992a; 1992b]
of a theory of abstract computation procedures, defined by least fixed-point
schemes, influenced by Moschovakis [1984; 1989]. The 'abstract data types'
here are classes of structures similar to our standard partial many-sorted
algebras, abstract in the sense that they are closed under isomorphism,
and the computation procedures are abstract in the sense that they are
isomorphism invariant on the data types; cf. Theorem 3.24. Types (or
sorts) and operations can have an intensional or extensional interpretation.
Another treatment of inductive definitions (also influenced by Moschovakis) and a survey of their connections with machine models is given in
Hinman [1999].
8.7
Set recursion
8.8
494
495
496
languages). Consider a deterministic programming language over an abstract data type dt.
(a) The functions that can be programmed in the language on an algebra
A which represents an implementation of dt, are the same as the
functions While* programmable on A.
(b) The families of functions that can be programmed in the language
uniformy over a class K of implementations of dt, are the same as
the families of functions While* programmable over K
The thesis has been discussed in Tucker and Zucker [1988].
The logical view of computable functions and sets, with its focus on
axiomatic theories and reasoning, is a more abstract view of computation
than the view from algebra and data type theory, with their focus on algorithms and programs. The logical view is directed at the specification of
computations.
8.9
In the course of our study, we have met logical and non-deterministic languages that define in a natural way the projectively computable sets (and,
equivalently, the projectively semicomputable sets). These languages are
motivated by the wish to specify problems and computations, and to leave
open all or some of the details of the programs that will solve the problems
and perform the computations.
To better understand the role of the projective computable sets, we
introduce the idea of an algorithmic specification language which includes
some ideas about non-deterministic programming languages. The properties that characterise an algorithmic specification language are forms of
algorithmically validating a specification. An algorithmic specification language is an informal concept that is intended to complement that of a
deterministic programming language. The problem we consider is that of
formalising the informal notion of an algorithmic specification language by
means of a generalised Church-Turing thesis for specification, based on
projectively computable sets.
There are four basic components to a computation:
(0) a data type;
(1) a specification of a task to be performed or problem to be solved;
(2) specifications for algorithms whose input/output behaviour accomplishes the task or solves the problem; and
(3) algorithms with appropriate i/o behaviour.
We model mathematically these components of a computation, by assuming
that:
(0) a data type is a many-sorted algebra, or class of algebras;
(1) a specification of the task or problem is defined by a relation on the
algebra;
497
{xA\3yR(x,y)}
R(x) =4 {y e A* \ R(x,y)}.
Quite commonly, the task is 'simplified' to computing one or more so-called
selection functions for the relation.
Definition 8.20 (Selection functions). Let R C Au x Av be a relation.
A function
/ : Au - Av
is a selection function for R if
(i) Vx[3yR(x,y) = f(x) | and R(x,f(x))]; and
(it) Vx[/(ar)4, =**(*,/(*))].
Notice that the domain and range of a selection function / are projections:
dom(f)
ran(f)
= {xeAu\3yR(x,y)},
= {y Av \ 3xR(x,y)}.
498
PxS
499
Lang()
and
Lang(*)
over the signatures E and S* with their usual semantics. The relations that are EX definable in these languages are the projectively
While and While* computable sets.
500
(ii) Horn clause languages. In Tucker and Zucker [1989; 1992a] we studied a generalisation of logic programming languages based on Horn
clauses, and a semantics based on resolution. The relations definable
in this specification-cum-programming language were the projectively
While* computable sets. The logic programming model was shown
to be equivalent to certain classes of logically definable functions (Fitting [1981]).
(Hi) Other definabilities. In Fitting [1981] the relations are shown to be
equivalent to those definable in Montague [1968]. Hence, by work in
Gordon [1970], these all coincide with the search computable functions of Moschovakis [l969a]. A summary of these results is contained
in Tucker and Zucker [1988, section 7].
(iv) Non-deterministic programming languages. Finally, recall from section 5 that we have seen that constructs allowing non-deterministic
choices of data, state, or control in programming languages also lead
to the projectively computable sets. In particular, the models
While* computability with initialisation and
While* computability with random assignments
were analysed.
The equivalence results suggest that the concepts of projective computablity and semicomputability are stable in the analysis of models of
specification. The concept of an algorithmic specification language in its
weak form, together with all the above equivalence results, leads us to formulate the following generalised Church-Turing thesis for specification, to
complement that for computation:
Thesis 8.24 (Generalised ChurchTuring thesis for specification
on abstract data types). Consider an adequate algorithmic specification
language S over an abstract data type dt.
(a) The relations on a many-sorted algebra A implementing dt that can
be specified in S are precisely the projectively While* computable
relations on A.
(b) The families of relations over a class K of such algebras implementing
dt, that can be specified in S, uniformly over K, are precisely the
families of uniformly projectively While* computable relations over
8.10
501
502
PR*,
fj,PR,
uPR*.
(\aba(g))(d)(n)
= g(d,n).
The addition of this construct to models of computation MC leads to models of computation XM.C(A):
XPR,
\PR*,
XpPR,
\nPR*.
We investigate the relationships between these various models; for example, we prove some computational conservativity results: for any function / on A,
fePR(X)
^=* fePR(A)
and similarly for \PR*: XfiPR and \/j,PR*. We also show that computability is not invariant under Cartesian forms, i.e., there are functions
/ such that
but
503
cart(f) e PR(A)
and similarly for XPR*, XfiPR and X^iPR*. Further, 'A-elimination' does
not hold, i.e., there are functions / such that
/ XPR(A)
but
/ i PR(A);
for example, the function const"4: A -t [N -> A], which maps data a A to
the stream const"4(a) [N -*A] with constant value a, is in XPR(A) but
not in PR(A), or even in fiPR*(A). However, we do have X-elimination
+ cartesian form, in the sense that
fXPR(A)
^ cart(f) PR(A),
504
References
The great majority of the publications listed here are referenced in the
text. Some papers, however, marked with a star next to the date, are not
so referenced. They are included here as a guide to further reading, either
because they shed some light on the historical development of the subject,
or because they provide useful further information in certain areas, such as
program verification and computation on the reals.
[American Standards Association, 1963] American Standards Association.
Proposed American standard flowchart symbols for information processing, Communications of the Association for Computing Machinery
6:601-604, 1963.
[Apt, *198l] K. R. Apt. Ten years of Hoare's logic: A survey Part 1,
ACM Transactions on Programming Languages and Systems 3:431-483,
1981.
[Apt and Plotkin, 1986] K. R. Apt and G. D. Plotkin. Countable nondeterminism and random assignment, Journal of the Association for Computing Machinery, 33:724-767, 1986.
[Arbib and Give'on, *1968] M. A. Arbib and Y. Give'on. Algebra automata I: Parallel programming as a prolegomena to the categorical
approach, Information and Control 12:331-345, 1968.
[Ashcroft and Manna, *197l] E. Ashcroft and Z. Manna. The translation
of 'go to' programs to 'while' programs, Information Processing, 71:147152, 1971.
[Ashcroft and Manna, *1974] E. Ashcroft and Z. Manna. Translation program schemas to while-schemas, SIAM Journal of Computing 4:125-146,
1974.
[Asser, *1960] G. Asser. Rekursive Wortfunktionen, Zeitschrift fur mathematische Logik und Grundlagen der Mathematik 6:258-278, 1960.
[Asser, 1961] G. Asser. Funktionen-Algorithmen und Graphschemata,
Zeitschrift fur mathematische Logik und Grundlagen der Mathematik
7:20-27, 1961.
[Back, 1983] R. J. R. Back. A continuous semantics for unbounded nondeterminism, Theoretical Computer Science 23:187-210, 1983.
[de Bakker, 1980] J. W. de Bakker. Mathematical Theory of Program Correctness, Prentice Hall, 1980.
[Banachowski et al., *1977] L. Banachowski, A. Kreczmar, G. Mirkowska,
H. Rasiowa and A. Salwicki. An introduction to algorithmic logic, mathematical investigations in the theory of programs. In Mathematical Foundations of Computer Science, A. Mazurkiewicz and Z. Pawlak eds, pp.
7-99, Banach Center Publications, 1977.
[Barwise, *1975] J. Barwise. Admissible Sets and Structures, SpringerVerlag, 1975.
505
[Becker, 1986] E. Becker. On the real spectrum of a ring and its application to semialgebraic geometry, Bulletin of the American Mathematical
Society (N.S.) 15:19-60, 1986.
[Bergstra and Tucker, *1980a] J. A. Bergstra and J. V. Tucker. A natural
data type with a finite equational final semantics specification but no
effective equational initial semantics specification. Bulletin of the European Association for Theoretical Computer Science 11:23-33, 1980.
[Bergstra and Tucker, *1980b] J. A. Bergstra and J. V. Tucker. A characterisation of computable data types by means of a finite equational specification method. In 7th International Colloquium on Automata, Languages and Programming, Noordwijkerhout, The Netherlands, July 1980.
J. W. de Bakker and J. van Leeuwen eds, Lecture Notes in Computer
Science 85, pp. 76-90, Springer-Verlag, 1980.
[Bergstra and Tucker, *1982a] J. A. Bergstra and J. V. Tucker. The completeness of the algebraic specification methods for data types, Information & Control 54:186-200, 1982.
[Bergstra and Tucker, *1982b] J. A. Bergstra and J. V. Tucker. Some natural structures which fail to possess a sound and decidable Hoare-like
logic for their while-programs, Theoretical Computer Science 17:303-315,
1982.
[Bergstra and Tucker, *1982c] J. A. Bergstra and J. V. Tucker. Expressiveness and the completeness of Hoare's logic, Journal of Computer &
Systems Science 25:267-284, 1982.
[Bergstra and Tucker, *1982d] J. A. Bergstra and J. V. Tucker. Two theorems about the completeness of Hoare's logic, Information Processing
Letters, 15:143-149, 1982.
[Bergstra and Tucker, *1983a] J. A. Bergstra and J. V. Tucker. Hoare's
logic and Peano's arithmetic, Theoretical Computer Science 22:265-284,
1983.
[Bergstra and Tucker, *1983b] J. A. Bergstra and J. V. Tucker. Initial and
final algebra semantics for data type specifications: two characterization
theorems, SIAM Journal of Computing 12:366-387, 1983.
[Bergstra and Tucker, *1984a] J. A. Bergstra and J. V. Tucker. Hoare's
logic for programming languages with two data types, Theoretical Computer Science 28:215-221, 1984.
[Bergstra and Tucker, *1984b] J. A. Bergstra and J. V. Tucker. The axiomatic semantics of programs based on Hoare's logic, Acta Informatica
21:293-320, 1984.
[Bergstra and Tucker, *1987] J. A. Bergstra and J. V. Tucker. Algebraic
specifications of computable and semicomputable data types, Theoretical
Computer Science 50:137-181, 1987.
[Bergstra et al, *1982] J. A. Bergstra, J. Tiuryn and J. V. Tucker. Floyd's
principle, correctness theories and program equivalence, Theoretical
Computer Science 17:113-149, 1982.
506
[Bird, 1976] R. Bird. Programs and Machines: An Introduction to the Theory of Computation, John Wiley and Sons, 1976.
[Bishop, 1967] E. Bishop. Foundations of Constructive Analysis, McGrawHill, 1967.
[Bishop and Bridges, 1985] E. Bishop and D. Bridges. Constructive Analysis, Springer-Verlag, 1985.
[Blanck, 1997] J. Blanck. Domain representability of metric spaces, Annals
of Pure and Applied Logic 83:225-247, 1997.
[Blum and Smale, 1993] L. Blum and S. Smale. The Godel incompleteness
theorem and decidability over a ring. In From Topology to Computation:
Proceedings of the Smalefest, M. W. Hirsch, J. E. Marsden and M. Schub
eds, pp. 321-339, Springer-Verlag, 1993.
[Blum et al., 1989] L. Blum, M. Shub and S. Smale. On a theory of computation and complexity over the real numbers: NP-completeness, recursive functions and universal machines, Bulletin of the American Mathematical Society 21:1-46, 1989.
[Blum et al., 1996] L. Blum, F. Cucker, M. Shub and S. Smale. Complexity
and real computation: A manifesto, International Journal of Bifurcation
and Chaos 6(l):3-26, 1996.
[Bohm and Jacopini, 1966] C. Bohm and G. Jacopini. Flow diagrams, Turing machines and languages with only two formation rules, Communications of the Association for Computing Machinery, 9:366-371, 1966.
[Brainerd and Landweber, 1974] W. S. Brainerd and L. H. Landweber.
Theory of Computation, John Wiley & Sons, 1974.
[Brattka, 1996] V. Brattka. Recursive characterisation of computable realvalued functions and relations, Theoretical Computer Science 162:45-77,
1996.
[Brattka, 1997] V. Brattka. Order-free recursion on the real numbers,
Mathematical Logic Quarterly 43:216-234, 1997.
[Brocker and Lander, 1975] T. Brocker and L/C. Lander. Differentiate
Germs and Catastrophes, London Mathematical Society, Lecture Notes
Series 17, Cambridge University Press, 1975.
[Brown et al., *1972] S. Brown, D. Gries and T. Szymanski. Program
schemes with pushdown stores, SIAM Journal of Computing 1:242-268,
1972.
[Broy et al, 1993] M. Broy, F. Dederichs, C. Dendorfer, M. Fuchs, T. F.
Gritzner and R. Weber. The design of distributed sytems: An introduction to FOCUS, Technical Report TUM-19202-2, Institut fur Informatik,
Technical University of Munich, January 1993.
[Burris and Sankappavanar, 1981] S. Burris and H. P. Sankappavanar. A
Course in Universal Algebra, Springer-Verlag, 1981.
[Byerly, *1993] R. E. Byerly. Ordered subrings of the reals in which output
sets are recusively enumerable, Proceedings of the American Mathematical Society 118:597-601, 1993.
507
508
509
[Elgot and Robinson, *1964] C. C. Elgot and A. Robinson. Randomaccess, stored-program machines. An approach to programming Languages, Journal of the Association for Computing Machinery, 11:365399, 1964.
[Elgot et ai, *1966] C. C. Elgot, A. Robinson and J. D. Rutledge. Multiple
control computer models, IBM Research Report RC-1622, 1966.
[Enderton, 1977] H. B. Enderton. Elements of recursion theory. In Handbook of Mathematical Logic, J. Barwise, ed., pp. 527-566, North-Holland,
1977.
[Engeler, 1967] E. Engeler. Algebraic properties of structures, Mathematical Systems Theory 1:183-195, 1967.
[Engeler, 1968a] E. Engeler. Formal Languages: Automata and Structures,
Markham Publishing Co., 1968.
[Engeler, 1968b] E. Engeler. Remarks on the theory of geometrical constructions. In The Syntax and Semantics of Infinitary Languages, Lecture Notes in Mathematics 72, pp. 64-76, Springer-Verlag, 1968.
[Engeler, 1971] E. Engeler. Structure and meaning of elementary programs.
In Symposium on Semantics of Algorithmic Languages, Lecture Notes in
Mathematics 188, pp. 89-101, Springer-Verlag, 1971.
[Engeler, 1975a] E. Engeler. On the solvability of algorithmic problems,
in Logic Colloquium '73, H. E. Rose and J. C. Shepherdson eds, pp.
231-251, North-Holland, 1975.
[Engeler, 1975b] E. Engeler. Algebraic logic. In Foundations of Computer
Science, Mathematical Centre Tracts No. 63, J. W. de Bakker, ed., Amsterdam, pp. 57-85, 1975.
[Engeler, 1993] E. Engeler. Algebraic Properties of Structures, World Scientific, 1993.
[Ershov, 1958] A. P. Ershov. On operator algorithms, Doklady Akademii
Nauk SSSR 122:967-970, 1958 (in Russian), translated in Automation
Express 1:20-23, 1959.
[Ershov, I960] A. P. Ershov. Operator algorithms I, Problemi Kibernetiki
3, 1960. (in Russian), translated in Problems of Cybernetics 3, 1962.
[Ershov, 1962] A. P. Ershov. Operator algorithms II, Problemi Kibernetiki
8:211-233 (in Russian), 1962.
[Ershov, *198l] A. P. Ershov. Abstract computability on algebraic structures. In Algorithms in Modern Mathematics and Computer Science, A.
P. Ershov and D. E. Knuth, eds, Lecture Notes in Computer Science
122, Springer-Verlag, 1981.
[Ershov and Shura-Bura, 1980] A. P. Ershov and M. R. Shura-Bura. The
early development of programming in the USSR. In A History of Computing in the Twentieth Century, E. N. Metropolis, J. Hewlett and G.-C.
Rota eds, pp. 137-196, Academic Press, 1980.
[Feferman, 1992a] S. Feferman. A new approach to abstract data types, I:
510
511
512
513
514
[Knuth and Prado, *1980] D. Knuth and L. T. Prado. The early development of programming languages. In A History of Computing in the
Twentieth Century, N. Metropolis, J. Hewlett and G.-C. Rota eds, pp.
197-273, Academic Press, 1980.
[Ko, 1991] K.-I. Ko. Complexity Theory of Real Functions, Birkhauser,
1991.
[Kolmogorov, *1953] A. N. Kolmogorov. O ponyatii algoritma, Uspekhi
Matematicheskikh Nauk 814:175-176, 1953.
[Kreczmar, *1977] A. Kreczmar. Programmability in fields, Fundamenta
Informaticae 1:195-230, 1977.
[Kreisel, 1971] G. Kreisel. Some reasons for generalizing recursion theory.
In Logic Colloquium '69, R. O. Gandy and C. M. E. Yates eds, pp. 139198, North-Holland, 1971.
[Kreisel and Krivine, 1971] G. Kreisel and J. L. Krivine. Elements of Mathematical Logic, North-Holland, 1971.
[Kreitz and Weihrauch, 1985] C. Kreitz and K. Weihrauch. Theory of representations, Theoretical Computer Science 38:35-53, 1985.
[Lacombe, 1955] D. Lacombe. Extension de la notion de fonction recursive
aux fonctions d'une ou plusieurs variables reelles, I, II, III, Comptes
Rendus de I'Academie des Sciences Paris 240:2470-2480,241:13-14,151153, 1955.
[Lacombe, *197l] D. Lacombe. Recursion theoretic structure for relational
systems. In Logic Colloquium '69, R. O. Gandy and C. M. E. Yates eds,
pp. 3-18, North-Holland, 1971.
[Lambert, 1968] W. M. Lambert, Jr. A notion of effectiveness in arbitrary
structures, Journal of Symbolic Logic, 33:577-602, 1968.
[Lauer, 1967] P. E. Lauer. The formal explicates of the notion of algorithm,
Technical Report TR 25.072, IBM Laboratory, Vienna, 1971.
[Lauer, 1968] P. E. Lauer. An introduction to H. Thiele's notions of algorithm, algorithmic process and graph-schemata calculus, Technical Report TR 25.079, IBM Laboratory, Vienna, 1968.
[Levien, *1962] R. E. Levien. Set-theoretic formalizations of computational
algorithms, computable functions and general purpose computers. In
Proceedings of the Symposium on the Mathematical Theory of Automation, New York, American Mathematical Society, pp. 101-123, 1962.
[Lucas et al, 1968] P. Lucas, P. E. Lauer and H. Stigleitner. Method and
notation for the formal definition of programming languages, Technical
Report TR 25.087, IBM Laboratory, Vienna, 1968.
[Luckham and Park, 1964] D. Luckham and D. M. Park. The undecidability of the equivalence problem for program schemata, Report 1141, Bolt,
Beranek and Newman Inc., 1964.
[Luckham et al., 1970] D. Luckham, D. M. Park and M. S. Paterson. On
formalized computer programs, Journal of Computer & Systems Science
4:220-249, 1970.
515
516
517
[Moschovakis, *1969b] Y. N. Moschovakis. Abstract first-order computability II, Transactions of the American Mathematical Society
138:465-504,1969.
[Moschovakis, *1969c] Y. N. Moschovakis. Abstract computability and invariant definability, Journal of Symbolic Logic, 34:605-633, 1969.
[Moschovakis, 1971] Y. N. Moschovakis. Axioms for computation theories
first draft. In Logic Colloquium '69, R. O. Gandy and C. E. M. Yates
eds, pp. 199-255, North-Holland, 1971.
[Moschovakis, *1974] Y. N. Moschovakis. Elementary Induction on Abstract Structures, North-Holland, 1974.
[Moschovakis, 1984] Y. N. Moschovakis. Abstract recursion as a foundation
for the theory of recursive algorithms. In Computation and Proof Theory,
M. M. Richter, E. Borger, W. Obserschelp, B. Schinzel and W. Thomas,
eds, Lecture Notes in Mathematics 1104, pp. 289-364, Springer-Verlag,
1984.
[Moschovakis, 1989] Y. N. Moschovakis. The formal language of recursion,
Journal of Symbolic Logic 54:1216-1252,1989.
[Miiller and Tucker, 1998] B. Miiller and J. V. Tucker, eds. Prospects
for Hardware Foundations. Lecture Notes in Computer Science 1546,
Springer-Verlag, 1998.
[Normann, 1978] D. Normann. Set-recursion. In Generalized Recursion
Theory II: Proceedings of the 1977 Oslo Symposium, J. E. Fenstad,
R. 0. Gandy and G. E. Sacks eds, pp. 303-320, North-Holland, 1978.
[Normann, 1982] D. Normann. External and internal algorithms on the
continuous functionals. In Patas Logic Symposium, G. Metakides, ed,
Studies in Logic, vol. 109, pp. 137-144. North-Holland, 1982.
[Parsons, 1971] C. Parsons. On a number theoretic choice scheme II (Abstract), Journal of Symbolic Logic 36:587, 1971.
[Parsons, 1972] C. Parsons. On n-quantifier induction, Journal of Symbolic
Logic 37:466-482, 1972.
[Paterson, *1967] M. S. Paterson. Equivalence problems in a model of computation, PhD thesis, Cambridge University, 1967.
[Paterson, *1968] M. S. Paterson. Program schemata, Machine Intelligence
3:19-31, 1968.
[Paterson and Hewitt, 1970] M. S. Paterson and C. E. Hewitt. Comparative schematology. In Record of Project MAC Conference on Concurrent
Systems and Parallel Computation pp. 119-128, ACM; also MIT, AI
Technical Memo 201, 1979.
[Peter, 1958] R. Peter. Graphschemata und rekursive Funktionen, Dialectica 2:373-393, 1958.
[Peter, 1967] R. Peter. Recursive Functions, Academic Press, 1967.
[Phillips, 1992] I. C. C. Phillips. Recursion theory. In Handbook of Logic
in Computer Science, Vol. 1, S. Abramsky, D. Gabbay and T. Maibaum
eds, pp. 79-187, Clarendon Press, 1992.
518
[Plaisted, *1972] D. Plaisted. Program schemas with counters. In Proceedings of the 4th Annual ACM Simposium on the Theory of Computing,
Denver, Col. pp. 44-51, Association for Computing Machinery, 1972.
[Platek, 1966] R. A. Platek. Foundations of recursion theory, PhD thesis,
Department of Mathematics, Stanford University, 1966.
[Pour-El and Caldwell, 1975] M. B. Pour-El and J. C. Caldwell. On a
simple definition of computable function of a real variable with applications to functions of a complex variable, Zeitschrift fur mathematische
Logik und Grundlagen der Mathematik 21:1-19, 1975.
[Pour-El and Richards, 1989] M. B. Pour-El and J. I. Richards. Computability in Analysis and Physics, Springer-Verlag, 1989.
[Rice, 1954] H. G. Rice. Recursive real numbers, Proceedings of the American Mathematical Society 5:784-791, 1954.
[Rogers, 1967] H. Rogers, Jr. Theory of Recursive Functions and Effective
Computability, McGraw-Hill, 1967.
[Rutledge, *1964] J. D. Rutledge. On lanov's program schemata, Journal
of the Association for Computing Machinery 11:1-9, 1964.
[Rutledge, *1970] J. D. Rutledge. Parallel processesschemata and transformations, IBM Research Report RC-2912 1970. Also in Architecture
and Design of Digital Computers, Boulayeed., pp. 91-129, Dunod, 1971.
[Rutledge, *1973] J. D. Rutledge. Program Schemes as Automata 1, Journal of Computer & System Sciences 7:543-578, 1973.
[Saint John, *1994] R. Saint John. Output sets, halting sets and an arithmetical hierarchy for ordered substrings of the real numbers under
Blum/Shub/Smale computation, Technical Report TR-94-035, ICSI,
Berkeley, CA, 1994.
[Saint John, *1995] R. Saint John. Theory of computation for the real
numbers and subrings of the real numbers following Blum/Shub/Smale,
Dissertation, University of California at Berkeley, 1995.
[Sanchis, *1988] L. E. Sanchis. Reflexive Structures, Springer-Verlag, 1988.
[Schreiber, *1975] P. Schreiber. Theorie der geometrischen Konstruktionen, Deutscher Verlag der Wissenschaften, Berlin, 1975.
[Scott, 1967] D. Scott. Some definitional suggestions for automata theory,
Journal of Computer & System Sciences 1:187-212, 1967.
[Scott, *1970a] D. S. Scott. The lattice of flow diagrams, Programming
Research Group, Oxford, 1970.
[Scott, *1970b] D. S. Scott. Outline of a mathematical theory of computation. In Proceedings of the 4th Annual Princeton Conference on Information Sciences & Systems, Princeton University, pp. 169-176, 1970.
Also Technical Monograph PRG-2, Porgramming Research Group, Oxford University, 1970.
[Scott and Strachey, *197l] D. S. Scott and C. Strachey. Towards a mathematical semantics for computer languages. In Proceedings of the Symposium on Computers & Automata, 3. Fox ed., Polytechnic Institute
519
520
[Stephens, 1997] R. Stephens. A survey of stream processing, Acta Information 34:491-541, 1997.
[Stephens and Thompson, 1996] R. Stephens and B. C. Thompson. Cartesian stream transformer composition, Fundamenta Informaticae 25:123174, 1996.
[Stephenson, 1996] K. Stephenson. An algebraic approach to syntax, semantics and computation, PhD thesis, Department of Computer Science,
University of Wales, Swansea, 1996.
[Stewart, 1999] K. Stewart. Abstract and concrete models of computation
over metric algebras. PhD thesis, Computer Science Department, University of Wales, Swansea, 1999.
[Stoltenberg-Hansen, 1979] V. Stoltenberg-Hansen. Finite injury arguments in infinite computation theories, Annals of Mathematical Logic
16:57-80, 1979.
[Stoltenberg-Hansen and Tucker, 1985] V. Stoltenberg-Hansen and J. V.
Tucker. Complete local rings as domains, Technical Report 1.85, Centre
for Theoretical Computer Science, University of Leeds, 1985.
[Stoltenberg-Hansen and Tucker, 1988] V. Stoltenberg-Hansen and J. V.
Tucker. Complete local rings as domains, Journal of Symbolic Logic
53:603-624, 1988.
[Stoltenberg-Hansen and Tucker, 1991] V. Stoltenberg-Hansen and J. V.
Tucker. Algebraic and fixed point equations over inverse limits of algebras, Theoretical Computer Science 87:1-24, 1991.
[Stoltenberg-Hansen and Tucker, 1993] V. Stoltenberg-Hansen and J. V.
Tucker. Infinite systems of equations over inverse limits and infinite synchronous concurrent algorithms. In Semantics: Foundations and Applications J. W. de Bakker, W.-P. de Roever and G. Rozenberg, eds. Lecture
Notes in Computer Science 666, pp. 531-562, Springer-Verlag, 1993.
[Stoltenberg-Hansen and Tucker, 1995] V. Stoltenberg-Hansen and J. V.
Tucker. Effective algebras. In Handbook of Logic in Computer Science,
Vol. 4, S. Abramsky, D. Gabbay and T. Maibaum eds, pp. 357-526,
Oxford University Press, 1995.
[Stoltenberg-Hansen and Tucker, 1999a] V. Stoltenberg-Hansen and J. V.
Tucker. Computable rings and fields. In Handbook of Computability Theory, E. Griffor ed, North-Holland, 1999.
[Stoltenberg-Hansen and Tucker, 1999b] V. Stoltenberg-Hansen and J. V.
Tucker. Concrete models of computation for topological algebras. Theoretical Computer Science, 219:347-378, 1999.
[Stoltenberg-Hansen et al., 1994] V. Stoltenberg-Hansen, I. Lindstrom and
E. Griffor. Mathematical Theory of Domains, Cambridge University
Press, 1994.
[Strong, 1968] H. R. Strong, Jr. Algebraically generalised recursive function theory, IBM Journal of Research and Development, 12:465-475,
1968.
521
522
523
[Urzyczyn, *1981a] P. Urzyczyn. Algorithmic triviality of abstract structures, Fundamenta Informaticae 4:819-849, 1981.
[Urzyczyn, *1981b] P. Urzyczyn. The unwind property in certain algebras,
Information & Control 50:91-109, 1981.
[Urzyczyn, *1982] P. Urzyczyn. On the unwinding of flow-charts with
stacks, Fundamenta Informaticae 4:119-126, 1982.
[Urzyczyn, *1983] P. Urzyczyn. Nontrivial definability by flow-chart programs, Information & Control, 58:101-112, 1983.
[Voorhes, 1958] E. A. Voorhes. Algebric formulation of the notion of flowdiagrams, Communications of the Association for Computing Machinery
1:4-8, 1958.
[Wagner, *1965] E. A. Wagner. Uniformly reflexive structures: An axiomatic approach to computability. In Logic, Computability & Automation: Joint PADG-AIAC Symposium, Trinkaus Manor, Oriskany, New
York, 1965.
[Wagner, 1969] E. A. Wagner. Uniformly reflexive structures: On the nature of Godelizations and relative computability, Transactions of the
American Mathematical Society 144:1-41, 1969.
[Walker and Strong, *1973] S. Q. Walker and H. R. Strong. Characterizations of flowchartable recursions, Journal of Computer & Systems Science 7:404-447, 1973.
[Warner, *1993] S. Warner. Topological Rings, Mathematics Studies 178,
North-Holland, 1989.
[Wechler, 1992] W. Wechler. Universal Algebra for Computer Scientists,
EATCS Monographs 25, Springer-Verlag, 1992.
[Weihrauch, 1987] K. Weihrauch. Computability, EATCS Monographs 9,
Springer-Verlag, 1987.
[Weihrauch and Schreiber, 1981] K. Weihrauch. Embedding metric spaces
into complete partial orders. Theoretical Computer Science 16:5-24,
1981.
[van Wijngaarden, *1966] A. van Wijngaarden. Numerical analysis as an
independent science, BIT 6:68-81, 1966.
[Wirsing, 1991] M. Wirsing. Algebraic specification. In Handbook of Theoretical Computer Science Vol.B: Formal Methods and Semantics, J. van
Leeuwen ed, pp. 675-788, North-Holland, 1991.
[Zucker and Pretorius, 1993] J. I. Zucker and L. Pretorius. Introduction to
computability theory, South African Computer Journal 9:3-30, 1993.
ON
-"
,)(.< ^
So
= i
ft
TTITIT
q^ qii 4ii J
J <S -U-
ONOsOs
O\ ON
oooo o\
VO
oo
VO
I
<
>
S5,|
SS::|*'S
53"';^3'
'
:
^
2,|
>
N
K ft0 ft ? || || || LU
a
fD
526
A, 120
A,, 60
<,79
x.lll
A, 82
-,51
--,85
(j> prop [T], 77
KA:XKA^
7i>, 109
7Ti,45
^,80
Sr, 135
Soo, 135
a + a', 53
CT = a' [T], 103
crto, 62
<T-KJ', 60
<T x a', 57
CTF, 102
*,60
A, 83
T*,83
{-I-}/. 54
a x A 114
OF, 115
e(e'),51
/ x A, 1 1 1
/M, 79, 111
n - x , 88
pr,84
a type [F], 103
- x x , 45
<p a (,fTi), 139
n, 142
ATIME, 140
, 74
,, 120
y , 61
abstract data type, 224, 35 1
modularized, 285
abstraction
behavioural, 298
INDEX
INDEX
real numbers, 320-321,327-329,
346, 353, 354, 396, 409,
438-451 passim, 451-478
passim
reduct of an, 237
standard, 322, 351-352,360
streams, 326,328,359-360,451,
454,501
subalgebra, 225
term algebra, 234
terms, 329
topological, 332, 451^78
passim
total, 345
ultrametric, 477
with arrays, 356-359
with unspecified value, 355-356
algebraic theory, 45
dependently typed, 102
algebraic and transcendental points,
441
algebraic domain, 477
algebraic operational semantics, 364366
amalgamated union, 283
ap,61,120
approximate computation 320-321,
342,451-478 passim
arity,51,220
assignment, 228
atomic formula, 143
atomic specification, 252
Ax, J., 207
axiom, 249
scheme, 249
term-equality, 102
type-equality, 102
axiomatizable, 246
Baire space, 476
Barwise, J., 200
Baur, W., 209
behavioural abstraction, 298
behavioural equivalence, 298
527
528
classifying
category, 70
prop-category, 99
type-category, 108-109
closure, 246
of a class of algebras, 247
condition, 225
theorems for semicomputable sets,
409-413, 415-416, 420421
coherence, 101,123
complete, 142,250
completeness, 77,97-100
Compton, K., 134,206
computational lambda calculus, 66
computation type, 66
computability theory
on natural numbers, 319-320,
322,330-331
on algebras, 320-321,330-331,
340-342
generalised, 320-321,330-331
abstract, 321,330-331,336-340
abstract versus concrete, 333335,475^78
history, 335-340
abstract, history, 335-340,474478
generalised, history, 335-340
reasons for generalising, 335336
type two, 476
computable analysis, 321
computable real number, 475
computation
local, 372-373,387
sequence, 364
tree, 342, 423-430, 436-438,
468-469
theory, 490
concrete representation, 477
cons, 62
cons x, 62
continuous domain, 477
INDEX
condition
inheritance, 302
subtype, 302
conditional equational logic, 241
CEL, 241
confluent, 263
congruence relation, 225
induced, 226
conservativity for terms, 383-387,429
consistent, 285
constant, 220
constraint, 267
construction term, 300
constructive specification, 267
abstract syntax, 267
concrete syntax, 268
semantics, 268
constructor, 231
constructors, 255
context, 44, 101
T/z-provably equal, 69
a-equivalence of, 69
identity morphism, 70
morphism, 69
morphism composition, 69
contextual categories, 101
(Contract), 79
Coste, M., 41
countability/cofiniteness condition, 442
countable connectivity condition, 445
course of values recursion, 483
Curry-Howard interpretation, 3
(Cut), 79
ctxt, 103
data types, 320, 325,329,351
data type theory, 329,330-332,340
declaration, 280
default terms and values, 349
density/codensity condition, 442
dependent type, 120
depth, 139
derivable formula, 249
design specification, 253
INDEX
disjoint union type, 53
domain, 243,476-477
Duret, J.-L., 207
dynamic signature, 304
dynamic sort, 304
dynamical systems, 328, 447^51,
484
emptyT, 56
EdalatA.,477,508
Egidi, L., 208
Ehrenfeucht games, 200
effective algebra, 321
effective domains, 476
effective definitional schemes, 339,
487
element
generalized, 49
global, 49
elementary recursive, 140
Emerson, E. A., 136
empty type, 57
Engeler, E., 338-339,408,425,429,
430, 434, 435, 446, 489,
509
Engeler's lemma, 425,428,429,434,
435,438,446
environment, 278
equation, 240
equation-in-context, 45
equational calculus, 250
equational logic, 45,240
dependently typed, 105-106
equivalence sequence, 263
Ergov.Y, 131, 142,192,208
Esik, Z., 75
essentially algebraic, 42,123
evaluation morphism, 62
exact computation, 320-321,451
evaluation homomorphism, 234
exception, 303
(Exchange), 79
explicit definition, 144
exponential object, 62
529
fsat(L0), 153
JFp,74
false, 85
fst, 59
factorization, 226
Fagin,R., 136
Feferman, S., 209
Fenstad, J. E., 336,343,490, 510
Ferrante, J., 146,148,186,191,194,
196,202,210,211
(false-Adj), 87
(false-Elim), 86
final algebra, 224
final specification, 299
finite algorithmic procedure, 339,484
finite algorithmic procedure with
counting, 485
finite algorithmic procedure with
stacking, 486
finite algorithmic procedures with
index registers, 487
finite computation, 320-321,331
finite product
category with, 45
preservation of, 68
strictly associative, 71
strictly unital, 71
strict preservation of, 68
finite sequences, 324
first-order languages, 499
first-order logics, 138
Fischer, M. J., 136, 148, 159, 160,
193,194,195,198,207
flattening, 281
Fleischmann, K., 132
530
INDEX
generalised Church-Turing thesis, 338,
342, 422, 478, 487, 493500, 503
generalised element, 49
generalised quantifiers, 95
generated, 231
in some sorts, 232
generic
T/i-algebra, 72
model, 77,99, 118-119
Gentzen, G., 85, 87,123
Girard, J.-Y., 40,100, 123
global element, 49
global section, 114
Godel numbering, 388
Godel completeness thorem, 139
Gradel.E., 136
Grandjean, E., 137, 141,211
Grothendieck, A., 114
Grothendieck fibration, 114
ground
equation, 240
term, 227
type, 51
group, 321,327,328,346-347,353,
370,395,416
Grzegorczyk, A., 474,475
guarded command language, 488
Gurevich, Y., 207, 208
Halpern.J. Y., 136
halting formula, 42^429,433,437
halting problem, 417
halting set, 324,408,414,422,428,
436,457
hard, 141
Hardy, G. H., 195,197
Harel, D., 136
height, 139
Henkin,L.,210
Henson, C. W., 206
hereditary, 140
hereditary lower bounds, 131, 132,
140
INDEX
531
imV,53
inr^, 53
junk, 230
K, 88
AC*, 80
INDEX
532
ML0, 138
MLt, 138
ME r , 135,171
MSoo, 135, 171
MLt, 171
Machtey, M., 140
Makkai,M., 81
many-sorted signatures, see
signatures
Martin-L6f,P.,51, 100
McCarthy,!., 338,340,515
Macintyre, A., 210
McNaughton, R., 211
Maurin, F., 136, 195, 196
NT I ME, 140
ATT/M J5(T(cra))-inseparable, 153
NTIME lower bounds, 166
mU62
n ilx, 62
numbering, 332
533
INDEX
natural number object, 65
weak, 65
next state function, 447
Noetherian, 263
non-deterministic assignment, 488
normal form, 263
Normann,D.,478,493,517
null, 57
Objc, 112
object, 307
class, 307
identities, 307
instances, 307
type, 307
observable sorts, 298
Oles, R, 50
one-element-type, 59
only positively, 145
operation, 220
constant, 220
operation name, 220
operator formula, 145
orbit, 447
order-sorted algebra, 302
order-sorted signature, 301
P,88
Pi, 88
Po,88
pair, 58
proj, 60
prop,78
Papert,S.,211
parameterization
A-calculus approach, 295
pushout approach, 296
Peano arithmetic, 246
periodic points, 447
persistent, 238,248,274,285
free extension, 276
Poigne, A., 40
Point, F., 207
polymorphic, 224
534
INDEX
Robinson, A., 209, 210
Roman, L., 65
rules
adjoint style, 86, 89,93
natural deduction style, 85, 89,
93
S, 88
Set, 80, 111
Sg,43
Sgc, 48
r , 171
Soo,171
snd,59
split, 58
sat(S), 132
sat*(), 144
sat^(S), 144
satr(), 139
satr(L), 139
satr(ME+), 161
sat^(L), 144
sat(), 139
satT(Lo), 132
satT(ML0), 132
Sacerdote, G., 207
satisfaction, 239
condition, 239
of an equation, 47
of a judgement, 117 '
of sequent, 82
relation, 239
satisfiability problem, 132
Scarpellini, B., 136,210
Scedrov, A., 123
Schmitt, P., 209 Schoning, U., 136
Scott, D. S., 41
search, 333,412-413,461
search procedure, 413
Seiferas,!., 159,160
Seiferas-Fischer-Meyer Theorem, 159
semantics
model oriented, 283
of while programs, 363-36
INDEX
presentation oriented, 284
specification oriented, 284
theory oriented, 284
Semenov,. L., 189,210
semicomputability equivalence theorem, 431
semicomputability with search, 413414
sentence, 144
sequent, 78
set
while computable, 324,326-329,
407,409,438^51 passim,
while semicomputable, 324,326329,407-451 passim
halting, 324,408,414,422,428,
436,457
a-decidable, 333
a-semidecidable, 333
a-cosemidecidable, 333
projective semicomputable, 407451 passim
definable, 408,431^35
semicomputable with search, 414,
416,421-422
semialgebraic, 444
Julia, 449-451
Open, 451,464,
Closed, 451,464
Clopen, 451,464
recursion, 493
Shelah.S., 164, 165
Shepherdson, J. C., 339, 342, 473,
475, 484, 487, 489, 518,
519
Shub, M, 438,484, 506
E-homomorphism, 223
(A")-term, 227
signature, 43, 101, 220, 322, 344360 passim
dynamic, 304
module, 284
morphism, 235, 237
inclusion, 236
535
order-sorted, 301
strongly typed, 228
simultaneous course of values
recursion, 337,483
simultaneous iterative definition, 148
slice category, 111
Slobodskoi, A. M., 209
Smale, S., 438,484,506
snapshot function, 382,402,417
sort, 43, 220
argument sort, 220
dynamic, 304
label sort, 304
observable, 298
target sort, 220
sound, 250
soundness, 47
specification
adequate, 252
atomic, 252
design, 253
e-specification, 278
final, 299
initial, 257
language 496-500
loose, 253
loose with constructors, 255
requirement, 253
strictly adequate, 252
specification morphism, 284
specification oriented semantics, 284
stable
coproduct, 56, 57
initial object, 57
join, 87
stage, 49
standardness assumption, 353
standard signatures, 351-352
statement remainder function, 417
Statman, R., 210
Stern,!., 191
Stirling, C., 136
Stoltenberg-Hansen, V., 330,332,335,
343,476,477, 520
536
INDEX
signature morphism on, 237
rewriting system, 264
value of a, 228
term-in-context, 44
term-model, 41
T/t-algebra, 47
JTi-context, 107
The, 73
theory, 245
algebraic, 45
axioms of, 45
dependently typed algebraic, 102
in predicate logic, 77
morphism, 284
oriented semantics, 284
theorems of, 45,78, 102
translation, 74
time resource bound, 140
Tiuryn, J., 136
topos, 112,122
total object, 110
Touraille, A., 208
Trakhtenbrot, B. A., 131,153
Trakhtenbrot-Vaught Inseparability
Theorem, 153
transition predicate, 304
Tucker, 3. V., 317, 330, 332, 335,
343, 365, 369, 396, 432,
452, 475, 576, 476, 477,
478, 483, 494, 495, 496,
520, 521
Turing, A., 154
type
computation, 66
dependent product, 120
disjoint union, 52
empty, 57
function, 60
ground, 51
indexed, 110
inductive, 62
one-element, 59
product, 57
type-category, 110
537
INDEX
type-category, cont
classifying, 107-108
has dependent products, 120
reachable, 119
TYPES, 51
Typec, 110
unit, 60
Ullman, J. D., 140,159,200
uniform computation, 324
uniform convergence, 470
union
amalgamated, 283
universe, 243
universality, 341,387-388,399-401,
439,490
val, 66
val(S), 132
valid, 244
in a domain, 244
inductively valid, 244
logically valid, 244
validity problem, 132
value of a formula, 242
Var, 44
uar(r), 44
Vardy.M., 136
variable
free occurrence of, 242
Vaught,R., 131,153,209
vocabulary, 138
Volger, H., 192
Vorobynov, S., 137,210
weak model, 138
weak product, 59
weak second-order language, 408,431435
(Weaken), 78
Weierstrass approximability, 470-474
Weihrauch, K., 476,477,523
well-formed theory, 105
well-pointed, 49
while-array programming language,
317-523 passim
while programming language, 317523 passim
Wood, C, 209
Wright, E. M., 195, 197
Young, P., 140, 153, 195
Zakharyaschev, M., 136
Zucker, J.I., 317,365,369,432,452,
475, 478, 483, 494, 496,
497, 500, 501, 503, 521,
522