Vous êtes sur la page 1sur 327

Incomplete Notes on Geometric Control Theory

Andrew D. Lewis 23/06/2009


!

Associate Professor,

Department

of

Mathematics

and

Statistics,

Queen's

University,

Kingston, ON K7L 3N6, Canada

Email: andrew@mast.queensu.ca,

URL: http://penelope.mast.queensu.ca/andrew/

Preface
These are a very incomplete set of notes that will eventually turn into a book on geometric control theory. Parts of these notes are quite complete (e.g., Chapters 2, 5, and the part of Chapter 6 that is nished), parts are partially complete (e.g., Chapters 3 and 4), parts are just being started (e.g., Chapters 1, 8, and 9), and some are merely placeholders for things that are not yet written (e.g., Chapters 7, 10, and 11). Among the victims of the state of incompleteness is the referencing. So I am afraid that these notes are but an imperfect source for gaining access to the research literature in geometric control. Hopefully some of the gaps left here can be lled in by using the texts Jurdjevic [1997] and Agrachev and Sachkov [2004]. The writing of this material started, ostensibly, as the writing of material for a short course on controllability. However, (enjoyable) distractions arose, the results of which are plain to see. These distractions prevent the material on controllability from being as complete as it might have been. This is no great crime, as the subject of controllability is incomplete by nature, and the intent of the short course is to outline the foundations of controllability theory, rather than the presentation of specic results on controllability. Nonetheless, the reader should be aware that, in their present state, these notes do not paint a very clear picture of the state of the art of controllability theory, at least as concerns specic necessary and sucient conditions for controllability. The guilt I feel about this is bounded above by a medium-sized positive constant. At present there is no mention of applications of geometric control theory in these notes. I do not have any plans to include mention of such applications in the future. The guilt I feel about this is bounded above by a small positive constant. If you are interested in applications of geometric control theory to mechanical systems, then we refer to the texts of Bloch [2003] and Bullo and Lewis [2004]. There are places where material is obviously missing. Places where absent material is perhaps less obvious are marked with an exclamation point in the margin, like this: ! There will obviously be many mistakes, typographical and, I am afraid, otherwise. I would appreciate it if you could pass on any that you nd. In summary: These notes are incomplete and mostly unchecked. Thus I will not guarantee that anything contained in them is correct. These notes are intended neither for distribution nor for citation.

Table of Contents
1 2 Notation and prerequisites Real analyticity 2.1 Real analytic functions: denitions and fundamental properties . . . . . 2.1.1 Multi-index and partial derivative notation . . . . . . . . . . . . 2.1.2 Formal power series . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Formal Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Convergent power series . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Real analytic functions . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Real analytic multivariable calculus . . . . . . . . . . . . . . . . . . . . . 2.2.1 Real analyticity and operations on functions . . . . . . . . . . . . 2.2.2 The real analytic Inverse Function Theorem . . . . . . . . . . . . 2.2.3 Some consequences of the Inverse Function Theorem . . . . . . . 2.3 Real analytic dierential geometry . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Real analytic manifolds, submanifolds, mappings, and vector bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 The GrauertMorrey Embedding Theorem . . . . . . . . . . . . . 2.3.3 Extension of and approximation by real analytic maps . . . . . . 2.4 Local properties of analytic functions . . . . . . . . . . . . . . . . . . . . 2.4.1 Unique factorisation domains . . . . . . . . . . . . . . . . . . . . 2.4.2 Noetherian rings and modules . . . . . . . . . . . . . . . . . . . . 2.4.3 The Weierstrass Preparation Theorem . . . . . . . . . . . . . . . . 2.4.4 Algebraic properties of germs of analytic functions . . . . . . . . 2.4.5 Properties of analytic sections of vector bundles and their germs Time-dependent vector elds and their ows 3.1 Vector elds depending measurably on time . . . . . . . . . 3.1.1 The nitely dierentiable case . . . . . . . . . . . . . 3.1.2 The smooth case . . . . . . . . . . . . . . . . . . . . . 3.1.3 The locally Lipschitz case . . . . . . . . . . . . . . . . 3.2 Absolutely continuous curves . . . . . . . . . . . . . . . . . 3.2.1 Some comments about absolute continuity . . . . . . 3.2.2 Absolutely continuous curves on smooth manifolds 3.3 Flows for time-dependent vector elds . . . . . . . . . . . . 3.3.1 Integral curves: local existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 9 9 9 12 14 19 24 33 33 36 41 47 47 51 52 56 56 62 66 73 77 91 91 91 94 95 100 101 102 104 104

iii 4 Set-valued analysis on manifolds 4.1 Riemannian manifolds as metric spaces . . . . . . . . . . . . . . . . . . 4.1.1 Denition of the metric . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Equivalence of metrics . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 The metric structure of the tangent bundle of a Riemannian manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Set-valued maps between metric and topological spaces . . . . . . . . 4.2.1 The Hausdor distance . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Set-valued maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Notions of continuity for set-valued maps . . . . . . . . . . . . 4.2.4 Lipschitz set-valued maps . . . . . . . . . . . . . . . . . . . . . 4.2.5 Measurable set-valued maps . . . . . . . . . . . . . . . . . . . . 4.3 Convex sets, ane subspaces, and cones . . . . . . . . . . . . . . . . . 4.3.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Combinations and hulls . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Topology of convex sets and cones . . . . . . . . . . . . . . . . 4.3.4 Separation theorems for convex sets . . . . . . . . . . . . . . . . 4.4 Dierential inclusions on manifolds . . . . . . . . . . . . . . . . . . . . 4.4.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Continuity of dierential inclusions . . . . . . . . . . . . . . . . 4.4.3 Lipschitz dierential inclusions . . . . . . . . . . . . . . . . . . 4.4.4 Dierential inclusions with measurable time dependence . . . 4.4.5 Selections of dierential inclusions . . . . . . . . . . . . . . . . 4.4.6 Trajectories for dierential inclusions . . . . . . . . . . . . . . . 4.4.7 Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.8 Dierential inclusions associated with a discontinuous vector eld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 . 112 . 112 . 116 . . . . . . . . . . . . . . . . . . . . 118 121 121 127 127 134 135 136 136 136 142 144 147 148 149 151 151 152 153 154

. 154

Families of vector elds, distributions, and ane distributions 156 5.1 Distributions: denitions and basic properties . . . . . . . . . . . . . . . 156 5.1.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.1.2 Regular and singular points . . . . . . . . . . . . . . . . . . . . . 163 5.1.3 Distributions invariant under vector elds and dieomorphisms 166 5.2 The algebraic structure of sets of functions and vector elds . . . . . . . 168 5.2.1 Rings of functions and modules of vector elds . . . . . . . . . . 168 5.2.2 Rings of germs of functions and modules of germs of vector elds173 5.2.3 Analytic germs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.2.4 Smooth germs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 5.3 The Orbit Theorem and some consequences . . . . . . . . . . . . . . . . 176 5.3.1 Lie algebras of vector elds . . . . . . . . . . . . . . . . . . . . . . 177 5.3.2 Immersed submanifolds . . . . . . . . . . . . . . . . . . . . . . . 182 5.3.3 Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 5.3.4 Fixed-time orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

iv 5.3.5 The Orbit Theorem . . . . . . . . . . . . . . . . . . 5.3.6 The nitely generated Orbit Theorem . . . . . . . . 5.3.7 The xed-time Orbit Theorem . . . . . . . . . . . . 5.3.8 Frobeniuss Theorem . . . . . . . . . . . . . . . . . 5.3.9 Equivalence of Lie subalgebras of vector elds . . Ane distributions . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Regular and singular points . . . . . . . . . . . . . 5.4.3 Algebraic aspects of ane distributions . . . . . . 5.4.4 The Lie algebra generated by an ane distribution 5.4.5 Invariant subspace constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 197 199 201 203 212 212 214 215 217 219 223 223 223 229 232 237 239 239 240 242 242 244 249 251 251 253 257 260 261 261 262 263 266 266 266 266 266 266 266

5.4

Geometric system models 6.1 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Metric space valued controls . . . . . . . . . . . . . . . . . . . . . 6.1.2 Subsets of admissible locally essentially bounded controls . . . . 6.1.3 Euclidean space valued controls . . . . . . . . . . . . . . . . . . . 6.1.4 Subsets of admissible locally integrable controls . . . . . . . . . . 6.2 Dierential inclusion systems . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Denition of dierential inclusion system . . . . . . . . . . . . . 6.2.2 Trajectories and reachable sets for dierential inclusion systems 6.3 Systems depending continuously on control . . . . . . . . . . . . . . . . 6.3.1 Denition of control system . . . . . . . . . . . . . . . . . . . . . 6.3.2 Trajectories and reachable sets for control systems . . . . . . . . 6.3.3 The bred manifold picture for a control system . . . . . . . . 6.4 Control-ane systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Denition of control-ane system . . . . . . . . . . . . . . . . . . 6.4.2 Trajectories and reachable sets for control-ane systems . . . . . 6.4.3 Important classes of control-ane systems . . . . . . . . . . . . . 6.4.4 Transformations of control-ane systems . . . . . . . . . . . . . 6.5 Ane systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Denition of ane system . . . . . . . . . . . . . . . . . . . . . . 6.5.2 The relationship between ane systems and control-ane systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Trajectories and reachable sets for ane systems . . . . . . . . . Linear systems and linearisation of systems 7.1 Linear systems . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Linear systems on vector spaces . . . . . . . . 7.1.2 Linear systems on vector bundles . . . . . . . 7.2 Linearisation of system models . . . . . . . . . . . . . 7.2.1 Linearisation of dierential inclusion systems 7.2.2 Linearisation of control systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v 7.2.3 7.2.4 8 Linearisation of control-ane systems . . . . . . . . . . . . . . . 266 Linearisation of ane systems . . . . . . . . . . . . . . . . . . . . 266 . . . . . . . . . . . . . . . . . . 267 267 267 268 270 272 273 274 275 275 275 276 276 276 280 282 282 282 283

Variations and the reachable set 8.1 Jet bundles of various sorts . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 The symmetric algebra of a vector space . . . . . . . . . . . . . 8.1.2 Jet bundles of vector bundles . . . . . . . . . . . . . . . . . . . . 8.1.3 Jet bundles of maps between manifolds . . . . . . . . . . . . . . 8.1.4 The structure of jets of maps between Euclidean spaces . . . . 8.1.5 Higher-order tangent vectors for nets . . . . . . . . . . . . . . . 8.2 Properties of the reachable set for dierential inclusion systems . . . . 8.2.1 The topology of the reachable set for a dierential inclusion system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Properties of reachable sets for control systems . . . . . . . . . . . . . 8.3.1 Topology of the reachable set for control systems . . . . . . . . 8.3.2 States reachable by subsets of trajectories . . . . . . . . . . . . . 8.4 Variations for dierential inclusion systems . . . . . . . . . . . . . . . 8.4.1 An algebro-geometric construction . . . . . . . . . . . . . . . . 8.4.2 A characterisation of variations . . . . . . . . . . . . . . . . . . 8.4.3 The relationship between variations and the reachable set . . . 8.5 Variations for control systems . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 Denition of variations . . . . . . . . . . . . . . . . . . . . . . . 8.5.2 The relationship between variations and the reachable set . . .

Controllability theory 284 9.1 Denitions for the various types of controllability . . . . . . . . . . . . . 284 9.1.1 Accessibility denitions . . . . . . . . . . . . . . . . . . . . . . . . 284 9.1.2 Controllability denitions . . . . . . . . . . . . . . . . . . . . . . 286 9.1.3 Geometric controllability denitions for control-ane and ane systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 9.2 Examples illustrating controllability problems . . . . . . . . . . . . . . . 292 9.2.1 The dierence between accessibility and local accessibility . 293 9.2.2 The distinction between accessibility and controllability . . 294 9.2.3 The distinction between accessibility and strong accessibility295 9.2.4 Global controllability and local controllability . . . . . . . . 296 9.2.5 The size of the control set can matter . . . . . . . . . . . . . . . . 297 9.2.6 The role of feedback transformations . . . . . . . . . . . . . . . . 298 9.2.7 Controllability might be computationally dicult to decide . . . 299 9.3 Accessibility theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 9.3.1 Positive orbits and positive xed-time orbits . . . . . . . . . . . . 300 9.3.2 Accessibility for control systems . . . . . . . . . . . . . . . . . . . 302 9.3.3 Accessibility for control-ane systems . . . . . . . . . . . . . . . 304 9.4 Some controllability results . . . . . . . . . . . . . . . . . . . . . . . . . . 306

vi 9.4.1 9.4.2 9.4.3 Controllability results for dierential inclusion systems . . . . . 307 Controllability results for control systems . . . . . . . . . . . . . 308 Controllability results for control-ane systems . . . . . . . . . . 309 311 312

10 Optimal control theory 11 Stabilisation theory

This version: 23/06/2009

Chapter 1 Notation and prerequisites


While we cover quite a bit of material in this book, it is not the case that the book is even close to being self-contained. In fact, quite the opposite is true: the reader is expected to have substantial prerequisites in many areas of mathematics. In this chapter we will give an overview of what is expected of the reader, give the notation we will use throughout thee book, and provide references which readers can use at appropriate moments to ll in the required background. What we say is exceedingly terse. Note that it is not the case that all prerequisites need to be understood completely to read a chosen section. For example, it may be the case that some prerequisites are needed only in a certain proof. Thus a reader need not worry if there are certain gaps in their background. The very basics We use the most common set theoretic notation. One thing the reader may wish to be aware of is that when we write A B we mean that A is a subset of B, allowing that A = B. If we require that A B and A B, then we will write A B. The power set of a set X we denote by 2X . By idX we denote the identity map on a set X. The cardinality of a set X is denoted by card(X). By Z, Q, R, and C we denote the sets of integers, rational numbers, real numbers, and complex numbers, respectively. By Z0 (resp. Z>0 ) we denote the set of nonnegative (resp. positive) integers. By R0 (resp. R>0 ) we denote the set of nonnegative (resp. positive) real numbers. For x R, x denotes the largest integer less than or equal to x and x denotes the smallest integer not less than x. We use the terms injective and surjective for maps, rather than the terms one-to-one or onto. If S1 , . . . , Sk are sets, the map pr j : S1 Sk S j dened by pr j (x1 , . . . , x j , . . . , xk ) = x j is the projection onto the jth factor.

2 Topology

1 Notation and prerequisites

23/06/2009

We use basic facts from point set topology. Thus we expect that the reader knows what a topological space is, and knows about the various avours of subsets of a topological space, like open sets, closed sets, compact sets, and connected sets. We shall make frequent use of basic theorems, often dealing with compactness, e.g., the BolzanoWeierstrass Theorem and the Arzel` aAscoli Theorem. Let X be a topological space. By int(A), cl(A), and bd(A) we denote the interior, closure, and boundary of A X, respectively. If Y X, then Y inherits the subspace topology. If A Y X, then intY (A), clY (A), and bdY (A) denote the relative interior, relative closure, and relative boundary of A. We shall also often use metric spaces in our constructions, denoting a typical metric space by (M, d), where d denotes the distance function. For x M and r R0 , we denote by B(r, x) and B(r, x) the open and closed balls, respectively, of radius r and centre x. We refer to [Willard 1970] as a good introduction to topology. We will also pull many facts about topology from Chapter 1 of [Abraham, Marsden, and Ratiu 1988]. Real analysis The reader is assumed to be familiar with basic, and maybe some not so basic, real analysis. The Euclidean space of n-dimensions we denote by Rn . Typically we will denote an element of Rn with a bold font, e.g., x. For a vector x Rn the components of x will be denoted by (x1 , . . . , xn ) or (x1 , . . . , xn ). On Rn we consider the standard inner product and norm denoted by
n

x, y =
j =1

x j y j,

x =

x, x ,

respectively. Other norms one can use are


n

=
j=1

|x j |,

= max{|x j | | j {1, . . . , n}}.

At times it will be convenient to use these other norms, and to recall the inequalities x 1 n x , x 1 n x , x x 1, (1.1) x n x , x x 1, x x . By Bn (r, x) (resp. Bn (r, x)) we denote the open ball (resp. closed ball) of radius r centred at x Rn . Unless otherwise stated, these balls are taken with respect to the standard norm. The ball with respect to the norm are denoted by Dn (r, x) and Dn (r, x) and are often called disks.

1 Notation and prerequisites

If U Rn is an open set and if f : U Rm is k-times continuously dierentiable for k Z, then we say that f is of class Ck . The kth derivative of f at x U we denote by Dk f (x), and we note that Dk f (x) is a symmetric multilinear map from (Rn )k to Rm . The n m set of such maps we denote by Lk sym (R ; R ). Maps that are innitely dierentiable are of class C and maps that are real analytic are of class C . By a map of class C0 we mean a continuous map. A good reference for the real analysis we use is [Rudin 1976]. Another excellent, and more advanced, source is [Hewitt and Stromberg 1975]. Differential geometry We suppose the reader to be thoroughly familiar with basic dierential geometry. We shall frequently employ the Einstein summation convention, although we will not do so slavishly. In particular, we will not be consistent with using superscript indices in places where the Einstein summation convention insists that you use superscript indices. Thus, for example, we will denote a point in Rn as (x1 , . . . , xn ) and not as (x1 , . . . , xn ) when we feel as if we want to do so. All manifolds we consider to be either innitely dierentiable or real analytic. When we use the word smooth, we shall always mean innitely dierentiable. We shall often use the expression innitely dierentiable or real analytic, as is required, by which we mean that innite dierentiability is assumed, unless objects are being consider that are real analytic. For manifolds M and N and for r Z0 {, }, the mappings from M to N of class Cr are denoted by Cr (M, N). We abbreviate Cr (M) = Cr (M, R) for r Z0 {} {}. We shall mostly carefully state degrees of dierentiability at all times. However, just to be safe: unless we state otherwise, all mappings and functions will be assumed to be of class C . If we wish to weaken this to some nite degree of dierentiability or strengthen this to analyticity (we shall often do this), we shall say so explicitly. The tangent bundle of a manifold M is denoted by TM, with Tx M denoting the tangent space at x. The cotangent bundle and cotangent space at x are similarly r denoted by T M and T x M, respectively. For r, s Z0 , Ts (TM) denotes the vector bundle of (r, s)-tensors on M. So that there is no confusion, tangent vectors are tensors of type (1, 0) and cotangent vectors are tensors of type (0, 1). For a vector bundle : V M of class Ck , the set of Ck -sections will be denoted by k (V), k Z0 {, }. Thus k (TM) denotes the set of vector elds of class Ck and, more generally, k (Tr s (TM)) denotes the set of (r, s)-tensor elds of type (r, s) and of class k C ; we do not use special notation for these. We denote Vx = 1 (x) the bre at x. Sometimes, but not always, the zero vector in the bre Vx is denoted by 0x . If : M N is dierentiable, T : TM TN denotes the derivative of , with Tx being the restriction to Tx M. We renounce other notation for the derivative, such as (which we use for push-forward) of d. k r If : M N is a dieomorphism, if A k (Tr s (M)), and if B (Ts (TN)), then A = denotes the push-forward of A and B denotes the pull-back of B. Note that does

1 Notation and prerequisites

23/06/2009

not need to be a dieomorphism to dene B in the case that r = 0, but the other possibilities generally require to be a dieomorphism. The Lie derivative of a tensor eld A with respect to a vector eld X is denoted by L X A. In case A = f is a function, we might often write L X f = X f if we are trying to be concise. Almost everything we will need to know about dierential geometry is contained in the text of Abraham, Marsden, and Ratiu [1988]. Riemannian geometry Riemannian geometry does not feature critically in our presentation, but we will occasionally benet from assuming that our manifolds possess a Riemannian metric. Principally, we will be interested in the metric structure a Riemannian manifold induces on the manifold, and we discuss this in Section 4.1. We will on occasion make use of some nontrivial facts about Riemannian manifolds; we refer to [Lang 1995] as a useful text in these cases. Measure and integration theory The dependence on time of controls must be allowed to be quite arbitrary in order to deal with some of the phenomenon that can arise in control theory. A general and useful class of controls to consider are those that are measurable, by which we mean Lebesgue measurable. For controls taking values in Euclidean space, we can also ask for controls to be integrable, by which we mean Lebesgue integrable. Thus we require enough knowledge of measure theory to know what is meant by Lebesgue measurability and Lebesgue integrability. At various times we will also require enough knowledge of measure theory to understand standard, but nontrivial, manipulations of the concepts of measurability and integrability. For example, if the reader knows enough measure theory to know the Dominated Convergence Theorem and what it means will probably possess sucient measure theory to get through this book. The Lebesgue measure on R will be denoted by . For an interval I R and for a subset A R, we denote by L1 (I; A) the Lebesgue integrable A-valued functions on I. By L1 loc (I ; A) we denote the subset of locally integrable A-valued functions, meaning that f |K L1 (K; A) for every compact interval K I. A good reference for measure and integration theory is [Cohn 1980]. Functional analysis We expect the reader to be acquainted with elementary functional analysis, such as Banach space theory. We will also, however, occasionally need some less basic functional analysis. In particular, we will make use of some facts about locally convex topological vector spaces, as such concepts are essential to understanding how to topologise spaces of functions and vector elds on manifolds. Basics of functional analysis such as we need can be found in [Rudin 1991].

1 Notation and prerequisites Linear algebra

A through understanding of linear algebra is essential in most any area of control theory. We assume the reader to be fully acquainted with nite-dimensional linear algebra. For R-vector spaces U and V, we will denote the set of R-linear maps from U to V by either L(U; V) or HomR (U; V). More or less, when we are focusing on analytical ideas, we will use the former notation, while the latter notation will be used when algebraic structure is the focus. By V = L(V; R) we denote the dual of V. If V and v V, we might denote (v) by ; v or v. For a subset S of a R-vector space V, spanR (S) is the linear hull of S, i.e., the smallest subspace of V containing S. By Rmn we denote the set of R matrices with m rows and n columns. A typical matrix will be denoted using a bold font, e.g., A. By In we denote the n n identity matrix. A good basic reference for nite-dimensional linear algebra is [Halmos 1986]. More advanced topics are covered in [Roman 2005]. Algebra We expect that the reader knows the basic denitions and properties for groups, rings (in particular elds), modules (in particular, vector spaces), and algebras. We expect the reader to know about tensor products. By Sk we denote the symmetric group of order k, which means, precisely, the group of bijections of the set {1, . . . , k}. Let us review some notation regarding direct sums and products. For a family (Va )aA of R-vector spaces, we regard the direct sum of these vector spaces as the following set of maps: Va = { : A aA Va | (a) Va , a A, (a) = 0 for all but nitely many a A}.
aA

In like manner, the direct product of the same family of vector spaces is also a set of maps: Va = { : A aA | (a) Va , a A}.
aA

Thus the direct sum is a subspace of the direct product if we use the operations of vector addition and scalar multiplication on aA Va by ( + )(a) = (a) + (a), ()(a) = ((a)).

The k-fold tensor product of a vector space V with itself we denote by Tk (V) = V V.

1 Notation and prerequisites

23/06/2009

The tensor algebra of V is then T(V) = Tk (V), with the understanding that T0 (V) = R. k =0 r r s We comment that Ts (V) T (V) T (V ) in the case when V is nite-dimensional. We shall occasionally recall some not quite elementary facts from algebra, but will provide the necessary background in these cases. References for basic algebra are [Hungerford 1980] and [Lang 1984]. Computational complexity This is denitely not a book on computational complexity methods in control theory, but we will on occasion make some statements using the language of computational complexity. Therefore, we should have in mind some idea about what these statements mean. The aspect of this that we will be the most vague about is the class of problems considered in the theory of computational complexity. The problems we consider are decidability problemsmeaning that they have answers that are yes or noand will have a characteristic size which measures the complexity of the problem. For a problem involving graphs, for example, the size might be the number of nodes in the graph, for a problem involving linear algebra, the size might be the dimension of the vector space or some number related to the dimension of the vector space. For us, dealing with problems in control theory, the complexity of the problem will be related to the dimension of the state space and the way in which one represents the system. In any case, we will be concerned with decidability problems with a characteristic size that we will typically denote by N. By a solution algorithm we mean a method for taking an instance of the decidability problem and producing a correct yes or no answer in all cases. A solution algorithm which takes a problem of size N and correctly returns the answer to the decidability problem in at most K steps where K satises an inequality of the form K CNp for some C, p Z>0 is called a polynomial-time solution algorithm. The class of problems admitting polynomial-time solution algorithms is denoted P. The class NP of problems are those problems which, if one is given an armative answer to an instance of size N of the decidability problem, one can evaluate whether the answer is correct using at most K steps, where K satises an inequality of the form K CNp for some C, p Z>0 . Clearly P NP. It is not presently known whether P = NP. A problem L in called NP-complete if (1) it is in NP and (2) if any other problem L of size N in NP can be converted to L using an algorithm with K steps, where K satises K CNp . A problem L is called NP-hard if any problem L of size N in NP can be converted to L using an algorithm with K steps, where K satises K CNp . The idea is that the class P of problems are nice in that a relatively ecient algorithm exists to solve them. The class NP are not known to be as nice (it is not known whether they can be solved using a polynomial-time algorithm), but are also not so bad since solutions can be eciently veried. Problems that are NP-complete are the easiest problems in NP. Problems that are NP-hard are at least as dicult as any problem in NP.

1 Notation and prerequisites

On introduction to matters of computation and computational complexity can be found in the book [Sipser 1996]. ! Control theory It is important not to forget that geometric control theory is about control theory as well as dierential geometry. It is assumed that the reader is familiar with basic topics in control theory, as such familiarity forms an essential context within which to understand geometric control theory. Without appreciating the needs of control theory, it is possible to turn geometric control theory into a rather unhappy distortion of what it is supposed to be. When thinking about problems in geometric control theory, one should always ask, What is the problem in control theory that this is contributing to? Of course, control theory is an enormous subject with many specialities, and geometric control theory is only even slightly related to a small subset of these specialities, e.g., continuous-time, nite-dimensional systems described by dierential equation models. And even within this small subset, one does not need to understand everything to be able to properly contextualise geometric control theory. For example, one need not understand the latest in robust adaptive controller design schemes to do useful research in geometric control theory. However, perhaps as a bare minimum, one should have familiarity with the following areas. 1. Linear control theory: The basics of linear control theory are well-established, and so form as excellent starting point for studying control theory. The fundamental problems are all precisely formulated and, in some sense at least, solved. There are many linear systems texts available, most of which are intended for graduate students in engineering. Many of these are presented with sucient rigour to be useful preparation for geometric control theory. One book that stands out from the norm in its treatment is that of Wonham [1985]. This book provides a very good formulation of linear control theory for the purposes of studying geometric control theory, and so might be a good starting point for someone coming from a background where control theory is absent. 2. Some of nonlinear control theory: The subject known as nonlinear control theory is quite expansive. Some areas of nonlinear control theory clearly overlap with geometric control theory. However, there is a signicant body of nonlinear control theory that really has no overlap with geometric control theory, and is more of a nonlinearisation of linear control theory. This latter sort of nonlinear control theory is presented in the text [Khalil 1996]. It is useful to understand some of this sort of nonlinear control theory. In particular, some of the more analytical techniques from this area are useful to know. However, the tools are decidedly not geometric, and at some point one has to really shake free from this sort of nonlinear control theory in order to immerse oneself into geometric control theory. A more geometric presentation of nonlinear control theory can be found in the books of Isidori [1995] and Nijmeijer and van der Schaft [1990]. The later volume,

1 Notation and prerequisites

23/06/2009

in particular, is very closely aligned with a geometric approach, but is also faithful to the needs of control theory.

This version: 23/06/2009

Chapter 2 Real analyticity


One of the places where geometric control theory departs from classical dierential geometry occurs in the occasional importance of real analyticity in geometric control theory. The reasons for this are deep and varied, as explained in [Sussmann 1990]. For our purposes, one of the important properties of real analyticity has to do with algebraic properties of germs of analytic functions. The setup for this is developed in Section 2.4. Another important aspect of real analyticity has to do with extending maps from a subset to a larger space. For smooth maps, this is often very easily done using things like partitions of unity and the Tietsze Extension Theorem [Abraham, Marsden, and Ratiu 1988, 5.5]. For analytic maps, the matter is more subtle because of the absence of partitions of unity. In Section 2.3.3 we present some useful results concerning extensibility of analytic functions. The facts we present about real analyticity are more or less well-known, but it is useful to organise them in one place for ease of reference. What we say here is a bare scratching of the surface of the important, interesting, and subtle topic of real analyticity.

2.1 Real analytic functions: denitions and fundamental properties


Real analytic functions are dened as being locally prescribed by a convergent power series. We, therefore, begin by describing formal (i.e., not depending on any sort of convergence) power series. We then indicate how the usual notion of a Taylor series gives rise to a formal power series, and we prove Borels Theorem which says that all formal power series arise as Taylor series. This leads us to consider convergence of power series, and then nally to consider real analytic functions. Much of what we say here is a eshing out of some material from Chapter 2 of [Krantz and Parks 2002]. 2.1.1 Multi-index and partial derivative notation A multi-index is an element of Zn . For a multi-index I we shall write I = (i1 , . . . , in ). 0 We introduce the following notation:

10

2 Real analyticity

23/06/2009

1. |I| = i1 + + in ; 2. I! = i1 ! in !; 3. xI = xi11 xinn for x = (x1 , . . . , xn ) Rn ; 4. |x|I = |x1 |i1 |xn |in for x = (x1 , . . . , xn ) Rn ; , so we shall think of these Note that the standard basis (e1 , . . . , en ) for Rn is in Zn 0 n vectors as elements of Z0 when it is convenient to do so. The following property of the set of multi-indices will often be useful. 2.1.1 Lemma For n Z>0 and m Z0 , card({I Zn 0 | |I| = m}) =
Proof We begin with an elementary lemma.
m

n+m1 . n1

1 Sublemma For n Z>0 and m Z0 ,


j=0

n+j1 n+m = . n n1

Proof Recall that for j, k Z0 with j k we have k k! = . j j!(k j)! We claim that if j, k Z>0 satisfy j k then k k k+1 + = . j j1 j This is a direct computation: (k j + 1)k! jk! k! k! + = + j!(k j)! ( j 1)!(k j + 1)! (k j + 1) j!(k j)! j( j 1)!(k j + 1)! (k j + 1)k! + jk! = j!(k j + 1)! (k + 1)k! (k + 1)! k+1 = = = . j!((k + 1) j)! j!((k + 1) j)! j Now we have
m j =0

n+ j1 =1+ n1 =1+

m j=1

n+ j1 =1+ n1

m j =1

n+ j n

m j=1

n+ j1 n

n+m n n+m = , n n n

as desired. We now prove the lemma by induction on n. For n = 1 we have card({ j Z0 | j = m}) = 1 = m , 0

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

11

which gives the conclusions of the lemma in this case. Now suppose that the lemma +1 satises |I| = m, then write I = (i , . . . , i , i holds for n {1, . . . , k}. If I Zk 1 k k+1 ) and take 0 I = (i1 , . . . , ik ). If ik+1 = j {0, 1, . . . , m} then |I | = m j. Thus
+1 card({I Zk 0 | |I| = m}) = m j=0 m

card({I Zn 0 | |I | = m j}) k+m j+1 = k1


m j =0

=
j=0

k+ j 1 k1

k+m (k + 1) + m 1 = , k (k + 1) 1

using the sublemma in the penultimate step. This proves the lemma by induction.

Multi-index notation is also convenient for representing partial derivatives of multivariable functions. Let us start from the ground up. Let (e1 , . . . , en ) be the standard basis for Rn and denote by (1 , . . . , n ) the dual basis for (Rn ) . Let U Rn and let f C (U). The kth total derivative of f at x0 we denote by D f (x0 ), noting that n D f (x0 ) Lk sym (R ; R) is a symmetric k-multilinear map [see Abraham, Marsden, and Ratiu 1988, Proposition 2.4.14]. We denote by Lk (Rn ; R) the set of k-multilinear maps with its usual basis {( j1 jk ) | j1 , . . . , jk {1, . . . , n}}. Thinking of D f (x0 ) as a multilinear map, forgetting about its being symmetric, we write n k f k D f (x0 ) = (x0 ) j1 jk . x j1 x jk j ,..., j =1
1 k

Now, for j1 , . . . , jk {1, . . . , n}, dene I Zn by letting im Z0 be the number of times 0 m {1, . . . , n} appears in the list of numbers j1 , . . . , jk . Then, via the product on n Lk sym (R ; R) as discussed in Section 8.1.1, we can also write
m

D f (x0 ) =
k a=1 i1 ,...,in Z0 i1 ++in =k

k f a 1 (x0 )i11 i1 ! in ! xi1 xinn


1

inn

For I = (i1 , . . . , in ) Zn , we write 0 | I | f |I| f (x0 ) = i (x0 ). x I x11 xinn We may also write this in a dierent way: | I | f |I| f ( x ) = (x0 ). 0 xI x1 x1 xn xn
i1 times in times

12

2 Real analyticity

23/06/2009

Indeed, because of symmetry of the derivative, for any collection of numbers j1 , . . . , j|I| {1, . . . , n} for which k occurs ik times for each k {1, . . . , n}, we have |I| f | I | f (x0 ) = (x0 ). xI x j1 x j|I| In any case, we can also write Dk f (x0 ) =
IZn 0 | I | =k

1 |I| f a (x0 )i11 I! xI

inn

We shall freely interchange the various partial derivative notations discussed above, depending on what we are doing. 2.1.2 Formal power series To get started with our discussion of real analyticity, it is useful to rst engage in a little algebra so that we can write power series without having to worry about convergence. 2.1.2 Denition (Formal power series with arbitrary indeterminates) Let = {1 , . . . , n } be a nite set and denote by Z the set of maps from into Z0 . The set of R-formal 0 power series with indeterminates X is the set of maps from Z to R, and is denoted 0 by R[[]]. There is a concrete way to represent Z . Given : Z0 we note that () is 0 uniquely determined by the n-tuple ((1 ), . . . , (n )) Zn 0 . Such an n-tuple is nothing but an n-multi-index. Therefore, we shall identify Z with 0 n the set Z0 of multi-indices. Therefore, rather than writing () for R[[]] and Zn , we shall write (I) 0 for I Zn . Using this notation, the R -algebra operations are dened by 0 ( + )(I) = (I) + (I), (a)(I) = a((I)), ( )(I) =
I1 ,I2 Zn 0 I1 +I2 =I

(I1 )(I2 ),

for a R and , R[[1 , . . . , n ]]. We shall identify the indeterminate j , j {1, . . . , n}, with the element j of R[[]] dened by 1, I = e j , j (I) = 0, otherwise,

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

13

. One can where (e1 , . . . , en ) is the standard basis for Rn , thought of as an element of Zn 0 readily verify that, using this identication, the k-fold product of j is 1, I = ke j , k ( I ) = j 0, otherwise. Therefore, it is straightforward to see that if R[[]] then =
I=(i1 ,...,in )Zn 0

(I)i11 inn .

(2.1)

Adopting the notational convention I = i11 inn , the preceding formula admits the compact representation = (I)I .
IZn 0

We can describe explicitly the units in the ring R[[]], and give a formula for the inverse for these units. 2.1.3 Proposition (Units in R[[]]) A member R[[]] is a unit if and only if (0) Moreover, if is a unit, then we have 1 (I) = for all I Zn . 0 1 (0)

0.

1
k=0

(I) (0)

Proof First of all, suppose that is a unit. Thus there exists R[[]] such that = 1. In particular, this means that (0)(0) = 1, and so (0) is a unit in R, i.e., is nonzero. Next suppose that (0) 0. To prove that is a unit we use the following lemma.

1 Lemma If R[[]] satises (0) = 0 then (1 ) is a unit in R[[]] and

(1 )1 (I) =
k =0

k (I)

for all I

Zn . 0

k Proof First of all, we claim that k=0 is a well-dened element of R[[]]. We claim that k (I) = 0 whenever |I| {0, 1, . . . , k}. We can prove this by induction on k. For k = 0 this follows from the assumption that (0) = 0. So suppose that k (I) = 0 for |I| {0, 1, . . . , k}, whenever k {0, 1, . . . , r}. Then, for I Zn satisfying |I| {0, 1, . . . , r + 1}, we have 0

r+1 (I) = ( r )(I) =


I1 ,I2 Zn 0 I1 +I2 =I

(I1 )r (I2 )

= (0)r (I) +
I I I Zn 0 Zn \0 0

(I )r (I I ) = 0,

14

2 Real analyticity

23/06/2009

using the denition of the product in R[[]] and the induction hypothesis. Thus we indeed have k (I) = 0 whenever |I| {0, 1, . . . , k}. This implies that, if I Zn , then the 0 k sum k=0 (I) is nite, and the formula in the statement of the lemma for (1 )1 at least makes sense. To see that it is actually the inverse of 1 , for I Zn we compute 0

(1 )
k =0

k (I ) =
k =0

k (I )
k =1

k (I) = 1,

as desired. Proceeding with the proof, let us dene = 1 (0) so that (0) = 0. By the lemma, 1 is a unit. Since = (0)(1 ) it follows that is also a unit, and that 1 = (0)1 (1 )1 . The formula in the statement of the proposition then follows from the lemma above.

Note that one of the consequences of the proof of the proposition is that the expression given for 1 makes sense since the sum is nite for a xed I Zn . 0 2.1.3 Formal Taylor series One can see an obvious notational resemblance between the representation (2.1) and power series in the usual sense. A common form of power series is the Taylor series for an innitely dierentiable function about a point.. In this section we esh this out by assigning to a C -map a formal power series in a natural way. Throughout this section we let = {1 , . . . , n } so R[[]] denotes the R-formal power series in these indeterminates. We let (e1 , . . . , en ) be the standard basis for Rn . We might typically denote the dual basis for (Rn ) by (e1 , . . . , en ), but notationally, in this section, it is instead convenient to denote the dual basis by (1 , . . . , n ). Let x0 Rn and let U be a neighbourhood of x0 Rn . We suppose that f : U Rm is innitely dierentiable. We let Dk f (x0 ) be the kth-derivative of f at x0 , noting that n m this is an element of Lk sym (R ; R ) as discussed in Section 2.1.1. As we saw in our n discussion in Section 2.1.1, writing this in the basis for Lk sym (R ; R) gives
m

D f (x0 ) =
k a=1 IZn 0 | I | =k

1 | I | f a (x0 )(i11 I I ! x

inn ) ea .

Thus, to f C (U) we can associate an element f (x0 ) R[[1 , . . . , n ]] Rm by


m

f (x0 ) =
a=1 IZn 0

1 | I | f a (x0 )(i11 inn ) ea . I I ! x

One can (somewhat tediously) verify using the high-order Leibniz Rule [Abraham, Marsden, and Ratiu 1988, Supplement 2.4A] that this map is a homomorphism of R-algebras. That is to say, a f (x0 ) = a f (x0 ), f + g (x0 ) = f (x0 ) + g (x0 ), f g (x0 ) = f (x0 ) g (x0 ).

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

15

We shall call f (x0 ) the formal Taylor series of f at x0 . To initiate our discussions of convergence, let us consider R-valued functions for the moment, just for simplicity. The expression for f (x0 ) is reminiscent of the Taylor series for f about x0 : 1 | I | f (x0 )(x x0 )I . I I ! x IZn
k =0
0

This series will generally not converge, even though as small children we probably thought that it did converge for innitely dierentiable functions. The situation regarding convergence is, in fact, as dire as possible, as is shown by the following theorem of Borel [1895]. Actually, Borel only proves case where n = 1. The proof we give for arbitrary n follows [Mirkil 1956]. 2.1.4 Theorem (Borel) If x0 Rn and if U is a neighbourhood of x0 , then the map f f from C (U) to R[[]] is surjective.
Proof For I Zn let us abbreviate 0 f (I) (x) = Let us also dene h : R R by 0, 2 e e1/(1(x+1) ) , h(x) = 1, 2 e e1/(1(x1) ) , 0, x (, 2], x (2, 1), x [1, 1], x (1, 2), x [2, ). |I| f (x). xI

As is well-known, cf. [Abraham, Marsden, and Ratiu 1988, Page 82] and Example 2.1.5, the function h is innitely dierentiable. We depict the function in Figure 2.1. Let R[[]]. Without loss of generality we assume that x0 = 0. Let r R>0 be such that Bn (r, 0) U. We recursively dene a sequence ( f j ) jZ0 in C (U) as follows. We take f0 C (U) such that f0 (0) = (0) and such that supp( f0 ) Bn (r, 0), e.g., take f0 (x) = (0)h( 2 r x ). Now suppose that f0 , f1 , . . . , fk have been dened and dene gk+1 : U R to a homogeneous polynomial function in x1 , . . . , xn of degree k + 1 so that, for every multi-index I = (i1 , . . . , in ) for which |I| = k + 1, we have gk+1 (0) = (I) f0 (0) fk (0). Note that gk+1 (0) = 0 if |I| {0, 1, . . . , k} since in these case gk+1 (x) will be a homogeneous polynomial of degree k + 1 m. Next let k+1 (x) = gk+1 (x)h( 2 g r x )
(I) (I) (I) (I) (I)

16

2 Real analyticity

23/06/2009

1.0 0.8 0.6 0.4 0.2 0.0 3 2 1 0 1 2 3

h(x)

Figure 2.1 The bump function

so that, since the function x h( 2 r x ) is equal to 1 in a neighbourhood of 0,


I) (I) (I) ( g (0) = (I) f0 (0) fk (0) k +1

k+1 (0) = 0 if |I| {0, 1, . . . , k + 1}. Also supp( g k+1 ) Bn (r, 0). Next let R>0 . If we and g k+1 (x) then dene h,k+1 (x) = k1 g k+1 (x). h,k+1 (x) = k1+m g
(I) (I)

(I)

Thus, if |I| {0, 1, . . . , k}, then we can choose suciently large that |h,k+1 (x)| < 2k1 for every x Bn (r, 0). With so chosen we take fk+1 (x) = h,k+1 (x). This recursive denition ensures that, for each k Z0 , fk has the following properties: 1. supp( fk ) Bn (r, 0); 2. 3. fk (0) = (I) f0 (0) fk (0) whenever I = (i1 , . . . in ) satises |I| = k; fk (0) = 0 if |I| {0, 1, . . . , k 1};
(I) (I) (I) (I) (I)

(I)

4. | fk (x)| < 2k whenever |I| {0, 1, . . . , k 1} and x Bn (r, 0).

We then dene f (x) = k=0 fk (x). From the second property of the functions fk , k Z0 , above we see that = f . It remains to show that f is innitely dierentiable. We shall do this by showing that the sequences of partial sums for all partial derivatives converge uniformly. The partial sums we denote by
m

F m (x ) =
k =0

fk (x).

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

17

Since all functions in our series have support contained in Bn (r, 0), this is tantamount to (I) showing that, for all multi-indices I, the sequences (Fm )mZ0 are Cauchy sequences in the Banach space C0 (Bn (r, 0); R) of continuous R-valued functions on Bn (r, 0) equipped with the norm g = sup{| g(x)| | x Bn (r, 0)}, Let R>0 and let I = (i1 , . . . , in ) be a multi-index. Let N Z>0 be such that 1 < j 2 j=l+1 for every l, m N, this being possible since and x Bn (r, 0), we have
m j j=1 2 m

< . Then, for l, m {N, |I|} with m > l 1 < , j 2 j =l +1


m

Fl (x) Fm (x) =
(I)

(I)

(I)

j=l+1

f j (x )

(I)

j=l+1

f j (x)

(I)

showing that (Fm )mZ>0 is a Cauchy sequence in C0 (Bn (r, 0); R), as desired.

Thus any possible coecients in a formal power series can arise as the Taylor coecients for an innitely dierentiable function. Of course, an arbitrary power series

(I)(x1 x01 )i1 . . . (xn x0n )in


k=0 I=(i1 ,...,ik )

may well only converge when x = x0 . Not only this, but even when the Taylor series does converge, it may not converge to the function producing its coecients. 2.1.5 Example (A Taylor series not converging to the function giving rise to it) We dene f : R R by 1 e x2 , x 0, f (x) = 0, x = 0, and in Figure 2.2 we show the graph of f . We claim that the Taylor series for f is the zero R-formal power series. To prove this, we must compute the derivatives of f at x = 0. The following lemma is helpful in this regard. 1 Lemma For j Z0 there exists a polynomial pj of degree at most 2j such that f(j) (x) = pj (x) x3j e x2 ,
1

0.

Proof We prove this by induction on j. Clearly the lemma holds for j = 0 by taking

p0 (x) = 1. Now suppose the lemma holds for j {0, 1, . . . , k}. Thus f (k) (x) = pk (x) 12 e x x3k

18

2 Real analyticity

23/06/2009

0.6

f (x)

0.4

0.2

0.0 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

Figure 2.2 Everyones favourite smooth but not analytic function

for a polynomial pk of degree at most 2k. Then we compute f


(k+1)

(x) =

x3 pk (x) 3kx2 pk (x) 2pk (x) x3(k+1)

e x2 .

Using the rules for dierentiation of polynomials, one easily checks that x x3 pk (x) 3kx2 pk (x) 2pk (x) is a polynomial whose degree is at most 2(k + 1). From the lemma we infer the innite dierentiability of f on R \ {0}. We now need to consider the derivatives at 0. For this we employ another lemma. 2 Lemma limx0
e
1 x2

xk

= 0 for all k Z0 . yk e x2 lim k = lim y2 , y e x0 x


1

Proof We note that

yk e x2 lim k = lim y2 . y e x0 x

We have e =
y2

j =0

y2 j j!

In particular, e y

y2k k!

, and so yk e y2
1

k! , yk

and so

e x2 lim k = 0, x0 x

as desired.

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties


2k j =0

19

Now, letting pk (x) =


(k)

a j x j , we may directly compute


2k 1 x2 2j e a j x 3k x 2k

lim f (x) = lim


x0 x0 j =0

=
j =0

e x2 a j lim 3k j = 0. x0 x

Thus we arrive at the conclusion that f is innitely dierentiable on R, and that f and all of its derivatives are zero at x = 0. Thus the Taylor series is indeed zero. This is clearly a convergent power series; it converges everywhere to the zero function. However, f (x) 0 except when x = 0. Thus the Taylor series about 0 for f , while convergent everywhere, converges to f only at x = 0. This is therefore an example of a function that is innitely dierentiable at a point, but is not equal to its Taylor series at x = 0. This function may seem rather useless, but in actuality it is quite an important one. For example, we used it in the construction for the proof of Theorem 2.1.4. It is also used in the construction of partitions of unity which are so important in smooth dierential geometry, and whose absence in real analytic dierential geometry makes the latter subject so subtle. Another way to think of the preceding example is that it tells us that the map f f (x0 ) from C (U) to R[[]], while surjective, is not injective. 2.1.4 Convergent power series Throughout this section we let = {1 , . . . , n } so R[[]] denotes the R-formal power series in these indeterminates. Let us turn to formal power series that converge, and give some of their properties. Recalling notation from Section 2.1.1, we state the following. 2.1.6 Denition (Convergent formal power series) Let = {1 , . . . , n }. A formal power series R[[]] converges at x Rn if there exists a bijection : Z>0 Zn such that 0 the series

(( j))x( j)
j =1

converges. Let us denote by Rconv () the set of points x Rn such that converges at x. We call Rconv the region of convergence. We denote by [[]] = { R[[]] | Rconv R {0}}

the set of power series converging at some nonzero point.

2.1.7 Remark (On notions of convergence for multi-indexed sums) Note that the definition of convergence we give is quite weak, as we require convergence for some arrangement of the index set Zn . A stronger notion of convergence would be that the 0 series

(( j))x( j)
j =1

20

2 Real analyticity

23/06/2009

. This, it turns out, is equivalent to absolute converge for every bijection : Z>0 Zn 0 convergence of the series, i.e., that |(I)||x|I < .
IZn 0

This is essentially explained by Roman [2005] (see Theorem 13.24) and Rudin [1976] (see Theorem 3.55).1 We shall take an understanding of this for granted. Let us now show that, convergence as in the denition above at any nontrivial point (i.e., a nonzero point) leads to a strong form of convergence at a large subset of other points. To be precise about this, for x Rn let us denote C(x) = {(c1 x1 , . . . , cn xn ) Rn | c1 , . . . , cn (1, 1)}. Thus C(x) is the smallest open cube centred at the origin whose closure contains x. 2.1.8 Theorem (Uniform and absolute convergence of formal power series) Let R[[]] and suppose that converges at x0 Rn . Then converges uniformly and absolutely on every compact subset of C(x0 ).
Proof Let K C(x0 ) be compact. The proposition holds trivially is K = {0}, so we suppose this is not the case. Let (0, 1) be such that |x j | |x0 j | for j {1, . . . , n}. Let : Z>0 Zn 0 be a bijection such that
j =1

(( j))x0

( j)

converges. This implies, in particular, that the sequence ((( j))x0 ) jZ>0 is bounded. Thus there exists M R>0 such that |(I)||x0 |I C for every I Zn . Then |(I)||x|I C|I| 0 for every x K. In order to complete the proof we use the following lemma.

( j)

1 Lemma For x (1, 1),


j=0

(m + j)! j dm xm . x = j! dxm 1 x

Proof Let a (0, 1) and recall that 1 = 1a

aj
j =0

am = 1a

am+ j
j=0

[Rudin 1976, Theorem 3.26]. Also, by the ratio test, the series

(m + j)(m + j 1) (m + j k)am+ jk1


j =0

(2.2)

converges for k Z0 .
Also see the important paper of Dvoretzky and Rogers [1950] in this regard, where it is shown that the equivalence of absolute and unconditional convergence only holds in nite dimensions.
1

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties


m

21

Now, for x [a, a], since | 1xx | <

am 1a ,

we have xm , 1x

xm+ j =
j =0

with the convergence being uniform and absolute on [a, a]. Thus the series can be dierentiated term-by-term to give d xm = dx 1 x

(m + j)xm+ j1 .
j =0

Since |(m + j)xm+ j1 | (m + j)am+ j1 , this dierentiated series converges uniformly and absolutely on [a, a] since the series (2.2) converges. In fact, by the same argument, this dierentiation can be made m-times to give dn xm = dxn 1 x as desired. Now, continuing with the proof, for x K and for m Z0 we have

(m + j) (m + j m + 1)x j =
j =0 j =0

(m + j)! j x, j!

|(I)xI |
IZn 0 |I|m IZn 0 |I|m

|(I)||x|I
IZn 0 |I|m

C|I| < C
j =0

n+ j1 j n1

<C
j=0

(n + j 1)! j dn1 n1 = C n1 , (n 1)! 1 d

using 2.1.1 and Lemma 1. Thus the sum | ( I ) x I |


IZn 0

converges absolutely on K, and uniformly in x K since our computation above provides a bound independent of x.

The result implies that, if we have convergence (in the weak sense of Denition 2.1.6) for a formal power series at some nonzero point in Rn , we have a strong form of convergence in some neighbourhood of the origin. We now dene Rabs () =
rR>0

x Rn
IZn 0

|(I) yI | < for all y Bn (r, x) ,

which we call the region of absolute convergence. The following result gives the relationship between the two regions of convergence.

22

2 Real analyticity

23/06/2009

2.1.9 Proposition (int(Rconv ()) = Rabs ()) For R[[]], int(Rconv )() = Rabs ().
Proof Let x int(Rconv ()). Then, there exists > 1 such that x Rconv (). For such a , x C(x). Let K C(x) and r R>0 be such that Bn (r, x) K, e.g., take K to be a large enough closed cube. By Theorem 2.1.8 it follows that |(I) yI | <
IZn 0

for y Bn (r, x) K, and so x Rabs . If x Rabs then there exists r R>0 such that |(I) yI | <
IZn 0

for y Bn (r, x). In particular, converges at every y Bn (r, x) and so x int(Rconv ()).

This result has the following corollary that will be useful for us. [[]] and 2.1.10 Corollary (Property of coefcients for convergent power series) If R if x Rabs () then there exists C, R>0 such that |(I)| for every I Zn . 0 (|x1 | + )i1 C (|xn | + )in

Proof Note that, if x Rabs (), then (|x1 |, . . . , |xn |) Rabs () by denition of the region of absolute convergence and by Theorem 2.1.8. Now, by Proposition 2.1.9 there exists R>0 such that (|x1 | + , . . . , |xn | + ) Rconv . Thus there exists a bijection : Z>0 Z0 such that

(( j))(|x1 | + )( j)1 (|xn | + )( j)n


j=1

converges. Therefore, the terms in this series must be bounded. Thus there exists C R>0 such that (I)(|x1 | + )i1 (|xn | + )in < C for every I Zn . 0

Using this property of the coecients of a convergent power series, one can deduce the following result.

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

23

2.1.11 Corollary (Convergent power series converge to innitely differentiable func ]] then the series tions) If R[[ (I)xI
IZn 0

converges in Rabs to an innitely dierentiable function whose derivatives are obtained by dierentiating the series term-by-term.
Proof By induction it suces to show that any partial derivative of f is dened on Rabs by a convergent power series. Consider a term (I)xI in the power series for I Zn . For 0 j Z>0 we have i j = 0, 0, (I)xI = I e j x j i j (I)x , i j 1. Thus, when dierentiating the terms in the power series with respect to x j , the only nonzero contribution will come from terms corresponding to multi-indices of the form I + e j . In this case, (I + e j )xI+e j = (i j + 1)(I + e j )xI . x j Therefore, the power series whose terms are the partial derivatives of those for the given power series with respect to x j is (i j + 1)(I + e j )xI .
IZn 0

Now let x Rabs and, according to Corollary 2.1.10, let C, R>0 be such that | ( I ) | Let y Rabs be such that y D n ( 2 , x ) = {x R n | | x j x j | < 2 }. Note that | y j | |x j | + | y j x j | < |x j | + 2 . Also let = max Then, we compute |i j + 1||(I + e j )|| y|I
IZn 0 IZn 0

(|x1 | +

)i1

C , (|xn | + )in

I Zn 0 .

|x1 | + |x1 | +

,...,

|xn | + |xn | +

(0, 1).

C|i j + 1|

|x 1 | + |x1 | +

2 i1

|x n | + |xn | +

2 in

m=0 IZn 0 |I|=m

C |i j + 1 | m
m =0

C(m + 1)

nm1 m , n1

24

2 Real analyticity

23/06/2009

using Lemma 2.1.1. The ratio test shows that this last series converges. Thus the power series whose terms are the partial derivatives of those for the given power series with respect to x j converges uniformly and absolutely in a neighbourhood of x. Thus x j ( I ) x I =
IZn 0 IZn 0

(i j + 1)(I + e j )xI ,

which gives the corollary, after an induction as we indicated at the beginning of the proof.

In Theorem 2.1.15 below we shall show that, in fact, convergent power series are real analytic on Rabs . 2.1.5 Real analytic functions Throughout this section we let = {1 , . . . , n } so R[[]] denotes the R-formal power series in these indeterminates. Now understanding some basic facts about convergent power series, we are in a position to use this knowledge to dene what we mean by a real analytic function, and give some properties of such functions. 2.1.12 Denition (Real analytic function) Let U Rn be open. A function f : U R is real [[]] and r R>0 analytic or of class C on U if, for every x0 U, there exists x0 R such that f (x) =
IZn 0

x0 (I)(x x0 )I =
IZn 0

x0 (I)(x1 x01 )i1 (xn x0n )in

for all x Bn (r, x0 ). The set of real analytic functions on U is denoted by C (U). A map f : U Rm is real analytic or of class C on U if its components f1 , . . . , fm : U R are real analytic. The set of real analytic Rm -valued maps on U is denoted by C (U, Rm ). 2.1.13 Notation (Real analytic or analytic) We shall very frequently, especially outside the connes of this chapter, write analytic in place of real analytic. This is not problematic since in our private life we use the term holomorphic and not the term analytic when referring to functions of a complex variable. We can now show that a real analytic function is innitely dierentiable with real analytic derivatives, and that the power series coecients x0 (I), I Zn , are actually 0 the Taylor series coecients for f at x0 . 2.1.14 Theorem (Real analytic functions have analytic partial derivatives of all orders) If U Rn is open and if f C (U), then all partial derivatives of f are real analytic. Moreover, [[]] and r R>0 are such that if x0 U, and if R f(x) =
IZn 0

(I)(x x0 )I

(2.3)

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

25

for all x Bn (r, x0 ), then = f (x0 ).


Proof We begin with a lemma. 1 Lemma If U Rn is open and if f C (U) then f is dierentiable and its partial derivatives are analytic functions. [[]] be such that Proof Let x0 U and let r R>0 and R f (x) =
IZn 0

(I)(x x0 )I

(2.4)

for all x Bn (r, x0 ). As we showed in the proof of Corollary 2.1.11, the power series whose terms are the partial derivatives of those for the power series for f with respect x j is (i j + 1)(I + e j )(x x0 )I .
IZn 0

Now let R>0 be such that x (x01 + , . . . , x0n + ) Bn (r, x0 ).

Since the series (2.4) converges at x , the terms in the series (2.4) must be bounded. Thus there exists C R>0 such that, for all I Zn , 0 |(I)(x x0 )I | = |(I)|
|I |

C.

Let x Bn (r, x0 ) be such that |x j x0 j | < for some (0, 1). We then estimate |(i j + 1)(I + e j )(x x0 )I | =
IZn 0 IZn 0

(i j + 1)|(I + e j )||x x0 |I (i j + 1)
IZn 0

|x x0 |I
| I | +1

(i j + 1)|I|
k=0 IZn 0 | I | =k

(k + 1)
k =0

n+k1 k , n1

where we have used Lemma 2.1.1. The ratio test can be used to show that this last series converges. Since this holds for every x for which |x j x0 j | < for (0, 1), it follows that there is a neighbourhood of x0 for which the series (I)(x x0 )I x j
f x j

IZn 0

converges absolutely and uniformly. This means that

is represented by a convergent
f x j

power series in a neighbourhood of x0 . Since x0 U is arbitrary, it follows that analytic in U.

is

26

2 Real analyticity

23/06/2009

Now, the only part of the statement in the theorem that does not follow immediately from a repeated application of the Lemma is the nal assertion. This conclusion is proved as follows. If we evaluate (2.3) at x = x0 we see that (0) = f (x0 ). In the proof of the lemma above we showed that f (x) = x j (i j + 1)(I + e j )(x x0 )I
IZn 0 f

in a neighbourhood of x0 . If we evaluate this at x = x0 we see that (e j ) = x j (x0 ). We can then inductively apply this argument to higher-order derivatives to derive the formula (i1 e1 + + in en ) = i1 ++in f 1 1 |I| f ( x ) = , 0 n i1 ! in ! xi1 xi I! xI n
1

which gives (I) = f (x0 )(I), as desired.

By denition, real analytic functions are represented by convergent power seriesin fact their Taylor series by Theorem 2.1.14in a neighbourhood of any point. Conversely, any convergent power series denes a real analytic function on its domain of convergence. [[]] 2.1.15 Theorem (Convergent power series dene real analytic functions) If R then the function f : Rabs () R dened by f (x) =
IZn 0

(I)xI

(2.5)

is real analytic.
Proof By Corollary 2.1.11 we know that f is innitely dierentiable and its derivatives can be gotten by term-by-term dierentiation of the series for f . Let x0 Rabs . By an induction on the argument of Corollary 2.1.11, if J = ( j1 , . . . , jn ) Zn then 0 | J| f (x 0 ) = x J Note that (i1 + j1 ) (i1 + 1) (in + jn ) (in + 1)(I + J)xI 0.

IZn 0

1 | J| f (x0 ) = J! x J

IZn 0

i1 + j1 in + jn (I + J)xI 0. j1 jn 1 | J| f (x0 )(x x0 ) J J! x J

We must show that f (x) =


JZn 0

for x in some neighbourhood of x0 . To this end, the following lemma will be useful.

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

27

1 Lemma If a, b R>0 satisfy a + b < 1, then


n i 1 + j1 in + jn |I| |J| 1 a b = . 1ab j1 jn

IZn JZn 0 0

Proof Recall that for m Z0 we have


m j=0

m m j j a b = (a + b)m , j

as is easily shown by induction. Therefore,


m

m =0 j =0

m m j j 1 a b = , j 1ab

cf. the proof of Lemma 1 from the proof of Theorem 2.1.8. Note that the sets {(i, j) Z2 | i + j = m}, {(m, j) Z2 | j m}

are in one-to-one correspondence. Using this fact we have i1 + j1 in + jn |I| | J| a b = j1 jn =


i 1 =0 j 1 =0

JZn IZn 0 0

JZn IZn 0 0 m1

i1 + j1 i1 j1 in + jn in jn a b a b j1 jn
mn

m1 m1 j1 j1 a b j1
m

in =0 jn =0

mn mn jn jn a b jn

= as desired.

1 1ab

Let z Rabs be such that none of the components of z are zero and such that for j {1, . . . , n}. This is possible by openness of Rabs . Denote = max x0n x01 ,..., . z1 zn

x0 j zj

<1

Let R>0 be such that + < 1. If y Rabs satises | y j x0 j | < |z j | for j {1, . . . , n}. With these denitions we now compute 1 | J| f (x0 ) |x x0 | J = J! x J
JZn IZn 0 0

JZn 0

JZn IZn 0 0

i1 + j1 in + jn |(I + J)||x0 |I |x x0 | J j1 jn i1 + j1 in + jn C |x0 |I | J| |z| J j1 jn |z|I+ J


n i1 + j1 in + jn |I| | J| 1 =C , 1 jn j1

=C
JZn IZn 0 0

28

2 Real analyticity

23/06/2009

using the lemma above. This shows that the Taylor series for f at x0 converges absolutely and uniformly in a neighbourhood of x0 . It remains to show that it converges to f . Let x be a point in the neighbourhood of x0 where the Taylor series of f at x0 converges. Let k Z>0 . By Taylors Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 2.4.15] there exists z {(1 t)x0 + tx | t [0, 1]} such that f (x) =
JZn 0 | J|k

1 | J| f (x0 )(x x0 ) J + J! x J

JZn 0 | J|=k+1

1 | J| f (z)(x x0 ) J . J! x J

By Corollary 2.1.11 we have 1 | J| f (z) = J! x J Therefore, f (x)


JZn 0 | J|k

IZn 0

i1 + j1 in + jn (I + J)zI . jn j1

1 | J| f (x0 )(x x0 ) J J! x J =

JZn 0 | J|=k+1

1 | J| f (z) |x x0 | J J! x J i1 + j1 in + jn |(I + J)||z|I |x x0 | J . j1 jn

IZn JZn 0 0 | J|=k+1

Just as we did above when we showed that the Taylor series for f at x0 converges absolutely, we can show that the series i1 + j1 in + jn |(I + J)||z|I |x x0 | J j1 jn

JZn IZn 0 0

converges. Therefore, lim i1 + j1 in + jn |(I + J)||z|I |x x0 | J = 0, j1 jn

IZn JZn 0 0 | J|=k+1

and so
k

lim f (x)
JZn 0 | J|k

1 | J| f (x0 )(x x0 ) J = 0, J! x J

showing that the Taylor series for f at x0 converges to f in a neighbourhood of x0 .

As a nal result, let us characterise real analytic functions by providing an exact description of their derivatives.

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties

29

2.1.16 Theorem (Derivatives of real analytic functions) If f C (U) then the following statements are equivalent: (i) f C (U); (ii) for each x0 U there exists a neighbourhood V U of x0 and C, r R>0 such that | I | f (x) CI!r|I| xI . for all x V and I Zn 0
Proof First suppose that f is real analytic and let x0 U. We will use the following lemmata. 1 Lemma Let J Zn and let x Rn satisfy |xk | < 1, k {1, . . . , n}. Then 0 i1 + j1 in + jn I |J| J! |x | = J j1 jn x
n k=1
k xk

IZn 0

1 xk

Proof By Lemma 1 from the proof of Theorem 2.1.8 we have


ik =0

ik + jk ik xk = jk ! jk

xk (ik + jk )! ik d jk , x = j ik ! dxkk 1 xk ik =0

jk

k {1, . . . , n}.

Therefore, J!
IZn 0

i1 + j1 in + jn I |x | = j1 jn =

j1 !
IZn 0

i1 + j1 i1 in + jn in x1 jn ! xn i1 in i1 + j1 i1 x1 i1
j x11

j1 !
i 1 =0

jn !
in =0 jn j j

in + jn in xn in

j1

j dx11 1 x1 n k =1

dxnn ,

xnn 1 xn

| J| = J x as desired.

xkk 1 xk

2 Lemma For each R (0, 1) there exists A, R>0 such that, for each m Z0 , dm xm sup dxm 1 x Proof We rst claim that
j =0

x [R, R] Am!m .

m+ j j 1 x = j (1 x)m+1

(2.6)

30

2 Real analyticity
for x (1, 1) and m Z0 . Indeed, by [Rudin 1976, Theorem 3.26] we have

23/06/2009

xj =
j =0

1 , 1x

and convergence is uniform and absolute on [R, R] for R (0, 1). Dierentiation m-times of both sides with respect to x then gives (2.6). By Lemma 1 from the proof of Theorem 2.1.8 we have dm xm = dxm 1 x for x (1, 1). If x [R, R] then (1 R)m dm xm = (1 R)m dxm 1 x m! = That is to say,
j =0

(m + j)! j x j!

(1 1R = m + 1 1x (1 x)

j=0 m R)

(m + j)! j x = (1 R)m m! j!
m

j =0

m+ j j x j

1 1 . 1x 1R

1 dm xm m!(1 R)m , dxm 1 x 1R


1 1R

and so the lemma follows with A =

and = 1 R.

Now, for x in a neighbourhood V of x0 we have f (x) =


IZn 0
|I | 1 f I! xI (x0 ).

1 |I| f (x0 )(x x0 )I . I! xI

Let us abbreviate (I) =

By Corollary 2.1.10 there exists C , R>0 such that I Zn 0 .

|(I)| C |I| ,

By Corollary 2.1.11 and following the computations from the proof of Theorem 2.1.15,we can write i1 + j1 in + jn 1 | J| f (x) = (I + J)(x x0 )I (2.7) J J! x j j n 1 n
IZ0

for J Zn and for x in a neighbourhood of x0 . Therefore, there exists (0, ) suciently 0 small that, if x Rn satises |x j x0 j | < , j {1, . . . , n}, (2.7) holds. Let C J be the partial derivative j n xkk | J| 1 xk x J
k =1

23/06/2009

2.1 Real analytic functions: denitions and fundamental properties


31

evaluated at x = ( , . . . , ). Let R (0, 1) satisfy R > . By the second lemma above there exists A, R>0 such that, for each k {1, . . . , n} and each xk [R, R], we have d jk
j dxkk

xkk 1 xk
n k =1 j

A jk ! jk

It follows that | J| x J

xkk 1 xk

An J!| J|

whenever x = (x1 , . . . , xn ) satises |x j | < R, j {1, . . . , n}. In particular, C J An J!| J| . Then, for such x such that |x j x0 j | < , j {1, . . . , n}, we have | J| f (x) x J
IZn 0

J!
IZn 0

i1 + j1 in + jn |(I + J)||x x0 |I jn j1 i1 + j1 in + jn C | J| j1 jn
|I |

J!

C | J| C J C An J!( + )| J| , using the lemmata above. Thus the second condition in the statement of the theorem holds with C = C An and r = + . Conversely, suppose that for each x0 U there exists a neighbourhood V U of x0 and C, r R>0 such that |I| f (x) CI!r|I| xI for all x V and I Zn . Then, for x0 U, let C, r, R>0 be such that < r and 0 |I| f (x) CI!r|I| , xI Then, for x Bn (, x0 ) we have 1 |I| f (x0 ) |x x0 |I I! xI r
|I | n I Zn 0 , x B (, x0 ).

k=0 IZn 0 | I | =k

IZn 0

IZn 0

|I |

k =0

nk1 k , k1 r

using Lemma 2.1.1. By the ratio test, the last series converges, giving absolute convergence of the Taylor series of f at x0 . We must also show that the series converges to f . Let k Z>0 , let x Bn (, x0 ), and recall from Taylors Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 2.4.15] that there exists z {(1 t)x0 + tx | t [0, 1]}

32
such that f (x ) =
IZn 0 |I|k

2 Real analyticity
1 |I| f (x0 )(x x0 )I + I! xI 1 |I| f (z)(x x0 )I . I! xI

23/06/2009

IZn 0 | I | =k +1

Thus f (x)
IZn 0 |I|k

1 |I| f (x0 )(x x0 )I I! xI r


|I |

IZn 0 | I | =k +1

1 |I| f (z) |x x0 |I I! xI

IZn 0 | I | =k +1

nk k r

k +1

As we saw above, the series

k =0

nk1 k1 r nk r k
k +1

converges, and so
k

lim

= 0,

giving
k

lim f (x)
IZn 0 |I|k

1 |I| f (x0 )(x x0 )I = 0, I! xI

and so f is equal to is Taylor series about x0 in a neighbourhood of x0 .

2.1.17 Remarks (On derivative estimates for real analytic functions) 1. Note that, given any N Z>0 , continuity of a C function and its derivatives ensures that one can nd an estimate of the form |I| f (x) CI!r|I| I x for all x in a neighbourhood of some point x0 and for all I for which |I| N. Thus the key distinction between smooth and analytic functions is that for analytic functions one can do this uniformly for all I Zn . 0 2. Readers familiar with the theory of holomorphic functions of several complex variables will recognise the estimate from the preceding theorem. Indeed, one can deduce the estimate we give using the holomorphic theory using the Cauchy estimates for derivatives. One place where the real theory diers from the complex theory is that, in the holomorphic case, the estimate one gives can be expressed as |I| f (z) CI!r|I| sup{| f (z)| | z Dn (, x0 )}, zI (2.8)

23/06/2009

2.2 Real analytic multivariable calculus

33

if f is holomorphic in the disk Dn (, z0 ) = {z Cn | |z j z0 j | < , j {1, . . . , n}}. The point here is that the bound on the derivatives involves the value of the function. In the real analytic case this is not possible. This has dire implications when one topologises the set of real analytic functions. Because of the form of the bounds for holomorphic functions, to topologise holomorphic functions is easy: one uses the topology of uniform convergence on compact sets. The analyticity of the limit function then essentially follows due to the bound (2.8). However, the set of real analytic functions is not a closed subspace of the set of continuous functions topologised by uniform convergence on compact sets. !

2.2 Real analytic multivariable calculus


Now that we know what a real analytic function is, and what are some of its properties, we can turn to the calculus of real analytic functions. We describe here the bare basics of this theory, enough so that we can do basic real analytic dierential geometry. 2.2.1 Real analyticity and operations on functions Let us verify that analyticity is respected by the standard ring operations for Rvalued functions on a set. 2.2.1 Proposition (Real analyticity and algebraic operations) If U is open and if f, g f C (U). C (U) then f + g, fg C (U). If, moreover, g(x) 0 for all x U, then g
Proof Let us rst prove that f + g, f g C (U). Let x0 U and let r R>0 be such that, for x Bn (r, x0 ), we have f (x ) =
IZn 0

f (x0 )(I)(x x0 )I ,

g(x) =
IZn 0

g (x0 )(I)(x x0 )I ,

with the convergence being uniform and absolute on Bn (r, x0 ). Absolute convergence implies that for any bijection : Z0 Zn we have 0

f (x) =
j=0

f (x0 )(( j))(x x0 )( j) ,

g(x) =
j =0

g (x0 )(( j))(x x0 )( j)

The standard results on sums and products [Rudin 1976, Theorem 3.4] of series now apply (noting that convergence is absolute) to show that

f (x) + g(x) =
j =0

( f (x0 )(( j)) + g (x0 )(( j)))(x x0 )( j) ,


k

f (x) g(x) =
k=0 j=0

( f (x0 )(( j)) g (x0 )((k j)))(x x0 )( j)+(k j)

34

2 Real analyticity

23/06/2009

for all x Bn (r, x0 ), with convergence being absolute in both series. Absolute convergence then implies that we can de-rearrange the series to get f (x) g(x) =
IZn 0

( f (x0 )(I) + g (x0 (I))(x x0 )I , f (x0 )(I1 ) g (x0 )(I2 )(x x0 )I


IZn I ,I Zn 0 1 2 0 I1 +I2 =I

f (x) g(x) =

for x Bn (r, x0 ). Thus the power series f (x0 ) + g (x0 ) and f (x0 ) g (x0 ) converge in a neighbourhood of x0 to f + g and f g, respectively. In particular, f + g, f g C (U). f f To show that g C (U) if g is nonzero on U, we show that 1 g C (U); that g C (U) then follows by our conclusion above for multiplication of real analytic functions. Let x0 U and let r R>0 be such that g(x) =
IZn 0

g (0)(x x0 )I

(2.9)

for x Bn (r, x0 ), convergence being absolute and uniform in Bn (r, x0 ). Let us abbreviate = g (x0 ). Since g is nowhere zero on U it follows that (0) 0 and so, by Proposition 2.1.3, is a unit in R[[]] with inverse dened by 1 ( I ) = 1 (0)

1
k =0

(I) k . ( 0 )

We will show that this is a convergent power series. Let R>0 be such that x (x01 + , . . . , x0n + ) Bn (r, x0 ).

Since the series (2.9) converges at x , the terms in the series must be bounded. Thus there exists C R>0 such that, for all I Zn , 0 |(I)(x x0 )I | = |(I)| Therefore, for I Zn , 0 1 Let C = max 1, 1 +
|I|

C.

( I ) C 1+ . (0) (0) C (0)

and let R>0 be such that C (0, 1). If x Bn (r, x0 ) satises |x j x0 j | < , j {1, . . . , n},

23/06/2009 we have

2.2 Real analytic multivariable calculus

35

1
IZ0 k=0

( I ) k |x x0 |I = (0)

|I |

1
IZ0 k=0 m m=0 IZ0 k=0 | I | =m

( I ) k |x x0 |I (0)
m

C
k m

(C)m
m=0 IZ0 k=0 |I|=m n m m =0

=
m=0 IZ0 | I | =m

(m + 1)(C) =

m1 (m + 1)(C)m , n1

using Lemma 2.1.1 and the fact, from Lemma 1 in the proof of Proposition 2.1.3, that (I) (1 (0) )k (I) = 0 whenever |I| {0, 1, . . . , k}.. The last series can be shown to converge by the ratio test, and this shows that the series 1 (0)

1
IZ0 k=0

(I) k (x x0 )I (0)

converges in a neighbourhood of x0 .

Of course, the addition part of the preceding result also applies to Rm -valued real analytic maps. That is, if f , g C (U, Rm ), then f + g C (U, Rm ). Next we consider compositions of real analytic maps. 2.2.2 Proposition (Compositions of real analytic maps are real analytic) Let U Rn and V Rm be open, and let f : U V and g : V Rp be real analytic. Then g f : U Rp is real analytic.
Proof It suces to consider the case where p = 1, and so we use g rather than g. We denote the components of f by f1 , . . . , fm : U R. Let x0 U and let y0 = f (x0 ) V. For x in a neighbourhood U U of x0 and for k {1, . . . , m} we write fk (x) =
IZn 0

k (I)(x x0 )I

(2.10)

and for y in a neighbourhood V V of y0 we write g( y) =


JZm 0

( J)( y y0 ) J .

(2.11)

U be such that the series (2.10) converges absolutely Following Theorem 2.1.8, let x j x0 j > 0, j {1, . . . , n}. In like fashion, for each k {1, . . . , m} and such that x at x = x k y0k > 0, V be such that (2.11) converges absolutely at y = y and such that y let y k {1, . . . , m}. Thus we have A, B R>0 such that |k (I)|( x x0 )I < A, k {1, . . . , m},
IZn 0 JZm 0

y0 ) J < B. |( J)|( y

36
Let

2 Real analyticity
1 y01 m y0m y y ,..., . A A j x0 j ), j {1, . . . , n}, then, for k {1, . . . , m}, If x U satises |x j | < (x r = min 1, |k (I)||x x0 |I
IZn 0 |I|1 IZn 0 |I|1

23/06/2009

|k (I)||I| ( x x0 )I =
IZn 0 |I|1

k y0k ) |k (I)||I| ( x x0 )I A ( y

since 1. Therefore, by (2.11),


j1 jk

|( J)|
JZm 0 I1 Zn 0 |I1 |1

|1 (I1 )||x x0 |

I1

In Zn 0 |In |1

|n (In )||x x0 |

In

B.

It follows that
j1 jk

( J)
JZm 0 I1 Zn 0 |I1 |1

1 (I1 )(x x0 )

I1

In Zn 0 |In |1

n (In )(x x0 )

In

j x0 j ), j {1, . . . , n}. Note, however, converges absolutely for x U satisfying |x j | < (x that since k (0) = y0k , k {1, . . . , m}, this means that the series
j1 jk

( J)
JZm 0 I1 Zn 0

1 (I1 )(x x0 )

I1

y01

In Zn 0

n (In )(x x0 )

In

y0k

j x0 j ), j {1, . . . , n}. This last series, converges absolutely for x U satisfying |x j | < (x however, is precisely g f (x). This series is also a power series after a rearrangement, and any rearrangement will not aect convergence, cf. Remark 2.1.7. Thus g f is expressed as a convergent power series in a neighbourhood of x0 .

2.2.2 The real analytic Inverse Function Theorem The Inverse Function Theorem lies at the heart of many of the constructions in dierential geometry. Thus it is essential for us to have at our disposal a real analytic version of the Inverse Function Theorem. We make the following obvious denition. 2.2.3 Denition (Real analytic diffeomorphism) If U, V Rn are open sets, a map f C (U, V) is a real analytic dieomorphism if it (i) is a bijection, (ii) is real analytic, and (iii) has a real analytic inverse. With this language, we have the following theorem.

23/06/2009

2.2 Real analytic multivariable calculus

37

2.2.4 Theorem (Real analytic Inverse Function Theorem) Let U Rn and let f C (U, Rn ). If the matrix f1 f1 ( x ) ( x ) 0 0 x 1 xn . . . . . . Df(x0 ) = . . . fn fn ( x ) ( x ) 0 0 x 1 xn is invertible for x0 U, then there exists a neighbourhood U of x0 such that f|U : U f(U ) is a real analytic dieomorphism.
Proof Since f is innitely dierentiable, by the usual Inverse Function Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 2.5.2], there exists a neighbourhood U of x0 such that f |U : U f (U ) is a C -dieomorphism. We wish to show that this map is a real analytic dieomorphism. Obviously it is an analytic bijection, so we need to show that its inverse is real analytic. To do this we shall show that, in a neighbourhood of f (x0 ), the inverse satises the derivative bounds of Theorem 2.1.16. We will do this by making use of the higher-order Chain Rule which relates the derivatives of the composition of two maps to the derivatives of each of the maps. To state this require some notation that we now introduce. Let r Z>0 and let r1 , . . . , rk Z0 satisfy r1 + + rk = r. We denote by Sr1 ,...,rk the subset of Sr having the property that Sr1 ,...,rk satises (r1 + + r j + 1) < < (r1 + + r j + r j+1 ), j {0, 1, . . . , k 1},

with the understanding that r0 = 0. Thus Sr1 ,...,rk rearranges {1, . . . , r} in such a way that order is preserved in the rst r1 entries, the next r2 entries, and so on. Let us also denote by S< r1 ,...,rk the subset of Sr1 ,...,rk given by S< r1 ,...,rk = { Sr1 ,...,rk | (1) < (r1 + 1) < < (rk1 + 1)}. In this case, the rearrangement by S< r1 ,...,rk preserves order in the rst place, the (r1 + 1)st place, the (r2 + 1)st place, and so on. With this notation, we can state and prove the higher-order Chain Rule, a full proof being dicult to locate in the literature. 1 Lemma Let U Rn and V Rm be open, consider maps g : U V and f : V Rk , and let x U. If g and f are of class C then f g is of class C and, moreover, Dr (f g)(x) (v1 , . . . , vr )
r

=
j=1 r1 ,...,rj Z>0 S< r1 ,...,rj r1 ++rj =r

Dj f(g(x)) (Dr1 g(x) (v(1) , . . . , v(r1 ) ), . . . , Drj g(x) (v(r1 ++rj1 +1) , . . . , v(r) )) (2.12)

for v1 , . . . , vr Rn . Proof The proof is by induction on r. For r = 1 the result is simply the usual Chain Rule [Abraham, Marsden, and Ratiu 1988, Theorem 2.4.3]. Assume the result is true for

38
r {1, . . . , s}. We thus have Ds ( f g)(x) (v2 , . . . , vs+1 )
s

2 Real analyticity

23/06/2009

=
j=1 s1 ,...,s j Z>0 S< s1 ,...,s j s1 ++s j =s

D j f ( g(x)) (Ds1 g(x) (v(2) , . . . , v(s1 +1) ), . . . , Ds j g(x) (v(s1 ++s j1 +2) , . . . , v(s+1) ))

for every v2 , . . . , vs+1 Rn , and where S< s1 ,...,s j Ss permutes the set {2, . . . , s + 1} in the obvious way. Let us now make an observation about permutations. Let j {1, . . . , s + 1}, let s1 , . . . , s j Z>0 satisfy s1 + + s j = s + 1, and let S< . For brevity denote s ,...,s tl = s1 + + sl for l {1, . . . , j }. We have two cases.
1 j

1. s1 = 1: In this case let j = j 1, dene sl = sl+1 for l {1, . . . , j 1}, and let tl = s1 + + sl for l {1, . . . , j}. We then have ((1), ( (t2 s2 + 1), . . . , (t2 )), . . . , ( (t j s j + 1), . . . , (t j ))) = ((1), ((t2 s2 + 1), . . . , (t2 )), . . . , ((t j s j + 1), . . . , (t j ))), (2.13) where S< Ss permutes {2, . . . , s + 1} in the obvious way. Note that this s ,...,s uniquely species s1 , . . . , s j and .
1 j

2. s1 1: Here we take j = j , s1 = s1 1, sl = sl for l {2, . . . , j}. Let us denote tl = s1 + + sl for l {1, . . . , j}. Then there exist l0 {1, . . . , j} giving the corresponding cycle S j given by = (1 l0 ) and Ss(1) ,s(2) ,...,s( j) such that (( (t1 s1 + 1), . . . , (t1 )), . . . , ( (t j s j + 1), . . . , (t j ))) = ((1, (t(1) s(1) + 1), . . . , (t(1) )), . . . , ((t( j) s( j) + 1), . . . , (t( j) ))), (2.14) where permutes {2, . . . , s + 1} in the obvious way. Note that this uniquely species s1 , . . . , s j , , and . Note that the cycle is necessary to ensure that (1) = 1, a necessary condition that S< . The cycle serves to place the slot into which the 1 is s ,...,s
1 j

inserted at the beginning of the slot list. Conversely, let j {1, . . . , s}, let s1 , . . . , s j Z>0 have the property that s1 + + s j = s, and let S< s1 ,...,sk . Denote tl = s1 + + sl for l {1, . . . , j}. Then we have two scenarios. 1. We take j = j + 1, let s1 = 1 and sl = sl1 for l {2, . . . , s + 1}. Dene tl = s1 + + sl . Then there exists S< such that (2.13) holds. Moreover, this uniquely determines s ,...,s s1 , . . . , s j and .
1 j

2. We take j = j and let l0 {1, . . . , j}. Then take S j to be the cycle (1 l0 ). We then dene s1 = s(1) + 1 and sl = s(l) for l {2, . . . , j}. Then there exists S< such s ,...,s that (2.14) holds. Note that this uniquely species s1 , . . . , s j and .
1 j

23/06/2009

2.2 Real analytic multivariable calculus

39

Using this observation, along with the usual Chain Rule and the symmetry of the derivatives of f of order up to s, we then compute Ds+1 ( f g)(x) (v1 , . . . , vs+1 )
s

=
j=1 s1 ,...,s j Z>0 S< s1 ,...,s j s1 ++s j =s

D j+1 f ( g(x)) (Dg(x) v1 ,

Ds1 g(x) (v(2) , . . . , v(s1 +1) ), . . . , Ds j g(x) (v(s1 ++s j1 +2) , . . . , v(s+1) )) + D j f ( g(x)) (Ds1 +1 g(x) (v1 , v(2) , . . . , v(s1 +1) ), . . . , Ds j g(x) (v(s1 ++s j1 +2) , . . . , v(s+1) )) + . . . + D j f ( g(x)) (Ds1 g(x) (v(2) , . . . , v(s1 +1) ), . . . , Ds j g(x) (v1 , v(s1 ++s j1 +2) , . . . , v(s+1) ))
s+1

=
j =1 s1 ++s j =s+1 s s1 ,...,s j Z>0 S<
s ,...,s 1 j

D j f ( g(x)) (Ds1 g(x) (v (1) , . . . , v (s1 ) ),

. . . , D j g(x) (v (s1 ++s as desired.

j 1

+1) , . . . , v (s+1) )),

If the vectors v1 , . . . , vr in the statement of the lemma are chosen from the standard basis vectors e1 , . . . , en , say v j = ek j , j {1, . . . , r}, then the formula above for the higher-order Chain Rule yields a formula for the partial derivative r ( g f ) (x) xk1 xkr in terms of the partial derivatives of f and g of order up to r. Let us now commence the proof of the theorem by estimating the derivatives of the inverse of f . By Theorem 2.1.16 there exists a neighbourhood U of x0 and C, R R>0 such that | J| f i (x) CJ!R| J| , x U , J Zn (2.15) 0 , i {1, . . . , n}. x J We shrink U if necessary so that U U , and so f |U is a C -dieomorphism onto its image V = f (U ). We denote by f the restriction of f to V (an entirely forgivable abuse of notation) and by g the inverse of f . We will show that there exists , R>0 such that | J| gi ( y) J!| J| , yJ y V , J Zn 0 , i {1, . . . , n}. (2.16)

We do this by induction on | J|. For | J| {0, 1} the result follows because, for | J| = 0, g is

40

2 Real analyticity
bounded on V , and because, for | J| = 1, the Jacobian matrix for g is given by g1 ( y) y1 . .. . . . gn ( y) y1 g1 ( y ) yn f1 ( g( y)) x1 . . .. . . = . . . gn fn ( y) ( g( y)) yn x1 . . . fn ( g ( y )) , xn 1 f1 ( g ( y )) xn

23/06/2009

yV ,

(this being a consequence of the smooth Inverse Function Theorem) and so the partial derivatives of g are bounded. So suppose that (2.16) holds for | J| {0, 1, . . . , r 1} and let J = ( j1 , . . . , jn ) Zn have 0 | J| = r 2. Let us dene k1 , . . . , kr {1, . . . , n} by (k1 , . . . , kr ) = (1, . . . , 1, . . . , n, . . . , n).
j1 terms jn terms

Let us consider the expression from the higher-order Chain Rule applied to the rth derivative of f g, taking v j = ek j , j {1, . . . , r}. The term in the outer sum in (2.12) with j = 1 is then D1 f ( g( y)) (Dr g( y) (ek1 , . . . , ekr )), which is the vector in Rn with components
n l =1

fi r gl ( g( y)) k ( y), xl x 1 xkn


1 n

i {1, . . . , n}.

(2.17)

Since f g = idV the derivatives of f g or order higher than 1 will vanish. Thus the higherorder Chain Rule expresses the partial derivative of gl in (2.17) as a linear combination of terms of the form
n n n

i=1 l1 =1 l j =1

r j gl j r1 gl1 gk j fi ( y) ( g( y)) ( y) ( y), xi xl1 xl j x(1) x(r1 ) x(r1 ++r j1 +1) x(r) (2.18)

for i {1, . . . , n}, where j {2, . . . , r}, r1 + + r j = r, and S< r1 ,...,r j . We have j fi ( g( y)) C j!R j xl1 xl j by (2.15) and we have rm glm ( y) rm !rm , x(r1 ++rm1 +1) x(r1 ++rm ) by the induction hypothesis. Let us also denote M = sup gk ( y) xi y V , i, k {1, . . . , n} . m {1, . . . , j},

With this notation, the expression (2.18) is bounded in magnitude by n j+1 MC j j!R j r1 ! r j !r .

23/06/2009 Thus

2.2 Real analytic multivariable calculus


r j=2 r1 ,...,r j Z>0 S< r1 ,...,r j r1 ++r j =r

41

r gl ( y) xk1 xkr

n j+1 MC j j!R j r1 ! r j !r .

By (2.15) we have j fi ( g( y)) C j!R j . xl1 xl j By the induction hypothesis,


1 rm glm ( y) rm !(1)rm 1 2 (2C)rm Rrm +1 , rm x(r1 ++rm1 +1) x(r1 ++rm )

m {1, . . . , j}.

! 2.2.5 Remark (On the higher-order Chain Rule) In our proof we make use of the higherorder Chain Rule for multi-variable functions. The formula is non-trivial. In the singlevariable case, the formula is due to Fa` a di Bruno [1855], an Italian mathematician and priest. The combinatorics of the formula arise in various places, including in the study of cumulants in probability theory. Generalisations of the Fa` a di Bruno formula to multiple-variables is enticing and has been studied by various people. Some useful formulae are given by Constantine and Savits [1996]. 2.2.3 Some consequences of the Inverse Function Theorem Having the real analytic Inverse Function Theorem at our disposal, we are now in a position to perform some of the more or less standard constructions that follow from it. Gratifyingly, these follow from the Inverse Function Theorem just as they do in the dierentiable case. We rst state the real analytic Implicit Function Theorem. 2.2.6 Theorem (Real analytic Implicit Function Theorem) Let m, n Z>0 , let U V Rn Rm be open, and let f C (U V, Rm ). Denote a point in Rn Rm by (x, y). If the matrix f1 f1 ( x , y ) ( x , y ) 0 0 0 0 y y m 1 . . . . . . D2 f(x0 ) = . . . fm (x , y ) fm (x , y ) 0 0 0 0 y1 ym is invertible for (x0 , y0 ) U V, then there exist (i) neighbourhoods U of x0 and W of f(x0 , y0 ), respectively, and (ii) g C (U W , V) such that f(x, g(x, z)) = z for all (x, z) U W .

42

2 Real analyticity
Proof Let us dene h : U V partial derivatives of h is 1 . . . 0 f1 Dh(x0 , y0 ) = (x , y ) x1 0 0 . . . fm (x , y ) x1 0 0

23/06/2009

Rn Rm by h(x, y) = (x, f (x, y)). The Jacobian matrix of .. . .. . 0 . . .


f1 (x , y ) xn 0 0

0 . . .
f1 (x , y ) y1 0 0

.. . .. .

. . . fm (x , y ) xn 0 0

. . . fm (x , y ) y1 0 0

0 f1 . ( x , y ) ym 0 0 . . . fm ( x , y ) 0 0 ym 0 . . .

By hypothesis, the lower right block is invertible. Since the upper left block is invertible and the upper right block is zero, it follows that the matrix is invertible. Therefore, by the Inverse Function Theorem there exists a neighbourhood U V of (x0 , y0 ) such that h|U V is a real analytic dieomorphism. Given the form of h, h(U V) = U W and the inverse of h|U V has the form (x, z) (x, g(x, z)) (2.19)

for some real analytic g : U W V V. One can easily verify that g(x, z) = z for all (x, z) U W by virtue of the fact that the map (2.19) is the inverse of h|U V .

The next result gives a local normal form for real analytic maps whose derivative is surjective at a point. 2.2.7 Theorem (Real analytic local submersion theorem) Let U Rn be open and let f C (U, Rm ). If the matrix f1 f1 (x0 ) (x0 ) x x n 1 . . . . . . Df(x0 ) = . . . fm fm ( x ) ( x ) 0 x1 xm 0 has rank m for x0 U, then there exists (i) a neighbourhood U1 U of x0 , (ii) an open set U2 Rm Rnm , and (iii) a real analytic dieomorphism : U2 U1 such that f (y, z) = y for all (y, z) U2 . !

Proof Let U Rn be a complement to ker(D f (x0 )) so that D f (x0 )|U is an isomorphism onto Rm [Roman 2005, ]. Note that Rn Uker(D f (x0 )). Choose a basis (1 , . . . , m , 1 , . . . , nm ) for Rn for which (1 , . . . m ) is a basis for U and (1 , . . . , nm ) is a basis for ker(D f (x0 )). Dene an isomorphism : Rm Rnm Rn by (u v) = u1 1 + + um m + v1 1 + + vnm nm .

Then dene f : 1 (U) Rm by f = f 1 so that D f (x0 ) = D f (x0 ) 1 . Now dene : 1 (U) Rm Rnm by g ( y, z) = ( g f ( y z), z).

23/06/2009

2.2 Real analytic multivariable calculus

43

is Let ( y0 , z0 ) = 1 (x0 ). Note that the Jacobian matrix of partial derivatives of g ( y0 , z0 ) = Dg


f 1 (y , z ) y1 0 0

. . .

.. . .. .

f 1 (y , z ) ym 0 0

. . .

f 1 (y , z ) z 1 0 0

. . .

.. . .. .

f 1 (y , z ) znm 0 0

f m ( y0 , z0 ) y1

f m (y , z ) ym 0 0

f m ( y0 , z0 ) z 1

0 . . . 0

0 . . . 0

1 . . . 0

. . . f m ( y , z ) 0 . znm 0 0 . . . 1

We claim that the upper left block of the Jacobian matrix, i.e., D1 f ( y0 , z0 ), is invertible. This follows since is an isomorphism from Rm {0} onto U and since U is dened so ( y0 , z0 ) is an isomorphism. By the that D f (x0 )|U is an isomorphism. It then follows that D g |U is an Inverse Function Theorem there exists a neighbourhood U of ( y0 , z0 ) such that g (U ) and denote by : U2 U analytic dieomorphism onto its image. Denote U2 = g |U . Let us write ( y, z) = (1 ( y, z), 2 ( y, z)). Note that the inverse of g ( y , z ) = g (1 ( y, z), 2 ( y, z)) = ( g f (1 ( y, z), 2 ( y, z)), 2 (z)) = ( y, z) for all ( y, z) U2 . Thus 2 ( y, z) = z for all ( y, z) U2 . Moreover, if we dene U1 = (U ) and : U2 U1 by = , then we have y= f (1 ( y, z), 2 ( y, z)) = f ( y, z) = f ( y, z), as desired.

Next we give a similar result when the derivative is injective at a point. 2.2.8 Theorem (Real analytic local immersion theorem) Let U Rn be open and let f C (U, Rm ). If the matrix f1 f1 ( x ) ( x ) 0 0 x1 x n . . . . . . Df(x0 ) = . . . fm fm ( x ) ( x ) 0 0 x1 xm has rank n for x0 U, then there exists (i) a neighbourhood V1 of f(x0 ), (ii) a neighbourhood V2 U Rmn of (x0 , 0), and (iii) a real analytic dieomorphism : V1 V2 such that f(x) = (x, 0) for all x U for which (x, 0) V2 .
Proof Since rank(D f (x0 )) = n, D f (x0 ) is injective. Let V Rm be a complement to image(D f (x0 )). Choose a basis (1 , . . . , n , 1 , . . . , mn ) for Rm such that (1 , . . . , n ) is a basis for image(D f (x0 )) and such that (1 , . . . , mn ) is a basis for V. Dene an isomorphism : Rn Rmn Rm by (u v) = u1 1 + + un n + + v1 1 + + vmn mn .

44

2 Real analyticity

23/06/2009

Then dene f : U Rn Rmn by f = 1 f so that D f (x0 ) = 1 D f (x0 ). Also dene m n n n m : UR g R R by (x , y ) = g f (x) + (0, y). at (x0 , 0) is Note that the Jacobian matrix of g (x0 , 0) = Dg
f 1 (x , 0) x1 0

. . .

.. . .. .

f 1 (x , 0) xm 0

. . .

0 . .. . . . 0 1 . .. . . . 0

0 . . . 0 0 . . . 1

f m (x , 0) x1 0

f m (x , 0) xm 0

0 . . . 0

0 . . . 0

The upper left corner of this matrix is invertible since is an isomorphism of Rn {0} onto (x0 ) is an isomorphism and image(D f (x0 )) and since D f (x0 ) is injective. It follows that D g |V2 is a dieomorphism onto its so there exists a neighbourhood V2 of (x0 , 0) such that g (V2 ) and denote by : V V2 the inverse of g |V2 . Note that image. Let V = g (x, y) = (x, y) g for every (x, y) V . Therefore, (x, 0) = g f (x) = 1 f (x) = x. Thus, if we dene V1 = (V ) and = 1 , we see that the conclusions of the theorem hold.

In Figure 2.3 we depict the behaviour of maps satisfying the hypotheses of Theorems 2.2.7 and 2.2.8 after the application of the dieomorphisms from the theorem. Even when the derivative is neither injective nor surjective, we can still provide a characterisation of the local behaviour of a map. 2.2.9 Theorem (Real analytic local representation theorem) Let U Rn be open and let f C (U, Rm ). If k = rank(Df(x0 )) for x0 U, then there exists (i) a neighbourhood U1 U of x0 , (ii) an open set U2 Rk Rnk , (iii) a real analytic dieomorphism : U2 U1 , (iv) an isomorphism : Rm Rk Rmk , and (v) a real analytic map g : U2 Rmk such that f (y, z) = (y, g(y, z)) for all (y, z) U2 and such that Dg(1 (x0 )) = 0.
Proof Let (1 , . . . , k , 1 , . . . , mk ) be a basis for Rm such that (1 , . . . , k ) is a basis for image(D f (x0 )) and (1 , . . . , mk ) is a basis for a complement V to image(D f (x0 )). In like manner we let (a1 , . . . , ak , b1 , . . . , bnk ) be a basis for Rn such that (b1 , . . . , bnk ) is a basis

23/06/2009
Rnm

2.2 Real analytic multivariable calculus

45

f = constant

Rm

Rn

Rmn

Rm

Rn

Figure 2.3 The character of local submersions (left) and immersions (right)

for ker(D f (x0 )) and (a1 , . . . , ak ) is a basis for a complement U to ker(D f (x0 )). Dene isomorphisms : Rk Rnk Rn by (u v) = u1 a1 + + uk ak + v1 b1 + + vnk bnk and : Rm Rk Rmk by 1 (r s) = r1 1 + + rk k + s1 1 + + smk mk . Let us dene f : U Rk Rmk by f = f . For x U write f (x) = ( f 1 (x), f 2 (x)). Note k that D f 1 (x0 ) is injective since is an isomorphism of image(D f (x0 )) with R {0}. Thus f 1 satises the hypotheses of the local submersion theorem, and so the conclusions of that theorem furnish us with a neighbourhood U1 of x0 , an open set U 2 Rk Rnk , and a real analytic dieomorphism : U2 U1 such that f 1 ( y, z) = y for all ( y, z) U2 . Let us m k dene g : U2 R by g = f 2 so that f ( y, z) = ( y, g( y, z)). Note that D f 2 (x0 ) = 0 since maps {0} Rnk to ker(D f (x0 )). Thus Dg(1 (x0 )) = 0. Finally, f ( y, z) = f ( y, z) = ( y, g( y, z)), giving the theorem.

Finally, we consider the case where the derivative is not necessarily injective nor surjective, but is of constant rank.

46

2 Real analyticity

23/06/2009

2.2.10 Theorem (Real analytic local rank theorem) Let U Rn be open and let f C (U, Rm ). If the rank of the matrix f1 f1 ( x ) ( x ) x 1 xn . . . . . . Df(x) = . . . fm fm ( x ) ( x ) x 1 xm is equal to k in a neighbourhood U U of x0 , then there exists (i) a neighbourhood U1 U of x0 , (ii) an open set U2 Rk Rnk , (iii) an open set V1 Rm , (iv) an open set V2 Rk Rmk , and (v) real analytic dieomorphisms : U2 U1 and : V1 V2 such that f (y, z) = (y, 0) for all (y, z) U2 .
Proof The local representation theorem furnishes us with a neighbourhood U1 U of x0 , an open subset U2 Rk Rnk , a real analytic dieomorphism : U2 U1 , an isomorphism : Rm Rk Rmk , and a real analytic map g : U2 Rmk such that f ( y, z) = ( y, g( y, z)) for all ( y, z) U2 and such that Dg(1 (x0 )) = 0. We shall assume that U1 U . Let us denote F ( y, z) = ( y, g( y, z)). If pr1 : Rk Rmk Rk is the projection onto the rst factor, we have pr1 DF ( y, z) (u, v) = u, This means that the map (u, 0) DF ( y, z) (u, 0) image(DF ( y, z)) (2.20)

is injective. By hypothesis image(DF ( y, z)) has dimension k, and so the map (2.20) is an isomorphism for each ( y, z) U1 . Now let (u, Dg( y, z) (u, v)) image(DF ( y, z)). Since the map (2.20) is an isomorphism there exists u Rk such that DF ( y, z) (u , 0) = (u , Dg( y, z) (u , 0)) = (u , D1 g( y, z) u ) = (u, Dg( y, z) (u, v)). Thus u = u and Dg( y, z) (u, v) = D1 g( y, z) u + D2 g( y, z) v = D1 g( y, z) u for all v Rnk . Thus D2 g( y, z) v = 0 for all ( y, z) U2 and v Rnk . Thus g is not a function of z. Since D2 F ( y, z) v = (0, D2 g( y, z) v), : pr (U2 ) Rk Rmk by we conclude that F also does not depend on z. Now dene F 1 n k ( y) = F ( y, z y ) where z y R F is such that ( y, z y ) U2 . It follows from the fact that the ( y) is injective for map (2.20) is an isomorphism and that F is independent of z that DF 1 every y pr1 (U 2 ). In particular, this holds at pr1 (x0 ). We can now apply the local

23/06/2009

2.3 Real analytic differential geometry

47

(pr 1 (x0 )), a neighbourhood V2 immersion theorem to give a neighbourhood V1 of F 1 ( y) = ( y, 0) pr1 (U 2 ) Rmk of (x0 , 0), and a dieomorphism : V 1 V2 such that F for all y pr1 (U 2 ) such that ( y, 0) V2 . Now let ( y, z) U2 and note that ( y) = F ( y, z) = f ( y, z), ( y, 0) = F giving the theorem after taking V1 = 1 (V1 ) and = .

2.3 Real analytic differential geometry


In this section we review real analytic dierential geometry. Here we shall only do three things: (1) provide the basic denitions for real analytic manifolds and maps; (2) formulate the important theorem of Grauert [1958] on the embedding of real analytic manifolds in Euclidean space; (3) state the principal extension and approximation results for real analytic functions. Other useful results in real analytic dierential geometry will be stated in Section 2.4. We suppose the reader to be already acquainted with smooth dierential geometry, and so we are short on motivation, intuition, and examples. 2.3.1 Real analytic manifolds, submanifolds, mappings, and vector bundles The basic denitions of real analytic dierential geometry follow those for smooth dierential geometry, but with smooth replaced with real analytic. So that there is no ambiguity, we shall quickly go through the denitions. 2.3.1 Denition (Real analytic charts, atlases, and differentiable structures) Let S be a set. A chart for S is a pair (U, ) with (i) U a subset of S, and (ii) : U Rn an injection for which (U) is an open subset of Rn . A C -atlas for S is a family A = ((Ua , a ))aA of charts for S with the properties that S = aA Ua , and that, whenever Ua Ub , we have (iii) a (Ua Ub ) and b (Ua Ub ) are open subsets of Rn , and 1 (iv) the overlap map ab b a |a (Ua Ub ) is a C -dieomorphism from a (Ua Ub ) to b (Ua Ub ). Two C -atlases A1 and A2 are equivalent if A1 A2 is also a C -atlas. A C dierentiable structure, or a real analytic dierentiable structure, on S is an equivalence class of atlases under this equivalence relation. A C -dierentiable manifold, or a C -manifold, or a real analytic manifold, M is a set S with a C -dierentiable structure. An admissible chart for a manifold M is a pair (U, ) that is a chart for some atlas dening the dierentiable structure. If all charts take values in Rn for some xed n, then n is the dimension of M, denoted by dim(M). The manifold topology on a set S with a dierentiable structure is the topology generated by the domains of the admissible charts.

48

2 Real analyticity

23/06/2009

In Figure 2.4 we illustrate how one should think about the overlap condition.
M Ua a

Rn Ub

b ab

Rn

Figure 2.4 An interpretation of the overlap condition

When we speak of manifolds, real analytic or otherwise, we will not generally suppose them to have a well-dened dimension, i.e., a manifold may have connected components with varying dimension, and the dimension of these components can even be unbounded. (For example, nZ>0 Rn is a manifold with innitely many connected components, and there is no bound on the dimension of the components.) Now we turn to real analytic maps between manifolds. 2.3.2 Denition (Local representative of a map, real analytic map) Let M and M be real analytic manifolds and let f : M N be a map. Let x M, let (U, ) be a chart for which U is a neighbourhood of x, and let (V, ) be a chart for which V is a neighbourhood of f (x), assuming that f (U) V (if f is continuous, U can always be made suciently small so that this holds). The local representative of f with respect to the charts (U, ) and (V, ) is the map f : (U) (V) given by f (x) = f 1 (x). With this notation we make the following denitions. (i) We say that f : M N is of class C or is real analytic, if, for every point x M and every chart (V, ) for N for which f (x) V, there exists a chart (U, ) for M such that f (U) V and for which the local representative f is of class C . (ii) The set of class C maps from M to N is denoted by C (M, N). (iii) We denote by C (M) = C (M, R) the set of real analytic functions on M. (iv) If f is a bijection of class C , and if f 1 is also of class C , then f is a C dieomorphism or a real analytic dieomorphism.

23/06/2009
M U

2.3 Real analytic differential geometry


N f V

49

Rn

Rm

Figure 2.5 The local representative of a map

In Figure 2.5 we depict how one should think about the local representative. Just as in the smooth case, we have the following statement regarding real analyticity of maps. 2.3.3 Proposition (Characterisation of real analytic maps) Let M and N be real analytic manifolds. For a continuous map f : M N the following statements are equivalent: (i) f C (M, N); (ii) for atlases A = ((Ua , a ))aA for M and B = ((Vb , b ))bB for N with the property that, for every a A, there exists ba B such that f(Ua ) Vba , all local representatives fa ba , a A, are real analytic.
Proof See [Abraham, Marsden, and Ratiu 1988, Proposition 3.2.6].

Next we consider submanifolds. 2.3.4 Denition (Real analytic submanifold) A subset S of a C -manifold M is a C submanifold or a real analytic submanifold if, for each point x S, there is an admissible chart (U, ) for M with x U, and such that (i) takes its values in a product Rk Rnk , and (ii) (U S) = (U) (Rk {0}). A chart with these properties is a submanifold chart for S. In Figure 2.6 we illustrate how one should think about submanifolds. We shall have a little more to say about submanifolds in Section 5.3.2. Let us now turn to real analytic vector bundles. As with real analytic manifolds, the constructions here are the same as in the smooth case, but we quickly review the constructions so as to ensure no ambiguity. 2.3.5 Denition (Local vector bundle) Let U Rn and V Rm be open sets. (i) A local vector bundle is a product U Rk .

50
M

2 Real analyticity

23/06/2009

Rnk S U Rk

Figure 2.6 A submanifold chart

(ii) If U Rk and V Rl are local vector bundles, then a map g : U Rk V Rl is a C -local vector bundle map or real analytic local vector bundle map if it has the form g(x, v) = ( g1 (x), g2 (x) v), where g1 : U V and g2 : U L(Rk ; Rl ) are real analytic. (iii) If, in part (ii), g1 is a real analytic dieomorphism and g2 (x) is an isomorphism for each x U, then we say that g is a C -local vector bundle isomorphism or a real analytic local vector bundle isomorphism. A vector bundle is constructed, just as was a manifold, by patching together local objects. 2.3.6 Denition (Vector bundle) A C -vector bundle, or a real analytic vector bundle, is a set S that has an atlas A = {(Ua , a )}aA where image(a ) is a local vector bundle, a A, and for which the overlap maps are C -local vector bundle isomorphisms. Such an atlas is a C -vector bundle atlas. Two C -vector bundle atlases, A1 and A2 , are equivalent if A1 A2 is a C -vector bundle atlas. A Cr -vector bundle structure, or a real analytic vector bundle structure, is an equivalence class of such atlases. A chart in one of these atlases is called an admissible vector bundle chart. A typical vector bundle will be denoted by V. The base space M of a vector bundle V is given by all points v V having the property that there exists an admissible vector bundle chart (V, ) such that (v) = (x, 0) U Rk . This denition may be shown to make sense, since the overlap maps are local vector bundle isomorphisms that map the zero vector in one local vector bundle to the zero vector in another. To any point v V we associate a point x M as follows. Let (V, ) be a local vector bundle chart for V around v. Thus (v) = (x, v) U Rk . Dene x = 1 (x, 0). Once again, since the overlap maps are local vector bundle isomorphisms, this denition makes sense. We denote the resulting map from V to M by and we call this the vector bundle projection. Sometimes we will write a vector bundle as : V M. One may verify the following properties of vector bundles.

23/06/2009

2.3 Real analytic differential geometry

51

2.3.7 Proposition Let : V M be a real analytic vector bundle. Then (i) M is a C -submanifold of V, and (ii) is a real analytic surjective submersion. When we wish to think of the base space M as a submanifold of V, we shall call it the zero section and denote it by Z(V). For x M, the set 1 (x) is the bre over x, and is often written Vx . One may verify that the operations of vector addition and scalar multiplication dened on Vx in a xed vector bundle chart are actually independent of the choice made for this chart. Thus Vx is indeed a vector space. We will sometimes denote the zero vector in Vx as 0x . If N M is a submanifold, we denote by V|N the restriction of the vector bundle to N, and we note that this is a vector bundle with base space N. A C -section, or a real analytic section, of a real analytic vector bundle : V M is a real analytic map : M V such that (x) Vx for each x M. The set of analytic sections of : V M is denoted by (V). One can verify that the tangent bundle is a real analytic vector bundle. Real analytic sections of the tangent bundle are real analytic vector elds. 2.3.2 The GrauertMorrey Embedding Theorem In this section we will present the embedding theorem for real analytic manifolds due to Grauert [1958] and Morrey [1958]. A proof of this theorem is simply not practical, as a full exposition of the required machinery is itself enough material for a book. For example, the proof of Grauert relies on sheaves and that of Morrey, which only works in the compact case, relies on results from partial dierential equations. What we will do, however, is provide some colour for the embedding theorem. We refer to [Krantz and Parks 2002, 6.4] for additional discussion. For smooth manifolds, the embedding theorem of Whitney [1936] shows that every second-countable Hausdor manifold can be embedded in a suciently large copy of Euclidean space as a submanifold. The proof of the so-called Whitney Embedding Theorem, while not trivial, is understandable by anyone with a background basic dierential geometry. The construction, however, involves partitions of unity in a fundamental way. For real analytic manifolds, there are no real analytic partitions of unity (we shall explore this in Section 2.3.3), so the proof of Whitney does not extend to give an embedding theorem for real analytic manifolds. Thus the embedding theorem for real analytic manifolds marks a prominent place where the real analytic theory departs from the smooth theory. Another mark of the depth of the real analytic embedding theorem comes from a comparison with the theory of complex manifolds, i.e., manifolds where charts take values in Cn and overlap maps are required to be holomorphic. Some complex manifolds are rather rigid in the sense that they support few holomorphic functions. There is, however, an interesting class of complex manifolds that support many holomorphic functions; these are called Stein manifolds after Stein [1951]. To be precise a complex manifold M is a Stein manifold if

52

2 Real analyticity

23/06/2009

1. for every compact set K M, the holomorphic convex hull of K, dened by hconv(K) = {z M | | f (z)| sup{| f |(z ) | z K} for all holomorphic f : M C}, is also compact, 2. whenever z1 , z2 M are distinct, there exists a holomorphic function f : M C such that f (z1 ) = f (z2 ), and 3. for every z M there exist holomorphic functions f1 , . . . , fn : M R which, in a neighbourhood of z, dene coordinate functions for an admissible chart. It is illustrative to mention that, for example, a connected Riemann surface is a Stein manifold if and only if it is not compact. Thus the Riemann sphere is not a Stein manifold; in fact the only holomorphic functions on the Riemann sphere are the constant functions [Noguchi 1998, Theorem 3.7.8]. However, a Stein manifold has many holomorphic functions. It is the existence of such holomorphic functions that allows one to prove an embedding theorem for Stein manifolds in CN [Remmert 1955]. A discussion of Stein manifolds, including a proof of the embedding theorem, can be found in [Hormander 1973]. Now the key connection to the real analytic embedding theorem is this. It turns out that a real analytic manifold can be regarded as the real part of a complex manifold [Whitney and Bruhat 1959]. Said otherwise, one can complexify a real analytic manifold into a complex manifold. Grauert [1958] proves that the complexication of a real analytic manifold is a Stein manifold by answering a question posed by Levi [1910] regarding the character of the domain of holomorphy of a holomorphic function. (Contributions in this direction are also make by Cartan [1957].) By this device, the embedding theorem for Stein manifolds of Remmert [1955] implies the following theorem, recalling that a proper map is one for which the preimages of compact sets are compact. 2.3.8 Theorem (GrauertMorrey Embedding Theorem) If M is a paracompact, real analytic, Hausdor manifold, then there exists N Z>0 and a proper embedding : M RN . 2.3.3 Extension of and approximation by real analytic maps In this section we consider the important problem of extending real analytic functions, maps, and section of vector bundles from local to global. This is a rather complicated issue, especially when compared to the smooth case where one has partitions of unity at ones disposal that make extension from global to local rather straightforward. If one considers the holomorphic case rst, then something akin to panic can creep in. For example, as we mentioned in the previous section, the only holomorphic functions on the Riemann sphere are the constant functions. However, in a local chart there will be many holomorphic functions. However, also as we saw in the previous section, one should not have in mind the situation of a general complex manifold when one thinks about real analytic manifolds. One should think, actually, about Stein manifolds. For Stein manifolds, since they can be embedded in complex Euclidean space, one can use

23/06/2009

2.3 Real analytic differential geometry

53

these embeddings to extend locally dened objects to global objects. A powerful tool for doing this is the theory of sheaves, and many of the useful results in this line are presented in the language of sheaves. We shall only sketch the main ideas in this area, and provide references. Before we get to the important extension results, let us provide a non-extension result. 2.3.9 Theorem (The principle of analytic continuation) If M is a connected real analytic manifold, if S M is a set possessing an accumulation point, and if f C (M) is a function for which f(x) = 0 for every x S, then f(x) = 0 for every x M.

Proof Let x0 M be an accumulation point for S. Then, given a coordinate chart (U, ) for M with x0 U, there exists a sequence of distinct points (x j ) jZ>0 in S U converging to x0 . Let us suppose that (x0 ) = 0, that (U) = Bn (R, 0) Rn , and denote x j = (x j ), xj j Z>0 . Note that the sequence ( x ) jZ>0 on Sn1 must contain a convergent subsequence by the BolzanoWeierstrass Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.5.4] and compactness of Sn1 . Thus we can suppose, by passing to a subsequence if necessary, that there exists u Sn1 such that u, x j > 0 for every j Z>0 . This amounts to asserting that all points in the sequence lie on one side of a hyperplane in Rn . Let us also suppose, without loss of generality by passing to a subsequence if necessary, that the sequence ( x j ) jZ>0 is monotonically decreasing. We claim that, for every r Z>0 , (1) Dr ( f 1 )(0) = 0 and (2) there exists a sequence (xr, j ) jZ>0 in Bn (R, 0) converging to 0 such that Dr ( f 1 )(xr, j ) = 0. We prove this by induction on r. For r = 0, by continuity of f we have f 1 (0) = lim f 1 (x j ) = 0,
j

and so our claim holds with x0, j = x j , j Z>0 . Now suppose that, for r {1, . . . , k}, we have (1) Dr ( f 1 )(0) = 0 and (2) there exists a sequence (xr, j ) jZ>0 in Bn (R, 0) converging to 0 such that Dr ( f 1 )(xr, j ) = 0. Let j Z>0 . By the Mean Value Theorem [Abraham, Marsden, and Ratiu 1988, Proposition 2.4.8], there exists xk+1, j {(1 t)xk, j + txk, j+1 | t [0, 1]} such that Dk+1 ( f 1 )(xk+1, j ) = 0. Let us show that lim j xk+1, j = 0. Let t j [0, 1] be such that xk+1, j = (1 t j )xk, j + t j xk, j+1 . Then, since xk+1, j (1 t j ) xk, j + t j xk, j+1 , we indeed have lim j xk+1, j = 0. Continuity of Dk+1 ( f 1 ) ensures that Dk+1 ( f 1 )(0) = lim Dk+1 ( f 1 )(xk+1, j ) = 0,
j

giving our claim by induction. Since f 1 is analytic and all of its derivatives vanish at 0, there is a neighbourhood U0 of 0 in Bn (R, 0) such that f 1 vanishes identically on V . Therefore, there is a neighbourhood U0 of x0 such that f vanishes identically on U0 . To show that f vanishes on M we use a lemma.

54

2 Real analyticity

23/06/2009

1 Lemma If I R is an open interval and if : I R is a real analytic function vanishing on an open subset U I, then vanishes on I. Proof Let t0 U. Since vanishes on U, all derivatives (r) (t0 ), r Z0 , are zero. Denote J = {t I | (r) (t) = 0, r Z0 }. Since the functions (r) are continuous, ((r) )1 (0) is a closed subset of I for every r Z0 . Since J = rZ0 ((r) )1 (0), it follows that J is a closed subset of I. Also, if t J then the Taylor series for is identically zero, and so there is a neighbourhood of t on which , and so all of its derivatives as well, vanish identically. Thus J is open. Since I is connected and J is an open and closed subset, J = I. Now we let x1 M and show that f (x1 ) = 0. Since M is connected and locally path connected, there exists a piecewise real analytic curve : [0, 1] M such that (0) = x0 and (1) = x1 . (By Proposition 1.6.7 of [Abraham, Marsden, and Ratiu 1988], connected and locally path connected spaces are path connected. Thus, given x0 , x1 M, there exists a continuous curve : [0, 1] M such that (0) = x0 and (1) = x1 . For every t [0, 1] one has a coordinate chart about (t). Since ([0, 1]) is compact, it can be covered by nitely many such coordinate charts. One can then use these charts and the curve to construct a piecewise real analytic curve connecting x0 and x1 .) Let I1 , . . . , Ik [0, 1] be a partition such that | int(I j ) is real analytic for each j {1, . . . , k}. Suppose, without loss of generality, that (t) is nowhere zero. Suppose that the intervals are ordered so that inf I j1 sup I j2 if j1 > j2 . Note that f is real analytic on I j by Proposition 2.2.2. Now note that f vanishes in a neighbourhood of 0 I1 . By the lemma, f vanishes on int(I1 ), and by continuity f vanishes on I1 . If t1 = sup I1 , let (s j ) jZ>0 be a sequence in I1 converging to t1 . Then f ((s j )) = 0 for every j Z>0 , and so from the rst part of the proof we know that f vanishes in a neighbourhood U1 of f ((t1 )). The process can now be continued to show that f vanishes on I2 , . . . , Ik . In particular, this gives f (x1 ) = 0.

Of course, the theorem has the following corollary which indicates the rigidity of real analyticity. 2.3.10 Corollary (Functions agreeing on sets with an accumulation point) If M is a connected real analytic manifold, if S M is a set possessing an accumulation point, and if f, g C (M) are functions for which f(x) = g(x) for every x S, then f(x) = g(x) for every x M. In particular, it follows from the theorem that a function vanishing on an open subset of a connected real analytic manifold must vanish everywhere. For smooth functions, of course, this is not the case; see, for example, the function in Figure 2.1. In the smooth case, such functions are crucial for constructing partitions of unity which are so valuable for making many constructions in smooth dierential geometry. In the real analytic, of course, partitions of unity do not exist. However, there are still similar sorts of theorems, and these go under the name of Cousin problems after Cousin [1895]. Let us state the two problems. 2.3.11 Problem (Cousin I) Let M be a real analytic manifold and let r Z0 {, }. Let there be given the following data:

23/06/2009

2.3 Real analytic differential geometry

55

(i) a countable family (U j ) jZ>0 of open subsets of M for which M = jZ>0 U j and (ii) for each j, k Z>0 , a function f jk Cr (U j Uk ) such that (a) f jk (x) = fk j (x), x U j Uk , j, k Z>0 , and (b) f jk (x) + fkl (x) + fl j (x) = 0, x U j Uk Ul , j, k, l Z>0 . Does there exist a family ( f j ) jZ>0 of functions f j Cr (U j ), j Z>0 , such that f jk (x) = f j (x) fk (x) for all x U j Uk , j, k Z>0 ? 2.3.12 Problem (Cousin II) Let M be a real analytic manifold and let r Z0 {, }. Let there be given the following data: (i) a countable family (U j ) jZ>0 of open subsets of M for which M = jZ>0 U j and (ii) for each j, k Z>0 , a R>0 -valued function f jk Cr (U j Uk ) such that (a) f jk (x) fk j (x) = 1, x U j Uk , j, k Z>0 , and (b) f jk (x) fkl (x) fl j (x) = 1, x U j Uk Ul , j, k, l Z>0 . Does there exist a family ( f j ) jZ>0 of R>0 -valued functions f j Cr (U j ), j Z>0 , such that f jk (x) = f j (x) fk1 (x) for all x U j Uk , j, k Z>0 ? Of course, we are interested here in solutions to the Cousin problems for real analytic data. However, as we shall see, for the solution to Cousin II, it helps to have the problem formulated in general. Let us state the solutions to these problems. 2.3.13 Theorem (Solution of Cousin I) A solution to Cousin I always exists when r = . 2.3.14 Theorem (Solution of Cousin II) A solution to Cousin II exists for r = if an d only if a solution exists for r = 0. The solutions to the Cousin problems on Stein manifolds were given by [Cartan 1951-52]. They are deduced from Cartans so-called Theorem B which has to do with the vanishing of certain cohomology groups of a coherent analytic sheaf. Note that the solution to Cousin II demonstrates what is often called Okas principle: most things that are true in the continuous category are also true in the real analytic category.2 As we say, Cartans Theorem B furnishes solutions to the Cousin problems. There is a Theorem A as well. As with his Theorem B, Theorem A is formulated by Cartan in the language of sheaves. It has a non-sheaf statement, however. 2.3.15 Theorem (Cartans Theorem A) If : V M is a real analytic vector bundle and if x0 M, then there exists sections 1 , . . . , n (V) such that Vx0 = spanR (1 (x0 ), . . . , n (x0 )). Let us conclude this section with the following approximation result proved by Whitney [1936]. Additional discussion can be found in [Shiga 1964].
Actually, Oka did not care much about real analytic geometry, and his principle is typically applied to the holomorphic category on Stein manifolds. However, we adapt the principle to our needs.
2

56

2 Real analyticity

23/06/2009

2.3.16 Theorem (Approximation of smooth mappings by analytic mappings) Let M and N be real analytic manifolds, let f C (M, N), and suppose that we are given the following data: (i) a family ((Ua , a ))aA of coordinate charts for M; (ii) for each a A, a compact subset Ka Ua ; (iii) a family ((Vb , b ))bB of coordinate charts for N with the property that, for each a A, there exists ba B such that f(Ua ) Vb ; (iv) a family ( a )aA of positive real numbers; (v) a sequence (xj )jZ>0 in M for which the set {xj | j Z>0 } has no accumulation points and for which, for each j Z>0 , there exists aj A such that xj Uaj . Then there exists g C (M, N) such that (vi) g(Ua ) Vba , (vii) Dr faj ba (aj (xj )) = Dr gaj ba (aj (xj )) = for each r, j Z>0 , and
j j

(viii) sup{ Dr fa ba (x) Dr ga ba (x) | x a (Ka )} <

for every a A and r Z>0 .

Some interesting consequences of this theorem are the following. We suppose that the reader know about jets, something we will formally discuss in Section 8.1. 2.3.17 Corollary (Analytic functions satisfying constraints) Let M be a real analytic manifold and let (xj )jZ>0 be a sequence in M for which {xj | j Z>0 } has no accumulation point. For r Z>0 let pj Jr (M; R) be an r-jet at xj , j Z>0 . Then there exists f C (M) such that jr f(xj ) = pj , j Z>0 . 2.3.18 Corollary (Analytic sections satisfying constraints) Let : V M be a real analytic vector bundle and let (xj )jZ>0 be a sequence in M for which {xj | j Z>0 } has no accumulation point. For r Z>0 let pj Jr V be an r-jet at xj , j Z>0 . Then there exists (V) such that jr (xj ) = pj , j Z>0 .

2.4 Local properties of analytic functions


In this section we explore some important very local properties of real analytic functions, and consequences of this. These local properties contribute greatly to the important role played by real analyticity in geometric control theory. Since a complete accounting of this topic is dicult to locate in the literature, we provide this here. Much of what we do here has a strong algebraic avour. We assume that the reader is familiar with basic concepts from rings and modules, and knows abut very basic concepts such as primes and irreducibles, ideals, submodules, and short exact sequences of modules. We refer to [Hungerford 1980] as a reference for this material. 2.4.1 Unique factorisation domains In this section we review some facts about unique factorisation domains.

23/06/2009

2.4 Local properties of analytic functions

57

2.4.1 Denition (Unique factorisation domain) A unique factorisation domain is an integral domain R such that: (i) if r R is nonzero and not a unit, then there exists irreducible elements f1 , . . . , fk R such that r = f1 fk ; (ii) if, for irreducible elements f1 , . . . , fk , g1 , . . . , gl R, we have f1 fk = g1 gl , then k = l and there exists Sk such that f j | g( j) and g( j) | f j for each j {1, . . . , k}. The rst part of the denition tells us that every nonzero element of R that is not a unit is expressible as a product of irreducibles. The second part of the denition tells us that the expression as a product of irreducibles is unique up to order and the factors diering by a unit. It is often convenient to eliminate the ambiguity of knowing the irreducible factors only up to multiplication by units. Let us denote by IR the set of irreducible elements of a commutative unit ring. On IR dene a relation by p1 p2 if p2 = up1 for some unit u. The following denition develops some terminology associated to this. 2.4.2 Denition (Selection of irreducibles) Let R be a commutative unit ring and let IR be the set of irreducible elements in R, with the equivalence relation described above. A selection of irreducibles is a map P : (IR / ) IR such that P([p]) [p]. We shall denote a selection of irreducibles P by (pa )aAR where AR = IR / and where pa = P(a). The following result encapsulates why the notion of a selection of irreducibles is valuable. 2.4.3 Proposition (Unique factorisation determined by selection of irreducibles) If R is a unique factorisation domain and if (pa )aAR is a selection of irreducibles, then, given a nonzero nonunit r R, there exists unique pa1 , . . . , pak (pa )aAR and a unique unit u R such that r = upa1 pak .
Proof Let f1 , . . . , fk be irreducibles such that r = f1 fk . For j {1, . . . , k}, dene pa j {pa | a AR } so that pa j [ f j ]. Then f j = u j pa j for some unit u j R, j {1, . . . , k}. Then we have r = u1 uk pa1 pak , giving the existence part of the result. Now suppose that r = upa1 pak = u pa1 pa
k

(2.21)

are two representations of the desired form. Since (upa1 )pa2 pak and (u pa1 )pa2 pa are k two factorisations by irreducibles, we immediately conclude that k = k. For convenience, let us dene f1 = upa1 , f j = pa j , j {2, . . . , k}, and f1 = u pa1 , f j = pa j , j {2, . . . , k}. We can then assert the existence of Sk such that f j = v j f( j) , j {1, . . . , k}, for units v1 , . . . , vk . By the denition of a selection of irreducibles, for each j {1, . . . , k} we have f j = v j f( j) = w j pb j for a unit w j and pb j (pa )aAR . Now we have a few cases.

58
1.

2 Real analyticity
f1 = u pa1 = v1 f1 = v1 upa1 : In this case we have u pa1 = uv1 pa1 = w1 pb1 .

23/06/2009

We conclude that pa1 pa1 pb1 , implying that pa1 = pa1 = pb1 . We also conclude that u = uv1 = w1 . 2. f1 = u pa1 = f(1) = pa(1) for (1) 1: Here we have u pa1 = v1 pa(1) = w1 pb1 , and as above we conclude that pa1 = pa(1) = pb1 and u = v1 = w1 . 3. f j = pa j = v j f1 = v j up1 = w j pb j for j 1: Here we conclude that pa j = p1 = pb j and 1R = v j u = w j . 4. f j = pa j = v j f( j) = v j pa j = w j pb j for j 1 and ( j) 1: In this case we conclude that pa j = pa j = pb j and 1R = v j = w j . We then see that pa1 pak = pa(1) pa(k) , and we immediately conclude from (2.21) that u = u, and this completes the proof of uniqueness.

For unique factorisation domains we have the following valuable characterisation of primes and irreducibles. 2.4.4 Proposition (Primes and irreducibles in unique factorisation domains) If R is be a unique factorisation domain, then the following three statements concerning p R are equivalent: (i) p is prime; (ii) p is irreducible; (iii) if d|p then either d is a unit or d = up for a unit u.
Proof The rst two assertions we prove are valid for general integral domains. (i) = (ii) Suppose that p is prime and that p = ab. Then p|(ab) and since p is prime, without loss of generality we can assert that p|a. Thus a = qp for some q R. Then p = ab = aqp which implies that aq = 1R . Thus a is a unit, and so p is irreducible. (ii) = (iii) Let p be irreducible and suppose that d|p so that (p) (d). Therefore, since p is irreducible, we have (p) = (d) or (d) = R. In the rst case we have p|d, and so p = ud for some unit u. In the second case, d is a unit. (iii) = (i). Suppose that p|(ab) and that p satises (iii). Using the properties of a unique factorisation domain, write p = f1 . . . fk for irreducibles f1 , . . . , fk . It follows from (iii) that k = 1 since irreducibles are not units. Now write a = g1 . . . gl , b = h1 . . . hm

for irreducibles g1 , . . . , gl , h1 , . . . , hm . Then there exists r R such that r f1 = g1 . . . gl h1 . . . hm . Now write r as a product of irreducibles: r = s1 sq . Then s1 sq f1 = g1 . . . gl h1 . . . hm .

23/06/2009

2.4 Local properties of analytic functions

59

Using the denition of a unique factorisation domain we conclude that, for some a { g1 , . . . , gl , h1 , . . . , hm }, we have f1 |a and a| f1 . This allows us to conclude that either p|a or p|b. Thus p is prime.

Let us now examine polynomial rings over unique factorisation domains. 2.4.5 Denition (Primitive polynomial) Let R be a unique factorisation domain and let j A= k j=0 a j R[]. (i) A content of A is a greatest common divisor of {a0 , a1 , . . . , ak }. (ii) A is primitive if it has a content that is a unit in R. We now record some results about polynomials over unique factorisation domains. We need a little warm up before getting to the main statement. 2.4.6 Denition (Fraction eld) Let R be an integral domain and dene an equivalence relation in R (R \ {0R }) by (r, s) (r , s ) rs r s = 0R

The set of equivalence classes under this equivalence relation is the fraction eld of R, . and is denoted by FR . The equivalence class of (r, s) is denoted by r s The statement of the next result concerning polynomials over a unique factorisation domain relies on the fact that the polynomial ring over an integral domain R is naturally a subset of the polynomial ring over the fraction eld FR . 2.4.7 Proposition (Properties of polynomials over unique factorisation domains) Let R be a unique factorisation domain with FR its fraction eld. Then the following statements hold for polynomials A, B R[] FR []. (i) A = cA A where cA is a content of A and A R[] is primitive; (ii) if cA and cB are contents of A and B, respectively, then cA cB is a content of A B; (iii) if A and B are primitive, then A B is primitive; (iv) if A and B are primitive, then A|B and B|A in R[] if and only if A|B and B|A in FR []; (v) if A is primitive and if deg(A) > 0, then A is irreducible in R[] if and only if it is irreducible in FR [].
Proof (i) Write A = by taking A = k j =0 a j (ii) By part (i), write A = cA A and B = cB B for A and B primitive. If c is a content for A B , it is easy to see that cA cB c is a content for A B = (cA A ) (cB B ). Thus it suces to show that c is a unit, i.e., that A B is primitive. Suppose that A B is not primitive and write C = A B = (ck = k j=0 a j bk j )kZ0 , where A = (a j ) jZ0 and B = (b j ) jZ0 . Suppose that p R is irreducible and that p|c j for all j. If cA is a content for A we have p cA since cA is a unit. Similarly, p cB where cB is a content for B . Now dene nA = inf{l {0, 1, . . . , deg(A)} | p|a j , j {0, 1, . . . , l}, p al }, nB = inf{l {0, 1, . . . , deg(B)} | p|b j , j {0, 1, . . . , l}, p bl }.
k j j =0 a j j.

and write a j = cA a j for j {0, 1, . . . , k}. Then the result follows

60

2 Real analyticity
Note that p|cnA +nB , and since cnA +nB = a0 bnA +nB + + an
A

23/06/2009

1 bnB +1

+ anA bnA + an

+1 bnB 1

+ + anA +nB b0 ,

p|anA bnA , which implies that p|anA or p|bnA since irreducibles are prime in unique factorisation domains (Proposition 2.4.4). This implies that either A or B is not primitive. (iii) This follows directly from part (ii) since the product of units is again a unit. (iv) Since R FR , it is clear that if A|B and B|A in R, then A|B and B|A in FR . Now suppose that B|A in FR . Then A = U B where U FR [] is a unit. This means that U = u a for some u FR , and ;et us write u = b for a, b R with b 0R . We thus have bA = aB. Since A and B are primitive, if cA and cB are contents for A and B, respectively, these must be units. Therefore, both b and bcA are contents for bA and both a and acB are contents for aB. This means that a = bv for a unit v R so that bA = bvB. Since R[] is an integral domain, this implies that A = vB for a unit v R. Now, A|B and B|A in R[] since v is also a unit in R[]. (v) Suppose that A is not irreducible in FR [] and write A = B C for FR [] both nonunits. Ee must therefore have deg(B), deg(C) 1. Write
k

B=
j =0

aj bj

,
j

C=
j =0

cj dj

j 0R for j {0, 1, . . . , l}.

for a j , b j R with b j 0R for j {0, 1, . . . , k} and for c j , d j R with d j Write b = b0 b1 bk and for j {0, 1, . . . , k} dene j = bb b1 b j1 b j+1 bk . b Dene B =
k j j=0 a j b j

R[] and write B = cB B where cB is a content of B and where


c

B R B is primitive, by part (i). A direct computation then shows that B = 1b B = b B . An cC entirely similar computation gives C = d C where cC R and C R[] is primitive. Therefore, since A = B C, we have bdA = cB cC B C . Since A and B C are primitive, the latter by part (ii), it follows that both bd and cB cC are contents for A. Thus bd = ucB cC for a unit u R. Thus bdA = bduB C , or A = uB C . Since deg(B ) = deg(B) 1 and deg(C ) = deg(C) 1, this implies that A is not irreducible in R[]. Now suppose that A is irreducible in FR [] and write A = B C for B, C R[] FR []. Thus either B or C must be a unit in FR [], and so we must have either deg(B) = 0 or deg(C) = 0. Suppose, without loss of generality, that deg(B) = 0 so that B = b0 R \ {0}. Then, if cC is a content for C, b0 cC is a content for A = B C. Since A is primitive, b0 cC must be a unit, and so, in particular, b0 must be a unit in R. Thus B is a unit in R[], showing that A is irreducible in R[].

Now, using the proposition, we can prove our main result concerning the factorisation properties of polynomials over unique factorisation domains. 2.4.8 Theorem (Polynomial rings over unique factorisation domains are unique factorisation domains) If R is a unique factorisation domain, then R[] is a unique factorisation domain.

23/06/2009

2.4 Local properties of analytic functions

61

Proof Let A R[] be a nonzero nonunit. If deg(A) = 0 then A is an element of R under the natural inclusion of R in R[]. In this case, A possesses a factorisation as a product of irreducibles since R is a unique factorisation domain. Now suppose that deg(A) 1, and by Proposition 2.4.7(i) write A = cA A where cA R is a content of A and where A is primitive. If cA is not a unit then write cA = cA,1 cA,l where cA, j R, j {1, . . . , l}, are irreducible, this being possible since R is a unique factorisation domain. Note that the elements cA, j , j {1, . . . , l}, are also irreducible thought of as elements of R[] (why?). Now, since FR is a unique factorisation domain since it is a Euclidean domain, write A = P1 Pk where P1 , . . . , Pk FR [] are irreducible. Now proceed as in the proof of aj Proposition 2.4.7(v) to show that, for j {1, . . . , k}, P j = b j P j for a j , b j R with b j 0R and with P j R[] primitive. Since b j is a unit in FR and so in FR [], it follows that P j is irreducible in FR [], and so in R[] by Proposition 2.4.7(v). Writing a = a1 ak and a b = b1 bk , we have A = b P1 Pk , or bA = aP1 Pk . Since A and P1 Pk are primitive (the latter by Proposition 2.4.7(iii)), it follows that a = ub for u a unit in R. Therefore, if cA is not a unit, we have A = cA A = cA,1 cA,l (uP1 )P2 Pk , where cA,1 , . . . , cA,l R R[] and uP1 , P2 , . . . , Pk R[] are all irreducible in R[]. If cA is a unit, then A is primitive already, and we can directly write A = (uP1 )P2 Pk , where (uP1 , P2 , . . . , Pk R[] are irreducible. This gives part (i) of Denition 2.4.1. Now we verify part (ii) of Denition 2.4.1. We begin with a lemma. We already know from above that every element of R[] possesses a factorisation as a product of irreducible. The lemma guarantees that the factorisation is of a certain form. 1 Lemma If R is a unique factorisation domain and if A R[] is written as a product of irreducibles, A = F1 Fm , then there exists irreducibles c1 , . . . , cl R and irreducibles P1 , . . . , Pk R[] such that l + k = m and such that Fjr = cr , r {1, . . . , l}, and Fjl+s = Ps , s {1, . . . , k}, where {1, . . . , m} = {j1 , . . . , jm }. Proof From the rst part of the proof of the theorem we can write A = cA,1 cA,l P1 Pk for irreducibles cA,1 , . . . , cA,l R and irreducibles P1 , . . . , Pk R[]. We thus have F1 Fm = cA,1 cA,l P1 Pk . Let { j1 , . . . , jl } be the indices from {1, . . . , m} such that deg(F j ) = 0 if and only if j {1, . . . , jl }. Denote by { jl +1 , . . . , jm } the remaining indices, so that deg(F j ) 1 if and only if j { jl +1 , . . . , jm }. Since the polynomials F jl +1 , . . . , F jm are irreducible, they are primitive, so that cA,1 cA,l and F j1 F jl are both contents for P1 Pk . Thus there exists a unit u R such that F j1 F jl = ucA,1 cA,l . By unique factorisation in R, l = l and there exists Sl such that F jr = u(r) cA,(r) for r {1, . . . , l}, and where u1 , . . . , ul are units in R. The result now follows by taking cr = u(r) cA,(r) , r {1, . . . , l} and Ps = F jl+s , s {1, . . . , k}. Now, using the lemma, let c1 cl P1 Pk and c1 cl P1 Pk be two factorisations of A by irreducibles, where c1 , . . . , cl , c1 , . . . , cl R are irreducible and P1 , . . . , Pk , P1 , . . . , Pk R[] are irreducible. Since P1 Pk and P1 , . . . , Pk are primitive, c1 cl and c1 cl are
aj

62

2 Real analyticity

23/06/2009

contents for A, and so there exists a unit u R such that c1 cl = uc1 cl . Since R is a unique factorisation domain, l = l and there exists a permutation Sl such that c( j) = u j c j for j {1, . . . , l}, and for some set u1 , . . . , ul of units. Since P1 Pk and P1 Pk have the same content, up to multiplication by a unit, it follows that P1 Pk = UP1 Pk where U R[] is a unit. Thus U = v where v R is a unit. Therefore, since FR [] is a unique factorisation domain, k = k and there exists a permutation Sk such that P( j) = v j P j for j {1, . . . , k}, and where v j is a unit in FR . Thus, in FR [], we have P( j) |P j and P j |P( j) , j {1, . . . , k}. By Proposition 2.4.7(iv) we then have, in R[], P( j) |P j and P j |P( j) , j {1, . . . , k}. Therefore, there exists units u1 , . . . , uk R such that P( j) = u j P j , j {1, . . . , k}. This then gives the uniqueness, up to units, of factorisation in R[].

2.4.2 Noetherian rings and modules The next bit of background in algebra we consider is part of what is commonly known as commutative algebra. The results here can be found in VIII.1 of [Hungerford 1980], for example. 2.4.9 Denition (Noetherian module, Noetherian ring) Let R be a commutative unit ring and let A be a unitary R-module. (i) The module A is Noetherian if, for every family (B j ) jZ>0 of submodules of A satisfying B j B j+1 , j Z>0 , there exists k Z>0 such that B j = Bk for j k. (ii) The ring R is Noetherian if, for every family (I j ) jZ>0 of ideals of R satisfying I j I j+1 , j Z>0 , there exists k Z>0 such that I j = Ik for j k. For us, the following result will be key. 2.4.10 Proposition (Finitely generated property of submodules of Noetherian modules) Let A be a unitary module over a commutative unit ring R. Then A is Noetherian and if and only if every submodule of A is nitely generated.
Proof Suppose that A is Noetherian and let B A be a submodule. Let P(B) be the set of nitely generated submodules of B which we partially order by C1 C2 if C1 C2 for C1 , C2 P(B). Note that P(B) is nonempty as it contains, for example, the trivial submodule. We claim that P(B) contains a maximal element under the partial ordering. Suppose not so that, for each C P(B), there exists C P(B) such that C C . Use the Axiom of Choice to dene : P(B) P(B) such that (C) C. Recursively dene : Z0 P(B) by (0) = {0} and ( j + 1) = (( j)), j Z0 . This gives a sequence (( j)) jZ0 of nitely generated submodules of B such that ( j) ( j + 1), j Z>0 , contradicting the fact that A is Noetherian. Thus P(B) contains a maximal element C. We claim that C = B. Let (c1 , . . . , ck ) be generators for C. For b B let Db be the module generated by (b, c1 , . . . , ck ). Thus Db P(B) and C Db . Since C is maximal in P(B) it follows that C = Db for every b B. This implies that B C and so C = B since we obviously have C B. Thus B is nitely generated. Now suppose that every submodule of A is nitely generated. Let (B j ) jZ>0 be a sequence of submodules of A satisfying B j B j+1 for j Z>0 . Then take B = jZ>0 B j .

23/06/2009

2.4 Local properties of analytic functions

63

It is easy to verify that B is a submodule, and so is nitely generated. Therefore, there exists b1 , . . . , bm B which generate B. By denition of B, bl B jl for some jl Z>0 . Let k = max{ j1 , . . . , jm }. Then b1 , . . . , bm Bk and so B Bk . Therefore, B j = B = Bk for j k. Thus B is Noetherian.

The following lemma about Noetherian modules is useful. 2.4.11 Lemma (Noetherian modules and exact sequences) Let R be a commutative unit ring and let A, B, and C be unitary R-modules such that we have a short exact sequence 0
/

Then the following statements are equivalent: (i) B is Noetherian; (ii) A and C are Noetherian.
Proof (i) = (ii) Let B be Noetherian. If A A is a submodule, then (A ) is a submodule of B, and is isomorphic to A since is injective. Let (A j ) jZ>0 be a family of submodules of A satisfying A j A j+1 , j Z>0 . Since B is Noetherian there exists k Z>0 such that (A j ) = (Ak ) for j k. Since is an R-module monomorphism this means that A j = Ak for j k. Thus A is Noetherian. If C C is a submodule, then 1 (C ) is a submodule of B. Let (C j ) jZ>0 be a family of submodules of C satisfying C j C j+1 , j Z>0 . Since B is Noetherian there exists k Z>0 such that 1 (C j ) = 1 (Ck ) for j k. Surjectivity of then implies that C j = Ck for j k. (ii) = (i) First we claim that if B B B are submodules satisfying B A = B A, (B + A)/A = (B + A)/A

then B = B . Indeed, let b B . Then there exists b B and a A such that b + a + A = b + A. Thus b b A. Since B B we also have b b B and so b b B A = B A. Therefore b b B and so b B , as claimed. Now let (B j ) jZ>0 such that B j B j+1 , j Z>0 . Since A is Noetherian there exists l Z>0 such that B j A = Bl A for j l. Since C is Noetherian, and noting that C B/A, there exists m Z>0 such that (B j + A)/A = (Bk + A)/A for j k. Letting k = max{l, m} we see that B j A = Bl A, (B j + A)/A = (Bk + A)/A

for j k. Our claim at the beginning of this part of the proof then gives B j = Bk for j k.

The following consequence of the lemma will be useful. 2.4.12 Corollary (Finite direct sums of Noetherian modules are Noetherian) If R is a commutative unit ring and if A1 , . . . , Ak are unitary modules over R, then k A is Noetherian j=1 j if A1 , . . . , Ak are Noetherian.

64

2 Real analyticity

23/06/2009

Proof We prove this by induction on k, the case of k = 1 being vacuous. Suppose the lemma holds for k {1, . . . , m} and let A1 , . . . , Am+1 be Noetherian modules. Consider the exact sequence 0
/ A1 Am / A1 Am Am+1 / Am+1 /0

where the second arrow is the inclusion and the third arrow is the projection. By Lemma 2.4.11, A1 Am Am+1 is Noetherian since Am+1 and A1 Am are Noetherian, the latter by the induction hypothesis.

Noetherian rings give rise to Noetherian modules. 2.4.13 Proposition (Modules over Noetherian rings are Noetherian) If A is a nitely generated unitary module over a Noetherian ring R, then A is Noetherian.
Proof Let a1 , . . . , ak be generators for A and dene : Rk A by (r1 , . . . , rk ) = r1 a1 + + rk ak . Then is an R-module epimorphism and so A Rk / ker() by the rst isomorphism theorem [Hungerford 1980, Theorem IV.1.7]. Consider now the exact sequence 0
/ ker() / Rk /A

Rk / ker()

/0

where the second arrow is the inclusion and the third arrow is the projection. By Corollary 2.4.12, Rk is Noetherian. By Lemma 2.4.11 it follows that A is Noetherian.

The following result will also be interesting for us in Section 5.2.4 for showing that a certain ring is not Noetherian. The proof we give here comes from [Perdry 2004]. 2.4.14 Theorem (Krull Intersection Theorem) If R is Noetherian and I R is an ideal, then there exists a I such that (1 a) Ij = {0}.
jZ>0

Proof We rst prove a lemma which gives an interesting class of Noetherian rings. This lemma goes under the name of the Hilbert Basis Theorem. 1 Lemma If R is a Noetherian ring then the polynomial ring R[1 , . . . , n ] is also a Noetherian ring. Proof Since R[1 , . . . , n ] R[1 , . . . , n1 ][n ], it suces by induction and Corollary 2.4.12 to prove the theorem when n = 1. First some terminology. If P R[] is given by P = ak k + + a1 + a0 with ak 0, then call ak the initial coecient of P. Let I be an ideal in R[]. We will show I is nitely generated, so that R[] is Noetherian by Proposition 2.4.10, recalling that submodules of R are precisely the ideals of R. We dene a sequence (Pk )kZ>0 in R[] as follows. Let P0 R[] be chosen so that deg(P0 ) = min{deg(P) | P I}.

23/06/2009

2.4 Local properties of analytic functions

65

Then, if P0 , P1 , . . . , Pk have been chosen then choose Pk+1 so that deg(Pk+1 ) = min{deg(P) | P I \ (P0 , P1 , . . . , Pk )}. Let ak be the initial coecient of Pk , and consider the ideal J of R generated by the family (ak )kZ>0 of initial coecients. Since R is Noetherian, by Proposition 2.4.10 there exists m Z>0 such that J is generated by (a0 , a1 , . . . , am ). We claim that I is generated by P0 , P1 , . . . , Pm . Indeed, suppose otherwise. Then, possibly by choosing m larger if necessary, Pm+1 I \ (P0 , P1 , . . . , Pm ). Since a0 , a1 , . . . , am generate the ideal J, am+1 = m k =0 r k a k dk where d = deg(P for some r1 , . . . , rm R. Let Q = m r P ) deg( P ). One sees m + 1 k k k =0 k k that deg(Q) = m + 1 and the coecient of m+1 is am+1 . Thus deg(Pm+1 Q) < deg(Pm+1 ) and also Pm+1 Q I. Since Q I it follows that Pm+1 Q (P0 , P1 , . . . , Pm ) since Pm+1 I. But this contradicts the denition of deg(Pm+1 ), and so we conclude I is nitely generated by P0 , P1 , . . . , Pm . By Proposition 2.4.10 there exists a1 , . . . , an I generating I. Let b jZ>0 I j and note that, since b Im for each m Z>0 , we can write
m b = pm1 am 1 + + pmn an

for some pml R, l {1, . . . , n}. Let us dene


m Pm = pm1 m 1 + + pmn n R[1 , . . . , n ].

For m Z>0 dene an ideal Jm R[1 , . . . , n ] as being generated by P1 , . . . , Pm . Then we clearly have algJm Jm+1 for m Z>0 . By the Hilbert Basis Theorem (i.e., the lemma above), there exists k Z>0 such that Jm = Jk for m k. Thus Pk+1 Jm and so there exists Q1 , . . . , Qk R[1 , . . . , n ] such that Pk+1 = Q1 Pk + + Qn P1 , and we may, moreover, without loss of generality assume that Q j is homogeneous of degree j for each j {1, . . . , n}. By denition of P j , j {1, . . . , k + 1}, if we evaluate the above equality at l = al , l {1, . . . , n}, we have b = (Q1 (a1 , . . . , an ) + + Qk (a1 , . . . , an ))b, and the coecient of b on the right is in I, being a linear combination of powers of a1 , . . . , an . Thus b bI, this holding for any b jZ>0 I j . Therefore, for any b jZ>0 we have (1 ab )b = 0 for some ab jZ>0 I j . Now, by Proposition 2.4.10, let b1 , . . . , bd generate jZ>0 I j , and let a1 , . . . , ad I satisfy (1 a j )b j = 0, j {1, . . . , d}. If b jZ>0 I j then we write b = r1 b1 + + rd bd for r1 , . . . , rd R, and determine that (1 a1 ) (1 ad )b = (1 a1 ) (1 ad )(r1 b1 + + rd bd ) = 0. We then let a R be such that 1 a = (1 a1 ) (1 ad ), and note that, actually, a R. This gives the theorem.

For us, it is the following corollary that will be of immediate value. We recall that an ideal I in a ring R is maximal if I R and if J R is an ideal for which I J, then either J = I or J = R. A local ring is a ring possessing a unique maximal ideal.

66

2 Real analyticity

23/06/2009

2.4.15 Corollary (Krull Intersection Theorem for local rings) If R is a Noetherian local ring with unique maximal ideal m, then jZ>0 mj = {0}.
Proof We rst claim that, for a general local ring, m consists of all of the nonunits of R. Indeed, if a R is a nonunit then the ideal (a) generated by a is not equal to R, and so, therefore, we must have (a) m. In particular, a m. Conversely, if a m then (a) m. Since m is maximal m R and so (a) R. Thus a is not a unit. Now we claim that if a m then 1 a is a unit. Indeed, if 1 a were not a unit, then our argument above gives 1 a m and so gives 1 m. This, however, m = R and so contradicts the maximality of m. Now, according to the Krull Intersection Theorem, let a m be such that (1 a) jZ>0 j m = {0}. Thus, if b jZ>0 m j , we have (1 a)b = 0. By our assertion of the previous paragraph, (1 a) is a unit. Therefore, b = 0, as claimed.

2.4.3 The Weierstrass Preparation Theorem We now turn to one of the most important structural results for real analytic functions, the Weierstrass Preparation Theorem. In this section we shall setup, state, and prove the result. In the next section we shall how the Weierstrass Preparation Theorem lends a great deal of algebraic structure to real analytic functions. The Weierstrass Preparation Theorem is concerned with the behaviour of real analytic functions in one of the variables of which they are a function. It is useful to have some notation for this. We let U Rn be a neighbourhood of 0 and V R be a neighbourhood of 0. A typical point in U we shall denote by x and a typical point in V we shall denote by y. We are interested in functions in C (U V). If f C (U V) admits a power series expansion valid on all of U V, then we will write this power series expansion as

f (x, y) =
j=0 IZn 0

fI, j xI y j .

Let us dene a particular class of real analytic function where its character in one of the variables is polynomial. 2.4.16 Denition (Weierstrass polynomial) Let U V Rn R be a neighbourhood of (0, 0). A function W C (U V) is a Weierstrass polynomial of degree k if there exists w0 , w1 , . . . , wk1 C (U) satisfying (i) w j (0) = 0, j {0, 1, . . . , k}, (ii) W (x, y) = yk + + w1 (0) y + w0 (x) for all (x, y) U V. With this as setup, we can state the Weierstrass Preparation Theorem, following [Krantz and Parks 2002, Theorem 6.1.3]. 2.4.17 Theorem (Weierstrass Preparation Theorem) Let UA VA Rn R be a neighbourhood

23/06/2009

2.4 Local properties of analytic functions

67

of (x, 0) and suppose that A C (UA VA ) is given by

A(x, y) =
IZn 0 j=0

AI,j xI yj ,

(x, y) UA VA ,

where

A0,0 = A0,1 = = A0,k1 = 0,

A0,k = 1

for some k Z>0 . Then the following statements hold: (i) if B C (UB VB ) has a convergent power series in a neighbourhood UB VB of (0, 0), then there exist unique real analytic functions Q C (UQ VQ ) and R C (UR VR ) dened on neighbourhoods UQ VQ and UR VR , respectively, of (0, 0) such that (a) Q and R are represented by power series

Q(x, y) =
j=0 IZn 0

QI,j xI yj , RI,j xI yj ,
IZn 0 j=0

(x, y) UQ VQ , (x, y) UR VR ,

R(x, y) =

where RI,j = 0 for all I Zn and j k and 0 (b) B(x, y) = Q(x, y)A(x, y) + R(x, y) for all x U UA UB UQ UR and y V VA VB VQ VR ; (ii) there exist unique W C (UW VW ) and E C (UE VE ) dened on neighbourhoods UW VW and UE VE , respectively, of (0, 0) such that (a) W is a Weierstrass polynomial of degree k, (b) E(0, 0) 0, and (c) E(x, y)A(x, y) = W(x, y) for all x U UA UE UW and y V VA VE VW .
Proof We store the following lemma for later use, using the notation that if I, J Zn then 0 J I if jk ik for each k {1, . . . , n}. 1 Lemma Let a, b R>0 with b < a and let I Zn be such that ik = 0 for some k {1, . . . , n}. 0 Then a |J| ban1 a |I| (i) and b (a b)n b
JI jk <ik

(ii)
JI |J|<|I|

a b

|J|

nban1 a (a b)n b

|I|

. 1 and r Z>0 we have s =


s =0

Proof Let us prove the rst statement. Note that for


r

( 1)
s=0

=
s

r+1

r+1 1 . 1

(2.22)

68
Using this fact we compute a b
| J| ik 1

2 Real analyticity

23/06/2009

=
j k =0

J I jk <ik

a b

jk

il

(
l=1 jl =0 l k n l =1 l k

a b

jl

(a/b)ik 1 (a/b) 1

n l =1 l k

(a/b)il +1 1 (a/b) 1
n l=1 l k |I |

b|I|+n (a/b)ik 1 b|I|+n (a/b) 1


n l =1 l k

(a/b)il +1 1 b aik bik = |I | (a/b) 1 b ab

ail +1 bil +1 ab

b aik |I | b ab

ail +1 b a|I|+n1 ban1 a = |I| = a b b (a b)n (a b)n b

as desired. The second assertion of the lemma follows from the rst after noting that a b
| J| m

=
k =1 J I jk <ik

J I | J|<|I|

a b

| J|

Now we proceed with the proof of the rst assertion of the theorem. Let us write

B(x, y) =
IZn 0 j =0

BI, j xI y j ,

(x, y) UB VB .

Let us rst show that, at the level of formal power series, there exist unique formal power series in indeterminates = (1 , . . . , n ) and

QI, j I j ,
j=0 IZn 0 j =0 IZn 0

RI , j I j ,

with RI, j = 0 for I Zn and j k and such that, at the level of formal power series, 0 B = QA + R. Note that the formula B = QA + R reads
j

BI , j =
J I m =0

Q J,m AI J, jm + RI, j ,

I Zn 0 , j Z0 .

(2.23)

For j = k, using the fact that RI,k = 0 for all I Zn , the formula (2.23) reads 0
k k

BI ,k =
J I m =0

Q J,m AI J,km = QI,0 +


JI m=0 | J||I|

Q J,m AI J,km ,

since A0,k = 1. Thus


k

QI,0 = BI,k
J I m =0 | J||I|

Q J,m AI J,km ,

I Zn 0 .

(2.24)

23/06/2009

2.4 Local properties of analytic functions

69

From this we infer that QI,0 is determined uniquely from A and B, and the set of Q J,m , where J I, | J| < |I|, and m {0, 1, . . . , k}. A particular case of the formula (2.24) is Q0,0 = B0,k . (2.25)

Therefore, starting from (2.25), we can recursively and uniquely dene QI,0 for all I Zn . 0 Now take j = k + l in (2.23) for l Z>0 . Then we have, using the fact that RI, j = 0 for j k,
k +l k +l k +l

BI,k+l =
J I m =0

Q J,m AI J,k+lm =
m =0 l 1

QI,m A0,k+lm +
J I m =0 | J|<|I| k +l

Q J,m AI J,k+lm

= QI,l +
m=0

QI,m A0,k+lm +
JI m=0 | J | <| I |

Q J,m AI J,k+lm ,

using the fact that A0, j = 0 for j {0, 1, . . . , k} and A0,k = 1. Thus we have
l1 k +l

QI,l = BI,k+l
m =0

QI,m A0,k+lm
J I m =0 | J|<|I|

Q J,m AI J,k+lm

(2.26)

for I Zn and l Z>0 . From this we infer that we can solve uniquely for QI,l , I Zn , 0 0 l Z>0 , in terms of A and B, and the set of Q J,m with J I, | J| < |I|, and m {0, 1, . . . , k}. When I = 0, in particular, the formula (2.26) reads
l 1

Q0,l = B0,k+l
m =0

Q0,m A0,k+lm ,

showing that we can recursively dene Q0,l for l Z>0 . Then (2.26) can be recursively applied to determine QI,l for I Zn and l Z>0 . Finally, for I Zn and j {0, 1, . . . , k 1}, 0 0 we can directly apply (2.23) to obtain
j j j

RI, j = BI, j
J I m =0 j

Q J,m AI J, jm = BI, j
m=0

QI,m A0, jm
J I m =0 | J|<|I|

Q J,m AI J, jm

= BI , j
JI m=0 | J | <| I |

Q J,m AI J, jm ,

(2.27)

using the fact that A0, j = 0 for j {0, 1, . . . , k 1} and A0,k = 1. Thus RI, j is uniquely determined from those for A and B, and from the set of Q J,m with J I, | J| < |I|, and m {0, 1, . . . , k 1}. The preceding computations show that formal power series exist for Q and R such that (2.23) holds. It remains to show that the resulting power series for Q and R converge. By Theorem 2.1.16 there exists b, c R>0 such that max{|AI, j |, |BI, j |} bc|I|+ j , I Zn 0 , j Z0 .

70

2 Real analyticity
Let , , R>0 be chosen so that > b, , > c, and bck 1 < , 3 We claim that |QI, j | j |I| , I Zn 0 , j Z0 . bck+1 1 < , c 3 n1 1 < . n k 1 3 c ( c) ( c) nbk+1

23/06/2009

(2.28)

We prove this by induction on |I| + j. By (2.25) we have |Q0,0 | = |B0,k | b < , giving (2.28) for |I| + j = 0. Now assume that (2.28) holds for I Z0 and j Z0 such that |I| + j = r 1. Then let I Zn and j Z0 be such that |I| + j = r. By (2.26) we have 0 |QI, j | bc|I|+k+ j +
l 1 m=0 k+ j

m |I| bck+ jm +
J I m =0 | J | <| I | j j 1 m =0

m | J| bc|I|| J|+k+ jm
k+ j

j |I|

bck c

|I |

c j k + bc

c +

|I |

b
m=0

m J I | J|<|I|

| J|

By denition of and since , > c, bck c Using (2.22) we compute c j k bc


j1 m=0 j

|I |

1 < . 3

c j k (/c) j 1 1 j cj bc = bck+1 j (/c) 1 c 1 j bck+1 1 < . j c c 3

bck+1

By (2.22) and the lemma above we compute c


j

|I |

k+ j

b
m =0

m J I | J | <| I |

| J|

|I|

(/c)k+ j+1 1 ncn1 (/c) 1 ( c)n c

|I |

b k+ j+1 ck k+1

k+1

k+ j+1 ck+ j+1 ncn1 c ( c)n

ncn1 1 < . n k 3 c ( c) ( c) b

Combining the previous three estimates we obtain |QI, j | j |I| ,

23/06/2009

2.4 Local properties of analytic functions

71

as in (2.28). Now, given that (2.28) holds, we claim that QI, j , I Zn , j Z0 , denes a convergent 0 power series. To see this, let (0, 1) and let r, R>0 be chosen so that r, = . If (x, y) Rn R satisfy |x j | < r, j {1, . . . , n}, and | y| < , then

|QI, j ||x| | y|
j=0 IZn 0 m=0 IZn j =0 0 | I | =m

(r) () =
j m m=0 JZn+1 | J|=m
0

=
m m =0

n+m m . n

This last series converges by the ratio test. Therefore, the series

QI, j xI y j
j=0 IZn 0

converges for (x, y) satisfying |x j | < r, j {1, . . . , n}, and | y| < . Since Q is analytic, R = B QA is also real analytic, and so the power series for R also converges. This gives the rst part of the theorem. For the second part of the theorem, dene B(x, y) = yk . Then apply the rst part of the theorem to give Q and R such that B = QA + R in a neighbourhood of (0, 0). Then dene W = B R and E = Q. Clearly W = EA. Since B0,k = 1, by (2.25) we have Q(0, 0) = Q0,0 = 1 0. If we apply (2.27) with I = 0 we have R0, j = B0, j for j {1, . . . , n}, giving W0, j = 0 for j {1, . . . , n}. Therefore, noting that RI, j = 0 for j k.
k

W (x, y) =
j=0 IZn 0 | I | >0

(BI, j RI, j )xI y j ,

which shows that W is a Weierstrass polynomial.

The following technical lemmata will be helpful when we subsequently work with the Weierstrass Preparation Theorem. We rst provide a class of analytic functions satisfying the hypotheses of the Weierstrass Preparation Theorem. 2.4.18 Denition (Normalised function) For an open subset U V Rn R, a function f C (U) is normalised if, in a neighbourhood of (0, 0),

f (x, y) =
j =0 IZn 0

I , j xI y j ,

where 0,0 = 0,1 = = 0,k1 = 0 and 0,k = 1 for some k Z>0 .

The following lemma says that many functions are, in fact, normalisable in a simple manner. 2.4.19 Lemma (Functions satisfying the hypotheses of the Weierstrass Preparation Theorem) Let U V Rn R and let f1 , . . . , fk C (U) be nonzero functions with the

72

2 Real analyticity

23/06/2009

property that fj (0, 0) = 0, j {1, . . . , k}. Then there exists an orthogonal transformation : Rn+1 Rn+1 such that fj fj , j {1, . . . , k}, are normalised. Consequently, there exist E1 , . . . , Ek , analytic in a neighbourhood U V of (0, 0) and nonzero at (0, 0), and Weierstrass polynomials W1 , . . . , Wk analytic in U such that fj (x, y) = Ej (x, y)Wj (x, y) for all (x, y) U V .
Proof We rst claim that there exists an open subset of unit vectors u Rn+1 such that the functions a f j (au), j {1, . . . , k}, are not identically zero in a neighbourhood of 0. We prove this by induction on k. For k = 1, suppose otherwise and let u Rn+1 be a unit vector. Then there exists a neighbourhood V of u in the unit sphere Sn Rn+1 and R>0 such that f1 (av) = 0 for v V and a ( , ). By Theorem 2.3.9 it follows that f1 is identically zero, contradicting our hypotheses. Thus there is some u1 Sn such that the function a f1 (au1 ) is not identically zero in a neighbourhood of 0. Moreover, there is a neighbourhood U1 of u1 in Sn such that a f1 (au) is not identically zero in a neighbourhood of 0 for every u U1 . Now suppose the claim holds for k {1, . . . , m} and let f1 , . . . , fm+1 satisfy the hypotheses of the lemma. By the induction hypothesis there exists an open set Um of Sn such that the functions a f j (au), j {1, . . . , m}, are not identically zero in a neighbourhood of 0 for each u Um . By our argument above for k = 1, there can be no open subset of Um such that a fm+1 (au) is not identically zero in a neighbourhood of 0 for all u Um . This gives our claim. By our claim, let u Rn+1 be such that none of the functions a f j (au) are identically zero in a neighbourhood of 0. Now let : Rn+1 Rn+1 be an orthogonal transformation for which (en+1 ) = u. Note that f j (0, y) = f j ( y(en+1 )) = f j (au). Therefore, the function y f (0, y) is not identically zero in a neighbourhood of 0 for each j {1, . . . , k}. One can readily see that this implies that the functions f1 , . . . , fk satisfy the same hypotheses as the function A from the Weierstrass Preparation Theorem.

2.4.20 Lemma (Products of real analytic functions involving Weierstrass polynomials) Let U V Rn R be a neighbourhood of (0, 0) and suppose that f, g, W C (U V). The following statements hold: (i) if f is a polynomial in y i.e., f(x, y) = fk (x)yk + + f1 (x)y + f0 (x) (2.29)

for some fj C (U), j {0, 1, . . . , k}, if W is a Weierstrass polynomial, and if f = gW, then g is a polynomial in y, i.e., g(x, y) = gm (x)ym + + g1 (x)y + g0 (x) (2.30)

for gj C (U), j {0, 1, . . . , m}; (ii) if W is a Weierstrass polynomial, if f and g are polynomials in y, i.e., f and g are as in (2.29) and (2.30), respectively, and if W = fg, then there exists E, F C (U) such that E(0) 0 and F(0) 0 and such that Ef and Fg are Weierstrass polynomials.
Proof (i) Since the coecient of the highest degree term of y in W is 1, i.e., a unit in C (U), and since f is a polynomial in y, we can perform polynomial long division to write f = QW + R where the degree of R as a polynomial in y is less than that of W and where

23/06/2009

2.4 Local properties of analytic functions

73

Q is a polynomial in y [Hungerford 1980, Theorem III.6.2]. By the uniqueness assertion of the Weierstrass Preparation Theorem, since f = gW , we must have g = Q and R = 0. In particular, g is a polynomial in y. (ii) Let k and m be the degrees of f and g, i.e., fk and gm are nonzero in (2.29) and (2.30). Let r be the degree of W . Then we must have yr = W (0, y) = f (0, y) g(0, y) = fk (0) gm (0) yk+m , implying that fk (0) and gm (0) are nonzero, and so fk and gm are invertible in a neighbourhood of 0. Thus the result follows by taking E(x) = fk (x)1 and F(x) = gm (x)1 .

2.4.4 Algebraic properties of germs of analytic functions In this section we use the Weierstrass Preparation Theorem to prove some important results about the innitesimal character of real analytic functions. First we must characterise what we mean by innitesimal. We work here in the setting of real analytic dierential geometry, and so we let M be a real analytic manifold with x0 M. We dene an equivalence relation on the set of ordered pairs ( f , U), where U is a neighbourhood of x0 and f C (U) as follows. We say that ( f1 , U1 ) and ( f1 , U2 ) are equivalent if there exists a neighbourhood U U1 U2 of x0 such that f1 U1 = f2 |U. This notion of equivalence is readily veried to be an equivalence relation. We denote a typical equivalence class by [( f , U)]x0 , or simply by [ f ]x0 if we understand that the domain of f is understood. The set of equivalence classes we denote by C x0 (M), which we call the set of germs of analytic functions at x0 . We make Cx0 (M) a ring by dening the following operations of addition and multiplication: [( f1 , U1 )]x0 + [( f2 , U2 )]x0 = [ f1 |U1 U2 + f2 |U1 U2 , U1 U2 ]x0 [( f1 , U1 )]x0 [( f2 , U2 )]x0 = [( f1 |U1 U2 )( f2 |U1 U2 ), U1 U2 ]x0 . It is elementary to verify that these operations are well-dened, and indeed make C x0 (M). We shall study the algebraic properties of this ring in this section. First of all, we make a fairly easy observation about the character of C x0 (M). 2.4.21 Proposition (Characterisation of analytic function germs) For M a C -manifold and for x0 M, the ring C x0 (M) is isomorphic to the ring R[[1 , . . . , n ]] of convergent power series in n indeterminates, where n is the dimension of the connected component of M containing x0 .

Proof Let (U, ) be a chart about x0 such that (x0 ) = 0. We identify U with (U) Rn and a function on U with its local representative. An analytic function, by denition, is one whose Taylor series converges in some neighbourhood of every point in its domain of denition, and which is equal to its Taylor series on that neighbourhood. Thus, if [( f , V)]0 C 0 (U), in some neighbourhood V of 0 in U we have f (x1 , . . . , xn ) =
IZn 0

|I| f (0). xI

Thus [( f , V)]0 is determined by its Taylor series, which gives a surjective map from C 0 (U) to R[[]]. That this map is also injective follows since two analytic functions having the same Taylor series are obviously equal.

74

2 Real analyticity

23/06/2009

Note that the isomorphism of C x0 (M) with R[[]] in the preceding result is not natural, but depends on a choice of coordinate chart. However, the key point is that if one chooses any coordinate chart, the isomorphism is induced. We shall use this fact in our subsequent proofs to reduce ourselves to the case where the manifold is Rn . This simplies things greatly. However, it is also interesting to have a coordinate independent way of thinking of the isomorphism of the preceding result, and we shall see how to do this in Section 8.1. Our rst serious algebraic result concerning the structure of germs of real analytic functions is the following. We recall from our discussion before Corollary 2.4.15 the notion of a local ring.
2.4.22 Theorem (C x0 (M) is a local ring) For a real analytic manifold M and for x0 M, Cx0 (M) is a local ring with unique maximal ideal given by

m = {[f]x0 | f(x0 ) = 0}.


Proof We let (U, ) be a chart for M with x0 U, and suppose that (x0 ) = 0. As in the proof of Proposition 2.4.21, it suces to show that C 0 ((U)) is a unique factorisation domain. However, since germs are local constructions by denition, we can replace the open subset (U) with Rn for notational simplicity. First of all, note that m is indeed an ideal. n Now suppose that J C 0 (R ) is a nonzero proper ideal and dene k = inf r Z0 there exists
IZn 0

I xI J such that I

0 for |I| = r .

We claim that k Z>0 . Indeed, if k = 0 then J contains a unit and so J is not proper. If k = then all elements of J are zero. Thus k is indeed nonzero and nite. Now let [ f ]0 J. By denition of k we can write f (x ) = I x I
IZn 0 |I|k

in a neighbourhood of 0. Thus [ f ]0 m and so J m.

The following obvious corollary will be used repeatedly.


2.4.23 Corollary (Units in C x0 (M)) For a real analytic manifold M and for x0 M, [f]x0 Cx0 (M) is a unit if and only if [f]x0 m, i.e., if and only of f(x0 ) 0.

Proof In the proof of Corollary 2.4.15 we showed that in a local ring the set of units is precisely the complement of the unique maximal ideal.

We also have the following rather useful property of the ring of germs of real analytic functions. 2.4.24 Theorem (C x0 (M) is a unique factorisation domain) For a real analytic manifold M and for x0 M, Cx0 (M) is a unique factorisation domain.

23/06/2009

2.4 Local properties of analytic functions

75

Proof As in the proof of Theorem 2.4.22, we suppose that M = Rn and that x0 = 0. For simplicity of notation let us denote a germ by [] rather than by []0 . Again as in the proof of Theorem 2.4.22, we prove the theorem by induction on n. n The following lemma will be helpful. We denote by C 0 (R )[] the polynomial ring over the ring C0 (Rn ). We think of C0 (Rn )[] as a subset of C(0,0) (Rn+1 ) by asking that the polynomial P = [ fk ] k + + [ f1 ] + f0 be mapped to the germ of the function (x, y) fk (x) yk + + f1 (x) y + f0 (x). With this identication, we state the following.
n 1 Lemma If [P] C 0 (R )[] is a nonzero nonunit, then the following two statements are equivalent: n (ii) [P] is irreducible in the ring C 0 (R )[]. n+1 ); (i) [P] is irreducible in the ring C (0,0) (R

n Proof (i) = (ii): Suppose that P is not irreducible as an element of C 0 (R )[]. Then [P] = [P1 ][P2 ] for nonzero nonunits [P1 ], [P2 ] C0 (Rn )[]. One readily checks that [P1 ] n+1 ). Thus [P] is irreducible in C (Rn+1 ). and [P2 ] are also nonzero nonunits in C (0,0) (0,0) (R (ii) = (i): We rst prove this part of the lemma in the case that P is equal to a n+1 ) Weierstrass polynomial W . Thus we assume that [W ] is not irreducible in C (0,0) (R n+1 ). Note that W (0, y) = so that [W ] = [ f1 ][ f2 ] for nonzero nonunits [ f1 ], [ f2 ] C (0,0) (R f1 (0, y) f2 (0, y) so that neither of the functions y f1 (0, y) nor y f2 (0, y) is identically zero in a neighbourhood of 0. Thus f1 and f2 are normalised and so, by the Weierstrass Preparation Theorem, we write [ f1 ] = [E1 ][W1 ] and [ f2 ] = [E2 ][W2 ] for units [E1 ], [E2 ] n+1 ) and Weierstrass polynomials W and W . Thus [W ] = [E ][E ][W ][W ] Since C 2 2 2 1 1 1 (0,0) (R [W1 ][W2 ] is a Weierstrass polynomial, it follows from the uniqueness of the second part of the Weierstrass Preparation Theorem that [E1 ][E2 ] is the identity and [W1 ][W2 ] = [W ]. Therefore, [W ] is not irreducible as an element of C (0)Rn []. n+1 ). As an element of C (Rn+1 ) Now suppose that [P] is not irreducible in C (0,0) (R (0,0) n+1 ) and for note that P is normalised. Thus we can write [P] = [E][W ] for a unit [E] C ( R 0 n+1 ) it follows that [W ] a Weierstrass polynomial W . Since [P] is not irreducible in C (0,0) (R n+1 ). As we showed in the previous paragraph, this means is also not irreducible in C (0,0) (R that [W ] = [W1 ][W2 ] for Weierstrass polynomials W1 and W2 . Therefore, [P] = [E][W1 ][W2 ]. Since [P] C (0)Rn [] and since [W1 ] is a Weierstrass polynomial, by Lemma 2.4.20(i) it follows that [E][W1 ] C (0)Rn [], showing that [P] is not irreducible in C (0)Rn [].

We now prove the theorem by induction on n. For n = 1, let us rst identify the irreducibles in C 0 (R). If [ f ] is irreducible then the ideal generated by [ f ] is maximal. By Theorem 2.4.22 it follows that f (0) = 0 if [ f ] is irreducible. If f (0) = 0 then f admits a convergent power series expansion

f (x) =
j =k

jx = x
j

k j =0

k+ j x j

(2.31)

76

2 Real analyticity

23/06/2009

for some k Z>0 , and where k 0. Thus the function x x generates the unique maximal ideal in C x0 (R). Now, if [ f ] is a nonzero nonunit, then f has a power series expansion about j 0 as in (2.31). Thus f is a product of k irreducibles and the unit j=0 k+ j x . This gives the theorem for n = 1. m+1 ) be a Now suppose that the theorem holds for n {1, . . . , m} and let [ f ] C x0 (R m nonzero nonunit. Note that the induction hypothesis is that C0 (R ) is a unique factorim sation domain. Therefore, by Theorem 2.4.8, C 0 (R )[] is a unique factorisation domain. By Lemma 2.4.19 there exists an orthogonal transformation of Rm+1 such that f is normalised. Thus there exists a Weierstrass polynomial W of degree (say) k and an analytic function E not vanishing at (0, 0) such that [ f ] = [E][W ]. Note that [W ] is a nonzero m m nonunit in C 0 (R )[]. Since C0 (R )[] is a unique factorisation domain, [W ] = [P1 ] [Pk ] m m+1 ), we note that for some irreducible [P1 ], . . . , [Pk ] C 0 (R )[]. As functions in C(0,0) (R [P1 ], . . . , [Pk ] are normalised as a consequence of their being nonzero nonunits. Thus, by the m +1 ) Weierstrass Preparation Theorem, we can write [P j ] = [E j ][W j ] for units [E j ] C (0,0) (R and for Weierstrass polynomials W j , j {1, . . . , k}. Thus [W ] = [E ][W1 ] [Wk ] for a unit m+1 ), and so [E] C (0,0) (R [ f ] = [F][W1 ] [Wk ], where [F] = [E][E ]. By the lemma above, [W1 ], . . . , [Wk ] are irreducible as elements m+1 ). Since [ g] [ g] is an isomorphism of C (Rn+1 ), it follows that of C (0,0) (R (0,0) [ W1 ], . . . , [ Wk ] are irreducible. Thus [ f ] is a nite product of irreducibles, namely [ f ] = [ F][ W1 ] [ Wk ]. Now we show that the representation of [ f ] as a product of irreducibles is unique, up to order and multiplication by units. Suppose that [ f ] = [ f1 ] [ fl ] for irreducibles m+1 ). Then [ f1 ], . . . , [ fl ] C (0,0) (R [ f ] = [ f1 ] [ fl ]. Now apply the second part of the Weierstrass Preparation Theorem to write, for each m+1 ) a j {1, . . . , l}, [ f j ] = [E j ][W j ] for a Weierstrass polynomial W j and for [E j ] C (0,0) (R unit. Thus [ f ] = [F ][W1 ] [Wl ]
m+1 ) and for Weierstrass polynomials W , . . . , W . By the for some unit [F ] C (0,0) (R 1 l uniqueness from the second part of Weierstrass Preparation Theorem, we must have m [F] = [F ] and [W1 ] [Wl ] = [W1 ] [Wk ]. Since C 0 (R )[] is a unique factorisation domain, we must have k = l and [W( j) ] = [E j ][W j ], j {1, . . . , k}, for some Sk and for m units [E j ] C 0 (R )[], j {1, . . . , k}. Since

[ f ] = [ F ][ W1 ] [ Wl ], the uniqueness of the product or irreducibles follows.

The next result will be of particular interest to us, especially its consequences in the next section.

23/06/2009

2.4 Local properties of analytic functions

77

2.4.25 Theorem (C x0 (M) is a Noetherian ring) For a real analytic manifold M and for x0 M, Cx0 (M) is a unique factorisation domain.

Proof As in the proof of Theorem 2.4.22, we suppose that M = Rn and that x0 = 0. We also use [] in place of []0 to represent a germ. Again as in the proof of Theorem 2.4.22, we prove the theorem by induction on n. For n = 1, we claim that all ideals in C 0 (R) are principal. Indeed, let I C0 (R) be a nonzero ideal and note by Theorem 2.4.22 that I m. Therefore, the ideal generated by x xk is contained in I for some least k Z>0 . Moreover, if [ g] I then g(x) = xm g (x) for some m k and where g (x) 0. Therefore, [ g] is contained in the ideal generated by x xk and so I is equal to the ideal generated by x xk . Thus I is nitely generated. m+1 ) be a Now suppose that the theorem holds for n {1, . . . , m} and let I C 0 (R nonzero ideal. By Lemma 2.4.19 we let be an orthogonal transformation of Rm+1 such that I contains an element [ f ] for which f is normalised. By the Weierstrass Preparation m+1 ) and a Weierstrass polynomial Theorem, write [ f ] = [E][W ] for a unit [E] C (0,0) (R W . Suppose that the degree of W is k. Denote I = {[ g] | [ g] I}, and note that, since [ g] [ g] is an isomorphism of C ((0, 0))Rn+1 , I is nitely generated if and only if I is nitely generated. Dene
m A = {[h] I C 0 (R )[] | degree of h is less than k}

to be the elements of I that are germs of polynomial functions in y with degree less than m m that of W . Note that A is a module over C 0 (R ), and as such is a submodule of C0 (R )[]. m By the induction hypothesis, C0 (R ) is a Noetherian ring. By Lemma 1 from the proof m of Theorem 2.4.14, C 0 (R )[] is also a Noetherian ring and so is nitely generated by m Proposition 2.4.10. Let [h1 ], . . . , [hr ] be generators for the C 0 (R ) module A. Now let [ g] I and write [ g] = [ f ][W ] + [R] as in the Weierstrass Preparation m Theorem. Note that the Weierstrass Preparation Theorem gives [R] C 0 (R )[] and the m polynomial degree of [R] is less than k. Therefore, [R] is in the C0 (R ) module A, and from this we see that [h1 ], . . . , [hr ], [ f ] generate I, giving the theorem.

2.4.5 Properties of analytic sections of vector bundles and their germs In this section we extend the algebraic structures investigated in the preceding section from functions to sections of vector bundles. First, as with functions, we dene germs of sections. We let : V M be a real analytic vector bundle and let x0 M. We dene an equivalence relation between ordered pairs (, U), where U M is a neighbourhood of x0 and (V|U) as follows. We say that (1 , U1 ) and (2 , U2 ) are equivalent if there exists a neighbourhood U U1 U2 of x0 such that 1 |U = 2 |U. This is readily veried to be an equivalence relation, and we denote a typical equivalence class by [(, U)]x0 , or simply by []x0 if the domain of is understood. The set of equivalence classes is denoted by x0 (V), which we call the set of germs of analytic sections of V at x0 . We make ( M ) into a module over C x0 x0 (M) by dening the following

78

2 Real analyticity

23/06/2009

operations of addition and scalar multiplication: [(1 , U1 )]x0 + [(2 , U2 )]x0 = [((1 + 2 )|U1 U2 , U1 U2 )]x0 , [( f , U1 )]x0 [(, U2 )]x0 = [(( f )|U1 U2 , U1 U2 )]x0 . We wish to understand some of the structure of this module. First we have the following result which gives a more or less concrete characterisation of the germs of sections. 2.4.26 Proposition (Characterisation of analytic section germs) For : V M a C -vector m bundle and for x0 M, the module x0 (V) is isomorphic to the module R[[1 , . . . , n ]] R , where n is the dimension of the connected component of M containing x0 and m is the dimension of Vx0 .

Proof We choose a vector bundle chart about x0 whose image is U Rm , where U Rn is an open set and where x0 is mapped to 0. Then a section is represented by m analytic functions on U, these being the components of . Thus is represented by an element of C (U, Rm ), and real analyticity implies, as in the proof of Proposition 2.4.21, that []x0 is uniquely determined by the Taylor coecients of this map at 0. As we say in Section 2.1.3, these Taylor coecients are in R[[]] Rm , and convergence of the resulting power series [[]] Rm . That the assignment to of the corresponding implies that they are actually in R [[]] Rm is an isomorphism follows as in the proof of Proposition 2.4.21. element in R

Since the ring of germs of functions is Noetherian, we immediately have the following result. 2.4.27 Theorem ( (V) is a Noetherian module) For a real analytic vector bundle : V M x0 and for x0 M, x0 (V) is a nitely generated Noetherian module. In particular, all submodules of ( V ) are nitely generated. x0
Proof By Propositions 2.4.10 and 2.4.13 we only need to show that x0 (V) is nitely generated as a module over C ( M ). Choose a vector bundle chart for V about x0 taking x0 m n values in U R , where U R is open. Let e1 , . . . , em be the sections of V, dened on the chart domain, whose local representatives are x (x, e j ), j {1, . . . , m},

where (e1 , . . . , em ) is the standard basis for Rm . Any section of V dened on the chart domain will then have the form x 1 (x)e1 (x) + + m (x)em (x) for real analytic functions 1 , . . . , m on the chart domain.

Now let us use our characterisation of germs to say something about sections locally, not just innitesimally. To set this up, we note that, for a real analytic vector bundle : V M, (V) is a module over C (M) with the operations of addition and scalar multiplication dened by (1 + 2 )(x) = 1 (x) + 2 (x), ( f )(x) = f (x)(x).

23/06/2009

2.4 Local properties of analytic functions

79

We shall see in Chapter 5 that submodules of (V) will be of interest to us. It turns out that the Noetherian structure of x0 (V) has implications for the local structure of submodules of (V). To be precise about this, let M be a submodule of (V). For an open set U M let us denote M |U = {|U | M }. The submodule M is locally nitely generated if, for every x0 M there exists a neighbourhood N of x0 and sections 1 , . . . , k (V|N) which generate M |N as a module over C (N). One might guess that the fact that submodules of x0 (V) are nitely generated implies that any submodule of (V) is locally nitely generated. This is true, but it is not obvious. Let us see what needs to be done in order to show that the Noetherian property of x0 (V) shows that submodules of (V) are locally nitely generated. Let M (V) be a submodule and let x0 M. Denote by x0 (M ) = {[(, M)]x0 | M } the set of germs at x0 of sections in M . Using the fact that M is a submodule of (V) it is an elementary exercise to show that x0 (M ) is a submodule of the Cx0 (M)-module x0 (V). By Theorem 2.4.27 it follows that x0 (M ) is nitely generated, say by germs [(1 , U1 )]x0 , . . . , [(k , Uk )]x0 . Let U = k U and, by a mild abuse of notation, let j denote j=1 j the restriction of j to U, j {1, . . . , k}. It seems reasonable that M is locally generated by the sections 1 , . . . , k . However, this seemingly obvious fact dees elementary proof. We shall prove this obvious assertion below. Some notation is helpful to prove the theorems we need to characterise the local character of submodules of (V). Let M and N be real analytic manifolds and let x0 M. As previously done, we introduce an equivalence relation on the set of pairs (, U) where U is a neighbourhood of x0 and : U N. Two such pairs, (1 , U1 ) and (2 , U2 ) are equivalent if there exists a neighbourhood U U1 U2 such that 1 |U = |U2 . An equivalence class is denoted by [(, U)]x0 , or simply by []x0 , and is called a germ of mappings from M to N. By C x0 (M; N) we denote the set of such germs. m m If N = R for some m Z>0 , then Cx0 (M; R ) has the structure of a module over C x0 (M) according to the following operations: [(1 , U1 )]x0 + [(2 , U2 )]x0 = [((1 + 2 )|U1 U2 , U1 U2 )]x0 , [( f , U1 )]x0 [(, U2 )]x0 = [(( f )|U1 U2 , U1 U2 )]x0 . The theorem we want is then the following. 2.4.28 Theorem (Submodules of real analytic sections are locally nitely generated) Let : V M be a real analytic vector bundle, let M (V) be a submodule, let x0 M, and let [(1 , U1 )], . . . , [(k , Uk )] be generators for x0 (V). Then there exists a neighbourhood U such that [1 ]x , . . . , [k ]x generate x (M ) for every x U. In particular, M is locally nitely generated.

80

2 Real analyticity

23/06/2009

Proof We can without loss of generality assume that V is a local vector bundle: V = U Rm . The base space is U and the vector bundle projection is (x, v) = x. By considering only principal parts, sections : U U Rm become mappings f : U Rm . We thus suppose that M is a submodule of C (U, Rm ). We let x0 U and we suppose that we have m [ f 1 ]x0 , . . . , [ f k ]x0 C U (R ;) generate x0 (M ). To prove the theorem, we must show that there exists a neighbourhood U0 U of x0 such that, if f C (U0 , Rm ), we can write f (x) = 1 (x) f 1 (x) + + k (x) f k (x)
n for every x U0 . For notational simplicity, we will sometimes denote germs by C x (R ) (and similarly for germs of mappings) in order to not have to keep track of the name of specic neighbourhoods. Also for notational simplicity we suppose that x0 = 0. The proof is by a double induction on n and m. We shall rst prove the theorem for m = m0 > 1, supposing that it already holds for smaller m {1, . . . , m0 1}. We shall then suppose that m = 1 and prove the theorem for n = n0 + 1, supposing the theorem has been proved for all m and n {0, 1, . . . , n0 }. Since the theorem holds vacuously when n = 0, this will prove the theorem. So rst take m = m0 > 1 and suppose that the theorem holds for m {1, . . . , m0 1}. Let f 1 , . . . , f m0 M be such that [ f 1 ]0 , . . . , [ f m0 ]0 generate (M ). Let us denote the 0 l components of f j by f j , l {1, . . . , m0 }. Let us dene [ f j ]0 C0 (U; Rm0 1 ) by

f j (x) = ( f j1 (x), . . . , f jm0 1 (x)),

x U, j {1, . . . , k}.

m0 1 ) generated by [ f ] , . . . , [ f ] . Let A be the Let A be the submodule of C 0 (U; R 1 0 k 0 submodule of C0 (U) comprised of those germs in ( M ) whose rst m0 1 components 0 are zero. We can identify A with an ideal of C0 (U) in the natural way, by projection onto the m0 th factor. It is easy to check that both A and A are indeed submodules. Since r m0 1 ) be C 0 (U) is Noetherian, there exists [ f 1 ]0 , . . . , [ f ]0 generating A . Let M C (U, R the submodule dened by

M = { f C (U, Rm0 1 ) | [ f ] A } and let M C (U, Rm0 ) be the submodule dened by M = { f C (U, Rm0 ) | [ f ]0 A }.

By the induction hypotheses, there exists a neighbourhood U U of 0 such that f 1 |U , . . . , f k |U generate M |U and such that f 1 |U , . . . , f r |U generate M |U . Note that we can write j j [ f a ]0 = [a ]0 [ f 1 ]0 + + [a ]0 [ f k ]0 for some [a ]0 C 0 (U ), j {1, . . . , k}, a {1, . . . , r}. By shrinking U if necessary, we can write j j f a (x) = a (x) f 1 (x) + + a (x) f k (x)
j

for all x U and a {1, . . . , r}. Now let f M |U and let its components be denoted by f 1 , . . . , f m0 . Since [ f ]0 0 (M ), there exists [1 ]0 , . . . , [k ]0 C 0 (U) such that [ f ]0 = [1 ]0 [ f 1 ]0 + + [k ]0 [ f k ]0 .

23/06/2009

2.4 Local properties of analytic functions

81

In particular, if f C (U, Rm0 1 ) has as components the rst m0 1 components of f , we have [ f ]0 = [1 ]0 [ f 1 ]0 + + [k ]0 [ f k ]0 . Moreover, by the induction hypothesis, the germs [1 ]0 , . . . , [k ]0 can be chosen such that, for x U , f (x) = 1 (x) f 1 (x) + + k (x) f k (x). Dene g = f 1 f 1 k f k and note that the rst m0 1 components of g vanish. Thus, [ g]0 A and so, by the induction hypothesis, we can write
r r k

g(x) =
a =1

a (x) f a (x) =
a=1 j=1

a (x)a (x) f j (x)

for x U . Thus f (x) =

j (x) +
j =1 a=1

a (x)a (x) f j (x)

for x U , completing this part of the induction. For the second part of the induction, we let m = 1 and n = n0 + 1, assuming the theorem to have been proved for all m Z>0 and n {0, 1, . . . , n0 }. We denote a typical point in Rn0 R by (x, y) and we assume that our neighbourhood has the form U V Rn0 R. We n0 begin by assuming that the germs [ f1 ], . . . , [ fk ] C 0 (R R; R) are germs of Weierstrass polynomials, and denote f j = W j , j {1, . . . , k}, these polynomials being dened on U V, possibly after shrinking this domain. Let d be the maximum of the degrees of W1 , . . . , Wk , and suppose without loss of generality that d is the degree of Wk . For j {1, . . . , k 1} let us write W j (x, y) = Q j (x, y)Wk (x, y) + R j (x, y), (x, y) U V for a neighbourhood U V UV of (0, 0). Here R1 , . . . , Rk1 are polynomials in y of degree less than d. Let us adopt the notation Rk = Wk for convenience (it also being notationally consistent with applying the Weierstrass Preparation Theorem to give Wk = Wk ). For s {0, 1, . . . , d 1} and for j {1, . . . , k} dene P js (x, y) = R j (x, y) ys . Since R1 , . . . , Rk are polynomials of degree less than 2d, we can write P js is a polynomial of degree less than 2d with coecients in C (U ). The set of all polynomials in y of degree less than 2d with coecients in C (U ) is a free C (U )-module that is isomorphic to C (U , R2d ) by the isomorphism that maps the polynomial function (x, y) g2d1 (x) y2d1 + + g1 (x) y + g0 (x) to mapping x ( g0 (x), g1 (x), . . . , g2d1 (x)).
2d Making this identication, we denote by A C 0 (U ; R ) the submodule generated by [P js ]0 , j {1, . . . , k}, s {0, 1, . . . , d 1}. Let N be the submodule of C (U , R2d ) dened by

N = { g C (U , R2d ) | [ g]0 A}.

82

2 Real analyticity

23/06/2009

By the induction hypothesis, there exists a neighbourhood U0 V of 0 such that, if g N we can write (thinking now of elements of N as being polynomial functions)
k 2d1

g(x, y) =
j=1 s=0

js (x)R j (x, y) ys ,

(2.32)

for some js C (U0 ), j {1, . . . , k}, s {0, 1, . . . , d 1}. We wish to understand the character of the germs [W1 ](x, y) , . . . , [Wk ](x, y) for (x, y) n0 n0 U V . To do this, if [ f ](x, y) C (x, y) (R R) we dene [ f ](0,0) C(0,0) (R R) by taking f (u, v) = f (x + u, y + v) for (u, v) in a neighbourhood of (0, 0). In particular, we let n0 Wk C (0,0) (R R) correspond to Wk . Note that Wk is polynomial in v with the same degree as Wk . But it is not necessarily a Weierstrass polynomial. Now, using this notation, by the Weierstrass Preparation Theorem write [Wk ](0,0) = n0 [E ](0,0) [W ](0,0) for a unit [E ](0,0) C (0,0) (R R) and a Weierstrass polynomial W . Note that Wk is a polynomial in v of degree d and so, by Lemma 2.4.20(i), E is a polynomial in v, and its highest degree coecient must therefore be 1. Let dE and dW be the degrees of E and W , respectively, noting that dE + dW = d. Now let f M |U0 V and, still using the notation above, write [ f ](0,0) = [Q ](0,0) [Wk ](0,0) + [R ](0,0) , where R is a polynomial in v of degree less than dW . By unpriming this equation we have [ f ](x, y) = [Q](x, y) [Wk ](x, y) + [R](x, y) . Now let us go back to examining germs at (0, 0). First write [ f ](0,0) = [1 ](0,0) [W1 ](0,0) + + [k ](0,0) [Wk ](0,0)
n0 for [1 ](0,0) , . . . , [k ](0,0) C (0,0) (R R). Apply the Weierstrass Preparation Theorem to write [ j ](0,0) = [ j ](0,0) [Wk ](0,0) + [ j ](0,0) , j {1, . . . , k},

where 1 , . . . , k are polynomials in v of degree less than d. From our application about of the Weierstrass Preparation Theorem to the germ of f at a general point (x, y), at (0, 0) we can write [ f ](0,0) = [Q](0,0) [Wk ](0,0) + [R](0,0) where R is a polynomial in v of degree less than d. We then have [R](0,0) = [ f ](0,0) [Q](0,0) [Wk ](0,0)
k 1

=
j=1 k 1

[ j ](0,0) [W j ](0,0) + [k ](0,0) [Wk ](0,0) [Q](0,0) [Wk ](0,0) [ j ](0,0) [Wk ](0,0) + [ j ](0,0) [Q j ](0,0) [Wk ](0,0) + [R j ](0,0)
j=1

+ [k ](0,0) [Wk ](0,0) [Q](0,0) [Wk ](0,0)


k 1

=
j =1

[ j ](0,0) [R j ](0,0) + [A](0,0) [Wk ](0,0) ,

(2.33)

23/06/2009

2.4 Local properties of analytic functions

83

where, by back substitution, we see that


k 1

[A](0,0) =
j =1

[ j ](0,0) [Q j ](0,0) [Wk ](0,0) + [ j ](0,0) [R j ](0,0) + [ j ](0,0) [Q j ](0,0) .

In (2.33), the left hand side, i.e., R, is a polynomial in v of degree less than d. On the right hand side, the terms in the sum are polynomials in v of degree less than 2d. Thus AWk is a polynomial in v of degree less than 2d. From Lemma 2.4.20(i) it follows that A is a polynomial in v, and so its degree must be less than d, since the degree of Wk is d. Therefore, remembering our convention that Rk = Wk , [R](0,0) A since 1 , . . . , k1 , A are nite linear combinations of 1, y, . . . , yd1 with coecients in C (U0 ). Thus, by the induction hypotheses as pointed out above, cf. (2.32), we can write
k

R(x, y) =
j =1

j (x, y)R j (x, y),

(x, y) U0 V ,

where 1 , . . . , k are polynomials in y of degree less than d. Then, for (x, y) U0 V , we have [ f ](x, y) = [Q](x, y) [Wk ](x, y) + [R](x, y)
k

= [Q](x, y) [Wk ](x, y) +


j =1

[ j ](x, y) [R j ]( x, y)
k 1

= ([Q](x, y) + [k ](x, y) )[Wk ](x, y) +


j =1 k 1

[ j ](x, y) ([W j ](x, y) [Q j ](x, y) [Wk ](x, y) )


k 1

= [Q](x, y) + [k ](x, y) +
j =1

[ j ](x, y) [Q j ](x, y) [Wk ](x, y) +


j =1

[ j ](x, y) [W j ](x, y) .

This shows that [W1 ](x, y) , . . . , [Wk ](x, y) generate (M ), so completing this part of the (x, y) induction in the case that f1 , . . . , fk are Weierstrass polynomials. Now suppose that f1 , . . . , fk are not necessarily Weierstrass polynomials. By Lemma 2.4.19 let : Rn0 +1 Rn0 +1 be an orthogonal transformation such that f1 , . . . , fk are normalised, and by the Weierstrass Preparation Theorem write f j = E j W j in a neighbourhood U V U V of (0, 0). Here, for j {1, . . . , k}, W j is a Weierstrass polynomial and E j can be assumed nonzero on U V by shrinking the domain, if necessary. Let A C (0,0) (U V ) be the submodule (i.e., ideal) generated by [W1 ](0,0) , . . . , [Wk ](0,0) and let M be the submodule of C (U V ) dened by M = { f C (U V ) | [ f ](0,0) A }. By our proof above for the case when f1 , . . . , fk are Weierstrass polynomials, there exists a neighbourhood U0 V0 U V of (0, 0) such that W1 |U0 V0 , . . . , Wk |U0 V0 generates M |U0 V0 . Let f M |U0 V0 so that [ f ](0,0) (M ). Thus we can write (0,0) [ f ](0,0) = [1 ](0,0) [ f1 ](0,0) + + [k ](0,0) [ fk ](0,0)

84

2 Real analyticity
for some [1 ](0,0) , . . . , [k ](0,0) C (0,0) (U0 V0 ). Therefore, [ f ](0,0) = [ 1 ](0,0) [ f1 ](0,0) + + [ k ](0,0) [ fk ](0,0)

23/06/2009

= [ 1 ](0,0) [ E1 ](0,0) [ W1 ](0,0) + + [ k ](0,0) [ Ek ](0,0) [ Wk ](0,0) , showing that [ f ](0,0) A . Thus f M |U0 V0 and so we can write f (x, y) = 1 (x, y)W1 (x, y) + + k (x, y)Wk (x, y), for some 1 , . . . , k C (U0 V0 ). Then we have f (x, y) = 1 (x, y) W1 (x, y) + + k (x, y) Wk (x, y), (x, y) (U0 V0 ), (x, y) U0 V0

showing that W1 , . . . , Wk generate C ((U0 V0 )). Since E1 , . . . , Ek are nowhere zero on U0 V0 , it follows that f j = E j W j , j {1, . . . , k}, generate C ((U0 V0 )). The theorem now follows since (U0 V0 ) is a neighbourhood of (0, 0).

The above theorem will be an important one for us. But there is more to the story than is immediately implied by the theorem. An important question, given sections 1 , . . . , k (V), is, What are the linear relations between the sections when they are not linearly independent? The following denition makes this notion precise. 2.4.29 Denition (Module of relations) Let : V M be a real analytic vector bundle and let 1 , . . . , k (V). Denote by Rk the trivial vector bundle M Rk . M (i) The submodule R(1 , . . . , k ) of (Rk ) given by M
k

R(1 , . . . , k ) = (Rk M)
j =1

j (x) f j (x) = 0x , x M

is the module of relations for the sections f1 , . . . , fk . (ii) For x M, the submodule Rx (1 , . . . , k ) of (Rk ) given by M
k k Rx (1 , . . . , k ) = []x x (RM ) j =1

[ j ]x [ f j ]x = 0

is the module of relations at x for the sections f1 , . . . , fk .

The modules R(1 , . . . , k ) and Rx (1 , . . . , k ) provide all linear relations between the sections 1 , . . . , k . Of course, we have Rx (1 , . . . , k ) = {[]x | R(1 , . . . , k )}. Moreover, since Rx (1 , . . . , k ) is a submodule of the Noetherian module x (V), it follows that Rx (1 , . . . , k ) is nitely generated. The Oka Coherence Theorem tells us that R(1 , . . . , k ) is locally nitely generated.

23/06/2009

2.4 Local properties of analytic functions

85

2.4.30 Theorem (Oka Coherence Theorem (non-sheaf version)) Let : V M be open and let 1 , . . . , k (V). Then, for any x0 M, there exists a neighbourhood U M of x0 and 1 , . . . , r (Rk ) such that Rx (1 , . . . , k ) is generated by [1 ]x , . . . , [r ]x for each x U. U

Proof We can without loss of generality assume that V is a local vector bundle: V = U Rm . The base space is U and the vector bundle projection is (x, v) = x. By considering only principal parts, sections : U U Rm become mappings f : U Rm . Thus we suppose that we have f 1 , . . . , f k C (U, Rm ), and we dene
k Rx ( f 1 , . . . , f k ) = []x C x (U; R ) k

[ j ]x [ f j ]x = 0 .
j=1

To prove the theorem, we must show that, for any x0 U, there exists a neighbourhood U0 U of x0 and 1 , . . . , r C (U0 , Rk ) such that Rx ( f 1 , . . . , f k ) is generated by [1 ]x , . . . , [r ]x n for each x U0 . For notational simplicity, we will sometimes denote germs by C x (R ) (and similarly for germs of mappings) in order to not have to keep track of the name of specic neighbourhoods. Also for notational simplicity we suppose that x0 = 0. The proof is by a double induction on n and m. We shall rst prove the theorem for m = m0 > 1, supposing that it already holds for smaller m {1, . . . , m0 1}. We shall then suppose that m = 1 and prove the theorem for n = n0 + 1, supposing the theorem has been proved for all m and n {0, 1, . . . , n0 }. Since the theorem holds vacuously when n = 0, this will prove the theorem. So rst take m = m0 > 1 and suppose that the theorem holds for m {1, . . . , m0 1}. Let us denote the components of f j by f jl , l {1, . . . , m0 }. Suppose that []x Rx ( f 1 , . . . , f k ). Then, in particular,
k j =1

j (x) f j1 (x) = 0,

1 , . . . , f 1 ). That is to say, and so []x Rx ( f1 k 1 Rx ( f 1 , . . . , f k ) Rx ( f1 , . . . , fk1 ).

By the induction hypothesis there exists a neighbourhood U U of 0 and 1 , . . . , r 1 , . . . , f 1 ). Thus, for x U , C (U , Rk ) such that [1 ]x , . . . , [r ]x generate Rx ( f1 k
r

Rx ( f 1 , . . . , f k )
a =1 n If [1 ]x , . . . , [r ]x C x (R ) are such that r

[a ]x [a ]x

n [a ]x C x (R ) .

[a ]x [a ]x Rx ( f 1 , . . . , f k )
a=1

(2.34)

then we have

j=1 a=1

[a ]x [a ]x [ f ji ]x = 0,

i {1, . . . , m0 }.

86

2 Real analyticity
1 , . . . , f 1 ), we have However, since [1 ]x , . . . , [r ]x generate Rx ( f1 k k j =1 j

23/06/2009

[a ]x [ f j1 ]x = 0,

a {1, . . . , r},

(2.35)

n and so the equations for [1 ]x , . . . , [r ]x C x (R ) to be such that (2.34) holds are k r j

j =1 a =1

[a ]x [a ]x [ f ji ]x = 0,

i {2, . . . , m0 }.

(2.36)

Let us dene g1 , . . . , gr C (U , Rm0 1 ) by


k

gi a (x) =
j =1

a (x) f ji+1 (x),


j

i {1, . . . , m0 1}.

Then the solutions [1 ]x , . . . , [r ]x of (2.36) are precisely elements of Rx ( g1 , . . . , gr ). By the induction hypothesis there exists a neighbourhood U0 U of 0 and 1 , . . . , s C (U0 , Rr ) such that Rx ( g1 , . . . , gr ) is generated by [1 ]x , . . . , [s ]x for each x U0 . We then dene 1 , . . . , r C (U0 , Rk ) by
s j a (x )

=
b=1

a b (x)a (x),

j {1, . . . , k}.

We claim that [1 ]x , . . . , [r ]x generate Rx ( f 1 , . . . , f k ) for every x U0 . First of all,


k j =1 k s i [a b ]x [a ]x [ f j ]x = 0, j

[a ]x [ f ji ]x =

i {1, . . . , m0 }, x U0

j=1 b=1

by (2.35) and since Rx ( g1 , . . . , gr ) is generated by [1 ]x , . . . , [s ]x for each x U0 . Thus [a ]x Rx ( f 1 , . . . , f k ) for a {1, . . . , r} and x U0 . Now, if []x Rx ( f 1 , . . . , f k ) then
k j =1

[ j ]x [ f ji ]x = 0,

i {1, . . . , m0 }.

n Thus there exists [1 ]x , . . . , [r ]x C x (R ) such that r

[ ]x =
j a =1

[a ]x [a ]x ,

j {1, . . . , k}.

However, by denition of Rx ( g1 , . . . , gr ) we can write


s

[ ]x =
a b=1

[b ]x [a b ]x ,

a {1, . . . , r},

23/06/2009

2.4 Local properties of analytic functions

87

n for some [1 ]x , . . . , [s ]x C x (R ). Therefore, r s

[ j ]x =
a=1 b=1

[b ]x [a b ] x [ a ] x ,

j {1, . . . , k},

showing that []x is in the module generated by [1 ]x , . . . , [r ]x . This completes this part of the inductive argument. Next we let m = 1 and n = n0 + 1, and suppose that the theorem has been proved for all m Z>0 and n {0, 1, . . . , n0 }. In the usual way, we denote a point in Rn0 +1 by (x, y), and we suppose that the neighbourhood of Rn0 +1 in which we work is of the form U V for a neighbourhood U Rn0 of 0 and a neighbourhood V R of 0. We rst suppose that all components of f1 , . . . , fk are Weierstrass polynomials, and denote f j = W j , j {1, . . . , k}. Thus W j C (U)[], possible after shrinking U and V. We let d be the maximum of the degrees of W1 , . . . , Wk . We wish to understand the character of R(x, y) (W1 , . . . , Wk ) for (x, y) U V. To do this, n0 n0 if [ f ](x, y) C (x, y) (R R) we dene [ f ](0,0) C(0,0) (R R) by taking f (u, v) = f (x + u, y + v) for (u, v) in a neighbourhood of (0, 0). We can do this for germs of mappings taking values in Euclidean spaces as well, and, up to the end of the proof of the lemma we are about to state and prove, we will use without comment the to denote a germ at (0, 0) corresponding to n0 a germ at (x, y). In particular, we let W1 , . . . , Wk C (0,0) (R R) correspond to W1 , . . . , Wk . Note, then, that R(0,0) (W1 , . . . , Wk ) = {[ ](0,0) | [](x, y) R(x, y) (W1 , . . . , Wk )}. Note also that W1 , . . . , Wk are polynomials in v of degree the same as W1 , . . . , Wk are in y, but are not necessarily Weierstrass polynomials. The following lemma is now useful.
n0 1 Lemma For each (x, y) U V the C (x,y) (R R)-module R(x,y) (W1 , . . . , Wk ) is generated by n0 its elements of the form ([P1 ](x,y) , . . . , [Pk ](x,y) where [Pj ](x,y) C x (R )[], j {1, . . . , k}, are polynomial functions of y of degree at most d.

Proof Without loss of generality, suppose that deg(Wk ) = d. By the Weierstrass Prepan0 ration Theorem, thinking of [Wk ](0,0) as an element of C (0,0) (R R), write [Wk ](0,0) = n0 [E ](0,0) [W ](0,0) for a unit [E ](0,0) C (0,0) (R R) and a Weierstrass polynomial W . Note that Wk is a polynomial in v of degree d and so, by Lemma 2.4.20(i), E is a polynomial in v, and its highest degree coecient must therefore be 1. Let dE and dW be the degrees of E and W , respectively, noting that dE + dW = d. For [ ](0,0) R(0,0) (W1 , . . . , Wk ), by the Weierstrass Preparation Theorem, write [ j ](0,0) = [Q j ](0,0) [Wk ](0,0) + [R j ](0,0) , where R j is a polynomial of degree less than dW . Dene
k 1

j {1, . . . , k 1},

[R ](0,0) = [ ](0,0) +
k k j =1

[Q j ](0,0) [W j ](0,0) .

88

2 Real analyticity
We claim that ([R 1 ](0,0) , . . . , [R k ](0,0) ) R(0,0) (W1 , . . . , Wk ). Indeed,
k j =1 k 1

23/06/2009

[R j ](0,0) [W j ](0,0) =

j=1

[ j ](0,0) [Q j ](0,0) [Wk ](0,0) [W j ](0,0)


k 1 k

+ [ ](0,0) [Wk ](0,0) +


k

j =1

[Q j ](0,0) [W j ](0,0) [Wk ](0,0)

=
j=1

[ j ](0,0) [W j ](0,0) = 0.

Thus we have

k 1 j =1

[R j ](0,0) [W j ](0,0) + [R k ](0,0) [E ](0,0) [W ](0,0) = 0.

The sum on the left is one whose terms are polynomial in v of degree less than dW + d which implies that [R k ](0,0) [E ](0,0) [W ](0,0) is a polynomial in v of degree less than dW + d. Since [W ](0,0) is a Weierstrass polynomial of degree dW , from Lemma 2.4.20(i) we have that [R k ](0,0) [E ](0,0) must be a polynomial, and so have degree less than d. Recalling that [E ](0,0) is a unit, we can write
k 1 [R k ](0,0) = [E ] (0,0) [E ](0,0) [R ](0,0) ,

showing that the polynomial degree of [R k ](0,0) has degree less than d, as do [R 1 ](0,0) , . . . , [R (k1) ](0,0) . Now we write ([ 1 ](0,0) , . . . , [ k ](0,0) ) = ([Wk ](0,0) , . . . , 0, [W1 ](0,0) )[Q 1 ](0,0) + + ([0, . . . , [Wk ](0,0) , [Wk1 ](0,0) [Q (k1) ](0,0) + ([R 1 ](0,0) , . . . , [R k ](0,0) ). Thus shows that elements of R(0,0) (W1 , . . . , Wk ) are linear combinations of vectors whose components are polynomials in v with degree at most d. By unpriming this equation we get ([1 ](x, y) , . . . , [k ](x, y) ) = ([Wk ](x, y) , . . . , 0, [W1 ](x, y) )[Q1 ](x, y) + + ([0, . . . , [Wk ](x, y) , [Wk1 ](x, y) [Qk1 ](x, y) + ([R1 ](x, y) , . . . , [Rk ](x, y) ), and the lemma follows. To complete the proof of the theorem in the case when f1 , . . . , fk are Weierstrass polynomials, let [](x, y) R(x, y) (W1 , . . . , Wk ) have the form of the lemma. Thus
d

[ j ](x, y) =
a =0

[ ja ]x [ ya ](x,0) ,

j {1, . . . , k},

n0 for [ ja ]x C 0 (R )[], j {1, . . . , k}, a {0, 1, . . . , d}. Note that we must have k d

[ ja ](x, y) [ ya ](x,0) [W j ](x, y) = 0,


j=1 a=0

j {1, . . . , k}.

23/06/2009

2.4 Local properties of analytic functions

89

Since W1 , . . . , Wk are polynomials with degree at most d, this preceding equation is a n0 polynomial equation in of degree at most 2d. Let [Cb ]x C x (R ), b {0, 1, . . . , 2d}, be the coecient of ya for the expression on the left, noting that this will be a linear function of the coecients [ ja ]x , j {1, . . . , k}, a {1, . . . , d}. Thus we can write
k d ja [cb ja ]x [ ]x = 0,

[Cb ]x =
j=1 a=1

b {0, 1, . . . , 2d},

(2.37)

C (U), b {0, 1, . . . , 2d}, a {1, . . . , d}, j {1, . . . , k}, are real analytic functions where cb ja on U determined from the coecients of the polynomials W1 , . . . , Wk . Thus the induction ja hypotheses give a neighbourhood U0 of (0, 0) and functions s C (U0 ), s {1, . . . , r}, a {1, . . . , d}, j {1, . . . , k}, such that every solution [ ja ]x to (2.37) is given by
r

[ ja ]x =
s =1

[s ]x [s ]x ,

ja

j {1, . . . , k}, a {0, 1, . . . , d},

n0 k for some [s ]x C x (R ). Then, by the lemma above, if we dene s C (U0 V, R ), s {1, . . . , r}, by d j s (x, y)

=
a =0

s (x) ya ,

ja

j {1, . . . , k}, s {1, . . . , r},

then [1 ](x, y) , . . . , [r ](x, y) generate R(x, y) (W1 , . . . , Wk ) for every (x, y) U0 V. This proves the theorem in the case that f1 , . . . , fk are Weierstrass polynomials. Now let us suppose that this is not necessarily the case. By Lemma 2.4.19 let : Rn0 +1 Rn0 +1 be an orthogonal transformation such that f1 , . . . , fk are normalised. Then, by the Weierstrass Preparation Theorem, write f j = E j W j for (x, y) in some neighbourhood U V of (0, 0), and where W1 , . . . , Wk are Weierstrass polynomials and E1 , . . . , Ek C (U V ) are nonzero at (0, 0). Let us suppose that U V is suciently small that E is nowhere zero. By the proof for Weierstrass polynomials above, there exists a neighbourhood U0 V0 U V of (0, 0) and 1 , . . . , r C (U0 V0 , Rk ) such that [1 ](x, y) , . . . , [r ](x, y) generate R(x, y) (W1 , . . . , Wk ) for each (x, y) U0 V0 . One then directly sees that [ (E1 1 )](x, y) , . . . , [ (E1 r )](x, y) generate R(x, y) ( f1 , . . . , fk ) for each (x, y) (U0 V0 ), where = (1 ) . Since (U0 V0 ) is a neighbourhood of (0, 0), the theorem follows.

This point of the theorem is nicely encapsulated by a commutative diagram. With the setting of the statement of the theorem, dene : (Rk ) (V|U) by U ( ())(x) = 1 (x)1 (x) + . . . k (x)k (x) and : (Rr ) (Rm ) by U U ( ())(x) = 1 (x)1 (x) + + k (x)k (x),
j j j

j {1, . . . , k}.

90

2 Real analyticity

23/06/2009

Then we have the exact sequence (Rr ) U

/ (Rm )
U

/ (V|U)

/0

of C (U)-modules. The results here are part of a substantial piece of work in this area by Oka. The theorem we state here appeared, but not quite in the same form, in [Oka 1950]. The statement and proof of the coherence theorem we give follows that of Hormander [1973]. Oka, and most of the folks interested in his work, are concerned with sheaf theory. The work of Oka has been translated [Oka 1984] into English. The way in which one slickly states Okas Coherence Theorem in this setting is: Every nitely generated analytic sheaf is coherent. The notion of coherence of a sheaf means that the sheaf and its module of relations are both locally nitely generated. Thus our results might be interpreted as saying, Analytic submodules of sections of vector bundles are coherent submodules, where by a coherent submodule we mean one that is locally nitely generated and whose module of relations is locally nitely generated.

This version: 23/06/2009

Chapter 3 Time-dependent vector elds and their ows


In this chapter we introduce the class of time-dependent dierential equations and vector elds that are useful in control theory, and we discuss the existence, uniqueness, and dependence on initial conditions for integral curves of these vector elds. We begin in Section 3.1 by dening the class of vector elds we will consider, and giving some of their basic properties. After this initial rather technical discussion, in Section 3.2 ! we begin our discussion of integral curves by dening and characterising the class of curves that arise as integral curves for our general class of vector elds. In Section 3.3 we consider ows for our classes of vector elds, carefully outlining how these vector elds depend on time, initial condition, and parameters. ! !

3.1 Vector elds depending measurably on time


In control theory one deals with controls that are generally not continuous functions of time. The necessitates the consideration of dierential equations where the dependence on time is quite general. In this section we consider this by considering with some care the matter of dierential equations where the time-dependence is locally integrable. It is convenient notationally, although admittedly repetitive, to break the discussion into three parts: the nitely dierentiable case; the smooth case; the analytic case. 3.1.1 The nitely differentiable case Although we are mainly interested in vector elds, we do this via functions. So we rst make some denitions concerning functions depending on time in a general way. 3.1.1 Denition (Finitely differentiable functions with measurable time-dependence) Let r Z0 and let M be a C -manifold. A Carath eodory function of class Cr on M is a map f : T M R with the following properties: (i) T R is an interval called the time-domain for f ; (ii) for each t T, the map ft : M R dened by ft (x) is of class Cr ; (iii) for each x M, the map f x : T R dened by f x (t) = f (t, x) is Lebesgue measurable.

92

3 Time-dependent vector elds and their ows

23/06/2009

A Carath eodory function f : T M R of class Cr , r Z0 , is locally integrably of r class C if, for every compact set K M and every X1 , . . . , Xk (TM), k {0, 1, . . . , r}, there exists g L1 loc (T; R0 ) such that |X1 Xk ft (x)| g(t), (t, x) T K.

The set of locally integrally class Cr -functions, r Z0 , with time-domain T is denoted by LICr (T, M). Using these denitions for functions, we can easily extend to corresponding denitions for vector elds. 3.1.2 Denition (Finitely differentiable vector elds with measurable timedependence) Let r Z0 and let M be a C -manifold. A Carath eodory vector eld of class Cr on M is a map X : T M TM such that, if Xt : M TM is dened by Xt (x) = X(t, x), then (i) for each t T, Xt is a vector eld and (ii) the map (t, x) Xt f (x) is a Carath eodory function of class Cr for every f C (M). The interval T is called the time-domain for X. A Carath eodory vector eld X : T M TM of class Cr , r Z0 , is locally integrably of class Cr if the function (t, x) Xt f (x) is in LICr (T, M) for every f C (M). The set of vector elds that are locally integrably of class Cr , r Z0 , is denoted by LIr (T, TM). The following result gives a more easily parsed characterisation of elements of LICr (T, M) and LIr (T, TM). 3.1.3 Proposition (Characterisation of LICr (T, M) and LIr (T, TM)) Let M be a C -manifold, let T R be an interval, and let r Z0 . Then the following statements hold: (i) a Carath eodory function f : T M R of class Cr is an element of LICr (T, M) if and only if, for every coordinate chart (U, ) for M with coordinates (x1 , . . . , xn ), for every compact subset K (U), and for every j1 , . . . , jk {1, . . . , n} with k {0, 1, . . . , r}, there exists g L1 loc (T; R0 ) such that k (ft 1 ) (x) g(t), xj1 xjk (t, x) T K;

(ii) a Carath eodory vector eld X : T M TM is an element of LIr (T, TM) if and only if, for every coordinate chart (U, ) for M with coordinates (x1 , . . . , xn ), the components X1 , . . . , Xn of X in these coordinates are elements of LICr (T, U).
Proof (i) First suppose that f LICr (T, M), let (U, ) be a chart with coordinates (x1 , . . . , xn ), and let K (U) be compact. The following lemma is useful for some of the coordinate constructions we perform. 1 Lemma Let M be a smooth manifold, let (U, ) be a coordinate chart for M, and let K (U) be k compact. If A k (Tr s (T(U))) is a (r, s)-tensor eld of class C on (U), k Z0 {}, then k k (Tr there exists a (r, s)-tensor eld A s (TM)) of class C on M such that ( A)|K = A|K.

23/06/2009

3.1 Vector elds depending measurably on time

93

Proof For x K let x R>0 be such that Bn (2 x , x) U. Since K is compact and Bn ( x j , x j ). Let K = since K xK Bn ( x , x), there exists x1 , . . . , xm K such that K m j =1 Bn ( x j , x j )) and note that K U and that K int(K ). By the smooth Tietze Extension cl(m j =1 Theorem [Abraham, Marsden, and Ratiu 1988, Proposition 5.5.8] let f C (U) be such that f (x) = 1 for x K and f (x) = 0 for x U \ int(K ). Then dene A = f A k (Tr s (T(U))) and dene A(x) = A (x) for x U and A(x) = 0 otherwise. Note that A is of class Ck and agrees with A on K, as desired. that A Let X 1 , . . . , X n (T((U))) be the standard vector elds: X j (x) = (x, e j ). By the lemma, these exists smooth vector elds X1 , . . . , Xn on M whose local representatives agree with the vector elds X 1 , . . . , X n on K. Then, for k {0, 1, . . . , r} and for j1 , . . . , jk {1, . . . , n}, X j1 X jk ( ft 1 )(x) = k ( ft 1 ) x j1 x jk (x).

Then, since 1 (K) is compact and since f LICr (T, M) LICk (T, M), there exists g L1 loc (T; R0 ) such that k ( ft 1 ) x j1 x jk (x) g(t), (t, x) T K,

as desired. Now we prove the converse. Let K M be compact and let X1 , . . . , Xr (TM). Let x K and let (Ux , x ) be a coordinate chart about x such that x (x) = 0. Let rx R>0 be such 1 1 n that Bn (rx , 0) x (Ux ). Then the open sets ( x (B (0, 2 rx )))xK cover K, and so compactness of K gives x1 , . . . , xk K such that
k 1 n 1 n 1 K k j=1 x j (B (0, 2 rx j )) j=1 x j (B (rx j , 0)).

For each j {1, . . . , k} the function Bn (rx j , 0)


1 x (X1 Xr ft ) x j (x) R

is a linear combination of products of partial derivatives of the components of the vec1 tor elds X1 , . . . , Xr and partial derivatives of ft x j , with the total number of partial derivatives being r. Since Bn (rx j , 0) is compact, this implies that each term in this linear combination will be bounded by a constant multiplied by the norm of a partial derivative 1 1 of ft x j of degree at most r. Our hypotheses then gives g j Lloc (T; R0 ) such that
1 |(X1 Xr ft ) x j (x)| g j (t),

(t, x) T Bn (rx j , 0).

If we dene g(t) = max{ g1 (t), . . . , gk (t)} then g L1 loc (T; R0 ) and |X1 Xr ft (x)| g(t) for every (t, x) T K, as desired.

94

3 Time-dependent vector elds and their ows

23/06/2009

(ii) Suppose that X LICr (T, TM) and let (U, ) be a coordinate chart with coordinates j j j (x1 , . . . , xn ). Let : U R be the jth coordinate function. Then Xt = Xt is the jth component of Xt , and so the result follows from the rst part of the proposition. For the converse, let (U, ) be a coordinate chart with coordinates (x1 , . . . , xn ), and let (X1 , . . . , Xn ) be the components of X, thought of as functions on T U. Suppose that these functions are elements of LICr (T, U). Let f C (M) and let K (U) be compact. By the previous part of the propositionto show that Xt f LICr (T, M) and so that X LIr (T, TM)it suces to show that there exists g L1 loc (T; R0 ) such that k ((Xt f ) 1 ) x j1 x jk (x) g(t), x K,

for all j1 , . . . , jk {1, . . . , n} and k {0, 1, . . . , r}. This conclusion, however, follows in the same manner as the similar statement in the second part of the proof above.

3.1.2 The smooth case It is fairly easy to provide the denitions for innitely dierentiable time-dependent vector elds, given that we understand the nitely dierentiable case. As above, we start by considering functions. 3.1.4 Denition (Smooth functions with measurable time-dependence) Let M be a C manifold. A Carath eodory function of class C on M is a map f : T M R with the following properties: (i) T R is an interval called the time-domain for f ; (ii) for each t T, the map ft : M R dened by ft (x) is of class C ; (iii) for each x M, the map f x : T R dened by f x (t) = f (t, x) is Lebesgue measurable. A Carath eodory function f : T M R of class C , r Z0 , is locally integrably of class C if it is locally integrably of class Cr for every r Z0 . The set of locally integrally class C -functions with time-domain T is denoted by LIC (T, M). For vector elds, this leads to the following denition. 3.1.5 Denition (Smooth vector elds with measurable time-dependence) Let be a C manifold. A Carath eodory vector eld of class C on M is a map X : T M TM such that, if Xt : M TM is dened by Xt (x) = X(t, x), then (i) for each t T, Xt is a vector eld and (ii) the map (t, x) Xt f (x) is a Carath eodory function of class C for every f C (M). The interval T is called the time-domain for X. A Carath eodory vector eld X : T M TM of class C is locally integrably of class C if the map (t, x) Xt f (x) is in LIC (T, M) for every f C (M). The set of vector elds that are locally integrably of class C is denoted by LI (T, TM).

23/06/2009

3.1 Vector elds depending measurably on time

95

The next result follows immediately from Proposition 3.1.3. 3.1.6 Proposition (Characterisation of LIC (T, M) and LI (T, TM)) Let M be a C -manifold and let T R be an interval. Then the following statements hold: (i) a Carath eodory function f : T M R of class C is an element of LIC (T, M) if and only if, for every coordinate chart (U, ) for M with coordinates (x1 , . . . , xn ) and for every compact subset K (U) and every j1 , . . . , jk {1, . . . , n} with k Z0 , there exists g L1 loc (T; R0 ) such that k (ft 1 ) (x) g(t), xj1 xjk (t, x) T K;

(ii) a Carath eodory vector eld X : T M TM is an element of LI (T, TM) if and only if, for every coordinate chart (U, ) for M with coordinates (x1 , . . . , xn ), the components X1 , . . . , Xn of X in these coordinates are elements of LIC (T, U). 3.1.3 The locally Lipschitz case The nal class of time-dependent vector elds we consider, those that are locally Lipschitz, arises in a natural way in the investigation of conditions for existence of integral curves. As we shall see in Proposition 3.1.12, all reasonable classes of vector elds, i.e., those that are dierentiable, are locally Lipschitz. As always, we begin with functions. In order to make sensible denitions, we rst need a technical result. 3.1.7 Lemma (Independence of Lipschitz condition on metric) Let M be a smooth manifold and let G1 and G2 be Riemannian metrics on M, with d1 and d2 the corresponding metrics. Then, for f : M R, the following statements are equivalent: (i) for every compact subset K M there exists L1 R>0 such that |f(x) f(y)| L1 d1 (x, y) for every x, y K; (ii) for every compact subset K M there exists L2 R>0 such that |f(x) f(y)| L2 d2 (x, y) for every x, y K.
Proof Suppose that for every compact subset K M there exists L1 R>0 such that | f (x) f ( y)| L1 d1 (x, y) for every x, y K. Let K M be compact and let x, y K. Then, according to Theorem 4.1.8, let c R>0 be such that c1 d1 (x, y) d2 (x, y) for every x, y K. Taking L2 = cL1 we have | f (x) f ( y)| L1 d1 (x, y) L2 d2 (x, y), as desired. The second assertion is proved in the same manner.

Now, with that piece of annoyance out of the way, the following denition can be made. 3.1.8 Denition (Lipschitz functions with measurable time-dependence) Let M be a smooth manifold with a Riemannian metric G and corresponding metric dG .

96

3 Time-dependent vector elds and their ows

23/06/2009

(i) A function f : M R is locally Lipschitz if, for every compact subset K M there exists L R>0 such that | f (x) f ( y)| LdG (x, y) for every x, y K. (ii) A function f : T M is locally Lipschitz if it has the following properties: (a) T R is an interval called the time-domain for f ; (b) for each t T, the map ft : M R dened by ft (x) = f (t, x) is locally Lipschitz; (c) for each x M, the map f x : T R dened by f x (t) = f (t, x) is Lebesgue measurable. A function f LIC0 (T, M) is locally integrally Lipschitz if, for every compact set K M there exists L L1 loc (T; R0 ) such that | f (t, x) f (t, y)| L(t)dG (x, y), (t, x, y) T K K.

The set of locally integrally Lipschitz functions with time-domain T is denoted by LIL(T, M). 3.1.9 Remark (On the need for localness) Note that the lemma preceding the denitions holds for a compact subset of M. By restricting to this sort of local denition, the denition for local Lipschitz can be made independent of the choice of Riemannian metric G. Were we to dene global Lipschitz functions, the choice of Riemannian metric would then matter, i.e., global Lipschitzness depends not just on the manifold structure, but on a choice of metric. Now we can dene what we mean by a locally Lipschitz vector eld. 3.1.10 Denition (Lipschitz vector elds with measurable time-dependence) Let M be a paracompact smooth manifold. (i) A vector eld X : M TM is locally Lipschitz if X f is locally Lipschitz for every f C (M). (ii) A map X : T M TM is a locally Lipschitz vector eld if, for Xt : M TM dened by Xt (x) = X(t, x), (a) for each t T, Xt is a vector eld and (b) for each t T, Xt f is locally Lipschitz for every f C (M). The interval T is called the time-domain for X. A vector eld X LI0 (T, TM) is locally integrally Lipschitz if Xt f is locally integrally Lipschitz for every f C (M). The set of locally integrally Lipschitz vector elds with time-domain T is denoted by LIL(T, TM). We can fairly easily characterise locally Lipschitz functions and vector elds in coordinates. Note that locally Lipschitz functions are Carath eodory functions since locally Lipschitz functions are more or less obviously continuous. 3.1.11 Proposition (Characterisation of LIL(T, M) and LIL(T, TM)) Let M be a paracompact C -manifold and let T R be an interval. Then the following statements hold:

23/06/2009

3.1 Vector elds depending measurably on time

97

(i) a Carath eodory function f : T M R is an element of LIL(T, M) if and only if, for every coordinate chart (U, ) for M and every compact subset K (U), there exist g, L L1 loc (T, R0 ) such that |ft 1 (x)| g(t), and |ft 1 (x) ft 1 (y)| L(t) x y , (t, x) K, (t, x, y) T K K.

(ii) a Carath eodory vector eld X : T M TM if and only if, for every coordinate chart (U, ) for M with coordinates (x1 , . . . , xn ), the components X1 , . . . , Xn of X are elements of LIL(T, U).
Proof Note that paracompactness of M allows us to equip M with a Riemannian metric [Abraham, Marsden, and Ratiu 1988, Theorem 5.5.12]. (i) First suppose that f LIL(T, TM), let (U, ) be a chart for M, and let K (U) be compact. From Proposition 3.1.3 we immediately have | ft 1 (x)| g(t), x K,

for some g L1 loc (T; R0 ). Let G be a Riemannian metric agreeing on K with the pull-back by of the standard Riemannian metric on (U) [Abraham, Marsden, and Ratiu 1988, Theorem 5.5.9]. Let dG be the associated metric. Since 1 (K) is compact, there exists L L1 loc (T, R0 ) such that | ft (x) ft ( y)| L(t)dG (x, y), Pushing this forward to (U) gives | ft 1 (x) ft 1 ( y)| L(t) x y , (t, x, y) T K K, t T, x, y 1 (K).

as desired. For the other implication, suppose that, for each chart (U, ) and each compact K (U), there exists g, L L1 loc (T; R0 ) such that | ft 1 (x)| g(t), and | ft 1 (x) ft 1 ( y)| L(t) x y , (t, x, y) T K K. From Proposition 3.1.3 it follows that f LIC0 (T, M). Let K M and let x K. Let (Ux , x ) be a chart about x such that x (x) = 0. Let rx R>0 be such that Kx Bn (rx , 0) x (Ux ). By assumption there exists Lx L1 loc (T; R0 ) such that
1 1 | ft x ( y) ft x (z)| Lx (t) y z ,

x K,

(t, y, z) T Kx Kx .

Pulling this formula back to U by x gives | ft ( y) ft (z)| Lx (t)dx ( y, z),


1 t T, y, z x (Kx ),

98

3 Time-dependent vector elds and their ows

23/06/2009

where dx is the metric on U induced by the standard metric on x (Ux ). Now let G be a Riemannian metric on M and denote by dG the corresponding metric. By Theorem 4.1.8 there exists cx R>0 such that | ft (x) ft ( y)| cx Lx (t)dG (x, y),
1 t T, x, y x (Kx ).

1 n Now note that ( x (B (rx , 0)))xK covers K and so there exists x1 , . . . , xk K such that k 1 1 n K k j=1 x j (B (rx j , 0)) j=1 x j (Kx j ). 1 n Let us abbreviate N j = x j (B (rx j , 0)), L j = Lx j , and c j = cx j for j {1, . . . , k}. As in the proof of Theorem 4.1.8, there exists r R>0 such that, if x, y K satisfy dG (x, y) < r, then x, y N j for some j {1, . . . , k}. If t T then, since f LIC0 (T, M), there exists g L1 loc (T; R0 ) such that | ft (x)| g(t), t T, x K.

Taking the supremum gives sup{| ft (x)| | x K} g(t), t T. Now dene L L1 loc (T; R0 ) by L(t) = max{c1 L1 (t), . . . , ck Lk (t),
2 g(t) r }.

We let x, y K. If dG (x, y) < r then x, y N j for some j {1, . . . , k} and so | ft (x) ft ( y)| c j L j (t)dG (x, y) L(t)dG (x, y). If dG (x, y) r then | ft (x) ft ( y)| | ft (x)| + | ft ( y)| 2 g(t) = 2 g(t)r L(t)dG (x, y), r

and so f LIL(T, M), as desired. (ii) Suppose that X LIL(T, TM), let (U, ) be a chart with coordinates (x1 , . . . , xn ). Let j j j : U R be the j coordinate function. Then Xt = Xt is the jth component of Xt . Now this part of the result follows from the rst part of the proposition. For the other implication, let (U, ) be a coordinate chart with coordinates (x1 , . . . , xn ) and let K (U) be compact. Let X1 , . . . , Xn be the components of X, thought of as functions on T U. Suppose that X1 , . . . , Xn LIL(T, U). Thus, in particular, X j LIC0 (T, U) for j {1, . . . , n} and so there exists g1 , . . . , gn L1 loc (T; R0 ) such that |Xt 1 (x)| g j (t), Also, there exists L1 , . . . , Ln L1 loc (T; R0 ) such that |Xt 1 (x) Xt 1 ( y)| L j (t) x y ,
j j j

t T, x K.

t T, x, y K, j {1, . . . , n}.

By Proposition 3.1.3 it follows that Xt f LIC0 (T, U). Now let f C (M) and let K (U) be compact. In Proposition 3.1.12 below we will show that continuously dierentiable

23/06/2009

3.1 Vector elds depending measurably on time

99

functions on open subsets of Euclidean space are locally Lipschitz. In particular, the partial derivatives of f 1 are locally Lipschitz, and so there exists 1 , . . . , n R>0 such that ( f 1 ) x j (x) ( f 1 ) x j ( y) j x y

for all x, y K and j {1, . . . , n} (this is a matter of applying the Mean Value Theorem, as we shall see in Proposition 3.1.12). Let us denote f, j Then
n K,

= sup

( f 1 ) x j

(x)

xK ,

Xt

K,

= sup{|Xt 1 (x)| | x K}.

|(Xt f ) 1 (x) (Xt f ) 1 ( y)| =


j =1 n

Xt 1 (x) Xt 1 (x) Xt 1 (x)


j j j

( f 1 ) x j ( f 1 ) x j ( f 1 ) x j

(x) Xt 1 ( y) (x) Xt 1 ( y) (x) Xt 1 (x)


j j j

( f 1 ) x j ( f 1 ) x j ( f 1 ) x j x j ( y)

( y)

j=1 n

( y)

=
j=1

( y)

+ Xt 1 (x)
n

( f 1 )

j=1

|Xt 1 (x)|
n

x j ( f 1 ) x j
j

( y) Xt 1 ( y) (x) x j
j

( f 1 )

( y)

( f 1 )

+
j =1 n

( f 1 ) x j

( y) |Xt 1 (x) Xt 1 ( y)|


n

j=1

j j Xt K,

xy +
j=1

f, j

K, L j (t)

xy .

Note that Xt

K,

g j (t) for all t T. Therefore, if we dene L L1 loc (T; R) by


n

L(t) =
j =1

( j g j (t) + f, j

K, L j (t)),

we have |(Xt f ) 1 (x) (Xt f ) 1 ( y)| L(t) x y for all t T and x, y K. That Xt f LIL(T, TM) follows from the rst part of the proposition. Thus X LIL(T, TM), as desired.

We can fairly easily show now that dierentiable functions are locally Lipschitz. More precisely, we have the following result.

100

3 Time-dependent vector elds and their ows

23/06/2009

3.1.12 Proposition (LICr (T, M) LIL(T, M) and LIr (T, TM) LIL(T, TM)) Let M be a smooth, paracompact manifold, let T R be an interval, and let r Z>0 {}. Then LICr (T, M) LIL(T, M) and LIr (T, TM) LIL(T, TM).
Proof Let us rst show that LICr (T, M) LIL(T, M). Let (U, ) be a chart with coordinates (x1 , . . . , xn ), and let K U be compact. If f LICr (T, M), then, by Proposition 3.1.3, there exists g0 , g1 , . . . , gn L1 loc (T; R0 ) such that | ft 1 (x)| g0 (t), ( ft 1 ) x j (x) g j (t), j {1, . . . , n},

for all (t, x) T K. Dene L L1 loc (T; R0 ) by


n

L(t) = max 2 g0 (t),


j =1

g j (t) .

Let x K and let rx R>0 be such that Bn (2rx , x) (U). For y1 , y2 Bn (rx , x), the Mean Value Theorem [Abraham, Marsden, and Ratiu 1988, Proposition 2.4.8] gives | ft 1 ( y1 ) ft 1 ( y2 )| sup{ D( f 1 )( y) | y Bn (rx , x)} y1 y2
n

sup
j =1 n

( ft 1 ) x j

( y)

y Bn (rx , x) y1 y2

j=1

g j (t) y1 y2 = L(t) y1 y2 ,

using (1.1). Since (Bn (rx , x))xK covers K, there exists x1 , . . . , xk K such that K k Bn (rx j , x j ). j =1 Let us abbreviate N j = Bn (rx j , x j ) for j {1, . . . , k}. As in the proof of Theorem 4.1.8, there exists r R>0 such that if x, y K satisfy x y < r then x, y N j for some j {1, . . . , k}. We let x, y K. If x y < r then x, y N j for some j {1, . . . , k} and so | ft 1 (x) ft 1 ( y)| L(t) x y . If x y r then | ft 1 (x) ft 1 ( y)| | ft 1 (x)| + | ft 1 ( y)| 2 g0 (t) = 2 g0 (t)r L(t)dG (x, y), r

and so ft 1 LIL(T, U), and then the result follows from Proposition 3.1.11.

3.2 Absolutely continuous curves


We now turn to the matter of the existence of integral curves, i.e., curves : T M satisfying (t) = X(t, (t)), for X LICr (T, TM). As with our denitions concerning vector elds, we separately consider the smooth and nitely dierentiable case and the analytic case.

23/06/2009

3.2 Absolutely continuous curves 3.2.1 Some comments about absolute continuity

101

From measure and integration theory, [see Cohn 1980], we know that a function f : [a, b] R is absolutely continuous if there exists g L1 ([a, b]; R) such that
t

f (t) = f (a) +
a

g() d,

t [a, b].

For a general interval T, a function f : T R is locally absolutely continuous if f |[a, b] is absolutely continuous for every compact interval [a, b] T. Recall that a locally absolutely continuous function has the following properties: 1. f is continuous; 2. f is almost everywhere dierentiable; 3. the Fundamental Theorem of Calculus holds, i.e., if
t

f (t) = f (a) +
a

g() d

for some g L1 ([a, b]; R), then f (t) = g(t) for almost every t [a, b]; 4. if a locally absolutely continuous function f : T R satises f (t) = 0 for almost every t T, then f is a constant function; 5. if T = [a, b] then f : [a, b] R is absolutely continuous if and only if, for every R>0 , there exists R>0 such that, if ((a j , b j )) j{1,...,k} is a family of disjoint intervals such that
k

|b j a j | < ,
j=1

then

| f (b j ) f (a j )| < .
j =1

(Note that the rst three properties are not alone enough to prescribe absolute continuity). We shall freely use these properties and characterisations of absolute continuity. We also comment that absolute continuity is really a local concept. To be precise about this, we have the following. 3.2.1 Lemma (Locality of absolute continuity) Let f : [a, b] R be a function and let R>0 . Then f is absolutely continuous if and only if, for every t [a, b], f|[a, b] [t , t + ] is absolutely continuous.
Proof Suppose that f is absolutely continuous and let R>0 . If t [a, b] let a = inf[a, b] [t , t + ]. Then, for t [a, b] [t , t + ],
t a t t

f (t) = f (a) +
a

f () d = f (a) +
a

f () d +
a

f () d = f (a ) +
a

f () d,

102

3 Time-dependent vector elds and their ows

23/06/2009

giving absolute continuity of f |[a, b] [t , t + ]. Now let R>0 and suppose that f |[a, b] [t , t + ] is absolutely continuous for every t [a, b]. There exists t1 , . . . , tk [a, b] such that [a, b] k j =1 [ t j , t j + ] . For j {1, . . . , k} let g j L1 ([a, b] [t j , t j + ]; R) be such that
t

( f |[a, b] [t j , t j + ])(t) =
aj

g j () d

where t [a, b] [t j , t j + ] and where a j = inf[a, b] [t j , t j + ]. Then dene g : [a, b] R by asking that g(t) = g j (t) if t [a, b] [t j , t j + ]. Then one checks that g L1 ([a, b]; R) and
t

f (t) = f (a) +
a

g() d,

giving the absolute continuity of f .

3.2.2 Absolutely continuous curves on smooth manifolds With the above comments about absolute continuity at hand, we can make the following denition. 3.2.2 Denition (Absolutely continuous curve) Let T R be an interval and let M be a C -manifold. A curve : T M is (locally) absolutely continuous if f : T R is (locally) absolutely continuous for every f C (M). We can concretely characterise absolute continuity of curves as follows. 3.2.3 Proposition (Characterisation of locally absolutely continuous curves) Let T be an interval and let M be a C -manifold. A curve : T M is (locally) absolutely continuous j if and only if, for every chart (U, ) for M with coordinates (x1 , . . . , xn ), the functions , j {1, . . . , n}, are (locally) absolutely continuous when restricted to a subinterval T T such j that (T ) U, where : U R is the jth coordinate function.
Proof First suppose that is (locally) absolutely continuous and let (U, ) be a chart with coordinates (x1 , . . . , xn ). Let t0 T be such that (t0 ) U. Let r R>0 be such that Bn (r, ((t0 ))) (U). Let f C (M) be such that f (x) = 1 for x 1 (Bn (r, ((t0 )))). Then f is (locally) absolutely continuous, and so in particular continuous. Thus there exists j R>0 such that f (t) Bn (r, ((t0 ))) for all t [t0 , t0 + ]. Now let ,r C (M) be a function agreeing with the jth coordinate function on 1 (Bn (r, ((t0 )))). (Local) absolute j continuity of gives (local) absolute continuity of ,r . Then Lemma 3.2.1 gives (local) absolute continuity of restricted to [t0 , t0 + ]. Since this can be done for every t0 for which (t0 ) U, another application of Lemma 3.2.1 gives (local) absolute continuity of j when restricted to any connected interval T containing t0 and for which (T ) U.
j

23/06/2009

3.2 Absolutely continuous curves

103

To prove the converse, let [a, b] T be a compact subinterval. For t [a, b] let (Ut , t ) be a chart for M about (t) such that t ((t)) = 0. Let rt R>0 be such that Bn (rt , 0) t (Ut ), such a positive number existing since the coordinate functions are absolutely continuous, and so continuous, functions of t. Since ([a, b]) is compact (by continuity of , which j follows from continuity of for all coordinate charts), there exists t1 , . . . , tk [a, b] such that k 1 n 1 1 n ([a, b]) k j=1 t j (B ( 2 rt j , 0)) j=1 t j (B (rt j , 0)).
1 n Let R>0 be such that, for each t [a, b], ([t , t + ]) t j (B (rt j , 0)) for some 1 n j {1, . . . , k}. Let t [a, b] and let j {1, . . . , k} be such that ([t , t + ]) t j (B (rt j , 0)). Now let f C (M). By the Mean Value Theorem there exists C R>0 such that

| f 1 (x1 ) f 1 (x2 )| C x1 x2 for all x1 , x2 Bn (rt j , 0). If t1 , t2 [a, b] we compute | f (t1 ) f (t2 )| = | f 1 (t1 ) f 1 (t2 )|
n

(t1 )

(t2 )

C
l=1

l |l (t1 ) (t2 )|,

, l {1, . . . , n}, and by using (1.1). Now let R>0 and, by absolute continuity of l Lemma 3.2.1, there exists R>0 such that, for any nite family ((ai , bi ))i{1,...,s} of disjoint subintervals in [t , t + ] satisfying s

|bi ai |,
i =1

it holds that

s i=1 l | l (bi ) (ai )| <

Cn

for each l {1, . . . , n}. Then


s s n l | l (bi ) (ai )| < ,

|f
i=1

(bi )

(ai )|

C
i=1 l=1

giving absolute continuity of f on [t , t + ]. The proposition now follows from Lemma 3.2.1.

This gives the following important property of locally absolutely continuous curves. 3.2.4 Corollary (Locally absolutely continuous curves are almost everywhere differentiable) If M is a smooth manifold, if T R is an interval, and if : T M is locally absolutely continuous, then is dierentiable for almost every t T.

104

3 Time-dependent vector elds and their ows

23/06/2009

Proof Let t T and let (U, ) be a chart about (t). Since is continuous by Proposition 3.2.3, there exists an interval Tt T such that (t ) U for every t Tt . By Proposition 3.2.3 and properties of R-valued absolutely continuous functions, it follows that is dierentiable for almost every t Tt . Note that (Tt )tT covers T. Thus there exists a countable set (t j ) jZ>0 such that (Tt j ) jZ>0 covers T. The set of points in Tt j , j Z>0 , such that is not dierentiable has measure zero. The result follows since the countable union of sets of zero measure has zero measure.

3.3 Flows for time-dependent vector elds


In this section we combine the denitions from the preceding two sections to talk about integral curves and ows of time-dependent vector elds. We do this by rst providing the standard existence and uniqueness theorems for time-dependent vector elds dened on open subsets of Euclidean space. Then we apply these results to give these existence and uniqueness results for time-dependent vector elds on manifolds. After we understand integral curves, we can dene ows for time-dependent vector elds by understanding how integral curves depend on initial condition. We also characterise dependence on parameters. 3.3.1 Integral curves: local existence and uniqueness We begin by dening what we mean by an integral curve of a time-dependent vector eld, where by time-dependent vector eld we mean the sort of time-dependence considered in Section 3.1. We also consider the matters of existence and uniqueness of integral curves for these sorts of vector elds. Let us rst make the denition. 3.3.1 Denition (Integral curve) Let r Z0 {, }, let M be a C - or C -manifold, as is required, and let X LIr (T, M). An integral curve for X is a locally absolutely continuous curve : T M such that (t) = X(t, (t)) for almost every t T. We next characterise existence and uniqueness of integral curves. To do this we rst consider the case of vector elds dened on open subsets of Rn . We rst consider the existence problem. 3.3.2 Theorem (Local existence of integral curves for continuous vector elds) Let U Rn be open, let T R be an interval, and let X LI0 (T, U). Then, for each (t0 , x0 ) TU, there exists a subinterval T T, relatively open in T and with t0 intT (T ), and an integral curve : T U such that (t0 ) = x0 .
Proof Let X : T U Rn be the principal part of X. By Proposition 3.1.3, the components X1 , . . . , Xn of X are in LIC0 (T, U). Let us prove a lemma.

1 Lemma For a continuous map : T U the function t X(t, (t)) is locally integrable. Proof First of all, let us show that t X (s, (s)) is measurable. It suces to prove this when T is compact, so we make this assumption. Since is continuous, there exists a

23/06/2009

3.3 Flows for time-dependent vector elds

105

sequence ( j ) jZ>0 of piecewise constant functions converging uniformly to . (Indeed, uniform limits of piecewise constant functions are known as regulated functions, and regulated functions include continuous functions [Bourbaki 2004, Theorem II.1.2].) That is, for each j Z>0 there exists a partition (T j,1 , . . . , T j,k j ) such that j (t) = x j,l for some x j,l Rn when t T j,l for l {1, . . . , k j }. Then
kj

X (t j (t)) =
l =1

X (t, x j,l )T j,l ,

and so t X (t, j (t)) is measurable. Now, by continuity of x X (t, x),


j

lim X (t, j (t)) = X (t, (t))

and measurability of t X (t, (t)) follows since the pointwise limit of measurable functions is measurable [Cohn 1980, Proposition 2.1.4]. Now let t T. Let t T and suppose that t > t0 . Then, by continuity of , there exists a compact set K U such that (s) K for every s [t0 , t0 + t]. Since X1 , . . . , Xn LIC0 (T, U) j there exists g1 , . . . , g j L1 loc (T; R0 ) such that |X (t, x)| g j (t) for every (t, x) T K. Then, dening g L1 loc (T; R0 ) by
n

g(t) =
j =1

g j (t),
n

we have X (t, x)

g j (t) = g(t),
j=1

using (1.1). Therefore,


t t

X (s, (s)) ds
t0 t0

h(s) ds < .

The same statement holds if t < t0 , ipping the limits of integration, and this gives the desired local integrability. Let r R>0 be chosen so that Bn (r, x0 ) U. Since X j LIC0 (T, U) there exists g j j n L1 loc (T; R0 ) such that |X (t, x)| g j (t) for every (t, x) T B (r, x0 ). Then, if we dene 1 g Lloc (T; R0 ) by
n

g(t) =
j=1

g j (t),

we have X (t, x)

g j (t) = g(t),
j=1

(t, x) T Bn (r, x0 ),

as in the proof of the lemma above. Then, since g is locally integrable, the function G+ : [t0 , ) T R dened by
t

G+ (t) =

g(s) ds
t0

(3.1)

106

3 Time-dependent vector elds and their ows

23/06/2009

is continuous. Let us suppose that t0 sup T so that there exists b R>0 such that [t0 , t0 + b] T. Thus, since g is nonnegative, there exists T+ R>0 such that [t0 , t0 + T+ ] T and such that
t

G+ (t) =

g(s) ds < r,
t0

t [t0 , t0 + T+ ].

For the remainder of the proof, we consider r and T+ to be chosen as above. Let C0 ([t0 , t0 + T+ ]; Rn ) be the Banach space of continuous Rn -valued functions on [t0 , t0 + T+ ] equipped with the norm

= sup{ (t) | t [t0 , t0 + T+ ]}

(see [Hewitt and Stromberg 1975, Theorem 7.9]). Let 0 C0 ([t0 , t0 + T+ ]; Rn ) be dened by 0 (t) = x0 . Let B+ (r, 0 ) be the closed ball of radius r and centre 0 in C0 ([t0 , t0 + T+ ]; Rn ). For (0, T+ ] let us dene B+ (r, 0 ) by t [t0 , t0 + ], x0 , (t) = t x0 + X (s, (s )) ds, t (t0 + , t0 + T+ ].
t0

It is not clear that this denition makes sense, so let us verify how it does. We x (0, T+ ]. If t [t0 , t0 + ] the meaning of (t) is unambiguous. If t (t0 + , t0 + 2] [t0 , t0 + T+ ] then (t) is determined from the already known value of on [t0 , t0 + ]. Similarly, if t (t0 + 2, t0 + 3] [t0 , t0 + T+ ] then (t) is determined from the already known value of on [t0 , t0 + 2]. In a nite number of such steps, one determines on [t0 , t0 + T+ ]. We now show that B+ (r, 0 ). If t [t0 , t0 + ] then (t) x0 = 0. If t (t0 + , t0 + 2] then (t) x0 =
t0 t0 + t0 t0 + t

0 ds +

t0 + t t0 +

X (s, x0 ) ds
t

0 ds +

X (s, x0 ) ds
t0

g(s) ds < r.

By induction, if t (t0 + (k 1), t0 + k] then


k 2

(t) x0
j =0

t0 +( j+1) t0 + j

g(s) ds +

t0 +(k1)

g(s) ds r,

giving B+ (r, 0 ), as desired. We claim that the family ( )(0,T+ ] is equicontinuous, i.e., for each R>0 there exists R>0 such that |t1 t2 | < = (t1 ) (t2 ) < for all (0, T+ ]. So let R>0 and note that the function G+ : [t0 , t0 + T+ ] R dened by (3.1) is continuous, and so uniformly continuous, its domain being compact. Therefore, there exists R>0 such that |t1 t2 | < = |G+ (t1 ) G+ (t2 )| < .

23/06/2009

3.3 Flows for time-dependent vector elds

107

Let be so chosen. Then, if |t1 t2 | < with t1 > t2 , (t1 ) (t2 ) =


t2 t1 t0 t1

X (s, (t )) ds X (s, (t )) ds

t2 t0

X (s, (t )) ds g(s) ds = G+ (t1 ) G+ (t2 ) < ,

t1 t2

as desired. Thus we have an equicontinuous family ( )(0,T+ ] contained in the bounded set B+ (r, 0 ). Consider then the sequence (T+ / j ) jZ>0 contained in this family. By the Arzel` aAscoli Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.5.12] and the BolzanoWeierstrass Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.5.4] there exists an increasing sequence ( jk )kZ>0 such that the sequence (T+ / jk )kZ>0 converges in C0 ([t0 , t0 + T+ ]; Rn ), i.e., converges uniformly. Let us denote the limit by + B+ (r, 0 ). It remains to show that the + is an integral curve for X satisfying + (t0 ) = x0 . For this, an application of the Dominated Convergence Theorem [Cohn 1980, Theorem 2.4.4], continuity of X in the second argument, and equicontinuity of ( )(0,T+ ] gives
t

+ (t) = lim T+ / jk (t) = x0 + lim


k

jk

t0

X (s, T+ / jk (s T+ / jk )) ds
t

= x0 +
t0

X (s, lim (s )) ds = x0 +
0

t0

X (s, + (s)) ds.

Therefore, by the lemma above, + is absolutely continuous and + (t) = X(t, + (t)) for almost every t [t0 , t0 + T+ ]. Thus + is an integral curve. Obviously + (t0 ) = x0 . Next suppose that t0 inf T. Then there exists a R>0 such that [t0 a, t0 ] T. As above, we let r R>0 be such that Bn (r, x0 ) U. Dene G : (, t0 ] T R by G (t) =
t t0

g(s) ds

so that G is continuous. Since g is nonnegative, there exists T R>0 such that [t0 , t0 T ] T and such that G (t) =
t t0

g(s) ds < r,

t [t0 T , t0 ].

Now, with r and T thusly dened, we can proceed as above to show the existence of an integral curve : [t0 T , t0 ] U for X such that (t0 ) = x0 . The proof of the theorem is complete if we dene T and as follows. 1. int(T) = : The interval T = {t0 } and the trivial integral curve 0 (t) = x0 satises the conclusions of the theorem. 2. t0 sup T and t0 = inf T: The interval T = [t0 , t0 + T+ ) and the integral curve = + as dened above satisfy the conclusions of the theorem.

108

3 Time-dependent vector elds and their ows

23/06/2009

3. t0 = sup T and t0 inf T: The interval T = [t0 T , t0 ) and the integral curve = as dened above satisfy the conclusions of the theorem. 4. t0 sup T and t0 inf T: The interval T = (t0 T , t0 + T+ ) and the integral curve (t), t (t0 T , t0 ], (t) = + (t), t (t0 , t0 + T+ ]

satisfy the conclusions of the theorem.

Note that the conditions of the above theorem are not sucient to ensure uniqueness of integral curves, as the following standard example shows. 3.3.3 Example (Integral curves may not be unique) We consider the vector eld X on R dened by X(x) = x1/3 x . To see that integral curves are not unique, consider the dierential equation for integral curves passing through x = 0 at time t = 0: (t) = x1/3 (t), x x(0) = 0.

2 3/2 By elementary integration this has the solution x0 (t) = ( 3 ) t |t|. However, x1 (t) = 0 is also clearly a solution. Indeed, there is a family of solutions of the form x (t + t ), t (, t ], 0 x(t) = 0, t (t , t+ ), x0 (t t+ ), t [t+ , ), where t , t+ R>0 .

Thus we need some addition hypotheses for uniqueness. These are provided in the following theorem. 3.3.4 Theorem (Local existence and uniqueness of integral curves for Lipschitz vector elds) Let U Rn be open, let T R be an interval, and let X LIL(T, U). Then, for each (t0 , x0 ) T U, there exists a subinterval T T, relatively open in T and with t0 intT (T ), and an integral curve : T U such that (t0 ) = x0 . Moreover, if T is another such interval and : T U is another such integral curve, then (t) = (t) for all tT T.
Proof Note that the existence statement in the proof follows from Theorem 3.3.2 since LIL(T, TM) LIC0 (T, TM). However, we shall reprove this via an argument that also ensures uniqueness. Let r R>0 be such that Bn (r, x0 ) U. As in the proof of Theorem 3.3.2, there exists g L1 loc (T; R0 ) such that
n

X (t, x)
j=1

g j (t) = g(t),

(t, x) T Bn (r, x0 ).

23/06/2009

3.3 Flows for time-dependent vector elds

109

Since X is locally integrally Lipschitz, for each j {1, . . . , n} there exists L j L1 loc (T; R0 ) such that |X j (t, x) X j (t, y)| L j (t) x y for all t T and x, y Bn (r, x0 ). Then
n n

X (t, x) X (t, y)
j=1

|X (t, x) X (t, y)|


j j j =1

L j (t) x y = L(t) x y ,

if we dene L L1 loc (T; R0 ) by L(t) =

L j (t).
j=1

Let us choose (0, 1). We rst consider the case where t0 sup T so that there exists b R>0 such that [t0 , t0 + b] T. Dene G+ , + : [t0 , ) T R by
t t

G+ (t) =

g(s) ds,
t0

+ (t)

=
t0

L(s) ds.

Since g and L are nonnegative and continuous, we can choose T+ R>0 such that
t t

G+ (t) =

g(s) ds r,
t0

+ (t)

=
t0

L(s) ds <

2r

for t [t0 , t0 + T+ ]. As in the proof of Theorem 3.3.2, let 0 be the trivial curve t x0 , t [t0 , t0 + T+ ], and let B+ (r, 0 ) be the ball of radius r and centre 0 in C0 ([t0 , t0 + T+ ]; Rn ). Dene F+ : B+ (r, 0 ) C0 ([t0 , t0 + T+ ]; Rn ) by
t

F+ ()(t) = x0 +

X (s, (s)) ds.


t0

By the lemma from the proof of Theorem 3.3.2, s X (s, (s)) is locally integrable, showing that F+ is well-dened and that F+ () is absolutely continuous. We claim that F+ (B+ (r, 0 )) B+ (r, 0 ). Suppose that B+ (r, 0 ) so that (t) x0 r, Then, for t [t0 , t0 + T+ ],
t t t

t [t0 , t0 + T+ ].

F+ ()(t) x0 =

X (s, (s)) ds
t0 t0

X (s, (s)) ds
t0

g(s) ds r,

as desired. We claim that F+ |B+ (r, 0 ) is a contraction mapping. That is, we claim that there exists [0, 1) such that F+ () F+ ()

110

3 Time-dependent vector elds and their ows

23/06/2009

for every , B+ (r, 0 ). Indeed, let , B+ (r, 0 ) and compute, for t [t0 , t0 + T+ ],
t t

F+ ()(t) F+ ()(t) =

X (s, (s)) ds
t0 t t0 t t0

X (s, (s))

X (s, (s)) X (s, (s)) ds L(s) (s) (s) ds 2r + (t) ,

t0

since (s), (s) Bn (r, x0 ) for every s [t0 , t0 + T+ ]. This proves that F+ |B+ (r, 0 ) is a contraction mapping. By the Contraction Mapping Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.2.6] there exists a unique xed point for F+ which we denote by + . Thus
t

+ (t) = F+ (+ )(t) =

t0

X (s, + (s)) ds.

Dierentiating the rst and last expressions with respect to t shows that + is an integral curve for X. Now we consider the case when t0 inf T so there exists a R>0 such that [t0 a, t0 ] T. We proceed as above, cf. the corresponding part of the proof of Theorem 3.3.2, to provide T R>0 such that
t0

G (t)
t

g(s) ds < r,

t0 (t) t

L(s) ds <

for t [t0 T , t0 ]. We then dene B (r, 0 ) as the ball of radius r and centre 0 in C0 ([t0 T , t0 ]; Rn ) and dene F : B (r, x0 ) C0 ([t0 T , t0 ]; Rn ) by
t

F ()(t) = x0 +
t0

X (s, (s)) ds.

We show, as above, that F (B (r, 0 )) B (r, 0 ) and that F |B (r, 0 ) is a contraction mapping, so possessing a unique xed point . This xed point is an integral curve, as above. We can then dene an interval T and an integral curve as at the end of the proof of Theorem 3.3.2. We now prove uniqueness of this integral curve on T . Suppose that : T U is another integral curve satisfying (t0 ) = x0 . Then (t) = X (t, (t)), tT.

Therefore, by the Fundamental Theorem of Calculus,


t t

(t) = (t0 ) +
t0

(s) ds = x0 +
t0 t

X (s, (s))

for t t0 and (t) = (t0 ) +

(s) ds = x0 +
t0 t0

X (s, (s))

23/06/2009

3.3 Flows for time-dependent vector elds

111

for t t0 . It then follows that |[t0 , ) T is a xed point for F+ and |(, t0 ] T is a xed point for F . Therefore, agrees with on T by the uniqueness part of the Contraction Mapping Theorem. Now suppose that T R is some other interval containing t0 and that : T U is an integral curve satisfying eta(t0 ) = x0 . Suppose that (t) (t) for some t T T . Suppose that t < t0 . Let t1 = inf{t [t0 , ) T T | (t) (t)}.

Then (t) = (t) for t [t0 , t1 ). Continuity of integral curves implies that (t1 ) = (t1 ). Denote x1 = (t1 ). Note that both and are integral curves for X satisfying (t1 ) = (t1 ) = x1 . By our above arguments for existence and uniqueness, there exists T+ R>0 and a unique integral curve on [t1 , t1 + T+ ] satisfying (t1 ) = x1 . Thus and must agree with on [t1 , t1 + T+ ] contradicting the denition of t1 . A similar argument leads to a similar contradiction when we assume that and disagree at some t T T with t < t0 ,

In particular, the following result is one that is most often useful. 3.3.5 Corollary (Local existence and uniqueness of integral curves for differentiable vector elds) Let U Rn be open, let T R be an interval, and let X LIr (T, U), r Z>0 {}. Then, for each (t0 , x0 ) T U, there exists a subinterval T T, relatively open in T and with t0 intT (T ), and an integral curve : T U such that (t0 ) = x0 . Moreover, if T is another such interval and : T U is another such integral curve, then (t) = (t) for all t T T .
Proof This follows from Theorem 3.3.4 and Proposition 3.1.12.

Chapter 4 Set-valued analysis on manifolds


One of the models we shall use for systems is a dierential inclusion model. The framework for this sort of model requires some terminology and results from the theory of set-valued analysis. This sort of analysis is developed in several places, e.g., [Aubin and Cellina 1984, Aubin and Frankowska 1990, Filippov 1988, Smirnov 2002]. This sort of theory is widely used in certain areas of economics, and we refer to [Klein and Thompson 1984] for a treatment geared towards economic applications. Most of the results in the area are developed in the framework of Euclidean space, and occasionally, metric spaces. We wish to develop this theory systematically on manifolds. To do so we will sometimes use the metric structure on a manifold induced by a Riemannian metric. This requires, at least on aesthetic grounds, that we develop a framework where all denitions and theorems do not depend on a choice of Riemannian metric. We develop the tools for establishing these sorts of results in Section 4.1.

4.1 Riemannian manifolds as metric spaces


In this section we recall the metric structure of a Riemannian manifold. We also prove that two Riemannian metrics, when restriction to compact subsets, give rise to equivalent metric structures. This fact is essential in our development of a metricindependent theory of set-valued analysis on manifolds. 4.1.1 Denition of the metric We begin by dening the length of a curve on a smooth Riemannian manifold. A curve : [a, b] M is piecewise dierentiable if it is continuous and if there exists a partition (I1 , . . . , Ik ) of [a, b] such that |I j is dierentiable for j {1, . . . , k}. 4.1.1 Denition (Length of curve) Let (<M, G) be a smooth Riemannian manifold. The length of a piecewise dierentiable curve : [a, b] M is
b G ()

=
a

G( (t), (t)) dt.

The length of a curve depends only on its trace, and not on the specic parameterisation.

23/06/2009

4.1 Riemannian manifolds as metric spaces

113

4.1.2 Lemma (Independence of length on parameterisation) Let (M, G) be a smooth Riemannian manifold, let : [a, b] M be a piecewise dierentiable curve, and let [c, d] : [a, b] a reparameterisation of , i.e., is a dierentiable bijection for which (t) > 0 for all t [c, d]. Then G () = G ( ).
Proof We compute
d G ( ) d

=
c b

G(( (s), ( ) (s)) ds =


c

(s) G( ((s)), ((s))) ds

=
a

G( (t), (t)) dt =

G (),

using the change of variable formula for the integral.

We can now state the denition of distance between two points on a Riemannian manifold, noting that we can always parameterise a curve dened on a compact interval [a, b] so that the reparameterised curve is dened on [0, 1]. 4.1.3 Denition (Riemannian distance) For a smooth connected Riemannian manifold (M, G) and for x, y M, the distance between x and y is dG (x, y) = inf{ G ()| : [0, 1] M is a piecewise dierentiable curve for which (0) = x and (1) = y} 4.1.4 Remark (Disconnected Riemannian manifolds as metric spaces) In our denition of the metric on a Riemannian manifold (,G), we asked that the manifold M be connected so that there exists a curve connecting any two points, and so the denition of the distance function makes sense. If the manifold M is disconnected, then one can extend the denition of the distance function to this case by asking that dG (x, y) = whenever x and y lie in dierent connected components of M. This extension is not profound, and will not be of much interest to us. Thus we shall often assume that Riemannian manifolds are connected when we wish to view them as metric spaces. We can now show that dG is, in fact, a metric. We recall that the manifold topology on a manifold is the topology generated by the domains of admissible charts. 4.1.5 Theorem (Riemannian manifolds are metric spaces) If (M, G) is a smooth Riemannian manifold, then (M, dG ) is a metric space. Moreover, the metric topology of M agrees with the manifold topology.
Proof First of all, we may as well assume that M is connected, since the only nontrivial parts of the denition of a metric space to verify are those that hold for the connected components. Now, if M is connected, it is path connected. As we argued in the proof of Theorem 2.3.9, this means that we can nd a piecewise dierentiable curve connecting every two points in M. Thus dG (x, y) is dened for every x, y M. Let x M. Since the length of the trivial curve t x is zero, it follows that dG (x, x) = 0.

114

4 Set-valued analysis on manifolds

23/06/2009

Now let x, y M. For a piecewise dierentiable curve : [0, 1] M, dene : [0, 1] M by (t) = (1 t). Note that G ( ) = G () by the change of variables formula. Then the map is a bijection from { : [0, 1] M | is piecewise dierentiable with (0) = x and (1) = y} to { : [0, 1] M | is piecewise dierentiable with (0) = y and (1) = x}. This allows us to conclude that dG (x, y) = dG ( y, x). Let x, y, z M. Now let : [0, 1] M be piecewise dierentiable with (0) = x and (1) = z and let : [0, 1] M be piecewise dierentiable with (0) = z and (1) = y. Then the curve : [0, 1] M dened by t [0, 1 (2t), 2 ], (t) = 1 (2t 1), t ( 2 , 1] is piecewise dierentiable and satises (0) = x and (1) = y. Moreover,
G (

) =

G ()

G ().

This allows us to conclude that dG (x, y) + G (x, z) + dG (z, y). Next we show that dG (x, y) > 0 when x y. Let (U, ) be a chart about x and suppose that (x) = 0. For x (U), let Gx be the inner product on Rn (n is the dimension of the connected component of M containing x) whose components with respect to the standard basis are the components of G at 1 (x). Since all norms on Rn are equivalent, for all v Rn and x U we have Gx (v, v) cx G0 (v, v) for cx R>0 . Let r R>0 be such that Bn (r, 0) (U) and such that, if y U, then ( y) Bn (r, 0). By continuity of x Gx , there exists c R>0 such that Gx (v, v) c G0 (v, v) for all x Bn (r, 0). Now let : [0, 1] Bn (r, 0) be piecewise continuous with (0) = 0. Then, using Proposition 6.1.8, we compute
1 0

(t), (t)) dt c G(t) (

1 0

(t), (t)) dt G0 (
1 0

cG0

(t),
0

(t)

1/2

= c G0 ((t), (t)).

Now let : [0, 1] M be piecewise dierentiable with (0) = x and (1) = y. Let T R>0 be such that (t) U for all t [0, T]. Then, if is the local representative,
G ()

G (|[0, T ])

c G0 ((T), (T)) cr,

giving dG (x, y)

0, as desired.

23/06/2009

4.1 Riemannian manifolds as metric spaces

115

To prove the nal assertion of the theorem, it suces to show that (1) if O is an open subset for the manifold topology, there exists an open ball B for dG such that B O and (2) if B is an open ball for dG , then there exists an open set O in the manifold topology such that O B. For the rst statement, let O be an open subset for the manifold topology. Let x O and let (U, ) be a normal coordinate chart [Kobayashi and Nomizu 1963, Theorem 8.7] about x such that U O. Then there exists r R>0 such that Bn (r, (x)) U. By [Kobayashi and Nomizu 1963, Theorem 8.7] it follows that Bn (r, (x)) is a ball in the metric topology. For the second assertion, let B be a ball in the metric topology for dG . Let x be the centre of the ball B and let (U, ) be a normal coordinate chart about x. Let r R>0 be such that the radius of B exceeds r and such that Bn (r, (x)) U. Then let U = 1 (Bn (r, (x))) and note that (U , |U ) is a chart for M such that U B, as desired.

We shall sometimes ask that the metric space (M, dG ) be complete in the sense that Cauchy sequences in M converge. The following result is helpful in these cases. 4.1.6 Theorem (Characterisation of complete connected Riemannian manifolds) Let (M, G) be a smooth connected Riemannian manifold with associated metric dG . Then every compact subset of M is closed and bounded and the following statements are equivalent: (i) (M, dG ) is complete; (ii) each closed and bounded subset of M is compact.
Proof Let us prove the rst assertion of the theorem. Let K M be compact. Let us rst show that K is complete. Let (x j ) jZ>0 be a Cauchy sequence in K. By the BolzanoWeierstrass Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.5.4] there exists a convergent subsequence (x jm )mZ>0 , converging to x0 K. Let R>0 and let N Z>0 be suciently large that dG (x j , xk ) < 2 for j, k N and dG (x jm , x0 ) < 2 for m N. Then, for j, m N and noting that jm N, dG (x j , x0 ) dG (x j , x jm ) + dG (x jm , x0 ) < , giving convergence of (x j ) jZ>0 to x0 K. This gives completeness of K and so closedness of K since it then follows that sequences in K converge in K. Now we show that K is bounded. Let R>0 . Since K xK Bn ( , x) there exists x1 , . . . , xm K such that K m Bn ( , x j ). j=1 Let M = max{dG (x j , xk ) | j, k {1, . . . , m}} + 2 . If x, y K then x Bn ( , x j ) and y Bn ( , xk ) for some j, k {1, . . . , m}. Then dG (x, y) dG (x, x j ) + dG (x j , xk ) + dG (xk , y) < M, giving boundedness of K, as desired. (i) = (ii) Let (M, dG ) be complete and suppose that K M is closed and bounded. Let R R>0 and x0 K be such that K B(R, x0 ). By the HopfRinow Theorem [Lang 1995, Theorem 6.6], for each x K there exists a geodesic of length at most R connecting x0 to x. Thus K is contained in the image of the restriction of the Riemannian exponential map expx0 : Tx0 M M to the closed ball of radius R in Tx0 M. This image is compact, being the image of a compact set under a continuous map [Abraham, Marsden, and Ratiu 1988,

116

4 Set-valued analysis on manifolds

23/06/2009

Proposition 1.5.2]. Thus K is a closed subset of a compact set, and so compact [Abraham, Marsden, and Ratiu 1988, Proposition 1.5.2]. (ii) = (i) Let (x j ) jZ>0 be a Cauchy sequence. We claim that this sequence must be bounded. Indeed, let N Z>0 be such that dG (x j , xk ) 1 for j, k > N, and dene R = max{dG (x j , xN ) | j {1, . . . , N}} {1}. Then note that x j B(R, xN ) for every j Z>0 . Since we are assuming that closed and bounded subsets are compact, B(R, xN ) is compact. By the BolzanoWeierstrass Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.5.4], (x j ) jZ>0 possesses a convergent subsequence. However, as we showed in the rst paragraph of the proof, this means that the sequence converges, giving completeness of M.

4.1.7 Remark (Existence of complete Riemannian metrics) Since complete Riemannian metrics will sometimes be of value to us, it is useful to know that they exist. To this end, Nomizu and Ozeki [1961] prove that every paracompact manifold possesses a complete smooth Riemannian metric. 4.1.2 Equivalence of metrics Although the metric topology agrees with the manifold topology, we shall make certain constructions in this chapter that rely on the metric, not just on the topology of the metric, but on the metric structure. In such cases, we would like for these constructions to be independent of the choice of metric. The following result is the key to this. 4.1.8 Theorem (Equivalence of metrics on compact sets) If G1 and G2 are smooth Riemannian metrics on M with metrics d1 and d2 , respectively, and if K M is compact, then there exists c R>0 such that c1 d1 (x, y) d2 (x, y) cd1 (x, y) for every x, y K.
Proof We begin with a lemma. 1 Lemma If G1 and G2 are inner products on a nite-dimensional R-vector space V, then there exists c R>0 such that c1 G1 (v, v) G2 (v, v) cG1 (v, v) for all v V. Proof Let G j L(V; V ) and G j L(V ; V), j {1, 2}, be the induced linear maps. Note that G1 (G1 G2 (v1 ), v2 ) = G2 (v1 , v2 ) = G2 (v2 , v1 ) = G1 (G1 G2 (v2 ), v1 ), showing that G1 G2 is G1 -symmetric. Let (e1 , . . . , en ) be a G1 -orthonormal basis for V that

23/06/2009

4.1 Riemannian manifolds as metric spaces

117

is also a basis of eigenvectors for G1 G2 . The matrix representatives of G1 and G2 are then 1 0 0 1 0 0 0 1 0 0 2 0 [G1 ] = , [ G ] = , . . . . . . 2 . . . . . . . . . . . . . . . . . . 0 0 1 0 0 n where 1 , . . . , n R>0 . Let us assume without loss of generality that 1 2 n .
1 } gives the result, as one can verify directly. Then taking c = max{n , 1

Continuing with the proof, let K M be compact. For x K let Nx be a neighbourhood of x and, by the lemma, let cx R>0 be such that
2 c x G1 (vx , vx ) < G2 (vx , vx ) < cx G2 (vx , vx ).

By continuity of G1 and G2 we can shrink Nx so that


2 c x G1 (v y , v y ) < G2 (v y , v y ) < cx G2 (v y , v y )

for every y Nx and v y T y M. Since K is compact and since K xK Nx , there exists x1 , . . . , xk K such that K k N . If we take c R>0 such that c2 = max{cx1 , . . . , cxk }, then j =1 x j c2 G1 (vx , vx ) < G2 (vx , vx ) < cG2 (vx , vx ) for every x K and every vx Tx M. Now let x, y K. If x and y are in separate connected components of M then dG1 (x, y) = dG2 (x, y) = and the conclusions of the theorem hold. Thus we suppose that K is contained in a single connected component of M. Let ( j : [0, 1] M) jZ>0 be a sequence of curves such that G1 ( j ) < dG1 (x, y) + 1 j . Then, for each j Z>0 , dG2 (x, y)
G2 ( j )

G1 ( j )

cdG1 (x, y) + 1 j.

Taking the limit as j gives dG2 (x, y) cdG1 (x, y). A similar argument clearly also gives dG1 (x, y) c1 dG2 (x, y), and so gives the theorem.

The compactness of K in the preceding theorem is generally necessary, as the following example show. 4.1.9 Example (Nonequivalent metrics) Let M = R and let G1 = dx dx, G2 = (1 + x2 )dx dx.

Let x, y R and let : [0, 1] R be the curve (t) = (1 t)x + ty connecting x and y. Then, since any curve connecting x and y that does not overlap on itself is a reparameterisation of , it follows that
1

dG1 (x, y) =
0

(t)| dt = | y x| |

118 and

4 Set-valued analysis on manifolds

23/06/2009

dG2 (x, y) =
0

1 2 (t)| dt = 3 (1 + (t)2 )| (x + xy + y2 + 3)| y x|.

Let us take x = 0. Then note that for any M R>0 there exists y R such that dG2 (0, y) MdG1 (0, y). This precludes dG1 and dG2 from being equivalent. 4.1.3 The metric structure of the tangent bundle of a Riemannian manifold When we talk about Lipschitz dierential inclusion on manifolds, it will be convenient to have a natural Riemannian metric on the tangent bundle. There is a natural way to induce a Riemannian metric on a Riemannian manifold due to Sasaki [1958]. We outline this construction in this section. There are a variety of ways to describe the induced Riemannian metric on the tangent bundle, and all constructions rely in an essential way on dening a connection on the bundle TM : TM M. We shall outline this construction here. We take for granted facts and notation about jet bundles that we will not review until Section 8.1. Most of our constructions are valid for a general ane connection. Thus we let M be a smooth manifold and we let be an ane connection on M. In this section, for notational simplicity, let us abbreviate the tangent bundle projection by : TM M. We denote by VTM = ker(T) the subbundle of tangent spaces to the bres of TM. Let : VTM TM be the restriction of the tangent bundle projection for TM. For vx , wx Tx M we dene the vertical lift of wx at vx by vlftvx (wx ) = d dt
t=0

(vx + twx ) Vvx TM.

The map wx vlftvx (wx ) is an isomorphism of Tx M with Vvx TM, and so provides us with an isomorphism of the vector bundle : VTM TM with the pull-back vector bundle : TM TM. We have the following commutative diagram: TM
pr1 vlft

VTM TM


TM

If X : M TM is a vector eld on M, we denote by X VTM the pull-back of the vertical bundle to M by X. Note that if Y : M X VTM is a section of this pull-back bundle, then Y(x) VX(x) TM Tx M. Thus sections of X VTM are naturally thought of as simply vector elds. We shall make this identication implicitly below. Using the above constructions, let us provide, associated with a general ane connection , a section of the bundle 1 : J1 TM TM. 0 4.1.10 Lemma (Connections and sections of jet bundles) Let be an ane connection on a smooth manifold M. Then there exists a unique section S : TM J1 TM satisfying Y(x) = j1 Y(x) S(Y(x)),

23/06/2009

4.1 Riemannian manifolds as metric spaces

119 Tx M.

for every vector eld Y (TM), noting that j1 X(x) S(X(x)) T x M VX(x) TM

Proof The existence of the section S is simply a matter of dening S(vx ) = j1 Y(x) Y(x), where Y is any vector eld satisfying Y(x) = vx . We should verify that this denition is well-dened. Let f C (M) be such that d f (x) = 0. Then, for any vector eld X, f X(x) = f (x)X(x) + d f (x) X(x) = d f (x) X(x). (4.1) (4.2)

Similarly, j1 ( f X)(x) = f (x) j1 X(x) + d f (x) X(x) = d f (x) X(x). Thus, if Z is a vector eld vanishing at x, since (TM) is a locally free module (such matter are discussed ad nauseum in Section 5.2), in a neighbourhood of x we can write Z = f1 X1 + + fn Xn for linearly independent (over C (M)) vector elds X1 , . . . , Xn and for functions f1 , . . . , fn which vanish at x. By (4.1) and (4.2) we then have j1 Z(x) = Z(x). Now let Y be a vector eld such that Y (x) = Y(x) = vx and compute ( j1 Y(x) Y(x)) ( j1 Y (x) Y (x)) = j1 (Y Y )(x) (Y Y )(x). Note that the vector eld Z = Y Y vanishes at x, and so our computations above give j1 Z(x) = Z(x), giving well-denedness of S. It is clear that S is uniquely determined from by the condition in the statement of the result.

We now use the map S as the basis for our further constructions. Let v1 J1 TM, let v = 1 (v ), and let x = (v). Let X : M TM be a vector eld satisfying j1 X(x) = v1 . 0 1 Dene Lv1 HomR (Tx M; Tv TM) by Lv1 (vx ) = Tx X(vx ). 4.1.11 Lemma Let v1 J1 TM, let v = 1 (v ), and let x = (v). The following statements hold: 0 1 (i) Lv1 is a well-dened linear injection; (ii) Tv Lv1 = idTx M ; (iii) image(Lv1 ) is a complement to Vv TM in Tv TM. Moreover, if v TM and x = (v), and if L : Tx M Tv TM is a linear map satisfying Tv L = idTx M , then there exists a unique v1 (1 )1 (v) such that L = Lv1 . 0

Proof (i) Suppose that X1 , X2 : M TM satisfy j1 X1 (x) = j1 X2 (x) = v1 . This means that, for d any smooth curve : I M satisfying 0 int(I) and (0) = x it holds that d s s=0 (X1 ) = d ds s=0 (X2 ). This immediately gives Tx X1 (vx ) = Tx X2 (vx ) for every vx Tx M. Thus the denition of Lv1 is independent of the choice of local section X. Linearity of Lv1 is now obvious. To see that Lv1 is injective note that X = idM and so TX(x) Tx X = idTx M . Thus Lv1 possesses a left-inverse and so is injective. (ii) This was proved as part of the proof of the previous assertion. (iii) Suppose that uv image(Lv1 ) Vv TM. Let uv image(Lv1 ) write uv = Lv1 (vx ) for vx Tx M. Then let : I M be a smooth curve satisfying 0 int(I) and (0) = vx . If

120

4 Set-valued analysis on manifolds


X : M TM is a vector eld satisfying j1 X(x) = v1 , then this means that uv = Since uv is vertical we have Tv (uv ) = 0. This in turn means that 0= d ds
s=0

23/06/2009
d ds s=0 (X ).

( X ) =

d ds

s=0

= vx .

This means, therefore, that uv = 0. This gives image(Lv1 ) Vv TM = {0}. This part of the result then follows from a dimension count. For the last assertion, let (U, ) be a chart for M. Denote the natural coordinates for TM by (x j , vk ). Suppose that (x) = 0 and that T(v) = (0, v). An arbitrary linear map between Tx X and Tv TM will have the coordinate representation L = Ak dxk
j

j . + Bk j dx j vk x
j j

The condition that Tv L = idTx M is readily seen to imply that Ak = k , j, k {1, . . . , n}. Now, if we dene a local section X on U with local representative X (x) = (x, Bx) where B is the n n matrix with components Bk , j, k {1, . . . , n}, then it is immediate that j if we take v1 = j1 X(x) we have L = Lv1 . This gives the existence assertion. For uniqueness suppose that Lv1 = Lv2 and let X1 and X2 be local sections such that Lv1 = Tx X1 and Lv2 = Tx X2 . This means that j1 X1 (x) = j1 X2 (x) and so v1 = v2 , as desired.

Now, associated with S, we dene an endomorphism PH of TTM by S


PH S (uv ) = LS(v) Tv (uv ),

uv Tv TM.

The following assertions are more or less obvious. 4.1.12 Lemma Let be an ane connection on M and let S : TM J1 TM be the corresponding section. The endomorphism PH (T TM TTM) has the following properties: S (i) ker(PH ) = VTM; S (ii) TTM = ker(PH ) image(PH ). S S
Proof (i) It is clear that VTM ker(PH ). For the opposite inclusion, suppose that PH (up ) = S S 0. Since LS(v) is injective this implies that Tv (uv ) = 0 whence uv is vertical. (ii) Since Tv is surjective it follows that image(PH (v)) = image(LS(v) ) which is compleS mentary to Vv TM by Lemma 4.1.11.

The endomorphism PH is called the horizontal projection associated with the S connection S and the endomorphism PV = idTTM PH is called the vertical projection. S S V H It is easy to see that PS is a projection onto VTM and PS is a projection onto a subbundle, denoted by HTM, that is complimentary to VTM. For vx Tx M, note that both Hvx TM and Vvx TM are naturally isomorphic to Tx M. For Hvx TM, we note that the restriction of Tvx to Hvx TM is the desired isomorphism with Tx M. For Vvx TM, we note that this subspace is the tangent space to the bre Tx M, and so is naturally isomorphic to Tx M without having to say more.

23/06/2009

4.2 Set-valued maps between metric and topological spaces

121

We will have need of the coordinate expressions for the horizontal and vertical projections. Let (U, ) be a chart for M and denote natural coordinates for TM by (x j , vk ). Parsing the above constructions, one can show that
k PH S = k dx

+ ijk vk dx j i , j v x j k PV ijk vk dx j i . S = k d y j v y
j

The preceding constructions allows us to make the following denition. 4.1.13 Denition (Sasaki Riemannian metric) Let (M, G) be a smooth Riemannian manifold, let be the associated Levi-Civita ane connection, and let S : TM J1 TM be the associated section. The Sasaki Riemannian metric on TM is the metric given by
V GT (Xvx , Yvx ) = G(Tvx TM (Xvx ), Tvx TM ((Yvx )) + G(PV S (Xvx ), PS (Yvx )),

for vx TM and Xvx , Yvx Tvx TM. The metric on TM associated with Sasaki Riemannian metric is denoted by dT G.

4.2 Set-valued maps between metric and topological spaces


In this section we give the essential characterisations of set-valued maps. Much of the discussion is most naturally carried out for metric spacesor sometimes more generally for topological spaces with certain propertiesand so most of the presentation is given in this framework. However, given our eorts in Section 4.1, we know that this allows us to include Riemannian manifolds in our framework. Since paracompact manifolds possess Riemannian metrics [Abraham, Marsden, and Ratiu 1988, Theorem 5.5.12], and since manifolds that are not paracompact are to be regarded as pathological, this means that the analysis we present in this section can be applied to any manifold of signicance. However, some of the constructions do generally depend on the choice of metric. In such cases we outline where appropriate the independence of our denitions and theorems on the choice of a particular metric. Our discussion begins with the construction of the Hausdor metric on the collection of compact subsets of a Riemannian manifold. 4.2.1 The Hausdorff distance We begin with a fairly simple measure of closeness of sets. 4.2.1 Denition (Distance between sets) Let (M, d) be a metric space. For nonempty subsets A and B of M the distance between A and B is dist(A, B) = inf{d(x, y) | x A, y B}. If A = {x} for some x M, then we denote dist(x, B) = dist({x}, B) and, if B = { y} for some y M, then we denote dist(A, y) = dist(A, { y}).

122

4 Set-valued analysis on manifolds

23/06/2009

A sometimes useful construction involving the distance between sets is the following. 4.2.2 Denition (Upper limit and lower limit of sequences of sets) Let (M, d) be a metric space and let (A j ) jZ>0 be a sequence of subsets of M. (i) The lower limit of (A j ) jZ>0 is Liminf A j = x M
j

lim inf d(x, A j ) = 0 .


j

(ii) The upper limit of (A j ) jZ>0 is Limsup A j = x M


j

lim sup d(x, A j ) = 0 .


j

(iii) A set A M is the limit of (A j ) jZ>0 , and we write A = Lim j A j , if A = Liminf A j = Limsup A j .
j j

For general sets A and B there is not much useful one can say about dist(A, B). However, if we make some assumptions about the sets, then there is some structure here. Let us explore some of this. 4.2.3 Proposition (Continuity of distance to a set) Let (M, d) be a metric space. If B M then the function x dist(x, B) on M is uniformly continuous in the metric topology.
Proof Let R>0 and take = 2 . Let y B be such that d(x1 , y) dist(x1 , B) < 2 . Then, if d(x1 , x2 ) < , dist(x2 , B) d(x2 , y) d(x2 , x1 ) + d(x1 , y) dist(x1 , B) + . In a symmetric manner one shows that dist(x1 , B) dist(x2 , B) + , provided that d(x1 , x2 ) < . Therefore, |dist(x1 , B) dist(x2 , B)| < , provided that d(x1 , x2 ) < , giving uniform continuity, as desired.

Now let us consider some properties of the distance function for closed sets. It is convenient to have at this point the notion of a HeineBorel metric space, by which we mean one in which closed and bounded sets are compact. By Theorem 4.1.6 we know, for example, that complete Riemannian manifolds, equipped with the Riemannian distance metric, are HeineBorel. 4.2.4 Proposition (Set distance and closed sets) Let (M, d) be a metric space. If A, B M are closed sets then the following statements hold:

23/06/2009

4.2 Set-valued maps between metric and topological spaces

123

(i) if A B = then dist(x, B), dist(A, y) > 0 for all x A and y B; (ii) if (M, d) is HeineBorel and if A is compact, then there exists x0 A and y0 B such that dist(A, B) = d(x0 , y0 ).
Proof (i) Suppose that dist(x, B) = 0. Then there exists a sequence ( y j ) jZ>0 in B such that d( y j , x) < 1 j for each j Z>0 . Thus the sequence ( y j ) jZ>0 converges to x and so x cl(B) = B. Therefore, if A B = we can conclude that, if dist(x, B) = 0, then x A. That is, dist(x, B) > 0 for every x A, and similarly dist(A, y) > 0 for every y B. (ii) By Proposition 4.2.3 the function x dist(x, B) is continuous and so too then is its restriction to the compact set A. Thus, since continuous functions on compact sets obtain their minimum [Abraham, Marsden, and Ratiu 1988, Corollary 1.5.8], there exists x0 A such that dist(A, B) = dist(x0 , B). Now there exists a sequence ( y j ) jZ>0 in B such that d( y j , x0 ) < dist(x0 , B) + 1 j for each j Z>0 . Abbreviate r = dist(x0 , B). The sequence ( y j ) jZ>0 is contained in the closed ball B(r + 1, x0 ), which is compact since M is HeineBorel. Therefore, by the BolzanoWeierstrass Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 1.5.4], there exists a convergent subsequence ( y jk )kZ>0 converging to y0 . Since B is closed we necessarily have y0 B. We claim that dist(A, B) = d(x0 , y0 ). Indeed, continuity of the metric ensures that dist(A, B) = dist(x0 , B) = lim d( y jk , x0 ) = d( y0 , x0 ),
k

as desired.

The above developments have to do with the minimum distance of points in two sets. Next let us turn to a dierent comparison of sets, one that compares their shapes. In order to do this the following notion will be useful. 4.2.5 Denition (r-neighbourhood of a set) Let (M, d) be a metric space. For A M nonempty and for r R>0 , the set B(r, A) = xA B(r, x) is called the r-neighbourhood of A.

One now denes the following numbers associated with nonempty subsets A, B M of a metric space (M, d): h (A, B) = inf{r R>0 | B B(r, A)}, h (A, B) = inf{r R>0 | A B(r, B)}. Let us record some properties of these numbers. 4.2.6 Lemma (Properties of h (A, B) and h (A, B)) Let (M, d) be a metric space. For nonempty subsets A, B, C M the following statements hold: (i) h (A, B) = sup{dist(A, y) | y B}; (ii) h (A, B) = sup{dist(x, B) | x A};

124 (iii) (iv) (v) (vi) (vii) (viii) (ix)

4 Set-valued analysis on manifolds if A is bounded then h (A, B) < ; if B is bounded then h (A, B) < ; h (A, B) = h (B, A); h (A, B) = 0 if and only if B cl(A); h (A, B) = 0 if and only if A cl(B); h (A, C) h (A, B) + h (B, C); h (A, C) h (A, B) + h (B, C).

23/06/2009

Proof (i) and (ii) We prove (ii). Let R>0 . Then, for each x A, there exists yx B such that dist(x, B) d(x, yx ) < h (A, B) + (the rst inequality follows from the denition of dist(x, B) and the second follows from the denition of h (A, B)). This holding for every x A we have sup{dist(x, B) | x A} h (A, B) + . This holding for every R>0 we have sup{dist(x, B) | x A} h (A, B). Again let R>0 and note that for x A there exists yx B such that d(x, yx ) < dist(x, B) + . Thus sup{d(x, yx ) | x A} sup{dist(x, B) | x A} + . This means that every point in A lies within a distance sup{dist(x, B) | x A} + of some point in B. This is the denition of the expression h (A, B) sup{dist(x, B) | x A} + or h (A, B) sup{dist(x, B) | x A}. Combining the previous two paragraphs gives this part of the proof. (iii) and (iv) Let us prove, say, (iv). If B is bounded there exists r R>0 and x0 B such that B B(r, x0 ). Thus h (A, B) r and so is nite. (v) This is obvious. (vi) and (vii) Let us prove, say (vi). Suppose that h (A, B) = 0 and let y B. Then, for every r R>0 there exists x A such that d(x, y) < r. Thus there exists a sequence (x j ) jZ>0 in A such that d(x j , y) < 1 j for every j Z>0 . Thus the sequence converges to y and so y cl(A). Next suppose that B cl(A) and let y B. Then there exists a sequence (x j ) jZ>0 in A 1 such that d(x j , y) < 1 j for every j Z>0 . Thus y B( j , A) for every j Z>0 . Since this holds for every y B it follows that h (A, B) = 0.

23/06/2009

4.2 Set-valued maps between metric and topological spaces

125

(viii) and (ix) Let us prove, say, (ix). Let x A, y B, and z C. We then have d(x, z) d(x, y) + d( y, z). Since this holds for every z C we then have inf{d(x, z) | z C} d(x, y) + inf{d( y, z) | z C} = dist(x, C) d(x, y) + dist( y, C).

Since this holds for every y B we have dist(x, C) inf{d(x, y) | y B} + sup{dist( y, C) | y B} = dist(x, C) dist(x, B) + h (B, C).

Since this holds for every x A we have sup{dist(x, C) | x A} sup{dist(x, B) | x A} + h (B, C) = as desired. h (A, C) h (A, B) + h (B, C),

Note that it is not generally the case that h (A, B) = h (A, B), as we shall see in examples below. To symmetrise this we then make the following denition. 4.2.7 Denition ((Upper and lower) Hausdorff distance) Let (M, d) be a metric space. For nonempty subsets A, B M, (i) the upper Hausdor distance between A and B is h (A, B), (ii) the lower Hausdor distance between A and B is h (A, B), and (iii) the Hausdor distance between A and B is h(A, B) = max{h (A, B), h (A, B)}. From Lemma 4.2.6 we have the following result. 4.2.8 Theorem (Properties of the Hausdorff distance for compact sets) Let (M, d) be a metric space. If A, B, C M are nonempty and compact then the following statements hold: (i) h(A, B) 0; (ii) h(A, B) = 0 if and only if A = B; (iii) h(A, B) = h(B, A); (iv) h(A, C) h(A, B) + h(B, C). The theorem then makes the following denition sensible. 4.2.9 Denition (Hausdorff metric) Let (M, d) be a metric space. If K (M) denotes the collection of compact subsets of M, then h : K (M) K (M) R0 is the Hausdor metric. Let us give a few examples to illustrate the Hausdor distance. 4.2.10 Examples (Hausdorff distance)

126

4 Set-valued analysis on manifolds

23/06/2009

1. We take M = R with d the standard metric. Let us take A = (0, 1) and B = [0, 1]. Note that for any r R>0 we have A B(r, B) and B B(r, A). Thus h (A, B) = h (A, B) = 0 and so h(A, B) = 0. However, A B. This explains why closedness is an essential property for sets to possess when using the Hausdor distance. 2. Let us take A = [0, 1] and B = [0, ). Since A B we have h (A, B) = 0. However, there is no nite r R>0 for which B B(r, A) and so h (A, B) = . Thus h(A, B) = . This explains why boundedness is important when using the Hausdor distance. As the nal result in this section, we indicate that the topology on the set K (M) induced by the metric h is independent of d in the case when d is the Riemannian distance. 4.2.11 Proposition (Independence of Hausdorff metric topology on Riemannian metric) Let M be a smooth manifold, let G1 and G2 be smooth Riemannian metrics on M, let d1 and d2 be the associated metrics on M, and let h1 and h2 be the associated Hausdor metrics on K (M). If (M, d1 ) and (M, d2 ) are complete metric spaces, then the metric topologies on K (M) induced by h1 and h2 agree.
Proof Let dist1 and dist2 be the set distance functions associated with the metrics G1 and G2 , respectively. Let B1 (r, K) and B2 (r, K) be the open balls of radius r and centre K in the metrics h1 and h2 , respectively. Let B1 (r, A) and B2 (r, A) be the r-neighbourhoods of A with respect to the metrics d1 and d2 , respectively. Let O1 K (M) be an open set in the topology induced by the metric h1 and let K O1 . By openness of O1 , let r R>0 be such that B1 (r, K) O1 . If K B1 (r, K) then h 1 (K, K ) h1 (K, K ) < r = K B1 (r, K).

We claim that B1 (r, K) is bounded. Since K is compact, by Theorem 4.1.6 it is bounded. Let R R>0 and x0 M be such that K is contained in the d1 -ball of radius R and centre x0 . Let x, y B1 (r, K) and let x , y K be such that x is in the d1 -ball of radius r and centre x and such that y is in the d1 -ball of radius r and centre y . Then d1 (x, y) d1 (x, x ) + d1 (x , y ) + d1 ( y, y ) < R + 2r, giving boundedness of B1 (r, K). as desired. Let A = cl(B1 (r, K)), noting that A is compact by Theorem 4.1.6. By Theorem 4.1.8 there exists c R>0 such that d1 (x, y) < cd2 (x, y) for all x, y B1 (r, K). r and let K B2 (r , K). First suppose that h1 (K , K) = h (K , K). Then Dene r = c 1 h1 (K , K) = sup{dist1 (K , y) | y K} < sup{c dist2 (K , y) | y K} c h 2 (K , K) c h2 (K , K) < r, using Lemma 4.2.6(i). Similarly, if h1 (K , K) = h1, (K , K), h1 (K , K) = sup{dist1 (x, K) | x K } < sup{c dist2 (x, K) | x K } c h 2 (K , K) c h2 (K , K) < r, using Lemma 4.2.6(ii). Thus h1 (K , K) < r and this shows that B2 (r , K) B1 (r, K) O1 . In like manner, one shows that, if O2 K (M) is open in the metric topology of h2 , then there exists r R>0 and K O2 such that B2 (r , K) O2 . This shows that the metric topologies of h1 and h2 agree.

23/06/2009

4.2 Set-valued maps between metric and topological spaces 4.2.2 Set-valued maps

127

Let us rst give the denition and notation for a set-valued map. 4.2.12 Denition (Set-valued map) Let S and T be sets. (i) A set-valued map from S to T is a map F from S to the set 2T of subsets of T. A set-valued map from S to T will be denoted by F : S T. (ii) The domain of a set-valued map F : S T is dom(F) = {x S | F(x) (iii) The graph of a set-valued map F : S T is }.

graph(F) = {(x, y) S T | y F(x)}.

If the sets S and T have additional structure, one can ask for additional structure for set-valued maps. The following gives some of the more important such properties. 4.2.13 Denition (Flavours of set-valued maps) Let S and T be topological spaces and let F : S T be a set-valued map. (i) The set-valued map is open-valued (resp. closed-valued, compact-valued) at x S if F(x) is open (resp. closed, bounded, compact). (ii) The set-valued map is open-valued (resp. closed-valued, compact-valued) if F(x) is open (resp. closed, bounded, compact) for every x S. Now let M and N be metric spaces and let F : M N be a set-valued map. (iii) The set-valued map is bounded-valued at x S if F(x) is bounded. (iv) The set-valued map is bounded-valued if F(x) is bounded for every x M. (v) The set-valued map F is bounded if there exists R R>0 and y0 N such that F(x) B(R, y0 ) for each x M. (vi) The set-valued map F is locally bounded if F|K is bounded for every compact subset K of M. 4.2.3 Notions of continuity for set-valued maps We shall be interested in cases where the sets S and T are manifolds, and sometimes Riemannian manifolds. In these cases, we can use the topology of the manifold, and sometimes the metric structure, to provide notions of continuity for set-valued maps. We begin by giving notions of continuity of set-valued maps depending explicitly on a metric for their denition. 4.2.14 Denition (Upper hemicontinuous, lower hemicontinuous, Hausdorff continuous) Let (M, dM ) and (N, dN ) be metric spaces with Hausdor distances hM and hN , respectively. Let F : M N be a set-valued map for which dom(F) = M. (i) The set-valued map F is upper hemicontinuous at x0 M if, for every R>0 , there exists R>0 such that h (F(x0 ), F(x)) < for every x B(, x0 ). N

128

4 Set-valued analysis on manifolds

23/06/2009

(ii) The set-valued map F is upper hemicontinuous if it is upper hemicontinuous at every point in M. (iii) The set-valued map F is lower hemicontinuous at x0 if, for every R>0 , there exists R>0 such that hN (F(x0 ), F(x)) < for every x B(, x0 ). (iv) The set-valued map F is lower hemicontinuous if it is lower hemicontinuous at every point in M. (v) The set-valued map F is Hausdor continuous at x0 A if it is both upper and lower hemicontinuous at x0 . That is, F is Hausdor continuous at x0 if, for every R>0 , there exists R>0 such that hN (F(x), F(x0 )) < for every x B(, x0 ). (vi) The set-valued map F is Hausdor continuous if it is Hausdor continuous at every point in M. There are also notions of continuity of set-valued maps that can be made without reference to the Hausdor metric, and which use only a topology for the sets in their denition. 4.2.15 Denition (Upper and lower semicontinuity) Let S and T be topological spaces and let F : S T be a set-valued map. (i) The set-valued map F is upper semicontinuous at x0 S if, for every open set V F(x0 ), there exists a neighbourhood U S of x0 such that F(U) V. (ii) The set-valued map F is upper semicontinuous if it is upper semicontinuous at each x S. (iii) The set-valued map F is lower semicontinuous at x0 S if, for any y0 F(x0 ) and any neighbourhood V of y0 , there exists a neighbourhood U of x0 such that F(x) V for every x U. (iv) The set-valued map F is lower semicontinuous if it is lower semicontinuous at each x S. (v) The set-valued map F is continuous at x0 S if it is both upper and lower semicontinuous at x0 . (vi) The set-valued map F is continuous if it is both upper and lower semicontinuous. The terminology we use here is very much not standardised. Dierent authors will use the words hemicontinuous and semicontinuous to mean dierent things. The reader should be alert to this. Let us now establish the relationships between the notions of hemicontinuity and semicontinuity. The following result should be read carefully; the implications have a potentially confusing asymmetry. 4.2.16 Theorem (Relationship between hemicontinuity and semicontinuity) Let (M, dM ) and (N, dN ) be metric spaces with Hausdor distances hM and hN , respectively. Let F : M N be a set-valued map for which dom(F) = M. Then the following statements hold: (i) if F is upper semicontinuous at x0 M then it is upper hemicontinuous at x0 ;

23/06/2009

4.2 Set-valued maps between metric and topological spaces

129

(ii) if F is lower hemicontinuous at x0 M then it is lower semicontinuous at x0 ; (iii) if F is upper hemicontinuous at x0 M and if F(x0 ) is compact, then F is upper semicontinuous at x0 ; (iv) if F is lower semicontinuous at x0 M and if F(x0 ) is totally bounded, then F is lower hemicontinuous at x0 .
Proof In the proof, we denote balls in the metrics dM and dN by BM (r, x) and BN (r, y), respectively. Similarly, BM (r, A) and BN (r, B) are the r-neighbourhoods of A M and B N, respectively. (i) Suppose that F is not upper hemicontinuous at x0 . Then there exists R>0 such that, for every R>0 , h (F(x0 ), F(x)) for some x BM (, x0 ). Therefore, for each N j Z>0 there exists x j M such that dM (x j , x0 ) < 1 j and such that hN (F (x0 ), F (x j )) . This means that, for the sequence (x j ) jZ>0 so dened and for V = BN ( , F(x0 )), we have F(x j ) V. Thus F is not upper semicontinuous at x0 . (ii) Let y0 F(x0 ) and let V be an open subset of N for which y0 V. Let R>0 be such that BN ( , y0 ) V. Since F is lower hemicontinuous at x0 let R>0 be such that hN, (F(x0 ), F(x)) < for x BM (, x0 ). Take U = BM (, x0 ). It then follows that F(x0 ) BN ( , F(x)) for x U. In particular, for each x U there exists y F(x) such that y BN ( , y0 ) V. Thus F is lower semicontinuous at x0 . (iii) Let V N be an open set such that F(x0 ) V. By part (iii) of Lemma 4.2.6 we have hN (F(x0 ), N \ V) < and by part (vi) of Lemma 4.2.6 we have h (F(x0 ), N \ V) > 0. Let us N take = h ( F ( x ) , N \ V ). Then, by upper hemicontinuity of F at x0 there exists R>0 0 N such that hN (F(x0 ), F(x)) < for every x BM (, x0 ). Take U = BM (, x0 ). If x U then we have F(x) BN ( , F(x0 )). By the denition of this means that F(x) (N \ V) = and so F(x) V for x U. Thus F is upper semicontinuous at x0 . (iv) Suppose that F is not lower hemicontinuous at x0 . This means that there exists R>0 such that, for every R>0 , hN, (F(x0 ), F(x)) for some x BM (, x0 ). Thus there exists a sequence (x j ) jZ>0 in M such that, for every j Z>0 , dM (x j , x0 ) < 1 j and such that F(x0 ) BN ( , F(x j )). Since F(x0 ) is totally bounded there exists y1 , . . . , yk F(x0 ) such that F(x0 ) k (4.3) l=1 BN ( 2 , yl ). We claim that there exists l {1, . . . , k} such that, for every N Z>0 , there exists j N such that F(x j ) BN ( 2 , yl ) = . Suppose otherwise. That is, suppose that, for each l {1, . . . , k}, there exists N Z>0 such that F(x j ) BN ( 2 , yl ) . Let y F(x0 ). By (4.3) there exists l( y) {1, . . . , k} such that y BN ( 2 , yl( y) ). For l {1, . . . , k} let Nl Z>0 be such that F(x j ) BN ( 2 , yl ) for j Nl . Let N = max{N1 , . . . , Nk }. Then, for j N and for y F(x0 ) there exists y( j, y) F(x j ) BN ( 2 , yl( y) ). We then have dN ( y, y( j, y)) dN ( y, yl( y) ) + dN ( yl( y) , y( j, y)) < . But this contradicts the fact that F(x0 ) BN ( , F(x j )) for every j Z>0 . Thus we can conclude that there exists l {1, . . . , k} such that, for every N Z>0 , there exists j N such that F(x j ) BN ( 2 , yl ) = . This means that F is not lower semicontinuous at x0 .

Now having at hand the denitions and the basic characterisations of hemicontinuity and semicontinuity, let us look at some examples and try to get a handle on what these notions mean.

130

4 Set-valued analysis on manifolds

23/06/2009

4.2.17 Examples (Hemicontinuity and semicontinuity) 1. We equip Rn and Rm with their standard metrics. Let M Rn . A plain old map f : M Rm denes a set-valued map F f : M Rm according to F f (x) = { f (x)}. It is easy to show that F f is upper semicontinuous at x0 if and only if f is continuous at x0 and that F f is lower semicontinuous at x0 if and only if f is continuous at x0 . Since F f (x) is compact for each x M it follows from Theorem 4.2.16 that F f is upper hemicontinuous if and only if it is upper semicontinuous if and only if it is lower hemicontinuous if and only if it is lower semicontinuous if and only if it is Hausdor continuous if and only if it is continuous if and only if f is continuous. That is to say, all possible notions of continuity (that we have dened) agree in this case. This is reassuring. Somewhat less reassuring is the fact that upper (resp. lower) semicontinuous functions do not correspond to upper (resp. lower) semicontinuous set-valued maps. This is one of the commonly accepted and confusing bits of this business. 2. We equip R with its standard metric. Dene F : R R by F(x) = (x 1, x + 1). We claim that F is upper hemicontinuous but not upper semicontinuous. Indeed, it is easy to verify that if |x1 x2 | < then h (F(x1 ), F(x2 )) < , h (F(x1 ), F(x2 )) < .

Thus F is actually Hausdor continuous and so, in particular, upper hemicontinuous. We next claim that F is not upper semicontinuous at 0. To see this, consider the open subset V = (1, 1) which has the property that V F(0). Consider the sequence ) which converges to 0. Since F(x j ) V for every j Z>0 it follows that (x j = 1 j jZ>0 F is indeed not upper semicontinuous at 0. Note that F(0) is not compact. | k Z>0 } {0}, which we think of as a zero-dimensional 3. Let us take M = { 1 k manifolds and dene F : M R by 1 {0, 1, . . . , k}, x = k , k Z>0 , F(x) = Z0 , x = 0. We claim that F is lower semicontinuous but not lower hemicontinuous. To see that F is lower semicontinuous at x0 = 1 for k Z>0 , note that any sequence k (x j ) jZ>0 in M converging to x0 must be eventually constant; that is, there exists N Z>0 such that x j = 1 for all j N. In this case we clearly have F(x j )V = F(x0 )V k for any j N and for any open set V R. From this we easily deduce that F is lower semicontinuous at x0 = 1 . k Next we show that F is lower semicontinuous at x0 = 0. Indeed, let (x j ) jZ>0 be a sequence in M converging to 0. If V R is an open set for which F(0) V then this means exactly that m V for some m Z0 . In particular this means that

23/06/2009

4.2 Set-valued maps between metric and topological spaces

131

F( 1 ) V for all k m. This means F(x j ) V for all j suciently large. This k gives lower semicontinuity of F. Finally, we claim that F is not lower hemicontinuous at 0. To see this one need only note that h (F(0), F( 1 )) = for all k Z>0 . This precludes lower hemicontinuity of k F at 0. Note that F(0) is not totally bounded. The preceding two examples show that the converses of the rst two parts of Theorem 4.2.16 are generally false. 4. Let M Rn and let f , f+ : M R have the following properties: (a) f is lower semicontinuous; (b) f+ is upper semicontinuous; (c) f (x) f+ (x) for all x M. Let us dene F : M R by F(x) = [ f (x), f+ (x)]. It is then an easy exercise to show that F is upper semicontinuous. In the case where M = R we depict the situation in Figure 4.1.

Figure 4.1 The graph of an upper continuous set-valued function (top) and a lower continuous set-valued function (bottom)

5. Now let M Rn be an open submanifold and let f , f+ : M R have the following properties:

132

4 Set-valued analysis on manifolds (a) f is upper semicontinuous; (b) f+ is lower semicontinuous; (c) f (x) f+ (x) for all x M.

23/06/2009

Let us dene F : M R by F(x) = [ f (x), f+ (x)]. It is still an easy exercise to show that F is lower semicontinuous. We refer to Figure 4.1 for a depiction of this situation. Let us now consider some further properties one can assign to set-valued maps, and the relationships between these properties. First let us consider the important special case when the assigned sets are closed. 4.2.18 Denition (Closed set-valued maps) Let S and T be topological spaces. A set-valued map F : S T is (i) closed at x0 S if, for every sequence (x j ) jZ>0 in S converging to x0 and for every sequence ( y j ) jZ>0 for which y j F(x j ) and which converges to some y0 T , it holds that y0 F(x0 ), and is (ii) closed if it is closed at each point in S. Let us explore the relationship between closedness and our concepts of continuity. 4.2.19 Proposition (Closedness and upper semicontinuity) If S and T are rst countable Hausdor topological spaces and if F : S T is a set-valued map, then the following statements hold: (i) F is closed if and only if graph(F) is a closed subset of S T ; (ii) if F is upper semicontinuous and if F(x) is closed for every x S, then F is closed; (iii) if T is compact and if F is closed, then F is upper semicontinuous.
Proof Note that rst countability of S implies that points in cl(S) are limits of sequences in S. Since S is Hausdor, limits of sequences are unique. The same comment holds, of course, for T and S T . (i) Suppose that graph(F) is closed, let x0 S, let (x j ) jZ>0 be a sequence converging to x0 , and let ( y j ) jZ>0 be a sequence in T such that y j F(x j ) and such that lim j y j = y0 . Thus ((x j , y j )) jZ>0 is a sequence in graph(F) converging to (x0 , y0 ) S T . Therefore, closedness of graph(F) gives (x0 , y0 ) graph(F), or y0 F(x0 ), as desired. Now suppose that F is closed at each x S and let ((x j , y j )) jZ>0 be a convergent sequence in graph(F). Thus lim j x j = x0 and lim j y j = y0 and y0 F(x0 ). Since y j F(x j ) it follows that F is closed at x0 , and this argument applies to every x0 S. (ii) We will show that, under the given hypotheses, the complement of graph(F) is open. Let (x0 , y0 ) (S T ) \ graph(F). Since F(x0 ) is closed, let V be a neighbourhood of y0 for which cl(V) F(x0 ) = . Let us dene a neighbourhood of F(x0 ) by V = T \ cl(V). By upper semicontinuity of F there exists a neighbourhood U of x0 such that x U implies that F(x) V . This means that U V graph(F) = and so U V is a neighbourhood of (x0 , y0 ) in (S T ) \ graph(F). Thus (S T ) \ graph(F) is open, as desired. (iii) Suppose that T is compact and that F is not upper semicontinuous at some x0 S. Since F is not upper hemicontinuous at x0 (by Theorem 4.2.16) there exists a neighbourhood V of F(x0 ) such that, for every neighbourhood U of x0 such that F(x) V for some

23/06/2009

4.2 Set-valued maps between metric and topological spaces

133

x U. Thus there is a sequence (x j ) jZ>0 in S converging to x0 and a sequence ( y j ) jZ>0 in T for which y j F(x j ) and for which y j V for each j Z>0 . Compactness of T ensures that there is a convergent subsequence ( y jk )kZ>0 and the limit of this sequence cannot be in V by openness of V. But this means that F is not closed at x0 .

Let us give examples to show that there is, in general, no correspondence between upper semicontinuity and closedness. 4.2.20 Examples (Closedness and upper semicontinuity) 1. We let S = T = R with the standard topology. The set-valued map F : R by 1 { x }, x 0, F(x) = {0}, x = 0 R given

is closed but not upper semicontinuous. 2. The set-valued map F : R R given by F(x) = (0, 1) is not closed but is upper semicontinuous. Next we consider some additional characterisations of set-valued maps when the sets assigned are compact. That this is of particular interest is already clear from Theorem 4.2.16. 4.2.21 Proposition (Compactness and upper semicontinuity) Let (M, dM ) and (N, dN ) be rst countable Hausdor topological spaces and let F : M N be a set-valued map for which F(x0 ) is compact for some x0 M. Then the following statement are equivalent: (i) F is upper semicontinuous at x0 ; (ii) for every sequence (xj )jZ>0 in M converging to x0 and for every sequence (yj )jZ>0 in N for which yj F(xj ), there exists a subsequence (yjk )kZ>0 converging to a point in F(x0 ).
Proof Denote by BM (r, A) and BN (r, B) the r-neighbourhoods of A M and B N, respectively. Similarly let distM and distN be the set distance functions for M and N. First suppose that F is upper semicontinuous, let (x j ) jZ>0 be a sequence in M converging to x0 M, and let ( y j ) jZ>0 be a sequence in N for which y j F(x j ). Suppose that ( y j ) jZ>0 has no convergent subsequences converging in F(x0 ). Let y F(x0 ). Then there exists a neighbourhood V y of y and N y Z>0 such that y j V y for j N y . Since F(x0 ) yF(x0 ) V y and since F(x0 ) is compact, there exists y1 , . . . , yk F(x0 ) such that F(x0 ) k V V. Let N = max{N y1 , . . . , N yk } and note that, if j N, then y j V. Since j =1 y j F is upper semicontinuous, there exists N Z>0 such that F(x j ) V for j N . This contradiction gives this part of the result. Now suppose F is not upper semicontinuous at x0 . Then there exists a neighbourhood V of F(x0 ) and a sequence (x j ) jZ>0 such that F(x j ) V for each j Z>0 . Let ( y j ) jZ>0 be a sequence for which y j F(x j ) and y j V. Since N \ V is closed, and convergent subsequences of ( y j ) jZ>0 must converge to a point in N \ V. In particular, such subsequences cannot converge to a point in F(x0 ).

134

4 Set-valued analysis on manifolds 4.2.4 Lipschitz set-valued maps

23/06/2009

In our study of dierential equations, we saw that the Lipschitz property of vector elds is a natural one as concerns existence and uniqueness of integral curves. We shall see that for dierential inclusions an important role is also played by the Lipschitz property. In this section we develop the theory of general set-valued Lipschitz maps. 4.2.22 Denition (Lipschitz set-valued map) Let (M, dM ) and (N, dN ) be metric spaces and let hM and hN denote the corresponding Hausdor distances. (i) A set-valued map F : M N is Lipschitz if there exists L R>0 such that hN (F(x), F( y)) LdM (x, y) for every x, y M. (ii) A set-valued map F : M compact subset K M. N is locally Lipschitz if F|K is Lipschitz for every

Let us verify that the notion of local Lipschitzness is independent of the choice of Riemannian metric in the case where the metric space is a Riemannian manifold. 4.2.23 Proposition (Locally Lipschitz maps between Riemannian manifolds) Let M and N be manifolds, let GM and GM be smooth Riemannian metrics on M with associated metrics dM and dM , respectively, and let GN and GN be smooth Riemannian metrics on N with associated metrics dN and dN , respectively. For a set-valued map F : M N, the following statements hold: (i) the following two statements are equivalent: (a) F is a locally Lipschitz map from (M, dM ) to (N, dN ); (b) F is a locally Lipschitz map from (M, dM ) to (N, dN ); (ii) if F is compact-valued and if (N, dN ) and (N, dN ) are complete metric spaces, then the following four statements are equivalent: (a) (b) (c) (d) F is a locally Lipschitz map from (M, dM ) to (N, dN ); F is a locally Lipschitz map from (M, dM ) to (N, dN ); F is a locally Lipschitz map from (M, dM ) to (N, dN ); F is a locally Lipschitz map from (M, dM ) to (N, dN ).

Proof Throughout the proof, we will adopt obvious conventions of using subscripts M and N and using or not to connote when dealing with one of the four metric spaces (M, dM ), (N, dN ), (N, dN ), and (N, dN ). This should not be ambiguous. (i) Suppose that F is a locally Lipschitz map from (M, dM ) to (N, dN ). Let K M be compact and let L R>0 be such that hN (F(x), F( y)) LdM (x, y) for every x, y M. By Theorem 4.1.8 let c R>0 be such that dM (x, y) cdM (x, y) for every x, y K. Then hN (F(x), F( y)) LdM (x, y) cLdM (x, y),

23/06/2009

4.2 Set-valued maps between metric and topological spaces

135

showing that F is a locally Lipschitz map from (M, dM ) to (N, dN ). Of course, one similarly shows that a locally Lipschitz map from (M, dM ) to (N, dN ) is also a locally Lipschitz map from (M, dM ) to (N, dN ). (ii) Suppose that F is a locally Lipschitz map from (M, dM ) to (N, dN ). Let K M be compact and let L R>0 be such that hN (F(x), F( y)) LdM (x, y) for every x, y K. Let RK = sup{dM (x, y) | x, y K}

and note by Theorem 4.1.6 that RK < . Thus sup{hN (F(x), F( y)) | x, y K} LRK , showing that F(K) is a hN -bounded subset of K (N). Let R R>0 and A K (N) be such that F(K) BN (R, A), where BN (R, A) denotes the ball of radius R and centre A in K (N) with respect to the metric hN . If BN (R, A) denotes the R-neighbourhood of A in N with respect to the metric dN , then, as in the proof of Proposition 4.2.11, we can show that BN (R, A) is bounded since A is compact. Therefore, by Theorem 4.1.6, C cl(BN (R, A)) is compact. Let c R>0 be such that dN (u, v) < cdN (u, v) for all u, v C. Let x, y K. First (F (x), F ( y)). Then suppose that hN (F(x), F( y)) = hN hN (F(x), F( y)) = sup{distN (u, F( y)) | u F(x)} < c sup{distN (u, F( y)) | u F(x)} = c h N (F (x), F ( y)) c hN (F (x), F ( y)) cLdM (x, y). using Lemma 4.2.6(i). If hN (F(x), F( y)) = hN, (F(x), F( y)) we obtain the same estimate using Lemma 4.2.6(ii). Thus hN (F(x), F( y)) cLdM (x, y) for all x, y K, showing that F is a locally Lipschitz map from (M, dM ) to (N, dN ). The remainder of this part of the proof follows in an obvious way by combining the computations from part (i) with the computations from the preceding paragraph.

4.2.5 Measurable set-valued maps We will wish to consider set-valued maps where the independent variable is to be thought of as time, i.e., as taking values in an interval in R. Of course, the constructions from the preceding section apply equally to this case. However, because of the particular structure of R, one can consider the generalisation of measurable functions to set-valued functions. 4.2.24 Denition (Measurable set-valued maps) Let (M, d) be a metric space and let T R be an interval. A set-valued map F : T M is measurable if, for every open set U M, the set {t T | F(t) U} is Lebesgue measurable.

136

4 Set-valued analysis on manifolds

23/06/2009

4.3 Convex sets, afne subspaces, and cones


In the theory of dierential inclusions in Section 4.4, a prominent role is played by convex dierential inclusions. Convexity also plays an important part in controllability theory (Chapter 9) and optimal control theory (Chapter 10). In this section we introduce the notions related to convexity that we shall use. A standard text with additional information along these lines is [Rockafellar 1970]. 4.3.1 Denitions We begin by dening subsets of a R-vector space that have the properties we shall study. 4.3.1 Denition (Convex set, cone, convex cone, afne subspace) Let V be a R-vector space. (i) A subset C V is convex if, for each x1 , x2 C, we have {sx1 + (1 s)x2 | s [0, 1]} C. (ii) A subset K V is a cone if, for each x K, we have {x | R0 } K. (iii) A subset K V is a convex cone if it is both convex and a cone. (iv) A subset A V is an ane subspace if, for each x1 , x2 A, we have {sx1 + (1 s)x2 | s R} A.

Note that the set {sx1 + (1 s)x2 | s [0, 1]} is the line segment in V between x1 and x2 . Thus a set is convex when the line segment connecting any two points in the set remains in the set. In a similar manner, {x | R0 } is the ray emanating from 0 V through the point x. A set is thus a cone when the rays emanating from 0 through all points remain in the set. One usually considers cones whose rays emanate from a general point in V, but we will not employ this degree of generality. An ane subspace is a set where the (bi-innite) line through any two points in the set remains in the set. We illustrate some of the intuition concerning these various sorts of sets in Figure 4.2. 4.3.2 Combinations and hulls We shall be interested in generating convex sets, cones, and ane subspaces containing given sets. 4.3.2 Denition (Convex hull, coned hull, coned convex hull, afne hull) Let V be a R-vector space and let S V be nonempty.

23/06/2009

4.3 Convex sets, afne subspaces, and cones

137

Figure 4.2 An illustration of a convex set (top left), a cone (top right), a convex cone (bottom left), and an ane subspace (bottom right)

(i) A convex combination from S is a linear combination in V of the form


k k

jv j,
j=1

k Z>0 , 1 , . . . , k R0 ,
j=1

j = 1, v1 , . . . , vk S.

(ii) The convex hull of S, denoted by conv(S), is the smallest convex subset of V containing S. (iii) The coned hull of S, denoted by cone(S), is the smallest cone in V containing S. (iv) A coned convex combination from S is a linear combination in V of the form
k

jv j,
j=1

k Z>0 , 1 , . . . , k R0 , v1 , . . . , vk S.

138

4 Set-valued analysis on manifolds

23/06/2009

(v) The coned convex hull of S, denoted by conv cone(S), is the smallest convex cone in V containing S. (vi) An ane combination from S is a linear combination in V of the form
k k

jv j,
j =1

k Z>0 , 1 , . . . , k R,
j =1

j = 1, v1 , . . . , vk S.

(vii) The ane hull of S, denoted by a (S), is the smallest ane subspace of V containing S. 4.3.3 Remark (Sensibility of hull denitions) The denitions of conv(S), cone(S), conv cone(S), and a (S) make sense because intersections of convex sets are convex, intersections of cones are cones, and intersections of ane subspaces are ane subspaces. The terms coned hull and coned convex hull are not standard. In the literature these will often be called the cone generated by S and the convex cone generated by S, respectively. Convex combinations have the following useful property which also describes the convex hull. 4.3.4 Proposition (The convex hull is the set of convex combinations) Let V be a R-vector space, let S V be nonempty, and denote by C(S) the set of convex combinations from S. Then C(S) = conv(S).
Proof First we show that C(S) is convex. Consider two elements of C(S) given by
k m

x=
j=1

ju j,

y=
l =1

l vl .

Then, for s [0, 1] we have


k m

sx + (1 s) y =
j =1

s j u j +
l =1

(1 s) j v j .

For r {1, . . . , k + m} dene r {1, . . . , k}, ur , wr = vrk , r {k + 1, . . . , k + m} and r {1, . . . , k}, sr , r = (1 s)rk , r {k + 1, . . . , k + m}.
k +m

Clearly wr S and r 0 for r {1, . . . , k + m}. Also,


k m

r =
r=1 j =1

s j +
l =1

(1 s)l = s + (1 s) = 1.

23/06/2009

4.3 Convex sets, afne subspaces, and cones

139

Thus sx + (1 s) y C(S), and so C(S) is convex. This necessarily implies that conv(S) C(S) since conv(S) is the smallest convex set containing S. To show that C(S) conv(S) we will show by induction on the number of elements in the linear combination that all convex combinations are contained in the convex hull. This is obvious for the convex combination of one vector. So suppose that every convex combination of the form
k

ju j,
j =1

k {1, . . . , m},

is in conv(S), and consider a convex combination from S of the form


m+1 m

y=
l =1

l vl =
l=1

l vl + m+1 vm+1 .

If m l=1 l = 0 then l = 0 for each l {1, . . . , m}. Thus y conv(S) by the induction hypothesis. So we may suppose that m 0 which means that m+1 1. Let us dene l=1 l l = l (1 m+1 )1 for l {1, . . . , m}. Since
m

1 m+1 =
l=1

it follows that

m l =1

l = 1.

Therefore,

m l =1

l vl conv(S)

by the induction hypothesis. But we also have


m

y = (1 m+1 )
l =1

l vl + m+1 vm+1

by direct computation. Therefore, y is a convex combination of two elements of conv(S). Since conv(S) is convex, this means that y conv(S), giving the result.

For cones one has a similar result. 4.3.5 Proposition (The set of positive multiples is the coned hull) Let V be a R-vector space, let S V be nonempty, and denote K(S) = {x | x S, R0 }. Then K(S) = cone(S).

140

4 Set-valued analysis on manifolds

23/06/2009

Proof Note that K(S) is clearly a cone which contains S. Thus cone(S) K(S). Now suppose that y K(S). Thus y = x for x S and R0 . Since cone(S) is a cone containing x, we must have y cone(S), giving K(S) cone(S).

One also has an interpretation along these lines for convex cones. 4.3.6 Proposition (The coned convex hull is the set of coned convex combinations) Let V be a R-vector space, let S V be nonempty, and denote by K (S) the set of coned convex combinations from S. Then K (S) = conv cone(S).
Proof We rst show that if x, y K (S) then x+ y K (S) and that if x K (S) then x K (S) 1 for R0 . The second of these assertions is obvious. For the rst, let z = 1 2 x + 2 y. Then z K (S) and so 2z = x + y K (S). Thus K (S) is closed under addition and positive scalar multiplication. From this it immediately follows that (1 s)x + sy K (S) for any x, y K (S) and s [0, 1]. Thus K (S) is convex. It is evident that K (S) is also a cone, and so we must have conv cone(S) K (S). Now let
k

y=
j=1

j v j K (S).

By the fact that conv cone(S) is a cone containing S we must have k j v j conv cone(S) for j {1, . . . , k}. Since conv cone(S) is convex and contains k j v j for j {1, . . . , k} we must have k 1 (k j v j ) = y conv cone(S), k
j=1

giving the result.

Finally, we prove the expected result for ane subspaces, namely that the ane hull is the set of ane combinations. In order to do this we rst give a useful characterisation of ane subspaces. 4.3.7 Proposition (Characterisation of an afne subspace) A nonempty subset A of a Rvector space V is an ane subspace if and only if there exists x0 V and a subspace U V such that A = {x0 + u | u U}.
Proof Let x0 A and dene U = {x x0 | x A}. The result will be proved if we prove that U is a subspace. Let x x0 U for some x A and a R. Then a(x x0 ) = ax + (1 a)x0 x0 , and so a(x x0 ) U since ax + (1 a)x0 A. For x1 x0 , x2 x0 U with x1 , x2 A we have (x1 x0 ) + (x2 x0 ) = (x1 + x2 x0 ) x0 .

23/06/2009

4.3 Convex sets, afne subspaces, and cones

141

Thus we will have (x1 x0 ) + (x2 x0 ) U if we can show that x1 + x2 x0 A. However, we have x1 x0 , x2 x0 U, = = = 2(x1 x0 ), 2(x2 x0 ) U, 2(x1 x0 ) + x0 , 2(x2 x0 ) + x0 A,
1 2 (2(x1

x0 ) + x0 ) + 1 2 (2(x2 x0 ) + x0 ) A,

which gives the result after we notice that


1 2 (2(x1 1 x0 ) + x0 ) + 2 (2(x2 x0 ) + x0 ) = x1 + x2 x0 .

We now make the following denition, corresponding to the preceding result. 4.3.8 Denition (Linear part of an afne subspace) Let A be an ane subspace of a Rvector space V, and let x0 V and a subspace U V satisfy A = x0 + U as in the preceding proposition. The subspace U is called the linear part of A and denoted by L(A). Now we can characterise the ane hull as the set of ane combinations. 4.3.9 Proposition (The afne hull is the set of afne combinations) Let V be a R-vector space, let S V be nonempty, and denote by A(S) the set of ane combinations from S. Then A(S) = a (S).
Proof We rst show that the set of ane combinations is an ane subspace. Choose x0 S and dene U(S) = {v x0 | v A(S)}.

We rst claim that U(S) is the set of linear combinations of the form
k k

jv j,
j =1

k Z>0 , 1 , . . . , k R,
j=1

j = 0, v1 , . . . , vk S.

(4.4)

To see this, note that if


k

u=
j =1

j u j x0 U(S)

then we can write


k +1 k +1

u=
j=1

ju j,

1 , . . . , k+1 R,
j =1

j = 0, u1 , . . . , uk+1 S,

by taking k+1 = 1 and uk+1 = x0 . Similarly, consider a linear combination of the form (4.4). We can without loss of generality suppose that x0 {v1 , . . . , vk }, since if this is not true we can simply add 0x0 to the sum. Thus we suppose, without loss of generality, that vk = x0 . We then have
k 1 j =1

u=

j v j + (k + 1)x0 x0 .

142

4 Set-valued analysis on manifolds

23/06/2009

Since the term in the parenthesis is clearly an element of A(S) it follows that u U(S). With this characterisation of U(S) it is then easy to show that U(S) is a subspace of V. Moreover, it is immediate from Proposition 4.3.9 that A(S) is then an ane subspace. Since a (S) is the smallest ane subspace containing S it follows that a (S) A(S). To show that A(S) a (S) we use induction on the number of elements in an ane combination in A(S). For an ane combination with one term this is obvious. So suppose that every ane combination of the form
k

jv j,
j=1

k {1, . . . , m},

is in a (S) and consider an ane combination of the form


m +1 m

x=
j =1

jv j =
j=1

j v j + m+1 vm+1 .

It must be the case that at least one of the numbers 1 , . . . , m+1 is not equal to 1. So, without 1 ) , j {1, . . . , m}. loss of generality suppose that m+1 1 and then dene j = (1 m +1 j We then have
m j =1

j = 1,

so that

m j =1

j v j a (S)

by the induction hypothesis. It then holds that


m

x = (1 m+1 )
j=1

j v j + m+1 vm+1 .

This is then in a (S).

4.3.3 Topology of convex sets and cones Let us now say a few words about the topology of convex sets. In this section we restrict our attention to nite-dimensional R-vector spaces with their standard topology. Note that every convex set is a subset of its ane hull. Moreover, as a subset of its ane hull, a convex set has an interior. 4.3.10 Denition (Relative interior and relative boundary) If V is a nite-dimensional Rvector space and if C V is a convex set, the set rel int(C) = {x C | x inta (C) (C)} is the relative interior of C and the set rel bd(C) = cl(C) \ rel int(C) is the relative boundary of C.

23/06/2009

4.3 Convex sets, afne subspaces, and cones

143

The point is that, while a convex set may have an empty interior, its interior can still be dened in a weaker, but still useful, sense. The notion of relative interior leads to the following useful concept. 4.3.11 Denition (Dimension of a convex set) Let V be a nite-dimensional R-vector space and let C V be convex and let U V be the subspace for which a (C) = {x0 + u | u U} for some x0 V. The dimension of C, denoted by dim(C), is the dimension of the subspace U. The following result will be used in our development. 4.3.12 Proposition (Closures and relative interiors of convex sets and cones are convex sets and cones) Let V be a nite-dimensional R-vector space, let C V be convex, and let K V be a convex cone. Then (i) cl(C) is convex and cl(K) is a convex cone and (ii) rel int(C) is convex and rel int(K) is a convex cone. Moreover, a (C) = a (cl(C)) and a (K) = a (cl(K)).
Proof For the purposes of the proof we put a norm on V; the result and the proof are independent of the choice of this norm. (i) Let x, y cl(C) and let s [0, 1]. Suppose that (x j ) jZ>0 and ( y j ) jZ>0 are sequences in C converging to x and y, respectively. Note that sx j + (1 s) y j C for each j Z>0 . Moreover, if > 0 then sx + (1 s) y sx j (1 s) y j s x x j + (1 s) y y j < , provided that j is suciently large that s x x j < 2 and (1 s) y y j < 2 . Thus the sequence (sx j + (1 s) y j ) jZ>0 converges to sx + (1 s) y and so sx + (1 s) y cl(C). This shows that cl(C) is convex. Since C cl(C) it follows that a (C) a (cl(C)). Moreover, since C a (C) and since a (C) is closed we have cl(C) cl(a (C)) = a (C), so giving a (C) = a (cl(C)) as desired. An entirely similar argument shows that cl(K) is convex and that a (K) = a (cl(K)). (ii) Let us rst consider the convex set C. To simplify matters, since the relative interior is the interior relative to the ane subspace containing C, and since the topology of an ane subspace is the same as that for a vector space, we shall assume that a (C) = V and show that int(C) is convex. We rst prove a lemma. 1 Lemma If V is a nite-dimensional R-vector space, if C is a convex set, if x rel int(C), and if y cl(C) then [x, y) {sx + (1 s)y | s [0, 1)} is contained in rel int(C).

144

4 Set-valued analysis on manifolds

23/06/2009

Proof As above, let us assume, without loss of generality, that a (C) = V. Let us also equip V with a norm. Since x int(C) there exists r > 0 such that B(x, r) C. Since y cl(C), for every > 0 there exists y C B( y, ). Let z = x + (1 y) [x, y) for [0, 1), and dene = r (1 ) . If is suciently small we can ensure that R>0 , and we assume that is so chosen. For z B(z, ) we have z z < = = = z (x + (1 ) y + (1 )( y y )) < z (x + (1 ) y ) + (1 ) = r z {x + (1 ) y | x B(x, r)}.

Since y C and B(x, r) C it follows that z C and so B(z, ) C. This gives our claim that [x, y) int(C). That int(C) is convex follows immediately since, if x, y int(C), Lemma 1 ensures that the line segment connecting x and y is contained in int(C). Now consider the convex cone K. We know now that rel int(K) is convex so we need only show that it is a cone. This, however, is obvious. Indeed, if x rel int(K) suppose that x rel int(K) for some R>0 . Since x K we must then have x bd(K). By Lemma 1 this means that ( + )x K for all R>0 . This contradicts the fact that K is a cone.

The following result will also come up in our constructions. 4.3.13 Proposition (The closure of the relative interior) If V if a nite-dimensional R-vector space and if C V is a convex set then cl(rel int(C)) = cl(C).
Proof It is clear that cl(rel int(C)) cl(C). Let x cl(C) and let y rel int(C). By Lemma 1 in the proof of Proposition 4.3.12 it follows that the half-open line segment [ y, x) is contained in rel int(C). Therefore, there exists a sequence (x j ) jZ>0 in this line segment, and so in rel int(C), converging to x. Thus x cl(rel int(C)).

4.3.4 Separation theorems for convex sets One of the most important properties of convex sets in convex analysis, and indeed for us in our proof of the Maximum Principle, is the notion of certain types of convex sets being separated by hyperplanes. We shall only examine those parts of the theory that we will use; we refer to [Rockafellar 1970] for further discussion. In this section we again consider subsets of a nite-dimensional R-vector space V. In order to make things clear, let us dene all of our terminology precisely. 4.3.14 Denition (Hyperplane, half-space, support hyperplane) Let V be a nitedimensional R-vector space. (i) A hyperplane in V is a subset of the form {x V | ; x = a} for some V \ {0} and a R. Such a hyperplane is denoted by P,a .

23/06/2009

4.3 Convex sets, afne subspaces, and cones

145

(ii) A half-space in V is a subset of the form {x V | ; x > a} for some V \ {0} and a R. We shall denote
H, a = {x V | ; x < a}, + H, a = {x V | ; x > a}.

+ (iii) If A V, a support hyperplane for A is a hyperplane P,a such that A H, P,a . a (iv) For subsets A, B V, a separating hyperplane is a hyperplane P,a for which + A H, a P,a , B H,a P,a .

The following result is a basis for many separation theorems for convex sets. 4.3.15 Theorem (Convex sets possess supporting hyperplanes) If V is a nite-dimensional R-vector space and if C V is a convex set not equal to V, then C possesses a supporting hyperplane.
Proof For convenience in the proof we suppose that V is equipped with a norm arising from an inner product , ; the statement of the result and the character of the proof is independent of this choice. We note that the inner product identies V naturally with V, and we make this identication without mention in the proof. Let x0 cl(C), let z C, and dene r = x0 z . Dene A = cl(C) B(x0 , r) noting that A is a nonempty compact set. Dene f : A R>0 by f ( y) = x0 y . The map f is continuous and so there exists y0 A cl(C) such that f ( y0 ) is the minimum value of f . Let = y0 x0 and a = y0 , y0 x0 . We will show that P,a is a support hyperplane for C. First let us show that P,a separates {x0 } and cl(C). A direct computation shows that , x0 = x0 y0 2 + a < a. To show that , x a for all x cl(C), suppose otherwise. Thus let x C be such that , x < a. By Lemma 1 in the proof of Proposition 4.3.12 the line segment from y to y0 is contained in cl(C). Dene g : [0, 1] R by g(s) = (1 s) y0 + sy x0 2 . Thus g is the square of the distance from x0 to points on the line segment from y to y0 . Note that g(s) g(0) for all s (0, 1] since y0 is the closest point in cl(C) to x0 . A computation gives g(s) = (1 s)2 y0 x0 2 + 2s(1 s) y x0 , y0 x0 + s2 y x0 2 and another computation gives g (0) = 2( , y a) which is strictly negative by our assumption about y. This means that g strictly decreases near zero, which contradicts the denition of y0 . Thus we must have , y a for all y cl(C).

During the course of the proof of the theorem we almost proved the following result. 4.3.16 Corollary (Separation of convex sets and points) If V is a nite-dimensional R-vector space, if C V is convex, and if x0 int(C) then there exists a separating hyperplane for {x0 } and C.

146

4 Set-valued analysis on manifolds

23/06/2009

Proof If x0 cl(C) then the result follows immediately from the proof of Theorem 4.3.15. If x0 bd(C) then let (x j ) jZ>0 be a sequence in V \ cl(C) converging to x0 . For each j Z>0 let j V \ {0} and a j R have the property that j; x j a j, j; y > a j, j Z>0 , y C, j Z>0 .
j

Let us without loss of generality take a j = j ; x j ; this corresponds to choosing the hyperplane separating C from x j to pass through x j . Let j = j , j Z>0 . The sequence ( j ) jZ>0 is a sequence in the unit sphere in V which is compact. Thus we can choose a convergent subsequence which we also denote, by an abuse of notation, by ( j ) jZ>0 . Let V denote the limit of this sequence. Dening c j = j ; x j we then have j; x j = c j, j; y > c j, Let c = lim j c j . For y C this gives ; x0 = lim j ; x j = c,
j

j Z>0 , y C, j Z>0 .

; y = lim j ; y c,
j

as desired.

The following consequence of Theorem 4.3.15 is also of independent interest. 4.3.17 Corollary (Disjoint convex sets are separated) If V is a nite-dimensional R-vector space and of C1 , C2 V are disjoint convex sets, then there exists a hyperplane separating C1 and C2 .
Proof Dene C1 C2 = {x1 x2 | x1 C1 , x2 C2 }. One checks directly that C1 C2 is convex. Since C1 and C2 are disjoint it follows that 0 C1 C2 . By Theorem 4.3.15 there exists a hyperplane P, passing through 0, separating C1 C2 from 0. We claim that this implies that the same hyperplane P, appropriately translated, separates C1 and C2 . To see this note that P gives rise to V \ {0} such that ; x1 x2 0, Let a1 = inf{ ; x1 | x1 C1 }, a2 = sup{ ; x2 | x2 C2 } so that a1 a2 0. For any a [a2 , a1 ] we have ; x 1 a , ; x 2 a , giving the separation of C1 and C2 , as desired. x1 C1 , x2 C2 , x1 C 1 , x2 C 2 .

23/06/2009

4.4 Differential inclusions on manifolds

147

We shall require the following quite general result concerning separation of convex sets by hyperplanes. 4.3.18 Theorem (A general separation theorem) If V is a nite-dimensional R-vector space and if C1 , C2 V are convex sets, then they possess a separating hyperplane if and only if either of the following two conditions holds: (i) there exists a hyperplane P such that C1 , C2 P; (ii) rel int(C1 ) rel int(C2 ) = .
Proof Suppose that C1 and C2 possess a separating hyperplane P. Therefore, there exists V \ {0} and a R such that ; x1 a, ; x2 a, x1 C 1 , x2 C 2 .

If ; x = a for all x C1 C2 then (i) holds. Now suppose that ; x1 > a for some x1 C1 (a similar argument will obviously apply if this holds for some x2 C2 ) and let x0 rel int(C1 ). Since P is a support hyperplane for C1 and since C1 P, it follows that the relative interior, and so x0 , lies in the appropriate half-space dened by P. Since P separates C1 and C2 this precludes x0 from being in C2 . Thus (ii) holds. Now suppose that (i) holds. It is then clear that P is a separating hyperplane for C1 and C2 . Finally, suppose that (ii) holds. From Proposition 4.3.12 and Corollary 4.3.17 it holds that rel int(C1 ) and rel int(C2 ) possess a separating hyperplane. Thus there exists V \{0} and a R such that ; x 1 a , ; x 2 a , x1 rel int(C1 ), x2 rel int(C2 ).

Therefore, by Proposition 4.3.13 we also have ; x1 a, ; x2 a, which implies this part of the theorem. x1 cl(C1 ), x2 cl(C2 ),

4.4 Differential inclusions on manifolds


In this section we study the principal sort of set-valued map that we shall use: the dierential inclusion. Dierential inclusions are, in some way of thinking about them, a generalisation of dierential equations or a generalisation of a certain class of control system. The idea is that one wishes to model a dynamical process not by directly prescribing the tangent vectors to trajectories, but by prescribing a set in which tangent vectors to potential trajectories must live. (Note that this is what a control system really does.) The study of dierential inclusions is somewhat technical, relying in essential ways on the theory of set-valued maps from Section 4.2.

148

4 Set-valued analysis on manifolds

23/06/2009

In almost all presentations of dierential inclusions, the states take values in open subsets of Euclidean space. In this section our formulations are on manifolds, and we prove that in any coordinate chart the usual denitions hold. (Some dierential inclusion concepts on manifolds are presented by Ledyaev and Zhu [2007].) 4.4.1 Denitions We begin with the terminology that we shall use for dierential inclusions and their properties. 4.4.1 Denition (Differential inclusion) Let M be a smooth manifold and let T R be an interval. (i) A dierential inclusion on M with time-domain T is a set-valued map X : T M TM such that dom(X) = T M and such that X(t, x) Tx M for each (t, x) T M. (ii) A dierential inclusion X : T M is time-independent if there exists a setvalued map X : M TM such that X(t, x) = X (x) for every (t, x) T M. We shall commonly write X in place of X , and simply forget about the redundant time-domain. (iii) A dierential inclusion X : T M TM is open-valued (resp. closed-valued, bounded-valued, compact-valued, convex-valued) at (t, x) T M if X(t, x) is open (resp. closed, bounded, compact, convex). (iv) A dierential inclusion X : T M TM is open-valued (resp. closed-valued, bounded-valued, compact-valued, convex-valued) if X(t, x) is open (resp. closed, bounded, compact, convex) for every (t, x) T M. We will sometimes consider cases where the time dependence of a dierential inclusion is regular in some way. In such cases, as for vector elds with regular time dependence, it is convenient sometimes to include time as an independent variable. We use the convention that, if T R is an interval (possibly not open), then TT = {(x, v) TR | x T}. ! With this convention, we have the following denition. 4.4.2 Denition (Suspension of a differential inclusion) Let M be a smooth manifold, let T R be an interval, and let X : T M TM be a dierential inclusion. The suspension of X is the time-independent dierential inclusion Xspnd : T M T(T M) on T M dened by Xspnd (t, x) = {(1, vx ) | vx X(t, x)}. Just as with vector elds, dierential inclusions can be conveniently represented in coordinate charts. 4.4.3 Denition (Local representative of a differential inclusion) Let M be a smooth manifold, let T R be an interval, and let X : T M TM be a dierential inclusion.

23/06/2009

4.4 Differential inclusions on manifolds

149

If (U, ) is a chart for M with (U) Rn , then the local representative of X is the set-valued map X : (U) Rn given by X(x) = {v Rn | (x, v) T(X(1 (x)))}. 4.4.2 Continuity of differential inclusions In this section we consider dierential inclusions with the various continuity properties introduced in Section 4.2.3. We also indicate how these denitions are manifested in coordinates, so making the connection with the usual denitions for dierential inclusions whose domain is an open subset of Euclidean space. We begin with those that depend explicitly on a choice of Riemannian metric on M. 4.4.4 Denition (Upper hemicontinuous, lower hemicontinuous, Hausdorff continuous differential inclusions) Let (M, G) be a smooth Riemannian manifold with dG the associated metric on M and with dT G the metric associated with the Sasaki Riemannian metric on TM, let T R be an interval, and let X : M TM be a time-independent dierential inclusion. (i) A time-independent dierential inclusion X : M TM is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) at x0 M if it is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) at x0 as a set-valued map between the metric spaces (M, dG ) and (TM, dT G ). (ii) A time-independent dierential inclusion X : M TM is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) if it is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) at every x M. Next we consider continuity of dierential inclusions such as can be dened using only the topology of the manifold. 4.4.5 Denition (Upper semicontinuous, lower semicontinuous, continuous differential inclusions) Let M be a smooth manifold, let T R be an interval, and let X: M TM be a time-independent dierential inclusion. (i) A time-independent dierential inclusion X : M TM is upper semicontinuous (resp. lower semicontinuous, continuous) at x0 M if it is upper semicontinuous (lower semicontinuous, continuous) at x0 as a set-valued map. (ii) A time-independent dierential inclusion X : M TM is upper semicontinuous (resp. lower semicontinuous, continuous) if it is upper semicontinuous (lower semicontinuous, continuous) as a set-valued map. Of course, all of the results from Section 4.2.3 can be repeated here, and we shall freely use these results as needed. Let us now turn to characterising the various sorts of continuous dierential inclusions in coordinates. We begin with the notions of continuity that depend on a Riemannian metric G on M. In these cases, if (U, ) is a chart for M, we endow

150

4 Set-valued analysis on manifolds

23/06/2009

(U) Rn and (U) Rn Rn Rn with their standard metrics, and when we need a metric on these spaces, we shall use this standard metric unless we say otherwise. 4.4.6 Proposition (Local characterisations of hemicontinuous differential inclusions) Let (M, G) be a smooth Riemannian manifold with dG the associated metric on M and with dT TM be a G the metric associated with the Sasaki Riemannian metric on TM and let X : M time-independent compact-valued dierential inclusion. For x0 M, the following statements hold: (i) X is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) at x0 ; (ii) for any chart (U, ) with x0 U, the local representative of X is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) at (x0 ). !
Proof This follows from Proposition 4.2.23.

For semicontinuity, depending only on the manifold topology as it does, the situation is simpler. 4.4.7 Proposition (Local characterisations of semicontinuous differential inclusions) Let M be a smooth manifold and let X : M TM be a time-independent dierential inclusion. For x0 M, the following statements hold: (i) X is upper semicontinuous (resp. lower semicontinuous, continuous) at x0 ; (ii) for any chart (U, ) with x0 U, the local representative of X is upper semicontinuous (resp. lower semicontinuous, continuous) at (x0 ). !
Proof This follows since coordinate chart maps are homeomorphisms.

Note that the above denitions are all made for time-independent dierential inclusions. The denitions can be adapted for time-dependent dierential inclusions as follows. 4.4.8 Denition (Continuity for time-dependent differential inclusions) Let M be a smooth manifold, let T R be an interval, and let X : R M TM be a dierential inclusion. (i) The dierential inclusion X is upper semicontinuous (resp. lower semicontinuous, continuous) if it is upper semicontinuous (resp. lower semicontinuous, continuous) as a set-valued map. Now let G be a smooth Riemannian metric on M with dG the associated metric on M and dT G the metric on TM associated with the Sasaki Riemannian metric. (ii) The dierential inclusion X is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) if it is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) as a set-valued map between the metric spaces (T M, dG ) and (TM, dT G ). The following result is useful, since it will allow us to derive many results for regular time-dependent dierential inclusions from those for time-independent dierential inclusions.

23/06/2009

4.4 Differential inclusions on manifolds

151

4.4.9 Proposition (Continuity of time-dependent differential inclusions) Let M be a smooth manifold, let T R be an interval, and let X : R M TM be a dierential inclusion. (i) The dierential inclusion X is upper semicontinuous (resp. lower semicontinuous, continuous) if and only if the time-independent dierential inclusion Xspnd is upper semicontinuous (resp. lower semicontinuous, continuous). Now let G be a smooth Riemannian metric on M with dG the associated metric on M and dT G the metric on TM associated with the Sasaki Riemannian metric. (ii) The dierential inclusion X is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous) if and only if the time-independent dierential inclusion Xspnd is upper hemicontinuous (resp. lower hemicontinuous, Hausdor continuous). 4.4.3 Lipschitz differential inclusions As with Lipschitz dierential equations, Lipschitz dierential inclusions play an important role. ! 4.4.10 Denition (Locally Lipschitz differential inclusions) Let (M, G) be a smooth Riemannian manifold with dG the associated metric and dT G the metric associated with the Sasaki Riemannian metric and let X : M TM be a time-independent dierential inclusion. (i) The dierential inclusion X is Lipschitz if it is a Lipschitz map between the metric spaces (M, dG ) and (TM, dT G ). (ii) The dierential inclusion X is locally Lipschitz if it is a Lipschitz map between the metric spaces (M, dG ) and (TM, dT G ). The property of local Lipschitzness is independent of metric in some cases. 4.4.11 Proposition (Local characterisations of Lipschitz differential inclusions) Let (M, G) be a smooth Riemannian manifold with dG the associated metric on M and with dT TM be a G the metric associated with the Sasaki Riemannian metric on TM and let X : M time-independent compact-valued dierential inclusion. For x0 M, the following statements hold: (i) X is locally Lipschitz; (ii) for any chart (U, ), the local representative of X is locally Lipschitz.
Proof This follows from Proposition 4.2.23 along with a standard compactness argument. !

4.4.4 Differential inclusions with measurable time dependence Let us also consider dierential inclusions with measurable time dependence.

152

4 Set-valued analysis on manifolds

23/06/2009

4.4.12 Denition (Measurable differential inclusion) Let M be a smooth manifold and let T R be an interval. Let P {upper semicontinuous, lower semicontinuous, continuous, upper hemicontinuous, lower hemicontinuous, Hausdor continuous, locally Lipschitz}. Then (i) a dierential inclusion X : T M TM ismeasurable/P if,

(a) the map t X(t, x) is measurable for every x M and (b) the map x X(t, x) has property P for almost every t T. (ii) In particular, a dierential inclusion X : T M measurable/continuous. TM is Carath eodory if it is

As with time-dependent vector elds, one can ask for the dependence on time of a dierential inclusion to have useful integrability properties. 4.4.13 Denition (Integrable time dependence for differential inclusions) Let M be a smooth manifold, let T be an interval, and let X : T M TM be a dierential inclusion. (i) The dierential inclusion X is locally integrally bounded if, for every compact subset K M and for every f C (M), there exists g L1 loc (T; R0 ) such that sup{|d f (x) X(t,x) | | X(t,x) X(t, x)} g(t) for every (t, x) T K. Now let G be a smooth Riemannian metric on M with dG the associated metric on M and dT G the metric on TM associated with the Sasaki Riemannian metric. (ii) The dierential inclusion X is locally integrally Lipschitz if, for every compact subset K M, there exists L L1 loc (T; R0 ) such that dT G (X(t, x1 ), X(t, x2 )) L(t)dG (x1 , x2 ) ! for every t T and every x1 , x2 K. 4.4.5 Selections of differential inclusions 4.4.14 Denition (Selection of a differential inclusion) For a dierential inclusion X : T M TM on a smooth manifold M, a selection of X is a time-dependent vector eld X : T M TM such that X(t, x) X(t, x) for every (t, x) T M.

23/06/2009

4.4 Differential inclusions on manifolds 4.4.6 Trajectories for differential inclusions

153

Let us state a few results concerning the existence of trajectories for dierential inclusions. First we need to know what we mean by a trajectory. 4.4.15 Denition (Trajectory for a differential inclusion) Let M be a smooth manifold, let T R be an interval, and let X : T M TM be a dierential inclusion. A trajectory for X is a locally absolutely continuous curve : T M such that (t) X(t, (t)) for almost every t T. Now we can give some of the basic conditions for existence of trajectories. We begin with a result for time-independent dierential inclusions. 4.4.16 Theorem (Trajectories for upper semicontinuous differential inclusions) Let M be a smooth manifold and let X : M TM be a time-independent dierential inclusion satisfying the following conditions: (i) X is compact-valued and convex-valued; (ii) X is upper semicontinuous. Then, for each x0 M, there exists an interval T with 0 T, and a trajectory : T M such that (0) = x0 . For time-dependent dierential inclusions, we have the following results. 4.4.17 Theorem (Trajectories for upper semicontinuous differential inclusions) Let (M, G) be a smooth Riemannian manifold and let X : M TM be a time-independent differential inclusion satisfying the following conditions: (i) X is closed-valued and convex-valued; (ii) X is measurable/upper semicontinuous; (iii) X is locally integrally bounded. Then, for each (t0 , x0 ) T M, there exists a subinterval T T, relatively open in T and with t0 intT (T ), and a trajectory : T M such that (t0 ) = x0 . One can relax the condition that the dierential inclusion be convex-valued, but only at the cost of strengthening the hypothesis of upper semicontinuity to continuity. 4.4.18 Theorem (Trajectories for upper semicontinuous differential inclusions) Let (M, G) be a smooth Riemannian manifold and let X : M TM be a time-independent differential inclusion satisfying the following conditions: (i) X is closed-valued; (ii) X is measurable/continuous; (iii) X is locally integrally bounded. Then, for each (t0 , x0 ) T M, there exists a subinterval T T, relatively open in T and with t0 intT (T ), and a trajectory : T M such that (t0 ) = x0 .

154

4 Set-valued analysis on manifolds 4.4.7 Relaxation

23/06/2009

It is very often of interest to compare trajectories of a dierential inclusion with its convexication. The most popular theorem for doing this is attributed to Wazewski [1962] and Filippov [1967]. This theorem has been extended in various ways by, for example, Joo and Tallos [1999] and Cernea [2001]. Here we report the version of Cernea [2001] as it is the most general. 4.4.19 Theorem (Relaxation Theorem) Let (M, G) be a smooth Riemannian manifold, let T R be an interval, and let X, Y : T M TM be dierential inclusions for which X has the following properties: (i) X is convex-valued and compact-valued; (ii) X is measurable/locally Lipschitz; (iii) X is continuous, and Y has the following properties: (ii) Y is compact-valued; (iii) Y is measurable/upper semicontinuous; (iv) Y(t, x) X(t, x) for every (t, x) T M. Then cl(conv(Y)) = X if and only if the set of trajectories for Y are dense in the set of trajectories for X relative to the topology of uniform convergence on compact sets. 4.4.8 Differential inclusions associated with a discontinuous vector eld One of the places where dierential inclusions have seen widespread use is for regularising a nonsmooth vector eld. Let (M, G) be a Riemannian manifold with the Levi-Civita ane connection. For a curve : I M and for t1 , t2 I, let Tx1 ,x2 : Tx1 M Tx2 M denote the isomorphism dened by parallel translation along . Let x0 M and, for r R>0 suciently small, let Nr be a normal neighbourhood of x0 of radius r [Kobayashi and Nomizu 1963, Theorem 8.7]. Thus, for x N there exists a unique geodesic from x0 to x whose length does not exceed r. Let us denote by Tx0 ,x : Tx0 M Tx M the isomorphism dened by parallel translation along this unique geodesic. A subset Z M has measure zero if, for every R>0 , there exists a covering (B j ) jZ>0 of Z by open balls in the metric dG such that j=1 volG (B j ) < . One can easily show that this denition of measure zero is independent of metric. Now, suppose that we are given a time-dependent vector eld X : T M TM, making no assumptions about any regularity properties of X. To X assign the timeindependent dierential inclusion XX by XX (t, x0 ) =
>0 ZN (Z)=0

cl(conv({Tx0 ,x (X(t, x)) | x N \ Z}))

Despite the lack of any regularity of X, the dierential inclusion XX does have some useful regularity properties.

23/06/2009

4.4 Differential inclusions on manifolds

155

4.4.20 Theorem (Regularity of differential inclusions associated with discontinuous vector elds) Let (M, G) be a Riemannian manifold, let T R be an interval, and let X : T M TM have the property that, for each compact subset K M and each f C (M), there exists g L1 loc (T; R0 ) such that |df(x) X(t, x)| g(t) for every (t, x) T K. Then the dierential inclusion XX is compact-valued, convexvalued, measurable/upper semicontinuous, and locally integrally bounded. In particular, Theorem 4.4.18 implies that, for each (t0 , x0 ) T M, there exists a subinterval T T, relatively open in T and with t0 intT (T ), and a trajectory : T M such that (t0 ) = x0 . !

Chapter 5 Families of vector elds, distributions, and afne distributions


As we shall see when we start discussing geometric control theory in earnest in subsequent chapters, the notion of a family of vector elds arises naturally in control theory. And associated with families of vector elds are the pointwise linear hulls of these vector elds. These are what are called distributions. Associated with families of vector elds and distributions are many constructions that can be used to shed light on the properties of control systems. In this chapter we study distributions and families of vector elds in detail. We also dene the notion of an ane distribution as this comes up in our treatment of certain kinds of systems, namely control-ane systems (Section 6.4) and ane systems (Section 5.4). After an initial discussion of distributions and their basic properties in Section 5.1, we present some algebraic properties of distributions in Section 5.2. Here we will see some of the features associated with real analyticity begin to play a role. We then discuss the Orbit Theorem which will be of some importance to us. In our treatment of the Orbit Theorem, we will again see that some properties of real analyticity play an important part. We then describe ane distributions.

5.1 Distributions: denitions and basic properties


In this section we dene what we mean by a distribution, and give some properties of distributions that can easily be associated with these basic denitions. 5.1.1 Denitions A distribution on a manifold assigns to each point in the manifold a subspace of the tangent space. 5.1.1 Denition (Distribution) Let M be a manifold of class C or C , as is required. A distribution on M is a subset D TM such that, for each x M, the subset Dx = D Tx M is a subspace (and so, in particular, is nonempty). Associated with the notion of a distribution we have the following. (i) A distribution D is of class Cr , r Z0 {, }, if, for each x0 M, there exists

23/06/2009

5.1 Distributions: denitions and basic properties

157

a neighbourhood N of x0 and a family (X j ) j J of Cr -vector elds, called local generators, on N such that Dx = spanR (X j (x) | j J) for each x N. (ii) A distribution D of class Cr , r Z0 {, }, is locally nitely generated if, for each x0 M, there exists a neighbourhood N of x0 and a family (X1 , . . . , Xk ) of Cr -vector elds, called local generators, on N such that Dx = spanR (X1 (x), . . . , Xk (x)) for each x N. (iii) A distribution D of class Cr , r Z0 {, }, is nitely generated if there exists a family (X1 , . . . , Xk ) of Cr -vector elds, called generators, on M such that Dx = spanR (X1 (x), . . . , Xk (x)) for each x M. The nonnegative integer dim(Dx ) is called the rank of D at x and is sometimes denoted rank(Dx ). The following rather non-obvious result is due to Sussmann [2008]. 5.1.2 Theorem (Distributions of class Cr are nitely generated) For r Z0 {} and for a distribution D on a smooth paracompact Hausdor manifold M of bounded dimension, the following statements are equivalent: (i) D is of class Cr ; (ii) for each x0 M and each vx0 Dx0 , there exists a neighbourhood N of x0 and a Cr -vector eld X such that X(x0 ) = vx0 and X(x) Dx for each x N; (iii) there exists a family (X1 , . . . , Xk ) of Cr -vector elds on M such that Dx = spanR (X1 (x), . . . , Xk (x)) for each x M.
Proof (i) = (ii) Suppose that D is of class Cr , let x0 M, and let vx0 Dx0 . Let N be a neighbourhood of x0 and let (X j ) j J be a family of vector elds on N of class Cr such that Dx = spanR (X j (x)| j J) for x N. Let j1 , . . . , jk J be such that (X j1 (x0 ), . . . , X jk (x0 )) is a basis for Dx0 . Then vx0 = c1 X j1 (x0 ) + + ck X jk (x0 ) for some uniquely dened c1 , . . . , ck R. The vector eld X = c1 X j1 + + ck X jk dened on N is then of class Cr , is D-valued on N, and satises X(x0 ) = vx0 .

158

5 Families of vector elds, distributions, and afne distributions

23/06/2009

(ii) = (iii) Since M has bounded dimension, there exists n Z0 such that all connected components of M have dimension at most n. Thus there exists a least integer n0 Z0 such that rank(Dx ) n0 for every x M. Use the notation rank(Dx ) = rankD (x). For k {0, 1, . . . , n0 + 1} denote Uk = {x M | rankD (x) k}. By Proposition 5.1.7 below, rankD is lower semicontinuous, and so Uk is open for each k {0, 1, . . . , n0 + 1}. Moreover, = Un0 +1 Un0 U1 U0 = M. We wish to dene a certain open cover of each of the open sets Uk , k {0, 1, . . . , n0 + 1}, with a certain property. We do this inductively. The following two general lemmata will be key in our inductive construction. Note that the notation of the lemmata may or may nor correspond to the notation of the theorem and its proof. So beware. 1 Lemma Let M be a smooth, paracompact, Hausdor manifold all of whose connected components have dimension bounded by n Z0 and let (Wj )jJ be an open cover of M. Then there exists an open cover (Va )aA of M with the following properties: (i) (Va )aA is a renement of (Wj )jJ (i.e., for each j J there exists a A such that Wj Va );
+1 A and such that, whenever (ii) there exists subsets A1 , . . . , An+1 A such that A = n l=1 l a1 , a2 Al for some l {1, . . . , n + 1}, it holds that Va1 Va2 = .

Proof Since M is paracompact it possesses a Riemannian metric by Corollary 5.5.13 of [Abraham, Marsden, and Ratiu 1988]. Therefore, we may assume that M is a metric space, and we denote the metric by d. A smooth manifold can be triangulated by which we mean that there exists a homeomorphism : S M where S is a union of simplices which intersect only at their boundaries; [Munkres 1966, Theorem 8.4]. By successive barycentric subdivisions of the simplices of S we can assume that all simplices Sb , b B, comprising S are such that (Sb ) W j for some j J. Let us adopt the usual slight abuse of terminology and say that (Sb ), b B, is a simplex. Let m {0, 1, . . . , n} and dene Fm = {F N | F is an open m-dimensional face of some simplex} and denote |Fm | = FFm F. For F Fm denote C(F) = cl({F Fm | F F}).

Note that Fm is both open and closed in the relative topology of Fm . Also, C(F) is closed in the relative topology of |Fm | since its intersection with any compact set is a nite union of closed sets, and so closed. This implies that F C(F) = by virtue of F being relatively open. Now let x F. Then {x} and C(F) are disjoint closed sets, and so d(x, C(F)) = inf{d(x, y) | y C(F)} is positive. (Indeed, suppose otherwise. Then there exists a sequence ( yk )kZ>0 in C(F) converging to x. Thus x C(F) since C(F) is relatively closed. This contradicts the fact

23/06/2009

5.1 Distributions: denitions and basic properties

159

that {x} and C(F) are disjoint.) Denote by B(x, 1 2 d(x, C(F))) the open ball in M of radius 1 1 2 d(x, C(F)) and dene B(F) = xF B(x, 2 d(x, C(F))). We claim that if F1 , F2 Fm are disjoint, then B(F1 ) and B(F2 ) are disjoint. Suppose otherwise and let x B(F1 ) B(F2 ), Let x1 F1 and x2 F2 be such that y B(x j , 1 2 d(x j , C(F j ))), j {1, 2}. Then
1 d(x1 , x2 ) d(x1 , x) + d(x2 , x) < 2 d(x1 , C(F1 )) + 1 2 d(x2 , C(F2 ))

max{d(x1 , C(F1 )), 1 2 d(x2 , C(F2 ))}. Thus either d(x1 , x2 ) < d(x1 , C(F1 )) or d(x1 , x2 ) < d(x2 , C(F2 )). In the rst case we have x2 F1 , contradicting the fact that x2 C(F1 ), and in the second case we have x1 F2 , contradicting the fact that x1 C(F2 ). Thus B(F1 ) and B(F2 ) are disjoint if F1 , F2 Fm are disjoint. For F Fm let jF J be such that F W jF , this being possible since our triangulation of M was chosen in precisely this manner. Dene an open set VF = B(F) W jF . Note that F VF W jF . Moreover, since B(F1 ) and B(F2 ) are disjoint for F1 , F2 Fm disjoint, it follows that VF1 and VF2 are disjoint for F1 , F2 Fm disjoint. If x M then x belongs to some open m-dimensional face from the triangulation of M for some m {0, 1, . . . , n}. Therefore,
n

M=
m=0 FFm

VF .

Thus we have an open cover of M that renes (W j ) j J . The index set for the open cover is the set A = {F | F Fm , m {0, 1, . . . , n}}. It remains to show that the index set for the open cover satises the second condition in the statement of the lemma. For m {1, . . . , n + 1} let Am = Fm1 so that A is the disjoint union of A1 , . . . , An+1 . As we have shown above, for each m {1, . . . , n + 1}, the family of open sets (VF )FAm is a pairwise disjoint union of open sets, just as is asserted in the statement of the lemma. The following technical lemma will also be useful. 2 Lemma If U is an open subset of a smooth, paracompact, Hausdor manifold M then there exists f C (M) such that f(x) R>0 for all x U and f(x) = 0 for all x M \ U. Proof We equip M with a Riemannian metric G, this by paracompactness of M [Abraham, Marsden, and Ratiu 1988, Corollary 5.5.13]. We denote by the Levi-Civita connection of G. Let g C (M). For k Z0 we dene k g recursively by 0 g = g, 1 g = d g, and k g = (k1 g). Thus k g is a (0, k)-tensor eld which, when represented in coordinates, is a sum of the kth derivatives of g, plus terms involving lower-order derivatives of g and Christoel symbols. If C M is compact we dene g
k,C

= sup{ j g(x) | x C, j {0, 1, . . . , k}},

160

5 Families of vector elds, distributions, and afne distributions

23/06/2009

where indicates the norm induced on tensors by the norm associated with the Riemannian metric. The value of g k,C depends on the choice of metric G, but if G is another metric with k,C the corresponding norm, then there exists M R>0 such that M1 g
k ,C

k ,C

M g

k,C

for every g C (M), cf. Lemma 3.1.7. The family k,C , k Z0 , C M, of norms denes a locally convex topology on C (M) (often called uniform convergence of all derivatives on compact sets), and if a sequence ( g j ) jZ>0 satises
j

lim g g j

k ,C

= 0,

k Z0 , C M compact,

then lim j g j (x) = g(x) (obviously) and g is innitely dierentiable (less obvious). We refer to [Hirsch 1976, 2.1] for details. We suppose that M is connected since, if it is not, we can construct f for each connected component, which suces to give f on M. If M is paracompact, connectedness allows us to conclude that M is second countable [Abraham, Marsden, and Ratiu 1988, Proposition 5.5.11]. Using Lemma 2.76 of [Aliprantis and Border 1999], we let (K j ) jZ>0 be a sequence of compact subsets of U such that K j int(K j+1 ) for j Z>0 and such that jZ>0 K j = U; . For j Z>0 let g j : M [0, 1] be a smooth function such that g j (x) = 1 for x K j and g j (x) = 0 for x M \ K j+1 ; see [Abraham, Marsden, and Ratiu 1988, Proposition 5.5.8]. Let us dene j = g j j,C j+1 and take j R>0 to satisfy j < ( j 2 j )1 . We dene f by

f (x ) =
j =1

j g j (x),

and claim that f as dened satises the conclusions of the lemma. First of all, since each of the functions g j takes values in [0, 1] we have
j j =1 j j =1

| f (x)|
j=1

| j g j (x)|

gj

0,C j+1

gj

j,C j+1

j =1

1 1, 2j

and so f is well-dened. If x U then there exists N Z>0 such that x KN . Thus gN (x) = 1 and so f (x) R>0 . If x M \ U then g j (x) = 0 for all j Z>0 and so f (x) = 0. All that remains to show is that f is innitely dierentiable. First let x M, let m Z>0 , and let j Z0 be such that j m. If x Km+1 then gm is zero in a neighbourhood of x, and so j gm (x) = 0. If x Km+1 then j gm (x) sup{ j gm (x ) | x Km+1 } sup{ j gm (x ) | x Km+1 , j {0, 1, . . . , m}} = m . Thus, whenever j m we have j gm (x) m for every x N. Let us dene fm C (M) by
m

fm (x ) =
j =1

j g j (x).

23/06/2009

5.1 Distributions: denitions and basic properties

161

Let C M be compact, let k Z0 , and let R>0 . Take N Z>0 suciently large that
m2 m =m 1 +1

1 < , 2m
1 j =1 2 j .

for m1 , m2 N with m1 < m2 , this being possible by convergence of m1 , m2 max{ j, N}, fm1 fm2
k,C

Then, for

= sup{ j fm1 (x) j fm2 (x) | x C, j {0, 1, . . . , k}}


m2

= sup
m =m 1 +1 m2

gm (x)

x C, j {0, 1, . . . , k}
m2

sup
m=m1 +1

gm (x)

x C, j {0, 1, . . . , k}
m 1 +1

1 < . 2m

Thus, for every k Z0 and C M compact, ( fm )mZ>0 is a Cauchy sequence in the norm k,C . This implies that the sequence ( fm )mZ>0 converges to a function that is innitely dierentiable. With the use of the preceding lemmata, we can prove the following, reverting now to the notation of the theorem and its proof. 3 Lemma There exist Cr -vector elds Xk m , k {0, 1, . . . , n0 }, m {1, . . . , n + 1}, on M, taking values in D, such that, for every k {0, 1, . . . , n0 } and for every x Uk , dim(spanR ({Xm (x ) | j {0, 1, . . . , k}, m {1, . . . , n + 1}})) k. Proof We prove this by induction on k. For k = 0 the assertion is obvious. Indeed, one 0 0 merely takes the vector elds X1 , . . . , Xn to all be zero, and the conclusion holds in this +1 case. Now assume that the conclusions hold for k {0, 1, . . . , s}. Let x Us+1 and dene
k Us x = spanR ({Xm (x) | k {0, 1, . . . , s}, m {1, . . . , n + 1}}), j

noting that dim(Us x ) s by the induction hypothesis. Since x Ux there exists vx Dx such that dim(spanR (Us x {vx })) s + 1.
+1 of x and By the hypotheses of part (ii) of the theorem, there exists a neighbourhood Ws x r s + 1 a C -vector eld Xx on Wx , taking values in D, such that Xx (x) = vx . By multiplying Xx +1 and with compact support [see Abraham, by a smooth function that is positive on Ws x Marsden, and Ratiu 1988, Theorem 5.5.9] we can extend Xx to a D-valued Cr -vector eld +1 on all of M. By shrinking Ws x if necessary, we can suppose that

dim(spanR (Us x {Xx (x )})) s + 1,

+1 x Ws x ,

by lower semicontinuity of the rank of a distribution (see Proposition 5.1.7 below). With s+1 so specied we have Ws+1 U s+1 Wx s+1 . Note that (Wx )xUs+1 is then an open cover of Uk . x +1 s +1 By Lemma 1 let (Vs a )aA be a renement of the open cover (Wx )xUs+1 of Us+1 such +1 s +1 = that the index set A is a disjoint union of sets A1 , . . . , An+1 such that Vs a1 Va2

162

5 Families of vector elds, distributions, and afne distributions

23/06/2009

+1 = s+1 whenever a1 , a2 Al are distinct for some l {1, . . . , n + 1}. Let us denote Vs aAl Va , l +1 Ws+1 . l {1, . . . , n + 1}. For l {1, . . . , m + 1} and a Al let xl,a Uk be such that Vs xl,a a r s + 1 s + 1 s + 1 Dene a C -vector eld Yl on Vl by asking that, if x Va for some (necessarily unique) a Al , then Yls+1 (x) = Xxl,a (x). The vector eld Yls+1 will then have the property that s+1 dim(spanR (Us x Vl , l {1, . . . , n + 1}. (5.1) x {Yl (x)})) s + 1,

For each l {1, . . . , n + 1}, by Lemma 2 let fl C (M) be such that f (x) R>0 for x Vl and such that f (x) = 0 for x M \ Vl . Then dene Xls+1 = fl Yls+1 so that
s+1 dim(spanR (Us x {Xl (x)})) s + 1,

x Vl , l {1, . . . , n + 1}

(5.2)

by (5.1), since Xls+1 (x) is a nonzero multiple of Yls+1 (x) for all x Vl . To complete the proof of the lemma, let x Us+1 and let a A be such that x Va . Then a Al for some unique l {1, . . . , n + 1} and so x Vl . Therefore, by (5.2) and recalling the denition of Us x,
s+1 spanR (Us x {Xl (x)}) spanR ({Xm (x) | j {0, 1, . . . , s + 1}, m {1, . . . , n + 1}}) j

dim(spanR ({Xm (x) | j {0, 1, . . . , s + 1}, m {1, . . . , n + 1}})) s + 1,

as desired. We now conclude the proof of this part of the theorem by showing that the D-valued Cr -vector elds j (Xm | j {1, . . . , n0 }, m {1, . . . , n + 1}) from Lemma 3 generate D. Let x M and let k = rankD (x) so that x Uk . By Lemma 3 we have dim(spanR ({Xm (x) | j {1, . . . , n0 }, m {1, . . . , n + 1}})) dim(spanR ({Xm (x) | j {1, . . . , k}, m {1, . . . , n + 1}})) k. However, since dim(Dx ) = k and since all vector elds Xm , j {1, . . . , n0 }, m {1, . . . , n + 1}, are D-valued, we conclude that spanR ({Xm (x) | j {1, . . . , n0 }, m {1, . . . , n + 1}}) = Dx , as desired. (iii) = (i) This is obvious.
j j j j

5.1.3 Remark (Bound on the number of generators for a distribution) During the course of the proof of the preceding theorem, we showed that, if all connected components of M are bounded in dimension by n and if the dimension of Dx , x M is bounded by m, then one can nd m(n + 1) generators for D. 5.1.4 Remark (Generalised vector subbundles are nitely generated) While we have been talking in this section about distributions, much of what we say applied to a vector bundle : V M over M. In particular, our notion of a Cr -distribution

23/06/2009

5.1 Distributions: denitions and basic properties

163

can be adapted to give the notion of a Cr -generalised subbundle of V as being an assignment of a subspace of each bre such that, around every point in M, there is a family of sections which brewise span the assigned subspaces. Moreover, by a mere change of notation, the proof of Theorem 5.1.2 adapts to give the following statement. Let : V M be a vector bundle such that M is smooth, paracompact, Hausdor and of bounded dimension and such that the bres of V have bounded dimension. For a generalised subbundle E of V and for r Z0 {}, the following statements are equivalent: (i) E is of class Cr ; (ii) for each x0 M and each vx0 Ex0 , there exists a neighbourhood N of x0 and a Cr -section of V such that (x0 ) = vx0 and (x) Ex for each x N; (iii) there exists a family (1 , . . . , k ) of Cr -sections of V such that Ex = spanR (1 (x), . . . , k (x)) for each x M. 5.1.2 Regular and singular points One of the complications ensuing from the notion of a distribution arises if the dimensions of the subspaces Dx , x M, are not locally constant. The following denition associates some language with this. 5.1.5 Denition (Regular point, singular point) Let D be a distribution on a manifold M. A point x0 M (i) is a regular point for D is there exists a neighbourhood N of x0 such that rank(Dx ) = rank(Dx0 ) for every x N and (ii) is a singular point for D if it not a regular point for D. A distribution D on M is regular if every point in M is a regular point for D, and is singular otherwise. Although our denition of regular and singular points is made for arbitrary distributions, these denitions only have real value in the case when the distribution has some smoothness. A regular distribution of class Cr , r Z0 {, }, is often called subbundle of TM of class Cr . For continuous distributions one can make some statements about the character of rank and the character of the set of regular and singular points. To do this we make the following denition. 5.1.6 Denition (Upper and lower semicontinuity) Let X be a topological space. A function f : X R is (i) upper semicontinuous if f 1 ((, a)) is open for every a R and is (ii) lower semicontinuous if f 1 ((a, )) is open for every a R.

164

5 Families of vector elds, distributions, and afne distributions

23/06/2009

One can easily verify that f is continuous if and only if it is both upper and lower semicontinuous. One might gain some insight into semicontinuity by showing that a set A X is open (resp. closed) if and only if the characteristic function A is lower (resp. upper) semicontinuous. We shall refrain here from a detailed discussion of semicontinuity, although such a discussion is interesting. We refer the reader to [Aliprantis and Border 1999, 2.20]. In the following result, if D is a distribution on M, then we denote by rankD : M Z0 the function dened by rankD (x) = rank(Dx ). 5.1.7 Proposition (Rank and regular points for differentiable distributions) If D is a distribution of class C1 on M then the function rankD is lower semicontinuous and the set of regular points of D is open and dense.
1 rankD (x0 ) > a. This means that Proof Let a R and let x0 rank D ((a, )). Thus k 1 there are k vector elds X1 , . . . , Xk of class C dened in a neighbourhood N of x0 such that Dx0 = spanR (X1 (x0 ), . . . , Xk (x0 )). Now choose a vector bundle chart (V, ) for TM about x0 so that the local sections X1 , . . . , Xk have local representatives

x (x, X j (x)),

j {1, . . . , k}.

Let (U, ) be the induced chart for M and let x0 = (x0 ). The vectors (X 1 (x0 ), . . . , X k (x0 )) are then linearly independent. Therefore, there exists j1 , . . . , jk {1, . . . , n} (supposing that n is the dimension of M) such that the matrix j1 j X 1 (x0 ) X k1 (x0 ) . . . . . . . . . jk jk X 1 (x0 ) X k (x0 ) has nonzero determinant, where X i l (x0 ) is the jl th component of X i , i, l {1, . . . , k}. By continuity of the determinant there exists a neighbourhood U of x0 such that the matrix j1 j X 1 (x) X k1 (x) . . . . . . . . . jk jk X 1 (x) X k (x) has nonzero determinant for every x U . Thus the vectors (X 1 (x), . . . , X k (x)) are linearly independent for every x U . Therefore, the local sections X1 , . . . , Xk are linearly inde1 pendent on 1 (U ), and so 1 (U ) rank D ((a, )) which gives lower semicontinuity of rankD . Let us denote by RD the set of regular points of D and let x0 RD . Then, by denition of RD , there exists a neighbourhood U of x0 such that U RD . Thus RD is open. Now let x0 M and let U be a connected neighbourhood of x0 . Since the function rankD is locally bounded, there exists a least integer N such that rankD (x) N for each x U. Moreover, since rankD is integer-valued, there exists x U such that rankD (x ) = N. Now, by lower semicontinuity of rankD , there exists a neighbourhood U of x such that rankD (x) N for all x U . By denition of N we also have rankD (x) N for each x U . Thus x RD , and so x0 cl(RD ). Therefore, RD is dense.
j

23/06/2009

5.1 Distributions: denitions and basic properties

165

5.1.8 Remark (Singular points of measure zero?) One often sees the confounding of open and dense and complement has measure zero in discussions of regular points. Note that one can have open and dense subsets of Rn for which the complement has arbitrarily large measure. Therefore, there is no correspondence between open and dense and complement has measure zero. If we consider an analytic distribution on an analytic manifold, then the set of singular points is of measure zero. However, much more is true in this case: the set of singular points is a proper analytic subset (meaning it is locally the intersection of the set of zeros of a nite number of analytic functions). Measure zero is a worthlessly coarse description of a proper analytic subset. Therefore, measure zero should be left out of this discussion, and one should use open and dense in the dierentiable case and complement being a proper analytic subset in the analytic case. The notions of regularity and singularity bear on the character of local generators for a distribution. 5.1.9 Proposition (A useful class of local generators for distributions) Let M be a manifold of class C or C , as is required, let r Z0 {, } and let D be a Cr -distribution on M. Then, for each x0 M there exists a neighbourhood N of x0 and local generators (X1 , . . . , Xk ) for D on N with the following properties: (i) (X1 (x0 ), . . . , Xm (x0 )) form a basis for Dx0 ; (ii) Xm+1 (x0 ) = = Xk (x0 ) = 0x0 . In particular, if x0 is a regular point for D, the vector elds (X1 , . . . , Xm ) are local generators for D in some neighbourhood (possibly smaller than N) of x0 .

Proof Let (Y1 , . . . , Yk ) be local generators dened on a neighbourhood N of x0 . We can assume there are nitely many of these by Theorem 5.1.2 in the case when r Z0 {} and by Theorem 2.4.28 in case r = . We may rearrange the vector elds (Y1 , . . . , Yk ) so that (Y1 (x0 ), . . . , Ym (x0 )) forms a basis for Dx0 . We then let (vm+1 , . . . , vk ) Rk be a basis for the kernel of the linear map Lx0 : Rk Dx0 dened by
k

Lx0 (v) =
j =1

v j Y j (x0 ).

Dene R GL(k; R) by R= e1 em vm+1 vk where e j Rk , j {1, . . . , k}, is the jth standard basis vector, and dene X1 , . . . , Xk by
k

Xj =
l =1

Rl j Yl ,

j {1, . . . , k}.

Then X j = Y j for j {1, . . . , k}, and X j (x0 ) = Lx0 (v j ) = 0x0 for j {m + 1, . . . , k}. This gives the rst conclusion of the proposition. For the second, if x0 is a regular point for D then let D be the distribution on N generated by (X1 , . . . , Xm ). Then, for x in a neighbourhood N of x0 rankD (x) = m by lower semicontinuity of rankD . Since D D it follows that D |N = D|N.

166

5 Families of vector elds, distributions, and afne distributions

23/06/2009

5.1.3 Distributions invariant under vector elds and diffeomorphisms In this section we investigate properties of distributions relative to vector elds. For a distribution D let r (D) be the set of Cr -vector elds taking values on M. 5.1.10 Denition (Distributions invariant under vector elds and diffeomorphisms) Let r Z>0 {, }, let M be a manifold of class C or C , as required, let D be a distribution of class Cr , let X be a Cr -vector eld, and let : M M be a Cr -dieomorphism. The distribution D (i) is invariant under X if [X, Y] r (D) for every Y r (D) and (ii) is invariant under if Y r (D) for every Y r (D). One would like to think that invariance under a vector eld and invariance under its ow are equivalent. This is sort of true sometimes. For the statement of the following theorem, we use the fact that r (D) is a module over the ring Cr (M). In Section 5.2.1 we say what it means for this submodule to be locally nitely generated. 5.1.11 Theorem (Invariant distributions) Let r {, }, let M be a Cr -manifold, let D be a Cr -distribution, and let X r (TM). Consider the following two statements: (i) D is invariant under X; r X (ii) (X t ) Y(x) Dx for every Y (D) and every pair (t, x) for which t (x) is dened. Then (ii) = (i) always, and (i) = (ii) if r (D) is a locally nitely generated module of vector elds.
Proof (i) = (ii) Let x M and let X r (TM) satisfy {[X, Y] | Y r (D)} r (D). Let N be a neighbourhood of x such that r (D|N) is nitely generated. Consider vector elds Y1 , . . . , Yk r (D) whose restrictions to N generate the module r (D|N). Then, the hypotheses of the theorem imply that
k

[Y j , X](x ) =
i=1

f ji (x )Yi (x ),

x N,

for f ji Cr (N), i, j {1, . . . , k}. For t R such that x U(X, t) dene v j (t) = (X t ) Y j (x), j {1, . . . , k}, so that t v j (t) is a curve in Tx M. Then, by [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.19] d d X v j (t) = (X ) Y j (x) = (X t ) [Y j , X](x) = (t ) dt dt t
k k i X (X t ) f j (x)(t ) Yi (x) = j =1 i (X t ) f j (x)vi (t). k i=1

f ji Y j (x)

=
j=1

Dene Ax (t) Rkk by

X i Ai x, j (t) = (t ) f j (x)

23/06/2009

5.1 Distributions: denitions and basic properties

167

and let x : R Rkk be the solution to the matrix initial value problem d x (t) = Ax (t)x (t), dt We claim that v j (t) =
i =1 k

x (0) = Ik .

i x, j (t)Yi (x).

Indeed, d dt
k i =1 k

i x, j (t)Yi (x) = =

i=1 k

d i (t)Yi (x) = dt x, j
k l (X t ) f j (x) i =1

k i,l=1 i Al x, j (t)x,l (t)Yi (x)

l=1

i x,l (t)Yi (x) .

Moreover,
k i =1

i x, j (0)Yi (x) = Y j (x),


k

v j (0) = Y j (x).

Thus t v j (t) and t

i =1

i x, j (t)Yi (x)

satisfy the same dierential equation with the same initial condition. Thus they are equal. This gives
k

(X t ) Y j (x ) ( X t ) Y j (x)

=
i=1

i x, j (t)Yi (x),

showing that Dx . By linearity of pull-back it immediately follows that (X ) Y ( x ) D for every Y r (D). x t (ii) = (i) Let x M and let R>0 be such that X t (x) exists for t ( , ). Then, for each t ( , ), we have (X t ) Y(x) Dx . Therefore, since Dx is a subspace, we use [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.19] to compute d (X ) Y(x) Dx , [Y, X](x) = dt t=0 t as desired.

5.1.12 Remark (Modules of vector elds invariant under vector elds) The implication (i) = (ii) can be made in a more general context. We let M be a locally nitely generated submodule of r (TM) (see Section 5.2.1 for the notation) and suppose that [X, Y] M for every Y M . Then the above proof, by mere replacing of r (D) with M shows that {AdX Y(x) | Y M } D(M )x t for every (t, x) such that X t (x) is dened.

168

5 Families of vector elds, distributions, and afne distributions

23/06/2009

5.2 The algebraic structure of sets of functions and vector elds


In this section we discuss some important algebraic structure associated with functions and vector elds. Sometimes certain nontrivial algebraic properties relate to geometric properties that are important to us. 5.2.1 Rings of functions and modules of vector elds Let r Z0 {, } and let M be a Cr -manifold. Note that Cr (M) is a ring with the operations of addition and multiplication dened pointwise: ( f + g)(x) = f (x) + g(x), ( f g)(x) = f (x) g(x).

Also, r (TM) is a module over the ring Cr (M). Addition and multiplication by ring elements is dened pointwise: (X + Y)(x) = X(x) + Y(x), ( f X)(x) = f (x)X(x),

for X, Y r (TM) and f Cr (M). If X r (TM) is a set of Cr -vector elds, we denote by T (X ) the module generated by X . One easily veries that T (X ) is the set of linear combinations f 1 X1 + + f k Xk , k Z>0 , f 1 , . . . , f k Cr (M), X1 , . . . , Xk X .

For subsets F Cr (M) of functions X r (TM) of vector elds, and for a submanifold N of M, we denote by F |N and X |N the families of functions and vector elds on N dened by F |N = { f |N | f F }, X |N = {X|N | X X },

respectively. We call these the restrictions of F and X to N. A submodule M r (TM) is locally nitely generated if, for each x0 M, there exists a neighbourhood N of x0 and X1 , . . . , Xk M |N such that, for any X M |N, there exists f 1 , . . . , f k Cr (N) such that X = f 1 X1 + + f k Xk . If D is a Cr -distribution then r (D) denotes the set of vector elds taking values in D. It is clear that r (D) is a submodule of r (TM). Conversely, if X r (TM) is a subset of vector elds, then we can dene a distribution D(X ) on M by D(X )x = spanR (X(x)| X X ). We call D(X ) the distribution generated by X . There is a common situation when the vector elds taking values in a distribution is a locally nitely generated module.

23/06/2009

5.2 The algebraic structure of sets of functions and vector elds

169

5.2.1 Proposition (Sections of regular distributions are locally free modules) Let r Z0 {, }, let M be a C - or C -manifold, as is required, and let D be a Cr -distribution on M. If x0 M is a regular point of D then there exists a neighbourhood N of x0 such that r (D)|N is a free nitely generated module.
Proof By Proposition 5.1.9 let X1 , . . . , Xm r (D|N) be such that (X1 (x), . . . , Xm (x)) is a basis for Dx for all x N for some neighbourhood N of x0 . Let us assume, without loss of generality, that N is the domain of a coordinate chart (N, ) and let Xlj , l {1, . . . , n}, be the components of X j , j {1, . . . , m}, in these coordinates. Here n is the dimension of the connected component of M containing x0 . Let us arrange the components of the vector elds X1 , . . . , Xm in a matrix X 1 (x) , X (x) = X 0 (x) where the ( j, l)th entry of X 1 (x) Rmm is Xlj (x), j, l {1, . . . , m}, and the ( j, l)th entry of X 0 (x) R(nm)m is Xlj (x), j {1, . . . , m}, l {m + 1, . . . , n}. Since (X1 (x0 ), . . . , Xm (x0 )) are linearly independent, we can rearrange the coordinates so that X 1 (x0 ) is invertible. Then, by shrinking N if necessary, X 1 (x) is invertible for every x N. Now let Y r (D)|N, let the components of Y be Yl , l {1, . . . , n}, and arrange the components in a vector Y (x) = Y 1 (x) Y 0 (x) ,

where Y 1 (x) Rm contains the rst m components and Y 0 (x) Rnm contains the last n m components. Now x x N. Since Y(x) Dx , Y(x) spanR (X1 (x), . . . , Xm (x)). Therefore, 1 (x), . . . , f m (x) R such that there exists unique fY Y
1 m Y(x) = fY (x)X1 (x) + + fY (x)Xm (x).

Writing this as a matrix equation we have X 1 (x) f Y (x) = Y 1 (x),


j

X 0 (x) f X (x) = Y 0 (x),

where the components f Y (x) Rm are fY (x), j {1, . . . , m}. Therefore,


1 f Y (x) = X 1 (x)Y 1 (x).

Therefore, since this construction holds for every x N, we have


m

Y (x ) =
j =1

1 j (X 1 (x)Y 1 (x)) X j (x).

r 1 By Cramers Rule, or some such, the components of X 1 are C -functions of x N, and so Y r r is a C (N)-linear combination of X1 , . . . , Xm , showing that (D)|N is nitely generated. To show that this module is free, it suces to show that (X1 , . . . , Xm ) are linearly independent over Cr (N). Suppose that there exists f 1 , . . . , f m Cr (N) such that

f 1 X1 + + f m Xm = 0r (TM) .

170

5 Families of vector elds, distributions, and afne distributions


Then, for every x N, f 1 (x)X1 (x) + + f m (x)Xm (x) = 0x giving the desired linear independence. = f 1 (x) = = f m (x),

23/06/2009

Of course, it does not hold that the set of sections of a regular distribution is free and nitely generated. For example, for n Z>0 , the distribution TS2n is not free; this is the Hairy Ball Theorem; see [Milnor 1978] for a nice proof. However, one does have the following global result due in a slightly dierent form to Swan [1962]. 5.2.2 Theorem (Swans Theorem for distributions) Let r Z0 {}, let M be a smooth, paracompact, Hausdor manifold of bounded dimension, and let D be a regular distribution on M of class Cr . Then r (D) is a nitely generated projective module over Cr (M); that is to say, r (D) is a direct summand of a nitely generated free module over Cr (M).
Proof By Theorem 5.1.2, let X1 , . . . , Xk be generators for D. Let Rk denote the trivial M k k vector bundle M R and dene a vector bundle map : RM TM by (x, (v1 , . . . , vk )) = v1 X1 (x) + + vk Xk (x). Clearly image() = D. Since D is regular, ker() is a subbundle of Rk [Abraham, Marsden, M and Ratiu 1988, Proposition 3.4.18]. Let , be the standard inner product on Rk which we think of as a vector bundle metric on Rk . Dene Ex to be the orthogonal complement M to ker(x ). Note that E is then a subbundle of Rk . We claim that |E is a Cr -vector bundle M r isomorphism onto D. Certainly is a C -vector bundle map. We claim that |E is onto D. Indeed, let x M and let vx Dx . Then, since image() = D, there exists (x, v) Rm such M that (x, v) = vx . Write (x, v) = (x, u + w) for (x, u) ker() and where w is orthogonal to u, i.e., (x, w) Ex . Then vx = (x, v) = (x, u) + (x, w) = (x, w), and so |E is onto D. To show that is injective, suppose that (x, v1 ) = (x, v2 ) for (x, v1 ), (x, v2 ) Ex . Then (x, v2 v1 ) = 0x = (x, v2 v1 ) ker(), = v2 v1 = 0,

as desired. Now we recall that Cr -vector bundles over M are isomorphic if and only if their sets of sections are isomorphic as Cr (M)-modules, cf. [Nelson 1967, 6]. Thus r (D) and r (E) are isomorphic. To complete the proof, note that r (Rk ) is a nitely generated free module over M r C (M), cf. the proof of Proposition 5.2.1. Moreover, any section of Rk with (x) = (x, v(x)) M can be written uniquely as (x) = (x, u(x)) + (x, w(x)) with (x, u(x)) ker(x ) and where w(x) is orthogonal to u(x). That is, = + for r (ker()) and r (E), and this decomposition is unique. That is to say, r (Rk )= M r (ker()) r (E), the direct sum being one of Cr (M)-modules. This gives the theorem.

23/06/2009

5.2 The algebraic structure of sets of functions and vector elds

171

5.2.3 Remark (The point of Swans Theorem) The theorem of Swan [1962] had to do with the classication of nitely generated projective modules over the ring of functions. Swan worked in the category of compact topological spaces and vector bundles over these. But the point is the same: The set of nitely generated projective modules over a ring of functions is in exact correspondence with the sets of sections of vector bundles (with suitable hypotheses). The proof of the fact that to a Cr -vector bundle one assigns a nitely generated projective module follows easily from an adaptation of the proof of Theorem 5.2.2. Indeed, let : V M be a vector bundle with M smooth, paracompact, Hausdor, and of bounded dimension, and suppose the bres of V have bounded dimension. From Remark 5.1.4 we can nd nitely many global generators. Then the proof of Theorem 5.2.2 is adapted by mere change of notation to show that the set of sections of V is a nitely generated projective module. To go the other way is more or less straightforward. Let M be a nitely generated projective module over Cr (M), r Z0 {}. Then, by denition, there exists a module N over Cr (M) such that M N Cr (M) Cr (M) .
k factors

The direct sum on the right is naturally isomorphic to the set of sections of the trivial vector bundle Rk = M Rk . Thus we can write M N = r (Rk ). For a {1, 2}, M M r k r k let a : (RM ) (RM ) be the projection onto the ath factor. As per [Nelson 1967, 6] (essentially), associated with a is a vector bundle map a : Rk Rk . Since M M a a = a (by virtue of a being a projection), a a = a . To show that M is the set of sections of a vector subbundle of Rk it suces to show that 1 has locally M constant rank. Following along the lines of the proof of Proposition 5.1.7 one can show that x rank(a,x ) is lower semicontinuous for a {1, 2}. However, since rank(1,x ) = rank(2,x ) = k for all x M, if x rank(1,x ) is lower semicontinuous at x0 , then x rank(2,x ) is upper semicontinuous at x0 . Thus we conclude that both of these functions must be continuous at x0 . Since x rank(1,x ) is integer-valued, it must therefore be locally constant. This correspondence between modules of sections of vector bundles and nitely generated projective modules is one of the correspondences between algebra and geometry that underlies the subject of algebraic geometry. The following result is sometimes helpful in understanding the structure of distributions generated by a collection of vector elds. 5.2.4 Proposition (Distributions generated by vector elds) Let r Z0 {, }, let M be a Cr -manifold, and let X r (TM). Then, the distributions (i) D(X ), (ii) D(T (X )), and (iii) D(spanR (X ))

172 agree.

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Proof We have D(T (X )) = spanR (( f 1 X1 + + f k Xk )(x)| k Z>0 , f 1 , . . . , f k Cr (M), X1 , . . . , Xk X ) = spanR (X(x)| X X ) = D(X )x and D(spanR (X )) = spanR ((a1 X1 + + ak Xk )(x)| k Z>0 , a1 , . . . , ak R, X1 , . . . , Xk X ) = spanR (X(x)| X X ) = D(X )x , as desired. = spanR (a1 X1 (x) + + ak Xk (x)| k Z>0 , a1 , . . . , ak R, X1 , . . . , Xk X ) = spanR (a1 X1 (x) + + ak Xk (x)| k Z>0 , a1 , . . . , ak R, X1 , . . . , Xk X )

Thus, when talking about distributions generated by families of vector elds, we can, without loss of generality, suppose that the family of vector elds is a subspace or a submodule if this is useful. For a submodule M of Cr -vector elds, one can then ponder about the relationships between M and r (D(M )). Similarly, if D is a distribution, one can ponder about the relationships between D and D(r (D)). Let us address these matters. 5.2.5 Proposition (Modules generated by sections of distributions, and distributions generated by modules) Let r Z0 {, }, let M be a smooth or analytic manifold, as is required, let D be a Cr -distribution, and let M r (TM) be a submodule. Then the following relationships hold: (i) D = D(r (D)); (ii) M r (D(M )); (iii) if D(M ) is regular then M = r (D(M )).
Proof (i) Let vx Dx . Then, by Theorem 5.1.2 there exists X r (D) such that X(x) = vx . Therefore, vx D(r (D)). Conversely, if vx D(r (D)) then there exists X1 , . . . , Xk r (D) and c1 , . . . , ck R such that vx = c1 X1 (x) + + ck Xk (x). Since X1 (x), . . . , Xk (x) Dx , vx Dx , as desired. (ii) If X M then X(x) Dx for every x M. Thus X r (D(X )). (iii) Let x M. By denition of D(M ) there exists X1 , . . . , Xm M such that (X1 (x), . . . , Xm (x)) is a basis for D(M )x . As we showed in the proof of Proposition 5.2.1, regularity of D implies that (X1 , . . . , Xm ) is a set of local generators for r (D(M )). Thus M contains a set of local generators for r (D(M )) for every x M. Thus spanCr (M) (M ) = r (D(M )).

The converse inclusion from the rst part of the result is false in general.

23/06/2009

5.2 The algebraic structure of sets of functions and vector elds

173

5.2.6 Example (Generally, r (D(M )) M ) Let M = R and let M (TR) be the submodule generated by X = x2 x . Then x = 0, {0}, D(M )x = Tx R, otherwise. Then the vector eld X = x x is in r (D(M )) but not in M . Unlike many counterexamples we encounter to natural statements, this one does not rely on lack of analyticity. Instead it has to do with the character of singular points of a distribution. 5.2.2 Rings of germs of functions and modules of germs of vector elds It is sometimes useful, particularly in the analytic case, to talk not about the module of vector elds over the ring of functions, but of the module of germs of vector elds over the module of germs of functions. We can repeat the constructions using germs from Sections 2.4.4 and 2.4.5 for functions and vector elds with general dierentiability properties. Let us see how to do this. Let M be a manifold of class C or C , as is needed. Let r Z0 {, } and let x0 M. Let U1 and U2 be neighbourhoods of x0 and let f j Cr (U j ) and X j r (TU j ), j {1, 2}. We say that the pairs ( f1 , U1 ) and ( f2 , U2 ) (resp. (X1 , U1 ) and (X2 , U2 )) are equivalent if there exists a neighbourhood W of x0 such that W U1 U2 and f1 |W = f2 |W (resp. X1 |W = X2 |W). This notion of equivalence is easily veried to be an equivalence relation, and the set r of equivalence classes is denoted by Cr x0 (M) (resp. x0 (TM)), and an equivalence class is called a germ of Cr -functions at x0 (resp. germ of Cr -vector elds at x0 ). We denote r an equivalence class in Cr x0 (M) (resp. x0 (TM)) by [( f , U)]x0 (resp. [(X, U)]x0 ). Note that Cr x0 (M) is a ring with the operations of addition and multiplication given by [( f , U)]x0 + [( g, V)]x0 = [(( f + g)|U V, U V)]x0 , [( f , U)]x0 [( g, V)]x0 = [(( f g)|U V, U V)]x0 ,
r and that r x0 (TM) is a module over the ring Cx0 (M) with the operations of addition and multiplication dened by

[(X, U)]x0 + [(Y, V)]x0 = [((X + Y)|U V, U V)]x0 , [( f , U)]x0 [(X, V)]x0 = [(( f X)|U V, U V)]x0 . We shall often abbreviate [( f , U)]x0 (resp. [(X, U)]x0 ) with [ f ]x0 (resp. [X]x0 ) when the neighbourhood is not relevant. Note that one can dene germs at all points in the manifold M, and one can imagine, rather than talking about functions (which assign to each point a number) and sections (which assign to each point a tangent vector), one can talk about assigning to each point a germ of functions and a germ of vector elds. To make this precise and add structure to such constructions, one uses sheaf theory, something we will not discuss here; see [Kashiwara and Schapira 1990].

174

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Why would this ring and module structure for germs be of interest, as opposed to just thinking about sections? In nonlinear control theory one is often interested in local properties of a system, and germs exactly capture local properties in an intrinsic way. Moreover, this module structure sometimes has properties that are of great value. We shall next explore this, and in doing so will come to understand some distinctions between the sets of analytic and smooth functions and vector elds. 5.2.3 Analytic germs In Sections 2.4.4 and 2.4.5 we devoted signicant eort to describing the algebraic properties of the ring of germs of analytic functions and the module of germs of analytic sections of a real analytic vector bundle. Let us summarise what we showed there. 5.2.7 Properties of analytic functions and analytic vector elds Let M be an analytic manifold and let x0 M. The following facts hold. 1. The ring C x0 (M) is a local ring (Theorem 2.4.22). 2. The ring Cx0 (M) is a unique factorisation domain (Theorem 2.4.24). 3. The module x0 (TM) is Noetherian (Theorem 2.4.27). 4. If M (TM) is a submodule, let x (M ) = {[X]x | X M } be the submodule of x (TM) consisting of germs of vector elds from M . If [X1 ]x0 , . . . , [Xk ]x0 generate x0 (M ), then there exists a neighbourhood N of x0 such that [X1 ]x , . . . , [Xk ]x generate x (M ) for each x N (Theorem 2.4.28). It might be tempting to believe that the nitely generated property holds globally. However, this is false. 5.2.8 Example (Noetherianness in the analytic case is only local1 ) We will show that C (R) is not Noetherian. Recall that sin C (R) and that x sin(x) admits the product representation x2 sin(x) = x 1 2 , j j =1 which converges uniformly on every compact subset of R to x sin(x). Let fk , k {1, . . . , k}, be dened by x2 fk (x) = x 1 2 , j
j =k

and let Ik be the ideal in C (R) generated by fk . Thus Ik = { f fk | f C (R)}.


1

The author would like to thank Mike Roth for suggesting this example.

23/06/2009

5.2 The algebraic structure of sets of functions and vector elds

175

x2 , k2 showing that Ik Ik+1 , k Z>0 . Note that if f Ik then f (k) = 0. Since fk+1 (k) 0, we conclude that fk+1 Ik and so we in fact have I Ik+1 , k Z>0 . Thus we have a chain fk (x) = fk+1 (x) 1 I1 I2 that is not nite. Thus C (R) is not Noetherian, as claimed. Note that the ideal I = kZ>0 Ik is not nitely generated. (Indeed, were I to be generated by analytic functions g1 , . . . , gm , there would necessarily be some k Z>0 such that I = ( g1 , . . . , gk ) Ik . But this implies that I j = Ik for all j k, contradicting what we have already shown.) Therefore, it follows that if we dene a submodule M of (TR) by M = { f x | f I}, then M is not nitely generated. 5.2.4 Smooth germs In the C case, the ring of germs of functions is not Noetherian. We can illustrate this with an example and references to commutative algebra, as per [Eisenbud 1995]. 5.2.9 Example (The module of germs of smooth functions is not Noetherian) Let M = R and consider the germ C 0 (R) of functions at 0. Consider the map Ev0 : Cx0 (R) R given by m Ev0 ([( f , U)]0 ) = f (0). This is a surjective homomorphism of rings and the kernel is the unique maximal ideal in C 0 (R), i.e., C0 (R) is a local ring. (It is clearly an ideal. It is maximal because if [( f , U)]0 m then f (0) 0 and so f is a unit in C 0 (R).) We claim that this ideal is generated by [(idR , R)]0 . Indeed, let [( f , U)]0 m and write, for x U,
x 1

Note that

f (x) =
0

f () d = x
0

f (x) d.

1 The function f: x 0 f (x) d is innitely dierentiable and so f = idR f. Thus every function f vanishing at 0 is a product of idR and an innitely dierentiable function, i.e., f (x) = x f(x). This shows that the maximal ideal m is principal, being generated by [(idR , R)]0 . Now, for k Z>0 , mk is the set of all germs of functions in k C 0 (R) whose derivatives of order 0, 1, . . . , k 1 vanish at 0, and so kZ>0 m is the set of germs of functions in C0 (M), all of whose derivatives vanish at 0. Note that kZ0 mk {[(0, R)]0 } since, for example, if 2 e1/x , x 0, f (x) = 0, x = 0,

176

5 Families of vector elds, distributions, and afne distributions

23/06/2009

then [( f , R)]0 kZ0 mk , but [( f , R)]0 is not the zero germ. We now can immediately conclude from the Krull Intersection Theorem (Theorem 2.4.14) that C 0 (R) is not Noetherian. With respect to vector elds this gives the following consequence. 5.2.10 Example (A distribution with generators generating an innitely generated module) We again take M = R and dene a distribution D on M by Tx R, x 0, Dx = {0}, x = 0. More or less obviously, (D) consists of those smooth vector elds which vanish at 0. We shall choose a stupid set of generators for D. Dene a submodule X (D) by X = {X (D) | lim xk X(x) = 0, k Z>0 }.
x0

Note that X contains the smooth vector eld X(x) = f (x) x , where 2 e1/x , x 0, f (x) = 0 , x = 0. Therefore, X generates (D). We claim that X is not nitely generated. Indeed, we can see that the vector elds X j , j Z>0 , dened by X j (x) = x j X(x) are linearly dependent over C (R). Note, however, that D admits the perfectly nice generator X (x) = x x , and the module generated by X is nitely generated. The dierence between X and X is that X is smooth but not analytic, whereas X is analytic. Make sure you understand that the preceding example does not contradict Theorem 5.1.2.

5.3 The Orbit Theorem and some consequences


In this section we consider an important theorem that is not so immediately connected to the theory of distributions, but, as we shall see, leads to important theorems regarding special classes of distributions. Moreover, the ideas we present here provide some of the important rst steps towards controllability theory. Contributions to the Orbit Theorem have been made by Hermann [1962], Krener [1974], Lobry [1970], Stefan [1974a,b], Sussmann [1973]. Our presentation follows that of Agrachev and Sachkov [2004, Chapter 5].

23/06/2009

5.3 The Orbit Theorem and some consequences 5.3.1 Lie algebras of vector elds

177

For r {, } and for a Cr -manifold M, we note that r (TM) has the structure of a Lie algebra. This Lie algebra structure can be dened most conveniently by recalling that the set of vector elds is in one-to-one correspondence with the set of derivations of Cr (M) (see [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.16] for the classical r = case and [Grabowski 1981] for the r = case). The derivation associated to a vector eld X, denoted by L X , is dened simply by L X f (x) = d f (x); X(x) . For X, Y r (TM) we note that f L XL Y f L YL X f is a derivation (one can just check this). Thus, associated to this derivation is a unique vector eld that we denote by [X, Y] which is the Lie bracket of X and Y. This Lie bracket is easily veried to have the properties that render r (TM) a R-Lie algebra: 1. the map (X, Y) [X, Y] is R-bilinear; 2. [X, X] = 0r (TM) for all X r (TM); 3. [X, [Y, Z]] + [Z, [X, Y]] + [Y, [Z, X]] = 0r (TM) for all X, Y, Z r (TM) (the Jacobi identity). A subset L r (TM) is a Lie subalgebra if it is a subspace of the R-vector space r (TM) and if [X, Y] L for every X, Y L . If X r (TM) then L () (X ) denotes the smallest Lie subalgebra containing X . We can explicitly describe this Lie subalgebra. 5.3.1 Proposition (The Lie algebra generated by a set of vector elds) Let r {, }, let M be a Cr -manifold, and let X r (TM). The Lie algebra L () (X ) is comprised of nite R-linear combinations of vector elds of the form [Xk , [Xk1 , . . . , [X2 , X1 ] ]], k Z>0 , X1 , . . . , Xk X .
Proof For vector elds X1 , . . . , Xk X , since L () (X ) is a Lie subalgebra of r (TM), it follows by induction that [Xk , [Xk1 , . . . , [X2 , X1 ] ]] L () (X ). Since L () (X ) is a subspace of the R-vector space r (TM), it also follows that all R-linear combinations of such vector elds are in L () (X ). To prove the opposite inclusion, it suces to showsince L () (X ) is the smallest Lie subalgebra containing X that the set of all R-linear combinations in the statement of the proposition forms a Lie algebra. If we have two R-linear combinations of vector elds of the form stated in the proposition, their Lie bracket will be in L () (X ) if and only if the Lie bracket of each of the summands is in L () (X ) (by linearity of the Lie bracket). Consider two vector elds of the form stated in the proposition: X = [Xk , [Xk1 , . . . , [X2 , X1 ] ]] Y = [Yl , [Yl1 , . . . , [Y2 , Y1 ] ]]. We shall prove by induction that [X, Y] L () (X ) for any k and l. Note that [X, Y] L () (X ) for any Y and l, and for k = 1. Now suppose this is true for k = 1, . . . , m. Then, taking k = m + 1, we have [X, Y] = [[Xm+1 , X1 ], Y]

178

5 Families of vector elds, distributions, and afne distributions


where X1 = [Xm , . . . , [X2 , X1 ] ]. By the Jacobi identity we have [[Xm+1 , X1 ], Y] + [[Y, Xm+1 ], X1 ] + [[X1 , Y], Xm+1 ] = 0r (TM) . This gives [X, Y] = [X1 , [Y, Xm+1 ]] + [Xm+1 , [X1 , Y]].

23/06/2009

By the induction hypothesis, [X1 , [Xm+1 , Y]] L () (X ) since X1 is a bracket of length m. Also [X1 , Y] L () (X ) so the second term on the right is in L () (X ). Thus the set of linear combinations of the form stated in the proposition forms a Lie subalgebra, giving the result.

Note that since L () (X ) is a subspace, one immediately has L () (X ) = L () (spanR (X )). However, it is not generally the case that L () (X ) is a submodule. In this respect, however, the following result is useful. 5.3.2 Proposition (The Lie algebra generated by a submodule of vector elds is a submodule) If r {, }, if M is a Cr -manifold, and if M r (TM) is a submodule, then L () (M ) is a submodule of r (TM).
Proof By Proposition 5.3.1 it suces to show that, for any f Cr (M) and for any X1 , . . . , Xk M , f [Xk , [Xk1 , . . . , [X2 , X1 ] ]] L () (M ). We prove this by induction on k, it clearly being true for k = 1. Assume now that the statement holds for k {1, . . . , m + 1}. Then f [Xm+1 , [Xm , . . . , [X2 , X1 ] ]] = [ f Xm+1 , [Xm , . . . , [X2 , X1 ] ]] + (L [Xm ,...,[X2 ,X1 ] ] f )Xk . By the induction hypothesis, (L [Xm ,...,[X2 ,X1 ] ] f )Xk L () (M ). Since f Xm+1 M , by Proposition 5.3.1 it follows that [ f Xm+1 , [Xm , . . . , [X2 , X1 ] ]] L () (M ), giving the result.

Associated with the collection of vector elds L () (X ) is the distribution D(L () (X )) which we abbreviate by L() (X ). The following result simplies some parts of the subsequent discussion. 5.3.3 Proposition (Characterisation of L() (X )) Let r {, }, let M be a Cr -manifold, and let X r (TM). Then the distributions (i) L() (X ), (ii) L() (T (X )), (iii) L() (spanR (X )), and (iv) D(T (L () (X ))) agree.

23/06/2009

5.3 The Orbit Theorem and some consequences

179

Proof It is clear that L() (X ) L() (spanR (X )) L() (T (X )). We will show that L() (T (X )) L() (X ). By Proposition 5.3.1 it suces to show that [Yk , [Yk1 , . . . , [Y2 , Y1 ] ]](x) L(X )x for every Y1 , . . . , Yk T (X ) and for every x M. We prove this by rst showing that [Yk , [Yk1 , . . . , [Y2 , Y1 ] ]] = f 1 Z1 + + f s Zs for every Y1 , . . . , Yk T (X ), and where f 1 , . . . , f s Cr (M) and where Z j = [X j,l , [X j,l1 , . . . , [X j,2 , X j,1 ] ]] for X j,1 , . . . , X j,l X with l {1, . . . , k}. This we prove by induction on k. It is clearly true for k = 1, so suppose it holds for k {1, . . . , m} and let Y1 , . . . , Ym , Ym+1 T (X ). Write Ym+1 = f 1 X1 + + f s Xs , Then
s

f 1 , . . . , f s Cr (M), X1 , . . . , Xs X .

[Ym+1 , [Ym , . . . , [Y2 , Y1 ] ]] =


j =1

( f j [X j , [Ym , . . . , [Y2 , Y1 ]]] (L [Ym ,...,[Y2 ,Y1 ] ] f j )X j ).

By the induction hypothesis, [Ym , . . . , [Y2 , Y1 ] ] = g1 Z1 + + gd Zd for g1 , . . . , gd Cr (M) and where Za = [Xa,la , [Xa,la 1 , . . . , [Xa,2 , Xa,1 ] ]] for Xa,1 , . . . , Xa,la X and where la {1, . . . , m} for each a {1, . . . , d}. Then
d

[X j , [Ym , . . . , [Y2 , Y1 ] ]] =
a =1 d

[X j , ga [Xa,la , [Xa,la 1 , . . . , [Xa,2 , Xa,1 ] ]]] ( ga [X j , [Xa,la , [Xa,la 1 , . . . , [Xa,2 , Xa,1 ] ]]]
a =1

+ (L X j ga )[Xa,la , [Xa,la 1 , . . . , [Xa,2 , Xa,1 ] ]]). This proves that [Ym+1 , [Ym , . . . , [Y2 , Y1 ] ]] has the desired form. From this it immediately follows from Proposition 5.3.1 that [Yk , [Yk1 , . . . , [Y2 , Y1 ] ]](x) L(X )x for every Y1 , . . . , Yk T (X ) and for every x M, and so the rst three distributions in the statement of the proposition are equal. The equality of these distributions with the fourth distribution in the statement of the proposition follows from Proposition 5.2.4.

180

5 Families of vector elds, distributions, and afne distributions

23/06/2009

The preceding constructions can be applied to the special case where X = r (D) consists of the D-valued vector elds of class Cr , where D is a Cr -distribution. In this case we use the following abbreviated notation: L () (D) L () (r (D)), L() (D) L() (r (D)).

Note that since r (D) is a submodule of r (TM), it follows from Proposition 5.3.2 that L () (D) is also a submodule of r (TM). Recall that at the end of Section 5.2.1 we indicated that for a submodule M of vector elds, M r (D(M )), and the inclusion is, in general, strict. Moreover, unlike many of our anomalies, this one is not a result of analyticity. Nonetheless, the following result concerning the distribution generated by the Lie algebras generated by these modules holds. 5.3.4 Theorem (Sometimes L() (X ) = L() (D(X ))) Let r {, }, let M be a Cr -manifold, and let X r (TM) be such that L () (X ) is a locally nitely generated submodule of vector elds. Then L() (X ) = L() (D(X )).
Proof Our proof will rely on the Orbit Theorem in the version of Theorem 5.3.21. First of all, note that, since the X -orbit through a point x0 is the set of points reached by owing forwards and backwards along vector elds from X, Orb(x0 , X ) = Orb(x0 , X X ). By Theorem 4.4.19, since conv cone({X(x) | X X X }) = spanR ({X(x) | X X X }) = D(X )x . Therefore, by Theorem 5.3.21, L() (X )x = Tx Orb(x, X ) = Tx Orb(x, (D(X ))) = L() (D)x , as claimed. Orb(x0 , X X ) = Orb(x0 , (D(X )))

The result has two interesting and often applicable corollaries. 5.3.5 Corollary (In the smooth constant rank case, L() (X ) = L() (D(X ))) Let M be a C manifold and let X (TM) be a regular point of L() (X ). Then L() (X )x = L() (D(X ))x .
Proof This follows from Theorem 5.3.4, along with Proposition 5.2.1.

5.3.6 Corollary (In the analytic case, L() (X ) = L() (D(X ))) Let M be a C -manifold and let X (TM). Then L() (X )x = L() (D(X ))x .
Proof This follows from Theorem 5.3.4, along with Theorem 2.4.28.

Let us consider a few examples that illustrate the subtlety of the preceding results.

23/06/2009

5.3 The Orbit Theorem and some consequences

181

5.3.7 Examples (The relationship between L() (X ) and L() (D(X ))) Here we take M = R2 and dene smooth vector elds X1 and X2 on R2 by X1 (x1 , x2 ) = , x 1 X2 (x1 , x2 ) = f (x1 ) , x 2

where f : R R is any smooth function vanishing only at x = 0. We shall take X = (X1 , X2 ) and consider a few f s. 1. First let us consider f (x) = x. We compute [X1 , X2 ](x1 , x2 ) = . x 2

Therefore, L() (X ) = TR2 . Thus we must have L() (X ) = L() (D(X )). Since f is analytic, this is in agreement with Corollary 5.3.6. 2. Next consider f (x) = x2 . In this case we have [X1 , X2 ](x1 , x2 ) = 2x1 , x2 [X1 , [X1 , X2 ]](x1 , x2 ) = 2 . x 2

Thus we can again conclude that L() (X ) = TR2 , implying that L() (X ) = L() (D(X )). This again is consistent with Corollary 5.3.6. Note, however, the distribution L() (X ) is generated by dierent brackets than was the case when we took f (x) = x. Thus the fact that L() (X ) = L() (D(X )) is less obvious in this case. 3. The nal case we consider is 2 e1/x , x 0, f (x) = 0, x = 0. We note that D(X )x1 ,x2 2 x1 0, T(x1 ,x2 ) R , = spanR ( x ), x1 = 0. 1 , x1 x2

Thus, for example, the vector elds X1 (x1 , x2 ) = X2 (x1 , x2 ) = x1

generate D(X ). As above, one can compute [X1 , X2 ] = , and so we have x2 () 2 () L (D(X )) = TR . However, one can easily show that L (X ) = D(X ) and so L() (X ) L() (D(X )). Again, this is a problem with the generators X for D(X ) being smooth but not analytic.

182

5 Families of vector elds, distributions, and afne distributions 5.3.2 Immersed submanifolds

23/06/2009

Embedded submanifolds are easy to understand in that their very denition gives rise to a natural dierentiable structure. In this section we give a brief discussion of immersed submanifolds and their topological and dierentiable structure. We begin with the denition, giving the denition of an embedded submanifold alongside that for an immersed manifold, so that the distinction is more apparent. 5.3.8 Denition (Embedded and immersed submanifolds) Let r Z>0 {, } and let N and M be Cr -manifolds. (i) A Cr -injective immersion f : N M is a Cr -embedding if f (N) is a submanifold. (ii) A subset S M is a Cr -immersed submanifold if there exists a manifold N and a Cr -injective immersion f : N M for which S = image(N). Let S M be a Cr -immersed submanifold, and let x S. By well-known theorems on the local character of immersions [Abraham, Marsden, and Ratiu 1988, Theorem 3.5.7], there exists a chart (U, ) around x with the following properties: 1. takes values in Rnm Rm ; 2. (U S) consists of possibly innitely many disconnected components (in the standard topology on Rnm Rm ), exactly one of which contains (x); 3. the connected component of (US) containing (x) is an open subset of {(x, 0) | x Rnm }. We then dene a chart for S around x as follows. We dene Ux = 1 {(x, 0) | x Rnm } (U) , and we dene x ( y) = pr1 ( y) for each y Ux , where pr1 : Rnm Rm Rnm is projection onto the rst factor. One can readily verify that (Ux , x ) is a chart. Furthermore, two such charts whose domains intersect will satisfy the overlap condition, by virtue of the fact that the corresponding charts for M satisfy the overlap condition. It is clear that AS = {Ux , x }xS denes an atlas for S. Thus we see that an immersed submanifold obtains a dierentiable structure from M. Since an immersed submanifold is dened to be the image of a manifold N under an injective immersion f : N M, it also follows that S inherits a dierentiable structure from N via f . One can show that these two dierentiable structures for S agree. Now let us say a few words about the topology on an immersed submanifold S M. Such an immersed submanifold possesses two natural topologies: (1) the manifold topology for the dierentiable structure dened above and (2) the subspace topology inherited from M. These topologies agree if and only if S is an embedded submanifold. Moreover, the manifold topology is stronger than the subspace topology. By this we mean that all subsets that are open in the subspace topology are also open in the manifold topology, but there may be subsets that are open in the manifold topology that are not open in the subspace topology. An example illustrates this.

23/06/2009

5.3 The Orbit Theorem and some consequences

183

5.3.9 Example (Submanifolds of T2 ) We think of the torus T2 as being the image of the canonical projection of an equivalence relation on R2 dened by (x1 , x2 ) ( y1 , y2 ) (x1 y2 , x2 y2 ) Z2 .

We denote : R2 T2 R2 / the canonical projection Now, for a = (a1 , a2 ) R2 , dene a submanifold Sa of R2 by Sa = {(a1 t, a2 t) | t R}, and denote Sa = (Sa ). One can show the following. 1. If (a1 , a2 ) is linearly independent over Q (think of R as a Q-vector space), then Sa is an immersed, but not embedded submanifold. Moreover, the dierentiable structure described above makes Sa dieomorphic to R. 2. If (a1 , a2 ) is linearly dependent over Q, then Sa is an embedded submanifold. Moreover, the dierentiable structure described above makes Sa dieomorphic to S1 . A reader for whom these statements are not obvious might benet from thinking about them for a while. 5.3.3 Orbits The Orbit Theorem has to do with. . . well. . . orbits of a family of vector elds. We shall dene the meaning of this in this section. First let us dene a group that can be associated with a family of vector elds. In order to do this, we rst introduce some terminology regarding local dieomorphisms. 5.3.10 Denition (Local diffeomorphisms, groups of local diffeomorphisms) Let r {, } and let M be a Cr -manifold. (i) A Cr -local dieomorphism on M is a pair (, U) where U M is (a possibly empty) open subset called the domain and where : U (U) is a Cr -dieomorphism. The image of (, U) is the open set (U). (ii) If (, U) and (, V) are Cr -local dieomorphisms, their composition (, V) (, U) is the Cr -local dieomorphism ( |1 (V), 1 (V)). (iii) If (, U) is a Cr -local dieomorphism, its inverse (, U)1 is the Cr -local dieomorphism (1 , (U)). (iv) A group of Cr -local dieomorphisms is a family G of Cr -local dieomorphisms such that, if (, U), (, V) G then (, V) (, U) G and (, U)1 G . The notion of a local dieomorphism with an empty domain is possibly confusing. However, in cases where such local dieomorphisms arise, they are simply ignorable. So this should not be dwelled upon.

184

5 Families of vector elds, distributions, and afne distributions

23/06/2009

We shall often make a slight abuse of notation by denoting a local dieomorphism by rather than (, U). In cases when we do this, there will be no loss in clarity we believe. Next we consider local dieomorphisms generated by a family of vector elds. We recall that if X 1 (TM) then X t (x) is dened for (t, x) in an open subset of R M that we denote by D(X). For a vector eld X, we denote by I(X, x0 ) R the domain of the maximal integral curve of X through x0 . For t R we denote by U(X, t) the largest (possibly empty) open subset of M such that (X t , U(X, t)) is a local dieomorphism. 5.3.11 Denition (Local group of diffeomorphisms generated by a family of vector elds) Let r {, }, let M be a Cr -manifold, and let X r (TM). By Di (X ) we denote the group of local dieomorphisms generated by the local dieomorphisms {(X t , U(X, t)) | X X , t R}.

Let us obtain a concrete description of Di (X ). First of all, to simplify notation, since the open set U(X, t) associated with the mapping X t is implicit, we shall write X X t for the local dieomorphism (t , U(X, t)). Then, if X = (X1 , . . . , Xk ) is a family of vector elds from X and if t = (t1 , . . . , tk ) Rk , then we denote
Xk X 1, X t = tk t1

which we think of as a composition of local dieomorphisms and so a local dieomorphism. For x M we shall also write
Xk X 1 (x), X t (x) = tk t1

with the understanding that this is dened if x is in the domain of the local dieomorphism X t . The set of such xs we denote by U(X , t), noting that this is an open subset of M. One can then directly verify that Di (X ) = X t X X k , t R k , k Z >0 .

The preceding discussion is greatly complicated by the fact that we allow vector elds from X to possibly not be complete. If all vector elds from X are complete, then one easily sees that
X k 1 Di (X ) = X tk t1

X1 , . . . , Xk X , t1 , . . . , tk R .

The reader would benet by keeping this special case in mind. We can now dene what we mean by an orbit for a family of vector elds. 5.3.12 Denition (Orbit) Let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of Cr -vector elds. The orbit of X through x0 M is the set Orb(x0 , X ) = X t (x0 ) X X k , t Rk , k Z>0 .

23/06/2009

5.3 The Orbit Theorem and some consequences

185

Note that two distinct orbits are disjoint. Thus the set of orbits denes a partition of M. Let us study the concept of an orbit by using some examples. Our examples all involve complete vector elds, so obviating some of the complications of the constructions above. 5.3.13 Examples (Orbits) 1. In Example 5.3.9 we considered a family (Sa )aR2 of submanifolds of T2 that were immersed or embedded submanifolds, depending on a. We can adapt this example to illustrate some properties of orbits. For a = (a1 , a2 ) R2 dene a vector eld Xa on R2 by Xa = a1 + a2 . x 1 x 2

The integral curve of Xa through (x10 , x20 ) is t (x10 + a1 t, x20 + a2 t), and we note that the image of this integral curve through (0, 0) is the submanifold Sa from Example 5.3.9. Now recall the projection : R2 T2 from Example 5.3.9, and note that the vector eld Xa projects to a well-dened vector eld Xa on T2 by Xa ((x1 , x2 )) = T(x1 ,x2 ) (Xa ). If we take X = (Xa ), then the orbits of X are precisely the images of the integral curves of Xa . The point is that if (a1 , a2 ) is linearly independent over Q then Orb((x1 , x2 ), X ) is an immersed, but not embedded, submanifold of T2 . If (a1 , a2 ) is linearly dependent over Q then Orb((x1 , x2 ), X ) is an embedded submanifold of T2 . 2. We take M = R2 and dene X1 = x1 , x1 X2 = x2 . x2

The ows of X1 and X2 are easily determined explicitly. Using these ows one can readily determine the orbits for X = (X1 , X2 ). Let us determine illustrate how to do this in a two cases; the other cases follow in the same manner. (a) x0 = (x01 , x02 ) with x01 R>0 and x20 = 0: Since X2 = 0 on the x1 -axis and since X1 is tangent to the x1 -axis, Orb(x0 , X ) will be contained in the x1 -axis. Moreover, x1 1 if x1 R>0 and if we dene t1 = log x , then X t1 (x0 ) = (x1 , 0). Moreover, for 01 any t R>0 , 1 X t (x0 ) {(x1 , 0) | x1 R>0 }. Thus we must have Orb(x0 , X ) = {(x1 , 0) | x1 R>0 }.

186

5 Families of vector elds, distributions, and afne distributions

23/06/2009

(b) x0 = (x01 , x02 ) with x01 , x02 R>0 : Here we let (x1 , x2 ) R2 with x1 , x2 R>0 . x1 X1 x2 2 We then dene t1 = log x and t2 = log x and note that X t2 t1 (x0 ) = (x1 , x2 ). 02 01 Moreover, for t1 , . . . , tk R and for j1 , . . . , jk {1, 2}, tk k t1 1 (x0 ) {(x1 , x2 ) | x1 , x2 R>0 }. This shows that Orb(x0 , X ) = {(x1 , x2 ) | x1 , x2 R>0 }. In any case, it is easy to see that there are nine distinct orbits for the family of vector elds X = (X1 , X2 ), and these are determined to be Orb1 ((0, 0), X ) = {(0, 0)}, Orb2 ((1, 0), X ) = {(x1 , 0) | x1 R>0 }, Orb3 ((1, 0), X ) = {(x1 , 0) | x1 R>0 }, Orb4 ((0, 1), X ) = {(0, x2 ) | x2 R>0 }, Orb5 ((0, 1), X ) = {(0, x2 ) | x2 R>0 }, Orb6 ((1, 1), X ) = {(x1 , x2 ) | x1 , x2 R>0 }, Orb7 ((1, 1), X ) = {(x1 , x2 ) | x1 , x2 R>0 }, Orb8 ((1, 1), X ) = {(x1 , x2 ) | x1 , x2 R>0 }, Orb9 ((1, 1), X ) = {(x1 , x2 ) | x1 , x2 R>0 }. We depict these orbits in Figure 5.1.
Xj Xj

Figure 5.1 Orbits

3. We let M = R2 and dene X1 = , x1 X2 = f (x1 ) , x2

23/06/2009

5.3 The Orbit Theorem and some consequences

187

where

2 e1/x , x R>0 , f (x) = 0, x R<0 .

We take X = (X1 , X2 ) and claim that Orb(x, X ) = R2 for every x R2 . It suces to show that, for example, Orb(0, X ) = R2 . For (x1 , x2 ) R2 with x1 > 0 we dene t1 = x1 and t2 = f (xx21 ) , and directly compute
X1 2 X t2 t1 (0) = (x1 , x2 ).

If (x1 , x2 ) R2 with x1 0 we dene t1 = 1, t2 = f (xx21 ) , and t3 = 1 + x1 , and directly compute X1 X2 3 X t3 t2 t1 (0) = (x1 , x2 ). This shows that Orb(0, X ) = R2 as desired. 5.3.4 Fixed-time orbits In this section we consider a modication of the notion of an orbit as dened in the previous section. Let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of vector elds. Above we dened Di (X ) as the subgroup of local dieomorphisms dened by ows of vector elds from X . In this section we modify this construction slightly to give the orbit corresponding to ows whose total time is xed. To make this construction, we rst consider ows whose total time is zero. For convenience and to reestablish notation, we recall the explicit characterisation of Di (X ) from above. As above, for a vector eld X we shall often denote the local X dieomorphism (X t , U(X, t)) simply by t . Let X = (X1 , . . . , Xk ) be a nite family of vector elds from X and let t = (t1 , . . . , tk ) Rk . Then we dene a local dieomorphism
Xk X 1, X t = tk t1

understanding implicitly that this is only interesting when the resulting composition has nonempty domain. With this notation, Di (X ) = X t X X k , t R k , k Z >0 .

We now dene a subgroup Di 0 (X ) of Di (X ) by


k

Di 0 (X ) =

X t

XX , tR,
k k j=1

tk = 0, k Z>0 .

The following property of Di 0 (X ) is useful in understanding some of the subsequent constructions. 5.3.14 Proposition (Di0 (X ) is a normal subgroup of Di(X )) Let r {, }, let M be a Cr -manifold, and let X r (TM). Then the following statements hold:

188

5 Families of vector elds, distributions, and afne distributions

23/06/2009

(i) Di 0 (X ) is a subgroup of the group Di (X ) of local dieomorphisms; that is, if (, U), (, V) Di 0 (X ), then (, U)1 Di 0 (X ) and (, U) (, V) Di 0 (X ); (ii) Di 0 (X ) is a normal subgroup; that is, if (, U) Di 0 (X ) and if (, V) Di (X ), then (, V) (, U) (, V)1 Di 0 (X ).
Proof (i) Let X = (X1 , . . . , Xk ) and Y = (Y1 , . . . , Ym ) be families of vector elds and t = Y (t1 , . . . , tk ) and s = (s1 , . . . , sm ) be families of real numbers such that = X t and = s . Thus
k m

tj =
j =1 l =1

sl = 0.

Then (, U)1 is dened by

k 1 = X t1 tk ,

and so (, U)1 Di 0 (X ). Similarly, (, U) (, V) is dened by


Y Ym 1 1, tk k X t1 tm t1 X

and so (, U) (, V) Di 0 (X ). (ii) Let X = (X1 , . . . , Xk ) and Y = (Y1 , . . . , Ym ) be families of vector elds and t = Y (t1 , . . . , tk ) and s = (s1 , . . . , sm ) be families of real numbers such that = X t and = s . Thus
k

t j = 0.
j =1

Note that (, V) (, U) (, V)1 is dened by


Y1 Y1 X1 Ym m k Y sm s1 tk t1 t1 tm , X

and so (, V) (, U) (, V)1 Di 0 (X ), as desired.

Now let T R. Dene


k

Di T (X ) =

X t

XX , tR,
k k j =1

tk = T, k Z>0 .

We can now make the following denition. 5.3.15 Denition (Fixed-time orbit) Let r {, }, let M be a Cr -manifold, let X r (TM) be a family of Cr -vector elds, and let T R. The T-orbit of X through x0 M is the set
k

OrbT (x0 , X ) = X t (x0 )

X X k , t Rk ,
j =1

t j = T, k Z>0 .

A xed-time orbit of X through x0 M is a set of the form OrbT (x0 , X ) for some T R.

23/06/2009

5.3 The Orbit Theorem and some consequences 5.3.5 The Orbit Theorem

189

Before we state the Orbit Theorem we need some terminology and notation. Let r {, } and let M be a Cr -manifold. For a Cr -local dieomorphism (, U) and for X r (TU), denote Ad X = X; this is the push-forward of X by . Thus Ad X is the vector eld on (U) dened by Ad X(x) = T1 (x) X 1 (x). If X r (TM) then we will write Ad X as an abbreviation for Ad (X|U). With this terminology, we are ready to state the Orbit Theorem. 5.3.16 Theorem (Orbit Theorem) Let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of vector elds. Then, for each x0 M, (i) Orb(x0 , X ) is a connected immersed Cr -submanifold of M and (ii) for each x Orb(x0 , X ), Tx Orb(x0 , X ) = {Ad X(x) | Di (X ), X X }.
Proof Let us denote a family of vector elds by O (X ) = {Ad X | Di (X ), X X }, noting that Ad X is dened on (U) where U is the domain of the local dieomorphism . We also dene a distribution O by Ox = spanR (Ad X(x)| Di (X ), X X ). The following lemma gives a useful property of these subspaces. 1 Lemma For x0 M and for x Orb(x0 , X ), dim(Ox ) = dim(Ox0 ). Proof If x Orb(x0 , X ) then there exists Di (X ) such that x = (x0 ). Let Di (X ) be such that x0 is in the image of and let X X so that Ad X(x0 ) Ox0 . Then Tx0 (Ad X(x0 )) = Tx0 T1 (x0 ) X 1 (x0 ) = Tx0 T1 (x0 ) X 1 1 (x0 ) = Tx ( ) X ( )1 (x) = Ad X(x) Ox since Di (X ). Since Tx0 is an isomorphism, this shows that dim(Ox ) dim(Ox0 ). This argument can be reversed to give the opposite inequality. For x M suppose that dim(Ox ) = mx . Let Yx = (Y1 , . . . , Ymx ) O (X ) be such that (Y1 (x), . . . , Ymx (x)) is a basis for Ox . Dene x : Rmx M by
1 x (t1 , . . . , tmx ) = tmmx Y t1 (x). x

Since t jx is equal to Y j (x) when evaluated to t1 = = tmx = 0, it follows that x is an embedding in a neighbourhood of 0 Rmx ; let us denote the image of this neighbourhood under x by U(Yx ). Thus U(Yx ) is a Cr -submanifold. 2 Lemma U(Yx ) Orb(x, X ).

190

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Proof By denition we can write Y j = Ad j X j for j Di (X ) and X j X , j {1, . . . , mx }. We recall from [Abraham, Marsden, and Ratiu 1988, Proposition 4.2.4] the X 1 which holds if Y = Ad X for vector elds X and Y and a formula Y t = t 1 dieomorphism . Using this formula we have, for t x (U(Yx )),
X1 mx 1 1 1 x (t) = tmmx Y mx 1 t1 1 (x), t1 (x) = mx tm
x x

showing that U(Yx ) Orb(x, X ), as desired, since all mappings in the above composition are in Di (X ). 3 Lemma Ty U(Yx ) = Oy for every y U(Yx ). Proof Let j {1, . . . , mx } and let t = tmmx t j+1 .
x

Y j+1

Then x Ymx 1 = Y t1 (x) t j t j tmx Yj 1 Ymx 1 = tm Y t t j t t1 (x) x t j


1 x (t) = Adt Y j (x (t)) Ox (t) . = Tt t Y j t

Since this holds for every j {1, . . . , mx }, it follows that image(Tt x ) Ox (t) . By Lemma 1 we have dim(Ox (t) ) = dim(Ox ) = dim(image(Tt x )) Therefore, Ox (t) = image(Tt x ) = Tx (t) U(Yx ), as desired. 4 Lemma The subsets (U(Yx ))xM for Yx as dened above, form a basis for a topology on M. Proof By, e.g., Theorem 5.3 of [Willard 1970], it suces to show that for any pair U(Yx1 ) and U(Yx2 ) of such subsets there exists U(Yx ) such that U(Yx ) U(Yx1 ) U(Yx2 ). Let x U(Yx1 ) U(Yx2 ) and let Y1 , . . . , Ymx O (X ) be such that Ox = spanR (Y1 (x), . . . , Ymx (x)). Let x : Rmx M be the map dened above. By Lemma 3 it follows that Y1 ( y), . . . , Ymx ( y) T y U(Yx1 ), Y1 ( y), . . . , Ymx ( y) T y U(Yx2 )

for every y in a neighbourhood of x. Therefore, the integral curves of the vector elds Y1 , . . . , Ymx with initial conditions in U(Yx1 ) (resp. U(Yx2 )) nearby x remain in U(Yx1 )

23/06/2009

5.3 The Orbit Theorem and some consequences

191

(resp. U(Yx2 )). Therefore, concatenations of these integral curves nearby x will also remain in U(Yx1 ) (resp. U(Yx2 )). In short, for t1 , . . . , tmx suciently near zero,
1 tm k Y t1 (x) U(Yx1 ) k

Ym

1 (resp. tm k Y t1 (x) U(Yx2 ). k

Ym

Thus, by restricting x to a small enough neighbourhood N of 0, if we dene U(Yx ) = x (N), we have U(Yx ) U(Yx1 ) U(Yx2 ). Let us call the topology on M generated by the sets (U(Yx )) the orbit topology. 5 Lemma In the orbit topology, the orbits are connected, open, and closed. Proof Let X X . Since the integral curve t X t (x) is a continuous curve of M and since it is tangent to U(Yx ), it follows that the curve is continuous in the relative topology on U(Yx ). Since U(Yx ) is a submanifold, the relative topology is the same as the topology induced by the immersion x . Thus the integral curve t X t (x) is continuous in the orbit topology. Therefore, by denition of orbits, orbits are path connected and so connected. If x Orb(x0 , X ) then every set U(Yx ) is a subset of Orb(x0 , X ). Since U(Yx ) is open, it follows that Orb(x0 , X ) is open. Note that M is a disjoint union of orbits. Therefore, the complement of an orbit is a union of orbits. Thus the complement of an orbit is a union of open sets and so open. Thus an orbit is closed. This last lemma shows that the orbits are the connected components of M in the orbit topology. 6 Lemma For x0 M, Orb(x0 , X ) =
xOrb(x0 ,X )

U(Yx ),

the union being over the neighbourhoods U(Yx ) constructed above. Proof It is trivial that Orb(x0 , X ) from Lemma 2.
xOrb(x0 ,X ) U(Yx ).

The converse inclusion follows

1 Since each of the sets U(Yx ) is dieomorphic to an open subset of Rmx by x , it follows 1 that (U(Yx ), x ) is a chart for Orb(x0 , X ) for every x Orb(x0 , X ). The overlap map 1 1 between intersection charts (U(Yx1 ), x1 ) and (U(Yx2 ), x2 ) are obtained by concatenations of ows of vector elds from X , and so are dieomorphisms. This shows that Orb(x0 , X ) is an immersed submanifold as in (i). Assertion (ii) follows from the Lemma 3 and the denition of the dierentiable structure on the orbits.

5.3.17 Remark (The orbit topology) In the proof of the Orbit Theorem we prescribed a topology on M that we will, on occasion, make reference to. This topology is called the orbit topology, and is dened as follows, using the notation from the proof of the Orbit Theorem. For a family of Cr -vector elds X , r {, }, and for x M, let Yx = (Y1 , . . . , Ym ) be vector elds from the family O (X ) = {Ad X | Di (X ), X X }

192

5 Families of vector elds, distributions, and afne distributions

23/06/2009

for which (Y1 (x), . . . , Ymx (x)) are a basis for Tx Orb(x, X ). Dene x : Rmx M by
Y x 1 (x), x (t1 , . . . , tmx ) = tmm t1 x Y

and note that this map, restricted to a neighbourhood of 0 Rmx , is an embedding. The image of this neighbourhood under x is denoted by U(Yx ). The sets U(Yx ) form a basis for a topology, and this topology is the orbit topology. The Orbit Theorem gives us an insightful description of the tangent spaces to X -orbits. However, computationally the description is not the most useful since it requires that we know something about the group Di (X ). One can wonder whether there is a simpler innitesimal description. If one has some intuition about things analytic, one might imagine that such an innitesimal description is possible for the analytic version of the Orbit Theorem. We shall show that this is true. We begin by describing a subspace of the tangent spaces to the X -orbits. The proof of the theorem is adapted from that of Jakubczyk [2001, Proposition 4.15]. 5.3.18 Theorem (A subspace of the tangent space of an orbit) Let r {, }, let M be a Cr -manifold, and let X be a family of Cr -vector elds on M. Then L() (X )x0 Tx0 Orb(x0 , X ) for every x0 M.
Proof Let x0 M. By Proposition 5.3.1 it suces to show that [Xk , [Xk1 , . . . , [X2 , X1 ] ]](x0 ) Tx0 Orb(x0 , X ) (5.3)

for any vector elds X1 , . . . , Xk X . We prove this using the some notation and three lemmata. First the notation. For dieomorphisms and of M, denote [, ] = 1 1 . With this notation we have the following lemma. 1 Lemma Let X1 , . . . , Xk r (TM) and recursively dene
1 1 (t1 ) = X t1 ,

X2 1 2 (t1 , t2 ) = [X t1 , t2 ], X3 X2 1 3 (t1 , t2 , t3 ) = [[X t1 , t2 ], t3 ], . . . X2 k 1 k (t1 , t2 , . . . , tk ) = [ [X t1 , t2 ], . . . , tk ], X

for t1 , . . . , tk R such that all ows are dened. Then t1


t1 =0

k (t1 , . . . , tk )(x0 ) = X1 (x0 ) +

t1

t1 =0

2 tkk k1 (t1 , . . . , tk1 ) X t2 (x0 ).

23/06/2009

5.3 The Orbit Theorem and some consequences

193

Proof It is easy to see by induction that if any of the numbers t1 , . . . , tk are zero, then k = idM . We shall use this fact frequently. First note that dierentiation of the relation k1 (t1 , . . . , tk1 ) k1 (t1 , . . . , tk1 )1 (x) = x gives (k1 (t1 , . . . , tk1 )1 (x)) t1 = Tk1 (t1 , . . . , tk1 )1

k1 (t1 , . . . , tk1 ) t1

k1 (t1 , . . . , tk1 )1 (x)

Evaluating at t1 = 0 and using the fact stated at the beginning of the proof then gives t1
t1 =0

(k1 (t1 , . . . , tk1 )1 (x)) = X1 (x).

Using this fact along with the statement made at the beginning of the lemma, we calculate t1
t1 =0

k (t1 , . . . , tk )(x0 ) =

t1 = t1

t1 =0 t1 =0

[k1 (t1 , . . . , tk1 ), tk k ](x0 )


k 2 k1 (t1 , . . . , tk1 )1 X t2 k1 (t1 , . . . , tk1 ) tk (x0 )

= X1 (x0 ) + giving the result.

t1

t1 =0

2 tk k k1 (t1 , . . . , tk1 ) X t2 (x0 ),

We also recall the denition of the pull-back of a vector eld X by a dieomorphism : X = T1 X . With this notation we have the following lemma. 2 Lemma With the notation from Lemma 1, tk t1 k (t1 , . . . , tk )(x0 ) = tk t2
2 ((tkk ) (X t2 ) X1 )(x0 ).

t1 ==tk =0

t2 ==tk =0

Proof We prove this by induction on k. For k = 2 we use Lemma 1 to determine that t2 t1


t1 =t2 =0

2 (t1 , t2 )(x0 ) =

X2 1 X2 X t1 t2 (x0 ) t2 t1 t1 =t2 =0 t2 2 = TX2 X1 (X t2 (x0 )) = t t2 t2 =0 t2 2

t2 =0

s ((X t2 ) X1 )(x0 ),

giving the lemma for k = 2.

194

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Now suppose the lemma holds for k {1, . . . , m 1}. An application of Lemma 1 and the induction hypothesis gives tm t1 = = = = = =
t1 ==tm =0

m (t1 , . . . , tm )(x0 )
t1 ==tm =0 t2 ==tm =0 Xm m X tm m1 (t1 , . . . , tm1 ) tm (x0 )
m TX tm

tm tm tm tm tm tm

t1 t2 t2

m (X m1 (t1 , . . . , tm1 ) ) t m t2 ==tm =0 t1 t1 =0 m (X m1 (t1 , . . . , tm1 ) ) t m tm =0 tm1 t2 t1 t1 ==tm1 =0 X2 m m1 (X ((X tm ) t tm1 ) (t2 ) X1 )(x0 ) tm =0 t = = t = 0 t 2 m 1 2 m 1 Xm 2 ( ) (X t2 ) X1 )(x0 ), t2 t2 ==tm =0 tm

t1

t1 =0

m m1 (t1 , . . . , tm1 ) X tm (x0 )

which is the result. (Note that our freely swapping partial derivatives with pull-backs is justied since we are dierentiating the pull-back with respect to its argument, and the pull-back is linear in its argument.) Now we prove the key fact. 3 Lemma We use the notation from Lemma 2. For x0 M, if we dene x0 (s) = k (s, . . . , s)(x0 ) for all s R such that the expression makes sense, then dj dsj and
s =0

x0 (s) = 0,

j {0, 1, . . . , k 1},

dk x (s) = k![Xk , . . . , [X2 , X1 ] ](x0 ). dsk s=0 0 Proof Now let j {0, 1, . . . , k 1}. By the Chain Rule for high-order derivatives, [Abraham, Marsden, and Ratiu 1988, Supplement 2.4A], dj ds j
s j =0

x0 (s) =
j1 ,..., jk {0,1,..., j} j1 ++ jk = j

j! j1 jk j j1 ! jk ! t j1 t k
1 k

t1 ==tk =0

k (t1 , . . . , tk )(x0 ).

(5.4)

Note that each term in the above will have ja = 0 for some a {1, . . . , k}. The partial derivatives in (5.4), when evaluated at t1 = = tk = 0, will then necessarily be taken with ta = 0 for some a {1, . . . , k}. By our comment at the beginning of the proof of Lemma 1, it follows that all such partial derivatives will be zero. By the same reasoning, in the expression dk dsk
sk =0

x0 (s) =
j1 ,..., jk {0,1,...,k} j1 ++ jk =k

k! j1 jk j j1 ! jk ! t j1 t k
1 k

t1 ==tk =0

k (t1 , . . . , tk )(x0 )

23/06/2009

5.3 The Orbit Theorem and some consequences

195

for the kth derivative, the only nonzero term in the sum occurs when j1 = = jk = 1, since otherwise at least one of the numbers j1 , . . . , jk will be zero. That is to say, dk dsk
sk =0

x0 (s) = k!

t1 tk

t1 ==tk =0

k (t1 , . . . , tk )(x0 ).

Let us now turn to the proof of the fact that this expression is the iterated Lie bracket in the statement of the lemma. We prove this by showing that, for any j {2, . . . , k}, t j ((tk k ) (t j ) [X j1 , . . . , [X2 , X1 ] ])(x0 ) = ((tk k ) (t j+1 ) [X j , . . . , [X2 , X1 ] ])(x0 ). (5.5) This we prove by induction on j. For j = 2 we have t2
t2 =0 X3 k 2 ((tk k ) (X t2 ) X1 )(x0 ) = ((tk ) (t3 ) X X X X X j+1 X Xj

t j =0

t2

t2 =0

2 X t2 X1 )(x0 )

3 = ((tk k ) (X t3 ) [X2 , X1 ])(x0 ),

d X using the well-known characterisation of the Lie bracket by [X, Y](x) = d t t=0 (t ) Y(x) [see Abraham, Marsden, and Ratiu 1988, Theorem 4.2.19]. Now suppose that (5.5) holds for j {2, . . . , m 1}. Then we use the induction hypotheses to get

tm

tm =0

m ((tk k ) (X tm ) [Xm1 , . . . , [X2 , X1 ] ])(x0 )

m+1 = ((tk k ) (X tm+1 )

tm

tm =0

m X tm [Xm1 , . . . , [X2 , X1 ] ])(x0 )

m+1 = ((tk k ) (X tm+1 ) [Xm , . . . , [X2 , X1 ] ])(x0 ),

giving (5.5). Now we use Lemma 2 and recursively apply (5.5) to give tk t1 as desired. We can now complete the proof of the theorem by verifying (5.3). Recalling the distribution O from the proof of the Orbit Theorem, the vector elds X1 , . . . , Xk are Ovalued and so tangent to Orb(x0 , X ). Therefore, with the notation of Lemma 3, x0 (s) Orb(x0 , X ) for every s R. Therefore, since the rst k 1 derivatives of x0 at s = 0 vanish by Lemma 3, the kth derivative of x0 is tangent to Orb(x0 , X ). That is to say, by Lemma 3, dk dsk
s=0 t1 ==tk =0

k (t1 , . . . , tk )(x0 ) =

X 2 (( k ) (X t2 ) X1 )(x0 ) tk t2 t2 ==tk =0 tk = [Xk , [Xk1 , . . . , [X2 , X1 ] ]](x0 )

x0 (s) = k![Xk , . . . , [X2 , X1 ] ](x0 ) Tx0 Orb(x0 , X ),

showing that (5.3) holds.

196

5 Families of vector elds, distributions, and afne distributions

23/06/2009

The proof we give of the above theorem is a bit technical. However, contained in the proof is an idea that will play a prominent role in our discussion of controllability. Let us discuss this a little. The main idea in the proof is that we want to show that the brackets of vector elds from X of the form [Xk , [Xk1 , . . . , [X2 , X1 ] ]] are tangent to orbits for X . In order to do this, we note that concatenations of integral curves of vector elds from X necessarily in the orbit. That is,
Y m 1 (x0 ) Orb(x0 , X ) Y tm t1

for every Y1 , . . . , Ym X and t1 , . . . , tm suciently small. Now consider a curve : R Rp so that Y m 1 (x0 ) s Y m (s) 1 (s) is a curve in Orb(x0 , X ). If the rst r 1 derivatives of this curve at zero vanish, then the rth derivative is a tangent vector to Orb(x0 , X ). The problem is then to prescribe Y1 , . . . , Ym and the map in such a way that this rth derivative is the desired tangent vector: dr Y m 1 (x0 ) = [Xk , [Xk1 , . . . , [X2 , X1 ] ]](x0 ). Y ( s ) r m 1 (s) ds s=0 The technicalities in the proof of the theorem arise from making exactly these denitions. An immediate consequence of Theorems 5.3.16 and 5.3.18 is the following wellknown result of Rashevsky [1938] and Chow [1939]. 5.3.19 Corollary (The RashevskyChow Theorem) Let M be a connected C -manifold and let X (TM) be a family of vector elds. If L() (X ) = TM then, for x1 , x2 M, there exists X1 , . . . , Xk X and t1 , . . . , tk R such that
X k 1 (x1 ). x2 = X tk t1

Proof Let x1 M. By Theorem 5.3.18 we have L() (X )x Tx Orb(x1 , X ) for every x Orb(x1 , X ) and so L() (X )x Tx Orb(x1 , X ) Tx M = L() (X )x , giving Tx Orb(x1 , X ) = Tx M for every x Orb(x1 , X ). Thus Orb(x1 , X ) is an open submanifold of M. Thus, recalling the basis for the orbit topology from the proof of the Theorem 5.3.16, open subsets of Orb(x1 , X ) in the orbit topology are open subsets in the relative topology on M. Since M is a disjoint union of its orbits, it is a disjoint union of open sets. Each component in this disjoint union is necessarily closed since its complement is open, being a union of open sets. Thus each orbit is a connected component of M. Since M is assumed connected it follows that Orb(x1 , X ) = M, which is the result.

The converse of the RashevskyChow Theorem is generally false.

23/06/2009

5.3 The Orbit Theorem and some consequences

197

5.3.20 Example (Failure of the converse of the RashevskyChow Theorem) Recall Example 5.3.133 where M = R2 and where X = (X1 , X2 ) is dened by X1 = where , x 1 X2 = f (x1 ) , x 2

2 e1/x , f (x) = 0 ,

x R>0 , x R<0 .

In Example 5.3.133 we explicitly showed that M = Orb(0, X ). However, one can also directly show that 2 x1 > 0, T(x1 ,x2 ) R , () L (X )( x1 , x2 ) = spanR ( x ), x1 0. 1 Thus L() (X ) TM. 5.3.6 The nitely generated Orbit Theorem Now we turn to characterising situations where the tangent spaces to the orbits are exactly the subspaces L() (X ). 5.3.21 Theorem (The Orbit Theorem in the nitely generated case) Let r {, }, let M be a Cr -manifold, and let X r (TM) be such that L () (X ) is a locally nitely generated submodule of vector elds. Then, for each x0 M, (i) Orb(x0 , X ) is a connected immersed Cr -submanifold of M and (ii) for each x Orb(x0 , X ), Tx Orb(x0 , X ) = L() (X )x .
Proof From the Orbit Theorem and Theorem 5.3.18 it only remains to show that Tx Orb(x0 , X ) L() (X ). For X X we have [X, Y] L () (X ) for every Y L () (X ), this since L () (X ) is a Lie subalgebra. Since L () (X ) is assumed to be locally nitely generated, Theorem 5.1.11 (along with Remark 5.1.12) gives AdX Y(x) L() (X ) for every X X and t R such that x X t (U(X, t)). A trivial induction then gives AdXk AdX1 Y(x) = AdXk X1 Y(x) L() (X )
tk t1 tk t1 t

(5.6)

for every X1 , . . . , Xk X and t1 , . . . , tk R, where we use the fact that push-forward commutes with composition [Abraham, Marsden, and Ratiu 1988, Proposition 4.2.3]. However, by Theorem 5.3.16, Tx Orb(x0 , X ) = {Ad X(x) | Di (X ), X X }, and so (5.6) implies that Tx Orb(x0 , X ) L() (X ).

198

5 Families of vector elds, distributions, and afne distributions

23/06/2009

5.3.22 Remark (The submodule assumption in the nitely generated Orbit Theorem) In the statement of the preceding theorem we asked that L () (X ) be a submodule of r (TM). For a general family X of vector elds, it will not be the case that L () (X ) is a submodule. However, if all one is interested in is the tangent spaces to orbits, then, by Propositions 5.3.2 and 5.3.3, one can replace with X with the module T (X ) generated by X . Equivalently, also by Proposition 5.3.3, one can replace L () (X ) with the module T (L () (X )) generated by L () (X ). As long as the module L () (T (X )) or T (L () (X )) is locally nitely generated, it will hold that Tx Orb(x0 , X ) = L() (X )x for all x0 M and x Orb(x0 , X ). The next example shows that the hypothesis in Theorem 5.3.21 that the module be nitely generated is essential. 5.3.23 Example (The necessity of nite generation) Here we take M = R2 and dene smooth vector elds X1 and X2 on R by X1 (x1 , x2 ) = where , x 1 X2 (x1 , x2 ) = f (x1 ) , x 2

2 e1/x , f (x) = 0 ,

x 0, x = 0.

Taking X = (X1 , X2 ), note that Orb(0, X ) = R2 in much the same manner as ) Example 5.3.133. However, one can easily show that L() (X )0 = spanR ( x1 () T0 Orb(0, X ). The problem here is that the module L (X ) is not nitely generated. Indeed, one computes the bracket of X2 with X1 taken k-times to be [X1 , . . . , [X1 , X2 ] ] = f (k) (x1 ) . x 2

The set of all such vector elds is linearly independent over C (U) in any neighbourhood U of 0, cf. Example 5.2.10. Again, it is important to distinguish between a distribution generated by a family of vector elds being nitely generated and the module generated by a family of vector elds being nitely generated. This gives the following important results for families of analytic vector elds and certain families of smooth vector elds. 5.3.24 Corollary (The Orbit Theorem when L() (X ) has constant rank) Let M be a C manifold and let X (TM) be such that the distribution L() (X ) is regular. Then, for each x0 M, (i) Orb(x0 , X ) is a connected immersed smooth submanifold of M and (ii) for each x Orb(x0 , X ), Tx Orb(x0 , X ) = L() (X ).

23/06/2009

5.3 The Orbit Theorem and some consequences

199

Proof This follows from Theorem 5.3.21, along with Proposition 5.2.1.

5.3.25 Corollary (The Orbit Theorem in the analytic case) Let M be an analytic manifold and let X (TM). Then, for each x0 M, (i) Orb(x0 , X ) is a connected immersed analytic submanifold of M and (ii) for each x Orb(x0 , X ), Tx Orb(x0 , X ) = L() (X ).
Proof This follows from Theorem 5.3.21, along with Theorem 2.4.28.

5.3.7 The xed-time Orbit Theorem In this section we give the version of the Orbit Theorem corresponding to the xedtime orbits considered in Section 5.3.4. We let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of Cr -vector elds. We denote by X0 the family of vector elds given by
k k

X0 =
j=1

jX j

X1 , . . . , Xk X , 1 , . . . , k R,
j =1

j = 0, k Z>0 .

The following characterisation of X0 is useful and insightful. We refer the reader to Denition 4.3.8 for the denition of the linear part of an ane subspace. 5.3.26 Proposition (Characterisation of X0 ) Let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of Cr -vector elds. Then X0 is the linear part of the ane hull of X in r (TM).
Proof In the rst part of the the proof of Proposition 4.3.9 we showed that for a subset S V of a R-vector space V, the linear part of the ane hull of S is the set of elements of the form
k k

jv j,
j =1

k Z>0 , 1 , . . . , k R,
j=1

j = 0, v1 , . . . , vk S;

this is (4.4). Our result is exactly this fact in the particular case we are discussing.

We also let D (X ) be the derived algebra of L () (X ). That is to say, D (X ) is the subspace of L () (X ) generated by vector elds of the form [Y1 , Y2 ] where Y1 , Y2 L () (X ). We have the following explicit characterisation of the derived algebra. In the proof we use the notion of an ideal of a Lie algebra, and the reader will recall that a subspace i of a Lie algebra g is an ideal if [, ] i for every g and i. 5.3.27 Proposition (Characterisation of D (X )) Let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of Cr -vector elds. Then the derived algebra L () (X ) is comprised of nite R-linear combinations of vector elds of the form [Xk , [Xk1 , . . . , [X2 , X1 ] ]], X1 , . . . , Xk X , k 2.

200

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Proof We rst claim that the derived algebra is an ideal of L () (X ), indeed the ideal generated by elements of the form [X1 , X2 ] for X1 , X2 X . First, it is clear from the denition that D (X ) is an ideal. It is also clear, since X L () (X ), that [X1 , X2 ] D (X ) for every X1 , X2 X . Thus D (X ) contains the ideal generated by brackets from X . Now consider an element from D (X ) of the form [Y1 , Y2 ] for Y1 , Y2 L () (X ), noting that all elements of D (X ) are nite linear combinations of such elements. By Proposition 5.3.1, [Y1 , Y2 ] is a nite linear combination of brackets of the type in the statement of the result. Moreover, if we consider the proof of Proposition 5.3.1, we can see that the brackets involved will be of the form in the statement of the proposition with k 2. From this we conclude our claim that D (X ) is the ideal of L () (X ) generated by brackets [X1 , X2 ] for X1 , X2 X . From the fact that the derived algebra is an ideal and that it contains all vector elds of the form [X1 , X2 ] for X1 , X2 X , it follows that [Xk , [Xk1 , . . . , [X2 , X1 ] ]] D (X ) for every X1 , . . . , Xk X , k 2. Conversely, the set of nite R-linear combinations of the form in the statement of the result is easily shown to be an ideal of L () (X ) and it clearly contains the brackets [X1 , X2 ] for X1 , X2 X . Thus D (X ) is contained in the this set of linear combinations, which completes the proof.

We then dene I (X ) = spanR (X + Y| X X0 , Y D (X )). The following characterisation of I (X ) is useful. 5.3.28 Proposition (Characterisation of I (X )) Let r {, }, let M be a Cr -manifold, and let X r (TM) be a family of Cr -vector elds. Then the following statements hold: (i) I (X ) is an ideal of L () (X ); (ii) the codimension of I (X ) in L () (X ) is zero if I (X ) X and is one otherwise.
Proof (i) If Y L () (X ) and if X I (X ), then [Y, X] is obviously in the derived algebra of L () (X ), by denition of the derived algebra. Since D (X ) I (X ), this part of the result follows. (ii) From Proposition 5.3.1 and Proposition 5.3.27 we have that any element of L () (X ) can be written as
k

j X j + Y,
j =1

X1 , . . . , Xk X , 1 , . . . , k R, Y D (X ).

Thus L () (X ) is the sum of the subspaces spanR (X ) and D (X ). Referring to Proposition 5.3.27, I (X ) is the sum of the subspaces L(a (X )) and D (X ). Note that L(a (X )) is a subspace of spanR (X ). Moreover, from Proposition 4.3.7, L(a (X )) = spanR (X ) if and only if X L(a (X )) . Also by Proposition 4.3.7, if X L(a (X )) = , then the codimension of L(a (X )) in spanR (X ) is one. This gives this part of the result.

23/06/2009

5.3 The Orbit Theorem and some consequences

201

We dene

I(X )x = {X(x) | X I (X )}

so that I(X ) is a distribution on M. Since I (X ) is an ideal of I (X ), it is also a Lie subalgebra, and so is a Lie subalgebra of r (TM). Frobeniuss Theorem, Theorem 5.3.32 below, ensures that this means that I(X ) possesses a maximal integral manifold through any point x, and that the tangent space of this integral manifold at any point y is I(X ) y . The picture one should have in mind is that I (X ) is to OrbT (x, X ) what L () (X ) is to Orb(x, X ). Indeed, one has the following theorem. ! 5.3.29 Theorem (Fixed-time Orbit Theorem) If X is a family of complete analytic vector elds on M and x M, then the following statements are true for each T R: (i) OrbT (x, X ) is a connected analytic immersed submanifold; (ii) for each y OrbT (x, X ), Ty (OrbT (x, X )) = I(X )y . 5.3.8 Frobeniuss Theorem We next use the Orbit Theorem to prove an important theorem concerning integral manifolds for distributions. The following denitions are essential to Frobeniuss Theorem. A related notion to an orbit is the following. 5.3.30 Denition (Integral manifold, integrable distribution, foliation) Let r {, }, let M be a Cr -manifold, and let D be a Cr -distribution. (i) An integral manifold of D is a Cr -immersed submanifold S of M such that Tx S = Dx for every x S. (ii) An integral manifold S for D is maximal if it is connected, and if every connected integral manifold S for D such that S S is an open submanifold of S. (iii) The distribution D is integrable if, for each x M, there exists an integral manifold of D containing x. (iv) A Cr -foliation of M is a family (Sa )aA of pairwise disjoint immersed Cr submanifolds such that (a) M = aA Sa and (b) for each x0 M, there exists a neighbourhood N of x and a family (Xb )bB of Cr -vector elds for which Tx Sa = spanR (Xb (x)| b B). Let us illustrate these denitions with examples. 5.3.31 Examples (Integral manifolds) 1. In Example 5.3.132 we considered an example with M = R2 and dene X1 = x1 , x1 X2 = x2 . x2

202

5 Families of vector elds, distributions, and afne distributions

23/06/2009

The orbits are shown in Figure 5.1, and we note that these are also the maximal integral manifolds. Note that the dimension of the integral manifolds passing through distinct points may have dierent dimensions. Trivially, the family of maximal integral manifolds comprises a foliation. 2. Let M = R3 and dene X1 = , x2 X2 = + x2 . x 1 x 3

We shall see that Frobeniuss Theorem provides an easy means of verifying that this distribution does not possess integral manifolds. However, let us verify this by hand to possibly get some insight. Let us x t R and compute
1 X t (0, 0, 0) = (0, t, 0),

X 2 1 (0, 0, 0) = (t, t, t2 ), X t t X X2 1 1 (0, 0, 0) = (t, 0, t2 ), X t t t X1 X X2 2 1 (0, 0, 0) = (0, 0, t2 ). X t t t t

Now suppose that S is an integral manifold for D containing 0 = (0, 0, 0). Thus S must be two-dimensional. Since X1 , X2 (D) and since TS D, it must be the case the integral curves, and therefore concatenations of integral curves, of X1 and X2 with initial conditions in S must remain in S. Therefore, for each T0 S. However, one readily t R, we must have (0, 0, t2 ) S. Therefore, x3 checks that (X1 (0), X2 (0), ) is linearly independent, prohibiting S from being x3 two-dimensional. Thus D has no integral manifold passing through 0. One can show, in fact, that D possesses no integral manifolds passing through any point. 3. We next consider the example from Example 5.3.133. We take M = R2 and dene X1 = where , x1 X2 = f (x1 ) , x2

2 e1/x , x R>0 , f (x) = 0, x R <0 .

In Example 5.3.133 we showed that there was one orbit, and this was all of R2 . If (x01 , x02 ) R2 with x01 > 0 we can see that {(x1 , x2 ) R2 | x1 R>0 } is the unique maximal integral manifold of D(X ) through (x01 , x02 ). For (x01 , x02 ) R2 with x01 < 0, the maximal integral manifold of D(X ) through (x01 , x02 ) is {(x1 , x02 ) | x1 R>0 }. Note that there are no integral manifolds through points on the x2 -axis.

23/06/2009

5.3 The Orbit Theorem and some consequences

203

The theorem we will state has two parts. The smooth part was proved by Frobenius [1877] and the analytic part was proved by Nagano [1966]. Contributions also come from [Hermann 1960]. In the statement of the theorem, we say that a distribution D is involutive if L() (D) = D. 5.3.32 Theorem (Frobeniuss Theorem) Let r {, }, let M be a Cr -manifold, and let D be a Cr -distribution on M. Then the following statements hold: (i) if r = and if rankD is locally constant, then D is integrable if and only if it is involutive; (ii) if r = then D is integrable if and only if it is involutive. Moreover, in case the hypotheses are satised in either of the above cases, the set of maximal integral manifolds forms a foliation of M.
Proof First suppose that D is integrable. Let x0 M and let S be the maximal integral manifold through x0 . Since every D-valued vector eld is tangent to S since TS D, it follows that U S for some neighbourhood U of x0 in the orbit topology. Thus TU TS. Moreover, if x U and if vx Tx S = Dx , then let X r (D) be such that vx = X(x). Then, r since X t (x) U Orb(x0 , (D)) for t suciently small, it follows that X(x) = vx Tx U. That is to say, for x U, Tx S = Tx Orb(x0 , r (D)). Since L() (D)x Tx Orb(x0 , r (D)) by Theorem 5.3.18, it follows that L() (D)x = Dx , so D is involutive. Conversely, suppose that D is involutive. Note that the conditions in both parts of the theorem ensure that r (D) is locally nitely generated as a Cr (M)-module. Since L() (D) = D, it follows that L() (D) is locally nitely generated as a Cr (M)-module. Therefore, by Theorem 5.3.21, Tx Orb(x, r (D)) = Dx for every x M. Thus Orb(x, r (D)) is an integral manifold for D through x. Now we verify the nal assertion of the theorem. Disjointness of the orbits ensures that the orbits are maximal integral manifolds, and moreover shows that the maximal integral manifolds form a partition of M. That this partition is a foliation follows since local generators for D will satisfy part (iv b) in the denition of a foliation.

Note that Example 5.3.313 shows that any attempt to relax the constant rank condition in the C -case will be met with failure in general. 5.3.9 Equivalence of Lie subalgebras of vector elds The nal topic that we discuss here that is directly related to the Orbit Theorem has to do with equivalence of families of vector elds. Let us say what this means. 5.3.33 Denition (Local and global equivalence of families of vector elds) Let r {, }, let M and N be Cr -manifolds, and let X r (TM) and Y r (TN). (i) The families X and Y are locally equivalent about x0 M and y0 N is there exists (a) neighbourhoods U of x0 and V of y0 and (b) a Cr -dieomorphism : U V such that is a bijection from X |U to Y |V.

204

5 Families of vector elds, distributions, and afne distributions

23/06/2009

(ii) The families X and Y are globally equivalent if there exists a dieomorphism : M N such that is a bijection from X to Y . One would like, of course, to have computable conditions that verify when two families of vector elds are locally or globally equivalent. Doing this in any generality is dicult. However, it is possible to give conditions that apply to the situation of interest in control theory. To do so requires some terminology. 5.3.34 Denition (Transitive, complete, and point separating Lie algebras of vector elds, isotropy subalgebra) Let r {, }, let M be a Cr -manifold, and let L r (TM) be a Lie subalgebra. The subalgebra L : (i) is transitive if {X(x) | X L } = Tx M for every x M; (ii) is complete if, for every X L , every x M, and every bounded sequence (t j ) jZ>0 in I(X, x), convergence of ((X t j ) Y(x)) jZ>0 in Tx M for every Y L implies that the sequence (X t j (x)) jZ>0 has a limit point. (iii) separates points if, for distinct x1 , x2 M, there exists X L such that X(x1 ) = 0x0 and X(x2 ) 0x2 . For x M, the isotropy subalgebra of L at x is Lx = {X L | X(x) = 0x }. The following property of the isotropy algebra is useful. 5.3.35 Lemma (Quotient by the isotropy algebra) If r {, }, if M is a Cr -manifold, and if L r (TM) is a transitive Lie subalgebra, then the map (L )x : L /Lx Tx M dened by (L )x (X + Lx ) = X(x) is a vector space isomorphism.
Proof Let us rst check that the map is well-dened. Let X, X L with X X Lx . Then X + Lx = X (X X ) + Lx = Lx , giving well-denedness. Linearity of (L )x is clear. Let vx Tx M. By transitivity of L there exists X L such that X(x) = vx . Then (L )x (X + Lx ) = vx giving surjectivity of (L )x . If (L )x (X + Lx ) = 0x then X(x) = 0x so X Lx , giving injectivity of (L )x .

Next we state a theorem of Sussmann [1974] (generalising a result of Nagano [1966]) which gives conditions for global equivalence. Actually, it is not so much of interest to us to be able to actually check these conditions. The main thing of interest to us is the character of the conditions guaranteeing local or global equivalence. 5.3.36 Theorem (Global equivalence of Lie algebras of vector elds) Let M and N be connected C -manifolds, and let L (TM) and M (TN) be Lie subalgebras. Assume the following: (i) L and M are transitive; (ii) there exists a Lie algebra isomorphism : L M ; (iii) for some x0 M and y0 N it holds that (Lx0 ) = My0 ;

23/06/2009

5.3 The Orbit Theorem and some consequences

205

(iv) either (a) M and N are simply connected or (b) the Lie subalgebras L and M separate points. Then there exists a unique analytic dieomorphism : M N for which (x0 ) = y0 and for which = |L .
Proof The hypotheses of the theorem imply that dim(M) and dim(N) are well-dened and equal. We let this dimension be denoted by n. We consider the product manifold M N. For vector elds X (TM) Y (TM) we dene XY (T(MN)) by XY(x, y) = (X(x), Y( y)), noting that T(x, y) (MN) Tx MT y N. One can show, e.g., in coordinates, that [X1 Y1 , X2 Y2 ] = [X1 , X2 ] [Y1 , Y2 ]. Let L M = {X (X) | X L }, and one shows by (5.7) that L M is a Lie subalgebra of (T(M N)). 1 Lemma For (x, y) M N, rank(D(L M )(x,y) ) n, and equality holds if and only if (Lx ) = My . Proof If ux Tx M then transitivity of L ensures that ux = X(x) for some X L . Therefore, the projection from pr1 D(L M )(x, y) to Tx M is surjective, giving the inequality in the statement of the lemma. Now suppose that (Lx ) = M y . Then, if pr1 (X(x), (X)( y)) = 0x for X L we must have X Lx , and so (X) M y , and so (X(x), (X)( y)) = 0(x, y) . This gives injectivity of pr1 and so gives rank(D(L M )(x, y) ) = n. Finally, suppose that rank(D(L M )(x, y) ) = n (i.e., that pr1 is injective) and suppose that X LX . Since pr1 is injective and pr1 (X(x), (X)( y)) = X(x) = 0x , (X(x), (X)( y)) = (0x , 0 y ), showing that (X)( y) M y . Thus (Lx ) M y . The argument can be ipped to give the opposite inclusion. By Corollary 5.3.25 the orbit of L M through (x0 , y0 ) is an n-dimensional immersed submanifold. By Frobeniuss Theorem, this orbit is an integral manifold of D(L M ). Let us denote this orbit by S(x0 , y0 ) = Orb((x0 , y0 ), L M ). Note that Lemma 1 shows that for all (x, y) S(x0 , y0 ) it holds that (Lx ) = M y . The projections from M N to M and N, respectively, restrict to S to maps from S(x0 , y0 ) to M and N, respectively, that we denote by M and N . We use another lemma. 2 Lemma For (x, y) S(x0 , y0 ) and X L , I(X, x) = I((X), y). Proof Suppose that I(X, x) = (a, b) and that I((X), y) = (c, d), allowing that some or all of the endpoints might be innite. We rst show that d b. Suppose otherwise and that d < b. This means that d is nite. We recall from Lemma 5.3.35 that the vector spaces Tx M and L /Lx are isomorphic, as are (5.7)

206

5 Families of vector elds, distributions, and afne distributions

23/06/2009

T y N and M /M y . We shall tacitly make these identications in the following argument. Let (x, y) : L /Lx M /M y be dened by (x, y) (X + Lx ) = (X) + M y . We claim that this map is well-dened. Indeed, if X, X L satisfy X X Lx then (X ) + M y = (X (X X )) + M y = (X) (X X ) + M y = (X) + M y , since (Lx ) = M y by Lemma 1. For X L we dene X : (a, b) Tx M by
X (t) = (X t ) X (x).

Repeated application of Theorem 4.2.19 from [Abraham, Marsden, and Ratiu 1988] gives dk dtk
t=0

X (t) = [X, [X, . . . , [X, X ] ]](x),

with k occurrences of X in the bracket on the right. Similarly we dene (X ) : (c, d) T y N by (X) (X ) (t) = (t ) (X )( y), and determine that dk dtk
t=0

(X ) (t) = [(X), [(X), . . . , [(X), (X )] ]]( y).

Since is a Lie algebra isomorphism it follows that [(X), [(X), . . . , [(X), (X )] ]] = ([X, [X, . . . , [X, X ] ]]). Therefore, dk dk ( t ) = (X ) (t) X dtk t=0 dtk t=0 for every k Z0 , making the identications Tx M L /Lx and T y N since the curves X and (X ) are analytic, we have (x, y)
(x, y) ((X t ) X (x)) = (t (X)

M /M y . Therefore,

) (X )( y)

for all t [0, d). Let (t j ) jZ>0 be a sequence in [0, d) converging to d. Continuity of the ow ensures that
j X lim (X t j ) X (x) = (d ) X (x).

Thus lim j (t j
(X) (t j ( y)) jZ>0

(X) ) (X

)( y) exists, and completeness of M ensures that the sequence R>0 and a


(X) (X) t ( y

. Then there exists has a limit point which we denote by y

such that neighbourhood U of y


(X) tk ( y)

) exists for all t ( , ) and all y U. Let k Z>0

be such that U and such that d tk < . Then the integral curve for t t ( y) to be dened on [0, tk + ). Since d < tk + , this can be extended to be dened through y contradicts the fact that sup I((X), y) = d. Therefore, we cannot have d < b. The above argument shows that d b. Reversing the roles of X and (X), using completeness of L , one shows that b d. By replacing X with X and repeating the arguments, one shows that c = a.

23/06/2009

5.3 The Orbit Theorem and some consequences

207

3 Lemma The maps M : S(x0 , y0 ) and N : S(x0 , y0 ) N are covering maps. Proof First we show that the maps M and N are surjective. Let x M. Since D(L )x = Tx M for every x M, the RashevskyChow Theorem ensures that there exists vector elds X1 , . . . , Xk L and t1 , . . . , tk R>0 such that
1 x = tk k X t1 (x0 ).

Let us dene

x j = t j

Xj

1 X t1 (x0 ),

j {0, 1, . . . , k}.

We claim that for each j {0, 1, . . . , k} there exists y j N such that (x j , y j ) S(x0 , y0 ). We prove this by induction on j, this being clearly true for j = 0. So suppose that the assertion is true for j {0, 1, . . . , m} for m {0, 1, . . . , k 1}. Thus there exists ym N such that (xm , ym ) S(x0 , y0 ). The integral curve of the vector eld Xm+1 (Xm+1 ) through
m+1 (xm , ym ) is then t X (xm ) t 1 ( ym ). Since the ow of Xm+1 is assumed to be t dened on an interval containing [0, tm+1 ], the integral curve of (Xm+1 ) is also dened (X ) on such an interval, this by Lemma 2. Thus ym+1 = tm+1m+1 ( ym ) has the property that (xm+1 , ym+1 ) S(x0 , y0 ). This proves by induction, in particular, that there exists yk N such that (xk , yk ) S(x0 , y0 ). This gives surjectivity of M . Surjectivity of N is similarly shown. Next we show that M and N are local dieomorphisms. Let (x, y) S(x0 , y0 ). Then by transitivity of L , there exists vector elds X1 , . . . , Xn L such that

(Xm )

Tx M = spanR (X1 (x), . . . , Xn (x)). Then Xn 1 tn X t1 (x) = X j (x) t j

when evaluated at t1 = = tn = 0. Thus, by the Inverse Function Theorem, there exists R>0 such that the map x : ( , )n M
n 1 (t1 , . . . , tn ) X X tn t1 (x)

is a dieomorphism onto its image. We claim that T y N = spanR ((X1 )( y), . . . , (Xn )( y)). It suces to show linear independence of the proposed spanning set. To see this, suppose that c1 (X1 )( y) + + cn (Xn )( y) = 0 y for c1 , . . . , cn R. Then (c1 X1 + + cn Xn ) M y = (Lx ).

208

5 Families of vector elds, distributions, and afne distributions


Therefore, there exists X Lx such that (c1 X1 + + cn Xn ) = (X) = = c1 X1 + + cn Xn = X c1 X1 (x) + + cn Xn (x) = 0x ,

23/06/2009

and so c1 = = cn = 0, giving the desired linear independence. Now the map y : ( , )n N (t1 , . . . , tn ) tn
(Xn ) (X1 )

t1

(x )

is well-dened (by Lemma 2) and a dieomorphism onto its image for suciently small. Denote U(x) = {x (t) | t ( , )n }, U( y) = { y (t) | t ( , )n }, U(x, y) = {(x (t), y (t)) | t ( , )n }. Thus the map U(x)
1 x (t) (x (t), y (t) = y x x (t)) U(x, y)

1 is a dieomorphism onto its image since both y and x are dieomorphisms onto their images. This proves that M is a local dieomorphism. The same argument can be applied to show that N is a local dieomorphism. 1 (x) are distinct, then Now we let x M and show that if (x, y1 ), (x, y2 ) M

U(x, y1 ) U(x, y2 ) = . Assume the contrary and let (x , y ) be in the intersection. Then, using our notation above,
1 y = y1 x (x ) = y2 x (x ). 1 Let t = x (x ). Then we have

tn

(Xn )

t1

(X1 )

( y1 ) = tn

(Xn )

t1

(X1 )

( y2 )

from which we deduce, by uniqueness of integral curves, that y1 = y2 : a contradiction. 1 ( y) are distinct, then One similarly shows that (x1 , y), (x2 , y) N U(x1 , y) U(x2 , y) = . The nal step in showing that the maps M and N are covering maps is to show 1 (U(x)) so that x U(x) and (x , y ) 1 (U(x)) = that, e.g., 1 U(x, y). Let (x , y ) y M M M S(x0 , y0 ). Then x = x (t) for some unique t Rn and so
Xn 1 x = X t1 tn (x ).

Using Lemma 2 to ensure well-denedness, we can dene y = t1 1


(X )

tn n ( y ).

(X )

23/06/2009 Then

5.3 The Orbit Theorem and some consequences

209

Xn 1 1 (x, y) = (X t1 tn (x ), t1
1 = t 1

(X )

tn n ( y ))

(X )

X (X1 )

n t n

X (Xn )

(x , y ).

Thus (x, y) S(x0 , y0 ). Since


n 1 x = X X tn t1 (x),

y = tn

(Xn )

t1

(X1 )

( y),

it follows that (x , y ) = (x (t), y (t)) U(x, y), as desired. Similarly one shows that 1 (U( y)) = x1 U(x, y). N
N

If M and N are simply connected then the covering maps M and N must be singlecoverings, i.e., they are bijections [Lee 2004, Corollary 11.24]. Since they are local dif1 is then a feomorphisms, they are in fact dieomorphisms, and the map = N M dieomorphism from M to N. If L and M separate points, then we also claim that M and M are dieomorphisms. The only thing to show is that they are injective. Suppose that M is not injective. Then, for x M there exists distinct y1 , y2 N such that (x, y1 ), (x, y2 ) S(x0 , y0 ). By Lemma 1 it follows that (Lx ) = M y1 = M y2 . This, however, contradicts the fact that M separates points. Thus M is a dieomorphism, and one similarly shows that N is a dieomorphism. Next we show that = . First let us determine the derivatives of the dieomorphisms M and N . Let prM : M N M and prN : M N N be the projections. We then have M = prM , N = prN , where : S(x0 , y0 ) M N is the inclusion. Therefore, for (x, y) S(x0 , y0 ), T(x, y) M (ux , v y ) = T(x, y) prM T(x, y) (ux , v y ) = T(x, y) prM ( (ux , vx )) = ux , where : TS(x0 , y0 ) T(M N) is the inclusion, i.e., the derivative of the inclusion . Similarly, T(x, y) N (ux , v y ) = v y . Note that both T(x, y) M and T(x, y) N are isomorphisms. Now let X L and let x M. Since S(x0 , y0 ) is an L M -orbit, it follows that (X(x), (X)((x))) T(x,(x)) S(x0 , y0 ). Therefore, X((x)) = Tx X 1 ((x)) = Tx (X(x))
1 1 = Tx N M (X(x)) = T1 (x) N Tx M (X(x))
M

= T1 (x) N (X(x), (X)((x))) = (X)((x)),


M

as desired. Finally, we show uniqueness of the dieomorphism . Let : M N be any diffeomorphism for which (x0 ) = y0 and for which = . Then, by Proposition 4.2.4 (X) of [Abraham, Marsden, and Ratiu 1988], t ((x)) = X t (x) for every (t, x) D(X). Now let x M and use the RashevskyChow Theorem to write
1 x = tk k X t1 (x0 )

210

5 Families of vector elds, distributions, and afne distributions


for X1 , . . . , Xk L and t1 , . . . , tk R>0 . Then
1 (x) = tk k X t1 (x0 )

23/06/2009

= tk . . .

(Xk )

X k1 tk1

1 X t1 (x0 )

= tk

(Xk )

t1

(X1 )

( y0 )

since (x0 ) = y0 . Therefore, the conditions (x0 ) = y0 and = uniquely determine (x) for every x M.

There are other hypotheses one can use for the preceding theorem to arrive at the same conclusions. Some hypotheses are stronger, some are equivalent. Rather then state various corollaries which give the same conclusions under dierent hypotheses, we shall simply prove the statements about relationships between hypotheses, leaving the reader to plug these into the preceding theorem to give the desired corollaries. This is, we believe, the clearer thing to do. Let us show that the somewhat dicult to understand condition of completeness follows if all vector elds in the Lie subalgebra are complete. 5.3.37 Proposition (Lie subalgebras of complete vector elds are complete Lie subalgebras of vector elds) If M is a smooth manifold and if L (TM) is a Lie subalgebra of complete vector elds, then L is complete.
Proof Let (t j ) jZ>0 be a bounded sequence in R. By the BolzanoWeierstrass Theorem, the set {t j | j Z>0 } has a limit point. That is to say, there is a subsequence (t jk )kZ>0 that converges. Let X L . Continuity of the ow ensures that the sequence (X t j )kZ>0 which can be dened since X is completeconverges. Thus (X t j ) jZ>0 has a limit point.
k

We can then give two special cases of Lie subalgebras of vector elds that are complete. The rst comes as an corollary to the preceding result after one recalls that all vector elds on a compact manifold are complete [Abraham, Marsden, and Ratiu 1988, Corollary 4.1.20]. 5.3.38 Corollary (Lie subalgebras of vector elds on compact manifolds are complete) If M is a smooth compact manifold and if L (TM) is a Lie subalgebra, then L is complete. One also has the following result. 5.3.39 Proposition (The set of all vector elds is complete) If M is an analytic manifold then (TM) is complete.
Proof Let X (TM), x M, and (t j ) jZ>0 a bounded sequence in I(X, x). We will show that if (X t j ) jZ>0 does not have a limit point, then there exists Y (TM) such that

((X t j ) Y(x)) jZ>0 does not converge. Take a sequence (v j ) jZ>0 in Tx M that does not converge

and dene u j = Tx X t j (v j ) TX (x) M. Taking Y to be an analytic vector eld for which


tj

Y(X t j (x))

= v j , cf. Theorem 2.3.16, gives the result.

23/06/2009

5.3 The Orbit Theorem and some consequences

211

While the global equivalence described above is deep and interesting, it is actually the following local result that will be of most interest to us. 5.3.40 Theorem (Local equivalence of Lie algebras of vector elds) Let M and N be C manifolds, and let L (TM) and M (TN) be Lie subalgebras. Assume the following: (i) L and M are transitive; (ii) there exists a Lie algebra isomorphism : L M ; (iii) for some x0 M and y0 N it holds that (Lx0 ) = My0 . Then there exist neighbourhoods U of x0 and V of y0 and a unique analytic dieomorphism : U V for which (x0 ) = y0 and for which (X)(y) = X(y) for every y V.
Proof The construction in the proof of Theorem 5.3.36 of S(x0 , y0 ) as the orbit of L M through (x0 , y0 ) is still valid as this only depends on the transitivity of L and M . If X1 , . . . , Xn L are such that Tx0 M = spanR (X1 (x0 ), . . . , Xn (x0 )) then, just as in the proof of Lemma 3 from Theorem 5.3.36, T y0 N = spanR ((X1 )( y0 ), . . . , (Xn )( y0 )). Moreover, again just as in the proof of Lemma 3 from Theorem 5.3.36, we can dene x : ( , )n M
n 1 (t1 , . . . , tn ) X X tn t1 (x)

and

y : ( , )n N (t1 , . . . , tn ) tn
(Xn )

t1

(X1 )

(x )

for R>0 suciently small, and these maps are local dieomorphisms. Then, just as in the proof of Lemma 3 from Theorem 5.3.36, it follows that the projections M and N are dieomorphisms in a neighbourhood of (x0 , y0 ) to neighbourhoods U of x0 and 1 |U is the desired V of y0 , respectively. And from this one establishes that = N M dieomorphism, possibly after shrinking U and then taking V to be the image of U. Just as in the proof of Theorem 5.3.36 one shows that (X)( y) = X( y) for every y V. Uniqueness of follows as in the proof of Theorem 5.3.36.

The results in this section indicate that the information needed to classify families of vector elds is contained in the Lie algebra generated by this family of vector elds. Generally, obtaining this information will be very dicult, but this is a problem that is considered by researchers in special cases. For example, the special case of classifying distributions is one of great interest. Much work has been done in this area, and we refer to [Gardner, Shadwick, and Wilkens 1989, Pasillas-L epine and Respondek 2001, 2002] as instances of this programme.

212

5 Families of vector elds, distributions, and afne distributions

23/06/2009

5.4 Afne distributions


In one of our geometric models of a control system that we introduce in Chapter 6, the basic geometric ingredient will be what we call an ane distribution. Since we have discussed distributions at some length, our task here will be easier since ane distributions bear some resemblance to distributions. 5.4.1 Denitions An ane distribution assigns an ane subspace of the tangent space to each point in the manifold. We refer ahead to Section 4.3 for what we shall mean by an ane subspace. 5.4.1 Denition (Afne distribution) Let M be a manifold of class C or C , as is required. An ane distribution on M is a subset A TM such that, for each x M, the subset Ax A Tx M is an ane subspace (and so in particular, is nonempty). (i) An ane distribution A is of class Cr , r Z0 {, }, if, for each x0 M, there exists a neighbourhood N of x0 , a vector eld X0 on N of class Cr , and family (X j ) j J of Cr -vector elds, called local generators, on N such that Ax = X0 (x) + spanR (X j (x) | j J) for each x N. (ii) An ane distribution A of class Cr , r Z0 {, }, is locally nitely generated if, for each x0 M, there exists a neighbourhood N of x0 and a family (X0 , X1 , . . . , Xk ) of Cr -vector elds, called local generators, on N such that Ax = X0 (x) + spanR (X1 (x), . . . , Xk (x)) for each x N. (iii) An ane distribution A of class Cr , r Z0 {, }, is nitely generated if there exists a family (X0 , X1 , . . . , Xk ) of Cr -vector elds, called generators, on M such that Ax = X0 (x) + spanR (X1 (x), . . . , Xk (x)) for each x M. As per Proposition 4.3.7, associated with Ax is a subspace of Tx M. The distribution which assigns to x this subspace is the linear part of A and is denoted by L(A). The nonnegative integer dim(L(A)x ) is called the rank of A at x and is sometimes denoted rank(Ax ). The following result is sort of obvious, but since we shall implicitly use it on many occasions, we state it explicitly. 5.4.2 Lemma (Afne distributions and their linear part) Let r Z0 {, }, let M be a C - or C -manifold, as is required, and let A be a Cr -ane distribution on M. The following statements hold:

23/06/2009

5.4 Afne distributions

213

(i) L(A) is a Cr -distribution; (ii) if U M is an open subset, then the following two statements hold for a Cr -vector eld X0 on U and a family (Xj )jJ of Cr -vector elds on U: (a) X0 is A-valued and (Xj )jJ generate L(A)|U; (b) Ax = X0 (x) + spanR (Xj (x)| j J) for every x U.
Proof Part (i) follows from part (ii), so we just prove the latter. (ii a) = (ii b) The x U and let vx Ax . Since X0 (x) Ax , by Proposition 4.3.7 we have Ax = X0 (x) + L(A)x , and so this part of the result immediately follows. (ii b) = (ii a) Since 0x spanR (X j (x)| j J) we have X0 (x) Ax . This part of the result then follows from Proposition 4.3.7.

As with distributions, ane distributions are globally nitely generated. 5.4.3 Theorem (Afne distributions of class Cr are nitely generated) For r Z0 {} and for an ane distribution A on a paracompact Hausdor manifold M of bounded dimension, the following statements are equivalent: (i) A is of class Cr ; (ii) for each x0 M and each vx0 Dx0 , there exists a neighbourhood N of x0 and a Cr -vector eld X such that X(x0 ) = vx0 and X(x) Dx for each x N; (iii) there exists a family (X0 , X1 , . . . , Xk ) of Cr -vector elds on M such that Ax = X0 (x) + spanR (X1 (x), . . . , Xk (x)) for each x M.
Proof (i) = (ii) Let x0 M and let vx0 Ax0 . Let N be a neighbourhood of x0 , let X0 be a Cr -vector eld on N, and let (X j ) j J be a family of Cr -vector elds on N such that Ax = X0 (x) + spanR (X j (x)| j J) for x N. Let j1 , . . . , j j J be such that (X j1 (x0 ), . . . , X jk (x0 )) is a basis for L(A)x0 . Then vx0 = X0 (x0 ) + c1 X j1 (x0 ) + + ck X jk (x0 ) for some uniquely dened c1 , . . . , ck R. The vector eld X = X0 + c1 X1 + + ck Xk is of class Cr , is A-valued on N, and satises X(x0 ) = vx0 . (ii) = (iii) By hypothesis, for each x M there exists a neighbourhood Nx of x and a Cr -vector eld Xx on Nx such that Xx (x ) Ax for x Nx . Since M is paracompact, we use [Abraham, Marsden, and Ratiu 1988, Theorem 5.5.12] to provide a smooth partition of unity ((U j , g j )) j J subordinate to (Nx )xM . Thus (U j ) j J is a locally nite open cover of M with U j Nx for some x M and where the functions g j : U j [0, 1], j J, are such that j J g j (x) = 1 for every x M (the sum being nite since (U j ) j J is locally nite). For each j J let x j M be such that U j Nx j . Now dene a vector eld X0 on M by X0 (x) =
j J

g j (x)Xx j (x).

(5.8)

214

5 Families of vector elds, distributions, and afne distributions

23/06/2009

By local niteness of (U j ) j J the sum is nite for each x M, so x0 is well-dened. Since the functions g j , j J are of class C and the vector elds Xx j , j J, are of class Cr , X0 is of class Cr . We claim that X0 takes values in A. Note that Ax is convex for each x M. Thus, since ( g j (x)) j J is a nite set of numbers in [0, 1] whose sum is 1, the sum on the right-hand side of (5.8) is a convex combination of vectors from the convex set Ax , and so is also in A by Proposition 4.3.4. Since A is of class Cr , it follows directly that L(A) is of class Cr . Therefore, by Theorem 5.1.2, there exists Cr -vector elds X1 , . . . , Xk on M which generate L(A). The vector elds (X0 , X1 , . . . , Xk ) then generate A. (iii) = (i) This is obvious.

5.4.2 Regular and singular points Just as one has the notion of regular and singular points for distributions, these concepts are also applicable to ane distributions. 5.4.4 Denition (Regular point, singular point) Let A be an ane distribution on a manifold M. A point x0 M (i) is a regular point for A is there exists a neighbourhood N of x0 such that rank(L(A)x ) = rank(L(A)x0 ) for every x N and (ii) is a singular point for A if it not a regular point for A. An ane distribution A on M is regular if every point in M is a regular point for A, and is singular otherwise. In the following result, if A is an ane distribution on M, then we denote by rankA : M Z0 the function dened by rankA (x) = rank(Ax ). 5.4.5 Proposition (Rank and regular points for differentiable afne distributions) If A is an ane distribution of class C1 on M then the function rankA is lower semicontinuous and the set of regular points of A is open and dense.
Proof This follows immediately from Proposition 5.1.7.

As with distributions, one can devise a class of generators for ane distributions that have some useful properties. 5.4.6 Proposition (A useful class of local generators for afne distributions) Let M be a manifold of class C or C , as is required, let r Z0 {, } and let A be a Cr -ane distribution on M. Then, for each x0 M there exists a neighbourhood N of x0 and local generators (X0 , X1 , . . . , Xk ) for D on N with the following properties: (i) (X1 (x0 ), . . . , Xm (x0 )) form a basis for L(A)x0 ; (ii) Xm+1 (x0 ) = = Xk (x0 ) = 0x0 . In particular, if x0 is a regular point for A, the vector elds (X0 X1 , . . . , Xm ) are local generators for A in some neighbourhood (possibly smaller than N) of x0 . Moreover, if 0x0 Ax0 , then X0 may be further chosen so that X0 (x0 ) = 0x0 .

23/06/2009

5.4 Afne distributions

215

Proof By Proposition 5.1.9 choose local generators (X1 , . . . , Xk ) for L(A) dened on a neighbourhood N1 of x0 such that (X1 (x0 ), . . . , Xm (x0 )) form a basis for L(A)x0 and such that Xm+1 (x0 ) = = Xk (x0 ) = 0x0 . Let X0 be any A-valued vector eld dened on some neighbourhood N2 of x0 . Then the vector elds (X0 , X1 , . . . , Xk ) generate A on N = N1 N2 . The second assertion follows from the corresponding assertion from Proposition 5.1.9. Let us prove the nal assertion of the proposition. If 0x0 Ax0 then L(A)x0 = Ax0 . Therefore, there exists c1 , . . . , cm R such that X0 (x0 ) = c1 X1 (x0 ) + + cm Xm (x0 ). Then dene X0 = X0 c1 X1 cm Xm and note that X0 (x) Ax for every x N and that X (x0 ) = 0x0 . Then (X0 , X1 , . . . , Xk ) are the required local generators.

5.4.3 Algebraic aspects of afne distributions Let r Z0 {, }, let M be a C - or C -manifold, as is required, and let A be a C -distribution. We denote by r (A) the set of A-valued vector elds of class Cr . For r x0 M we also denote by r x0 (A) the subset of x0 (TM) consisting of the germs of Avalued vector elds. Note that r (A) is not generally not a Cr (M)-submodule of r (TM) r r and that r x0 (A) is not generally a Cx0 (M)-submodule of x0 (TM). However, there is some algebraic structure that we would like to understand. The following more or less obvious result gets us started. We adapt some standard vector space notation to modules as follows. If A is a module over a ring R,if B A is a submodule, and if a0 A, then we denote
r

a0 + B We now state the following result.

{a0 + a | a B}.

5.4.7 Lemma (Characterisation of r (A) and r (A)) Let r Z0 {, }, let M be a smooth or x0 analytic manifold, as is required, and let A be a Cr -ane distribution on M. Then the following statements hold; (i) if X0 r (A) then r (A) = X0 + r (L(A)); r r (ii) if [(X0 , U)] r x0 (A) then x0 (A) = [(X0 , U)] + x0 (L(A)).

Proof (i) Let X0 r (A). Clearly X0 + r (L(A)) r (A). Conversely, if X r (A) then X X0 r (L(A)) (by Proposition 4.3.7) and so X = X0 + (X X0 ) X0 + r (L(A)). r (ii) Let [(X0 , U)] r x0 (A). If [(X, V)] x0 (L(A)) then [(X0 , U)] + [(X, V)] = [(X0 + X, U V)] r x0 (L(A)),

r r giving r x0 (A) [(X0 , U)] + x0 (L(A)). For the converse inclusion, let [(X, V)] x0 (A) and note that [(X, V)] [(X0 , U)] = [(X X0 , U V)] r (x0 )A

since, as in the rst part of the proof, (X X0 )|U V r (L(A)|U V). Therefore, [(X, V)] = [(X0 , U)] + [(X X0 , U V)] [(X0 , U)] + r x0 (L(A)), as desired.

216

5 Families of vector elds, distributions, and afne distributions

23/06/2009

The essential point is that r (A) (resp. r x0 (A)) is characterised by two things (1) a r r vector eld in X0 (A) (resp. [(X0 , U)] x0 (A)) and (2) r (L(A)) (resp. r x0 (L(A))). The ( L ( A ))) is canonical, but the choice of X (resp. [(X0 , U)]) is choice of r (L(A)) (resp. r 0 x0 not. This is rectied as follows. We let L(A) : r (TM) r (TM)/r (L(A)) (resp.
r r x0 ,L(A) : r x0 (TM) x0 (TM)/x0 (L(A)))

be the projection onto the quotient module. Then, for an ane distribution A, if X0 r (A) (resp. [(X0 , U)] r x0 (A)) then L(A) (X0 ) (resp. x0 ,L(A) ([(X0 , U)])) is uniquely determined by A. This all shows the following (including a minor foray into sheaf language that we promise we will never make again). 5.4.8 Proposition (Equivalent characterisation of afne distributions) Let r Z0 {, } and let M be a smooth or analytic manifold, as is required. Then the following two statements hold: (i) There is a 11 correspondence between the following two collections of objects: (a) ane distributions A of class Cr on M; (b) pairs (X0 , D) where D is a distribution of class Cr on M and X0 image(D ). (ii) There is a 11 correspondence between the following two collections of objects: (a) ane distributions A of class Cr on M; (b) pairs (X0 , D ) where D is a subsheaf of the sheaf G r (TM) of germs of Cr -vector elds on M and X0 image(D ), where D : G r (TM) G r (TM)/D is the canonical projection. With this understanding of how an ane distribution is built, we can adapt the terminology and results of Section 5.2 to ane distributions. We shall do this in a concise way, uncharacteristically for us. . . We let r Z0 {, }, we let M be a manifold of class C or C , as is required, and we let A be a Cr -ane distribution on M. 1. If r (L(A)) is locally nitely generated, we shall say that r (A) is locally nitely generated. 2. As per Proposition 5.2.1, if x0 M is a regular point for A, then there exists a neighbourhood N of x0 such that r (A)|N is nitely generated. 3. As per Theorem 2.4.28, (A) is locally nitely generated. 4. By D(A) we denote the distribution generated by A. That is, in the notation of Section 5.2.1, D(A) = D(r (A)). 5. As per Proposition 5.2.5, r (A) r (D(A)). However, the opposite inclusion does not generally hold. To see this, let A be the ane distribution on M = R2 generated by , X1 (x1 , x2 ) = . X0 (x1 , x2 ) = x2 1 x2 x 1

23/06/2009

5.4 Afne distributions

217

Then the vector eld X(x1 , x2 ) = x1 is in (D(A)) but not in (A). x2 r 6. If r x0 (L(A)) is nitely generated we shall say that x0 (A) is nitely generated. 7. As per Theorem 2.4.27, x0 (A) is nitely generated. 5.4.4 The Lie algebra generated by an afne distribution As a family of vector elds, r (A), r {, }, is capable of dening a Lie subalgebra of vector elds. As in Section 5.3.1 we can dene L () (r (A)) as the Lie subalgebra of vector elds generated by the A-valued vector elds. We shall abbreviate this Lie subalgebra by L () (A). We also denote L() (A) D(L () (A)). We wish to characterise this distribution in a useful way. In order to do this, we let D(A) be the distribution generated by r (A), i.e., D(A) = D(r (A)). We also denote L() (D(A)) = L() (r (D(A))). With this notation we have the following result. 5.4.9 Proposition (The Lie subalgebra of vector elds generated by an afne distribution) Let r {, }, let M be a smooth or analytic manifold, as is required, and let A be a Cr -ane distribution on M. If L () (A) is a nitely generated module of vector elds, then the distributions L() (A) and L() (D(A)) agree.
Proof This is a direct consequence of Theorem 5.3.4.

The following corollary covers an important case where the hypotheses of the preceding theorem hold. 5.4.10 Corollary (The Lie subalgebra of vector elds generated by an analytic afne distribution) Let M be an analytic manifold and let A be an analytic distribution on M. Then the distributions L() (A) and L() (D(A)) agree.
Proof This follows from Proposition 5.4.9, along with Theorem 2.4.28.

Example 5.3.7 can be adapted to illustrate the preceding result. 5.4.11 Example (The Lie algebra generated by an afne distribution) We take M = R2 and dene smooth vector elds X0 and X1 on R2 by X0 (x1 , x2 ) = f (x1 ) , x2 X1 (x1 , x2 ) = , x 1

where f : R R is any smooth function vanishing only at x = 0. We take A(x1 ,x2 ) = X0 (x1 , x2 ) + spanR (X1 (x1 , x2 )), and note that D(A)(x1 ,x2 ) 2 T(x1 ,x2 ) R , = spanR ( ), x1 x1 0, x1 = 0.

218

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Therefore, D is generated by the vector elds Y1 = Since [Y1 , Y2 ](x1 , x2 ) = , x1 Y2 = x1 . x 2

x2

we conclude that L() (D(A)) = TR2 . We now consider the character of L() (A) for a few f s. 1. First let us consider f (x) = x. Note that X0 , X0 + X1 (A). We compute [X0 + X1 , X0 ](x1 , x2 ) = , x 2

and so L() (A)) = TR2 . Thus we have L() (A) = L() (D(A)). Since f is analytic, this is in agreement with Corollary 5.4.10. 2. Next consider f (x) = x2 . In this case we have [X0 + X1 , X0 ](x1 , x2 ) = 2x1 , x2 [X0 + X1 , [X0 + X1 , X0 ]](x1 , x2 ) = 2 . x 2

Thus we can again conclude that L() (A) = TR2 , implying that L() (A) = L() (D(A)). This again is consistent with Corollary 5.4.10. Note, however, the distribution L() (A) is generated by dierent brackets than was the case when we took f (x) = x. Thus the fact that L() (A) = L() (D(A)) is less obvious in this case. 3. The nal case we consider is 2 e1/x , x 0, f (x) = 0, x = 0. One can easily show that L() (A) = D(A) and so L() (A) L() (D(A)). Thus analyticity is required in Proposition 5.4.9. 5.4.12 Remark (Equivalence of afne distributions) Having broached the subject of Lie algebras generated by ane distributions, and recalling from Section 5.3.9 the dening role of Lie algebras in the equivalence problem for families of vector elds, one can wonder whether there are theorems like Theorem 5.3.36 that characterise ane distributions, retaining the information about the ane structure (note that Theorem 5.3.36 does not retain this structure). It turns out that this is a touchy matter, and there is a fairly simple reason for this. Let us explain, rst by noting the following two things. 1. For our study of controllability, we are mostly interested in situations where 0x0 Ax0 for some x0 M; see the next section.

23/06/2009

5.4 Afne distributions

219

2. In the class of ane distributions are those ane distributions with trivial linear part, i.e., L(A)x = {0x } for all x M. Such ane distributions are really then just a vector eld, the vector eld denoted by X0 in our terminology for local generators. Combining the preceding two comments, a useful classication of ane distributions will include as a consequence a classication of vector elds which vanish at some point x0 , i.e., a classication of vector elds at equilibrium points. This is a sort of impossible problem to resolve in any generality; see, for example, [Takens 1974, Theorem 2]. Despite the fact that any use general classication of ane systems is not tractable, researchers work on special cases. There are many instances of this, and we refer to [Elkin 1999, Gardner and Shadwick 1990, Zhitomirskii and Respondek 2000] as examples. 5.4.5 Invariant subspace constructions In these notes we shall frequently encounter the situation where our interest is in a point x0 M where 0x0 Ax0 . One can easily verify that this is equivalent to the fact that L(A)x0 = Ax0 . In this case there are some interesting constructions one can perform, and we provide these here. The following lemma gets us started. 5.4.13 Lemma (The linearisation of a vector eld) For a C1 -vector eld X on a C -manifold M for which X(x0 ) = 0x0 for some x0 M, Then there exists a unique AX (x0 ) End(Tx0 M) such that AX (x0 ) vx0 = [V, X](x0 ), where V 1 (TM) is such that V(x0 ) = vx0 .
Proof Let us rst show that [V, X](x0 ) depends only on the value of V at x0 . Suppose that V (x0 ) = V (x0 ) = vx0 . Then, [V V , X](x0 ) = 0x0 which gives [V, X](x0 ) = [V , X](x0 ), and hence well-denedness of AX (x0 ). Linearity of AX (x0 ) follows from R-linearity of the Lie bracket. Uniqueness follows since, if [X1 , Y](x0 ) = [X2 , Y](x0 ) for all Y 1 (TM), then X1 (x0 ) = X2 (x0 ).

Let us call AX (x0 ) the linearisation of X at x0 . Note that for the denition to make sense, it is essential that X(x0 ) = 0x0 . In coordinates (x1 , . . . , xn ) for M, the components Xj , . . . , ) are (x0 ), j, k {1, . . . , n}. of AX (x0 ) relative to the coordinate basis ( xn x1 xk Now we provide some constructions in linear algebra. For this purpose, let V be a R-vector space with U V a subspace. Let L End(V). By L , U we denote the smallest subspace of V which (1) contains U and (2) is an invariant subspace for every L L. 5.4.14 Lemma (Characterisation of L , U ) The subspace L , U is generated by vectors from V of the form Lk L1 (u), k Z0 , u U, L1 , . . . , Lk L .
Proof Let L (U) be the subspace generated by the vectors of the form stated in the lemma. Note that U L (U) and that, if v L (U) and L L (U), then L(v) L (U). Thus L , U L (U) since L , U is the smallest subspace with these properties. For the reverse inclusion it suces to show that any vector of the form asserted in the lemma is in L , U .

220

5 Families of vector elds, distributions, and afne distributions

23/06/2009

To prove this, we use induction. Since U L , U the assertion is true for k = 0. Now assume that every vector of the form L j L1 (u), j {0, 1, . . . , m}, u U, L1 , . . . , L j L

is an element of L , U . Then, for u U and L1 , . . . , Lm+1 L , by the induction hypothesis and since L , U is invariant under Lm+1 , Lm+1 Lm L1 (u) = Lm+1 (Lm L1 (u)) L , U , as desired.

Now we combine the preceding two constructions in the following denition. 5.4.15 Denition (Zx0 (A) and Zx0 (A), Sx0 ) Let r Z>0 {, } and let A be a Cr -ane distribution on a smooth manifold M and suppose that 0x0 Ax0 . (i) We denote Zx0 (A) = {X r (A) | X(x0 ) = 0x0 }. (ii) If Sx0 Tx0 M is a subspace, then, following our notation above, Zx0 (A), Sx0 is the smallest subspace of Tx0 M that (a) contains Sx0 and (b) is invariant under AX (x0 ) for every X Zx0 (A). In some cases, the subspace Zx0 (A), Sx0 is more or less easy to understand. 5.4.16 Proposition (Concrete description of Zx0 (A), Sx0 ) Let r Z>0 {, }, let M be a manifold of class C or C , as is required, and let A be a Cr -ane distribution on M such that 0x0 Ax0 for x0 M. Suppose that there exists a neighbourhood N of x0 and X0 , X1 , . . . , Xk r (A|N) such that (i) (X1 (x0 ), . . . , Xm (x0 )) is a basis for L(A)x0 , (ii) Xm+1 (x0 ) = = Xk (x0 ) = 0x0 , and (iii) (X1 , . . . , Xk ) generate r (L(A)|N). cf. Proposition 5.4.6 for conditions (i) and (ii). Then, for a subspace Sx0 L(A)x0 , Zx0 (A), Sx0 = {AX0 (x0 ), AXm+1 (x0 ), . . . , AXk (x0 )}, Sx0 . In particular, if x0 is a regular point for A, if Sx0 L(A)x0 , and if X Zx0 (A), then Zx0 (A), Sx0 = {AX (x0 )}, Sx0 .
Proof Let us denote X0 = {X0 , Xm+1 , . . . , Xk } and let Sx0 Tx0 M be as stated in the proposition. Clearly we have Zx0 (X0 ), Sx0 Zx0 (A), Sx0 , so it is the opposite inclusion we prove. Let V Zx0 (A) and write V = X0 + f 1 X1 + + f k Xk for Cr -functions f 1 , . . . , f k ; this is possible by Lemma 5.4.7 and condition (iii) in the statement of the proposition. Note that we must necessarily have f 1 (x0 ) = = f m (x0 ) = 0. If Y r (TM) then [Y, V ] = [Y, X0 ] + f 1 [Y, X1 ] + + f k [Y, Xk ] + (Y f 1 )X1 + + (Y f k )Xk .

23/06/2009 This gives

5.4 Afne distributions

221

AV (x0 ) = AX0 (x0 ) + f 1 (x0 )AX1 (x0 ) + + f k (x0 )AXk (x0 ) + X1 (x0 ) d f 1 (x0 ) + + Xk (x0 ) d f k (x0 ) = AX0 (x0 ) + f m+1 (x0 )AXm+1 (x0 ) + + f k (x0 )AXk (x0 ) + X1 (x0 ) d f 1 (x0 ) + + Xm (x0 ) d f m (x0 ). This shows that AV (x0 ) = LV + LV where LV is a linear combination of AXm+1 (x0 ), . . . , Ak (x0 ) and where image(LV ) L(A)x0 . To prove the proposition, it suces to show that every vector in Tx0 M of the form AVr (x0 ) AV1 (v), r Z0 , V1 , . . . , Vr Zx0 (A), v Sx0 , lies in X0 , Sx0 . We prove this by induction on r. It is clearly true for r = 0, so suppose that all vectors of the form AV j (x0 ) AV1 (v), j {0, 1, . . . , s}, V1 , . . . , V j Zx0 (A), v Sx0 ,

lie in X0 , Sx0 and let V1 , . . . , Vs+1 Zx0 (A) and v Sx0 . As in the preceding paragraph, write AdVs+1 (x0 ) = LV + LV , where LV is a linear combination of AXm+1 (x0 ), . . . , AXk (x0 ) s+1 s+1 s+1 and where image(LV ) L(A)x0 . We then have
s+1

AVs+1 (x0 ) AVs (x0 ) AV1 (x0 )(v) = LVs+1 (AVs (x0 ) AV1 (x0 )(v)) + LVs+1 (AVs (x0 ) AV1 (x0 )(v)). Since LV
s+1

is a linear combination of AXm+1 (x0 ), . . . , AXk (x0 ) we clearly have LVs+1 (AVs (x0 ) AV1 (x0 )(v)) X0 , Sx0

by the induction hypothesis. Also, since L(A)x0 Sx0 X0 , Sx0 , and since image(LV ) s+1 L(A)x0 we have LVs+1 (AVs (x0 ) AV1 (x0 )(v)) X0 , Sx0 , giving the rst part of the result. The nal assertion is an immediate consequence of the rst since, if x0 is a regular point for A and if X Zx0 (A), by Proposition 5.4.6 we can nd local generators X1 , . . . , Xm for L(A) with (X1 (x), . . . , Xm (x)) a basis for L(A)x for every x in a neighbourhood of x0 , and such that (X0 = X, X1 , . . . , Xm ) locally generate A.

5.4.17 Remark (The computation of Zx0 (A), Sx0 is nite) Let us adopt the notation of the preceding proposition. If one chooses a basis (v1 , . . . , vd ) for Sx0 , Zx0 (A), Sx0 is a spanned by vectors of the form AX js (x0 ) AX j1 (x0 )(v j ), s Z0 , j1 , . . . , js {m + 1, . . . , k}, j {1, . . . , d}. (5.9)

Note that if the dimension of the connected component of M containing x0 is n, then the CayleyHamilton Theorem [Hungerford 1980, Theorem VII.5.2] gives Ar (x ), j Xj 0

{m + 1, . . . , k}, for r n as a linear combination of AlX j (x0 ), l {0, 1, . . . , n 1}. Therefore, there are only nitely many vectors (5.9) that one must compute to determine a set of generators for Zx0 (A), Sx0 .

222

5 Families of vector elds, distributions, and afne distributions

23/06/2009

Let us give an example which will be meaningful to anyone who knows about linear controllability. 5.4.18 Example (The controllability matrix via invariant subspaces) We let V be a nitedimensional R-vector space which we think of as being a manifold. Let A End(V) and dene a vector eld X0 on V by X0 (x) = (x, A(x)). Also let b1 , . . . , bm V and dene vector elds X j , j {1, . . . , m}, on V by X j (x) = (x, b j ). Thus the vector eld X0 is a linear vector eld and the vector elds X1 , . . . , Xm are constant vector elds. We dene an ane distribution A (we hope the reader will forgive the bad practice of using A and A together) by Ax = X0 (x) + spanR (X1 (x), . . . , Xm (x)). Clearly 0 A0 . Let us determine an explicit expression for Z0 (A), L(A)0 . Note that 0 is a regular point for A and so, by Proposition 5.4.16, Z0 (A), L(A)0 = {AX0 (0)}, L(A)0 . Moreover, since X0 is linear, its linearisation is itself: AX0 (0) = A. By Lemma 5.4.14 we note that Z0 (A), L(A)0 is then the subspace generated by vectors of the form Ak (v), k Z0 , v L(A)0 .

Since L(A)0 is the subspace spanned by the vectors b1 , . . . , bm , Z0 (A), L(A)0 is the subspace generated by vectors of the form Ak (b j ), k Z0 , j {1, . . . , m}.

By the CayleyHamilton Theorem, this means that the vectors of the form Ak (b j ), k {0, 1, . . . , n 1}, j {1, . . . , m}

generate Z0 (A), L(A)0 . Let us organise this into something a little familiar. Let us dene a linear map B : Rm V by B(u) = u1 b1 + + um bm . Let Vn denote the n-fold direct sum of V with itself. We can then dene a linear map LA,B : Rm Vn by LA,B (u) = (B(u), A B(u), . . . , An1 B(u)), which we represent by the block matrix B A B An1 B . (5.10)

We note that Z0 (A), L(A)0 = image(LA,B ). The linear map (5.10)or more properly the matrix representation of this matrix in a basis for Vis called the controllability matrix for the pair (A, B), and will feature prominently in our discussion of linear controllability.

This version: 23/06/2009

Chapter 6 Geometric system models


In this chapter we explore various sorts of system models that can arise in geometric control theory. Just which sort of model one chooses to develop can depend on just what one is trying to do. If one has a specic application in mind (as we do not here), then the model type is normally forced upon the user. However, for the purposes of understanding classes of systems, the choice of system model is often less clear. A good rule of thumb is to use a model whose structure captures the structure one is interested in, but does not add any additional structure. This sort of vague guideline does not typically precisely pin down the appropriate class of system to deal with, but can be helpful nonetheless. In these notes we do not advocate a particular model, but instead present a few sorts of models that are commonly encountered. For each class of system we give some basic associated denitions and prove some basic properties of the systems. In particular, for each class of system we will introduce the notion of a reachable set. The reachable set and its properties feature very prominently in all of the various manifestations of control theory, and particularly in geometric control theory. In Chapter 8 we will study various ways of understanding the reachable set. !

6.1 Controls
In most, but not all, system models, one selects a class of admissible controls for which one considers corresponding trajectories. Just what is the right class of controls to consider typically depends on what one is trying to achieve. For instance, if one is studying optimality, one might want to choose the largest possible class of controls so that the optimality results are as strong as possible. However, when studying controllability, perhaps one would like to choose nice controls so as to prove that controllability is possible in a reasonable way. In this section we study with some care the various sorts of admissible controls typically encountered. 6.1.1 Metric space valued controls We think of a control as being a function of time, and we think of time as taking values in an interval that we typically denote by T. We begin by considering controls taking values in a metric space. To do this, we need to dene a class of such controls that is manageable. By we denote the Lebesgue measure. Let us agree to, in this

224

6 Geometric system models

23/06/2009

section, call metric space valued maps controls, since this is how we will think of them. We will formalise this in Denitions 6.1.4 and 6.1.14 below. 6.1.1 Denition (Measurable, essentially bounded, piecewise constant controls) Let (C, d) be a metric space and let T R be an interval. For a control : T C we have the following denitions. (i) The control is measurable if f 1 (U) is Lebesgue measurable for every open set U C. (ii) The control is essentially bounded if there exists a compact subset K C such that ({t T | (t) K}) = 0. (iii) The control is locally essentially bounded if |T is essentially bounded for every compact subinterval T T. (iv) If T is bounded, is piecewise constant if there exists a partition (T1 , . . . , Tk ) of T into subintervals such that |T j is constant for each j {1, . . . , k}. (v) For a general interval T, is piecewise constant if |T is piecewise constant for every compact subinterval T T. We also use the following notation. (vi) The set of measurable C-valued controls on T is denoted by L0 (T; C). (vii) The set of measurable essentially bounded C-valued controls on T is denoted by L (T; C). (viii) The set of measurable locally essentially bounded C-valued controls on T is denoted by L loc (T; C). (ix) The set of piecewise constant C-valued controls on T is denoted by Lpwc (T; C). (x) The set of continuous C-valued controls on T is denoted by C0 (T; C).
Note that we obviously have L (T; C) L loc (T; C) and Lpwc (T; C) Lloc (T; C). One might think that controls from Lpwc (T; C) are very special, as indeed they are, but they also approximate very general classes of controls.

6.1.2 Theorem (Measurable controls are approximated by piecewise constant controls) For a metric space (C, d), for an interval T R, and for a control : T C, consider the following two statements: (i) L0 (T; C); (ii) there exists a sequence (j )jZ>0 in Lpwc (T; C) such that limj j (t) = (t) for almost every t T. Then (ii) = (i) and, if (C, d) is separable, (i) = (ii). Moreover, if is essentially bounded, then the sequence (j )jZ>0 in Lpwc (T; C) from (ii) can be chosen so that j (T) K for a compact subset K C.

23/06/2009

6.1 Controls

225

Proof (i) = (ii) Let us say that a control : T C is simple if there exists a partition (A1 , . . . , Ak ) of T into Lebesgue measurable sets and u1 , . . . , uk C such that (t) = u j when t A j . Note that simple controls are measurable. We rst claim that if L0 (T; C) then there exists a sequence ( j ) jZ>0 of simple controls from T to C such that lim j j (t) = (t) for almost every t T. We rst prove this when T is bounded. Since C is separable, let (uk )kZ>0 be a countable set of points dense in image(). For 1 1 k, m Z>0 , let B( m , uk ) be the ball of radius m centred at uk , and note that image() 1 1 kZ>0 B( m , uk ) for each m Z>0 . Thus T = kZ>0 1 (B( m , uk )) and so
r r

lim
k =1

1 , uk )) = (T) < . 1 (B( m

Therefore, for each m Z>0 there exists rm Z>0 such that, if


rm

Am = T \
k =1

1 1 (B( m , uk )),

then (Am ) < 2m . Now dene m : T C by m (t) = u1 , m (t) = u2 , . . . m (t) = urm , , m (t) = u
1 , u1 )), t 1 (B( m 1 1 t 1 (B( m , u2 )) 1 (B( m , u1 )),

1 1 m 1 1 , urm )) r (B( m , ul )), t 1 (B( m l =1

t Am ,

1 C is arbitrary, but xed. By construction, for t T \ Am we have d((t), m (t)) < m where u . Without loss of generality, we suppose that the sequence (rm )mZ>0 is increasing, and so, if we dene Zm = A , we have Zm+1 Zm for each m Z>0 . Moreover, if l =m l Z = m=1 Zm , then (Z) = 0 [Cohn 1980, Proposition 1.2.4]. Note that for t T \ Z we have limm m (t) = (t), giving our claim when T is bounded. If T is not bounded, then write T = T for a sequence (T j ) jZ>0 of pairwise disjoint j =1 j bounded intervals. For each j Z>0 , by the previous paragraph let ( j,k )kZ>0 be a sequence of simple controls from T j to C such that limk j,k (t) = (t) for all t T j \ Z j , for a subset Z j T j of zero measure. Then dene a sequence (k )kZ>0 of simple controls from T to C by asking that, for k {1, . . . , j}, k (t) = k, j (t) for t T j . For t T \ k T , we let j =1 j , where u is again arbitrary, but xed. It then follows that, if t T \ j=1 Z j , then k (t) = u limk k (t) = (t). Since Z has zero measure, our claim of being the pointwise j =1 j limit, almost everywhere, of simple controls follows. Now we prove this part of the theorem. We begin by assuming that T is bounded and that is a simple control. We thus have a partition (A1 , . . . , Ak ) of T into Lebesgue measurable sets and elements u1 , . . . , uk C such that (t) = u j for t A j , j {1, . . . , k}. Let m Z>0 . By regularity of the Lebesgue measure [Cohn 1980, Proposition 1.4.1], for each j {1, . . . , k} we write A j = U j \ B j where U j is open and where B j U j has the property that (B j ) < m21 m+2 . Since U j is open, it is a countable union of disjoint open intervals. If

226

6 Geometric system models

23/06/2009

U j is in fact a nite union of open intervals then denote V j = U j . If any of the intervals comprising V j have common endpoints, then these intervals may be shrunk so that their complement in A j has measure at most m2m+1 . Next suppose that U j is a countable union of open intervals ( J j,l )lZ>0 . Since U j is bounded we must have l=1 ( J j,l ) < . Therefore,
1 there exists N j Z>0 such that j=N j +1 ( Jl, j ) < m2m+2 . We then dene V j = l=1 J j,l . If any of the intervals J1, j , . . . , JN j +1, j have common endpoints, they can be shrunk while maintaining the fact that the measure of their complement in A j is at most m21 m+1 . Dene m : T C on V j by asking that m (t) = u j for t V j . Doing this for each j {1, . . . , k} denes m : T C on the set k V which is a nite union of open intervals whose complement has measure j=1 j Nj

V is a union of intervals, and on these intervals dene at most . The complement to k j=1 j , where u C is arbitrary but xed. Note that m as constructed is piecewise m to be u constant, and that m (t) = (t) for t k V . Thus, if we dene j=1 j Ym = t T d((t), m (t)) >
1 m

we have (Ym ) <

1 . m2m+1

Dene Zr = m=r Yr and note that

(Zr )
m =r

1 1 < m + 1 r m2

m=r

1 2m+1

1 < . r

Since Zr+1 Zr for each r Z>0 , if we dene Z = Z we have (Z) = 0 [Cohn 1980, r=1 r Proposition 1.2.4]. Suppose that t T \ Z. Then t T \ Zr for some r Z>0 and so t T \ Ym 1 for m r and so limm m (t) = (t). This gives for m r. Therefore, d((t), m (t)) < m the desired approximation of simple controls by piecewise constant controls. Now assume that T is bounded, but that is not necessarily a simple control. For m Z>0 , from the computations above let m : T C be a simple control such that tT d((t), m (t)) >
1 2m

<

1 m2m+2

and let m : T C be a piecewise constant control such that tT Dene Ym = t T d((t), m (t)) >
1 m

d(m (t), m (t)) >

1 2m

<

1 . m2m+2

tT

d(m (t), m (t)) >

1 m

so that (Ym ) < m21 m+1 . If we dene Zr = m=r and Z = r=1 Zr , then, as in the previous paragraph, (Zr ) < 1 r and (Z) = 0. If t T \ Z then, again as in the previous paragraph, there exists r Z>0 such that t T \ Ym for m r. But this means that

d((t), m (t)) d((t), m (t)) + d(m (t), m (t)) <

1 m

for m r, and so limm m (t) = (t), giving this part of the theorem when T is bounded. Finally, when T is not bounded, write T = T for a sequence (T j ) jZ>0 of pairwise j=1 j disjoint bounded intervals. For each j Z>0 let ( j,k )kZ>0 be a sequence of piecewise constant controls from T j to C such that limk j,k (t) = (t) for all t T j \ Z j , for a subset

23/06/2009

6.1 Controls

227

Z j T j of zero measure. Then dene a sequence (k )kZ>0 of piecewise constant controls T , we from T to C by asking that, for k {1, . . . , j}, k (t) = k, j (t) for t T j . For t T \ k j=1 j , where u is again arbitrary, but xed. It then follows that, if t T \ let k (t) = u Z , then j =1 j limk k (t) = (t). Since j=1 Z j has zero measure, this completes the proof of this part of the theorem. (ii) = (i) For this part of the proof, we will actually prove something more general. We shall prove that if ( j ) jZ>0 is a sequence in L0 (T; C) such that lim j j (t) = (t) for almost every t T, then L0 (T; C). Since piecewise constant controls are measurable, this will prove this part of the theorem. Dene Z = t T lim j (t) (t) .
j

Let us rst suppose that Z = . Let U C be open and let t T satisfy 1 (U). Since lim j j (t) = (t), there exists 1 (U) for j N . Thus, for each k Z N Z>0 such that j (t) U for j N. Thus t >0 we j have

(U)
1 j=k

1 j (U)

and so 1 (U)

1 j (U).

k=1 j=k

Now let A C be closed and suppose that t T satises

t
j =k

1 j (A)

for every k Z>0 . This means that j (t) A for all j and so (t) = lim j j (t) A since A is closed. Thus we have

1 (A)
k=1 j=k

1 j (A).

Now take a xed open set U C and dene sequences (Ak )kZ>0 of closed subsets and (Uk )kZ>0 of open subsets of C by Ak = {u C | d(u, cl(U)) 1 k} and Uk = {u C | d(u, cl(U)) > 1 k }. Note that Uk Ak and that A=
kZ>0

Ak = kZ>0 Uk .

By our observations above for general open and closed sets we have
1 j (Uk )

1 (Uk )
m=1 j=m

228
and

6 Geometric system models


1 j (Ak ).

23/06/2009

(Ak )
1 m =1 j =m

Therefore, 1 (U) =

1 j (Uk )

1 (Uk )
k =1 k =1 m =1 j =m 1 j (Ak )

and 1 (U) =

1 j (Uk )

1 (Ak )
k =1 k =1 m =1 j =m

k =1 m =1 j =m

and so (U) =
1

1 j (Uk ).

k =1 m =1 j =m

1 is measurable for each j Z , 1 (U ) is Lebesgue measurable for each j, k Z . Since >0 >0 k j j Thus, in turn, 1 j (Uk ), 1 j (Uk ), 1 j (Uk )

j =m

m=1 j=m

k =1 m =1 j =m

are Lebesgue measurable, and so 1 (U) is Lebesgue measurable. This completes this part of the proof when Z = . To complete the proof, suppose that Z . Dene j : T C, j Z>0 , by j (t), t T \ Z , j (t) = (t), t Z , so that lim j j (t) = (t) for every t T. We claim that j is measurable for each j Z>0 . Indeed, let U C be open. Then
1 1 1 j (U) = ( j (U) Z ) ( j (U) (T \ Z )). 1 (U) Z is Lebesgue measurable. Also, note Since the Lebesgue measure is complete, j that 1 1 j (U) (T \ Z ) = j (U) (T \ Z ), 1 (U) (T \ Z ) is measurable. Thus is measurable, as claimed. and so it follows that j j

Thus, by the proof above, 1 (U) is Lebesgue measurable for every open subset U C. Now we turn to the nal assertion of the theorem. Let K C be a compact set such that ({t T | (t) K}) = 0. : T C K by Then dene (t), (t) = u , (t) K, (t) K,

23/06/2009

6.1 Controls

229

C K is arbitrary but xed. By the implication (i) = (ii) proved above, there where u (t) for almost every exists a sequence ( j ) jZ>0 in Lpwc (T; C K) such that lim j j (t) = (t) for almost every t T, it follows that lim j j (t) = (t) for almost t T. Since (t) = every t T, as desired.

6.1.3 Remark (Sequences of measurable controls converging pointwise almost everywhere) Note that in the second part of the proof, we proved the following: If (C, d) is a metric space, if : T C is a control, and if ( j ) jZ>0 is a sequence in L0 (T; C) such that lim j j (t) = (t) for almost every t T, then L0 (T; C). This is interesting, but this is not a book about measure theory. 6.1.2 Subsets of admissible locally essentially bounded controls We will often wish to consider subsets of controls having various properties. Before we do so, we shall precise the notion of a control and assign some notation to the notion. We will also provide a few rather natural constructions with controls that can be useful in characterising subsets of controls. 6.1.4 Denition (Admissible locally essentially bounded controls) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system. (i) A locally essentially bounded control for is a map L loc (T ; C) where T T is a subinterval called the time-domain for and denoted by TD(). (ii) The set of locally essentially bounded controls for with time-domain T is denoted by Cont (, T ) and the set of all controls is denoted by Cont (). (iii) A set of admissible locally essentially bounded controls is simply a subset U Cont (). Next we consider some constructions one can perform with controls. 6.1.5 Denition (Constructions using locally essentially bounded controls) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system. (i) If Cont (, T ) and if T T is a subinterval, we denote by |T Cont (; T ) the restriction of to T . (ii) Let T1 , T2 T be subintervals of a subinterval T T and let 1 Cont (, T1 ) and 2 Cont (, T2 ). The controls 1 and 2 is concatible if (a) sup T1 = inf T2 and (b) sup T1 T1 T2 , i.e., the boundary point for T1 and T2 is contained in at least one of the intervals. If 1 and 2 are concatible then their concatenation is the control 1 2 !

230

6 Geometric system models Cont (, T ) dened by 1 (t), t T1 {sup T1 }, 2 (t), t T2 {inf T2 }, 1 2 (t) = 1 (t), t = sup T1 , sup T1 T1 , 2 (t), t = inf T2 , inf T2 T2 , sup T2 T1 .

23/06/2009

(iii) Let Cont (, T ) and let T T be a subinterval. We consider three cases. (a) T \ T is connected and inf T < inf T : In this case dene the control Cont (, T \ T ) by (t) = (t). (b) T \ T is disconnected: In this case, T is bounded; let us denote a = inf T and b = sup T . We then dene T = (T (, a]) {t (b a) | t T (b, )}

and Cont (, T ) by (t), (t) = (t (b a)),

t T (, a], t T (b, ).

The control is the T -deletion of . (iv) Let Cont (, T ), let T T be a subinterval, and let Cont (, T ). The control Cont (, T ) dened by tT \T , (t), (t) = (t), t T is the -substitution of .

In Figure 6.1 we depict the four constructions on controls from the previous denition. We shall sometimes use a denition of concatenation that is a little dierent from what we have provided above. In our construction above, we ask that the domains of denition of the concatenated curves 1 and 2 are adjacent. Sometimes, however, both controls are dened on intervals starting at 0: 1 : [0, T1 ] C and 2 : [0, T2 ] C. In this case, the concatenation 1 2 : [0, T1 + T2 ] C of 1 and 2 is dened by t [0, T1 ], 1 (t), 1 2 (t) = 2 (t T1 ), t (T1 , T1 + T2 ]. Finally, we can give some properties of subsets of admissible controls that will be useful.

23/06/2009

6.1 Controls

231

T2

T1

T1

T2

Figure 6.1 Constructions on controls: restriction (top left), concatenation (top right), deletion (bottom left), and substitution (bottom right)

6.1.6 Denition (Properties of admissible locally essentially bounded controls) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system. Let U Cont () be a set of admissible controls. (i) The set U is closed under restriction if, for every U and T TD(), |T U .

232

6 Geometric system models

23/06/2009

(ii) The set U is closed under concatenation if, for all concatible elements 1 and 2 of U , 1 2 U . (iii) The set U is closed under interval deletion if, for every U and T TD(), the T -deletion of is in U . (iv) The set U is closed under substitution if, for every , U satisfying TD( ) TD(), the -substitution of is in U . 6.1.3 Euclidean space valued controls Of course, the Euclidean space Rm is a metric space with the metric
m

d(u, v) =
j =1

(v j u j )2

1/2

If C Rm is any subset, it is necessarily a metric space with the metric induced from that on Rm . Moreover, since Rm is separable, so too is C. Thus the developments of the preceding section concerning metric space valued controls also hold for controls taking values in a subset C Rm . However, we can take advantage of the additional structure of Euclidean space to make some further denitions. 6.1.7 Denition (Integrable controls) Let C Rm and let T R be an interval. A measurable control : T C (i) is integrable if, for j {1, . . . , m}, | j (t)| dt < ,

and (ii) is locally integrable if |T is integrable for every compact interval T T. We also use the following notation. (iii) The set of integrable C-valued controls on T is denoted by L1 (T; C). (iv) The set of locally integrable C-valued controls on T is denoted by L1 loc (T; C). r (v) For r Z0 {, }, the set of C-valued controls on T of class C is denoted by Cr (T; C) (when we refer to dierentiability here, we are thinking of dierentiability of maps taking values in Rm ). Of course, if L1 (T; C), the integral of is dened as the element of Rm given by (t) dt = 1 (t) dt, . . . , m (t) dt .

The following result is useful. 6.1.8 Proposition (Characterisation of integrable controls) For a subset C Rm , the following statements are equivalent:

23/06/2009

6.1 Controls

233

(i) L1 (T; C); (ii) the R-valued function t (t) is integrable. Moreover, if either of the above equivalent conditions holds, then (t) dt (t) dt.

Proof (i) = (ii) Since the product and sum of measurable functions is measurable [Cohn 1980, Propositions 2.1.5 and 2.1.6], the function
m

t
j =1

j (t)2

is measurable. Since the composition of a measurable function with a continuous function is measurable, it then follows that the function t (t) is measurable. Using (1.1) we ! have
T

(t) dt

|1 (t)| dt + +

|m (t)| dt < ,

giving the result. (ii) = (i) Recall that

|u1 | + + |um |

m u

for every u Rm . Therefore, for each j {1, . . . , m} we have | j (t)| dt (t) dt < ,

as desired. Now we prove the nal assertion of the proposition. The inequality obviously holds if T (t) dt = 0, so we may suppose that T (t) dt 0. Let u Rm be such that u = 1 and (t) dt = u (t) dt .

Therefore, using linearity of the integral and the fact that u, u = 1, u, (t) dt = u, (t) dt = (t) dt .

Since |u j | 1 for each j {1, . . . , n} we can use the CauchyBunyakovskySchwarz inequality to get u, (t) | u, (t) | u (t) = (t) . Therefore,
T

(t) dt

(t) dt,

as desired.

234

6 Geometric system models

23/06/2009

From Theorem 6.1.2, since integrable controls are measurable it follows that integrable controls are approximated by piecewise constant controls in the sense that, if L1 (T; C), then there exists a sequence ( j ) jZ>0 in Lpwc (T; C) such that lim j j (t) = (t) for almost every t T. However, for integrable controls there is another sort of approximation that is also useful. 6.1.9 Theorem (Integrable controls are approximated by piecewise constant controls) If C Rm , if T R is an interval, and if L1 loc (T; C), then there exists a sequence (j )jZ>0 in Lpwc (T; C) such that lim
j T

(t) j (t) dt = 0.

Proof We prove the theorem by proving it for cases increasing in generality, until the desired nal result is achieved. We rst suppose that T is bounded and that C is bounded. Let M R>0 be such that C Bm (M, 0). Let k Z>0 . By Theorem 6.1.2 there exists a piecewise constant control k : T C such that tT Then
T

(t) k (t) <

1 (2(T)k)

<

1 . 2Mk

(t) k (t) dt <

1 1 1 (T) + M= , 2(T)k 2Mk k

giving the result in this case. Next we suppose that T is bounded but that C Rm is arbitrary. Let k Z>0 . For j Z>0 dene (t), (t) j, j (t) = u , (t) > j, C. We have (t) = lim j j (t) for almost every t T. By for some arbitrary but xed u the Dominated Convergence Theorem [Cohn 1980, Theorem 2.4.4], lim (t) j (t) dt = 0.

Thus there exists N suciently large that (t) N (t) dt < 1 . 2k

By the argument in the previous paragraph there exists a piecewise constant control k : I C such that 1 N (t) k (t) dt < . 2 k T Then, using the triangle inequality and monotonicity of the integral, (t) k (t) dt (t) N (t) dt + 1 N (t) k (t) dt < , k

23/06/2009

6.1 Controls

235

giving the result in this case. Finally, we prove the theorem under the stated hypotheses, allowing T to be unbounded. Let (T j ) jZ>0 be a partition of T into disjoint bounded intervals. Let k Z>0 . For each j Z>0 let j,k : T j C be a piecewise constant control such that
Tj

(t) j,k (t) dt <

1 . k2 j+1

Then dene the piecewise constant control k : T C by asking that k (t) = j,k (t) when t T j . Then
T Tj

(t) k (t) dt =

j =1

(t) j,k (t) dt <

j =1

1 1 = , j + 1 k k2

which completes the proof.

If the control set C Rm has further structure, then approximations can be made by nicer sets of controls. 6.1.10 Theorem (Approximation of measurable controls by continuous controls) If C Rm is convex, if T R is an interval, and if L0 (T; C), then there exists a sequence (j )jZ>0 of continuous C-valued controls on T such that limj j (t) = (t) for almost every t T.

Proof Let us begin by considering a piecewise constant control Lpwc (T; C). This means that we have a partition (T j ) j J of T into nitely or countably many bounded intervals with j (t) = u j for t T j , j J. Let us suppose that the intervals are ordered so that sup T j1 inf T j2 if j1 < j2 . Let R>0 . Let J = J \ sup J. Then let j J and let 1 t j = sup T j = inf T j+1 . Let j = min{ 2 j+2 , 2 (T j ), 1 2 (T j+1 )}. Then, for t [t j j , t j + j ], dene 1 1 (t t j )(u j+1 u j ) + (u j + u j+1 ). (t) = 2 j 2 Note that t (t) linearly interpolates between u j and u j+1 on [t j j , t j + j ]. By convexity of C, (t) C for every t [t j j , t j + j ]. Moreover, if we dene (t) for t T by the preceding process if t [t j j , t j + j ] for some j and otherwise by u j if t T j , we note that agrees exactly with except on j J [t j j , t j + j ]. But

j J

[t j j , t j + j ]
j=1

2 j +1

= ,

showing that one can exactly approximate a piecewise constant control by a continuous control except on a set whose measure can be made as small as desired. Now, by Theorem 6.1.2 let ( j ) jZ>0 be a sequence of piecewise constant controls with the property that 1 < j21 t T (t) j (t) > 2 j+2 j and, by our computations at the beginning of the proof, let ( j ) jZ>0 be a sequence of continuous controls such that tT j (t) j (t) >
1 2j

<

1 . j2 j+2

236
Dene

6 Geometric system models

23/06/2009

Yj = t T so that (Y j ) <
1 . j2 j+1

(t) j (t) >

1 j

tT

j (t) j (t) >

1 j

If we dene Zk = and Z = Z , then m =1 k j=k

(Zk )
j =k

1 1 < k j2 j+1

j=k

1 2 j+1

1 < . k

Since Zk+1 Zk for each k Z>0 , we have (Z) = 0 [Cohn 1980, Proposition 1.2.4]. Suppose that t T \ Z. Then t T \ Zk for some k Z>0 and so t T \ Y j for j k. But this means that (t) j (t) (t) j (t) + j (t) j (t) < 1 j for j k, and so lim j j (t) = (t), giving the theorem.

The same proof as that for Theorem 6.1.9 gives the following result. 6.1.11 Corollary (Approximation in norm of integrable controls by continuous controls) If C Rm is convex, if T R is an interval, and if L1 (T; C), then there exists a sequence (j )jZ>0 of continuous C-valued controls on T such that lim
j T

(t) j (t) dt = 0.

The nal approximation result we state for Euclidean space valued controls holds when the control set is open, something which does not happen in many situations. 6.1.12 Theorem (Approximation of measurable controls by analytic controls) If C Rm is open and convex, if T R is a bounded interval, and if L0 (T; C), then there exists a sequence (j )jZ>0 of analytic C-valued controls on T such that limj j (t) = (t) for almost every t T.
Proof We can assume, without loss of generality, that T is compact. We begin by considering a continuous control C0 (T; C). Since T is compact, (T) is compact [Abraham, Marsden, and Ratiu 1988, Proposition 1.5.2]. For u image() let m m u R>0 be such that B ( u , u) C. Since image() uimage() B ( u , u) there exists u1 , . . . , uk image() such that image() = k Bm ( u j , u j ). Let R>0 and suppose that j =1 < min{
u 1 , . . . , u k }.

By the Weierstrass Approximation Theorem [Rudin 1976, ] there exists a polynomial, and therefore analytic, control such that (t) (t) < for every t T. Note that is C-valued. Now, by Theorem 6.1.10, is well approximated, except on a set of arbitrarily small measure by a continuous control. As in the preceding paragraph, a continuous control is uniformly approximated by an analytic control. Then the arguments used in the last paragraph of the proof of Theorem 6.1.10 can be repeated here to give the present theorem.

23/06/2009

6.1 Controls

237

The following corollary follows by the same sort of proof as Theorem 6.1.9. 6.1.13 Corollary (Approximation in norm of integrable controls by analytic controls) If C Rm is open and convex, if T R is a bounded interval, and if L1 (T; C), then there exists a sequence (j )jZ>0 of analytic C-valued controls on T such that lim
j T

(t) j (t) dt = 0.

6.1.4 Subsets of admissible locally integrable controls The constructions of Section 6.1.2 can be repeated for controls taking values in a subset of Euclidean space. The only distinction is that the set of controls can now expanded to include locally integrable controls. 6.1.14 Denition (Admissible locally integrable controls) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system with C Rm a subset of Euclidean space. (i) A locally integrable control for is a map L1 loc (T ; C) where T T is a subinterval called the time-domain for and denoted by TD(). (ii) The set of locally integrable controls for with time-domain T is denoted by Cont1 (, T ) and the set of all controls is denoted by Cont1 (). (iii) A set of admissible controls is simply a subset U Cont1 (). Next we consider some constructions one can perform with controls. 6.1.15 Denition (Constructions using locally integrable controls) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system with C Rm a subset of Euclidean space. (i) If Cont1 (, T ) and if T T is a subinterval, we denote by |T Cont1 (; T ) the restriction of to T . (ii) Let T1 , T2 T be subintervals of a subinterval T T and let 1 Cont1 (, T1 ) and 2 Cont1 (, T2 ). The controls 1 and 2 is concatible if (a) sup T1 = inf T2 and (b) sup T1 T1 T2 , i.e., the boundary point for T1 and T2 is contained in at least one of the intervals. If 1 and 2 are concatible then their concatenation is the control 1 2 Cont1 (, T ) dened by 1 (t), t T1 {sup T1 }, 2 (t), t T2 {inf T2 }, 1 2 (t) = 1 (t), t = sup T1 , sup T1 T1 , 2 (t), t = inf T2 , inf T2 T2 , sup T2 T1 . (iii) Let Cont1 (, T ) and let T T be a subinterval. We consider three cases.

238

6 Geometric system models

23/06/2009

(a) T \ T is connected and inf T < inf T : In this case dene the control Cont1 (, T \ T ) by (t) = (t). (b) T \ T is disconnected: In this case, T is bounded; let us denote a = inf T and b = sup T . We then dene T = (T (, a]) {t (b a) | t T (b, )} and Cont1 (, T ) by (t), (t) = (t (b a)),

t T (, a], t T (b, ).

The control is the T -deletion of . (iv) Let Cont1 (, T ), let T T be a subinterval, and let Cont1 (, T ). The control Cont1 (, T ) dened by tT \T , (t), (t) = (t), t T is the -substitution of . We recall from the discussion following Denition 6.1.5 that there is a dierent notion of concatenation that we will sometimes use. Finally, let us give some properties of subsets of admissible controls that will be useful. 6.1.16 Denition (Properties of subsets of controls) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system with C Rm a subset of Euclidean space. Let U Cont1 () be a set of admissible controls. (i) The set U is closed under restriction if, for every U and T TD(), |T U . (ii) The set U is closed under concatenation if, for all concatible elements 1 and 2 of U , 1 2 U . (iii) The set U is closed under interval deletion if, for every U and T TD(), the T -deletion of is in U . (iv) The set U is closed under substitution if, for every , U satisfying TD( ) TD(), the -substitution of is in U . Note that when controls take values in Euclidean space, C Rm , there is a source of genuine confusion in terms of terminology. Specically, the constructions and notation of Section 6.1.2 and those of this section apply equally well. A resolution to this potential confusion is not of enormous importance. However, we will adopt a rule about how the terminology is used, in order to avoid confusion. The rule is based on the type of system we are considering, rather than just the type of control we are considering. In Section 6.3 we will consider what we call a control system. In Section 6.4 we will consider a particular sort of control system that we call a controlane system. Our convention will be as follows.

23/06/2009

6.2 Differential inclusion systems

239

6.1.17 Convention for classes of control We will use the following implicit conventions, unless otherwise stated. (i) If we are working with control systems, even with control sets C that are subsets of Euclidean space, we shall always suppose controls to be chosen from Cont (), cf. Example 6.3.5. (ii) If we are working with control-ane systems, we shall always suppose controls to be chosen from Cont1 (), cf. Proposition 6.4.4. We shall mention this again at the end of Section 6.4.2.

6.2 Differential inclusion systems


A rather general class of model that we shall think about is what we call a dierential inclusion system. In Section 4.4 we devoted some signicant eort to understanding basic features of dierential inclusions, so we will mostly give mere denitions. In subsequent sections we shall investigate features of dierential inclusion systems. 6.2.1 Denition of differential inclusion system The denition of a dierential inclusion system is more or less obvious, if one recalls our denition of a dierential inclusion from Denition 4.4.1. ! 6.2.1 Denition (Differential inclusion system) A dierential inclusion system if a triple = (M, F, T) where (i) M is a smooth manifold, (ii) T R is an interval called the time-domain for the system, and (iii) F : T M TM is a dierential inclusion. A dierential inclusion system = (M, F, T) is time-independent if there exists a time-independent dierential inclusion F : M TM such that F(t, x) = F (x) for all (t, x) T M. For time-independent dierential inclusion systems, we will often write F in place of F and also write = (M, F), omitting the redundant time-domain. Note that we do not a priori ascribe any properties to the dierential inclusion in a dierential inclusion system. This diers from how we handle other sorts of systems, where we nail down precisely the properties of the system. For dierential inclusion systems, there is simply too much variability in the sorts of properties one wishes for a dierential inclusion to have. This can be seen, for example, in the results of Sections 4.4.6 and 4.4.7, where various sorts of hypotheses are needed for various sorts of conclusions. We will adopt the practice of giving as a prex to dierential inclusion system the desired properties of the dierential inclusion. Thus, for example, if we wish to have the dierential inclusion be upper semicontinuous, we shall say that is an upper semicontinuous dierential inclusion system.

240

6 Geometric system models

23/06/2009

6.2.2 Trajectories and reachable sets for differential inclusion systems Of course, the condition to be satised by a trajectory for a dierential inclusion system is (t) F(t, (t)), for a suitable curve t (t) M. It only remains to provide terminology and notation. 6.2.2 Denition (Trajectories for differential inclusion systems) Let (M, F, T) be a dierential inclusion system. (i) A trajectory for is, for a subinterval T T, a locally absolutely continuous curve : T M that is a trajectory for F. The subinterval T is the time-domain for that we denote by TD(). (ii) For a subinterval T T, we denote Traj(, T ) = { | is a trajectory for which TD() = T } and Traj() = T T Traj(, T ). Associated with trajectories is the important notion of the reachable set. 6.2.3 Denition (Reachable sets for differential inclusion systems) Let (M, F, T) be a dierential inclusion system and let x0 M and t0 T. (i) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t is R (x0 , t0 , t) = {(t) | Traj(), [t0 , t] TD()}. (ii) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t is R (x0 , t0 , t) =
[t0 ,t]

R (x0 , t0 , ).

(iii) The reachable set from (t0 , x0 ) is R (x0 , t0 ) =


tT[t0 ,)

R (x0 , t0 , t).

Let T Traj() be a subset of trajectories for . (iv) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t with trajectories from T is R (x0 , t0 , t, T ) = {(t) | T , [t0 , t] TD()}. (v) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t with trajectories from T is R (x0 , t0 , t, T ) = R (x0 , t0 , , T ).
[t0 ,t]

23/06/2009

6.2 Differential inclusion systems

241

(vi) The reachable set from (t0 , x0 ) with trajectories from T is R (x0 , t0 , T ) =
tT[t0 ,)

R (x0 , t0 , t, T ).

The denitions above will apply to a dierential inclusion system that may depend explicitly on time. Of course, the denitions also apply to time-independent dierential inclusion systems. However, for time-independent systems, one very often wishes to work with denitions for reachable sets that acknowledge the time-independence. 6.2.4 Denition (Reachable sets for time-independent differential inclusion systems) Let (M, F) be a time-independent dierential inclusion system and let x0 M. (i) Let t R0 . The reachable set from x0 in time t is R (x0 , t) = {(t) | Traj(), [0, t] TD()}. (ii) Let t R0 . The reachable set from x0 in time at most t is R (x0 , t) =
[0,t]

R (x0 , ).

(iii) The reachable set from x0 is R (x0 ) =


tR0

R (x0 , t).

Let T Traj() be a subset of trajectories for . (iv) Let t R0 . The reachable set from x0 in time t with trajectories from T is R (x0 , t, T ) = {(t) | T , [0, t] TD()}. (v) Let t R0 . The reachable set from x0 in time at most t with trajectories from T is R (x0 , t, T ) = R (x0 , , T ).
[0,t]

(vi) The reachable set from x0 with trajectories from T is R (x0 , T ) =


tR0

R (x0 , t, T ).

There could possibly be some confusion with our using two dierent notions of reachable set. However, in practice this confusion will not be manifested as we will never use the two notions simultaneously. Moreover, it should be clear from context what notion is used in any situation.

242

6 Geometric system models

23/06/2009

6.3 Systems depending continuously on control


= F(x, u) sorts of models that The models we consider in this section are the x are commonly encountered in the nonlinear and geometric control literature. This is a quite general class of system, particularly if one allows the set in which u lives to be suciently rich. For example, the control-ane systems considered in Section 6.4 certainly fall into this class. 6.3.1 Denition of control system In order to state our denition, it is convenient to introduce some notation. Let S be a topological space, let M be a manifold, and let r Z0 {, }. For s S dene fs : M R by fs (x) = f (s, x). If r Z0 , a map f : S M R is of parameterised class Cr if, for every k {0, 1, . . . , r} and for every X1 , . . . , Xk (TM), the map (s, x) X1 Xk fs (x) is continuous. If r = , a map f : S M R is of parameterised class C if, for every k Z0 and for every X1 , . . . , Xk (TM), the map (s, x) X1 Xk fs (x) is continuous. Next let X : S M TM satisfy X(s, x) Tx M so that, if we dene Xs : M TM by Xs (x) = X(s, x), then Xs is a vector eld. We say that X is of parameterised class Cr for r Z0 {} if the map (s, x) Xs f (x) is of parameterised class Cr for every f C (M). Let us characterise functions and vector elds of parameterised class Cr . 6.3.1 Proposition (Functions and vector elds of parameterised class Cr ) Let S be a topological space, let M be a smooth manifold, and let r Z0 {}. Then the following statements hold: (i) a map f : S M R is of parameterised class Cr if and only if, for every coordinate chart (U, ) for M, the local representative (s, x) f(s, 1 (x)) and the rst r derivatives of this local representative with respect to x are continuous on S (U); (ii) a map X : S M TM satisfying X(s, x) Tx M, (s, x) S M, is of parameterised class Cr if and only if, for every coordinate chart (U, ) for M, the local representative (s, x) T X(s, 1 (x)) and the rst r derivatives of this local representative with respect to x are continuous on S (U).
Proof (i) Suppose that f : S M R is of parameterised class Cr and let (U, ) be a coordinate chart. Let X 1 , . . . , X n (T((U))) be the standard vector elds on (U). Let x0 (U) and let R>0 be such that Bn ( , x0 ) (U). By Lemma 1 from the proof of Proposition 3.1.3, there exists vector elds X1 , . . . , Xn (TM) such that X j |Bn ( , x0 ) = X j |Bn ( , x0 ), j {1, . . . , n}. Then, for j1 , . . . , jk {1, . . . , n}, with k Z0 , k r, we have X1 Xk fs 1 (x) = k ( fs 1 ) x j1 x jn (x), x Bn ( , x0 ).

Thus the partial derivative on the right is continuous as a function of (s, x) at (s, x0 ), and this gives the desired implication. For the converse, let f : S M R be such that the local representative (s, x) f (s, 1 (x)) and its rst r derivatives are continuous for every chart. Let k {0, 1, . . . , r} and

23/06/2009

6.3 Systems depending continuously on control

243

let X1 , . . . , Xk (TM). The local representative of X1 Xk fa if a linear combination of terms where each term is linear in the local representative of fs and its rst k derivatives, and the coecient involves derivatives of the vector elds X1 , . . . , Xk . It follows that the local representative (s, x) X1 Xk fs (x) is continuous, and this gives this part of the result. (ii) Suppose that X is of parameterised class Cr . Let (U, ) be a chart for M with coordinates (x1 , . . . , xn ). Let (s, x) X j (s, x), j {1, . . . , n}, be the components of the local representative of X. Let x0 (U) and let R>0 be such that Bn ( , x0 ) U. j Let , j {1, . . . , n}, be the coordinate functions and, by Lemma 1 from the proof of Note that the local representative of (s, x) Xs f j (x) is (s, x) X j (s, x) for j {1, . . . , n} and x Bn ( , x0 ). From the rst part of the proof, we conclude that the local representative and it rst r derivatives are continuous at (s, x0 ). For the converse, let X : S M TM be such that X(s, x) Tx M for (s, x) S M, and suppose that the local representative (s, x) T X(s, 1 (x)) and its rst r derivatives are continuous for all charts. Then, for f C (M), the local representative of (s, x) Xs f is (s, x) Xs (x)
j

Proposition 3.1.3, let f 1 , . . . , f j C (M) be functions agreeing with 1 , . . . , on 1 (K).

( f 1 ) x j

(x).

If k Z0 satises k r and if X1 , . . . , Xk (TM), then one can use the same argument as in the second half of the proof of part (i) to show that the local representative of (s, x) X1 . . . Xk Xs f (x) and its rst r derivatives are continuous. Thus, from the rst part of the proof, X is of parameterised class Cr .

With the preceding terminology, we have the following denition of a control system. 6.3.2 Denition (Control system) Let r Z0 {, }. A Cr -control system is a 4-tuple = (M, F, C, T), where (i) M is a smooth or analytic manifold, as is required, whose elements are called states, (ii) C is a separable metric space called the control set, (iii) T R is an interval called the time-domain for the system, and (iv) F : T M C TM is obtained as follows: there exists a separable metric space S, a measurable locally essentially bounded map : T S, and a map : S M C TM with the following properties: F (s, x, u) Tx M for each (s, x, u) S M C; (a) F (s, x, u) TM is of parameterised class Cr ; (b) the map S C M (s, u, x) F ((t), x, u) for every (t, x, u) T M C. (c) F(t, x, u) = F A Cr -control system = (M, F, C) is time-independent if there exists a map F : M C TM such that F(t, x, u) = F (x, u) for all (t, x, u) T M C. For time-independent control systems, we will often write F in place of F and also write = (M, F, C), omitting the redundant time-domain which we shall assume to be T = R.

244

6 Geometric system models

23/06/2009

The peculiar way in which time enters explicitly into thee system F, through the and the metric space S, warrants explanation. First of all, note that one can take map F S = R and let : T R be measurable and locally essentially bounded, and in this way obtain the explicit dependence of F on time in the usual way. However, there is another situation we wish to include in our setup. In Chapter 7 we shall linearise systems about xed trajectories as a means of studying local behaviour around the trajectory. In this case, even though the system itself may not have explicit dependence on time, such explicit time dependence will enter through the trajectory about which one is linearising. Moreover, as we shall see in Chapter 7, the time dependence enters in exactly the way we have prescribed in the denition. A control system = (M, F, C, T) gives rise to a dierential inclusion system F = (M, FF , T) in the obvious way: FF (t, x) = {F(t, x, u) | u C}. ! The interesting thing is that, given the conditions we have for control systems, the dierential inclusion system has some structure. The following result records this. 6.3.3 Proposition (Differential inclusion systems coming from control systems) If r Z0 {, } and if = (M, F, C, T), then the dierential inclusion FF : T M TM dened by FF (t, x) = {F(t, x, u) | u C} is measurable/locally integrally Lipschitz. 6.3.2 Trajectories and reachable sets for control systems The governing equations for a control system are (t) = F(t, (t), (t)), for suitable functions t (t) C and t (t) M. To ensure that these equations make sense, the dierential equation should be shown to have the properties needed for existence and uniqueness of solutions. We do this by allowing the controls for the system to be as general as reasonable. 6.3.4 Proposition (Property of control system when the control is specied) Let r Z0 {} and let = (M, F, C, T) be a Cr -control system. If L loc (T; C) then F LIr (T, TM), where F : T M TM is dened by F (t, x) = F(t, x, (t)).
: S M C TM be such that F(t, x, u) = F ((t), x, u). Proof Let S and : T S and F Let us abbreviate G(t, x) = F (t, x, (t)). Let us rst show that G is a Carath eodory vector eld of class Cr . Let f C (M). Clearly the function Gt f is of class Cr . Now x : S C Tx M by G x (s, u) = F (s, x, u) and Gx : R Tx M by x x M and dene G Gx (t) = G(t, x). Let us dene : T S C by (t) = ((t), (t)). Since and are measurable, for open sets T S and D C, 1 (T ) and 1 (D) are Lebesgue measurable. Therefore, ( )1 (T D) = {t T | (t) T , (t) D} = 1 (T ) 1 (D),

23/06/2009

6.3 Systems depending continuously on control

245

showing that ( )1 (T D) is Lebesgue measurable. Now let V S C be an arbitrary open set. Since S and C are separable and since open sets of S C of the form T D generate the product topology, there exists a countable family (T j D j ) jZ>0 of products of open sets such that V = jZ>0 T j D j . Then ( )1 (V) =
jZ>0

( )1 (T j D j ),

and since countable unions of Lebesgue measurable sets are Lebesgue measurable, it follows that ( )1 (V) is measurable. x is continuous, Let U R. Since d f (x) is continuous, d f (x)1 (U) Tx M is open. Since G x 1 1 x 1 G (d f (x) (U)) SC is open. Therefore, () (G (d f (x) (U))) is Lebesgue measurable, and so the function x ( )(t) = d f (x) Gx (t) t d f (x ) G is measurable. This shows that G is indeed a Carath eodory vector eld. Now let f C (M). To show that G LIr (T, TM) we must show that the function (t, x) Gt f is in LICr (T, M). So let K M be compact, let k Z0 satisfy k r, and let (s,u) : M TM by G (s,u) = G (s, x, u). Since G X1 , . . . , Xk (TM). For (s, u) S C, denote G r is of parameterised class C , the map (s,u) f )(x) (s, u, x) X1 Xk (G is continuous. Now x t0 T and for t T denote [t0 , t], t t0 , |t0 , t| = [t, t0 ], t < t0 . Since and are locally essentially bounded, for t T there exist compact sets K (t) S and K (t) C such that () K (t) for almost every |t0 , t| and () K (t) for almost every |t0 , t|. By continuity of the map (6.1), there exists a compact subset A(t) R such (s,u) f )(x) A(t) for (s, x, u) K (t) K K (t). But this implies that that X1 Xk (G X1 Xk (G f )(x) A(t), Now dene g : T R0 by g(t) = ess sup{|X1 Xk (G f )(x)| | x K, |t0 , t|}. We claim that g L1 loc (T; R0 ). Since g is locally essentially bounded by virtue of (6.2), our claim will follow if we can show that g is measurable. To prove this, we note that on the subintervals T = {t T | t < t0 }, T+ = {t T | t t0 }, of T, g is nonincreasing and nondecreasing, respectively. Thus, on every compact subinterval of T, g is of bounded variation [Cohn 1980, Proposition 4.4.2]. A function of locally bounded variation is almost everywhere continuous [Taylor 1965, Theorem 9-1 I], and so locally Riemann integrable [Cohn 1980, Theorem 2.5.1], and so measurable. Finally, we can obviously modify g on a set of measure zero so that |X1 Xk (Gt f )(x)| g(t), t T, x K. x K, a.e. |t0 , t|. (6.2) (6.1)

246

6 Geometric system models

23/06/2009

Note that this also preserves the measurability of g [Hewitt and Stromberg 1975, Theorem 11.23]. This proves that (t, x) Gt f (x) is in LICr (T; M), as desired.

One might wonder, in cases where the control set is a subset of Euclidean space, whether the class of controls can be enlarged to include locally integrable controls, while still maintaining the validity of the preceding result. The next elementary example shows that this is generally not the case. 6.3.5 Example (Locally integrable controls for control systems) We consider the timeindependent C -control system (M, F, C) with M = R, C = R, and F(x, u) = u2 x . Note that the control : R0 C dened by t1/2 , t R>0 , (t) = 0 , t=0 is locally integrable. However, the vector eld F (t, x) = t1 x is not in LIC (R0 ; TM). The problem is that the control appears nonlinearly in F, allowing the possibility of local integrability of the control to be lost on substitution into F. This suggests that, as long as the control appears linearly, one should be able to use locally integrable controls. This is indeed true. Systems where the control appears linearly are called control-ane systems which we will discuss in detail in Section 6.4. In Proposition 6.4.4 we will see that, for these systems, it is valid to allow locally integrable controls. As concerns trajectories, we now have the following result. 6.3.6 Corollary (Existence and uniqueness of trajectories for control systems) Let r Z0 {, }, let = (M, F, C, T) be a Cr -control, let L loc (T; C), and let (t0 , x0 ) T M. Then the following statements hold: (i) if r 0 then there exists a subinterval T T, relatively open in T and with t0 intT T , and an integral curve : T M of F such that (t0 ) = x0 ; (ii) if r 1 then, additionally, if T is another interval and if : T M is another integral curve of F as in part (i), then (t) = (t) for all t T T .
Proof This follows from Theorems 3.3.2 and 3.3.4, along with Proposition 3.1.12.

Since we know that trajectories exist, we can assign some notation to them. We recall from Denition 6.1.4 the formal denition and notation for controls. 6.3.7 Denition (Trajectories for control systems) Let r Z0 {, } and let = (M, F, C, T) be a Cr -control system. (i) A controlled trajectory for is a pair (, ) where, for a subinterval T T, Cont (, T ) and : T M satisfy (t) = F(t, (t), (t)) for almost every t T . The subinterval T is the time-domain for (, ) that we denote by TD(, ).

23/06/2009

6.3 Systems depending continuously on control

247

(ii) A trajectory for is, for a subinterval T T, a curve : T M for which (, ) is a controlled trajectory for some Cont (, T ). The subinterval T is the time-domain for that we denote by TD(). (iii) For a subinterval T T, we denote CTraj(, T ) = {(, ) | (, ) is a controlled trajectory for which TD(, ) = T } and CTraj() = T T CTraj(, T ). (iv) For a subinterval T T, we denote Traj(, T ) = { | is a trajectory for which TD() = T } and Traj() = T T Traj(, T ). (v) If U Cont () is a subset of locally essentially bounded controls, then we denote CTraj(, T , U ) = {(, ) CTraj(, T ) | U }, CTraj(, U ) = {(, ) CTraj() | U }, Traj(, T , U ) = { Traj(, T )| there exists U such that (, ) CTraj(, T )} Traj(, U ) = { Traj()| there exists U such that (, ) CTraj()}.

We now turn our attention to providing the denitions for reachable sets for control systems. 6.3.8 Denition (Reachable sets for control systems) Let r Z0 {, }. Let = (M, F, C, T) be a Cr -control system and let t0 T and x0 M. (i) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t is R (x0 , t0 , t) = {(t) | Traj(), [t0 , t] TD()}. (ii) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t is R (x0 , t0 , t) =
[t0 ,t]

R (x0 , t0 , ).

(iii) The reachable set from (t0 , x0 ) is R (x0 , t0 ) =


tT[t0 ,)

R (x0 , t0 , t).

Let T Traj() be a subset of trajectories for .

248

6 Geometric system models

23/06/2009

(iv) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t with trajectories from T is R (x0 , t0 , t, T ) = {(t) | T , [t0 , t] TD()}. (v) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t with trajectories from T is R (x0 , t0 , t, T ) = R (x0 , t0 , , T ).
[t0 ,t]

(vi) The reachable set from (t0 , x0 ) with trajectories from T is R (x0 , t0 , T ) =
tT[t0 ,)

R (x0 , t0 , t, T ).

If T = Traj(, U ) for some set U Cont () of admissible controls, then we will denote R (x0 , t0 , t, T ), R (x0 , t0 , t, T ), and R (x0 , t0 , T ) by R (x0 , t0 , t, U ), R (x0 , t0 , t, U ), and R (x0 , t0 , U ), respectively. The preceding denitions apply to a control system that will generally depend explicitly on time. However, very often we will be considering time-independent control systems. For such systems the above denitions still apply. However, for the purposes of studying the reachable set in these cases, one normally considers all trajectories as starting at time zero. We will nd it convenient to do this, and the following denitions formalise this. 6.3.9 Denition (Reachable sets for time-independent control systems) Let r Z0 {, }. Let = (M, , C) be a time-independent control system and let x0 M. (i) Let t R0 . The reachable set from x0 in time t is R (x0 , t) = {(t) | Traj(), [0, t] TD()}. (ii) Let t R0 . The reachable set from x0 in time at most t is R (x0 , t) =
[0,t]

R (x0 , ).

(iii) The reachable set from x0 is R (x0 ) =


tR0

R (x0 , t).

Let T Traj() be a subset of trajectories for . (iv) Let t R0 . The reachable set from x0 in time t with trajectories from T is R (x0 , t, T ) = {(t) | T , [0, t] TD()}.

23/06/2009

6.3 Systems depending continuously on control

249

(v) Let t R0 . The reachable set from x0 in time at most t with trajectories from T is R (x0 , t, T ) = R (x0 , , T ).
[0,t]

(vi) The reachable set from x0 with trajectories from T is R (x0 , T ) =


tR0

R (x0 , t, T ).

If T = Traj(, U ) for some set U Cont () of admissible controls, then we will denote R (x0 , t, T ), R (x0 , t, T ), and R (x0 , T ) by R (x0 , t, U ), R (x0 , t, U ), and R (x0 , U ), respectively. The two possible notions for reachable sets will not be confusing. We will never use them simultaneously, and it will always be clear from context whether we are using the time-dependent or the time-independent version. ! 6.3.3 The bred manifold picture for a control system This short section will be of a somewhat expository, rather than precise, nature. To keep things simple, and since what we are saying here will be used nowhere else in the book, we consider only time-independent systems. = F(x, u) model we are conBrockett [1977] proposed a generalisation of the x sidering in this section. His generalisation is as follows. Let : C M be a bred manifold over M. Thus C is a manifold, and is a surjective submersion. One should think of 1 (x) for x M as the controls one can apply at x. With this setup for modelling controls, a control system is then a map F : C TM such that F(u) T(x) M. A trajectory for such a system is then a pair (, ) where t (t) M and t (t) C are such that (u(t)) = (t). Thus, given a trajectory dened on an interval T, we have a commutative diagram: TM = U O

/ =M | | || || | |

|| ||

F |||

TM

Brockett argues that this is a useful generalisation of what we call a control system in some cases. The rationale is the following. Let us consider the problem of steering on the sphere S2 . Locally, since the sphere is two-dimensional, we can do this by following integral curves of two linearly independent vector elds: (t) = 1 (t)F1 ((t)) + 2 (t)F2 ((t)).

250

6 Geometric system models

23/06/2009

Thus F1 and F2 are vector elds on S2 and the controls 1 and 2 are R-valued. Thus, at least locally, our problem can be attacked using a control system with M = S2 , C = R2 , and F(x, (u1 , u2 )) = u1 F1 (x) + u2 F2 (x). However, the Hairy Ball Theorem guarantees that each of the vector elds F1 and F2 vanish at at least one point in S2 . Thus these vector elds are never linearly independent everywhere, and so the problem of steering on S2 can never be formulated as a problem using two controls, despite the fact that S2 is two-dimensional. So, argues Brockett, 1 why not take the controls applied at x S2 as coming from (x) which at least has the TM right dimension. That is to say, in the bred manifold picture above, we take U = TS2 and dene F(u) = u. The main drawback of this sort of formulation is the restrictions placed on the problem by asking that C be a bred manifold. This requires far more smoothness than is often present in a typical control problem. For instance, if the controls have bounds, then the control set 1 (x) is not a manifold. Thus this approach is only useful in those cases where controls take values in a manifold, and this is a stringent requirement, and in fact makes the formulation essentially uninteresting from the point of view of control theory. One could, in principle, generalise from bred manifolds to some more general object like bred topological spaces. However, this is a stretch, particularly until an interesting problem in geometric (or any other kind of) control theory is solved using this sort of approach. The author is unaware of any such solved problem. Note, however, that the steering problem on S2 can always be formulated using more than two vector elds, say (t) = 1 (t)F1 ((t)) + 2 (t)F2 ((t)) + 3 (t)F3 ((t)). Thus we take C = R3 and F(x, (u1 , u2 , u3 )) = u1 F1 (x) + u2 F2 (x) + u3 F3 (x). This manner of addressing the problem is not without its drawbacks. The main problem is that there is not a nice correspondence between controls and trajectories. That is to say, there may well be controls t (t) and t (t), diering everywhere, but giving rise to the same trajectory t (t). This is seldom a problem in applications, since in application a control is something real, and so there is seldom any ambiguity about what control is intended to be applied at any moment to achieve the desired actuation. However, this lack of a unique correspondence between controls and trajectories is awkward mathematically. One way, and perhaps the preferred way, to handle these sorts of situations is to use dierential inclusions to model the systems. After all, what one is really interested in is trajectories, and dierential inclusions oer a formulation that allows the denition of trajectories without the additional baggage of controls. However, it is also the case that having ones hands concretely on the controls is useful, and this renders the dierential inclusion formulation problematic.

23/06/2009

6.4 Control-afne systems

251

The upshot is: One should be exible in terms of what sorts of models on uses, knowing what are the issues that might make one model preferable over another in certain circumstances.

6.4 Control-afne systems


In this section we study a special class of control systems, a class where the dynamics are ane in the control. Many control problems in practice are naturally modelled by systems of this form. Moreover, the simple manner in which the control appears in the equations leads to a corresponding simplication of some of the theoretical developments surrounding these systems. 6.4.1 Denition of control-afne system We can begin straightaway. 6.4.1 Denition (Control-afne system) Let r Z0 {, }. A Cr -control-ane system is a 4-tuple = (M, F , C, T), where (i) M is a smooth or analytic manifold, as is required, whose elements are called states, (ii) C Rm is the control set, (iii) T R is an interval called the time-domain for the system, and (iv) F = (F0 , F1 , . . . , Fm ) is a family such that, for each a {0, 1, . . . , m}, Fa : T M TM is obtained as follows: there exists separable metric space Sa , a measurable locally a : Sa M TM with the essentially bounded map a : T Sa , and a map F following properties: a (s, x) Tx M for each (s, x) Sa M; (a) F a is of parameterised class Cr ; (b) F a (a (t), x) for every (t, x) T M. (c) Fa (t, x) = F (v) The vector eld F0 is called the drift vector eld and the vector elds F1 , . . . , Fm are the control vector elds or input vector elds. r A C -control-ane system = (M, F , C, T) with F = (F0 , F1 , . . . , Fm ) is timeindependent if there exist vector elds F0 , F1 , . . . , Fm on M such that Fa (t, x) = Fa (x, u) for all (t, x) T M and a {0, 1, . . . , m}. For time-independent control-ane systems, we will often write Fa in place of Fa , a {0, 1, . . . , m}, and also write = (M, F , C), omitting the redundant time-domain which we shall assume to be T = R. The same comments as made after Denition 6.3.2 concerning the nature of the explicit time-dependence for control-ane systems can be made here. The following denition will be useful, although it is perhaps not clear why at present. 6.4.2 Denition (Control-afne pre-system) Let r Z0 {, }. A Cr -control-ane pre-system is a triple = (M, F , T), where

252

6 Geometric system models

23/06/2009

(i) M is a smooth or analytic manifold, as is required, whose elements are called states, (ii) T R is an interval called the time-domain for the system, and (iii) F = (F0 , F1 , . . . , Fm ) is a family such that, for each a {0, 1, . . . , m}, Fa : T M TM is obtained as follows: there exists separable metric space Sa , a measurable locally a : Sa M TM with the essentially bounded map a : T Sa , and a map F following properties: a (s, x) Tx M for each (s, x) Sa M; (a) F a is of parameterised class Cr ; (b) F a (a (t), x) for every (t, x) T M. (c) Fa (t, x) = F A Cr -control-ane pre-system = (M, F , T) with F = (F0 , F1 , . . . , Fm ) is timeindependent if there exist vector elds F0 , F1 , . . . , Fm on M such that Fa (t, x) = Fa (x, u) for all (t, x) T M and a {0, 1, . . . , m}. For time-independent control-ane pre-systems, we will often write Fa in place of Fa , a {0, 1, . . . , m}, and also write = (M, F ), omitting the redundant time-domain which we shall assume to be T = R. Of course, a control-ane system gives rise to a control system in the obvious way. 6.4.3 Proposition (Control-afne systems are control systems) Let r Z0 {, }. If = (M, (F0 , F1 , . . . , Fm ), C, T) is a Cr -control-ane system and if F : T M C TM is dened by
m

F(t, x, u) = F0 (t, x) +
a=1

ua Fa (t, x),

then = (M, F, C, T) is a Cr -control system. Moreover, if is time-independent, then so is .


Proof Let Sa and a : T Sa be the metric spaces and locally essentially bounded maps for a (a (t), x) for t T, x M, and for maps F a : S M TM, a {0, 1, . . . , m}. which Fa (t, x) = F Dene S = S0 S1 S m (noting that S is a separable metric space) and : T S by (t) = (0 (t), 1 (t), . . . , m (t)) (noting that is locally essentially bounded since a nite product of compact sets is compact [Abraham, Marsden, and Ratiu 1988, Proposition 1.5.3]). The only (possibly not : S M C TM dened by entirely) obvious assertion to prove is that the map F
m

((s0 , s1 , . . . , sm ), x, u) = F 0 (s 0 , x ) + F
a =1

a (sa , x) ua F

(s, x, u) is of parameterised class Cr if F 0, F 1, . . . , F m are of parameis such that (s, u, x) F r terised class C , where we denote s = (s0 , s1 , . . . , sm ). Let f C (M). We must show that

23/06/2009

6.4 Control-afne systems

253

(s,u) f )(x) is for k Z0 with k r and X1 , . . . , Xk (TM), the map (s, u, x) X1 Xk (F (s,u) is the vector eld dened by F (s,u) (x) = F (s, x, u). Note that continuous, where F
m

(s,u) f )(x) = X1 Xk (F 0,(s ,u) f )(x) + X 1 X k (F 0


a=1

a,(s ,u) f )(x), X1 Xk (F a

a,(s ,u) = F a (sa , x, u), a {0, 1, . . . , m}. By hypothesis, the maps where F a a,(s ,u) f )(x), (sa , x) X1 Xk (F a a {0, 1, . . . , m},

are continuous, and so the result follows since sums and products of continuous functions are continuous.

6.4.2 Trajectories and reachable sets for control-afne systems For a control-ane system (M, (F0 , F1 , . . . , Fm ), C, T), the equations dening trajectories are
m

(t) = F0 (t, (t)) +


a=1

a (t)Fa (t, (t)),

for suitable functions t (t) C and t (t) M. Of course, since a control-ane system gives rise in a natural way to a control system as in Proposition 6.4.3, locally essentially bounded controls L loc (T; C) give rise to trajectories as in Corollary 6.3.6. Thus the denitions of admissible controls and their associated trajectories hold for control-ane systems as in Denition 6.3.7. However, because of the fact that controls for control-ane systems take values in Euclidean space, and because the controls appear linearly in the dierential equations, one can enlarge the class of controls 1 from L loc (T; C) to Lloc (T; C). The following result ensures that the resulting dierential equations will possess solutions. 6.4.4 Proposition (Property of control-afne system when the control is specied) Let r Z0 {} and let = (M, F , C, T) be a Cr -control-ane system with = (M, F, C, T) the r associated Cr -control system. If L1 loc (T; C) then F LI (T, TM), where F : T M TM is dened by F (t, x) = F(t, x, (t)).
a : Sa M C be such that Fa (t, x) = F a (a (t), x), Proof Let Sa and a : T Sa and F a {0, 1, . . . , m}. Let us dene S = S 0 S 1 Sm and dene : T S by (t) = (0 (t), 1 (t), . . . , m (t)).

and F ( (t), x, u) and a , a {0, 1, . . . , m}, satisfy G(t, x, u) = G Let us abbreviate G = F and let G a (a (t), x). For sa Sa , denote by F a,sa the vector eld given by F a,sa (x) = F a (sa , x), Fa (t, x) = F a {0, 1, . . . , m}. It follows from Proposition 6.3.4 and the fact that C Rm is separable that F is a Carath eodory vector eld. For f C (M) we must show that the function (t, x) Gt f (x) is

254

6 Geometric system models

23/06/2009

in LICr (T, M). Let K M be compact, let k Z>0 satisfy k r, and let X1 , . . . , Xk LI (TM). a , a {0, 1, . . . , m} is of parameterised class Cr , the map Since F a,sa f )(x), (sa , x) X1 Xk (F a {0, 1, . . . , m}, (6.3)

is continuous. Now x t0 T and for t T denote [t0 , t], t t0 , |t 0 , t | = [t, t0 ], t < t0 . Since a is locally essentially bounded for each a {0, 1, . . . , m}, for t T there exists a compact set Ka (t) Sa such that a () Ka (t) for |t0 , t|. By continuity of the map (6.3), a,sa f )(x) for each a {0, 1, . . . , m} there exists a compact set Aa (t) R such that X1 Xk (F Aa (t) for (sa , x) Ka (t) K. But this implies that a, f )(x) Aa (t), X1 Xk (F Now, for a {0, 1, . . . , m}, dene ga : T R0 by a, f )(x)| | x K, |t0 , t|}. ga (t) = ess sup{|X1 Xk (F Using the argument applied to g in the proof of Proposition 6.3.4, we have ga L loc (T; R0 ). Now dene g : T R0 by
m

x K, a.e. |t0 , t|.

g(t) = g0 (t) +
a=1

|a (t)| ga (t).

We claim that g L1 loc (T; R0 ). Indeed, for a compact subinterval T T let Ma = ess sup{ ga (t) | t T }, a {0, 1, . . . , m},

noting that Ma < for a {0, 1, . . . , m}. Therefore, since L1 loc (T; C),
m T m T

g(t) dt =

g0 (t) dt +
a =1

|a (t)| ga (t) dt (T ) M0 +
a =1

Ma

|a (t)| dt < ,

giving g L1 loc (T; R0 ), as desired. Finally, note that, for almost every t T, we have
m

|X1 Xk (Gt f )(x)| = X1 Xk F0,t +


a =1

a (t)Fa,t f (x)
m

|X1 Xk (F0,t f )(x)| +


a =1

|a (t)||X1 Xk (Fa,t f )(x)| g(t).

By modifying g on a set of measure zero, we can ensure that the preceding inequality holds for all t T without aecting the fact that g L1 loc (T; R0 ). This shows that (t, x) Gt f is in LICr (T; M), and so proves the proposition.

23/06/2009

6.4 Control-afne systems

255

We next dene trajectories for control-ane systems. This is mostly repetition from Denition 6.3.7, but allowing locally integrable rather than locally essentially bounded controls. We recall from Denition 6.1.14 the denitions of admissible subsets of locally integrable controls. 6.4.5 Denition (Trajectories for control-afne systems) Let r Z0 {, } and let = (M, F , C, T) be a Cr -control-ane system with = (M, F, C, T) the associated control system. (i) A controlled trajectory for is a pair (, ) where, for a subinterval T T, L1 loc (T ; C) and : T M satisfy (t) = F(t, (t), (t)) for almost every t T . The subinterval T is the time-domain for (, ) that we denote by TD(, ). (ii) A trajectory for is, for a subinterval T T, a curve : T M for which (, ) is a controlled trajectory for some Cont1 (, T ). The subinterval T is the time-domain for that we denote by TD(). (iii) For a subinterval T T, we denote CTraj(, T ) = {(, ) | (, ) is a controlled trajectory for which TD(, ) = T } and CTraj() = T T CTraj(, T ). (iv) For a subinterval T T, we denote Traj(, T ) = { | is a trajectory for which TD() = T } and Traj() = T T Traj(, T ). (v) If U Cont1 () is a subset of locally essentially bounded controls, then we denote CTraj(, T , U ) = {(, ) CTraj(, T ) | U }, CTraj(, U ) = {(, ) CTraj() | U }, Traj(, T , U ) = { Traj(, T )| there exists U such that (, ) CTraj(, T )} Traj(, U ) = { Traj()| there exists U such that (, ) CTraj()}.

Note that because control-ane systems are control systems, there is a genuine ambiguity between the preceding denition and Denition 6.3.7. This was touched upon at the end of Section 6.1.2. This ambiguity is, rst and foremost, not really problematic since it will be generally clear from context (or better, by explicit statement) what class of controls is being used at any moment. However, to eliminate any possible confusion. . . 6.4.6 Convention for classes of controls (again) For control-ane systems, unless otherwise stated, we suppose that the class of controls considered is from L1 loc (T; C).

256

6 Geometric system models

23/06/2009

Next we dene the notions of reachable sets for control-ane systems. These follow mostly as repetition from those for control systems. However, for clarity, since the set from which admissible controls can be chosen is dierent, we perform the repetition. 6.4.7 Denition (Reachable sets for control-afne systems) Let r Z0 {, }. Let = (M, F , C, T) be a Cr -control-ane system and let t0 T and x0 M. (i) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t is R (x0 , t0 , t) = {(t) | Traj(), [t0 , t] TD()}. (ii) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t is R (x0 , t0 , t) =
[t0 ,t]

R (x0 , t0 , ).

(iii) The reachable set from (t0 , x0 ) is R (x0 , t0 ) =


tT[t0 ,)

R (x0 , t0 , t).

Let T Traj() be a subset of trajectories for . (iv) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t with trajectories from T is R (x0 , t0 , t, T ) = {(t) | T , [t0 , t] TD()}. (v) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t with trajectories from T is R (x0 , t0 , t, T ) = R (x0 , t0 , , T ).
[t0 ,t]

(vi) The reachable set from (t0 , x0 ) with trajectories from T is R (x0 , t0 , T ) =
tT[t0 ,)

R (x0 , t0 , t, T ).

If T = Traj(, U ) for some set U Cont1 () of admissible controls, then we will denote R (x0 , t0 , t, T ), R (x0 , t0 , t, T ), and R (x0 , t0 , T ) by R (x0 , t0 , t, U ), R (x0 , t0 , t, U ), and R (x0 , t0 , U ), respectively. As with control systems, these denitions can be specialised to time-independent systems with a simplication of notation. 6.4.8 Denition (Reachable sets for time-independent control-afne systems) Let r Z0 {, }. Let = (M, F , C) be a time-independent control-ane system and let x0 M.

23/06/2009

6.4 Control-afne systems

257

(i) Let t R0 . The reachable set from x0 in time t is R (x0 , t) = {(t) | Traj(), [0, t] TD()}. (ii) Let t R0 . The reachable set from x0 in time at most t is R (x0 , t) =
[0,t]

R (x0 , ).

(iii) The reachable set from x0 is R (x0 ) =


tR0

R (x0 , t).

Let T Traj() be a subset of trajectories for . (iv) Let t R0 . The reachable set from x0 in time t with trajectories from T is R (x0 , t, T ) = {(t) | T , [0, t] TD()}. (v) Let t R0 . The reachable set from x0 in time at most t with trajectories from T is R (x0 , t, T ) = R (x0 , , T ).
[0,t]

(vi) The reachable set from x0 with trajectories from T is R (x0 , T ) =


tR0

R (x0 , t, T ).

If T = Traj(, U ) for some set U Cont1 () of admissible controls, then we will denote R (x0 , t, T ), R (x0 , t, T ), and R (x0 , T ) by R (x0 , t, U ), R (x0 , t, U ), and R (x0 , U ), respectively. The two possible notions for reachable sets will not be confusing. We will never use them simultaneously, and it will always be clear from context whether we are using the time-dependent or the time-independent version. 6.4.3 Important classes of control-afne systems Within the class of control-ane systems lie a couple of interesting subclasses of systems that warrant singling out. In this section we consider two such classes: the linear systems and the driftless systems.

258 Linear systems

6 Geometric system models

23/06/2009

Linear control systems are probably the most important class of systems in applications, and also form an interesting subclass of systems for exhibiting some of the important ideas in geometric control theory. In Section 7.1 we will consider linear systems in a very general and geometric context. For our purposes here, we will consider linear systems in a rather more pedestrian framework, and the one that is most often thought of when one hears the word linear system. 6.4.9 Denition (Linear system) A linear system is a 5-tuple = (V, A, B, C, T), where (i) V is a nite-dimensional R-vector space, (ii) C Rm , (iii) T R is an interval called the time-domain for the system, (iv) A : T EndR (V) is locally essentially bounded, and (v) B : T HomR (Rm ; V) is locally essentially bounded. A linear system = (V, A, B, C, T) is time-independent is there exists A EndR (V) and B HomR (Rm ; V) such that A(t) = A and B(t) = B for every t T. For timeindependent linear systems, we will often write A in place of A and B in place of B for a time-independent linear system, and also write = (V, A, B, C), omitting the redundant time-domain which we shall assume to be T = R. The idea is that the equations governing a linear system according to our denition are (t) = A(t) (t) + B(t) (t),

where t (t) is a curve in V and t (t) is a control. The in the above equation is merely notation to eliminate the ugly A(t)((t)) and B(t)((t)) that would otherwise result. Associated to a linear control system (V, A, B, C, T) is a C -control-ane system = (M, (F0 , F1 , . . . , Fm ), C, T) dened as follows: 1. M = V; 2. C = C (abuse of notation); 3. T = T (abuse of notation); a : Sa M C 4. the separable metric spaces Sa , the maps a : T Sa , and the maps F TM, a {0, 1, . . . , m}, needed to dene are given by: 0 (, x) = (x, (x)); (a) a = 0: S0 = EndR (V), 0 (t) = A(t), and F a (, x) = (x, (ea )), where (b) a {1, . . . , m}: Sa = HomR (Rm ; V), a (t) = B(t), F m (e1 , . . . , em ) is the standard basis for R . This construction is absurdly complicated by (1) allowing time-dependent linear systems and (2) by the weird nature of the time dependence in our general notion of control-ane systems. However, when a linear system is time-independent, things are gratifyingly simpler, and more easily resemble the usual notion of a time-independent

23/06/2009

6.4 Control-afne systems

259

linear system. To be precise, to a time-independent linear system = (V, A, B, C) we assign the time-independent control-ane system = (M, (F0 , F1 , . . . , Fm ), C) according to the following: 1. M = V; 2. C = C (abuse of notation); 3. T = T (abuse of notation); 4. F0 (x) = (x, A(x)) and Fa (x) = (x, B(ea )), a {1, . . . , m}, where (e1 , . . . , em ) is the standard basis for Rm . Having now dened linear systems, we will use them as we go along to understand certain general properties and problems in geometric control theory. Linear systems of the sort considered above are a nice class of systems to consider by virtue of the fact that their solutions are relatively easy to characterise. ! Driftless systems The next special class of control-ane system we consider are the so-called driftless systems. As the name suggests, a driftless system in one where the drift vector eld is zero. So here comes the formal denition. 6.4.10 Denition (Driftless system) Let r Z0 {, }. A Cr -driftless system is a 4-tuple = (M, F , C, T), where (i) M is a smooth or analytic manifold, as is required, (ii) C Rm is the control set, (iii) T R is an interval called the time-domain for the system, and (iv) F = (F1 , . . . , Fm ) is a family such that, for each a {1, . . . , m}, Fa : T M TM is obtained as follows: there exists separable metric space Sa , a measurable locally a : Sa M TM with the essentially bounded map a : T Sa , and a map F following properties: a (s, x) Tx M for each (s, x) Sa M; (a) F a is of parameterised class Cr ; (b) F a (a (t), x) for every (t, x) T M. (c) Fa (t, x) = F (v) The vector elds F1 , . . . , Fm are the control vector elds or input vector elds. A Cr -driftless system = (M, F , C, T) with F = (F1 , . . . , Fm ) is time-independent if there exist vector elds F1 , . . . , Fm on M such that Fa (t, x) = Fa (x) for all (t, x) T M and a {1, . . . , m}. For time-independent driftless systems, we will often write Fa in place of Fa , a {1, . . . , m}, and also write = (M, F , C), omitting the redundant time-domain which we shall assume to be R. A driftless system (M, (F1 , . . . , Fm ), C, T) is governed by the equations
m

(t) =
a =1

a (t)Fa (t, (t)),

260

6 Geometric system models

23/06/2009

for a curve t (t) on M and a control t (t). Like linear systems, driftless systems provide an often useful class of systems to illustrate certain ideas in geometric control theory. 6.4.11 Remark (Control-afne systems are driftless systems, sort of) As a curiosity, we remark that if = (M, (F0 , F1 , . . . , Fm ), C, T) is a control-ane system, then there is associated to this a driftless system = (M, (F1 , . . . , F,m+1 ), C , T) given by taking Fa = Fa1 , a {1, . . . , m + 1}, and taking C = {(1, u2 , . . . , um+1 ) Rm+1 | (u2 , . . . , um+1 ) C}. This observation is of no practical importance, since it does not lead to a means of solving any interesting problems in geometric (or any other kind of) control theory. However, it can sometimes be a useful notational trick, kind of like adding time as a variable to a dierential equation to render a time-dependent equation time independent. 6.4.4 Transformations of control-afne systems If one is studying an application where the dynamics are modelled by a controlane system, then typically the drift vector eld F0 and the control vector elds F1 , . . . , Fm are given naturally in terms of the physical model, and the controls very often have a real meaning in terms of voltages, torques, etc. Nonetheless, it is sometimes useful when studying control-ane systems to allow the drift vector eld and the control vector elds to be modied, as long as the trajectories remain the same for both systems. The following denition encapsulates this. 6.4.12 Denition (Equivalence of control-afne systems) Let r Z0 {, } and let F = (M, (F0 , F1 , . . . , Fl ), Rl , T) and G = (M, (G0 , G1 , . . . , Gm ), Rm , T) be Cr -control-ane systems. The control-ane systems F and G are equivalent if Traj(F ) = Traj(G ). The matter of studying control system equivalence is one of great interest in the control community, and the notion of equivalence we give above is not the one that researchers in this area use. The usual denition of equivalence allows for dierent state manifolds, and asks that there be a dieomorphism between the state manifolds which, along with its inverse, maps trajectories of one system to those of the other. This notion of equivalence is much stronger than the one we give, and the idea with the stronger notion is to use the dieomorphism part of the equivalence to nd a set of coordinates where the system takes a normal form. Thus the study of equivalence becomes one of classication, something we do not want to get into, cf. the comments at the end of Section 5.3.9 and Remark 5.4.12. Our notion of equivalence is more geometric, but even with it there are some subtleties that require some care [Elkin 1999].

23/06/2009

6.5 Afne systems

261

6.5 Afne systems


In this section we consider what is essentially a blending of dierential inclusion systems with control-ane systems. The motivation behind these models is to provide a setting for studying control-ane systems that is naturally invariant under the sort of equivalence discussed in Section 6.4.4. 6.5.1 Denition of afne system If Section 5.4 we studies the notion of an ane distribution on a manifold. Here we wish to generalise this slightly to allow for time dependence. The reason for doing this is that when we linearise ane systems about nontrivial reference trajectories in Section 7.2.4, we shall arrive naturally at an ane system that is time-dependent. As with control systems and control-ane systems, the time dependence will be assumed to arise in a particular way, through a measurable curve in a metric space. This, of course, complicates the presentation. We begin by considering ane distributions that depend on a parameter taking values in a topological space. 6.5.1 Denition (Parameterised afne distribution) Let r Z0 {, }, let M be a C - or C -manifold, as is required, and let S be a topological space. A subset A S TM is an ane distribution of parameterised class Cr if, for every x0 M, there exists a neighbourhood N of x0 and vector elds X0 , X1 , . . . , Xk : S M TM of parameterised class Cr such that A(s,x) A {s} Tx M = X0 (s, x) + spanR (X1 (s, x), . . . , Xk (s, x)).

It is then straightforward to dene when we mean by an ane system. 6.5.2 Denition (Afne system) Let r Z0 {, }. A Cr -ane system is a 4-tuple = (M, A, F, T), where (i) M is a smooth or analytic manifold, as is required, whose elements are called states, (ii) T R is an interval called the time-domain for the system, (iii) A T TM is dened as follows: there exists a separable metric space S, a S TM measurable locally essentially bounded map : T S and a subset A with the following properties: is an ane distribution of parameterised class Cr ; (a) A (b) A(t,x) ((t),x) for every (t, x) T M; A {t} Tx M = A (iv) F : T M TM is a dierential inclusion with the following properties: (a) F(t, x) A(t,x) for every (t, x) T M; (b) a (F(t, x)) = A(t,x) for every (t, x) T M.

262

6 Geometric system models

23/06/2009

A Cr -ane system = (M, A, F, T) is time-independent if there exists an ane distribution A on M such that A(t,x) = Ax for every (t, x) T M. For time-independent ane systems, we will often write A in place of A , and also write = (M, A, F), omitting the redundant time-domain. Obviously, an ane system = (M, A, F, T) gives rise to the dierential inclusion system (M, F, T), where one ignores the structure of the ane distribution. As with dierential inclusion systems, we do not specify a priori the properties of the dierential inclusion F, preferring to give whatever properties are needed to do what we wish at any moment. For reasons that are not quite clear at the moment, the following denition will be useful. 6.5.3 Denition (Afne pre-system) Let r Z0 {, }. A Cr -ane pre-system is a triple = (M, A, T), where (i) M is a smooth or analytic manifold, as is required, whose elements are called states, (ii) T R is an interval called the time-domain for the system, (iii) A T TM is dened as follows: there exists a separable metric space S, a S TM measurable locally essentially bounded map : T S and a subset A with the following properties: is an ane distribution of parameterised class Cr ; (a) A (b) A(t,x) ((t),x) for every (t, x) T M. A {t} Tx M = A A Cr -ane pre-system = (M, A, T) is time-independent if there exists an ane distribution A on M such that A(t,x) = Ax for every (t, x) T M. For time-independent ane pre-systems, we will often write A in place of A , and also write = (M, A), omitting the redundant time-domain. 6.5.2 The relationship between afne systems and control-afne systems It will be useful for us to be able to have certain relationships between the more abstract ane systems and the more concrete control-ane systems. In this section, we establish some of these relationships. First, it is more or less clear that associated with any control-ane system is an ane system. The following denition encapsulates this. 6.5.4 Denition (The afne system associated with a control-afne system) Let = (M, (F0 , F1 , . . . , Fm ), C, T) be a control-ane system of class Cr , r Z0 {, }. The -ane system is dened by (M, A , F , T), where (i) A,(t,x) = F0 (t, x) + spanR (F1 (t, x), . . . , Fm (t, x)) and (ii) F (t, x) = {F0 (t, x) + m a=1 ua Fa (t, x) | u C} for every (t, x) T M.

23/06/2009

6.5 Afne systems

263

It is more or less obvious (1) that the -ane system corresponding to a controlane system is time-independent if is time-independent and (2) the trajectories for the -ane system are the same as those for . It should be not dicult to believe that, given an ane system there is no natural control-ane system that can be associated with it, and possibly no ane system at all whose trajectories agree with those of the control-ane system. However, we will sometimes nd it useful to approximate an ane system with a control-ane system. The next denition captures a natural way of doing this. 6.5.5 Denition (Inner and outer control-afne realisations) Let = (M, A, F, T) be an ane system. (i) A control-ane system inner = (M, F , C, T) is an inner realisation of if Ainner = A and if Finner (t, x) F(t, x) for every (t, x) T M. (ii) A control-ane system inner = (M, F , C, T) is an outer realisation of if Ainner = A and if F(t, x) Finner (t, x) for every (t, x) T M. 6.5.3 Trajectories and reachable sets for afne systems For an ane system = (M, A, F, T), the condition to be satised by a trajectory for a dierential inclusion system is (t) F(t, (t)), where t (t) M is a suitable curve on M. It only remains to provide terminology and notation, and these follow the corresponding denitions for dierential inclusion systems. 6.5.6 Denition (Trajectories for afne systems) Let (M, A, F, T) be an ane system. (i) A trajectory for is, for a subinterval T T, a locally absolutely continuous curve : T M that is a trajectory for F. The subinterval T is the time-domain for that we denote by TD(). (ii) For a subinterval T T, we denote Traj(, T ) = { | is a trajectory for which TD() = T } and Traj() = T T Traj(, T ). Associated with trajectories is the notion of the reachable set. 6.5.7 Denition (Reachable sets for afne systems) Let (M, A, F, T) be an ane system and let x0 M and t0 T. (i) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t is R (x0 , t0 , t) = {(t) | Traj(), [t0 , t] TD()}.

264

6 Geometric system models

23/06/2009

(ii) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t is R (x0 , t0 , t) =
[t0 ,t]

R (x0 , t0 , ).

(iii) The reachable set from (t0 , x0 ) is R (x0 , t0 ) =


tT[t0 ,)

R (x0 , t0 , t).

Let T Traj() be a subset of trajectories for . (iv) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time t with trajectories from T is R (x0 , t0 , t, T ) = {(t) | T , [t0 , t] TD()}. (v) Let t T [t0 , ). The reachable set from (t0 , x0 ) in time at most t with trajectories from T is R (x0 , t0 , t, T ) = R (x0 , t0 , , T ).
[t0 ,t]

(vi) The reachable set from (t0 , x0 ) with trajectories from T is R (x0 , t0 , T ) =
tT[t0 ,)

R (x0 , t0 , t, T ).

The denitions above will apply to an ane system that may depend explicitly on time. Of course, the denitions also apply to time-independent ane systems. However, for time-independent systems, one very often wishes to work with denitions for reachable sets that acknowledge the time-independence. 6.5.8 Denition (Reachable sets for time-independent afne systems) Let (M, A, F) be a time-independent ane system and let x0 M. (i) Let t R0 . The reachable set from x0 in time t is R (x0 , t) = {(t) | Traj(), [0, t] TD()}. (ii) Let t R0 . The reachable set from x0 in time at most t is R (x0 , t) =
[0,t]

R (x0 , ).

(iii) The reachable set from x0 is R (x0 ) =


tR0

R (x0 , t).

23/06/2009

6.5 Afne systems

265

Let T Traj() be a subset of trajectories for . (iv) Let t R0 . The reachable set from x0 in time t with trajectories from T is R (x0 , t, T ) = {(t) | T , [0, t] TD()}. (v) Let t R0 . The reachable set from x0 in time at most t with trajectories from T is R (x0 , t, T ) = R (x0 , , T ).
[0,t]

(vi) The reachable set from x0 with trajectories from T is R (x0 , T ) =


tR0

R (x0 , t, T ).

There could possibly be some confusion with our using two dierent notions of reachable set. However, in practice this confusion will not be manifested as we will never use the two notions simultaneously. Moreover, it should be clear from context what notion is used in any situation.

Chapter 7 Linear systems and linearisation of systems


! At the moment, this is just a placeholder for things yet to be written.

7.1 Linear systems


7.1.1 Linear systems on vector spaces 7.1.2 Linear systems on vector bundles

7.2 Linearisation of system models


7.2.1 Linearisation of differential inclusion systems 7.2.2 Linearisation of control systems 7.2.3 Linearisation of control-afne systems 7.2.4 Linearisation of afne systems

This version: 23/06/2009

Chapter 8 Variations and the reachable set


In Chapter 6, while dening the notions of trajectories for our various classes of systems, we gave denitions of reachable sets. In this section we describe in detail some of the properties of these reachable sets. We begin by describing some of the basic structure of reachable sets. We then turn to providing one of the fundamental tools for discovering properties of the reachable set: the variation. Our description of variations is facilitated by an understanding of jet bundles for maps and for sections of vector bundles. We provide a very quick background in Section 8.1. Regarding variations, we have already seen something of these in Chapter 7. However, the variations considered in the development of linearisation were, tautologically, of rstorder. The introduction of higher-order variations is an essential part of the study of controllability theory and optimal control theory that we shall study in Chapters 9 and 10, respectively. We consider variations of various sorts corresponding to the dierent system models we introduced in Chapter 6. In each case we show how the sets of variations approximate, in some sense, the reachable set. !

8.1 Jet bundles of various sorts


In the development of variations with we undertake in this chapter, we will use jet bundle concepts and notations. We shall utilise two principle sorts of jet bundles: (1) jet bundles associated with sections of a vector bundle (specically the tangent bundle) and (2) jet bundles associated with maps between manifolds. Since a knowledge of the detailed structure of jet bundles is not always part of the repertoire of the control theoretician, we provide a somewhat detailed exposition of jet bundle structure. However, it is still the case that our presentation will seem rushed to someone unfamiliar with the ways of jet bundles. Such readers are directed to [Saunders 1989] and [Kol ar , Michor, and Slov ak 1993, Chapter 4]. 8.1.1 The symmetric algebra of a vector space Jet bundles have a rich algebraic structure that we will rely heavily upon in our characterisations of variations. Much of this algebraic structure comes from the symmetric algebra structure of partial derivatives. In this section we say some things about symmetric algebras, since there can be mixed conventions here to confuse things. Let

268

8 Variations and the reachable set

23/06/2009

IS (V) be the two-sided ideal of T(V) generated by elements of the form v1 v2 v2 v1 , v1 , v2 V .

The symmetric algebra of V is the R-algebra S(V) = T(V)/IS (V). The product in S(V) induced by the tensor product is denoted by (A1 + IS (V)) (A2 + IS (V)) A1 A2 + IS (V).

A related, but not quite identical, object to an element of the symmetric tensor algebra is a symmetric tensor, by which we mean an element A Tk (V) such that A(v(1) , . . . , v(k) ) = A(v1 , . . . , vk ) for every Sk . The set of symmetric elements of Tk (V) is denoted by TSk (V), and we denote TS(V) = kZ0 TSk (V). We dene a projection Symk : Tk (V) TSk (V) by Symk (A) = 1 (A). k! S
k

Then, if k, l Z0 , and if A TSk (V) and B TSl (V), we can dene A B= (k + l)! Symk+l (A B), k!l!

which makes TS(V) into a commutative R-algebra, Then there exists a unique algebra homomorphism M : S(M) TS(M) such that the diagram T(V)H S(V) commutes. 8.1.2 Jet bundles of vector bundles Let V : V M be a vector bundle over M. In this paper we shall only be interested in the tangent bundle TM : TM M, but there is some benet to being general in our initial exposition. Let k Z0 and let x M. Sections , (V ) agree to order k at x if, in vector bundle coordinates about x, and and their rst k derivatives agree when evaluated at x. This denition is independent of choice of coordinates (by the Chain Rule for higher-order derivatives; see Lemma 1 from the proof of
 HH k! Sym HH HH HH # / TS(V)
V

23/06/2009

8.1 Jet bundles of various sorts

269

Theorem 2.2.4) and denes an equivalence relation on the set of sections of V . An equivalence class is called a k-jet of sections and the set of equivalence classes is called the bundle of k-jets and denoted by Jk V . For k, l Z0 with k l we have the natural projection (V )k : Jk V Jl V sending the k-jet to the l-jet; this obviously makes sense l since if sections agree to order k they also agree to order l k. Note that J0 V is naturally identied with V. The composition (V )k V (V )k can be shown to give a 1 k bundle structure to (V )k : J V M. Vector bundle coordinates for V induce natural coordinates for Jk V as the rst k Taylor series coecients for sections. Thus, if V is locally of the form U Rm with U an open subset of Rn , Jk V is locally parameterised by k n m n m U Rm L(Rn ; Rm ) L2 sym (R ; R ) Lsym (R ; R ). If (V ) then we denote by jk ((V )k ) the corresponding section of the k-jet bundle. It is easy to show that if k Jk V then there exists a section (V ) such that jk ((V )k (k )) = k . Thus we can, without loss of generality, write a typical point in Jk V as jk (x) for some section and some appropriate x M. For x M we k shall denote by Jk x V the set of k-jets of the form j (x). One can see that (V )k : Jk V M is a vector bundle if we dene vector bundle operations jk (x) + jk (x) = jk ( + )(x), a( jk (x)) = jk (a)(x) for , (V ) and a R. It is not the case that (V )k : Jk V Jl V is a vector bundle, l although (V )k is a vector bundle mapping. For l = k 1, however, this bundle does l have some important structure. Indeed, (V )k : Jk V Jk1 V is an ane bundle k 1 modelled on the pull-back of the bundle Sk (T M) V to Jk1 V . Let us explicitly dene the ane structure to which we allude. Let jk (x) Jk V , let f1 , . . . , fk C (M) have the property that f1 (x) = = fk (x) = 0, and let (V ). The ane structure is then dened by its satisfying jk (x) + (d f1 (x) d fk (x)) (x) = jk ( + ( f1 fk ))(x).

A convenient means of depicting the ane structure of (V )k is through the following k 1 short exact sequence of vector bundles: 0 Here
k

/ Sk (T M) V

/ Jk

(V )k k1

/ J k 1

/0

(8.1)

is the injection dened by


k ((d f1 (x)

d fk (x)) (k)) = jk (( f1 fk ))(x).

In particular, this shows that the restriction of Jk V to the zero section of Jk1 V is isomorphic to Sk (T M) V, a fact that we will make use of. We shall also wish to have at our disposal innite jets. These are obtained as the projective limit as k of the jet bundles Jk V . Explicitly, J V denotes the set of maps : Z0 kZ0 Jk V having the following two properties:

270

8 Variations and the reachable set

23/06/2009

1. (k) Jk V , k Z0 ; 2. (V )k ( (k)) = (l) for k, l Z0 satisfying k l. l It is possible to consider topological and dierentiable structures for J V , but we shall not need these here. It is sucient for our purposes to consider J V as a set, noting that J x V = { J V | (0) Vx } is a R-vector space with operations ( + )(k) = (k) + (k), (a )(k) = a( (k))

for k Z0 , , J x V , and a R. For a section (V ) we denote j (x) k J x V as dened by j (x)(k) = j (x). By Theorem 2.1.4, if Jx V then there exists a section (V ) such that j (x) = .

8.1.3 Jet bundles of maps between manifolds We shall also require another sort of jet bundle. Here we let M and N be manifolds. Let k Z0 and let x M. Maps , C (M, N) agree to order k at x if (x) = (x) and if, in a chart for M about x and a chart for N about (x) = (x), the rst k derivatives of and agree when evaluated at x. As with the construction in the previous section, this denition is independent of coordinate charts by the higher-order Chain Rule. This also denes at equivalence relation on C (M, N), and we call an equivalence class a k-jet of maps. The set of equivalence classes we denote by Jk (M; N). For k, l Z0 with k l we again have a natural projection, this denoted by k : Jk (M; N) Jl (M; N). l We have a natural identication of J0 (M; N) with M N. One makes Jk (M; N) a manifold by using the Taylor coecients of maps as coordinates. Thus if M is locally modelled by an open subset U Rn and if N is locally modelled by an open subset V Rm , then Jk (M; N) is locally parameterised by
k n m n m U V L(Rn ; Rm ) L2 sym (R ; R ) Lsym (R ; R ).

If C (M, N) then we denote by jk : M Jk (M; N) the map assigning to x the k-jet of at x. As with jets of sections, it is easy to see that if k Jk (M; N) then there exists C (M, N) such that jk (x) = k for an appropriate x M. For x M and y N we denote by Jk (M; N) the set of k-jets of the form jk (x) where (x) = y. (x, y) (M; N) as follows. First note that, for x M, There is a useful characterisation of Jk (x, y) k J(x,0) (M; R) is a R-algebra with vector space operations jk f (x) + jk g(x) = jk ( f + g)(x), and with product a( jk f (x)) = jk (a f )(x)

( jk f (x)) ( jk g(x)) = jk ( f g)(x).

23/06/2009

8.1 Jet bundles of various sorts

271

To make our notation more wieldy, let us abbreviate, following [Kol ar , Michor, and k k (N; R) is a Slov ak 1993, Section 12.8], Tx M = J(x,0) (M; R). In like manner, of course, Jk ( y,0) k k k R-algebra which we denote by Tx N. Now, for j (x) J(x, y) (M; N) dene a map
k k hom( jk (x)) : T y N Tx M

jk f ( y) jk ( f )(x). This map may be shown to be a homomorphism of R-algebras. Moreover, the map (M; N) jk (x) hom( jk (x)) can be shown to be a bijection, and so this establishes Jk (x, y) k k as essentially being the set Hom(T y N; Tx M) of R-algebra homomorphisms. In partic(M; N) is itself a R-algebra with product dened by ular, Jk (x, y) hom( jk jk )( jk f ( y)) = hom( jk )( jk f ( y)) hom( jk )( jk f ( y)). We shall frequently omit the hom and directly regard elements of Jk (M; N) as (x, y) algebra homomorphisms without comment. Unlike the situation with jets of sections of a vector bundle, the bundles of k-jets of maps do not form a vector bundle. We do still have an ane structure, however. In this case, k : Jk (M; N) Jk1 (M; N) is an ane bundle modelled on the pull-back of k 1 the vector bundle Sk (T M) TN to Jk1 (M; N). Let us describe this ane structure explicitly. To do this we use our interpretation above of elements of Jk (M; M) as algebra (x, y)
k homomorphisms. We rst establish an injection k : Sk (T x M) T y N J(x, y) (M; N). Let k jk g( y) T y N, let f1 , . . . , fk C (M) satisfy f1 (x) = = fk (x) = 0, and let be a vector eld on N. One can then show that jk g( y) jk (L g( y) f1 fk )(x) is a homomorphism k k of the R-algebras T y N and Tx M. Moreover, one easily shows that the map sending (d f1 (x) d fk (x)) ( y) to this homomorphism is well-dened, giving the injection k k k 1 (M; N). k . Now we can use this to describe the ane structure of k1 : J (M; N) J k k k Let j (x) J(x, y) (M; N) and let A(x) ( y) S (Tx M) T y N. One may then show that jk (x) + A(x) ( y), addition being that of algebra homomorphisms, is well-dened and gives the desired ane structure. This ane structure is conveniently represented by the following exact sequence of vector spaces: k k1

/ Sk (T M) T
x

yN

/ Jk

(x, y)

(M; N)

/ J k 1

(x, y)

(M; N)

As with jets of sections of vector bundles, we will want to consider the projective limit as k of Jk (M; N), which we denote by J (M; N). Explicitly, an element of J (M; N) is a map : Z0 kZ0 Jk (M; N) having the following two properties: 1. (k) Jk (M; N), k Z0 ; 2. k ( (k)) = (l) for k, l Z0 satisfying k l. l

272

8 Variations and the reachable set

23/06/2009

We shall content ourselves with simply the set structure of J (M; N). We shall also use the fact that J (x, y) (M; N) = { J (M; N) | (0) = (x, y)} is identied with the set of R-algebra homomorphisms of J (N; R) and J (M; R). (x,0) ( y,0) To describe this, we should rst give the algebra structure on T x M is dened by the vector space operations (F + G )(k) = F (k) + G (k), and the product (aF )(k) = a(F (k)) J (M; R). This (x,0)

(F G )(k) = F (k)G (k),

for k Z0 , F , G T x M, and a R. Of course, similar denitions hold for Tx N. (M; N) and F T Now let Jk y N and use Borels Theorem to write = j (x) (x, y) and F = j f ( y) for C (M, N) and f C (N). Here j (x) and j f ( y) are dened by j (x)(k) = jk (x), j f ( y)(k) = jk f ( y), k Z0 .

Then dene

(F ) = j ( f )(x),

so dening J (M; N) as the set Hom(T y N; Tx M) of R-algebra homomorphisms. (x, y)

8.1.4 The structure of jets of maps between Euclidean spaces Let us consider briey a special case of the preceding constructions that will be of particular interest. The following lemma is of use. We denote
k

Sk (V) =
j =1

S j (V),

for k Z>0 and for a R-vector space V. For the map k : V Tk (V) v v (v v) (v v). we have the following lemma. 8.1.1 Lemma If U and V are R-vector spaces with V nite-dimensional and if : image(k ) U, then there exists a unique L(Sk (V); U) such that | image(k ) = .
Proof For j {1, . . . , k} let j : Sk (V) S j (V) be the inclusion of the jth factor in the direct sum. Let j : V S j (V) be dened by j (v) = v v. Then dene j : image( j ) U by j = j . Then, as in [Bourbaki 1990, Proposition V.9.13], there exists a unique map j L(S j (V); U) such that j | image( j ) = j . Now dene : Sk (V) U by (v11 (v21 v22 ) (vk1 vkk ) = 1 (v11 ) + 2 (v21 v22 ) + + k (vk1 vkk ),

(8.2)

23/06/2009

8.1 Jet bundles of various sorts

273

with the extension to all of Sk (V) being made by enforcing linearity. The welldenedness and linearity of follows from the well-denedness and linearity of the maps 1 , 2 , . . . , k . Uniqueness of follows from the fact that its restriction to each S j (V) is uniquely determined by the maps j , j {1, . . . , k}. factor in the direct sum k j=1

(Rp ; R). We note that, by Lemma 8.1.1, Let us abbreviate (Rp )k = Jk (0p ,0)

(R )

p k j=1

Lsym (Rp ; R),

and so elements of (Rp )k can be thought of as polynomial functions on Rp of degree k and having zero constant term. Explicitly, if jk f (0p ) (Rp )k then one has the associated polynomial function v D f (0p ) v + 1 2 1 D f (0p ) (v, v) + + Dk f (0p ) (v, . . . , v), 2! k!

which is simply the truncation of the Taylors series to order k. We then have the following lemma which will be of subsequent interest. 8.1.2 Lemma If k, p Z>0 then df(sv) j f(0p )(v) = ds
k

1 dk f(sv) + + s=0 k! dsk

s=0

where we think of jk f(0p ) as a polynomial function.


Proof By an elementary induction we have d j f (sv) ds j = D j f (sv) (v, . . . , v).

From this the lemma immediately follows.

8.1.5 Higher-order tangent vectors for nets In this section we introduce what we call higher-order tangent vectors at a point. The typical way this is done is via curves. However, it is useful to also allow sequences in the construction, as variations sometimes naturally arise using sequential constructions [Agrachev and Gamkrelidze 1993, 5]. We capture the constructions here in the language of directed sets and nets. Let us recall the denitions. 8.1.3 Denition (Directed set, net) A directed set is a pair (D, ) with D being a set and a binary relation with the following properties: (i) a a for every a D; (ii) a b and b c implies that a c; (iii) for each a, b D, there exists c D such that a, b c.

274

8 Variations and the reachable set

23/06/2009

If T is a topological space, a D-net is a map : D T . A D-net : D T converges to x0 T if, for every neighbourhood U of x0 , there exists a D such that (b) U for every a b. We then write x0 = lim . Let M be a smooth manifold and let x0 M. We shall consider directed sets and associated nets taking values in M. 1. We rst consider the set D = R>0 equipped with the relation a b if and only if b a. The nets we consider here are of the form : R>0 M such that lim = x0 . This means that lima0 (a) = x0 . We do not necessarily require that the map have any continuity properties. However, if is of class Cr as a map in the usual sense, we shall say the net is of class Cr . 2. The second directed set we consider is D = Z>0 equipped with the relation j k if and only if j k. The nets we consider are of the form : Z>0 M such that lim = x0 . This means, in this case, that lim j ( j) = x0 . Let us call any net of the above form a natural net at x0 . For D {R>0 , Z>0 } and for a D, let us denote D = R>0 , a, a = 1 a , D = Z>0 , so that lima0 a = 0 when D = R>0 and lima a = 0 when D = Z>0 . The following lemma ensures that the constructions we are about to make are independent of coordinates. 8.1.4 Lemma (Independence of higher-order tangent vectors on coordinate chart) Let M be a smooth manifold, let x0 M, let D {R>0 , Z>0 }, and let : D M be a natural net at x0 . For k Z>0 , there exists at most one v0 Tx0 M such that, for every chart (U, ) about x0 , the local representative of satises (a) (a )k v = 0, a 0 (a )k lim where v Rn is the local representative of v0 . With the preceding as setup, we make the following denition. 8.1.5 Denition (Tame natural nets) Let M be a smooth manifold, let x0 M and let k Z>0 . A natural net at x0 is tame if there exists v0 Tx0 M satisfying the conclusions of Lemma 8.1.4. The vector v0 is the kth-order tangent vector to the net .

8.2 Properties of the reachable set for differential inclusion systems


In this section we consider some of the basic properties of the reachable sets for dierential inclusion systems. Of course, the results will apply to ane systems as well, since these are also dierential inclusion systems.

23/06/2009

8.3 Properties of reachable sets for control systems

275

8.2.1 The topology of the reachable set for a differential inclusion system We begin by investigating some of the basic properties of the reachable set for a dierential inclusion system. These basic properties can be important, for example, when studying the existence of optimal control laws. We have the following result. 8.2.1 Theorem (Topology of reachable set for differential inclusion systems) Let = (M, F, T) be a convex-valued and compact-valued dierential inclusion system, let (t0 , x0 ) T M, and let T T [t0 , ). If there exists a compact set K M such that {(t) | t [t0 , T], Traj(, [t0 , T])} K, then the following statements hold: (i) R (x0 , t0 , T) is compact and connected; (ii) the set-valued map [t0 , T] t R (x0 , t0 , t) is continuous.

8.3 Properties of reachable sets for control systems


In this section we shall prove a few fundamental results about the character of the reachable set. We begin by proving some boundedness and compactness results about the reachable set. After this, we consider some important results concerning sets of states reachable by simple classes of controls. Of particular interest are results concerning states reachable by piecewise constant controls. 8.3.1 Topology of the reachable set for control systems In this section we investigate the basic topological properties of the reachable set. These basic properties can be important, for example, when studying the existence of optimal control laws. We have the following result. 8.3.1 Theorem (Topology of the reachable set for control systems) Let r Z>0 {, } and let = (M, F, C, T) be a control system of class Cr . If {F(t, x, u) | u C} Tx M is compact and convex for each x M and if there exists a compact set K M such that {(t) | t [t0 , T], Traj(, [t0 , T])} K, then the following statements hold: (i) R (x0 , t0 , T) is compact; (ii) the set-valued map [t0 , T] t R (x0 , t0 , t) is continuous.

276

8 Variations and the reachable set 8.3.2 States reachable by subsets of trajectories

23/06/2009

Sometimes it is useful to be able to approximate trajectories generated by arbitrary controls with trajectories generated by simpler controls. The following result indicates one case in which this can be done. 8.3.2 Theorem (Approximation using piecewise constant controls) Let r Z>0 {, }, let = (M, F, C) be a time-independent control system of class Cr , let x0 M, and let T R>0 . If x R (x0 , T) then there exists a sequence (yj )jZ>0 in R (x0 , T, Lpwc ([0, T]; C)) such that x = limj yj .

8.4 Variations for differential inclusion systems


Now we turn to the study of variations for our various system models. Variations can be thought of as being innitesimal approximations to the reachable set, and as such form our principal tool for understanding the structure of the reachable set. Before we study variations in detail, we consider a seemingly unrelated and perhaps strange construction with jet bundles. In this section we consider the construction of variations for very special classes of dierential inclusion systems, systems that are regular enough to be smoothly selectioned. The presentation here follows [Aguilar and Lewis 2008]. Let us rst dene what precisely the sort of system we consider. 8.4.1 Denition (Cr -selectionable differential inclusion system) Let r Z0 {, }. A time-independent dierential inclusion system = (M, F) is Cr -selectionable if, for every x0 M, there exists a neighbourhood N such that, if v0 F(x0 ), then there exists a Cr -selection of F|N. Obviously, smoothly selectionable, i.e., C -selectionable, dierential inclusion systems are very special. However, for example, the dierential inclusion system associated with a time-independent control-ane system of class C is smoothly selectionable, and this is the sort of thing we have in mind. 8.4.1 An algebro-geometric construction Before making constructions for a smoothly selectionable dierential inclusion system, we make a substantial detour to discuss some constructions that have no relation to any sort of control system, but are just geometric constructions on the state manifold. 8.4.2 Denition (Multitrajectories, variations) Let M be a manifold, let x0 M, and let = (1 , . . . , p ) (TM) be such that j is complete for every j {1, . . . , p}. (i) The Cr -map p x0 : R0 M
p 1 . . . (x0 ) (t1 , . . . , tp ) t1 tp

23/06/2009

8.4 Variations for differential inclusion systems

277

is the -multitrajectory. p (ii) A p-end-time variation is a C -map : R R0 with the property that (0) = 0p . The set of p-end-time variations is denoted by ETp . (iii) Let be a p-end-time variation. The order of the pair (, ) at x0 , denoted ordx0 (, ), is the smallest positive integer k such that jk ( x0 )(0) 0x0

(derivatives are assumed to be taken from the right). If no such k exists then the order is taken to be . (iv) Let be a p-end-time variation such that (, ) has nite order k = ordx0 (, ). The (, )-variation at x0 is the curve , : s x0 (s) and the (, )-innitesimal variation is the tangent vector V, (x0 ) Tx0 M dened by V, (x0 ) = jk , (0) Sk (R ) Tx0 M Tx0 M (cf. Section 8.1)

(derivatives are assumed to be taken from the right).

8.4.3 Remark (The completeness assumption) In our development we shall only be considering jets, so the assumption that the vector elds j , j {1, . . . , p}, be complete is immaterial since, for example, we can demand without loss of generality that they have compact support. Let = (1 , . . . , p ) (TM), let be a p-end-time variation with ordx0 (, ) = k. for each r {1, . . . , k 1}. By the exact sequence (8.1) it follows Then jr ( x0 )(0) = 0Jr x0 M

k k that jk ( x0 )(0) S (R ) Tx0 M. Since S (R ) is essentially the collection of R-valued polynomial functions on R of homogeneous degree k, it is canonically isomorphic to R (by evaluating all polynomials at 1). Thus we can think of jk ( x0 )(0) as being an element of Tx0 M. The rst observation we make is that for k Z>0 we have jk ( x0 )(0) = jk (0) jk x0 (0p ),

where we think of jk (0) Hom((Rp )k ; Rk ),


k p k jk x0 (0p ) Hom(Tx0 M; (R ) )

k p k as homomorphisms of R-algebras. Given A Jk (Rp ; M) Hom(T x0 M; (R ) ) we (x0 ,0p ) dene a map p k k k k k p,A : Hom((R ) ; R ) Hom(Tx0 M; R )

B B A. As a nal bit of notation before we get to the important denitions, we note that we have a canonical identication
k p Jk (0,0p ) (R, R ) k

S (R) R
j j =1

p j =1

Rp .

278

8 Variations and the reachable set

23/06/2009

With this identication we denote


k p Jk (0,0p ) (R0 , R0 ) = (a1 , . . . , ak ) J(0,0p ) (R, R ) a j (r j ) > 0, j {1, . . . , k}, p

where r j is the largest integer such that a j (r) = 0 for r < r j . (R0 , R0 ) is the set of k-jets at (0, 0p ) of positive A moments thought shows that Jk (0,0p ) p-end-time variations. The idea is that the lowest nonzero term in the Taylor expansion of each component is positive. With the preceding notation as backdrop, we make the following denition. 8.4.4 Denition (Kernels of algebra homomorphisms) For x0 M and for A p k k Hom(T x0 M; (R ) ) let us denote
k Z(A) = { jk (0) Jk (0,0p ) (R0 ; R0 ) | p,A ( jk (0)) = 0}. p p

Now let us denote by TMp the p-fold Whitney sum of TM with itself, and denote by TMp M the canonical projection. For a family = (1 , . . . , p ) of C -vector elds on M, let us denote by the corresponding section of TMp , accepting a convenient abuse of notation. In the statement of the next result we recall from (8.2) the denition of the map k . For R-algebras A and B we recall that Hom(A; B) L(A; B), but Hom(A; B) is not a subspace in general.
p TM :

8.4.5 Theorem (A linear map characterising variations) For each k, p Z>0 there exists a unique map 1 p k p k Tpk (x0 ) L(Sk (Jk x0 TM ); L(Tx0 M; (R ) )) such that Tpk (x0 )(k (jk1 (x0 ))) = jk x0 (0p ).

for every family = (1 , . . . , p ) of C -vector elds. Moreover, the diagram


p o 1 (J0 x0 TM ) Tp1 (x0 ) p o 2 (J1 x0 TM ) Tp2 (x0 ) p o 3 (J2 x0 TM ) Tp3 (x0 )

1 p 1 o Hom(T x0 M; (R ) )

2 p 2 o Hom(T x0 M; (R ) )

3 p 3 o Hom(T x0 M; (R ) )

commutes, where the horizontal arrows are the canonical projections. Note that we are actually not interested in the behaviour of Tpk (x0 ) o the image of k . Thus, while Tpk (x0 ) is linear, we are only interested in its restriction to the algebraic variety image(k ). Moreover, Tpk (x0 ) is system independent, depending on k, p, dim(M), and the BakerCampbellHausdor formula. The following result summarises how one should interpret the maps Tpk (x0 ) in our context.

23/06/2009

8.4 Variations for differential inclusion systems

279

8.4.6 Proposition (Algebraic characterisation of variations) If = (1 , . . . , p ) is a family of C -vector elds on M, if is a p-end-time variation, if x0 M, and if k, p Z>0 , then
k jk ( x0 )(0) = jk (0) Tp (x0 )(jk1 (x0 )).

Moreover, if jk (0) Z(Tpk (x0 )(jk1 (x0 ))) then ord(, ) k. While the notation jk ( is more compact, the notation x0 )(0) k jk (0) Tp (x0 )( jk1 (x0 )) better represents the structure of the problem. The diagram from the Theorem 8.4.5 allows us to take the limit as k in a natural (i.e., projective) way. To do this explicitly (i.e., without just writing proj limk in front of everything) requires us to introduce some notation. For k, l Z>0 with l k we have a projection k k 1 l l1 p p k l : S (Jx0 TM ) S (Jx0 TM ) dened by k l (k ( jk1 (x0 ))) = l ( jl1 (x0 )).

By Lemma 8.1.1, the preceding equations uniquely determines the linear map k . Let l p us denote by S (Jx0 TM ) the set of maps
1 p : Z>0 kZ>0 Sk (Jk x0 TM )

having the properties


1 p 1. (k) Sk (Jk x0 TM ), k Z>0 , and

2. (l) = k (k) for k, l Z>0 satisfying l k. l


p The set S (J x0 TM ) has the obvious R-vector space structure dened by

( + )(k) = (k) + (k),

(a)(k) = a((k)),

p for , S (J x0 TM ) and a R. We then have linear maps k 1 TMp ), (Jx0 TMp ) Sk (Jx k : S 0

k Z>0 ,

dened by () = (k). Let us dene k


p : J (Jx0 TMp ) x0 TM S

by ()(k) = k ((k 1)). We also dene


k T x0 M = proj lim Tx0 M, k

(Rp ) = proj lim(Rp )k


k

With all of this notation we have the following result.

280

8 Variations and the reachable set

23/06/2009

8.4.7 Proposition (Existence of projective limit) For p Z>0 there exists a unique map
p p Tp (x0 ) L(S (J x0 TM ); L(Tx0 M; (R ) ))

such that

Tp (x0 )( (j (x0 ))) = j x0 (0p )

for every family = (1 , . . . , p ) of C -vector elds. Moreover, for each k Z>0 , the following diagram commutes:
p o 1 k (Jk x0 TM ) Tpk (x0 ) k p (J x0 TM )

k p k o Hom(T x0 M; (R ) )

p Hom(T x0 M; (R ) )

Tp (x0 )

Proof This follows from the denitions of the objects involved, along with applications of Lemma 8.1.1.

8.4.2 A characterisation of variations We now apply the above denitions to the situation where we have a smoothly selectionable dierential inclusion system = (M, F). We shall denote by (F) the set of smooth selections of F. We can also dene Jk F = { jk (x) | (F), x M} Jk TM. We also denote Fp = {1 (x) p (x) TMp | 1 (x), . . . , p (x) F(x), x M}, and denote by F : Fp M the projection. We denote by (Fp ) the p-tuples = (1 , . . . , p ) of smooth selections of F and Jk Fp = { jk (x) | (Fp ), x M} Jk TMp . The next step is the following idea, adapting to our setting the common notion of neutralisability encountered in the controllability literature. 8.4.8 Denition (Neutralisable families of sections) Let k Z>0 . A family = (1 , . . . , p ) of C -vector elds is neutralisable to order k (resp. positively neutralisable to order k) at x0 M if Z(Tpk (x0 )( jk1 (x0 ))) { jk (0) Jk (0,0p ) (R0 ; R0 ) | jk (0)
p p

0}

The condition of (positive) neutralisability of = (1 , . . . , p ) to order k is a condition on the (k 1)-jets of the vector elds 1 , . . . , p . Innitesimal variations are dened by taking appropriate order jets of (positively) neutralisable elements. Thus the following notion is useful.

23/06/2009

8.4 Variations for differential inclusion systems

281

8.4.9 Denition (Neutralisable families for an differential inclusion system) Let = (M, F) be a smoothly selectionable dierential inclusion system. For k, p Z>0 with k 2 denote
1 p Npk (F, x0 ) = jk1 (x0 ) Jk x0 F = (1 , . . . , p ) is neutralisable to order k 2 at x0 . 1 Note that the condition for membership in Npk (F, x0 ) is a condition on (TM )k k 2 applied to the element. Innitesimal variations of order k associated with F-vector elds are then essentially k-jets of elements that are neutralisable to order k 1. For k = 1 there is no neutralisation to be done, so we directly dene p

Vp1 (, x0 ) = { j1 (0) Tp1 (x0 )((x0 )) | = (1 , . . . , p ) (F), ETp }. Then, for k 2, we denote Vpk (, x0 ) = jk (0) Tpk (x0 )( jk1 (x0 )) Note that Let us also denote jk1 (x0 ) Npk (F, x0 ), Z( jk1 (x0 )) . Tx0 M.

Vpk (, x0 ) Sk (R ) Tx0 M

V k (, x0 ) = pZ>0 Vpk (, x0 ).

These subsets of Tx0 M are (possibly high-order) tangent vectors to the reachable set. Let us record some of the more basic properties of our tangent vectors to the reachable set. 8.4.10 Proposition (Basic properties of variational cones) For a smoothly selectionable differential inclusion system = (M, F) and for x0 M, the following statements hold: (i) V 1 (, x0 ) = conv cone(F(x0 )); (ii) V k (, x0 ) is a convex cone. It is very often the case that the variational cones constructed in the literature have a nesting property whereby cones of lower-order variations are subsets of the cones of higher-order variations, cf. [Bianchini and Stefani 1993, Proposition 2.5]. This is not quite the case for our setup since we require the composition x0 to be of class C . We could relax this condition to achieve the nesting property. However, since the nesting property is not necessary for our constructions, we elect not to do this. Nonetheless, we have the following result which essentially captures the desired behaviour. 8.4.11 Proposition (Quasi-nesting property of variational cones) Let = (M, F) be a smoothly selectionable dierential inclusion system and let x0 M. Then the following statements hold: (i) for k, m Z>0 , V k (, x0 ) V mk (, x0 ); (ii) if V1 , . . . , Vm V (, x0 ), then there exists k Z>0 such that V1 , . . . , Vr V k (, x0 ); (iii) V (, x0 ) is a convex cone.

282

8 Variations and the reachable set

23/06/2009

8.4.3 The relationship between variations and the reachable set In this section we understand the relationship between the variational cones and the reachable set for a smoothly selectionable dierential inclusion system = (M, F). This relationship plays a crucial role in our examination of controllability in Chapter 9. In the characterisations we give, we use a coordinate chart about x0 to establish the connection between V (, x0 ) and R (x0 , T). To do this, we need some notation. We suppose that (U, ) is a chart for M for which (x0 ) = 0. If S Tx0 M we denote S = {v Rn | (0, v) Tx0 (S)}. Similarly, we denote R, (x0 , T) = {(x) | x R (x0 , T)}.

The principal result is the following. 8.4.12 Theorem (Variations and the reachable set for differential inclusion systems) Let = (M, F) be a smoothly selectionable dierential inclusion system and let x0 M. Let m Z>0 and let K int(V m (, x0 )) be a closed convex cone. Then, for any coordinate chart (U, ) about x0 for which (x0 ) = 0, there exists , T R>0 such that K B( , 0) R, (x0 , t) for every t [0, T].

8.5 Variations for control systems


In this section we consider variations for control systems. There are many possible ways of dening such variations. For example, Bianchini and Stefani [1993] give a general and elaborate construction of variations depending continuously on controls. Their construction has the benet of producing variations that can be summed, allowing one to assert convexity properties for cones of variations along a trajectory. We keep things simple in this section and we follow the notion of variations as dened by Kawski [1990]. 8.5.1 Denition of variations We recall from Section 8.1.5 the notion of a tame natural net and a kth-order tangent vector for a tame net. 8.5.1 Denition (Innitesimal net variations) Let = (M, F, C) be a smooth timeindependent control system, let x0 M and let k Z>0 . An innitesimal net variation of order k at x0 is a kth-order tangent vector to a tame natural net : D M at x0 for which (a) R (x0 ) for every a D. The set of innitesimal net variations of order k at x0 we denote by N k (, x0 ). The innitesimal net variations enjoy the following properties.

23/06/2009

8.5 Variations for control systems

283

8.5.2 Proposition (Properties of innitesimal net variations) For a smooth time-independent control system = (M, F, C) and for x0 M, the following properties hold: (i) for every [0, 1] and every k Z>0 , k N k (, x0 ) N k (, x0 ); (ii) if k, m Z>0 satisfy k m, N k (, x0 ) N m (, x0 ); (iii) for every k Z>0 , every v1 , v2 N k (, x0 ), and every [0, 1], k v1 + (1 )k v2 N k (, x0 ). Note that N k (, x0 ) is not necessarily a cone. It will, therefore, be convenient to introduce the cone generated by N k (, x0 ): N k (, x0 ) = cone(N k (, x0 )). We also dene N (, x0 ) = kZ>0 N k (, x0 ). 8.5.2 The relationship between variations and the reachable set For the statement of the next result, we recall the notation preceding the statement of Theorem 8.4.12. 8.5.3 Theorem (Variations and the reachable set for control systems) Let = (M, F, C) be a smooth time-independent control system and let x0 M. Let m Z>0 and let K int(N m (, x0 )) be a closed convex cone. Then, for any coordinate chart (U, ) about x0 for which (x0 ) = 0, there exists , T R>0 such that K B( , 0) R, (x0 , t) for every t [0, T]. !

Chapter 9 Controllability theory


In this chapter we consider the important and fundamental notion of controllability for the various system models considered in Chapter 6. Intuitively, a system is controllable when one can steer the system from any initial state to any nal state by a trajectory of the system. We shall see, however, that this intuitive notion of controllability is not always satisfactory, and we must work with various renements of this idea. The associated denitions are given in Section 9.1. Following this, we provide a collection of examples in Section 9.2 that illustrate that the various shades of controllability introduced in Section 9.1 are distinguished by concrete examples, most of which are very simple. We then consider the property of accessibility, which is rather closely connected to the Orbit Theorem that we studied in Section 5.3. Finally, in Section 9.4 we look at a few basic results concerning local controllability.

9.1 Denitions for the various types of controllability


In this section we provide denitions for various notions of controllability. Since the notions of controllability have to do with the reachable set, since the reachable set is determined by trajectories, and since the notion of a trajectory is available for all systems, we can make our presentation less cumbersome by providing denitions for controllability that work for any of our systems. Thus, when we say a system in this section, we mean that can be (1) a dierential inclusion system, (2) a control system with unspecied dierentiability, (3) a control-ane system with unspecied dierentiability, or (4) an ane system. In all cases, a system has a manifold of states typically denoted by M and a time-domain typically denoted by T, and these are the only system specic objects to which we need to refer. We will also use the expression a time-independent system with the same conventions applying. 9.1.1 Accessibility denitions The rst notion of controllability that we consider is accessibility, which itself comes in two avours. Both have to do with the fact that the reachable set has a nonempty interior. We recall from Denitions 6.2.3, 6.3.8, 6.4.7, and 6.5.7 the denitions of the reachable sets for our various system models.

23/06/2009

9.1 Denitions for the various types of controllability

285

9.1.1 Denition (Accessibility for time-dependent systems) Let be a system with state manifold M and time-domain T and let (t0 , x0 ) T M. (i) The system is accessible from (t0 , x0 ) if int(R (x0 , t0 )) . (ii) The system is locally accessible from (t0 , x0 ) if there exists T T (t0 , ) such that int(R (x0 , t0 , t)) for every t (t0 , T]. (iii) The system is strongly accessible from (t0 , x0 ) if there exists T T (t0 , ) such that int(R (x0 , t0 , T)) . (iv) The system is locally strongly accessible from (t0 , x0 ) if there exists T T(t0 , ) such that int(R (x0 , t0 , t)) for every t (t0 , T]. Let T Traj(). (v) The system is T -accessible from (t0 , x0 ) if int(R (x0 , t0 , T )) . (vi) The system is locally T -accessible from (t0 , x0 ) if there exists T T (t0 , ) such that int(R (x0 , t0 , t, T )) for every t (t0 , T]. (vii) The system is strongly T -accessible from (t0 , x0 ) if there exists T T (t0 , ) such that int(R (x0 , t0 , T, T )) . (viii) The system is locally strongly T -accessible from (t0 , x0 ) if there exists T T (t0 , ) such that int(R (x0 , t0 , t, T )) for every t (t0 , T]. If is a control system or a control-ane system and if T = Traj(, U ) for some subset U of admissible controls, then we will say that is U -accessible, locally U accessible, strongly U -accessible, or locally strongly U -accessible from (t0 , x0 ) if it is T -accessible, locally T -accessible, strongly T -accessible, and locally strongly T accessible from (t0 , x0 ), respectively. Of course, the preceding denitions can be adapted to time-independent systems, recalling from Denitions 6.2.4, 6.3.9, 6.4.8, and 6.5.8 the denitions of reachable sets for time-independent systems. 9.1.2 Denition (Accessibility for time-independent systems) Let be a timeindependent system with state manifold M and let x0 M. (i) The system is accessible from x0 if int(R (x0 )) . (ii) The system is locally accessible from x0 if there exists T R>0 such that int(R (x0 , t)) for every t (0, T]. (iii) The system is strongly accessible from x0 if there exists T R>0 such that int(R (x0 , T)) . (iv) The system is locally strongly accessible from x0 if there exists T R>0 such that int(R (x0 , t)) for every t (0, T]. Let T Traj(). (v) The system is T -accessible from x0 if int(R (x0 , T )) . (vi) The system is locally T -accessible from x0 if there exists T R>0 such that int(R (x0 , t, T )) for every t (0, T].

286

9 Controllability theory

23/06/2009

(vii) The system is strongly T -accessible from x0 if there exists T R>0 such that int(R (x0 , T, T )) . (viii) The system is locally strongly T -accessible from x0 if there exists T R>0 such that int(R (x0 , t, T )) for every t (0, T]. If is a time-independent control system or a control-ane system and if T = Traj(, U ) for some subset U of admissible controls, then we will say that is U -accessible, locally U -accessible, strongly U -accessible, or locally strongly U accessible from x0 if it is T -accessible, locally T -accessible, strongly T -accessible, or locally strongly T -accessible from x0 , respectively. 9.1.2 Controllability denitions The notions of accessibility and strong accessibility are too coarse to be of much value. If one wants the theory of controllability to give results that have signicant impact on applications, the property of the reachable set having a nonempty interior is simply not strong enough to achieve much. Thus the notions of accessibility and strong accessibility are to be regarded as the rst thing one should check for a system. Moreover, as we shall see in Section 9.3, it is often quite straightforward to verify accessibility and strong accessibility. This is decidedly not true of the more subtle and useful notions of controllability we introduce in this section. We begin by considering controllability denitions relative to a given trajectory. 9.1.3 Denition (Controllability along a reference trajectory for time-dependent systems) Let be a system with state manifold M and time-domain T, let (t0 , x0 ) T M, and let 0 Traj(, T) satisfy 0 (t0 ) = x0 . (i) Let t T (t0 , ). The system is locally controllable along 0 from (t0 , x0 ) at time t if 0 (t) int(R (x0 , t0 , t)). (ii) The system is small time locally controllable along 0 from (t0 , x0 ) if there exists T T (t0 , ) such that 0 (t) int(R (x0 , t0 , t)) for all t (t0 , T]. Let T Traj(). (iii) Let t T (t0 , ). The system is locally T -controllable along 0 from (t0 , x0 ) at time t if 0 (t) int(R (x0 , t0 , t, T )). (iv) The system is small time locally T -controllable along 0 from (t0 , x0 ) if there exists T T (t0 , ) such that 0 (t) int(R (x0 , t0 , t, T )) for all t (t0 , T]. If is a control system or a control-ane system and if T = Traj(, U ) for some subset U if admissible controls, then we will say that is locally U -controllable or small time locally U -controllable along 0 from (t0 , x0 ) if it is locally T -controllable or small time locally T -controllable along 0 from (t0 , x0 ), respectively. The preceding denitions can, as usual, be simplied to time-independent systems with a mild gain in eciency of terminology. 9.1.4 Denition (Controllability along a reference trajectory for time-independent systems) Let be a time-independent system with state manifold M, let x0 M, and let

23/06/2009

9.1 Denitions for the various types of controllability

287

0 Traj() satisfy 0 (0) = x0 . Suppose that is dened on an interval T such that inf T = 0 and such that 0 T. (i) Let t T. The system is locally controllable along 0 from x0 at time t if 0 (t) int(R (x0 , t)). (ii) The system is small time locally controllable along 0 from x0 if there exists T T such that 0 (t) int(R (x0 , t)) for all t (0, T]. Let T Traj(). (iii) Let t T. The system is locally T -controllable along 0 from x0 at time t if 0 (t) int(R (x0 , t, T )). (iv) The system is small time locally T -controllable along 0 from x0 if there exists T T such that 0 (t) int(R (x0 , t, T )) for all t (0, T]. If is a control system or a control-ane system and if T = Traj(, U ) for some subset U if admissible controls, then we will say that is locally U -controllable or small time locally U -controllable along 0 from x0 if it is locally T -controllable or small time locally T -controllable along 0 from x0 , respectively. A special case of particular interest arises when the reference trajectory is the trivial trajectory t x0 for some x0 M. In this case we introduce some terminology associated with this situation. 9.1.5 Denition (Controllability from a point for time-dependent systems) Let be a system with state manifold M and time-domain T and let (t0 , x0 ) T M. (i) Let t T (t0 , ). The system is locally controllable from (t0 , x0 ) at time t if x0 int(R (x0 , t0 , t)). (ii) The system is small time locally controllable from (t0 , x0 ) if there exists T T (t0 , ) such that x0 int(R (x0 , t0 , t)) for all t (t0 , T]. Let T Traj(). (i) Let t T (t0 , ). The system is locally T -controllable from (t0 , x0 ) at time t if x0 int(R (x0 , t0 , t, T )). (ii) The system is small time locally T -controllable from (t0 , x0 ) if there exists T T (t0 , ) such that x0 int(R (x0 , t0 , t, T )) for all t (t0 , T]. If is a control system or a control-ane system and if T = Traj(, U ) for some subset U if admissible controls, then we will say that is locally U -controllable or small time locally U -controllable from (t0 , x0 ) if it is small time locally T -controllable or small time locally T -controllable from (t0 , x0 ), respectively. These denitions can be adapted to the time-independent case, of course. It is the following denitions of controllability that one sees investigated most frequently in the literature. 9.1.6 Denition (Controllability from a point for time-independent systems) Let be a time-independent system with state manifold M and let x0 M.

288

9 Controllability theory

23/06/2009

(i) Let t R>0 . The system is locally controllable from x0 at time t if x0 int(R (x0 , t)). (ii) The system is small time locally controllable from x0 if there exists T R>0 such that x0 int(R (x0 , t)) for all t (0, T]. Let T Traj(). (i) Let t R>0 . The system is locally T -controllable from x0 at time t if x0 int(R (x0 , t, T )). (ii) The system is small time locally T -controllable from x0 if there exists T R>0 such that x0 int(R (x0 , t, T )) for all t (0, T]. If is a control system or a control-ane system and if T = Traj(, U ) for some subset U if admissible controls, then we will say that is locally U -controllable or small time locally U -controllable from (t0 , x0 ) if it is small time locally T -controllable or small time locally T -controllable from x0 , respectively. Finally, we introduce very strong forms of controllability. These make the most sense for time-independent systems. 9.1.7 Denition (Global controllability, total controllability) Let be a time-independent system with state manifold M. (i) Let x0 M. The system is globally controllable from x0 if R (x0 ) = M. (ii) The system is totally controllable if it is globally controllable from every x M. Let T Traj(). (i) Let x0 M. The system is globally T -controllable from x0 if R (x0 , T ) = M. (ii) The system is totally T -controllable if it is globally T -controllable from every x M. If is a control system or a control-ane system and if T = Traj(, U ) for some subset U if admissible controls, then we will say that is globally U -controllable from x0 or totally U -controllable if it is globally T -controllable from x0 or totally T -controllable, respectively. 9.1.3 Geometric controllability denitions for control-afne and afne systems Control-ane and ane systems have associated with them geometric variants of controllability that are interesting to formulate, as they indicate how the geometry of the ane distribution aects the controllability of the systems one can consider. First we give a notion that is useful for categorising systems. 9.1.8 Denition (Proper afne and control-afne system, proper control set) Let as = (M, A, (F0 , F1 , . . . , Fm ), T) be an ane system and let cas = (M, F , C, T) be a controlane system of class Cr , r Z0 {, }. Let (t0 , x0 ) T M and denote
m

FC (t0 , x0 ) = F0 (t0 , x0 ) +
a =1

ua Fa (t0 , x0 )

uC .

23/06/2009

9.1 Denitions for the various types of controllability

289

(i) The ane system as is proper at (t0 , x0 ) if 0x0 inta (F(t0 ,x0 )) (conv(F(t0 , x0 ))). (ii) The control set C is proper if 0 inta (C) (conv(C)). (iii) The control-ane system cas is proper at (t0 , x0 ) if 0x0 inta (FC (t0 ,x0 )) (conv(FC (t0 , x0 ))). Now suppose that as and cas are time-independent, let x0 M, and denote
m

FC (x0 ) = F0 (x0 ) +
a =1

ua Fa (x0 )

uC .

(iv) The ane system as is proper at x0 if 0x0 int(conv(F(x0 ))). (v) The control-ane system cas is proper at x0 if 0x0 inta (FC (x0 )) (conv(FC (x0 ))). The following result is clear. 9.1.9 Lemma (Proper control sets and proper control-afne systems) Let = (M, (F0 , F1 , . . . , Fm ), C, T) be a control-ane system of class Cr , r Z0 {, }. If C is proper and if F0 (t0 , x0 ) = 0x0 then is proper at (t0 , x0 ). Similarly, if is time-independent, if C is proper, and if F0 (x0 ) = 0x0 , then is proper at x0 . The objective in this section is to give geometric denitions of controllability for ane and control-ane systems. In order to do this, we recall from Denitions 6.4.2 and 6.5.3 the notions of a control-ane pre-system and an ane pre-system. These, it will be recalled, capture the geometric ingredients of a control-ane system and ane system, respectively, without the presence of a control set or a dierential inclusion. Now we can make the following controllability denitions, beginning with timedependent control-ane systems. 9.1.10 Denition (Controllability for time-dependent control-afne pre-systems) Let = (M, (F0 , F1 , . . . , Fm ), T) be a control-ane pre-system of class Cr , r Z0 {, } and let (t0 , x0 ) T M. Suppose that F0 (t0 , x0 ) = 0x0 . (i) Let t T (t0 , ). The pre-system is properly locally controllable from (t0 , x0 ) at time t if, for every proper C Rm , the control-ane system (M, (F0 , F1 , . . . , Fm ), C, T) is locally controllable from (t0 , x0 ) at time t. (ii) The pre-system is properly small time locally controllable from (t0 , x0 ) if, for every proper C Rm , the control-ane system (M, (F0 , F1 , . . . , Fm ), C, T) is small time locally controllable from (t0 , x0 ). (iii) Let t T (t0 , ). The pre-system is locally uncontrollable from (t0 , x0 ) at time t if, for every compact C Rm , the control-ane system (M, (F0 , F1 , . . . , Fm ), C, T) is not locally controllable from (t0 , x0 ) at time t.

290

9 Controllability theory

23/06/2009

(iv) The pre-system is small time locally uncontrollable from (t0 , x0 ) if, for every compact C Rm , the control-ane system (M, (F0 , F1 , . . . , Fm ), C, T) is not small time locally controllable from (t0 , x0 ). (v) Let t T (t0 , ). The pre-system is conditionally locally controllable from (t0 , x0 ) at time t if it is neither properly locally controllable nor locally uncontrollable from (t0 , x0 ) at time t. (vi) The pre-system is conditionally small time locally controllable from (t0 , x0 ) if it is neither properly small time locally controllable nor small time locally uncontrollable from (t0 , x0 ). Now we make these denitions for time-independent systems. 9.1.11 Denition (Controllability for time-dependent control-afne pre-systems) Let = (M, (F0 , F1 , . . . , Fm ), T) be a time-independent control-ane pre-system of class Cr , r Z0 {, } and let x0 M. Suppose that F0 (x0 ) = 0x0 . (i) Let t R>0 . The pre-system is properly locally controllable from x0 at time t if, for every proper C Rm , the time-independent control-ane system (M, (F0 , F1 , . . . , Fm ), C) is locally controllable from x0 at time t. (ii) The pre-system is properly small time locally controllable from x0 if, for every proper C Rm , the tie-independent control-ane system (M, (F0 , F1 , . . . , Fm ), C) is small time locally controllable from x0 . (iii) Let t R>0 . The pre-system is locally uncontrollable from x0 at time t if, for every compact C Rm , the time-independent control-ane system (M, (F0 , F1 , . . . , Fm ), C) is not locally controllable from x0 at time t. (iv) The pre-system is small time locally uncontrollable from x0 if, for every compact C Rm , the time-independent control-ane system (M, (F0 , F1 , . . . , Fm ), C) is not small time locally controllable from x0 . (v) Let t R>0 . The pre-system is conditionally locally controllable from x0 at time t if it is neither properly locally controllable nor locally uncontrollable from x0 at time t. (vi) The pre-system is conditionally small time locally controllable from x0 if it is neither properly small time locally controllable nor small time locally uncontrollable from x0 . The same constructions can be made for ane pre-systems. Here they come. 9.1.12 Denition (Controllability for time-dependent afne pre-systems) Let = (M, A, T) be an ane pre-system of class Cr , r Z0 {, } and let (t0 , x0 ) T M. Suppose that 0x0 A(t0 ,x0 ) . (i) Let t T (t0 , ). The pre-system is properly locally controllable from (t0 , x0 ) at time t if, for every dierential inclusion F : T M TM with the following properties: (a) F(t, x) A(t,x) for every (t, x) T M;

23/06/2009

9.1 Denitions for the various types of controllability

291

(b) a (F(t, x)) = A(t,x) for every (t, x) T M; (c) the ane system (M, A, F, T) is Cr -selectionable and proper at (t0 , x0 ), the ane system (M, A, F, T) is locally controllable from (t0 , x0 ) at time t. (ii) The pre-system is properly small time locally controllable from (t0 , x0 ) if, for every dierential inclusion F : T M TM with the following properties: (a) F(t, x) A(t,x) for every (t, x) T M; (b) a (F(t, x)) = A(t,x) for every (t, x) T M; (c) the ane system (M, A, F, T) is Cr -selectionable and proper at (t0 , x0 ), the ane system (M, A, F, T) is small time locally controllable from (t0 , x0 ). (iii) Let t T (t0 , ). The pre-system is locally uncontrollable from (t0 , x0 ) at time t if, for every dierential inclusion F : T M TM with the following properties: (a) F(t, x) A(t,x) for every (t, x) T M; (b) a (F(t, x)) = A(t,x) for every (t, x) T M; (c) F is compact-valued, the ane system (M, A, F, T) is not locally controllable from (t0 , x0 ) at time t. (iv) The pre-system is small time locally uncontrollable from (t0 , x0 ) if, for every dierential inclusion F : T M TM with the following properties: (a) F(t, x) A(t,x) for every (t, x) T M; (b) a (F(t, x)) = A(t,x) for every (t, x) T M; (c) F is compact-valued, the ane system (M, A, F, T) is not small time locally controllable from (t0 , x0 ). (v) Let t T (t0 , ). The pre-system is conditionally locally controllable from (t0 , x0 ) at time t if it is neither properly locally controllable nor locally uncontrollable from (t0 , x0 ) at time t. (vi) The pre-system is conditionally small time locally controllable from (t0 , x0 ) if it is neither properly small time locally controllable nor small time locally uncontrollable from (t0 , x0 ). Now we make these denitions for time-independent systems. 9.1.13 Denition (Controllability for time-independent afne pre-systems) Let = (M, A, T) be a time-independent ane pre-system of class Cr , r Z0 {, } and let x0 M. Suppose that 0x0 Ax0 . (i) Let t R>0 . The pre-system is properly locally controllable from x0 at time t if, for every time-independent dierential inclusion F : M TM with the following properties: (a) F(x) Ax for every x M; (b) a (F(x)) = Ax for every x M;

292

9 Controllability theory

23/06/2009

(ii)

(iii)

(iv)

(v)

(vi) ! !

(c) the time-independent ane system (M, A, F) is Cr -selectionable and proper at x0 , the time-independent ane system (M, A, F) is locally controllable from x0 at time t. The pre-system is properly small time locally controllable from x0 if, for every time-independent dierential inclusion F : T M TM with the following properties: (a) F(x) Ax for every x M; (b) a (F(x)) = Ax for every x M; (c) the time-independent ane system (M, A, F) is Cr -selectionable and proper at x0 , the time-independent ane system (M, A, F) is small time locally controllable from x0 . Let t R>0 . The pre-system is locally uncontrollable from x0 at time t if, for every time-independent dierential inclusion F : M TM with the following properties: (a) F(x) Ax for every x M; (b) a (F(x)) = Ax for every x M; (c) F is compact-valued, the time-independent ane system (M, A, F) is not locally controllable from x0 at time t. The pre-system is small time locally uncontrollable from x0 if, for every timeindependent dierential inclusion F : M TM with the following properties: (a) F(x) Ax for every x M; (b) a (F(x)) = Ax for every x M; (c) F is compact-valued, the time-independent ane system (M, A, F) is not small time locally controllable from x0 . Let t R>0 . The pre-system is conditionally locally controllable from x0 at time t if it is neither properly locally controllable nor locally uncontrollable from x0 at time t. The pre-system is conditionally small time locally controllable from x0 if it is neither properly small time locally controllable nor small time locally uncontrollable from x0 .

9.2 Examples illustrating controllability problems


Before we get started with the precise denitions and statements of theorems, let us rst consider some examples which exhibit the subtle nature of the problems in

23/06/2009

9.2 Examples illustrating controllability problems

293

controllability. The presentation here will be a little intuitive and possibly a little vague. Nevertheless, we hope it will be useful motivation for the chapters to follow. The examples we consider are all instances of control-ane systems, such as were discussed in Section 6.4. 9.2.1 The difference between accessibility and local accessibility Here we take M = R2 with coordinates (x1 , x2 ), dene F0 = where , x1 F1 = f (x1 ) , x 2

2 e1/x , f (x) = 0 ,

x R>0 , x R0 .

In Figure 9.1 we show the graph of f , and we note that f is innitely dierentiable,

0.8

0.6

f (x)

0.4

0.2

0.0 3 2 1 0 1 2 3

Figure 9.1 The smooth but not analytic function f

but not analytic. We take C = R for simplicity. The dierential equations governing the system are 1 (t) = 1; 2 (t) = f (x1 (t))u(t). x x We are interested in the character of the reachable set from x0 = (1, 0). A moments thought shows that the reachable set is as indicated in Figure 9.2. We can see that for t > 1 we have int(R (x0 , t)) but for t 1 we have int(R (x0 , t)) = . Thus this is a system that is accessible, but not locally accessible. This is possible because this is a system that is smooth, but not analytic. For analytic systems, we shall see in Section 9.3 that accessible and locally accessible are equivalent.

294

9 Controllability theory

23/06/2009

(1, 0)

Figure 9.2 The reachable set for a system that is accessible but not locally accessible

9.2.2 The distinction between accessibility and controllability We take M = R2 with coordinates (x1 , x2 ), dene F0 = x2 1 , x2 F1 = , x 1

and take C = [1, 1]. The resulting dierential equations are 1 (t) = u(t); x 2 (t) = x1 (t)2 . x We wish to understand the character of the reachable set from 0 = (0, 0). To understand the reachable set we rst consider the eect of applying the controls u (t) = 1 and u+ (t) = 1. The resulting trajectories are directly computed to be (x1, (t), x2, (t) = (t, 1 t3 ), 3
1 3 (x1,+ (t), x2,+ (t) = (t, 3 t ).

It is fairly evident that these trajectories form the left and right boundaries of the reachable set. Now x T R>0 . The upper boundary of the reachable set in time T is obtained by controls which switch from u1 to u+ once in the interval [0, T]. If the control u is given by 1, t [0, ], u(t) = 1 , t (, T], one directly computes that the trajectory satises
1 3 (x1 (T), x2 (T)) = (T 2, 3 T T2 + T2 ).

23/06/2009

9.2 Examples illustrating controllability problems

295

Thus = 1 (T x1 (T)) and so we determine that, on the upper boundary of R (0, T) we 2 have 1 3 1 x2 (T) = 12 T +4 Tx1 (T)2 . We sketch the reachable set in Figure 9.3. The dark shaded region is R (0, T) which,

1 = u; x 2 = x2 Figure 9.3 The reachable set of x with bounded 1 controls (left) and unbounded controls (right)

it turns out, is equal to R (0, T). The shaded region is R (0). Also in Figure 9.3 we show the reachable set in the case when C = R. In this case the reachable sets R (0, T), R (0, T), and R (0) all agree. The point of this example is that the reachable set has a nonempty interior, but that 0 is not contained in this interior. 9.2.3 The distinction between accessibility and strong accessibility In this case we take M = R2 with coordinates (x1 , x2 ), dene F0 = , x 2 F1 = , x1

and take C = [1, 1]. The governing equations are 1 (t) = u(t); x 2 (t) = 1. x

As with the preceding example, we consider the character of the reachable set from 0 = (0, 0). Again, we compute the boundaries of the reachable set, although in this case the computations are much simpler than in the preceding example. The left and right boundaries are dened by the sets {(x1 , x2 ) | x2 = x1 }, respectively. Since we obviously have R (0, T) {(x1 , T) | x1 R}, {(x1 , x2 ) | x2 = x1 },

296

9 Controllability theory

23/06/2009

it follows that the upper boundary of R (0, T) is {(x1 , T) | x1 [T, T]}. In Figure 9.4 we show the reachable sets. The shaded region is R (0), the dashed lines

1 = u; x 2 = 1 with bounded Figure 9.4 The reachable set of x controls (left) and unbounded controls (right)

are the sets R (0, T) for various T, and the hatched region is R (0, T). We also show the reachable sets in the case of C = R in the same gure. The point of this example is that the reachable set in time T may have a dierent interior than the reachable set in time less than or equal to T. 9.2.4 Global controllability and local controllability Here we take M = R S1 with coordinates (x, ), dene F0 = , F1 = , x

and consider C = [1, 1]. The governing equations are (t) = u(t), x (t) = 1.

Note that this system is the image of the preceding system under the projection : R2 R S1 dened by (x1 , x2 ) = (x1 , (cos(x2 ), sin(x2 )), and that (0, 0) = (0, (1, 0)). We shall think of R S1 as the strip {(x, y) R2 | y [0, 2]} with the upper and lower edges of the strip identied. By doing this we can depict the reachable set as the projection under of the reachable set from the preceding example. These are shown in Figure 9.5 where the shaded region represents R ((0, (1, 0)), T) and the dashed lines represent R ((0, (1, 0)), T) for various T. The dark shaded region represents the points in the reachable set for T > 2. In Figure 9.6 we show the

23/06/2009

9.2 Examples illustrating controllability problems

297

(0, (1, 0))

(0, (1, 0))

= 1 with bounded con = u; Figure 9.5 The reachable set of x trols for small time (left) and longer time (right)

(0, (1, 0))

(0, (1, 0))

= 1 with unbounded = u; Figure 9.6 The reachable set of x controls for small time (left) and longer time (right)

reachable sets for C = R. The point of this example is that the reachable set for small times may not contain x0 in its interior, but for larger times x0 is in the interior of the reachable set. This distinction is that between global and local. 9.2.5 The size of the control set can matter In this example we take M = R3 with coordinates (x1 , x2 , x3 ), dene F0 = x2 1 , x 3 F1 = , x 1 F2 = 1 + x2 , 1 x2 2 x3

and take Cr = [r, r] [r, r] for r R>0 . The governing equations are 1 (t) = u1 (t); x 2 (t) = u2 (t); x
1 3 (t) = (1 + 2 x u2 (t))x1 (t)2 .

We are interested in controllability from 0 = (0, 0, 0). It can be shown that R (0, T) has a nonempty interior for every T R>0 . However, if r (0, 2) then clearly we have 3 (t) > 0 for every control t u(t) Cr . Therefore, for r (0, 2) it follows that 0 is not x in the interior of R (0). However, when r > 2, one can show that 0 is in the interior of R (0, T) for any T R>0 . The point of this example is that, even for consideration of local properties of the reachable set, the size of the control set can play a role. The reasons for this are not so entirely clear at this point, but motivate our denitions for geometric controllability from Section 9.1.3. Indeed, the system we consider here falls into the category of being conditionally controllable from 0.

298

9 Controllability theory of feedback transformations 9.2.6 The role

23/06/2009

We take M = R3 with coordinates (x1 , x2 , x3 ), dene F0 = x2 , x1 F1 = x3 , x2 F2 = , x 3

and take C = R2 for simplicity. The equations corresponding to this system are 1 (t) = x2 (t); x 2 = x3 (t)u1 (t); x 3 (t) = u2 (t). x (9.1)

We will be interested in the controllability of this system (i.e., the determination of whether the reachable set contains the initial point in its interior), which we will denote by 1 , from 0 = (0, 0, 0). In this case, we choose to linearise the system about 0. The linearisation is the linear system satisfying x ( t ) 0 1 0 x ( t ) 0 0 1 1 u1 (t) x ( t ) 0 0 0 x ( t ) 0 0 = + . 2 2 u2 (t) x3 (t) 0 0 1 x3 (t) 0 1 2 (t) = 0. This linearisation is clearly not controllable since the equation for x2 reads x Now let us consider a feedback transformationi.e., a change of controldened by v1 = u1 1, v2 = u2 . This gives rise to an equivalent control-ane system, with the notion of equivalence from Section 6.4.4. After substituting this change of variable into the equations governing the system, we see that the system with the controls v1 and v2 is still a control-ane system, but now with drift and control vector elds g0 = x2 + x3 , x 1 x 2 g2 = , x2 g3 = , x 3

respectively. The equations governing the behaviour of this transformed system are 1 (t) = x2 (t); x 2 (t) = x3 (t) + x3 (t)v1 (t); x 3 (t) = v2 (t). x (9.2)

Note that, although the system represented by (9.2) is dierent than that represented by (9.1), there is an exact correspondence between the set of controlled trajectories for these systems, this being given by the mapping of the controlled trajectory (t (x1 (t), x2 (t), x3 (t)), t (u1 (t), u2 (t))) for (9.1) to the controlled trajectory (t (x1 (t), x2 (t), x3 (t)), t (u1 (t) 1, u2 (t))) for (9.2). Note in particular that the trajectories in M are the same for both systems.

23/06/2009

9.3 Accessibility theory

299

We next wish to investigate the linearisation of the system (9.2). This is easily seen to be 0 0 x ( t ) 0 1 0 x ( t ) 1 1 u (t) 1 1 0 x (t) 0 0 1 x ( t ) . + = 2 2 u2 (t) 3 (t) 0 1 0 0 1 x3 (t) x This system can be shown to controllable using Corollary 9.4.9. The point of this system is that controllability properties are not invariant under feedback transformations. In this example, we have that controllability of the linearisation is not invariant under feedback transformations. 9.2.7 Controllability might be computationally difcult to decide Let n Z>0 , let r =
1 (4n 3

1) , and let j = (4n 2 j)/(2 j + 1) , j = (2 j + 1)( j + 2) (4n + 1), j = (2 j + 1)( j + 2) (4n + 1)

for j {1, . . . , 2n}. Consider the system on M = R2n+r+2 with governing equations 0 = u, x j = x j1 , x j = y
x jj xjj +1 , 2n1 n = x2 x2 , y 0 x1 = P( y), z

j {1, . . . , r}, j {1, . . . , n 1, n + 1, . . . , 2n},

where P is a homogeneous polynomial. Standard techniques easily show that the (x, y) subsystem, which is decoupled from z, has the property that any state nearby 0 can be reached from 0 in as small a time as desired. It is then more or less clear that the entire system itself has this property if and only if P changes sign in any neighbourhood of 0: the decidability of this is NP-hard. The point is that coming up with an algorithm that eciently decides whether a given system is controllable may be quite hard. Note, however, that if you come up with an algorithm that can correctly decide whenever a system is controllable, you will be the recipient of $1,000,000 from the Clay Institute for solving one of the Millennium Problems. Good luck!

9.3 Accessibility theory


In this section we consider the matter of deciding the accessibility of a system. As we shall see, it is often relatively straightforward to determine whether a system is accessible, particularly in the analytic case. The Orbit Theorem, Theorem 5.3.16, and related constructions play an important role in accessibility theory. Indeed, we begin with a modication of the notion of an orbit.

300

9 Controllability theory 9.3.1 Positive orbits and positive xed-time orbits

23/06/2009

The Orbit Theorem gives a clear description of the orbits for a family of vector elds. The problem with the Orbit Theorem, as concerns its direct application to control theory, is that in control theory time only goes forwards. Thus some of the power of the Orbit Theorem is lost. However, one can still dene useful concepts related to orbits, and taking into consideration that time goes forward. This is what we do in this section. Let r {, }, let M be a Cr -manifold, and let X r (TM). To begin, we dene semigroups Di + (X ) and Di + T (X ) of local dieomorphisms by Di + (X ) = X t and
X Di + T (X ) = t

X X k , t Rk 0 , k Z>0
k

X X k , t Rk 0 ,
j =1

tk = T, k Z>0 .

Corresponding to these we have the following subsets of the orbit and the T-orbit. 9.3.1 Denition (Positive orbit, positive xed-time orbit) Let r {, }, let M be a Cr manifold, let X r (TM) be a family of Cr -vector elds, and let x0 M. (i) The positive orbit of X through x0 is the set Orb+ (x0 , X ) = X t (x0 ) X X k , t Rk 0 , k Z>0 .

(ii) For T R0 , the positive T-orbit of X through x0 is the set


k

Orb+ T (x0 , X

)=

X t (x0 )

XX , t
k

Rk 0 ,
j =1

t j = T, k Z>0 .

(iii) A positive xed-time orbit of X through x0 M is a set of the form Orb+ T (x0 , X ) for some T R0 . Associated with these subsets of orbits and xed-time orbits are the following notions. 9.3.2 Denition (Attainable, strongly attainable) Let r {, }, let M be a Cr -manifold, let X r (TM) be a family of Cr -vector elds, and let x0 M. (i) The family of vector elds X is attainable from x0 if int(Orb+ (x0 , X )) . (ii) The family of vector elds X is locally attainable from x0 if there exists T R>0 such that int Orb+ t (x0 , X )
(0,t]

for every t (0, T].

23/06/2009

9.3 Accessibility theory

301

(iii) The family of vector elds X is strongly attainable from x0 if there exists T R>0 such that int(Orb+ . T (x0 , X )) (iv) The family of vector elds X is locally strongly attainable from x0 if there exists T R>0 such that int(Orb+ for every t (0, T]. t (x0 , X )) It is evident that generally we will have Orb+ (x0 , X ) Orb(x0 , X ) and Orb+ T (x0 , X ) OrbT (x0 , X ). However, in the analytic case, the positive orbits are nice subsets of the orbits. The following result is due to Sussmann and Jurdjevic [1972]. 9.3.3 Theorem (Property of positive orbits) Let M be an analytic manifold, let X (TM), and let x0 M. Then, with respect to the orbit topologies on Orb(x0 , X ) and OrbT (x0 , X ), (i) int(Orb+ (x0 , X )) is dense in Orb+ (x0 , X ) and + (ii) int(Orb+ T (x0 , X )) is dense in OrbT (x0 , X ). The theorem is generally false in the absence of the assumption of analyticity. 9.3.4 Example (The positive orbit for smooth systems does not have a dense interior and F1 = f (x1 ) , where in the orbit) We take M = R2 and dene vector elds F0 = x1 x2 2 e1/x , f (x) = 0 , We consider the family of vector elds X = {F0 + uF1 | u R}. The positive orbits for this system are the reachable sets for the control-ane system considered in Section 9.2.1. In particular, the positive orbit from (1, 0) is shown in Figure 9.2. In Example 5.3.133 we showed that Orb(x, X ) = R2 for every x R2 . In particular, int(Orb+ ((1, 0), X )) is not dense in Orb((1, 0), X ). Now, for analytic systems, it is more or less easy to imagine how attainability is related to the Orbit Theorem. The following result for was obtained by Sussmann and Jurdjevic [1972]. 9.3.5 Theorem (Attainability for analytic systems) Let X (TM) be a family of analytic vector elds on an analytic manifold. For x0 M, the following statements hold: (i) the family X is locally attainable from x0 if and only if L() (X )x0 = Tx0 M; (ii) the family X is locally strongly attainable from x0 if and only if I(X )x0 = Tx0 M. x 0, x = 0.

302

9 Controllability theory 9.3.2 Accessibility for control systems

23/06/2009

In this section we use the attainability results of Sussmann and Jurdjevic [1972] to arrive at accessibility results for control systems, i.e., the class of systems introduced in Section 6.3. We consider a time-independent control system = (M, F, C) of class C . We let U be a class of admissible controls containing Lpwc (T; C) for any interval T. For xed u C note that Fu : x F(x, u) denes an analytic vector eld on M. We can then dene a family of vector elds X = {Fu | u C} on M. Let x0 M. If x Orb+ (x0 , X ), this means that there exists u1 , . . . , uk C and t1 , . . . , tk R>0 such that Fu Fu x = tk k . . . t1 1 (x0 ). Let T0 = 0 and T j = i=1 t j , j {1, . . . , k}. Thus, if we dene Lpwc ([0, Tk ]; C) by (t) = u j when t [T j1 , T j ) and (Tk ) = uk , then we see that (Tk ) = x where is the solution to the initial value problem (t) = F((t), (t)), (0) = x0 . (9.3)
j

Thus x R (x0 , U ). Conversely, suppose that Lpwc ([0, T]; C) for some T R>0 . Then there exists T0 , T1 , . . . , Tk [0, T] such that 0 = T0 < T1 < < Tk1 < Tk = T and such that (t) = u j for t (T j , T j1 ). We can then dene t j = T j T j1 , j {1, . . . , k}. Then, if x = (T) where is the solution to the initial value problem (9.3), we have x = tk k . . . t1 1 (x0 ), and so x Orb+ (x0 , X ). Thus Orb+ (x0 , X ) R(x0 , U ). A similar argument shows, of course, that Orb+ T (x0 , X ) R (x0 , T, U ). In fact, we have the following theorem. 9.3.6 Theorem (Accessibility for analytic control systems) Let = (M, F, C) be an analytic time-independent control system, let U be a class of admissible controls containing the piecewise constant controls, and let x0 M. Then the following statements hold: (i) is locally accessible from x0 if and only if L() (X )x0 = Tx0 M; (ii) is locally strongly accessible from x0 if and only if I(X )x0 = Tx0 M. Moreover, (iii) if is locally accessible from x0 then int(R (x0 , T, U )) is dense in R (x0 , T, U ) for every T R>0 ; (iv) if is locally strongly accessible from x0 then int(R (x0 , T, U )) is dense in R (x0 , T, U ) for every T R>0 .
Fu Fu

23/06/2009

9.3 Accessibility theory

303

Proof (i) That is locally accessible from x0 if L() (X )x0 = Tx0 M follows from our arguments before the statement of the theorem, along with Theorem 9.3.5. For the converse statement, we rst prove a lemma. 1 Lemma If Traj(, U ) has the property that (t0 ) Orb(x0 , X ) for some t0 TD(), then (t) Orb(x0 , X ) for every t TD(). Proof Let x0 = (t0 ), let t TD(), and, by Theorem 8.3.2, let ( j ) jZ>0 be a sequence of trajectories dened on [t0 , t] with piecewise constant controls such that lim j j (t) = (t). By our comments preceding the statement of the theorem, each of the curves j is nite concatenation of integral curves of vector elds from X . Thus, since the vector elds from X are tangent to Orb(x0 , X ), the curves j take values on Orb(x0 , X ). Since Orb(x0 , X ) is an immersed submanifold, it follows that lim j j (t) Orb(x0 , X ), giving the result. ! From the lemma we conclude that R (x0 , t, U ) Orb(x0 , X ) for every t R0 . Therefore, if is locally accessible from x0 , we must have Tx0 Orb(x0 , X ) = Tx0 M. By the appropriate version of the Orbit Theorem (Corollary 5.3.25 in this case), it follows that L (X )x0 = Tx0 M. (ii) This part of the theorem follows from a similar argument as the rst part of the theorem. ! (iii) Note that the orbit topology on the connected component of M containing x0 is the same as the manifold topology. From Theorem 8.3.2 and the arguments preceding the statement of the theorem, Orb+ (x0 , X ) is dense in R (x0 , U ). Therefore, by Theorem 9.3.3, int(Orb+ (x0 , X )) is dense in R (x0 , U ). Since Orb+ (x0 , X ) R (x0 , U ) we also have int(Orb+ (x0 , X )) int(R (x0 , U )). Thus we can conclude that int(R (x0 , U )) is dense in R (x0 , U ), as desired. (iv) This follows from a similar argument as the preceding part of the theorem.

For smooth control systems, it is possible to give useful results for accessibility, even though they are not sharp, as for analytic systems. 9.3.7 Theorem (Accessibility for smooth control systems) Let = (M, F, C) be a smooth time-independent control system, let U be a class of admissible controls containing the piecewise constant controls, and let x0 M. Then the following statements hold: (i) if L() (X )x0 = Tx0 M then is locally accessible from x0 ; (ii) if I(X )x0 = Tx0 M then is locally strongly accessible from x0 ; (iii) if is locally accessible from every x M, then there exists an open dense subset O M such that L() (X )x = Tx M for every x O; (iv) if is locally strongly accessible from every x M, then there exists an open dense subset O M such that I(X )x = Tx M for every x O.
Proof The constructions preceding the proof of Theorem 9.3.6 are valid for smooth systems, and so for smooth systems we have Orb+ (x0 , X ) R (x0 , U ). Thus, just as in Theorem 9.3.6, we conclude that if L() (X )x0 = Tx0 M then is locally accessible from x0 . A similar argument is valid for the second assertion, and so the rst two parts of the theorem are proved. For the third part of the theorem, note that the set of points where L() (X )x = Tx M is open by Proposition 5.1.7. To prove that the set of points where this holds is dense, suppose

304

9 Controllability theory

23/06/2009

otherwise, and that there exists an open set U such that dim(L() (X )x ) < dim(Tx M) for each x U. Then, again by Proposition 5.1.7, there exists an open set V U such that dim(L() (X )x ) = k for all x V. Let x V. By Frobeniuss Theorem, there exists a k-dimensional submanifold N containing x such that T y N = L() (X ) y for every y N. In particular, this implies that Fu ( y) T y N for every y N and so, following Lemma 1, trajectories for starting from x remain in N suciently near x. This prohibits from being locally accessible. An argument similar to that for the third part of the result gives the last part of the result.

9.3.3 Accessibility for control-afne systems Of course, control-ane systems are control systems, and so the accessibility results from the preceding section also apply to analytic control-ane systems. However, for control-ane systems, the special form of the system allows one to state simpler conditions for accessibility, using only the drift and control vector elds F0 , F1 , . . . , Fm . The key to doing this are the following observations. 9.3.8 Lemma (The family of vector elds associated to a control-afne system I) Let = (M, F = (F0 , F1 , . . . , Fm ), C) be a smooth time-independent control ane system. If a (C) = Rm then spanR (F ) = spanR (X ).
Proof By denition of spanR (X ), the inclusion spanR (X ) spanR (F ) holds. By Proposition 4.3.9 there exists 1 , . . . , k R0 and u1 , . . . , uk C such that
k k

j = 1,
j =1

0=
j =1

ju j.

Therefore,
k m k m

j F0 +
j =1 a =1

ua j Fa = F0 +

j=1 a=1

j ua j Fa = F0 .

Thus F0 spanR (X ). Similarly, again by Proposition 4.3.9, for each a {1, . . . , m} there exists 1 , . . . , k R and u1 , . . . , uk C such that
k k

j = 1,
j=1

ea =
j =1

ju j,

where ea , a {1, . . . , m}, is the ath standard basis vector for Rm . Therefore
k m k m

j F0 +
j =1 a =1

ua j Fa = F0 +

j=1 b=1

j ub j Fb = F0 + Fa .

Thus F0 + Fa spanR (X ), showing that Fa spanR (X ), a {1, . . . , m}. Thus we have shown that the inclusion spanR (F ) spanR (X ) also holds.

23/06/2009

9.3 Accessibility theory

305

9.3.9 Lemma (The family of vector elds associated to a control-afne system II) Let = (M, F = {F0 , F1 , . . . , Fm }, C) be a control ane system. Let L0() (F ) be the smallest subalgebra of (TM) containing {F1 , . . . , Fm } and which is invariant under F0 , i.e., [F0 , X] L0() (F ) for each X L0() (F ). The following statements hold: (i) L0() (F ) is generated as a R-vector space by vector elds of the form [Fa1 , [Fa2 , , [Fak1 , Fa ]]], a1 , . . . , ak1 {0, 1, . . . , m}, a {1, . . . , m}; (9.4)

(ii) if a (C) = Rm then L0() (F ) = I (X ).


invariant under F0 , the vector elds [Fa1 , Fa ], a1 {0, 1, . . . , m}, a {1, . . . , m}, are in L0 (F ). Continuing in this way one readily sees that each of the vector elds of the form (9.4) is () () in L0 (F ). Thus the vector elds (9.4) must be contained in L0 (F ). This part of the
()

Proof (i) Clearly F1 , . . . , Fm L0

()

(F ). Also, since L0

()

(F ) is a subalgebra and is
()

lemma follows since, by denition, L0 generators. (ii) If u C let us write

(F ) is the smallest subalgebra containing these

Fu = F0 +
a =1

ua Fa (TM).

Since a (C) = Rm , by Lemma 9.3.8 we have spanR (F ) = spanR (X ), meaning that the derived algebras of L () (F ) and L () (X ) agree: D (F ) = D (X ). Since D (F ) is the subalgebra generated by the vector elds [Fa1 , Fa2 ], [Fa1 , [Fa2 , Fa3 ]], . . . Fak F ,

this part of the lemma will follow from the rst part of the lemma if we can show that spanR (X,0 ) = spanR (F1 , . . . , Fm ). A typical element of X,0 looks like
k m k m k

j F0 +
j=1 a=1

ua j Fa =

j
j =1 a =1

ua j Fa ,

1 , . . . , k R,
j=1

j = 0, u1 , . . . , uk C.

Thus we obviously have spanR (X,0 ) spanR (F1 , . . . , Fm ). Since a (C) = Rm , the linear part of a (C) is also Rm . This means that, for any a {1, . . . , m}, there exists 1 , . . . , k R and u1 , . . . , uk C such that
k k

j = 0,
j =1 j=1 m

j u j = ea .

Therefore

j F0 +
j=1 b=1

ub j Fb = Fa ,

showing that spanR (F1 , . . . , Fm ) spanR (X,0 ), and so proving the lemma.

306 As usual, we denote

9 Controllability theory

23/06/2009

) L( (F )x = {X(x) | X L0() (F )}, 0 ) so dening a distribution L( (F ) on M. 0 The preceding two results and Theorem 9.3.6 give the following result.

9.3.10 Theorem (Accessibility for analytic control-afne systems) Let = (M, F = (F0 , F1 , . . . , Fm ), C) be an analytic time-independent control-ane system, let U be a class of admissible controls containing the piecewise constant controls, and let x0 M. If a (C) = Rm , then the following statements hold: (i) is locally accessible from x0 if and only if L() (F )x0 = Tx0 M;
) (ii) is locally strongly accessible from x0 if and only if L( (F )x0 = Tx0 M. 0 Moreover, (iii) if is locally accessible from x0 then int(R (x0 , T, U )) is dense in R (x0 , T, U ) for every T R>0 ; (iv) if is locally strongly accessible from x0 then int(R (x0 , T, U )) is dense in R (x0 , T, U ) for every T R>0 .

One of the important features of the preceding characterisation of accessibility for control-ane systems is that it is checkable. Indeed, Sontag [1988] shows that the accessibility of a control-ane system can be decided by an algorithm that has a running time that is polynomial with respect to a natural size parameter of the problem. We also have the result for smooth control-ane systems that follows from Theorem 9.3.7. 9.3.11 Theorem (Accessibility for smooth control-afne systems) Let = (M, F = (F0 , F1 , . . . , Fm ), C) be a smooth time-independent control system, let U be a class of admissible controls containing the piecewise constant controls, and let x0 M. If a (C) = Rm , then the following statements hold: (i) if L() (F )x0 = Tx0 M then is locally accessible from x0 ;
) (ii) if L( (F )x0 = Tx0 M then is locally strongly accessible from x0 ; 0 (iii) if is locally accessible from every x M, then there exists an open dense subset O M such that L() (X )x = Tx M for every x O; (iv) if is locally strongly accessible from every x M, then there exists an open dense subset O M such that I(X )x = Tx M for every x O.

9.4 Some controllability results


In this section we present a few more or less elementary results concerning controllability, as opposed to accessibility. Controllability is subtle, and the results we give

23/06/2009

9.4 Some controllability results

307

will be nowhere near as satisfying as those for accessibility. We will give no hard controllability theorems in this section, as these typically require rather specialised developments. What we do instead is give an idea of how one might arrive at some of these hard controllability theorems. 9.4.1 Controllability results for differential inclusion systems In this section we provide a few results on the controllability of dierential inclusion systems. In this section we consider a time-independent dierential inclusion system = (M, F), and we shall prescribe the properties required of F as we go along. We begin with a rather general result that relies on our constructions with variations from Section 8.4. Thus we recall that in that section, under the assumption that F is smoothly selectionable, we constructed a sequence (V k (, x0 ))kZ>0 of convex cones in Tx0 M. The super theorem for the approach to controllability of smoothly selectionable dierential inclusion systems is the following. 9.4.1 Theorem (Controllability of differential inclusion systems using variations) Let = (M, F) be a smoothly selectionable time-independent dierential inclusion system and let x0 M. If V k (, x0 ) = Tx0 M for some k Z>0 , then is small time locally controllable from x0 .
Proof This follows immediately from Theorem 8.4.12.

The problem with this theorem is that it is in general very dicult to ascertain the contents of the variational cones V k (, x0 ). Therefore, the problem becomes one of determining whether any of the variational cones satises the hypotheses of the preceding theorem. Let us give a few more or less elementary results. 9.4.2 Proposition (Fully actuated differential inclusion systems are controllable) Let = (M, F) be a smoothly selectionable time-independent dierential inclusion system and let x0 M. If 0x0 conv(F(x0 )) then V 1 (, x0 ) = Tx0 M, and so, in particular, is small time locally controllable from x0 .
Proof This follows from Proposition 8.4.10(i).

Our next result is a little less trivial. For a smoothly selectionable time-independent dierential inclusion system = (M, F) and for x0 M, let us dene Zx0 (F) = {X (TM) | X is a smooth selection of F and X(x0 ) = 0x0 }. We recall from Lemma 5.4.13 that is X Zx0 (F) then there is dened a linear map AX (x0 ) End(Tx0 M) such that AX (x0 ) vx0 = [V, X](x0 ), where V 1 (TM) is such that V (x0 ) = vx0 . We also recall from Lemma 5.4.14 the denition of the subspace Zx0 (F), U where U is a subspace of Tx0 M. With these recollections made, we have the following result. 9.4.3 Proposition (A class of subspaces of variations for differential inclusion systems) Let = (M, F) be a smoothly selectionable time-independent dierential inclusion

308

9 Controllability theory

23/06/2009

system and let x0 M. If U V (, x0 ) is a subspace, then Zx0 (F), U V (, x0 ). In particular, if Zx0 (F), U = Tx0 M, then is small time locally controllable from x0 . 9.4.4 Corollary (Linear controllability of differential inclusion systems) Let = (M, F) be a smoothly selectionable time-independent dierential inclusion system and let x0 M. If 0x0 int(conv(F(x0 ))) then Zx0 (F), spanR (F(x0 )) V (, x0 ). In particular, if Zx0 (F), spanR (F(x0 )) = Tx0 M, then is small time locally controllable from x0 . We also have a necessary condition for controllability as follows. 9.4.5 Proposition (A zeroth-order necessary condition for controllability of differential inclusion systems) Let = (M, F) be an upper semicontinuous time-independent dierential inclusion system and let x0 M. If F(x0 ) is compact and if 0x0 conv(F(x0 )), then is not small time locally controllable from x0 . 9.4.2 Controllability results for control systems In this section we provide a few result regarding controllability for control systems such as dened in Section 6.3. In this section we consider a time-independent C control system = (M, F, C). As with dierential inclusion systems, we begin with a sort of super theorem indicating how control variations give rise to controllability. For the statement of the theorem, we recall from Section 8.5 the denition of the sequence (N k (, x0 ))kZ>0 of convex cones in Tx0 M. 9.4.6 Theorem (Controllability of control systems using variations) Let = (M, F, C) be a smooth time-independent control system and let x0 M. If N k (, x0 ) = Tx0 M for some k Z>0 , then is small time locally controllable from x0 .
Proof This follows immediately from Theorem 8.5.3.

As with dierential inclusion systems, the diculty with applying this theorem is the determination of vectors in the cones N k (, x0 ). Also as with dierential inclusion systems, we provide a few elementary results. 9.4.7 Proposition (Fully actuated control systems are controllable) Let = (M, F, C) be a smooth time-independent control system and let x0 M. If 0x0 conv(F(x0 )) then V 1 (, x0 ) = Tx0 M, and so, in particular, is small time locally controllable from x0 . The next result we give is a linearisation result, and so requires some additional structure for the system. Thus we make the following assumptions about = (M, F, C): 1. C Rm ; 2. F : M C TM can be extended to a continuously dierentiable map F : M Rm TM such that F|M C = F;

23/06/2009

9.4 Some controllability results

309

3. for (x0 , u0 ) M C, (x0 , u0 ) = 0x0 . With these assumptions, let us dene A (x0 , u0 ) End(Tx0 M) to be the linear map corresponding to the vector eld x F(x, u0 ) that vanishes at x0 . Let us also dene B (x0 , u0 ) HomR (Rm ; Tx0 M) to be the derivative at u0 of the map u F(x0 , u). This then denes a time-independent linear system (x0 ) = (Tx0 M, A (x0 , u0 ), B (x0 , u0 ), Rm ). 9.4.8 Proposition (A class of subspaces of variations for control systems) Let = (M, F, C) be a time-independent control system subject to the conditions above at (x0 , u0 ) M C. If U N (, x0 ) is a subspace, then A (x0 , u0 ), U N (, x0 ). In particular, if A (x0 , u0 ), U = Tx0 M, then is small time locally controllable from x0 . The result also has the following useful corollary. 9.4.9 Corollary (Linear controllability of control systems) Let = (M, F, C) be a smooth time-independent control system and let x0 M. Denote F(x0 , C) = {F(x0 , u) | u C}. If 0x0 int(conv(F(x0 , C))) then A (x0 , u0 ), spanR (F(x0 , u0 )) N (, x0 ). In particular, if A (x0 , u0 ), spanR (F(x0 , u0 )) = Tx0 M, then is small time locally controllable from x0 . Note that, from Example 5.4.18, the condition that A (x0 , u0 ), spanR (F(x0 , u0 )) = Tx0 M can be expressed as a certain linear map having maximal rank. 9.4.3 Controllability results for control-afne systems In the literature on controllability, a prominent role is played by control-ane systems. The results from the preceding section on control systems apply, of course, to control-ane systems. Thus in this section we state a few other results that are well adapted to the control-ane setting. The key idea is to identify subspaces of N (, x0 ). We shall state two results, neither of which are entirely obvious, but which follow from results in [Bianchini and Stefani 1993]. Our rst result is interesting in that for it to hold in general, one cannot place any a priori bounds on the controls, cf. the example of Section 9.2.5. 9.4.10 Proposition (A sufcient condition for controllability involving only the control vector elds) Let = (M, F = (F0 , F1 , . . . , Fm ), C) be a smooth time-independent controlane system and let x0 M. Denote F1 = (F1 , . . . , Fm ). If C = Rm then L() (F1 )x0 N (, x0 ). In particular, if L() (F1 )x0 = Tx0 M, then is small time locally controllable from x0 . Our next result is a low-order result. 9.4.11 Proposition (A sufcient condition for controllability involving Lie brackets of degree two) Let = (M, F = (F0 , F1 , . . . , Fm ), C) be a smooth time-independent controlane system and let x0 M. If C is proper and convex and if F(x0 , 0) = 0x0 , then spanR ([Fa , Fb ](x0 ))a, b {0, 1, . . . , m} N (, x0 ).

310 In particular, if

9 Controllability theory

23/06/2009

spanR ([Fa , Fb ](x0 ))a, b {0, 1, . . . , m} = Tx0 M,

then is small time locally controllable from x0 . One might speculate that such a result might hold with the Lie brackets of degree two being replaced by Lie brackets of higher degree. This is not true, for as soon as one looks to brackets of degree three or higher, there arises the possibility of obstructions to controllability. This is entirely related to the notion of neutralisability consider for variations in Section 8.4.

This version: 23/06/2009

Chapter 10 Optimal control theory


At the moment, this is just a placeholder for things yet to be written.

Chapter 11 Stabilisation theory


At the moment, this is just a placeholder for things yet to be written.

This version: 23/06/2009

Bibliography
Abraham, R., Marsden, J. E., and Ratiu, T. S. [1988] Manifolds, Tensor Analysis, and Applications, second edition, number 75 in Applied Mathematical Sciences, SpringerVerlag, ISBN 0-387-96790-7. Agrachev, A. A. and Gamkrelidze, R. V. [1993] Local controllability and semigroups of dieomorphisms, Acta Applicandae Mathematicae. An International Journal on Applying Mathematics and Mathematical Applications, 32(1), 157. Agrachev, A. A. and Sachkov, Y. [2004] Control Theory from the Geometric Viewpoint, volume 87 of Encyclopedia of Mathematical Sciences, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 3-540-21019-9. Aguilar, C. O. and Lewis, A. D. [2008] Jet bundles and algebro-geometric characterisations for controllability of ane systems, in Proceedings of the 47th IEEE Conference on Decision and Control, pages 12671274, Institute of Electrical and Electronics Engineers, Cancun, Mexico. Aliprantis, C. D. and Border, K. C. [1999] Innite-dimensional analysis, second edition, Springer-Verlag, New YorkHeidelbergBerlin. Aubin, J.-P. and Cellina, A. [1984] Dierential Inclusions: Set-Valued Maps and Viability Theory, volume 264 of Grundlehren der mathematischen Wissenschaften, SpringerVerlag, New YorkHeidelbergBerlin, ISBN 0-387-13105-1. Aubin, J.-P. and Frankowska, H. [1990] Set-Valued Analysis, volume 2 of Systems & Control: Foundations & Applications, Birkh auser, Boston/Basel/Stuttgart, ISBN 0-81763478-9. Bianchini, R. M. and Stefani, G. [1993] Controllability along a trajectory: A variational approach, SIAM Journal on Control and Optimization, 31(4), 900927. Bloch, A. M. [2003] Nonholonomic Mechanics and Control, volume 24 of Interdisciplinary Applied Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387095535-6. Borel, E. [1895] Sur quelles points de la th eorie des fonctions, Annales Scientiques de lEcole Normale Sup erieure. Quatri` eme S erie, 12(3), 44. Bourbaki, N. [1990] Algebra II, Elements of Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 3-540-00706-7. [2004] Functions of a Real Variable, Elements of Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 3-540-65340-6.

314

Bibliography

Brockett, R. W. [1977] Control theory and analytical mechanics, in Geometric Control Theory, C. Martin and R. Hermann, editors, pages 148, Math Sci Press, 53 Jordan Road, Brookline, Massachusetts. Bullo, F. and Lewis, A. D. [2004] Geometric Control of Mechanical Systems: Modeling, Analysis, and Design for Simple Mechanical Systems, number 49 in Texts in Applied Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-22195-6. Cartan, H. [1951-52] S eminaire Henri Cartan de lEcole Normale Sup erieure, Lecture notes. [1957] Vari et es analytiques r eelles et vari et es analytiques complexes, Bulletin de la Soci et e Math ematique de France, 85, 7799. Cernea, A. [2001] On the converse statement of the FilippovWa zewski relaxation theorem, Commentationes Mathematicae Universitatis Carolinae, 42(1), 7781. Chow, W.-L. [1939] Uber Systemen von linearen partiellen Dierentialgleichungen erster Ordnung, Mathematische Annalen, 117, 98105. Cohn, D. L. [1980] Measure Theory, Birkh auser, Boston/Basel/Stuttgart, ISBN 0-81763003-1. Constantine, G. M. and Savits, T. H. [1996] A multivariate Fa` a di Bruno formula with applications, Transactions of the American Mathematical Society, 348(2), 503520. Cousin, P. [1895] Sur les fonctions de n variables, Acta Mathematica, 19, 162. Dvoretzky, A. and Rogers, C. A. [1950] Absolute and unconditional convergence in normed linear spaces, Proceedings of the National Academy of Sciences of the United States of America, 36(3), 192197. Eisenbud, D. [1995] Commutative Algebra, number 150 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-94268-8. Elkin, V. I. [1999] Reduction of Nonlinear Control Systems. A Dierential Geometric Approach, number 472 in Mathematics and its Applications, Kluwer Academic Publishers, Dordrecht, ISBN 0-7923-5623-3, translated from the 1997 Russian original by P. S. V. Naidu. Fa` a di Bruno, C. F. [1855] Note sur une nouvelle formule du calcul dierentiel, The Quarterly Journal of Mathematics. Oxford. Second Series, 1, 359360. Filippov, A. F. [1967] Classical solutions of dierential equations with multivalued right-hand side, Journal of the Society of Industrial and Applied Mathematics, Series A Control, 5(4), 609621. [1988] Dierential Equations with Discontinuous Righthand Sides, number 18 in Mathematics and its Applications (Soviet Series), Kluwer Academic Publishers, Dordrecht, ISBN 90-277-2699-X. Frobenius, G. [1877] Uber das Pfasche Problem, Journal fur die Reine und Angewandte Mathematik, 82, 230315.

Bibliography

315

Gardner, R. B. and Shadwick, W. F. [1990] Feedback equivalence for general control systems, Systems & Control Letters, 15(1), 1523. Gardner, R. B., Shadwick, W. F., and Wilkens, G. R. [1989] Feedback equivalence and symmetries of Brunowski normal forms, in Dynamics and Control of Multibody Systems, pages 115130, number 97 in Contemporary Mathematics, American Mathematical Society, Providence, Rhode Island. Grabowski, J. [1981] Derivations of Lie algebras of analytic vector elds, Compositio Mathematica, 43(2), 239252. Grauert, H. [1958] On Levis problem and the imbedding of real-analytic manifolds, Annals of Mathematics. Second Series, 68, 460472. Halmos, P. R. [1986] Finite-Dimensional Vector Spaces, second edition, Undergraduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-38790093-4. Hermann, R. [1960] On the dierential geometry of foliations, Annals of Mathematics. Second Series, 72, 445457. [1962] The dierential geometry of foliations. II, Journal of Applied Mathematics and Mechanics. Translation of the Soviet journal Prikladnaya Matematika i Mekhanika, 11, 303315. Hewitt, E. and Stromberg, K. [1975] Real and Abstract Analysis, number 25 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-38790138-8. Hirsch, M. W. [1976] Dierential Topology, number 33 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-90147-5. Hormander, L. [1973] An Introduction to Complex Analysis in Several Variables, second edition, North-Holland, Amsterdam/New York, ISBN 0-444-10523-9. Hungerford, T. W. [1980] Algebra, number 73 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-90518-9. Isidori, A. [1995] Nonlinear Control Systems, third edition, Communications and Control Engineering Series, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 3-540199160. Jakubczyk, B. [2001] Introduction to geometric nonlinear control; controllability and the Lie bracket, in Mathematical Control Theory, Parts 1 and 2, pages 107168, number VIII in ICTP Lecture Notes, ITCP, Abdus Salam International Center Theoretetical Physics, Trieste. Joo, zewski relaxation theorem revisited, Acta I. and Tallos, P. [1999] The FilippovWa Mathematica Hungarica, 83(1-2), 171177.

316

Bibliography

Jurdjevic, V. [1997] Geometric Control Theory, number 51 in Cambridge Studies in Advanced Mathematics, Cambridge University Press, New York/Port Chester/Melbourne/Sydney, ISBN 0-521-49502-4. Kashiwara, M. and Schapira, P. [1990] Sheaves on Manifolds, number 292 in Grundlehren der mathematischen Wissenschaften, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 3-540-51861-4. Kawski, M. [1990] High-order small-time local controllability, in Nonlinear Controllability and Optimal Control, volume 133 of Monographs and Textbooks in Pure and Applied Mathematics, pages 431467, Dekker Marcel Dekker, New York. Khalil, H. K. [1996] Nonlinear Systems, second edition, Prentice-Hall, Englewood Clis, NJ, ISBN 0-13-228024-8. Klein, E. and Thompson, A. C. [1984] Theory of Correspondences: Including Applications to Mathematical Economics, Canadian Mathematical Society Series of Monographs and Advanced Texts, John Wiley and Sons, New York, New York, ISBN 0-471-88016-7. Kobayashi, S. and Nomizu, K. [1963] Foundations of Dierential Geometry, Volume I, number 15 in Interscience Tracts in Pure and Applied Mathematics, Interscience Publishers, New York, ISBN 0-470-49647-9. Kol ar ak, J. [1993] Natural Operations in Dierential Geometry, , I., Michor, P. W., and Slov Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-56235-4. Krantz, S. G. and Parks, H. R. [2002] A Primer of Real Analytic Functions, second edition, Birkh auser Advanced Texts, Birkh auser, Boston/Basel/Stuttgart, ISBN 0-8176-4264-1. Krener, A. J. [1974] A generalization of Chows theorem and the bang-bang theorem to nonlinear control problems, Journal of the Society of Industrial and Applied Mathematics, Series A Control, 12, 4352. Lang, S. [1984] Algebra, second edition, Addison Wesley, Reading, MA, ISBN 0-20105487-6. [1995] Dierential and Riemannian Manifolds, number 160 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-94338-2. Ledyaev, Y. S. and Zhu, Q. J. [2007] Nonsmooth analysis on smooth manifolds, Transactions of the American Mathematical Society, 359(8), 36873732. Lee, J. M. [2004] Introduction to Topological Manifolds, number 202 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-95026-5. Levi, E. E. [1910] Studii sui puncti singolari essenziale delle funzioni analitiche di due o pi/a variabili complesse, Annali di Matematica Pura ed Applicata. Serie III, 17, 6187. Lobry, C. [1970] Contr olabilit e des syst` emes non lin eaires, Journal of the Society of Industrial and Applied Mathematics, Series A Control, 8, 573605.

Bibliography

317

Milnor, J. W. [1978] Analytic proofs of the Hairy Ball Theorem and the Brouwer Fixed Point Theorem, The American Mathematical Monthly, 85(7), 521524. Mirkil, H. [1956] Dierentiable functions, formal power series, and moments, Proceedings of the American Mathematical Society, 7(4). Morrey, C. B. [1958] The analytic embedding of abstract real analytic manifolds, Annals of Mathematics. Second Series, 68, 159201. Munkres, J. R. [1966] Elementary Dierential Topology, volume 54 of Annals of Mathematical Studies, Princeton University Press, Princeton, New Jersey. Nagano, T. [1966] Linear dierential systems with singularities and an application to transitive Lie algebras, Journal of the Mathematical Society of Japan, 18, 398404. Nelson, E. [1967] Tensor Analysis, Princeton University Press, Princeton, New Jersey, ISBN 0-691-08046-1. Nijmeijer, H. and van der Schaft, A. J. [1990] Nonlinear Dynamical Control Systems, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-97234-X. Noguchi, J. [1998] Introduction to Complex Analysis, number 168 in Translations of Mathematical Monographs, American Mathematical Society, Providence, Rhode Island, ISBN 0-8218-0377-8. Nomizu, K. and Ozeki, H. [1961] The existence of complete Riemannian metrics, Proceedings of the American Mathematical Society, 12, 889891. Oka, K. [1950] Sur les fonctions analytiques de plusieurs variables. VII. Sur quelques notions arithm etiques, Bulletin de la Soci et e Math ematique de France, 78, 127. [1984] Collected Works, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 3-54013240-6, translated from the French by R. Narasimhan. With commentaries by H. Cartan. Edited by R. Remmert. Pasillas-L epine, W. and Respondek, W. [2001] On the geometry of Goursat structures, ESAIM. Control, Optimization and Calculus of Variations. European Series in Applied and Industrial Mathematics, 6, 119181. [2002] Contact systems and corank one involutive subdistributions, Acta Applicandae Mathematicae. An International Journal on Applying Mathematics and Mathematical Applications, 69(2), 105128. Perdry, H. [2004] An elementary proof of the Krull intersection theorem, The American Mathematical Monthly, 111(4), 356357. Rashevsky, P. K. [1938] Any two points of a totally nonholonomic space may be connected by an admissible line, Uchenye Zapiski Pedagogicheskogo Instituta imeni Libknechta, 2, 8394. Remmert, R. [1955] Theorie der Modikationen. I. Stetige und eigentliche Modikationen komplexer R aume, Mathematische Annalen, 129, 274296.

318

Bibliography

Rockafellar, R. T. [1970] Convex Analysis, Princeton Mathematical Series, Princeton University Press, Princeton, New Jersey. Roman, S. [2005] Advanced Linear Algebra, second edition, number 135 in Graduate Texts in Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-38724766-1. Rudin, W. [1976] Principles of Mathematical Analysis, third edition, International Series in Pure & Applied Mathematics, McGraw-Hill, New York, ISBN 0-07-054235-X. [1991] Functional Analysis, second edition, International Series in Pure and Applied Mathematics, McGraw-Hill, New York, ISBN 0-07-054236-8. Sasaki, S. [1958] On the dierential geometry of tangent bundles of Riemannian manifolds, The Tohoku Mathematical Journal. Second Series, 10, 338354. Saunders, D. J. [1989] The Geometry of Jet Bundles, number 142 in London Mathematical Society Lecture Note Series, Cambridge University Press, New York/Port Chester/Melbourne/Sydney, ISBN 0-521-36948-7. Shiga, K. [1964] Some aspects of real-analytic manifolds and dierentiable manifolds, Journal of the Mathematical Society of Japan, 16(2), 128142. Sipser, M. [1996] Introduction to the Theory of Computation, Brooks/Cole, Pacic Grove, CA, ISBN 053494728X. Smirnov, G. V. [2002] Introduction the the Theory of Dierential Inclusions, volume 41 of Graduate Studies in Mathematics, American Mathematical Society, Providence, Rhode Island, ISBN 0-8218-2977-7. Sontag, E. D. [1988] Controllability is harder to decide than accessibility, SIAM Journal on Control and Optimization, 26(5), 11061118. Stefan, P. [1974a] Accessibility and foliations with singularities, American Mathematical Society. Bulletin. New Series, 80, 11421145. [1974b] Accessible sets, orbits and foliations with singularities, Proceedings of the London Mathematical Society. Third Series, 29, 699713. Stein, K. [1951] Analytische Funktionen mehrerer komplexer Ver anderlichen zu vorgegebenen Periodizittsmoduln und das zweite Cousinsche Problem, Mathematische Annalen, 123, 201222. Sussmann, H. J. [1973] Orbits of families of vector elds and integrability of distributions, Transactions of the American Mathematical Society, 180, 171188. [1974] An extension of of theorem of Nagano on transitive lie algebras, Proceedings of the American Mathematical Society, 45(3), 349356. [1990] Why real analyticity is important in control theory, in Perspectives in Control Theory (Sielpia), pages 315340, number 2 in Progress in Systems and Control Theory, Birkh auser, Boston/Basel/Stuttgart.

Bibliography

319

[2008] Smooth distributions are globally nitely spanned, in Analysis and Design of Nonlinear Control Systems, A. Astol and L. Marconi, editors, pages 38, SpringerVerlag, New YorkHeidelbergBerlin, ISBN 978-3-540-74357-6. Sussmann, H. J. and Jurdjevic, V. [1972] Controllability of nonlinear systems, Journal of Dierential Equations, 12, 95116. Swan, R. G. [1962] Vector bundles and projective modules, Transactions of the American Mathematical Society, 105(2), 264277. Takens, F. [1974] Singularities of vector elds, Institut des Hautes Etudes Scientiques. Publications Math ematiques, 43, 47100. Taylor, A. E. [1965] General Theory of Functions and Integration, Blaisdell Publishing Company, New York/Toronto/London. Wazewski, T. [1962] Sur quelques d enitions e quivalentes des quasitrajectoires des syst` emes de commande, Bulletin de lAcad emie Polonaise des Sciences. S erie des Sciences Math ematiques, Astronomiques et Physiques, 10, 469474. Whitney, H. [1936] Dierentiable manifolds, Annals of Mathematics. Second Series, 37(3), 645680. Whitney, H. and Bruhat, F. [1959] Quelques propri et es fondamentales des ensembles analytiques-r eels, Commentarii Mathematici Helvetici, 33, 132160. Willard, S. [1970] General Topology, Addison Wesley, Reading, MA, reprint: [Willard 2004]. [2004] General Topology, Dover Publications, Inc., New York, ISBN 0-486-43479-6, reprint of 1970 Addison-Wesley edition. Wonham, W. M. [1985] Linear Multivariable Control, A Geometric Approach, third edition, number 10 in Applications of Mathematics, Springer-Verlag, New YorkHeidelbergBerlin, ISBN 0-387-96071-6. Zhitomirskii, M. and Respondek, W. [2000] Simple germs of corank one ane distributions, in Singularities Symposiumojasiewicz 70, pages 269276, number 44 in Banach Center Publications, Polish Academy of Science, Warsaw.

Vous aimerez peut-être aussi