Vous êtes sur la page 1sur 9

A literate Gram-Schmidt algorithm in Clojure

Ahmed Fasih

1 Introduction
This document is a pilot vehicle for a number of systems that I have been very
excited to try out. It is:

• first and foremost, part of my Homework 1 for ECE 850, to implement advanced lin-
Gram-Schmidt orthonormalization and apply it to (5); ear control
at The Ohio
• but more importantly, a “literate program” [1] written for the Nuweb State Univer-
literate programming (LP) tool [2]—my first such program; sity, homework
• nearly as importantly, outputs a program in Clojure [3], a modern Lispy due 2010-1-20
programming language for the JVM, and uses Incanter [4], a statistics
library in Clojure—my first use of both language and library;
• and worthy of being mentioned:

– will have a bibliography managed by JabRef (which ran from within


my browser without me having to install anything—nice!) [5],
– will hopefully be very regularly tracked by Mercurial soruce control
[6].

1.1 Literate programming


Literate programming was developed by Knuth, who also invented the TEXdigital
typesetting software. LP is an approach to software engineering to my mind in-
volving writing literature instead a program, specifically by switching the quan-
tity of comments and code.
Thus, I am essentially writing a LATEX-typeset document with bits of code
embedded in it. Literate programming tools like Nuweb or Knuth’s CWEB
define the markup that indicates code, and accept as input a file containing
both LATEXand computer source code, which is processed to output compiler-
ready source file(s) as well as a human-ready LATEX-typeset document.
Furthermore, LP makes aggressive use of “macros,” here meaning named
scraps of code. As an example, below is the first macro I typed (thirty minutes
ago):

1
hMy first macro 2ai ≡
; Welcome to Clojure!

Used in part 2b.

The macro’s name is “My first macro” and is printed at the top of the scrap
along with a scrap identifier (in this case, “1,” the page number) inside the h·i.
A key idea is that macros can contain other macros: this next one refers to
“My first macro” that we just defined.
hThis is how we build webs 2bi ≡
hMy first macro 2ai
(+ 1 2 3 4) ; some Clojure code
(reduce + [1 2 3 4]) ; Goodness me, it’s good to return to Lisp

Used in part 2c.

Thus, instead of source code being laid out serially as dictated by a compiler,
snippets of code that accomplish notable and nameable chunks of work are laid
out and built up in a way that appeals to a human reader. The supremacy of
this benefit over all the others in the LP framework is only beginning to dawn
on me as I continue to use it. This web-like nature of fully literate programs will
manifest itself in different ways in different languages. E.g., Nuweb’s literate
source is in C, and this engenders, I claim, a very different chunking than I
envision in this Lisp-Clojure mathematics program. I think Clojure
After all the relevant macros have been constructed, we can output them to is so much of
files in some compiler-dictated way, almost as an afterthought: a Lisp that I
"src/test.clj" 2c ≡ do not hesitate
hThis is how we build webs 2bi to refer to
it as Lisp-
Clojure or
When Nuweb parses this literate program, it will output a .tex file that gets Clojure-Lisp
converted to a PDF (which you are perhaps reading now), as well as a file
src/test.clj which can be fed into a compiler. Any other output files I define
will also be created, ready to be fed into their respective compilers.

2 Mathematics
Gram-Schmidt orthonormalization is an algorithm that takes a set of vectors,
{~ai }m
i=1 , each vector therein being in some normed linear vector space such as
Rn , and outputs an orthonormal basis of span{~ai }. In less technical terms,
we have m vectors and we wish to find r ≤ m vectors that are orthogonal to
themselves, and have length 1, and most importantly, can be linearly combined
to obtain any of the original m vectors.
This algorithm gives us the dimension of the span of our starting vectors for
free, since this is r, the number of basis vectors found. If we form a matrix with

2
columns as our vectors,
 
A = ~a1 ~a2 . . . ~an , (1)

then this dimension is also equal to the rank of A, ρ(A).


We will outline the algorithm assuming that A is full-rank, and point out as
we go the additional steps needed to deal with rank-deficient matrixes. Gram-
Schmidt orthonormalization is still useful for full-rank A matrixes, even though
the columns of A are linearly independent, since it will yield an orthogonal basis,
which is not guaranteed by A’s full-rankedness.
The algorithm begins as follows. Let candidate basis vectors be denoted as
~bi for i = 1, . . . , r, and let r unknown
~b1 = unit(~a1 ). (2)
that is, let the first basis vector be simply the first input vector, normalized.
Then, let the subsequent basis vectors be suppressing
the arrows for
b2 = unit(a2 − ha2 , b1 ib1 ) clarity
b3 = unit(a3 − ha3 , b1 ib1 − ha3 , b2 ib2 ), (3)

where ha, bi is the inner product of a onto b. Observe that the second basis vector
is simply the second input vector minus whichever component lies parallel to
the first basis vector.
And that the third basis vector is the third input vector minus the compo-
nents of it that lie parallel to the first two basis vectors. Each new input vector
has the components along the existing basis vectors removed to obtain the new
basis vector.
Then, the ith basis vector would be
i−1
X
bi = unit(ai − hai , bj ibj )
j=1
T
= unit(ai − Bi−1 Bi−1 ai )
T
= unit((I − Bi−1 Bi−1 )ai ), (4)

where Bi−1 = [b1 b2 . . . bi−1 ]. To see how the second line follows from the
first, note that the vector Gil Strang’s
  Emmy-
hai , b1 i deserving
 hai , b2 i  linear algebra
T
 = Bi−1 ai ,
 
 .. videos will free
 . 
hai , bi−1 i your mind [7].

that is, simply a decomposition of ai onto the columns of Bi−1 . Then, we left-
multiply this with Bi−1 (without the transpose) to get a linear combination of
the i − 1 existing b’s.
If A were rank-deficient, that is, if one or more ai ’s were linear combinations
of the others, then some b’s will be zero (that is, can be fully expressed as a

3
linear comination of existing basis vectors). Therefore, we omit those vectors
from the set of basis vectors.

2.1 The application


To demonstrate the numerical issues which may arise during orthonormalization,
the following matrix will be submitted to Gram-Schmidt:
 
1 2 . . . 12
13 14 . . . 24
A= . . . (5)
 
 .. ..
. .. 
49 50 . . . 60
This 5 × 12 matrix must have column rank ≤ 5. However, due to roundoff
errors, a numerical implementation may confuse ~bi 6= ~0 for some i and admit it
as a basis vector. Depending on the microarchitecture, subsequent iterations of
the algorithm may fall into a repeatable pattern of such spurious basis vectors.

3 Programming
Let’s begin by converting our mathematical recipe (2)–(4) to Clojure by defining
a recursive function that, as the algorithm proceeds, removes vectors from the
input matrix A and grows the matrix of basis vectors B as needed. Clojure
functions can be defined with multiple arities, which we take advantage of here:

hGS core function 4ai ≡


(defn gram-schmidt
hGS arity 1: no basis vectors, start algorithm 4bi
hGS arity 2: analyze input vectors with available basis vectors 5ai )

Used in part 7c.

When given solely an A matrix of input vectors, we take the first input vector
(normalized) to be the first basis vector and pass it into the 2-arity function.

hGS arity 1: no basis vectors, start algorithm 4bi ≡


([A]
(gram-schmidt (sel A :except-cols 0) (mkunit (sel A :cols 0))))

Used in part 4a.

where mkunit normalizes its input vector. Incanter provides the slicing infras-
tructure of sel.
The 2-ary function receives A input vectors and B basis vectors, and for a
given ai (the first column of A), computes bi as per (4). Then it recursively i.e., find ai ’s
calls itself on a smaller A and a potentially larger B: component not
along any vec-
tors already in
4
B
hGS arity 2: analyze input vectors with available basis vectors 5ai ≡
([A B]
(if hTermination check: if no more As, return B; else— 5ci
(let
[a (sel A :cols 0)
b (mmult (minus (identity-matrix (nrow A)) (mmult B (trans B))) a)]
hRecursion step 5bi )))

Used in part 4a.

The verbosity of Clojure for mathematical programming is notable, particularly


to those coming from Matlab and other domain-specific languages.
The function’s recursion shows A shrinking and the test that bi 6= ~0 before
it is appended to B:

hRecursion step 5bi ≡

(recur
(sel A :except-cols 0)
(if (not-small b)
(bind-columns B (mkunit b))
B))

Used in part 5a.

To decide if b is a bona fide new basis vector, it can’t be not-small, in the `2


sense (see below). Here’s the termination check.

hTermination check: if no more As, return B; else— 5ci ≡


(= 0 (ncol A)) B
Used in part 5a.

And that’s all there is to the algorithm’s implementation. It remains to define


some helper functions that we’ve alluded to:

hHelper functions 5di ≡

(defn vecnorm [x] (sqrt (sum-of-squares x)))


(defn not-small [x] (> (vecnorm x) 1e-10))
(defn mkunit [x] (div x (vecnorm x)))

Used in part 7c.

In all the above code, Incanter provides sel (slicing), mmult (matrix multiplica-
tion), minus, identity-matrix, trans (transpose), nrow and ncol, bind-columns
(matrix concatenation), and sum-of-squares.

5
4 Results
Let’s use gram-schmidt defined above by first defining our input matrix A, as
an Incanter matrix.
hRequested example 6ai ≡
(def A (matrix (range 1 61) 12))
(def B (gram-schmidt A))
Used in part 7c.

The output is
 
0.014800609797685755 0.7744552549693355
 0.1924079273699148 0.5128149661283447 
 
 0.3700152449421439
B= 0.25117467728735265 
.
 0.5476225625143729 −0.010465611553636452
0.7252298800866019 −0.2721059003946242

This shows that the rank of A is just 2. Experimenting with the threshold
of not-small did not change the rank nor the basis vectors given, unless it was
set too close to machine precision (in which case 12 basis vectors were found,
the last several of which were extremely similar to each other).
We note that shuffling the order of the columns of A should produce different
orthonormal bases for the same space. For example,
 
0.0778498944161523 −0.7706746355884528
0.2335496832484569 −0.4954336943068626 
B0 = 
 
0.3892494720807615 −0.2201927530252724 

0.5449492609130662 0.055048188256316403
0.7006490497453707 0.3302891295379068
 
0.04207031619116713 −0.7734533525013498
0.21035158095583564 −0.5057194997124216 
00
 
B = 0.37863284572050415 −0.23798564692349328

 0.5469141104851727 0.029748205865432836 
0.7151953752498412 0.29748205865435823
are also orthonormal bases for the span of the columns of A.

5 Code output
First, we’ll need to import Incanter and contrib.seq-utils libraries.

hLoad libraries 6bi ≡


(use ’[clojure.contrib.seq-utils :only [shuffle]])
(use ’(incanter core charts))

Used in part 7c.

6
Let’s add a few additional functions to make our lives a bit easier.
Here’s a couple of functions to convert an Incanter matrix into a LATEXtable,
with complete decimal precision.

hMatrix to TeX printer 7ai ≡


(defn list-to-tex-row [lst]
(str
(reduce
(fn [s n]
(str
s
(if (> (.length s) 0) " & ")
(print-str n)))
"" lst)
" \\\\ \n"))

(defn matrix-to-tex-table [A] (apply str (map #(list-to-tex-row (to-list %1)) A)))

Used in part 7c.

And here we generate a few alternative basis vectors:

hShuffled A 7bi ≡
(doseq [n (range 3)]
(println "Gram-Schmidt-obtained orthonormal basis for shuffled input:")
(println (matrix-to-tex-table (gram-schmidt (sel A :cols (shuffle (range 12)))))))

Used in part 7c.

The file we define to be as follows.


"src/gs.clj" 7c ≡
hLoad libraries 6bi
hHelper functions 5di

hGS core function 4ai

hRequested example 6ai

hMatrix to TeX printer 7ai


(println "Gram-Schmidt-obtained orthonormal basis:")
(println (matrix-to-tex-table B))

hShuffled A 7bi

7
6 Time-varying control: problem 2
We have also been tasked to derive an expression of the control input to a
time-varying plant with dynamics

ẋ(t) = A(t)x(t) + B(t)u(t). (6)

This control-driven state evolution can be expressed in terms of the state tran-
sition matrix Φ(s, t) such that x(s) = Φ(s, t)x(t) as follows (shown in class)
Z t1
x(t1 ) = Φ(t1 , t0 )x(t0 ) + Φ(t1 , τ )B(τ )u(τ )dτ. (7)
t0

Our goal is to find the control u(t) to drive the system from a given x(t0 ) to
some x(t1 ). Following [8, Theorem 6.11], we posit that for control Let transpose
be (·)0 .
u(t) = −B 0 (t)Φ0 (t1 , t) W −1 (t0 , t1 ) (Φ(t1 , t0 )x(t0 ) − x(t1 ))
 
(8)

(note that the expression in the brackets is independent of t), for


Z t1
0
W (t0 , t1 ) = Φ(t1 , τ )B(τ ) [Φ(t1 , τ )B(τ )] dτ, (9)
t0

will satisfy this requirement.


To see this, insert (8) and (9) into (7):
Z t1
x(t1 ) = Φ(t1 , t0 )x(t0 )− Φ(t1 , τ )B(τ ) [Φ0 (t1 , τ )B 0 (τ )] dτ
t0
· W −1 (t0 , t1 )(Φ(t1 , t0 )x(t0 ) − x(t1 )). (10)

Note that the integral expression is equivalent to W (t1 , t0 ) by (9), thus simpli-
fying the above equation to

x(t1 ) = Φ(t1 , t0 )x(t0 ) − I [Φ(t1 , t0 )x(t0 ) − x(t1 )] = x(t1 ). (11)

Acknowledgements
Thanks to liebke, Chousuke, mattrepl on #clojure@Freenode, and to Rich
Hickey for Clojure, David Liebke for Incanter, and Don Knuth for literate pro-
gramming.

References
[1] D. E. Knuth, “Literate programming,” The Computer Journal, vol. 27,
pp. 97–111, May 1984. 1
[2] P. Briggs, Nuweb Version 1.0b1: A Simple Literate Programming Tool.
http://nuweb.sourceforge.net/nuwebhtml/index.html. 1

8
[3] Clojure. http://clojure.org. 1
[4] Incanter. http://incanter.org. 1
[5] JabRef Reference Manager Documentation. http://jabref.sourceforge.
net/help/Contents.php. 1

[6] Mercurial SCM. http://mercurial.selenic.com. 1


[7] G. Strang, “MIT 18.06 Linear Algebra,” 1999. http://ocw.mit.edu/
OcwWeb/Mathematics/18-06Spring-2005/VideoLectures. 2
[8] C.-T. Chen, Linear System Theory and Design. Oxford University Press,
1999. 6

Vous aimerez peut-être aussi