Vous êtes sur la page 1sur 5

A Novel Recursive Method for Computing the

Null-space of Random Matrices of Arbitrary Size


Muhammad Ali Raza Anjum, Muhammad Yasir Siddique Anjum
Department of Engineering Sciences, Army Public College of Management and Sciences (APCOMS)
Rawalpindi, PAKISTAN

AbstractProblem of finding the null-space arises times and


often in many important science and engineering
applications. A few of them are bioinformatics, gene
expression analysis, structural analysis, computation fluid
dynamics, electromagnetics, and optimization theory. Many
of the existing methods rely heavily on the structure, size,
and sparsity of the matrices in question and are therefore,
tailored only for particular applications. Besides these,
many hybrid methods have also been tried out. Instead of
being helpful, they turn out to be even more complex. In this
paper, we propose a novel method for computing the nullspace of a matrix. The method makes no apriori
assumptions as such and is applicable to purely random
matrices of arbitrary size. In addition, the method is
recursive in nature which provides the flexibility of finding
an approximate solution whenever the cost of increase in
accuracy is unjustifiable by the corresponding increase in
computation time. And yet, despite all this, the method is
simple, stable, and has excellent convergence properties.
KeywordsNull-space, recursive, computation

I. INTRODUCTION
Problem of computing the null-space generally arises
in under-determined systems. At first, it may appear that
such systems may not have solution at all. On the
contrary, these systems present infinite solutions to the
problem at hand. This is precisely due to the presence of
null-space. In under-determined systems, the count of
vectors in the column space exceeds the maximum
number of independent vectors the column space can
permit. As a result, the excess vectors turn out to be
dependent and a particular linear combination of these
columns with the independent columns can generate a
zero right hand side or more specifically, a null. Any
vector capable of producing such a null is the null-vector
and linear combinations of all such null-vectors trace out
the entire null-space of an under-determined system.
Hence, the system has a null-space solution in addition to
a particular solution, resulting in an infinite number of
solutions for such systems [1].
Under-determined systems come up in physical
problems in which fewer output measurements are
available as compared to the number of parameters
controlling these outputs. For example, in gene
expression analysis [2, 3] the number of genes involved
can be very large but the number of samples available
from patients can be very few. Accurate prediction of the
genes responsible for a particular outcome is a daunting

task and so far remains a challenge. A similar problem


presents itself in the area of sparse sensing [4-6]. A signal
or image with possibly a large number of features is
represented by fewer samples. Reconstruction of such
large features from so few samples is quite a puzzle [7].
Identical problems arise in many other areas of
importance: in numerical solution of differential
equations [8], in computational fluid dynamics [9], in
finite element analysis [10], in structural analysis [11],
and in optimization and minimization [7].
Popular methods for computing the null-space fall in
four broad categories: LU decomposition, Householders
method for QR decomposition, the Singular Value
Decomposition (SVD), and Krylov sub-space methods [7,
12, 13]. LU decomposition works on the principle of
elimination and creates a rectangular matrix U with a
square upper-triangular block and an adjacent block of all
zeros which corresponds to the null-space solution. LU
decomposition inherently cannot take advantage of
sparsity [7]. Householders method performs elimination
using a square orthogonal matrix Q and ends up in a
rectangular upper triangular matrix R similar to U.
Columns of Q corresponding to zero blocks of R reveal
the respective null-space basis. QR is not preferred for
large matrices [12]. SVD decomposes a rectangular
matrix into diagonal matrix of singular values and two
orthonromal matrices U and V that contain the null-space
basis corresponding to the zero singular values. SVD is
computationally expensive [14]. Krylov methods are
iterative methods to solve large scale systems and take
advantage of the sparsity of the matrix to generate Krylov
sub-spaces and find a null-space solution therein. These
methods rely heavily on sparsity for the exploit and apply
only to a certain class of matrices [13].
There are several other methods which are variants or
the combinations of the methods mentioned. Since they
follow independent approaches to tackle the null-space
problem and depend on different sets of parameters, it is
not easy to compare them fairly. Therefore, the purpose
of this paper is not to compare them. Instead, a new
method is proposed for the computation of the null-space.
The method takes a novel approach to the problem and
does not make any presumptions about the structure, size,
or sparsity of a matrix. It is iterative in nature and equally
applicable to both random and deterministic matrices of
any size. Despite all this, it is simple and elegant in itself
and brings forth a completely new insight to the process
of finding the null-space dispensing altogether with the
earlier approaches.

978-1-4799-6369-0/15/$31.00 2015 IEEE


Proceedings of 2015 12th International Bhurban Conference on Applied Sciences & Technology (IBCAST)
Islamabad, Pakistan, 13th 17th January, 2015

235

Rest of the article is organized as follows. Section II


begins with a system model and presents the basic
nomenclature adopted in this paper. The proposed
method to compute the null-space recursively is discussed
in Section III which is followed by the geometrical
interpretation of the method in Section IV. Section V is
devoted to the issues of convergence and stability of the
recursive solution. A complete algorithm for step by step
computation of the null-space according to the proposed
method is outlined in Section VI. Finally, the article
concludes with a discussion of simulation results in
Section VII and a brief conclusion in Section VIII.
II. SYSTEM MODEL
Consider a system of linear equations,
(1)
=
is a matrix of dimensions and is a vector of
dimension 1, where and are the number of rows
and columns respectively. We let
= 1 so that the
number of equations is less than the the number of
unknowns by at least a factor of one and hence, the
system is under-determined. The zero right-hand side
indicates that we are looking for a solution in null-space.
III. RECURSIVE SOLUTION
In order solve Eq. (1), should lie in the right nullspace of which is bound to have a null-space even if
had a full-rank initially because the removal of a row will
make it rank deficient by at least one vector. Null space
will consist of all the scalar multiples of this vector and
form a line in the entire space. Now we begin by solving
Eq. (1) by multiplying with an arbitrary vector [ ] at
time . Since our choice for [ ] is purely arbitrary at the
beginning, the right hand side might not be zero and there
could be a residue. Let us denote this residue as error
vector [ ].
[ ]= [ ]
(2)
[ ] has dimensions of 1. Squaring the error in
Eq. (2),
(3)
[ ]
= [ ]
Minimizing Eq. (3) with respect to [ ],
(4)
[ ]
=
where,
[ ] [ ]
=
[ ]
Setting the gradient to zero in Eq. (4) is not possible
due to two reasons: zero left hand side and a singular
matrix. But these reasons do promise the existence
of a solution in null-space. Therefore, we can search for
it. A natural choice for solving for [ ] would be the
Steepest Descent Algorithm [15].
[ + ]= [ ]
(5)
Where is the step-size. Substituting Eq. (4) in Eq.
(5),
(6)
[ + ]= [ ]
[ ]
Substituting Eq. (2) in Eq. (6),
(7)
[ + ]= [ ]
[ ]
Eq. (7) is a recursive solution to Eq. (1). It searches for
in the null-space of . Eq. (1) may have multiple
solutions in the null-space of . Therefore, will not be
unique.

IV. GEOMETRICAL INTERPRETATION


In this section, we provide a geometrical interpretation
of the proposed method. For the purpose of illustration
and therefore, for the sake of simplicity, we take a 2 2
Hadamard matrix and remove its last row. Starting with
the Eq. (3), also known as performance surface, we
observe that it is the quadratic form of the matrix
.
is positive definite, performance surface is
When
zero only when = . But is formed by the removal of
will not have a full rank and
at least one row. Hence,
would be positive semi-definite. Performance
surface will be zero for values other than = 0. Those
. Search Eq. (7)
values lie in the null-space of
searches for in the null-space of
. Once a solution
is found, all its linear combinations will solve the Eq. (3)
as well. These linear combinations form the complete
null-space. The direction of null-space is determined by
the direction of the eigenvector associated with the zeroth
eigenvalue of the
matrix.
Figure 1 provides a pictorial representation of the
whole process whereas Fig. 2 presents a three
dimensional view of Fig. 1.
V. CONVERGENCE AND STABILITY
A critical parameter in controlling the stability and
convergence of Eq. (7) is the step-size . The use of the
stochastic gradient in Eq. (6) makes the system even
more sensitive to the step-size. A small step may lead to
slow convergence whilst a large-step size may result in
instability. An obvious problem here would be the choice
of optimal step-size that results in a stable algorithm and
ensures faster convergence. This has been achieved using
the Normalized Least Mean Square (NLMS) algorithm
[15]. It selects a normalized step-size by minimizing the
error between the desired output and aposteriori output,
resulting in a time-varying, self-adjusting step-size [16].
Rewriting Eq. (7) by taking step-size as time-varying
[ ],
(8)
[ + ] = [ ] [ ]
[ ]

Figure 1. Geometrical interpretation of the proposed method in two


dimensions for a 2 2 Hadamard matrix with last row removed. (a)
origin (b) performance surface (c) null-space (d) contour intersecting the
null-space at the head of the located vector (e) point of intersection of
contour and the null-space (f) desired null-space vector

Proceedings of 2015 12th International Bhurban Conference on Applied Sciences & Technology (IBCAST)
Islamabad, Pakistan, 13th 17th January, 2015

236

We define the aposteriori error as,


[ ]= [ + ]
(9)
Substituting Eq. (8) in Eq. (9) and re-arranging,
(10)
[ ]
[ ] = [ ] [ ]
Squaring and minimizing Eq. (10) with respect to [ ]
yields,

[ ] =

[ ]
[ ]

[ ]
[ ]

(11)

Where,
(12)
=
Eq. (11) gives optimal step-size that minimizes the
aposteriori error criteria as defined in Eq. (9),
guaranteeing stability and providing faster convergence.
We can regularize it by adding a small constant in the
denominator to avoid division by zero.
[ ]
[ ]
(13)
[ ] =
[ ]
+ [ ]
Eq. (13) seems intimidating at first due to the presence
of large terms in the numerator as well as in the
denominator. Computation of such large terms in every
iteration can be expensive. Also, another question rises
about the exact magnitude of the chosen . We can
simplify Eq. (13) by observing that both the numerator as
well as the denominator in Eq. (13) are scalars. Also, in
the steady-state,
[ + ] [ ]

Figure 3. Learning curve of the proposed algorithm. The step size


parameter is selected for the misadjustment values 10%, 20%, and 30%
according to the Eq. (14) for = = 40.

Allowing for some misadjustment


the Eq. (13) as,

, we can re-write

[ ]
[ ]

=
Finally,
=

(14)

[ ]

Eq. (14) is simplified equation for the optimal stepsize.


is a dimension free measure of degradation [15].
It arises from the use of stochastic gradient in Eq. (4)
which implies that the instantaneous value of squared
error is not zero even if mean square error (MSE) is zero.
It is directly proportional to the step-size. A larger step
size would result in a larger misadjustment and vice
versa.
It is well-known that for the convergence and stability
of an iterative method, the step-size should be less than
the maximum eigenvalue of the matrix that controls the
iterative process which, in this case, is the
matrix.
Since eigenvalues may not always be available in
practice, one way to restrict the step-size is to set is less
than the trace of the corresponding matrix. Trace, being
the sum of all the eigenvalues, is hence, a safer and more
practical measure, which also avoids the need of
computing the eigenvalues, a huge task in itself. Eq. (14)
automatically leads us to this result. By setting the stepsize less that the trace of the matrix , Eq. (14)
guarantees both the convergence and the stability of the
method.
VI. ALGORITHM
We summarize the complete procedure in the form of
an algorithm as follows.

Figure 2. Geometrical interpretation of the proposed method in three


dimensions for a 2 2 Hadamard matrix with last row removed. (a)
performance surface (b) contour intersecting the null-space at the head
of the located vector (c) null-space. The arrow in red, with its tip at the
point of intersection of three surfaces, represents the desired null-space
vector.

Form the

matrix.
=

Calculate .
=

[ ]

Proceedings of 2015 12th International Bhurban Conference on Applied Sciences & Technology (IBCAST)
Islamabad, Pakistan, 13th 17th January, 2015

237

Initialize [ ].

Compute the error [ ].


[ ]=

[ ]

Update [ ].
[ + ] = [ ] [ ]

[ ]

VII. SIMULATION AND RESULTS


We now present the simulation results of the algorithm
for a random matrix whose entries are independent and
identically distributed Gaussian random variables with a
mean of zero and unity variance. Figure 3 compares the
convergence of the proposed algorithm for three different
values of misadjustment. Learning curves of the
algorithm are plotted for a random choice of entries for
= = 40. The step-size parameter is selected for the
misadjustment values 10%, 20%, and 30% according to
the Eq. (14). [ ] is initialized to zero plus a small
constant to avoid division by zero. Each plot is based on
fifty simulation runs. Figure 3 reveals that the algorithm
converges quickly for the smaller values of the step-size
and the accuracy of the algorithm is not lost for the larger
values of the step-size.
Figure 4 provides the information about the transient
behavior of the algorithm. Starting with the initial value
and letting the recursive equation run, we get two
sequences of variables, denoted by
and
on the
bottom and left axis of Fig. 4 respectively. Plotting these
two sequences against each other, we obtain the trajectory
followed by the algorithm.
Figure 4 shows one such trajectory we have obtained
for the null-space vector of the matrix . Also shown in
the figure are the contour plots that highlight the
performance surface of the algorithm. The results are
presented for misadjustment value of 100% and 30
iterations. Convergence is complete within five iterations.
Performance surface for the null-space vector of the
matrix is displayed in Fig. 5.

Figure 5. Performance surface for the null-space vector.

The eccentricity of performance surface is determined


by the spread of eigenvalues of matrix
. Since
is
is
not a full-rank matrix due to its rectangular nature,
also deficient in rank. At least one of its eigenvalues is
zero and so is the value of performance surface along the
axis associated with the corresponding eigenvector. The
spread of the hyper-ellipses in that direction is infinite
and the ellipses will no longer meet at that axis.
Therefore, the shape of the performance surface will no
longer be a set of hyper-ellipses, but a parabolic cylinder
with its base aligned to the null-space eigenvector, which
accounts for its particular shape in our case.
VIII. CONCLUSION
A novel method for the computation of null-space of
random matrices of arbitrary size has been presented in
this paper. A recursive algorithm, with an optimal step
size, was derived to solve the problem with an ensured
convergence which was further accompanied by a
geometrical interpretation to facilitate the visualization
process.
REFERENCES
[1]
[2]

[3]

[4]

[5]

[6]

[7]
Figure 4. Trajectory showing the convergence of null-space vector on
the performance surface according to Eq. (14) for a misadjustment value
of 100%.

[8]

G. Strang, Linear Algebra and Its Applications: Brooks Cole,


2005.
A. Sharma, et al., "A filter based feature selection algorithm using
null space of covariance matrix for DNA microarray gene
expression data," Current Bioinformatics, vol. 7(3), pp. 289-294,
2012.
A. Sharma, et al., "Null space based feature selection method for
gene expression data," International Journal of Machine Learning
and Cybernetics, vol. 3, pp. 269-276, 2012/12/01 2012.
M.-J. Lai and Y. Liu, "The null space property for sparse recovery
from multiple measurement vectors," Applied and Computational
Harmonic Analysis, vol. 30, pp. 402-406, 2011.
Y. Plan, "Compressed sensing, sparse approximation, and lowrank matrix estimation," PhD, California Institute of Technology,
2011.
N. Vaswani and R. Chellappa, "Principal components null space
analysis for image and video classification," Trans. Img. Proc.,
vol. 15, pp. 1816-1830, 2006.
G. Strang, Computational Science and Engineering: {WellesleyCambridge Press}, 2007.
H. J. Neradt, "Null-space Methods for Numerical Solutions of
Differential Equations," PhD, University of Illinois at UrbanaChampaign, 2007.

Proceedings of 2015 12th International Bhurban Conference on Applied Sciences & Technology (IBCAST)
Islamabad, Pakistan, 13th 17th January, 2015

238

Shashank, et al., "A co-located incompressible Navier-Stokes


solver with exact mass, momentum and kinetic energy
conservation in the inviscid limit," Journal of Computational
Physics, 2010.
[10] G. Shklarski and S. Toledo, "Computing the null space of finite
element problems," Computer Methods in Applied Mechanics and
Engineering, vol. 198, pp. 3084-3095, 2009.
[11] S. Leyendecker, et al., "The discrete null space method for
constrained mechanical systems in nonlinear structural and
multibody dynamics," PAMM, vol. 5, pp. 205-206, 2005.
[12] C. Gotsman and S. Toledo, "On the Computation of Null Spaces
of Sparse Rectangular Matrices," SIAM J. Matrix Anal. Appl., vol.
30, pp. 445-463, 2008.

[9]

[13] M. Arioli and G. Manzini, "A null space algorithm for mixed
finite-element
approximations
of
Darcy's
equation,"
Communications in Numerical Methods in Engineering, vol. 18,
pp. 645-657, 2002.
[14] J. Tzeng, "Split-and-Combine Singular Value Decomposition for
Large-Scale Matrix," Journal of Applied Mathematics, vol. 2013,
p. 8, 2013.
[15] B. Farhang-Boroujeny, Adaptive Filters Theory and Applications,
2nd ed.: Wiley, 2013.
[16] K.-A. Lee, et al., Subband Adaptive Filtering: Theory and
Implementation: Wiley, 2009.

Proceedings of 2015 12th International Bhurban Conference on Applied Sciences & Technology (IBCAST)
Islamabad, Pakistan, 13th 17th January, 2015

239

Vous aimerez peut-être aussi