The Mixed-Determined Problem: N M N M

GEOS 5004: Analytical Geosciences Hole: 3.
1 Inversion: The Mixed-Determined Problem

3. The Mixed-Determined Problem
for both the least-squares misfit method and the l
2
smallest model method, a solution to the inversion
of the matrix (either G
T
G or GG
T
) required that none of the eigenvalues equal zero
this is equivalent to saying that the matrix must be linearly independent
in both cases, linear dependence means that the problem is mixed-determined
for the least-squares misfit case: despite N>M , there are an infinite number of models that fit
the data with the same best misfit
for the smallest model case: despite N<M , some of the data kernels g
i
are linearly dependent
upon one another and (due to data errors) there is no exact match to the data
mixed-determined
some subset of the model parameters (or linear combinations of them) are underdetermined
=> minimize the norm of the model for these parameters
some subset of the model parameters (or linear combinations of them) are overdetermined
=> minimize the norm of the data misfit for these parameters
compromise: minimize both norms, with some tradeoff between the two norms
minimize the objective function:

= m
2
2
+ d
2
2
= m
j
2
j=1
M
+ e
i
g
ij
m
j
j=1
M

2
i=1
N
where > 0 is a tradeoff parameter chosen by the user

larger value gives more emphasis on minimizing the model size & less on the data misfit
GEOS 5004: Analytical Geosciences Hole: 3.2 Inversion: The Mixed-Determined Problem
damped least squares misfit
when N>M , approach the mixed-determined problem as if it were over-determined
differentiate the objective function with respect to each of the unknown model parameters, and set
the derivatives to be equal to zero

m
k
= 0 = 2m
k
+ 2 e
i
g
ij
m
j
j=1
M
[
\
|
|

)
j
j
g
ik
i=1
N
k =1, 2,..., M
m
k
+ g
ij
m
j
g
ik
j=1
M
i=1
N
= e
i
g
ik
i=1
N
k =1, 2,..., M
kj
+ g
T
( )
ki
g
ij
i=1
N
|
|
|
|
|
|
m
j
j=1
M
= g
T
( )
ki
e
i
i=1
N
k =1, 2,..., M

I+ G
T
G
( )
m= G
T
e
the damped least squares misfit solution:

m= I+ G
T
G
( )
1
G
T
e
the matrix that needs to be inverted is still square, of size MxM , and symmetric
adding a positive constant to the diagonal adds a positive constant to the eigenvalues
the matrix can always be inverted
regularized smallest model
when N<M , approach the mixed-determined problem as if it were under-determined
during the smallest model derivation, it was shown that the model that minimizes the size of the
model while matching the data is a linear combination of the data kernels

m
k
= a
i
g
ik
i=1
N
m= a
i
g
i
i=1
N
= G
T
a
substitute into the mixed-determined tradeoff objective function

= a
m
g
mj
g
nj
a
n
m=1
N
n=1
N
j=1
M
+ e
i
g
ij
g
pj
a
p
p=1
N
j=1
M

i=1
N
2
differentiate with respect to the unknowns a
k
, k=1,2,,N and set the derivatives equal to zero

a
k
= 0 = a
m
g
mj
g
kj
m=1
N
j=1
M
+ g
kj
g
nj
a
n
n=1
N
j=1
M
2 e
i
g
ij
g
pj
a
p
p=1
N
j=1
M
|
|
|
|
|
|
|
|
g
ij
g
kj
j=1
M
|

|
|
|
|

|
|
|
i=1
N
g
kj
g
ij
a
i
i=1
N
j=1
M
g
kj
g
ij
e
i
g
ij
g
pj
a
p
p=1
N
j=1
M
|
|
|
|
|
|
|
|
|

|
|
|
|

|
|
|
i=1
N
j=1
M
= 0 k =1, 2,..., N
g
kj
g
ij
a
i
e
i
+ g
ij
g
pj
a
p
p=1
N
j=1
M
|
|
|
|
|
|
|
|
|

|
|
|
|

|
|
|
i=1
N
j=1
M
= 0 k =1, 2,..., N
regularized smallest model continued
the above equation can only be satisfied for all values of k if the term in square brackets is zero

a
i
e
i
+ g
ij
g
pj
a
p
p=1
N
j=1
M
= 0 i =1, 2,..., N
ip
+ g
ij
g
T
( )
jp
j=1
M

a
p
p=1
N
= e
i
i =1, 2,..., N

I+ GG
T
( )
a = e a = I+ GG
T
( )
1
e
substitute back into the equation for the model
the regularized l
2
smallest model solution

m= G
T
I+ GG
T
( )
1
e
the matrix that needs to be inverted is still square, of size NxN , and symmetric
adding a positive constant to the diagonal adds a positive constant to the eigenvalues
the matrix can always be inverted
damping / regularization
eigenvalue decomposition of system of equations:

Ax = b

A= RR
T
x = R
T
x

b = R
T
b x
i
=

b
i
i
for least-squares misfit or l
2
smallest model in a mixed-determined system,
the eigenvalues can be zero => divide by zero!
adding a constant to the diagonal of the matrix to be inverted adds a constant to the eigenvalues

x
i
=

b
i
i
+
> 0
how much damping/regularization is required?
in principal, a tiny value added to the zero eigenvalues will prevent divide by zero
in practice, numerical precision of computer calculations also plays a role
the condition number for a matrix is:

c =

max
min
if c approaches machine precision, the eigenvalues cannot be computed accurately
=> smaller eigenvalues will take on random small values that are round-off zeroes
numerical precision: floating point (32-bit) calculation: ~10
6
double-precision (64-bit) calculation: ~10
14
damping / regularization
for
max
>> >>
min
:
increases the size of the
smallest eigenvalues,
stabilizing their effect
has little effect on the largest
eigenvalues, hence little
effect on those data/model
parameters
from the objective function:
=> greater damping equals smaller
model (by allowing greater misfit)
=> tradeoff between smaller model
and smaller misfit
data with measurement errors
should the misfit be minimized?
each datum has an associated measurement error
e
i
= e
i
t
+e
i
e
i
t
= true datum; e
i
= unknown error; e
i
= observed datum
assume that the data errors are uncorrelated: e
i
is statistically unrelated to e
j
for i j
assume that the data errors are random and each satisfies a Gaussian (normal) distribution with zero
mean and a standard deviation
i
apply a weighted l
2
norm to the data measurement errors
e
2
2
=
e
i
2
i
2
i=1
N
the expected value of this norm can be shown to be equal to N

apply the same norm to the data misfit, called chi-squared misfit
2
= d
2
2
=
d
i
2
i
2
i=1
N
philosophy: fit the data statistically to a level similar to that of the measurement errors
2
= N
method: explore tradeoff curve, calculating

2
for each value of damping, until
2
= N
errors and stability
eigenvectors with small eigenvalues correspond to directions in model space that are not well
sampled by linear combinations of the data kernels
example: in 2D space, collect two data

g
1
= 15 17 [ ] g
2
= 16 18 [ ]
these two kernels are linearly independent, but barely, as they are almost parallel
the eigenvalues of GG
T
are:

1
=1094
2
= 0.00365 c = 310
5
the first eigenvector is

r
1
=
1
158629
273
290

which is almost parallel to the data kernels
in this direction, the data provide strong constraints on the length of the model
the rotated data are (very roughly):

e
1
= (sum of data)/2 and

e
2
= (difference of data)/2
the first rotated datum will have an error similar to the original data
the second rotated datum will have very large error (perhaps larger than its value!)
e.g., if truth is

m= 1 0
[ ]
, then

e = 15 16
[ ]
; if

e
= 1 1
[ ]
,

e = 221 0.051
[ ]
in the direction of

r
2
the small eigenvalue will expand the data errors to produce garbage
=> the length of the resulting model in the

r
2
direction will have very large error
instability: for a large condition number, small data errors can produce huge model errors
the large model errors will be in directions corresponding with the small eigenvalues
damping or regularization prevents these terms from blowing up
scaled data
minimizing the misfit norms

= d
2
2
or

= m
2
2
+ d
2
2
will place equal weight on matching data with small or large measurement errors
philosophy: should place more emphasis matching data that are more accurate
this is even more important if data with different physical meanings are mixed together
e.g., gravity data plus magnetic data
e.g., ???
normalize the data equations by the data measurement errors

g
i
i
m=
e
i
i
g
i
m= e
i
the new data measurement errors have a standard deviation of one
use the new data and kernels in the inversion
spectral decomposition
form a new matrix composed of smaller matrices

H=
0 G
G
T
0

this matrix is square, of size (N+M) x (N+M) , and symmetric
=> eigenvalues and eigenvectors exist; and the eigenvectors are orthogonal
splitting the eigenvector into two shorter vectors, and using s
i
to indicate the eigenvalues, the
eigenvector equation is

0 G
G
T
0

u
i
v
i

= s
i
u
i
v
i

Gv
i
= s
i
u
i
G
T
u
i
= s
i
v
i
where u
i
is of length N and v
i
is of length M
isolating u
i
and v
i
in the equations at right, and plugging into the other equation

GG
T
u
i
= s
i
2
u
i
G
T
Gv
i
= s
i
2
v
i
which we recognize as the eigenvector equations for G
T
G and GG
T
v
i
are the eigenvectors for the least-squares misfit problem
u
i
are the eigenvectors for the l
2
smallest model problem
and s
i
2
are the eigenvalues for both problems
the s
i
are called the singular values
singular value decomposition
define a diagonal, NxM matrix S with the singular values on the diagonal
define matrices U and V with the eigenvectors in the columns

S =
s
1
0 0
0 s
2
0
0 0 s
3
0 0 0

U=

u
1
... u
N

V=

v
1
... v
M

GV= US G
T
U= VS
since the eigenvectors are orthonormal

VV
T
= I UU
T
= I
singular value decomposition (SVD)

G= USV
T
these matrices are usually truncated to include only the

P min N, M ( ) non-zero eigenvalues
S
P
is PxP , U
P
is NxP , and V
P
is MxP
the vectors in U
P
correspond to (non-zero eigenvector part of) the l
2
smallest model
the vectors in V
P
correspond to (non-zero eigenvector part of) the least-squares misfit model
singular value decomposition continued
plug the SVD decomposition into the data equations

U
p
S
p
V
p
T
m= e
V
p
S
p
1
U
p
T
U
p
S
p
V
p
T
m= V
p
S
p
1
U
p
T
e
the inverse of the S matrix is trivial
natural generalized inverse

m= V
p
S
p
1
U
p
T
e G
g
= V
p
S
p
1
U
p
T
it is natural only in the sense that the l
2
seems natural
it is NOT an inverse, as

GG
g
I unless P = N and

G
g
G I unless P = M
the natural generalized inverse relates the model to the data in rotated coordinates

m = V
p
T
m e = U
p
T
e S
p
m = e
the natural generalized inverse simultaneously finds
- the l
2
smallest model for the linear combination of model parameters that are underconstrained
- the least-squares misfit model for the linear combination of model parameters that are
overconstrained
it is equivalent to the damped/regularized system with the smallest possible damping/regularization
singular value decomposition continued
practical issues
small singular values still lead to instability
=> damp/regularize by adding a small value to the singular values
OR
=> truncate the matrices (shrink P ) to eliminate the small singular values
allows a greater data misfit; produces a smaller model
the damped SVD solution is equivalent to the damped / regularized solutions before
eigenvalue decomposition and singular value decomposition have a high computational cost
they are not practical for large matrices (dimension over 1,000-10,000)
but they provide substantial insight into the system and its inverse
many algorithms exist to more efficiently solve systems of equations
different algorithms maximize efficiency by taking advantage of special properties of the matrix
(e.g., sparse, symmetric, etc.)
but the theory above holds
=> choose which system of equations to solve based upon the desired objective function

The Mixed-Determined Problem: N M N M

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

The Mixed-Determined Problem: N M N M

Transféré par

Droits d'auteur :

Formats disponibles

GEOS 5004: Analytical Geosciences Hole: 3.

1 Inversion: The Mixed-Determined Problem

where > 0 is a tradeoff parameter chosen by the user

the expected value of this norm can be shown to be equal to N

Vous aimerez peut-être aussi