Vous êtes sur la page 1sur 12

Tutorial on Compressed Sensing

Exercises

1. Exercise
Let η ≥ 0. Show that the `0 minimization problem

(P0,η ) x# = arg min kzk0 s.t. kAz − yk2 ≤ η

for general m × N -matrices A and y ∈ Rm is an NP-hard problem.

Hint: You can use the fact that the exact cover problem is NP-hard.
Exact Cover Problem: Given as the input a natural number m divisible by 3 and
a system {Tj : j = 1, ..., N } of subsets of {1, ..., m} with |Tj | = 3 for all j ∈ [N ].
Decide, if there is a subsystem of mutually disjoint sets {Tj : j ∈ J}, J ⊂ [N ],
such that ∪j∈J Tj = {1, ..., m}.

Solution
We show that any algorithm solving the `0 -problem can be transformed in poly-
nomial time into an algorithm solving the exact cover problem.
Let therefore {Tj : j = 1, . . . , N } be a system of subsets of {1, . . . , m} with
|Tj | = 3. We construct a matrix A ∈ Rm×N by putting
(
1 if i ∈ Tj
aij := ,
0 if i ∈
/ Tj

i.e. the j-th column of A is the indicator function of Tj denoted by XTj and
N
X
Ax = xj XTj . (1)
j=1

This construction can be of course done in polynomial time. Let now x be the
solution to the minimization problem
 
1
 .. 
min kxk0 s.t. Ax = y =  .  . (P0 )
1

By (1) follows:
N
X N
X
m = kyk0 = kAxk0 = k xj XTj k0 ≤ kxj XTj k0 ≤ 3kxk0 , (2)
i=1 i=1

i.e. kxk0 ≥ m/3, where the last step in the inequality follows, because every XTj
has at most three nonzero entries and kxj XTj k0 = 0, if xj = 0.
We show, that the exact cover problem has a unique solution if and only if kxk0 =
m/3. This shows that after solving (P0 ) we can decide if the exact cover problem
has a positive solution or not by computing the `0 -norm of the solution x.
Let us first assume that the exact cover problem has a positive solution.
P Then
there is a set J ⊂ {1, . . . , N } with |J| = m/3 and y = X{1,...,m} = j∈J XTj .
Hence y = Ax for x = XJ and kxk0 = |J| = m/3, which is indeed the minimizer
of (P0 ), because of (2).
If on the other hand y = Ax and kxk0 = m/3, then {Tj : j ∈ supp x} solves the
exact cover problem.
2. Exercise
Let q > 1 and let A be a m × N -matrix with m < N . Show that, there is a
1-sparse vector x, which is not a solution of the optimization problem

(Pq ) x∗ = arg min kzkq s.t. Az = Ax.

Solution
For j = 1, . . . , N let ej ∈ RN be a 1-sparse vector. Now suppose that for all
z ∈ RN with Az = Aej and z 6= ej we have kzkqq > kej kqq = 1. Let v ∈ ker(A) \ {0}
and t 6= 0 with |t| < 1/kvk∞ , then we obtain
X X
1 < kej + tvkqq = |1 + tvj |q + |tvk |q = |1 + tvj |q + |t|q |vk |q ∼t→0 1 + qtvj ,
k6=j k6=j

where the last estimation follows from the multi-binomial theorem.


This inequality shows that vj = 0 because it is in particular true for −1/kvk∞ ≤
t < 0. But this is in fact true for all j ∈ {1, . . . , N }, and therefore it follows v = 0,
which yields a contradiction.
3. Exercise
Let A be a m × N -matrix, y ∈ Rm , η > 0 and let k · k an arbitrary norm on Rm .
Show that the solution of the optimization problem

(P1,η ) x∗ = arg min kzk1 s.t. kAz − yk ≤ η

is m-sparse in the case of the uniqueness of the solution.


Hint: Show that the system of columns {aj : j ∈ supp x∗ } is linearly independent.

Solution
We show that, if x∗ is a unique solution, with K := supp x∗ , then the columns
{aj : j ∈ K} have to be linearly independent. Since at most m columns can be
linearly independent the statement follows. Suppose {aj : j ∈ K} is not linearly
independent, then there is a v ∈ RN with Av = 0 and v 6= 0, i.e. v ∈ ker(A),
and supp(v) ⊂ K. But because of the uniqueness we have for every t ∈ R small
enough (in absolute value):
X X
kx∗ k1 < kx + tvk1 = |xj + tvj | = sign(xj + tvj )(xj + tvj )
j∈K j∈K
X X
= sign(xj + tvj )xj + t vj sign(xj + tvj )
j∈K j∈K
X X
= sign(xj )xj + t vj sign(xj )
j∈K j∈K
X
= kx∗ k1 + t vj sign(xj ),
j∈K
P
which is a contradiction, since we can choose t such that t j∈K vj sign(xj ) be-
comes smaller than zero.
4. Exercise
Let A be an m × N matrix and 2s ≤ m. Show that the following statements are
equivalent:

i) There is a mapping Λ : Rm → RN such that Λ(Ax) = x for all x ∈ Σs . We


call such a mapping Λ a decoder.

ii) Σ2s ∩ ker(A) = {0}.

iii) For any set T with #T = 2s, the matrix AT has rank 2s.

iv) The symmetric non-negative matrix AtT AT is invertible, i.e. positive definite.

Solution
The equivalence between ii), iii) and iv) is linear algebra.
For example ii)⇒iii): If Σ2k ∩ ker(A) = {0}, we can deduce that for every T ⊂
{1, . . . , N } with |T | ≤ 2k it holds ker(AT ) = {0}. And therefore that AT has full
rank.
i)⇒ ii): Let x ∈ Σ2k ∩ ker(A), then we can write x = x1 − x0 , where both x1 and
x0 lie in Σk . Since x ∈ ker(A) it holds Ax = 0 and therefore Ax1 = Ax0 , which
implies by assumption i) that x1 = x0 and therefore x = 0.
ii)⇒ i): For any y ∈ Rm we define the decoder Λ as Λ(y) to be the element with
the smallest support in the set of solutions {x ∈ RN : Ax = y}. Suppose there is
x1 ∈ Σk such that Λ(Ax1 ) 6= x1 . This implies that there is a x0 with Ax0 = Ax1
and kx0 k0 ≤ kx1 k0 = k, and hence that x1 − x0 ∈ Σ2k ∩ ker(A). By assumption
this implies x1 = x0 .
5. Exercise
[NSP] Given a matrix A ∈ Rm×N , every vector x ∈ RN supported on a set T is
the unique solution of (P1 ) with y = Ax if and only if A satisfies the null space
property relative to T .
Reminder: A is said to satisfy the null space property relative to the set T if for
all v ∈ ker(A) holds

kvT k1 < kvT C k1 ,

where (vT )i = vi if i ∈ T and (vT )i = 0 otherwise.

Solution
Given a index set T and assume that every vector x ∈ RN supported on T is the
unique minimizer. Thus for every v ∈ ker(A) \ {0}, vT is the unique minimizer of
(P1 ) with Ax = Avk . But because of A(vT + vT C ) = Av = 0, we can deduce that
A(−vT C ) = Avk and hence by assumption kvT k1 < kvT C k.

Conversely let us assume that the NSP relative to T holds. Given a vector x
supported on T , for every z ∈ RN with Az = Ax and z 6= x, we have v := x − z ∈
ker(A) \ {0}. We obtain by assumption

kxk1 ≤ kx − zT k1 + kzT k1 = kvT k + kzT k1 < kvT C k1 + kzT k1


= kzT C k1 + kzT k1 = kzk1 ,

where we used in the third step the assumption.


6. Exercise
Given a matrix A ∈ Rm×N , a P vector x ∈ RN with support T is the unique
minimizer of (P1 ) if and only if | j∈T sign(xj )vj | < kvT C k1 for all v ∈ ker A \ {0}.

Solution
Let us start by proving that the inequality implies that x ∈ RN with support T
is the unique minimizer of (P1 ). For a vector z ∈ RN , z 6= x, with Az = Ax we
write, with v = x − z ∈ ker(A) \ {0},

kzk1 = kzT k1 + kzT C k1 = k(x − v)T k1 + kvT C k1


> |hx − v, sign(x)T i| + |hv, sign(x)T i| ≥ |hx, sign(x)T i|
= kxk1 .

It remains to show that the inequality holds as soon as x, supported on T , is


the unique minimizer of (P1 ). In this situation for v ∈ ker(A) \ {0}, the vector
z = x − v satisfies Ax = Az and kxk1 < kzk1 . From this we can deduce

hx, sign(z)T i ≤ kxk1 < kzk1 = kzT k1 + kzT C k1 = hz, sign(z)T i + kzT C k1
⇔ hv, sign(z)T i < kzT C k1
⇔ hv, sign(x − v)T i < kzT C k1 = kvT C k1 .

But since this holds true for every v ∈ ker(A) \ {0}, it holds for t > 0 that

htv, sign(x − tv)T i < tkvT C k1


⇔ hv, sign(x − tv)T i < kvT C k1 .
And for t small enough it holds sign(x − tv)j = sign(xj ) and therefore:
hv, sign(x)i < kvT C k1 .

7. Exercise
Show that the RIP implies the NSP.
More explicit: Let A ∈ Rm,d satisfy the restricted isometry porperty (RIP) of
order 2s with constant 0 < δ2s < 1/3, i.e.
(1 − δ2s )kxk22 ≤ kAxk22 ≤ (1 + δ2s )kxk22
holds for all 2s-sparse vectors x, i.e. for all
x ∈ Σ2s = {v ∈ Rd | kvk0 = #{i | vi 6= 0} ≤ 2s}.
Show that A satisfies the null space property of order s (NSP), i.e. for any T ⊂ [d]
with #T ≤ s and any v ∈ ker A\{0} it holds
2kvT k1 < kvk1 ,
where (vT )i = vi if i ∈ T and (vT )i = 0 otherwise.
Hint:
1. First show that
hAx, Ayi ≤ δ2s kxk2 kyk2
if x, y are s-sparse with disjoint support.
2. For v ∈ ker A\{0} let T0 ⊂ [d] denote the set of indices corresponding to the
s-largest entries of v (in magnitude). Further let T c = T1 ∪ T2 ∪ . . . be a
partition of T c such that T1 contains the indices of s-largest entries of vT c ,
T2 contains the s-largest entries of VT c \T1 etc.

Solution
1. step Let x, y ∈ Σs with disjoint support and with kxk2 = kyk2 = 1. Then it
holds x ± y ∈ Σ2s and kx ± yk22 = 2. Using the RIP of A of order 2s we obtain
2(1 − δ2s ) = (1 − δ2s )kx ± yk22 ≤ kA(x ± y)k22 ≤ (1 + δ2s )kx ± yk22 = 2(1 + δ2s ).
Now the claim follows from the polarization identity, since
1 1
kA(x + y)k22 − kA(x − y)k22 ≤ 2(1 + δ2s ) − 2(1 − δ2s ) = δ2s .

|hAx, Ayi| =
4 4
2. step Let v ∈ ker A\{0} and let T0 ⊂ [d] = {1, 2, . . . , d} denote the set of
indices corresponding to the largest s entries of v (in magnitude). Further divide
T c = [d]\T into sets
T1 − s − largest indices of vT c
T2 − s − largest indices of vT c \T1
..
.
In total we splittet the support T of v into disjoint sets T0 , T1 , . . . such that T0
contains indices of s largest entries of v, T1 contains indices of remaining s-largest
entries etc., hence

T = T0 ∪ T1 ∪ T2 ∪ . . .

Since we v is an element of the kernel of A we get

0 = Av = A(vT0 + vT1 + vT2 + . . .) ⇒ AvT0 = −A(vT1 + vT2 + . . .).

Now we can apply the RIP (since #T0 ≤ s) to arrive at

(1 − δ2s )kvT0 k22 ≤ kAvT0 k22 = hAvT0 , AvT0 i = hAvT0 , −A(vT1 + vT2 + . . .)i
X
= hAvT0 , −AvTi i.
i≥1

Here we can apply our first step to get


X X
(1 − δ2s )kvT0 k22 ≤ hAvT0 , −AvTi i ≤ δ2s kvT0 k2 kvTi k2 .
i≥1 i≥1

Using our construction of the Ti ’s we further estimate for i ≥ 1


1/2
2 1/2
  
X X √ √
kvTi k2 =  vj2  ≤ max |vk |  = s max |vk | ≤ s min |vk |
k∈Ti k∈Ti k∈Ti−1
j∈Ti j∈Ti
P
|vj |
√ j∈Ti−1 kvTi−1 k1
≤ s = √ .
s s

Hence,
X X kvTi−1 k1 δ2s kvT0 k2
(1 − δ2s )kvT0 k22 ≤ δ2s kvT0 k2 kvTi k2 ≤ δ2s kvT0 k2 √ = √ kvk1 .
s s
i≥1 i≥1

Dividing by kvT0 k2 and (1 − δ2s ) and using δ2s < 1/3 we end up with

1 δ2s kvk1
kvT0 k2 ≤ √ kvk1 ≤ √
s 1 − δ2s 2 s
| {z }
<1/2

which yields the claim, since by Cauchy-Schwartz inequality it holds for any x ∈ Rs

kxk1 = hx, sign(x)i ≤ kxk2 k sign(x)k2 = skxk2 .

8. Exercise
Let s ∈ N, 0 < δ < 1 and let

m ≥ cδ −2 s log(ed/s).

Further let A = Ã/ m ∈ Rm,d with i.i.d. entries ãij ∼ N (0, 1). Show that A
satisfies the RIP of order s with RIP constant δs ≤ δ with probability at least

1 − 2 exp −Cδ 2 m .


Hint: First use a Bernstein inequality to show that

P kAxk22 − kxk22 ≥ tkxk22 ≤ 2 exp −ct2 m


 

holds for all t > 0 and x ∈ Rd . Then show the desired RIP inequality for a fixed
s-dimensional subspace using a covering argument.

Solution
We use the Bernstein inequality:
Let X1 , . . . , Xm be independent mean zero (i.e. EXi = 0) subexponential random
variables, i.e. it holds

P(|Xi | ≥ t) ≤ β exp(−κt)

holds for any t > 0 and constants β, κ > 0. Then it holds



m
!
−κ2 t2
X  
Xi ≥ t ≤ 2 exp .

P
4βm + κt
i=1

1. step With Bernstein’s inequality we first want to show that

P kAxk22 − kxk22 ≥ tkxk22 ≤ 2 exp(−ct2 m).




Therefore let x ∈ Rd with kxk2 = 1 and let ãi denote the i-th row of Ã. Consider
the random variable

Xi = |hãi , xi|2 − kxk22 = |hãi , xi|2 − 1.

Then it holds
• Xi are independent, since ãi are independent,

• Xi are subexponential, since ãi (and hãi , xi) are Gaussians,

• Xi have mean zero, since


 2
2 x
EXi = E |hãi , xi| − kxk22 = kxk22 E ãi , − kxk22 = kxk22 E |g|2 −kxk22 = 0
kxk2 | {z }
| {z } =1
∼N (0,1)

with g ∼ N (0, 1),

• it holds
m m m   2
1 X 1 X X ãi
(|hãi , xi|2 − kxk22 ) = ( √ , x − kxk22 ) = kAxk22 − kxk22 .

Xi =
m m m
i=1 i=1 i=1
Now we apply Bernstein’s inequality to get
m
! m !
−κ2 t2 m2
1 X X  
P kAxk22 − kxk22 ≥ t = P

Xi ≥ t = P Xi ≥ mt ≤ 2 exp .

m 2βm + κtm


i=1 i=1

2. step We fix an s-dimensional subspace. Therefore let T ⊂ [d] with #T = s and


let

XT = {x ∈ Σs | supp x ⊂ T }.

We want to show that

(1 − δ)kxk2 ≤ kAxk2 ≤ (1 + δ)kxk2

holds fpr all x ∈ XT with probability at least


 s
12
1−2 exp(−cδ 2 m).
δ

Let δ > 0 and let Q ⊂ XT be a δ/4-net of XT ∩ B d , i.e. it holds

• kqk2 = 1 for all q ∈ Q and

• for any x ∈ XT with kxk2 = 1 there exists some q ∈ Q with kx − qk2 ≤ δ/4.

It is known that we can choose Q with #Q ≤ (12/δ)s (a proof is given below).


For any q ∈ Q and t = δ/2 we now use the first step to get

P kAqk22 − kqk22 ≥ δ/2 ≤ 2 exp(−cδ 2 m)




which is equivalent to
   
δ δ
1− kqk22 ≤ kAqk22 ≤ 1+ kqk22
2 2

with probability at least 1 − 2 exp(−cδ 2 m). Hence, for any (fixed) q ∈ Q it also
holds
   
δ δ
1− kqk2 ≤ kAqk2 ≤ 1 + kqk2
2 2

with probability at least 1 − 2 exp(−cδ 2 m). Hence, this inequality holds (simulta-
neously) far all q ∈ Q with probability at least

1 − 2#Q exp(−cδ 2 m) ≥ 1 − 2(12/δ)s exp(−cδ 2 m).

Now we want to prove that the desired inequality also holds for all x ∈ XT . Let
δ̂ > 0 be the smallest constant such that

kAxk2 ≤ (1 + δ̂)kxk2
holds for all x ∈ XT and let v ∈ XT be fixed with kvk2 = 1. Then there is some
q ∈ Q with kv − qk2 ≤ δ/4 and we get
 
δ δ
kAvk2 ≤ kAqk2 + kA(v − q)k2 ≤ 1 + + (1 + δ̂)
2 4

which implies
 
δ δ
δ̂ ≤ 1 + + (1 + δ̂) ⇒ δ̂ < δ.
2 4

3. step We already proved the inequality for every s-dimensional subspace. Since
there are
 s
 ed
d s≤
s

possibilities to choose s indices out of d the claim follows.


Covering argument It remains to show the covering argument which we used for
the set Q which we will prove in a more general setting:
Let X be an m-dimensional normed space, let ε > 0 and denote BX = {x ∈ X |
kxk ≤ 1}. Then the covering number
n
[
N = min{n ∈ N | ∃q1 , . . . , qn ∈ BX : BX ⊂ (qi + εBX )}
i=1

can be bounded by
 m
2 + 2ε
N≤ .
ε

Indeed, let Q = {q1 , . . . , qk } be (any) maximal set of points in BX with

kqi − qj k > ε, for all i 6= j.

Then it holds
S
• BX ⊂ qi + εBX , since otherwise there is some z ∈ BX with kz − qi k > ε
for alle i = 1, . . . , k in contradiction to the maximality of Q. Hence, we have
k ≥ N.

• The sets qi + ε/2BX are mutually disjoint. Assume that there exists i, j and

z ∈ (qi + ε/2BX ) ∩ (qj + ε/2BX ) .

It follows kqi − qj k ≤ kqi − zk + kqj − zk ≤ ε/2 + ε/2 which implies i = j.

We conclude
k k
[ ε [
qi + B X ⊂ +εBX ⊂ εBX
2
i=1 i=1
and comparing the volumes we arrive at
k
!
[ ε ε   ε m
vol qi + BX = k vol BX = k vol(BX ) ≤ vol((1 + ε)BX )
2 2 2
i=1
= (1 + ε)m vol(BX ),

hence
 ε m  m
m 2 + 2ε
k ≤ (1 + ε) ⇒ N ≤k≤ .
2 ε

Matlab Exercises

1. Matlab Exercise

1. Implement the basis pursuit

min kxk1 subject to Ax = y


x∈Rd

in the form of a linear optimization problem.


Hint: Matlab routine linprog can be useful.

2. Test your program for noisy measurements of the form y = Ax + z, where


z ∈ Rm is either deterministic noise (i.e. kzk is small) or random Gaussian
noise (i.e. zi ∼ N (0, σ 2 ) and σ > 0 small).

2. Matlab Exercise
Show numerically that the number of measurements m only has to grow logarith-
mically in the dimension d if we want to recover an s-sparse signal x0 ∈ Rd from
linear measurements y = Ax with A ∈ Rm,d .
To show this, calculate for increasing values of d and m the error of your approx-
imation and plot the resulting matrix.

3. Matlab Exercise

1. Implement the 1-Bit Compressed Sensing Algorithm


m
X
max yi hai , xi subject to kxk1 ≤ R, kxk2 ≤ 1,
x∈Rd
i=1

which recovers the true signal x0 ∈ Rd with kx0 k1 ≤ R and kx0 k2 ≤ 1 from
measurements yi = signhai , x0 i, i = 1, . . . , m.
Hint: Matlab package CVX can be useful.
2. Test your algorithm with noisy measurements of the form
(
signhai , x0 i with probability 1 − p,
yi =
− signhai , x0 i with probability p

for some 0 < p < 1/2.

4. Matlab Exercise
Let f : Bd → R be a ridge function with f (x) = g(ha, xi) for some (unknown)
s-sparse ridge vector a ∈ Rd with kak2 = 1 and some differentiable ridge profile
g : R → R with g 0 (0) 6= 0. The ridge vector a gets recovered by the following
algorithm:

• Input: Ridge function f (x) = g(ha, xi), h > 0 small and m ∈ N

• Take Φ ∈ Rm×d a normalized Bernoulli matrix (i.e. with entries ±1, both
with probability 1/2.
f (hϕj )−f (0)
• Put b̃j := h , j = 1, . . . , m

• Put ã := ∆1 (b̃) = arg minw∈Rd kwk1 s.t. Φw = b̃



• Put â := kãk2

• Output: â

Implement this algorithm and show numerically that it indeed recovers the ridge
vector a.

Solution
Matlab implementations are given in the corresponding Matlab files.
kx0 − xk2
60 0
sparsity s = 5
55 0.2

50
1.8 0.4
s=5 no noise 45
Gaussian noise, sigma=0.1
1.6 d = 1000 deterministic noise, r=0.1 0.6
40

measurements m
1.4
0.8
35
1.2
1
30
kx0 − xk2

1
1.2
25
0.8

20 1.4
0.6

15 1.6
0.4

10 1.8
0.2

5
2
0
20 25 30 35 40 45 50 55 60 100 200 300 400 500 600 700 800 900 1000
measurements m dimension d

0.014

g = @(t)tanh(t-1)
1.2
0.012 s=5
s=5
1.1 d = 1000 d = 1000
0.01 m = 60
1
ka − âk2

0.9 0.008
kx0 − xk2

0.8

0.006
0.7

0.6 0.004

0.5
0.002
0.4 no noise
misclassification prob. p=0.1
misclassification prob. p=0.2
0.3 0
50 100 150 200 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
measurements m
step size h

Figure 1: Top: Generated figure of test basis pursuit (left) and figure gener-
ated by phase transition (right). Bottom: Generated figure of test one bit
(left) and generated figure of ridge function (right).

Vous aimerez peut-être aussi