Académique Documents
Professionnel Documents
Culture Documents
Exercises
1. Exercise
Let η ≥ 0. Show that the `0 minimization problem
Hint: You can use the fact that the exact cover problem is NP-hard.
Exact Cover Problem: Given as the input a natural number m divisible by 3 and
a system {Tj : j = 1, ..., N } of subsets of {1, ..., m} with |Tj | = 3 for all j ∈ [N ].
Decide, if there is a subsystem of mutually disjoint sets {Tj : j ∈ J}, J ⊂ [N ],
such that ∪j∈J Tj = {1, ..., m}.
Solution
We show that any algorithm solving the `0 -problem can be transformed in poly-
nomial time into an algorithm solving the exact cover problem.
Let therefore {Tj : j = 1, . . . , N } be a system of subsets of {1, . . . , m} with
|Tj | = 3. We construct a matrix A ∈ Rm×N by putting
(
1 if i ∈ Tj
aij := ,
0 if i ∈
/ Tj
i.e. the j-th column of A is the indicator function of Tj denoted by XTj and
N
X
Ax = xj XTj . (1)
j=1
This construction can be of course done in polynomial time. Let now x be the
solution to the minimization problem
1
..
min kxk0 s.t. Ax = y = . . (P0 )
1
By (1) follows:
N
X N
X
m = kyk0 = kAxk0 = k xj XTj k0 ≤ kxj XTj k0 ≤ 3kxk0 , (2)
i=1 i=1
i.e. kxk0 ≥ m/3, where the last step in the inequality follows, because every XTj
has at most three nonzero entries and kxj XTj k0 = 0, if xj = 0.
We show, that the exact cover problem has a unique solution if and only if kxk0 =
m/3. This shows that after solving (P0 ) we can decide if the exact cover problem
has a positive solution or not by computing the `0 -norm of the solution x.
Let us first assume that the exact cover problem has a positive solution.
P Then
there is a set J ⊂ {1, . . . , N } with |J| = m/3 and y = X{1,...,m} = j∈J XTj .
Hence y = Ax for x = XJ and kxk0 = |J| = m/3, which is indeed the minimizer
of (P0 ), because of (2).
If on the other hand y = Ax and kxk0 = m/3, then {Tj : j ∈ supp x} solves the
exact cover problem.
2. Exercise
Let q > 1 and let A be a m × N -matrix with m < N . Show that, there is a
1-sparse vector x, which is not a solution of the optimization problem
Solution
For j = 1, . . . , N let ej ∈ RN be a 1-sparse vector. Now suppose that for all
z ∈ RN with Az = Aej and z 6= ej we have kzkqq > kej kqq = 1. Let v ∈ ker(A) \ {0}
and t 6= 0 with |t| < 1/kvk∞ , then we obtain
X X
1 < kej + tvkqq = |1 + tvj |q + |tvk |q = |1 + tvj |q + |t|q |vk |q ∼t→0 1 + qtvj ,
k6=j k6=j
Solution
We show that, if x∗ is a unique solution, with K := supp x∗ , then the columns
{aj : j ∈ K} have to be linearly independent. Since at most m columns can be
linearly independent the statement follows. Suppose {aj : j ∈ K} is not linearly
independent, then there is a v ∈ RN with Av = 0 and v 6= 0, i.e. v ∈ ker(A),
and supp(v) ⊂ K. But because of the uniqueness we have for every t ∈ R small
enough (in absolute value):
X X
kx∗ k1 < kx + tvk1 = |xj + tvj | = sign(xj + tvj )(xj + tvj )
j∈K j∈K
X X
= sign(xj + tvj )xj + t vj sign(xj + tvj )
j∈K j∈K
X X
= sign(xj )xj + t vj sign(xj )
j∈K j∈K
X
= kx∗ k1 + t vj sign(xj ),
j∈K
P
which is a contradiction, since we can choose t such that t j∈K vj sign(xj ) be-
comes smaller than zero.
4. Exercise
Let A be an m × N matrix and 2s ≤ m. Show that the following statements are
equivalent:
iii) For any set T with #T = 2s, the matrix AT has rank 2s.
iv) The symmetric non-negative matrix AtT AT is invertible, i.e. positive definite.
Solution
The equivalence between ii), iii) and iv) is linear algebra.
For example ii)⇒iii): If Σ2k ∩ ker(A) = {0}, we can deduce that for every T ⊂
{1, . . . , N } with |T | ≤ 2k it holds ker(AT ) = {0}. And therefore that AT has full
rank.
i)⇒ ii): Let x ∈ Σ2k ∩ ker(A), then we can write x = x1 − x0 , where both x1 and
x0 lie in Σk . Since x ∈ ker(A) it holds Ax = 0 and therefore Ax1 = Ax0 , which
implies by assumption i) that x1 = x0 and therefore x = 0.
ii)⇒ i): For any y ∈ Rm we define the decoder Λ as Λ(y) to be the element with
the smallest support in the set of solutions {x ∈ RN : Ax = y}. Suppose there is
x1 ∈ Σk such that Λ(Ax1 ) 6= x1 . This implies that there is a x0 with Ax0 = Ax1
and kx0 k0 ≤ kx1 k0 = k, and hence that x1 − x0 ∈ Σ2k ∩ ker(A). By assumption
this implies x1 = x0 .
5. Exercise
[NSP] Given a matrix A ∈ Rm×N , every vector x ∈ RN supported on a set T is
the unique solution of (P1 ) with y = Ax if and only if A satisfies the null space
property relative to T .
Reminder: A is said to satisfy the null space property relative to the set T if for
all v ∈ ker(A) holds
Solution
Given a index set T and assume that every vector x ∈ RN supported on T is the
unique minimizer. Thus for every v ∈ ker(A) \ {0}, vT is the unique minimizer of
(P1 ) with Ax = Avk . But because of A(vT + vT C ) = Av = 0, we can deduce that
A(−vT C ) = Avk and hence by assumption kvT k1 < kvT C k.
Conversely let us assume that the NSP relative to T holds. Given a vector x
supported on T , for every z ∈ RN with Az = Ax and z 6= x, we have v := x − z ∈
ker(A) \ {0}. We obtain by assumption
Solution
Let us start by proving that the inequality implies that x ∈ RN with support T
is the unique minimizer of (P1 ). For a vector z ∈ RN , z 6= x, with Az = Ax we
write, with v = x − z ∈ ker(A) \ {0},
hx, sign(z)T i ≤ kxk1 < kzk1 = kzT k1 + kzT C k1 = hz, sign(z)T i + kzT C k1
⇔ hv, sign(z)T i < kzT C k1
⇔ hv, sign(x − v)T i < kzT C k1 = kvT C k1 .
But since this holds true for every v ∈ ker(A) \ {0}, it holds for t > 0 that
7. Exercise
Show that the RIP implies the NSP.
More explicit: Let A ∈ Rm,d satisfy the restricted isometry porperty (RIP) of
order 2s with constant 0 < δ2s < 1/3, i.e.
(1 − δ2s )kxk22 ≤ kAxk22 ≤ (1 + δ2s )kxk22
holds for all 2s-sparse vectors x, i.e. for all
x ∈ Σ2s = {v ∈ Rd | kvk0 = #{i | vi 6= 0} ≤ 2s}.
Show that A satisfies the null space property of order s (NSP), i.e. for any T ⊂ [d]
with #T ≤ s and any v ∈ ker A\{0} it holds
2kvT k1 < kvk1 ,
where (vT )i = vi if i ∈ T and (vT )i = 0 otherwise.
Hint:
1. First show that
hAx, Ayi ≤ δ2s kxk2 kyk2
if x, y are s-sparse with disjoint support.
2. For v ∈ ker A\{0} let T0 ⊂ [d] denote the set of indices corresponding to the
s-largest entries of v (in magnitude). Further let T c = T1 ∪ T2 ∪ . . . be a
partition of T c such that T1 contains the indices of s-largest entries of vT c ,
T2 contains the s-largest entries of VT c \T1 etc.
Solution
1. step Let x, y ∈ Σs with disjoint support and with kxk2 = kyk2 = 1. Then it
holds x ± y ∈ Σ2s and kx ± yk22 = 2. Using the RIP of A of order 2s we obtain
2(1 − δ2s ) = (1 − δ2s )kx ± yk22 ≤ kA(x ± y)k22 ≤ (1 + δ2s )kx ± yk22 = 2(1 + δ2s ).
Now the claim follows from the polarization identity, since
1 1
kA(x + y)k22 − kA(x − y)k22 ≤ 2(1 + δ2s ) − 2(1 − δ2s ) = δ2s .
|hAx, Ayi| =
4 4
2. step Let v ∈ ker A\{0} and let T0 ⊂ [d] = {1, 2, . . . , d} denote the set of
indices corresponding to the largest s entries of v (in magnitude). Further divide
T c = [d]\T into sets
T1 − s − largest indices of vT c
T2 − s − largest indices of vT c \T1
..
.
In total we splittet the support T of v into disjoint sets T0 , T1 , . . . such that T0
contains indices of s largest entries of v, T1 contains indices of remaining s-largest
entries etc., hence
T = T0 ∪ T1 ∪ T2 ∪ . . .
(1 − δ2s )kvT0 k22 ≤ kAvT0 k22 = hAvT0 , AvT0 i = hAvT0 , −A(vT1 + vT2 + . . .)i
X
= hAvT0 , −AvTi i.
i≥1
Hence,
X X kvTi−1 k1 δ2s kvT0 k2
(1 − δ2s )kvT0 k22 ≤ δ2s kvT0 k2 kvTi k2 ≤ δ2s kvT0 k2 √ = √ kvk1 .
s s
i≥1 i≥1
Dividing by kvT0 k2 and (1 − δ2s ) and using δ2s < 1/3 we end up with
1 δ2s kvk1
kvT0 k2 ≤ √ kvk1 ≤ √
s 1 − δ2s 2 s
| {z }
<1/2
which yields the claim, since by Cauchy-Schwartz inequality it holds for any x ∈ Rs
√
kxk1 = hx, sign(x)i ≤ kxk2 k sign(x)k2 = skxk2 .
8. Exercise
Let s ∈ N, 0 < δ < 1 and let
m ≥ cδ −2 s log(ed/s).
√
Further let A = Ã/ m ∈ Rm,d with i.i.d. entries ãij ∼ N (0, 1). Show that A
satisfies the RIP of order s with RIP constant δs ≤ δ with probability at least
1 − 2 exp −Cδ 2 m .
holds for all t > 0 and x ∈ Rd . Then show the desired RIP inequality for a fixed
s-dimensional subspace using a covering argument.
Solution
We use the Bernstein inequality:
Let X1 , . . . , Xm be independent mean zero (i.e. EXi = 0) subexponential random
variables, i.e. it holds
P(|Xi | ≥ t) ≤ β exp(−κt)
Therefore let x ∈ Rd with kxk2 = 1 and let ãi denote the i-th row of Ã. Consider
the random variable
Then it holds
• Xi are independent, since ãi are independent,
• it holds
m m m 2
1 X 1 X X ãi
(|hãi , xi|2 − kxk22 ) = ( √ , x − kxk22 ) = kAxk22 − kxk22 .
Xi =
m m m
i=1 i=1 i=1
Now we apply Bernstein’s inequality to get
m
! m !
−κ2 t2 m2
1 X X
P kAxk22 − kxk22 ≥ t = P
Xi ≥ t = P Xi ≥ mt ≤ 2 exp .
m 2βm + κtm
i=1 i=1
XT = {x ∈ Σs | supp x ⊂ T }.
• for any x ∈ XT with kxk2 = 1 there exists some q ∈ Q with kx − qk2 ≤ δ/4.
which is equivalent to
δ δ
1− kqk22 ≤ kAqk22 ≤ 1+ kqk22
2 2
with probability at least 1 − 2 exp(−cδ 2 m). Hence, for any (fixed) q ∈ Q it also
holds
δ δ
1− kqk2 ≤ kAqk2 ≤ 1 + kqk2
2 2
with probability at least 1 − 2 exp(−cδ 2 m). Hence, this inequality holds (simulta-
neously) far all q ∈ Q with probability at least
Now we want to prove that the desired inequality also holds for all x ∈ XT . Let
δ̂ > 0 be the smallest constant such that
kAxk2 ≤ (1 + δ̂)kxk2
holds for all x ∈ XT and let v ∈ XT be fixed with kvk2 = 1. Then there is some
q ∈ Q with kv − qk2 ≤ δ/4 and we get
δ δ
kAvk2 ≤ kAqk2 + kA(v − q)k2 ≤ 1 + + (1 + δ̂)
2 4
which implies
δ δ
δ̂ ≤ 1 + + (1 + δ̂) ⇒ δ̂ < δ.
2 4
3. step We already proved the inequality for every s-dimensional subspace. Since
there are
s
ed
d s≤
s
can be bounded by
m
2 + 2ε
N≤ .
ε
Then it holds
S
• BX ⊂ qi + εBX , since otherwise there is some z ∈ BX with kz − qi k > ε
for alle i = 1, . . . , k in contradiction to the maximality of Q. Hence, we have
k ≥ N.
• The sets qi + ε/2BX are mutually disjoint. Assume that there exists i, j and
We conclude
k k
[ ε [
qi + B X ⊂ +εBX ⊂ εBX
2
i=1 i=1
and comparing the volumes we arrive at
k
!
[ ε ε ε m
vol qi + BX = k vol BX = k vol(BX ) ≤ vol((1 + ε)BX )
2 2 2
i=1
= (1 + ε)m vol(BX ),
hence
ε m m
m 2 + 2ε
k ≤ (1 + ε) ⇒ N ≤k≤ .
2 ε
Matlab Exercises
1. Matlab Exercise
2. Matlab Exercise
Show numerically that the number of measurements m only has to grow logarith-
mically in the dimension d if we want to recover an s-sparse signal x0 ∈ Rd from
linear measurements y = Ax with A ∈ Rm,d .
To show this, calculate for increasing values of d and m the error of your approx-
imation and plot the resulting matrix.
3. Matlab Exercise
which recovers the true signal x0 ∈ Rd with kx0 k1 ≤ R and kx0 k2 ≤ 1 from
measurements yi = signhai , x0 i, i = 1, . . . , m.
Hint: Matlab package CVX can be useful.
2. Test your algorithm with noisy measurements of the form
(
signhai , x0 i with probability 1 − p,
yi =
− signhai , x0 i with probability p
4. Matlab Exercise
Let f : Bd → R be a ridge function with f (x) = g(ha, xi) for some (unknown)
s-sparse ridge vector a ∈ Rd with kak2 = 1 and some differentiable ridge profile
g : R → R with g 0 (0) 6= 0. The ridge vector a gets recovered by the following
algorithm:
• Take Φ ∈ Rm×d a normalized Bernoulli matrix (i.e. with entries ±1, both
with probability 1/2.
f (hϕj )−f (0)
• Put b̃j := h , j = 1, . . . , m
• Output: â
Implement this algorithm and show numerically that it indeed recovers the ridge
vector a.
Solution
Matlab implementations are given in the corresponding Matlab files.
kx0 − xk2
60 0
sparsity s = 5
55 0.2
50
1.8 0.4
s=5 no noise 45
Gaussian noise, sigma=0.1
1.6 d = 1000 deterministic noise, r=0.1 0.6
40
measurements m
1.4
0.8
35
1.2
1
30
kx0 − xk2
1
1.2
25
0.8
20 1.4
0.6
15 1.6
0.4
10 1.8
0.2
5
2
0
20 25 30 35 40 45 50 55 60 100 200 300 400 500 600 700 800 900 1000
measurements m dimension d
0.014
g = @(t)tanh(t-1)
1.2
0.012 s=5
s=5
1.1 d = 1000 d = 1000
0.01 m = 60
1
ka − âk2
0.9 0.008
kx0 − xk2
0.8
0.006
0.7
0.6 0.004
0.5
0.002
0.4 no noise
misclassification prob. p=0.1
misclassification prob. p=0.2
0.3 0
50 100 150 200 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
measurements m
step size h
Figure 1: Top: Generated figure of test basis pursuit (left) and figure gener-
ated by phase transition (right). Bottom: Generated figure of test one bit
(left) and generated figure of ridge function (right).