Vous êtes sur la page 1sur 3

Solutions to the Exercises on

the Kernel Trick

Laurenz Wiskott
Institut fur Neuroinformatik
Ruhr-Universitat Bochum, Germany, EU

2 February 2017

Contents

1 Kernel Trick 1

1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Exercise: Kernel function for a linear expansion . . . . . . . . . . . . . . . . . . . . . . 1

1.1.2 Exercise: Quadratic kernel function in R2 . . . . . . . . . . . . . . . . . . . . . . . . . 2

1 Kernel Trick

1.1 Exercises

1.1.1 Exercise: Kernel function for a linear expansion

Given N -dimensional input data vectors x that get expanded into a K-dimensional space (K  N ) with a
linear transformation:
z := Ax with A R(KN ) . (1)

The inner product in the expanded space shall be given by

(z, z0 ) := zT Cz0 , (2)

with C being a symmetric, positive definite K K-matrix.


2017 Laurenz Wiskott (homepage https://www.ini.rub.de/PEOPLE/wiskott/). This work (except for all figures from
other sources, if present) is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view
a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/. Figures from other sources have their own
copyright, which is generally indicated. Do not distribute parts of these lecture notes showing figures with non-free copyrights
(here usually figures I have the rights to publish but you dont, like my own published figures).
Several of my exercises (not necessarily on this topic) were inspired by papers and textbooks by other authors. Unfortunately,
I did not document that well, because initially I did not intend to make the exercises publicly available, and now I cannot trace
it back anymore. So I cannot give as much credit as I would like to. The concrete versions of the exercises are certainly my
own work, though.
These exercises complement my corresponding lecture notes available at https://www.ini.rub.de/PEOPLE/wiskott/

Teaching/Material/, where you can also find other teaching material such as programming exercises. The table of contents of
the lecture notes is reproduced here to give an orientation when the exercises can be reasonably solved. For best learning effect
I recommend to first seriously try to solve the exercises yourself before looking into the solutions.

1
(a) Determine the kernel function k(x, x0 ) that realizes this inner product in the space of the input data,
so that k(x, x0 ) = (z(x), z(x0 )) for all x, x0 .
Solution: To calculate the kernel function, substitute z in the definition of the inner product in the
expanded space by an expression in x.
(2)
(z, z0 ) = zT Cz0 (3)
(1)
= xT A CA} x0
| {z
T
(4)
=:BN N

=: k(x, x0 ) . (5)

(b) How many multiplications () and additions () does it cost to compute the inner product (i) ex-
plicitely via the expanded space and (ii) with the kernel function?
Solution: Projecting each vector into expanded space costs KN +K(N 1) . The inner product
in expanded space costs KK +K(K 1) to multiply z0 with C and K +(K 1) to multiply
the result with z. Taken together computing the inner product in the expanded space costs

2(KN +K(N 1) ) + ((K + 1)K +(K + 1)(K 1) ) K 2 ( + ) . (6)

Calculating the inner product with the kernel function costs N N +N (N 1) to multiply x0 with
B and N +(N 1) to multiply the result with x resulting in

((N + 1)N +(N + 1)(N 1) ) N 2 ( + ) . (7)

If N = 10 and K = 100 the costs reduce from 12, 100 +11, 799 10, 000( + ) to 110 +99
100( + ) by about two orders of magnitude.
(c) Show that the kernel function is positive semi-definite, i.e. k(x, x) 0 for all x and A. Why is it not
necessarily strictly positive definite?
Solution: We know C is positive definite, i.e. zT Cz > 0 for any z 6= 0. Furthermore, the kernel
function norm k(x, x) is equivalent to an expansion of a given vector x to a corresponding expanded
vector z and then applying the positive definite matrix C to calculate the inner product of z with
itself in expanded space, which always leads to a non-negative result. Thus, k(x, x0 ) is at least positive
semi-definite.
The kernel function is not strictly positive definite, because some vectors x might get mapped onto
the zero vector z = 0, which happens if matrix A does not have full rank.

1.1.2 Exercise: Quadratic kernel function in R2

Consider the quadratic kernel function

k(x, x0 ) = (2 + x x0 )2 (1)

for input data x R2 .

(a) Determine a nonlinear expansion and an inner product in the expanded space that, taken together,
match the kernel function.
Solution: First expand the kernel function to see better what it is like.
(1)
k(x, x0 ) = (2 + x x0 )2 (2)
= (2 + x1 x01 + x2 x02 )2 (3)
2 2
= 4+ 4x1 x01 + 4x2 x02 + x21 x01 + 2x1 x01 x2 x02 + x22 x02 . (4)

2
Now we see, the kernel function involves products of the coefficients of each input vector up to order
two. This suggests an expansion into the space of polynomials of degree two,

z(x) := (1, x1 , x2 , x21 , x1 x2 , x22 )T . (5)

Substituting z for x in (4) yields

(z, z0 ) := 4z1 z10 + 4z2 z20 + 4z3 z30 + z4 z40 + 2z5 z50 + z6 z60 . (6)

It is obvious that this is indeed an inner product in the expanded space. Take a moment to think
about results that would not be an inner product.

(b) Is the result unique? If not, provide another pair of expansion and inner product that matches the
kernel function.
Solution: The result is not unique, since one could introduce arbitrary factors in the expansion that
one could compensate for in the inner product. If we use the expansion

z(x) := (2, 2x1 , 2x2 , x21 , 2 x1 x2 , x22 )T , (7)

the inner product in the expanded space becomes the standard Euclidean product,

(z, z0 ) := z1 z10 + z2 z20 + z3 z30 + z4 z40 + z5 z50 + z6 z60 . (8)

Vous aimerez peut-être aussi