Académique Documents
Professionnel Documents
Culture Documents
Azamat Berdyshev
University of Toronto
Entropy
X
H(X) = − P (x) log P (x)
x∈X
= − E [log P (X)]
2
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
Conditional Entropy
XX PX (x)
H(X|Y ) = PXY (x, y) log
PXY (x, y)
x∈X y∈Y
X
= PX (x)H(Y |X = x)
x∈X
3
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
Mutual Information
4
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
5
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
f (x) g(t)
I Consider the information channel: X −−−→ T −−→ Y
minimize I(X; T )
PT |X (t|x)
(1)
subject to I(T ; Y ) >
6
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
1
1 Ravid Shwartz-Ziv and Naftali Tishby. “Opening the Black Box of Deep Neural Networks via Information”. In: arXiv preprint
arXiv:1703.00810 (2017). 7
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
X → T1 · · · Ti−1 → Ti → Ti+1 · · · → Tk → Y
8
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
Key Observation
I for a given , Information Bottleneck (IB) tells how well you can do,
but does not tell how to achieve it, to find that out we still need to
train the network
I cost function in the Deep Neural Net training is highly nonlinear, but
the information bottleneck optimization (2) is convex, thus has
unique optimal solution!
9
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
Main Takeaway
Solving Information Bottleneck Optimization (1) for all levels of gives
so called Information Bottleneck Bounds on Information Plane
2
2 Ravid Shwartz-Ziv and Naftali Tishby. “Opening the Black Box of Deep Neural Networks via Information”. In: arXiv preprint
arXiv:1703.00810 (2017). 10
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning
Questions?
11