Bienvenue sur Scribd !

Ignorer le carrousel

Information Bottleneck Method

Transféré par

Azamat

0% ont trouvé ce document utile (0 vote)

47 vues11 pages

Presentation

Copyright

Formats disponibles

PDF, TXT ou lisez en ligne sur Scribd

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Signaler ce document

Presentation

Droits d'auteur :

Formats disponibles

Téléchargez comme PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

0% ont trouvé ce document utile (0 vote)

47 vues11 pages

Information Bottleneck Method

Transféré par

Azamat

Presentation

Droits d'auteur :

Formats disponibles

Téléchargez comme PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

Passer à la page

Vous êtes sur la page 1sur 11

Rechercher à l'intérieur du document

Information Bottleneck Method

Azamat Berdyshev

University of Toronto

February 22, 2018

Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Entropy

Let X ∈ X be a discrete r.v. distributed as X ∼ P then

X
H(X) = − P (x) log P (x)
x∈X
= − E [log P (X)]

Intuition: How much “uncertainty” is in random variable X.

2
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Conditional Entropy

Let (X, Y ) ∈ X × Y be a pair of discrete r.v. jointly distributed as

(X, Y ) ∼ PXY then

XX PX (x)
H(X|Y ) = PXY (x, y) log
PXY (x, y)
x∈X y∈Y
X
= PX (x)H(Y |X = x)
x∈X

Intuition: How much “uncertainty” is left in random variable X, when

the random variable Y is observed

3
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Mutual Information

As before, let (X, Y ) ∈ X × Y be a pair of discrete r.v. jointly

distributed as (X, Y ) ∼ PXY then

I(X; Y ) = H(X) − H(X|Y )

XX PXY (x, y)
= PXY (x, y) log
PX (x) PY (y)
x∈X y∈Y

Intuition: How much “information” (in bits) about random variable X

is contained in the random variable Y

4
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Data Processing Inequality

I Let X → Y → Z be a Markov chain, then

I(X; Y ) > I(X; Z)

I Reparametrization invariance trick: for any invertible φ, ψ

I(X; Y ) = I(φ(X); ψ(Z))

5
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Information Bottleneck Problem

(N. Tishby, F. Pereira, W. Bialek, 1999)

f (x) g(t)
I Consider the information channel: X −−−→ T −−→ Y
minimize I(X; T )
PT |X (t|x)
(1)
subject to I(T ; Y ) >

I let β be the Lagrange multiplier, then

min I(X; T ) − βI(T ; Y ) (2)

PT |X (t|x)

6
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Deep Neural Nets and Markov Chains

Given input X ∈ Rd , consider training of the Multilayer Perceptron,
a.k.a. Deep Neural Net (DNN), with Stochastic Gradient Descent.

1
1 Ravid Shwartz-Ziv and Naftali Tishby. “Opening the Black Box of Deep Neural Networks via Information”. In: arXiv preprint
arXiv:1703.00810 (2017). 7
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Deep Neural Nets and Markov Chains

I training is a Markov process, thus the DNN forms Markov Chain

X → T1 · · · Ti−1 → Ti → Ti+1 · · · → Tk → Y

I data processing inequality tells us that

I(X; T1 ) > I(X; T2 ) > · · · > I(X; Tk ) > I(X; Ŷ )

hence having more activation functions (a.k.a. neurons) won’t

create any new information

8
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Key Observation

I for a given , Information Bottleneck (IB) tells how well you can do,
but does not tell how to achieve it, to find that out we still need to
train the network

I cost function in the Deep Neural Net training is highly nonlinear, but
the information bottleneck optimization (2) is convex, thus has
unique optimal solution!

I finding global optimum in DNN training NP-hard, but optimizing

Information Bottleneck is very efficient (O(n3 ))!

9
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Main Takeaway
Solving Information Bottleneck Optimization (1) for all levels of gives
so called Information Bottleneck Bounds on Information Plane

2
2 Ravid Shwartz-Ziv and Naftali Tishby. “Opening the Black Box of Deep Neural Networks via Information”. In: arXiv preprint
arXiv:1703.00810 (2017). 10
Some Information Theory basics Information Bottleneck Method Applications in Deep Learning

Thank you for attention!

Questions?

Vous aimerez peut-être aussi

Intelligent Techniques For Data Science
Document282 pages
Intelligent Techniques For Data Science
swjaffry
100% (7)
Accenture AI Guide For Executives
Document92 pages
Accenture AI Guide For Executives
Rafael Novak
100% (2)
Test Bank For Business Intelligence Analytics and Data Science A Managerial Perspective 4th Edition by Sharda Delen Turban ISBN 0134633288 9780134633282
Document36 pages
Test Bank For Business Intelligence Analytics and Data Science A Managerial Perspective 4th Edition by Sharda Delen Turban ISBN 0134633288 9780134633282
chelseawilliamsddsdswbepgmaf
100% (25)
Matlab Neural Network Toolbox Guide
Document34 pages
Matlab Neural Network Toolbox Guide
LakmalWeerasinghe
Pas encore d'évaluation
Deep Reinforcement Learning With Python - Sudharsan Ravichandiran
Document939 pages
Deep Reinforcement Learning With Python - Sudharsan Ravichandiran
Nguyen Duc Anh
100% (1)
Differential Geometry Reconstructed. Kennington Alan PDF
Document2 015 pages
Differential Geometry Reconstructed. Kennington Alan PDF
Azamat
100% (1)
04 OUMH1303 English For Oral Communication PDF
Document242 pages
04 OUMH1303 English For Oral Communication PDF
haniff
100% (1)
PINNs Navier-Stokes Example
Document26 pages
PINNs Navier-Stokes Example
Hari Madhavan Krishna Kumar
Pas encore d'évaluation
Fuzzy Logic
Document35 pages
Fuzzy Logic
Janmejay Pant
100% (1)
Programa Ciencia de Datos Con Python Promo
Document13 pages
Programa Ciencia de Datos Con Python Promo
dennipando
Pas encore d'évaluation
Information Bottleneck Method and Its Applications in Deep Learning
Document15 pages
Information Bottleneck Method and Its Applications in Deep Learning
Azamat
Pas encore d'évaluation
Lecture 22
Document33 pages
Lecture 22
Win Myo
Pas encore d'évaluation
ECE 7680 Lecture 2 - Key Definitions and Theorems in Information Theory
Document8 pages
ECE 7680 Lecture 2 - Key Definitions and Theorems in Information Theory
vahap_samanli4102
Pas encore d'évaluation
Lecture 2
Document19 pages
Lecture 2
kungmu
Pas encore d'évaluation
Rational Inattention Introduction
Document32 pages
Rational Inattention Introduction
GaOn Kim
Pas encore d'évaluation
PRINCIPLES OF COMMUNICATIONS SOURCE CODING
Document60 pages
PRINCIPLES OF COMMUNICATIONS SOURCE CODING
Lê Khánh Duy
Pas encore d'évaluation
1 Intro Handout
Document36 pages
1 Intro Handout
Alessandro Sinai
Pas encore d'évaluation
Lecture1 INL
Document132 pages
Lecture1 INL
bkiakisolako
Pas encore d'évaluation
NN, Hypersurface, RTA
Document8 pages
NN, Hypersurface, RTA
SpectroMan
Pas encore d'évaluation
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
Document23 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
Talvany Luis de Barros
Pas encore d'évaluation
Notes08 Infotheory
Document7 pages
Notes08 Infotheory
Asddd
Pas encore d'évaluation
Lec36 - 210102032 - G V KRISHNA KIREETI
Document5 pages
Lec36 - 210102032 - G V KRISHNA KIREETI
vasu sain
Pas encore d'évaluation
Lec 1
Document12 pages
Lec 1
khodang
Pas encore d'évaluation
AbdallahPlumbley13 Gsi Accepted
Document8 pages
AbdallahPlumbley13 Gsi Accepted
orestis willemen
Pas encore d'évaluation
BayesianFrequencyAnalysisOfExtremeRiverDischarges JVN MD
Document10 pages
BayesianFrequencyAnalysisOfExtremeRiverDischarges JVN MD
Viswanath Koduru
Pas encore d'évaluation
Sufficient Statistics - Information Processing Equality CMU NOTES
Document6 pages
Sufficient Statistics - Information Processing Equality CMU NOTES
Saksham Jain
Pas encore d'évaluation
SD-M1 TSI Chapitre 4
Document42 pages
SD-M1 TSI Chapitre 4
Felix Chokwe Danra Taissala
Pas encore d'évaluation
Classical Information Theory Key Concepts
Document6 pages
Classical Information Theory Key Concepts
viewh2003
Pas encore d'évaluation
On Information (pseudo) Metric
Document13 pages
On Information (pseudo) Metric
Jose Gregorio Rodriguez Vilarreal
Pas encore d'évaluation
Solution 5 Problem 1: Let a > 0 be a known constant, and let θ > 0 be a parameter
Document8 pages
Solution 5 Problem 1: Let a > 0 be a known constant, and let θ > 0 be a parameter
Abdul Basit
Pas encore d'évaluation
Neural Network With Matrix Inputs
Document11 pages
Neural Network With Matrix Inputs
harutyun
Pas encore d'évaluation
Statistical Learning Intro
Document10 pages
Statistical Learning Intro
apopop
Pas encore d'évaluation
What Is Information Theory? The Basics: Sensor Reading Group 10 October 2003
Document9 pages
What Is Information Theory? The Basics: Sensor Reading Group 10 October 2003
Mradul Yadav
Pas encore d'évaluation
What Is Neural Network Technology?
Document17 pages
What Is Neural Network Technology?
Mészáros Balázs
Pas encore d'évaluation
ColumbiaX Machine Learning Lecture 1 Overview
Document615 pages
ColumbiaX Machine Learning Lecture 1 Overview
Alvaro Jose Rehnfeldt Schmidt
Pas encore d'évaluation
Asset-V1 ColumbiaX+CSMM.102x+1T2018+type@asset+block@ML Lecture1
Document17 pages
Asset-V1 ColumbiaX+CSMM.102x+1T2018+type@asset+block@ML Lecture1
Adi
Pas encore d'évaluation
Management Decision-Making Based On Fuzzy Neural Network
Document3 pages
Management Decision-Making Based On Fuzzy Neural Network
Mario Alcides Calo
Pas encore d'évaluation
Exercise Problems: Information Theory and Coding
Document6 pages
Exercise Problems: Information Theory and Coding
Reagan Torbi
Pas encore d'évaluation
2021 UNAS REFER Rafi Yon Saputra 173112706420242 Kernel Primer
Document65 pages
2021 UNAS REFER Rafi Yon Saputra 173112706420242 Kernel Primer
Rafi Y.S
Pas encore d'évaluation
Info Theory Exercise Solutions
Document16 pages
Info Theory Exercise Solutions
AvishekMajumder
Pas encore d'évaluation
Insurance Analytics: Prof. Julien Trufin
Document35 pages
Insurance Analytics: Prof. Julien Trufin
david Abotsitse
Pas encore d'évaluation
What Is Information?: W. Szpankowski
Document29 pages
What Is Information?: W. Szpankowski
Ruma Devi
Pas encore d'évaluation
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
Document16 pages
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
Camila Lopes
Pas encore d'évaluation
Lec08 LSH
Document7 pages
Lec08 LSH
java
Pas encore d'évaluation
Reasoning About Uncertainty Entropy
Document4 pages
Reasoning About Uncertainty Entropy
Bob
Pas encore d'évaluation
Entropy
Document21 pages
Entropy
Gyana Ranjan Mati
Pas encore d'évaluation
ACT6100 A2020 Sup 12
Document37 pages
ACT6100 A2020 Sup 12
lebesgues
Pas encore d'évaluation
ML Lecture II: Neural Networks, Decision Trees and Unsupervised Learning
Document118 pages
ML Lecture II: Neural Networks, Decision Trees and Unsupervised Learning
Narendra Singh
Pas encore d'évaluation
Lecture 1: Entropy and Mutual Information: 2.1 Example
Document8 pages
Lecture 1: Entropy and Mutual Information: 2.1 Example
Lokesh Singh
Pas encore d'évaluation
Estimation and Detection: Lecture 9: Introduction Detection Theory (Chs 1,2,3)
Document38 pages
Estimation and Detection: Lecture 9: Introduction Detection Theory (Chs 1,2,3)
akhilaraj
Pas encore d'évaluation
FRJ Paper
Document9 pages
FRJ Paper
Matt
Pas encore d'évaluation
The Sufficiency Principle: Maura Mezzetti
Document35 pages
The Sufficiency Principle: Maura Mezzetti
Pedro Lopes
Pas encore d'évaluation
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
Document67 pages
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
Boul chandra Garai
Pas encore d'évaluation
41 DeepLearning
Document4 pages
41 DeepLearning
Caballero Alférez Roy Torres
Pas encore d'évaluation
Interpolatio Poly Approx
Document60 pages
Interpolatio Poly Approx
anon_774161618
Pas encore d'évaluation
CPSC340: Entropy and Maximum Likelihood
Document19 pages
CPSC340: Entropy and Maximum Likelihood
bzsahil
Pas encore d'évaluation
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
Document23 pages
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
Anonymous d0rFT76B
Pas encore d'évaluation
CS236 Homework 1
Document4 pages
CS236 Homework 1
Raffael Yasin
Pas encore d'évaluation
lec16
Document15 pages
lec16
Thiago Salles
Pas encore d'évaluation
Lecture 2: Gibb's, Data Processing and Fano's Inequalities: 2.1.1 Fundamental Limits in Information Theory
Document6 pages
Lecture 2: Gibb's, Data Processing and Fano's Inequalities: 2.1.1 Fundamental Limits in Information Theory
Abhishek Prakash
Pas encore d'évaluation
Lec38 - 210108071 - AKSHAY KUMAR JHA
Document12 pages
Lec38 - 210108071 - AKSHAY KUMAR JHA
vasu sain
Pas encore d'évaluation
APA Chapter3 T20
Document24 pages
APA Chapter3 T20
XxXavillitoxX 5
Pas encore d'évaluation
EE 325: Probability and Random Processes Bounds and Inequalities
Document62 pages
EE 325: Probability and Random Processes Bounds and Inequalities
Srishti Sharma
Pas encore d'évaluation
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
Document58 pages
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
Aneez Shrestha
Pas encore d'évaluation
Comm ch02 Random en PDF
Document84 pages
Comm ch02 Random en PDF
Harshali Mane
Pas encore d'évaluation
Num Interpolation 2
Document25 pages
Num Interpolation 2
Animesh Manna
Pas encore d'évaluation
Joint & Conditional Entropy, Mutual Information: Application of Information Theory, Lecture 2
Document26 pages
Joint & Conditional Entropy, Mutual Information: Application of Information Theory, Lecture 2
Geremu Tilahun
Pas encore d'évaluation
Green's Function Estimates for Lattice Schrödinger Operators and Applications. (AM-158)
D'Everand
Green's Function Estimates for Lattice Schrödinger Operators and Applications. (AM-158)
Jean Bourgain
Pas encore d'évaluation
Invariants
Document1 page
Invariants
Azamat
Pas encore d'évaluation
Exact and Fast Computation of Geometric Moments For Gray Level Images
Document9 pages
Exact and Fast Computation of Geometric Moments For Gray Level Images
Azamat
Pas encore d'évaluation
SO (3) Identities and Approximations. T. Barfoot
Document2 pages
SO (3) Identities and Approximations. T. Barfoot
Azamat
Pas encore d'évaluation
Lmibook
Document205 pages
Lmibook
أحمد الطنطاوى
Pas encore d'évaluation
Linear Systems. T. Kailath PDF
Document704 pages
Linear Systems. T. Kailath PDF
Azamat
Pas encore d'évaluation
Deborah - May 4 337 Notes
Document2 pages
Deborah - May 4 337 Notes
Azamat
Pas encore d'évaluation
Parallel Multiscale Autoregressive Density Estimation
Document16 pages
Parallel Multiscale Autoregressive Density Estimation
Mario Galindo Queralt
Pas encore d'évaluation
Civil Complaints Management System by Using Machine Learning Techniques
Document4 pages
Civil Complaints Management System by Using Machine Learning Techniques
vaishnavi
Pas encore d'évaluation
First International Workshop On Disaster Management
Document162 pages
First International Workshop On Disaster Management
Alan White
Pas encore d'évaluation
Information Technology Fundamentals: CCIT4085
Document43 pages
Information Technology Fundamentals: CCIT4085
singyuetchan05
Pas encore d'évaluation
Experiment No. 4 - DBMS Lab Final
Document17 pages
Experiment No. 4 - DBMS Lab Final
grzmpgqn4f
Pas encore d'évaluation
Group Report (Automatic Control Lab Project 2 of 3)
Document17 pages
Group Report (Automatic Control Lab Project 2 of 3)
Carl Ariff
Pas encore d'évaluation
Forest Fires
Document23 pages
Forest Fires
RISHITHA LELLA
Pas encore d'évaluation
Syntactic Pattern Recognition
Document2 pages
Syntactic Pattern Recognition
nigel989
Pas encore d'évaluation
1 s2.0 S1877050919321295 Main
Document8 pages
1 s2.0 S1877050919321295 Main
PREET GADA
Pas encore d'évaluation
Data Mining - Lab - Manual
Document20 pages
Data Mining - Lab - Manual
varmam
Pas encore d'évaluation
Fuzzy Week - 2 CSCI 3006 Lecture 2
Document18 pages
Fuzzy Week - 2 CSCI 3006 Lecture 2
drthuhan
Pas encore d'évaluation
P, PI, PD & PID Controllers Approaches
Document77 pages
P, PI, PD & PID Controllers Approaches
Tushit Thakkar
Pas encore d'évaluation
One-Port Network
Document9 pages
One-Port Network
GiwrgosMth
Pas encore d'évaluation
RTU SYLLABUS M.TECH CONTROL INSTRUMENTATION
Document10 pages
RTU SYLLABUS M.TECH CONTROL INSTRUMENTATION
Pawan Kumar Yadav
Pas encore d'évaluation
Speaker Diarization Research Paper
Document12 pages
Speaker Diarization Research Paper
hallo fook
Pas encore d'évaluation
Hidden Markov Models
Document17 pages
Hidden Markov Models
Spam Scam
Pas encore d'évaluation
Communicative Competence Strategies in Various Speech Situations
Document9 pages
Communicative Competence Strategies in Various Speech Situations
Kristyl Joy Anas
Pas encore d'évaluation
Assignment Neural Networks
Document7 pages
Assignment Neural Networks
Sagar Sharma
Pas encore d'évaluation
SQL-2 Select
Document24 pages
SQL-2 Select
muaz.jutt113
Pas encore d'évaluation
The Ground Motion Dynamics Analysis of A Bionic Amphibious Robot With Undulatory Fins
Document6 pages
The Ground Motion Dynamics Analysis of A Bionic Amphibious Robot With Undulatory Fins
Anson Kwan
Pas encore d'évaluation
Unit 2-DBP
Document44 pages
Unit 2-DBP
priyadharshinimr50
Pas encore d'évaluation
Istanbul Technical University Department of Mechanical Engineering System Dynamics and Control Course
Document3 pages
Istanbul Technical University Department of Mechanical Engineering System Dynamics and Control Course
Ceren Arpak
Pas encore d'évaluation