Académique Documents
Professionnel Documents
Culture Documents
Outline
Discuss value of modular and deep gradient based
systems, especially in robotics
Introduce a new and useful family of modules
Properties of new family
Online training with non-gaussian priors
E.g. encourage sparsity, multi-task weight sharing
RGB Camera
NIR Camera
Motion
Planner
Goal System
Camera
Lighting
Variance Cost
Data Flow
Gradient
Webcam Data
Labelme
Labelme
IMU data
Classification
Cost
Labeled 3-D
points
Motion plans
Human-Driven
Example Paths
New Modules
Modules that are important in this system
require two new abilities
Induce new priors on weights
Allow modules to solve internal optimization
problems
Alternate Priors
KL-divergence
L2 Backpropagation
Loss Function
c
w1t 1 w1t 1c
Module (M1)
Input
w2t 1 w2t 2c
Module (M2)
c
c
w3t 1 w3t 3c
Module (M3)
c
c
Loss Function
c
w1t 1 w1t e 1c
Module (M1)
Input
w2t 1 w2t e 2c
Module (M2)
c
c
w3t 1 w3t 3c
Module (M3)
c
c
Loss Function
c
t 1
1
( w ) c ( w )
t
1
t
1
Module (M1)
Input
Module (M2)
t 1
2
b
( w ) c( w2t )
t
2
Module (M3)
c
c
Loss Function
New Modules
Modules that are important in this system
require two new abilities
Induce new priors on weights
Allow modules to solve internal
optimization problems
interesting nonlinear effects such as inhibition
that involve coupled outputs
Sparse Approximation
Inhibition
Input
Basis
Inhibition
Input
Projection
Basis
Inhibition
Input
KL-regularized Optimization
Basis
Sparse Approximation
Assumes the input is a sparse combination of
elements, plus observation noise
Many possible elements
Only a few present in any particular example
Olhausen and Field, Sparse Coding of Natural Images Produces Localized, Oriented, Bandpass Receptive Fields, Nature 95
Doi and Lewicki, Sparse Coding of natural images using an overcomplete set of limited capacity units, NIPS 04
Sparse Approximation
Semantic meaning
is sparse
Sparse Approximation
Basis Coefficients (w1)
Error
gradient
Input
Reconstruction Error
(Cross Entropy)
r1=Bw
Sparse Approximation
KL-regularized Coefficients on a KL-regularized Basis
Input
Output
Sparse Coding
r=Bw(i)
Input
Training
Examples
Reconstruction Error
(Cross Entropy)
Optimization Modules
L1 Regularized Sparse Approximation
Reconstruction Loss
Regularization Term
Convex
Preliminary Results
L1 sparse coding
Main Points
Modular, gradient based systems are an important
design tool for large scale learning systems
Need new tools to include a family of modules that
have important properties
Presented a generalized backpropagation
technique that
Allow priors that encourage, e.g. sparsity (KL prior): uses
mirror descent to modify weights
Uses implicit differentiation to compute gradients through
modules (e.g. sparse approximation) that internally solve
optimization
Acknowledgements
The Authors would like to thank the UPI
team, especially Cris Dima, David Silver,
and Carl Wellington
DARPA and the Army Research Office for
supporting this work through the UPI
program and the NDSEG fellowship