Vous êtes sur la page 1sur 42

Machine Learning

Deep Learning


Deep Learning (DL)


• Jared Dean, Big Data, Data Mining, and

Machine Learning, Willey
• Brett Lantz, Machine Learning with R,
• M S Ram, Introduction to Deep
Learning, Indian Institute of Technology
• Papers, Website, tutorial
Deep Learning Overview
• The term “deep learning” has become
widely used to describe machine learning
methods that can learn more abstract
concepts, primarily in the areas of image
recognition and text processing than
methods previously available
• Deep Learning is another name for a set
of algorithms that use a neural network as
an architecture, and learn the features
Deep Learning - Brief History
• The earliest example of artificial neural networks is the
Perceptron algorithm developed by Rosenblatt in 1958
• In the late 1970’s, researchers discovered that Perceptron
cannot approximate many nonlinear decision functions
• In 1980’s, researchers found a solution to that problem by
stacking multiple layers of linear classifiers (hence the name
“multilayer perceptron”) to approximate nonlinear decision
• the lack of computational power and labeled data etc., neural
networks were left out of mainstream research in late 1990’s
and early 2000’s
• Since the late 2000’s, neural networks have recovered and
become more successful thanks to the availability of
inexpensive, parallel hardware (graphics processors,
computer clusters) and a massive amount of labeled data
• It was the first algorithmically described neural
network. Its invention by Rosenblatt in 1958
• It is the simplest form of a neural network used
for the classification of patterns said to be
linearly separable
• The perceptron built around a single neuron is
limited to performing pattern classification with
only two classes (hypotheses)
Neural Network – Why Success ?
• The most important reason is that neural networks have
a lot of parameters, and can approximate very nonlinear
functions. So if the problem is complex, and has a lot of
data, neural networks are good approximators for it
– For instance, neural networks with three or more hidden layers
have proven to do quite well at tasks such as recognizing
handwritten digits or selecting images with dogs in them

• The second reason is that neural networks are very

flexible: we can change the architecture fairly easily to
adapt to specific problems/domains (such as
convolutional neural networks and recurrent neural
Deep Learning Application
Neural Network & Deep Learning
• Ideas drawn from neural networks and
machine learning are hybridized to
perform improved learning tasks beyond
the capability of either one operating on its
own, and
• ideas inspired by the human brain lead to
new perspectives wherever they are of
particular importance
Deep Learning- Example
Deep Learning- Example
Deep Learning- Example 1
Deep Learning- Example 2
Deep Learning- Example 3
Deep Learning - Aim
• Create algorithms
– that can understand scenes and describe them in
natural language
– that can infer semantic concepts to allow machines to
interact with humans using these concepts
• Requires creating a series of abstractions
– Image (Pixel Intensities) -> Objects in Image ->
Object Interactions -> Scene Description
• Deep learning aims to automatically learn these
abstractions with little supervision
How do we train?
• Inspiration from mammal brain
• Multiple Layers of “neurons” (Rumelhart et al
• Train each layer to compose the representations
of the previous layer
• to learn a higher level abstraction
– Ex: Pixels -> Edges -> Contours -> Object parts ->
Object categories
– Local Features -> Global Features
• Train the layers one-by-one (Hinton et al 2006)
– Greedy strategy
Neuron vs Artificial Network
Neuron vs Artificial Network
Neuron vs Artificial Network
Element of Neural Network
Element of Neural Network
1. Inputs are fed into the perceptron
2. Weights are multiplied to each input
3. Summation and then add bias
4. Activation function is applied. Note that here
we use a step function, but there are other
more sophisticated activation functions like
sigmoid, hyperbolic tangent (tanh), rectifier
(relu) and more. No worries, we will cover many
of them in the future!
5. Output is either triggered as 1, or not, as 0.
Note we use y hat to label output produced by
our perceptron model
Deep Learning Model
First DL Model
• Training neural networks
takes a long time,
especially when the
training set is large
• It therefore makes sense
to use many machines
to train our neural
• Too many concepts to learn
– Too many object categories
– Too many ways of interaction between objects
• Behaviour is a highly varying function underlying
– f: L -> V
– L: latent factors of variation
• low dimensional latent factor space
– V: visible behaviour
• high dimensional observable space
– f: highly non-linear function