Vous êtes sur la page 1sur 10

In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet

transform for which thewavelets are discretely sampled. As with other wavelet transforms, a key
advantage it has over Fourier transforms is temporal resolution: it captures both
frequency and location information (location in time) The most commonly used set of discrete
wavelet transforms was formulated by the Belgian mathematician Ingrid Daubechies in 1988. This
formulation is based on the use ofrecurrence relations to generate progressively finer discrete
samplings of an implicit mother wavelet function; each resolution is twice that of the previous scale.
In her seminal paper, Daubechies derives a family of wavelets, the first of which is the Haar wavelet.
Interest in this field has exploded since then, and many variations of Daubechies' original wavelets
were developed

 Sinusoidal waves differ only in their frequency. The first does not complete any cycles, the
second completes one full cycle, the third completes two cycles, and the fourth completes three
cycles (which is equivalent to completing one cycle in the opposite direction). Differences in
phase can be represented by multiplying a given basis vector by a complex constant.
 Wavelets, by contrast, have both frequency and location. As before, the first completes zero
cycles, and the second completes one cycle. However, the second and third both have the same
frequency, twice that of the first. Rather than differing in frequency, they differ in location — the
third is nonzero over the first two elements, and the second is nonzero over the second two

.Daubechies wavelets, based on the work of Ingrid Daubechies, are a family of orthogonal
waveletsdefining a discrete wavelet transform and characterized by a maximal number of
vanishing moments for some given support. With each wavelet type of this class, there is a scaling
function (called the father wavelet) which generates an orthogonal multiresolution analysis

compactly supported orthonormal wavelets — thus making discrete wavelet analysis practicable.

The names of the Daubechies family wavelets are written dbN, where N is the order, and db the
"surname" of the wavelet. The db1 wavelet, as mentioned above, is the same asHaar wavelet. Here are
the wavelet functions psi of the next nine members of the family:

Area' Returns a scalar that specifies the actual number of pixels in the region. (This value might differ slightly from the value returned by bwa
different patterns of pixels differently.)
'BoundingBox' Returns the smallest rectangle containing the region, specified as a 1-by-Q*2 vector, where Q is the number of image dimensions, for
example, [ul_corner width]. ul_corner specifies the upper-left corner of the bounding box in the form [x y z ...]. width s
of the bounding box along each dimension in the form[x_width y_width ...] . regionprops uses ndims to get the dimensions
binary image,ndims(L), and numel to get the dimensions of connected components, numel(CC.ImageSize).
'Centroid' Returns a 1-by-Q vector that specifies the center of mass of the region. The first element of Centroid is the horizontal coordinate (or
center of mass, and the second element is the vertical coordinate (ory-coordinate). All other elements of Centroid are

in order of dimension. This figure illustrates the centroid and bounding box for a discontiguous region

A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases, and then
decreases back to zero. It can typically be visualized as a "brief oscillation" like one might see
recorded by a seismograph or heart monitor. Generally, wavelets are purposefully crafted to have
specific properties that make them useful for signal processing. Wavelets can be combined, using a
"reverse, shift, multiply and integrate" technique called convolution, with portions of a known signal
to extract information from the unknown signal
his representation is a wavelet series representation of a square-integrable function with respect to
either a complete,orthonormal set of basis functions, or an overcomplete set or frame of a vector
space, for the Hilbert space of square integrable function
the wavelet will correlate with the signal if the unknown signal contains information of similar
frequency. This concept of correlation is at the core of many practical applications of wavelet theory
In any discretised wavelet transform, there are only a finite number of wavelet coefficients for each
bounded rectangular region in the upper halfplane. Still, each coefficient requires the evaluation of
an integral. In special situations this numerical complexity can be avoided if the scaled and shifted
wavelets form a multiresolution analysis. This means that there has to exist an auxiliary function,
the father wavelet φ in L2(R), and that a is an integer. A typical choice is a = 2 and b = 1. The most
famous pair of father and mother wavelets is theDaubechies 4-tap wavelet.
if it is sufficient to pick a discrete subset of the upper halfplane to be able to reconstruct a signal
from the corresponding wavelet coefficients. One such system is the affine system for some real
parameters a > 1, b > 0. The corresponding discrete subset of the halfplane consists of all the points
(am, namb) with m, n in Z

A simple elastic snake is defined by a set of n points where , the internal

elastic energy term , and the external edge-based energy term . The purpose
of the internal energy term is to control the deformations made to the snake, and the purpose of the
external energy term is to control the fitting of the contour onto the image. The external energy is
usually a combination of the forces due to the image itself and the constraint forces
introduced by the user

The energy function of the snake is the sum of its external energy and internal energy, or

Internal energy[edit]
The internal energy of the snake is composed of the continuity of the contour and the
smoothness of the contour .

This can be expanded as

Where and are user-defined weights; these control the internal energy
function's sensitivity to the amount of stretch in the snake and the amount of curvature
in the snake, respectively, and thereby control the number of constraints on the shape of
the snake.

In practice, a large weight for the continuity term penalizes changes in distances
between points in the contour. A large weight for the smoothness term penalizes
oscillations in the contour and will cause the contour to act as a thin plate.

Image energy[edit]
Energy in the image is some function of the features of the image. This is one of the
most common points of modification in derivative methods. Features in images and
images themselves can be processed in many and various ways.

For an image , lines, edges, and terminations present in the image, the general
formulation of energy due to the image is

where , , are weights of these salient features. Higher weights

indicate that the salient feature will have a larger contribution to the image force.

Line functional[edit]

The line functional is the intensity of the image, which can be represented as
The sign of will determine whether the line will be attracted to either dark
lines or light lines.

Some smoothing or noise reduction may be used on the image, which then the
line functional appears as

Edge functional[edit]

The edge functional is based on the image gradient. One implementation of

this is

A snake originating far from the desired object contour may erroneously
converge to some local minimum. Scale space continuation can be
used in order to avoid these local minima. This is achieved by using a
blur filter on the image and reducing the amount of blur as the
calculation progresses to refine the fit of the snake. The energy
functional using scale space continuation is

where is a Gaussian with standard deviation . Minima of this

function fall on the zero-crossings of which define edges
as per Marr–Hildreth theory

Active contour model, also called snakes, is a framework in computer vision for delineating an
object outline from a possibly noisy 2D image. The snakes model is popular in computer vision, and
snakes are greatly used in applications like object tracking, shape recognition, segmentation, edge
detection and stereo matching.

A snake is an energy minimizing, deformable spline influenced by constraint and image forces that
pull it towards object contours and internal forces that resist deformation. Snakes may be
understood as a special case of the general technique of matching a deformable model to an image
by means of energy minimization.[1] In two dimensions, theactive shape model represents a discrete
version of this approach, taking advantage of the point distribution model to restrict the shape range
to an explicit domain learned from a training set.
Snakes – active deformable models

Snakes do not solve the entire problem of finding contours in images, since the method requires
knowledge of the desired contour shape beforehand. Rather, they depend on other mechanisms
such as interaction with a user, interaction with some higher level image understanding process, or
information from image data adjacent in time or space.

In machine learning and cognitive science, artificial neural networks (ANNs) are a family of
models inspired by biological neural networks (the central nervous systems of animals, in particular
the brain) and are used to estimate or approximate functions that can depend on a large number
of inputs and are generally unknown. Artificial neural networks are generally presented as systems
of interconnected "neurons" which exchange messages between each other. The connections have
numeric weights that can be tuned based on experience, making neural nets adaptive to inputs and
capable of learning.

For example, a neural network for handwriting recognition is defined by a set of input neurons which
may be activated by the pixels of an input image. After being weighted and transformed by
a function (determined by the network's designer), the activations of these neurons are then passed
on to other neurons. This process is repeated until finally, an output neuron is activated. This
determines which character was read.

Like other machine learning methods – systems that learn from data – neural networks have been
used to solve a wide variety of tasks that are hard to solve using ordinary rule-based programming,
including computer vision and speech recognition.
An artificial neural network is an interconnected group of nodes, akin to the vast network of neurons in a brain.
Here, each circular node represents an artificial neuron and an arrow represents a connection from the output
of one neuron to the input of another.

Learning paradigms[edit]
There are three major learning paradigms, each corresponding to a particular abstract learning task.
These are supervised learning, unsupervised learning and reinforcement learning.

Supervised learning[edit]

In supervised learning, we are given a set of example pairs and the aim is to find a

function in the allowed class of functions that matches the examples. In other words, we wish
to infer the mapping implied by the data; the cost function is related to the mismatch between our
mapping and the data and it implicitly contains prior knowledge about the problem domain.

A commonly used cost is the mean-squared error, which tries to minimize the average squared error
between the network's output, , and the target value over all the example pairs. When one
tries to minimize this cost using gradient descent for the class of neural networks called multilayer
perceptrons (MLP), one obtains the common and well-known backpropagation algorithm for training
neural networks.

Tasks that fall within the paradigm of supervised learning are pattern recognition (also known as
classification) and regression (also known as function approximation). The supervised learning
paradigm is also applicable to sequential data (e.g., for speech and gesture recognition). This can be
thought of as learning with a "teacher", in the form of a function that provides continuous feedback
on the quality of solutions obtained thus far.

segments the 2-D grayscale image A into foreground (object) and background regions using active
contour based segmentation. The output image bw is a binary image where the foreground is white
(logical true) and the background is black (logical false). mask is a binary image that specifies the initial
state of the active contour. The boundaries of the object region(s) (white) in mask define the initial contour
position used for contour evolution to segment the image. To obtain faster and more accurate
segmentation results, specify an initial contour position that is close to the desired object boundaries.
Defining a Problem
To define a pattern recognition problem, arrange a set of Q input vectors as columns in a matrix. Then
arrange another set of Q target vectors so that they indicate the classes to which the input vectors are
assigned (see "Data Structures" for a detailed description of data formatting for static and time series
data). There are two approaches to creating the target vectors.

One approach can be used when there are only two classes; you set each scalar target value to either 1
or 0, indicating which class the corresponding input belongs to. For instance, you can define the two-class
exclusive-or classification problem as follows:
inputs = [0 1 0 1; 0 0 1 1];
targets = [0 1 0 1; 1 0 1 0];

Sensitivity and specificity are statistical measures of the performance of a binary

classification test, also known in statistics as classification function:

 Sensitivity (also called the true positive rate, or the recall in some fields) measures the
proportion of positives that are correctly identified as such (e.g., the percentage of sick people
who are correctly identified as having the condition).
 Specificity (also called the true negative rate) measures the proportion of negatives that are
correctly identified as such (e.g., the percentage of healthy people who are correctly identified as
not having the condition).

Thus sensitivity quantifies the avoiding of false negatives, as specificity does for false positives. For
any test, there is usually a trade-off between the measures. For instance, in an airport
security setting in which one is testing for potential threats to safety, scanners may be set to trigger
on low-risk items like belt buckles and keys (low specificity), in order to reduce the risk of missing
objects that do pose a threat to the aircraft and those aboard (high sensitivity). This trade-off can be
represented graphically as a receiver operating characteristic curve. A perfect predictor would be
described as 100% sensitive (e.g., all sick are identified as sick) and 100% specific (e.g., no healthy
are identified as sick); however, theoretically any predictor will possess a minimum error
bound known as the Bayes error rate.

Sensitivity refers to the test's ability to correctly detect patients who do have the condition. Consider
the example of a medical test used to identify a disease. The sensitivity of the test is the proportion
of people who test positive for the disease among those who have the disease. Mathematically, this
can be expressed as:

A negative result in a test with high sensitivity is useful for ruling out disease. A high sensitivity
test is reliable when its result is negative, since it rarely misdiagnoses those who have the
disease. A test with 100% sensitivity will recognize all patients with the disease by testing
positive. A negative test result would definitively rule out presence of the disease in a patient.

A positive result in a test with high sensitivity is not useful for ruling in disease. Suppose a
'bogus' test kit is designed to show only one reading, positive. When used on diseased patients,
all patients test positive, giving the test 100% sensitivity. However, sensitivity by definition does
not take into account false positives. The bogus test also returns positive on all healthy patients,
giving it a false positive rate of 100%, rendering it useless for detecting or "ruling in" the disease.

Sensitivity is not the same as the precision or positive predictive value (ratio of true positives to
combined true and false positives), which is as much a statement about the proportion of actual
positives in the population being tested as it is about the test.

The calculation of sensitivity does not take into account indeterminate test results. If a test
cannot be repeated, indeterminate samples either should be excluded from the analysis (the
number of exclusions should be stated when quoting sensitivity) or can be treated as false
negatives (which gives the worst-case value for sensitivity and may therefore underestimate it).

A test with high sensitivity has a low type II error rate. In non-medical contexts, sensitivity is
sometimes called recall.

Specificity relates to the test's ability to correctly detect patients without a condition. Consider the
example of a medical test for diagnosing a disease. Specificity of a test is the proportion of
healthy patients known not to have the disease, who will test negative for it. Mathematically, this
can also be written as:
A positive result in a test with high specificity is useful for ruling in disease. The test rarely
gives positive results in healthy patients. A test with 100% specificity will read negative, and
accurately exclude disease from all healthy patients. A positive result signifies a high
probability of the presence of disease.[3]

A negative result in a test with high specificity is not useful for ruling out disease. Assume a
'bogus' test is designed to read only negative. This is administered to healthy patients, and
reads negative on all of them. This will give the test a specificity of 100%. Specificity by
definition does not take into account false negatives. The same test will also read negative
on diseased patients, therefore it has a false negative rate of 100%, and will be useless for
ruling out disease.

A test with a high specificity has a low type I error rate.

In the field of machine learning, a confusion matrix, also known as an error matrix,[3] is a specific
table layout that allows visualization of the performance of an algorithm, typically a supervised
learning one (in unsupervised learning it is usually called a matching matrix). Each column of the
matrix represents the instances in a predicted class while each row represents the instances in an
actual class (or vice-versa).[2] The name stems from the fact that it makes it easy to see if the system
is confusing two classes (i.e. commonly mislabeling one as another).

It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical
sets of "classes" in both dimensions (each combination of dimension and class is a variable in the
contingency table).

A confusion matrix is a table that is often used to describe the performance of

a classification model (or "classifier") on a set of test data for which the true
values are known