Vous êtes sur la page 1sur 113

EC6402

COMMUNICATION THEORY

DEPT/ YEAR/ SEM: ECE/ II/ IV


PREPARED BY: Mr. M.ASIRVATHAM/AP/ECE

UNIT 1

AMPLITUDE MODULATION

Review of spectral characteristics of periodic and non-periodic signals.


Generation and demodulation of AM signal.
Generation and demodulation of DSBSC signal.
Generation and demodulation of SSB signal.
Generation and demodulation of VSB signal.
Comparison of amplitude modulation systems.
Frequency translation.
FDM.
Non-linear distortion.

Introduction:
In electronics, a signal is an electric current or electromagnetic field used to convey data from one
place to another. The simplest form of signal is a direct current (DC) that is switched on and off; this is the
principle by which the early telegraph worked. More complex signals consist of an alternating-current (AC)
or electromagnetic carrier that contains one or more data streams.

Modulation:

Modulation is the addition of information (or the signal) to an electronic or


optical signal carrier. Modulation can be applied to direct current (mainly by turning it on
and off), to alternating current, and to optical signals. One can think of blanket waving as a
form of modulation used in smoke signal transmission (the carrier being a steady stream of
smoke). Morse code, invented for telegraphy and still used in amateur radio, uses
a binary (two-state)digital code similar to the code used by modern computers. For most of
radio and telecommunication today, the carrier is alternating current (AC) in a given range
of frequencies. Common modulation methods include:
Amplitude modulation (AM), in which the voltage applied to the carrier is varied
over time
Frequency modulation (FM), in which the frequency of the carrier waveform is
varied in small but meaningful amounts
Phase modulation (PM), in which the natural flow of the alternating current
waveform is delayed temporarily

Classification of Signals:
Some important classifications of signals

Analog vs. Digital signals: as stated in the previous lecture, a signal with a
magnitude that may take any real value in a specific range is called an analog signal
while a signal with amplitude that takes only a finite number of values is called a
digital signal.

Continuous-time vs. discrete-time signals: continuous-time signals may be analog


or digital signals such that their magnitudes are defined for all values of t, while
discrete-time signal are analog or digital signals with magnitudes that are defined at
specific instants of time only and are undefined for other time instants.

Periodic vs. aperiodic signals: periodic signals are those that are constructed from a
specific shape that repeats regularly after a specific amount of time T0, [i.e., a
periodic signal f(t) with period T0 satisfies f(t) = f(t+nT0) for all integer values of
n], while aperiodic signals do not repeat regularly.

Deterministic vs. probabilistic signals: deterministic signals are those that can be
computed beforehand at any instant of time while a probabilistic signal is one that
is random and cannot be determined beforehand.

Energy vs. Power signals: as described below.

Energy and Power Signals


The total energy contained in and average power provided by a signal f(t) (which
is a function of time) are defined as

| f (t ) |2 dt ,

Ef

and

Pf

1
lim
T
T

T /2

| f (t ) |2 dt ,
T /2

respectively.

For periodic signals, the power P can be computed using a simpler form based on
the periodicity of the signal as

PPeriodic f

1
T

T t0

| f (t ) |2 dt ,
t0

where T here is the period of the signal and t0 is an arbitrary time instant that is chosen to
simply the computation of the integration (to reduce the functions you have to integrate
over one period).

Classification of Signals into Power and Energy Signals


Most signals can be classified into Energy signals or Power signals. A signal is classified
into an energy or a power signal according to the following criteria

a)

Energy Signals: an energy signal is a signal with finite energy and zero
average power (0 E < , P = 0),

b)

Power Signals: a power signal is a signal with infinite energy but finite
average power (0 < P < , E

).

Comments:

1.

The square root of the average power

P of a power signal is what is

usually defined as the RMS value of that signal.


2.

Your book says that if a signal approaches zero as t approaches

then the

signal is an energy signal. This is in most cases true but not always as you
can verify in part (d) in the following example.
3.

All periodic signals are power signals (but not all nonperiodic signals are
energy signals).

4.

Any signal f that has limited amplitude (| f | < ) and is time limited
(f = 0 for

| t | > t0 for some t0 > 0) is an energy signal as in part (g) in

the following example.

Exercise 1: determine if the following signals are Energy signals, Power signals, or
neither, and evaluate E and P for each signal (see examples 2.1 and 2.2 on
pages 17 and 18 of your textbook for help).

a)

a(t ) 3sin(2 t ),

This is a periodic signal, so it must be a power signal. Let us prove it.

| a (t ) |2 dt

Ea

| 3sin(2 t ) |2 dt

1
1 cos(4 t ) dt
2

1
dt
2

9 cos(4 t )dt

Notice that the evaluation of the last line in the above equation is infinite
because of the first term. The second term has a value between 2 to 2 so it
has no effect in the overall value of the energy.

Since a(t) is periodic with period T = 2 /2 = 1 second, we get


1

Pa

1
| a (t ) |2 dt
10

| 3sin(2 t ) |2 dt
0

1
1 cos(4 t ) dt
2
0
0

1
dt
2
0

9
2
9
2

9 cos(4 t )dt
0

9
sin(4 t )
4

So, the energy of that signal is infinite and its average power is finite (9/2).
This means that it is a power signal as expected. Notice that the average
power of this signal is as expected (square of the amplitude divided by 2)

b)

b (t ) 5e

2|t |

Let us first find the total energy of the signal.

Eb

| b (t ) |2 dt

2|t | 2

5e

dt

25 e 4t dt

25 e 4t dt
0

25 4t 0
25
e
e
4
4
25 25 50
J
4
4
4

4t
0

The average power of the signal is

Pb

1
lim
T
T
25 lim
T

T /2

1
lim
T
T

| b (t ) | dt
T /2

1
T

e 4t dt

25 lim
T

T /2

25
1 4t
lim
e
T
4
T
25
1
lim
1 e
T
4
T
0 0 0

0
T /2
2T

1
T

T /2

5e

2|t | 2

dt

T /2
T /2

e 4t dt
0

25
1
lim
e 4t
T
4
T
25
1
lim
e 2T
T
4
T

T /2
0

So, the signal b(t) is definitely an energy signal.


So, the energy of that signal is infinite and its average power is finite (9/2). This means that
it is a power signal as expected. Notice that the average power of this signal is as expected
(the square of the amplitude divided by 2)

c)

d)

c (t )

d (t )

4e 3t , | t | 5
,
0,
|t | 5

1
, t
t
0, t

Let us first find the total energy of the signal.

1
dt
t
1

| d (t ) |2 dt

Ed

ln t

So, this signal is NOT an energy signal. However, it is also NOT a power
signal since its average power as shown below is zero.

The average power of the signal is

Pd

1
lim
T
T
lim

lim

T /2

1
lim
T
T

| d (t ) | dt
T /2

1
ln t
T

T /2

ln

1
T
ln
T
2

1
dt
t

1
T
ln
T
2

lim

T /2

lim

1
ln 1
T

T
2
T

Using Lehopitals rule, we see that the power of the signal is zero. That is

ln
Pd

lim

T
2
T

2
lim T
T
1

So, not all signals that approach zero as time approaches positive and
negative infinite is an energy signal. They may not be power signals either.

e)

e (t )

f)

f (t )

g)

g (t )

7t 2 ,

2 cos 2 (2 t ),

12cos 2 (2 t ),
0,

8 t 31
elsewhere

AMPLITUDE MODULATION:

In amplitude modulation, the instantaneous amplitude of a carrier wave is varied in accordance with the insta
modulating signal. Main advantages of AM are small bandwidth and simple transmitter and receiver designs.

implemented by mixing the carrier wave in a nonlinear device with the modulating signal. This produces uppe
which are the sum and difference frequencies of the carrier wave and modulating signal.

The carrier signal is represented by


c(t) = A cos(wct)
The modulating signal is represented by
m(t) = B sin(wmt)

Then the final modulated signal is


[1 + m(t)] c(t)
= A [1 + m(t)] cos(wct)
= A [1 + B sin(wmt)] cos(wct)
= A cos(wct) + A m/2 (cos((wc+wm)t)) + A m/2 (cos((wc-wm)t))
Because of demodulation reasons, the magnitude of m(t) is always kept less than 1 and
the frequency much smaller than that of the carrier signal.
The modulated signal has frequency components at frequencies wc, wc+wm and wc-wm.

DSBSC:

Double Sideband Suppressed Carrier Modulation In amplitude modulation the


amplitude of a high-frequency carrier is varied indirect proportion to the low-frequency
(baseband) message signal. The carrier is usually a sinusoidal waveform, that is,
c(t)=Ac cos(ct+c)
Or
c(t)=Ac sin(ct+c)
Where:
Ac is the unmodulated carrier amplitude
c is the unmodulated carrier angular frequency in radians/s;
c =2fcc is the unmodulated carrier phase, which we shall assume is zero.
The amplitude modulated carrier has the mathematical form
DSB-SC(t)= A(t) cos(ct)
Where:
A(t) is the instantaneous amplitude of the modulated carrier, and is a linear function
of the message signal m(t). A(t) is also known as the envelope of the modulated signal For
double-sideband suppressed carrier (DSB-SC) modulation the amplitude is
related to the message as follows:
A(t)=Ac(t) m(t)
Consider a message signal with spectrum (Fourier transform) M() which is band limited
to 2B as shown in Figure 1(b). The bandwidth of this signal is B Hz and c is chosen
such that c >> 2B. Applying the modulation theorem, the modulated Fourier transform
is
A(t) cos(ct)= m(t) cos(ct) ( M( - c)+ M( + c))

GENERATION OF DSBSC:
The DSB-SC can be generated using either the balanced modulator or the ring-modulator.
The balanced modulator uses two identical AM generators along with an adder. The two
amplitude modulators have a common carrier with one of them modulating the input
message , and the other modulating the inverted message . Generation of AM is not simple,
and to have two AM generators with identical operating conditions is extremely difficult.

Hence, laboratory implementation of the DSB-SC is usually using the ring-modulator,


shown in figure 1.

Figure 1

Figure 1: The ring modulator used for the generation of the double-side-band-suppressedcarrier (DSB-SC)
This standard form of DSB-SC generation is the most preferred method of laboratory
implementation. However, it cannot be used for the generation of the AM waveform.
The DSB-SC and the DSB forms of AM are closely related as; the DSB-SC with the
addition of the carrier becomes the DSB, while the DSB with the carrier removed results in
the DSB-SC form of modulation. Yet, existing methods of DSB cannot be used for the
generation of the DSB-SC. Similarly the ring modulator cannot be used for the generation
of the DSB. These two forms of modulation are generated using different methods. Our
attempt in this work is to propose a single circuit capable of generating both the DSB-SC
and the DSB forms of AM.
THE MODIFIED SWITCHING MODULATOR:
The block diagram of the modified switching modulator given in figure 1, has all
the blocks of the switching modulator, but with an additional active device. In this case, the
active device has to be of three terminals to enable it being used as a controlled switch.
Another significant change is that of the adder being shifted after the active device. These

changes in the switching-modulator enable the carrier to independently control the


switching action of the active device, and thus eliminate the restriction existing in the usual
switching-modulator (equation (2)). In addition, the same circuit can generate the DSBSC waveform. Thus the task of modulators given in figures 1 and 2 is accomplished by the
single modulator of figure 3.

Figure 2
Figure 2: The modified switching modulator
It is possible to obtain AM or the DSB-SC waveform from the modified switchingmodulator of figure 3, by just varying, the amplitude of the square wave carrier . It may be
noted that the carrier performs two tasks: (i) control the switching action of the active
devices and (ii) control the depth of modulation of the generated AM waveform. Thus, the
proposed modification in the switching modulator, enables the generation of both the AM
and the DSB-SC from a single circuit. Also, it may be noted that the method is devoid of
any assumptions or stringent difficult to maintain operating conditions, as in existing low
power generation of the AM. We now implement the modified switching modulator and
record the observed output in the next Section.
Experimental results
The circuit implemented for testing the proposed method is given in figure 4, which
uses transistors CL-100 and CK-100 for controlled switches, two transformers for the
adder, followed by a passive BPF. The square-wave carrier and the sinusoidal message are

given from a function generator (6MHz Aplab FG6M).The waveforms are observed on the
mixed signal oscilloscope (100MHz Agilent 54622D, capable of recording the output in
.tif format).

Figure 3
Figure 3: The implementation of the modified switching modulator to generate the AM
and the DSB-SC waveform
The modified switching modulator is tested using a single tone message of 706 Hz,
with a square-wave carrier of frequency 7.78 KHz. The depth of modulation of the
generated waveform can be varied either by varying the amplitude of the carrier or by
varying the amplitude of the signal. Figure 5 has the results of the modulated waveforms
obtained using the modified switching modulator. It can be seen that the same circuit is
able to generate AM for varying depths of modulation, including the over-modulation and
the DSB-SC. The quality of the modulated waveforms is comparable to that obtained using
industry standard communication modules (like the LabVolt for example).

Properties of DSB-SC Modulation:

(a) There is a 180 phase reversal at the point where +A(t)=+m(t) goes negative.
This is typical of DSB-SC modulation.

(b) The bandwidth of the DSB-SC signal is double that of the message signal, that
is,
BWDSB-SC =2B (Hz).
(c) The modulated signal is centered at the carrier frequency c with two identical
sidebands (double-sideband) the lower sideband (LSB) and the upper sideband (USB).
Being identical, they both convey the same message component.

(d) The spectrum contains no isolated carrier. Thus the name suppressed carrier.

(e)The 180 phase reversal causes the positive (or negative) side of the envelope to
have a shape different from that of the message signal. This is known as envelope
distortion, which is typical of DSBSC modulation.

(f) The power in the modulated signal is contained in all four sidebands.

Generation of DSB-SC Signals

The circuits for generating modulated signals are known as modulators. The basic
modulators are Nonlinear, Switching and Ring modulators. Conceptually, the simplest
modulator is the product or multiplier modulator which is shown in figure 1-a. However, it
is very difficult (and expensive) in practice to design a product modulator that maintains
amplitude linearity at high carrier frequencies. One way of replacing the modulator stage
is by using a non-linear device. We use the non-linearity to generate a harmonic that
contains the product term then use a BPF to separate the term of interest. Figure 3 shows a
block diagram of a nonlinear DSBSC modulator. Figure 4 shows a double balanced
modulator that use the diode as a non-linear device, then use the BPF to separate the
product term.

The received DSB-SC signal is


Sm(t) = DSB-SC(t)= Ac (t) m(t) cos(ct)
The receiver first generates an exact (coherent) replica (same phase and frequency) of the
unmodulated carrier
Sc(t) = Cos(ct)
The coherent carrier is then multiplied with the received signal to give
Sm(t)* Sc(t) = Ac (t) m(t) cos(ct)* Cos(ct)
= Ac (t) m(t)+1/2 Ac (t) m(t) cos(2ct)
The first term is the desired baseband signal while the second is a band-pass signal
centered at 2c. A low-pass filter with bandwidth equal to that of the m(t) will pass the
first term and reject the band-pass component.

Single Side Band (SSB) Modulation:

In DSB-SC it is observed that there is symmetry in the band structure. So,


even if one half is transmitted, the other half can be recovered at the received. By
doing so, the bandwidth and power of transmission is reduced by half.

Depending on which half of DSB-SC signal is transmitted, there are two types of

1. Lower Side Band (LSB) Modulation


2. Upper Side Band (USB) Modulation

Vestigial Side Band (VSB) Modulation:


The following are the drawbacks of SSB signal generation:
1.

Generation of an SSB signal is difficult.

2.

Selective filtering is to be done to get the original signal back.

3.

0
Phase shifter should be exactly tuned to 90 .

To overcome these drawbacks, VSB modulation is used. It can viewed


as a compromise between SSB and DSB-SC.

In VSB
1.

One sideband is not rejected fully.

2.

One sideband is transmitted fully and a small part (vestige)of

the other sideband is transmitted.

The transmission BW is BW v =B + v. where, v is the vestigial frequency band.

FREQUENCY TRANSLATION:
The transfer of signals occupying a specified frequency band, such
as a channel or group of channels, from one portion of the frequency spectrum to another,
in such a way that the arithmetic frequency difference of signals within the band is
unaltered.
FREQUENCY-DIVISION MULTIPLEXING (FDM):
It is a form of signal multiplexing which involves assigning non-overlapping
frequency ranges to different signals or to each "user" of a medium.
FDM can also be used to combine signals before final modulation onto a carrier
wave. In this case the carrier signals are referred to as subcarriers: an example is stereo
FM transmission, where a 38 kHz subcarrier is used to separate the left-right difference
signal from the central left-right sum channel, prior to the frequency modulation of the
composite signal. A television channel is divided into subcarrier frequencies for video,
color, and audio. DSL uses different frequencies for voice and
for upstream and downstream data transmission on the same conductors, which is also an
example of frequency duplex. Where frequency-division multiplexing is used as to allow
multiple users to share a physical communications channel, it is called frequency-division
multiple access (FDMA).
NONLINEAR DISTORTION:
It is a term used (in fields such as electronics, audio and telecommunications) to
describe the phenomenon of a non-linear relationship between the "input" and "output"
signals of - for example - an electronic device.
EFFECTS OF NONLINEARITY:

Nonlinearity can have several effects, which are unwanted in typical situations.
The a3 term for example would, when the input is a sine wave with frequency , result in
an extra sine wave at 3, as shown below.

In certain situations, this spurious signal can be filtered away because the
"harmonic" 3 lies far outside the frequency range used, but in cable television, for
example, third order distortion could cause a 200 MHz signal to interfere with the regular
channel at 600 MHz.
Nonlinear distortion applied to a superposition of two signals at different frequencies
causes the circuit to act as a frequency mixer, creating intermodulation distortion.

PART A (2 MARK) QUESTIONS.

1. As related to AM, what is over modulation, under modulation and 100% modulation?
2. Draw the frequency spectrum of VSB, where it is used
3. Define modulation index of an AM signal
4. Draw the circuit diagram of an envelope detector
5. What is the mid frequency of IF section of AM receivers and its bandwidth.
6. A transmitter radiates 9 kW without modulation and 10.125 kW after modulation.
Determine depth of modulation.
7. Draw the spectrum of DSB.
8. Define the transmission efficiency of AM signal.
9. Draw the phasor diagram of AM signal.
10. Advantages of SSB.
11. Disadvantages of DSB-FC.
12. What are the advantages of superhetrodyne receiver?
13. Advantages of VSB.
14. Distinguish between low level and high level modulator.
15. Define FDM & frequency translation.
16. Give the parameters of receiver.
17. Define sensitivity and selectivity.
18. Define fidelity.
19. What is meant by image frequency?
20. Define multitone modulation.

PART B (16 MARK) QUESTIONS

1. Explain the generation of AM signals using square law modulator. (16)


2. Explain the detection of AM signals using envelope detector. (16)
3. Explain about Balanced modulator to generate DSB-SC signal. ` (16)
4. Explain about coherent detector to detect SSB-SC signal. (16)
5. Explain the generation of SSB using balanced modulator. (16)
6. Draw the circuit diagram of Ring modulator and explain with its operation? (16)
7. Discus the coherent detection of DSB-SC modulated wave with a block diagram of
detector and Explain. (16)
8. Explain the working of Superheterodyne receiver with its parameters. (16)
9. Draw the block diagram for the generation and demodulation of a VSB signal and
explain the principle of operation. (16)
10. Write short notes on frequency translation and FDM? (16)

UNIT II
ANGLE MODULATION

Phase and frequency modulation


Single tone
Narrow band FM
Wideband FM
Transmission bandwidth
Generation of FM signal.
Demodulation of FM signal

PHASE MODULATION:
Phase modulation (PM) is a form of modulation that represents information as
variations in the instantaneous phase of a carrier wave.
Unlike its more popular counterpart, frequency modulation (FM), PM is not very
widely used for radio transmissions. This is because it tends to require more complex
receiving hardware and there can be ambiguity problems in determining whether, for
example, the signal has changed phase by +180 or -180. PM is used, however, in digital
music synthesizers such as the Yamaha DX7, even though these instruments are usually
referred to as "FM" synthesizers (both modulation types sound very similar, but PM is
usually easier to implement in this area).

An example of phase modulation. The top diagram shows the modulating signal
superimposed on the carrier wave. The bottom diagram shows the resulting phase-

modulated signal. PM changes the phase angle of the complex envelope in direct
proportion to the message signal.
Suppose that the signal to be sent (called the modulating or message signal) is m(t) and the
carrier onto which the signal is to be modulated is

Annotated:
carrier(time) = (carrier amplitude)*sin(carrier frequency*time + phase shift)
This makes the modulated signal

This shows how m(t) modulates the phase - the greater m(t) is at a point in time, the
greater the phase shift of the modulated signal at that point. It can also be viewed as a
change of the frequency of the carrier signal, and phase modulation can thus be considered
a special case of FM in which the carrier frequency modulation is given by the time
derivative of the phase modulation.
The spectral behavior of phase modulation is difficult to derive, but the
mathematics reveals that there are two regions of particular interest:

For small amplitude signals, PM is similar to amplitude


modulation (AM) and exhibits its unfortunate doubling of
baseband bandwidth and poor efficiency.

For a single large sinusoidal signal, PM is similar to FM, and its


bandwidth is approximately
,

where fM = m / 2 and h is the modulation index defined below. This is


also known as Carson's Rule for PM.

MODULATION INDEX:
As with other modulation indices, this quantity indicates by how much the
modulated variable varies around its unmodulated level. It relates to the variations in the
phase of the carrier signal:
,
where is the peak phase deviation. Compare to the modulation index for frequency
modulation.

Variable-capacitance diode phase modulator:

This circuit varies the phase between two square waves through at least 180. This
capability finds application in fixed-frequency, phase shift, resonant-mode converters. ICs
such as the UC3875 usually only work up to about 500 kHz, whereas this circuit can be
extended up to tens of megahertz. In addition, the circuit shown uses low-cost components.
This example was used for a high-efficiency 2-MHz RF power supply.

The signal is delayed at each gate by the RC network formed by the 4.7k input
resistor and capacitance of the 1N4003 diode. The capacitance of the diode, and hence
delay, can be varied by controlling the reverse dc bias applied across the diode. The 100k
resistor to ground at the input to the second stage corrects a slight loss of 1:1 symmetry.
The fixed delay for output A adjusts the phase to be approximately in phase at a 5-V bias.

Note that the control voltage should not drop below approximately 3 V, because the diodes
will start to be forward-biased and the signal will be lost.

FREQUENCY MODULATION:

Frequency modulation (FM) conveys information over a carrier wave by varying


its instantaneous frequency. This is in contrast with amplitude modulation, in which
the amplitude of the carrier is varied while its frequency remains constant.
In analog applications, the difference between the instantaneous and the base frequency of
the carrier is directly proportional to the instantaneous value of the input signal
amplitude. Digital data can be sent by shifting the carrier's frequency among a set of
discrete values, a technique known as frequency-shift keying.
Frequency modulation can be regarded as phase modulation where the carrier phase
modulation is the time integral of the FM modulating signal.
FM is widely used for broadcasting of music and speech, and in two-way
radio systems, in magnetic tape recording systems, and certain video transmission systems.
In radio systems, frequency modulation with sufficient bandwidth provides an advantage in
cancelling naturally-occurring noise. Frequency-shift keying (digital FM) is widely used in
data and fax modems.
THEORY:
Suppose the baseband data signal (the message) to be transmitted is xm(t) and
the sinusoidal carrier is

, where fc is the carrier's base frequency

and Ac is the carrier's amplitude. The modulator combines the carrier with the baseband
data signal to get the transmitted signal:

In this equation,

is the instantaneous frequency of the oscillator and

is

the frequency deviation, which represents the maximum shift away from fc in one
direction, assuming xm(t) is limited to the range 1.
Although it may seem that this limits the frequencies in use to fc f, this neglects the
distinction between instantaneous frequency and spectral frequency. The frequency
spectrum of an actual FM signal has components extending out to infinite frequency,
although they become negligibly small beyond a point.
SINUSOIDAL BASEBAND SIGNAL:
While it is an over-simplification, a baseband modulated signal may be approximated
by a sinusoidal Continuous Wave signal with a frequency fm. The integral of such a
signal is

Thus, in this specific case, equation (1) above simplifies to:

where the amplitude


deviation

of the modulating sinusoid, is represented by the peak

(see frequency deviation).

The harmonic distribution of a sine wave carrier modulated by such


a sinusoidal signal can be represented with Bessel functions - this provides a basis for a
mathematical understanding of frequency modulation in the frequency domain.
MODULATION INDEX:
As with other modulation indices, this quantity indicates by how much the
modulated variable varies around its unmodulated level. It relates to the variations in the
frequency of the carrier signal:

where
signal xm(t), and

is the highest frequency component present in the modulating


is the Peak frequency-deviation, i.e. the maximum deviation of

the instantaneous frequency from the carrier frequency. If

, the modulation is

called narrowband FM, and its bandwidth is approximately

. If

modulation is called wideband FM and its bandwidth is approximately

, the
. While

wideband FM uses more bandwidth, it can improve signal-to-noise ratio significantly.


With a tone-modulated FM wave, if the modulation frequency is held constant and
the modulation index is increased, the (non-negligible) bandwidth of the FM signal
increases, but the spacing between spectra stays the same; some spectral components
decrease in strength as others increase. If the frequency deviation is held constant and the
modulation frequency increased, the spacing between spectra increases.
Frequency modulation can be classified as narrow band if the change in the carrier
frequency is about the same as the signal frequency, or as wide-band if the change in the
carrier frequency is much higher (modulation index >1) than the signal frequency. [1] For
example, narrowband FM is used for two way radio systems such as Family Radio
Service where the carrier is allowed to deviate only 2.5 kHz above and below the center
frequency, carrying speech signals of no more than 3.5 kHz bandwidth. Wide-band FM is
used for FM broadcasting where music and speech is transmitted with up to 75 kHz
deviation from the center frequency, carrying audio with up to 20 kHz bandwidth.
CARSON'S RULE:
A rule of thumb, Carson's rule states that nearly all (~98%) of the power of a
frequency-modulated signal lies within a bandwidth

where

of

, as defined above, is the peak deviation of the instantaneous frequency

from the center carrier frequency

NOISE QUIETING:
The noise power decreases as the signal power increases, therefore the SNR goes
up significantly.
MODULATION:
FM signals can be generated using either direct or indirect frequency modulation.
Direct FM modulation can be achieved by directly feeding the message into the
input of a VCO.

For indirect FM modulation, the message signal is integrated to generate a phase


modulated signal. This is used to modulate a crystal controlled oscillator, and the
result is passed through a frequency multiplier to give an FM signal.
DEMODULATION:
Many FM detector circuits exist. One common method for recovering the
information signal is through a Foster-Seeley discriminator. A phase-lock loop can be used
as an FM demodulator.
Slope detection demodulates an FM signal by using a tuned circuit, which has its
resonant frequency slightly offset from the carrier frequency. As the frequency rises and
falls, the tuned circuit provides a changing amplitude of response, converting FM to AM.
AM receivers may detect some FM transmissions by this means, though it does not provide
an efficient method of detection for FM broadcasts.
APPLICATIONS:
MAGNETIC TAPE STORAGE:
FM is also used at intermediate frequencies by all analog VCR systems,
including VHS, to record both the luminance (black and white) and the chrominance
portions of the video signal. FM is the only feasible method of recording video to and
retrieving video from Magnetic tape without extreme distortion, as video signals have a
very large range of frequency components from a few hertz to several megahertz, too
wide for equalizers to work with due to electronic noise below 60 dB. FM also keeps the
tape at saturation level, and therefore acts as a form of noise reduction, and a
simple limiter can mask variations in the playback output, and the FM capture effect
removes print-through and pre-echo. A continuous pilot-tone, if added to the signal as
was done on V2000 and many Hi-band formats can keep mechanical jitter under control
and assist time base correction.
These FM systems are unusual in that they have a ratio of carrier to maximum
modulation frequency of less than two; contrast this with FM audio broadcasting where the
ratio is around 10,000. Consider for example a 6 MHz carrier modulated at a 3.5 MHz rate;
by Bessel analysis the first sidebands are on 9.5 and 2.5 MHz, while the second sidebands
are on 13 MHz and 1 MHz The result is a sideband of reversed phase on +1 MHz; on
demodulation, this results in an unwanted output at 61 = 5 MHz The system must be
designed so that this is at an acceptable level.

SOUND:
FM is also used at audio frequencies to synthesize sound. This technique, known
as FM synthesis, was popularized by early digital synthesizers and became a standard
feature for several generations of personal computer sound cards.
RADIO:
The wideband FM (WFM) requires a wider signal bandwidth than amplitude
modulation by an equivalent modulating signal, but this also makes the signal more robust
against noise and interference. Frequency modulation is also more robust against simple
signal amplitude fading phenomena. As a result, FM was chosen as the
modulation standard for high frequency, high fidelity radio transmission: hence the term
"FM radio" (although for many years the BBC called it "VHF radio", because commercial
FM broadcasting uses a well-known part of the VHF bandthe FM broadcast band).
FM receivers employ a special detector for FM signals and exhibit
a phenomenon called capture effect, where the tuner is able to clearly receive the stronger
of two stations being broadcast on the same frequency. Problematically
however, frequency drift or lack of selectivity may cause one station or signal to be
suddenly overtaken by another on an adjacent channel. Frequency drift typically
constituted a problem on very old or inexpensive receivers, while inadequate selectivity
may plague any tuner.
An FM signal can also be used to carry a stereo signal: see FM stereo. However,
this is done by using multiplexing and demultiplexing before and after the FM process. The
rest of this article ignores the stereo multiplexing and demultiplexing process used in
"stereo FM", and concentrates on the FM modulation and demodulation process, which is
identical in stereo and mono processes.
A high-efficiency radio-frequency switching amplifier can be used to transmit FM
signals (and other constant-amplitude signals). For a given signal strength (measured at the
receiver antenna), switching amplifiers use less battery power and typically cost less than
a linear amplifier. This gives FM another advantage over other modulation schemes that
require linear amplifiers, such as AM and QAM.
FM is commonly used at VHF radio frequencies for highfidelity broadcasts of music and speech (see FM broadcasting). Normal (analog) TV sound
is also broadcast using FM. A narrow band form is used for voice communications in

commercial and amateur radio settings. In broadcast services, where audio fidelity is
important, wideband FM is generally used. In two-way radio, narrowband FM (NBFM) is
used to conserve bandwidth for land mobile radio stations, marine mobile, and many other
radio services.

VARACTOR FM MODULATOR:

Varactor FM Modulator
Another fm modulator which is widely used in transistorized circuitry uses a
voltage-variable capacitor (VARACTOR). The varactor is simply a diode, or pn junction,
that is designed to have a certain amount of capacitance between junctions. View (A) of
figure 2 shows the varactor schematic symbol. A diagram of a varactor in a simple
oscillator circuit is shown in view (B).This is not a working circuit, but merely a simplified
illustration. The capacitance of a varactor, as with regular capacitors, is determined by the
area of the capacitor plates and the distance between the plates. The depletion region in the
varactor is the dielectric and is located between the p and n elements, which serve as the
plates. Capacitance is varied in the varactor by varying the reverse bias which controls the
thickness of the depletion region. The varactor is so designed that the change in

capacitance is linear with the change in the applied voltage. This is a special design
characteristic of the varactor diode. The varactor must not be forward biased because it
cannot tolerate much current flow. Proper circuit design prevents the application of
forward bias.

IMPORTANT QUESTION
PART A
All questions Two Marks:
1. What do you mean by narrowband and wideband FM?
2. Give the frequency spectrum of narrowband FM?
3. Why Armstrong method is superior to reactance modulator.
4. Define frequency deviation in FM?
5. State Carsons rule of FM bandwidth?
6. Differentiate between narrow band and wideband FM.?
7. What are the advantages of FM.?
8. Define PM.
9. What is meant by indirect FM generation?
10. Draw the phasor diagram of narrow band FM.
11. Write the expression for the spectrum of a single tone FM signal.
12. What are the applications of phase locked loop?
13. Define modulation index of FM and PM.
14. Differentiate between phase and frequency modulation.
15. A carrier of frequency 100 MHz is frequency modulated by a signal x(t)=20sin
(200x103t ). What is the bandwidth of the FM signal if the frequency sensitivity of the
modulator is 25 KHz per volt?
16. What is the bandwidth required for an FM wave in which the modulating frequency
signal
is 2 KHz and the maximum frequency deviation is 12 KHz?
17. Determine and draw the instantaneous frequency of a wave having a total phase angle
given by (t)= 2000t +sin10t.
18. Draw the block diagram of PLL.

PART B

1. Explain the indirect method of generation of FM wave and any one method of
demodulating an FM wave. (16)
2. Derive the expression for the frequency modulated signal. Explain what is meant by
narrowband FM and wideband FM using the expression. (16)
3. Explain any two techniques of demodulation of FM. (16)
4. Explain the working of the reactance tube modulator and drive an expression to show
how the variation of the amplitude of the input signal changes the frequency of the output
signal of the modulator. (16)
5. Discuss the effects of nonlinearities in FM. (8)
6. Discuss in detail FM stereo multiplexing. (8)
7. Draw the frequency spectrum of FM and explain. Explain how Varactor diode can be
used for frequency modulation. (16)
8. Discuss the indirect method of generating a wide-band FM signal. (8)
9. Draw the circuit diagram of Foster-Seelay discriminator and explain its working. (16)
10. Explain the principle of indirect method of generating a wide-band FM signal with a
neat block diagram. (8)

UNIT III

RANDOM PROCESS

Review of probability.
Random variables and random process.
Gaussian process.
Noise.
Shot noise.
Thermal noise.
White noise.
Narrow band noise.
Noise temperature.
Noise figure.

INTRODUCTION OF PROBABILITY:
Probability theory is the study of uncertainty. Through this class, we will be relying on concepts
from probability theory for deriving machine learning algorithms. These notes attempt to cover the
basics of probability theory at a level appropriate. The mathematical theory of probability is very
sophisticated, and delves into a branch of analysis known as measure theory. In these notes, we provide a
basic treatment of probability that does not address these finer details.

1 Elements of probability
In order to define a probability on a set we need a few basic elements,
Sample space : The set of all the outcomes of a random experiment. Here, each
outcome can be thought of as a complete description of the state of the real
world at the end of the experiment.
Set of events (or event space) F : A set whose elements A F (called events) are subsets
of (i.e., A is a collection of possible outcomes of an experiment).1 .
Probability measure: A function P : F R that satisfies the following properties,
- P (A) 0, for all A F
- P () = 1
- If A1 , A2 , . . . are disjoint events (i.e., Ai Aj = whenever i = j), then
X
P (ui Ai ) =
P (Ai )
i

These three properties are called the Axioms of Probability.


Example: Consider the event of tossing a six-sided die. The sample space is = {1, 2, 3, 4, 5,
6}. We can define different event spaces on this sample space. For example, the simplest event
space is the trivial event space F = {, }. Another event space is the set of all subsets of
. For the first event space, the unique probability measure satisfying the requirements
above is given by P () = 0, P () = 1. For the second event space, one valid probability
measure is to assign the probability of each set in the event space to be i where i is the number
6
of elements of that set.
Properties:
-

If A B = P (A) P (B).
P (A B) min(P (A), P (B)).
(Union Bound) P (A u B) P (A) + P (B).
P ( \ A) = 1 P (A).
(Law of Total Probability) If A1 , . . . , Ak are a set of disjoint events such that uki=1
Pk

2 Random variables
Consider an experiment in which we flip 10 coins, and we want to know the number of coins that
come up heads. Here, the elements of the sample space I are 10-length sequences of heads and
tails. For example, we might have wO = (H, H, T , H, T , H, H, T , T , T ) E I. However, in practice,
we usually do not care about the probability of obtaining any particular sequence of heads and tails.
Instead we usually care about real-valued functions of outcomes, such as the number of the number
of heads that appear among our 10 tosses, or the length of the longest run of tails. These functions,
under some technical conditions, are known as random variables.
More formally, a random variable X is a function X : I R.2 Typically, we will denote random
variables using upper case letters X () or more simply X (where the dependence on the random
outcome is implied). We will denote the value that a random variable may take on using lower
case letters x.
Example: In our experiment above, suppose that X () is the number of heads which occur in the
sequence of tosses . Given that only 10 coins are tossed, X () can take only a finite number of
values, so it is known as a discrete random variable. Here, the probability of the set associated
with a random variable X taking on some specific value k is

P (X = k) := P ({ : X () = k}).
Example: Suppose that X () is a random variable indicating the amount of time it takes for a
radioactive particle to decay. In this case, X (I) takes on a infinite number of possible values, so it
is called a continuous random variable. We denote the probability that X takes on a value between
two real constants a and b (where a < b) as
P (a X b) := P ({ : a X
() b}).
2.1 Cumulative distribution functions
In order to specify the probability measures used when dealing with random variables, it is often
convenient to specify alternative functions (CDFs, PDFs, and PMFs) from which the probability
measure governing an experiment immediately follows. In this section and the next two sections,
we describe each of these types of functions in turn.
A cumulative distribution function (CDF) is a function FX : R [0, 1] which specifies a probability measure as,
FX (x) , P (X x).
(1)
By using this function one can calculate the probability of any event in F .3 Figure 1 shows a sample
CDF function.
2.2 Probability mass functions
When a random variable X takes on a finite set of possible values (i.e., X is a discrete random
variable), a simpler way to represent the probability measure associated with a random variable is
to directly specify the probability of each value that the random variable can assume. In particular,
a probability mass function (PMF) is a function pX : I R such that
pX (x) , P (X = x).
In the case of discrete random variable, we use the notation V al(X ) for the set of possible values
that the random variable X may assume. For example, if X () is a random variable indicating the
number of heads out of ten tosses of coin, then V al(X ) = {0, 1, 2, . . . , 10}.
Properties:
- 0 pX (x)
1.
P
xV

al(X )

pX (x) =

1.
-

P
xA

pX (x) = P (X e

A).
2.3 Probability density functions
For some continuous random variables, the cumulative distribution function FX (x) is differentiable
everywhere. In these cases, we define the Probability Density Function or PDF as the derivative
of the CDF, i.e.,
dFX (x)
fX (x) ,
.
(2)
dx

Note here, that the PDF for a continuous random variable may not always exist (i.e., if FX (x) is not
differentiable everywhere).
According to the properties of differentiation, for very small x,
P (x X x + x) fX (x)x.

(3)

Both CDFs and BDFs (when they exist!) can be used for calculating the probabilities of different
events. But it should be emphasized that the value of PDF at any given point x is not the probability
of that event, i.e., fX (x) = P (X = x). For example, fX (x) can take on values larger than one (but
the integral of fX (x) over any subset of R will be at most one).
Properties:
- fX (x) 0 .
R
- fX (x) = i.
R
- xA fX (x)dx = P (X e A).
2.4 Expectation
Suppose that X is a discrete random variable with PMF pX (x) and g : R - R is an arbitrary
function. In this case, g(X ) can be considered a random variable, and we define the expectation or
expected value of g(X ) as
X
g(x)pX (x).
E[g(X )] ,
xV al(X )

If X is a continuous random variable with PDF fX (x), then the expected value of g(X ) is defined
as,
Z
E[g(X )] ,
g(x)fX (x)dx.

Intuitively, the expectation of g(X ) can be thought of as a weighted average of the values that g(x)
can taken on for different values of x, where the weights are given by pX (x) or fX (x). As a special
case of the above, note that the expectation, E[X ] of a random variable itself is found by letting g(x) =
x; this is also known as the mean of the random variable X .
Properties:
- E[a] = a for any constant a e R.
- E[af (X )] = aE[f (X )] for any constant a e R.
- (Linearity of Expectation) E[f (X ) + g(X )] = E[f (X )] + E[g(X )].
- For a discrete random variable X , E[i{X = k}] = P (X = k).
2.5 Variance
The variance of a random variable X is a measure of how concentrated the distribution of a random
variable X is around its mean. Formally, the variance of a random variable X is defined as
V ar[X ] , E[(X - E(X ))2 ]
Using the properties in the previous section, we can derive an alternate expression for the variance:
E[(X - E[X ])2 ]

= E[X 2 - 2E[X ]X + E[X ]2 ]


= E[X 2 ] - 2E[X ]E[X ] + E[X ]2
E[X 2 ] - E[X ]2 ,

where the second equality follows from linearity of expectations and the fact that E[X ] is actually a
constant with respect to the outer expectation.
Properties:
- V ar[a] = 0 for any constant a e R.
- V ar[af (X )] = a2 V ar[f (X )] for any constant a e R.
2.6 Some common random variables
Discrete random variables
X Bernoulli(p) (where O p 1): one if a coin with heads probability p
comes up heads, zero otherwise.
p(x) =

if p = 1

1-p

if p = O

X Binomial(n, p) (where O p 1): the number of heads in n independent


flips of a coin with heads probability p.
p(x) =

n x
nx
x p (1 - p)

X Geometric(p) (where p > O): the number of flips of a coin with heads probability p
until the first heads.
p(x) = p(1 - p)x1
X P oisson() (where > O): a probability distribution over the nonnegative integers
used for modeling the frequency of rare events.
p(x) = e

x
x!

Continuous random variables


X U nif orm(a, b) (where a < b): equal probability density to every value between a
and b on the real line.
(
1
if a x b
f (x) = ba
O
otherwise

X Exponential() (where > O): decaying probability density over the nonnegative
reals.
f (x) =

ex
O

if x > O
otherwise

X N ormal(, 2 ): also known as the Gaussian distribution


2
1
1
f (x) =
e 2 2 (x)
2

Figure 2: PDF and CDF of a couple of random variables.

3 Two random variables


Thus far, we have considered single random variables.
In many situations, however,
there may be more than one quantity that we are interested in knowing during a random experiment. For instance, in an experiment where we flip a coin ten times, we
may care about both X () = the number of heads that come up as well as Y () =
the length of the longest run of consecutive heads. In this section, we consider the setting of two
random variables.
3.1 Joint and marginal distributions
Suppose that we have two random variables X and Y . One way to work with these two random
variables is to consider each of them separately. If we do that we will only need FX (x) and FY (y).
But if we want to know about the values that X and Y assume simultaneously during outcomes
of a random experiment, we require a more complicated structure known as the joint cumulative
distribution function of X and Y , defined by
FX Y (x, y) = P (X

x, Y

y)

It can be shown that by knowing the joint cumulative distribution function, the probability of any
event involving X and Y can be calculated.
6

The joint CDF FX Y (x, y) and the joint distribution functions FX (x) and FY (y) of each variable
separately are related by
FX (x)

FY (y)

lim FX Y (x, y)dy

lim FX Y (x, y)dx.

Here, we call FX (x) and FY (y) the marginal cumulative distribution functions of FX Y (x, y).
Properties:
- o

FX Y (x, y)

1.

- limx,y FX Y (x, y) = 1.
- limx,y FX Y (x, y) = o.
- FX (x) = limy FX Y (x, y).
3.2 Joint and marginal probability mass functions
If X and Y are discrete random variables, then the joint probability mass function pX Y : R R
[o, 1] is defined by

Here, o

PX Y (x, y)

pX Y (x, y) = P (X = x, Y = y).
P
P
1 for all x, y, and xV al(X ) yV al(Y ) PX Y (x, y) = 1.

How does the joint PMF over two variables relate to the probability mass function for each variable
separately? It turns out that
X
pX (x) =
pX Y (x, y).
y

and similarly for pY (y). In this case, we refer to pX (x) as the marginal probability mass function
of X . In statistics, the process of forming the marginal distribution with respect to one variable by
summing out the other variable is often known as marginalization.
3.3 Joint and marginal probability density functions
Let X and Y be two continuous random variables with joint distribution function FX Y . In the case
that FX Y (x, y) is everywhere differentiable in both x and y, then we can define the joint probability
density function,
fX Y (x, y) =

2 FX Y (x, y)
.
xy

Like in the single-dimensional case, fX Y (x, y) = P (X = x, Y = y), but rather


ZZ
fX Y (x, y)dxdy = P ((X, Y ) e A).
xA

Note that the values of the probability density function fX Y R(x, y)Rare always nonnegative, but they

may be greater than 1. Nonetheless, it must be the case that fX Y (x, y) = 1.


Analagous to the discrete case, we define
Z
fX (x) =

fX Y (x, y)dy,

as the marginal probability density function (or marginal density) of X , and similarly for fY (y).

3.4 Conditional distributions


Conditional distributions seek to answer the question, what is the probability distribution over Y ,
when we know that X must take on a certain value x? In the discrete case, the conditional probability
mass function of X given Y is simply
pY |X (y|x) _

pX Y (x, y)
,
pX (x)

assuming that pX (x) _ o.


In the continuous case, the situation is technically a little more complicated because the probability
that a continuous random variable X takes on a specific value x is equal to zero4 . Ignoring this
technical point, we simply define, by analogy to the discrete case, the conditional probability density
of Y given X _ x to be
fX Y (x, y)
fY |X (y|x) _
,
fX (x)
provided fX (x) _ o.
3.5 Bayess rule
A useful formula that often arises when trying to derive expression for the conditional probability of
one variable given another, is Bayess rule.
In the case of discrete random variables X and Y ,
PY

|X (y|x)

PX |Y (x|y)PY (y)
PX Y (x, y)
_ P
.
l
l
PX (x)
y i EV al(Y ) PX |Y (x|y )PY (y )

If the random variables X and Y are continuous,


fY |X (y|x) _

fX |Y (x|y)fY (y)
fX Y (x, y)
_ R
.
l
l
l
fX (x)
fX |Y (x|y )fY (y )dy

3.6 Independence
Two random variables X and Y are independent if FX Y (x, y) _ FX (x)FY (y) for all values of x
and y. Equivalently,
For discrete random variables, pX Y (x, y) _ pX (x)pY (y) for all x e V al(X ), y e
V al(Y ).
For discrete random variables, pY |X (y|x) _ pY (y) whenever pX (x) _ o for all y e
V al(Y ).
For continuous random variables, fX Y (x, y) _ fX (x)fY (y) for all x, y e R.
For continuous random variables, fY |X (y|x) _ fY (y) whenever fX (x) _ o for all y e R.
To get around this, a more reasonable way to calculate the conditional CDF is,
FY

|X (y,

x) = lim P (Y y|x X x + x).


x0

It can be easily seen that if F (x, y) is differentiable in both x, y then,


Z y
fX,Y (x,
FY

|X (y,

x) =

)
fX (x)

and therefore we define the conditional PDF of Y given X = x in the following way,
fY |X (y|x) =

fX Y (x, y)
fX (x)

Informally, two random variables X and Y are independent if knowing the value of one variable
will never have any effect on the conditional probability distribution of the other variable, that is,
you know all the information about the pair (X, Y ) by just knowing f (x) and f (y). The following
lemma formalizes this observation:
Lemma 3.1. If X and Y are independent then for any subsets A, B R, we have,
P (X e A, y e B) _ P (X e A)P (Y e B)
By using the above lemma one can prove that if X is independent of Y then any function of X is
independent of any function of Y .
3.7 Expectation and covariance
Suppose that we have two discrete random variables X, Y and g : R2 - R is a function of these
two random variables. Then the expected value of g is defined in the following way,
X
X
E[g(X, Y )] ,
g(x, y)pX Y (x, y).
xEV al(X ) yEV al(Y )

For continuous random variables X, Y , the analogous expression is


Z Z
g(x, y)fX Y (x, y)dxdy.
E[g(X, Y )] _

We can use the concept of expectation to study the relationship of two random variables with each
other. In particular, the covariance of two random variables X and Y is defined as
C ov[X, Y ]

, E[(X - E[X ])(Y - E[Y ])]

Using an argument similar to that for variance, we can rewrite this as,
C ov[X, Y ]

_ E[(X - E[X ])(Y - E[Y ])]


_ E[X Y - X E[Y ] - Y E[X ] + E[X ]E[Y ]]
_ E[X Y ] - E[X ]E[Y ] - E[Y ]E[X ] + E[X ]E[Y ]]
_ E[X Y ] - E[X ]E[Y ].

Here, the key step in showing the equality of the two forms of covariance is in the third equality,
where we use the fact that E[X ] and E[Y ] are actually constants which can be pulled out of the
expectation. When C ov[X, Y ] _ o, we say that X and Y are uncorrelated5 .
Properties:
- (Linearity of expectation) E[f (X, Y ) + g(X, Y )] _ E[f (X, Y )] + E[g(X, Y )].
- V ar[X + Y ] _ V ar[X ] + V ar[Y ] + 2C ov[X, Y ].
- If X and Y are independent, then C ov[X, Y ] _ o.
- If X and Y are independent, then E[f (X )g(Y )] _ E[f (X )]E[g(Y )].

4 Multiple random variables


The notions and ideas introduced in the previous section can be generalized to more than
two random variables. In particular, suppose that we have n continuous random variables,
X1 (), X2 (), . . . Xn (). In this section, for simplicity of presentation, we focus only on the
continuous case, but the generalization to discrete random variables works similarly.
4.1 Basic properties
We can define the joint distribution function of X1 , X2 , . . . , Xn , the joint probability density
function of X1 , X2 , . . . , Xn , the marginal probability density function of X1 , and the conditional probability density function of X1 given X2 , . . . , Xn , as

FX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )

= P (X1
x1 , X2
x2 , . . . , Xn xn )
n FX1 ,X ,...,X
(x
,
1 x2 , . . . x n )
2
n
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )
x1 . . . xn
=
Z
Z
fX1 (X1 ) =
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )dx2 . . . dxn

fX1 |X2 ,...,Xn (x1 |x2 , . . . xn )

f
(x , x , . . . xn )
= X1 ,X2 ,...,Xn 1 2
f X2 ,...,Xn(x 1 , x 2 , . . . x n )

To calculate the probability of an event A Rn we have,


Z
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )dx1 dx2 . . . dxn (4)
P ((x1 , x2 , . . . xn ) e A) =
(x1 ,x2 ,...xn )EA

Chain rule: From the definition of conditional probabilities for multiple random variables, one can
show that
f (x1 , x2 , . . . , xn )

= f (xn |x1 , x2 . . . , xn1 )f (x1 , x2 . . . , xn1 )


= f (xn |x1 , x2 . . . , xn1 )f (xn1 |x1 , x2 . . . , xn2 )f (x1 , x2 . . . , xn2 )
n
Y
= . . . = f (x1 )
f (xi |x1 , . . . , xi1 ).

i=2

Independence: For multiple events, A1 , . . . , Ak , we say that A1 , . . . , Ak are mutually independent if for any subset S {1, 2, . . . , k}, we have
Y
P (niES Ai ) =
P (Ai ).
iES

Likewise, we say that random variables X1 , . . . , Xn are independent if


f (x1 , . . . , xn ) = f (x1 )f (x2 ) f (xn ).
Here, the definition of mutual independence is simply the natural generalization of independence of
two random variables to multiple random variables.
Independent random variables arise often in machine learning algorithms where we assume that the
training examples belonging to the training set represent independent samples from some unknown
probability distribution. To make the significance of independence clear, consider a bad training
set in which we first sample a single training example (x(1) , y (1) ) from the some unknown distribution, and then add m - 1 copies of the exact same training example to the training set. In this case,
we have (with some abuse of notation)
P ((x

(1)

,y

(1)

), . . . .(x

(m)

,y

(m)

m
Y

)) =

P (x(i) , y (i) ).

i=1

Despite the fact that the training set has size m, the examples are not independent! While clearly the
procedure described here is not a sensible method for building a training set for a machine learning
algorithm, it turns out that in practice, non-independence of samples does come up often, and it has
the effect of reducing the effective size of the training set.

4.2 Random vectors


Suppose that we have fl random variables. When working with all these random variables together,
we will often find it convenient to put them in a vector X = [X1 X2 . . . Xn ]T . We call the
resulting vector a random vector (more formally, a random vector is a mapping from I to Rn ). It
should be clear that random vectors are simply an alternative notation for dealing with fl random
variables, so the notions of joint PDF and CDF will apply to random vectors as well.
Expectation: Consider an arbitrary function from g : Rn R. The expected value of this function
is defined as
Z
E[g(X )] =
g(x1 , x2 , . . . , xn )fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )dx1 dx2 . . . dxn ,
(5)
Rn
R
where Rn is fl consecutive integrations from - to . If g is a function from Rn to Rm , then the
expected value of g is the element-wise expected values of the output vector, i.e., if g is

g1 (x)
g2 (x)
. ,
g(x) =

.
gm (x)
Then,

E[g1 (X )]
E[g2 (X )]

.
E[g(X )] =
.

.
E[gm (X )]
Covariance matrix: For a given random vector X : I Rn , its covariance matrix is the fl fl
square matrix whose entries are given by ij = C ov[Xi , Xj ].
From the definition of covariance, we have

C ov[X1 , X1 ] C ov[X1 , Xn ]

..
=

.
.
.
C ov[Xn , X1 ] C ov[Xn , Xn ]

E[X12 ] - E[X1 ]E[X1 ]


E[X1 Xn ] - E[X1 ]E[Xn ]

..
=

.
.
.

E[Xn X1 ] - E[Xn ]E[X1 ]


E[X n2 ] - E[Xn ]E[Xn ]

E[X12 ] E[X1 Xn ]
E[X1 ]E[X1 ] E[X1 ]E[Xn ]

..
..
..
= ..
-

.
.
..
..
2
E[X
]E[X
]

E[X
]E[X
]
E[Xn X1 ]
E[X n ]
n
1
n
n

= E[X X T ] - E[X ]E[X ]T = . . . = E[(X - E[X ])(X - E[X ])T ].


where the matrix expectation is defined in the obvious way.
The covariance matrix has a number of useful properties:
- 0; that is, is positive semi definite.
- = T ; that is, is symmetric.
4.3 The multivariate Gaussian distribution
One particularly important example of a probability distribution over random vectors X is called
the multivariate Gaussian or multivariate normal distribution. A random vector X e Rn is said
to have a multivariate normal (or Gaussian) distribution with mean e Rn and covariance matrix
e Sn++ (where Sn++ refers to the space of symmetric positive definite fl fl matrices)
1
1
fX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ; , ) =
exp - (x - )T 1 (x - ) .
2
(2)n/2 || 1/2

We write this as X N (, ). Notice that in the case fl = 1, this reduces the regular definition of a
normal distribution with mean parameter 1 and variance 11 .
Generally speaking, Gaussian random variables are extremely useful in machine learning and statistics for
two main reasons. First, they are extremely common when modeling noise in statistical algorithms. Quite
often, noise can be considered to be the accumulation of a large number of small independent random
perturbations affecting the measurement process; by the Central Limit Theorem, summations of independent
random variables will tend to look Gaussian. Second, Gaussian random variables are convenient for many
analytical manipulations, because many of the integrals involving Gaussian distributions that arise in practice
have simple closed form solutions.

GAUSSIAN PROCESS:
In probability theory and statistics, a Gaussian process is a stochastic process whose realizations
consist of random values associated with every point in a range of times (or of space) such that each
such random variable has a normal distribution. Moreover, every finite collection of those random variables
has a multivariate normal distribution.
Gaussian processes are important in statistical modeling because of properties inherited from the normal
distribution. For example, if a random process is modeled as a Gaussian process, the distributions of various
derived quantities can be obtained explicitly. Such quantities include: the average value of the process over a
range of times; the error in estimating the average using sample values at a small set of times.
A process is Gaussian if and only if for every finite set of indices t1, ..., tk in the index set T

is a vector-valued Gaussian random variable. Using characteristic functions of random variables, the
Gaussian property can be formulated as follows:{ Xt ; t T } is Gaussian if and only if, for every finite
set of indices t1, ..., tk, there are reals l j with i i > 0 and reals j such that

The numbers l j and j can be shown to be the covariances and means of the variables in the
process.

NOISE:
In common use, the word noise means any unwanted sound. In both analog and digital
electronics, noise is an unwanted perturbation to a wanted signal; it is called noise as a generalization of the
audible noise heard when listening to a weak radio transmission. Signal noise is heard as acoustic noise if
played through a loudspeaker; it manifests as 'snow' on a television or video image. Noise can block, distort,
change or interfere with the meaning of a message in human, animal and electronic communication.
In signal processing or computing it can be considered unwanted data without meaning; that is, data
that is not being used to transmit a signal, but is simply produced as an unwanted by-product of other activities.
"Signal-to-noise ratio" is sometimes used informally to refer to the ratio of useful information to false or
irrelevant data in a conversation or exchange, such as off-topic posts and spam in online discussion forums and
other online communities. In information theory, however, noise is still considered to be information. In a
broader sense, film grain or even advertisements encountered while looking for something else can be

considered noise. In biology, noise can describe the variability of a measurement around the mean, for
example transcriptional noise describes the variability in gene activity between cells in a population.
In many of these areas, the special case of thermal noise arises, which sets a fundamental lower limit
to what can be measured or signaled and is related to basic physical processes at the molecular level described
by well-established thermodynamics considerations, some of which are expressible by simple formulae.

SHOT NOISE:
Shot noise consists of random fluctuations of the electric current in an electrical conductor, which are
caused by the fact that the current is carried by discrete charges (electrons). The strength of this noise increases
for growing magnitude of the average current flowing through the conductor. Shot noise is to be distinguished
from current fluctuations in equilibrium, which happen without any applied voltage and without any average
current flowing. These equilibrium current fluctuations are known as Johnson-Nyquist noise.
Shot noise is important in electronics, telecommunication, and fundamental physics.
The strength of the current fluctuations can be expressed by giving the variance of the current I, where
<I> is the average ("macroscopic") current. However, the value measured in this way depends on the frequency
range of fluctuations which is measured ("bandwidth" of the measurement): The measured variance of the
current grows linearly with bandwidth. Therefore, a more fundamental quantity is the noise power, which is
essentially obtained by dividing through the bandwidth (and, therefore, has the dimension ampere squared
divided by Hertz). It may be defined as the zero-frequency Fourier transform of the current-current correlation
function.

THERMAL NOISE:
Thermal noise (JohnsonNyquist noise, Johnson noise, or Nyquist noise) is the electronic noise
generated by the thermal agitation of the charge carriers (usually the electrons) inside an electrical conductor at
equilibrium, which happens regardless of any applied voltage.
Thermal noise is approximately white, meaning that the power spectral density is nearly equal
throughout the frequency spectrum (however see the section below on extremely high frequencies).
Additionally, the amplitude of the signal has very nearly a Gaussian probability density function.
This type of noise was first measured by John B. Johnson at Bell Labs in 1928. He described his
findings to Harry Nyquist, also at Bell Labs, who was able to explain the results.
Noise voltage and power
Thermal noise is distinct from shot noise, which consists of additional current fluctuations that occur
when a voltage is applied and a macroscopic current starts to flow. For the general case, the above definition
applies to charge carriers in any type of conducting medium (e.g. ions in an electrolyte), not just resistors. It
can be modeled by a voltage source representing the noise of the non-ideal resistor in series with an ideal noise
free resistor.
The power spectral density, or voltage variance (mean square) per hertz of bandwidth, is given by

where kB is Boltzmann's constant in joules per kelvin, T is the resistor's absolute temperature in kelvins, and R
is the resistor value in ohms (). Use this equation for quick calculation:

For example, a 1 k resistor at a temperature of 300 K has

For a given bandwidth, the root mean square (RMS) of the voltage, vn, is given by

where f is the bandwidth in hertz over which the noise is measured. For a 1 k resistor at room temperature
and a 10 kHz bandwidth, the RMS noise voltage is 400 nV. A useful rule of thumb to remember is that 50 at
1 Hz bandwidth correspond to 1 nV noise at room temperature.
A resistor in a short circuit dissipates a noise power of

The noise generated at the resistor can transfer to the remaining circuit; the maximum noise power
transfer happens with impedance matching when the Thevenin equivalent resistance of the remaining circuit is
equal to the noise generating resistance. In this case each one of the two participating resistors dissipates noise
in both itself and in the other resistor. Since only half of the source voltage drops across any one of these
resistors, the resulting noise power is given by

where P is the thermal noise power in watts. Notice that this is independent of the noise generating resistance.

Noise current:
The noise source can also be modeled by a current source in parallel with the resistor by taking the Norton
equivalent that corresponds simply to divide by R. This gives the root mean square value of the current source
as:

Thermal noise is intrinsic to all resistors and is not a sign of poor design or manufacture, although resistors
may also have excess noise.

Noise power in decibels:


Signal power is often measured in dBm (decibels relative to 1 milliwatt, assuming a 50 ohm load).
From the equation above, noise power in a resistor at room temperature, in dBm, is then:

where the factor of 1000 is present because the power is given in milliwatts, rather than watts. This equation
can be simplified by separating the constant parts from the bandwidth:

which is more commonly seen approximated as:

Noise power at different bandwidths is then simple to calculate:


Bandwidth (f) Thermal noise power

Notes

1 Hz

174 dBm

10 Hz

164 dBm

100 Hz

154 dBm

1 kHz

144 dBm

10 kHz

134 dBm

100 kHz

124 dBm

180 kHz

121.45 dBm

One LTE resource block

200 kHz

120.98 dBm

One GSM channel (ARFCN)

1 MHz

114 dBm

2 MHz

111 dBm

Commercial GPS channel

6 MHz

106 dBm

Analog television channel

20 MHz

101 dBm

WLAN 802.11 channel

FM channel of 2-way radio

Thermal noise on capacitors:


Thermal noise on capacitors is referred to as kTC noise. Thermal noise in an RC circuit has an
unusually simple expression, as the value of the resistance (R) drops out of the equation. This is because higher
R contributes to more filtering as well as to more noise. The noise bandwidth of the RC circuit is 1/(4RC),
which can substituted into the above formula to eliminate R. The mean-square and RMS noise voltage
generated in such a filter are:

Thermal noise accounts for 100% of kTC noise, whether it is attributed to the resistance or to the
capacitance.
In the extreme case of the reset noise left on a capacitor by opening an ideal switch, the resistance is
infinite, yet the formula still applies; however, now the RMS must be interpreted not as a time average, but as
an average over many such reset events, since the voltage is constant when the bandwidth is zero. In this sense,
the Johnson noise of an RC circuit can be seen to be inherent, an effect of the thermodynamic distribution of
the number of electrons on the capacitor, even without the involvement of a resistor.
The noise is not caused by the capacitor itself, but by the thermodynamic equilibrium of the amount
of charge on the capacitor. Once the capacitor is disconnected from a conducting circuit, the thermodynamic
fluctuation is frozen at a random value with standard deviation as given above.
The reset noise of capacitive sensors is often a limiting noise source, for example in image sensors.
As an alternative to the voltage noise, the reset noise on the capacitor can also be quantified as the electrical
charge standard deviation, as

Since the charge variance is kBTC, this noise is often called kTC noise.
Any system in thermal equilibrium has state variables with a mean energy of kT/2 per degree of
freedom. Using the formula for energy on a capacitor (E = CV2), mean noise energy on a capacitor can be
seen to also be C(kT/C), or also kT/2. Thermal noise on a capacitor can be derived from this relationship,
without consideration of resistance.
The kTC noise is the dominant noise source at small capacitors.
Noise of capacitors at 300 K
Capacitance

Electrons

1 fF

2 mV

12.5 e

10 fF

640 V

40 e

100 fF

200 V

125 e

1 pF

64 V

400 e

10 pF

20 V

1250 e

100 pF

6.4 V

4000 e

1 nF

2 V

12500 e

Noise at very high frequencies:


The above equations are good approximations at any practical radio frequency in use (i.e. frequencies
below about 80 gigahertz). In the most general case, which includes up to optical frequencies, the power
spectral density of the voltage across the resistor R, in V2/Hz is given by:

where f is the frequency, h Planck's constant, kB Boltzmann constant and T the temperature in kelvins. If the
frequency is low enough, that means:

(this assumption is valid until few terahertz at room temperature) then the exponential can be expressed in
terms of its Taylor series. The relationship then becomes:

In general, both R and T depend on frequency. In order to know the total noise it is enough to
integrate over all the bandwidth. Since the signal is real, it is possible to integrate over only the positive
frequencies, then multiply by 2. Assuming that R and T are constants over all the bandwidth f, then the root
mean square (RMS) value of the voltage across a resistor due to thermal noise is given by

that is, the same formula as above.

WHITE NOISE:
White noise is a random signal (or process) with a flat power spectral density. In other words, the
signal contains equal power within a fixed bandwidth at any center frequency. White noise draws its name
from white light in which the power spectral density of the light is distributed over the visible band in such a
way that the eye's three color receptors (cones) are approximately equally stimulated. In statistical sense, a
time series rt is called a white noise if {rt} is a sequence of independent and identically distributed (iid) random
variables with finite mean and variance. In particular, if rt is normally distributed with mean zero and variance
, the series is called a Gaussian white noise.
An infinite-bandwidth white noise signal is a purely theoretical construction. The bandwidth of white noise is
limited in practice by the mechanism of noise generation, by the transmission medium and by finite
observation capabilities. A random signal is considered "white noise" if it is observed to have a flat spectrum
over a medium's widest possible bandwidth.

WHITE NOISE IN A SPATIAL CONTEXT:


While it is usually applied in the context of frequency domain signals, the term white noise is also
commonly applied to a noise signal in the spatial domain. In this case, it has an auto correlation which can be
represented by a delta function over the relevant space dimensions. The signal is then "white" in the spatial
frequency domain (this is equally true for signals in the angular frequency domain, e.g., the distribution of a
signal across all angles in the night sky).

STATISTICAL PROPERTIES:
The image to the right displays a finite length, discrete time realization of a white noise process
generated from a computer.
Being uncorrelated in time does not restrict the values a signal can take. Any distribution of values is
possible (although it must have zero DC components). Even a binary signal which can only take on the values
1 or -1 will be white if the sequence is statistically uncorrelated. Noise having a continuous distribution, such
as a normal distribution, can of course be white.
It is often incorrectly assumed that Gaussian noise (i.e., noise with a Gaussian amplitude distribution see
normal distribution) is necessarily white noise, yet neither property implies the other. Gaussianity refers to the
probability distribution with respect to the value i.e. the probability that the signal has a certain given value,
while the term 'white' refers to the way the signal power is distributed over time or among frequencies.
We can therefore find Gaussian white noise, but also Poisson, Cauchy, etc. white noises. Thus, the
two words "Gaussian" and "white" are often both specified in mathematical models of systems. Gaussian white
noise is a good approximation of many real-world situations and generates mathematically tractable models.
These models are used so frequently that the term additive white Gaussian noise has a standard abbreviation:
AWGN. Gaussian white noise has the useful statistical property that its values are independent (see Statistical
independence).
White noise is the generalized mean-square derivative of the Wiener process or Brownian motion.

APPLICATIONS:
It is used by some emergency vehicle sirens due to its ability to cut through background noise, which
makes it easier to locate.
White noise is commonly used in the production of electronic music, usually either directly or as an
input for a filter to create other types of noise signal. It is used extensively in audio synthesis, typically to
recreate percussive instruments such as cymbals which have high noise content in their frequency domain.
It is also used to generate impulse responses. To set up the equalization (EQ) for a concert or other
performance in a venue, a short burst of white or pink noise is sent through the PA system and monitored from
various points in the venue so that the engineer can tell if the acoustics of the building naturally boost or cut
any frequencies. The engineer can then adjust the overall equalization to ensure a balanced mix.
White noise can be used for frequency response testing of amplifiers and electronic filters. It is not
used for testing loudspeakers as its spectrum contains too great an amount of high frequency content. Pink
noise is used for testing transducers such as loudspeakers and microphones.
White noise is a common synthetic noise source used for sound masking by a tinnitus masker.[1]
White noise is a particularly good source signal for masking devices as it contains higher frequencies in equal
volumes to lower ones, and so is capable of more effective masking for high pitched ringing tones most
commonly perceived by tinnitus sufferers.

White noise is used as the basis of some random number generators. For example, Random.org uses a
system of atmospheric antennae to generate random digit patterns from white noise.
White noise machines and other white noise sources are sold as privacy enhancers and sleep aids and
to mask tinnitus. Some people claim white noise, when used with headphones, can aid concentration by
masking irritating or distracting noises in a person's environment.

MATHEMATICAL DEFINITION:
White random vector:
A random vector

is a white random vector if and only if its mean vector and autocorrelation matrix

are the following:

That is, it is a zero mean random vector, and its autocorrelation matrix is a multiple of the identity
matrix. When the autocorrelation matrix is a multiple of the identity, we say that it has spherical correlation.
White random process (white noise)
A continuous time random process w(t) where

is a white noise process if and only if its mean

function and autocorrelation function satisfy the following:

i.e. it is a zero mean process for all time and has infinite power at zero time shift since its autocorrelation
function is the Dirac delta function.
The above autocorrelation function implies the following power spectral density.

since the Fourier transform of the delta function is equal to 1. Since this power spectral density is the same at
all frequencies, we call it white as an analogy to the frequency spectrum of white light.
A generalization to random elements on infinite dimensional spaces, such as random fields, is the
white noise measure.
Random vector transformations
Two theoretical applications using a white random vector are the simulation and whitening of another
arbitrary random vector. To simulate an arbitrary random vector, we transform a white random vector with a
carefully chosen matrix. We choose the transformation matrix so that the mean and covariance matrix of the

transformed white random vector matches the mean and covariance matrix of the arbitrary random vector that
we are simulating. To whiten an arbitrary random vector, we transform it by a different carefully chosen matrix
so that the output random vector is a white random vector.
These two ideas are crucial in applications such as channel estimation and channel equalization in
communications and audio. These concepts are also used in data compression.
Simulating a random vector
Suppose that a random vector

has covariance matrix Kxx. Since this matrix is Hermitian symmetric

and positive semi definite, by the spectral theorem from linear algebra, we can diagonalize or factor the matrix
in the following way.

where E is the orthogonal matrix of eigenvectors and is the diagonal matrix of eigenvalues.
We can simulate the 1st and 2nd moment properties of this random vector
covariance matrix Kxx via the following transformation of a white vector

with mean

and

of unit variance:

where

Thus, the output of this transformation has expectation

and covariance matrix

Whitening a random vector


The method for whitening a vector

with mean

following calculation:

Thus, the output of this transformation has expectation

and covariance matrix

and covariance matrix Kxx is to perform the

By diagonalizing Kxx, we get the following:

Thus, with the above transformation, we can whiten the random vector to have zero mean and the
identity covariance matrix.
Random signal transformations
We cannot extend the same two concepts of simulating and whitening to the case of continuous time
random signals or processes. For simulating, we create a filter into which we feed a white noise signal. We
choose the filter so that the output signal simulates the 1st and 2nd moments of any arbitrary random process.
For whitening, we feed any arbitrary random signal into a specially chosen filter so that the output of the filter
is a white noise signal.
Simulating a continuous-time random signal

White noise fed into a linear, time-invariant filter to simulate the 1st and 2nd moments of an arbitrary
random process.

We can simulate any wide-sense stationary, continuous-time random process

with constant

mean and covariance function

and power spectral density

We can simulate this signal using frequency domain techniques.


Because Kx() is Hermitian symmetric and positive semi-definite, it follows that Sx() is real and can be
factored as

if and only if Sx() satisfies the Paley-Wiener criterion.

If Sx() is a rational function, we can then factor it into pole-zero form as

Choosing a minimum phase H() so that its poles and zeros lie inside the left half s-plane, we can
then simulate x(t) with H() as the transfer function of the filter.
We can simulate x(t) by constructing the following linear, time-invariant filter

where w(t) is a continuous-time, white-noise signal with the following 1st and 2nd moment properties:

Thus, the resultant signal

has the same 2nd moment properties as the desired signal x(t).

Whitening a continuous-time random signal

An arbitrary random process x(t) fed into a linear, time-invariant filter that whitens x(t) to create
white noise at the output.

Suppose we have a wide-sense stationary, continuous-time random process


defined with the same mean , covariance function Kx(), and power spectral density Sx() as above.
We can whiten this signal using frequency domain techniques. We factor the power spectral density
Sx() as described above.

Choosing the minimum phase H() so that its poles and zeros lie inside the left half s-plane, we can
then whiten x(t) with the following inverse filter

We choose the minimum phase filter so that the resulting inverse filter is stable. Additionally, we
must be sure that H() is strictly positive for all

so that Hinv() does not have any singularities.

The final form of the whitening procedure is as follows:

so that w(t) is a white noise random process with zero mean and constant, unit power spectral density

Note that this power spectral density corresponds to a delta function for the covariance function of w(t).

Narrowband Noise Representation

In most communication systems, we are often dealing with band-pass filtering of


signals. Wideband noise will be shaped into band limited noise. If the bandwidth of the
band limited noise is relatively small compared to the carrier frequency, we refer to this as
narrowband noise.
We can derive the power spectral density G n (f) and the auto-correlation
function R nn ( ) of the narrowband noise and use them to analyze the performance
of linear systems. In practice, we often deal with mixing (multiplication), which is a
non-linear operation, and the system analysis becomes difficult. In such a case, it is
useful to express the narrowband noise as

n(t) = x(t) cos 2 fct - y(t) sin 2 fct

(1)

where fc is the carrier frequency within the band occupied by the noise. x(t) and y(t)
are known as the quadrature components of the noise n(t). The Hibert transform of
n(t) is

n^ (t) = H[n(t)] = x(t) sin 2 fct + y(t) cos 2 fct

(2)

Proof:
The Fourier transform of n (t) is

N(f) =

X(f - fc) +

X(f+ fc) +

jY(f- fc) -

jY(f+ fc)

Let N^ ( f ) be the Fourier transform of n^ ( t ).

In the frequency domain,

N^ (f) = N (f)[-j sgn(f)]. We simply multiply all positive frequency components of


N(f) by -j and all negative frequency components of N(f) by j. Thus,

1
1
1
1
N^ (f) = -j X(f-fc)+ j X(f+ fc) - j jY(f- fc) - j jY(f+ f )
c
2

1
1
1
1
N^ (f) = -j X(f - fc) + j X(f+ fc) + Y(f- fc) + Y(f+ fc)
2

and the inverse Fourier transform of N^ (f) is

n^ (t) = x(t) sin 2 fct + y(t) cos 2 fct

The quadrature components x(t) and y(t) can now be derived from equations (1)
and (2).
x(t) = n(t)cos 2 fct + n^ (t)sin 2 fct

(3)

y(t) = n(t)cos 2 fct - n^ (t)sin 2 fct

(4)

and

Given n(t), the quadrature components x(t) and y(t) can be obtained by using the
arrangement.
x(t) and y(t) have the following properties:
1.

E[x(t) y(t)] = 0. x(t) and y(t) are uncorrelated with each other.

2.

x(t) and y(t) have the same means and variances as n(t).

3.

If n(t) is Gaussian, then x(t) and y(t) are also Gaussian.

4.

x (t) and y (t) have identical power spectral densities, related to the power
spectral density of n(t) by

Gx(f) = Gy(f) = Gn(f- fc) + Gn(f+ fc)

(5)

for fc - 0.5B < | f | < fc + 0.5B and B is the bandwidth of n(t).

Proof:
Equation (5) is the key that will enable us to calculate the effect of noise on AM and FM
systems.

It implies that the power spectral density of x(t) and y(t) can be found by

shifting the positive portion and negative portion of Gn(f) to zero frequency and adding to
give Gx(f) and Gy(f).
In the special case where G n (f) is symmetrical about the carrier frequency f c ,
the positive- and negative-frequency contributions are shifted to zero frequency and added to
give
Gx(f) = Gy(f) = 2Gn(f- fc) = 2Gn(f+ fc)

(6)

Performance of Binary FSK:


Consider the synchronous detector of binary FSK signals. In the presence of additive white Gaussian
noise (AWGN), w(t), the received signal is
r(t) = Acos 2 fc t + w(t)
1
where A is a constant and fc is the carrier frequency employed if a 1 has been sent. The signals at the output of
1
the band-pass filters of centre frequencies fc and fc are
1
2
r1(t) = Acos 2 fc t + n1(t)
1
and
r2(t) = n2(t)
where
n 1(t) = x1 (t) cos 2 fc t - y1 (t) sin 2 fc t
1
1
and
n 2(t) = x2 (t) cos 2 fc t - y2 (t) sin 2 fc t
2
2
are the narrowband noise. With appropriate design of low-pass filter and sampling period, the sampled output
signals are

vo1 = A + x1

vo2 = x2 and

v = A + [x 1 - x 2 ].

x1 and x2 are statistically independent Gaussian random variables with zero mean and fixed variance

= N, where N is the power of the random variable. It can be seen that one of the detectors has signal plus
noise, the other detector has noise only.

When fc is the carrier frequency employed for sending a 0, the received signal is
2

r(t) = Acos 2 fc t + w(t). It can be


2
shown that

v = -A + [x1 - x2 ]

Since E [ x 1 - x 2 ] 2 = E [ x 1 ] 2 - 2E [ x 1 x 2 ] 2 + E [ x 2 ] 2 = E [ x 1 ] 2 + E [ x 2 ] 2 = 2 +
variance

2 , the total

2
2.
t =2

NOISE TEMPERATURE:
In electronics, noise temperature is a temperature (in Kelvins) assigned to a component such that the
noise power delivered by the noisy component to a noiseless matched resistor is given by
PRL = kBTsBn
in watts, where:
is the Boltzmann constant (1.3811023 J/K, joules per Kelvin)
is the noise temperature (K)
is the noise bandwidth (Hz)
Engineers often model noisy components as an ideal component in series with a noisy resistor. The source resistor
is often assumed to be at room temperature, conventionally taken as 290 K (17 C, 62 F).

APPLICATIONS:
A communications system is typically made up of a transmitter, a communications channel, and a
receiver. The communications channel may consist of any one or a combination of many different physical media
(air, coaxial cable, printed wiring board traces). The important thing to note is that no matter what physical
media the channel consists of, the transmitted signal will be randomly corrupted by a number of different
processes. The most common form of signal degradation is called additive noise.
The additive noise in a receiving system can be of thermal origin (thermal noise) or can be from other noisegenerating processes. Most of these other processes generate noise whose spectrum and probability distributions
are similar to thermal noise. Because of these similarities, the contributions of all noise sources can be lumped
together and regarded as thermal noise. The noise power generated by all these sources (
assigning to the noise a noise temperature (

) can be described by

) defined as:

Tn = Pn / (kBn)
In a wireless communications receiver,

would equal the sum of two noise temperatures:

Tn = (Tant + Tsys)
is the antenna noise temperature and determines the noise power seen at the output of the antenna. The
physical temperature of the antenna has no affect on

is the noise temperature of the receiver circuitry

and is representative of the noise generated by the non-ideal components inside the receiver.

NOISE FACTOR AND NOISE FIGURE:


An important application of noise temperature is its use in the determination of a components noise
factor. The noise factor quantifies the noise power that the component adds to the system when its input noise
temperature is

The noise factor (a linear term) can be converted to noise figure (in decibels) using:

NOISE TEMPERATURE OF A CASCADE:


If there are multiple noisy components in cascade, the noise temperature of the cascade can be calculated
using the Friis equation:

where
= cascade noise temperature
= noise temperature of the first component in the cascade
= noise temperature of the second component in the cascade
= noise temperature of the third component in the cascade
= noise temperature of the nth component in the cascade
= linear gain of the first component in the cascade
= linear gain of the second component in the cascade
= linear gain of the third component in the cascade
= linear gain of the (n-1) component in the cascade
Components early in the cascade have a much larger influence on the overall noise temperature than those
later in the chain. This is because noise introduced by the early stages is, along with the signal, amplified by the
later stages. The Friis equation shows why a good quality preamplifier is important in a receive chain.

MEASURING NOISE TEMPERATURE:


The direct measurement of a components noise temperature is a difficult process. Suppose that the noise
temperature of a low noise amplifier (LNA) is measured by connecting a noise source to the LNA with a piece of
transmission line. From the cascade noise temperature it can be seen that the noise temperature of the transmission
line (

) has the potential of being the largest contributor to the output measurement (especially when you consider

that LNAs can have noise temperatures of only a few Kelvin). To accurately measure the noise temperature of the

LNA the noise from the input coaxial cable needs to be accurately known. This is difficult because poor surface
finishes and reflections in the transmission line make actual noise temperature values higher than those predicted
by theoretical analysis.
Similar problems arise when trying to measure the noise temperature of an antenna. Since the noise
temperature is heavily dependent on the orientation of the antenna, the direction that the antenna was pointed
during the test needs to be specified. In receiving systems, the system noise temperature will have three main
contributors, the antenna (

), the transmission line (

), and the receiver circuitry (

). The antenna noise

temperature is considered to be the most difficult to measure because the measurement must be made in the field
on an open system. One technique for measuring antenna noise temperature involves using cryogenically cooled
loads to calibrate a noise figure meter before measuring the antenna. This provides a direct reference comparison at
a noise temperature in the range of very low antenna noise temperatures, so that little extrapolation of the collected
data is required.

NOISE FIGURE:
Noise figure (NF) is a measure of degradation of the signal-to-noise ratio (SNR), caused by components
in a radio frequency (RF) signal chain. The noise figure is defined as the ratio of the output noise power of a device
to the portion thereof attributable to thermal noise in the input termination at standard noise temperature T0 (usually
290 K). The noise figure is thus the ratio of actual output noise to that which would remain if the device itself did
not introduce noise. It is a number by which the performance of a radio receiver can be specified.
The noise figure is the difference in decibels (dB) between the noise output of the actual receiver to the
noise output of an ideal receiver with the same overall gain and bandwidth when the receivers are connected to
sources at the standard noise temperature T0 (usually 290 K). The noise power from a simple load is equal to kTB,
where k is Boltzmann's constant, T is the absolute temperature of the load (for example a resistor), and B is the
measurement bandwidth.
This makes the noise figure a useful figure of merit for terrestrial systems where the antenna effective
temperature is usually near the standard 290 K. In this case, one receiver with a noise figure say 2 dB better than
another, will have an output signal to noise ratio that is about 2 dB better than the other. However, in the case of
satellite communications systems, where the antenna is pointed out into cold space, the antenna effective
temperature is often colder than 290 K. In these cases a 2 dB improvement in receiver noise figure will result in
more than a 2 dB improvement in the output signal to noise ratio. For this reason, the related figure of effective
noise temperature is therefore often used instead of the noise figure for characterizing satellite-communication
receivers and low noise amplifiers.
In heterodyne systems, output noise power includes spurious contributions from image-frequency
transformation, but the portion attributable to thermal noise in the input termination at standard noise temperature
includes only that which appears in the output via the principal frequency transformation of the system and
excludes that which appears via the image frequency transformation.

DEFINITION:
The noise factor of a system is defined as:

where SNRin and SNRout are the input and output power signal-to-noise ratios, respectively. The noise figure is
defined as:

where SNRin,dB and SNRout,dB are in decibels (dB). The noise figure is the noise factor, given in dB:

These formulae are only valid when the input termination is at standard noise temperature T0, although in practice
small differences in temperature do not significantly affect the values.
The noise factor of a device is related to its noise temperature Te:

Devices with no gain (e.g., attenuators) have a noise factor F equal to their attenuation L (absolute value, not in dB)
when their physical temperature equals T0. More generally, for an attenuator at a physical temperature T, the noise
temperature is Te = (L 1)T, giving a noise factor of:

If several devices are cascaded, the total noise factor can be found with Friis' Formula:

where Fn is the noise factor for the n-th device and Gn is the power gain (linear, not in dB) of the n-th device. In a
well designed receive chain, only the noise factor of the first amplifier should be significant.

IMPORTANT QUESTIONS
PART A

1. Define noise figure.


2. What is white noise?
3. What is thermal noise? Give the expression for the thermal noise voltage across a resistor.
4. What is shot noise?
5. Define noise temperature.
6. Find the thermal noise voltage developed across a resistor of 700ohm. The bandwidth of the
measuring instrument is 7MHz and the ambient temperature is 27C.
7. Define a random variable?
8. What is a random process?
9. What is Gaussian process?
10. What is a stationary random process?

PART B
1. Derive the effective noise temperature of a cascade amplifier. Explain how the various
noises are generated in the method of representing them. (16)
2. Explain how the various noises are generated and the method of representing them. (16)
3. Write notes on noise temperature and noise figure. (8)
4. Derive the noise figure for cascade stages. (8)
5. What is narrowband noise discuss the properties of the quadrature components of a
narrow band noise. (8)
6. What is meant by noise equivalent bandwidth? Illustrate it with a diagram. (8)
7. Derive the expression for output signal to noise for a DSB-SC receiver using coherent
detection. (16)
8. Write short notes on noise in SSB. (16)
9. Discuss the following: . (16)
i) Noise equivalent bandwidth (4)
ii) Narrow band noise (4)
iii) Noise temperature (4)
iv) Noise spectral density (4)
12. How sine wave plus noise is represented? Obtain the joint PDF of such noise
Component. (16)

UNIT IV
NOISE CHARACTERIZATION

Superheterodyne radio receiver and its characteristic.


SNR.
Noise in DSBSC systems using coherent detection.
Noise in AM system using envelope detection FM system.
FM threshold effect.
Pre-emphasis and de-emphasis in FM.
Comparison of performances.

SUPERHETERODYNE RADIO RECEIVER:


In electronics, a superheterodyne receiver uses frequency mixing or heterodyning to convert a received
signal to a fixed intermediate frequency, which can be more conveniently processed than the original radio carrier
frequency. Virtually all modern radio and television receivers use the superheterodyne principle.

DESIGN AND EVOLUTION:

Schematic of a typical superheterodyne receiver.


The diagram at right shows the minimum requirements for a single-conversion superheterodyne receiver
design. The essential elements are common to all superheterodyne circuits. A signal receiving antenna, a
broadband r.f. amplifier, a variable frequency local oscillator, a frequency mixer, a band pass filter to remove
unwanted mixer product signals, a demodulator to recover the original audio signal. Cost-optimized designs use
one active device for both local oscillator and mixer called a "converter" stage. One example is the pentagrid
converter.
Circuit description:
A suitable antenna is required to receive the chosen range of broadcast signals. The signal received is very
small, sometimes only a few microvolts. Reception starts with the antenna signal fed to the R.F. stage. The R.F.
amplifier stage must be selectively tuned to pass only the desired range of channels required. To allow the receiver
to be tuned to a particular broadcast channel a method of changing the frequency of the local oscillator is needed.
The tuning circuit in a simple design may use a variable capacitor, or varicap diode. Only one or two tuned stages
need to be adjusted to track over the tuning range of the receiver.
Mixer stage
The signal is then fed into the mixer stage circuit. The mixer is also fed with a signal from the variable
frequency local oscillator (VFO) circuit. The mixer produces both sum and difference beat frequencies signals
catch one containing a copy of the desired signal. The four frequencies at the output include the wanted signal fd,
the original fLO, and the two new frequencies fd+fLO and fd-fLO. The output signal also contains a number of
undesirable frequencies. These are 3rd- and higher-order inter modulation products. These multiple signals are
removed by the R.F. bandpass filter, leaving only the desired offset I.F frequency signal fIF which contains the
original broadcast information fd.

Intermediate frequency stage:


All the intermediate-frequency stages operate at a fixed frequency which need not be adjusted. [6] The I.F.
amplifier section fIF is tuned to be highly selective. By changing fLO, the resulting fd-fLO (or fd+fLO) signal can be
tuned to the amplifier's fIF. the suitably amplified signal includes the frequency the user wishes to tune, fd. The local
oscillator is tuned to produce a frequency close to fd, fLO. In typical amplitude modulation ("AM radio" in the U.S.,
or MW) receivers, that frequency is 455 kHz;[10] for FM receivers, it is usually 10.7 MHz; for television 33.4 to
45.75 MHz.
Other signals from the mixed output of the heterodyne are filtered out by this stage. This depends on the
intermediate frequency chosen in the design process. Typically it is 455 kHz for a single stage conversion receiver.
The higher the chosen I.F. offset will reduce the effect interference from powerful radio transmissions in adjacent
broadcast bands will have on the required signal.
Usually the intermediate frequency is lower than either the carrier or oscillator frequencies, but with some
types of receiver (e.g. scanners and spectrum analyzers) it is more convenient to use a higher intermediate
frequency. In order to avoid interference to and from signal frequencies close to the intermediate frequency, in
many countries IF frequencies are controlled by regulatory authorities. Examples of common IFs are 455 kHz for
medium-wave AM radio, 10.7 MHz for FM, 38.9 MHz (Europe) or 45 MHz (US) for television, and 70 MHz for
satellite and terrestrial microwave equipment.
Bandpass filter:
The filter must have a band pass range equal to or lesser than the frequency spacing between adjacent
broadcast channels. A perfect filter would have high attenuation factor to adjacent channels, but with a broad
bandpass response to obtain a better quality of received signal. This may be designed with a dual frequency tuned
coil filter design, or a multi pole ceramic crystal filter.
Demodulation:
The received signal is now processed by the demodulator stage where the broadcast, (usually audio, but
may be data), signal is recovered and amplified. A.M. demodulation requires the simple rectification of the R.F.
signal to remove one sideband, and a simple resistor and capacitor low pass RC filter to remove the high frequency
R.F. carrier component. Other modes of transmission will require more specialized circuits to recover the
broadcast signal. The remaining audio signal is then amplified and fed to a suitable transducer, such as a
loudspeaker or headphones.
Advanced designs:
To overcome obstacles such as image response, multiple IF stages are used, and in some cases multiple
stages with two IFs of different values are used. For example, the front end might be sensitive to 130 MHz, the
first half of the radio to 5 MHz, and the last half to 50 kHz. Two frequency converters would be used, and the radio
would be a double conversion superheterodyne; a common example is a television receiver where the audio
information is obtained from a second stage of intermediate-frequency conversion. Receivers which are tunable

over a wide bandwidth (e.g. scanners) may use an intermediate frequency higher than the signal, in order to
improve image rejection.
Other uses:
In the case of modern television receivers, no other technique was able to produce the precise bandpass
characteristic needed for vestigial sideband reception, first used with the original NTSC system introduced in 1941.
This originally involved a complex collection of tunable inductors which needed careful adjustment, but since the
1970s or early 1980s these have been replaced with precision electromechanical surface acoustic wave (SAW)
filters. Fabricated by precision laser milling techniques, SAW filters are cheaper to produce, can be made to
extremely close tolerances, and are stable in operation. To avoid tooling costs associated with these components
most manufacturers then tended to design their receivers around the fixed range of frequencies offered which
resulted in de-facto standardization of intermediate frequencies.
Modern designs:
Microprocessor technology allows replacing the superheterodyne receiver design by a software defined
radio architecture, where the IF processing after the initial IF filter is implemented in software. This technique is
already in use in certain designs, such as very low-cost FM radios incorporated into mobile phones, since the
system already has the necessary microprocessor.
Radio transmitters may also use a mixer stage to produce an output frequency, working more or less as the
reverse of a superheterodyne receiver.
Technical advantages:
Superheterodyne receivers have superior characteristics to simpler receiver types in frequency stability
and selectivity. They offer better stability than Tuned radio frequency receivers (TRF) because a tunable oscillator
is more easily stabilized than a tunable amplifier, especially with modern frequency synthesizer technology. IF
filters can give narrower pass bands at the same Q factor than an equivalent RF filter. A fixed IF also allows the
use of a crystal filter when exceptionally high selectivity is necessary. Regenerative and super-regenerative
receivers offer better sensitivity than a TRF receiver, but suffer from stability and selectivity problems.
Drawbacks of this design:
High-side and low-side injection:
The amount that a signal is down-shifted by the local oscillator depends on whether its frequency f is
higher or lower than fLO. That is because its new frequency is |f fLO| in either case. Therefore, there are potentially
two signals that could both shift to the same fIF; one at f = fLO + fIF and another at f = fLO fIF. One of those signals,
called the image frequency, has to be filtered out prior to the mixer to avoid aliasing. When the upper one is filtered
out, it is called high-side injection, because fLO is above the frequency of the received signal. The other case is
called low-side injection. High-side injection also reverses the order of a signal's frequency components. Whether
that actually changes the signal depends on whether it has spectral symmetry. The reversal can be undone later in
the receiver, if necessary.

Image Frequency (fimage):


One major disadvantage to the superheterodyne receiver is the problem of image frequency. In heterodyne
receivers, an image frequency is an undesired input frequency equal to the station frequency plus twice the
intermediate frequency. The image frequency results in two stations being received at the same time, thus
producing interference. Image frequencies can be eliminated by sufficient attenuation on the incoming signal by
the RF amplifier filter of the superheterodyne receiver.

Early Autodyne receivers typically used IFs of only 150 kHz or so, as it was difficult to maintain reliable
oscillation if higher frequencies were used. As a consequence, most Autodyne receivers needed quite elaborate
antenna tuning networks, often involving double-tuned coils, to avoid image interference. Later super heterodynes
used tubes especially designed for oscillator/mixer use, which were able to work reliably with much higher IFs,
reducing the problem of image interference and so allowing simpler and cheaper aerial tuning circuitry.
For medium-wave AM radio, a variety of IFs have been used, but usually 455 kHz is used.
Local oscillator radiation:
It is difficult to keep stray radiation from the local oscillator below the level that a nearby receiver can
detect. The receiver's local oscillator can act like a miniature CW transmitter. This means that there can be mutual
interference in the operation of two or more superheterodyne receivers in close proximity. In espionage, oscillator
radiation gives a means to detect a covert receiver and its operating frequency. One effective way of preventing the
local oscillator signal from radiating out from the receiver's antenna is by adding a shielded and power supply
decoupled stage of RF amplification between the receiver's antenna and its mixer stage.
Local oscillator sideband noise:
Local oscillators typically generate a single frequency signal that has negligible amplitude modulation but
some random phase modulation. Either of these impurities spreads some of the signal's energy into sideband
frequencies. That causes a corresponding widening of the receiver's frequency response, which would defeat the
aim to make a very narrow bandwidth receiver such as to receive low-rate digital signals. Care needs to be taken to
minimize oscillator phase noise, usually by ensuring that the oscillator never enters a non-linear mode.

SIGNAL-TO-NOISE RATIO:

Signal-to-noise ratio (often abbreviated SNR or S/N) is a measure used in science and engineering to
quantify how much a signal has been corrupted by noise. It is defined as the ratio of signal power to the noise
power corrupting the signal. A ratio higher than 1:1 indicates more signal than noise. While SNR is commonly
quoted for electrical signals, it can be applied to any form of signal (such as isotope levels in an ice core or
biochemical signaling between cells).

In less technical terms, signal-to-noise ratio compares the level of a desired signal (such as music) to the
level of background noise. The higher the ratio, the less obtrusive the background noise is.
"Signal-to-noise ratio" is sometimes used informally to refer to the ratio of useful information to false or
irrelevant data in a conversation or exchange. For example, in online discussion forums and other online
communities, off-topic posts and spam are regarded as "noise" that interferes with the "signal" of appropriate
discussion.

FM DEMODULATORS AND THRESHOLD EFFECT:


An important aspect of analogue FM satellite systems is FM threshold effect. In FM systems where the
signal level is well above noise received carrier-to-noise ratio and demodulated signal-to-noise ratio are related by:

The expression however does not apply when the carrier-to-noise ratio decreases below a certain point.
Below this critical point the signal-to-noise ratio decreases significantly. This is known as the FM threshold effect
(FM threshold is usually defined as the carrier-to-noise ratio at which the demodulated signal-to-noise ratio fall 1
dB below the linear relationship. It generally is considered to occur at about 10 dB).
Below the FM threshold point the noise signal (whose amplitude and phase are randomly varying), may
instantaneously have an amplitude greater than that of the wanted signal. When this happens the noise will produce
a sudden change in the phase of the FM demodulator output. In an audio system this sudden phase change makes a
"click". In video applications the term "click noise" is used to describe short horizontal black and white lines that
appear randomly over a picture.
Because satellite communications systems are power limited they usually operate with only a small design
margin above the FM threshold point (perhaps a few dB). Because of this circuit designers have tried to devise
techniques to delay the onset of the FM threshold effect. These devices are generally known as FM threshold
extension demodulators. Techniques such as FM feedback, phase locked loops and frequency locked loops are used
to achieve this effect. By such techniques the onset of FM threshold effects can be delayed till the C/N ratio is
around 7 dB
Pre-emphasis and de-emphasis:
Random noise has a 'triangular' spectral distribution in an FM system, with the effect that noise occurs
predominantly at the highest frequencies within the baseband. This can be offset, to a limited extent, by boosting
the high frequencies before transmission and reducing them by a corresponding amount in the receiver. Reducing

the high frequencies in the receiver also reduces the high-frequency noise. These processes of boosting and then
reducing certain frequencies are known as pre-emphasis and de-emphasis, respectively.
The amount of pre-emphasis and de-emphasis used is defined by the time constant of a simple RC
filter circuit. In most of the world a 50 s time constant is used. In North America, 75 s is used. This applies to
both mono and stereo transmissions and to baseband audio (not the subcarriers).
The amount of pre-emphasis that can be applied is limited by the fact that many forms of contemporary
music contain more high-frequency energy than the musical styles which prevailed at the birth of FM broadcasting.
They cannot be pre-emphasized as much because it would cause excessive deviation of the FM carrier. (Systems
more modern than FM broadcasting tend to use either programme-dependent variable pre-emphasise.g. dbx in
the BTSC TV sound systemor none at all.)
FM stereo:
In the late 1950s, several systems to add stereo to FM radio were considered by the FCC. Included were
systems from 14 proponents including Crosley, Halstead, Electrical and Musical Industries, Ltd (EMI), Zenith
Electronics Corporation and General Electric. The individual systems were evaluated for their strengths and
weaknesses during field tests in Uniontown, Pennsylvania using KDKA-FM in Pittsburgh as the originating
station. The Crosley system was rejected by the FCC because it degraded the signal-to-noise ratio of the main
channel and did not perform well under multipath RF conditions. In addition, it did not allow for SCA services
because of its wide FM sub-carrier bandwidth. The Halstead system was rejected due to lack of high frequency
stereo separation and reduction in the main channel signal-to-noise ratio. The GE and Zenith systems, so similar
that they were considered theoretically identical, were formally approved by the FCC in April 1961 as the standard
stereo FM broadcasting method in the USA and later adopted by most other countries.
It is important that stereo broadcasts should be compatible with mono receivers. For this reason, the left
(L) and right (R) channels are algebraically encoded into sum (L+R) and difference (LR) signals. A mono
receiver will use just the L+R signal so the listener will hear both channels in the single loudspeaker. A stereo
receiver will add the difference signal to the sum signal to recover the left channel, and subtract the difference
signal from the sum to recover the right channel.
The (L+R) Main channel signal is transmitted as baseband audio in the range of 30 Hz to 15 kHz. The
(LR) Sub-channel signal is modulated onto a 38 kHz double-sideband suppressed carrier (DSBSC) signal
occupying the baseband range of 23 to 53 kHz.
A 19 kHz pilot tone, at exactly half the 38 kHz sub-carrier frequency and with a precise phase relationship
to it, as defined by the formula below, is also generated. This is transmitted at 810% of overall modulation level
and used by the receiver to regenerate the 38 kHz sub-carrier with the correct phase.
The final multiplex signal from the stereo generator contains the Main Channel (L+R), the pilot tone, and
the sub-channel (LR). This composite signal, along with any other sub-carriers, modulates the FM transmitter.
The instantaneous deviation of the transmitter carrier frequency due to the stereo audio and pilot tone (at
10% modulation) is:

[2]

Where A and B are the pre-emphasized Left and Right audio signals and fp is the frequency of the pilot tone. Slight
variations in the peak deviation may occur in the presence of other subcarriers or because of local regulations.
Converting the multiplex signal back into left and right audio signals is performed by a stereo decoder,
which is built into stereo receivers.
In order to preserve stereo separation and signal-to-noise parameters, it is normal practice to apply preemphasis to the left and right channels before encoding, and to apply de-emphasis at the receiver after decoding.
Stereo FM signals are more susceptible to noise and multipath distortion than are mono FM signals.
In addition, for a given RF level at the receiver, the signal-to-noise ratio for the stereo signal will be worse
than for the mono receiver. For this reason many FM stereo receivers include a stereo/mono switch to allow
listening in mono when reception conditions are less than ideal, and most car radios are arranged to reduce the
separation as the signal-to-noise ratio worsens, eventually going to mono while still indicating a stereo signal is
being received.

IMPORTANT QUESTION
PART A
1. How to achieve threshold reduction in FM receiver?
2. What is meant by FOM of a receiver?
3. What is extended threshold demodulator?
4. Draw the Phasor representation of FM noise.
5. Define pre-emphasis and de-emphasis.
6. What is capture effect in FM?
7. What is the SNR for AM with small noise case?
8. What is threshold effect with respect to noise?
9. Define SNR.
10. Define CSNR.
11. Discuss the factors that influence the choice of intermediate frequency in a radio receiver.

PART B
1. Define Hilbert Transform with a suitable example. Give the method of generation and
detection of SSB waver. . (16)
2. Discuss the noise performance of AM system using envelope detection. (16)
3. Compare the noise performance of AM and FM systems. (16)
4. Explain the significance of pre-emphasis and de-emphasis in FM system? (8)
5. Derive the noise power spectral density of the FM demodulation and explain its
performance with diagram. (16)
6. Draw the block diagram of FM demodulator and explain the effect of noise in detail.
Explain the FM threshold effect and capture effect in FM? (16)
7. Explain the FM receiver with block diagram. (8)

UNIT V
INFORMATION THEORY

Discrete messages and information content.


Concept of amount of information.
Average information.
Entropy.
Information rate.
Source coding to increase average information per bit.
Shannon-fano coding.
Huffman coding.
Lempel-Ziv (LZ) coding.
Shannons theorem.
Channel capacity.
Bandwidth.
S/N trade-off.
Mutual information.
Channel capacity.
Rate distortion theory.
Lossy source coding.

INFORMATION THEORY:
Information theory is a branch of applied mathematics and electrical engineering involving the
quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits
on signal processing operations such as compressing data and on reliably storing and communicating data. Since its
inception it has broadened to find applications in many other areas, including statistical inference, natural language
processing, cryptography generally, networks other than communication networks as inneurobiology, the
evolution and function of molecular codes, model selection in ecology, thermal physics, quantum computing,
plagiarism detection and other forms of data analysis.
A key measure of information is known as entropy, which is usually expressed by the average number of
bits needed for storage or communication. Entropy quantifies the uncertainty involved in predicting the value of
a random variable. For example, specifying the outcome of a fair coin flip (two equally likely outcomes) provides
less information (lower entropy) than specifying the outcome from a roll of a die (six equally likely outcomes).
Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP
files), lossy data compression (e.g. MP3s), and channel coding (e.g. for DSL lines). The field is at the intersection
of mathematics, statistics, computer science, physics, neurobiology, and electrical engineering. Its impact has been
crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of
mobile phones, the development of the Internet, the study of linguistics and of human perception, the
understanding of black holes, and numerous other fields. Important sub-fields of information theory are source
coding, channel coding, algorithmic complexity theory, algorithmic information theory, information-theoretic
security, and measures of information.

OVERVIEW:
The main concepts of information theory can be grasped by considering the most widespread means of
human communication: language. Two important aspects of a concise language are as follows: First, the most
common words (e.g., "a", "the", "I") should be shorter than less common words (e.g., "benefit", "generation",
"mediocre"), so that sentences will not be too long. Such a tradeoff in word length is analogous to data
compression and is the essential aspect of source coding. Second, if part of a sentence is unheard or misheard due
to noise e.g., a passing car the listener should still be able to glean the meaning of the underlying message.
Such robustness is as essential for an electronic communication system as it is for a language; properly building
such robustness into communications is done by channel coding. Source coding and channel coding are the
fundamental concerns of information theory.
Note that these concerns have nothing to do with the importance of messages. For example, a platitude
such as "Thank you; come again" takes about as long to say or write as the urgent plea, "Call an ambulance!" while
the latter may be more important and more meaningful in many contexts. Information theory, however, does not
consider message importance or meaning, as these are matters of the quality of data rather than the quantity and
readability of data, the latter of which is determined solely by probabilities.
Information theory is generally considered to have been founded in 1948 by Claude Shannon in his
seminal work, "A Mathematical Theory of Communication". The central paradigm of classical information theory
is the engineering problem of the transmission of information over a noisy channel. The most fundamental results
of this theory are Shannon's source coding theorem, which establishes that, on average, the number of bits needed
to represent the result of an uncertain event is given by its entropy; and Shannon's noisy-channel coding theorem,

which states that reliable communication is possible over noisy channels provided that the rate of communication
is below a certain threshold, called the channel capacity. The channel capacity can be approached in practice by
using appropriate encoding and decoding systems.
Information theory is closely associated with a collection of pure and applied disciplines that have been
investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half
century or more: adaptive systems, anticipatory systems, artificial intelligence, complex systems, complexity
science, cybernetics, informatics, machine learning, along with systems sciences of many descriptions. Information
theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the
vital field of coding theory.
Coding theory is concerned with finding explicit methods, called codes, of increasing the efficiency and
reducing the net error rate of data communication over a noisy channel to near the limit that Shannon proved is the
maximum possible for that channel. These codes can be roughly subdivided into data compression (source coding)
and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods
Shannon's work proved were possible. A third class of information theory codes are cryptographic algorithms (both
codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used
in cryptography and cryptanalysis. See the article ban (information) for a historical application.
Information theory is also used in information retrieval, intelligence gathering, gambling, statistics, and
even in musical composition.
Quantities of information

Information theory is based on probability theory and statistics. The most important quantities of
information are entropy, the information in a random variable, and mutual information, the amount of
information in common between two random variables. The former quantity indicates how easily message
data can be compressed while the latter can be used to find the communication rate across a channel.
The choice of logarithmic base in the following formulae determines the unit of information entropy that is
used. The most common unit of information is the bit, based on the binary logarithm. Other units include
the nat, which is based on the natural logarithm, and the hartley, which is based on the common logarithm.
In what follows, an expression of the form
whenever p = 0. This is justified because

Entropy:

is considered by convention to be equal to zero


for any logarithmic base.

Entropy of a Bernoulli trial as a function of success probability, often called the binary entropy
function, Hb(p). The entropy is maximized at 1 bit per trial when the two possible outcomes are equally
probable, as in an unbiased coin toss.
The entropy, H, of a discrete random variable X is a measure of the amount of uncertainty associated with
the value of X.
Suppose one transmits 1000 bits (0s and 1s). If these bits are known ahead of transmission (to be a certain
value with absolute probability), logic dictates that no information has been transmitted. If, however, each is
equally and independently likely to be 0 or 1, 1000 bits (in the information theoretic sense) have been
transmitted. Between these two extremes, information can be quantified as follows. If
messages{x1,...,xn} that X could be, and p(x) is the probability of X given some

is the set of all


, then the entropy

of X is defined:

(Here, I(x) is the self-information, which is the entropy contribution of an individual message, and

is

the expected value.) An important property of entropy is that it is maximized when all the messages in the message
space are equiprobable p(x) = 1 / n,i.e., most unpredictablein which case H(X) = logn.
The special case of information entropy for a random variable with two outcomes is the binary entropy
function, usually taken to the logarithmic base 2:

Joint entropy:
The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X,Y).
This implies that if X and Y areindependent, then their joint entropy is the sum of their individual entropies.
For example, if (X,Y) represents the position of a chess piece X the row and Y the column, then the joint entropy
of the row of the piece and the column of the piece will be the entropy of the position of the piece.

Despite similar notation, joint entropy should not be confused with cross entropy.

Conditional entropy (equivocation):


The conditional entropy or conditional uncertainty of X given random variable Y (also called
the equivocation of X about Y) is the average conditional entropy over Y:

Because entropy can be conditioned on a random variable or on that random variable being a certain
value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in
more common use. A basic property of this form of conditional entropy is that:

Mutual information (transinformation):


Mutual information measures the amount of information that can be obtained about one random variable
by observing another. It is important in communication where it can be used to maximize the amount of
information shared between sent and received signals. The mutual information of X relative to Y is given by:

where SI (Specific mutual Information) is the pointwise mutual information.


A basic property of the mutual information is that

That is, knowing Y, we can save an average of I(X;Y) bits in encoding X compared to not knowing Y.
Mutual information is symmetric:

Mutual information can be expressed as the average Kullback Leibler divergence (information gain) of
the posterior probability distribution of X given the value of Y to the prior distribution on X:

In other words, this is a measure of how much, on the average, the probability distribution on X will
change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal
distributions to the actual joint distribution:

Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables
and the multinomial distribution and to Pearson's 2 test: mutual information can be considered a statistic for
assessing independence between a pair of variables, and has a well-specified asymptotic distribution.

KullbackLeibler divergence (information gain):


The KullbackLeibler divergence (or information divergence, information gain, or relative entropy) is a
way of comparing two distributions: a "true" probability distribution p(X), and an arbitrary probability
distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data,
when, in reality, p(X) is the correct distribution, the KullbackLeibler divergence is the number of average
additional bits per datum necessary for compression. It is thus defined

Although it is sometimes used as a 'distance metric', it is not a true metric since it is not symmetric and does not
satisfy the triangle inequality (making it a semi-quasimetric).

Coding theory:
Coding theory is one of the most important and direct applications of information theory. It can be
subdivided into source coding theory and channel coding theory. Using a statistical description for data,
information theory quantifies the number of bits needed to describe the data, which is the information entropy of
the source.
Data compression (source coding): There are two formulations for the compression problem:
1.

lossless data compression: the data must be reconstructed exactly;

2.

lossy data compression: allocates bits needed to reconstruct the data, within a specified fidelity level
measured by a distortion function. This subset of Information theory is called ratedistortion theory.

Error-correcting codes (channel coding): While data compression removes as much redundancy as
possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to
transmit the data efficiently and faithfully across a noisy channel.
This division of coding theory into compression and transmission is justified by the information transmission

theorems, or sourcechannel separation theorems that justify the use of bits as the universal currency for
information in many contexts. However, these theorems only hold in the situation where one transmitting user
wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access
channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more
general networks, compression followed by transmission may no longer be optimal. Network information
theory refers to these multi-agent communication models.

SOURCE THEORY:
Any process that generates successive messages can be considered a source of information. A memoryless
source is one in which each message is an independent identically-distributed random variable, whereas the
properties of ergodicity and stationarity impose more general constraints. All such sources are stochastic. These
terms are well studied in their own right outside information theory.

Rate:
Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of
each symbol, while, in the case of a stationary stochastic process, it is

that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general
case of a process that is not necessarily stationary, the average rate is

that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the
same result.

It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate,
for example, when the source of information is English prose. The rate of a source of information is related to
its redundancy and how well it can be compressed, the subject of source coding.

Channel capacity:
Communications over a channelsuch as an ethernet cableis the primary motivation of information
theory. As anyone who's ever used a telephone (mobile or landline) knows, however, such channels often fail to
produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often
degrade quality. How much information can one hope to communicate over a noisy (or otherwise imperfect)
channel?
Consider the communications process over a discrete channel. A simple model of the process is shown
below:

Here X represents the space of messages transmitted, and Y the space of messages received during a unit
time over our channel. Let p(y | x) be the conditional probability distribution function of Y given X. We will
consider p(y | x) to be an inherent fixed property of our communications channel (representing the nature of
the noise of our channel). Then the joint distribution of X and Y is completely determined by our channel and by
our choice of f(x), the marginal distribution of messages we choose to send over the channel. Under these
constraints, we would like to maximize the rate of information, or the signal, we can communicate over the
channel. The appropriate measure for this is the mutual information, and this maximum mutual information is
called the channel capacity and is given by:

This capacity has the following property related to communicating at information rate R (where R is
usually bits per symbol). For any information rate R < C and coding error > 0, for large enough N, there exists a
code of length N and rate R and a decoding algorithm, such that the maximal probability of block error is ;
that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R > C, it is
impossible to transmit with arbitrarily small block error.
Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over
a noisy channel with a small coding error at a rate near the channel capacity.
BIT RATE:
In telecommunications and computing, bitrate (sometimes written bit rate, data rate or as a
variable R or fb) is the number of bits that are conveyed or processed per unit of time.
The bit rate is quantified using the bits per second (bit/s or bps) unit, often in conjunction with an SI
prefix such as kilo- (kbit/s or kbps), mega-(Mbit/s or Mbps), giga- (Gbit/s or Gbps) or tera- (Tbit/s or Tbps). Note
that, unlike many other computer-related units, 1 kbit/s is traditionally defined as 1,000 bit/s, not 1,024 bit/s, etc.,
also before 1999 when SI prefixes were introduced for units of information in the standard IEC 60027-2.

The formal abbreviation for "bits per second" is "bit/s" (not "bits/s", see writing style for SI units). In less
formal contexts the abbreviations "b/s" or "bps" are often used, though this risks confusion with "bytes per second"
("B/s", "Bps"). 1 Byte/s (Bps or B/s) corresponds to 8 bit/s (bps or b/s).

ShannonFano coding
In the field of data compression, ShannonFano coding, named after Claude Elwood
Shannon and Robert Fano, is a technique for constructing a prefix code based on a set of symbols and
their probabilities (estimated or measured). It is suboptimal in the sense that it does not achieve the
lowest possible expected code word length like Huffman coding; however unlike Huffman coding, it does
guarantee that all code word lengths are within one bit of their theoretical ideal logP(x). The technique
was proposed in Shannon's "A Mathematical Theory of Communication", his 1948 article introducing the
field of information theory. The method was attributed to Fano, who later published it as a technical
report. ShannonFano coding should not be confused with Shannon coding, the coding method used to
prove Shannon's noiseless coding theorem, or with Shannon-Fano-Elias coding (also known as Elias
coding), the precursor to arithmetic coding.
In ShannonFano coding, the symbols are arranged in order from most probable to least
probable, and then divided into two sets whose total probabilities are as close as possible to being
equal. All symbols then have the first digits of their codes assigned; symbols in the first set receive "0"
and symbols in the second set receive "1". As long as any sets with more than one member remain, the
same process is repeated on those sets, to determine successive digits of their codes. When a set has
been reduced to one symbol, of course, this means the symbol's code is complete and will not form the
prefix of any other symbol's code.
The algorithm works, and it produces fairly efficient variable-length encodings; when the two
smaller sets produced by a partitioning are in fact of equal probability, the one bit of information used to
distinguish them is used most efficiently. Unfortunately, ShannonFano does not always produce
optimal prefix codes; the set of probabilities {0.35, 0.17, 0.17, 0.16, 0.15} is an example of one that will
be assigned non-optimal codes by ShannonFano coding.
For this reason, ShannonFano is almost never used; Huffman coding is almost as
computationally simple and produces prefix codes that always achieve the lowest expected code word
length, under the constraints that each symbol is represented by a code formed of an integral number of
bits. This is a constraint that is often unneeded, since the codes will be packed end-to-end in long
sequences. If we consider groups of codes at a time, symbol-by-symbol Huffman coding is only optimal
if the probabilities of the symbols are independent and are some power of a half, i.e.,

. In most

situations, arithmetic coding can produce greater overall compression than either Huffman or Shannon
Fano, since it can encode in fractional numbers of bits which more closely approximate the actual
information content of the symbol. However, arithmetic coding has not superseded Huffman the way that
Huffman supersedes ShannonFano, both because arithmetic coding is more computationally
expensive and because it is covered by multiple patents.
ShannonFano coding is used in the IMPLODE compression method, which is part of
the ZIP file format.

SHANNONFANO ALGORITHM:
A ShannonFano tree is built according to a specification designed to define an effective code table. The
actual algorithm is simple:
1. For a given list of symbols, develop a corresponding list of probabilities or frequency counts so that each
symbols relative frequency of occurrence is known.
2. Sort the lists of symbols according to frequency, with the most frequently occurring symbols at the left
and the least common at the right.
3. Divide the list into two parts, with the total frequency counts of the left half being as close to the total of
the right as possible.
4. The left half of the list is assigned the binary digit 0, and the right half is assigned the digit 1. This means
that the codes for the symbols in the first half will all start with 0, and the codes in the second half will all
start with 1.
5. Recursively apply the steps 3 and 4 to each of the two halves, subdividing groups and adding bits to the
codes until each symbol has become a corresponding code leaf on the tree.
Example

ShannonFano Algorithm
The example shows the construction of the Shannon code for a small alphabet. The five symbols which can be
coded have the following frequency:
Symbol

Count

15

Probabilities 0.38461538 0.17948718 0.15384615 0.15384615 0.12820513

All symbols are sorted by frequency, from left to right (shown in Figure a). Putting the dividing line
between symbols B and C results in a total of 22 in the left group and a total of 17 in the right group. This
minimizes the difference in totals between the two groups.
With this division, A and B will each have a code that starts with a 0 bit, and the C, D, and E codes will all
start with a 1, as shown in Figure b. Subsequently, the left half of the tree gets a new division between A and
B, which puts A on a leaf with code 00 and B on a leaf with code 01.
After four division procedures, a tree of codes results. In the final tree, the three symbols with the highest
frequencies have all been assigned 2-bit codes, and two symbols with lower counts have 3-bit codes as
shown table below:
Symbol A B C

Code

00 01 10 110 111

Results in 2 bits for A, B and C and per 3 bits for D and E an average bit number of

HUFFMAN CODING:
The Shannon-Fano algorithm doesn't always generate an optimal code. In 1952, David A. Huffman gave a
different algorithm that always produces an optimal tree for any given probabilities. While the Shannon-Fano tree
is created from the root to the leaves, the Huffman algorithm works from leaves to the root in the opposite
direction.
1. Create a leaf node for each symbol and add it to frequency of occurrence.
2. While there is more than one node in the queue:
1. Remove the two nodes of lowest probability or frequency from the queue
2. Prepend 0 and 1 respectively to any code already assigned to these nodes

3. Create a new internal node with these two nodes as children and with probability equal to the
sum of the two nodes' probabilities.
4. Add the new node to the queue.
3. The remaining node is the root node and the tree is complete.
Example

Huffman Algorithm
Using the same frequencies as for the Shannon-Fano example above, viz:
Symbol

Count

15

Probabilities 0.38461538 0.17948718 0.15384615 0.15384615 0.12820513

In this case D & E have the lowest frequencies and so are allocated 0 and 1 respectively and grouped
together with a combined probability of 0.28205128. The lowest pair now are B and C so they're allocated 0
and 1 and grouped together with a combined probability of 0.33333333. This leaves BC and DE now with
the lowest probabilities so 0 and 1 are prepended to their codes and they are combined. This then leaves just
A and BCDE, which have 0 and 1 prepended respectively and are then combined. This leaves us with a
single node and our algorithm is complete.
The code lengths for the different characters this time are 1 bit for A and 3 bits for all other characters.

Symbol A B

Code

0 100 101 110 111

Results in 1 bit for A and per 3 bits for B, C, D and E an average bit number of

LempelZivWelch:
LempelZivWelch (LZW) is a universal lossless data compression algorithm created by Abraham
Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of
the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has very high
throughput

ALGORITHM:
Idea:
The scenario described in Welch's 1984 paper[1] encodes sequences of 8-bit data as fixed-length 12-bit
codes. The codes from 0 to 255 represent 1-character sequences consisting of the corresponding 8-bit character,
and the codes 256 through 4095 are created in a dictionary for sequences encountered in the data as it is encoded.
At each stage in compression, input bytes are gathered into a sequence until the next character would make a
sequence for which there is no code yet in the dictionary. The code for the sequence (without that character) is
emitted, and a new code (for the sequence with that character) is added to the dictionary.
The idea was quickly adapted to other situations. In an image based on a color table, for example, the
natural character alphabet is the set of color table indexes, and in the 1980s, many images had small color tables
(on the order of 16 colors). For such a reduced alphabet, the full 12-bit codes yielded poor compression unless the
image was large, so the idea of a variable-width code was introduced: codes typically start one bit wider than the
symbols being encoded, and as each code size is used up, the code width increases by 1 bit, up to some prescribed
maximum (typically 12 bits).
Further refinements include reserving a code to indicate that the code table should be cleared (a "clear
code", typically the first value immediately after the values for the individual alphabet characters), and a code to
indicate the end of data (a "stop code", typically one greater than the clear code). The clear code allows the table to
be reinitialized after it fills up, which lets the encoding adapt to changing patterns in the input data. Smart encoders
can monitor the compression efficiency and clear the table whenever the existing table no longer matches the input
well.
Since the codes are added in a manner determined by the data, the decoder mimics building the table as it
sees the resulting codes. It is critical that the encoder and decoder agree on which variety of LZW is being used:
the size of the alphabet, the maximum code width, whether variable-width encoding is being used, the initial code
size, whether to use the clear and stop codes (and what values they have). Most formats that employ LZW build

this information into the format specification or provide explicit fields for them in a compression header for the
data.
Encoding:
A dictionary is initialized to contain the single-character strings corresponding to all the possible input
characters (and nothing else except the clear and stop codes if they're being used). The algorithm works by
scanning through the input string for successively longer substrings until it finds one that is not in the dictionary.
When such a string is found, the index for the string less the last character (i.e., the longest substring that is in the
dictionary) is retrieved from the dictionary and sent to output, and the new string (including the last character) is
added to the dictionary with the next available code. The last input character is then used as the next starting point
to scan for substrings.
In this way, successively longer strings are registered in the dictionary and made available for subsequent
encoding as single output values. The algorithm works best on data with repeated patterns, so the initial parts of a
message will see little compression. As the message grows, however, the compression ratio tends asymptotically to
the maximum.
Decoding:
The decoding algorithm works by reading a value from the encoded input and outputting the
corresponding string from the initialized dictionary. At the same time it obtains the next value from the input, and
adds to the dictionary the concatenation of the string just output and the first character of the string obtained by
decoding the next input value. The decoder then proceeds to the next input value (which was already read in as the
"next value" in the previous pass) and repeats the process until there is no more input, at which point the final input
value is decoded without any more additions to the dictionary.
In this way the decoder builds up a dictionary which is identical to that used by the encoder, and uses it to
decode subsequent input values. Thus the full dictionary does not need be sent with the encoded data; just the
initial dictionary containing the single-character strings is sufficient (and is typically defined beforehand within the
encoder and decoder rather than being explicitly sent with the encoded data.)
Variable-width codes:
If variable-width codes are being used, the encoder and decoder must be careful to change the width at the
same points in the encoded data, or they will disagree about where the boundaries between individual codes fall in
the stream. In the standard version, the encoder increases the width from p to p + 1 when a sequence + s is
encountered that is not in the table (so that a code must be added for it) but the next available code in the table is
2p (the first code requiring p + 1 bits). The encoder emits the code for at width p (since that code does not require
p + 1 bits), and then increases the code width so that the next code emitted will be p + 1 bits wide.
The decoder is always one code behind the encoder in building the table, so when it sees the code for , it
will generate an entry for code 2p 1. Since this is the point where the encoder will increase the code width, the
decoder must increase the width here as well: at the point where it generates the largest code that will fit in p bits.
Unfortunately some early implementations of the encoding algorithm increase the code width and then emit at
the new width instead of the old width, so that to the decoder it looks like the width changes one code too early.
This is called "Early Change"; it caused so much confusion that Adobe now allows both versions in PDF files, but

includes an explicit flag in the header of each LZW-compressed stream to indicate whether Early Change is being
used. Most graphic file formats do not use Early Change.
When the table is cleared in response to a clear code, both encoder and decoder change the code width
after the clear code back to the initial code width, starting with the code immediately following the clear code.
Packing order:
Since the codes emitted typically do not fall on byte boundaries, the encoder and decoder must agree on
how codes are packed into bytes. The two common methods are LSB-First ("Least Significant Bit First")
and MSB-First ("Most Significant Bit First"). In LSB-First packing, the first code is aligned so that the least
significant bit of the code falls in the least significant bit of the first stream byte, and if the code has more than 8
bits, the high order bits left over are aligned with the least significant bit of the next byte; further codes are packed
with LSB going into the least significant bit not yet used in the current stream byte, proceeding into further bytes as
necessary. MSB-first packing aligns the first code so that its most significant bit falls in the MSB of the first stream
byte, with overflow aligned with the MSB of the next byte; further codes are written with MSB going into the most
significant bit not yet used in the current stream byte.
Example:
The following example illustrates the LZW algorithm in action, showing the status of the output and
the dictionary at every stage, both in encoding and decoding the data. This example has been constructed to give
reasonable compression on a very short message. In real text data, repetition is generally less pronounced, so
longer input streams are typically necessary before the compression builds up efficiency.
The plaintext to be encoded (from an alphabet using only the capital letters) is:
The # is a marker used to show that the end of the message has been reached. There are thus 26 symbols
in the plaintext alphabet (the 26 capital letters A through Z), plus the stop code #. We arbitrarily assign these the
values 1 through 26 for the letters, and 0 for '#'. (Most flavors of LZW would put the stop code after the data
alphabet, but nothing in the basic algorithm requires that. The encoder and decoder only have to agree what value it
has.)
A computer will render these as strings of bits. Five-bit codes are needed to give sufficient combinations
to encompass this set of 27 values. The dictionary is initialized with these 27 values. As the dictionary grows, the
codes will need to grow in width to accommodate the additional entries. A 5-bit code gives 25 = 32 possible
combinations of bits, so when the 33rd dictionary word is created, the algorithm will have to switch at that point
from 5-bit strings to 6-bit strings (for all code values, including those which were previously output with only five
bits). Note that since the all-zero code 00000 is used, and is labeled "0", the 33rd dictionary entry will be
labeled 32. (Previously generated output is not affected by the code-width change, but once a 6-bit value is
generated in the dictionary, it could conceivably be the next code emitted, so the width for subsequent output shifts
to 6 bits to accommodate that.)

The initial dictionary, then, will consist of the following entries:


Symbol Binary Decimal

# 00000

A 00001

B 00010

C 00011

D 00100

E 00101

F 00110

G 00111

H 01000

I 01001

J 01010

10

K 01011

11

L 01100

12

M 01101

13

N 01110

14

O 01111

15

P 10000

16

Q 10001

17

R 10010

18

S 10011

19

T 10100

20

U 10101

21

V 10110

22

W 10111

23

X 11000

24

Y 11001

25

Z 11010

26

Encoding:
Buffer input characters in a sequence until + next character is not in the dictionary. Emit the code for
, and add + next character to the dictionary. Start buffering again with the next character.
Output
Current Sequence Next Char

Extended Dictionary
Code

Comments

Bits

NULL

20 10100

27:

TO 27 = first available code after 0 through 26

15 01111

28:

OB

2 00010

29:

BE

5 00101

30:

EO

15 01111

31:

OR

18 10010

32:

RN 32 requires 6 bits, so for next output use 6 bits

14 001110

33:

NO

15 001111

34:

OT

20 010100

35:

TT

TO

27 011011

36:

TOB

BE

29 011101

37:

BEO

OR

31 011111

38:

ORT

TOB

36 100100

39:

TOBE

EO

30 011110

40:

EOR

RN

32 100000

41:

RNO

OT

34 100010

0 000000

# stops the algorithm; send the cur seq

and the stop code

Unencoded length = 25 symbols 5 bits/symbol = 125 bits


Encoded length = (6 codes 5 bits/code) + (11 codes 6 bits/code) = 96 bits.
Using LZW has saved 29 bits out of 125, reducing the message by almost 22%. If the message were longer, then
the dictionary words would begin to represent longer and longer sections of text, allowing repeated words to be
sent very compactly.

Decoding:
To decode an LZW-compressed archive, one needs to know in advance the initial dictionary used, but
additional entries can be reconstructed as they are always simply concatenations of previous entries.
Input

New Dictionary Entry


Output Sequence

Bits

Comments

Code

Full

10100

20

01111

15

00010

00101

Conjecture

27:

T?

27:

TO 28:

O?

28:

OB 29:

B?

29:

BE 30:

E?

01111

15

30:

EO 31:

O?

10010

18

31:

OR 32:

R? created code 31 (last to fit in 5 bits)

001110

14

32:

RN 33:

N? so start using 6 bits

001111

15

33:

NO 34:

O?

010100

20

34:

OT 35:

T?

011011

27

TO

35:

TT 36:

TO?

011101

29

BE

36:

TOB 37:

BE? 36 = TO + 1st symbol (B) of

011111

31

OR

37:

BEO 38:

OR? next coded sequence received (BE)

100100

36

TOB

38:

ORT 39: TOB?

011110

30

EO

39: TOBE 40:

EO?

100000

32

RN

40:

EOR 41:

RN?

100010

34

OT

41: RNO 42:

OT?

000000

At each stage, the decoder receives a code X; it looks X up in the table and outputs the sequence it
codes, and it conjectures + ? as the entry the encoder just added because the encoder emitted X for precisely
because + ? was not in the table, and the encoder goes ahead and adds it. But what is the missing letter? It is the
first letter in the sequence coded by the next code Z that the decoder receives. So the decoder looks up Z, decodes
it into the sequence and takes the first letter z and tacks it onto the end of as the next dictionary entry.
This works as long as the codes received are in the decoder's dictionary, so that they can be decoded into
sequences. What happens if the decoder receives a code Z that is not yet in its dictionary? Since the decoder is
always just one code behind the encoder, Z can be in the encoder's dictionary only if the encoder just generated it,
when emitting the previous code X for . Thus Z codes some that is + ?, and the decoder can determine the
unknown character as follows:
1. The decoder sees X and then Z.
2. It knows X codes the sequence and Z codes some unknown sequence .
3. It knows the encoder just added Z to code + some unknown character,
4. and it knows that the unknown character is the first letter z of .
5. But the first letter of (= + ?) must then also be the first letter of .
6. So must be + x, where x is the first letter of .
7. So the decoder figures out what Z codes even though it's not in the table,
8. and upon receiving Z, the decoder decodes it as + x, and adds + x to the table as the value of Z.
This situation occurs whenever the encoder encounters input of the form cScSc, where c is a single
character, S is a string and cS is already in the dictionary, but cSc is not. The encoder emits the code for cS, putting
a new code for cSc into the dictionary. Next it sees cSc in the input (starting at the second c of cScSc) and emits the
new code it just inserted. The argument above shows that whenever the decoder receives a code not in its
dictionary, the situation must look like this.
Although input of form cScSc might seem unlikely, this pattern is fairly common when the input stream is
characterized by significant repetition. In particular, long strings of a single character (which are common in the
kinds of images LZW is often used to encode) repeatedly generate patterns of this sort.
Further coding:
The simple scheme described above focuses on the LZW algorithm itself. Many applications apply further
encoding to the sequence of output symbols. Some package the coded stream as printable characters using some
form of Binary-to-text encoding; this will increase the encoded length and decrease the compression frequency.
Conversely, increased compression can often be achieved with an adaptive entropy encoder. Such a coder estimates

the probability distribution for the value of the next symbol, based on the observed frequencies of values so far.
Standard entropy encoding such as Huffman coding or arithmetic coding then uses shorter codes for values with
higher probabilities.
Uses:
When it was introduced, LZW compression provided the best compression ratio among all well-known
methods available at that time. It became the first widely used universal data compression method on computers. A
large English text file can typically be compressed via LZW to about half its original size.
LZW was used in the program compress, which became a more or less standard utility in Unix systems
circa 1986. It has since disappeared from many distributions, for both legal and technical reasons, but as of 2008 at
least FreeBSD includes both compress and uncompress as a part of the distribution. Several other popular
compression utilities also used LZW, or closely related methods.
LZW became very widely used when it became part of the GIF image format in 1987. It may also
(optionally) be used in TIFF and PDF files. (Although LZW is available in Adobe Acrobat software, Acrobat by
default uses the DEFLATE algorithm for most text and color-table-based image data in PDF files.)

Shannon's Theorem:
Shannon's Theorem gives an upper bound to the capacity of a link, in bits per second (bps), as a function
of the available bandwidth and the signal-to-noise ratio of the link.
The Theorem can be stated as:
C = B * log2(1+ S/N)
where C is the achievable channel capacity, B is the bandwidth of the line, S is the average signal power
and N is the average noise power.
The signal-to-noise ratio (S/N) is usually expressed in decibels (dB) given by the formula:
10 * log10(S/N)
so for example a signal-to-noise ratio of 1000 is commonly expressed as
10 * log10(1000) = 30 dB.
Here is a graph showing the relationship between C/B and S/N (in dB):

Examples
Here are two examples of the use of Shannon's Theorem.
Modem
For a typical telephone line with a signal-to-noise ratio of 30dB and an audio bandwidth of 3kHz, we get a
maximum data rate of:
C = 3000 * log2(1001)
which is a little less than 30 kbps.
Satellite TV Channel
For a satellite TV channel with a signal-to noise ratio of 20 dB and a video bandwidth of 10MHz, we get a
maximum data rate of:
C=10000000 * log2(101)
which is about 66 Mbps.

CHANNEL CAPACITY:
In electrical engineering, computer science and information theory, channel capacity is the tightest upper
bound on the amount of information that can be reliably transmitted over a communications channel. By the noisychannel coding theorem, the channel capacity of a given channel is the limiting information rate (in units
of information per unit time) that can be achieved with arbitrarily small error probability.
Information theory, developed by Claude E. Shannon during World War II, defines the notion of channel
capacity and provides a mathematical model by which one can compute it. The key result states that the capacity of
the channel, as defined above, is given by the maximum of the mutual information between the input and output of
the channel, where the maximization is with respect to the input distribution.

BANDWIDTH:
It has several related meanings:

Bandwidth (signal processing) or analog bandwidth, frequency bandwidth or radio bandwidth: a measure
of the width of a range of frequencies, measured in hertz

Bandwidth (computing) or digital bandwidth: a rate of data transfer, bit rate or throughput, measured in
bits per second (bps)

Spectral line width: the width of an atomic or molecular spectral line, measured in hertz

Bandwidth can also refer to:

Bandwidth (linear algebra), the width of the terms around the diagonal of a matrix hypotenuse

In kernel density estimation, "bandwidth" describes the width of the convolution kernel used

A normative expected range of linguistic behavior in language expectancy theory

In business jargon, the resources needed to complete a task or project

Bandwidth (radio program): A Canadian radio program

Graph bandwidth, in graph theory

SIGNAL-TO-NOISE RATIO:
Signal-to-noise ratio (often abbreviated SNR or S/N) is a measure used in science and engineering to
quantify how much a signal has been corrupted by noise. It is defined as the ratio of signal power to the noise
power corrupting the signal. A ratio higher than 1:1 indicates more signal than noise. While SNR is commonly
quoted for electrical signals, it can be applied to any form of signal (such as isotope levels in an ice
core or biochemical signaling between cells).
In less technical terms, signal-to-noise ratio compares the level of a desired signal (such as music) to the
level of background noise. The higher the ratio, the less obtrusive the background noise is.
"Signal-to-noise ratio" is sometimes used informally to refer to the ratio of useful information to false or
irrelevant data in a conversation or exchange. For example, in online discussion forums and other online
communities, off-topic posts and spam are regarded as "noise" that interferes with the "signal" of appropriate
discussion.
Signal-to-noise ratio is defined as the power ratio between a signal (meaningful information) and the
background noise (unwanted signal):

where P is average power. Both signal and noise power must be measured at the same or equivalent points
in a system, and within the same system bandwidth. If the signal and the noise are measured across the
same impedance, then the SNR can be obtained by calculating the square of the amplitude ratio:

where A is root mean square (RMS) amplitude (for example, RMS voltage). Because many signals have a
very wide dynamic range, SNRs are often expressed using the logarithmicdecibel scale. In decibels, the SNR is
defined as

which may equivalently be written using amplitude ratios as

The concepts of signal-to-noise ratio and dynamic range are closely related. Dynamic range measures the
ratio between the strongest un-distorted signal on a channel and the minimum discernable signal, which for most
purposes is the noise level. SNR measures the ratio between an arbitrary signal level (not necessarily the most
powerful signal possible) and noise. Measuring signal-to-noise ratios requires the selection of a representative
or reference signal. In audio engineering, the reference signal is usually a sine wave at a standardized nominal
or alignment level, such as 1 kHz at +4 dBu (1.228 VRMS).

SNR is usually taken to indicate an average signal-to-noise ratio, as it is possible that (near) instantaneous
signal-to-noise ratios will be considerably different. The concept can be understood as normalizing the noise level
to 1 (0 dB) and measuring how far the signal 'stands out'.

Mutual information:
In probability theory and information theory, the mutual information (sometimes known by
the archaic term trans information) of two random variables is a quantity that measures the mutual dependence of
the two variables. The most common unit of measurement of mutual information is the bit, when logarithms to the
base 2 are used.
Definition of mutual information:
Formally, the mutual information of two discrete random variables X and Y can be defined as:

where p(x,y) is the joint probability distribution function of X and Y, and p1(x) and p2(y) are the marginal
probability distribution functions of X and Y respectively.
In the case of a continuous function, summation is matched with a definite double integral:

where p(x,y) is now the joint probability density function of X and Y, and p1(x) and p2(y) are the marginal
probability density functions of X and Y respectively.
These definitions are ambiguous because the base of the log function is not specified. To disambiguate,
the function I could be parameterized as I(X,Y,b) where b is the base. Alternatively, since the most common unit of
measurement of mutual information is the bit, a base of 2 could be specified.
Intuitively, mutual information measures the information that X and Y share: it measures how much
knowing one of these variables reduces our uncertainty about the other. For example, if X and Y are independent,
then knowing X does not give any information about Y and vice versa, so their mutual information is zero. At the
other extreme, if X and Y are identical then all information conveyed by X is shared with Y:
knowing X determines the value of Y and vice versa. As a result, in the case of identity the mutual information is
the same as the uncertainty contained in Y (or X) alone, namely the entropy of Y (or X: clearly if X and Y are
identical they have equal entropy).
Mutual information quantifies the dependence between the joint distribution of X and Y and what the joint
distribution would be if X and Y were independent. Mutual information is a measure of dependence in the
following sense: I(X; Y) = 0 if and only if X and Y are independent random variables. This is easy to see in one
direction: if X and Y are independent, then p(x, y) = p(x) p(y), and therefore:

Moreover, mutual information is nonnegative (i.e. I(X;Y) 0; see below) and symmetric (i.e. I(X;Y) = I(Y;X)).

CHANNEL CAPACITY:
In electrical engineering, computer science and information theory, channel capacity is the tightest upper
bound on the amount of information that can be reliably transmitted over a communications channel. By the noisychannel coding theorem, the channel capacity of a given channel is the limiting information rate (in units
of information per unit time) that can be achieved with arbitrarily small error probability.
Information theory, developed by Claude E. Shannon during World War II, defines the notion of channel
capacity and provides a mathematical model by which one can compute it. The key result states that the capacity of
the channel, as defined above, is given by the maximum of the mutual information between the input and output of
the channel, where the maximization is with respect to the input distribution
Formal definition

Let X represent the space of signals that can be transmitted, and Y the space of signals received, during a block of
time over the channel. Let

be the conditional distribution function of Y given X. Treating the channel as a known statistic system, pY | X(y | x) is
an inherent fixed property of the communications channel (representing the nature of the noise in it). Then the joint
distribution

of X and Y is completely determined by the channel and by the choice of

the marginal distribution of signals we choose to send over the channel. The joint distribution can be recovered by
using the identity

Under these constraints, next maximize the amount of information, or the message, that one can communicate over
the channel. The appropriate measure for this is the mutual information I(X;Y), and this maximum mutual
information is called the channel capacity and is given by

Noisy-channel coding theorem:


The noisy-channel coding theorem states that for any > 0 and for any rate R less than the channel
capacity C, there is an encoding and decoding scheme that can be used to ensure that the probability of block error
is less than for a sufficiently long code. Also, for any rate greater than the channel capacity, the probability of
block error at the receiver goes to one as the block length goes to infinity.
Example application:

An application of the channel capacity concept to an additive white Gaussian noise (AWGN) channel
with B Hz bandwidth and signal-to-noise ratio S/N is the ShannonHartley theorem:

C is measured in bits per second if the logarithm is taken in base 2, or nats per second if the natural logarithm is
used, assuming B is in hertz; the signal and noise powers S and N are measured in watts or volts2, so the signal-tonoise ratio here is expressed as a power ratio, not in decibels (dB); since figures are often cited in dB, a conversion
may be needed. For example, 30 dB is a power ratio of 1030 / 10 = 103 = 1000.
Slow-fading channel:
In a slow-fading channel, where the coherence time is greater than the latency requirement, there is no
definite capacity as the maximum rate of reliable communications supported by the channel, log2(1 + | h | 2SNR),
depends on the random channel gain | h | 2. If the transmitter encodes data at rate R [bits/s/Hz], there is a certain
probability that the decoding error probability cannot be made arbitrarily small,
,
in which case the system is said to be in outage. With a non-zero probability that the channel is in deep
fade, the capacity of the slow-fading channel in strict sense is zero. However, it is possible to determine the largest
value of R such that the outage probability pout is less than . This value is known as the -outage capacity.
FAST-FADING CHANNEL:
In a fast-fading channel, where the latency requirement is greater than the coherence time and the codeword length
spans many coherence periods, one can average over many independent channel fades by coding over a large
number of coherence time intervals. Thus, it is possible to achieve a reliable rate of communication
of

[bits/s/Hz] and it is meaningful to speak of this value as the capacity of the

fast-fading channel.

RATE DISTORTION THEORY:


Ratedistortion theory is a major branch of information theory which provides the theoretical foundations
for lossy data compression; it addresses the problem of determining the minimal amount
of entropy (or information) R that should be communicated over a channel, so that the source (input signal) can be
approximately reconstructed at the receiver (output signal) without exceeding a given distortion D.

IMPORTANT QUESTIONS
PART A
All questions Two Marks
1. What is entropy?
2. What is prefix code?
3. Define information rate.
4. What is channel capacity of binary synchronous channel with error probability of 0.2?
5. State channel coding theorem.
6. Define entropy for a discrete memory less source.
7. What is channel redundancy?
8. Write down the formula for the mutual information.
9. When is the average information delivered by a source of alphabet size 2, maximum?
10. Name the source coding techniques.
11. Write down the formula for mutual information.
12. Write the expression for code efficiency in terms of entropy.
13. Is the information of a continuous system non negative? If so, why?
14. Explain the significance of the entropy H(X/Y) of a communication system where X is the
transmitter and Y is the receiver.
15. An event has six possible outcomes with probabilities .1/4,1/8,1/16,1/32,1/32. Find the
entropy of the system.

PART B
1. Discuss Source coding theorem, give the advantage and disadvantage of channel coding in
detail, and discuss the data compaction. (16)
2. Explain in detail Huffman coding algorithm and compare this with the other types of coding.
(8)
3. Explain the properties of entropy and with suitable example, explain the entropy of binary
memory less source. (8)
4. What is entropy? Explain the important properties of entropy. (8)
5. Five symbols of the alphabet of discrete memory less source and their probabilities are given
below. (8)
S=[S0,S1,S2,S3,S4]
P[S]=[.4,.2,.2,.1,.1]
Code the symbols using Huffman coding.
6. Write short notes on Differential entropy, derive the channel capacity theorem and discuss
the implications of the information capacity theorem. . (16)
7. What do you mean by binary symmetric channel? Derive channel capacity formula for
symmetric channel. (8)
8. Construct binary optical code for the following probability symbols using Huffman procedure
and calculate entropy of the source, average code Length, efficiency, redundancy and variance?
0.2, 0.18, 0.12, 0.1, 0.1, 0.08, 0.06, 0.06, 0.06, 0.04 (16)
9. Define mutual information. Find the relation between the mutual information and the joint
entropy of the channel input and channel output. Explain the important properties of mutual
information. . (16)
10. Derive the expression for channel capacity of a continuous channel. Find also the expression
for channel capacity of continuous channel of a infinite bandwidth. Comment on the results.
(16)

UNIVERSITY
QUESTIONS

Reg. No.

Question Paper Code: E3077


B.E./B.Tech. Degree Examinations, Apr/May 2010
Regulations 2008
Fourth Semester
Electronics and Communication Engineering
EC2252 Communication Theory
Time: Three Hours

Maximum: 100 Marks


Answer ALL Questions
Part A - (10 x 2 = 20 Marks)

1. How many AM broadcast stations can be accommodated i n a 100 kHz bandwidth if


the highest frequency modulating a carrier i s 5 kHz?
2. What are the causes of linear distortion?
3. Draw the block diagram of a method for generating a narrowband FM signal.
4. A carrier wave of frequency 100 MHz is frequency modulated by a signal
20 sin (200103 t).
What is bandwidth of FM signal if the frequency sensitivity of t he modulation is
25kH z/V.
5. When is a random process called deterministic?
6. A receiver connected to a n antenna of resistance of 50 has an equivalent noise
resistance of 30. Find t he receiver noise figure.
7. What are t he characteristics of superheterodyne receivers?
8. What are the methods to improve FM threshold reduction?
9. Define entropy function.
10. Define Rate Bandwidth and Bandwidth efficiency.

Part B - (5 x 16 = 80 Marks)
11. (a) (i) Draw an envelope detector circuit used for demodulation of AM and
explain its operation.
(10)
(ii) How SSB can be generated using Weavers method? Illustrate with a neat
block diagram.
(6)
OR
11. (b) (i) Discuss in detail about frequency translation and frequency division
multiplexing technique with diagrams.
(ii) Compare Amplitude Modulation and Frequency Modulation.

(10)
(6)

12. (a)

(i) Using suitable Mathematical analysis show that FM modulation produces


infinite sideband. Also deduce an expression for the frequency modulated
output and its frequency spectrum.
(10)
(ii) How can you generate an FM from PM and PM from FM?
(6)
OR

12. (b)

(i) A 20 MHz i s frequency modulated by a sinusoidal signal such that the


maximum frequency deviation is 100 kHz. Determine the modulation index
and approximate bandwidth of the FM signal for the following modulating
signal frequencies,
(1) 1 kHz (2) 100 kHz an d (3) 500 kHz.
(8)
(ii) Derive the time domain expressions of FM and PM signals.
(8)

13. (a) (i) Give a random process, X (t) = A cos(t + ), where A and are constants
and is a uniform random variable. Show that X(t) is argotic in both mean
and autocorrelation.
(8)
(ii) Write a short note on shot noise and also explain about power spectral
density of shot noise.
(8)
OR
13. (b) Write t he details about narrow band noise an d the properties of quadrature
components of narrowband noise.
(16)
14. (a) Derive an expression for SNR at input (SNRc ) and output of (SNRo ) of a
coherent detector.
(16)
2

E3077

OR
14. (b) (i) Explain pre-emphasis and De-emphasis in detail.
15. (a) (i) Find the code words for five symbols of t he alphabet of a discrete memory- less source with probability {0.4, 0.2, 0.2, 0.1, 0.1}, using Huffman coding
and determine t he source entropy and average code word length.
(10)
(ii) Discuss the source coding theorem.
(6)
OR
15. (b) (i) Derive the channel capacity of a continuous band limited white Gaussian noise channel.
(ii) Discuss about rate distortion theory.

(10)
(6)

Vous aimerez peut-être aussi