Académique Documents
Professionnel Documents
Culture Documents
http://blackboard.tudelft.nl/.
Copyright © 2018 by Optica, TUDelft
Lecture Notes
Authors:
Paul Urbach
Aurèle Adam
Sander Konijnenberg
March-April 2018
Monday 16th April, 2018, 09:44
Optica Lecture Notes TN2421 2 of 165 Monday 16th April, 2018, 09:44
Contents
1 Introduction 7
3 Geometrical Optics 35
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Fermat’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Some Consequences of Fermat’s Principle . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Perfect Imaging by Conic Sections . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Gaussian Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.1 Gaussian Imaging by a Single Spherical Surface . . . . . . . . . . . . . . . 44
3.5.2 The Thin Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.3 Construction of the Image of a Finite Object . . . . . . . . . . . . . . . . 48
3.5.4 Two Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.5 The Matrix Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.6 The Thick Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6 Stops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.7 Beyond Gaussian Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.7.1 Aberrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.8 Beyond Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3
CONTENTS
4 Optical Instruments 61
4.1 The Camera Obscura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 The Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.1 Camera in a Mobile Phone . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 The Human Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3.1 Accommodation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2 Retina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.3 Eyeglasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3.4 New Correction Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Magnifying glass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.1 Magnifying power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.2 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 Eyepieces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 The Compound Microscope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7 Telescopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5 Polarisation 73
5.1 Polarization states, Jones Vectors, Jones Matrices . . . . . . . . . . . . . . . . . . 73
5.2 Creating and manipulating polarisation states . . . . . . . . . . . . . . . . . . . . 75
5.2.1 Jones Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2.2 Linear Polarizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.3 Quarter-Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.4 Half-Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.5 Full-Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 How to Determine Whether a Matrix Corresponds to a Linear Polariser or a Wave
Plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Decomposition of Elliptical Polarisation into Linear and Circular States . . . . . 80
Optica Lecture Notes TN2421 4 of 165 Monday 16th April, 2018, 09:44
Contents
8 Lasers 129
8.1 Unique Properties of Lasers and Their Applications . . . . . . . . . . . . . . . . . 130
8.1.1 High Monochromaticity; Narrow Spectral Width; High Temporal
Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.1.2 Highly Collimated Beam; Difraction Limited Collimation. . . . . . . . . . 130
8.1.3 Very Small Focused Spot; Diffraction Limited Focused Spot; High Spatial
Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.1.4 High Power; CW and Pulsed. . . . . . . . . . . . . . . . . . . . . . . . . . 132
8.1.5 Wide Tuning Range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.2 Optical Resonator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.3 Amplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.3.1 The A and B Einstein Coefficients . . . . . . . . . . . . . . . . . . . . . . 135
8.3.2 Relation Between the Einstein Coefficients . . . . . . . . . . . . . . . . . . 136
8.3.3 Population Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.4 Cavities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.5 Problems of Laser Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.5.1 How to Realize Single Frequency Operation . . . . . . . . . . . . . . . . . 139
8.5.2 How to Prevent Transverse Modes . . . . . . . . . . . . . . . . . . . . . . 140
8.6 Types of Lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.6.1 Optical Pumping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.6.2 Electron-Collision Pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.6.3 Atom Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.6.4 Chemical Pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.6.5 Semiconductor Laser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Appendices
Optica Lecture Notes TN2421 5 of 165 Monday 16th April, 2018, 09:44
CONTENTS
Optica Lecture Notes TN2421 6 of 165 Monday 16th April, 2018, 09:44
Chapter 1
Introduction
As a physics student, you are probably familiar with many concepts of optics and the nature
of light. From secondary school you may remember Snell’s law of refraction, the lens formula
and ray tracing rules, and the interference fringes observed in the double-slit experiment. By
now you have also learned that with Maxwell’s equations one can show that light consists of
electromagnetic waves, that its speed c was found to be constant which resulted in the develop-
ment of the theory of relativity, and that light exhibits a wave-particle duality that is explained
by quantum mechanics and by the De Broglie hypothesis in particular. Although this is al-
ready a rather sizable body of knowledge, there is still lots to be taught about optics, which is
what we will do in this course. However, many of these new topics do not necessarily require
knowledge about quantum mechanics or even Maxwell’s equations. Thus, we are in the some-
what strange situation where we may have to ”take a step back” or ”forget” some of the things
we already know about light in order to explain certain concepts as simply and clearly as possible.
We remark that what you will learn in this course applies to a much larger part of physics
than only opitcs. In fact, optics refers strictly speaking only to electromagnetic fields of visible
wavelengths from 390 nm to 780 nm. But everthing we will discuss applies to electromagnetic
radiation of any wavelength, from γ radiation of 10−13 nm wavelength to long radio waves of
wavelength of more than 103 m. Since the approximate theories that we will discuss, such
as for example geometrical optics, are valid provided the wavelength is sufficiently small com-
pared to the size of the objects occuring in th problem, these theories apply also to any of the
mentioned wavelengths provided the same ratio of wavelenght to typical size of the objects holds.
In this course, we will mostly follow the book "Optics" by Eugene Hecht, to which we will
refer usually simply by "Hecht". Although this book offers nice reading it is very long. There-
fore these Lecture Notes are made self-contained. This means that you do not need the book
of Hecht to master the topics. If you sometimes have the feeling that you need a more detailed
explanation than given in the Lecture Notes, you can use the links to websites with additional
explanations and demonstrations that are provided at the end of several sections. If you still like
to use a book for further reading, we recommend to read the section(s) in the book of Hecht to
which we refer.
We summarize the content of the Lecture Notes:
• First we recall some basic facts about Maxwell equations and show how the wave equation
is derived from these equations and then we discuss some special solutions such as plane
waves and spherical waves. The has already been treated in the course ”Golven” and
therefore we will go through this material quickly during the lectures. You may want to
study it yourself in more detail to refresh your knowledge because you have to master this
chapter to be able to follow other parts of the course. Chapter 2 corresponds to Appendix
1 of Hecht, Sections 2.7-2.9 and some parts of Chapter 3.
7
CHAPTER 1. INTRODUCTION
• Then we study light from the point of view of Geometrical Optics. The theory studied
can be found in a couple of sections of Chapters 4, 5 and 6 of Hecht. This model of optics
applies to cases where the wavelength of light can be considered to be vanishingly small
compared to other lengths of the problem. In geometrical optics light is considered to
travel as rays. With this concept we can explain phenomena observed in for example the
pinhole camera, or simple microscopes and telescopes.
• Next we study different kinds of polarisation of light and how one can manipulate them.
This corresponds to some sections of Chapter 8 of Hecht.
• Then we discuss the superposition of light waves and the phenomena of interference of light
and how this depends on a property of light sources called coherence. This corresponds to
certain sections in Chapter 7, 9 and 12 of Hecht.
• In Chapter 7 we treat Diffraction Optics. In this model we describe light as a wave, with
which we can explain phenomena such as interference fringes caused by the interaction of
light with structures of finite size, such as a slit or aperture in a screen. Our treatment of
this subject differs quite a bit of how it is discussed in Hecht, but you may nevertheless
find it useful to read parts of Chapter 10 of Hecht.
• Finally, in Chapter 8 the unique properties of lasers and their applications are discussed.
In discussing lasers, many of the properties of light discussed in previous chapters will play
a role, in particular, the coherence and diffraction. A laser contains an optical resonator
and we will explain how such a resonator can be used to achieve very high light intensities.
A laser also requires a medium which amplifies the light by stimulated emission. To un-
derstand the mechanism of stimulated emission, the theory of Einstein will be discussed.
We also consider some problems with lasers and how they can are commonly solved.
It may bother you that you are sometimes taught models which we know are basically ”wrong”
or ”inaccurate” (such as Geometrical Optics). But then remember that in the end all of physics
is merely a model that tries to describe reality. Some models, which tend to be more complex,
are more accurate than others, but depending on the phenomena we want to predict, a simpler,
less accurate model may suffice. For example, in a substantial number of practical cases, such as
the modeling of imaging formation in cameras, geometrical optics is already sufficiently accurate
and a model based on Maxwell’s equations or even the scalar wave equation would be too
computationally demanding. From a pedagogical point of view, it surely seems preferable to
learn the simpler model prior to learning the more accurate model: think of how we need to
learn Newtonian mechanics before we can get a grasp of quantum mechanics or relativity.
Optica Lecture Notes TN2421 8 of 165 Monday 16th April, 2018, 09:44
Chapter 2
The content of this chapter is assumed known in the rest of the course.
It has been treated already in the course ”Golven”, therefore in the
lectures it will discussed only briefly. In summary, with the material in
this chapter you should know and be able to do:
• Derive the scalar wave equation for the components of the electromagnetic
field.
• Understand time harmonic plane waves, spherical waves, wave fronts and
the phase velocity.
• Compute the rate of energy flow and its long time average energy.
• Understand the Brewster angle, total internal reflection and evanscent waves.
Maxwell’s equations provide a very complete description of light which includes diffraction,
interference and polarization. Yet it is strictly speaking not fully accurate because it allows
monochromatic elecromagnetic waves to carry any energy whereas according to quantum optics
the energy is quantized. According to quantum optics light is a flow of massless particles, the
photons, which each carry an extremely small quantum of energy: which for frequency ω is
~ω, where ~ = 6.63 × 10−34 /(2π) Js. Optical frequencies are of the order 5 × 1014 Hz, hence
~ω ≈ 3.3 × 10−19 Js.
Quantum optics is mainly important in experiments involving only a small number of photons,
i.e. at very low light intensities , and for specially prepared states of photons (e.g. entangled
states) for which there is no classical description.
9
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
Table 2.1: The mean photon flux density for a sampling of common
sources
where n̂ is the outwards pointing unit normal on A. Using Gauss’s divergence theorem (A.13) to
convert the left-hand side into a volume integral, the following differential form of the conservation
law is obtained:
∂ρ
−∇·J = . (2.2)
∂t
At every point in space and at every time the field vectors satisfy the Maxwell equations1,2 :
∂B
∇×E = − , (Faraday’s law), (2.3)
∂t
B ∂E
∇× = 0 + J , (Maxwell’s law), (2.4)
µ0 ∂t
∇ · 0 E = %, (Gauss’s law), (2.5)
∇ · B = 0 (there is no magnetic charge), (2.6)
1
Khan Academy - Faraday’s Law Introduction
2
Khan Academy - Magnetic field created by a current carrying wire (Ampere’s Law)
Optica Lecture Notes TN2421 10 of 165 Monday 16th April, 2018, 09:44
2.3. Maxwell Equations in Matter
where 0 = 8.8544 × 10−12 C2 N−1 m−2 is the so-called dielectric permittivity and µ0 = 1.2566 ×
10−6 m kg C−2 is the magnetic permeability of vacuum. The quantity c = (1/0 µ0 )1/2 =
2.997924562 × 108 ± 1.1 m/s is the speed of light in vacuum.
where χ is a dimensionless quantity, the so-called susceptibility. We stress that E is the total
local field at the position of the dipole, i.e. it contains the contribution of other dipoles that are
also excited and radiate fields themselves. Only in the case of diluted gasses, the influence of the
other dipoles in matter can be neglected and the local electric field is simply given by the field
emitted by the external source. A dipole moment density that changes with time corresponds to
a current density J [C s−1 m2 ], and a charge density given by
∂P(r, t) ∂E(r, t)
J (r, t) = = 0 χ , (2.8)
∂t ∂t
%(r, t) = −∇ · P(r, t) = −∇ · (0 χE), (2.9)
Substitution of (2.8) and (2.9) into Maxwell’s equations (2.4) and (2.5) gives
B ∂E
∇× = 0 (1 + χ) , (2.10)
µ0 ∂t
∇ · (0 E) = −∇ · (0 χE), (2.11)
where it is assumed that apart from the current and charges of the induced polarisation, there
are no other currents and sources in the material considered. We define the permittivity (same
dimension as 0 ) of the material by
= 0 (1 + χ). (2.12)
Then
∂B
∇×E = − , (Faraday’s Law), (2.13)
∂t
∂E
∇ × B = µ0 , (Maxwell’s Law), (2.14)
∂t
∇ · E = 0, (Gauss’s Law), (2.15)
∇ · B = 0 (there is no magnetic charge), (2.16)
It is seen that the Maxwell’s equations in matter are identical to those in vacuum provided the
permittivity of vacuum is replaced by that of the material.
Remark: If the material is magnetic, the magnetic permeability is different from vacuum and
denoted by µ. In Maxwell’s equations one should then replace µ0 by µ. However, at optical
frequencies magnetic effects are negligible (except in ferromagnetic materials, which are rare).
Optica Lecture Notes TN2421 11 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
If the material is conducting, then there is an additional current density given by Ohm’s Law:
J c = σE, (2.17)
where σ [C V −1 m−1 ] is the conductivity. This current density has to be added to the right-hand
side of ((2.14)). The other Maxwell equations remain unchanged.
∂2E
∇ × ∇ × E + µ0 = 0. (2.18)
∂t2
Now for any vector field A there holds:
∇ × ∇ × A = −∇2 A + ∇∇ · A. (2.19)
∇ × ∇ × E = −∇2 E, (2.22)
∂2U
∇2 U − µ0 = 0. (2.24)
∂t2
The refractive index is the dimensionless quantity defined by
r
n= . (2.25)
0
∂2U
∇2 U − n2 0 µ0 = 0. (2.26)
∂t2
The speed of light in matter is
c 1
=√ . (2.27)
n µ0
Optica Lecture Notes TN2421 12 of 165 Monday 16th April, 2018, 09:44
2.5. Time Harmonic Solutions of the Wave Equation
where k = kx x̂ + ky ŷ + kz ẑ is the wave vector and the planes of constant phase are perpendicular
to the direction of k. Eq. (2.32) is a solution of (2.24) if
The direction of the wave vector can be chosen arbitrarily, but its length is fixed by the frequency
ω.
where the amplitude A(r) > 0 and phase ϕ(r) are functions of position r. The wave fronts, i.e.
the surfaces of constant phase, are now in general not planes, hence the solution is in general not
a plane wave. Eq. (2.34) could for example be a wave with spherical wave fronts are discussed
below.
Remark We will show in Section 7.1 that every time-harmonic solution of the wave equation
can always be expanded in terms of in general infinitely many plane waves of the form (2.32).
Optica Lecture Notes TN2421 13 of 165 Monday 16th April, 2018, 09:44
a different c( $r ). The planes should also have been drawn with
an infinite spatial extent, since no limits were put on $r . The
disturbance
CHAPTER clearly
2. THE BASICS OFoccupies all of space.AND WAVE OPTICS
ELECTROMAGNETIC
0
= A
= 0
c
c = –A 0
c = = A
c c =
c
!
k
c ( r!)
+A
l
For time harmonic solutions it is often convenient to use complex notation. Define the com-
plex amplitude by:
U (r) = A(r)eiϕ(r) , (2.35)
i.e. the modulus of the complex number U (r) is the amplitude A(r) and the argument of U (r)
is the phase ϕ(r) at t = 0. Then (2.34) can be written as 26/08/16 11:14 A
Hence U(r, t) is the real part of the complex time harmonic function
U (r)e−iωt . (2.37)
Remark: The complex amplitudes are in the case of vector fields such as E and H also called
"complex fields". Complex amplitudes and complex fields are only functions of position r; the
time dependent factor exp(−iωt) is omitted. To get the physical meaningful real quantity, the
complex amplitude or complex field first have to be multiplied by exp(−iωt) and then the real
part must be taken.
Real-valued physical quantities (whether they are time-harmonic or have more general time de-
pendence) are denoted by a calligraphic letter, e.g. U, Bx , or Ex . The symbols are bold when we
are dealing with a vector, e.g. E or B. The complex amplitude of a time-harmonic function is
linked to the real physical quantity by (2.36) and is written as an ordinary letter such as U and E.
Optica Lecture Notes TN2421 14 of 165 Monday 16th April, 2018, 09:44
2.5. Time Harmonic Solutions of the Wave Equation
It is easier to calculate with complex quantities than with trigonometric functions (cosine and
sine). As long as all the operations carried out on the functions are linear, the operations can
be carried out on the complex quantities. To get the real-valued physical quantity (i.e. the
physical meaningful result), you multiply the finally obtained complex amplitude by exp(−iωt)
and take the real part. The reason that this is allowed is that taking the real part commutes
with linear operations, i.e. taking first the real part to get the real-valued physical quantity and
then operating on this real physical quantity gives the same result as operating on the complex
scalar and taking the real part at the end.
Remarks
1. The complex quantity of which the real part has to be taken is: U exp(−iωt). It is not
necessary to drag the time dependent factor exp(−iωt) along in the computations: it suffices
to compute with the complex amplitude U , but to get the physical relevant quantity, you of
course first have to multiply U by exp(−iωt) and then take the real part. However, omitting
the factor exp(−iωt) and computing only with the complex amplitude U requires that
everywhere the differentiation with respect to time: ∂/∂t is replaced by multiplication by
−iω. This is for example done in the time-harmonic Maxwell’s equations in Section 2.6
below.
2. An example that complex computations can not be used for nonlinear operations is
taking the square of a time-harmonic functions. Suppose the real-valued quantity is U =
A cos(φ − ωt). In complex notation:
U = Re U e−iωt = Re Aeiϕ−iωt , (2.41)
This is not the same as (2.42). Hence when the compuations are nonlinear such as in
computing the energy (squaring the field) do NOT use complex fields but work with the
real fields.
Optica Lecture Notes TN2421 15 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
1 ∂2
∇2 U(r, t) = [rU(r, t)]. (2.45)
r ∂r2
The scalar wave equation (2.24) becomes, after multiplying by r:
∂2 ∂2
[rU] = µ 0 [rU]. (2.46)
∂r2 ∂t2
It is easy to see that
f (r ± ct/n)
U(r, t) = , (2.47)
r
satisfies (2.46) for any choice for the function f . The surfaces of constant phase are spheres:
c
r = constant ∓ t. (2.48)
n
and therefore the solutions are called spherical waves. Of particular interest are the time harmonic
spherical waves:
A
U(r, t) = cos(kr ± ωt + ϕ), (2.49)
r
where
k = nω/c. (2.50)
If the − sign applies, the wave propagates outwards, away from the origin. The wave is then
radiated by a source at the origin. In fact, if the − sign holds in (2.49), then a surface of constant
phase moves outwards when time increases. Similarly, if the + sign holds, the wave propagates
towards the origin corresponding to a sink in the origin.
In either case the amplitude of the wave A/r, is proportional to the inverse distance so
that energy isd conserved. Energy is proportional to the square of the amplitude and hence is
proportional to the inverse distance squared. This means that the integral over a sphere of the
energy is constant.
Using complex notation we have for the outwards propagating wave:
−iωt A ikr−iωt
U(r, t) = Re U (r)e = Re (2.51)
e
r
with U (r) = A exp(ikr)/r and A = A exp(iϕ), where ϕ is the argument and A the modulus of
the complex amplitude A.
In Fig. 2.2 and Fig. 2.3 spherical wave fronts are shown. It is seen from Fig. 2.3 that for
an observer at very large distance to a source, the spherical wave fronts look plane. Hence if
your distance to the source is large, you perceive the spherical wave as a plane wave propagating
towards you.
Optica Lecture Notes TN2421 16 of 165 Monday 16th April, 2018, 09:44
oduct (rc). The solution of t1
𝒜
t2
v
r - vt) 2.6. Time Harmonic Maxwell Equations in Matter
t3
t4
- vt)
(2.72) r
r 0 1
26/08/16 11:14 AM
Figure 2.29 The flattening of spherical
waves with distance.
with
where ϕx is the argument of the complex number Ex etc. With similar notations for the magnetic
field, we obtain by substitution into Maxwell’s equations (2.13), (2.14), (2.14) and (2.16), the
time-harmonic Maxwell equations for the complex fields:
∇ × E = iωB, (2.53)
∇ × B = −iωµ0 E, (2.54)
∇ · E = 0, (2.55)
∇·B = 0 (2.56)
The scalar wave equation in time-harmonic form is called the Helmholtz equation. The
permittivity is in general complex-valued and depends on the frequency. The latter property
is called dispersion. The imaginary part of the permittivity is a measure of absorption of the
light. Except for close to a resonance frequency, the imaginary part of (ω) is small and the
real part is a slowly increasing function of frequency. Near a resonance the real part is rapidly
changing and decreases with ω, while the imaginary part has a maximum at the resonance
frequency, corresponding to maximum absorrption at resonances. In some books the notation is
used: = (n + iκ)2 , where n and κ (not to be confused with the wavenumber k) are both real
with n the refractive index and κ a measure of absorption. We then have Re() = n2 − κ2 and
Im() = 2nκ (see Fig. 2.4). As remarked before, at optical frequencies the magnetic permeability
differs only very little from the vacuum value µ0 .
2nκ n2-κ2
Complex Dielectric Constant
increment
in n2
0
Wavelength
Frequency
Figure 2.4: Real part n2 − κ2 and imaginary part 2nκ of the permittivity = (n + iκ)2 , as
function of wavelength and of frequency.
Optica Lecture Notes TN2421 18 of 165 Monday 16th April, 2018, 09:44
at value, and there, v20j 7 7 v2. Notice trum, electronic polarization is the operative mechanism deter-
,j (v20j - v2) decreases and n gradu- mining n(v). Classically, one imagines electron-oscillators
cy, as is clearly evident in Fig. 3.40. vibrating at the frequency of the incident wave. When the wave’s
rsion. In the ultraviolet region,Harmonic
2.6. Time as v frequency
Maxwellis Equations
appreciably in
different
Matter from a characteristic or natural
ncy, the oscillators will begin to reso- frequency, the oscillations are small, and there is little dissipa-
increase markedly, and this will be tive absorption. At resonance, however, the oscillator amplitudes
nd a strong absorption of energy from are increased, and the field does an increased amount of work
= v in Eq. (3.73), the damping term on the charges. Electromagnetic energy removed from the wave
ant. The regions immediately sur-
n Fig. 3.41 are called absorption
ative, and the process is spoken of as Frequency n (Hz)
dispersion. When white light passes 3 × 1015 3 × 1014 3 × 1013 5 × 1012
2.8
blue constituent has a higher index
deviated through a larger angle (see
Thallium bromoiodide
when we use a liquid-cell prism con- 2.4
an absorption band in the visible, the Thallium chlorobromide
AgCl
y (see Problem 3.59). All substances
Index of refraction
KI
2.0 KBr
mewhere within the electromagnetic NaCl CsI AgCl
CsBr
the term anomalous dispersion, be- KCl
CsI
e 1800s, is certainly a misnomer. SiO2
1.6
CaF2 KI CsBr
LiF
NaF KBr BaF2
BaF2 KCl CaF2
1.2
SiO2 NaCl
NaF
0.8 LiF
100 200 400 600 800 2000 4000 10,000 60,000
ultra- light infrared
violet
v
v02 v03 Wavelength l (nm)
Ultraviolet X-ray Figure 3.42 Index of refraction versus wavelength and frequency
Figure 2.5: Refractive
for index as function
several important opticalofcrystals.
wavelength andpublished
(SOURCE: Data frequency for several important optical
by The Harshaw
ersus frequency. crystals. Chemical Co.)
hv hv
v l Photon Photon
Frequency Wavelength energy energy MICROSCOPIC ARTIFICIAL
(Hz) (m) (eV) (J) SOURCE DETECTION GENERATION
Atomic Geiger
1022
10–13 nuclei and Accelerators
g-RAYS scintillation
1 MeV 106 counters
10–14 Inner Ionization X-ray
1Å 10–10 electrons chamber tubes
X-RAYS
3
1 nm 10–9 1 keV 10
Inner and
outer Photoelectric Synchrotrons
ULTRAVIOLET
10 10–18 electrons Photomultiplier Lasers
1015 LIGHT Outer electrons Eye Arcs
1m 10–6 1 eV 100 10–19
1014 Molecular Balometer Sparks
10–1 10–20 INFRARED vibrations Lamps
and
1 THz 1012 Thermopile Hot bodies
rotations
Magnetron
1 cm 10–2 MICROWAVES Electron spin Klystron
Nuclear spin Travelling-wave
1 GHz 109 21 cm H line Radar
Crystal
1 m 100 10–6 UHF tube
VHF TV FM Radio
105 10–11
1 kHz 103
Power lines AC generators
with
E(r) = Aeik·r , (2.59)
where A is a complex vector
Ax
A = Ay , (2.60)
Az
with Ax = |Ax |eiϕx etc.. Here k is the wave vector, which satisfies (2.33). By substituting (2.59)
into (2.55) it follows that
A · k = 0, (2.61)
which means that the electric field vector is in every point perpendicular to the wave vector.
For simplicity we now choose the wave vector in the direction of the x-axis and we assume that
the electric field vector is parallel to the y-axis. This case is called a y-polarised electromagnetic
wave. The complex field is then written as
k √
B(x) = x̂ × ŷAeikx = µ0 Aeikx ẑ. (2.63)
ω
The real electromagnetic field is thus:
We conclude that the electric and magnetic field of a plane wave are in phase, i.e. at any given
point z both the electric and the magnetic field achieve their maximum and minimum values at
the same time.
B(r, t)
S(r, t) = E(r, t) × . (2.66)
µ0
More precisely, the flow of electromagnetic energy through a small surface dA with normal n̂ at
point r is given by
S(r, t) · n̂ dA. (2.67)
If this scalar product is positive, the energy flow is in the direction of n̂, otherwise it is opposite
to n̂. Hence the direction of S(r, t) is the direction of the flow of energy at point r and the length
kS(r, t)k is the amount of the flow of energy, per unit of time and per unit of area perpendicular
to the direction of S.
Optica Lecture Notes TN2421 20 of 165 Monday 16th April, 2018, 09:44
2.8. Electromagnetic Energy
agnetic Theory, Photons, and Light
0Ey 0Bz
= - (3.27)
0x 0t E
0y
e constant and of no interest at present.
field can only have a component in the
n, in free space, the plane electromag- z
B 0z
$
B
(Fig. 3.13). Except in the case of normal
propagating in real material media are
c
se—a complication arising from the fact
dissipative or contain free charge. For the x
working with only dielectric (i.e., noncon- $
E
homogeneous, isotropic, linear, and sta- (b)
ane electromagnetic waves are transverse. E0 l
d the form of the disturbance other than
!
E
e wave. Our conclusions are therefore B0 !
E
equally well to both pulses and continu-
ready pointed out that harmonic func- !B
v
interest because any waveform can be
nusoidal waves using Fourier techniques
!
!B v
!
n being c. The associated magnetic flux The constant of integration, which represents a time-independent
By(3.27),
y directly integrating Eq. adopting
that is,the results frombeen
field, has electrostatics
disregarded. and magnetostatics,
Comparison if follows
of this result with that the total energy
stored in the electromagnetic field per
Eq. (3.28) makes unit that
it evident of volume
in vacuumat a point r is equal to the sum of the
0Ey
L 0x
Bz = - dt electric and the magnetic energy densities: E = cB (3.30)
y z
1
btain Since
UemE(r,
y and
t) B=z differ t) · E(r,
E(r,only by a t)
scalar,
+ and B(r,
so t) · B(r,
have the t).
same (2.68)
µ0
time dependence, E $ and B$ are in-phase at all points in space.
Moreover, E $ = ĵEy(x, t) and B$ = k̂Bz(x, t) are mutually perpen-
L
v
sin [v(t - x>c) + e] dt The energydicular,
Remark: flux Sandandtheir
thecross-product,
energy density$:U
E $ ,em
B depend
points in thenonlinearly
propaga- on the field. For
Uem the quadratic dependence on îthe
tion direction, electric
(Fig. 3.14). and magnetic fields is clear. To see that the Poynt-
1 ing vector is (3.29) In ordinary
also quadratic in the dielectric materials, field,
electromagnetic whichyouare essentially non- that the electric and
should realize
E0y cos [v(t - x>c) + e]
c conducting and nonmagnetic, Eq. (3.30) can be generalized:
magnetic fields are inseparable: they together form the electromagnetic field. Stated differently:
if the amplitude of the electric field is doubled, E then
= vB also that of the magnetic field is doubled and
hence the Poynting vector is increased by a factor 4.medium
where v is the speed of the wave in the and v = 1>if1Pm.
Conclusion: you have to compute the
Poynting vector or the electromagnetic
Plane waves, though energy density
important, of the
are not a time harmonic
only solutions to electromagnetic field,
substitute the real-valued vector fields and do NOT use complex computations. An exception is
Maxwell’s Equations. As we saw in the previous chapter, the
the calculation of the differential
long-timewave equation
average allows
of the many solutions,
Poynting vector among
or thewhich
energy density. As we will
E! are cylindrical and spherical waves (Fig. 3.15). Still, the point
show below, the time averages of the energy flux and energy density can be expressed quite
must be made again that spherical EM waves, although a useful
conveniently
c in terms notion
of thethat
complex field amplitudes in this case.
we will occasionally embrace, do not actually exist.
x
Indeed, Maxwell’s Equations forbid the existence of such waves.
!
B
For the plane wave (2.62), (2.63) we
No arrangement substituting
of emitters can havethe real-valued
their fields
radiation fields com-in the Poynting vector
and the electromagnetic
bineenergy density
to produce a truly and get:wave. Moreover, we know from
spherical
Quantum Mechanics that the emission r of radiation is fundamen-
tally anisotropic. B(x,
Like t) waves,spherical
plane
figuration in a plane harmonic electro- S(x, t) = E(x, t) × = |A|2 coswaves
2
(kx −areωtan+ap-
ϕ) x̂, (2.69)
vacuum. proximation to reality.µ0 µ0
Uem (x, t) = 2|A|2 cos2 (kx − ωt + ϕ). (2.70)
We see that the energy flow of a plane wave is in the direction of the wave vector which is also
the direction of the phase velocity.
Optica Lecture Notes TN2421 21 of 165 Monday 16th April, 2018, 09:44
26/08/16 11:50 AM
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
Because
1 −iωt
A(t) = Re Ae−iωt = + A∗ eiωt ,
Ae
2
where the ∗ means complex conjugation, and with a similar expression for B(t), it follows that
1 T /2
Z Z T /2
0 1
AB ∗ + A∗ B + ABe−2iωt + A∗ B ∗ e2iωt dt
lim A(t)B(t)d = lim
T →∞ T −T /2 T →∞ 4T −T /2
If one takes the average of the product of two time harmonic quantities over a time interval
long compared with the period of the oscillation, the result is half the real part of the product of
the complex amplitude of one quantity and the complex conjugate of the other.
E(r, t) = Re E(r)e−iωt ,
B(r, t) = Re B(r)e−iωt ,
Optica Lecture Notes TN2421 22 of 165 Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface
then we find that the time averaged energy flow denoted by S(r), is given by
Z T /2
1 1
S(r) = lim S(r, t)dt = Re [E × B∗ ] . (2.76)
T →∞ T −T /2 2µ0
def 1 T /2
Z
1 1
< Uen (r) > = lim Uen (r, t0 )dt0 = E(r) · E(r)∗ + B(r) · B(r)∗
T →∞ T −T /2 2 2µ0
1 1
= |E(r)|2 + |B(r)|2 . (2.77)
2 2µ0
If we consider these expressions for the special case of our plane wave (2.62), (2.63):
r r
1 1
S= Re [AA∗ ] x̂ = |A|2 x̂. (2.78)
2 µ0 2 µ0
The length of vector (2.78) is the time averaged flow of energy per unit of area in the direction
of the plane wave and is commonly called the intensity of the wave. For the time averaged
electromagnetic energy density of the plane wave we get:
1 1
< Uen >= |A|2 + µ0 |A|2 = |A|2 . (2.79)
2 2µ0
It is seen that both the time averaged energy flux and the time averaged energy density of a
plane wave are proportional to the modulus squared of the complex electric field.
Part of the incident field is reflected into medium y > 0 and part is transmitted into medium
y < 0. The reflected and transmitted fields are also plane waves:
h r
i
E r (r) = Re Er (r)e−iωt = Re Ar ei(k ·r−ωt) , for y > 0, (2.81)
h t
i
E t (r) = Re Et (r)e−iωt = Re At ei(k ·r−ωt) , for y<0. (2.82)
Our aim is to determine the reflected and transmitted field amplitudes Ar and At .
n̂ × (E i + E r ) = n̂ × E t , (2.83)
i r
n̂ × (B + B ) = n̂ × B , t
(2.84)
Optica Lecture Notes TN2421 23 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
where n̂ = ŷ is the unit normal on the interface. This means that the tangential components of
the total electric and total magnetic field are continuous across the interface, or explictely:
We will only demonstrate this for the electric field. By choosing a closed loop in the (x, y)-
plane as shown in Fig. 2.8 and integrating the normal component of Faraday’s Law (2.13) over
the area A bounded by the loop L, we obtain:
Z Z Z Z
d
− ẑ · B dA = ẑ · ∇ × E dA
dt A A
I
= E · dl, (2.89)
L
Figure 2.8: Closed loop in the (x, y)-plane enclosing the area A and surrounding part of the
interface y = 0, as used in Stokes Law to derive the continuity of the electric and magnetic
component tangential to the interface and parallel to the plane through the loop.
where in the last step we used Stokes theorem and the direction of integration over the loop
corresponds to the direction of the normal ẑ according to the positive screw driver rule. In
words: the rate of change of the magnetic flux through the surface A is equal to the integral of
the tangential electric field over the bounding closed loop L. By taking the limit dy → 0, the
surface integral and the integrals over the vertical parts of the loop vanish and there remain only
the integrals of the tangential electric field over the horizontal parts of the loop on both sides of
the interface y = 0. Since these integrals are traversed in opposite directions and the lengths of
these parts are the same, we conclude for the loop as shown in Fig. 2.8 that
By choosing the closed loop in the (y, z)-plane instead of the (x, y)-plane one finds that also the
z-component of the electric field is continuous. Hence all tangential electric field compo-
nents are continuous across the interface.
By integrating Maxwell’s equations that contain the div-operator (2.15), (2.16) over a pill box
with height dy and top and bottom surfaces on either side and parallel to the interface, and
considering the limit dy → 0, we find continuity relations for the normal components of the
fields:
Optica Lecture Notes TN2421 24 of 165 Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface
Since (2.85), (2.86), (2.87), (2.88), hold for all times, it is easy to see that the complex fields
satisfy the same boundary conditions for y = 0:
The same holds for the boundary conditions for the normal components, but as will become clear
below, these boundary conditions are actually not needed in the derivation below: we will only
need the continuity of the tangential electric and magnetic components.
2.10.2 S-polarisation
We have shown in (2.61) and (2.63) of Section 2.7 that the electric and magnetic fields of a plane
wave are perpendicular to each other and to the wave vector. Let the incident plane wave have
wave vector:
ki = kxi x̂ + kyi ŷ = k i sin θi x̂ − k i cos θi ŷ, (2.97)
√
where θi is the angle with the normal as shown in Fig. 2.9 and k i = ω i µ0 = k0 ni with
√
k0 = ω 0 µ0 is the wave number in vacuum and ni is the refractive index of the medium in
y > 0. The plane of incidence is defined as the plane through the normal and the incident
wave vector. In the present case this is the (x, y)-plane. Then electric vector Ai of the incident
plane wave can have any direction in the plane perpendicular to the wave vector. We distinguish
two (orthogonal) cases:
1. S-polarisation : in this case the electric field is in the z-direction, i.e. perpendicular
("Senkrecht") to the plane of incidence (see Fig. 2.9):
i [sin(θ )x−cos(θ )y]
Ei (x, y) = Ai eik i i
ẑ. (2.98)
2. P-polarisation : then the electric field vector is parallel to the plane of incidence and
perpendicular to the wave vector (see Fig. 2.10):
i [sin(θ )x−cos(θ )y]
Ei (x, y) = Ai eik i i
[cos θi x̂ + sin θi ŷ]. (2.99)
It will follow from the computations below that if the incident electric field is S-polarised, the
reflected and transmitted fields are also S-polarised and similarly for P-polarisation. This is the
reason for using this particular decomposition: it greatly simplifies the calculations and since
any incident electric field can always be written as the superposition of a S-polarised and a
P-polarised wave, studying these two cases is sufficient.
We will consider S-polarisation in this section. We seek amplitudes Ar and At such that the
reflected and transmitted plane waves are given by
i [sin(θ )x+cos(θ )y]
Er (x, y) = Ar eik r r
ẑ, (2.100)
t ikt [sin(θt )x−cos(θt )y]
t
E (x, y) = A e ẑ, (2.101)
Optica Lecture Notes TN2421 25 of 165 Monday 16th April, 2018, 09:44
and any point
m, ki = kr.
plane of the
$ 0i + E$ 0r = E$ 0t
E (4.25)
at CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
where the cosines cancel. Realize that the field vectors as shown
really ought to be envisioned at y = 0 (i.e., at the surface), from
which they have been displaced for the sake of clarity. Note too
y
$i ,
vectors, k
nce. Again,
!r
k
! ui
!Ei
(4.21) Bi ur
!i
k
z
!
Br
$i,
e. Thus k !Er x
e
ac
ial compo- ut
erf
Int
!
Bt !t
k
(4.22) !
Et
c>vi to get
(a)
origin O to
(4.21) that !
Ei !
Er !r
k
ent, though !i
k
use it from
Bi
Interface
! ui ur
ni
!
Br
x
nt
û n u
t
!
Et
!
Bt
!t
k
ui ui
f the wave,
Bi cos ui
nts parallel
these con-
Optica Lecture Notes TN2421 26 of 165
(c) Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface
y One furthe
Law, wher
!
Ei
become (P
!
Er
!r
k
ui
!Bi
ur
!i
k
z
!Br x
e
ac
erf
Int
!t
k
!
Et
!
Bt
!
Ei !
Er
!r
k
A note
that the dir
!
Bi
ui ur !
Br in Figs. 4.4
!i
k ni x ample, in F
Interface
nt ward, whe
û n u Had we do
t
!
Et positive, le
!
Bt
!t
k
The signs
positive ex
field direc
Figure
Figure 2.10: An4.48 An incoming
P-polarised wavewave:
incident plane whose $
theEelectric
-field isfield
in the plane-of-
vector is parallel to the plane
of incidence.
incidence. will see, ju
$ r in Fig. 4
E
standardiz
the Fresne
Using the fact that mi = mr and ui = ur, we can combine these to the spec
formulas to obtain two more of the Fresnel Equations:
nt ni
cos ui - cos ut
Optica Lecture Notes TN2421 E0r 27 ofmt165 mi Monday 16 th
April, 2018, 09:44 EXAMPLE
ri K a b = (4.38)
E0i i ni nt An electro
cos ut + cos ui
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
with nt the refractive index of the material in y < 0. the angles θr and θt are called the angle
of reflection and the angle of refraction (or transmission), respectively.
First of all we claim that the tangential components of the wave vectors of the three plane
waves, i.e. their x-components, must be identical, i.e.
If this were not the case, the incident, reflected and transmitted electric fields (2.98), (2.100) and
(2.101) would for y = 0 be periodic functions of x but with different periodicities. But fields that
are periodic but have different periods as functions of x can not satisfy the boundary conditions
at the interface y = 0 for all x. For P-polarisation a similar reasoning applies so that (2.103)
hold for all polarisations.
We conclude from (2.103) that
θr = θi , (2.104)
nt sin θt = ni sin θi , (Snell’s Law). (2.105)
Hence, the angle of reflection is the same as the angle of incidence and the transmitted wave is
refracted to or away from the normal by the amount given by (2.105). Eq. (2.105) is Snell’s
Law (named after Willebrord Snellius 1580-1626, mathematics professor in Leiden; he had never
published the result during his lifetime. Meanwhile, Ren Descartes independently derived
the law using heuristic momentum conservation arguments in terms of sines in his 1637 essay
Dioptrics, therefore in French speaking country, it is called the Descartes’ Law). Note that
the Reflection Law (2.104), and Snell’s Law (2.105) hold independent of the polarization, hence
they hold also for the case of a P-polarised incident wave.
Snell’s Law implies that when the angle of incidence θi increases, the angle of transmission
increases as well. If the medium in y > 0 is air with refractive index ni = 1 and the other
medium is glass with refractive index nt = 1.5, then the maximum angle of transmission occurs
when θi = 90o and then
θt,max = arcsin(ni /nt ) = 41.8o . (2.106)
In case the light is incident from glass, i.e. ni = 1.5 and nt = 1.0, the angle of incidence θi can
not be larger than 41.8o because otherwise there is no real solution for θt . It turns out that when
θi > 41.8o , the wave is totally reflected, i.e. there is no propagating transmitted wave in air.
The angle θi,critical = 41.8o is called the critical angle of total internal reflection. It exists
only if a wave is incident from a medium with larger refractive index on a medium with lower
refractive index (nt < ni ). The critical angle is independent of the polarization of the incident
wave.
We proceed now with computing the amplitudes Ar and At . Because Ai is known, there are
two unknowns, hence we need two equations to determine them. The first equation that we apply
is the continuity of the tangential electric field at y = 0. Since the z-component is tangential, it
follows that
Ai + Ar = At . (2.107)
The second equation is obtained from the continuity of the tangential component of the magnetic
field. It follows from Faraday’s Law (2.13) that
ki i
Bi (x, y) = × ẑAi eik [sin(θi )x−cos(θi )y]
ω
ni i
= − [cos θi x̂ + sin θi ŷ] Ai eik [sin(θi )x−cos(θi )y] , (2.108)
c
ni i
B (x, y) = − [− cos θi x̂ + sin θi ŷ] Ai eik [sin(θi )x−cos(θi )y] ,
r
(2.109)
c
nt i
Bt (x, y) = − [cos θt x̂ + sin θt ŷ] At eik [sin(θt )x−cos(θt )y] , (2.110)
c
Optica Lecture Notes TN2421 28 of 165 Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface
where c is the velocity of light in vacuum. The tangential component in the interface y = 0 are
ni i
Bxi (x, 0) = − cos(θi )Ai eik sin(θi )x . (2.111)
c
ni i
Bxr (x, 0) = cos(θi )Ar eik sin(θi )x , (2.112)
c
nt t
Bxt (x, 0) = − cos(θt )At eik sin(θt )x . (2.113)
c
Since the total tangential components are continuous at the interface:
− ni cos(θi )Ai + ni cos(θi )Ar = −nt cos(θt )At . (2.114)
The reflection and transmission coefficients are defined by Ar /Ai and At /Ai , respectively. They
are called the Fresnel coefficients of S-polarisation. Eqns. (2.107) and (2.114) imply:
Ar ni cos θi − nt cos θt sin(θi − θt )
rS = i
= =− , (2.115)
A ni cos θi + nt cos θt sin(θi + θt )
At 2ni cos θi 2 cos θi sin θt
tS = i
= = , (2.116)
A ni cos θi + nt cos θt sin(θi + θt )
where at the far right we have substituted Snell’s Law.
2.10.3 P-polarization
The incident electric field is in this case given by (2.99). The reflected and transmitted fields are
similarly given by
i [sin(θ )x+cos(θ )y]
Er (x, y) = Ar eik i i
[− cos θi x̂ + sin θi ŷ], (2.117)
t ikt [sin(θt )x−cos(θt )y]
t
E (x, y) = A e [cos θt x̂ + sin θt )ŷ]. (2.118)
(2.119)
The x-components of the electric field are continuous across the interface y = 0 because they are
tangential, hence
Ai cos θi − Ar cos θi = At cos θt . (2.120)
The magnetic field of the incident follows from Faraday’s Law (2.13):
ki i
Bi (x, y) = × [cos(θi )x̂ + sin(θi )ŷ]Ai eik [sin(θi )x+cos(θi )y]
ω
ni i iki [sin(θi )x−cos(θi )y]
= Ae ẑ, (2.121)
c
The magnetic field of the reflected and transmitted waves follows analogously:
ni r iki [sin(θi )x+cos(θi )y]
Br (x, y) = A e ẑ, (2.122)
c
nt t ikt [sin(θt )x−cos(θi ty]
Bt (x, y) = Ae ẑ, (2.123)
c
Because ẑ is tangential to the interface and the tangential magnetic components are continous
across y = 0, it follows
n i Ai + n i Ar = n t At . (2.124)
Eqns (2.120) and (2.124) then imply for the Fresnel coefficients of P-polarisation:
cos θi cos θt
Ar ni − nt tan(θi − θt )
rP = = = , (2.125)
Ai cos θi
ni + ntcos θt tan(θi + θt )
cos θi
At 2 nt 2 cos θi sin θt
tP = = = , (2.126)
Ai cos θi
ni + nt
cos θt sin(θi + θt ) cos(θi − θt )
Optica Lecture Notes TN2421 29 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
The Fresnel coefficients are easy to use. To compute the reflection and transmission of a generally
polarised incident plane wave, you first write the incident electric field as a linear combination
of S- and P-polarised waves. Then for the given angle of incidence θi and given refractive indices
ni and nt , you calculate the angle of refraction θt . Then, you substitute θi and θt at the right
of (2.115), (2.116) and (2.125), (2.126). Finally you take the proper linear combination of the
reflected and transmitted S- and P-Fresnel coefficients.
Remark Note that |rS |2 + |tS |2 6= 1 and |rP |2 + |tP |2 6= 1. The reason is that the ratio of
the transmitted energy flow in y < 0 to the incident energy flow is not given by |tS |2 . In fact, as
follows from (2.78), in computing this ratio one should take account of the velocity of light which
is different in the two media. To obtain the correct formula for the ratio of the transmitted and
the incident energy flow, |tS |2 and |tP |2 have to multiplied by nt = 1.5.
Brewster angle. It follows from Snell’s Law (2.105) that sin θt = (ni /nt ) sin θi . Hence θt
monotonically increases with θi and therefore there exists some θi such that
θi + θt = 90o . (2.130)
For this particular angle of incidence, the denominator of (2.125) vanishes and hence the P-
polarised wave is not reflected at all. This angle of incidence is called the Brewster angle; θB .
3 . It is easy to see from (2.115) that the reflection is never zero for S-polarisation. Hence if
unpolarized light is incident at the Brewster angle, the reflected light will be purely S-polarized.
We have θt = 90o − θi , hence sin(θt ) = cos(θi ) and by Snell’s Law (writing θi = θB ):
nt
tan(θB ) = . (2.131)
ni
3
MIT OCW - Reflection at The Air-glass Boundary: demonstration of reflection of polarized light and the
Brewster angle.
Optica Lecture Notes TN2421 30 of 165 Monday 16th April, 2018, 09:44
reflected will increase, and it will become more difficult to see
the page through the glass. When ui ≈ 90° the slide will look
like a perfect mirror as the reflection coefficients (Fig. 4.49) go
2.10. Reflection and - 1.0. Even a poor
to Transmission at ansurface (see photo), such as the cover of
Interface
this book, will be mirrorlike at glancing incidence. Hold the
book horizontally at the level of the middle of your eye and face
We see that there aisbright
always a solution,
light; you will seeindependent of whether
the source reflected nicely the
in thewave
cover.is incident from the
material with smallest
This suggests that X-rays could be mirror-reflected at glancingwe have θB = 56.3
or largest refractive index. For the air-glass interface o
In Fig. 2.11 the reflection and transmission coefficients of S- and P-polarised waves are shown
2n
as function of the angle of incidence[tfor the case of incidencei from air to glass.
i ]u = 0 = [t#]u = 0 = (4.48) There is no critical
i i
n + n
angle of total reflection in this case and hence the reflection and transmission coefficients are well
i t
defined for all angles between 0o and 90o . The Brwester angle is indicated. It is seen that the
It will be shown in Problem 4.63 that the expression
reflection coefficients decrease from the values −0.2 and 0.8 for θi = 0o to -1 formore 90o . The
θi = dense (ni 7 nt), is of inte
t + (-r )
transmission coefficients monotonically decrease to 0 at θi = 90 .
# # = 1 o (4.49)
ut 7 ui, and r#, as described by E
Fig. 2.11 showsholds the for
Fresnel
all ui, coefficients
whereas when the wave is incident from glass tive.toFigure
air. The
4.50 shows that r# i
critical angle is θc = 41.8 as derived earlier. At the angle of total internal reflection
o the(4.47)]
[Eq. reflection
at ui = 0, reaching +
t +r =1 (4.50)
coefficients are identical to 1. There is againi an iangle where the reflection of P-polarised angle, uc.light is
Specifically, uc is the sp
zero θB = 33.7 . is true only at normal incidence.
o
gle (p. 133) for which ut = p>2
Depending on the The foregoing
refractive discussion,
indices and theforangle
the most part, was restricted
of incidence, to
the reflection coefficients
tively can at ui = 0 and
[Eq. (4.47)]
be negative. The reflected external
the case ofelectric reflection
field then has 7 ni). The opposite
(i.e.,annt additional π phasesitu- + 1 at ui =toucthe
shift compared , as is evident from
ation of internal reflection, in which the incident medium
incident wave. In contrast, the transmitted field is always in phase with the incident field, is the Again, ri passes
i.e. through zero at t
the transmission coefficients are always positive.
1.0
1.0
t ""
t⊥
0.5
0.5
Amplitude coefficients
Amplitude coefficients
r⊥
r ""
0
0 up! uc
up
r ""
r⊥
–0.5
–0.5
33.7° 41.8
56.3° –1.0
–1.0 0 30
0 30 60 90
ui (de
ui (degrees)
Figure 4.49 The amplitude coefficients of reflection and transmission as Figure 4.50 The amplitude coefficien
Figure 2.11: Reflection andoftransmission
a function coefficients
incident angle. These correspondas
to function of thent angle
external reflection 7 ni of incident
incidence
angle.ofThese
S- correspond to inte
and P-polarised waves
at anincident from air
air–glass interface (nti to glass. The Brewster angle is denoted air-glass
= 1.5). by θp interface (nti = 1>1.5).
Optica Lecture Notes TN2421 31 of 165 Monday 16th April, 2018, 09:44
c c
e. gle (p. 133) for which ut = p>2. Likewise, ri starts off nega-
or the most part, was restricted to tively [Eq. (4.47)] at ui = 0 and thereafter increases, reaching
(i.e., nt 7 ni). The opposite situ- + 1 at ui = uc, as is evident from the Fresnel Equation (4.40).
which the incident medium is the CHAPTER
Again, ri passes
2. THE through
BASICSzero OF
at the polarization angle u′p. ItAND
ELECTROMAGNETIC is WAVE OPTICS
1.0
t ""
t⊥
0.5
Amplitude coefficients
r⊥
r ""
0
up! uc
up
r ""
r⊥
–0.5
33.7° 41.8°
56.3° –1.0
0 30 60 90
60 90
ui (degrees)
(degrees)
ients of reflection and transmission as Figure 4.50 The amplitude coefficients of reflection as a function of
Figure 2.12: Reflection
orrespond to external reflection nt 7 ni
and transmission coefficients as function of the angle of incidence of S-
incident angle. These correspond to internal reflection nt 6 ni at an
and P-polarised waves
air-glass interfacefrom
incident (nti = glass
1>1.5).to air.
This is equivalent to
kxt = k0 ni sin θi,c = k0 nt = k0 . (2.133)
The wave vector kt = kxt x̂ + kyt ŷ in y < 0 satisfies:
kyt = 0. (2.135)
For angles of incidence above the critical angle we have: kxt > k0 and it follows from (2.134) that
(kyt )2 = k02 − (kxt )2 < 0, hence kyt is imaginary:
q q
kyt = ± k02 − (kxt )2 = ±i (kxt )2 − k02 , (2.136)
where the last square root is a positive real number. It can be shown that above the critical
angle the reflection coefficients are complex numbers with modulus 1: |rS | = |rP | = 1. This
implies that the reflected intensity is indentical to the incident intensity while at the same time
the transmission coefficients are not zero! For eample for S-polarisation we have according to
(2.107):
tS = 1 + rS 6= 0 (2.137)
because, although |rS | = 1, rS 6= −1. Therefore there is an electric field in y < 0, given by
t t t
√ t 2 2
E(x, y)e−iωt = tS eikx x+iky y−iωt ẑ = tS ei(kx x−ωt) ey (kx ) −k0 ẑ, y < 0, (2.138)
where we have chosen the - sign in (2.136) to garantee that the field does not blow up for
y → −∞. Since kxt is real, the wave propagates in the x-direction. In the y-direction however,
Optica Lecture Notes TN2421 32 of 165 Monday 16th April, 2018, 09:44
2.11. Fiber Optics
the wave is not propagating. Its amplitude decreases exponentially as function of distance |y| to
the interface and therefore the wave is confined to a thin layer adjacent to the interface. Such a
wave is called an evanescent wave. By computing the Poynting vector one can show that the
energy is being propagated parallel to the interface, i.e. in the direction in which kxt is positive.
Hence no energy is transported away from the interface into the air region.
In summary, for angles of incidence in glass above the critical angle, the transmitted field in
air is evanescent. We shall return to evanescent waves in the chapter on diffraction theory.
1. Youtube video - 8.03 - Lect 18 - Index of Refraction, Reflection, Fresnel Equations, Brewster
Angle - Lecture by Walter Lewin
1. Hecht 5.6
EXAMPLE 5.11
A fiber has a core index of 1.499 and a cladding index of 1.479.
When surrounded by air what will be its (a) acceptance angle,
(b) numerical aperture, and (c) the critical angle at the core–
uc cladding interface?
ut
nf SOLUTION
nc ui = umax (b) From Eq. (5.61)
ni
NA = (n2f - n2c )1>2 = (1.4992 - 1.4792)1>2
Figure 5.81 Rays in a clad optical fiber.
Figure 2.13: Light reflected within a dielectric cylinder.NA = 0.244
which is a typical value.
1
angles greater than umax will strike the interior wall at angles (c) Since sin umax = NA = NA
Optica Lectureless than uTN2421
Notes 33 of
c. They will be only partially reflected 165such
at each Monday 16thni April, 2018, 09:44
encounter with the core–cladding interface and will quickly umax = sin-1(0.244) = 14.1°
leak out of the fiber. Accordingly, umax, which is known as the
acceptance angle, defines the half-angle of the acceptance cone Hence 2umax = 28.2°
of the fiber. To determine it, start with
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS
5.6 Fibero
/
Nr = ±1
D>sin ut
L sin ui
or Nr = ±1
D(n2ƒ - sin2 ui)1>2
ni =
ut na
ui
nf
±ui
L
Figure 2.15: A coherent bundel of 10 µm glass fibers which transmits an image even though it
Figure 5.80 Rays reflected within a dielectric cylinder. Light emerging from the ends of a loose bundle of glass fibers.
is knotted and sharply bent.
M05_HECH6933_05_GE_C05.indd 205
Optica Lecture Notes TN2421 34 of 165 Monday 16th April, 2018, 09:44
Chapter 3
Geometrical Optics
What you should know and be able to do after studying this chapter.
• Principle of Fermat.
• Know how to work with the sign convention of the lens maker’s formula (not
the derivation of the formula).
• Understand how the Lens maker’s Formula of a single lens follows from the
formula for a single interface.
• Understand how the image of two and more lenses is derived from that of
a single lens by constructing and computing the intermediate images. You
do not need to know the imaging equation and the formulae for the focal
distances of two thin lenses.
• Understand the matrix method (you do not need to know the matrices by
hart).
3.1 Introduction
Geometrical optics is an old subject but it is still very essential to understand and design optical
instruments such as camera’s, microscopes, telescopes etc.. Geometrical optics started long
before light was described as a wave as is done in wave optics, and long before it was discovered
that light is an electromagnetic wave governed by Maxwell’s equations.
In this chapter we go back in history and treat geometrical optics. That may seem strange now
that we have a much more accurate and better theory at our disposal. However, the predictions
of geometrical optics are under quite common circumstances very useful and also very accurate.
In fact, for many optical systems and practical instruments there is no good alternative for
geometrical optics because more accurate theories are much too complicated to use.
Molecules of a material that is being illuminated start to radiate spherical waves (more pre-
cisely, they radiate like tiny electric dipoles) and the total wave scattered and transmitted by
35
CHAPTER 3. GEOMETRICAL OPTICS
the material is the sum of all these spherical waves. A time harmonic wave has at every point in
space and at every instant of time a well defined phase. A wave front is a surface of constant
phase. The velocity at which the wave front moves is the light velocity which points in the
direction of the local normal on the wave front. For plane waves we have shown that the normal
to the wave front is the direction of the wave vector which also coincides with the direction of the
phase velocity and the direction of the flow of energy (the direction of the Poynting vector). For
more general waves, the normal to the wave front may still be considered to be in the direction
of the local flow of energy provided the curvature of the wave front is not too large. Such waves
behave locally as plane waves.
Geometrical optics is based on the intuitive idea that light consists of a bunch of rays. But
what is a ray?
Consider a point source at some distance before an opaque screen with an aperture. According
to the ray picture, the light distribution on a second screen further away from the source and
parallel to the first screen is simply an enlarged copy of the aperture (see Fig. 3.1). The copy is
enlarged due to the fanning out of the rays. However, this description is only accurate when the
wavelength of the light is very small compared to the diameter of the aperture. If the aperture is
only ten times the wavelength, the pattern is much broader due to the bending of the rays around
the edge of the aperture. This phenomenon is called diffraction and can not be explained by
geometrical optics.
Figure 3.1: Light distribution on a screen due to a rectangular aperture. Left: for a large
aperture,we get an enlarged copy of the aperture. Right: for an aperture that is of the order of
the wavelength there is strong bending and diffraction of light.
Geometrical optics is accurate when the size of the objects in the system are large compared to
the wavelength. It is possible to derive geometrical optics from Maxwell’s equations by expanding
the electromagnetic field in a Taylor series in the wavelength and retaining only the first term of
this expansion 1 .
Principle of Fermat (1657) The path followed by a light ray between two points is the one
that costs the least amount of time.
1
See Chapter 1 of M. Born & E. Wolf, "Principles of Optics"
Optica Lecture Notes TN2421 36 of 165 Monday 16th April, 2018, 09:44
3.3. Some Consequences of Fermat’s Principle
The speed of light in a material with refractive index n, is c/n, where c = 3 × 108 m/s is
the speed of light in vacuum. At the time of Fermat, the conviction was that the speed of light
must be finite, but nobody at the time could suspect how incredibly large it actually is. In 1676
the Danish astronomer Ole Römer computed the speed from inspecting the eclipses of a moon
of Jupiter and arrived at an estimate that was only 30% too low.
Let r(s), be a ray with s the length parameter. The ray links two points S and P. Suppose
that the refractive index varies with position: n(r). Over the distance from s to s + ds, the
speed of the light is
c
. (3.1)
n(r(s))
Hence the time light needs to go from r(s) to r(s + ds) is:
n(r(s))
dt = ds, (3.2)
c
and the total total time to go from S to P is:
Z sP
n(r(s))
tS→P = ds, (3.3)
0 c
where sP is the distance along the ray from S to P. The optical path length of the ray between
S and P is defined by: Z sP
OP L = n(r(s)) ds, (3.4)
0
So the OPL is the distance weigthed by the refractive index.
Fermat’s Principle is thus equivalent to the statement that a ray follows the path
with shortest OPL.
Remark. Actually, Fermat’s Principle as formulated above is not complete. There are circum-
stances that a ray can take two paths between two points that have different travel times. Each
of these paths then corresponds to a minimum travel time compared to nearby paths, so the
travel time is in general a local minimum.
Optica Lecture Notes TN2421 37 of 165 Monday 16th April, 2018, 09:44
Fermat’s Principle in its modern fo
The same effect is well known as it applies to sound. Fig-
going from point S to point P must
ure 4.40 depicts the alternative understanding in terms of waves.
length that is stationary with respect t
n of Light The wavefronts bend because of temperature-induced changes
In essence what that means is that the
in speed and therefore in wavelength. (The speed of sound is
CHAPTER 3. GEOMETRICAL x will haveOPTICS
a somewhat flattened regio
proportional to the square root of the temperature.) The noises
of people on a hot beach climb up and away, and the place can the slope goes to zero. The zero-slope
ar en t po sition actual path taken. In other words, the
App
tory will equal, to a first approxima
n1 Ray fr immediately adjacent to it.† For exam
(a) om Su
n the OPL is a minimum, as with the ref
Warm Straigh
to Sun 4.36, the OPL curve will look somethi
t path
s2 n2 Earth change in x in the vicinity of O has lit
s3 n3
Figure 4.38 The bending of rays through inhomogeneous media.
a similar change in x anywhere well a
Because the rays bend as they pass through the atmosphere the Sun substantial change in OPL. Thus th
appears higher in the sky. Cold
neighboring the actual one that would
si ni for the light to traverse. This latter ins
In the same way, a road viewed at a glancing angle, as in begin to understand how light manag
Fig. 4.39, appears to reflect the environs as if it were covered meanderings.
with a sheet of water. The air near the roadway is warmer and Suppose that a beam of light adva
less dense than that farther above it. It was established experi- neous isotropic medium (Fig. 4.42) s
sm nm (b) points S to P. Atoms within the mater
mentally by Gladstone and Dale that for a gas of density r
Cold dent disturbance, and they reradiate in
(n - 1) ∝ r progressing along paths in the immedia
P
straight-line path will reach P by route
It follows from the Ideal Gas Law that at a fixed pressure, since in OPL (as with group-I in Fig. 4.42b).
r ∝ P>T, (n - 1) ∝ 1>T; the hotter the road, the lower the in- nearly in-phase and reinforce each oth
hrough a layered material. dex of refraction of the air immediately Warmabove it.
According to Fermat’s Principle, a ray leaving a branch in
Figu
OPL
Fig. 4.39a heading somewhat downward would take a route that show
th length and speed, respectively, minimized the OPL. Such a ray would bend upward, passing loca
ibution. Thus through more of the less dense air than if it had traveled straight. to a
m
Figure 3.2: The density
To
Figure and therefore
appreciate
4.40 The how
puddlethat the
mirage refractive
works,
can be imagine index
understood the decreases
air divided
via waves; the speed,and
into hence
an the light speed
1 warmerand therefore thebends
wavelength, increase in the less dense medium. That bends happens with sound
^
ci=1
nisi increases in(4.9)
waves.
infinite
air.
layers.
(a)
number
This
A ray
when the passing
surface
of theinfinitesimally
from
air is cold,
wavefront
layercan
sounds tobe
thin
and
layer
constant-n
rays.
heardwould
The
the wavefronts and the rays. The same effect is common with sound waves,
bend than
much farther
horizontal
same
(via Snell’s
Law) slightly
normal. upward
(b) And when at each
it’s warm, interface
sounds seem to (much asthe
vanish into in Fig.
air. 4.36 held x
O
known as the optical path length upside down with the ray run backwards). Of course, if the ray
n contrast to the spatial path length comes down nearly vertically it makes a small angle-of-incidence
mogeneous medium where n is a *See, for example, T. Kosa and P. Palffy-Muhoray, “Mirage mirror on the wall,” †The first derivative of the OPL vanishes in its Tay
Am. J. Phys. 68 (12), 1120 (2000). path is stationary.
ummation must be changed to an
Cool air
(a)
= 3 n(s) ds
P
(4.10) Hot air
S
Apparent reflecting
surface
corresponds to the distance in
stance traversed (s) in the medium
will correspond to the same number
M04_HECH6933_05_GE_C04.indd 119
= s>l, and the same phase change (b)
Optica Lecture Notes TN2421 38 of 165 Monday 16th April, 2018, 09:44
n of Light 3.3. Some Consequences of Fermat’s Principle
ion
Apparent posit
n1 Ray fr
om Su
n
Straigh
t path
n2 to Sun
Earth
s3 n 3
Figure Figure 4.38
3.4: The bending The through
of rays bending of rays
the through inhomogeneous
inhomogeeous media.
atmosphere. The upper layers are
Because the rays bend as they pass through the atmosphere the Sun
less dense, hence the light speed is higher there. The sun therefore seems to be higher up the
appears higher in the sky.
si sky nthan
i
is actually the case.
In the same way, a road viewed at a glancing angle, as in
Fig. 4.39, appears to reflect the environs as if it were covered
where n is the refractive index of the medium in y > 0. The point (x, 0) should be such
with a sheet of water. The air near the roadway is warmer and
that the travel time is minimum , i.e.
less dense than that farther above it. It was established experi-
sm nm mentally by d Gladstone and Dale (x that−for
xPa) gas(x
ofQdensity
− x) r
[d1 (x) + d2 (x)] = − = 0. (3.6)
dx (n - 1) d∝1 (x)
r d2 (x)
P
Hence It follows from the Ideal Gas Law that at a fixed pressure, since
r ∝ P>T, (n - 1) ∝ 1>T; the sin hotter
θi = sin
theθroad,
r, the lower the in- (3.7)
hrough a layered material. or dex of refraction of the air immediately above it.
According to Fermat’s Principle,
θr = θai .ray leaving a branch in (3.8)
Fig. 4.39a heading somewhat downward would take a route that
where θi and minimized
h length and speed, respectively, θr are the theangles
OPL.of Such
incidence
a ray and
wouldreflection as shown
bend upward, in Fig. 3.5.
passing
bution. Thus through more of the less dense air than if it had traveled straight.
m
To appreciate how that works, imagine the air Q(x ,yQ)into an
divided
Q horizontal
P(x ,yP) from layer to layer would bend (via Snell’s
1
^ ns
ci=1 i i
(4.9)
infinite number of infinitesimally thin constant-n
layers. A rayP passing
Law) slightly upward at each θr (much as in Fig. 4.36 held
θi interface
nown as the optical path length upside down with d the ray run backwards). d2Of course, if the ray
n contrast to the spatial path length 1
comes down nearly vertically it makes a small angle-of-incidence
mogeneous medium where n is a (x,0)
mmation must be changed to an
Cool air
(a)
3 n(s) ds
P Figure 3.5: Ray from P to Q via the mirror.
(4.10) Hot air
S
Apparent reflecting
• Snell’s law of refraction surface
orresponds to the distance
Next weinconsider refraction at an interface. Let y = 0 be the interface between a medium
tance traversed (s) in the medium
with refractive index ni in y > 0 and nt in y < 0. Let P= (xP , yP ) and Q=(xQ , yQ ) with
ill correspond to the same number
yP > 0 and yQ < 0 (see Fig. 3.6). What path will a ray follow that goes from P to Q?
s>l, and the same phase change
Since the refractive
(b) index is constant in both half spaces, the ray is a straight line in both
media. Let (x, 0) be the coordinate of the intersection point of the ray with the interface.
we can restate Fermat’s Principle:
Then the travel time is
o P, traverses the route having the
ni nt ni nt q
q
d1 (x) + d2 (x) = (x − xP )2 + yP2 + (xQ − x)2 + yQ 2. (3.9)
c c c c
The travel time must be minimum, hence there must hold
d (x − xP ) (xQ − x)
n pass through the inhomogeneous [ni d1 (x) + nt d2 (x)] = ni − nt = 0. (3.10)
dx d1 (x) d2 (x)
hown in Fig. 4.38, they bend so as
regions as abruptly aswhere
possible,
the travel
Figuretime
4.39 has been
(a) At very multiplied
low angles the by
raysthe speed
appear of light
to be coming in vacuum. Eq. (3.10)
from
ne can still see the Sun after it
implies has beneath the road as if reflected in a puddle. (b) A photo of this puddle
rizon. effect. (Matt Malloy and Dan MacIsaac,
niNorthern
sin θi Arizona
= ntUniversity,
sin θt ,Physics & Astronomy) (3.11)
Optica Lecture Notes TN2421 39 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS
P(xP,yP)
d1 θ
i
ni (x,0)
nt
d2
θ t
Q(xQ,yQ)
where θi and θt are the angles between the ray in the upper half space and the normal to
the surface and between the ray in the lower half space and the normal (Fig. 3.6).
Hence we have derived the law of reflection and Snell’s Law from Fermat’s Principle. In Chapter
2, we have previously derived the Reflection Law and Snell’s Law from the continuity conditions
for the electromagnetic field components at the interface.
1. Perfect focusing of a parallel beam of light by refraction. Suppose there are two
media with refractive index n1 and n2 , with n2 > n1 and suppose point S is at infinity in
the medium with refractive index n2 . We try to construct a surface (interface) between
the two media such that all rays from S are focused into the same point F (see Fig. 3.9).
Because S is at very large distance, the rays entering from the right are parallel. Since all
parallel rays have travelled the same distance when they hit the surface DD’ perpendicular
to the rays, all parallel rays have the same phase at their intersection points with the plane
DD’. If point A is on the interface sought for, the travel time for a ray from D to F via A
must be minimum and hence must be the same for all points A. Hence
n2 n1
|DA| + |AF | = constant, (3.12)
c c
Optica Lecture Notes TN2421 40 of 165 Monday 16th April, 2018, 09:44
3.4. Perfect Imaging by Conic Sections
Figure 3.7: Overview of the definitions of some conic sections. The lower figure shows a defini-
tion that unifies the three definitions in the above figure by introducing a parameter called the
eccentricity e. The point F is the focus and the line e = ∞ is the directrix of the conic sections.
Optica Lecture Notes TN2421 41 of 165 Monday 16th April, 2018, 09:44
P
(b)
Optical system
Figure 5.1 Conj
S sends out spher
S P an optical system
causing them to c
section rays diverg
converge to P. If n
continues on.
Figure 3.8: Perfect imaging: a cone of rays diverging from S converges onto P. The light continues
afterspreading
P. out and weakening as it progresses. In just the reverse, the wave slows upon entering
it’s frequently necessary to collect incoming parallel rays and area of the wavefront travels mo
bring
wherethem together
“constant” at a point,
means thereby
the same value focusing thehence
for all rays, energy, as is
for all points ities, which
A on the are still moving qu
interface.
done with a the
By moving burning-glass or a telescope
plane DD’ parallel to itself, lens. Moreover,
we can since
achieve that for thisdium. These
new plane DD’ extremities
we overtak
thethen
light reflected from someone’s face scatters out from billions
get:
of point sources, a lens that causese|DA| each−diverging
|AF | = 0, wavelet to (3.13)
(a)
converge
where e =could
n2 /nform
1 > 1.anHence
imagetheofset
thatofface
points(Fig. 5.2). a hyperboloid.
A defines
In contrast, when n2 < n1 , as shown at the right of Fig. 3.9, then e < 1 and the interface
is an ellipsoid with F as one of its focal points. S
5.2.1 Aspherical Surfaces
To see how Chapter
162 a lens works, imagine that
5 D Geometrical we interpose in the 𝑛path
Optics # 𝑛
of a wave a transparent
A substance in which the 𝑛wave’s
" speed (b)
is different than it was initially. Figure 5.3a presents a cross-
sectional view of a diverging spherical wave traveling in an in-
cident (a)
medium of index D’ ni impinging on the curved interface of
spherical ones.
a transmitting medium of index nt. When nt is greater than ni, tions. Furtherm
Figure 3.9:F(a) Hyperboloid (n2 > n1 ) and (b) ellipsoid (n2 < n1 ) to perfectly focus a parallelwearer’s eyes a
1
beam incident from the medium with refractive index n2 into a point in a medium with refractive A new gener
(c)
index n1 .
ic generators, i
Fpartures
1 from
S
(b) (0.000 020 inch
n
F2 generally requi
F1 grinding, asphe
Figure 5.2 A person’s face, like technique,
Figure 5.3 A hyperbolic used
interface be
everything else we ordinarily see in fronts bend and straighten out. (b)
controls the dir The
reflected light, is covered with countless bola is such that the optical path from
(c)3.10: Lens with hyperboloid
Figure surfaces for perfect imaging of a pair
atomic scatterers. of points.
where A is. the abrasive par
Nowadays as
The direction of the rays in Fig. 3.9 can obviously also be reversed in which case the rays all kinds of instr
from point F are all perfectly collimated (i.e. parallel). If medium 2 consists of glass and
F2
medium 1 of air, we conclude that by joining two hyperboloids as shown in Fig. 3.10, point ing telescopes, p
F1 in air is perfectly imaged to point F2 , also in air.
2. Perfect focussing of parallel rays by a mirror. Let there be a parallel bundle EXAMPLE
of 5.1
rays in
(d)air (n = 1) and suppose we want to focus all rays in point F. We draw a plane
The accompany
Optica Lecture Notes TN2421
M05_HECH6933_05_GE_C05.indd 160
42 of 165 lens in air. Expl
Monday 16th April, 2018, 09:44
3.4. Perfect Imaging by Conic Sections
Σ1 perpendicular to the rays as shown in Fig. 3.11. The rays that hit Σ1 have traversed
the same optical path length. We draw a second surface Σ2 parallel to Σ1 . Consider rays
hitting the mirror in A1 and A2 . The OPL from Wj via Aj to F must be the same for all
rays:
OP L = |W1 A1 | + |A1 F | = |W2 A2 | + |A2 F |. (3.14)
Since Σ2 is parallel to Σ1 :
Hence (3.14) will be satisfied for points A for which |AF | = |AD|, i.e. for which the distance
to F is the same as to Σ2 . This is a paraboloid with F as focus and Σ2 as directrix.
By reversing the arrows, we get (within geometrical optics) a prefect parallel beam from
a point source. Parabolic mirrors are used everywhere,192from automobile
Chapter headlights
5 Geometrical Optics to
radiotelescopes.
W1
V
n=1
W2
(a)
Figure 3.11: A paraboloid mirror.
Since the
2. Yale Courses - 16. Ray or Geometrical Optics I - Lecture byplane Σ is parallel
Ramamurti Shankarto the incident wavefr
W1A1 + Shankar
3. Yale Courses - 17. Ray or Geometrical Optics II - Lecture by Ramamurti A1D1 = W2A2 + A2D2
Optica Lecture Notes TN2421 43 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS
In gaussian geometrical optics only paraxial rays and spherical surfaces are con-
sidered. In gaussian geometrical optics every point has a perfect image.
Figure 3.12: Imaging by a spherical interfaces between two media with refractive indices n2 > n1 .
Optica Lecture Notes TN2421 44 of 165 Monday 16th April, 2018, 09:44
3.5. Gaussian Geometrical Optics
It suffices to show that P is independent of the ray, i.e. of A. Let so and si be the distances
from S to V and from P to V. We will express si into so and show that the result is independent
of A. Choose a Cartesian coordinate system (x, y) with origin at V and such that the x-axis
points from V to C. Let α1 and α2 be the angles of the rays SA and AP with the x-axis as shown
in Fig. 3.12. θi is the angle of incidence of ray SA with the local normal CA on the surface and
θt similarly is the angle of refraction. By considering the angles of ∆ SCA we find
θi = α1 + ϕ. (3.17)
We have
yA yA
α1 ≈ tan(α1 ) = , α2 ≈ tan(α2 ) = , (3.20)
so + xA si − xA
and
yA
ϕ≈ . (3.21)
R
which is small for paraxial rays. Hence,
ϕ2
R 2
xA = R − R cos ϕ = R − R 1 − = ϕ ≈ 0, (3.22)
2 2
is of second order in yA and can therefore be neglected. Then, (3.20) and (3.21) become
yA yA
α1 = , α2 = . (3.23)
so si
By substituting (3.23) and (3.21) into (3.19) we find
n1 n2 n2 − n1
yA + yA = yA ,
so si R
or
n1 n2 n2 − n1
+ = . (3.24)
so si R
This implies that si and hence P are independent of yA , i.e. of the ray chosen, hence P is a
perfect image within the approximation of gaussian geometrical optics.
Note that when so → ∞ we have si → Rn2 /(n2 − n1 ) which is called the focal distance fi on
the image side:
n2
fi = R. (3.25)
n2 − n1
and (3.24) becomes
n1 n2 n2
+ = . (3.26)
so si fi
Similarly we have for the focal distance on the object side: fo = n1 R/(n2 − n1 ). It is clear that
our assumption that the refracted ray through point A intersects the extension of SV means that
so > fo .
Suppose now that n1 > n2 . The rays are then refracted away from the normal. Suppose there
is a ray bundle incident from medium 1 which, when viewed from medium 1, seems to converge
to point S to the right of the surface. The point S is called a virtual object point because it is
not really present in medium 2. It can be shown that in this case (3.24) holds but with negative
sign in front of n1 /so . By adopting the sign convention listed in Table 3.1, it is prevented
Optica Lecture Notes TN2421 45 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS
Figure 3.13: Imaging of a virtual object S by a spherical interface between two media with
refractive indices n1 > n2 .
that sometimes in (3.24) minus signs occur. As follows from the table, the object distance so is
positive when the object is to the left of the surface and negative when S is a virtual object in
medium 2 to which the rays in medium 1 seem to converge. Similarly, the image distance si is
taken to be positive if the image is to the right of the surface, but negative if the image is virtual
(i.e. to the left of the surface). A virtual image occurs when the rays in image space do not
converge to a point but diverge. By extending these rays into object space until they intersect
in a point P in front of the lens. For an observer in image space the rays in image space seem to
come from P which therefore is called the virtual image point.
The same sign conventions apply for fo as for so and for fi as for si . Finally, the radius of
curvature R is positive if the surface when viewed from medium 1 is convex and negative when
it is concave when viewed from medium 1. With these sign conventions, there always holds
n1 n2 n2 n1
+ = = (3.27)
so si fi fo
with
n1 n2
fo = R, fi = R. (3.28)
n2 − n1 n2 − n1
Table 3.1: Sign convention for spherical refracting surfaces and thin lenses (light
entering from the left)
s0 , f0 + left of V − right of V
x0 + left of Fo − right of Fo
si , fi + right of V − left of V
xi + right of Fi − left of Fi
R + if C is right of V − if C is left of V
y0 , yi + above optical axis − below optical axis
Optica Lecture Notes TN2421 46 of 165 Monday 16th April, 2018, 09:44
(b)
(5.13)
3.5. Gaussian Geometrical Optics
na nl
left surface with radius R1 . This intermediate na
image then serves as the object for imaging by
ide is positive.
the second surfaceCwith
2 radius R2 . C1
(c)
ld
(5.14)
- d)si1
P! V1 V2 P
on the right is S C2 C1
sume the sur- nl
ingly, we have R2 R1
nm nm
rred to as the
d
so1
si1 si2
so2
(5.15)
Figure 5.14 A spherical Figure
lens.3.14: A spherical
(a) Rays lens. plane passing through
in a vertical
a lens. Conjugate foci. (b) Refraction at the interfaces where the lens is
Two rays starting from S are drawn in Fig. 3.14. These rays are refracted by the first surface
immersed in air and nm there = na. must
The radius drawn from C1 is normal to the
nd -V2 tend to
towards its local normal so that
first surface, and as the strong
ray enters
hold nl > nm . However,
the lens
in the situation shown,
this refraction is not sufficiently to make theseitrays
bends down and
converge toward that in an image
intersect
ed from either normal. The radius from C
point after the first surface (if you extend
2 is normal to the second surface; and as
the dashed rays after the first surface as theif the second
ray emerges,
surface was since
not present, nl 7would
the rays na, thenever
ray intersect
bends down away from
but would thatIfnormal.
diverge). one extends these
urface, if so diverging
is (c)rays
Thetogeometry.
the region before the first surface, they intersect in P’. This means that for an
“observer inside the lens”, the diverging rays after the first surface appear to come out of point
mes the focal
P’ in front of the surface, which therefore is the virtual image of S for the first surface.
According to (3.27) we have
nm
1 nl =1 nl −1nm ,
+ (3.29)
and s o1 s+
i1 =R1 (5.17)
where in the case of Fig. 3.14, R1 > 0. Let so d besi the distance
ƒ between the vertices: d = |V1 V2 |.
P’ is to the left of the second surface with object distance to vertex V2 given by
This underlines again that the sign convention is very convenient because, whatever case occurs,
(3.32) applies in all cases. With (3.27) applied to P’ and the second surface we get
nl nm nm − nl
+ = . (3.33)
d − si1 si2 R2
Optica Lecture Notes TN2421 47 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS
Note that si1 is a function of so1 and it is not so easy to express the image distance si2 in terms
of the object distance so1 alone. One can use either (3.29) or (3.33) to solve for si1 and substitute
into the other equation. However, if the lens is thin: d → 0 (3.34) simplifies because si1 drops
out and we get:
where so = so1 and si = si2 are the object distance and image distance to the common vertex
V = V1 = V2 of the two surfaces. It follows from the sign convention that so is positive (negative)
if the object is to the left (right) of V and that si is positive (negative) if the image is to the
right (left) of V .
When so → ∞, the image distance becomes the focal length fi in image space whereas when
si → ∞, the object distance becomes fo . Since the refractive index in both half spaces is the
same (nm ), these focal distances are the same: f = fi = fo :
1 nl − nm 1 1
= − . (3.36)
f nm R1 R2
1 1 1
+ = (thin lens). (3.37)
so si f
Remarks
1. We have seen in Section 3.5.1 that when in gaussian geometrical optics a parallel bundle
of rays (so → ∞) is incident on a single spherical surface, the rays are focused in a point
at a distance f from the vertex. It is clear form symmetry that if the parallel bundle is
incident under an angle and we vary this angle, the focal point will be displaced and vary
over a sphere with centre the vertex and radius f given by (3.28). In the approximation
of gaussian geometrical optics where only paraxial rays are considered, this circle may
be considered to be a plane. Therefore in gaussian geometrical optics a parallel
bundle is focused to a point in the focal plane at distance f given by (3.36)
from the lens.
2. Similarly, a line perpendicular to the optical axis is imaged into a circular arc which in the
paraxial limit may be regarded to be a straight line. Hence in gaussian geometrical
optics, lines are imaged to lines and planes to planes.
Optica Lecture Notes TN2421 48 of 165 Monday 16th April, 2018, 09:44
o i
By convention, xo is taken to be positive left of Fo, whereas xi is
positive on the right of Fi. It is evident from Eq. (5.23) that xo and
3.5. Gaussian Geometrical Optics
A positive
S2 A
2 means the
yo that si and
1
Fo O Fi P1 Clearly, the
S1
yi be inverted
3 follows fro
B
P2
xo f f xi
so si The term m
tude of MT
Figure 5.24 Object
Figureand image
3.15: location
Object for for
and image a thin lens.
a thin lens. is smaller
y-coordinate yi of a point in image space is positive if the point is above the optical axis and
negative otherwise.
Draw the ray through the focal point in object space and the ray through the centre O of
the lens. The first ray becomes parallel in image space while the latter is unrefracted. Their
intersection gives the location of the image point. The image is real if the intersection occurs in
image space and is virtual otherwise. From the similar triangles ∆ AOFi and ∆ P2 P1 Fi it follows
that
yo f
= . (3.38)
|yi | si − f
From the similar triangles ∆ S2 S1 Fo and ∆ BOFo :
M05_HECH6933_05_GE_C05.indd 172
|yi | f
= . (3.39)
yo so − f
(the absolute value of yi is taken because according to our sign convention yi in Fig. 3.15 is
negative whereas (3.39) is a ratio of lengths). By multiplying these two equations we get the
Newtonian form of the lens equation:
xo xi = f 2 , (3.40)
where xo and xi are the distances of the object and image to the front and back focal planes,
respectively:
xo = so − f, xi = si − f. (3.41)
Here xo is reckoned positive if the object is to the left of Fo and xi is positive if the image is to
the right of Fi .
The transverse magnification is
yi −si xi
M= = =− , (3.42)
yo so f
where the last identity follows from considering similar triangles in Fig. 3.15. A positive M
means an erect image, a negative M means an inverted image.
Optica Lecture Notes TN2421 49 of 165 Monday 16th April, 2018, 09:44
A similar pair of lenses is illustrated in Fig. 5.38, in which
the separation has been increased. Once again rays-2 and -3 +
through Fi1 and Fo1 fix the position of the intermediate image
generated by L 1 alone. As before, ray-4 is drawn backward
from O2 to P′1 to S1. The intersection of rays-3 and -4, as the
former is refracted through Fi2, locates the final image. This (c)
CHAPTER 3.L1 GEOMETRICAL
L2 OPTICS
time it is real and erect. Notice that if the focal length of L 2 is
increased with all else constant, the size of the image increases
+
3.5.4as well.
Two Thin Lenses
Analytically, looking only at L 1 in Fig. 5.36,
L2
L1 P1
S1 2
3 O1 O2
Fo1 Fi1 Fo2 Fi2
4
P1!
si1 so2
f1 f2 si2
so1 d
Figure 5.38 Two thin lenses separated by a distance greater than the sum of their focal lengths. Because
Figurethe 3.16: Two
intermediate image thin lenses
is real, you separated
could start with point-P1′ andby
treatait as
distance
if it were a realthat is larger
object point for L2. than the sum of their focal
Thus a ray from P1′ through Fo2 would arrive at P1.
lengths.
In the case of Fig. 3.17 the distance d between two positive lenses is smaller than their
focal lengths. First the intermediate image P10 is constructed. It is a real image for L1 which is
constructed by finding the intersection of rays 2 and 3 passing through the back and front focal
points Fi1 and Fo1 of lens L1 , respectively. P 0 is a virtual object point for lens L2 . To 26/08/16
M05_HECH6933_05_GE_C05.indd 179
find 1:33
itsPM
image by L2 , draw ray 4 from P1 through the centre of lens L2 back to S1 (this ray is refracted
0
by lens L1 but not by L2 ) and draw ray 3 as refracted by lens L2 . Since ray 3 is parallel to
the optical axis between the lenses, it passes through the back focal point Fi2 of lens L2 . The
intersection point P1 of ray 3 and 4 is the final image point P1 .
It is easy to express the image distance si in the object distance so for two thin lenses. These
distances are measured from the vertices of L2 and L1 , respectively. The intermediate image P10
due to lens L1 has distance si1 to the vertex of lens L1 satisfying:
1 1 1
+ = . (3.43)
so si,1 f1
P10 is object for lens L2 with object distance so,2 = d − si,1 , where d is the distance between the
lenses. Hence, with si = si,2 . the lens equation for lens L2 implies:
1 1 1
+ = . (3.44)
d − si,1 si f2
By solving for si,1 from (3.43) and substituting the result into (3.44) one finds:
so f1 f2 − f2 d(so − f1 )
si = (two thin lenses). (3.45)
so f1 − (d − f2 )(so − f1 )
Optica Lecture Notes TN2421 50 of 165 Monday 16th April, 2018, 09:44
(reexamine Fig. 5.31) must refract at the lens and emerge paral- at least since the thirteenth century) are in thi
lel to one another and to the line from O to A. This means that the the radii of curvature are large and the lens d
ray we are concerned with refracts at B and gaining divergence the thickness will usually be small as well. A
heads up and away such that it is parallel to the line from O to A. would generally have a large focal length, com
As we’ll
3.5. Gaussian see presently
Geometrical this technique will allow us to quickly
Optics the thickness would be quite small; many early
trace an arbitrary ray through a series of lenses. tives fit that description perfectly.
We’ll now derive expressions for parameter
thin-lens combinations. The approach will b
leaving the more elaborate traditional treatme
Thin-Lens Combinations
cious enough to pursue the matter into the nex
Our purpose here is not to become proficient in the intricacies Consider two thin positive lenses L 1 and L
of modern lens design, but rather to gain the familiarity neces- distance d, which is smaller than either fo
sary to utilize, and adapt, those lens systems already available Fig. 5.36. The resulting image can be located g
commercially. lows. Overlooking L 2 for a moment, construct
In constructing a new optical system, one generally begins by exclusively by L 1 using rays-2 and -3. As u
sketching out a rough arrangement using the quickest approxi- through the lens object and image foci, Fo1 and
mate calculations. Refinements are then added as the designer The object is in a normal plane, so that two ra
(a) L1 L2
2
S1
1 4
f2 P1!
f1 d so2
so1 si1
(b) L1 L2
S1
4
Figure 3.17: Two thin lenses at a distance smaller than their focal lengths.
M05_HECH6933_05_GE_C05.indd 178
Optica Lecture Notes TN2421 51 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS
By taking the limit so → ∞, we obtain the back focal length of the two lenses, while by taking
the limit si → ∞ we get the front focal length:
(f1 − d)f2
b.f.l. = , (3.46)
f1 + f2 − d
f1 (f2 − d)
f.f.l. = , (3.47)
f1 + f2 − d
By construction using the intermediate image it is clear that the magnification of the two lens
system is the product of the magnification of the two lenses:
M = M1 M2 . (3.48)
Remarks
1. When f1 + f2 = d the focal points are at infinity. Such a system is called telecentric.
2. In the limit that the lenses are very close together: d → 0, (3.45) becomes
1 1 1 1
+ = + . (3.49)
so si f1 f2
The focal length f of the two lenses in contact satisfies:
1 1 1
= + . (3.50)
f f1 f2
Two positive lenses in close contact enforce each other, i.e. the second positive lens makes
the convergence of the first lens stronger. Similarly, two negative lenses in contact make a
more strongly negative system. The same applies for more than two lenses in close contact.
Sign convention for the ray angle: α is the smallest angle between the ray and the op-
tical axis and is positive for a positive slope and negative for a negative slope.
We write M for the matrix and specify it for a number of cases. One can prove (but we will not
do it) that all matrices have determinant=1.
Optica Lecture Notes TN2421 52 of 165 Monday 16th April, 2018, 09:44
3.5. Gaussian Geometrical Optics
a. Homogeneous medium with refractive index n. For example, if the planes have dis-
tance d and there is a homogeneous medium of refractive index n between the two planes,
we have
nα2 = nα1 ,
y2 = y1 + α 1 d
and hence
1 0
M= d (3.53)
n 1
This matrix will also be denoted by the symbol T12 ("transfer matrix").
b. Spherical surface. Let the two planes 1 and 2 be immediately to the left and right of a
spherical interface with radius R, centre C and vertex V, with refractive indices n1 and n2
to the left and right of the interface. With reference to Fig. 3.12, the angle α2 is according
to the sign convention negative whereas in Eq. (3.19) it was taken positive. Hence, we
have to replace α2 by −α2 in (3.19) and obtain:
(n2 − n1 )y1
n2 α2 = n1 α1 − ,
R
y2 = y1 ,
where we used (3.21): ϕ = y1 /R. Hence the transfer matrix for a single surface is
1 − n2 −n
1
S= R (3.54)
0 1
c. Thin lens. If the two planes are the entrance and exit planes of a thin lens of refractive
index nl in an ambient medium nm , we have
1 − nfm
Lthin = , (3.55)
0 1
If between two planes there are several lenses, single spherical surfaces and homogeneous spaces,
then the matrix M for the two planes is simply the product of the separate matrices.
Imaging conditions. Let S be a point in the first plane with distance to the optical axis
denoted by y1 . Suppose there is an image point P of S in plane 2. This mean that there must
be a y2 such that any (paraxial) ray through S will pass through point P with distance y2 to the
optical axis. Hence
M21 n1 α1 + M22 y1 = y2 , (3.56)
for all α1 . This is possible only when
M21 = 0,
(imaging conditions) (3.57)
y2 = M22 y1 .
Optica Lecture Notes TN2421 53 of 165 Monday 16th April, 2018, 09:44
6 More on
Geometrical Optics CHAPTER 3. GEOMETRICAL OPTICS
optical axis (in the paraxial approximation) is in good approximation a plane perpendicular to
the optical axis. This plane is called the primary principal plane and its intersection with
the optical axis is called the first principal point H1 .
By considering the rays that are incident parallel to the optical axis and focused in the back
focal point, the second principal plane and second principal point H2 are defined in a similar
way. The principal planes need not be inside the lens. In particular for meniscus lenses this is
not the case.
ns Systems
one whose thickness is by
e, it could equally well be
ical system, allowing for
mber of simple lenses, not Second focal
l points, or if you like, the point
conveniently be measured V1 H1 H2 V2 Fi
that case we have the fa-
denoted by f.f.l. and b.f.l. b.f.l.
merged rays will meet at
urved surface that may or
surface, approximating a Secondary
d the principal plane (see principal
ary and secondary princi- plane
ersect the optical axis are
cipal points, H1 and H2, Figure 6.1 Figure
A thick3.18:
lens. Principal planes of a thick lens.
ry useful references from
tem parameters. We Onesaw
can show by a rather long computation which we omit2 , that when the object and image
distances
aversing the lens throughso , si are measured with respect to theFigure principal
6.2 points H1 and H2 , we have
Nodal points.
he incident direction. Ex- 1 1 1
+ = , (3.58)
oing rays until they cross so s i f
d the nodal points,
whereN1fand
, the focal length as measured from the first and second principal planes, satisfies:
rounded on both sides by N1
1 nl − nm 1 1 (nl − nm )dl
he nodal and principal O N2
= − + , (3.59)
f nm R1 R2 nl R1 R2
nts, two focal, two princi-
dinal points of the system.
2
A derivation can be found in a nicely written book K.K. Sharma, Optics, Principles and Applications, Acadenic
Press 2006, Section 4.2
ong with the six cardinal
mined for any system of
Optica Lecture Notes TN2421 54 of 165 Monday 16 th
April, 2018, 09:44
255
256 Chapter 6 More on Geometrical Optics
3.6. Stops
Ray-
Figure
Figure 6.3Position
3.19: Lens bending.
of the principal planes for different lenses of the same power. Ray-
where dl is the distance between the vertices. For completeness we list the distances h1 and h2
from H1 coaxial
to V1 and refracting
from H2 to Vspherical surfaces
2 (hj > 0 if H j is to theregardless
right of Vj :) of the actual
curvatures, spacings, and indices the rays encounter. Consequently,
f (nl − nm )dl
it’s common practice hto1 calculate = −
Rthe
2 nl
positions
, of the cardinal (3.60)
points early in any analysis. f (nl − nm )dl
h2 = − . (3.61)
As shown in Fig. 6.3, the principal R1 nl planes can lie completely
outside
A thick lens can bethedescribed
lens system.
by a rayHere,
matrix though
which has differently
the same shapeconfigured,
as for a thin lens,
provided the reference planes, which for the thin lens are passing through thethat
each lens in either group has the same power. Observe in areFigure
vertices, for 6.4 Tracin
the thickthe
lenssymmetrical
taken to be thelensprincipal planes. So
the principal planes are, quite reasonably,
symmetrically located. In the case
1 −of nmeither
the planar-concave
Lthick = f (3.62)Depicted in F
or planar-convex lens, one principal 0 1 plane is tangent to the
,
curved surface—as should be expected from the definition a ray might hea
with 1/f is given by (3.59).
(applied to the paraxial region). In contrast, the principal points After striking H
In summary the model of refraction by a thick lens within gaussian geometrical optics is the
cana be
same as for thinexternal for the
lens provided meniscus
object andlenses.
image One often
distances andspeaks this arecentral
the focalofdistances all axis. At H
measuredsuccession of shapes
form the principal planes.with the same power as exemplifying lens ing ray, much as
bending. A rule-of-thumb for ordinary glass lenses in air is that in Fig. 6.4, trav
3.6 Stops the separation H1H2 roughly equals one-third the lens thick- first principal p
ness V1V2 . principal plane,
An element such as the rim of a lens or a separate diaphragm, which determines the set of
A quick way to trace rays through a thin lens is to draw a
rays that can contribute to the image, is called the aperture stop. An ordinary camera has a
converges to ba
plane down the middle of the lens (perpendicular to the optical
variable diaphragm. diverges as if fr
axis) andpupil
The entrance refract
is theall the of
image incoming
the aperture rays
stopatbythat plane, its
all elements principal
preceeding positive lens in
the aperture
stop. If plane,
there arerather
no lenses between object and aperture stop,
than at its two interfaces, where the bending actu- the aperture stop itself is the passes throu
that
entrance pupil. Similarly the exit pupil is the image of the aperture stop by all elements
followingally takes
it. The place.
entrance In effect,
pupil determinesforthe
a thin
conelens thethat
of light twoenters
principal planes
the optical pal plane, refrac
system while
the cone in Fig. it6.1
leaving coalesce into
is determined a single
by the plane. A similar scheme can be
exit pupil. continues on. Fo
devised to quickly ray trace through a thick lens provided we focal point-F2, s
55 of 165 Monday 16th April, 2018, 09:44
first set out a few rules. Keep in mind that the technique we are
Optica Lecture Notes TN2421
to the central ax
about to explore will take the actual entering ray and allow us to Any parallel b
yo ƒ xo
thick lenses, L 1 and L 2 (Fig. 6.6). Let so1,
Obviously, if dl S 0, Eqs. (6.1), (6.2), and (6.5) are transformed and ƒ2 be the object and image distances
into the thin-lens expressions Eqs. (5.17), (5.16), and (5.23). the two lenses, all measured with respect
CHAPTER 3. GEOMETRICAL OPTICS
yo V1 H1 H2 V2 Fi
Fo yi
h1 h2
f.f.l. b.f.l.
dl
xo f f xi
so si
Figure 6.5 Thick-len
For any object ppoint, the chief ray is the the ray in the cone that passes through the centre
of the entrance pupil, and hence also through the centres of the aperture stop and the exit pupil.
A marginal ray is the ray that for an object point on the optical axis passe through the rim of
the entrance pupil (and hence also through the rims of teh aperture stop and the exit pupil).
The field stop determines the size of the object that can be imaged. The field stop could
M06_HECH6933_05_GE_C06.indd 257
for example be the edge of a CCD detector.
For a fixed diameter D of the exit pupil and fixed object distance so , the magnification of
the system is according to (3.42) and (3.40) given by M = xi /f = f /xo . It follows that when
f is increased, the magnification increases. A larger magnification means a lower energy density
hence a longer exposure time, i.e. the speed of the lens is reduced. Camera lenses are usually
specified by two numbers: the focal length f and the largest diameter D of the exit pupil. The
f -number is the ratio of the focal length to this diameter:
For example, f /2 means f = 2D. Since the exposure time is proportional to the square of the
f -number, a f/1.4 lens is twice as fast as a f/2 lens.
The power of a lens is defined by
nm
P ower = D = (3.64)
f
where nm is the refractive index of the surrounding medium. The unit of the power is "Diopter"
when f is specified in meters. Hence a lens with power 20 Diopter has focal distance 5 cm. The
power is positive for a positve (.e. convergent) lens and negative for a negative (i.e. divergent)
lens.
3.7.1 Aberrations
When non-paraxial rays are traced, it is found that in general these do not intersect the ideal
gaussian image point. Instead of a single spot, a spot diagram is found which is more or less
confined. The deviation from an ideal point image is quantified in terms of aberrations. One
Optica Lecture Notes TN2421 56 of 165 Monday 16th April, 2018, 09:44
3.7. Beyond Gaussian Geometrical Optics
Exit
pupil
Entrance
pupil
Marginal ray
Chief ray
Exp Enp
A.S.
Figuresource
3.21: onAperture
the axis atstop (A.S.)ofbetween
the center the second
the hole sending and
light to thethird
left lens, with
where 5*entrance
40 mm pupil
= 200andmm. Now locate
exit pupil (in this case pupils are virtual images of the aperture
toward the lens. That means modifying all of the appropriate signs stop. Also
call it P. shown are the chief
ray and
in the marginal ray.
the equation
1 1 1
1 1 1 = +
= + 10 20 si
f so si si = + 20 cm
Here ƒ = +10 cm and with so = +8.0 cm P is 20 cm to the right of L. The element that li
rays arriving at P is the hole in the screen, not
1 1 1
= + b 6 a—hence the hole is the aperture stop and
10 8.0 si entrance pupil.
si = - 40 cm. This tells us that the image is on the same side of
L as the object, that is, on the right. The image of the aperture
is virtual, since so 6 f. Notice how the cone of rays, in Fig. 5.47
the image plane becomes narrower as the obj
off-axis. The effective aperture stop, whic
Aperture
L stop bundle of rays was the rim of L 1 , has bee
Optica Lecture Notes TN2421 57 of 165 duced
Mondayfor16the
th
off-axis
April, bundle.
2018, 09:44 The result is a
a
out of the image at points near its periph
S b P
200 mm
known as vignetting.
F F The locations and sizes of the pupils of an op
CHAPTER 3. GEOMETRICAL OPTICS
distinguishes between monochromatic and chromatic aberrations. The latter are caused by the
fact that the refractive index depends on wavelength. Recall that in paraxial geometrical optics
Snell’s Law (2.105) is replaced by: ni θi = nt θt , i.e. sin θi and sin θt are replaced by the linear
terms. If instead one retains the first two terms of the sinus, the errors in the imaging can be
quantified by five monochromatic aberrations, the so-called primary or Seidel aberrations.
The best known is spherical aberration, which is caused by the fact that for a convergent
spherical lens rays that make a large angle with the optical axis are focused closer to the lens
than the paraxial rays (see Fig. 3.22). Distortion is one of the other of the five primary
P!
3.8 Beyond Geometrical OpticsP C
R
Aberrations can be quantified by analysing the spot diagram or, alternatively, by considering
the wave front in image space converging to a point image. When a point object is imaged,
(b)
ideally the transmitted wave front at the exit pupil is part of a perfect sphere with centre the
gaussian image point. Aberrations cause that the wave front deviates from a perfect sphere.
According to a generally accepted criterion formulated first by Rayleigh, aberrations start to
deteriorate images considerably if the wavefront aberrations cause path length
A differences of
Optica Lecture Notes TN2421 58 of 165 Monday 16th April, 2018, 09:44
P! P C
R
3.8. Beyond Geometrical Optics
Figure 3.24: EUV (Extreme UV) lithographic machine from ASML. The mirrors are replacing
the lenses due to the lack of lenses for this wavelength.
more than a quarter of the wavelength. When the aberrations are less than this, the system is
called diffraction limited. Even if the wave transmitted by the exit pupil would be perfectly
spherical, the wave front is only part of a sphere since the field is limited by the aperture. An
aperture causes diffraction, i.e. bending and spreading of the light. When one images a point
object on the optical axis, diffraction causes the light distribution called the Airy spot as shown
in Fig. 3.25. The Airy spot has full-width at half maximum:
λsi
F W HM = 1.6 , (3.65)
D
where D is the diameter of the exit pupil and si is the image distance as predicted by gaussian
geometrical optics. Diffraction depends on the wavelength and hence it can not be described by
geometrical optics which applies to the limit of vanishing wavelength. We will treat diffraction
by apertures in Chapter 7.
Optica Lecture Notes TN2421 59 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS
Figure 3.25: Left: intensity of the Airy pattern; right: cross section of the Airy Pattern (ampli-
tude)
Optica Lecture Notes TN2421 60 of 165 Monday 16th April, 2018, 09:44
Chapter 4
Optical Instruments
What you should know and be able to do after studying this chapter.
• Understand the principle of the eye and its accommodation with the near
and far point.
• Know what the principle of the magnifier and the eyepiece and its use in the
microscope and telescope.
• Understand the microscope and the telescope concept and the angular mag-
nification in both cases.
After having studied the laws of gaussian geometrical optics, we are able to build more complex
systems based on optical elements, such as lens and reflectors. In this chapter we describe the
most common systems.
61
CHAPTER 4. OPTICAL INSTRUMENTS
Figure 4.1: Left:The principle of the camera obscura and its use by a painter. It is suspected
that Vermeer has used a camera obscura. Right:An example of an image with a view of Central
Park (NY) looking north in spring ©Abelardo Morell
is released, the diaphragm closes to a preset value, the mirror swings up and the CCD or film is
exposed. To focus the camera the entire lens is moved toward or away from the detection plane.
Autofocus is based on maximizing the contrast of the images.
The angular field of view (AFOV) is defined for scenes at large distances. The AFOV is the
angle subtended at the lens by the detector area when the image distance is the focal length
f . The AFOV decreases when f increases. A standard SLR has a focal length of around 6
cm and the AFOV is then between 40°and 50°. The horizontal field of view HFOV is related
to the AFOV as shown in Fig. 4.3b. More complex systems of lenses can have a variable
Figure 4.2: Inside of Reflex camera. When taking an image the mirror will swing up and light
will go to the sensor instead.1)Camera lens, 2) Reflex mirror, 3) Focal-plane shutter, 4)Image
sensor, 5) Matte focusing screen, 6) Condenser lens, 7)Pentaprism/pentamirror, 8)Viewfinder
eyepiece.
focal distance, so they are able to zoom into a scene. To achieve such an effect, the lenses are
translated. The effects on the picture due to the change of distance between the object and
the camera at constant focal length, and the change of focal length for a same position of the
observer at are shown in Fig. 4.4.
The depth of focus is determined by the diaphragm. When the aperture is wide open, rays
coming from different objects at various distances will not all be focused on the screen (see
Fig. 4.5) and the image is blurred. When the aperture is reduced, this effect is less and therefore
a smaller diaphragm implies a larger depth of focus. The drawback is that less light reaches the
sensor, therefore a longer exposure time is needed.
1
http://www.edmundoptics.com/resources/application-notes/imaging/understanding-focal-length-and-field-
of-view/
Optica Lecture Notes TN2421 62 of 165 Monday 16th April, 2018, 09:44
4.2. The Camera
Figure 4.4: Effect of the focal length of the lens system and of the distance of the object in the
pictures taken.
Optica Lecture Notes TN2421 63 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS
Figure 4.5: Left: the ray diagram with rays showing the origin of the depth of focus. Right: two
pictures taken with different apertures. In the top image, only the black ball appears sharp, the
background is blurred; in the bottom image all balls appear sharp.
(n ≈1.336) with the iris (or pupil) in it. It can expand or contract from 2 mm (bright sun) to
8 mm (low light) diameter to adapt to the light intensity. The iris also gives colour to the eye.
After this, the rays reach the flexible crystalline lens which has the size of a bean (9 mm in
Optica Lecture Notes TN2421 64 of 165 Monday 16th April, 2018, 09:44
4.3. The Human Eye
diameter, and 4 mm thick in unaccomodated condition). Its index of refraction varies from 1.406
in its centre, to 1.386 at the edge.
Figure 4.7: Left: Optical rays showing how an eye accommodates by changing its focal length.
Right: Relaxed and contracted muscle at the crystalline lens needed for this accommodation.
The entire eye can effectively be treated as two lenses in contact, of which the crystalline
lens can change its focal length. In unaccommodated condition, the front focal distance of the
two lens system is f1 = 16 mm as measured from the cornea and the back focal distance is equal
to the length of the eye: f2 = 24 mm. These focal distances are different because the refractive
indices of the surrounding medium (air and vitreous humour) differ. The power of the intact
unaccommodated eye lens system is according to (3.64):
nvh 1.337
D= = = 55 Diopter. (4.1)
f2 0.0243
4.3.1 Accommodation
In relaxed condition the lens focuses light coming from infinity on the retina. When the object
is closer, the eye muscles contract due to which the crystalline lens becomes more convex and
the focal length of the system decreases as seen in Fig. 4.7-right. At a certain point, the object
will be too close to be focused on the retina: this is called the near point of the eye. Due to the
loss of elasticity of the muscle, the near point distance moves from 7 cm for teens to 100 cm for
60 years old. Fig 4.7 shows the optical rays entering the eyes, for two configurations: an object
at infinity and an object nearby. The so-called normal near point is at 25 cm. The far point
is where the furthest object is which is imaged on the retina by the unaccommodated eye. For
the normal eye the far point is at infinity.
4.3.2 Retina
The retina is composed of approximately 125 million photoreceptors cells: the rods and the
cones. The rods are highly sensitive black and white (intensity) sensors, while the cones are
colour sensitive for the wavelengths 390 nm - 780 nm. UV light is absorbed by the lens (people
whose lens is removed because of cataract can "see" UV light). The fovea centralis is the most
sensitive centre of the retina with a high density of cones. The eyes move continuously to focus
Optica Lecture Notes TN2421 65 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS
the image on this area. The information is transferred by the optical nerve, placed at the back
where it causes a blind spot.
4.3.3 Eyeglasses
The eye can suffer from imperfections as seen in Fig. 4.8. We discuss the most common imper-
fections and their solutions.
a. Myopia or Nearsightedness. A myopic eye has too high power so that distant objects are
focused in front of retina by the unaccommodated eye. The far point is thus not at infinity but
at a certain distance. This can be corrected by a negative lens. Suppose the far point is at 2 m.
If the concave lens would make a virtual image of a distant object which is 2 m in front of the
cornea, the unaccommodated eye can see it clearly. The lens Law (3.37), with so = ∞ implies
then f = si = −2 m. Hence the required power of the lens is:
1
D= = −0.5 Diopter. (4.2)
f
The lens is best put in the front focal plane of the eye, i.e. at approximately 16 mm in front of
the cornea. The reason is that in this case the magnification of the eye and the negative lens
together are the same as for the uncorrected eye. To see this, draw a ray from the top of the
object through the centre of the negative lens. This will then be made parallel to the optical
axis by the eye lens and the distance of this ray to the optical axis is the image size on the
retina. This ray will end up at the same point of the retina as when the negative lens is taken
out because it is unrefracted by this lens.
Contact lenses are very close to the eye lens and hence the total power of the eye with a
contact lens is simply the sum of the power of the eye and the contact lens.
c. Presbyopia. This is the lack of accommodation of the eye such as in people over 40.
It results in an increase in the distance between the near point and the retina. This defect affects
all images. Presbyopia is usually corrected by glasses with progressive correction, the upper glass
for distance vision and the lower part for near vision.
d. Astigmatism. In this case the focal distances for two directions perpendicular to the
optical axis are different. It is attributed to a lack of symmetry of revolution of the cornea. This
is compensated for by using glasses which themselves are astigmatic.
Optica Lecture Notes TN2421 66 of 165 Monday 16th April, 2018, 09:44
4.4. Magnifying glass 222 Chapter 5 Geometrical Optics
Farsighted Eye
y clear lens in the eye Nearsighted Eye (a)
he condition is referred to (a) Object at ∞
resulting haziness can Object at ∞ F
g effect on vision. In
crystalline lens is usually
d. A small convex plastic
ar lens implant) is then
to enhance its conver- (b) (b)
Near-point Figure 5.97 Again the far-poi
shows an enlarged image
Distant object correction lens.
verging spherical lens; it’s
t 6 mm in diameter.) Its
minated the need for the retina) is once again the fa
glasses” that were once the lens. The hyperope can
gery. (E.H.) (c) No accommodation (c)
any lens located anywhere
25 cm priate focal length will ser
Far point
Very gentle finger pres
gh it is when immersed in
cornea will temporarily d
commonly used model for
blurred to clear and vice v
wer of the crystalline lens (d) (d)
he cornea provides roughly Object at ∞ Distant object
ct unaccommodated eye. Astigmatism—Anamorp
ation of the word, is not as
he term normal, or its syn- Perhaps the most common
that is capable of focusing from an uneven curvature
relaxed condition—that is, (e)
(e) cornea is asymmetric. Sup
n the retina. For the unac- (one containing the optica
Nearby object
ect point whose image lies Accommodation Nearby object (curvature or) power is m
hus for the normal eye the other. If these planes are
ht to a focus on the retina, regular and correctable; i
which for all practical pur- corrected. Regular astigm
m). In contrast, when the Figure 5.94 Correction of the nearsighted eye. eye can be emmetropic, my
the eye is ametropic (e.g.,
Figure 4.8: Correction of nearsighted Figure 5.96(left) andof farsighted
Correction the farsighted eye.(right) eye
nations and degrees on the
tigmatism). This can arise will diverge the rays a bit. Resist the temptation to suppose that Thus, as a simple examp
adequate acuity; that is, it will form a distant virtual image, might be well focused, wh
s in the refracting mecha- we are merely reducing the power of the system. In point of
of alterations in the length fact, the power of the lens–eye combination is most often which
madethe eye can then see clearly. pia or hyperopia. Obvious
ce between the lens and the
4.4 Magnifying glass
to equal that of the unaided eye. If you are wearing glasses to be horizontal and vertical
common cause. Just to put correct myopia, take them off; the world gets blurry,Example but it 5.16 The great astronomer
that about 25% of young The image
doesn’ton thesize.
change retina cana be
Try casting real increased bySuppose
image on a piece ofbringing
paper that athe object
hyperopic eye closer
has a neartopoint
theofeye (reduce
125 cm. at
Find sosphero-cylindrical lens to
per- susing
eglass correction, andfixed ). your glasses—it
But s can can’t
not be
be done.
smaller than the
the near
needed point
corrective d ,
lens. which we take here to be 25 cm.
tism in 1825. This was pro
i o o
.0 D or less. been corrected. But it was
It is desirable
Example 5.14
to use a lens that makes a magnified SOLUTION erect image at a distance to the eye greater treatise on cylindrical lens
than dSuppose
o . This an
can
eye
be
has a
achieved
far point of
by
2 m.
a positive
All would be
lens
For anwith
well if a
objectthe
at + object
25 cm to closer to the
have its image at slens than
i = -125 cm the
so front
Franciscus Cornelius Don
ses that it can be seen as if through a normal eye, the focal length
focal point,
spectaclethereby producing
lens appeared to bring more a magnified
distant objectsvirtual
in closer image. An example is given in Fig. 4.10. gists were moved to adopt
arallel rays are brought to than 2 m. If the virtual image of an object at infinity ismust formedbe Any optical system tha
r of the lens system as con- by a concave lens at 2 m, the eye will see the object clearly with 1 1 1 1 two principal meridians i
osterior axial length of the an unaccommodated lens. Find the needed focal length. = + = example, if we rebuilt the s
front of the retina, the far ƒ ( -1.25) 0.25 0.31
ll points beyond it will ap- SOLUTION
or ƒ = 0.31 m and 𝒟 = +3.2 D. This is in accord with Table
often called nearsighted- Using the thin-lens approximation (eyeglasses are generally 5.3, where so 6 ƒ. These spectacles will cast real images—try it
es nearby objects clearly thin to reduce weight and bulk), we have if you’re hyperopic.
, or at least its symptoms,
t of the eye such that the 1 1 1 1 1
= + = + [5.17]
m has its focal point on the ƒ so si ∞ -2
As shown in Fig. 5.97, the correcting lens allows the relaxed
arly see objects closer than
and ƒ = -2 m while 𝒟 = - 1
D. eye to view objects at infinity. In effect, it creates an image on
t cast relatively nearby im- 2
its focal “plane” (passing through F), which then serves as a
roduce a negative lens that
virtual object for the eye. The point (whose image lies on the
Optica Lecture Notes TN2421 67 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS
fixed, the ratio of the image size on the retina for the eye with and without magnifying glass is:
αa
MP = (4.4)
αu
with αa and αu are the angles between the optical axis and the chief rays for the aided and
the unaided eye as shown in Fig. (4.10). Working with these angles instead of distances is in
particular useful when the (virtual) image of the magnifying glass becomes infinite (see below).
Using αa ≈ yi /L and αu ≈ y0 /d0 with yi and y0 positive with L the positive distance from the
image to the magnifier, we have
yi d0
MP = . (4.5)
y0 L
Since si < 0 we have,
yi si −si
=− =1+ ,
yo so f
where we used the Lens Equation for the magnifying glass. We have −si = |si | = L − l , where
l is the distance between the magnifying glass and the eye. Hence, (4.5) becomes:
d0 L−l
MP = 1+
L f
d0
= [1 + D (L − l )] , (4.6)
L
224 Chapter 5 Geometrical Optics
where D is the power of the magnifyer glass. We can distinguish three situations:
(a) (c)
yo au
do
Near
point
(b) Entrance
pupil
A.S.
Exit pupil
yi
aa
F yo
so
f Figure 5.102 (a) An unaided
si ! through a magnifying glass. (c)
L glass. The object is less than on
Figure 4.10: An unaided view (top) and an aided view using a magnifier (down). (5.91)
magnifier, was unearthed in 1885 among the ruins of the palace Inasmuch as the image dis
of King Sennacherib (705–681 b.c.e.) of Assyria.
1. l = f : the magnifying
Evidently, power
it wouldis d
be0 D.
desirable for the lens to form a magni- d
MP =
2. l = 0: so fied, erect image.
the largest Furthermore,
value of MP corresponds the rays entering
to the the L,
smallest normal
i.e. Leye
= d0 : L
should not be converging. Table 5.3 (p. 173) immediately sug-
𝒟 of course
(4.7) being the pow
gests placing the object MP|
within the focal
l =0,L=d 0 = dlength (i.e., so 6 ƒ). The
0 D + 1.
three situations of particul
result is shown in Fig. 5.102. Because of the relatively tiny sizeth
68 ofcertainly
165 nifying power equals do𝒟.
of the eye’s pupil, it will almost
Optica Lecture Notes TN2421
always be the aperture
Monday 16 April, 2018, 09:44
stop, and as in Fig. 5.44 (p. 184), it will also be the exit pupil.
The magnifying power, MP, or equivalently, the angular [MP]/ =
4.5. Eyepieces
3. The object is at the focal point of the magnifier (s0 = f ), so the virtual image is at infinity
(L = ∞):
MP|L=∞ = d0 D (4.8)
for every distance l between the eye and the magnifying glass. The rays are parallel so that
the eye views the object in a relaxed way. This is the most common use of the magnifier.
4.4.2 Nomenclature
Normally magnifiers are expressed in terms of the magnifying power when L = ∞: for example
a magnifier with a power of 10 D has a MP equal to 2.5 and it will be called 2.5×. In other
words, the image is 2.5 times larger than the object at the focal length of the lens than it would
be as if the object was at the near point of the unaided eye.
4.5 Eyepieces
An eyepiece or ocular is a magnifier used before the eye at the end of an other optical instrument
such as a microscope and a telescope. The eye looks into the ocular and the ocular "looks" into
the optical instrument. The ocular provides a magnified virtual image of the image produced by
the optical instrument. Similar to the magnifying glass, the virtual image should preferably be
at or near infinity to be viewed by a relaxed eye. Several types of eye pieces exist and most of
them are made out of two lenses: 1. the field lens which is the first lens in the ocular; 2. the
eye-lens, which is closest to the eye at a fixed distance called the eye relief. The aperture of
the eyepiece is controlled by a field stop. An example is given in Fig. 4.11. You can play with
Figure 4.11: Example of an eyepiece used in situ: 1) real image 2) field diaphragm 3) eye relief
4) exit pupil
the eyepieces that are commonly used at the following address: Oculaires
Optica Lecture Notes TN2421 69 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS
spectacle maker, Zacharias Janssen of Middleburg. Galileo runs barrel of the device. Rays diverging from each point of this im-
a close second, having announced his invention of a compound age will emerge from the eye-lens (which in this simple case is
microscope in 1610. A simple version, which is closer to these the eyepiece itself) parallel to each other, as noted in the previous
earliest devices than it is to a modern laboratory microscope, is section. The ocular magnifies the intermediate image still further.
depicted in Fig. 5.110. Thus the magnifying power of the entire system is the product of
The lens system, here a singlet, closest to the object is referred the transverse linear magnification of the objective, MTo, and the
to as the objective. It forms a real, inverted, magnified image of angular magnification of the eyepiece, MAe, that is,
the object. This image resides in space on the plane of the field
MP = MToMAe (5.80)
stop of the eyepiece and has to be small enough to fit inside the
The objective magnifies the object and brings it up in the form
of a real image, where it can be examined as if through a mag-
nifying glass.
Recall that MT = -xi >ƒ, Eq. (5.26). With this in mind most,
but not all, manufacturers design their microscopes such that
the distance (corresponding to xi) from the second focus of the
objective to the first focus of the eyepiece is standardized at
160 mm. This distance, known as the tube length, is denoted
by L in the figure. (Some authors define tube length as the
image distance of the objective.) Hence, with the final image
Exit pupil at infinity [Eq. (5.79)] and the standard near point taken as
254 mm (10 inches),
160 254
MP = a - ba b (5.81)
f ray
ƒo ƒe
Chie
fe
Here the focal lengths are in millimeters, and the image is
inverted (MP 6 0). Accordingly, the barrel of an objective
Eyepiece with a focal length ƒo of, say, 32 mm will be engraved with the
marking 5 * (or * 5), indicating a power of 5. Combined with
a 10 * eyepiece (ƒe = 1 inch), the microscope MP would then
fe be 50 * .
To maintain the distance relationships among the objective,
field stop, and ocular, while a focused intermediate image of the
Field stop object is positioned in the first focal plane of the eyepiece, all
three elements are moved as a single unit.
The objective itself functions as the aperture stop and en-
trance pupil. Its image, formed by the eyepiece, is the exit pupil
L into which the eye is positioned. The field stop, which limits the
extent of the largest object that can be viewed, is fabricated as
part of the ocular. The image of the field stop formed by the
optical elements following it is called the exit window, and the
image formed by the optical elements preceding it is the en-
trance window. The cone angle subtended at the center of the
fo
exit pupil by the periphery of the exit window is said to be the
A.S. Objective Entrance pupil angular field of view in image space.
A modern microscope objective can be roughly classified as
one of three different kinds. It might be designed to work best
with the object positioned below a cover glass, with no cover
Object
Image at ∞ glass (metallurgical instruments), or with the object immersed
in a liquid that is in contact with the objective. In some cases,
Figure 5.110 A rudimentary compound microscope. The objective forms the distinction is not critical, and the objective may be used with
Figure 4.12: Simple compound a microscope. The objective forms a realorimage
real image of a nearby object. The eyepiece, functioning like a magnifying
glass, enlarges this intermediate image. The final virtual image can be big-
without of a nearby
a cover glass. Fourobject.
representative objectives are
The eyepiece enlarges this intermediate
ger than the barrel of image. The
the device, since final
it needn’t image
fit inside. can beshown
With parallel bigger
in Fig.than the
5.111 (see barrel
Section ofIn addition, the ordi-
6.3.1).
nary low-power (about 5 * ) cemented doublet achromate is
rays entering the eye it can remain comfortably relaxed.
the device since it is virtual.
Optica Lecture Notes TN2421 70 of 165 Monday 16th April, 2018, 09:44
4.7. Telescopes
linear magnification of the objective MT0 and the angular magnification of the eyepiece MAe :
According to (3.42): MT = −xi /f , where xi is the distance of the image made by the objective
to its back focal plane. We have xi = L which is the tube length, i.e. the distance between the
second focus of the objective and the first focus of the eyepiece. The tube length is standardized
at 16 cm. We can then write:
−xi do −16 25
MP = = , (4.10)
fo fe f0 fe
with the standard near-point do =25 cm. As an example, an Amici objective gives 40× and
combined with a 10× eye piece one gets M P = 400.
The Numerical aperture of a microscope is a measure of the capability to gather light from
the object. It is defined by:
with ni the refractive index of the immersing medium, usually air but it could be water or oil,
and θmax the half-angle of the maximum cone of light accepted by the lens. The numerical
aperture is the second number etched in the barrel of the objective. It ranges from 0.07 (low-
power objectives) to 1.4 for high-power objectives. In Chapter 7 it will be explained that the
NA is proportional to the resolving power, which is the minimum transverse distance between
two objects points that can be resolved in the image.
(4.12)
4.7 Telescopes
A telescope enlarges the retinal image of a distant object. Like a compound microscope it is also
composed of a objective and an eyepiece as seen in Fig. 4.13 The object in this figure is a large
232 Chapter 5 Geometrical Optics
fo
fe
Intermediate
Objective image Eyepiece
Object
The periphery of the objective is the aperture stop, and it center of the telescope’s exit pupil. In that case, the primary
Figure 4.13: Keplerian
encompasses the entrance pupilastronomical telescope,
as well, there being no lenses to accommodating
line-of-sight therayeye.
will always correspond to a chief through the
the left of it. If the telescope is trained directly on some distant center of the exit pupil, however the eye moves.
galaxy, the visual axis of the eye will presumably be colinear Suppose that the margin of the visible object subtends a half-
but finite distance, therefore an image is formed by the objective just after its second focal point.
with the central axis of the scope. The entrance pupil of the eye
should then coincide in space with the exit pupil of the scope.
angle of a at the objective (Fig. 5.118). This is essentially the
same as the angle au, which would be subtended at the unaided
The eyepiece makes a virtual magnified image to be viewed with a relaxed eye. Therefore the
However, the eye is not immobile. It will move about scanning
the entire field of view, which quite often contains many points
eye. As in previous sections, the angular magnification is
aa
intermediary image of the objective must be within the focal lengthMPf=e aufrom the eyepiece.
of interest. In effect, the eye examines different regions of the
field by rotating so that rays from a particular area fall on the
[5.75] The
final image is inverted. fovea centralis. The direction established by the chief ray
Here au and aa are measures of the field of view in object and
through the center of the entrance pupil to the fovea centralis is image space, respectively. The first is the half-angle of the
the primary line-of-sight. The axial point, fixed in reference to actual cone of rays collected, and the second relates to the
the head, through which the primary line-of-sight always pass- apparent cone of rays. If a ray arrives at the objective with a
Optica Lecture Notes TN2421 71 of 165
es, regardless of the orientation of the eyeball, is called the negative slope, it will enter Monday
the eye with a16
th
positiveApril,
slope and2018, 09:44
sighting intersect. When it is desirable to have the eye survey- vice versa. To make the sign of MP positive for erect images,
ing the field, the sighting intersect should be positioned at the and therefore consistent with previous usage (Fig. 5.102),
Exit pupil
CHAPTER 4. OPTICAL INSTRUMENTS
As seen earlier, the angular magnification is: MP = ααua with αu is the half angle of the cone
of light that is collected and αa is the half angle of the apparent cone of rays. From triangles
Fo1 BC and Fe2 DE in Fig. 4.14 we see that
fo
MP = − . (4.13)
fe 5.7 Optical System
a B D Fe2
Fo1 aa
C E
fo fo fe fe
Figure 5.118 Ray angles for a telesc
ƒe ƒe 20ƒe + ƒe = 1.05
MTe = - = -
xo ƒo ƒe = 0.05 m and fo = 1.00 m
Furthermore, if Do is the diameter of the objective and Dep is the Since the eye is relaxed, si = ∞ and the intermediate i
diameter of its image, the exit pupil, then MTe = Dep >Do. These formed at the focal point of the eyepiece. That point
two expressions for MTe compared with Eq. (5.83) yield 105 cm behind the objective. For the objective si =
ƒo = 1.00 m and
Do
MP = (5.84)
Dep 1 1 1
+ =
so si f
The diameter of the cylinder of light entering the telescope
1 1 1
is compressed down to the diameter of the cylinder leaving + =
so 1.05 1.00
the eyepiece by a factor equal to the magnification of the
instrument—that much is evident from the geometry of the The object is located at so = 21 m in front of the objec
region between the lenses in Fig. 5.117.
Here Dep is actually a negative quantity, since the image is
inverted. It is an easy matter to build a simple refracting scope To be useful when the orientation of the object is o
by holding a lens with a long focal length in front of one with a tance, a scope must contain an additional erecting syste
short focal length and making sure that d = ƒo + ƒe. But again, an Monday
arrangement terrestrial telescope. A
Optica Lecture Notes TN2421 72 of 165 16th isApril,
known as a09:44
2018,
Chapter 5
Polarisation
What you should know and be able to do after studying this chapter.
• Know that different states of polarization are due to the phase difference
between two orthogonal components of the electric field.
• Know that linear polarisation and circular polarisation state are special
cases.
• Know how to change linear polarisation into circular and the reverse.
73
CHAPTER 5. POLARISATION
Here, Ax and Ay are the real-valued (positive) amplitudes of each of the electric field components.
While k and ω are fixed in this case, we can vary Ax , Ay , ϕx and ϕy in whatever way we like!
This degree of freedom is why different states of polarisation exist: the state of polarisation is
determined by the ratio of the amplitudes and by the phase difference ϕy −ϕx between
the two orthogonal components of the light wave. Varying the quantity ϕy − ϕx means
that we are ‘shifting’ Ey (r, t) with respect to Ex (r, t) 1 . Let us consider the electric field in a
fixed plane z = 0:
Ex (0, t) Ax cos(−ωt + ϕx )
=
Ey (0, t) Ay cos(−ωt + ϕy )
Ex (0) −iωt
= Re e (5.3)
Ey (0)
Ax eiϕx −iωt
= Re e .
Ay eiϕy
The complex vector
Ax eiϕx
Ex (0)
J= = (5.4)
Ey (0) Ay eiϕy
is a vector commonly used to characterize the polarisation state, and it is called the Jones
vector.
Let us see what at a fixed position in space happens to the electric field vector as function of
time for different choices of Ax , Ay and ϕy − ϕx .
a) Linear polarisation: ϕy − ϕx = 0. In this case we have
Ax iϕx
J= e . (5.5)
Ay
Equality of the phases: ϕy = ϕx , means that the field components Ex (z, t) and Ey (z, t) are
in phase: when Ex (z, t) is large Ey (z, t) is large, and when Ex (z, t) is small Ey (z, t) is small.
We can write
Ex (0, t) Ax
= cos(ωt − ϕx ), (5.6)
Ey (0, t) Ay
which shows that for ϕy − ϕx = 0 the electric field simply oscillates in one direction given
by the vector Ax x̂ + Ay ŷ. See Fig. 5.1a.
b) Circular polarisation: ϕy − ϕx = ±π/2, Ax = Ay . In this case the Jones vector is:
1
J= A eiϕx . (5.7)
±i x
The field components Ex (z, t) and Ey (z, t) are π/2 radians (90 degrees) out of phase: when
Ex (z, t) is large Ey (z, t) is small, and when Ex (z, t) is small Ey (z, t) is large. We can write
for z = 0 and with ϕx = 0:
Ex (0, t) Ax cos(−ωt)
=
Ey (0, t) Ax cos(−ωt ± π/2)
(5.8)
cos(ωt)
= Ax .
± sin(ωt)
We see that the electric field vector moves in a circle. When the wave is approaching
you, i.e. you are looking towards the source, and the electric field is rotating against the
clock the polarisation is called left-circularly polarised (+ sign in (5.8)), and if it goes
clockwise we call it right-circularly polarised (- sign in Fig. 5.1b).
1
KhanAcademy - Polarization of light, linear and circular: Explanation of different polarisation states and
their applications.
Optica Lecture Notes TN2421 74 of 165 Monday 16th April, 2018, 09:44
5.2. Creating and manipulating polarisation states
which shows that the electric vector moves along an ellipse with major and minor axes
parallel to the x- and y-axis. When the + sign applies we say that the field is left elliptically
polarised, otherwise it is right elliptically polarised.
The normalized vector represents of course the same polarisation state as the unnormalised one.
In general, multiplying the Jones matrix by a complex number does not change the polarisation
state. If we multiply for example by eiθ , this has the same result as changing the instant that
t = 0, hence it does not change the polarisation state. In fact:
h i h i
E(t) = Re eiθ Je−iωt = Re Je−iω(t−θ/ω) (5.13)
2. Hecht 8.1.1-8.1.4
Optica Lecture Notes TN2421 75 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION
Figure 5.1: Illustration of different types of polarisation.Top: Linear polarisation; middle: Cir-
cular polarisation; bottom: Elliptical polarisation. The red lines indicate the field components
Ex , Ey . The blue line indicates the vector E. The black line indicates the trajectory of E(t).
light pass through a dichroic crystal (which absorbs light polarised perpendicular to its so-called
optic axis4 ). A third method is sending the light through a wire grid polariser which consists
of a metallic grating with subwavelength slits. Such a grating only transmits the electric field
component that is perpendicular to the slits.
So suppose that through one such process we have obtained linearly polarised light. How
can we change such a state of linear polarisation into circularly or elliptically polarised light?
Or how can we rotate a state of linear polarisation over a certain angle? Well, we have seen
that the polarisation state depends on the ratio of the amplitudes and on the phase difference
ϕy − ϕx of the orthogonal components Ey and Ex of the electric field. Thus, to change linearly
polarised light to some other state of polarisation, we must introduce one phase shift (say ∆ϕx ) to
one component (say Ex ), and another phase shift ∆ϕy to the orthogonal component Ey . We can
achieve this with birefringent crystals, such as calcite5 . What is special about these crystals is
that they have two different refractive indices: light polarised in one certain direction experiences
a refractive index of no , while light polarised perpendicular to it feels another refractive index
ne (the subscripts o and e stand for ‘ordinary’ and ‘extraordinary’), but for our purpose we do
not need to understand this terminology. The direction for which the refractive index is smallest
(which can be either no or ne ) is called the fast axis (since its phase velocity is largest) , and
the other direction is the slow axis. Because of there being two different refractive indices, one
can see double images through a birefringent crystal6 . The difference between the two refractive
indices ∆n = ne − no is called the birefringence.
Suppose that the fast axis corresponds to ne , and is aligned with Ey , while the slow axis
(which then is no ) is aligned with Ex . If the wave travels a distance d through the crystal, then
4
Hecht §8.3.2 ‘Dichroic Crystals’.
5
Hecht §8.4 ‘Birefringence’
6
Double Vision - Sixty Symbols: Demonstration of double refraction by a calcite crystal due to birefringence.
Optica Lecture Notes TN2421 76 of 165 Monday 16th April, 2018, 09:44
5.2. Creating and manipulating polarisation states
Ey will accumulate a phase ∆ϕy = 2πn λ d, and Ex will accumulate a phase ∆ϕx =
e 2πno
λ d. Thus,
the phase difference ϕy − ϕx has increased by
2π
∆ϕy − ∆ϕx = d(ne − no ). (5.14)
λ
J̃ = MJ, (5.15)
A matrix such as M which transfers one state of polarisation in another is called a Jones
matrix. Depending on the phase difference which a wave accumulates by traveling through the
crystal, these devices are called quarter-wave plates (phase difference π/2), half-wave plates
(phase difference π), or full-wave plates (phase difference 2π). The applications of these wave
plates will be discussed in later sections.
Consider as example the Jones matrix which described the change of linear polarised light
into circular polarisation. Assume that we have diagonally (linearly) polarised light, so that
1 1
J=√ . (5.17)
2 1
where one can check that indeed ϕy − ϕx = π/2. We can do this by passing the light through
a crystal such that Ex accumulates no phase difference, and Ey accumulates a phase difference
π/2. We can write this transformation as
1 0 1 1 1 1
√ =√ . (5.19)
0 i 2 1 2 i
| {z }
M
Here, the matrix (let’s call it M) is the Jones matrix describing the operation of a quarter-wave
plate8 .
Another important Jones matrix is the rotation matrix. In the preceding discussion we
have assumed that the fast and slow axes were aligned with the y- and x-direction (i.e. they
were parallel to Ey and Ex ). But what happens if we rotate our wave plate such that its fast and
slow axes no longer coincide with y and x, but rather with some other y 0 and x0 as in Fig. 5.2.
In that case we need to apply a basis transformation: the electric field vector which is expressed
in the (x, y) basis should first be expressed in the (x0 , y 0 ) basis before we apply the Jones matrix
7
Hecht §8.7 ‘Retarders’.
8
Hecht §8.13.2 ‘The Jones Vectors’, §8.13.3 ‘The Jones and Mueller Matrices’
Optica Lecture Notes TN2421 77 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION
of the wave plate to it. After we have applied the Jones matrix, we must transform the electric
field vector back from the (x0 , y 0 ) basis to the (x, y) basis. By referring to Fig. 5.2 we see that
Ex0 Ex cos θ + Ey sin θ E
= = R−θ x , (5.20)
Ey0 −Ex sin θ + Ey cos θ Ey
where Ex0 = Ax0 eiϕx0 , Ex0 = Ax0 eiϕx0 are the components of J on the (x0 , y 0 ) basis etc. and
R−θ = R−1
θ , with Rθ the rotation matrix over an angle θ in the counter clockwise direction:
9
cos(θ) − sin(θ)
Rθ = . (5.21)
sin(θ) cos(θ)
(That R(θ) indeed is a rotation over angle θ counter clockwise is easy to see by considering what
happens when Rθ is applied to the vector (1, 0)T ).
If the matrix M describes the Jones matrix as defined in (5.16) then the matrix Mθ for the
same wave plate but with x0 as slow and y 0 as fast axis, is given by:
Mθ = Rθ MR−θ . (5.22)
Figure 5.2: If the wave plate is rotated, the fast and slow axis no longer correspond to y and x.
Instead, we have to introduce a new coordinate system y 0 , x0 .
Clearly, if one transmits horizontally polarised light through it, all of it will go through. However,
if one transmits vertically polarised light through it, nothing will go through. More generally, if
we transmit through it light that is polarised at an angle α, we get
1 0 cos(α) cos(α)
M= = . (5.24)
0 0 sin(α) 0
The amplitude of the transmitted field is reduced by the factor cos(α), which implies that the
intensity of the transmitted light is reduced by the factor | cos(α)|2 . This relation is known as
Malus’ law.
9
KhanAcademy - Linear transformation examples: Rotations
Optica Lecture Notes TN2421 78 of 165 Monday 16th April, 2018, 09:44
5.2. Creating and manipulating polarisation states
Optica Lecture Notes TN2421 79 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION
Figure 5.3: Rotation of horizontal polarised light over an angle α using a half-wave plate.
How can you determine whether it corresponds to a linear polariser or another polarisation
changing plate?
1. Linear Polariser The matrix corresponds to a linear polariser if there is a real vector
which remains invariant under M and the real vector orthogonal to the first is mapped to
zero. In other words, there must be an orthogonal basis of real eigenvectors and one of the
eigenvalues must be 1 and the other 0. Hence to check that a given matrix corresponds to
a linear polariser, one should compute the eigenvalues and show they are equal to 1 and 0
and the eigenvectors should be real and orthogonal.
2. Wave plate To show that a matrix corresponds to a wave plate, there should exist two
real orthogonal eigenvectors with, in general, complex eigenvalues of modulus 1. in fact,
one of the eigenvectors corresponds to the fast axis with refractive index n1 , say and the
other to the slow axis with refractive index n2 , say. The eigenvalues are then
eikn1 d , eikn2 d ,
where d is the thickness of the plate and k is the wave number. Hence to verify that a
(2, 2)-matrix corrsponds to a wave plate, one has to compute the eigenvalues, check that
these have modulus 1 and that the corresponding eigenvectors are real and orthogonal.
Ax eiϕx
iϕx 1 iϕy 0
J= = Ax e + Ay e (5.29)
Ay eiϕy 0 1
Optica Lecture Notes TN2421 80 of 165 Monday 16th April, 2018, 09:44
5.4. Decomposition of Elliptical Polarisation into Linear and Circular States
Alternatively, any elliptical polarisation state can be written as the sum of two circular polari-
sation states, one right and the other left circular polarised:
Ax eiϕx
1 1 1 1
J= iϕx
= (Ax e − iAy e ) iϕy iϕx
+ (Ax e + iAy e ) iϕ
. (5.30)
Ay eiϕy 2 i 2 −i
We conclude that to study what happens to elliptic polarisation, it suffices to consider two
orthogonal linear polarisations, or, if that is more convenient, left and right circular polarised
light. In a birefringent material two linear polarisations, namely the one parallel to the o- and the
e-axis, each propagate with their own refractive index. To predict what happens to an arbitrary
linear polarisation state which is not aligned to either of these axes, or more generally what
happens to an elliptical polarisation state, we write this polarisation as linear combination of o-
en e-polarisation states, i.e. we expand the field on the o- and e-basis.
In sugar, the left and right circular polarisation states propagate with their own refractive
index. Therefore sugars are said to have circular birefringent. To see what happens to an
arbitrary elliptical polarisation state in such a material, you should write it as linear combination
of left and right circular polarisations.
1. Double Vision - Sixty Symbols: Demonstration of double refraction by a calcite crystal due
to birefringence.
2. Hecht §8.4
3. Hecht §8.7.1
4. Hecht §8.13
5. MIT OCW - Linear Polarizer: Demonstration of linear polarizers and linear polarisation.
8. MIT OCW - Quarter-wave Plate: Demonstration of the quarter-wave plate to create ellip-
tical (in particular circular) polarisation.
Optica Lecture Notes TN2421 81 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION
Optica Lecture Notes TN2421 82 of 165 Monday 16th April, 2018, 09:44
Chapter 6
What you should know and be able to do after studying this chapter.
• Understand the link between the time coherence and the frequency band-
width.
Although the model of geometrical optics helps us to design optical systems and explains many
phenomena, there are also phenomena that require a more elaborate model. For example, in-
terference fringes observed in Young’s double-slit experiment, or the Arago spot indicate that
light is more accurately modeled as a wave. From your first year of Bachelor, you probably
remember that the condition for interference maxima in the double-slit experiment occurs when
the difference in path length is an integer multiple of the wavelength
where d is the distance between the two slits, θ is the angle of the propagation direction of
the light, m is an integer, and λ is the wavelength of the light. You may also remember the
Huygens-Fresnel principle, that states that each of the narrow slits acts as point source, from
which spherical waves are radiated.
In this chapter we will find more results that derive from the wave model of light. We will
see how light interferes in certain cases and how the wave model of light can reduce to the ray
model of light by considering (in)coherence. In the largest part of the discussion we will assume
that all the light has the same polarization, so we can treat the fields as scalars. In the last part
we will look at how polarization affects interference, as is described in the Fresnel-Arago laws.
It is very much worth noting that the concepts of interference and coherence are not just
restricted to optics. Since quantum mechanics dictates that particles have a wave-like nature,
interference and coherence also play a role in e.g. solid state physics and quantum information.
83
CHAPTER 6. INTERFERENCE AND COHERENCE
of U(t)2 , because it fluctuates so rapidly for optical frequencies. We recall the definition of the
time average over an interval of length T at a specific time t given in (2.74) in Chapter 2:
1 t+T /2
Z
hf (t)i = f (t0 ) dt0 , (6.7)
T t−T /2
where T is a time interval that is very large compared to the period of visible light. For a
time-harmonic function, the time average is independent of t. Indeed for (6.6) we get
I = h(U1 (t) + U2 (t))2 i
= 4 cos2 (ϕ/2) hcos2 (ωt + ϕ/2)i
= 1 + cos(ϕ) (6.8)
Optica Lecture Notes TN2421 84 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence
where T ω >> 1. It is important to note that one can use complex notation to obtain the factor
1 + cos(ϕ) more easily. Let us write
where
U1 = 1, U2 = e−iϕ . (6.10)
Then we find
|U1 + U2 |2 = |1 + e−iϕ |2
= (1 + eiϕ )(1 + e−iϕ )
(6.11)
= 1 + 1 + e−iϕ + eiϕ
= 2 + 2 cos(ϕ),
hence
1
I = |U1 + U2 |2 . (6.12)
2
(To see why this works, recall Eq. (2.75) and choose A = B = U1 + U2 ). In this chapter we omit
the factor 1/2 in the time averaged intensity. Hence we define I1 = |U1 |2 and I2 = |U2 |2 , and we
then find for the time averaged intensity of the sum of U1 and U2 :
Here, 2Re{U1∗ U2 } is known as the interference term. In the famous double-slit experiment
(which we will discuss in a later section), we can interpret the terms as follows: let us say U1
is the field that comes from slit 1, and U2 comes from slit 2. If we were to open only slit 1,
we would measure on the screen some intensity I1 , and if we were to measure slit 2, we would
see some intensity I2 . If we were to open both slits at the same time, we would not simply see
I1 + I2 , but we would see fringes which are due to the interference term 2Re{U1∗ U2 }.
More generally, if we want to see the intensity of a sum of multiple time harmonic fields Uj
that all have the same frequency, we have to compute the coherent sum
2
X
I= Uj . (6.14)
j
However, we will see in the next section that sometimes the fields are unable to interfere. In
that case all the interference terms of the coherent sum vanish, and the intensity is given by the
incoherent sum X
I= |Uj |2 . (6.15)
j
6.2 Coherence
In the discussion so far we have only considered monochromatic light. The light is then perfectly
coherent. But what happens when light is not monochromatic but consists of multiple frequencies
(e.g. laser light may consist of multiple harmonics)? Or what if our light comes from multiple
sources which radiate independently (e.g. each point of an extended source such as a lamp acts as
an independent radiator)? To answer such questions, we must study the topic of coherence. One
could make a distinction between coherent and incoherent light. An intuitive way to think
Optica Lecture Notes TN2421 85 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
about these concepts is in terms of the ability to form interference fringes: for example, with
laser light (which is coherent) one can form an interference pattern using a double slit, while
with sunlight (which is incoherent) this is much more difficult. However, it is not impossible
to create interference fringes with natural light1 : the trick is to put the two slits very closely
together. More specifically, the distance between the slits should be smaller than the coherence
length of the light. This is a very important observation: no light is actually completely coherent
or completely incoherent in practice. All light is partially coherent, but some light is more
coherent than others: laser light has a longer coherence length than sunlight, but in the end it is
possible to get an interference pattern from sunlight, or to see no interference fringes with laser
light. In the next sections we discuss these properties of light more quantitatively.
where Aω (r) is the complex amplitude of the time harmonic field with frequency ω. When there
is only a certain frequency band that contributes, then Aω = 0 for ω outside this band. We
define the complex time-dependent field U (r, t) by
Z ∞
U (r, t) = Aω (r)e−iωt dω. (6.17)
0
Then
U(r, t) = Re U (r, t). (6.18)
Note that the complex field U (r, t) contains now the time dependence in contrast with the time-
harmonic (i.e. single frequency) case in Chapter 2 where the time dependent e−iωt was a separate
factor.
Quasi-monochromatic field. If there is a frequency band with width ∆ω and centre fre-
quency ωc and the band is very narrow, we speak of a quasi-monochromatic field. This can in
good approximation be considered to be time-harmonic:
We see that ∆ω Aωc (r) is the complex amplitude of the time-harmonic field which previously in
Chapter 2 and the following chapters has been written as U (r).
Remark: In the present chapter U (r, t) will be the complex field which includes the time
dependence (even in the time-harmonic case).
We now compute the intensity of polychromatic light. The instantaneous energy flux is again
proportional to the square of the instantaneous field: U(r, t)2 . We average the instantaneous
intensity over the integration times T over common detectors which are very long compared to
the period 2π/ω of the field. Using the definition (2.74) and
Optica Lecture Notes TN2421 86 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence
we get
1
hU(r, t)2 i = h(U (r, t) + U (r, t)∗ )(U (r, t) + U (r, t)∗ )i
4
1
hU (r, t)2 i + h(U (r, t)∗ )2 i + 2 hU (r, t)∗ U (r, t)i
=
4
1
= hU (r, t)∗ U (r, t)i
2
1
= h|U (r, t)|2 i . (6.21)
2
where the averages of U (r, t)2 and (U (r, t)∗ )2 are zero because they are oscillating functions of
time. In contrast, U (r, t)∗ U (r, t) has a DC-component which does not average to zero.
Remark: In contrast to the time-harmonic case, the long time average of polychromatic light
depends in principle on the time t at which the average is taken. However, the fields that we
consider in this chapter have frequency bands that are sufficiently narrow and hence the dura-
tion of the fields is sufficiently long that the precise time around which the average over the time
interval is computed does not matter.
We use for the intensity again the expression without the factor 1/2 in front, i.e.
Hence the time averaged intensity has been expressed in terms of the squared modulus of the
complex field.
If τ is changed the maxima of the intereference pattern translates as function of x which is easy
to observe. How interference fringes for tilted collimated (i.e. not converging or diverging) beams
are observed in a Michelson interferometer is demonstrated in 2 . It is possible to obtain different
2
MIT OCW - Two-beam Interference - Collimated Beams: Interference of laser light in a Michelson interfer-
ometer for collimated beams.
Optica Lecture Notes TN2421 87 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
Vary 𝜏
I(𝜏)=<|U(t)+U(t+𝜏)|2>
Source
Figure 6.1: A Michelson interferometer that can be used to study the temporal coherence of a
field. A beam is split in two by a beam splitter, and the two beams interfere in a detector after
one beam has propagated through one arm of the interferometer for a time t, while another beam
has propagated through the other arm for a time t + τ . The time difference τ can be varied by
varying the length of one arm.
fringe patterns using diverging beams instead of collimated beams, as is demonstrated in3 .
We define the self coherence function Γ (τ ) as
Γ (τ ) = hU ∗ (t)U (t + τ )i . (6.24)
0 ≤ |γ(τ )| ≤ 1, (6.27)
Recall that we vary τ by varying the length of one of the arms in the Michelson interferometer.
Let us see what happens for different cases.
and
γ(τ ) = e−iωτ . (6.31)
3
MIT OCW - Two-beam interference - Diverging Beams: Interference of laser light in a Michelson interferom-
eter for diverging beams.
Optica Lecture Notes TN2421 88 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence
So for monochromatic light we expect to see a cosine interference pattern, which shifts as we
change the arm length of the interferometer (i.e. change τ ). For all τ , a clear interference pattern
should be observed.
In that case:
1 i(ωc +∆ω/2)t
Γ (τ ) = h e + ei(ωc −∆ω/2)t e−i(ωc +∆ω/2)(t+τ ) + e−i(ωc −∆ω/2)(t+τ ) i
4
e−i(ωc +∆ω/2)τ + e−i(ωc −∆ω/2)τ
≈
4
−iω
e cτ
= cos (∆ω τ /2) , (6.34)
2
where in the second line we assumed that the terms that oscillate in time average to 0. Hence,
the complex degree of self-coherence is:
h n oi
I(x, τ ) = 1 + Re γ(τ )e−ikx x = [1 + cos (∆ω τ /2) cos(ωc τ + kx x)] . (6.36)
It is seen that the interference term is the product of the function cos(ωc τ + kx x)), which is a
quickly oscillating function of τ , and a slowly varying envelope cos (∆ω τ /2). It is interesting to
note that the envelope, and hence γ(τ ), vanish for some periodically spaced τ , which means that
for certain τ the degree of self-coherence vanishes and no interference fringes form. Indeed, this
is what we see in4,5 . Note that if ∆ω increases, the intervals between the zeroes of γ(τ ) decrease.
If more frequencies are added, the envelope function is not a cosine function but on aver-
age decreases with τ . The typical value of τ below which intereferences are observed is roughly
equal to half the first zero of the envelope function.This value is called the coherence time ∆τc .
4
MIT OCW - Fringe Contrast - Path Difference: Demonstration of how fringe contrast varies with propagation
distance.
5
MIT OCW - Coherence Length and Source Spectrum: Demonstration of how the coherence length depends
on the spectrum of the laser light.
Optica Lecture Notes TN2421 89 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
Example. We consider a, admittingly, rather special but also simple case for which the frequency
band is ωc − ∆ω/2 < ωc + ∆ω/2 and the amplitudes of all frequency components are the same.
Then the complex field is (the factor 1/∆ω is for normalization):
Z ωc +∆ω/2
1 sin(∆ωt/2) −iωc t
U (t) = e−iωt dω = e . (6.37)
∆ω ωc −∆ω/2 ∆ωt/2
It is seen that the modulus of the mutual coherence is given by a sinc-function. The sinc-function
vanishes at certain τ , where the interference totally disappears, and decreases (although not
monotonically) to zero for τ → ∞. A reasonable definition of the pulse width of the field U (t)
is half the distance between the smallest positive and negative zeros, hence in this example the
pulse duration is 2π/∆ω. The coherence time can be set equal to half the first zero of the sinc
function, i.e. ∆τc = π/∆ω which for this example is half the pulse duration. We show three
examples of Re {γ(τ )} for three bandwidths in Fig. 6.2, with identical centre frequency ωc .
• Readers who are familiar with stochastic signal analysis recognize in Γ (τ ) = hU ∗ (t)U (t + τ )i
the autocorrelation of U (t). Informally, one can interpret the autocorrelation function
as the ability to predict the field U at time t + τ given the field at time t. For instance,
in the limit of complete incoherence where Γ (τ ) = δ(τ ), we cannot predict anything about
U (t + τ ) given U (t) for τ 6= 0, meaning U (t) is completely random. In the limit of complete
coherence where |γ(τ )| = 1, we can predict U (t + τ ) given U (t) for any τ .
• From the Wiener-Khinchin theorem we know that the Fourier transform of the self
coherence function gives the spectral power of U (t)
Using the uncertainty principle, we can see that the larger the spread of the frequencies of
U (t) (i.e. the larger the bandwidth), the more sharply peaked Γ (τ ) is. Thus, the light gets
temporally less coherent when it consists of a larger range of frequencies.
It is also possible to think of temporal coherence as the length of wavetrains/pulses emitted
by the source: the less temporally coherent the source is, the shorter the emitted wavetrains.
After all, we have seen that if a source is less temporally coherent, it emits radiation at more
frequencies, and we have seen that according to the uncertainty principle this allows shorter
pulses to be formed6 . One can think of the coherence time τc as the duration of such a pulse
(which is inversely proportional to the bandwith of the signal), and of the coherence length
as the corresponding distance lc = cτc . In the case of perfectly temporally coherent light, the
wavetrains would be infinitely long which is physically impossible. Therefore, there are in practice
no perfectly coherent (i.e. monochromatic) sources. Instead, if a source is nearly monochromatic,
6
See for example Heisenberg’s Microscope - Sixty Symbols, 0:20 to 2:38: Basic explanation of the uncertainty
principle (though in the context of quantum physics). Or Hecht §7.2, 7.4.2, 7.4.3.
Optica Lecture Notes TN2421 90 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence
Figure 6.2: Real part of the degree of self coherence Re {γ} as function of τ /ωc for a bandwidth
from top to bottom of ∆ω = 0.01ωc , ∆ω = 0.02ωc and ∆ω = 0.04ωc . In all cases ωc was the
same. The τ axes are normalised by the coherence time for the first chosen bandwidth, i.e. τ is
normalised by τc = π/(0.01ωc ). It is seen that the coherence time decreases as the bandwidth
increases.
Optica Lecture Notes TN2421 91 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
T I(𝜏)=<|U(r1,t)+U(r2,t+𝜏)|2>
Vary 𝜏
U(r2,t)
θ
d T+𝜏
ΔR
U(r1,t)
Figure 6.3: Young’s experiment can be used to study the spatial coherence of a field. A screen
with two holes at the two points of interest, r1 and r2 , is used to let the fields in these points
interfere with each other on a second screen at large distance. Because the light propagates
over different distances from the two holes to the point of observation, U (r1 , t) interferes with
U (r2 , t + τ ), where τ is the corresponding difference in propagation time. For different points
on the screen, the time difference τ varies. Due to the distance from the holes to the screen, the
amplitudes of U (r1 ) and U (r2 ) have decreased by the time they have reached the screen, but
when the distances between the screens is very large, all amplitudes have decreased by the same
factor which can then be omitted.
In Young’s experiment, a screen is used with two pinholes at the positions of the points P1
and P2 . Let r1 and r2 be the position vectors of the two points. We write the field in P1 as a
superpostion of time harmonic fields as in (6.17):
Z
U (r1 , t) = Aω (r1 )e−iωt dω. (6.40)
According to the Huygens-Fresnel principle, a time harmonic disturbance with frequency ω in the
pinhole at r1 causes a radiating spherical wave behind the screen, such that the time-harmonic
field in some point r is given by
e−iω(t−|r−r1 |/c)
Aω (r1 ) . (6.41)
|r − r1 |
7
See Hecht, §9.3.1 ‘Young’s experiment’
Optica Lecture Notes TN2421 92 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence
The total field in r due to the pinhole at P1 is obtained by integrating over all frequencies:
Hence, The field in r at time t due to the pinhole at r1 is proportional to the field at r1 at the
earlier time that it takes for the light to propagate form r1 to r. The proportionality factor scales
with the reciprocal distance between r and r1 .
Consider the set-up as shown in Fig. 6.3. The field from the two pinholes in r1 and r2 interfere
with each other on a second screen at large distance. Because of the difference in propagation
distance ∆R = |r − r1 | − |r − r2 |, there is a time difference τ between the two fields when they
arrive at the same point r on the screen, given by
∆R
τ= . (6.43)
c
Furthermore, because of the propagation, the amplitudes are reduced by a factor proportional to
the reciprocal distance which is different for the two fields, but if the distance z between the two
screens is large enough, we can take both factors to be 1/z and then omit this common factor.
The interference pattern on the screen is then given by
The complex degree of mutual coherence is defined by using these intensities to normalize
Γ12 (τ ):
Γ12 (τ )
γ12 (τ ) = p p (6.47)
Γ11 (0) Γ22 (0)
The modulus of γ12 is smaller or equal than 1. We can write (6.44) as
p p
I(τ ) = I1 + I2 + 2 I1 I2 Re {γ12 (τ )} . (6.48)
By varying the point of observation r over the screen, we can vary τ and by measuring the
intensities we can deduce the real part of γ12 (τ ). Note that γ12 (τ ) indicates the ability to form
fringes: if γ12 (τ ) is zero for a certain τ , then the interference is zero and the intensity is the sum
of the intensities of the two holes.
Let us see what happens when U (r, t) is a monochromatic field
In that case
Optica Lecture Notes TN2421 93 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
So we get
Γ12 (τ )
γ12 (τ ) = = e−iωτ +iϕ , (6.51)
|A(r1 )||A(r2 )|
where ϕ is the phase difference of A(r2 ) and A(r1 ). In this case γ12 has modulus 1, as expected
for a monochromatic field. The intensity on the screen becomes
I(τ ) = |A(r1 )|2 + |A(r2 )|2 + 2|A(r1 )||A(r2 )| cos (ωτ + ϕ) . (6.52)
So indeed we see interference fringes as one would expect for a monochromatic wave. If ϕ = 0,
then interference maxima occur for
Noting that ω = c 2π
λ , and ∆R = cτ , we find that maxima occur when
For large distance between the screens (in the Fraunhofer limit), these path length differences
correspond to directions of the maxima given by the angles θm (see Fig. 6.3):
∆R λ
θm = =m , (6.55)
d d
where d is the distance between the slits and m is an integer. This result should be familiar from
secondary school treatments of Young’s double slit experiment8 .
For fields that are not monochromatic, the degree of mutual coherence varies with τ and is
less than 1.
Example. Suppose that the fields in the point P1 and P2 are identical and given by (6.37):
sin(∆ωt/2) −iωc t
U (r1 , t) = U (r2 , t) = e . (6.56)
∆ωt/2
This fields consists of a spectrum with centre frequency ωc and bandwidth ∆ω and with constant
spectral density. The complex degree of mutual coherence can be computed. We only state the
final result, without going into the details of the derivation:
In particular,
sin(∆ωτ /2)
Re {γ12 (τ )} = cos(ωc τ ). (6.58)
∆ωτ /2
where, as above, τ = ∆R/c ≈ d θ/c, with θ the angle between the normal on the screen and
the line connecting the point of observation to the point on the screen halfway between P1 and
P2 . The factor cos(ωc τ ) is equal to Re {γ12 } for monochromatisch light with frequency ωc . The
sinc function in (6.58) is a slowly varying evelope function that is smaller than 1 for τ > 0. For
τ = 2π/∆ω this factor vanishes and therefore the coherence time is set equal to ∆τc = π/∆ω
and the coherence length is ∆`c = πc/∆ω. In Fig. 6.4, the intensity
is shown as function of angle θ for several values of the bandwidth. It is seen that between θ = 0
and the first zero, the fringe ampitudes decrease for increasing angle θ because of the increase
of the difference in propagation distance ∆R from the two holes to the point of observation.
Furthermore, for larger bandwith the amplitudes of the fringes decrease faster with the angle.
Optica Lecture Notes TN2421 94 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence
Figure 6.4: The intensity I(θ) as function of θ for a bandwidth from top to bottom of ∆ω =
0.025ωc , ∆ω = 0.05ωc and ∆ω = 0.1ωc . In all cases ωc has the same value. It is seen that the
fringe amplitudes decrease with increasing θ due to the fact that the difference in propagation
distances ∆R, and hence τ , increases with θ. The larger the bandwidth the faster this decrease
with angle. Furthermore for certain angles there is no interference at all.
Optica Lecture Notes TN2421 95 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
Let us interpret the mutual coherence function Γ12 (τ ) and the complex degree of mutual co-
herence.
• We have 0 ≤ |γ12 (τ )| ≤ 1 and |γ12 | indicates the ability to from fringes in a double-slit
experiment. When γ12 (τ ) is zero there is no interference.
• We recognize Γ12 (τ ) = hU ∗ (r1 , t)U (r2 , t + τ )i to be the cross-correlation of the two
signals U (r1 , t) and U (r2 , t).
Let us assume that z is so large that all distances |Pi Qj | in the denominators can be replaced
by z. Then the distances can be omitted and we get for the mutual coherence in Q1 and Q2 for
zero time delay τ = 0:
ΓQ1 Q2 (0) = h(U (Q1 , t)∗ U (Q2 , t)i
= hU (P1 , t − |P1 Q1 |/c)∗ U (P1 , t − |P1 Q2 |/c)i
+ hU (P1 , t − |P1 Q1 |/c)∗ U (P2 , t − |P2 Q2 |/c)i
+ hU (P2 , t − |P2 Q1 |/c)∗ U (P1 , t − |P1 Q2 |/c)i
+ hU (P2 , t − |P2 Q1 |/c)∗ U (P2 , t − |P2 Q2 |/c)i
|P1 Q1 | − |P1 Q2 | |P1 Q1 | − |P2 Q2 |
= ΓP1 P1 + ΓP1 ,P2
c c
|P2 Q1 | − |P1 Q2 | |P2 Q1 | − |P2 Q2 |
+ΓP2 P1 + ΓP2 ,P2 . (6.64)
c c
8
KhanAcademy - Young’s Double slit part 1
Optica Lecture Notes TN2421 96 of 165 Monday 16th April, 2018, 09:44
6.3. Change of Spatial Coherence due to Propagation
Figure 6.5: Two incoherent point sources P1 , P2 and two points Q1 , Q2 in a plane at distance z
from the point sources. The degree of mutual coherence at Q1 and Q2 increases to 1 for α → 0.
Similarly,
ΓQ1 Q1 (0) = ΓQ2 Q2 (0) = 2I0 . (6.66)
The complex degree of mutual coherence for zero time delay becomes
ΓP1 P2 (0)
γQ1 Q2 (0) = p p
ΓQ1 Q1 (0) ΓQ2 Q2 (0)
1 h −i ωc (|P1 Q2 |−|P1 Q1 |) ωc
i
= e c + e−i c (|P2 Q2 |−|P2 Q1 |) . (6.67)
2
The degree of mutual coherence at Q1 and Q2 is thus not zero and depends on the one hand on
the optical path differences between P1 and Q1 and P1 and Q2 , and the other hand on the path
difference between P2 and Q1 and P2 and Q2 .
Suppose that P1 = (a/2, 0, 0), P2 = (−a/2, 0, 0) and Qj = (xj , 0, z) for j = 1, 2. Then, for z
large: r
(a/2 − xj )2 (a/2 − xj )2
|P1 Qj | = z 1 + ≈ z + , (6.68)
z2 2z
and hence
(a/2 − x2 )2 − (a/2 − x1 )2 a(x1 − x2 ) + x22 − x21
|P1 Q2 | − |P1 Q1 | ≈ = . (6.69)
2z 2z
Similarly,
−a(x1 − x2 ) + x22 − x21
|P2 Q2 | − |P2 Q1 | ≈ . (6.70)
2z
Hence, hω a i
ωc
(x22 −x21 )
c
γQ1 Q2 (0) = e−i c (x1 − x2 ) .
cos (6.71)
2cz
It is thus seen that the degree of mutual coherence depends on the angle α ≈ a/z subtended by
the two point sources at the midpoint (0, 0, z) on the screen. The smaller this angle, the higher
the degree of spatial coherence.
We see form (6.71) that by keeping Q1 fixed, we can retrieve the angle α by measuring
Re {γQ1 Q2 (0)} for a number of different positions of Q2 .
If the light is not quasi-monochromatic and has larger bandwidth, the modulus of |ΓQ1 Q2 (0)|
will be smaller which means a reduced coherence compared to the quasi-monochromatic case.
Optica Lecture Notes TN2421 97 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
The dependence of the spatial coherence on propagation can be understood without calcu-
lations by realizing that the fields at Q1 and Q2 both partly consist of the fields radiated by
P1 and P2 . The field at Q1 and Q2 radiated by the point source P1 are coherent provided the
difference in propagation distance from P1 to Q1 and from P2 to Q2 is smaller than
the coherence length. The same applies to the field at Q1 and Q2 due to point source P2 .
Therefore the fields at Q1 and Q2 are correlated (i.e. they are partially coherent), even though
the fields at P1 and P2 are completely uncorrelated.
Imax − Imin
V= . (6.72)
Imax + Imin
For example, if we have two perfectly coherent, monochromatic point sources U1 , U2 (the slits
in a double-slit experiment), each with intensity I1 = |U1 |2 , I2 = |U2 |2 , then the interference
pattern is by ((6.52)): p
I(τ ) = I1 + I2 + 2 I1 I2 cos(ωτ + ϕ). (6.73)
We then get p p
Imax = I1 + I2 + 2 I1 I2 , Imin = I1 + I2 − 2 I1 I2 , (6.74)
so √
2 I1 I2
V= . (6.75)
I1 + I2
In case I1 = I2 , we find V = 1. In the opposite case where U1 and U2 are completely incoherent,
we find
I(τ ) = I1 + I2 , (6.76)
from which follows
Imax = Imin = I1 + I2 , (6.77)
which gives V = 0.
9
See Hecht §9.2.1. ‘Temporal and spatial coherence’
10
Hecht §12.4.1 ‘The Michelson Stellar Interferometer ’
Optica Lecture Notes TN2421 98 of 165 Monday 16th April, 2018, 09:44
6.5. Interference and polarization
I(x)
Imax
Imin
x
Figure 6.6: Illustration of Imax and Imin of an interference pattern I(x) that determines the
visibility V.
(E 1 + E 2 ) · (E 1 + E 2 ) = E 1 · E 1 + E 2 · E 2 + 2E 1 · E 2 (6.79)
This is always possible, whether the fields are polarized or randomly polarized. Then (6.79)
becomes
2
E 1 · E 1 + E 2 · E 2 + 2E 1 · E 2 = E1⊥ 2
+ E2⊥ 2
+ 2E1⊥ E2⊥ + E1k 2
+ E2k + 2E1k E2k . (6.82)
If the fields are randomly polarized, the time average of the ⊥-part will equal the average of the
k-part, so the time averaged intensity becomes
2 2
I = 2 hE1⊥ + E2⊥ + 2E1⊥ E2⊥ i
2 2 (6.83)
= 2 hE1k + E2k + 2E1k E2k i .
11
See Hecht 9.1
Optica Lecture Notes TN2421 99 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE
This is qualitatively the same as what we would get if the fields had parallel polarization e.g.
E1⊥ E2⊥
E1 = E2 = . (6.84)
0 0
This gives us the Second Fresnel-Arago law: two fields with parallel polarization inter-
fere the same way as two fields that are randomly polarized. This also indicates that
our initial assumption in the previous sections that all our fields have parallel polarization is not
as limiting as it may have appeared at first.
The third Fresnel-Arago law states the following: suppose we have some field
E
E= ⊥ (6.85)
Ek
that is randomly polarized. Suppose we separate the two polarizations, and rotate one so that
the two resulting fields are aligned, e.g.
E⊥ E
E1 = E2 = k . (6.86)
0 0
Then these two fields cannot interfere with each other, because E⊥ and Ek are inco-
herent.
3. MIT OCW - Fringe Contrast - Path Difference: Demonstration of how fringe contrast
varies with propagation distance.
4. MIT OCW - Coherence Length and Source Spectrum: Demonstration of how the coherence
length depends on the spectrum of the laser light.
8. Hecht 9.1
Optica Lecture Notes TN2421 100 of 165 Monday 16th April, 2018, 09:44
Chapter 7
What you should know and be able to do after studying this chapter.
• Know when the scalar wave equation can be used to propagate fields.
• Know the Rayleigh Sommerfeld formula (not necesarily all details, but at
least be able to write the integral over spherical waves with amplitudes pro-
portional to the field in the starting plane).
• Understand intuitively how the Fourier transform says something about res-
olution.
• Know how the Fresnel and Fraunhofer propagation integrals relate to Fourier
transforms.
• Know that and why propagation to the far field corresponds to taking the
Fourier transform.
• Know that and why propagation to the focal plane of a lens corresponds to
taking the Fourier transform.
• Understand how the Fourier transforming property of a lens can be used for
Fourier filtering.
In this chapter we will study the theory that describes how light propagates. Why is this
important? It is the propagation of light that revealed to us its wave-like nature: in the double-
slit experiment, we inferred from the interference pattern observed on a screen after propagation
101
CHAPTER 7. SCALAR DIFFRACTION OPTICS
that light is a wave. To demonstrate more convincingly that light is indeed a wave, we require
a detailed quantitative model of the propagation of light, which gives experimentally verifiable
predictions.
But a precise description of the propagation of light is not only important for fundamental
science, it also has many practical applications. For example, if we want to analyse a sample by
illuminating it and measuring the scattered light, we need to take into account the fact that the
detected light has not only been affected by the sample, but by both the sample and propagation.
Another example is lithography: if we want to print a pattern onto a substrate using a mask
that is illuminated, we need to realize that if there is a certain distance between the mask and
the photoresist, the light that reaches the resist doesn’t have the exact shape of the mask because
of propagation effects. Thus, the mask needs to be designed to compensate for this effect. These
motivations are illustrated in Fig. 7.1.
Propagation Propagation
Detector
Interference
Pattern
Sample
Slits Screen
(a) (b)
Figure 7.1: Several motivations for a quantitative model of the propagation of light. It may serve
fundamentally scientific purposes, since it would provide predictions that can be tested. It could
also be applied in sample analyses or lithography.
In Section 2.4 we have derived that in homogeneous matter (i.e. the permittivity and refrac-
tive index are constant), every component of a time-harmonic electromagnetic field U(r, t) =
Re U (r)e−iωt satisfies the scalar Helmholtz equation (2.23):
∇2 + k 2 U(r) = 0, (7.1)
√
where k = ω µ0 is the wave number of the light in matter with permittivity and refractive
index n = /0 .
p
One particular solution of the Helmholtz equation (if we do not impose boundary conditions) is
the plane wave solution
U (r) = eik·r , (7.2)
where
kx2 + ky2 + kz2 = k 2 . (7.3)
The choice of the sign of the components of the wave vector k determines in which direction the
plane wave propagates as time progresses.
In the following Sections 7.1 and 7.2 we will present two equivalent methods to compute the
propagation of the field through homogeneous matter, i.e. matter of which the refractive index is
independent of position. Although both methods in the end describe the same, the two different
forms give physical insight of different aspects of propagation as will be seen in Sections 7.4 and
Optica Lecture Notes TN2421 102 of 165 Monday 16th April, 2018, 09:44
7.1. Angular Spectrum Method
7.5. The two methods which we will discuss can be applied to propagate any component U of the
electromagnetic field, provided the propagation is in homogeneous matter. With this assumption
the two methods give identical and rigorous results.
When the refractive index is not constant, Maxwell’s equations are not anymore equivalent
to the wave equation for the individual electromagnetic field components and there is then cou-
pling of the components due to the curl operators in Maxwell’s equation. When the variation of
the refractive index is slow on the scale of the wavelength, the scalar wave equation may still be
a good approximation, but for structures that vary on the scale of the wavelength (i.e. on the
scale of ten microns or less), the scalar wave equation is not sufficiently accurate.
U(x,y,0)
Propagation
U(x,y,z)
y
x
?
z
Figure 7.2: Illustration of what is calculated. We know the field U in the plane z = 0. Given
that field U (x, y, 0), we want to find U in the plane z. It is assumed that the field propagates in
the positive z-direction, which means that all sources are in z < 0.
Our goal is to find what the field looks like in some plane z = constant, given the field
in the plane z = 0, as is illustrated in Fig. 7.2. The sources of the field are assumed to be
in the half space z < 0. One way to see how light propagates from one plane to another is
by using the angular spectrum method. We decompose the field in plane waves with a
two-dimensional Fourier transform. Since we know how each plane wave propagates, we can
propagate each component separately and then add them all together by taking the inverse
Fourier transform. Mathematically, it is described as follows: we know the field U (x, y, 0). We
will write U0 (x, y) = U (x, y, 0) for convenience and apply a two-dimensional Fourier transform
to U0 : Z Z
F(U0 )(ξ, η) = U0 (x, y)e−2πi(ξx+ηy) dxdy, (7.4)
The most important properties of the Fourier transform are listed in Appendix E. By defining
Optica Lecture Notes TN2421 103 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
The variables in the Fourier plane: (ξ, η) and (kx , ky ) are called spatial frequencies.
Equation (7.7) says we can write U0 (x, y) = U (x, y, z = 0) as an integral (a sum) of plane
kx ky
waves with wave vector k = (kx , ky , kz ) , each with weight F(U0 ) 2π , 2π . We know how each
1 T
kx ky
plane wave with complex amplitude F(U0 ) 2π , 2π and wave vector k = (kx , ky , kz )T propagates
a distance z > 0
kx ky kx ky
F(U0 ) , ei(kx x+ky y)
→ F(U0 ) , ei(kx x+ky y+kz z) , (7.8)
2π 2π 2π 2π
Thus, the field U (x, y, z) in the plane z (for some z > 0) is given by
Z Z
1 kx ky
U (x, y, z) = 2 F(U0 ) , ei(kx x+ky y+kz z) dkx dky . (7.9)
4π 2π 2π
where s 2
2π
kz = − kx2 − ky2 , (7.10)
λ
with λ the wavelength of the light as measured in the material (hence, λ = λ0 /n, with λ0 the
wavelength in vacuum). The sign in front of the square root in (7.10) could in principle be chosen
negative: one would then also obtain a solution of the Helmholtz equation. The choice of the
sign of kz is determined by the direction in which the light propagates, which in turn depends
on the location of the sources. We have to choose here the + sign because the time dependence
is given by e−iωt and the sources are assumed to be in z < 0.
We can write (7.9) alternatively as
Note that one can interpret this as a diagonalization of the propagation operator, as explained
in Appendix F. 2
We can observe something interesting: if kx2 + ky2 > 2π
λ , then kz becomes imaginary, and
e ik z z decays exponentially for increasing z:
r r
2
i kx x+ky y+z ( 2πn 2πn 2 2
) −kx
2 −k 2 )
y i(kx x+ky y) −z ( λ ) −kx −ky2
(7.13)
λ
e =e e .
These exponentially decaying waves are evanescent in the positive z-direction. We have met
evanescent waves already in the context of total internal reflection discussed in Section 2.10.5.
The physical consequences of evanescent waves in the angular spectrum decomposition will be
explained in Section 7.4.
1
Every picture is made of waves - Sixty Symbols, 3:33 to 7:15: Basic explanation of Fourier transforms. Also
see section 7.4
Optica Lecture Notes TN2421 104 of 165 Monday 16th April, 2018, 09:44
7.2. Rayleigh-Sommerfeld Diffraction Integral
Figure 7.3: The plane waves in the angular spectrum of a time-harmonic field which propagates in
the z-direction are parametrized
q by kx , ky . There are two types of waves: the propagating waves
which correspond to kx2 + ky2 < k and which have constant amplitude, and the evanescent waves
q
for which kx2 + ky2 > k and of which the amplitude decrease exponentially with propagation.
Remark. In homogeneous space the scalar Helmholtz equation for every electric field com-
ponent is equivalent to Maxwell’s equations and hence we may propagate each component Ex ,
Ey and Ez individually using the angular spectrum method. If the data in the plane z = 0 of
these field components are physically consistent, the thus obtained electric field will automatically
satisfy the condition that the electric field is free of divergence, i.e.
∇ · E = 0, (7.14)
everywhere in z > 0. This is equivalent to the statement that the electric vectors of the plane
waves in the angular spectrum are perpendicular to their wave vectors. Alternatively, one can
propagate only the Ex and Ey -components and afterwards determine Ez from the condition that
(7.14) must be satisfied.
Optica Lecture Notes TN2421 105 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
where we defined
(7.16)
p
r= (x − x0 )2 + (y − y 0 )2 + z 2 .
Remark. In (7.15) there is an additional factor z/r compared to the expressions for a time-
harmonic spherical wave given in (2.51) and at the right-hand side of (6.41). This factor means
that the spherical waves in the Rayleigh-Sommerfeld diffraction integral have amplitudes that
depend on the angle of radiation (although their wave front is spherical), the amplitude being
largest in the forward direction.
What happens when an object is illuminated and the reflected or transmitted light is detected
at some distance from the object? Let us look at transmission for example. When the object is
much larger than the wavelength, a transmission function τ (x, y) is often defined and the field
transmitted by the object is then assumed to be simply the product of the incident field and the
function τ (x, y). For example, for a hole in a metallic screen with diameter large compared to
the wavelength, the transmission function would be 1 inside the hole and 0 outside. However, if
the object has features of the size of the order of the wavelength, this simple model breaks down
and the transmitted field must instead be determined by solving Maxwell’s equations. This is
not easy but there exist software packages that can do it.
Now suppose that the transmitted electric field has been obtained in a plane z = 0 very close
to the object (a distance within a fraction of a wavelength). This field is called the transmit-
ted near field and it may have been obtained by simply multiplying the incident field with a
transmission function τ (x, y) or by solving Maxwell’s equations. This transmitted near field is
a kind of footprint of the object. But it should be clear that although it is quite common in
optics to speak in terms of "imaging an object", strictly speaking we do NOT image an object
as such, but we image the transmitted (or reflected) near field which is a kind of copy of the object.
After the transmitted near field has been obtained, we apply the angular spectrum method
to propagate the individual components through homogeneous matter (e.g. air) from the object
to the detector plane or to an optical element like a lens. The first step is to Fourier transform
the transmitted component U0 (x, y) = U (x, y, 0). So what is a spatial Fourier transform? A
spatial Fourier transform decomposes the component into plane waves. To each plane wave,
kx ky
characterized by the wave numbers kx and ky , it assigns a complex amplitude F(U0 ) 2π , 2π ,
the magnitude of which indicates how important the role is which this particular wave plays in
the formation of the object. So what can we say about an object
U0 (x, y), simply by looking at
kx ky
the magnitude of its spatial Fourier transform |F(U0 ) 2π , 2π |?
Optica Lecture Notes TN2421 106 of 165 Monday 16th April, 2018, 09:44
7.4. Intuition for the Spatial Fourier Transform in Optics
Suppose U0 (x, y) has sharp features, i.e. there are regions where U0 (x, y) varies rapidly as a
function of x and y. To describe these features as a combination of plane waves, these waves
must also vary rapidly as a function of x and y, which means that the length of their wave vectors
q
kx2 + ky2 must be large. Thus, the more sharp features U (x, y) has, the larger we can expect
q
kx ky
|F(U0 ) 2π , 2π | to be for large kx2 + ky2 , i.e. high spatial frequencies can be expected to have
large amplitude. Similarly, the slowly
varying,
broad q
features of U0 (x, y) are described by slowly
kx ky
fluctuating waves, i.e. by F(U0 ) 2π , 2π for small kx2 + ky2 , i.e. for low spatial frequencies.
This is sketched in Fig. 7.4.
To illustrate these concepts we choose a certain field, take its Fourier transform, remove the
higher spatial frequencies and then invert the Fourier transform. We then expect that the result-
ing field has lost its sharp features and only retains its broad features, i.e. the image is blurred.
Conversely, if we remove the lower spatial frequencies but retain the higher, then the result will
only show its sharp features, i.e. its contours. These effects are shown in Fig. 7.5.
2
Recall the observation we made about Eq. (7.13): if kx2 + ky2 > 2π λ , the plane wave decays
exponentially as the field propagates. We have seen that losing these high spatial frequencies
leads to loss of resolution: propagation of light leads to irrecoverable loss of resolution.
Because by propagation through homogeneous space all the information contained in the high
spatial frequencies corresponding to evanescent waves is lost (only exponentially small amplitudes
of the evanescent waves remain), perfect imaging is impossible no matter how well designed an
optical system is. It is this fact that motivates near-field microscopy which tries to detect these
evanescent waves by scanning close to the sample, thus obtaining a high resolution. Another
relevant research topic is hyperbolic metamaterials, in which evanescent waves can propagate,
rather than decay exponentially3 .
Remark. The importance of the phase for the field can also be seen by looking at the plane wave
expansion (7.9). We have seen that the field in a plane z = constant can be obtained by propa-
gating the plane waves by multiplying their amplitudes by the phase factors eizkz , which depends
on the propagation distance z. If one leaves out the evansecent waves from consideration (which
3
Hyperbolic materials allow in principle super-resolution. They are treated in the Master course "Advanced
Photonics"
4
See the course Advanced Photonics
Optica Lecture Notes TN2421 107 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
after some distance do not contribute to the field anymore anyway), it follows that ONLY the
phases of the plane waves change upon propagation while their amplitudes (the moduli of their
complex amplitudes) do NOT change. Yet, depending on the propagation distance z, widely
differing light patterns are obtained (see e.g. Fig. 7.8).
Slow fluctuations
Broad features
kx
High spatial frequencies
Fast fluctuations
Sharp features
Figure 7.4: qA qualitative interpretation of spatial Fourier transforms. The low spatial frequencies
(i.e. small kx2 + ky2 ) represent slow fluctuations, and therefore contribute to the broad features
q
of the real-space object. The high spatial frequencies (i.e. large kx2 + ky2 ) fluctuate rapidly, and
can therefore form sharp features in the real-space object.
Another important qualitative aspect of the Fourier transform is the uncertainty principle. It
states that many many waves of different frequencies have to be added to get a function that is
confined to a small space5 . Stated differently, if U (x, y) is confined to a very small region, then
F(U )(kx , ky ) must be very spread out. This can also be illustrated by the scaling property of
the Fourier transform:
kx 1 kx
if h(x) = f (ax) then F(h) = F(f ) , (7.17)
2π |a| 2πa
which simply states that the more h(x) is squeezed, the more its Fourier transform F(h) spreads
out. This principle is shown in Fig. 7.7. Perhaps you are more familiar with the uncertainty
principle in the context of quantum physics: a particle cannot have both a definite momentum
and a definite position. In fact, this is just one particular manifestation of the uncertainty
principle just described. A quantum state |ψi can be described in the position basis ψx (x) as
well as in the momentum basis ψp (p). The basis transformation that links these two expressions
is the Fourier transform
ψp (p) = F{ψx (x)}(p), (7.18)
so of course the two are subject to the uncertainty principle! In fact, any two quantum observ-
ables which are related by Fourier transform (also called conjugate variables), such as position
and momentum, or voltage and electric charge, have this uncertainty relation. The uncertainty
relation roughly says that:
5
Heisenberg’s Microscope - Sixty Symbols, 0:20 to 2:38: Basic explanation of the uncertainty principle (though
in the context of quantum physics)
Optica Lecture Notes TN2421 108 of 165 Monday 16th April, 2018, 09:44
7.4. Intuition for the Spatial Fourier Transform in Optics
Figure 7.5: Demonstration of the roles of different spatial frequencies. By removing the high
spatial frequencies, only the broad features of the image remain: we lose resolution. If the low
spatial frequencies are removed only the sharp features (i.e. the contours) in the image remain.
Optica Lecture Notes TN2421 109 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
Figure 7.6: Demonstration of the role of the phase of the spatial Fourier transform. If the ampli-
tude information is removed, but phase information is kept, some features of the original image
are still recognizable. However, if the phase information is removed but amplitude information
is kept, the original image is completely lost.
Optica Lecture Notes TN2421 110 of 165 Monday 16th April, 2018, 09:44
7.4. Intuition for the Spatial Fourier Transform in Optics
If a function f (x) has width ∆x, its Fourier transform has a width ∆kx ≈ 2π/∆x.
Since after propagation over a distance z, the evanescent waves do not contribute to the Fourier
transform of the field, it follows that this Fourier transform has maximum width ∆kx = k. By
the uncertainty principle it follows that after propagation, the minimum width of the field is
∆x, ∆y ≈ 2π/k = λ. Hence, the minimum feature size of a field after propagation is
of the order of the wavelength. This poses a fundamental limit to resolution given by the
wavelength of the light.
U(x, y) Û(kx , ky )
U(x, y) Û(kx , ky )
U(x, y) Û(kx , ky )
Optica Lecture Notes TN2421 111 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
z eikr 0 0
Z Z
1
U (x, y, z) = U0 (x0 , y 0 ) dx dy (7.20)
iλ r r
Z Z
1
≈ U0 (x0 , y 0 )eikr dx0 dy 0 . (7.21)
iλz
The reason why we can NOT apply the same approximation for r in the exponent, is because
there r is multiplied by k = 2π/λ which is very large, so any error introduced by approximating
r would be magnified significantly by k. To approximate r in the exponent eikr we need to be
more careful, and instead apply a Taylor expansion. Recall
p
r = (x − x0 )2 + (y − y 0 )2 + z 2
r
(x − x0 )2 + (y − y 0 )2
= z + 1. (7.22)
z2
We know that for a small number s we can expand (compare 6.69)):
√ s s2
s+1=1+ − + .... (7.23)
2 8
0 2 0 2
Since we assumed that z is large, (x−x ) z+(y−y
2
)
is small, so we can expand
r
(x − x0 )2 + (y − y 0 )2
r = z +1
z2
(x − x0 )2 + (y − y 0 )2
≈ z 1+
2z 2
(x − x ) + (y − y 0 )2
0 2
= z+ Fresnel approximation (7.24)
2z
With this approximation, we arrive at the Fresnel diffraction integral, which can be written
in different forms
eikz
Z Z
ik 0 2 +(y−y 0 )2
U (x, y, z) ≈ U0 (x0 , y 0 )e 2z [(x−x ) ] dx0 dy 0
iλz
ik(x2 +y 2 )
eikz e 2z ik(x02 +y 02 )
Z Z
x 0 y 0
= U0 (x0 , y 0 )e 2z e−ik( z x + z y ) dx0 dy 0 (7.25)
iλz
ik(x2 +y 2 )
eikz e 2z
ik(x02 +y 02 ) x y
0 0
= F U0 (x , y )e 2z , .
iλz λz λz
Optica Lecture Notes TN2421 112 of 165 Monday 16th April, 2018, 09:44
7.5. The Fresnel and Fraunhofer Approximations
The Fresnel integral is the Fourier transform of the field U0 (x0 , y 0 ) multiplied by
ik(x02 +y 02 )
the Fresnel propagator e 2z .
Remark. By Fourier transforming (7.25), one gets the plane wave amplitudes of the Fres-
nel integral. It turns out that these amplitudes are equal to F(U0 ) mulitiplied by a phase factor.
This phase factor is a paraxial approximation of the exact phase factor given by exp(izkz ), i.e.
it contains as exponent the parabolic approximation of kz . Therefore the Fresnel approximation
is also called the paraxial approximation. In fact, it can be shown that the Fresnel diffraction
integral is a solution of the paraxial wave equation and conversely, that every solution of the
paraxial wave equation can be written as a Fresnel diffraction integral.
(x − x0 )2 + (y − y 0 )2
r ≈ z+ Fresnel approximation (7.26)
2z
x2 + y 2 − 2xx0 − 2yy 0
≈ z+ Fraunhofer approximation. (7.27)
2z
We have thus omitted the quadratic terms x02 + y 02 , so with respect to the Fresnel diffraction
ik(x02 +y 02 )
integral, we simply omit e 2z to obtain the Fraunhofer diffraction integral
ik(x2 +y 2 )
eikz e 2z x y
U (x, y, z) ≈ F(U0 ) , . (7.28)
iλz λz λz
The far field of U0 (x0 , y 0 ) is simply its Fourier transform with an additional quadratic
phase factor6 .
Note that the coordinates in which we have to evaluate F(U0 ) scale with 1/z, and the overall
field U (x, y, z) also scales with 1/z. This means that as you choose z larger (i.e. you propagate
the field further), the field simply spreads out without changing its shape, and its amplitude goes
down. Hence the field diverges as the propagation distance z increases:
Eventually, for sufficiently large propagation distances, light always spreads out
while preserving the shape of the light distribution.
Remark 1. The Fresnel integral is, like the Fraunhofer integral, also a Fourier transform,
evaluated in spatial frequencies which depend on the point of observation:
x y
ξ= , η= . (7.29)
λz λz
However, in contrast to the Fraunhofer integral, the Fresnel integral depends additionally in a
different way on the propagation distance z, namely the propagator in the integrand also depends
on z. This is the reason that the Fresnel integral does not merely depend on the ratios x/z and
6
Also see Hecht §7.4.4, subsection ‘Fourier Analysis and Diffraction’.
Optica Lecture Notes TN2421 113 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
y/z but in a more complicated manner on the position of the point of observation. Therefore
the Fresnel integral can yield quite diverse patterns depending on the value of the propagation
distance z, as is shown in Fig. 7.8.
Remark 3. The points of observation where the Fraunhofer formulae can be used must in
any case satisy:
x y
< 1, < 1. (7.32)
z z
The reason is that the Fresnel approximation already fails when x/z > 1, hence so does Fraun-
hofer. On top of that, when x/z > 1, the spatial frequency kx = 2πx
zλ > k which correponds to an
evanescent wave. An evanescent wave can obviously not contribute to the Fraunhofer far field.
Optica Lecture Notes TN2421 114 of 165 Monday 16th April, 2018, 09:44
7.5. The Fresnel and Fraunhofer Approximations
0.08 I I
NF=0.01 2.5 NF=4
0.06 2
0.04 1.5
1
0.02
0.5
x x
-10 -8 -6 -4 -2 2 4 6 8 10 -4 -3 -2 -1 1 2 3 4
3
I I
2.5
2.5
NF=1.0 NF=10
2
2
1.5
1.5
1 1
0.5 0.5
x x
-3 -2 -1 1 2 3 -4 -3 -2 -1 1 2 3 4
Figure 7.8: An example of Fresnel fields of a slit. The distance to the slit increases from very
close to the slit at the bottom right, to increasing intermediate distances in the top right and the
bottom left figures, to very large propagation distances in the top left figure. NF = D2 /λz is the
Fresnel number. In the top left, the Fresnel and Fraunhofer approximations give identical results
equal to the Fourier transform of the slit, which is the sinc function. The size of the slit relative
to the pattern is shown in each case by the black bar and hence the pattern becomes broader with
propagation distance. Note: the decrease of amplitude with distance is NOT shown. Increasing
the distance beyond that for the case shown in the top left figure does not change the shape of
the pattern anymore: it only gets broader while the amplitude decreases.
Optica Lecture Notes TN2421 115 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
The Fraunhofer far field of a rectangular hole in a plane at large distance z is obtained by
substituting (7.37) into (7.28).
• The first zero along the x-direction from the centre x = 0 occurs for
λz
x=± . (7.38)
a
The distance between the first zeros is 2λz/a and is thus larger when the width of the
slit is smaller.
• The inequalities (7.32) imply that when a < λ, the far field pattern does not have
any zeros as function of x. It is then not possible to deduce the width a from the
Fraunhofer intensity. This is an illustration of the fact that details of size less than
the wavelength can not propagate to the far field.
an infinite series of slits with finite width is given by the convolution X∆ (x) ⊗ Wslit (x).
If we want the number of slits to be finite, we multiply the expression with another block
function Warray (x) to get
The diffraction pattern in the far field is given by the Fourier transform of the trans-
mitted near field. If the incident illumination is a perpendicular plane wave with unit
amplitude, the transmitted near field is simply τ (x). Using the fact that convolutions in
real space correspond to products in Fourier space and vice versa, and using the fact that
F{X∆ (x)} = (1/∆)X1/∆ (ξ), we find
(7.41)
F(τ ) = X1/∆ F(Wslit ) ⊗ F(Warray )
Optica Lecture Notes TN2421 116 of 165 Monday 16th April, 2018, 09:44
7.6. Fourier Optics
and the Fraunhofer field of the array of slits is (omitting the quadratic phase factor):
x aA X∞ a x m
F(τ ) = sinc mπ sinc πA − . (7.45)
λz ∆ m=−∞ ∆ λz ∆
The width of a diffraction order is given by the width of the function (7.43), i.e. it is given
by
λ
angular width = , (7.48)
A
Hence, the larger A, i.e. the more slits there are in the array, the narrower the peaks into
which the energy is diffracted. The phenomenon that the angles of diffraction of the orders
depend on wavelength is used to separate wavelengths. Grating spectrometers use periodic
structures such as this array of slits to separate and measure wavelengths.
The amplitudes of the diffracted orders:
a
sinc mπ , (7.49)
∆
are determined by the width of the slits. Hence the envelope (i.e. large features) of the
Fraunhofer diffraction pattern is determined by the small-scale properties of the arrray,
namely the width of the slits. This is illustrated in Fig. 7.9.
Optica Lecture Notes TN2421 117 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
x2 + y 2 + (z − f )2 = constant, (7.51)
p
are spheres with centre the focal point, and the amplitude of the field is proportional to the
reciprocal of the distance between (x, y, z) and the focal point. The minus sign in the exponent
in (7.50) is explained by the (implicit) time dependance which is, as always for monochromatic
fields, given by the factor exp(−iωt) where ω > 0. After multipying by this factor, it is seen that
the phase at a particular point and time is given by
(7.52)
p
− k x2 + y 2 + (z − f )2 − ωt,
and hence as time increses the phase can only be kept constant when the distance to the focal
point decreases. Incidently, for a point (x, y, z) to the right of the focal point, the spherical wave
Optica Lecture Notes TN2421 118 of 165 Monday 16th April, 2018, 09:44
7.6. Fourier Optics
fronts propagate away from the focal point and therefore one should choose there the plus sign
in the exponent.
Returning to the region 0 < z < f , we now consider the field in the exit pupil of the lens, i.e.
we choose z = 0, where the field is truncated by the aperture of the lens. in Hence the field in
the plane z = 0 is given by
√
e−ik x2 +y 2 +f 2
1Ja (x, y) p , for (x, y, 0) in the exit pupil of the lens, (7.53)
x2 + y 2 + f 2
where
1 if x2 + y 2 < a2 ,
1J (x, y) = (7.54)
a 0 otherwise
The field (7.53) is the field in the exit pupil as predicted by geometrical optics when the lens is
diffraction limited. In diffraction optics we compute the field in the focal region using diffraction
integrals, instead of using ray tracing as is done in geometrical optics. Hence, the modification
introduced by diffraction optics is due to the more accurate propagation of the field in the exit
pupil (as predicted by geometrical optics) into the region behind the exit pupupil of the lens.
The starting field in the exit pupil is however still the same as in geometrical optics.
Because of the more accurate propagation using diffraction integrals we find that the field is
not strictly zero outside of the cones shown in Fig. 7.10, although most of the energy is indeed
concentrated inside the cones. Also, the field inside the cones is modified and not exactly given
by the spherical wave front, due to the diffraction by the exit pupil.
p If a/f is sufficiently small, we may replace in the denominator of (7.53) the distance
x2 + y 2 + f 2 between a point in the exit pupil and the focal point by f . Replacing this
distance by f is not allowed in the exponent however, because the error made in this replace-
ment would be enhanced too much by the multiplication by the large wave number k. In the
exponent we therefore use instead the first two terms of the Taylor series, i.e. we apply the
paraxial approximation (7.23):
s
p x2 + y 2 x2 + y 2
x2 + y 2 + f 2 = f 1 + ≈ f + , (7.55)
f2 2f
where we dropped the constant factors eikf and 1/f . For a general incident field U0 (x, y) in the
entrance pupil, the lens applies a transformation such that the field in the exit plane becomes:
2 +y 2
−ik x
U0 (x, y) → U0 (x, y)1Ja (x, y)e 2f ,
(7.57)
(transformation applied by a lens between its entrance and exit planes)
The function that multiplies U0 (x, y) is the transmission function of the lens:
2 +y 2
−ik x
τlens (x, y) = 1Ja (x, y)e 2f . (7.58)
This result makes sense: in the centre (x, y) = 0 the lens is thickest, so the phase is shifted the
most (but we can define this phase shift to be zero because only phase differences matter, not
absolute phase). As is indicated by the minus-sign in the exponent, the further you go away
from the centre of the lens, the less the phase is shifted. For shorter f , the lens focuses more
strongly, so the curvature of the lens is higher, and the phase shift changes more rapidly as a
function of the radial coordinate. Note that transmission function (7.58) has modulus 1 so that
Optica Lecture Notes TN2421 119 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
Figure 7.10: Top: wavefronts in image space due to the focussing of a plane wave that propagates
parallel to the optical axis according to geometrical optics. There is no light outside of the two
cones. Bottom left: amplitude as predicted by diffraction optics. There is no absolute darkness:
the boundary of the cones is diffuse. Furthermore, the intensity does not increase monotonically
with decreasing distance to the focal point as predicted by geometrical optics. Bottom right:
phase of the focused field as predicted by diffractin optics.
Optica Lecture Notes TN2421 120 of 165 Monday 16th April, 2018, 09:44
7.6. Fourier Optics
energy is conserved.
If we propagate this new field using the Fresnel diffraction integral of Eq. (7.25), we get
ik(x2 +y 2 )
eikz e 2z
0 0 0 0
1
ik(x02 +y 02 ) 2z 1
− 2f x y
U (x, y, z) = F U0 (x , y )1J (x , y )e , . (7.59)
iλz a λz λz
This expression can be used for sufficiently large distances from the exit pupil plane, in particular
in the focal plane and beyond it. It is seen at the bottom of Fig. 7.10 that the field is not
monotonically increasing with decreasing distance to the focal point. Instead, secondary maxima
are seen along the optical axis. Also the boundary of the light cone is not sharp as predicted by
geometrical optics but diffuse. For points in the back focal plane of the lens, i.e. z = f , we find
ik(x2 +y 2 )
eikf e 2f n
0 0 J 0 0
o x y
U (x, y, f ) = F U0 (x , y )1 a (x , y ) , , (7.60)
iλf λf λf
which is the same as the Fraunhofer integral! Thus, the field at the focal plane is the same as
the far field of the field in the entrance pupil of the lens, or to put it differently:
The field in the entrance pupil of the lens and the field at the focal plane are
related by a Fourier transform (apart from a quadratic phase factor in front of the
integral).
It can be shown that the fields in the front focal plane U (x, y, −f ) and the back focal plane
U (x, y, f ) are related exactly by a Fourier transform, i.e. without the additional quadratic
phase factor7 .
So a lens performs a Fourier transform. Let us see if that corresponds to some of the things
we know from geometrical optics:
• We know from geometrical optics that if we illuminate a lens with parallel rays of light (a
plane wave), they all intersect in the back focal plane. This corresponds with the fact that
for U0 (x, y) = 1 (i.e. plane wave illumination, neglecting the finite aperture of the lens),
i.e. neglecting diffraction effects), its Fourier transform is a delta peak:
kx ky kx ky
F(U0 ) , =δ δ , (7.61)
2π 2π 2π 2π
• If in geometrical optics we illuminate a lens with tilted parallel rays of light (a plane wave
propagating in some other direction), then the point in the back focal plane where they
intersect is laterally displaced. A tilted plane wave is described by U0 (r) = eik0 ·r , and its
Fourier transform with respect to (x, y) is given by
kx ky kx − k0,x ky − k0,y
F{U0 } , ,z = δ δ ,
2π 2π 2π 2π
It seems that our new model of light propagation confirms what we know from geometrical
optics. But in the previous two examples we have discarded the infuence of the finite size of the
7
Introduction to Fourier Optics, J. Goodman, §5.2.2 - Several calculations on the Fourier transforming prop-
erties of lenses.
Optica Lecture Notes TN2421 121 of 165 Monday 16th April, 2018, 09:44
A2 J1(kaq>R) 2 2J1(ka sin u) 2
c d (10.52) I(u) = I(0) c d (10.56)
kaq>R ka sin u
( )
b spot: surface plot of the intensity and cross-section of the field.
Figure 7.11: Airy
7.02
8.42
Optica Lecture Notes TN2421 122 of 165 Monday 16th April, 2018, 09:44
26/08/16 4:06 PM
7.6. Fourier Optics
where si is the image distance as given by the Lens Law. This field is called the Point
Spread Function (PSF for short). For an object point that is not on the optical axis, the
PSF is translated such that it is centred on the image point according to geometrical optics.
2. The second method is by propagating the field of the point object from the object plane
to the entrance pupil of the lens using the Fresnel diffraction formula, multiplying by the
transmission function of the lens given by (7.58), and finally propagating the field from the
exit pupil of the lens to the image plane using the Fresnel diffraction formula again.
Both methods give identical formulae for the PSF Airy disk8 : [NOTE: you do not need
to know the following formula for the examination]
p
a 2 + y2
πa2 J 1 2π λsi x
PSF(x, y) = 2πa
p , (7.66)
λsi x2 + y 2
λsi
Usually we consider as object plane the plane immediately behind the object (on the side of
the lens). We assume that we know the field in this plane. This field has been transmitted (or
reflected) by the object and is then further propagation through the optical system. The object
plane is discretised by a set of points and the images of these points are given by translated
versions of the PSF weighted by the field in these points. The total image field is then:
Z Z
1 x y
Ui (x, y, si ) = PSF − xo , − xo Uo (xo , yo , s0 ) dxo dyo . (7.67)
M M M
where xi = M xo , yi = M yo is the image point and M = si /so is the magnification and the factor
1/M is to preserve energy. If the magnification is unity, the image field is a convolution between
the PSF and the object field. If the magnification differs from unity, the integral can be made
into a convolution by rescaling the coordinates in image space.
It is clear from (7.66) that larger radius a of the lens and smaller wavelength λ imply a narrower
PSF. This in turn implies that the kernel in the convolution is more sharply peaked and hence
that the resolution of the image is higher. 9 . Alternatively, one could think of the aperture of
the lens cutting away the higher spatial frequencies, as shown in Fig. 7.5, which causes loss of
resolution of the Fourier transform observed in the image plane.
Optica Lecture Notes TN2421 123 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
be made. By applying three SLMs in series, one can tune the polarisation, the phase and
the amplitude pixel by pixel and hence very special fields can be made in the focal region of
a lens, for example an electric field with only a longitudinal component (i.e. Ez -component)
in the focal point10 .
• Suppose we have the setup as shown in Fig. 7.13. With one lens we can create the Fourier
transform of some field U0 (x, y), and we can put a mask in the focal plane and then with a
second lens we can invert the Fourier transform of this new field. This procedure is called
Fourier filtering using lenses. Fourier filtering means that the amplitude and/or phase of
the plane waves in the angular spectrum of the field can be manipulated. An application
of this idea is the phase contrast microscope.
Lens Lens
U(x,y) Û(kx,ky) U(x,-y)
f f f f
Figure 7.13: Setup for Fourier filtering. The first lens creates a Fourier transform of U (x, y),
to which we can apply some operation (e.g. applying a different phase shifts to different parts
of the field). The second lens then applies another Fourier transform (which is the same as the
inverse Fourier transform and a mirror transformation).
7.7 Superresolution
We have emphasized that evanescent waves set the ultimate limit to resolution in optics. In
Chapter 3, it was explained that, although within geometrical optics one can image a single
point perfectly using conical surfaces, several points, let alone an extended object, can not be
imaged perfectly. It was futhermore explained that when only paraxial rays are considered, i.e.
within gaussian geometrical optics, perfect imaging of extended objects is possible. However,
rays whose angle with the optical axis are large cause aberrations. But even when perfect imaging
would be possible in geometrical optics, a real image can never be perfect due to the fact that
information contained in the amplitudes and phase of the evanescent waves can not propagate.
The resolution that can be obtained with an optical system consisting of lenses is less than follows
from considering the effect of evanescent waves because, apart form the evanescent waves, also
the propagating waves with spatial frequencies that are so high that they are not captured by
the optical system, can not contribute to the image. Therefore the image of a point object has
the size
λ/NAi , (7.68)
where NAi = a/si is the numerical aperture in image space, i.e. it is the sinus of half the opening
angle of the cone extended by the exit pupil at the gaussian image point on the optical axis.
This resolution limit is called the diffraction limit.
10
See Phys. Rev. Lett. 100, 123904, 2008
Optica Lecture Notes TN2421 124 of 165 Monday 16th April, 2018, 09:44
7.7. Superresolution
The size of the image of a point as given by the PSF in (7.66), is influenced by the magnifica-
tion of the system. To characterize the resolution of a diffraction limited system, it is therefore
better to consider the numerical aperture on the object side: NAo = NAi |M | = a/so . The value
of NAo is the sinus of the half angle of the cone extended by the entrance pupil of the system
on the object point on the optical axis. This is the cone of wave vectors emitted by this object
point that can contribute to the image. The larger the half angle of this cone, the higher the
spatial frequencies that can contribute and hence the higher the resolution.
It should be clear by now that beating the diffraction limit is extremely difficult. Nevertheless,
a lot of research in physics has been and still is directed to realizing this goal. Many attempts
have been made, some succesful, others have failed, but, whether succesful or not, most were
based on very ingenious ideas. To close this chapter on diffraction theory, we will give a flavor
of the attemps to achieve what is called superresolution.
Confocal microscope. A focused spot is used to scan the object and the reflected field is
imaged onto a small detector (“point detector”). The resolution is roughly a factor 1.5
better than for normal imaging with full field of view using the same objective. The higher
resolution is achieved thanks to the illumination by oblique plane waves that are present in
the spatial (Fourier) spectrum of the focused spot. By illumination with plane waves under
large angles of incidence, higher spatial frequencies of the object which are under normal
incidence not accepted by the objective, are now “folded back” into the cone of plane waves
accepted by the objective. The higher resolution comes at a prize of longer imaging time
because of scanning. The confocal microscope was invented by Marvin Minsky in 1957.
Figure 7.14: A laser beam is focused by an objective and scanned over an object. The reflected
light is imaged by the same objective onto a small detector.
The Perfect Lens based on negative refraction It can be shown that when a material has
negative permittivity and negative permeability, the phase velocity of a plane wave
is opposite to the energy velocity. Furthermore, when a slab of such material is surrounded
by material with postive permittivity and positive permeability equal to the absolute values
of the permittivity and permeability of the slab, the reflection of waves is zero for every
Optica Lecture Notes TN2421 125 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
angle of incidence and every state of polarisation. Moreover, evanescent waves gain
amplitude inside the slab and it turns out that there are two planes, one inside the slab
and one on the other side of it, where a perfect image of a point in front of the slab occurs.
Note that the increase of amplitude of an evanescent wave does not violate the conservation
of energy because an evanscent wave does not propagate energy in the direction in which
it is evanescent.
The simple slab geometry which acts as a perfect lens, was invented by John Pendry in
2000 11 . Unfortunately, a materal with negative permittivity and negative permeability
has not been found in nature, although there seems to be no fundamental reason why it
could not exist. Therefore many groups have attempted to mimick such a material by
conventional materials such as metals. There are however more fundamental reasons why
Pendry’s perfect lens will not work satisfactory, even if the material would exist. We refer
to the master lecture Advanced Photonics for more details.
Figure 7.15: Pendry’s perfect lens consists of a slab of a material with negative permittivity and
negative permeability such that its absolute values are equal to the positive permittivity and
positive permeability of the surrounding material. Points outside the slab are imaged perfectly
in two planes: one inside the sab and the other on the other side of the slab.
Hyperbolic materials Hyperbolic materials are anisotropic, i.e. the phase velocity of a plane
wave depends on the polarisation. The permittivity of an anisotropic material is a tensor
(loosely speaking a (3,3)-matrix). Normally the eigenvalues of the permittivity matrix
are positive, however in a hyperbolic material two eigenvalues are of equal sign and the
third has opposite sign. In such a medium all waves of the so-called extra-ordinary type
of polarisation, propagate, no matter how high the spatial frequencies are. Hence for this
state of polarisation, there are NO evanescent waves and therefore super-resolution and
perfect imaging should be possible in such a medium.
Natural hyperbolic media seem to exist for a few frequencies in the mid-infrared. For visible
wavelengths, materials with hyperbolic behaviour are too lossy to give superresolution.
Therefore one tries to approximate hyperbolic media by so-called metamaterials which are
made of very thin metallic and dielectric layers so that the effective permittivity has the
desired hyperbolic property. The success of this idea has however been moderate so far.
Nonlinear effects When the refractive index of a material depends on the local electric field,
the material is nonlinear. At optical frequencies nonlinear effects are in general quite
11
J.B. Pendry, PRL 18, 2000
Optica Lecture Notes TN2421 126 of 165 Monday 16th April, 2018, 09:44
7.7. Superresolution
Figure 7.16: Examples of composite materials consisting of thin (sub-wavelength) layers of metals
and dielectrics. These artificial materials are called metamaterials.
small, but with a strong laser they become significant. One effect is self-focusing, where
the refractive index is proportional to the local light intensity. The locally higher intensity
causes an increase of the refractive index, leading to a waveguiding effect due to which the
beam focuses even more strongly. Hence the focused beam becomes more and more narrow
while propagating until finally the material breaks down.
Optica Lecture Notes TN2421 127 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS
Figure 7.18: Spot used for excitation (top left) and for depletion (top middle). Fluorescence
signal top right. In the lower figure the confocal image is compared to the STED image.
1. Every picture is made of waves - Sixty Symbols, 3:33 to 7:15: Basic explanation of Fourier
transforms.
2. Heisenberg’s Microscope - Sixty Symbols, 0:20 to 2:38: Basic explanation of the uncertainty
principle (though in the context of quantum physics).
Optica Lecture Notes TN2421 128 of 165 Monday 16th April, 2018, 09:44
Chapter 8
Lasers
What you should know and be able to do after studying this chapter.
• Understand the optical resonator and the reason for needing it.
• Understand the role of the amplifier and explain what the gain curve is.
• Explain the principle of the population inversion and how it can be achieved.
• Understand what transverse modes are and how they can be eliminated.
In the early 1950s a new source of microwave radiation, the maser, was invented by C.H. Townes
in the USA and A.M. Prokhorov and N.G. Basov in the USSR. Maser stands for "Microwave
Amplification by Stimulated Emission of Radiation". In 1958, A.L. Schawlow and Townes for-
mulated the physical constraint to realize a similar device for visible light. This has resulted in
1960 in the first optical maser by T.H. Maiman in the USA. This device has since then been
called Light by Amplitification Stimulated Emission of Radiation or laser.
It has revolutionized science and engineering and is being applied in many different applications
such as:
• bar code readers,
• compact discs,
• computer printers,
• fiberoptic communication,
• sensors,
• material processing,
• non-destructive testing,
• position and motion control,
• spectroscopy,
• medical applications, such as treatment of retina detachment,
• nuclear fusion,
• holography.
129
CHAPTER 8. LASERS
∆ν = 1/τc , (8.2)
where τc is called the coherence time, the typical duration of the bursts.
Laser light is also emitted by atoms but, as will be explained in this chapter, due to the
special configuration of the laser, the wave trains can be extremely long corresponding to a very
long coherence time.
The property of a very narrow spectral width is essential for many of the already mentioned
applications of lasers, in particular in communication, high resolution spectroscopy, interferom-
etry and for sensors.
θ = h/f, (8.3)
where 2h is the size of the source and f is the focal length of the lens. Hence the light can
be collimated by either choosing a lens with large focal length or by reducing the size of the
source, or both. Both methods lead however to weak intensities. Before the invention of the
laser, collimated beams were obtained by using a tiny light source. Hence collimated beams were
in those days always very weak.
There exists a fundamental limit for the collimation of a beam. As follows from Chapter 7, a
time-harmonic beam of diameter D and wavelength λ has a diffraction limited divergence given
by:
λ
θ= . (8.4)
D
Optica Lecture Notes TN2421 130 of 165 Monday 16th April, 2018, 09:44
8.1. Unique Properties of Lasers and Their Applications
Figure 8.1: A discharge lamp is positioned in the focal plane of a converging lens. Every atom in
the lens emits a spherical wave during a burst of radiation lasting on average a coherence time
τc . The overall divergence of the beam is determined by the atoms at the extreme positions of
the source.
Note that the minimum divergence depends on the wavelength. When a laser beam is used, the
diffraction limited convergence angle can almost be reached. The minimum divergence angle of
a laser beam therefore does not depend on the size of the laser source as is the case of classical
light sources. Furthermore, all the power emitted by the laser can be collimated so that very
high intensities can be realized. High degree of collimation is very useful for many applications
Figure 8.2: A laser beam can almost reach diffraction limited collimation.
8.1.3 Very Small Focused Spot; Diffraction Limited Focused Spot; High Spa-
tial Coherence.
If we add a second lens after the first lens in Fig. 8.1 a spot is obtained in the focal plane of the
second lens. This spot can be very small only when the light has been made almost perfectly
collimated by the first lens.
What is the smallest focal spot that one can achieve? If one focuses a perfectly collimated
beam with a lens with very small aberrations, the lateral size of the focused spot is limited by
diffraction. According to Chapter 7:
f λ
diffraction limited spot size = 0.6 λ = 0.6 . (8.5)
D NA
With a laser one can achieve a diffraction limited spot that has a very high intensity. Almost all
the light emitted by the laser can be focused into the spot.
As has been explained in Chapter 6, a light wave has high spatial coherence if it is well
behaved in space. At any given time, its amplitude and phase in any point can be predicted.
The spherical waves emitted by a point source have this property. But when there are many
point sources (atoms) that each emit bursts of harmonic waves that start at random times as
is the case in a classical light source, the amplitude and phase of the total emitted field at any
Optica Lecture Notes TN2421 131 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
position in space can not be predicted. The only way to make the light spatially coherent is
by making the light source very small, but then there is hardly any light. As will be explained
below, by the design of the laser, the emission by the atoms of the amplifying medium in a laser
are phase correlated which leads to a very high temporal and spatial coherence. The property of
a small spot size with high intensity is essential for many applications. For the compact disc, a
scanning diffraction limited focused spot is necessary to obtain maximum resolution and hence
maximum storage density of data, while a high light intensity is needed for high sensitivity of
writing and reading the data. In material processing (cutting, welding and drilling) spots with
very high powers are needed. In retina surgery a very small high intensity spot is applied to weld
the retina without damaging the surrounding healthy tissue.
There are many applications of high power lasers such as for cutting and welding materials. In
Integrating Circuit (IC) manufacturing the required high transistor density on the chip will be
realised in the future by using the very short extreme ultra-violet (EUV) wavelength of 13.5 nm to
image the mask pattern into the photoresist. To obtain EUV light with sufficient high intensity,
extremely powerful CO2 lasers are used to excite a plasma. Extremely high power lasers are
also applied to initiate fusion and in many nonlinear optics applications. Lasers with very short
pulses are used to study very fast phenomena with short decay times, for optical computers to
realise faster clocks and for high resolution imaging.
Optica Lecture Notes TN2421 132 of 165 Monday 16th April, 2018, 09:44
8.2. Optical Resonator
• an optical resonator;
• an amplifying medium.
In this section we consider the optical resonator. The function of the resonator is to obtain a
high light energy density and to gain control over the emission wavelengths.
A resonator, whether it is mechanical like a pendulum, a spring or a string, or electrical like an
LRC circuit, has one or multiple resonance frequencies νres . Every resonator has losses, therefore
the oscillation will gradually die when after the initial excitation no energy is supplied. The
losses cause an exponential decrease of the amplitude of the oscillation as shown in Fig. 8.4. The
oscillation is therefore not purely monochromatic but has a finite bandwidth given by ∆ν ≈ 1/τ
as shown in Fig. 8.4, where τ is the time at which the amplitude of the oscillation has reduced
to half the initial value.
Figure 8.4: Damped oscillation (left) and frequency spectrum of a damped oscillation (right)
with resonance wavelength and frequency equal to the reciprocal of the decay time.
The optical resonator is a region filled with some material with refractive index n bounded
by two aligned highly reflective mirrors at a distance L. The resonator is called a Fabry-Perot
cavity.
Let the z-axis be chosen along the axis of the cavity as shown in Fig. 8.5, and assume that the
transverse directions are so large that the light can be considered a plane wave bouncing back
and forth along the z-axis between the two mirrors. Let ω be the frequency and k0 = ω/c the
wave number in vacuum. The plane wave that propagates in the positive z-direction is given by:
For very good mirrors, the amplitude remains unchanged upon reflections while the phase
Optica Lecture Notes TN2421 133 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
typically changes by π. Hence, after one round trip (i.e. two reflections) the field (8.6) is (the
possible phase changes at the mirrors add up to 2π and hence have no effect):
A high field builds up when this wave constructively interferes with (8.6), i.e. when
2πm kc c
k= , or ν = =m , (8.8)
2nL 2π 2nL
∆ν = c/(2nL), (8.9)
which is the so-called free spectral range. For a gas laser that is 1 m long, the free spectral
range is approximately 150 MHz.
Example Suppose that the cavity is 100 cm long and is filled with a material with refrac-
tive index n = 1. Light with visible wavelength of λ = 500 nm corresponds to mode number
m = 2L/λ = 4 × 106 and the free spectral range is ∆ν = c/(2L) = 150 MHz.
Because of losses caused by the mirrors (which never reflect perfectly) and by the absorption
and scattering of the light, the resonances have a certain frequency width ∆ν. When a resonator
is used as a laser, one of the mirrors is given a small transmission to couple the laser light out.
This also corresponds to a loss of the resonator. To compensate for all losses, the cavity must
contain an amplifying medium. Due to the amplification the resonance line widths inside the
bandwidth of the amplifier are reduced to very sharp lines as shown in Fig. 8.6.
Figure 8.6: Resonant frequencies of a cavity of length L when the refractive index n = 1. With
an amplifier inside the cavity, the linewidths of the resonances within the bandwidth of the
amplifier are reduced. The red curve is the spectral function of the amplification.
Optica Lecture Notes TN2421 134 of 165 Monday 16th April, 2018, 09:44
8.3. Amplification
8.3 Amplification
Amplification can be achieved by a medium with atomic resonances that are at or close to one of
the resonances of the resonator. We first recall the simple theory developed by Einstein in 1916
of the dynamic equilibrium of a material in the presence of electromagnetic radiation.
~ω = E2 − E1 , (8.10)
an atom that is initially in the lower energy state 1 can be excited to state 2. Here ~ is Planck’s
constant:
6.626070040
~= × 10−34 Js . (8.11)
2π
Suppose W (ω) is the time-averaged electromagnetic energy density per unit of frequency interval
around frequency ω. Hence W has dimension Js/m3 . Let N1 and N2 be the number of atoms
in state 1 and 2, respectively, where
N1 + N2 = N, (8.12)
is the total number of atoms (which is constant). The rate of absorption is the rate of decrease
of N1 and is proportional to the energy density and the number of atoms in state 1:
dN1
= −B12 N1 W (ω), absorption, (8.13)
dt
where B12 > 0 is constant of proportionality with dimension m3 J −1 s−2 . Without any external
influence, an atom that is in the excited state will usually transfer to state 1 within 1 ns or so,
while emitting a photon of energy (8.10). This process is called spontaneous emission since
it happens also without an electromagnetic field present. The rate of spontaneous emission is
given by:
dN2
= −A21 N2 , spontaneous emission, (8.14)
dt
where A21 has dimension s−1 . The life time of spontaneous transmission is τsp = 1/A21 . It is
important to note that the spontaneously emitted photon is emitted in a random direction.
In fact, since an atom can in general be described by a radiating electric dipole, the statistical
distribution of radiation angles is proportional to the intensity pattern of the field radiated by the
dipole. Furthermore, since the radiation occurs at a random time, there is no phase relation
between the spontaneously emitted field and the field that excites the atom.
It is less obvious that in the presence of an electromagnetic field of frequency close to the
atomic resonance, an atom in the excited state can also be stimulated by that field to emit a
photon and transfer to the lower energy state. The rate of stimulated emission is proportional
to the number of excited atoms and to the energy density of the field:
dN2
= −B21 N2 W (ω), stimulated emission, (8.15)
dt
where B21 has the same dimension as B12 . It is very important to remark that stimulated
emission occurs in the same electromagnetic mode (e.g. a plane wave) as the mode of the
field that excites the transmission and that the phase of the radiated field is identical to that
of the exciting field. This implies that stimulated emission enhances the electromagnetic field by
constructive interference and this property is crucial for the operation of the laser.
Optica Lecture Notes TN2421 135 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
Figure 8.7: Absorption, Spontaneous Emission and Stimulated Emission with their respective
rates.
~ω 3 1
WT (ω) = 2 3
, (8.16)
π c exp
kB T − 1
~ω
The rate of upward and downwards transitions of the atoms in the wall of the box must be
identical:
B12 N1 WT (ω) = A21 N2 + B21 N2 WT (ω). (8.18)
Hence,
A21
WT (ω) = . (8.19)
B12 N1 /N2 − B21
But in thermal equilibrium:
N1 E1 − E2 ~ω
= exp − = exp . (8.20)
N2 kB T kB T
By substituting (8.20) into (8.19), and comparing the result with (8.16), it follows that both
expressions for WT (ω) are identical for all temperatures only if
~ω 3
B12 = B21 , A21 = B21 . (8.21)
π 2 c3
Example For green light of λ = 550 nm, we have ω/c = 2π/λ = 2.8560 × 106 m−1 and thus
A21
= 1.5640 × 10−15 J s m−3 . (8.22)
B21
Hence the spontaneous and stimulated emission rates are equal if W (ω) = 1.5640 × 10−15 Js
m−3 .
Optica Lecture Notes TN2421 136 of 165 Monday 16th April, 2018, 09:44
8.3. Amplification
I (W m−2 )
Mercury lamp 104
Continuous laser 105
Pulsed laser 1013
For a (narrow) frequency band dω the time averaged energy density is W (ω) dω and for a
plane wave the energy density is related to the intensity I (i.e. the length of the time averaged
Poynting vector) as:
A typical value for the frequency width of a narrow emission line of an ordinary light source
is: 1010 Hz, i.e. dω = 2π × 1010 Hz. Hence, the spontaneous and stimulated emission rates
are identical if the intensity is I = 2.95 × 104 W/m2 . As seen from Table 8.1, only for laser
light stimulated emission is larger than spontaneous emission. For classical light sources the
spontaneous emission rate is much larger than the stimulated emission rate. If a beam with
frequency width dω and energy density W (ω) dω propagates through a material, the rate of loss
of energy is proportional to:
According to (8.18) this is equal to the spontaneous emission rate. Indeed, the spontaneously
emitted light corresponds to a loss of intensity of the beam because it is emitted in random
directions and with random phase.
When N2 > N1 , the light is amplified. This state is called population inversion and it is
essential for the operation of the laser. Note that the ratio of the spontaneous and stimulated
emission rates is according to (8.21) proportional to ω 3 . Hence for shorter wavelengths such as
X-rays, it is much more difficult to make lasers than for visible light.
Optica Lecture Notes TN2421 137 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
Figure 8.8: ∆N/N as function of t/(A21 + 2B12 W ) when all atoms are in the ground state at
t = 0, i.e. ∆N (0) = −N .
A way to achieve population inversion of levels 1 and 2 and hence amplification of the radiation
with frequency ω with ~ω = E2 − E1 is to use more atomic levels, for example three. In Fig. 8.9
the ground state is state 1 with two upper levels 2 and 3 such as: E1 < E2 < E3 . The transition
of interest is still that from level 2 to level 1. Initially almost all atoms are in the ground state
1. Atoms are pumped with rate R from level 1 directly to level 3. The transition 3 → 2 is
non-radiative and has high rate A32 so that level 3 is quickly emptied and therefore N3 remains
small. State 2 is called a metastable state because each atom’s residence time in the metastable
state is relatively long. Therefore the population tends to increase and leads to a population
inversion between the metastable state 2 and the lower ground state 1 (which is continuously
being depopulated to the highest level).
To obtain population inversion, a majority of ground state electrons (State 1) must be pro-
moted to the highly excited energy level (State 3), requiring a significant input of external energy.
Note that when A31 is not small, level 1 will quickly be filled by which population inversion
will be stopped. Then the laser output is a series of pulses. To have a continuous laser output,
atoms in level 1 should quickly decay to level 0.
Pumping may be done optically as described, but the required energy to transfer the atoms
from level 0 to level 2 can also be supplied by an electrical discharge in a gas or by an electric
current. After the pumping has achieved population inversion, there is initially no light emitted.
So how does the laser actually start? Lasing starts by spontaneous emission. The spontaneously
emitted photon stimulate emission of the atoms in level 2 to decay to level 1 while emitting a
photon of energy ~ω. This stimulated emission occurs in phase with the exciting light en hence
the light continuously builds up coherently while it is bouncing back and forth between the
mirrors. One of the mirrors is slightly transparent and in this way some of the light is leaking
out of the laser.
8.4 Cavities
The amplifying medium can completely fill the space between the mirrors as at the top in
Fig. 8.10, or there can be space between the amplifier and the mirrors. For example, if the
amplifier is a gas it may be enclosed by a glass cylinder. The end faces of the cylinder are
positions under the Brewster angle with respect to the axis as shown in the middle figure of
Fig. 8.10, to minimise reflections. This type of resonator is called a resonator with external
mirrors.
Usually one or both mirrors are convex as shown in the bottom figure of Fig. 8.10. We state
Optica Lecture Notes TN2421 138 of 165 Monday 16th April, 2018, 09:44
8.5. Problems of Laser Operation
without proof that in that case the distance L between the mirrors and the radii of curvature
R1 and R2 of the mirrors have to satisfy
L L
0< 1− 1− < 1, (8.29)
R1 R2
or else the laser light will ultimately leave the cavity. This condition is called the condition of
stability of the laser. The curvatures for the mirrors are positive for convex mirrors but negative
values have to be substituted when the mirror is concave. Clearly, when both mirrors are concave,
the laser is always unstable.
Figure 8.10: Three types of laser cavity. The yellow region is the amplifier. The middle case is
called a laser with external cavities.
Optica Lecture Notes TN2421 139 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
than the losses. One then says that the laser is above threshold for only one frequency. This can
be done by choosing the length L of the cavity so small that there is only one mode under the
gain curve, with gain higher than the losses. However, a small length of the amplifier means less
output power. Another method would be to restrict the pumping so that for only one mode the
gain compensates the losses. But this implies again that the laser output power is very limited.
A better solution is to add a Fabry-Perot cavity inside the laser cavity as shown in Fig. 8.12. The
cavity consists e.g. of a piece of glass of a certain thickness a. By choosing a sufficiently small, the
distance in frequency c/(2a) between the resonances of the Fabry-Perot cavity becomes so large
that there is only one Fabry-Perot resonance under the gain curve of the amplifier. Furthermore,
by choosing the proper angle for the Fabry-Perot cavity with respect to the axis of the laser cavity,
the Fabry-Perot resonance can be coupled to the desired resonance frequency. This frequency is
then the frequency of the laser output. All other resonance frequencies of the resonator under
the gain curve are damped because they are not a resonance of the Fabry-Perot cavity.
Figure 8.11: Laser with cavity of length L and broad amplifier gain curve. Many resonance
frequencies of the resonances are above threshold to compensate the losses.
Optica Lecture Notes TN2421 140 of 165 Monday 16th April, 2018, 09:44
8.6. Types of Lasers
Figure 8.12: Laser with cavity of length L and broad amplifier gain curve. Many resonance
frequencies are below the gain curve and have gain which is above the red dashed line, to
compensate the losses. Such modes are referred to as being above treshold.
can be eliminated by inserting an aperture in the laser cavity. This aperture is so small that the
transverse modes suffer high scattering losses, but is sufficiently large so that the Gaussian mode
is not affected.
Optica Lecture Notes TN2421 141 of 165 Monday 16th April, 2018, 09:44
width. Suppose
e approximate
x of refraction (a) (b) (c) CHAPTER 8. LASERS
dividing 1.5 *
half the maxi-
de in the cavity
by Eq. (13.16),
mode would fit
d by the broad-
laser operating
0 cm to ensure
of this particu-
ve region con-
utput power of
of oscillation,
g the cavity or
ell (Figs. 13.11
l to z, these are
magnetic). The
ansverse nodal
g beam. That is
Figure Mode
Figure 13.13 8.14: Intensity pattern of
patterns (without theseveral transverse modes.
faint interference fringes this is
what the beam looks like in cross section). (Used with permission of Alcatel-Lucent
USA Inc.)
Frequency to say, the beam is segmented in its cross section into one or
) more regions. Each such array is associated with a giventhTEM
Optica Lecture Notes TN2421 142 of 165 Monday 16 April, 2018, 09:44
mode, as shown in Figs. 13.13 and 13.14. The lowest order, or
TEM00, transverse mode is perhaps the most widely used, and
8.6. Types of Lasers
Figure 8.15: Resonance frequencies of transverse modes that have sufficient gain to compensate
the losses.
If A is the atom in the ground state and A∗ is the excited atom, we have
~ω02 + A → A∗ , (8.30)
where ω02 is the frequency for the transition 0 → 2. The Ruby laser (the amplifier consists of
Al2 O3 with 0.05 weight percent Cr2 O3 ) was the fist laser invented in 1960. It emits pulses of
light of wavelength 694.3 nm and is optically pumped with a gas discharge lamp. Other optically
pumped lasers are the YAG, glass, fiber, semiconductor and the dye laser.
In the dye laser the amplifier is a liquid (e.g. Rhodamine6G). It is optically pumped by an
argon laser and it has a huge gain width, which covers almost the complete visible wavelength
range. We can select a certain wavelength by inserting a dispersive element like the Fabry-Perot
cavity inside the laser cavity and rotate it at the right angle to select the desired wavelength, as
explained above.
where e(E1 ) means an electron with energy E1 and where E1 − E2 is equal to ~ω02 so that the
atom is transferred from the ground state to state 2 to obtain population inversion. Examples
are the Argon, Krypton, Xenon, Nitrogen and Copper lasers. Electrons can be created by a
discharge or by an electron beam.
Optica Lecture Notes TN2421 143 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
Let B m be atom B in an excited, so-called metastable state. This means that B m , although
unstable, has a very long relaxation time, i.e. longer than 1 ms or so. If B m collides with atom
A, it transfers energy to A.
B m + A → B + A∗ , (8.32)
A∗ is the excited state used for the stimulated emission. Let τm1 be the relaxation time of
metastable state B m , then τm1 is very large and hence the spontaneous emission rate is very
small. This implies that the number of metastable atoms as function of time t is given by a
slowly decaying exponential function exp(−t/τm1 ). How can one get metastable atoms? One
can for example pump atom B from their ground state 1 to an excited state 3 above state m,
such that the spontaneous emission rate 3 → m is large. The pumping can be done electrically
or using electron collisions or by any other means. If it is done electrically, then we have
Examples of these types of laser are He-Ne (which emits in the red at 632 nm), N2 -CO2 and
He-Cd. All of these depend on atom or molecule collisions, where the atom or molecule that
is mentioned first in the name is brought in the metastable state and the lasing occurs at a
wavelength corresponding to a level difference of the second mentioned atom (or molecule). In
the simplest case the metastable states are created by electrons generated by a discharge. the
CO2 laser emits at 10 µm and can achieve huge power.
Optica Lecture Notes TN2421 144 of 165 Monday 16th April, 2018, 09:44
8.6. Types of Lasers
Figure 8.19: HeNe laser with spherical external mirrors, a discharge tube with faces at the
Brewster angle to minimise reflections, and an anode and cathode for the discharge pumping.
Figure 8.20: Semiconductor laser with active p-n junction, polished end faces and current supply
for pumping.
Optica Lecture Notes TN2421 145 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS
Optica Lecture Notes TN2421 146 of 165 Monday 16th April, 2018, 09:44
Appendices
147
Appendix A
Vector Calculus
Below, A, B C, and D are vector fields (or constant vectors) and φ and ψ are scalar functions.
Then:
A · (B × C) = B · (C × A) = C · (A × B). (A.1)
A × (B × C) = (A · C)B − (A · B)C. (A.2)
(A × B) · (C × D) = (A · C)(B · D) − (A · D)(B · C) (A.3)
∇ · (φA) = φ∇ · A + ∇φ · A. (A.4)
∇ × (φA) = φ∇ × A + ∇φ × A. (A.5)
∇ · (A × B) = −A · ∇ × B + B · ∇ × A. (A.6)
∇ × (A × B) = −(A · ∇)B + A∇ · B + (B · ∇)A − B∇ · A. (A.7)
∇(A · B) = (A · ∇)B + A × ∇ × B + (B · ∇)A + B × ∇ × A. (A.8)
∇ · ∇φ = ∆φ, (A.9)
where ∆ = ∂ 2 /∂x2 + ∂ 2 /∂y 2 + ∂ 2 /∂z 2 provided that (x, y, z) is an orthonormal basis.
∇ × ∇ × A = −∆A + ∇∇ · A. (A.10)
Remark: The last formula is only valid in a Cartesian coordinate system. This means that
the vector field A must be decomposed on the Cartesian basis and the derivatives must be com-
puted with respect to the corresponding Cartesian coordinates and then ∆A must be interpreted
component-by-component: ∆A= (∆Ax , ∆Ay , ∆Az )T , where Ax , Ay , Az are components with
respect to the Cartesian basis. The formula does not hold in cylindrical or spherical coordinates!.
∇ × ∇φ = 0. (A.11)
∇ · (∇ × A) = 0. (A.12)
In addition, the following integral theorems apply (V is a volume with surface area S and
outward unit normal n.
Gauss’s Theorem (or divergence theorem):
ZZZ ZZ
∇ · A dV = A · n dS. (A.13)
V S
Apply this to the vector field A = φ∇ψ. Because of (A.4) and (A.9) holds ∇·A = φ∆ψ +∇·∇ψ
and thus (Green’s Theorem):
ZZZ ZZ
∂ψ
φ∆ψ + ∇φ · ∇ψ dV = φ dS. (A.14)
V S ∂n
149
APPENDIX A. VECTOR CALCULUS
By subtracting the analogous relation from (A.14) with φ and ψ interchanged, one gets:
ZZZ ZZ
∂ψ ∂φ
φ∆ψ − ψ∆φ dV = φ −ψ dS. (A.15)
V S ∂n ∂n
= (n × A) · B) dS
Z ZS
= A · (n × B) dS, (A.16)
S
where n is the unit vector field that is perpendicular to S, which is in the direction to which a
right-handed corkscrew points if it is rotated in the positive direction of the line integral along
C.
There is also an analogue of Green’s Theorem for the curl operator:
Z Z Z Z Z Z Z Z Z Z
∇ × A · B dS − A · ∇ × B dS = (n × A) · B dS = − A · (n × B) dS.
V V S S
(A.18)
Optica Lecture Notes TN2421 150 of 165 Monday 16th April, 2018, 09:44
1.4
0 200 400 600 800 1000 transmitting th
Wavelength l (nm) As a final p
er than any of
Figure 3.40 The wavelength dependence of the index of refraction for
situation can o
various materials. Note that while l goes up toward the right, n goes up
toward the left. plate. This is
seeming contr
this behavior a
For example, the important characteristic frequencies for glasses (Section 7.2.2)
occur at wavelengths of about 100 nm. The middle of the visible In partial su
Appendix B 2
range is roughly five times that value, and there, v0j 7 7 v . Notice 2
trum, electroni
2 2
that as v increases toward v0j, (v0j - v ) decreases and n gradu- mining n(v).
ally increases with frequency, as is clearly evident in Fig. 3.40. vibrating at the
The Lorentz Model of Material
This is called normal dispersion. In the ultraviolet region, as v frequency is ap
approaches a natural frequency, the oscillators will begin to reso-
Dispersion frequency, the
nate. Their amplitudes will increase markedly, and this will be tive absorption
accompanied by damping and a strong absorption of energy from are increased,
the incident
The Lorentz wave.
model, which we already v0j = v in
Whenmentioned inSection
Eq. (3.73), thetodamping
2.3, leads term
a dispersion relation on
for the charges
obviously
the susceptibility and becomes
hence for thedominant.
permittivity The
given regions
by immediately sur-
rounding the various v0j in Fig. 3.41 are called absorption
1 + 2q
(ω) = , (B.1)
bands. There dn>dv is negative, 1and − q the process is spoken of as
with anomalous (i.e., abnormal) dispersion. When white light passes 3 × 1015
2 2.8
through a glass prism, q = thee blue 2constituent
Nq fj
, has a higher index (B.2)
X
30 me ω − ω 2 + iγ ω
than the red and is therefore deviated j j
through a larger angle (see
j
Index of refraction
the square root of the permittivity: and its real is shown in Figs. B.1. K
2.0 KBr
possess
For dilute absorption
gases, N is smallbands somewhere
and hence q is smallwithin
compared thetoelectromagnetic
1. Then the permittivity NaCl
becomes equal to
frequency spectrum, so that the term anomalous
N qe2 X fj
dispersion, be- KCl
ing a carryover from the
q =late
1 +1800s, is certainly (B.3) SiO2
ωj2 − ω 2 + iγaj ωmisnomer.
(ω) ≈1+ ,
30 me 1.6
j CaF2
LiF
NaF
n Ba
1.2
0.8
√KE 100 200
1 ultra-
violet
v
0 v01 v02 v03
Infrared Visible Ultraviolet X-ray Figure 3.42 Ind
for several impor
Figure 3.41Figure
Refractive index versus
B.1: Refractive indexfrequency.
as function of frequency. Chemical Co.)
The resonances corresponding to transitions from a lower to a higher energy level of electrons
that are in the inner shells of an atom, typically are in the x-ray region, whereas transitions
of valence electrons can be in the ultra-violet to the visible. Resonances of relative motions of
atoms inside a molecule are often in the infrared. At a resonance, the atom absorbs a photon of
151
APPENDIX B. THE LORENTZ MODEL OF MATERIAL DISPERSION
energy ~ω equal to the difference between the energy levels. The material is then absorbing and
this corresponds to a permittivity with positive imaginary part. In between the resonances, the
absorption is low so that the imaginary part of the permittivity is almost zero while its real part
is slowly increasing with frequency (this is called "normal" dispersion). Close to a resonance,
the real part of the permittivity is quickly decreasing with frequency (abnormal dispersion).
Optica Lecture Notes TN2421 152 of 165 Monday 16th April, 2018, 09:44
Appendix C
We consider a time harmonic electric field that is more general than a plane wave (i.e. it is
not necessarily a single plane wave but a superposition of plane waves with wave vectors with
different directions). Let V be a bounded volume with closed boundary A. The time averaged
flux of electromagnetic energy through the boundary A outwards from the volume is given by
the surface integral Z Z
F = S(r) · n̂dA, (C.1)
where n̂ is the outwards pointing unit normal. We assume that there are no sources inside V .
There are then two possibilities:
1. F < 0. In this case there is a nonzero net flux into the volume. Because all fields are time
harmonic, there can only be a net influx if electromagnetic energy is absorbed inside the
volume. Hence the imaginary part of the permittivity must be positive. It can be shown
that the time average of the absorbed power is given by
ω
Absorbed e.m. energy = Im()|E(r)|2 dV, (C.2)
2
where E(r) is the complex amplitude of the electric field at position r.
2. F = 0. In this case the net energy flow through the boundary is zero and hence the matter
in the volume does not absorb.
153
APPENDIX C. ABOUT THE CONSERVATION OF ELECTROMAGNETIC ENERGY
Optica Lecture Notes TN2421 154 of 165 Monday 16th April, 2018, 09:44
Appendix D
Electromagnetic Momentum
155
APPENDIX D. ELECTROMAGNETIC MOMENTUM
Optica Lecture Notes TN2421 156 of 165 Monday 16th April, 2018, 09:44
Appendix E
E.1 Definitions
ZZ
F(h)(ξ, η) = e−2πi(xξ+yη) h(x, y)dx dy. (E.1)
ZZ
F −1 (H)(x, y) = e2πi(xξ+yη) H(ξ, η)dξ dfy . (E.2)
where ZZ
(g ∗ h)(x, y) = g(x − x0 , y − y 0 )h(x0 , y 0 )dx0 dy 0 . (E.8)
where Z p
1
ĥ(n) = h(x)e−2πnx dx. (E.10)
p 0
where
sin(πx)
sinc(x) = . (E.12)
πx
157
APPENDIX E. THE FOURIER TRANSRORM
h 2 2 2 2
i 1 −π(ξ2 /a2 +η2 /b2 )
F e−π(a x +b y ) (ξ, η) = e . (E.15)
|ab|
Let (
als px2 + y 2 ≤ a,
p
1,
1
a (x, y) = (E.16)
0, als x2 + y 2 > a.
Then
p
J1 2πa ξ 2 + η 2
F(1
a )(ξ, η) = a p . (E.17)
ξ2 + η2
h 2 2 2 2
i i −iπ(ξ2 /a2 +η2 /b2 )
F eiπ(a x +b y ) (ξ, η) = e (E.18)
|ab|
Optica Lecture Notes TN2421 158 of 165 Monday 16th April, 2018, 09:44
Appendix F
Basis transformations
In this section, we discuss the relevance of basis transformations and how to apply them. So
what are basis transformations essentially? It comes down to the following: if we have some
physical object Ψ, we can describe it with a vector (which can in principle be a continuous
function). The form of the vector with which we represent Ψ depends on the basis that we
choose. For example, we could represent a position vector R in Cartesian coordinates (x, y, z),
or in spherical coordinates (ρ, φ, θ), or in cylindrical coordinates (r, φ, z). It is important to note
that the physical object remains unchanged, it is only the coefficients with which the ob-
ject is represented that change. The formulas that describe how the coefficients for one basis
transform to the coefficients in the other basis constitute the basis transformation. In case
these formulas can be described as a matrix operation, we have a linear basis transformation.
This concept you have encountered in Linear Algebra courses.
Basis transforms are ubiquitous, so it is important to be familiar with them also outside the
context of Optics. For example, if you have some signal Ψ, you can either express it in the time
domain or in the frequency domain. These are two different representations of the same
physical object, and the basis transformation that relates the two is the Fourier transform. In
the discrete case it would read
N
X −1
Xk = xn e−2πikn/N , (F.1)
n=0
where xn are the coefficients representing the signal in the time domain, and Xk are the coeffi-
cients representing the signal in the frequency domain. Note that this basis transformation can
be described as a matrix operation
1 1 1 ...
X0 −2πi/N e−4πi/N . . . x0
X1 1 e
= 1 e−4πi/N e−8πi/N . . . x1 , (F.2)
.. ..
. .. .. .. .. .
. . . .
so the Fourier transform is a linear basis transformation. The use of applying such a basis trans-
formation is obvious: in different bases, there is different information that becomes apparent
more obviously. In the time domain one can see how the signal progresses in time, but it is
difficult to identify different frequency components, whereas in the frequency domain it is very
easy to see how much each frequency contributes to the signal, but it is difficult to see how the
signal changes in time. Also, sometimes it is more efficient to describe a signal in one basis than
in the other. For example, if the signal is a sine wave in the time domain, it takes infinitely many
nonzero coefficients (each coefficient being a point in time) to describe it in the time domain,
while it takes only two nonzero coefficients to describe it in the frequency domain. We say that
a signal can be sparse in a certain basis (sparse meaning that it be represented with few non-
zero coefficients). This sparsity can help in compressing data, or it can be used as a constraint
159
APPENDIX F. BASIS TRANSFORMATIONS
A similar observation holds for the different representations of a quantum state. One can repre-
sent a quantum state |ψi in the position basis (i.e. in terms of the eigenvectors of the position
operator x̂), or in the momentum basis (i.e. in terms of the eigenvectors of the position operator
p̂). Again, the physical object remains unchanged, but by representing it in different bases,
different parts of information become more apparent. In the position basis it becomes easier
to see where a particle may be located, while in the momentum basis it is easier to see what
momentum it may have. The basis transformation that relates the position representation to
the momentum representation is the Fourier transform. One can also represent a quantum state
|ψi in the energy basis (i.e. in terms of the eigenvectors of the energy operator Ĥ, also called
the Hamiltonian), in which case it is easier to see what energy a particle may have, and which
makes it easier to calculate the time-evolution of the wave function (because the time evolution
is determined by the Schrödinger equation, which is a differential equation involving Ĥ).
So we have seen that basis transformations can help in making certain properties of a vector
become more apparent, or make its description simpler (i.e. more sparse). Another advantage
that a basis transformation can have is that applying operators can be easier in a cer-
tain basis. In particular, applying a linear operator A to some vector Ψ is much easier if Ψ is
expressed in the eigenbasis of A. Suppose we can write
X
Ψ= a k vk , (F.3)
k
So we have seen how one may benefit from expressing Ψ in terms of eigenvectors of A. But
if Ψ is given in some arbitrary basis, how do we find the coefficients that represent it in the
eigenbasis of A? To do this, let us consider a simple example. Suppose Ψ has the following
representation in the x̂, ŷ basis
Ψ = 4x̂ + 7ŷ. (F.6)
Or in vector notation
4
Ψxy = . (F.7)
7
Keep in mind that this is not the vector corresponding to Ψ. Rather, it is a representation of
Ψ which holds in the x̂, ŷ basis (i.e. it should be understood that the first entry in the vector
Optica Lecture Notes TN2421 160 of 165 Monday 16th April, 2018, 09:44
is the coefficient corresponding to x̂, and the second entry is the coefficient corresponding to ŷ).
Now, let us suppose that the linear operator A has eigenvectors
v1 = 1x̂ + 3ŷ,
(F.8)
v2 = 2x̂ + 1ŷ,
Suppose we want to write Ψ in the v1 , v2 basis. We need to find Ψ [v1 ], Ψ [v2 ] such that
Obviously
Ψ = 2v1 + 1v2 , (F.11)
because in the x̂, ŷ basis this gives
4 1 2
=2 + . (F.12)
7 3 1
Once again, let us emphasize that although Ψ is represented with different numbers, the object
itself hasn’t changed.
Let us put our previous calculations in more general terms. We know representations of Ψ,
v1 , v2 in the x̂, ŷ basis
Ψ [x] v1 [x] v2 [x]
Ψ= , v1 = , v2 = , (F.14)
Ψ [y] v1 [y] v2 [y]
Here, Ψ [x], Ψ [y] represent Ψ in the x̂, ŷ basis, and Ψ [v1 ], Ψ [v2 ] represent Ψ in the v1 , v2 basis.
Thus, defining the matrix
v1 [x] v2 [x]
B= , (F.16)
v1 [y] v2 [y]
to go from the v1 , v2 representation of Ψ to the x̂, ŷ representation of Ψ, we must calculate
Ψ [x] Ψ [v1 ]
=B . (F.17)
Ψ [y] Ψ [v2 ]
Optica Lecture Notes TN2421 161 of 165 Monday 16th April, 2018, 09:44
APPENDIX F. BASIS TRANSFORMATIONS
So now we know how to go from one basis representation to another and back. We have seen
previously that it can be convenient to go to the eigenbasis of a linear operator A, because in
that representation A is diagonal. Thus we can diagonalize A as
λ1 0
A=B B −1 . (F.19)
0 λ2
To summarize, with B −1 we go from some x̂, ŷ basis to the eigenbasis of A. The columns of
B contain the eigenvectors of B in the x̂, ŷ basis. Then we apply the operator A, which in its
eigenbasis is a diagonal matrix with its eigenvalues along the diagonal. Then, to go back from
the eigenbasis to the x̂, ŷ basis, we apply B.
In particular, this can be useful when one has to apply A many times. Because in that case
N
λ1 0
N
A =B B −1 . (F.20)
0 λN 2
λ
e 1 0
=B B −1 .
0 eλ2
This is for example used in the solution of the Schrödinger equation
d
Ĥ = i~
dt
⇒ (F.22)
which indicates why it’s useful to describe (0) in the energy basis if we want to find its time
evolution.
So how can we apply these basis transformations and eigenvalue decompositions in Optics? Well,
a
suppose we know the transmission axis of a linear polarizer. Let’s say it’s . Then we know
b
a
all light polarized in that direction will be transmitted completely, so is an eigenvector of
b
b
the polarizer operator with eigenvalue 1. We know that all light polarized in the direction
−a
(i.e. perpendicular to the transmission axis) will be completely blocked, so this is an eigenvector
with eigenvalue 0. Thus, given the transmission axis of a linear polarizer, we can immediately
write down its Jones matrix
−1
a b 1 0 a b
J= . (F.23)
b −a 0 0 b −a
Conversely, from the eigenvalue decomposition of a Jones matrix we can immediately see what
its principal axes are, and how it acts on the components along those axes (i.e. whether it’s a
Optica Lecture Notes TN2421 162 of 165 Monday 16th April, 2018, 09:44
linear polarizer, half-wave plate, quarter-wave plate, or something else).
Also, basis transformations can be used to describe optical activity. In optically active media,
there are different refractive indices for left-circularly and right-circularly polarized light, so it is
more convenient to represent the Jones vector in the basis of left-circularly and right-circularly
polarized light, rather than in the basis of two linear orthogonal polarizations.
It is also interesting to note the equivalence between the Jones vector and the quantum states of
photons that are used as qubits: the polarization of a photon is a two-state quantum-mechanical
system. This qubit can be represented as
where α and β are analogous to the entries of the Jones vector. Indeed, in experiments on
quantum information with photons as qubits, wave plates are ubiquitous1 . Also in quantum
information, it is important to be familiar with basis transformations.
Another instance in Optics where we use a basis transformation (or more specifically: an eigen-
value decomposition) in order to apply an operation more easily is in the Angular Spectrum
Method. This method is used to propagate a field U0 , and it is explained in the chapter on
Diffraction Optics. The operation we want to apply in this case is the propagation operator
P∆z which denotes the propagation over a distance z. To do this, we decompose the field U0 in
eigenfunctions of P∆z , which are plane waves eik·r because
So indeed, eik·r is an eigenfunction of P∆z , with eigenvalue eikz ∆z . The basis transformation
we need to apply in order to decompose U0 into eigenfunctions of P∆z is the Fourier transform.
So, as prescribed in Eq. (F.19), to apply the propagation operator we Fourier transform U0 to
decompose it into eigenfunctions of P∆z , we multiply each component with the eigenvalue eikz ∆z ,
and then we inverse Fourier transform to go back to the original basis
In this framework, it can be easily understood how this method should be altered for propagation
in non-homogeneous media. In that case the plane waves eik·r are no longer eigenfunctions of the
propagation operator, and instead we must find the appropriate eigenfunctions and eigenvalues
for propagation through such a medium.
For other explanations of basis transformations, one could go to Khan Academy - Alternate
coordinate systems (bases), and Khan Academy - Showing that an eigenbasis makes for good
coordinate systems.
1
See e.g. Experimental Demonstration of Blind Quantum Computing, S. Barz et al. (2011).
Optica Lecture Notes TN2421 163 of 165 Monday 16th April, 2018, 09:44
Index
Aberrations, 56 visibility, 98
distortion, 58 Full-wave plates, 77, 79
spherical, 58
AFOV, see Angular field of view Half-wave plates, 77, 79
Airy Spot, 59 Helmholtz equation, 18
Angular field of view, 62 Huygens-Fresnel principle, 83
Aperture stop, 55 Hyperopia, 66
Jones Matrices, 77
Coherence
partial, 86 Laser
propagation, 98 coherence, 131
spatial, 92 CW, 132
temporal, 87 diffraction, 131
Complex notation, 14 dye laser, 143
Cones, 65 population inversion, 137
Critical angle, 28 pulsed, 132
resonator, 133
Depth of focus, 62
spontaneous emission, 135
Descartes’ Law, 28
stimulated emission, 135
Descartes’ law, 39
transverse mode, 140
Diffraction, 59
Lens
Einstein Coefficients, 135 aberrations, 56
Energy Magnifying power, 69
flow, 20 power, 56, 65
in a electromagnetic field, 21 Sign convention, 10, 46
Poynting’s vector, 20 spherical, 47
time averaged, 23 thick, 53
Entrance pupil, 55 thin, 53
Exit Pupil, 55 thin lens equation, 48
Eyepiece, 69 Lensmaker’s Formula, see Lens
164
Index
Perfect imaging, 40
Polarisation
circular, 74
elliptical, 75
linear, 74
P-, 25
S-, 25
TE, 25
TM, 25
Polarizer
linear, 78
Poynting’s vector, see Energy
Reflection
total internal, 28, 31
Refraction
angle, 28
Brewster angle, 30
Refractive index, 12
Rods, 65
Rotation matrix, 77
Sign convention
lens, 10, 46
Ray angle, 52
Snell’s Law, 28, 39
speed of light, 12
Telescope, 71
Wave
evanescent, 33
intensity, 23
plane wave, 13, 20
spherical, 16
Wave equation
scalar, 12
vector, 12
Optica Lecture Notes TN2421 165 of 165 Monday 16th April, 2018, 09:44