Vous êtes sur la page 1sur 167

An electronic version of this dissertation is available at

http://blackboard.tudelft.nl/.
Copyright © 2018 by Optica, TUDelft

Front Cover: Picture taken at TUDelft by Roland Horsten (2016).


Delft University of Technology

Bachelor Technische Natuurkunde

Optica (TN 2421)

Lecture Notes

Authors:
Paul Urbach
Aurèle Adam
Sander Konijnenberg

March-April 2018
Monday 16th April, 2018, 09:44
Optica Lecture Notes TN2421 2 of 165 Monday 16th April, 2018, 09:44
Contents

1 Introduction 7

2 The Basics of Electromagnetic and Wave Optics 9


2.1 Electromagnetic Theory of Optics and Quantum Optics . . . . . . . . . . . . . . 9
2.2 The Maxwell Equations in Vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Maxwell Equations in Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 The Scalar and Vector Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Time Harmonic Solutions of the Wave Equation . . . . . . . . . . . . . . . . . . . 13
2.5.1 Time Harmonic Plane Waves . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.2 Complex Notation for Time Harmonic Functions . . . . . . . . . . . . . . 13
2.5.3 Time Harmonic Spherical Waves . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Time Harmonic Maxwell Equations in Matter . . . . . . . . . . . . . . . . . . . . 16
2.7 Time Harmonic Electromagnetic Plane Waves . . . . . . . . . . . . . . . . . . . . 20
2.8 Electromagnetic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Time Averaged Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.10 Reflection and Transmission at an Interface . . . . . . . . . . . . . . . . . . . . . 23
2.10.1 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.10.2 S-polarisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.10.3 P-polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10.4 Interpretation of the Fresnel Coefficients . . . . . . . . . . . . . . . . . . . 30
2.10.5 Total Internal Reflection and Evanescent Waves . . . . . . . . . . . . . . . 31
2.11 Fiber Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Geometrical Optics 35
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Fermat’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Some Consequences of Fermat’s Principle . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Perfect Imaging by Conic Sections . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Gaussian Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.1 Gaussian Imaging by a Single Spherical Surface . . . . . . . . . . . . . . . 44
3.5.2 The Thin Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.3 Construction of the Image of a Finite Object . . . . . . . . . . . . . . . . 48
3.5.4 Two Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.5 The Matrix Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.6 The Thick Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6 Stops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.7 Beyond Gaussian Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.7.1 Aberrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.8 Beyond Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3
CONTENTS

4 Optical Instruments 61
4.1 The Camera Obscura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 The Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.1 Camera in a Mobile Phone . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 The Human Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3.1 Accommodation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2 Retina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.3 Eyeglasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3.4 New Correction Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Magnifying glass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.1 Magnifying power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.2 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 Eyepieces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 The Compound Microscope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7 Telescopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5 Polarisation 73
5.1 Polarization states, Jones Vectors, Jones Matrices . . . . . . . . . . . . . . . . . . 73
5.2 Creating and manipulating polarisation states . . . . . . . . . . . . . . . . . . . . 75
5.2.1 Jones Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2.2 Linear Polarizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.3 Quarter-Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.4 Half-Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.5 Full-Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 How to Determine Whether a Matrix Corresponds to a Linear Polariser or a Wave
Plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Decomposition of Elliptical Polarisation into Linear and Circular States . . . . . 80

6 Interference and coherence 83


6.1 Interference of Monochromatic Fields of the Same Frequency . . . . . . . . . . . 84
6.2 Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.1 Polychromatic Light and its Intensity . . . . . . . . . . . . . . . . . . . . 86
6.2.2 Temporal Coherence and the Michelson Interferometer . . . . . . . . . . . 87
6.2.3 Spatial Coherence and Young’s Experiment . . . . . . . . . . . . . . . . . 92
6.3 Change of Spatial Coherence due to Propagation . . . . . . . . . . . . . . . . . . 96
6.4 Fringe Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.5 Interference and polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

7 Scalar Diffraction Optics 101


7.1 Angular Spectrum Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 Rayleigh-Sommerfeld Diffraction Integral . . . . . . . . . . . . . . . . . . . . . . . 105
7.3 Equivalence of the Two Propagation Methods . . . . . . . . . . . . . . . . . . . . 106
7.4 Intuition for the Spatial Fourier Transform in Optics . . . . . . . . . . . . . . . . 106
7.5 The Fresnel and Fraunhofer Approximations . . . . . . . . . . . . . . . . . . . . . 112
7.5.1 Fresnel Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.5.2 Fraunhofer Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.5.3 Examples of Fraunhofer fields . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.6 Fourier Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.6.1 Focussing of a Parallel Incident Beam . . . . . . . . . . . . . . . . . . . . 118
7.6.2 Imaging by a lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.6.3 Optical Fourier Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.7 Superresolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Optica Lecture Notes TN2421 4 of 165 Monday 16th April, 2018, 09:44
Contents

8 Lasers 129
8.1 Unique Properties of Lasers and Their Applications . . . . . . . . . . . . . . . . . 130
8.1.1 High Monochromaticity; Narrow Spectral Width; High Temporal
Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.1.2 Highly Collimated Beam; Difraction Limited Collimation. . . . . . . . . . 130
8.1.3 Very Small Focused Spot; Diffraction Limited Focused Spot; High Spatial
Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.1.4 High Power; CW and Pulsed. . . . . . . . . . . . . . . . . . . . . . . . . . 132
8.1.5 Wide Tuning Range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.2 Optical Resonator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.3 Amplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.3.1 The A and B Einstein Coefficients . . . . . . . . . . . . . . . . . . . . . . 135
8.3.2 Relation Between the Einstein Coefficients . . . . . . . . . . . . . . . . . . 136
8.3.3 Population Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.4 Cavities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.5 Problems of Laser Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.5.1 How to Realize Single Frequency Operation . . . . . . . . . . . . . . . . . 139
8.5.2 How to Prevent Transverse Modes . . . . . . . . . . . . . . . . . . . . . . 140
8.6 Types of Lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.6.1 Optical Pumping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.6.2 Electron-Collision Pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.6.3 Atom Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.6.4 Chemical Pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.6.5 Semiconductor Laser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Appendices

A Vector Calculus 149

B The Lorentz Model of Material Dispersion 151

C About the Conservation of Electromagnetic Energy 153

D Electromagnetic Momentum 155

E The Fourier Transrorm 157


E.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
E.2 General Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
E.3 Special Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

F Basis transformations 159

Optica Lecture Notes TN2421 5 of 165 Monday 16th April, 2018, 09:44
CONTENTS

Optica Lecture Notes TN2421 6 of 165 Monday 16th April, 2018, 09:44
Chapter 1

Introduction

As a physics student, you are probably familiar with many concepts of optics and the nature
of light. From secondary school you may remember Snell’s law of refraction, the lens formula
and ray tracing rules, and the interference fringes observed in the double-slit experiment. By
now you have also learned that with Maxwell’s equations one can show that light consists of
electromagnetic waves, that its speed c was found to be constant which resulted in the develop-
ment of the theory of relativity, and that light exhibits a wave-particle duality that is explained
by quantum mechanics and by the De Broglie hypothesis in particular. Although this is al-
ready a rather sizable body of knowledge, there is still lots to be taught about optics, which is
what we will do in this course. However, many of these new topics do not necessarily require
knowledge about quantum mechanics or even Maxwell’s equations. Thus, we are in the some-
what strange situation where we may have to ”take a step back” or ”forget” some of the things
we already know about light in order to explain certain concepts as simply and clearly as possible.

We remark that what you will learn in this course applies to a much larger part of physics
than only opitcs. In fact, optics refers strictly speaking only to electromagnetic fields of visible
wavelengths from 390 nm to 780 nm. But everthing we will discuss applies to electromagnetic
radiation of any wavelength, from γ radiation of 10−13 nm wavelength to long radio waves of
wavelength of more than 103 m. Since the approximate theories that we will discuss, such
as for example geometrical optics, are valid provided the wavelength is sufficiently small com-
pared to the size of the objects occuring in th problem, these theories apply also to any of the
mentioned wavelengths provided the same ratio of wavelenght to typical size of the objects holds.

In this course, we will mostly follow the book "Optics" by Eugene Hecht, to which we will
refer usually simply by "Hecht". Although this book offers nice reading it is very long. There-
fore these Lecture Notes are made self-contained. This means that you do not need the book
of Hecht to master the topics. If you sometimes have the feeling that you need a more detailed
explanation than given in the Lecture Notes, you can use the links to websites with additional
explanations and demonstrations that are provided at the end of several sections. If you still like
to use a book for further reading, we recommend to read the section(s) in the book of Hecht to
which we refer.
We summarize the content of the Lecture Notes:

• First we recall some basic facts about Maxwell equations and show how the wave equation
is derived from these equations and then we discuss some special solutions such as plane
waves and spherical waves. The has already been treated in the course ”Golven” and
therefore we will go through this material quickly during the lectures. You may want to
study it yourself in more detail to refresh your knowledge because you have to master this
chapter to be able to follow other parts of the course. Chapter 2 corresponds to Appendix
1 of Hecht, Sections 2.7-2.9 and some parts of Chapter 3.

7
CHAPTER 1. INTRODUCTION

• Then we study light from the point of view of Geometrical Optics. The theory studied
can be found in a couple of sections of Chapters 4, 5 and 6 of Hecht. This model of optics
applies to cases where the wavelength of light can be considered to be vanishingly small
compared to other lengths of the problem. In geometrical optics light is considered to
travel as rays. With this concept we can explain phenomena observed in for example the
pinhole camera, or simple microscopes and telescopes.

• Next we study different kinds of polarisation of light and how one can manipulate them.
This corresponds to some sections of Chapter 8 of Hecht.

• Then we discuss the superposition of light waves and the phenomena of interference of light
and how this depends on a property of light sources called coherence. This corresponds to
certain sections in Chapter 7, 9 and 12 of Hecht.

• In Chapter 7 we treat Diffraction Optics. In this model we describe light as a wave, with
which we can explain phenomena such as interference fringes caused by the interaction of
light with structures of finite size, such as a slit or aperture in a screen. Our treatment of
this subject differs quite a bit of how it is discussed in Hecht, but you may nevertheless
find it useful to read parts of Chapter 10 of Hecht.

• Finally, in Chapter 8 the unique properties of lasers and their applications are discussed.
In discussing lasers, many of the properties of light discussed in previous chapters will play
a role, in particular, the coherence and diffraction. A laser contains an optical resonator
and we will explain how such a resonator can be used to achieve very high light intensities.
A laser also requires a medium which amplifies the light by stimulated emission. To un-
derstand the mechanism of stimulated emission, the theory of Einstein will be discussed.
We also consider some problems with lasers and how they can are commonly solved.

It may bother you that you are sometimes taught models which we know are basically ”wrong”
or ”inaccurate” (such as Geometrical Optics). But then remember that in the end all of physics
is merely a model that tries to describe reality. Some models, which tend to be more complex,
are more accurate than others, but depending on the phenomena we want to predict, a simpler,
less accurate model may suffice. For example, in a substantial number of practical cases, such as
the modeling of imaging formation in cameras, geometrical optics is already sufficiently accurate
and a model based on Maxwell’s equations or even the scalar wave equation would be too
computationally demanding. From a pedagogical point of view, it surely seems preferable to
learn the simpler model prior to learning the more accurate model: think of how we need to
learn Newtonian mechanics before we can get a grasp of quantum mechanics or relativity.

Optica Lecture Notes TN2421 8 of 165 Monday 16th April, 2018, 09:44
Chapter 2

The Basics of Electromagnetic and


Wave Optics

The content of this chapter is assumed known in the rest of the course.
It has been treated already in the course ”Golven”, therefore in the
lectures it will discussed only briefly. In summary, with the material in
this chapter you should know and be able to do:

• Derive the scalar wave equation for the components of the electromagnetic
field.

• Work with complex notation of time harmonic functions and fields.

• Understand time harmonic plane waves, spherical waves, wave fronts and
the phase velocity.

• Derive long time averages of products of time harmonic functions.

• Compute the rate of energy flow and its long time average energy.

• Derive the reflection and transmission of an incident plane wave at an inter-


face.

• Understand the Brewster angle, total internal reflection and evanscent waves.

• Understand the principle of electromagnetic wave guides (fibers).

2.1 Electromagnetic Theory of Optics and Quantum Optics

Maxwell’s equations provide a very complete description of light which includes diffraction,
interference and polarization. Yet it is strictly speaking not fully accurate because it allows
monochromatic elecromagnetic waves to carry any energy whereas according to quantum optics
the energy is quantized. According to quantum optics light is a flow of massless particles, the
photons, which each carry an extremely small quantum of energy: which for frequency ω is
~ω, where ~ = 6.63 × 10−34 /(2π) Js. Optical frequencies are of the order 5 × 1014 Hz, hence
~ω ≈ 3.3 × 10−19 Js.
Quantum optics is mainly important in experiments involving only a small number of photons,
i.e. at very low light intensities , and for specially prepared states of photons (e.g. entangled
states) for which there is no classical description.

9
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

Table 2.1: The mean photon flux density for a sampling of common
sources

Light Mean photon flux density


Source φ/A in units of (photons/s.m2 )
Laserbeam (10mW, He-Ne, focused 1026
to 20 µm)
Laserbeam (1mW, He-Ne) 1021
Bright sunlight 1018
Indoor light level 1016
Twilight 1014
Moonlight 1012
Starlight 1010

2.2 The Maxwell Equations in Vacuum


Light is transfer of electromagnetic energy, linear momentum and angular momentum in the
visible part of the spectrum (wavelengths of 390 to 780 nm). In a vacuum, light can be mathe-
matically described by two position- and time-dependent vector fields E(r, t) and B(r, t), where
r is the position vector of the point of observation and t is the time. These vector fields are
traditionally called the electric field strength and the magnetic induction, respectively, and to-
gether they are referred to as "the electromagnetic field". This terminology is explained by the
fact that when the fields vary with time, the electric and magnetic field always occur together,
i.e. one does not exist without the other. Only when the fields are constant in time, there can
be an electric field without a magnetic field and conversely. The first case is called electrostatics,
the second magnetostatics. In optics the fields are changing with time, hence in optics we always
have both the electric and the magnetic field.
Time dependent electromagnetic fields are generated by moving electric charges, the so-called
sources. If the sources have charge density ρ(r, t) and current density J (r, t), then conservation
of charge implies that the rate of increase of charge inside a volume V must be equal to the flux
of charges passing through its surface A from the outside to the inside of V , i.e.:
Z Z
d
− J · n̂ dA = % dV, (2.1)
A dt V

where n̂ is the outwards pointing unit normal on A. Using Gauss’s divergence theorem (A.13) to
convert the left-hand side into a volume integral, the following differential form of the conservation
law is obtained:
∂ρ
−∇·J = . (2.2)
∂t
At every point in space and at every time the field vectors satisfy the Maxwell equations1,2 :

∂B
∇×E = − , (Faraday’s law), (2.3)
∂t
B ∂E
∇× = 0 + J , (Maxwell’s law), (2.4)
µ0 ∂t
∇ · 0 E = %, (Gauss’s law), (2.5)
∇ · B = 0 (there is no magnetic charge), (2.6)
1
Khan Academy - Faraday’s Law Introduction
2
Khan Academy - Magnetic field created by a current carrying wire (Ampere’s Law)

Optica Lecture Notes TN2421 10 of 165 Monday 16th April, 2018, 09:44
2.3. Maxwell Equations in Matter

where 0 = 8.8544 × 10−12 C2 N−1 m−2 is the so-called dielectric permittivity and µ0 = 1.2566 ×
10−6 m kg C−2 is the magnetic permeability of vacuum. The quantity c = (1/0 µ0 )1/2 =
2.997924562 × 108 ± 1.1 m/s is the speed of light in vacuum.

2.3 Maxwell Equations in Matter


Atoms are neutral and consist of a positive kernel surrounded by a negative electron cloud. In
an electric field, the centre of mass of the positive and negative charges get displaced and do not
coincide anymore. Therefore an atom in an electric field behaves like an electric dipole. In polar
molecules, the centre of mass of the positive and negative charges are permanently separated,
even without an electric field. The molecular dipoles are randomly oriented but line up under the
influence of an electric field. What ever the precise mechanism may be, under the influence of
an electric field a certain dipole moment density per unit volume P(r) [C/m2 ] is induced which
is proportional to the local electric field E(r):

P(r, t) = 0 χE(r, t), (2.7)

where χ is a dimensionless quantity, the so-called susceptibility. We stress that E is the total
local field at the position of the dipole, i.e. it contains the contribution of other dipoles that are
also excited and radiate fields themselves. Only in the case of diluted gasses, the influence of the
other dipoles in matter can be neglected and the local electric field is simply given by the field
emitted by the external source. A dipole moment density that changes with time corresponds to
a current density J [C s−1 m2 ], and a charge density given by

∂P(r, t) ∂E(r, t)
J (r, t) = = 0 χ , (2.8)
∂t ∂t
%(r, t) = −∇ · P(r, t) = −∇ · (0 χE), (2.9)

Substitution of (2.8) and (2.9) into Maxwell’s equations (2.4) and (2.5) gives

B ∂E
∇× = 0 (1 + χ) , (2.10)
µ0 ∂t
∇ · (0 E) = −∇ · (0 χE), (2.11)

where it is assumed that apart from the current and charges of the induced polarisation, there
are no other currents and sources in the material considered. We define the permittivity (same
dimension as 0 ) of the material by
 = 0 (1 + χ). (2.12)
Then
∂B
∇×E = − , (Faraday’s Law), (2.13)
∂t
∂E
∇ × B = µ0 , (Maxwell’s Law), (2.14)
∂t
∇ · E = 0, (Gauss’s Law), (2.15)
∇ · B = 0 (there is no magnetic charge), (2.16)

It is seen that the Maxwell’s equations in matter are identical to those in vacuum provided the
permittivity of vacuum is replaced by that of the material.

Remark: If the material is magnetic, the magnetic permeability is different from vacuum and
denoted by µ. In Maxwell’s equations one should then replace µ0 by µ. However, at optical
frequencies magnetic effects are negligible (except in ferromagnetic materials, which are rare).

Optica Lecture Notes TN2421 11 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

We will therefore take µ = µ0 in these Lectures Notes.

If the material is conducting, then there is an additional current density given by Ohm’s Law:

J c = σE, (2.17)

where σ [C V −1 m−1 ] is the conductivity. This current density has to be added to the right-hand
side of ((2.14)). The other Maxwell equations remain unchanged.

2.4 The Scalar and Vector Wave Equation


We consider the case of a pure dielectric, i.e. the conduction currents vanish. Take the curl of
(2.13) and the time derivative of (2.14) and add the thus obtained equations. This gives

∂2E
∇ × ∇ × E + µ0 = 0. (2.18)
∂t2
Now for any vector field A there holds:

∇ × ∇ × A = −∇2 A + ∇∇ · A. (2.19)

where ∇2 A is the vector:  2 


∇ Ax
∇2 A = ∇2 Ay  , (2.20)
∇2 Az
with
∂2 ∂2 ∂2
∇2 = + + . (2.21)
∂x2 ∂y 2 ∂z 2
Because Gauss’s law (2.15) implies that ∇ · E = 0, (2.19) applied to E implies

∇ × ∇ × E = −∇2 E, (2.22)

and hence (2.18) becomes


∂2E
∇2 E − µ0 = 0. (2.23)
∂t2
Hence every component of the electric field satisfies the scalar wave equation:

∂2U
∇2 U − µ0 = 0. (2.24)
∂t2
The refractive index is the dimensionless quantity defined by
r

n= . (2.25)
0

The scalar wave equation can then be written as

∂2U
∇2 U − n2 0 µ0 = 0. (2.26)
∂t2
The speed of light in matter is
c 1
=√ . (2.27)
n µ0

Optica Lecture Notes TN2421 12 of 165 Monday 16th April, 2018, 09:44
2.5. Time Harmonic Solutions of the Wave Equation

2.5 Time Harmonic Solutions of the Wave Equation


2.5.1 Time Harmonic Plane Waves
Time harmonic solutions depend on time by a cosine or a sine. One can easily verify by substi-
tution that
U(r, t) = A cos(kx − ωt + ϕ), (2.28)
where the amplitude A > 0 and ϕ are constant, is a solution of (2.24) if

k = ω(µ0 )1/2 = ωn 0 µ0 = nk0 , (2.29)

where k0 = ω 0 µ0 is the wave number in vacuum. The frequency ω > 0 can be chosen arbitrary
but the wave number k is then determined by (2.29). We define T = 2π/ω and λ = 2π/k as
the period and the wavelength in the material, respectively. Furthermore, λ0 = 2π/k0 is the
wavelength in vacuum.

Remark: If we speak of ”the wavelength”, we always mean the wavelength in vacuum.

We can write (2.28) in the form


h  c  i
U(x, t) = A cos k x − t + ϕ , (2.30)
n

where c/n = 1/ µ0 is the speed of light in the material. At any time t, a wave front is a surface
of constant phase:
c
x − t = constant. (2.31)
n
These surfaces are planes perpendicular to the x-axis, and therefore the wave is called a plane
wave. As time proceeds, the wave front moves with velocity c/n in the positive x-direction.
A time harmonic plane wave is more generally given by

U(r, t) = A cos(k · r − ωt + ϕ), (2.32)

where k = kx x̂ + ky ŷ + kz ẑ is the wave vector and the planes of constant phase are perpendicular
to the direction of k. Eq. (2.32) is a solution of (2.24) if

kx2 + ky2 + kz2 = ω 2 µ0 = ω 2 n2 0 µ0 = k02 n2 . (2.33)

The direction of the wave vector can be chosen arbitrarily, but its length is fixed by the frequency
ω.

2.5.2 Complex Notation for Time Harmonic Functions


We consider a time harmonic solution of the wave equation (2.24):

U(r, t) = A(r) cos(ϕ(r) − ωt), (2.34)

where the amplitude A(r) > 0 and phase ϕ(r) are functions of position r. The wave fronts, i.e.
the surfaces of constant phase, are now in general not planes, hence the solution is in general not
a plane wave. Eq. (2.34) could for example be a wave with spherical wave fronts are discussed
below.

Remark We will show in Section 7.1 that every time-harmonic solution of the wave equation
can always be expanded in terms of in general infinitely many plane waves of the form (2.32).

Optica Lecture Notes TN2421 13 of 165 Monday 16th April, 2018, 09:44
a different c( $r ). The planes should also have been drawn with
an infinite spatial extent, since no limits were put on $r . The
disturbance
CHAPTER clearly
2. THE BASICS OFoccupies all of space.AND WAVE OPTICS
ELECTROMAGNETIC

0
= A
= 0
c
c = –A 0
c = = A
c c =
c

!
k

c ( r!)
+A
l

Figure 2.21 (a) The Cartesian


unit basis vectors. (b) A plane –A Disp
lac
wave moving in the direc ement in Figure 2.22 Wavefronts
tion o t
$k-direction. ! he
fk for a harmonic plane wave.

Figure 2.1: Planes of constant phase and amplitude.

This underlines the importance of studying plane waves.

For time harmonic solutions it is often convenient to use complex notation. Define the com-
plex amplitude by:
U (r) = A(r)eiϕ(r) , (2.35)
i.e. the modulus of the complex number U (r) is the amplitude A(r) and the argument of U (r)
is the phase ϕ(r) at t = 0. Then (2.34) can be written as 26/08/16 11:14 A

U(r, t) = Re U (r)e−iωt . (2.36)


 

Hence U(r, t) is the real part of the complex time harmonic function

U (r)e−iωt . (2.37)

Remark: The complex amplitudes are in the case of vector fields such as E and H also called
"complex fields". Complex amplitudes and complex fields are only functions of position r; the
time dependent factor exp(−iωt) is omitted. To get the physical meaningful real quantity, the
complex amplitude or complex field first have to be multiplied by exp(−iωt) and then the real
part must be taken.

The following convention is used throughout these Lecture Notes:

Real-valued physical quantities (whether they are time-harmonic or have more general time de-
pendence) are denoted by a calligraphic letter, e.g. U, Bx , or Ex . The symbols are bold when we
are dealing with a vector, e.g. E or B. The complex amplitude of a time-harmonic function is
linked to the real physical quantity by (2.36) and is written as an ordinary letter such as U and E.

Optica Lecture Notes TN2421 14 of 165 Monday 16th April, 2018, 09:44
2.5. Time Harmonic Solutions of the Wave Equation

It is easier to calculate with complex quantities than with trigonometric functions (cosine and
sine). As long as all the operations carried out on the functions are linear, the operations can
be carried out on the complex quantities. To get the real-valued physical quantity (i.e. the
physical meaningful result), you multiply the finally obtained complex amplitude by exp(−iωt)
and take the real part. The reason that this is allowed is that taking the real part commutes
with linear operations, i.e. taking first the real part to get the real-valued physical quantity and
then operating on this real physical quantity gives the same result as operating on the complex
scalar and taking the real part at the end.

Example: (differentiation with respect to position)


Differentiation is a linear operation, hence it commutes with taking the real part. Indeed, dif-
ferentation of the real function (2.34) gives:
∂U
(r, t) = A(r) ∂ϕ(r)
∂x cos(ϕ(r) − ωt) − A(r) ∂x sin(ϕ(r) − ωt), (2.38)
∂x
Differentiation of the corresponding complex amplitude (2.35) gives:
 
∂U A(r) ∂ϕ(r) iϕ(r)
(r) = + iA(r) e (2.39)
∂x ∂x ∂x
Multiplying this result by exp(−iωt) and taking the real part yields:
 
∂  −iωt A(r) ∂ϕ(r)
Re (2.40)

U (r)e = cos(ϕ(r) − ωt) − A(r) sin(ϕ(r) − ωt),
∂x ∂x ∂x
which is identical to (2.38). This proves that the differentiation with respect to a spatial coordi-
nate commutes with taking the real part.

Remarks
1. The complex quantity of which the real part has to be taken is: U exp(−iωt). It is not
necessary to drag the time dependent factor exp(−iωt) along in the computations: it suffices
to compute with the complex amplitude U , but to get the physical relevant quantity, you of
course first have to multiply U by exp(−iωt) and then take the real part. However, omitting
the factor exp(−iωt) and computing only with the complex amplitude U requires that
everywhere the differentiation with respect to time: ∂/∂t is replaced by multiplication by
−iω. This is for example done in the time-harmonic Maxwell’s equations in Section 2.6
below.
2. An example that complex computations can not be used for nonlinear operations is
taking the square of a time-harmonic functions. Suppose the real-valued quantity is U =
A cos(φ − ωt). In complex notation:
U = Re U e−iωt = Re Aeiϕ−iωt , (2.41)
   

with U = A exp(iϕ). Squaring the real quantity gives


U 2 = A2 cos2 (ϕ − ωt). (2.42)
Squaring the complex quantity gives
U 2 = A2 e2iϕ . (2.43)
Multiply by exp(−iωt) and then take the real part:
h i
Re U 2 e−iωt = Re A2 ei(2ϕ−ωt) = A2 cos(2ϕ − ωt). (2.44)
 

This is not the same as (2.42). Hence when the compuations are nonlinear such as in
computing the energy (squaring the field) do NOT use complex fields but work with the
real fields.

Optica Lecture Notes TN2421 15 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

2.5.3 Time Harmonic Spherical Waves


A spherical wave depends on position only by the distance to a fixed point, which we choose
as the origin of our coordinate system. We thus seek a solution of the form U(r, t) with r =
x2 + y 2 + z 2 . For spherical symmetric functions we have
p

1 ∂2
∇2 U(r, t) = [rU(r, t)]. (2.45)
r ∂r2
The scalar wave equation (2.24) becomes, after multiplying by r:

∂2 ∂2
[rU] = µ 0 [rU]. (2.46)
∂r2 ∂t2
It is easy to see that
f (r ± ct/n)
U(r, t) = , (2.47)
r
satisfies (2.46) for any choice for the function f . The surfaces of constant phase are spheres:
c
r = constant ∓ t. (2.48)
n
and therefore the solutions are called spherical waves. Of particular interest are the time harmonic
spherical waves:
A
U(r, t) = cos(kr ± ωt + ϕ), (2.49)
r
where
k = nω/c. (2.50)
If the − sign applies, the wave propagates outwards, away from the origin. The wave is then
radiated by a source at the origin. In fact, if the − sign holds in (2.49), then a surface of constant
phase moves outwards when time increases. Similarly, if the + sign holds, the wave propagates
towards the origin corresponding to a sink in the origin.
In either case the amplitude of the wave A/r, is proportional to the inverse distance so
that energy isd conserved. Energy is proportional to the square of the amplitude and hence is
proportional to the inverse distance squared. This means that the integral over a sphere of the
energy is constant.
Using complex notation we have for the outwards propagating wave:
 
−iωt A ikr−iωt
U(r, t) = Re U (r)e = Re (2.51)
 
e
r

with U (r) = A exp(ikr)/r and A = A exp(iϕ), where ϕ is the argument and A the modulus of
the complex amplitude A.
In Fig. 2.2 and Fig. 2.3 spherical wave fronts are shown. It is seen from Fig. 2.3 that for
an observer at very large distance to a source, the spherical wave fronts look plane. Hence if
your distance to the source is large, you perceive the spherical wave as a plane wave propagating
towards you.

2.6 Time Harmonic Maxwell Equations in Matter


Using complex notation we have

E(r, t) = Re E(r)e−iωt , (2.52)


 

Optica Lecture Notes TN2421 16 of 165 Monday 16th April, 2018, 09:44
oduct (rc). The solution of t1
𝒜
t2
v
r - vt) 2.6. Time Harmonic Maxwell Equations in Matter
t3
t4

- vt)
(2.72) r
r 0 1

progressing radially outward Figure 2.27 A “quadruple exposure” of a spherical pulse.


d v, and having an arbitrary
n is given by
c
+ vt)
r
ging toward the origin.* The
at r = 0 is of little practical
r
lution
g(r + vt)
+ C2 (2.73)
r

k(r ∓ vt) (2.74)

eik(r ∓ vt) (2.75) r

the source strength. At any


nts a cluster of concentric
Figure 2.28 Spherical wavefronts.
Figure 2.2: Spherical wave fronts.

*The attenuation factor is a direct consequence of energy conservation. Chapter


n the wave is not spherically 3 contains a discussion of how these ideas apply specifically to electromagnetic
ter 1. radiation.

2.11 Twisted Light

26/08/16 11:14 AM
Figure 2.29 The flattening of spherical
waves with distance.

Figure 2.3: Planes


relateof
theconstant
diagrammaticphase in cross-section.
representation A large
of c(r, t) in the previous where the spherical wave front
fig-distances
becomes plane. ure to its actual form as a spherical wave. It depicts half the spherical x = r cos u, y = r sin u, and z = z
pulse at two different times, as the wave expands outward. Remem-
ber that these results would obtain regardless of the direction of r, The simple case of cylindrical symmetry requires that
because of the spherical symmetry. We could also have drawn a
harmonic wave, rather than a pulse, in Figs. 2.27 and 2.28. In this c( $r ) = c(r, u, z) = c(r)
case, the sinusoidal disturbance would have been bounded by the
curves The u-independence means that a plane perpendicular to
z-axis will intersect the wavefront in a circle, which may vary
c = 𝒜>r and c = -𝒜>r r, at different values of z. In addition, the z-independence f
The outgoing spherical wave emanating from a point source ther restricts the wavefront to a right circular cylinder center
Optica Lecture Notes incoming wave converging to17
TN2421
and the of 165
a point are idealizations. on the z-axis and
Monday 16thhaving
April,infinite
2018, length.
09:44 The differential wa
In actuality, light can only approximate spherical waves, as it equation becomes
can only approximate plane waves.
As a spherical wavefront propagates out, its radius increases. 1 0 0c 1 0 2c
ar - b= 2 2
(2.7
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

with

Ex (r) = |Ex (r)|eiϕx (r) ,


Ey (r) = |Ey (r)|eiϕy (r)
Ez (r) = |Ez (r)|eiϕz (r) ,

where ϕx is the argument of the complex number Ex etc. With similar notations for the magnetic
field, we obtain by substitution into Maxwell’s equations (2.13), (2.14), (2.14) and (2.16), the
time-harmonic Maxwell equations for the complex fields:

∇ × E = iωB, (2.53)
∇ × B = −iωµ0 E, (2.54)
∇ · E = 0, (2.55)
∇·B = 0 (2.56)

whereas (2.23) becomes :


∇2 E + ω 2 µ0 E = 0. (2.57)

The scalar wave equation in time-harmonic form is called the Helmholtz equation. The
permittivity is in general complex-valued and depends on the frequency. The latter property
is called dispersion. The imaginary part of the permittivity is a measure of absorption of the
light. Except for close to a resonance frequency, the imaginary part of (ω) is small and the
real part is a slowly increasing function of frequency. Near a resonance the real part is rapidly
changing and decreases with ω, while the imaginary part has a maximum at the resonance
frequency, corresponding to maximum absorrption at resonances. In some books the notation is
used:  = (n + iκ)2 , where n and κ (not to be confused with the wavenumber k) are both real
with n the refractive index and κ a measure of absorption. We then have Re() = n2 − κ2 and
Im() = 2nκ (see Fig. 2.4). As remarked before, at optical frequencies the magnetic permeability
differs only very little from the vacuum value µ0 .

2nκ n2-κ2
Complex Dielectric Constant

increment
in n2
0

Wavelength

Frequency

Figure 2.4: Real part n2 − κ2 and imaginary part 2nκ of the permittivity  = (n + iκ)2 , as
function of wavelength and of frequency.

Optica Lecture Notes TN2421 18 of 165 Monday 16th April, 2018, 09:44
at value, and there, v20j 7 7 v2. Notice trum, electronic polarization is the operative mechanism deter-
,j (v20j - v2) decreases and n gradu- mining n(v). Classically, one imagines electron-oscillators
cy, as is clearly evident in Fig. 3.40. vibrating at the frequency of the incident wave. When the wave’s
rsion. In the ultraviolet region,Harmonic
2.6. Time as v frequency
Maxwellis Equations
appreciably in
different
Matter from a characteristic or natural
ncy, the oscillators will begin to reso- frequency, the oscillations are small, and there is little dissipa-
increase markedly, and this will be tive absorption. At resonance, however, the oscillator amplitudes
nd a strong absorption of energy from are increased, and the field does an increased amount of work
= v in Eq. (3.73), the damping term on the charges. Electromagnetic energy removed from the wave
ant. The regions immediately sur-
n Fig. 3.41 are called absorption
ative, and the process is spoken of as Frequency n (Hz)
dispersion. When white light passes 3 × 1015 3 × 1014 3 × 1013 5 × 1012
2.8
blue constituent has a higher index
deviated through a larger angle (see
Thallium bromoiodide
when we use a liquid-cell prism con- 2.4
an absorption band in the visible, the Thallium chlorobromide
AgCl
y (see Problem 3.59). All substances
Index of refraction

KI
2.0 KBr
mewhere within the electromagnetic NaCl CsI AgCl
CsBr
the term anomalous dispersion, be- KCl
CsI
e 1800s, is certainly a misnomer. SiO2
1.6
CaF2 KI CsBr
LiF
NaF KBr BaF2
BaF2 KCl CaF2
1.2
SiO2 NaCl
NaF
0.8 LiF
100 200 400 600 800 2000 4000 10,000 60,000
ultra- light infrared
violet
v
v02 v03 Wavelength l (nm)

Ultraviolet X-ray Figure 3.42 Index of refraction versus wavelength and frequency
Figure 2.5: Refractive
for index as function
several important opticalofcrystals.
wavelength andpublished
(SOURCE: Data frequency for several important optical
by The Harshaw
ersus frequency. crystals. Chemical Co.)

84 Chapter 3 Electromagnetic Theory, Photons, and Light


26/08/16 11:51 AM

hv hv
v l Photon Photon
Frequency Wavelength energy energy MICROSCOPIC ARTIFICIAL
(Hz) (m) (eV) (J) SOURCE DETECTION GENERATION
Atomic Geiger
1022
10–13 nuclei and Accelerators
g-RAYS scintillation
1 MeV 106 counters
10–14 Inner Ionization X-ray
1Å 10–10 electrons chamber tubes
X-RAYS
3
1 nm 10–9 1 keV 10
Inner and
outer Photoelectric Synchrotrons
ULTRAVIOLET
10 10–18 electrons Photomultiplier Lasers
1015 LIGHT Outer electrons Eye Arcs
1m 10–6 1 eV 100 10–19
1014 Molecular Balometer Sparks
10–1 10–20 INFRARED vibrations Lamps
and
1 THz 1012 Thermopile Hot bodies
rotations
Magnetron
1 cm 10–2 MICROWAVES Electron spin Klystron
Nuclear spin Travelling-wave
1 GHz 109 21 cm H line Radar
Crystal
1 m 100 10–6 UHF tube
VHF TV FM Radio

102 Broadcast Electronic Electronic


1 MHz 106 10–27 circuits circuits
1 km 103
RADIOFREQUENCY

105 10–11
1 kHz 103
Power lines AC generators

Figure 3.43 The electromagnetic-photon spectrum.


Figure 2.6: The electromagnetic spectrum.
This section enumerates the main categories (there is actually
Germany, succeeded in generating and detecting electromag-
some overlapping) into which the spectrum is usually divided.netic waves.* His transmitter was essentially an oscillatory dis-
charge across a spark gap (a form of oscillating electric
dipole). For a receiving antenna, he used an open loop of wire
3.6.1 Radiofrequency Waves with a brass knob on one end and a fine copper point on the
In 1887, eight years after Maxwell’s death, Heinrich Hertz, then other. A small spark visible between the two ends marked the
professor of physics at the Technische Hochschule in Karlsruhe, detection of an incident electromagnetic wave. Hertz focused
the radiation, determined its polarization, reflected and refracted
Optica Lecture Notes TN2421 19 of 165 Monday
it, caused it to interfere setting up 16thand
standing waves, April, 2018,
then even 09:44
measured its wavelength (on the order of a meter). As he put it:
I have succeeded in producing distinct rays of electric force,
and in carrying out with them the elementary experiments
which are commonly performed with light and radiant heat. . . .
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

2.7 Time Harmonic Electromagnetic Plane Waves


In this section we assume that the material in which the wave propagates does not absorb the
light, i.e. that its permittivity  is real. The electric field of a time-harmonic plane wave is given
by
E(r, t) = Re E(r)e−iωt , (2.58)
 

with
E(r) = Aeik·r , (2.59)
where A is a complex vector  
Ax
A =  Ay  , (2.60)
Az
with Ax = |Ax |eiϕx etc.. Here k is the wave vector, which satisfies (2.33). By substituting (2.59)
into (2.55) it follows that
A · k = 0, (2.61)
which means that the electric field vector is in every point perpendicular to the wave vector.
For simplicity we now choose the wave vector in the direction of the x-axis and we assume that
the electric field vector is parallel to the y-axis. This case is called a y-polarised electromagnetic
wave. The complex field is then written as

E(x) = Aeikx ŷ, (2.62)



where k = ω µ0 and A = |A| exp(iϕ). It follows from Faraday’s Law (2.53) that

k √
B(x) = x̂ × ŷAeikx = µ0 Aeikx ẑ. (2.63)
ω
The real electromagnetic field is thus:

E(x, t) = Re E(x)e−iωt = |A| cos(kx − ωt + ϕ)ŷ, (2.64)


 
 √
B(x, t) = Re B(x)e−iωt = µ0 |A| cos(kx − ωt + ϕ)ẑ. (2.65)


We conclude that the electric and magnetic field of a plane wave are in phase, i.e. at any given
point z both the electric and the magnetic field achieve their maximum and minimum values at
the same time.

2.8 Electromagnetic Energy


Electromagnetic fields propagate energy. The flow of electromagnetic energy at a certain position
r and time t is given by Poynting’s vector which is defined by:

B(r, t)
S(r, t) = E(r, t) × . (2.66)
µ0

More precisely, the flow of electromagnetic energy through a small surface dA with normal n̂ at
point r is given by
S(r, t) · n̂ dA. (2.67)
If this scalar product is positive, the energy flow is in the direction of n̂, otherwise it is opposite
to n̂. Hence the direction of S(r, t) is the direction of the flow of energy at point r and the length
kS(r, t)k is the amount of the flow of energy, per unit of time and per unit of area perpendicular
to the direction of S.

Optica Lecture Notes TN2421 20 of 165 Monday 16th April, 2018, 09:44
2.8. Electromagnetic Energy
agnetic Theory, Photons, and Light

nd the curl of the electric field. Since (a) y


a function only of x and not of y and z, it

0Ey 0Bz
= - (3.27)
0x 0t E
0y
e constant and of no interest at present.
field can only have a component in the
n, in free space, the plane electromag- z
B 0z
$
B
(Fig. 3.13). Except in the case of normal
propagating in real material media are
c
se—a complication arising from the fact
dissipative or contain free charge. For the x
working with only dielectric (i.e., noncon- $
E
homogeneous, isotropic, linear, and sta- (b)
ane electromagnetic waves are transverse. E0 l
d the form of the disturbance other than
!
E
e wave. Our conclusions are therefore B0 !
E
equally well to both pulses and continu-
ready pointed out that harmonic func- !B
v
interest because any waveform can be
nusoidal waves using Fourier techniques
!
!B v
!

limit the discussion to harmonic waves


Figure 3.14
Figure 2.7: (a) Orthogonal (a) Orthogonal
harmonic E and harmonic $E- and of
B fields $ -fields
B for a plane
a plane polar-
polarized wave. (b) The wave
ized wave. (b) The wave propagates in the direction of $ E3B $.
propagates
= E0y cos [v(t - x>c) + e] in(3.28)
the direction of E × B.

n being c. The associated magnetic flux The constant of integration, which represents a time-independent
By(3.27),
y directly integrating Eq. adopting
that is,the results frombeen
field, has electrostatics
disregarded. and magnetostatics,
Comparison if follows
of this result with that the total energy
stored in the electromagnetic field per
Eq. (3.28) makes unit that
it evident of volume
in vacuumat a point r is equal to the sum of the
0Ey
L 0x
Bz = - dt electric and the magnetic energy densities: E = cB (3.30)
y z
1
btain Since
UemE(r,
y and
t) B=z differ t) · E(r,
E(r,only by a t)
scalar,
+ and B(r,
so t) · B(r,
have the t).
same (2.68)
µ0
time dependence, E $ and B$ are in-phase at all points in space.
Moreover, E $ = ĵEy(x, t) and B$ = k̂Bz(x, t) are mutually perpen-
L
v
sin [v(t - x>c) + e] dt The energydicular,
Remark: flux Sandandtheir
thecross-product,
energy density$:U
E $ ,em
B depend
points in thenonlinearly
propaga- on the field. For
Uem the quadratic dependence on îthe
tion direction, electric
(Fig. 3.14). and magnetic fields is clear. To see that the Poynt-
1 ing vector is (3.29) In ordinary
also quadratic in the dielectric materials, field,
electromagnetic whichyouare essentially non- that the electric and
should realize
E0y cos [v(t - x>c) + e]
c conducting and nonmagnetic, Eq. (3.30) can be generalized:
magnetic fields are inseparable: they together form the electromagnetic field. Stated differently:
if the amplitude of the electric field is doubled, E then
= vB also that of the magnetic field is doubled and
hence the Poynting vector is increased by a factor 4.medium
where v is the speed of the wave in the and v = 1>if1Pm.
Conclusion: you have to compute the
Poynting vector or the electromagnetic
Plane waves, though energy density
important, of the
are not a time harmonic
only solutions to electromagnetic field,
substitute the real-valued vector fields and do NOT use complex computations. An exception is
Maxwell’s Equations. As we saw in the previous chapter, the
the calculation of the differential
long-timewave equation
average allows
of the many solutions,
Poynting vector among
or thewhich
energy density. As we will
E! are cylindrical and spherical waves (Fig. 3.15). Still, the point
show below, the time averages of the energy flux and energy density can be expressed quite
must be made again that spherical EM waves, although a useful
conveniently
c in terms notion
of thethat
complex field amplitudes in this case.
we will occasionally embrace, do not actually exist.
x
Indeed, Maxwell’s Equations forbid the existence of such waves.
!
B
For the plane wave (2.62), (2.63) we
No arrangement substituting
of emitters can havethe real-valued
their fields
radiation fields com-in the Poynting vector
and the electromagnetic
bineenergy density
to produce a truly and get:wave. Moreover, we know from
spherical
Quantum Mechanics that the emission r of radiation is fundamen-
tally anisotropic. B(x,
Like t) waves,spherical
plane
figuration in a plane harmonic electro- S(x, t) = E(x, t) × = |A|2 coswaves
2
(kx −areωtan+ap-
ϕ) x̂, (2.69)
vacuum. proximation to reality.µ0 µ0
Uem (x, t) = 2|A|2 cos2 (kx − ωt + ϕ). (2.70)
We see that the energy flow of a plane wave is in the direction of the wave vector which is also
the direction of the phase velocity.

Optica Lecture Notes TN2421 21 of 165 Monday 16th April, 2018, 09:44

26/08/16 11:50 AM
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

2.9 Time Averaged Energy


Optical frequencies are in the range of 5 × 1014 Hz and therefore no detector can measure the
fluctuations of the electromagnetic fields with time. A detector will always measure an average
value, taken over an interval of time that is very large, typically at least 105 longer than the period
2π/ω of the light. We therefore compute averages over such time intervals of the Poynting vector
and of the electromagnetic energy. Because the Poynting vector depends nonlinearly (quadratic)
on the field amplitudes, we can not compute with complex amplitudes and take the real part
afterwards but have to compute with the real-valued quantities. Nevetheless, it turns out the
the final result can be conveniently expressed in terms of the complex field amplitudes.
Consider two time harmonic functions:

A(t) = Re Ae−iωt = |A| cos(ϕA − ωt) (2.71)


 
 −iωt 
B(t) = Re Be = |B| cos(ϕB − ωt), (2.72)

with A = |A| exp(iϕA ) and B = |B| exp(iϕB ) the complex amplitudes.


For a general function of time f (t) we define the time average over an interval T at a certain
time t, by
1 t+T /2
Z
f (t0 )dt0 . (2.73)
T t−T /2
where T is much larger (say a factor of 105 ) than the period of visible light. It is obvious that
for time-harmonic fields the average does not depend on the time t at which it is computed. and
we therefore take t = 0. We then write
Z T /2
1
hf (t)i = lim f (t)dt. (2.74)
T →∞ T −T /2

Because
 1  −iωt
A(t) = Re Ae−iωt = + A∗ eiωt ,
 
Ae
2
where the ∗ means complex conjugation, and with a similar expression for B(t), it follows that

1 T /2
Z Z T /2
0 1
AB ∗ + A∗ B + ABe−2iωt + A∗ B ∗ e2iωt dt
 
lim A(t)B(t)d = lim
T →∞ T −T /2 T →∞ 4T −T /2

eiωT − e−iωT iωT − e−iωT


 
1 ∗ ∗ ∗ ∗e
= lim AB + A B + AB +A B
T →∞ 4 2iT ω 2iT ω
1
= Re [AB ∗ ] , (2.75)
2
This important result will be used over and over again. In words:

If one takes the average of the product of two time harmonic quantities over a time interval
long compared with the period of the oscillation, the result is half the real part of the product of
the complex amplitude of one quantity and the complex conjugate of the other.

If we apply this to Poynting’s vector of a general time harmonic electromagnetic field:

E(r, t) = Re E(r)e−iωt ,
 

B(r, t) = Re B(r)e−iωt ,
 

Optica Lecture Notes TN2421 22 of 165 Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface

then we find that the time averaged energy flow denoted by S(r), is given by
Z T /2
1 1
S(r) = lim S(r, t)dt = Re [E × B∗ ] . (2.76)
T →∞ T −T /2 2µ0

Similarly, the time averaged electromagnetic energy density is:

def 1 T /2
Z
1 1
< Uen (r) > = lim Uen (r, t0 )dt0 = E(r) · E(r)∗ + B(r) · B(r)∗
T →∞ T −T /2 2 2µ0
1 1
= |E(r)|2 + |B(r)|2 . (2.77)
2 2µ0

If we consider these expressions for the special case of our plane wave (2.62), (2.63):
r r
1  1 
S= Re [AA∗ ] x̂ = |A|2 x̂. (2.78)
2 µ0 2 µ0

The length of vector (2.78) is the time averaged flow of energy per unit of area in the direction
of the plane wave and is commonly called the intensity of the wave. For the time averaged
electromagnetic energy density of the plane wave we get:

1 1
< Uen >= |A|2 + µ0 |A|2 = |A|2 . (2.79)
2 2µ0

It is seen that both the time averaged energy flux and the time averaged energy density of a
plane wave are proportional to the modulus squared of the complex electric field.

2.10 Reflection and Transmission at an Interface


Consider an interface y = 0 between to materials in y > 0 and y < 0 and with permittivities i
and t , respectively. We assume that the materials are lossless, i.e. that the permittivities are
real. There is a plane wave incident from medium y > 0 onto this interface:
h i
i
E i (r) = Re Ei (r)e−iωt = Re Ai ei(k ·r−ωt) , for y > 0 (2.80)
 

Part of the incident field is reflected into medium y > 0 and part is transmitted into medium
y < 0. The reflected and transmitted fields are also plane waves:
h r
i
E r (r) = Re Er (r)e−iωt = Re Ar ei(k ·r−ωt) , for y > 0, (2.81)
 
h t
i
E t (r) = Re Et (r)e−iωt = Re At ei(k ·r−ωt) , for y<0. (2.82)
 

Our aim is to determine the reflected and transmitted field amplitudes Ar and At .

2.10.1 Boundary Conditions


Boundary conditions for the tangential components of the electric and magnetic fields follow
from the Maxwell equations that contain the curl-operator, i.e. (2.13) and (2.14). There holds
for y = 0:

n̂ × (E i + E r ) = n̂ × E t , (2.83)
i r
n̂ × (B + B ) = n̂ × B , t
(2.84)

Optica Lecture Notes TN2421 23 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

where n̂ = ŷ is the unit normal on the interface. This means that the tangential components of
the total electric and total magnetic field are continuous across the interface, or explictely:

Exi (x, y = 0, z) + Exr (x, y = 0, z) = Ext (x, y = 0, z), (2.85)


Ezi (x, y = 0, z) + Ezr (x, y = 0, z) = Ezt (x, y = 0, z), (2.86)
Bxi (x, y = 0, z) + Bxr (x, y = 0, z) = Bxt (x, y = 0, z), (2.87)
Bzi (x, y = 0, z) + Bzr (x, y = 0, z) = Bzt (x, y = 0, z), (2.88)

We will only demonstrate this for the electric field. By choosing a closed loop in the (x, y)-
plane as shown in Fig. 2.8 and integrating the normal component of Faraday’s Law (2.13) over
the area A bounded by the loop L, we obtain:
Z Z Z Z
d
− ẑ · B dA = ẑ · ∇ × E dA
dt A A
I
= E · dl, (2.89)
L

Figure 2.8: Closed loop in the (x, y)-plane enclosing the area A and surrounding part of the
interface y = 0, as used in Stokes Law to derive the continuity of the electric and magnetic
component tangential to the interface and parallel to the plane through the loop.

where in the last step we used Stokes theorem and the direction of integration over the loop
corresponds to the direction of the normal ẑ according to the positive screw driver rule. In
words: the rate of change of the magnetic flux through the surface A is equal to the integral of
the tangential electric field over the bounding closed loop L. By taking the limit dy → 0, the
surface integral and the integrals over the vertical parts of the loop vanish and there remain only
the integrals of the tangential electric field over the horizontal parts of the loop on both sides of
the interface y = 0. Since these integrals are traversed in opposite directions and the lengths of
these parts are the same, we conclude for the loop as shown in Fig. 2.8 that

x̂ · E(x, y + 0, z, t) = x̂ · E(x, y − 0, z, t). (2.90)

By choosing the closed loop in the (y, z)-plane instead of the (x, y)-plane one finds that also the
z-component of the electric field is continuous. Hence all tangential electric field compo-
nents are continuous across the interface.

By integrating Maxwell’s equations that contain the div-operator (2.15), (2.16) over a pill box
with height dy and top and bottom surfaces on either side and parallel to the interface, and
considering the limit dy → 0, we find continuity relations for the normal components of the
fields:

1 n̂ · E(x, y + 0, z, t) = 2 n̂ · E(x, y − 0, z, t), (2.91)


n̂ · B(x, y + 0, z, t) = n̂ · B(x, y − 0, z, t), (2.92)

Optica Lecture Notes TN2421 24 of 165 Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface

Since (2.85), (2.86), (2.87), (2.88), hold for all times, it is easy to see that the complex fields
satisfy the same boundary conditions for y = 0:

Exi (x, y = 0, z) + Exr (x, y = 0, z) = Ext (x, y = 0, z), (2.93)


Ezi (x, y = 0, z) + Ezr (x, y = 0, z) = Ezt (x, y = 0, z), (2.94)
Bxi (x, y = 0, z) + Bxr (x, y = 0, z) = Bxt (x, y = 0, z), (2.95)
Bzi (x, y = 0, z) + Bzr (x, y = 0, z) = Bzt (x, y = 0, z), (2.96)

The same holds for the boundary conditions for the normal components, but as will become clear
below, these boundary conditions are actually not needed in the derivation below: we will only
need the continuity of the tangential electric and magnetic components.

2.10.2 S-polarisation
We have shown in (2.61) and (2.63) of Section 2.7 that the electric and magnetic fields of a plane
wave are perpendicular to each other and to the wave vector. Let the incident plane wave have
wave vector:
ki = kxi x̂ + kyi ŷ = k i sin θi x̂ − k i cos θi ŷ, (2.97)

where θi is the angle with the normal as shown in Fig. 2.9 and k i = ω i µ0 = k0 ni with

k0 = ω 0 µ0 is the wave number in vacuum and ni is the refractive index of the medium in
y > 0. The plane of incidence is defined as the plane through the normal and the incident
wave vector. In the present case this is the (x, y)-plane. Then electric vector Ai of the incident
plane wave can have any direction in the plane perpendicular to the wave vector. We distinguish
two (orthogonal) cases:

1. S-polarisation : in this case the electric field is in the z-direction, i.e. perpendicular
("Senkrecht") to the plane of incidence (see Fig. 2.9):
i [sin(θ )x−cos(θ )y]
Ei (x, y) = Ai eik i i
ẑ. (2.98)

S-polarisation is sometimes called TE-polarisation (TE= Transverse Electric).

2. P-polarisation : then the electric field vector is parallel to the plane of incidence and
perpendicular to the wave vector (see Fig. 2.10):
i [sin(θ )x−cos(θ )y]
Ei (x, y) = Ai eik i i
[cos θi x̂ + sin θi ŷ]. (2.99)

P-polarisation is sometimes called TM-polarisation, (TM=Transverse Magnetic).

It will follow from the computations below that if the incident electric field is S-polarised, the
reflected and transmitted fields are also S-polarised and similarly for P-polarisation. This is the
reason for using this particular decomposition: it greatly simplifies the calculations and since
any incident electric field can always be written as the superposition of a S-polarised and a
P-polarised wave, studying these two cases is sufficient.
We will consider S-polarisation in this section. We seek amplitudes Ar and At such that the
reflected and transmitted plane waves are given by
i [sin(θ )x+cos(θ )y]
Er (x, y) = Ar eik r r
ẑ, (2.100)
t ikt [sin(θt )x−cos(θt )y]
t
E (x, y) = A e ẑ, (2.101)

where the wave number of the transmitted wave is



k t = ω t µ0 = k0 nt , (2.102)

Optica Lecture Notes TN2421 25 of 165 Monday 16th April, 2018, 09:44
and any point
m, ki = kr.
plane of the
$ 0i + E$ 0r = E$ 0t
E (4.25)
at CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

where the cosines cancel. Realize that the field vectors as shown
really ought to be envisioned at y = 0 (i.e., at the surface), from
which they have been displaced for the sake of clarity. Note too

y
$i ,
vectors, k
nce. Again,

!r
k
! ui
!Ei
(4.21) Bi ur
!i
k

z
!
Br
$i,
e. Thus k !Er x

e
ac
ial compo- ut

erf
Int
!
Bt !t
k
(4.22) !
Et

c>vi to get

(a)
origin O to
(4.21) that !
Ei !
Er !r
k
ent, though !i
k
use it from
Bi
Interface
! ui ur
ni
!
Br
x
nt
û n u
t
!
Et

!
Bt
!t
k

the phases (b)


re is still an Figure 2.9: An S-polarised incident plane wave: the electric field vector is normal to the plane
r , and E $ 0t, of incidence. The plane of incidence is the shaded plane in the upper figure.
hat a plane ki
ce separat-
!
B i
ui

ui ui
f the wave,
Bi cos ui
nts parallel
these con-
Optica Lecture Notes TN2421 26 of 165
(c) Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface

y One furthe
Law, wher
!
Ei
become (P
!
Er
!r
k
ui
!Bi
ur
!i
k

z
!Br x

e
ac
erf
Int
!t
k
!
Et

!
Bt

!
Ei !
Er
!r
k
A note
that the dir
!
Bi
ui ur !
Br in Figs. 4.4
!i
k ni x ample, in F
Interface
nt ward, whe
û n u Had we do
t
!
Et positive, le
!
Bt
!t
k
The signs
positive ex
field direc
Figure
Figure 2.10: An4.48 An incoming
P-polarised wavewave:
incident plane whose $
theEelectric
-field isfield
in the plane-of-
vector is parallel to the plane
of incidence.
incidence. will see, ju
$ r in Fig. 4
E
standardiz
the Fresne
Using the fact that mi = mr and ui = ur, we can combine these to the spec
formulas to obtain two more of the Fresnel Equations:
nt ni
cos ui - cos ut
Optica Lecture Notes TN2421 E0r 27 ofmt165 mi Monday 16 th
April, 2018, 09:44 EXAMPLE
ri K a b = (4.38)
E0i i ni nt An electro
cos ut + cos ui
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

with nt the refractive index of the material in y < 0. the angles θr and θt are called the angle
of reflection and the angle of refraction (or transmission), respectively.
First of all we claim that the tangential components of the wave vectors of the three plane
waves, i.e. their x-components, must be identical, i.e.

k i sin θi = k i sin θr = k t sin θi . (2.103)

If this were not the case, the incident, reflected and transmitted electric fields (2.98), (2.100) and
(2.101) would for y = 0 be periodic functions of x but with different periodicities. But fields that
are periodic but have different periods as functions of x can not satisfy the boundary conditions
at the interface y = 0 for all x. For P-polarisation a similar reasoning applies so that (2.103)
hold for all polarisations.
We conclude from (2.103) that

θr = θi , (2.104)
nt sin θt = ni sin θi , (Snell’s Law). (2.105)

Hence, the angle of reflection is the same as the angle of incidence and the transmitted wave is
refracted to or away from the normal by the amount given by (2.105). Eq. (2.105) is Snell’s
Law (named after Willebrord Snellius 1580-1626, mathematics professor in Leiden; he had never
published the result during his lifetime. Meanwhile, Ren Descartes independently derived
the law using heuristic momentum conservation arguments in terms of sines in his 1637 essay
Dioptrics, therefore in French speaking country, it is called the Descartes’ Law). Note that
the Reflection Law (2.104), and Snell’s Law (2.105) hold independent of the polarization, hence
they hold also for the case of a P-polarised incident wave.
Snell’s Law implies that when the angle of incidence θi increases, the angle of transmission
increases as well. If the medium in y > 0 is air with refractive index ni = 1 and the other
medium is glass with refractive index nt = 1.5, then the maximum angle of transmission occurs
when θi = 90o and then
θt,max = arcsin(ni /nt ) = 41.8o . (2.106)
In case the light is incident from glass, i.e. ni = 1.5 and nt = 1.0, the angle of incidence θi can
not be larger than 41.8o because otherwise there is no real solution for θt . It turns out that when
θi > 41.8o , the wave is totally reflected, i.e. there is no propagating transmitted wave in air.
The angle θi,critical = 41.8o is called the critical angle of total internal reflection. It exists
only if a wave is incident from a medium with larger refractive index on a medium with lower
refractive index (nt < ni ). The critical angle is independent of the polarization of the incident
wave.
We proceed now with computing the amplitudes Ar and At . Because Ai is known, there are
two unknowns, hence we need two equations to determine them. The first equation that we apply
is the continuity of the tangential electric field at y = 0. Since the z-component is tangential, it
follows that
Ai + Ar = At . (2.107)
The second equation is obtained from the continuity of the tangential component of the magnetic
field. It follows from Faraday’s Law (2.13) that

ki i
Bi (x, y) = × ẑAi eik [sin(θi )x−cos(θi )y]
ω
ni i
= − [cos θi x̂ + sin θi ŷ] Ai eik [sin(θi )x−cos(θi )y] , (2.108)
c
ni i
B (x, y) = − [− cos θi x̂ + sin θi ŷ] Ai eik [sin(θi )x−cos(θi )y] ,
r
(2.109)
c
nt i
Bt (x, y) = − [cos θt x̂ + sin θt ŷ] At eik [sin(θt )x−cos(θt )y] , (2.110)
c

Optica Lecture Notes TN2421 28 of 165 Monday 16th April, 2018, 09:44
2.10. Reflection and Transmission at an Interface

where c is the velocity of light in vacuum. The tangential component in the interface y = 0 are
ni i
Bxi (x, 0) = − cos(θi )Ai eik sin(θi )x . (2.111)
c
ni i
Bxr (x, 0) = cos(θi )Ar eik sin(θi )x , (2.112)
c
nt t
Bxt (x, 0) = − cos(θt )At eik sin(θt )x . (2.113)
c
Since the total tangential components are continuous at the interface:
− ni cos(θi )Ai + ni cos(θi )Ar = −nt cos(θt )At . (2.114)
The reflection and transmission coefficients are defined by Ar /Ai and At /Ai , respectively. They
are called the Fresnel coefficients of S-polarisation. Eqns. (2.107) and (2.114) imply:
Ar ni cos θi − nt cos θt sin(θi − θt )
rS = i
= =− , (2.115)
A ni cos θi + nt cos θt sin(θi + θt )
At 2ni cos θi 2 cos θi sin θt
tS = i
= = , (2.116)
A ni cos θi + nt cos θt sin(θi + θt )
where at the far right we have substituted Snell’s Law.

2.10.3 P-polarization
The incident electric field is in this case given by (2.99). The reflected and transmitted fields are
similarly given by
i [sin(θ )x+cos(θ )y]
Er (x, y) = Ar eik i i
[− cos θi x̂ + sin θi ŷ], (2.117)
t ikt [sin(θt )x−cos(θt )y]
t
E (x, y) = A e [cos θt x̂ + sin θt )ŷ]. (2.118)
(2.119)
The x-components of the electric field are continuous across the interface y = 0 because they are
tangential, hence
Ai cos θi − Ar cos θi = At cos θt . (2.120)
The magnetic field of the incident follows from Faraday’s Law (2.13):
ki i
Bi (x, y) = × [cos(θi )x̂ + sin(θi )ŷ]Ai eik [sin(θi )x+cos(θi )y]
ω
ni i iki [sin(θi )x−cos(θi )y]
= Ae ẑ, (2.121)
c
The magnetic field of the reflected and transmitted waves follows analogously:
ni r iki [sin(θi )x+cos(θi )y]
Br (x, y) = A e ẑ, (2.122)
c
nt t ikt [sin(θt )x−cos(θi ty]
Bt (x, y) = Ae ẑ, (2.123)
c
Because ẑ is tangential to the interface and the tangential magnetic components are continous
across y = 0, it follows
n i Ai + n i Ar = n t At . (2.124)
Eqns (2.120) and (2.124) then imply for the Fresnel coefficients of P-polarisation:
cos θi cos θt
Ar ni − nt tan(θi − θt )
rP = = = , (2.125)
Ai cos θi
ni + ntcos θt tan(θi + θt )
cos θi
At 2 nt 2 cos θi sin θt
tP = = = , (2.126)
Ai cos θi
ni + nt
cos θt sin(θi + θt ) cos(θi − θt )

Optica Lecture Notes TN2421 29 of 165 Monday 16th April, 2018, 09:44
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

where at the far right we used again Snell’s Law (2.105).

The Fresnel coefficients are easy to use. To compute the reflection and transmission of a generally
polarised incident plane wave, you first write the incident electric field as a linear combination
of S- and P-polarised waves. Then for the given angle of incidence θi and given refractive indices
ni and nt , you calculate the angle of refraction θt . Then, you substitute θi and θt at the right
of (2.115), (2.116) and (2.125), (2.126). Finally you take the proper linear combination of the
reflected and transmitted S- and P-Fresnel coefficients.

2.10.4 Interpretation of the Fresnel Coefficients


For normal incidence: θi = 0, Snell’s Law implies: θt = 0. Hence, (2.115), (2.125) imply:
ni − nt
rS (θi = 0) = −rP (θi = 0) = , (2.127)
ni + nt
The reason that rP is not equal to rS but has an additional minus sign is because of the reflected
(and also the transmitted) field for P-polarisation been chosen such that the magnetic field is
parallel to the z-direction and in phase. As you can see by comparing the incident and reflected
electric vectors in Figs. 2.9 and 2.10, in the limit that θi = 0 the reflected electric field is parallel
to the incident electric field in the S-polarised case whereas they are opposite in the P-polarised
case.
If the incident medium is air and the other medium is glass (ni = 1.0, nt = 1.5), we get
rS (θi = 0) = −rP (θi = 0) = −0.2, (2.128)
and since the flow of energy is proportional to the square of the field, it follows that 0.22 = 0.04,
i.e. 4% of normal incident light is reflected by the glass. Hence a lens of glass and without anti-
reflection coating, reflects approximately 4% of the light at normal incidence. The transmission
coefficient for normal incidence are:
2ni
tS (θi = 0) = tP (θi = 0) = , (2.129)
ni + nt
which for air-glass becomes 0.8.

Remark Note that |rS |2 + |tS |2 6= 1 and |rP |2 + |tP |2 6= 1. The reason is that the ratio of
the transmitted energy flow in y < 0 to the incident energy flow is not given by |tS |2 . In fact, as
follows from (2.78), in computing this ratio one should take account of the velocity of light which
is different in the two media. To obtain the correct formula for the ratio of the transmitted and
the incident energy flow, |tS |2 and |tP |2 have to multiplied by nt = 1.5.

Brewster angle. It follows from Snell’s Law (2.105) that sin θt = (ni /nt ) sin θi . Hence θt
monotonically increases with θi and therefore there exists some θi such that
θi + θt = 90o . (2.130)
For this particular angle of incidence, the denominator of (2.125) vanishes and hence the P-
polarised wave is not reflected at all. This angle of incidence is called the Brewster angle; θB .
3 . It is easy to see from (2.115) that the reflection is never zero for S-polarisation. Hence if

unpolarized light is incident at the Brewster angle, the reflected light will be purely S-polarized.
We have θt = 90o − θi , hence sin(θt ) = cos(θi ) and by Snell’s Law (writing θi = θB ):
nt
tan(θB ) = . (2.131)
ni
3
MIT OCW - Reflection at The Air-glass Boundary: demonstration of reflection of polarized light and the
Brewster angle.

Optica Lecture Notes TN2421 30 of 165 Monday 16th April, 2018, 09:44
reflected will increase, and it will become more difficult to see
the page through the glass. When ui ≈ 90° the slide will look
like a perfect mirror as the reflection coefficients (Fig. 4.49) go
2.10. Reflection and - 1.0. Even a poor
to Transmission at ansurface (see photo), such as the cover of
Interface
this book, will be mirrorlike at glancing incidence. Hold the
book horizontally at the level of the middle of your eye and face
We see that there aisbright
always a solution,
light; you will seeindependent of whether
the source reflected nicely the
in thewave
cover.is incident from the
material with smallest
This suggests that X-rays could be mirror-reflected at glancingwe have θB = 56.3
or largest refractive index. For the air-glass interface o

and θt = 33.7 . By incidence


o (2.115): (p. 254), and modern X-ray telescopes are based on
that very fact. rS (θB = 56.3o ) = −0.38, (2.132)
At normal
so that (0.38) /2 = 0.07,
2 or 7 %incidence Eqs. (4.35) and
of the unpolarised light(4.41) lead straightfor-
is reflected as purely S-polarised light
wardly to
at the air glass interface. For a wave incident from glass, θB = 33.7 . o

In Fig. 2.11 the reflection and transmission coefficients of S- and P-polarised waves are shown
2n
as function of the angle of incidence[tfor the case of incidencei from air to glass.
i ]u = 0 = [t#]u = 0 = (4.48) There is no critical
i i
n + n
angle of total reflection in this case and hence the reflection and transmission coefficients are well
i t
defined for all angles between 0o and 90o . The Brwester angle is indicated. It is seen that the
It will be shown in Problem 4.63 that the expression
reflection coefficients decrease from the values −0.2 and 0.8 for θi = 0o to -1 formore 90o . The
θi = dense (ni 7 nt), is of inte
t + (-r )
transmission coefficients monotonically decrease to 0 at θi = 90 .
# # = 1 o (4.49)
ut 7 ui, and r#, as described by E
Fig. 2.11 showsholds the for
Fresnel
all ui, coefficients
whereas when the wave is incident from glass tive.toFigure
air. The
4.50 shows that r# i
critical angle is θc = 41.8 as derived earlier. At the angle of total internal reflection
o the(4.47)]
[Eq. reflection
at ui = 0, reaching +
t +r =1 (4.50)
coefficients are identical to 1. There is againi an iangle where the reflection of P-polarised angle, uc.light is
Specifically, uc is the sp
zero θB = 33.7 . is true only at normal incidence.
o
gle (p. 133) for which ut = p>2
Depending on the The foregoing
refractive discussion,
indices and theforangle
the most part, was restricted
of incidence, to
the reflection coefficients
tively can at ui = 0 and
[Eq. (4.47)]
be negative. The reflected external
the case ofelectric reflection
field then has 7 ni). The opposite
(i.e.,annt additional π phasesitu- + 1 at ui =toucthe
shift compared , as is evident from
ation of internal reflection, in which the incident medium
incident wave. In contrast, the transmitted field is always in phase with the incident field, is the Again, ri passes
i.e. through zero at t
the transmission coefficients are always positive.
1.0
1.0

t ""

t⊥
0.5
0.5

Amplitude coefficients
Amplitude coefficients

r⊥
r ""
0
0 up! uc
up
r ""
r⊥

–0.5
–0.5

33.7° 41.8
56.3° –1.0
–1.0 0 30
0 30 60 90
ui (de
ui (degrees)
Figure 4.49 The amplitude coefficients of reflection and transmission as Figure 4.50 The amplitude coefficien
Figure 2.11: Reflection andoftransmission
a function coefficients
incident angle. These correspondas
to function of thent angle
external reflection 7 ni of incident
incidence
angle.ofThese
S- correspond to inte
and P-polarised waves
at anincident from air
air–glass interface (nti to glass. The Brewster angle is denoted air-glass
= 1.5). by θp interface (nti = 1>1.5).

2.10.5 Total Internal Reflection and Evanescent Waves


We return to the case of a wave incident from glass to air, i.e. ni = 1.5 and nt = 1. As has been
explained, there is then a critical angle, given by
nt
sin θi,c = .
ni
M04_HECH6933_05_GE_C04.indd 127

Optica Lecture Notes TN2421 31 of 165 Monday 16th April, 2018, 09:44
c c
e. gle (p. 133) for which ut = p>2. Likewise, ri starts off nega-
or the most part, was restricted to tively [Eq. (4.47)] at ui = 0 and thereafter increases, reaching
(i.e., nt 7 ni). The opposite situ- + 1 at ui = uc, as is evident from the Fresnel Equation (4.40).
which the incident medium is the CHAPTER
Again, ri passes
2. THE through
BASICSzero OF
at the polarization angle u′p. ItAND
ELECTROMAGNETIC is WAVE OPTICS

1.0

t ""

t⊥
0.5

Amplitude coefficients
r⊥
r ""
0
up! uc
up
r ""
r⊥

–0.5

33.7° 41.8°
56.3° –1.0
0 30 60 90
60 90
ui (degrees)
(degrees)
ients of reflection and transmission as Figure 4.50 The amplitude coefficients of reflection as a function of
Figure 2.12: Reflection
orrespond to external reflection nt 7 ni
and transmission coefficients as function of the angle of incidence of S-
incident angle. These correspond to internal reflection nt 6 ni at an
and P-polarised waves
air-glass interfacefrom
incident (nti = glass
1>1.5).to air.

This is equivalent to
kxt = k0 ni sin θi,c = k0 nt = k0 . (2.133)
The wave vector kt = kxt x̂ + kyt ŷ in y < 0 satisfies:

(kxt )2 + (kyt )2 = k02 . (2.134)

Because of (2.133), we have at the critical angle 26/08/16 1:10 PM

kyt = 0. (2.135)

For angles of incidence above the critical angle we have: kxt > k0 and it follows from (2.134) that
(kyt )2 = k02 − (kxt )2 < 0, hence kyt is imaginary:
q q
kyt = ± k02 − (kxt )2 = ±i (kxt )2 − k02 , (2.136)

where the last square root is a positive real number. It can be shown that above the critical
angle the reflection coefficients are complex numbers with modulus 1: |rS | = |rP | = 1. This
implies that the reflected intensity is indentical to the incident intensity while at the same time
the transmission coefficients are not zero! For eample for S-polarisation we have according to
(2.107):
tS = 1 + rS 6= 0 (2.137)
because, although |rS | = 1, rS 6= −1. Therefore there is an electric field in y < 0, given by
t t t
√ t 2 2
E(x, y)e−iωt = tS eikx x+iky y−iωt ẑ = tS ei(kx x−ωt) ey (kx ) −k0 ẑ, y < 0, (2.138)

where we have chosen the - sign in (2.136) to garantee that the field does not blow up for
y → −∞. Since kxt is real, the wave propagates in the x-direction. In the y-direction however,

Optica Lecture Notes TN2421 32 of 165 Monday 16th April, 2018, 09:44
2.11. Fiber Optics

the wave is not propagating. Its amplitude decreases exponentially as function of distance |y| to
the interface and therefore the wave is confined to a thin layer adjacent to the interface. Such a
wave is called an evanescent wave. By computing the Poynting vector one can show that the
energy is being propagated parallel to the interface, i.e. in the direction in which kxt is positive.
Hence no energy is transported away from the interface into the air region.
In summary, for angles of incidence in glass above the critical angle, the transmitted field in
air is evanescent. We shall return to evanescent waves in the chapter on diffraction theory.

External sources in recommended order:

1. Youtube video - 8.03 - Lect 18 - Index of Refraction, Reflection, Fresnel Equations, Brewster
Angle - Lecture by Walter Lewin

2. MIT OCW - Reflection at The Air-glass Boundary: demonstration of reflection of polarized


light and the Brewster angle.

2.11 Fiber Optics


By using the phenomenon of total internal reflection, light can be transported over long distances
with very little loss and without spreading its energy. The principle was known for a long time,
but the topic was greatly boosted by the invention of the laser.
Consider a straight glass cylinder of reflective index ni , surrounded by air with refractive
index nt = 1. The core of the cylinder is of the order of a human hair and hence is many
wavelength in size. This implies that when light strikes the cylindrical surface, we can locally
consider the cylinder as a flat surface. By focusing a laser beam at the entrance plane of the
fiber, light can be coupled into the fiber. The part of the light that strikes the cylinder surface
at an angle with the normal that is larger than the critical angle of total reflection will be totally
reflected. As it hits the opposite side of the cylinder surface it will again be totally reflected and
so on (Fig. 2.13). Since light has such high frequencies (order 1015 Hz), one hundred thousand
times more information can be carried than with microwaves. Today fibers with very low losses
are fabricated so that signals can be sent around the earth with hardly any attenuation. By
packing thousands of fibers into a cable, images can be transferred, even if the bundle is bent
(Figs. 2.14, 2.13).

External sources in recommended order:

1. Hecht 5.6

2. MIT OCW - Single Mode Fiber: Demonstration of a single-mode fiber.

3. MIT OCW - Multi-mode Fiber: Demonstration of a multimode fiber.


206 Chapter 5 Geometrical Optics

EXAMPLE 5.11
A fiber has a core index of 1.499 and a cladding index of 1.479.
When surrounded by air what will be its (a) acceptance angle,
(b) numerical aperture, and (c) the critical angle at the core–
uc cladding interface?
ut
nf SOLUTION
nc ui = umax (b) From Eq. (5.61)
ni
NA = (n2f - n2c )1>2 = (1.4992 - 1.4792)1>2
Figure 5.81 Rays in a clad optical fiber.
Figure 2.13: Light reflected within a dielectric cylinder.NA = 0.244
which is a typical value.
1
angles greater than umax will strike the interior wall at angles (c) Since sin umax = NA = NA
Optica Lectureless than uTN2421
Notes 33 of
c. They will be only partially reflected 165such
at each Monday 16thni April, 2018, 09:44
encounter with the core–cladding interface and will quickly umax = sin-1(0.244) = 14.1°
leak out of the fiber. Accordingly, umax, which is known as the
acceptance angle, defines the half-angle of the acceptance cone Hence 2umax = 28.2°
of the fiber. To determine it, start with
CHAPTER 2. THE BASICS OF ELECTROMAGNETIC AND WAVE OPTICS

5.6 Fibero

The number of reflections Nr is then given by

/
Nr = ±1
D>sin ut

L sin ui
or Nr = ±1
D(n2ƒ - sin2 ui)1>2

rounded off to the nearest whole number. The ±1,


pends on where the ray strikes the end face, is of
cance when Nr is large, as it is in practice. Thus, if D
(i.e., 50 microns where 1 mm = 10-6 m = 39.37 *
which is about 2 * 10-3 in. (a hair from the head of
roughly 50 mm in diameter), and if nƒ = 1.6 and u
turns out to be approximately 2000 reflections per f
Figure 5.79 Figure are available in diameters as small as 2 mm or so b
2.14: mode
Optical waveguide A fiber
patternsbundle.
small-diameter fibers. (Narinder S. Kapany, AMP Fellow)
seen in the end faces of
dom used in sizes much less than about 10 mm. Extr
glass (or plastic) filaments are quite flexible and ca
woven into fabric.
The smooth surface of a single fiber must be kep
the thin-film variety, are of increasing interest, this discus- moisture, dust, oil, etc.), if there is to be no leakage o
sion will be limited to the case of relatively large-diameter frustrated total internal reflection). Similarly, if larg
fibers, those about the thickness of a human hair. of fibers are packed in close proximity, light may leak
Consider the straight glass cylinder of Fig. 5.80 surrounded fiber to another in what is known as cross-talk. For
by an incident medium of index ni—let it be air, ni = na. Light sons, it is customary to enshroud each fiber in a t
striking its walls from within will be totally internally reflected, sheath of lower index called a cladding. This layer ne
provided that the incident angle at each reflection is greater thick enough to provide the desired isolation, but for
than uc = sin-1 na >nƒ, where nƒ is the index of the cylinder or sons it generally occupies about one tenth of the cros
fiber. As we will show, a meridional ray (i.e., one that is copla- area. Although references in the literature to simple
nar with the central or optical axis) might undergo several thou- go back 100 years, the modern era of fiberoptics bega
sand reflections per foot as it bounces back and forth along a introduction of clad fibers in 1953.
fiber, until it emerges at the far end (see photo). If the fiber has Typically, a fiber core might have an index (nƒ) o
a diameter D and a length L, the path length / traversed by the the cladding an index (nc) of 1.52, although a range o
ray will be available. A clad fiber is shown in Fig. 5.81. Notice th
a maximum value umax of ui, for which the interna
/ = L>cos ut
impinge at the critical angle, uc. Rays incident on
or from Snell’s Law

/ = nƒL(n2ƒ - sin2 ui)-1>2

ni =
ut na
ui

nf

±ui
L

Figure 2.15: A coherent bundel of 10 µm glass fibers which transmits an image even though it
Figure 5.80 Rays reflected within a dielectric cylinder. Light emerging from the ends of a loose bundle of glass fibers.
is knotted and sharply bent.

M05_HECH6933_05_GE_C05.indd 205

Optica Lecture Notes TN2421 34 of 165 Monday 16th April, 2018, 09:44
Chapter 3

Geometrical Optics

What you should know and be able to do after studying this chapter.

• Principle of Fermat.

• Understand the approximation made in gaussian geometrical optics.

• Know how to work with the sign convention of the lens maker’s formula (not
the derivation of the formula).

• Understand how the Lens maker’s Formula of a single lens follows from the
formula for a single interface.

• Understand how the image of two and more lenses is derived from that of
a single lens by constructing and computing the intermediate images. You
do not need to know the imaging equation and the formulae for the focal
distances of two thin lenses.

• Understand the matrix method (you do not need to know the matrices by
hart).

• Understand the modification of the lens model to incorporate a thick lens.

• Understand the limitations of geometrical optics: when is diffraction optics


needed?

Nice software for practicing geometrical optics: https://www.geogebra.org/m/X8RuneVy

3.1 Introduction
Geometrical optics is an old subject but it is still very essential to understand and design optical
instruments such as camera’s, microscopes, telescopes etc.. Geometrical optics started long
before light was described as a wave as is done in wave optics, and long before it was discovered
that light is an electromagnetic wave governed by Maxwell’s equations.
In this chapter we go back in history and treat geometrical optics. That may seem strange now
that we have a much more accurate and better theory at our disposal. However, the predictions
of geometrical optics are under quite common circumstances very useful and also very accurate.
In fact, for many optical systems and practical instruments there is no good alternative for
geometrical optics because more accurate theories are much too complicated to use.
Molecules of a material that is being illuminated start to radiate spherical waves (more pre-
cisely, they radiate like tiny electric dipoles) and the total wave scattered and transmitted by

35
CHAPTER 3. GEOMETRICAL OPTICS

the material is the sum of all these spherical waves. A time harmonic wave has at every point in
space and at every instant of time a well defined phase. A wave front is a surface of constant
phase. The velocity at which the wave front moves is the light velocity which points in the
direction of the local normal on the wave front. For plane waves we have shown that the normal
to the wave front is the direction of the wave vector which also coincides with the direction of the
phase velocity and the direction of the flow of energy (the direction of the Poynting vector). For
more general waves, the normal to the wave front may still be considered to be in the direction
of the local flow of energy provided the curvature of the wave front is not too large. Such waves
behave locally as plane waves.

Geometrical optics is based on the intuitive idea that light consists of a bunch of rays. But
what is a ray?

A ray is an oriented curve which is everywhere perpendicular to the wavefronts


and points in the direction of the flow of energy.

Consider a point source at some distance before an opaque screen with an aperture. According
to the ray picture, the light distribution on a second screen further away from the source and
parallel to the first screen is simply an enlarged copy of the aperture (see Fig. 3.1). The copy is
enlarged due to the fanning out of the rays. However, this description is only accurate when the
wavelength of the light is very small compared to the diameter of the aperture. If the aperture is
only ten times the wavelength, the pattern is much broader due to the bending of the rays around
the edge of the aperture. This phenomenon is called diffraction and can not be explained by
geometrical optics.

Figure 3.1: Light distribution on a screen due to a rectangular aperture. Left: for a large
aperture,we get an enlarged copy of the aperture. Right: for an aperture that is of the order of
the wavelength there is strong bending and diffraction of light.

Geometrical optics is accurate when the size of the objects in the system are large compared to
the wavelength. It is possible to derive geometrical optics from Maxwell’s equations by expanding
the electromagnetic field in a Taylor series in the wavelength and retaining only the first term of
this expansion 1 .

3.2 Fermat’s Principle


The starting point of the treatment of geometrical optics is the very powerful

Principle of Fermat (1657) The path followed by a light ray between two points is the one
that costs the least amount of time.

1
See Chapter 1 of M. Born & E. Wolf, "Principles of Optics"

Optica Lecture Notes TN2421 36 of 165 Monday 16th April, 2018, 09:44
3.3. Some Consequences of Fermat’s Principle

The speed of light in a material with refractive index n, is c/n, where c = 3 × 108 m/s is
the speed of light in vacuum. At the time of Fermat, the conviction was that the speed of light
must be finite, but nobody at the time could suspect how incredibly large it actually is. In 1676
the Danish astronomer Ole Römer computed the speed from inspecting the eclipses of a moon
of Jupiter and arrived at an estimate that was only 30% too low.
Let r(s), be a ray with s the length parameter. The ray links two points S and P. Suppose
that the refractive index varies with position: n(r). Over the distance from s to s + ds, the
speed of the light is
c
. (3.1)
n(r(s))
Hence the time light needs to go from r(s) to r(s + ds) is:
n(r(s))
dt = ds, (3.2)
c
and the total total time to go from S to P is:
Z sP
n(r(s))
tS→P = ds, (3.3)
0 c
where sP is the distance along the ray from S to P. The optical path length of the ray between
S and P is defined by: Z sP
OP L = n(r(s)) ds, (3.4)
0
So the OPL is the distance weigthed by the refractive index.

Fermat’s Principle is thus equivalent to the statement that a ray follows the path
with shortest OPL.

Remark. Actually, Fermat’s Principle as formulated above is not complete. There are circum-
stances that a ray can take two paths between two points that have different travel times. Each
of these paths then corresponds to a minimum travel time compared to nearby paths, so the
travel time is in general a local minimum.

3.3 Some Consequences of Fermat’s Principle


• Homogeneous matter
In homogenous matter, the refractive index is constant and hence paths of shortest OPL
are straight lines. Hence in homogeneous matter rays are straight lines.
• Inhomogeneous matter
When the refractive index is a function of position such as air with a temperature gradient,
the rays are in general curved which leads to several well-known effects (see Fig. 3.2, 3.3)
and 3.4)
• Law of reflection
Consider the mirror shown in Fig. 3.5. A ray from point P can end up in Q in two ways: by
going straight form P to Q or alternatively via the mirror. Both possibilities have different
path lengths and hence different travel times and hence both are local minima. We consider
here the path by means of reflection by the mirror. Let the x-axis be the intersection of
the mirror and the plane through the points P and Q and perpendicular to the mirror. Let
the y-axis be normal to the mirror. Let (xP , yP ) and (xQ , yQ ) be the coordinates of P and
Q. If (x, 0) is the point where a ray from P to Q hits the mirror, its travel time is
n n n nq
q
d1 (x) + d2 (x) = 2 2
(x − xP ) + yP + (xQ − x)2 + yQ2, (3.5)
c c c c

Optica Lecture Notes TN2421 37 of 165 Monday 16th April, 2018, 09:44
Fermat’s Principle in its modern fo
The same effect is well known as it applies to sound. Fig-
going from point S to point P must
ure 4.40 depicts the alternative understanding in terms of waves.
length that is stationary with respect t
n of Light The wavefronts bend because of temperature-induced changes
In essence what that means is that the
in speed and therefore in wavelength. (The speed of sound is
CHAPTER 3. GEOMETRICAL x will haveOPTICS
a somewhat flattened regio
proportional to the square root of the temperature.) The noises
of people on a hot beach climb up and away, and the place can the slope goes to zero. The zero-slope
ar en t po sition actual path taken. In other words, the
App
tory will equal, to a first approxima
n1 Ray fr immediately adjacent to it.† For exam
(a) om Su
n the OPL is a minimum, as with the ref
Warm Straigh
to Sun 4.36, the OPL curve will look somethi
t path
s2 n2 Earth change in x in the vicinity of O has lit
s3 n3
Figure 4.38 The bending of rays through inhomogeneous media.
a similar change in x anywhere well a
Because the rays bend as they pass through the atmosphere the Sun substantial change in OPL. Thus th
appears higher in the sky. Cold
neighboring the actual one that would
si ni for the light to traverse. This latter ins
In the same way, a road viewed at a glancing angle, as in begin to understand how light manag
Fig. 4.39, appears to reflect the environs as if it were covered meanderings.
with a sheet of water. The air near the roadway is warmer and Suppose that a beam of light adva
less dense than that farther above it. It was established experi- neous isotropic medium (Fig. 4.42) s
sm nm (b) points S to P. Atoms within the mater
mentally by Gladstone and Dale that for a gas of density r
Cold dent disturbance, and they reradiate in
(n - 1) ∝ r progressing along paths in the immedia
P
straight-line path will reach P by route
It follows from the Ideal Gas Law that at a fixed pressure, since in OPL (as with group-I in Fig. 4.42b).
r ∝ P>T, (n - 1) ∝ 1>T; the hotter the road, the lower the in- nearly in-phase and reinforce each oth
hrough a layered material. dex of refraction of the air immediately Warmabove it.
According to Fermat’s Principle, a ray leaving a branch in
Figu

OPL
Fig. 4.39a heading somewhat downward would take a route that show
th length and speed, respectively, minimized the OPL. Such a ray would bend upward, passing loca
ibution. Thus through more of the less dense air than if it had traveled straight. to a

m
Figure 3.2: The density
To
Figure and therefore
appreciate
4.40 The how
puddlethat the
mirage refractive
works,
can be imagine index
understood the decreases
air divided
via waves; the speed,and
into hence
an the light speed
1 warmerand therefore thebends
wavelength, increase in the less dense medium. That bends happens with sound
^
ci=1
nisi increases in(4.9)
waves.
infinite
air.
layers.
(a)
number
This
A ray
when the passing
surface
of theinfinitesimally
from
air is cold,
wavefront
layercan
sounds tobe
thin
and
layer
constant-n
rays.
heardwould
The
the wavefronts and the rays. The same effect is common with sound waves,
bend than
much farther
horizontal
same
(via Snell’s
Law) slightly
normal. upward
(b) And when at each
it’s warm, interface
sounds seem to (much asthe
vanish into in Fig.
air. 4.36 held x
O
known as the optical path length upside down with the ray run backwards). Of course, if the ray
n contrast to the spatial path length comes down nearly vertically it makes a small angle-of-incidence
mogeneous medium where n is a *See, for example, T. Kosa and P. Palffy-Muhoray, “Mirage mirror on the wall,” †The first derivative of the OPL vanishes in its Tay
Am. J. Phys. 68 (12), 1120 (2000). path is stationary.
ummation must be changed to an
Cool air
(a)

= 3 n(s) ds
P
(4.10) Hot air
S
Apparent reflecting
surface
corresponds to the distance in
stance traversed (s) in the medium
will correspond to the same number
M04_HECH6933_05_GE_C04.indd 119
= s>l, and the same phase change (b)

, we can restate Fermat’s Principle:


to P, traverses the route having the

n pass through the inhomogeneous


shown in Fig. 4.38, they bend so as
er regions as abruptly as possible, Figure 4.39 (a) At very low angles the rays appear to be coming from
Figure
one can still see the 3.3: itAt
Sun after hasvery beneath
low angles (a) the light appears to be coming from beneath the road as if
the road as if reflected in a puddle. (b) A photo of this puddle
orizon. reflected by a puddle (b).(Matt Malloy and Dan MacIsaac, Northern Arizona University, Physics & Astronomy)
effect.

Optica Lecture Notes TN2421 38 of 165 Monday 16th April, 2018, 09:44
n of Light 3.3. Some Consequences of Fermat’s Principle

ion
Apparent posit

n1 Ray fr
om Su
n
Straigh
t path
n2 to Sun
Earth
s3 n 3
Figure Figure 4.38
3.4: The bending The through
of rays bending of rays
the through inhomogeneous
inhomogeeous media.
atmosphere. The upper layers are
Because the rays bend as they pass through the atmosphere the Sun
less dense, hence the light speed is higher there. The sun therefore seems to be higher up the
appears higher in the sky.
si sky nthan
i
is actually the case.
In the same way, a road viewed at a glancing angle, as in
Fig. 4.39, appears to reflect the environs as if it were covered
where n is the refractive index of the medium in y > 0. The point (x, 0) should be such
with a sheet of water. The air near the roadway is warmer and
that the travel time is minimum , i.e.
less dense than that farther above it. It was established experi-
sm nm mentally by d Gladstone and Dale (x that−for
xPa) gas(x
ofQdensity
− x) r
[d1 (x) + d2 (x)] = − = 0. (3.6)
dx (n - 1) d∝1 (x)
r d2 (x)
P
Hence It follows from the Ideal Gas Law that at a fixed pressure, since
r ∝ P>T, (n - 1) ∝ 1>T; the sin hotter
θi = sin
theθroad,
r, the lower the in- (3.7)
hrough a layered material. or dex of refraction of the air immediately above it.
According to Fermat’s Principle,
θr = θai .ray leaving a branch in (3.8)
Fig. 4.39a heading somewhat downward would take a route that
where θi and minimized
h length and speed, respectively, θr are the theangles
OPL.of Such
incidence
a ray and
wouldreflection as shown
bend upward, in Fig. 3.5.
passing
bution. Thus through more of the less dense air than if it had traveled straight.
m
To appreciate how that works, imagine the air Q(x ,yQ)into an
divided
Q horizontal
P(x ,yP) from layer to layer would bend (via Snell’s
1
^ ns
ci=1 i i
(4.9)
infinite number of infinitesimally thin constant-n
layers. A rayP passing
Law) slightly upward at each θr (much as in Fig. 4.36 held
θi interface
nown as the optical path length upside down with d the ray run backwards). d2Of course, if the ray
n contrast to the spatial path length 1
comes down nearly vertically it makes a small angle-of-incidence
mogeneous medium where n is a (x,0)
mmation must be changed to an
Cool air
(a)

3 n(s) ds
P Figure 3.5: Ray from P to Q via the mirror.
(4.10) Hot air
S
Apparent reflecting
• Snell’s law of refraction surface
orresponds to the distance
Next weinconsider refraction at an interface. Let y = 0 be the interface between a medium
tance traversed (s) in the medium
with refractive index ni in y > 0 and nt in y < 0. Let P= (xP , yP ) and Q=(xQ , yQ ) with
ill correspond to the same number
yP > 0 and yQ < 0 (see Fig. 3.6). What path will a ray follow that goes from P to Q?
s>l, and the same phase change
Since the refractive
(b) index is constant in both half spaces, the ray is a straight line in both
media. Let (x, 0) be the coordinate of the intersection point of the ray with the interface.
we can restate Fermat’s Principle:
Then the travel time is
o P, traverses the route having the
ni nt ni nt q
q
d1 (x) + d2 (x) = (x − xP )2 + yP2 + (xQ − x)2 + yQ 2. (3.9)
c c c c
The travel time must be minimum, hence there must hold
d (x − xP ) (xQ − x)
n pass through the inhomogeneous [ni d1 (x) + nt d2 (x)] = ni − nt = 0. (3.10)
dx d1 (x) d2 (x)
hown in Fig. 4.38, they bend so as
regions as abruptly aswhere
possible,
the travel
Figuretime
4.39 has been
(a) At very multiplied
low angles the by
raysthe speed
appear of light
to be coming in vacuum. Eq. (3.10)
from
ne can still see the Sun after it
implies has beneath the road as if reflected in a puddle. (b) A photo of this puddle
rizon. effect. (Matt Malloy and Dan MacIsaac,
niNorthern
sin θi Arizona
= ntUniversity,
sin θt ,Physics & Astronomy) (3.11)

Optica Lecture Notes TN2421 39 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS

P(xP,yP)
d1 θ
i

ni (x,0)
nt
d2
θ t

Q(xQ,yQ)

Figure 3.6: Ray from P to Q, refracted by an interface.

where θi and θt are the angles between the ray in the upper half space and the normal to
the surface and between the ray in the lower half space and the normal (Fig. 3.6).

Hence we have derived the law of reflection and Snell’s Law from Fermat’s Principle. In Chapter
2, we have previously derived the Reflection Law and Snell’s Law from the continuity conditions
for the electromagnetic field components at the interface.

3.4 Perfect Imaging by Conic Sections


(In this section the conic sections ellipse, hyperbole and parabola are important. In Fig. 3.7 their
definitions are shown as a quick reminder. See also : https://en.wikipedia.org/wiki/Conic_
section )

What is perfect imaging?


Let S be a point source. The rays perpendicular to the spherical waves emitted by S radially
fan out from S. Due to objects such as lenses etc. the spherical wave fronts are deformed and
the direction of the rays start to deviate from the radial propagation direction. When there is
a cone of rays coming from point S and all rays intersect in the same point P, then by Fermat’s
Principle, all these rays have traversed paths of minimum travel time. In particular, their travel
times are equal and therefore they all add up in phase when they arrive in P. Hence at P there
is a high light intensity. By reversing the arrows, S is similarly a perfect image of P. The optical
system in which this happens is called stigmatic for the two points S and P.
We give in this section examples of stigmatic systems and discuss to what extend a system
that is stigmatic for a certain pair of points, can also be stigmatic for other pairs of points.

1. Perfect focusing of a parallel beam of light by refraction. Suppose there are two
media with refractive index n1 and n2 , with n2 > n1 and suppose point S is at infinity in
the medium with refractive index n2 . We try to construct a surface (interface) between
the two media such that all rays from S are focused into the same point F (see Fig. 3.9).
Because S is at very large distance, the rays entering from the right are parallel. Since all
parallel rays have travelled the same distance when they hit the surface DD’ perpendicular
to the rays, all parallel rays have the same phase at their intersection points with the plane
DD’. If point A is on the interface sought for, the travel time for a ray from D to F via A
must be minimum and hence must be the same for all points A. Hence
n2 n1
|DA| + |AF | = constant, (3.12)
c c

Optica Lecture Notes TN2421 40 of 165 Monday 16th April, 2018, 09:44
3.4. Perfect Imaging by Conic Sections

Figure 3.7: Overview of the definitions of some conic sections. The lower figure shows a defini-
tion that unifies the three definitions in the above figure by introducing a parameter called the
eccentricity e. The point F is the focus and the line e = ∞ is the directrix of the conic sections.

Optica Lecture Notes TN2421 41 of 165 Monday 16th April, 2018, 09:44
P

CHAPTER 3. GEOMETRICAL OPTICS

(b)

Optical system
Figure 5.1 Conj
S sends out spher
S P an optical system
causing them to c
section rays diverg
converge to P. If n
continues on.

Figure 3.8: Perfect imaging: a cone of rays diverging from S converges onto P. The light continues
afterspreading
P. out and weakening as it progresses. In just the reverse, the wave slows upon entering
it’s frequently necessary to collect incoming parallel rays and area of the wavefront travels mo
bring
wherethem together
“constant” at a point,
means thereby
the same value focusing thehence
for all rays, energy, as is
for all points ities, which
A on the are still moving qu
interface.
done with a the
By moving burning-glass or a telescope
plane DD’ parallel to itself, lens. Moreover,
we can since
achieve that for thisdium. These
new plane DD’ extremities
we overtak
thethen
light reflected from someone’s face scatters out from billions
get:
of point sources, a lens that causese|DA| each−diverging
|AF | = 0, wavelet to (3.13)
(a)
converge
where e =could
n2 /nform
1 > 1.anHence
imagetheofset
thatofface
points(Fig. 5.2). a hyperboloid.
A defines
In contrast, when n2 < n1 , as shown at the right of Fig. 3.9, then e < 1 and the interface
is an ellipsoid with F as one of its focal points. S
5.2.1 Aspherical Surfaces
To see how Chapter
162 a lens works, imagine that
5 D Geometrical we interpose in the 𝑛path
Optics # 𝑛

of a wave a transparent
A substance in which the 𝑛wave’s
" speed (b)
is different than it was initially. Figure 5.3a presents a cross-
sectional view of a diverging spherical wave traveling in an in-
cident (a)
medium of index D’ ni impinging on the curved interface of
spherical ones.
a transmitting medium of index nt. When nt is greater than ni, tions. Furtherm
Figure 3.9:F(a) Hyperboloid (n2 > n1 ) and (b) ellipsoid (n2 < n1 ) to perfectly focus a parallelwearer’s eyes a
1
beam incident from the medium with refractive index n2 into a point in a medium with refractive A new gener
(c)
index n1 .
ic generators, i
Fpartures
1 from
S
(b) (0.000 020 inch
n
F2 generally requi
F1 grinding, asphe
Figure 5.2 A person’s face, like technique,
Figure 5.3 A hyperbolic used
interface be
everything else we ordinarily see in fronts bend and straighten out. (b)
controls the dir The
reflected light, is covered with countless bola is such that the optical path from
(c)3.10: Lens with hyperboloid
Figure surfaces for perfect imaging of a pair
atomic scatterers. of points.
where A is. the abrasive par
Nowadays as
The direction of the rays in Fig. 3.9 can obviously also be reversed in which case the rays all kinds of instr
from point F are all perfectly collimated (i.e. parallel). If medium 2 consists of glass and
F2
medium 1 of air, we conclude that by joining two hyperboloids as shown in Fig. 3.10, point ing telescopes, p
F1 in air is perfectly imaged to point F2 , also in air.

2. Perfect focussing of parallel rays by a mirror. Let there be a parallel bundle EXAMPLE
of 5.1
rays in
(d)air (n = 1) and suppose we want to focus all rays in point F. We draw a plane
The accompany
Optica Lecture Notes TN2421
M05_HECH6933_05_GE_C05.indd 160
42 of 165 lens in air. Expl
Monday 16th April, 2018, 09:44
3.4. Perfect Imaging by Conic Sections

Σ1 perpendicular to the rays as shown in Fig. 3.11. The rays that hit Σ1 have traversed
the same optical path length. We draw a second surface Σ2 parallel to Σ1 . Consider rays
hitting the mirror in A1 and A2 . The OPL from Wj via Aj to F must be the same for all
rays:
OP L = |W1 A1 | + |A1 F | = |W2 A2 | + |A2 F |. (3.14)

Since Σ2 is parallel to Σ1 :

|W1 A1 | + |A1 D1 | = |W2 A2 | + |A2 D2 |. (3.15)

Hence (3.14) will be satisfied for points A for which |AF | = |AD|, i.e. for which the distance
to F is the same as to Σ2 . This is a paraboloid with F as focus and Σ2 as directrix.
By reversing the arrows, we get (within geometrical optics) a prefect parallel beam from
a point source. Parabolic mirrors are used everywhere,192from automobile
Chapter headlights
5 Geometrical Optics to
radiotelescopes.

W1
V

n=1
W2

(a)
Figure 3.11: A paraboloid mirror.

Remarks. 5.4.2 Aspherical Mirrors


Curved mirrors that form images very much li
1. Although we found that conic surfaces give perfect imaging for a certain pair of points,
lenses or curved refracting surfaces have been k
other points do NOT have perfect images in the sense that for a certain cone of rays, all
rays are refracted (or reflected) in the same point.
the time of the ancient Greeks. Euclid, who is
have authored the book titled Catoptrics, discuss
2. One should realize that only the cone of light that is concave
capturedand byconvex mirrors.*
the conic surfaceFortunately,
is the co
sis for designing such mirrors
refracted (reflected) into a point. The fact that rays outside this cone do not end up in was developed earl
studied
this point causes blurring by diffraction.The smaller the cone Fermat’s
angle, Principle
the larger as applied to imagery
the blurring.
Since this blurring is due to diffraction, it can not be quantified in geometrical optics.determine
systems. Accordingly, let’s In the con
mirror
geometrical optics a unique point of intersection of all rays must by
captured have
theifoptical
an incident plane wave is to b
element
is considered perfect. upon reflection into a converging spherical wave
Because the plane wave is ultimately to converge
External sources in recommended order: the optical path lengths for all rays must be eq
ingly, for arbitrary points-A1 and -A2
1. KhanAcademy - Geometrical Optics: Playlist on geometrical optics, which mainly revises
secondary school material. OPL = W1A1 + A1F = W2A2 + A2F

Since the
2. Yale Courses - 16. Ray or Geometrical Optics I - Lecture byplane Σ is parallel
Ramamurti Shankarto the incident wavefr
W1A1 + Shankar
3. Yale Courses - 17. Ray or Geometrical Optics II - Lecture by Ramamurti A1D1 = W2A2 + A2D2

Optica Lecture Notes TN2421 43 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS

3.5 Gaussian Geometrical Optics


We have seen above that by using lenses or mirrors which have surfaces that are conic sections
we can perfectly image a certain pair of points, but for other points the image is in general not
perfect. The imperfections are caused by rays that make larger angles to the optical axis, i.e.
to the symmetry axis of the system. Rays for which these angles are small are called paraxial
rays. Because for paraxial rays the angles of incidence and transmission at the surfaces of the
lenses are small, the sinus of the angles in Snell’s Law are replaced by the angles themselves:

ni θi = nt θt (paraxial rays only). (3.16)

This approximation greatly simplifies the calculations.


Finally, when only paraxial rays are considered, one may replace any refracting surface by a
sphere with the same curvature at its vertex, i.e. at the common point of intersection with the
optical axis. For paraxial rays, errors caused by replacing the general surface by a sphere are
of second order and hence insignificant. We remark that spherical surfaces are not only more
simple in the derivations but they are also much easier to manufacture. Hence in the optical
industry spherical surfaces are used a lot. To reduce imaging errors caused by non-paraxial rays
one applies to stategies: 1. adding more spherical surfaces; 2 replace one of the spherical surfaces
(typically the last before image space) by a non-sphere.

In gaussian geometrical optics only paraxial rays and spherical surfaces are con-
sidered. In gaussian geometrical optics every point has a perfect image.

3.5.1 Gaussian Imaging by a Single Spherical Surface


Fig. 3.12 shows a spherical interface between two media with refractive indices n1 and n2 . We
assume n2 > n1 . The sphere has radius R and centre C which is inside medium 2. We consider
a point object S to the left of the surface. We draw a ray from S perpendicular to the surface.
The point of intersection is V. Since for this ray the angle of incidence (with the local normal)
vanishes, the ray continues into the second medium without refraction and passes through the
centre C of the sphere. Next we draw a ray that hits the spherical surface in some point A and
draw the refracted ray in medium 2 using Snell’s law in the form (3.16) (note that the angles of
incidence and transmission must be measured with respect to the local normal, i.e. with respect
to CA). We assume that this ray intersects the the first ray in point P. We will show that within
the approximation of gaussian geometrical optics, all rays from S end up in P, hence P is the
perfect image of S.
[Note: this derivation is not part of the exam].

Figure 3.12: Imaging by a spherical interfaces between two media with refractive indices n2 > n1 .

Optica Lecture Notes TN2421 44 of 165 Monday 16th April, 2018, 09:44
3.5. Gaussian Geometrical Optics

It suffices to show that P is independent of the ray, i.e. of A. Let so and si be the distances
from S to V and from P to V. We will express si into so and show that the result is independent
of A. Choose a Cartesian coordinate system (x, y) with origin at V and such that the x-axis
points from V to C. Let α1 and α2 be the angles of the rays SA and AP with the x-axis as shown
in Fig. 3.12. θi is the angle of incidence of ray SA with the local normal CA on the surface and
θt similarly is the angle of refraction. By considering the angles of ∆ SCA we find

θi = α1 + ϕ. (3.17)

Similarly, from ∆ CPA we find


θt = −α2 + ϕ. (3.18)
The paraxial version of Snell’s Law (3.16) implies

n1 α1 + n2 α2 = (n2 − n1 )ϕ. (3.19)

We have
yA yA
α1 ≈ tan(α1 ) = , α2 ≈ tan(α2 ) = , (3.20)
so + xA si − xA
and
yA
ϕ≈ . (3.21)
R
which is small for paraxial rays. Hence,

ϕ2
 
R 2
xA = R − R cos ϕ = R − R 1 − = ϕ ≈ 0, (3.22)
2 2

is of second order in yA and can therefore be neglected. Then, (3.20) and (3.21) become
yA yA
α1 = , α2 = . (3.23)
so si
By substituting (3.23) and (3.21) into (3.19) we find
n1 n2 n2 − n1
yA + yA = yA ,
so si R
or
n1 n2 n2 − n1
+ = . (3.24)
so si R
This implies that si and hence P are independent of yA , i.e. of the ray chosen, hence P is a
perfect image within the approximation of gaussian geometrical optics.
Note that when so → ∞ we have si → Rn2 /(n2 − n1 ) which is called the focal distance fi on
the image side:
n2
fi = R. (3.25)
n2 − n1
and (3.24) becomes
n1 n2 n2
+ = . (3.26)
so si fi
Similarly we have for the focal distance on the object side: fo = n1 R/(n2 − n1 ). It is clear that
our assumption that the refracted ray through point A intersects the extension of SV means that
so > fo .
Suppose now that n1 > n2 . The rays are then refracted away from the normal. Suppose there
is a ray bundle incident from medium 1 which, when viewed from medium 1, seems to converge
to point S to the right of the surface. The point S is called a virtual object point because it is
not really present in medium 2. It can be shown that in this case (3.24) holds but with negative
sign in front of n1 /so . By adopting the sign convention listed in Table 3.1, it is prevented

Optica Lecture Notes TN2421 45 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS

Figure 3.13: Imaging of a virtual object S by a spherical interface between two media with
refractive indices n1 > n2 .

that sometimes in (3.24) minus signs occur. As follows from the table, the object distance so is
positive when the object is to the left of the surface and negative when S is a virtual object in
medium 2 to which the rays in medium 1 seem to converge. Similarly, the image distance si is
taken to be positive if the image is to the right of the surface, but negative if the image is virtual
(i.e. to the left of the surface). A virtual image occurs when the rays in image space do not
converge to a point but diverge. By extending these rays into object space until they intersect
in a point P in front of the lens. For an observer in image space the rays in image space seem to
come from P which therefore is called the virtual image point.
The same sign conventions apply for fo as for so and for fi as for si . Finally, the radius of
curvature R is positive if the surface when viewed from medium 1 is convex and negative when
it is concave when viewed from medium 1. With these sign conventions, there always holds

n1 n2 n2 n1
+ = = (3.27)
so si fi fo

with
n1 n2
fo = R, fi = R. (3.28)
n2 − n1 n2 − n1

Table 3.1: Sign convention for spherical refracting surfaces and thin lenses (light
entering from the left)

s0 , f0 + left of V − right of V
x0 + left of Fo − right of Fo
si , fi + right of V − left of V
xi + right of Fi − left of Fi
R + if C is right of V − if C is left of V
y0 , yi + above optical axis − below optical axis

3.5.2 The Thin Lens


Fig. 3.14 shows a lens with two spherical surfaces. The refractive index of the lens is nl and that
of the ambient medium is nm . We will determine the image obtained with this lens of a point
S with distance so1 to the left vertex V1 . First we determine the intermediate image due to the

Optica Lecture Notes TN2421 46 of 165 Monday 16th April, 2018, 09:44
(b)

(5.13)
3.5. Gaussian Geometrical Optics

na nl
left surface with radius R1 . This intermediate na
image then serves as the object for imaging by
ide is positive.
the second surfaceCwith
2 radius R2 . C1

(c)

ld
(5.14)
- d)si1
P! V1 V2 P
on the right is S C2 C1
sume the sur- nl
ingly, we have R2 R1
nm nm
rred to as the
d
so1
si1 si2
so2
(5.15)
Figure 5.14 A spherical Figure
lens.3.14: A spherical
(a) Rays lens. plane passing through
in a vertical
a lens. Conjugate foci. (b) Refraction at the interfaces where the lens is
Two rays starting from S are drawn in Fig. 3.14. These rays are refracted by the first surface
immersed in air and nm there = na. must
The radius drawn from C1 is normal to the
nd -V2 tend to
towards its local normal so that
first surface, and as the strong
ray enters
hold nl > nm . However,
the lens
in the situation shown,
this refraction is not sufficiently to make theseitrays
bends down and
converge toward that in an image
intersect
ed from either normal. The radius from C
point after the first surface (if you extend
2 is normal to the second surface; and as
the dashed rays after the first surface as theif the second
ray emerges,
surface was since
not present, nl 7would
the rays na, thenever
ray intersect
bends down away from
but would thatIfnormal.
diverge). one extends these
urface, if so diverging
is (c)rays
Thetogeometry.
the region before the first surface, they intersect in P’. This means that for an
“observer inside the lens”, the diverging rays after the first surface appear to come out of point
mes the focal
P’ in front of the surface, which therefore is the virtual image of S for the first surface.
According to (3.27) we have
nm
1 nl =1 nl −1nm ,
+ (3.29)
and s o1 s+
i1 =R1 (5.17)
where in the case of Fig. 3.14, R1 > 0. Let so d besi the distance
ƒ between the vertices: d = |V1 V2 |.
P’ is to the left of the second surface with object distance to vertex V2 given by

which is the famoussGaussian o2 = d + |si1 | Lens


= d − sFormula
i1 > 0. (see photo). (3.30)

The situation Asshown


an example ofishow
in Fig. 3.14 only these expressions
one particular example.might be used, let’s
The intermediate image point
s ƒi = ƒo, and
P’ of S compute the focal length in air of a thin planar-convex lens hav-
due to the first surface can also be a real image point, i.e. it can be to the right of the
first surface and we distinguish then between the following two cases:
Thus ing a radius of curvature of 50 mm and an index of 1.5. With
• P’ light
is to the right of on
entering surface
the1planar
but to the left of the
surface = ∞,surface,
(R1second R2 = i.e.
-50),
0 < si1 < d. Then

so2 = d − si1 > 0. (3.31)


(5.16) 1 1 1
= Then
• P’ is to the right of surface 2. 1)dawhereas
(1.5 s-i1 > - so2 <b0 is given by
ƒ ∞ -50
so2 = d − si1 . (3.32)

This underlines again that the sign convention is very convenient because, whatever case occurs,
(3.32) applies in all cases. With (3.27) applied to P’ and the second surface we get
nl nm nm − nl
+ = . (3.33)
d − si1 si2 R2

Optica Lecture Notes TN2421 47 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS

Adding (3.29) and (3.33) gives:


 
nm nm 1 1 nl d
+ = (nl − nm ) − + . (3.34)
so,1 si,2 R1 R2 (si1 − d)si1

Note that si1 is a function of so1 and it is not so easy to express the image distance si2 in terms
of the object distance so1 alone. One can use either (3.29) or (3.33) to solve for si1 and substitute
into the other equation. However, if the lens is thin: d → 0 (3.34) simplifies because si1 drops
out and we get:

Thin-lens Equation or Lensmaker’s Formula:


 
nm nm 1 1
+ = (nl − nm ) − (thin lens). (3.35)
so si R1 R2

where so = so1 and si = si2 are the object distance and image distance to the common vertex
V = V1 = V2 of the two surfaces. It follows from the sign convention that so is positive (negative)
if the object is to the left (right) of V and that si is positive (negative) if the image is to the
right (left) of V .
When so → ∞, the image distance becomes the focal length fi in image space whereas when
si → ∞, the object distance becomes fo . Since the refractive index in both half spaces is the
same (nm ), these focal distances are the same: f = fi = fo :
 
1 nl − nm 1 1
= − . (3.36)
f nm R1 R2

The Thin-Lens Equation can thus be written:

1 1 1
+ = (thin lens). (3.37)
so si f

If f > 0 then the lens is convergent or positive, otherwise divergent or negative.

Remarks

1. We have seen in Section 3.5.1 that when in gaussian geometrical optics a parallel bundle
of rays (so → ∞) is incident on a single spherical surface, the rays are focused in a point
at a distance f from the vertex. It is clear form symmetry that if the parallel bundle is
incident under an angle and we vary this angle, the focal point will be displaced and vary
over a sphere with centre the vertex and radius f given by (3.28). In the approximation
of gaussian geometrical optics where only paraxial rays are considered, this circle may
be considered to be a plane. Therefore in gaussian geometrical optics a parallel
bundle is focused to a point in the focal plane at distance f given by (3.36)
from the lens.

2. Similarly, a line perpendicular to the optical axis is imaged into a circular arc which in the
paraxial limit may be regarded to be a straight line. Hence in gaussian geometrical
optics, lines are imaged to lines and planes to planes.

3.5.3 Construction of the Image of a Finite Object


Consider the imaging of a finite object S1 S2 as shown in Fig. 3.15. Let yo be the distance of
S2 to the optical axis. We have yo > 0 if the object is above the optical axis. Similarly, the

Optica Lecture Notes TN2421 48 of 165 Monday 16th April, 2018, 09:44
o i
By convention, xo is taken to be positive left of Fo, whereas xi is
positive on the right of Fi. It is evident from Eq. (5.23) that xo and
3.5. Gaussian Geometrical Optics

A positive
S2 A
2 means the
yo that si and
1
Fo O Fi P1 Clearly, the
S1
yi be inverted
3 follows fro
B
P2

xo f f xi

so si The term m
tude of MT
Figure 5.24 Object
Figureand image
3.15: location
Object for for
and image a thin lens.
a thin lens. is smaller

y-coordinate yi of a point in image space is positive if the point is above the optical axis and
negative otherwise.
Draw the ray through the focal point in object space and the ray through the centre O of
the lens. The first ray becomes parallel in image space while the latter is unrefracted. Their
intersection gives the location of the image point. The image is real if the intersection occurs in
image space and is virtual otherwise. From the similar triangles ∆ AOFi and ∆ P2 P1 Fi it follows
that
yo f
= . (3.38)
|yi | si − f
From the similar triangles ∆ S2 S1 Fo and ∆ BOFo :
M05_HECH6933_05_GE_C05.indd 172
|yi | f
= . (3.39)
yo so − f

(the absolute value of yi is taken because according to our sign convention yi in Fig. 3.15 is
negative whereas (3.39) is a ratio of lengths). By multiplying these two equations we get the
Newtonian form of the lens equation:

xo xi = f 2 , (3.40)

where xo and xi are the distances of the object and image to the front and back focal planes,
respectively:
xo = so − f, xi = si − f. (3.41)
Here xo is reckoned positive if the object is to the left of Fo and xi is positive if the image is to
the right of Fi .
The transverse magnification is
yi −si xi
M= = =− , (3.42)
yo so f

where the last identity follows from considering similar triangles in Fig. 3.15. A positive M
means an erect image, a negative M means an inverted image.

Optica Lecture Notes TN2421 49 of 165 Monday 16th April, 2018, 09:44
A similar pair of lenses is illustrated in Fig. 5.38, in which
the separation has been increased. Once again rays-2 and -3 +
through Fi1 and Fo1 fix the position of the intermediate image
generated by L 1 alone. As before, ray-4 is drawn backward
from O2 to P′1 to S1. The intersection of rays-3 and -4, as the
former is refracted through Fi2, locates the final image. This (c)
CHAPTER 3.L1 GEOMETRICAL
L2 OPTICS
time it is real and erect. Notice that if the focal length of L 2 is
increased with all else constant, the size of the image increases
+
3.5.4as well.
Two Thin Lenses
Analytically, looking only at L 1 in Fig. 5.36,

The imaging by two thin 1


=
1lenses
-
1 L1 and L2 can easily be obtained by construction. For example,
(5.29)
si1 ƒ1 so1
in Fig. 3.16, the distance between the lenses is larger than the sum of their focal lengths. First
Figure 5.37 (a) The effect of placing a second lens, L2, within the focal
length of a positive lens, L1. (b) When L2 is positive, its presence adds con-
the image P1 of S1 iss constructed
0 so1ƒ1 as obtained by L1 as
vergence ifrayLbundle.
to the
2 were not
(c) When L2 ispresent.
negative, it addsWe construct
divergence to
or i1 = (5.30) the ray bundle.
s - ƒ
the intermediate image P1o1 due1 to lens L1 using ray 2 and 3. P1 is a real image of lens L1 and a
0 0

real object for lens


This is positive, and Lthe2 .intermediate
Ray 3 isimage
parallel toto the
(at P′1) is the optical axis between the two lenses and is thus
right of L 1, when so1 7 ƒ1 and ƒ1 7 0. Now considering the and if d 7 si1, the object for L 2 is real (as0 in Fig. 5.38), whereas
refracted by lens L 2 through
second lens L 2 with its object at P′1
its back focal point if .d Ray
Fi2 6 si1, it4isisvirtual
the (sray from P1 through
o2 6 0, as in Fig. 5.36). In thethe centre
former
of lens L2 . The image point P1 is the intersectioninstance of ray the 3 and
rays 4.
approaching L 2 are diverging from P′1, where-
so2 = d - si1 (5.31) as in the latter they are converging toward it. As drawn in

L2

L1 P1

S1 2

3 O1 O2
Fo1 Fi1 Fo2 Fi2
4
P1!

si1 so2
f1 f2 si2
so1 d

Figure 5.38 Two thin lenses separated by a distance greater than the sum of their focal lengths. Because
Figurethe 3.16: Two
intermediate image thin lenses
is real, you separated
could start with point-P1′ andby
treatait as
distance
if it were a realthat is larger
object point for L2. than the sum of their focal
Thus a ray from P1′ through Fo2 would arrive at P1.
lengths.

In the case of Fig. 3.17 the distance d between two positive lenses is smaller than their
focal lengths. First the intermediate image P10 is constructed. It is a real image for L1 which is
constructed by finding the intersection of rays 2 and 3 passing through the back and front focal
points Fi1 and Fo1 of lens L1 , respectively. P 0 is a virtual object point for lens L2 . To 26/08/16
M05_HECH6933_05_GE_C05.indd 179
find 1:33
itsPM
image by L2 , draw ray 4 from P1 through the centre of lens L2 back to S1 (this ray is refracted
0

by lens L1 but not by L2 ) and draw ray 3 as refracted by lens L2 . Since ray 3 is parallel to
the optical axis between the lenses, it passes through the back focal point Fi2 of lens L2 . The
intersection point P1 of ray 3 and 4 is the final image point P1 .
It is easy to express the image distance si in the object distance so for two thin lenses. These
distances are measured from the vertices of L2 and L1 , respectively. The intermediate image P10
due to lens L1 has distance si1 to the vertex of lens L1 satisfying:

1 1 1
+ = . (3.43)
so si,1 f1

P10 is object for lens L2 with object distance so,2 = d − si,1 , where d is the distance between the
lenses. Hence, with si = si,2 . the lens equation for lens L2 implies:

1 1 1
+ = . (3.44)
d − si,1 si f2

By solving for si,1 from (3.43) and substituting the result into (3.44) one finds:

so f1 f2 − f2 d(so − f1 )
si = (two thin lenses). (3.45)
so f1 − (d − f2 )(so − f1 )

Optica Lecture Notes TN2421 50 of 165 Monday 16th April, 2018, 09:44
(reexamine Fig. 5.31) must refract at the lens and emerge paral- at least since the thirteenth century) are in thi
lel to one another and to the line from O to A. This means that the the radii of curvature are large and the lens d
ray we are concerned with refracts at B and gaining divergence the thickness will usually be small as well. A
heads up and away such that it is parallel to the line from O to A. would generally have a large focal length, com
As we’ll
3.5. Gaussian see presently
Geometrical this technique will allow us to quickly
Optics the thickness would be quite small; many early
trace an arbitrary ray through a series of lenses. tives fit that description perfectly.
We’ll now derive expressions for parameter
thin-lens combinations. The approach will b
leaving the more elaborate traditional treatme
Thin-Lens Combinations
cious enough to pursue the matter into the nex
Our purpose here is not to become proficient in the intricacies Consider two thin positive lenses L 1 and L
of modern lens design, but rather to gain the familiarity neces- distance d, which is smaller than either fo
sary to utilize, and adapt, those lens systems already available Fig. 5.36. The resulting image can be located g
commercially. lows. Overlooking L 2 for a moment, construct
In constructing a new optical system, one generally begins by exclusively by L 1 using rays-2 and -3. As u
sketching out a rough arrangement using the quickest approxi- through the lens object and image foci, Fo1 and
mate calculations. Refinements are then added as the designer The object is in a normal plane, so that two ra

(a) L1 L2

2
S1

1 4

Fo1 Fo2 O1 O2 Fi1 Fi2


3

f2 P1!
f1 d so2

so1 si1

(b) L1 L2

S1
4

Fo1 Fo2 O1 O2 Fi1 Fi2


3
P1

Figure 5.36 Two th


separated by a distan
si2 either focal length.

Figure 3.17: Two thin lenses at a distance smaller than their focal lengths.

M05_HECH6933_05_GE_C05.indd 178

Optica Lecture Notes TN2421 51 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS

By taking the limit so → ∞, we obtain the back focal length of the two lenses, while by taking
the limit si → ∞ we get the front focal length:

(f1 − d)f2
b.f.l. = , (3.46)
f1 + f2 − d
f1 (f2 − d)
f.f.l. = , (3.47)
f1 + f2 − d
By construction using the intermediate image it is clear that the magnification of the two lens
system is the product of the magnification of the two lenses:

M = M1 M2 . (3.48)

Remarks

1. When f1 + f2 = d the focal points are at infinity. Such a system is called telecentric.

2. In the limit that the lenses are very close together: d → 0, (3.45) becomes
1 1 1 1
+ = + . (3.49)
so si f1 f2
The focal length f of the two lenses in contact satisfies:
1 1 1
= + . (3.50)
f f1 f2
Two positive lenses in close contact enforce each other, i.e. the second positive lens makes
the convergence of the first lens stronger. Similarly, two negative lenses in contact make a
more strongly negative system. The same applies for more than two lenses in close contact.

3.5.5 The Matrix Method


For optical systems consisting of more than two lenses it becomes complicated to derive the
imaging properties. The matrix formalism of gaussian geometrical optics is then very useful and
we explain in this section how it works.
In any plane perpendicular to the optical axis, a ray is specified by its distance y to the
optical axis and its angle α with the optical axis. We use the following convention:

Sign convention for the ray angle: α is the smallest angle between the ray and the op-
tical axis and is positive for a positive slope and negative for a negative slope.

We define the vector  



(3.51)
y
where n is the local refractive index (the presence of the refractive index in the first element of
the vector turns out to be convenient). For a given ray specified by such a vector in plane 1, the
vector of the ray in plane 2 can be found by a matrix multiplication:
    
n2 α2 M11 M12 n1 α1
= . (3.52)
y2 M21 M22 y1

We write M for the matrix and specify it for a number of cases. One can prove (but we will not
do it) that all matrices have determinant=1.

Optica Lecture Notes TN2421 52 of 165 Monday 16th April, 2018, 09:44
3.5. Gaussian Geometrical Optics

a. Homogeneous medium with refractive index n. For example, if the planes have dis-
tance d and there is a homogeneous medium of refractive index n between the two planes,
we have

nα2 = nα1 ,
y2 = y1 + α 1 d

and hence  
1 0
M= d (3.53)
n 1
This matrix will also be denoted by the symbol T12 ("transfer matrix").

b. Spherical surface. Let the two planes 1 and 2 be immediately to the left and right of a
spherical interface with radius R, centre C and vertex V, with refractive indices n1 and n2
to the left and right of the interface. With reference to Fig. 3.12, the angle α2 is according
to the sign convention negative whereas in Eq. (3.19) it was taken positive. Hence, we
have to replace α2 by −α2 in (3.19) and obtain:

(n2 − n1 )y1
n2 α2 = n1 α1 − ,
R
y2 = y1 ,

where we used (3.21): ϕ = y1 /R. Hence the transfer matrix for a single surface is

1 − n2 −n
 1

S= R (3.54)
0 1

c. Thin lens. If the two planes are the entrance and exit planes of a thin lens of refractive
index nl in an ambient medium nm , we have

1 − nfm
 
Lthin = , (3.55)
0 1

If between two planes there are several lenses, single spherical surfaces and homogeneous spaces,
then the matrix M for the two planes is simply the product of the separate matrices.

Imaging conditions. Let S be a point in the first plane with distance to the optical axis
denoted by y1 . Suppose there is an image point P of S in plane 2. This mean that there must
be a y2 such that any (paraxial) ray through S will pass through point P with distance y2 to the
optical axis. Hence
M21 n1 α1 + M22 y1 = y2 , (3.56)
for all α1 . This is possible only when

M21 = 0,
(imaging conditions) (3.57)
y2 = M22 y1 .

The matrix element M22 is the magnification.

3.5.6 The Thick Lens


In Fig.3.18 a thick lens is shown. The front focal point is defined as the point whose rays
are refracted such that the emerging rays are parallel to the optical axis. Extend the incident
and emerging rays by straight segments. These intersect in a curved surface which close to the

Optica Lecture Notes TN2421 53 of 165 Monday 16th April, 2018, 09:44
6 More on
Geometrical Optics CHAPTER 3. GEOMETRICAL OPTICS

optical axis (in the paraxial approximation) is in good approximation a plane perpendicular to
the optical axis. This plane is called the primary principal plane and its intersection with
the optical axis is called the first principal point H1 .
By considering the rays that are incident parallel to the optical axis and focused in the back
focal point, the second principal plane and second principal point H2 are defined in a similar
way. The principal planes need not be inside the lens. In particular for meniscus lenses this is
not the case.

part, dealt with paraxial Primary


stems. The two predomi- principal
plane
d thin lenses and that first-
analysis. Neither of these
ghout the design of a pre-
her, they provide the basis
er carries things a bit fur- First focal
berrations; even at that, it point
computerized lens design Fo V1 H1 H2 V2
—there is little need to do
f.f.l.

ns Systems
one whose thickness is by
e, it could equally well be
ical system, allowing for
mber of simple lenses, not Second focal
l points, or if you like, the point
conveniently be measured V1 H1 H2 V2 Fi
that case we have the fa-
denoted by f.f.l. and b.f.l. b.f.l.
merged rays will meet at
urved surface that may or
surface, approximating a Secondary
d the principal plane (see principal
ary and secondary princi- plane
ersect the optical axis are
cipal points, H1 and H2, Figure 6.1 Figure
A thick3.18:
lens. Principal planes of a thick lens.
ry useful references from
tem parameters. We Onesaw
can show by a rather long computation which we omit2 , that when the object and image
distances
aversing the lens throughso , si are measured with respect to theFigure principal
6.2 points H1 and H2 , we have
Nodal points.
he incident direction. Ex- 1 1 1
+ = , (3.58)
oing rays until they cross so s i f
d the nodal points,
whereN1fand
, the focal length as measured from the first and second principal planes, satisfies:
rounded on both sides by N1  
1 nl − nm 1 1 (nl − nm )dl
he nodal and principal O N2
= − + , (3.59)
f nm R1 R2 nl R1 R2
nts, two focal, two princi-
dinal points of the system.
2
A derivation can be found in a nicely written book K.K. Sharma, Optics, Principles and Applications, Acadenic
Press 2006, Section 4.2
ong with the six cardinal
mined for any system of
Optica Lecture Notes TN2421 54 of 165 Monday 16 th
April, 2018, 09:44

255
256 Chapter 6 More on Geometrical Optics
3.6. Stops

Ray-

Figure
Figure 6.3Position
3.19: Lens bending.
of the principal planes for different lenses of the same power. Ray-

where dl is the distance between the vertices. For completeness we list the distances h1 and h2
from H1 coaxial
to V1 and refracting
from H2 to Vspherical surfaces
2 (hj > 0 if H j is to theregardless
right of Vj :) of the actual
curvatures, spacings, and indices the rays encounter. Consequently,
f (nl − nm )dl
it’s common practice hto1 calculate = −
Rthe
2 nl
positions
, of the cardinal (3.60)
points early in any analysis. f (nl − nm )dl
h2 = − . (3.61)
As shown in Fig. 6.3, the principal R1 nl planes can lie completely
outside
A thick lens can bethedescribed
lens system.
by a rayHere,
matrix though
which has differently
the same shapeconfigured,
as for a thin lens,
provided the reference planes, which for the thin lens are passing through thethat
each lens in either group has the same power. Observe in areFigure
vertices, for 6.4 Tracin
the thickthe
lenssymmetrical
taken to be thelensprincipal planes. So
the principal planes are, quite reasonably,
symmetrically located. In the case 
1 −of nmeither
 the planar-concave
Lthick = f (3.62)Depicted in F
or planar-convex lens, one principal 0 1 plane is tangent to the
,

curved surface—as should be expected from the definition a ray might hea
with 1/f is given by (3.59).
(applied to the paraxial region). In contrast, the principal points After striking H
In summary the model of refraction by a thick lens within gaussian geometrical optics is the
cana be
same as for thinexternal for the
lens provided meniscus
object andlenses.
image One often
distances andspeaks this arecentral
the focalofdistances all axis. At H
measuredsuccession of shapes
form the principal planes.with the same power as exemplifying lens ing ray, much as
bending. A rule-of-thumb for ordinary glass lenses in air is that in Fig. 6.4, trav
3.6 Stops the separation H1H2 roughly equals one-third the lens thick- first principal p
ness V1V2 . principal plane,
An element such as the rim of a lens or a separate diaphragm, which determines the set of
A quick way to trace rays through a thin lens is to draw a
rays that can contribute to the image, is called the aperture stop. An ordinary camera has a
converges to ba
plane down the middle of the lens (perpendicular to the optical
variable diaphragm. diverges as if fr
axis) andpupil
The entrance refract
is theall the of
image incoming
the aperture rays
stopatbythat plane, its
all elements principal
preceeding positive lens in
the aperture
stop. If plane,
there arerather
no lenses between object and aperture stop,
than at its two interfaces, where the bending actu- the aperture stop itself is the passes throu
that
entrance pupil. Similarly the exit pupil is the image of the aperture stop by all elements
followingally takes
it. The place.
entrance In effect,
pupil determinesforthe
a thin
conelens thethat
of light twoenters
principal planes
the optical pal plane, refrac
system while
the cone in Fig. it6.1
leaving coalesce into
is determined a single
by the plane. A similar scheme can be
exit pupil. continues on. Fo
devised to quickly ray trace through a thick lens provided we focal point-F2, s
55 of 165 Monday 16th April, 2018, 09:44
first set out a few rules. Keep in mind that the technique we are
Optica Lecture Notes TN2421
to the central ax
about to explore will take the actual entering ray and allow us to Any parallel b
yo ƒ xo
thick lenses, L 1 and L 2 (Fig. 6.6). Let so1,
Obviously, if dl S 0, Eqs. (6.1), (6.2), and (6.5) are transformed and ƒ2 be the object and image distances
into the thin-lens expressions Eqs. (5.17), (5.16), and (5.23). the two lenses, all measured with respect
CHAPTER 3. GEOMETRICAL OPTICS

yo V1 H1 H2 V2 Fi
Fo yi

h1 h2
f.f.l. b.f.l.
dl
xo f f xi
so si
Figure 6.5 Thick-len

Figure 3.20: Thick lens geometry.

For any object ppoint, the chief ray is the the ray in the cone that passes through the centre
of the entrance pupil, and hence also through the centres of the aperture stop and the exit pupil.
A marginal ray is the ray that for an object point on the optical axis passe through the rim of
the entrance pupil (and hence also through the rims of teh aperture stop and the exit pupil).
The field stop determines the size of the object that can be imaged. The field stop could
M06_HECH6933_05_GE_C06.indd 257
for example be the edge of a CCD detector.
For a fixed diameter D of the exit pupil and fixed object distance so , the magnification of
the system is according to (3.42) and (3.40) given by M = xi /f = f /xo . It follows that when
f is increased, the magnification increases. A larger magnification means a lower energy density
hence a longer exposure time, i.e. the speed of the lens is reduced. Camera lenses are usually
specified by two numbers: the focal length f and the largest diameter D of the exit pupil. The
f -number is the ratio of the focal length to this diameter:

f − number = f /# with # = f /D. (3.63)

For example, f /2 means f = 2D. Since the exposure time is proportional to the square of the
f -number, a f/1.4 lens is twice as fast as a f/2 lens.
The power of a lens is defined by
nm
P ower = D = (3.64)
f
where nm is the refractive index of the surrounding medium. The unit of the power is "Diopter"
when f is specified in meters. Hence a lens with power 20 Diopter has focal distance 5 cm. The
power is positive for a positve (.e. convergent) lens and negative for a negative (i.e. divergent)
lens.

3.7 Beyond Gaussian Geometrical Optics


For designing advanced optical systems gaussian geometrical optics is not sufficient. Instead, rays
must be traced more accurately using software based on Snell’s Law without using the paraxial
approximation, i.e. with the sinus of the angles of incidence and refraction. Many thousands of
rays are sometimes traced to evaluate the quality of an image.

3.7.1 Aberrations
When non-paraxial rays are traced, it is found that in general these do not intersect the ideal
gaussian image point. Instead of a single spot, a spot diagram is found which is more or less
confined. The deviation from an ideal point image is quantified in terms of aberrations. One

Optica Lecture Notes TN2421 56 of 165 Monday 16th April, 2018, 09:44
3.7. Beyond Gaussian Geometrical Optics

Exit
pupil
Entrance
pupil

Marginal ray

Chief ray

Exp Enp

A.S.

Figure 5.46 Pupils a


a three-lens system.

Figuresource
3.21: onAperture
the axis atstop (A.S.)ofbetween
the center the second
the hole sending and
light to thethird
left lens, with
where 5*entrance
40 mm pupil
= 200andmm. Now locate
exit pupil (in this case pupils are virtual images of the aperture
toward the lens. That means modifying all of the appropriate signs stop. Also
call it P. shown are the chief
ray and
in the marginal ray.
the equation
1 1 1
1 1 1 = +
= + 10 20 si
f so si si = + 20 cm
Here ƒ = +10 cm and with so = +8.0 cm P is 20 cm to the right of L. The element that li
rays arriving at P is the hole in the screen, not
1 1 1
= + b 6 a—hence the hole is the aperture stop and
10 8.0 si entrance pupil.
si = - 40 cm. This tells us that the image is on the same side of
L as the object, that is, on the right. The image of the aperture
is virtual, since so 6 f. Notice how the cone of rays, in Fig. 5.47
the image plane becomes narrower as the obj
off-axis. The effective aperture stop, whic
Aperture
L stop bundle of rays was the rim of L 1 , has bee
Optica Lecture Notes TN2421 57 of 165 duced
Mondayfor16the
th
off-axis
April, bundle.
2018, 09:44 The result is a

a
out of the image at points near its periph
S b P
200 mm
known as vignetting.
F F The locations and sizes of the pupils of an op
CHAPTER 3. GEOMETRICAL OPTICS

distinguishes between monochromatic and chromatic aberrations. The latter are caused by the
fact that the refractive index depends on wavelength. Recall that in paraxial geometrical optics
Snell’s Law (2.105) is replaced by: ni θi = nt θt , i.e. sin θi and sin θt are replaced by the linear
terms. If instead one retains the first two terms of the sinus, the errors in the imaging can be
quantified by five monochromatic aberrations, the so-called primary or Seidel aberrations.
The best known is spherical aberration, which is caused by the fact that for a convergent
spherical lens rays that make a large angle with the optical axis are focused closer to the lens
than the paraxial rays (see Fig. 3.22). Distortion is one of the other of the five primary

Figure 3.22: Spherical aberration of a planar-convex lens.


The oil-immersion microscope objective uses this principle to Soon after the H
great advantage. The object under study is positioned at P and orbit in April 1990,
aberrations. It causes deformation of images
surrounded due
by oil of to the nfact
index that the magnification depends on
2, as in Fig. 6.19. P and P′ are the terribly wrong. The
the distance of the object pointproper
to theconjugate
optical axis.
points for zero SA for the first element, and P′ despite all attempts
For high quality imaging the aberrations
and P″ are thosehave to meniscus
for the be minimizedlens. by optimising the curvatures
secondary mirror (
of the surfaces, the thicknesses of the lenses and the distances between the lenses. Often more tially a point sourc
lenses have to be added to reduce the aberrations and it may also be necessary to introduceexpected a diffractio
lens with an aspherical shape. Systems with very small aberrations are extremely expensive. ameter), but only a
A
instead of the expe
The disk was surro
about 1.5 arcsecon
remaining radiant
P! P C the halo in a radial
mirror micro-rough
n2 n1 the secondary (Fig.
of spherical aberrat
As scientists lat
had been polished i
n1 about half a wavele
Rn R
2
n cusing on the optic
R n2 (a) people at Perkin-E
1
hyperboloid, had p
A figure, or curvature
Figure 3.23: Lithographic lens system for DUV (192 nm). It costs more than half a million Euro.
mm error in the po
device, ultimately

P!
3.8 Beyond Geometrical OpticsP C
R

Aberrations can be quantified by analysing the spot diagram or, alternatively, by considering
the wave front in image space converging to a point image. When a point object is imaged,
(b)
ideally the transmitted wave front at the exit pupil is part of a perfect sphere with centre the
gaussian image point. Aberrations cause that the wave front deviates from a perfect sphere.
According to a generally accepted criterion formulated first by Rayleigh, aberrations start to
deteriorate images considerably if the wavefront aberrations cause path length
A differences of

Optica Lecture Notes TN2421 58 of 165 Monday 16th April, 2018, 09:44

P! P C
R
3.8. Beyond Geometrical Optics

Figure 3.24: EUV (Extreme UV) lithographic machine from ASML. The mirrors are replacing
the lenses due to the lack of lenses for this wavelength.

more than a quarter of the wavelength. When the aberrations are less than this, the system is
called diffraction limited. Even if the wave transmitted by the exit pupil would be perfectly
spherical, the wave front is only part of a sphere since the field is limited by the aperture. An
aperture causes diffraction, i.e. bending and spreading of the light. When one images a point
object on the optical axis, diffraction causes the light distribution called the Airy spot as shown
in Fig. 3.25. The Airy spot has full-width at half maximum:
λsi
F W HM = 1.6 , (3.65)
D
where D is the diameter of the exit pupil and si is the image distance as predicted by gaussian
geometrical optics. Diffraction depends on the wavelength and hence it can not be described by
geometrical optics which applies to the limit of vanishing wavelength. We will treat diffraction
by apertures in Chapter 7.

Optica Lecture Notes TN2421 59 of 165 Monday 16th April, 2018, 09:44
CHAPTER 3. GEOMETRICAL OPTICS

Figure 3.25: Left: intensity of the Airy pattern; right: cross section of the Airy Pattern (ampli-
tude)

Optica Lecture Notes TN2421 60 of 165 Monday 16th April, 2018, 09:44
Chapter 4

Optical Instruments

What you should know and be able to do after studying this chapter.

• Understand the working principle of a camera.

• Understand the principle of the eye and its accommodation with the near
and far point.

• The working of eye glasses.

• Know what the principle of the magnifier and the eyepiece and its use in the
microscope and telescope.

• Understand the microscope and the telescope concept and the angular mag-
nification in both cases.

After having studied the laws of gaussian geometrical optics, we are able to build more complex
systems based on optical elements, such as lens and reflectors. In this chapter we describe the
most common systems.

4.1 The Camera Obscura


The camera obscura or pinhole camera is the simplest image forming system. It consists of a
closed box with a pinhole on one side. An inverted image is cast on the opposite side of the box.
If the hole is too large, the image is very blurred, but by reducing the diameter of the aperture,
the image becomes sharper, reaching a best image for a specific diameter to distance from screen
ratio. If the hole gets too small, diffraction (Chapter 7) will render the image blurred again.
The camera obscura can form images of objects across an extremely wide angular field due to
great depth of focus and over a large range of distances (great depth of field). If a film would be
used to record the image, very long exposure times are needed because only a small amount of
light enters the pinhole, (f -number= ∼ f /500).

4.2 The Camera


In Fig. 4.2 a single-lens reflex (SLR) camera is shown. After traversing the first few lens elements,
the light passes through an iris diaphragm with adjustable diameter with which the f -number
can be changed. On emerging from the lens, the light is reflected by a movable mirror tilted at
45o and then goes up through a prism, and goes out via the finder eyepiece. When the shutter

61
CHAPTER 4. OPTICAL INSTRUMENTS

Figure 4.1: Left:The principle of the camera obscura and its use by a painter. It is suspected
that Vermeer has used a camera obscura. Right:An example of an image with a view of Central
Park (NY) looking north in spring ©Abelardo Morell

is released, the diaphragm closes to a preset value, the mirror swings up and the CCD or film is
exposed. To focus the camera the entire lens is moved toward or away from the detection plane.
Autofocus is based on maximizing the contrast of the images.
The angular field of view (AFOV) is defined for scenes at large distances. The AFOV is the
angle subtended at the lens by the detector area when the image distance is the focal length
f . The AFOV decreases when f increases. A standard SLR has a focal length of around 6
cm and the AFOV is then between 40°and 50°. The horizontal field of view HFOV is related
to the AFOV as shown in Fig. 4.3b. More complex systems of lenses can have a variable

Figure 4.2: Inside of Reflex camera. When taking an image the mirror will swing up and light
will go to the sensor instead.1)Camera lens, 2) Reflex mirror, 3) Focal-plane shutter, 4)Image
sensor, 5) Matte focusing screen, 6) Condenser lens, 7)Pentaprism/pentamirror, 8)Viewfinder
eyepiece.

focal distance, so they are able to zoom into a scene. To achieve such an effect, the lenses are
translated. The effects on the picture due to the change of distance between the object and
the camera at constant focal length, and the change of focal length for a same position of the
observer at are shown in Fig. 4.4.
The depth of focus is determined by the diaphragm. When the aperture is wide open, rays
coming from different objects at various distances will not all be focused on the screen (see
Fig. 4.5) and the image is blurred. When the aperture is reduced, this effect is less and therefore
a smaller diaphragm implies a larger depth of focus. The drawback is that less light reaches the
sensor, therefore a longer exposure time is needed.

1
http://www.edmundoptics.com/resources/application-notes/imaging/understanding-focal-length-and-field-
of-view/

Optica Lecture Notes TN2421 62 of 165 Monday 16th April, 2018, 09:44
4.2. The Camera

Figure 4.3: Angular and horizontal field of view1

Figure 4.4: Effect of the focal length of the lens system and of the distance of the object in the
pictures taken.

Optica Lecture Notes TN2421 63 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS

Figure 4.5: Left: the ray diagram with rays showing the origin of the depth of focus. Right: two
pictures taken with different apertures. In the top image, only the black ball appears sharp, the
background is blurred; in the bottom image all balls appear sharp.

4.2.1 Camera in a Mobile Phone


These contain standard Double Gauss or Cook Triplet lenses but sometimes also more advanced
aspheres. The image sensor is CMOS. A digital zoom is used by cropping the centered area with
the same aspect ratio as the original. This means that the lens is not moved like in an optical
zoom and no resolution is gained.

4.3 The Human Eye


The eye is a fascinating optical instrument. It is made of an almost spherical (24 mm long and
22 mm across) gelatinous substance called the vitreous humor with refractive index 1.337,
surrounded by a white shell, the sclera Fig 4.6. At the front the sclera has an opening with
a transparent lens called the cornea having index of refraction 1.376. Most of the bending of
the rays takes place at the air-cornea interface and this is why under water (nwater = 1.33) you
have difficulty to see. Once the rays have passed the cornea, they reach the aqueous humour

Figure 4.6: Cross section of a human eye

(n ≈1.336) with the iris (or pupil) in it. It can expand or contract from 2 mm (bright sun) to
8 mm (low light) diameter to adapt to the light intensity. The iris also gives colour to the eye.
After this, the rays reach the flexible crystalline lens which has the size of a bean (9 mm in

Optica Lecture Notes TN2421 64 of 165 Monday 16th April, 2018, 09:44
4.3. The Human Eye

diameter, and 4 mm thick in unaccomodated condition). Its index of refraction varies from 1.406
in its centre, to 1.386 at the edge.

Figure 4.7: Left: Optical rays showing how an eye accommodates by changing its focal length.
Right: Relaxed and contracted muscle at the crystalline lens needed for this accommodation.

The entire eye can effectively be treated as two lenses in contact, of which the crystalline
lens can change its focal length. In unaccommodated condition, the front focal distance of the
two lens system is f1 = 16 mm as measured from the cornea and the back focal distance is equal
to the length of the eye: f2 = 24 mm. These focal distances are different because the refractive
indices of the surrounding medium (air and vitreous humour) differ. The power of the intact
unaccommodated eye lens system is according to (3.64):
nvh 1.337
D= = = 55 Diopter. (4.1)
f2 0.0243

4.3.1 Accommodation
In relaxed condition the lens focuses light coming from infinity on the retina. When the object
is closer, the eye muscles contract due to which the crystalline lens becomes more convex and
the focal length of the system decreases as seen in Fig. 4.7-right. At a certain point, the object
will be too close to be focused on the retina: this is called the near point of the eye. Due to the
loss of elasticity of the muscle, the near point distance moves from 7 cm for teens to 100 cm for
60 years old. Fig 4.7 shows the optical rays entering the eyes, for two configurations: an object
at infinity and an object nearby. The so-called normal near point is at 25 cm. The far point
is where the furthest object is which is imaged on the retina by the unaccommodated eye. For
the normal eye the far point is at infinity.

4.3.2 Retina
The retina is composed of approximately 125 million photoreceptors cells: the rods and the
cones. The rods are highly sensitive black and white (intensity) sensors, while the cones are
colour sensitive for the wavelengths 390 nm - 780 nm. UV light is absorbed by the lens (people
whose lens is removed because of cataract can "see" UV light). The fovea centralis is the most
sensitive centre of the retina with a high density of cones. The eyes move continuously to focus

Optica Lecture Notes TN2421 65 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS

the image on this area. The information is transferred by the optical nerve, placed at the back
where it causes a blind spot.

4.3.3 Eyeglasses
The eye can suffer from imperfections as seen in Fig. 4.8. We discuss the most common imper-
fections and their solutions.

a. Myopia or Nearsightedness. A myopic eye has too high power so that distant objects are
focused in front of retina by the unaccommodated eye. The far point is thus not at infinity but
at a certain distance. This can be corrected by a negative lens. Suppose the far point is at 2 m.
If the concave lens would make a virtual image of a distant object which is 2 m in front of the
cornea, the unaccommodated eye can see it clearly. The lens Law (3.37), with so = ∞ implies
then f = si = −2 m. Hence the required power of the lens is:
1
D= = −0.5 Diopter. (4.2)
f
The lens is best put in the front focal plane of the eye, i.e. at approximately 16 mm in front of
the cornea. The reason is that in this case the magnification of the eye and the negative lens
together are the same as for the uncorrected eye. To see this, draw a ray from the top of the
object through the centre of the negative lens. This will then be made parallel to the optical
axis by the eye lens and the distance of this ray to the optical axis is the image size on the
retina. This ray will end up at the same point of the retina as when the negative lens is taken
out because it is unrefracted by this lens.
Contact lenses are very close to the eye lens and hence the total power of the eye with a
contact lens is simply the sum of the power of the eye and the contact lens.

b. Hyperopia or Farsightedness. In this case a distant object is imaged by the unac-


commodated eye behind the retina, i.e. the back focal distance of the unaccommodated eye is
larger than the depth of the eye. Close objects can not be imaged on the retina, hence the near
point is relatively far from the cornea. In order to bend the rays more, a positive lens is placed
in front of the eye. Suppose that a hyperopic eye has near point of 125 cm. For an object at
so = 25 cm to have virtual image at si = −125 cm so that it can be seen, the focal length must
satisfy
1 1 1 1 1 1
= + = − = , (4.3)
f so si 0.25 1.25 0.31
hence the power must be D = 1/f = +3.2 Diopter.

c. Presbyopia. This is the lack of accommodation of the eye such as in people over 40.
It results in an increase in the distance between the near point and the retina. This defect affects
all images. Presbyopia is usually corrected by glasses with progressive correction, the upper glass
for distance vision and the lower part for near vision.

d. Astigmatism. In this case the focal distances for two directions perpendicular to the
optical axis are different. It is attributed to a lack of symmetry of revolution of the cornea. This
is compensated for by using glasses which themselves are astigmatic.

4.3.4 New Correction Technique


In recent years, to correct eye defects such as myopia and astigmatism, the local curvatures of
the surface of the cornea is changed using an excimer laser. The laser is computer steered and
causes photo-ablation in appropriate parts of the cornea.

Optica Lecture Notes TN2421 66 of 165 Monday 16th April, 2018, 09:44
4.4. Magnifying glass 222 Chapter 5 Geometrical Optics

Farsighted Eye
y clear lens in the eye Nearsighted Eye (a)
he condition is referred to (a) Object at ∞
resulting haziness can Object at ∞ F
g effect on vision. In
crystalline lens is usually
d. A small convex plastic
ar lens implant) is then
to enhance its conver- (b) (b)
Near-point Figure 5.97 Again the far-poi
shows an enlarged image
Distant object correction lens.
verging spherical lens; it’s
t 6 mm in diameter.) Its
minated the need for the retina) is once again the fa
glasses” that were once the lens. The hyperope can
gery. (E.H.) (c) No accommodation (c)
any lens located anywhere
25 cm priate focal length will ser
Far point
Very gentle finger pres
gh it is when immersed in
cornea will temporarily d
commonly used model for
blurred to clear and vice v
wer of the crystalline lens (d) (d)
he cornea provides roughly Object at ∞ Distant object
ct unaccommodated eye. Astigmatism—Anamorp
ation of the word, is not as
he term normal, or its syn- Perhaps the most common
that is capable of focusing from an uneven curvature
relaxed condition—that is, (e)
(e) cornea is asymmetric. Sup
n the retina. For the unac- (one containing the optica
Nearby object
ect point whose image lies Accommodation Nearby object (curvature or) power is m
hus for the normal eye the other. If these planes are
ht to a focus on the retina, regular and correctable; i
which for all practical pur- corrected. Regular astigm
m). In contrast, when the Figure 5.94 Correction of the nearsighted eye. eye can be emmetropic, my
the eye is ametropic (e.g.,
Figure 4.8: Correction of nearsighted Figure 5.96(left) andof farsighted
Correction the farsighted eye.(right) eye
nations and degrees on the
tigmatism). This can arise will diverge the rays a bit. Resist the temptation to suppose that Thus, as a simple examp
adequate acuity; that is, it will form a distant virtual image, might be well focused, wh
s in the refracting mecha- we are merely reducing the power of the system. In point of
of alterations in the length fact, the power of the lens–eye combination is most often which
madethe eye can then see clearly. pia or hyperopia. Obvious
ce between the lens and the
4.4 Magnifying glass
to equal that of the unaided eye. If you are wearing glasses to be horizontal and vertical
common cause. Just to put correct myopia, take them off; the world gets blurry,Example but it 5.16 The great astronomer
that about 25% of young The image
doesn’ton thesize.
change retina cana be
Try casting real increased bySuppose
image on a piece ofbringing
paper that athe object
hyperopic eye closer
has a neartopoint
theofeye (reduce
125 cm. at
Find sosphero-cylindrical lens to
per- susing
eglass correction, andfixed ). your glasses—it
But s can can’t
not be
be done.
smaller than the
the near
needed point
corrective d ,
lens. which we take here to be 25 cm.
tism in 1825. This was pro
i o o
.0 D or less. been corrected. But it was
It is desirable
Example 5.14
to use a lens that makes a magnified SOLUTION erect image at a distance to the eye greater treatise on cylindrical lens
than dSuppose
o . This an
can
eye
be
has a
achieved
far point of
by
2 m.
a positive
All would be
lens
For anwith
well if a
objectthe
at + object
25 cm to closer to the
have its image at slens than
i = -125 cm the
so front
Franciscus Cornelius Don
ses that it can be seen as if through a normal eye, the focal length
focal point,
spectaclethereby producing
lens appeared to bring more a magnified
distant objectsvirtual
in closer image. An example is given in Fig. 4.10. gists were moved to adopt
arallel rays are brought to than 2 m. If the virtual image of an object at infinity ismust formedbe Any optical system tha
r of the lens system as con- by a concave lens at 2 m, the eye will see the object clearly with 1 1 1 1 two principal meridians i
osterior axial length of the an unaccommodated lens. Find the needed focal length. = + = example, if we rebuilt the s
front of the retina, the far ƒ ( -1.25) 0.25 0.31
ll points beyond it will ap- SOLUTION
or ƒ = 0.31 m and 𝒟 = +3.2 D. This is in accord with Table
often called nearsighted- Using the thin-lens approximation (eyeglasses are generally 5.3, where so 6 ƒ. These spectacles will cast real images—try it
es nearby objects clearly thin to reduce weight and bulk), we have if you’re hyperopic.
, or at least its symptoms,
t of the eye such that the 1 1 1 1 1
= + = + [5.17]
m has its focal point on the ƒ so si ∞ -2
As shown in Fig. 5.97, the correcting lens allows the relaxed
arly see objects closer than
and ƒ = -2 m while 𝒟 = - 1
D. eye to view objects at infinity. In effect, it creates an image on
t cast relatively nearby im- 2
its focal “plane” (passing through F), which then serves as a
roduce a negative lens that
virtual object for the eye. The point (whose image lies on the

Figure 4.9: Example of a positive lens used as a magnifying glass.

4.4.1 Magnifying power


26/08/16 1:34 PM
M05_HECH6933_05_GE_C05.indd 222
The magnifying power MP or angular magnification Ma is defined as the ratio of the size
of the retinal image obtained with the instrument over the size of the retinal image as seen by
the unaided eye at normal viewing distance do . To estimate the size of the retinal image we
compare in both cases where the chief ray through the top of the object and the centre
of the pupil of the eye hits the retina. Since the distance between the lens and the retina is

Optica Lecture Notes TN2421 67 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS

fixed, the ratio of the image size on the retina for the eye with and without magnifying glass is:
αa
MP = (4.4)
αu
with αa and αu are the angles between the optical axis and the chief rays for the aided and
the unaided eye as shown in Fig. (4.10). Working with these angles instead of distances is in
particular useful when the (virtual) image of the magnifying glass becomes infinite (see below).
Using αa ≈ yi /L and αu ≈ y0 /d0 with yi and y0 positive with L the positive distance from the
image to the magnifier, we have
yi d0
MP = . (4.5)
y0 L
Since si < 0 we have,
yi si −si
=− =1+ ,
yo so f
where we used the Lens Equation for the magnifying glass. We have −si = |si | = L − l , where
l is the distance between the magnifying glass and the eye. Hence, (4.5) becomes:
 
d0 L−l
MP = 1+
L f
d0
= [1 + D (L − l )] , (4.6)
L
224 Chapter 5 Geometrical Optics
where D is the power of the magnifyer glass. We can distinguish three situations:

(a) (c)

yo au

do
Near
point
(b) Entrance
pupil
A.S.
Exit pupil
yi
aa
F yo

so
f Figure 5.102 (a) An unaided
si ! through a magnifying glass. (c)
L glass. The object is less than on

Figure 4.10: An unaided view (top) and an aided view using a magnifier (down). (5.91)
magnifier, was unearthed in 1885 among the ruins of the palace Inasmuch as the image dis
of King Sennacherib (705–681 b.c.e.) of Assyria.
1. l = f : the magnifying
Evidently, power
it wouldis d
be0 D.
desirable for the lens to form a magni- d
MP =
2. l = 0: so fied, erect image.
the largest Furthermore,
value of MP corresponds the rays entering
to the the L,
smallest normal
i.e. Leye
= d0 : L
should not be converging. Table 5.3 (p. 173) immediately sug-
𝒟 of course
(4.7) being the pow
gests placing the object MP|
within the focal
l =0,L=d 0 = dlength (i.e., so 6 ƒ). The
0 D + 1.
three situations of particul
result is shown in Fig. 5.102. Because of the relatively tiny sizeth
68 ofcertainly
165 nifying power equals do𝒟.
of the eye’s pupil, it will almost
Optica Lecture Notes TN2421
always be the aperture
Monday 16 April, 2018, 09:44

stop, and as in Fig. 5.44 (p. 184), it will also be the exit pupil.
The magnifying power, MP, or equivalently, the angular [MP]/ =
4.5. Eyepieces

3. The object is at the focal point of the magnifier (s0 = f ), so the virtual image is at infinity
(L = ∞):

MP|L=∞ = d0 D (4.8)

for every distance l between the eye and the magnifying glass. The rays are parallel so that
the eye views the object in a relaxed way. This is the most common use of the magnifier.

In practice d0 /D = do /f is much larger than 1 so that MP is similar in the three cases.

4.4.2 Nomenclature
Normally magnifiers are expressed in terms of the magnifying power when L = ∞: for example
a magnifier with a power of 10 D has a MP equal to 2.5 and it will be called 2.5×. In other
words, the image is 2.5 times larger than the object at the focal length of the lens than it would
be as if the object was at the near point of the unaided eye.

4.5 Eyepieces
An eyepiece or ocular is a magnifier used before the eye at the end of an other optical instrument
such as a microscope and a telescope. The eye looks into the ocular and the ocular "looks" into
the optical instrument. The ocular provides a magnified virtual image of the image produced by
the optical instrument. Similar to the magnifying glass, the virtual image should preferably be
at or near infinity to be viewed by a relaxed eye. Several types of eye pieces exist and most of
them are made out of two lenses: 1. the field lens which is the first lens in the ocular; 2. the
eye-lens, which is closest to the eye at a fixed distance called the eye relief. The aperture of
the eyepiece is controlled by a field stop. An example is given in Fig. 4.11. You can play with

Figure 4.11: Example of an eyepiece used in situ: 1) real image 2) field diaphragm 3) eye relief
4) exit pupil

the eyepieces that are commonly used at the following address: Oculaires

4.6 The Compound Microscope


A magnifier alone can provide very high magnification only at the cost of intolerable aberrations.
The compound microscope is a magnifier of close objects with a high angular magnification,
generally more than 30×. It has been invented by Zacharias Janssen (Middelburg 1590). The
first element of the compound microscope is an objective (in Fig. 4.12 a simple positive lens)
which makes a real, inverted and magnified image of the object in the front focal plane of an
eyepiece (where there is also the field stop). The eye piece will make a virtual image at infinity
as explained above. The magnifying power of the entire system is the product of the transverse

Optica Lecture Notes TN2421 69 of 165 Monday 16th April, 2018, 09:44
CHAPTER 4. OPTICAL INSTRUMENTS

5.7 Optical Systems 227

spectacle maker, Zacharias Janssen of Middleburg. Galileo runs barrel of the device. Rays diverging from each point of this im-
a close second, having announced his invention of a compound age will emerge from the eye-lens (which in this simple case is
microscope in 1610. A simple version, which is closer to these the eyepiece itself) parallel to each other, as noted in the previous
earliest devices than it is to a modern laboratory microscope, is section. The ocular magnifies the intermediate image still further.
depicted in Fig. 5.110. Thus the magnifying power of the entire system is the product of
The lens system, here a singlet, closest to the object is referred the transverse linear magnification of the objective, MTo, and the
to as the objective. It forms a real, inverted, magnified image of angular magnification of the eyepiece, MAe, that is,
the object. This image resides in space on the plane of the field
MP = MToMAe (5.80)
stop of the eyepiece and has to be small enough to fit inside the
The objective magnifies the object and brings it up in the form
of a real image, where it can be examined as if through a mag-
nifying glass.
Recall that MT = -xi >ƒ, Eq. (5.26). With this in mind most,
but not all, manufacturers design their microscopes such that
the distance (corresponding to xi) from the second focus of the
objective to the first focus of the eyepiece is standardized at
160 mm. This distance, known as the tube length, is denoted
by L in the figure. (Some authors define tube length as the
image distance of the objective.) Hence, with the final image
Exit pupil at infinity [Eq. (5.79)] and the standard near point taken as
254 mm (10 inches),

160 254
MP = a - ba b (5.81)
f ray

ƒo ƒe
Chie

fe
Here the focal lengths are in millimeters, and the image is
inverted (MP 6 0). Accordingly, the barrel of an objective
Eyepiece with a focal length ƒo of, say, 32 mm will be engraved with the
marking 5 * (or * 5), indicating a power of 5. Combined with
a 10 * eyepiece (ƒe = 1 inch), the microscope MP would then
fe be 50 * .
To maintain the distance relationships among the objective,
field stop, and ocular, while a focused intermediate image of the
Field stop object is positioned in the first focal plane of the eyepiece, all
three elements are moved as a single unit.
The objective itself functions as the aperture stop and en-
trance pupil. Its image, formed by the eyepiece, is the exit pupil
L into which the eye is positioned. The field stop, which limits the
extent of the largest object that can be viewed, is fabricated as
part of the ocular. The image of the field stop formed by the
optical elements following it is called the exit window, and the
image formed by the optical elements preceding it is the en-
trance window. The cone angle subtended at the center of the
fo
exit pupil by the periphery of the exit window is said to be the
A.S. Objective Entrance pupil angular field of view in image space.
A modern microscope objective can be roughly classified as
one of three different kinds. It might be designed to work best
with the object positioned below a cover glass, with no cover
Object
Image at ∞ glass (metallurgical instruments), or with the object immersed
in a liquid that is in contact with the objective. In some cases,
Figure 5.110 A rudimentary compound microscope. The objective forms the distinction is not critical, and the objective may be used with
Figure 4.12: Simple compound a microscope. The objective forms a realorimage
real image of a nearby object. The eyepiece, functioning like a magnifying
glass, enlarges this intermediate image. The final virtual image can be big-
without of a nearby
a cover glass. Fourobject.
representative objectives are
The eyepiece enlarges this intermediate
ger than the barrel of image. The
the device, since final
it needn’t image
fit inside. can beshown
With parallel bigger
in Fig.than the
5.111 (see barrel
Section ofIn addition, the ordi-
6.3.1).
nary low-power (about 5 * ) cemented doublet achromate is
rays entering the eye it can remain comfortably relaxed.
the device since it is virtual.

M05_HECH6933_05_GE_C05.indd 227 26/08/16 1:3

Optica Lecture Notes TN2421 70 of 165 Monday 16th April, 2018, 09:44
4.7. Telescopes

linear magnification of the objective MT0 and the angular magnification of the eyepiece MAe :

MP = MT0 MAe . (4.9)

According to (3.42): MT = −xi /f , where xi is the distance of the image made by the objective
to its back focal plane. We have xi = L which is the tube length, i.e. the distance between the
second focus of the objective and the first focus of the eyepiece. The tube length is standardized
at 16 cm. We can then write:
−xi do −16 25
MP = = , (4.10)
fo fe f0 fe

with the standard near-point do =25 cm. As an example, an Amici objective gives 40× and
combined with a 10× eye piece one gets M P = 400.

The Numerical aperture of a microscope is a measure of the capability to gather light from
the object. It is defined by:

NA = ni sin θmax (4.11)

with ni the refractive index of the immersing medium, usually air but it could be water or oil,
and θmax the half-angle of the maximum cone of light accepted by the lens. The numerical
aperture is the second number etched in the barrel of the objective. It ranges from 0.07 (low-
power objectives) to 1.4 for high-power objectives. In Chapter 7 it will be explained that the
NA is proportional to the resolving power, which is the minimum transverse distance between
two objects points that can be resolved in the image.

(4.12)

4.7 Telescopes
A telescope enlarges the retinal image of a distant object. Like a compound microscope it is also
composed of a objective and an eyepiece as seen in Fig. 4.13 The object in this figure is a large
232 Chapter 5 Geometrical Optics

fo
fe

Intermediate
Objective image Eyepiece

Object

Figure 5.116 Keplerian astronomical telescope (accommodating eye). The final


Final image image is virtual, enlarged, and inverted.

The periphery of the objective is the aperture stop, and it center of the telescope’s exit pupil. In that case, the primary
Figure 4.13: Keplerian
encompasses the entrance pupilastronomical telescope,
as well, there being no lenses to accommodating
line-of-sight therayeye.
will always correspond to a chief through the
the left of it. If the telescope is trained directly on some distant center of the exit pupil, however the eye moves.
galaxy, the visual axis of the eye will presumably be colinear Suppose that the margin of the visible object subtends a half-
but finite distance, therefore an image is formed by the objective just after its second focal point.
with the central axis of the scope. The entrance pupil of the eye
should then coincide in space with the exit pupil of the scope.
angle of a at the objective (Fig. 5.118). This is essentially the
same as the angle au, which would be subtended at the unaided
The eyepiece makes a virtual magnified image to be viewed with a relaxed eye. Therefore the
However, the eye is not immobile. It will move about scanning
the entire field of view, which quite often contains many points
eye. As in previous sections, the angular magnification is
aa
intermediary image of the objective must be within the focal lengthMPf=e aufrom the eyepiece.
of interest. In effect, the eye examines different regions of the
field by rotating so that rays from a particular area fall on the
[5.75] The
final image is inverted. fovea centralis. The direction established by the chief ray
Here au and aa are measures of the field of view in object and
through the center of the entrance pupil to the fovea centralis is image space, respectively. The first is the half-angle of the
the primary line-of-sight. The axial point, fixed in reference to actual cone of rays collected, and the second relates to the
the head, through which the primary line-of-sight always pass- apparent cone of rays. If a ray arrives at the objective with a
Optica Lecture Notes TN2421 71 of 165
es, regardless of the orientation of the eyeball, is called the negative slope, it will enter Monday
the eye with a16
th
positiveApril,
slope and2018, 09:44
sighting intersect. When it is desirable to have the eye survey- vice versa. To make the sign of MP positive for erect images,
ing the field, the sighting intersect should be positioned at the and therefore consistent with previous usage (Fig. 5.102),

Exit pupil
CHAPTER 4. OPTICAL INSTRUMENTS

As seen earlier, the angular magnification is: MP = ααua with αu is the half angle of the cone
of light that is collected and αa is the half angle of the apparent cone of rays. From triangles
Fo1 BC and Fe2 DE in Fig. 4.14 we see that
fo
MP = − . (4.13)
fe 5.7 Optical System

Field stop Exit pupil


plane

a B D Fe2
Fo1 aa
C E

fo fo fe fe
Figure 5.118 Ray angles for a telesc

either au or aa mustFigure 4.14:to Ray


be taken angles for achoose
be negative—we telescopewell-corrected telescopic instruments generally have multi
the former because the ray has a negative slope. Observe objectives, usually doublets or triplets.
that the ray passing through the first focus of the objective
passes through the second focus of the eyepiece; that is, Fo1
and Fe2 are conjugate points. In the paraxial approximation EXAMPLE 5.17
a ≈ au ≈ tan au and aa ≈ tan aa. The image fills the region A small Keplerian telescope, operating at infinite conju
of the field stop, and half its extent equals the distance composed of two thin positive lenses separated by 105
BC = DE. Thus, from triangles Fo1BC and Fe2DE, the ratio of that configuration it provides an angular magnificatio
the tangents yields The viewer then pulls the eyepiece out 5.0 cm in order to
ƒo see a nearby object with a relaxed eye. How far awa
MP = - (5.83) object?
ƒe
SOLUTION
It’s not surprising, then, that early refracting telescopes had (a) With infinite conjugates
fairly flat objectives (long focal lengths), and therefore very long
tubes. The famous telescope of Johannes Hevelius (1611–1687) d = ƒo + ƒe = 1.05 m
was 50 m long. There’s an additional benefit to having a long-
focal-length objective: the flatter the lens, the less spherical and and since the image is inverted
chromatic aberration it will suffer. ƒo
Another convenient expression for the MP comes from con- -20 = -
ƒe
sidering the transverse magnification of the ocular. Inasmuch as
the exit pupil is the image of the objective (Fig. 5.118), we have therefore

ƒe ƒe 20ƒe + ƒe = 1.05
MTe = - = -
xo ƒo ƒe = 0.05 m and fo = 1.00 m
Furthermore, if Do is the diameter of the objective and Dep is the Since the eye is relaxed, si = ∞ and the intermediate i
diameter of its image, the exit pupil, then MTe = Dep >Do. These formed at the focal point of the eyepiece. That point
two expressions for MTe compared with Eq. (5.83) yield 105 cm behind the objective. For the objective si =
ƒo = 1.00 m and
Do
MP = (5.84)
Dep 1 1 1
+ =
so si f
The diameter of the cylinder of light entering the telescope
1 1 1
is compressed down to the diameter of the cylinder leaving + =
so 1.05 1.00
the eyepiece by a factor equal to the magnification of the
instrument—that much is evident from the geometry of the The object is located at so = 21 m in front of the objec
region between the lenses in Fig. 5.117.
Here Dep is actually a negative quantity, since the image is
inverted. It is an easy matter to build a simple refracting scope To be useful when the orientation of the object is o
by holding a lens with a long focal length in front of one with a tance, a scope must contain an additional erecting syste
short focal length and making sure that d = ƒo + ƒe. But again, an Monday
arrangement terrestrial telescope. A
Optica Lecture Notes TN2421 72 of 165 16th isApril,
known as a09:44
2018,
Chapter 5

Polarisation

What you should know and be able to do after studying this chapter.

• Know that different states of polarization are due to the phase difference
between two orthogonal components of the electric field.

• Know that elliptical polarisation is the most general state of polarisation.

• Know that linear polarisation and circular polarisation state are special
cases.

• Be able to show that every elliptical state of polarisation can be written as


the sum of two orthogonal linear polarisation states and also as the sum of
two circular polarisation states: one right, the other left circular polarised.

• Be able to work with Jones vectors and Jones matrices.

• Know how birefringence is exploited to create wave plates.

• Know the types of wave plate.

• Know how to rotate a state of linear polarisation over a given angle.

• Know how to change linear polarisation into circular and the reverse.

5.1 Polarization states, Jones Vectors, Jones Matrices


We have seen that light is an electromagnetic wave, whose wave equation can be derived from
Maxwell’s equations. Since the electric field is a vector that oscillates in a certain direction, we
say that the wave has a certain polarisation. In this section we look at the different types of
polarisation a wave can have, and how we can manipulate the polarisation of a light beam.
Let us start with Eqs. (2.59) and (2.61). It says that the electric field E(r, t) of a plane wave
is always perpendicular to the direction of propagation (given by the direction of the wave vector
k). Let us assume that the wave propagates in the z-direction
 
0
k = 0 .
 (5.1)
k
Then we know that electric field vector does not have a z-component
 
Ax cos(kz − ωt + ϕx )
E(z, t) = Ay cos(kz − ωt + ϕy ) . (5.2)
0

73
CHAPTER 5. POLARISATION

Here, Ax and Ay are the real-valued (positive) amplitudes of each of the electric field components.
While k and ω are fixed in this case, we can vary Ax , Ay , ϕx and ϕy in whatever way we like!
This degree of freedom is why different states of polarisation exist: the state of polarisation is
determined by the ratio of the amplitudes and by the phase difference ϕy −ϕx between
the two orthogonal components of the light wave. Varying the quantity ϕy − ϕx means
that we are ‘shifting’ Ey (r, t) with respect to Ex (r, t) 1 . Let us consider the electric field in a
fixed plane z = 0:
   
Ex (0, t) Ax cos(−ωt + ϕx )
=
Ey (0, t) Ay cos(−ωt + ϕy )
  
Ex (0) −iωt
= Re e (5.3)
Ey (0)
Ax eiϕx −iωt
  
= Re e .
Ay eiϕy
The complex vector
Ax eiϕx
   
Ex (0)
J= = (5.4)
Ey (0) Ay eiϕy
is a vector commonly used to characterize the polarisation state, and it is called the Jones
vector.
Let us see what at a fixed position in space happens to the electric field vector as function of
time for different choices of Ax , Ay and ϕy − ϕx .
a) Linear polarisation: ϕy − ϕx = 0. In this case we have
 
Ax iϕx
J= e . (5.5)
Ay
Equality of the phases: ϕy = ϕx , means that the field components Ex (z, t) and Ey (z, t) are
in phase: when Ex (z, t) is large Ey (z, t) is large, and when Ex (z, t) is small Ey (z, t) is small.
We can write    
Ex (0, t) Ax
= cos(ωt − ϕx ), (5.6)
Ey (0, t) Ay
which shows that for ϕy − ϕx = 0 the electric field simply oscillates in one direction given
by the vector Ax x̂ + Ay ŷ. See Fig. 5.1a.
b) Circular polarisation: ϕy − ϕx = ±π/2, Ax = Ay . In this case the Jones vector is:
 
1
J= A eiϕx . (5.7)
±i x
The field components Ex (z, t) and Ey (z, t) are π/2 radians (90 degrees) out of phase: when
Ex (z, t) is large Ey (z, t) is small, and when Ex (z, t) is small Ey (z, t) is large. We can write
for z = 0 and with ϕx = 0:
   
Ex (0, t) Ax cos(−ωt)
=
Ey (0, t) Ax cos(−ωt ± π/2)
  (5.8)
cos(ωt)
= Ax .
± sin(ωt)
We see that the electric field vector moves in a circle. When the wave is approaching
you, i.e. you are looking towards the source, and the electric field is rotating against the
clock the polarisation is called left-circularly polarised (+ sign in (5.8)), and if it goes
clockwise we call it right-circularly polarised (- sign in Fig. 5.1b).
1
KhanAcademy - Polarization of light, linear and circular: Explanation of different polarisation states and
their applications.

Optica Lecture Notes TN2421 74 of 165 Monday 16th April, 2018, 09:44
5.2. Creating and manipulating polarisation states

c) Elliptical polarisation: ϕy − ϕx = ±π/2, Ax and Ay arbitrary. The Jones vector is:


 
Ax
J= eiϕx . (5.9)
±iAy

In this case we get instead of (5.8):


   
Ex (0, t) Ax cos(ωt)
= . (5.10)
Ey (0, t) ±Ay sin(ωt)

which shows that the electric vector moves along an ellipse with major and minor axes
parallel to the x- and y-axis. When the + sign applies we say that the field is left elliptically
polarised, otherwise it is right elliptically polarised.

d) Elliptical polarisation: ϕy − ϕx = anything else, Ax and Ay arbitrary. The Jones vec-


tor is now the most general one:
Ax eiϕx
 
J= . (5.11)
Ay eiϕy
It can be shown2 that the electric field vector will always move along an ellipse. The exact
shape and orientation of this ellipse of course varies with ϕy − ϕx and Ax , Ay . and, except
when ϕy − ϕx = ±π/2, the major and minor axis of the ellipse are not parallel to the x-
and y-axis. See Fig. 5.1c.

Remark. Frequently the Jones vector is normalised such that

|Jx |2 + |Jy |2 = 1. (5.12)

The normalized vector represents of course the same polarisation state as the unnormalised one.
In general, multiplying the Jones matrix by a complex number does not change the polarisation
state. If we multiply for example by eiθ , this has the same result as changing the instant that
t = 0, hence it does not change the polarisation state. In fact:
h i h i
E(t) = Re eiθ Je−iωt = Re Je−iω(t−θ/ω) (5.13)

External sources in recommended order:

1. KhanAcademy - Polarization of light, linear and circular: Explanation of different polari-


sation states and their applications.

2. Hecht 8.1.1-8.1.4

5.2 Creating and manipulating polarisation states


We have seen how Maxwell’s equations allow the existence of plane waves with many different
states of polarisation. But how can we create these states, and how do these states manifest
themselves?
Natural light often does not have a definite polarisation. Instead, the polarisation fluctuates
rapidly with time3 . In order to turn such randomly polarised light into linearly polarised light in
a certain direction, we must extinguish the light polarised in the perpendicular direction, so that
the remaining light is linearly polarised along the required direction. One could do this by using
light reflected under the Brewster angle (which extinguishes p-polarised light), or one could let
2
See Hecht 8.1.3 ‘Elliptical polarisation’
3
Hecht, §8.1.4 ‘Natural light’.

Optica Lecture Notes TN2421 75 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION

(a) Linear polarisation, ϕy − ϕx = 0

(b) Circular polarisation, ϕy − ϕx = π/2

(c) Elliptical polarisation, ϕy − ϕx = π/4

Figure 5.1: Illustration of different types of polarisation.Top: Linear polarisation; middle: Cir-
cular polarisation; bottom: Elliptical polarisation. The red lines indicate the field components
Ex , Ey . The blue line indicates the vector E. The black line indicates the trajectory of E(t).

light pass through a dichroic crystal (which absorbs light polarised perpendicular to its so-called
optic axis4 ). A third method is sending the light through a wire grid polariser which consists
of a metallic grating with subwavelength slits. Such a grating only transmits the electric field
component that is perpendicular to the slits.
So suppose that through one such process we have obtained linearly polarised light. How
can we change such a state of linear polarisation into circularly or elliptically polarised light?
Or how can we rotate a state of linear polarisation over a certain angle? Well, we have seen
that the polarisation state depends on the ratio of the amplitudes and on the phase difference
ϕy − ϕx of the orthogonal components Ey and Ex of the electric field. Thus, to change linearly
polarised light to some other state of polarisation, we must introduce one phase shift (say ∆ϕx ) to
one component (say Ex ), and another phase shift ∆ϕy to the orthogonal component Ey . We can
achieve this with birefringent crystals, such as calcite5 . What is special about these crystals is
that they have two different refractive indices: light polarised in one certain direction experiences
a refractive index of no , while light polarised perpendicular to it feels another refractive index
ne (the subscripts o and e stand for ‘ordinary’ and ‘extraordinary’), but for our purpose we do
not need to understand this terminology. The direction for which the refractive index is smallest
(which can be either no or ne ) is called the fast axis (since its phase velocity is largest) , and
the other direction is the slow axis. Because of there being two different refractive indices, one
can see double images through a birefringent crystal6 . The difference between the two refractive
indices ∆n = ne − no is called the birefringence.
Suppose that the fast axis corresponds to ne , and is aligned with Ey , while the slow axis
(which then is no ) is aligned with Ex . If the wave travels a distance d through the crystal, then
4
Hecht §8.3.2 ‘Dichroic Crystals’.
5
Hecht §8.4 ‘Birefringence’
6
Double Vision - Sixty Symbols: Demonstration of double refraction by a calcite crystal due to birefringence.

Optica Lecture Notes TN2421 76 of 165 Monday 16th April, 2018, 09:44
5.2. Creating and manipulating polarisation states

Ey will accumulate a phase ∆ϕy = 2πn λ d, and Ex will accumulate a phase ∆ϕx =
e 2πno
λ d. Thus,
the phase difference ϕy − ϕx has increased by


∆ϕy − ∆ϕx = d(ne − no ). (5.14)
λ

5.2.1 Jones Matrices


By letting light pass through crystals of different thicknesses d, we can create different phase
differences between the orthogonal field components and by this we can create different states
of polarisation 7 . To be specific, let J as given by (5.4) , be the Jones vector before the crystal.
Then we have for the Jones vector after the passage through the crystal:

J̃ = MJ, (5.15)

where " # " #


2πi
dno 1 0
e λ 0 2πi
M= 2πi =e λ
dno
2πi . (5.16)
0 e λ
dne 0 e λ d(ne −no )

A matrix such as M which transfers one state of polarisation in another is called a Jones
matrix. Depending on the phase difference which a wave accumulates by traveling through the
crystal, these devices are called quarter-wave plates (phase difference π/2), half-wave plates
(phase difference π), or full-wave plates (phase difference 2π). The applications of these wave
plates will be discussed in later sections.
Consider as example the Jones matrix which described the change of linear polarised light
into circular polarisation. Assume that we have diagonally (linearly) polarised light, so that
 
1 1
J=√ . (5.17)
2 1

We want to change it to circularly polarised light, for which


 
1 1
J=√ , (5.18)
2 i

where one can check that indeed ϕy − ϕx = π/2. We can do this by passing the light through
a crystal such that Ex accumulates no phase difference, and Ey accumulates a phase difference
π/2. We can write this transformation as
     
1 0 1 1 1 1
√ =√ . (5.19)
0 i 2 1 2 i
| {z }
M

Here, the matrix (let’s call it M) is the Jones matrix describing the operation of a quarter-wave
plate8 .
Another important Jones matrix is the rotation matrix. In the preceding discussion we
have assumed that the fast and slow axes were aligned with the y- and x-direction (i.e. they
were parallel to Ey and Ex ). But what happens if we rotate our wave plate such that its fast and
slow axes no longer coincide with y and x, but rather with some other y 0 and x0 as in Fig. 5.2.
In that case we need to apply a basis transformation: the electric field vector which is expressed
in the (x, y) basis should first be expressed in the (x0 , y 0 ) basis before we apply the Jones matrix
7
Hecht §8.7 ‘Retarders’.
8
Hecht §8.13.2 ‘The Jones Vectors’, §8.13.3 ‘The Jones and Mueller Matrices’

Optica Lecture Notes TN2421 77 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION

of the wave plate to it. After we have applied the Jones matrix, we must transform the electric
field vector back from the (x0 , y 0 ) basis to the (x, y) basis. By referring to Fig. 5.2 we see that
     
Ex0 Ex cos θ + Ey sin θ E
= = R−θ x , (5.20)
Ey0 −Ex sin θ + Ey cos θ Ey

where Ex0 = Ax0 eiϕx0 , Ex0 = Ax0 eiϕx0 are the components of J on the (x0 , y 0 ) basis etc. and
R−θ = R−1
θ , with Rθ the rotation matrix over an angle θ in the counter clockwise direction:
9

 
cos(θ) − sin(θ)
Rθ = . (5.21)
sin(θ) cos(θ)

(That R(θ) indeed is a rotation over angle θ counter clockwise is easy to see by considering what
happens when Rθ is applied to the vector (1, 0)T ).
If the matrix M describes the Jones matrix as defined in (5.16) then the matrix Mθ for the
same wave plate but with x0 as slow and y 0 as fast axis, is given by:

Mθ = Rθ MR−θ . (5.22)

For more information on basis transformations, see Appendix F.

Figure 5.2: If the wave plate is rotated, the fast and slow axis no longer correspond to y and x.
Instead, we have to introduce a new coordinate system y 0 , x0 .

5.2.2 Linear Polarizers


A polarizer that only transmits horizontally polarised light is described by


1 0
M= . (5.23)
0 0

Clearly, if one transmits horizontally polarised light through it, all of it will go through. However,
if one transmits vertically polarised light through it, nothing will go through. More generally, if
we transmit through it light that is polarised at an angle α, we get
    
1 0 cos(α) cos(α)
M= = . (5.24)
0 0 sin(α) 0

The amplitude of the transmitted field is reduced by the factor cos(α), which implies that the
intensity of the transmitted light is reduced by the factor | cos(α)|2 . This relation is known as
Malus’ law.
9
KhanAcademy - Linear transformation examples: Rotations

Optica Lecture Notes TN2421 78 of 165 Monday 16th April, 2018, 09:44
5.2. Creating and manipulating polarisation states

5.2.3 Quarter-Wave Plates


A quarter-wave plate introduces a phase shift of π/2, so its Jones matrix reads
 
1 0
M= , (5.25)
0 i
because eiπ/2 = i. To describe the actual transmission through the quarter-wave plate, the
matrix should be multiplied by some global phase factor, but because we only care about the
relative phase difference between the field components, this global phase factor can be omitted
without consequence. The quarter-wave plate is typically used to convert linearly polarised
light to elliptically polarised light and vice-versa10 . If we consider a linearly polarised
state at angle α, we can calculate
    
cos(α) 1 0 cos(α)
= . (5.26)
i sin(α) 0 i sin(α)
In particular, if incident light is linear polarised under 45o , or equivalently, the quarter wave
plate is rotated over this angle, it will transform linearly polarised light into circularly polarised
light (and vice versa).      
1 1 1 0 1 1
√ = √ . (5.27)
2 i 0 i 2 1
A demonstration is shown in11 .

5.2.4 Half-Wave Plates


A half-wave plate introduces a phase shift of π, so its Jones matrix reads
 
1 0
M= , (5.28)
0 −1
because eiπ = −1. An important application of the half-wave plate is to change the orientation
of linearly polarised light. After all, what this matrix does is mirroring the polarisation state
in the x-axis. Thus, if we choose our mirroring axis correctly (i.e. if we choose the orientation
of the wave plate correctly), we can change the direction in which the light is linearly polarised
arbitrarily12 . A demonstration is shown in13 . To give an example: the polarisation of a wave
that is parallel to the x-direction, can be rotated over angle α by rotating the crystal such that
the slow axis makes angle α/2 with the x-axis. Upon propagation through the crystal, the fast
axis gets an additional phase of π due to which the electric vector makes angle α with the x-axis
(see Fig. 5.3).

5.2.5 Full-Wave Plates


A full-wave plate introduces a phase difference of 2π, which is the same as introducing no phase
difference between the two field components, so what can possibly be its use? We need to recall
from Eq. (5.14) that the phase difference is 2π only for a particular wavelength. If we send
through linearly (say vertically) polarised light of other wavelengths, these will get elliptically
polarised, while the light with the correct wavelength λ0 will stay vertically polarised. If we
then let all the light pass through a horizontal polarizer, the light with wavelength λ0 will be
completely extinguished, while the light of other wavelength will be able to pass through at least
partially. Thus, full-wave plates can be used to filter away specific wavelengths of
light.
10
Youtube video - QuarterWavePlate
11
MIT OCW - Quarter-wave Plate: Demonstration of the quarter-wave plate to create elliptical (in particular
circular) polarisation.
12
Youtube video - HalfWavePlate
13
MIT OCW - Half-wave Plate: Demonstration of the half-wave plate.

Optica Lecture Notes TN2421 79 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION

Figure 5.3: Rotation of horizontal polarised light over an angle α using a half-wave plate.

5.3 How to Determine Whether a Matrix Corresponds to a Lin-


ear Polariser or a Wave Plate
If the directions of either the slow or fast axis is given, it is easy to write down the Jones matrix
of the birefringent plate. Similarly for a linear polariser it is trivial to write down the Jones
matrix if one knows the direction in which the polariser absorbs or transmits all the light. But
suppose that you are given a complex (2,2)-matrix:
 
a b
M=
c d

How can you determine whether it corresponds to a linear polariser or another polarisation
changing plate?

1. Linear Polariser The matrix corresponds to a linear polariser if there is a real vector
which remains invariant under M and the real vector orthogonal to the first is mapped to
zero. In other words, there must be an orthogonal basis of real eigenvectors and one of the
eigenvalues must be 1 and the other 0. Hence to check that a given matrix corresponds to
a linear polariser, one should compute the eigenvalues and show they are equal to 1 and 0
and the eigenvectors should be real and orthogonal.

2. Wave plate To show that a matrix corresponds to a wave plate, there should exist two
real orthogonal eigenvectors with, in general, complex eigenvalues of modulus 1. in fact,
one of the eigenvectors corresponds to the fast axis with refractive index n1 , say and the
other to the slow axis with refractive index n2 , say. The eigenvalues are then

eikn1 d , eikn2 d ,

where d is the thickness of the plate and k is the wave number. Hence to verify that a
(2, 2)-matrix corrsponds to a wave plate, one has to compute the eigenvalues, check that
these have modulus 1 and that the corresponding eigenvectors are real and orthogonal.

5.4 Decomposition of Elliptical Polarisation into Linear and Cir-


cular States
Any elliptical polarisation state can be written as the sum of two linear orthogonal states:

Ax eiϕx
     
iϕx 1 iϕy 0
J= = Ax e + Ay e (5.29)
Ay eiϕy 0 1

Optica Lecture Notes TN2421 80 of 165 Monday 16th April, 2018, 09:44
5.4. Decomposition of Elliptical Polarisation into Linear and Circular States

Alternatively, any elliptical polarisation state can be written as the sum of two circular polari-
sation states, one right and the other left circular polarised:

Ax eiϕx
     
1 1 1 1
J= iϕx
= (Ax e − iAy e ) iϕy iϕx
+ (Ax e + iAy e ) iϕ
. (5.30)
Ay eiϕy 2 i 2 −i

We conclude that to study what happens to elliptic polarisation, it suffices to consider two
orthogonal linear polarisations, or, if that is more convenient, left and right circular polarised
light. In a birefringent material two linear polarisations, namely the one parallel to the o- and the
e-axis, each propagate with their own refractive index. To predict what happens to an arbitrary
linear polarisation state which is not aligned to either of these axes, or more generally what
happens to an elliptical polarisation state, we write this polarisation as linear combination of o-
en e-polarisation states, i.e. we expand the field on the o- and e-basis.
In sugar, the left and right circular polarisation states propagate with their own refractive
index. Therefore sugars are said to have circular birefringent. To see what happens to an
arbitrary elliptical polarisation state in such a material, you should write it as linear combination
of left and right circular polarisations.

External sources in recommended order:

1. Double Vision - Sixty Symbols: Demonstration of double refraction by a calcite crystal due
to birefringence.

2. Hecht §8.4

3. Hecht §8.7.1

4. Hecht §8.13

5. MIT OCW - Linear Polarizer: Demonstration of linear polarizers and linear polarisation.

6. MIT OCW - Polarization Rotation Using Polarizers: Demonstration of polarisation rota-


tion using linear polarizers.

7. Youtube video - QuarterWavePlate

8. MIT OCW - Quarter-wave Plate: Demonstration of the quarter-wave plate to create ellip-
tical (in particular circular) polarisation.

9. Youtube video - HalfWavePlate

10. MIT OCW - Half-wave Plate: Demonstration of the half-wave plate.

Optica Lecture Notes TN2421 81 of 165 Monday 16th April, 2018, 09:44
CHAPTER 5. POLARISATION

Optica Lecture Notes TN2421 82 of 165 Monday 16th April, 2018, 09:44
Chapter 6

Interference and coherence

What you should know and be able to do after studying this chapter.

• Understand time coherence and spatial coherence.

• Understand the measurement of the time coherence by the Michelson inter-


ferometer.

• Understand the link between the time coherence and the frequency band-
width.

• Understand Young’s two-slit experiment to measure the spatial coherence of


two points.

• Know the definition of fringe visibility.

• Know and understand the three Laws of Fresnel-Arago.

Although the model of geometrical optics helps us to design optical systems and explains many
phenomena, there are also phenomena that require a more elaborate model. For example, in-
terference fringes observed in Young’s double-slit experiment, or the Arago spot indicate that
light is more accurately modeled as a wave. From your first year of Bachelor, you probably
remember that the condition for interference maxima in the double-slit experiment occurs when
the difference in path length is an integer multiple of the wavelength

d sin θ = mλ, (6.1)

where d is the distance between the two slits, θ is the angle of the propagation direction of
the light, m is an integer, and λ is the wavelength of the light. You may also remember the
Huygens-Fresnel principle, that states that each of the narrow slits acts as point source, from
which spherical waves are radiated.
In this chapter we will find more results that derive from the wave model of light. We will
see how light interferes in certain cases and how the wave model of light can reduce to the ray
model of light by considering (in)coherence. In the largest part of the discussion we will assume
that all the light has the same polarization, so we can treat the fields as scalars. In the last part
we will look at how polarization affects interference, as is described in the Fresnel-Arago laws.
It is very much worth noting that the concepts of interference and coherence are not just
restricted to optics. Since quantum mechanics dictates that particles have a wave-like nature,
interference and coherence also play a role in e.g. solid state physics and quantum information.

83
CHAPTER 6. INTERFERENCE AND COHERENCE

• KhanAcademy - Interference of light waves: Playlist on wave interference at a secondary


school level.

• Yale Courses - 18. Wave Theory of Light

6.1 Interference of Monochromatic Fields of the Same Frequency


Let us first recall the basic concepts of interference. What causes interference is the fact that
light is a wave, which means it not only has an amplitude, but also a phase. Suppose for
example we evaluate a time-harmonic field in two points
U1 (t) = cos(ωt), U2 (t) = cos(ωt + ϕ). (6.2)
Here ϕ denotes the phase difference between the fields at the two points. If ϕ = 0, or ϕ is a
multiple of 2π, then the fields are in phase, and when they are superimposed (i.e. added) they
interfere constructively
U1 (t) + U2 (t) = cos(ωt) + cos(ωt + 2mπ) = 2 cos(ωt). (6.3)
However, when ϕ = π, or more generally ϕ = π + 2mπ, for some integer m, then the waves are
out of phase, and when they are superimposed, they interfere destructively.
U1 (t) + U2 (t) = cos(ωt) + cos(ωt + π + 2mπ)
= cos(ωt) − cos(ωt) (6.4)
= 0.
We can sum the two fields for arbitrary ϕ more conveniently using complex notation
U1 (t) = Re{e−iωt }, U2 (t) = Re{e−iωt e−iϕ }. (6.5)
Adding gives
U1 (t) + U2 (t) = Re{e−iωt (1 + e−iϕ )}
= Re{e−iωt e−iϕ/2 (eiϕ/2 + e−iϕ/2 )}
= Re{e−iωt e−iϕ/2 2 cos(ϕ/2)}
= 2 cos(ϕ/2) cos(ωt + ϕ/2). (6.6)
Indeed we see that for ϕ = 2mπ and ϕ = π + 2mπ we retrieve the results obtained previously. It
is important to note that what we see or detect physically (say, the ‘brightness’ of light) does not
correspond to the quantities U1 , U2 . After all, U1 and U2 can attain negative values, while there
is no such thing as ‘negative brightness’. What U1 and U2 describe are the fields, which may
be positive or negative. The ‘brightness’ is the irradiance or instantaneous intensity given by
(2.69): U(t)2 (we omit the factor /µ0 ). Actually, we see and measure only the time-average
p

of U(t)2 , because it fluctuates so rapidly for optical frequencies. We recall the definition of the
time average over an interval of length T at a specific time t given in (2.74) in Chapter 2:
1 t+T /2
Z
hf (t)i = f (t0 ) dt0 , (6.7)
T t−T /2
where T is a time interval that is very large compared to the period of visible light. For a
time-harmonic function, the time average is independent of t. Indeed for (6.6) we get
I = h(U1 (t) + U2 (t))2 i
= 4 cos2 (ϕ/2) hcos2 (ωt + ϕ/2)i
= 1 + cos(ϕ) (6.8)

Optica Lecture Notes TN2421 84 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence

where T ω >> 1. It is important to note that one can use complex notation to obtain the factor
1 + cos(ϕ) more easily. Let us write

U1 (t) = Re{U1 e−iωt }, U2 (t) = Re{U2 e−iωt }, (6.9)

where
U1 = 1, U2 = e−iϕ . (6.10)
Then we find
|U1 + U2 |2 = |1 + e−iϕ |2
= (1 + eiϕ )(1 + e−iϕ )
(6.11)
= 1 + 1 + e−iϕ + eiϕ
= 2 + 2 cos(ϕ),

hence
1
I = |U1 + U2 |2 . (6.12)
2
(To see why this works, recall Eq. (2.75) and choose A = B = U1 + U2 ). In this chapter we omit
the factor 1/2 in the time averaged intensity. Hence we define I1 = |U1 |2 and I2 = |U2 |2 , and we
then find for the time averaged intensity of the sum of U1 and U2 :

I = |U1 + U2 |2 = (U1 + U2 )∗ (U1 + U2 )


= |U1 |2 + |U2 |2 + U1∗ U2 + U1 U2∗ (6.13)
= I1 + I2 + 2Re{U1∗ U2 }

Here, 2Re{U1∗ U2 } is known as the interference term. In the famous double-slit experiment
(which we will discuss in a later section), we can interpret the terms as follows: let us say U1
is the field that comes from slit 1, and U2 comes from slit 2. If we were to open only slit 1,
we would measure on the screen some intensity I1 , and if we were to measure slit 2, we would
see some intensity I2 . If we were to open both slits at the same time, we would not simply see
I1 + I2 , but we would see fringes which are due to the interference term 2Re{U1∗ U2 }.
More generally, if we want to see the intensity of a sum of multiple time harmonic fields Uj
that all have the same frequency, we have to compute the coherent sum
2

X
I= Uj . (6.14)
j

However, we will see in the next section that sometimes the fields are unable to interfere. In
that case all the interference terms of the coherent sum vanish, and the intensity is given by the
incoherent sum X
I= |Uj |2 . (6.15)
j

6.2 Coherence
In the discussion so far we have only considered monochromatic light. The light is then perfectly
coherent. But what happens when light is not monochromatic but consists of multiple frequencies
(e.g. laser light may consist of multiple harmonics)? Or what if our light comes from multiple
sources which radiate independently (e.g. each point of an extended source such as a lamp acts as
an independent radiator)? To answer such questions, we must study the topic of coherence. One
could make a distinction between coherent and incoherent light. An intuitive way to think

Optica Lecture Notes TN2421 85 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

about these concepts is in terms of the ability to form interference fringes: for example, with
laser light (which is coherent) one can form an interference pattern using a double slit, while
with sunlight (which is incoherent) this is much more difficult. However, it is not impossible
to create interference fringes with natural light1 : the trick is to put the two slits very closely
together. More specifically, the distance between the slits should be smaller than the coherence
length of the light. This is a very important observation: no light is actually completely coherent
or completely incoherent in practice. All light is partially coherent, but some light is more
coherent than others: laser light has a longer coherence length than sunlight, but in the end it is
possible to get an interference pattern from sunlight, or to see no interference fringes with laser
light. In the next sections we discuss these properties of light more quantitatively.

6.2.1 Polychromatic Light and its Intensity


When dealing with the issue of coherence one has to consider fields that consist of a range of
different frequencies. Let U(r, t) be the real-valued physical field component which is an arbitrary
function of time. We can always write U(r, t) as an integral over time-harmonic components:
Z ∞
U(r, t) = Re Aω (r)e−iωt dω, (6.16)
0

where Aω (r) is the complex amplitude of the time harmonic field with frequency ω. When there
is only a certain frequency band that contributes, then Aω = 0 for ω outside this band. We
define the complex time-dependent field U (r, t) by
Z ∞
U (r, t) = Aω (r)e−iωt dω. (6.17)
0

Then
U(r, t) = Re U (r, t). (6.18)
Note that the complex field U (r, t) contains now the time dependence in contrast with the time-
harmonic (i.e. single frequency) case in Chapter 2 where the time dependent e−iωt was a separate
factor.

Quasi-monochromatic field. If there is a frequency band with width ∆ω and centre fre-
quency ωc and the band is very narrow, we speak of a quasi-monochromatic field. This can in
good approximation be considered to be time-harmonic:

U (r, t) ≈ ∆ωAωc (r)e−iωc t . (6.19)

We see that ∆ω Aωc (r) is the complex amplitude of the time-harmonic field which previously in
Chapter 2 and the following chapters has been written as U (r).

Remark: In the present chapter U (r, t) will be the complex field which includes the time
dependence (even in the time-harmonic case).

We now compute the intensity of polychromatic light. The instantaneous energy flux is again
proportional to the square of the instantaneous field: U(r, t)2 . We average the instantaneous
intensity over the integration times T over common detectors which are very long compared to
the period 2π/ω of the field. Using the definition (2.74) and

U(r, t) = (U (r, t) + U (r, t)∗ )/2, (6.20)


1
See Veritasium - The original double slit experiment, starting at 2:15 - Demonstration of an interference
pattern obtained with sunlight. Or see Hecht §9.3.1 ‘Young’s experiment’.

Optica Lecture Notes TN2421 86 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence

we get
1
hU(r, t)2 i = h(U (r, t) + U (r, t)∗ )(U (r, t) + U (r, t)∗ )i
4
1
hU (r, t)2 i + h(U (r, t)∗ )2 i + 2 hU (r, t)∗ U (r, t)i

=
4
1
= hU (r, t)∗ U (r, t)i
2
1
= h|U (r, t)|2 i . (6.21)
2
where the averages of U (r, t)2 and (U (r, t)∗ )2 are zero because they are oscillating functions of
time. In contrast, U (r, t)∗ U (r, t) has a DC-component which does not average to zero.

Remark: In contrast to the time-harmonic case, the long time average of polychromatic light
depends in principle on the time t at which the average is taken. However, the fields that we
consider in this chapter have frequency bands that are sufficiently narrow and hence the dura-
tion of the fields is sufficiently long that the precise time around which the average over the time
interval is computed does not matter.

We use for the intensity again the expression without the factor 1/2 in front, i.e.

I(r) = h|U (r, t)|2 i . (6.22)

Hence the time averaged intensity has been expressed in terms of the squared modulus of the
complex field.

6.2.2 Temporal Coherence and the Michelson Interferometer


Let us first discuss temporal coherence. This concept refers to how a field U (r, t) in some
point r interferes with itself a time τ later, i.e. how U (r, t) interferes with U (r, t + τ ). Since the
point r is always the same, we omit it from the forumula.
Temporal coherence is closely related to the spectral content of the light: if the light consists
of fewer frequencies (think of monochromatic light), then it is more temporally coherent. To
study the interference of U (t) with U (t + τ ), a Michelson interferometer shown in Fig. 6.1, is a
suitable setup. The light that goes through one arm takes time t to reach the detector, while
the light that goes through the other (longer) arm takes time t + τ , thus the detector observes
the time-averaged intensity h|U (t) + U (t + τ )|2 i. As remarked before, this averaged intensity
does not depend on the time the average is taken, but it does depend on the time difference τ
between the two beams. The expression h|U (t) + U (t + τ )|2 i does not depend on position, so it
cannot describe interference fringes in space. To better observe what happens when τ is varied,
we introduce interference fringes by tilting one beam so that the observed interference pattern
is given by

I(x, τ ) = h|U (t)eikx x + U (t + τ )|2 i


(6.23)
= h|U (t)|2 i + h|U (t + τ )|2 i + 2Re{hU ∗ (t)U (t + τ )i e−ikx x }.

If τ is changed the maxima of the intereference pattern translates as function of x which is easy
to observe. How interference fringes for tilted collimated (i.e. not converging or diverging) beams
are observed in a Michelson interferometer is demonstrated in 2 . It is possible to obtain different
2
MIT OCW - Two-beam Interference - Collimated Beams: Interference of laser light in a Michelson interfer-
ometer for collimated beams.

Optica Lecture Notes TN2421 87 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

Vary 𝜏

I(𝜏)=<|U(t)+U(t+𝜏)|2>

Detector Beam Splitter

Source

Figure 6.1: A Michelson interferometer that can be used to study the temporal coherence of a
field. A beam is split in two by a beam splitter, and the two beams interfere in a detector after
one beam has propagated through one arm of the interferometer for a time t, while another beam
has propagated through the other arm for a time t + τ . The time difference τ can be varied by
varying the length of one arm.

fringe patterns using diverging beams instead of collimated beams, as is demonstrated in3 .
We define the self coherence function Γ (τ ) as

Γ (τ ) = hU ∗ (t)U (t + τ )i . (6.24)

The intensity of U (t) is


I0 = h|U (t)|2 i = Γ (0). (6.25)
The complex degree of self-coherence is defined as
Γ (τ )
γ(τ ) = . (6.26)
Γ (0)
This is a complex number with modulus between 0 and 1:

0 ≤ |γ(τ )| ≤ 1, (6.27)

The observed intensity can then be written:


 n o
I(x, τ ) = 2I0 1 + Re γ(τ )e−ikx x , (6.28)

Recall that we vary τ by varying the length of one of the arms in the Michelson interferometer.
Let us see what happens for different cases.

Suppose U (t) is a monochromatic wave

U (t) = e−iωt . (6.29)

In that case we get for the self coherence

Γ (τ ) = heiωt e−iω(t+τ ) i = e−iωτ , (6.30)

and
γ(τ ) = e−iωτ . (6.31)
3
MIT OCW - Two-beam interference - Diverging Beams: Interference of laser light in a Michelson interferom-
eter for diverging beams.

Optica Lecture Notes TN2421 88 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence

Hence the interference pattern is given by

I(x, τ ) = 2 [1 + cos (kx x + ωτ )] . (6.32)

So for monochromatic light we expect to see a cosine interference pattern, which shifts as we
change the arm length of the interferometer (i.e. change τ ). For all τ , a clear interference pattern
should be observed.

What happens if our light is a superposition of two frequencies, i.e.

e−i(ωc +∆ω/2)t + e−i(ωc −∆ω/2)t


U (t) = . (6.33)
2

In that case:

1  i(ωc +∆ω/2)t  
Γ (τ ) = h e + ei(ωc −∆ω/2)t e−i(ωc +∆ω/2)(t+τ ) + e−i(ωc −∆ω/2)(t+τ ) i
4
e−i(ωc +∆ω/2)τ + e−i(ωc −∆ω/2)τ

4
−iω
e cτ
= cos (∆ω τ /2) , (6.34)
2

where in the second line we assumed that the terms that oscillate in time average to 0. Hence,
the complex degree of self-coherence is:

γ(τ ) = cos (∆ω τ /2) e−iωc τ (6.35)

and (6.28) becomes

h n oi
I(x, τ ) = 1 + Re γ(τ )e−ikx x = [1 + cos (∆ω τ /2) cos(ωc τ + kx x)] . (6.36)

It is seen that the interference term is the product of the function cos(ωc τ + kx x)), which is a
quickly oscillating function of τ , and a slowly varying envelope cos (∆ω τ /2). It is interesting to
note that the envelope, and hence γ(τ ), vanish for some periodically spaced τ , which means that
for certain τ the degree of self-coherence vanishes and no interference fringes form. Indeed, this
is what we see in4,5 . Note that if ∆ω increases, the intervals between the zeroes of γ(τ ) decrease.
If more frequencies are added, the envelope function is not a cosine function but on aver-
age decreases with τ . The typical value of τ below which intereferences are observed is roughly
equal to half the first zero of the envelope function.This value is called the coherence time ∆τc .

4
MIT OCW - Fringe Contrast - Path Difference: Demonstration of how fringe contrast varies with propagation
distance.
5
MIT OCW - Coherence Length and Source Spectrum: Demonstration of how the coherence length depends
on the spectrum of the laser light.

Optica Lecture Notes TN2421 89 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

Example. We consider a, admittingly, rather special but also simple case for which the frequency
band is ωc − ∆ω/2 < ωc + ∆ω/2 and the amplitudes of all frequency components are the same.
Then the complex field is (the factor 1/∆ω is for normalization):
Z ωc +∆ω/2
1 sin(∆ωt/2) −iωc t
U (t) = e−iωt dω = e . (6.37)
∆ω ωc −∆ω/2 ∆ωt/2

The complex degree of coherence is found to be (we omit the derivation):

sin(∆ωτ /2) −iωc τ


γ(τ ) = e . (6.38)
∆ωτ /2

It is seen that the modulus of the mutual coherence is given by a sinc-function. The sinc-function
vanishes at certain τ , where the interference totally disappears, and decreases (although not
monotonically) to zero for τ → ∞. A reasonable definition of the pulse width of the field U (t)
is half the distance between the smallest positive and negative zeros, hence in this example the
pulse duration is 2π/∆ω. The coherence time can be set equal to half the first zero of the sinc
function, i.e. ∆τc = π/∆ω which for this example is half the pulse duration. We show three
examples of Re {γ(τ )} for three bandwidths in Fig. 6.2, with identical centre frequency ωc .

So let us further interpret the degree of self-coherence function γ(τ ).


• From (6.34) we see that |γ(τ )| is a measure of the ability to form fringes in a Michelson
interferometer. In the limit of complete incoherence where γ(τ ) = δ(τ ) (i.e. it vanishes
for all τ ), no interference fringes can form, whereas when the coherence time τc is long, the
fringe contrast is relatively large for all τ < τc .

• Readers who are familiar with stochastic signal analysis recognize in Γ (τ ) = hU ∗ (t)U (t + τ )i
the autocorrelation of U (t). Informally, one can interpret the autocorrelation function
as the ability to predict the field U at time t + τ given the field at time t. For instance,
in the limit of complete incoherence where Γ (τ ) = δ(τ ), we cannot predict anything about
U (t + τ ) given U (t) for τ 6= 0, meaning U (t) is completely random. In the limit of complete
coherence where |γ(τ )| = 1, we can predict U (t + τ ) given U (t) for any τ .

• From the Wiener-Khinchin theorem we know that the Fourier transform of the self
coherence function gives the spectral power of U (t)

Γ̂ (ω) = |Û (ω)|2 . (6.39)

Using the uncertainty principle, we can see that the larger the spread of the frequencies of
U (t) (i.e. the larger the bandwidth), the more sharply peaked Γ (τ ) is. Thus, the light gets
temporally less coherent when it consists of a larger range of frequencies.
It is also possible to think of temporal coherence as the length of wavetrains/pulses emitted
by the source: the less temporally coherent the source is, the shorter the emitted wavetrains.
After all, we have seen that if a source is less temporally coherent, it emits radiation at more
frequencies, and we have seen that according to the uncertainty principle this allows shorter
pulses to be formed6 . One can think of the coherence time τc as the duration of such a pulse
(which is inversely proportional to the bandwith of the signal), and of the coherence length
as the corresponding distance lc = cτc . In the case of perfectly temporally coherent light, the
wavetrains would be infinitely long which is physically impossible. Therefore, there are in practice
no perfectly coherent (i.e. monochromatic) sources. Instead, if a source is nearly monochromatic,
6
See for example Heisenberg’s Microscope - Sixty Symbols, 0:20 to 2:38: Basic explanation of the uncertainty
principle (though in the context of quantum physics). Or Hecht §7.2, 7.4.2, 7.4.3.

Optica Lecture Notes TN2421 90 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence

Figure 6.2: Real part of the degree of self coherence Re {γ} as function of τ /ωc for a bandwidth
from top to bottom of ∆ω = 0.01ωc , ∆ω = 0.02ωc and ∆ω = 0.04ωc . In all cases ωc was the
same. The τ axes are normalised by the coherence time for the first chosen bandwidth, i.e. τ is
normalised by τc = π/(0.01ωc ). It is seen that the coherence time decreases as the bandwidth
increases.

Optica Lecture Notes TN2421 91 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

we call it quasi-monochromatic. The narrow but finite bandwidth of a quasi-monochromatic


source is referred to as the linewidth.

6.2.3 Spatial Coherence and Young’s Experiment


We have just looked at temporal coherence, which describes how the field at a certain point
interferes with the field at the same position but at a later time. What we will do now is look
at the spatial coherence, which describes how the fields at two points P1 and P2 separated in
space interfere with each other. While for studying temporal coherence we used a Michelson
interferometer, for studying spatial coherence the natural choice is Young’s experiment,
since it allows two points separated in space to interfere with each other7 .

T I(𝜏)=<|U(r1,t)+U(r2,t+𝜏)|2>

Vary 𝜏
U(r2,t)

θ
d T+𝜏

ΔR
U(r1,t)

Figure 6.3: Young’s experiment can be used to study the spatial coherence of a field. A screen
with two holes at the two points of interest, r1 and r2 , is used to let the fields in these points
interfere with each other on a second screen at large distance. Because the light propagates
over different distances from the two holes to the point of observation, U (r1 , t) interferes with
U (r2 , t + τ ), where τ is the corresponding difference in propagation time. For different points
on the screen, the time difference τ varies. Due to the distance from the holes to the screen, the
amplitudes of U (r1 ) and U (r2 ) have decreased by the time they have reached the screen, but
when the distances between the screens is very large, all amplitudes have decreased by the same
factor which can then be omitted.

In Young’s experiment, a screen is used with two pinholes at the positions of the points P1
and P2 . Let r1 and r2 be the position vectors of the two points. We write the field in P1 as a
superpostion of time harmonic fields as in (6.17):
Z
U (r1 , t) = Aω (r1 )e−iωt dω. (6.40)

According to the Huygens-Fresnel principle, a time harmonic disturbance with frequency ω in the
pinhole at r1 causes a radiating spherical wave behind the screen, such that the time-harmonic
field in some point r is given by
e−iω(t−|r−r1 |/c)
Aω (r1 ) . (6.41)
|r − r1 |
7
See Hecht, §9.3.1 ‘Young’s experiment’

Optica Lecture Notes TN2421 92 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence

The total field in r due to the pinhole at P1 is obtained by integrating over all frequencies:

e−iω(t−|r−r1 |/c) U (r1 , t − |r − r1 |/c)


Z
U (r, t) = Aω (r1 ) dω = . (6.42)
|r − r1 | |r − r1 |

Hence, The field in r at time t due to the pinhole at r1 is proportional to the field at r1 at the
earlier time that it takes for the light to propagate form r1 to r. The proportionality factor scales
with the reciprocal distance between r and r1 .

Consider the set-up as shown in Fig. 6.3. The field from the two pinholes in r1 and r2 interfere
with each other on a second screen at large distance. Because of the difference in propagation
distance ∆R = |r − r1 | − |r − r2 |, there is a time difference τ between the two fields when they
arrive at the same point r on the screen, given by

∆R
τ= . (6.43)
c
Furthermore, because of the propagation, the amplitudes are reduced by a factor proportional to
the reciprocal distance which is different for the two fields, but if the distance z between the two
screens is large enough, we can take both factors to be 1/z and then omit this common factor.
The interference pattern on the screen is then given by

I(τ ) = h|U (r1 , t) + U (r2 , t + τ )|2 i


= h|U (r1 , t)|2 i + h|U (r2 , t + τ )|2 i + 2Re{hU ∗ (r1 , t)U (r2 , t + τ )i}. (6.44)

We define the mutual coherence function Γ12 (τ )

Γ12 (τ ) = hU ∗ (r1 , t)U (r2 , t + τ )i , (6.45)

and the intensities

I1 = h|U (r1 , t)|2 i = Γ11 (0),


(6.46)
I2 = h|U (r2 , t + τ )|2 i = Γ22 (0).

The complex degree of mutual coherence is defined by using these intensities to normalize
Γ12 (τ ):
Γ12 (τ )
γ12 (τ ) = p p (6.47)
Γ11 (0) Γ22 (0)
The modulus of γ12 is smaller or equal than 1. We can write (6.44) as
p p
I(τ ) = I1 + I2 + 2 I1 I2 Re {γ12 (τ )} . (6.48)

By varying the point of observation r over the screen, we can vary τ and by measuring the
intensities we can deduce the real part of γ12 (τ ). Note that γ12 (τ ) indicates the ability to form
fringes: if γ12 (τ ) is zero for a certain τ , then the interference is zero and the intensity is the sum
of the intensities of the two holes.
Let us see what happens when U (r, t) is a monochromatic field

U (r, t) = A(r)e−iωt . (6.49)

In that case

Γ12 (τ ) = hA∗ (r1 )A(r2 )eiωt e−iω(t+τ ) i


(6.50)
= A∗ (r1 )A(r2 )e−iωτ .

Optica Lecture Notes TN2421 93 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

So we get
Γ12 (τ )
γ12 (τ ) = = e−iωτ +iϕ , (6.51)
|A(r1 )||A(r2 )|
where ϕ is the phase difference of A(r2 ) and A(r1 ). In this case γ12 has modulus 1, as expected
for a monochromatic field. The intensity on the screen becomes

I(τ ) = |A(r1 )|2 + |A(r2 )|2 + 2|A(r1 )||A(r2 )| cos (ωτ + ϕ) . (6.52)

So indeed we see interference fringes as one would expect for a monochromatic wave. If ϕ = 0,
then interference maxima occur for

ωτ = 0, ±2π, ±4π, ±6π, . . . (6.53)

Noting that ω = c 2π
λ , and ∆R = cτ , we find that maxima occur when

∆R = 0, ±λ, ±2λ, ±3λ, . . . (6.54)

For large distance between the screens (in the Fraunhofer limit), these path length differences
correspond to directions of the maxima given by the angles θm (see Fig. 6.3):
∆R λ
θm = =m , (6.55)
d d
where d is the distance between the slits and m is an integer. This result should be familiar from
secondary school treatments of Young’s double slit experiment8 .
For fields that are not monochromatic, the degree of mutual coherence varies with τ and is
less than 1.

Example. Suppose that the fields in the point P1 and P2 are identical and given by (6.37):

sin(∆ωt/2) −iωc t
U (r1 , t) = U (r2 , t) = e . (6.56)
∆ωt/2

This fields consists of a spectrum with centre frequency ωc and bandwidth ∆ω and with constant
spectral density. The complex degree of mutual coherence can be computed. We only state the
final result, without going into the details of the derivation:

sin(∆ωτ /2) −iωc τ


γ12 (τ ) = e , (6.57)
∆ωτ /2

In particular,
sin(∆ωτ /2)
Re {γ12 (τ )} = cos(ωc τ ). (6.58)
∆ωτ /2
where, as above, τ = ∆R/c ≈ d θ/c, with θ the angle between the normal on the screen and
the line connecting the point of observation to the point on the screen halfway between P1 and
P2 . The factor cos(ωc τ ) is equal to Re {γ12 } for monochromatisch light with frequency ωc . The
sinc function in (6.58) is a slowly varying evelope function that is smaller than 1 for τ > 0. For
τ = 2π/∆ω this factor vanishes and therefore the coherence time is set equal to ∆τc = π/∆ω
and the coherence length is ∆`c = πc/∆ω. In Fig. 6.4, the intensity

I(θ) = 1 + Re {γ12 ] , (6.59)

is shown as function of angle θ for several values of the bandwidth. It is seen that between θ = 0
and the first zero, the fringe ampitudes decrease for increasing angle θ because of the increase
of the difference in propagation distance ∆R from the two holes to the point of observation.
Furthermore, for larger bandwith the amplitudes of the fringes decrease faster with the angle.

Optica Lecture Notes TN2421 94 of 165 Monday 16th April, 2018, 09:44
6.2. Coherence

Figure 6.4: The intensity I(θ) as function of θ for a bandwidth from top to bottom of ∆ω =
0.025ωc , ∆ω = 0.05ωc and ∆ω = 0.1ωc . In all cases ωc has the same value. It is seen that the
fringe amplitudes decrease with increasing θ due to the fact that the difference in propagation
distances ∆R, and hence τ , increases with θ. The larger the bandwidth the faster this decrease
with angle. Furthermore for certain angles there is no interference at all.

Optica Lecture Notes TN2421 95 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

Let us interpret the mutual coherence function Γ12 (τ ) and the complex degree of mutual co-
herence.
• We have 0 ≤ |γ12 (τ )| ≤ 1 and |γ12 | indicates the ability to from fringes in a double-slit
experiment. When γ12 (τ ) is zero there is no interference.
• We recognize Γ12 (τ ) = hU ∗ (r1 , t)U (r2 , t + τ )i to be the cross-correlation of the two
signals U (r1 , t) and U (r2 , t).

6.3 Change of Spatial Coherence due to Propagation


When the light propagates, its coherence in general increases with propagation distance. For
example, the sun consists of incredibly many emitters which emit at random times and therefore
the fields at different points of the surface are completely uncorrelated, i.e. the field is spatially
incoherent. But after propagation to the earth, the sun light has become partially coherent, with
coherence length of roughly 50 µm around the wavelength of 510 nm.
To understand this phenomenon we consider two incoherent point sources P1 and P2 in the
z = 0-plane. We assume that their mutual coherence function is given by:
(
0, for all τ if P1 6= P2 ,
ΓP1 P2 (τ ) = sin(∆ωτ /2) −iωc τ (6.60)
I0 ∆ωτ /2 e , if P1 = P2 ,
where I0 is the intensity of either source and where we assume the same spectral properties of the
field as in the example of Section 6.2.2 (See (6.38)): ωc is the centre frequency of the frequency
band with width ∆ω. For a narrow frequency band we approximate (6.60) by
ΓPj Pj (τ ) = I0 e−iωc τ , j = 1, 2. (6.61)
Consider two points Q1 , Q2 in a plane at distance z > 0 from the two point sources. We
will compute the mutual coherence ΓQ1 Q2 (0) of these points for zero time delay τ = 0 and for
varying z. The field in Q1 is the sum of the fields emitted by P1 and P2 :
U (P1 , t − |P1 Q1 |/c) U (P2 , t − |P2 Q1 |/c)
U (Q1 , t) ∝ + , (6.62)
|P1 Q1 | |P2 Q1 |
where we used (6.42). Similarly,
U (P1 , t − |P1 Q2 |/c) U (P2 , t − |P2 Q2 |/c)
U (Q2 , t) ∝ + , (6.63)
|P1 Q2 | |P2 Q2 |

Let us assume that z is so large that all distances |Pi Qj | in the denominators can be replaced
by z. Then the distances can be omitted and we get for the mutual coherence in Q1 and Q2 for
zero time delay τ = 0:
ΓQ1 Q2 (0) = h(U (Q1 , t)∗ U (Q2 , t)i
= hU (P1 , t − |P1 Q1 |/c)∗ U (P1 , t − |P1 Q2 |/c)i
+ hU (P1 , t − |P1 Q1 |/c)∗ U (P2 , t − |P2 Q2 |/c)i
+ hU (P2 , t − |P2 Q1 |/c)∗ U (P1 , t − |P1 Q2 |/c)i
+ hU (P2 , t − |P2 Q1 |/c)∗ U (P2 , t − |P2 Q2 |/c)i
   
|P1 Q1 | − |P1 Q2 | |P1 Q1 | − |P2 Q2 |
= ΓP1 P1 + ΓP1 ,P2
c c
   
|P2 Q1 | − |P1 Q2 | |P2 Q1 | − |P2 Q2 |
+ΓP2 P1 + ΓP2 ,P2 . (6.64)
c c
8
KhanAcademy - Young’s Double slit part 1

Optica Lecture Notes TN2421 96 of 165 Monday 16th April, 2018, 09:44
6.3. Change of Spatial Coherence due to Propagation

Figure 6.5: Two incoherent point sources P1 , P2 and two points Q1 , Q2 in a plane at distance z
from the point sources. The degree of mutual coherence at Q1 and Q2 increases to 1 for α → 0.

Now we use (6.60) and (6.61) to conclude


n ωc ωc
o
ΓQ1 Q2 (0) = I0 e−i c (|P1 Q1 |−|P1 Q2 |) + e−i c (|P2 Q1 |−|P2 Q2 |) . (6.65)

Similarly,
ΓQ1 Q1 (0) = ΓQ2 Q2 (0) = 2I0 . (6.66)
The complex degree of mutual coherence for zero time delay becomes

ΓP1 P2 (0)
γQ1 Q2 (0) = p p
ΓQ1 Q1 (0) ΓQ2 Q2 (0)
1 h −i ωc (|P1 Q2 |−|P1 Q1 |) ωc
i
= e c + e−i c (|P2 Q2 |−|P2 Q1 |) . (6.67)
2
The degree of mutual coherence at Q1 and Q2 is thus not zero and depends on the one hand on
the optical path differences between P1 and Q1 and P1 and Q2 , and the other hand on the path
difference between P2 and Q1 and P2 and Q2 .
Suppose that P1 = (a/2, 0, 0), P2 = (−a/2, 0, 0) and Qj = (xj , 0, z) for j = 1, 2. Then, for z
large: r
(a/2 − xj )2 (a/2 − xj )2
|P1 Qj | = z 1 + ≈ z + , (6.68)
z2 2z
and hence
(a/2 − x2 )2 − (a/2 − x1 )2 a(x1 − x2 ) + x22 − x21
|P1 Q2 | − |P1 Q1 | ≈ = . (6.69)
2z 2z
Similarly,
−a(x1 − x2 ) + x22 − x21
|P2 Q2 | − |P2 Q1 | ≈ . (6.70)
2z
Hence, hω a i
ωc
(x22 −x21 )
c
γQ1 Q2 (0) = e−i c (x1 − x2 ) .
cos (6.71)
2cz
It is thus seen that the degree of mutual coherence depends on the angle α ≈ a/z subtended by
the two point sources at the midpoint (0, 0, z) on the screen. The smaller this angle, the higher
the degree of spatial coherence.
We see form (6.71) that by keeping Q1 fixed, we can retrieve the angle α by measuring
Re {γQ1 Q2 (0)} for a number of different positions of Q2 .
If the light is not quasi-monochromatic and has larger bandwidth, the modulus of |ΓQ1 Q2 (0)|
will be smaller which means a reduced coherence compared to the quasi-monochromatic case.

Optica Lecture Notes TN2421 97 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

The dependence of the spatial coherence on propagation can be understood without calcu-
lations by realizing that the fields at Q1 and Q2 both partly consist of the fields radiated by
P1 and P2 . The field at Q1 and Q2 radiated by the point source P1 are coherent provided the
difference in propagation distance from P1 to Q1 and from P2 to Q2 is smaller than
the coherence length. The same applies to the field at Q1 and Q2 due to point source P2 .
Therefore the fields at Q1 and Q2 are correlated (i.e. they are partially coherent), even though
the fields at P1 and P2 are completely uncorrelated.

Remark. Also in case of an extended quasi-monochromatic spatially incoherent source (i.e.


a source which consisits of many spatially incoherent point sources), the mutual coherence in
two points is directly related to the size of the extended source. The larger the angle the source
subtends at the plane of observation, the less spatially coherent the field is in this plane9 . This
property is used in stellar interferometry10 . It works as follows: we want to know the size of
a certain star. The size of the star, being an extended spatially incoherent source, determines
the spatial coherence of the light we receive on earth. Thus, by measuring the light with two
separated telescopes and letting the two signals interfere, we effectively create a double-slit ex-
periment, with which we can measure the degree of spatial coherence of the star light on earth,
and thus measure the angle which the star subtends on earth. Then, if we know the distance of
the star by independent means, e.g. from its spectral brightness, we can deduce its size.

6.4 Fringe Visibility


Several times now we have described the coherence of light as ‘the ability to form fringes’. With
this phrase we referred to the time average of the interference term hRe{U1∗ U2 }i: if this term
is 0 then no fringes will form, and if it is larger, then more distinct fringes will form. However,
there is a quantity called the fringe visibility that uses directly observable data (i.e. intensities
instead of fields) to quantify the ability to form fringes. Given some interference intensity pattern
I(x) (see Fig. 6.6), the visibility is defined as

Imax − Imin
V= . (6.72)
Imax + Imin

For example, if we have two perfectly coherent, monochromatic point sources U1 , U2 (the slits
in a double-slit experiment), each with intensity I1 = |U1 |2 , I2 = |U2 |2 , then the interference
pattern is by ((6.52)): p
I(τ ) = I1 + I2 + 2 I1 I2 cos(ωτ + ϕ). (6.73)
We then get p p
Imax = I1 + I2 + 2 I1 I2 , Imin = I1 + I2 − 2 I1 I2 , (6.74)
so √
2 I1 I2
V= . (6.75)
I1 + I2
In case I1 = I2 , we find V = 1. In the opposite case where U1 and U2 are completely incoherent,
we find
I(τ ) = I1 + I2 , (6.76)
from which follows
Imax = Imin = I1 + I2 , (6.77)
which gives V = 0.
9
See Hecht §9.2.1. ‘Temporal and spatial coherence’
10
Hecht §12.4.1 ‘The Michelson Stellar Interferometer ’

Optica Lecture Notes TN2421 98 of 165 Monday 16th April, 2018, 09:44
6.5. Interference and polarization

I(x)
Imax

Imin
x

Figure 6.6: Illustration of Imax and Imin of an interference pattern I(x) that determines the
visibility V.

6.5 Interference and polarization


We have now seen how light interferes and how the interference is affected by the coherence of the
light. We have however ignored the vectorial nature of light (i.e. its polarization) by assuming
that all the fields have the same polarization. Thus, the preceding discussion was only about a
specific, simplified case. How does light interfere in general?
Suppose we have two vectorial fields E 1 , E 2 . The (instantaneous) intensity of each field is given
by
E 1 · E 1, E 2 · E 2, (6.78)
where we have ignored a multiplicative constant. If the two fields interfere, the instanteneous
intensity is given by

(E 1 + E 2 ) · (E 1 + E 2 ) = E 1 · E 1 + E 2 · E 2 + 2E 1 · E 2 (6.79)

where 2E 1 · E 2 is the interference term. Suppose the polarization of E 1 is orthogonal to the


polarization of E 2 , e.g.    
E1x 0
E 1 =  0  E 2 = E2y  . (6.80)
0 0
Then E 1 · E 2 = 0, which means the two fields can not interfere. This observation is the First
Fresnel-Arago law: fields with orthogonal polarization can not interfere11 .

Now let us write the fields in terms of their orthogonal components


   
E1⊥ E2⊥
E1 = E2 = . (6.81)
E1k E2k

This is always possible, whether the fields are polarized or randomly polarized. Then (6.79)
becomes
2
E 1 · E 1 + E 2 · E 2 + 2E 1 · E 2 = E1⊥ 2
+ E2⊥ 2
+ 2E1⊥ E2⊥ + E1k 2
+ E2k + 2E1k E2k . (6.82)

If the fields are randomly polarized, the time average of the ⊥-part will equal the average of the
k-part, so the time averaged intensity becomes
2 2
I = 2 hE1⊥ + E2⊥ + 2E1⊥ E2⊥ i
2 2 (6.83)
= 2 hE1k + E2k + 2E1k E2k i .
11
See Hecht 9.1

Optica Lecture Notes TN2421 99 of 165 Monday 16th April, 2018, 09:44
CHAPTER 6. INTERFERENCE AND COHERENCE

This is qualitatively the same as what we would get if the fields had parallel polarization e.g.
   
E1⊥ E2⊥
E1 = E2 = . (6.84)
0 0

This gives us the Second Fresnel-Arago law: two fields with parallel polarization inter-
fere the same way as two fields that are randomly polarized. This also indicates that
our initial assumption in the previous sections that all our fields have parallel polarization is not
as limiting as it may have appeared at first.

The third Fresnel-Arago law states the following: suppose we have some field
 
E
E= ⊥ (6.85)
Ek

that is randomly polarized. Suppose we separate the two polarizations, and rotate one so that
the two resulting fields are aligned, e.g.
   
E⊥ E
E1 = E2 = k . (6.86)
0 0

Then these two fields cannot interfere with each other, because E⊥ and Ek are inco-
herent.

External sources in recommended order:

1. Veritasium - The original double slit experiment, starting at 2:15 - Demonstration of an


interference pattern obtained with sunlight.

2. MIT OCW - Two-beam Interference - Collimated Beams: Interference of laser light in a


Michelson interferometer.

3. MIT OCW - Fringe Contrast - Path Difference: Demonstration of how fringe contrast
varies with propagation distance.

4. MIT OCW - Coherence Length and Source Spectrum: Demonstration of how the coherence
length depends on the spectrum of the laser light.

5. Hecht - §7.2.1, 7.4.2, 7.4.3, 9.2.1, 9.3.1, 12.1, 12.3, 12.4.1.

6. Lecture - 18 Coherence: Lecture Series on Physics - I: Oscillations and Waves by


Prof.S.Bharadwaj, Department of Physics and Meteorology, IIT Kharagpur.

7. Lecture - 19 Coherence: Lecture Series on Physics - I: Oscillations and Waves by


Prof.S.Bharadwaj, Department of Physics and Meteorology, IIT Kharagpur.

8. Hecht 9.1

Optica Lecture Notes TN2421 100 of 165 Monday 16th April, 2018, 09:44
Chapter 7

Scalar Diffraction Optics

What you should know and be able to do after studying this chapter.

• Know when the scalar wave equation can be used to propagate fields.

• Know how to derive the Angular Spectrum Method of propagation, starting


from the scalar wave equation, via the Helmholtz equation. Know the prop-
agation formula of the Angular Spectrum Method (also known as the plane
wave expansion).

• Know the Rayleigh Sommerfeld formula (not necesarily all details, but at
least be able to write the integral over spherical waves with amplitudes pro-
portional to the field in the starting plane).

• Know how to deduce the Fresnel and Fraunhofer approximation of the


Rayleigh-Sommerfeld integral.

• Understand intuitively how the Fourier transform says something about res-
olution.

• Understand why propagation of light leads to loss of resolution (i.e. the


evanescent waves dissapear).

• Understand intuitively the uncertainty principle of Fourier transforms.

• Know how the Fresnel and Fraunhofer propagation integrals relate to Fourier
transforms.

• Know that and why propagation to the far field corresponds to taking the
Fourier transform.

• Know that and why propagation to the focal plane of a lens corresponds to
taking the Fourier transform.

• Understand why the Numerical Aperture (NA) of a lens determines the


resolution of the images it forms.

• Understand how the Fourier transforming property of a lens can be used for
Fourier filtering.

In this chapter we will study the theory that describes how light propagates. Why is this
important? It is the propagation of light that revealed to us its wave-like nature: in the double-
slit experiment, we inferred from the interference pattern observed on a screen after propagation

101
CHAPTER 7. SCALAR DIFFRACTION OPTICS

that light is a wave. To demonstrate more convincingly that light is indeed a wave, we require
a detailed quantitative model of the propagation of light, which gives experimentally verifiable
predictions.

But a precise description of the propagation of light is not only important for fundamental
science, it also has many practical applications. For example, if we want to analyse a sample by
illuminating it and measuring the scattered light, we need to take into account the fact that the
detected light has not only been affected by the sample, but by both the sample and propagation.
Another example is lithography: if we want to print a pattern onto a substrate using a mask
that is illuminated, we need to realize that if there is a certain distance between the mask and
the photoresist, the light that reaches the resist doesn’t have the exact shape of the mask because
of propagation effects. Thus, the mask needs to be designed to compensate for this effect. These
motivations are illustrated in Fig. 7.1.

Propagation Propagation

Detector
Interference
Pattern

Sample
Slits Screen

(a) (b)

Figure 7.1: Several motivations for a quantitative model of the propagation of light. It may serve
fundamentally scientific purposes, since it would provide predictions that can be tested. It could
also be applied in sample analyses or lithography.

In Section 2.4 we have derived that in homogeneous matter (i.e. the permittivity and refrac-
tive index are constant), every component of a time-harmonic electromagnetic field U(r, t) =
Re U (r)e−iωt satisfies the scalar Helmholtz equation (2.23):

∇2 + k 2 U(r) = 0, (7.1)



where k = ω µ0 is the wave number of the light in matter with permittivity  and refractive
index n = /0 .
p

One particular solution of the Helmholtz equation (if we do not impose boundary conditions) is
the plane wave solution
U (r) = eik·r , (7.2)
where
kx2 + ky2 + kz2 = k 2 . (7.3)
The choice of the sign of the components of the wave vector k determines in which direction the
plane wave propagates as time progresses.

In the following Sections 7.1 and 7.2 we will present two equivalent methods to compute the
propagation of the field through homogeneous matter, i.e. matter of which the refractive index is
independent of position. Although both methods in the end describe the same, the two different
forms give physical insight of different aspects of propagation as will be seen in Sections 7.4 and

Optica Lecture Notes TN2421 102 of 165 Monday 16th April, 2018, 09:44
7.1. Angular Spectrum Method

7.5. The two methods which we will discuss can be applied to propagate any component U of the
electromagnetic field, provided the propagation is in homogeneous matter. With this assumption
the two methods give identical and rigorous results.

When the refractive index is not constant, Maxwell’s equations are not anymore equivalent
to the wave equation for the individual electromagnetic field components and there is then cou-
pling of the components due to the curl operators in Maxwell’s equation. When the variation of
the refractive index is slow on the scale of the wavelength, the scalar wave equation may still be
a good approximation, but for structures that vary on the scale of the wavelength (i.e. on the
scale of ten microns or less), the scalar wave equation is not sufficiently accurate.

7.1 Angular Spectrum Method

U(x,y,0)

Propagation
U(x,y,z)
y
x
?
z
Figure 7.2: Illustration of what is calculated. We know the field U in the plane z = 0. Given
that field U (x, y, 0), we want to find U in the plane z. It is assumed that the field propagates in
the positive z-direction, which means that all sources are in z < 0.

Our goal is to find what the field looks like in some plane z = constant, given the field
in the plane z = 0, as is illustrated in Fig. 7.2. The sources of the field are assumed to be
in the half space z < 0. One way to see how light propagates from one plane to another is
by using the angular spectrum method. We decompose the field in plane waves with a
two-dimensional Fourier transform. Since we know how each plane wave propagates, we can
propagate each component separately and then add them all together by taking the inverse
Fourier transform. Mathematically, it is described as follows: we know the field U (x, y, 0). We
will write U0 (x, y) = U (x, y, 0) for convenience and apply a two-dimensional Fourier transform
to U0 : Z Z
F(U0 )(ξ, η) = U0 (x, y)e−2πi(ξx+ηy) dxdy, (7.4)

The inverse Fourier transform implies:


Z Z
U0 (x, y) = F(U0 )(ξ, η)e2πi(ξx+ηy) dξdη (7.5)

= F −1 {F(U0 )}(x, y). (7.6)

The most important properties of the Fourier transform are listed in Appendix E. By defining

Optica Lecture Notes TN2421 103 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

kx = 2πξ, ky = 2πη, (7.5) can be written as


Z Z  
1 kx ky
U0 (x, y) = 2 F(U0 ) , ei(kx x+ky y) dkx dky . (7.7)
4π 2π 2π

The variables in the Fourier plane: (ξ, η) and (kx , ky ) are called spatial frequencies.
Equation (7.7) says we can write U0 (x, y) = U (x, y, z = 0) as an integral  (a sum) of plane
kx ky
waves with wave vector k = (kx , ky , kz ) , each with weight F(U0 ) 2π , 2π . We know how each
1 T
 
kx ky
plane wave with complex amplitude F(U0 ) 2π , 2π and wave vector k = (kx , ky , kz )T propagates
a distance z > 0
   
kx ky kx ky
F(U0 ) , ei(kx x+ky y)
→ F(U0 ) , ei(kx x+ky y+kz z) , (7.8)
2π 2π 2π 2π

Thus, the field U (x, y, z) in the plane z (for some z > 0) is given by
Z Z  
1 kx ky
U (x, y, z) = 2 F(U0 ) , ei(kx x+ky y+kz z) dkx dky . (7.9)
4π 2π 2π

where s 2

kz = − kx2 − ky2 , (7.10)
λ
with λ the wavelength of the light as measured in the material (hence, λ = λ0 /n, with λ0 the
wavelength in vacuum). The sign in front of the square root in (7.10) could in principle be chosen
negative: one would then also obtain a solution of the Helmholtz equation. The choice of the
sign of kz is determined by the direction in which the light propagates, which in turn depends
on the location of the sources. We have to choose here the + sign because the time dependence
is given by e−iωt and the sources are assumed to be in z < 0.
We can write (7.9) alternatively as

U (x, y, z) = F −1 {F(U0 )(ξ, η)eikz z }(x, y), (7.11)

where now kz is to be interpreted as a function of (ξ, η):


s 
1 2
kz = 2π − ξ2 − η2. (7.12)
λ

Note that one can interpret this as a diagonalization of the propagation operator, as explained
in Appendix F. 2
We can observe something interesting: if kx2 + ky2 > 2π
λ , then kz becomes imaginary, and
e ik z z decays exponentially for increasing z:
 r  r
2

i kx x+ky y+z ( 2πn 2πn 2 2

) −kx
2 −k 2 )
y i(kx x+ky y) −z ( λ ) −kx −ky2

(7.13)
λ
e =e e .

These exponentially decaying waves are evanescent in the positive z-direction. We have met
evanescent waves already in the context of total internal reflection discussed in Section 2.10.5.
The physical consequences of evanescent waves in the angular spectrum decomposition will be
explained in Section 7.4.

1
Every picture is made of waves - Sixty Symbols, 3:33 to 7:15: Basic explanation of Fourier transforms. Also
see section 7.4

Optica Lecture Notes TN2421 104 of 165 Monday 16th April, 2018, 09:44
7.2. Rayleigh-Sommerfeld Diffraction Integral

Figure 7.3: The plane waves in the angular spectrum of a time-harmonic field which propagates in
the z-direction are parametrized
q by kx , ky . There are two types of waves: the propagating waves
which correspond to kx2 + ky2 < k and which have constant amplitude, and the evanescent waves
q
for which kx2 + ky2 > k and of which the amplitude decrease exponentially with propagation.

Remark. In homogeneous space the scalar Helmholtz equation for every electric field com-
ponent is equivalent to Maxwell’s equations and hence we may propagate each component Ex ,
Ey and Ez individually using the angular spectrum method. If the data in the plane z = 0 of
these field components are physically consistent, the thus obtained electric field will automatically
satisfy the condition that the electric field is free of divergence, i.e.

∇ · E = 0, (7.14)

everywhere in z > 0. This is equivalent to the statement that the electric vectors of the plane
waves in the angular spectrum are perpendicular to their wave vectors. Alternatively, one can
propagate only the Ex and Ey -components and afterwards determine Ez from the condition that
(7.14) must be satisfied.

7.2 Rayleigh-Sommerfeld Diffraction Integral


Another method to propagate a wave field is by using the Rayleigh-Sommerfeld integral. A
very good approximation of this integral states that each point in the plane z = 0 emits spherical
waves, and to find the field in a point (x, y, z), we have to add the contributions from all these
point sources together. This corresponds to the Huygens-Fresnel principle postulated earlier.
Because a more rigorous derivation starting from the Helmholtz equation2 would be complicated
and lengthy, and would not give much additional physical insights, we will just present the final
result
√ 0 2 0 2 2
z eik (x−x ) +(y−y ) +z
Z Z
1 0 0
U (x, y, z) = U (x , y , 0) dx0 dy 0
iλ (x − x0 )2 + (y − y 0 )2 + z 2
(7.15)
z eikr 0 0
Z Z
1
= U (x0 , y 0 , 0) dx dy ,
iλ r r2
2
Introduction to Fourier Optics, J. Goodman, §3.3, §3.4, §3.5 - A rigorous but lenghty derivation of the
Rayleigh-Sommerfeld diffraction integral.

Optica Lecture Notes TN2421 105 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

where we defined
(7.16)
p
r= (x − x0 )2 + (y − y 0 )2 + z 2 .
Remark. In (7.15) there is an additional factor z/r compared to the expressions for a time-
harmonic spherical wave given in (2.51) and at the right-hand side of (6.41). This factor means
that the spherical waves in the Rayleigh-Sommerfeld diffraction integral have amplitudes that
depend on the angle of radiation (although their wave front is spherical), the amplitude being
largest in the forward direction.

7.3 Equivalence of the Two Propagation Methods


One can note that the angular spectrum method relies on a multiplication in Fourier space,
while the Rayleigh-Sommerfeld integral is a convolution. It is one of the properties of Fourier
transforms that a multiplication in Fourier space corresponds to a convolution in real space and
vice versa. Indeed a mathematical result called Weyl’s identity shows that these formulas are
the same, as they should be.

7.4 Intuition for the Spatial Fourier Transform in Optics


Since spatial Fourier transformations have played and will play a significant role in our discussion
of the propagation of light, it is important to understand them not just mathematically, but also
intuitively.

What happens when an object is illuminated and the reflected or transmitted light is detected
at some distance from the object? Let us look at transmission for example. When the object is
much larger than the wavelength, a transmission function τ (x, y) is often defined and the field
transmitted by the object is then assumed to be simply the product of the incident field and the
function τ (x, y). For example, for a hole in a metallic screen with diameter large compared to
the wavelength, the transmission function would be 1 inside the hole and 0 outside. However, if
the object has features of the size of the order of the wavelength, this simple model breaks down
and the transmitted field must instead be determined by solving Maxwell’s equations. This is
not easy but there exist software packages that can do it.

Now suppose that the transmitted electric field has been obtained in a plane z = 0 very close
to the object (a distance within a fraction of a wavelength). This field is called the transmit-
ted near field and it may have been obtained by simply multiplying the incident field with a
transmission function τ (x, y) or by solving Maxwell’s equations. This transmitted near field is
a kind of footprint of the object. But it should be clear that although it is quite common in
optics to speak in terms of "imaging an object", strictly speaking we do NOT image an object
as such, but we image the transmitted (or reflected) near field which is a kind of copy of the object.

After the transmitted near field has been obtained, we apply the angular spectrum method
to propagate the individual components through homogeneous matter (e.g. air) from the object
to the detector plane or to an optical element like a lens. The first step is to Fourier transform
the transmitted component U0 (x, y) = U (x, y, 0). So what is a spatial Fourier transform? A
spatial Fourier transform decomposes the component into plane waves. To each plane  wave, 
kx ky
characterized by the wave numbers kx and ky , it assigns a complex amplitude F(U0 ) 2π , 2π ,
the magnitude of which indicates how important the role is which this particular wave plays in
the formation of the object. So what can we say about an object
 U0 (x, y), simply by looking at
kx ky
the magnitude of its spatial Fourier transform |F(U0 ) 2π , 2π |?

Optica Lecture Notes TN2421 106 of 165 Monday 16th April, 2018, 09:44
7.4. Intuition for the Spatial Fourier Transform in Optics

Suppose U0 (x, y) has sharp features, i.e. there are regions where U0 (x, y) varies rapidly as a
function of x and y. To describe these features as a combination of plane waves, these waves
must also vary rapidly as a function of x and y, which means that the length of their wave vectors
q
kx2 + ky2 must be large. Thus, the more sharp features U (x, y) has, the larger we can expect
  q
kx ky
|F(U0 ) 2π , 2π | to be for large kx2 + ky2 , i.e. high spatial frequencies can be expected to have
large amplitude. Similarly, the slowly
 varying,
 broad q
features of U0 (x, y) are described by slowly
kx ky
fluctuating waves, i.e. by F(U0 ) 2π , 2π for small kx2 + ky2 , i.e. for low spatial frequencies.
This is sketched in Fig. 7.4.

To illustrate these concepts we choose a certain field, take its Fourier transform, remove the
higher spatial frequencies and then invert the Fourier transform. We then expect that the result-
ing field has lost its sharp features and only retains its broad features, i.e. the image is blurred.
Conversely, if we remove the lower spatial frequencies but retain the higher, then the result will
only show its sharp features, i.e. its contours. These effects are shown in Fig. 7.5.
2
Recall the observation we made about Eq. (7.13): if kx2 + ky2 > 2π λ , the plane wave decays
exponentially as the field propagates. We have seen that losing these high spatial frequencies
leads to loss of resolution: propagation of light leads to irrecoverable loss of resolution.
Because by propagation through homogeneous space all the information contained in the high
spatial frequencies corresponding to evanescent waves is lost (only exponentially small amplitudes
of the evanescent waves remain), perfect imaging is impossible no matter how well designed an
optical system is. It is this fact that motivates near-field microscopy which tries to detect these
evanescent waves by scanning close to the sample, thus obtaining a high resolution. Another
relevant research topic is hyperbolic metamaterials, in which evanescent waves can propagate,
rather than decay exponentially3 .

So we have seen how we can guess properties


  of some "object" U0 (x, y) given the amplitude
 
kx ky kx ky
of its spatial Fourier transform |F(U0 ) 2π , 2π |. But what about the phase of F(U0 ) 2π , 2π ?
 
kx ky
Although one cannot really guess properties of U0 (x, y) by looking at the phase of F(U0 ) 2π , 2π
the same way as we can looking at its amplitude, it is in fact the phase that plays a larger
 role in 
kx ky
defining U0 (x, y). This is illustrated in Fig. 7.6: if the amplitude information of F(U0 ) 2π , 2π
is removed, features of the original object U0 (x, y) may still be retrieved. However, if we only
know the amplitude |F(U0 )(kx , ky )| but not the phase, then the original object is utterly lost.
Thus, the phase of a field F(U0 ) is very important, arguably sometimes even more important
than its amplitude. However, we cannot measure the phase of a field directly, only its intensity
I = |F(U0 )|2 from which we can calculate the amplitude |F(U0 )|. It is this fact that makes phase
retrieval an entire field of study on its own: how can we find the phase of a field, given that we
can only perform intensity measurements? This question is related to a new field of optics called
"lensless imaging", where amplitudes AND phases are retrieved from intensity measurements
and the image is reconstructed computationally. Interesting as this topic may be, we will not
treat it in these notes and refer instead to master courses in optics 4 .

Remark. The importance of the phase for the field can also be seen by looking at the plane wave
expansion (7.9). We have seen that the field in a plane z = constant can be obtained by propa-
gating the plane waves by multiplying their amplitudes by the phase factors eizkz , which depends
on the propagation distance z. If one leaves out the evansecent waves from consideration (which
3
Hyperbolic materials allow in principle super-resolution. They are treated in the Master course "Advanced
Photonics"
4
See the course Advanced Photonics

Optica Lecture Notes TN2421 107 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

after some distance do not contribute to the field anymore anyway), it follows that ONLY the
phases of the plane waves change upon propagation while their amplitudes (the moduli of their
complex amplitudes) do NOT change. Yet, depending on the propagation distance z, widely
differing light patterns are obtained (see e.g. Fig. 7.8).

Spatial Fourier Transform


ky
Low spatial frequencies

Slow fluctuations
Broad features

kx
High spatial frequencies

Fast fluctuations
Sharp features

Figure 7.4: qA qualitative interpretation of spatial Fourier transforms. The low spatial frequencies
(i.e. small kx2 + ky2 ) represent slow fluctuations, and therefore contribute to the broad features
q
of the real-space object. The high spatial frequencies (i.e. large kx2 + ky2 ) fluctuate rapidly, and
can therefore form sharp features in the real-space object.

Another important qualitative aspect of the Fourier transform is the uncertainty principle. It
states that many many waves of different frequencies have to be added to get a function that is
confined to a small space5 . Stated differently, if U (x, y) is confined to a very small region, then
F(U )(kx , ky ) must be very spread out. This can also be illustrated by the scaling property of
the Fourier transform:
   
kx 1 kx
if h(x) = f (ax) then F(h) = F(f ) , (7.17)
2π |a| 2πa
which simply states that the more h(x) is squeezed, the more its Fourier transform F(h) spreads
out. This principle is shown in Fig. 7.7. Perhaps you are more familiar with the uncertainty
principle in the context of quantum physics: a particle cannot have both a definite momentum
and a definite position. In fact, this is just one particular manifestation of the uncertainty
principle just described. A quantum state |ψi can be described in the position basis ψx (x) as
well as in the momentum basis ψp (p). The basis transformation that links these two expressions
is the Fourier transform
ψp (p) = F{ψx (x)}(p), (7.18)
so of course the two are subject to the uncertainty principle! In fact, any two quantum observ-
ables which are related by Fourier transform (also called conjugate variables), such as position
and momentum, or voltage and electric charge, have this uncertainty relation. The uncertainty
relation roughly says that:
5
Heisenberg’s Microscope - Sixty Symbols, 0:20 to 2:38: Basic explanation of the uncertainty principle (though
in the context of quantum physics)

Optica Lecture Notes TN2421 108 of 165 Monday 16th April, 2018, 09:44
7.4. Intuition for the Spatial Fourier Transform in Optics

Original Image New Image

Original Fourier Transform New Fourier Transform

(a) Removing the high spatial frequencies.

Original Image New Image

Original Fourier Transform New Fourier Transform

(b) Removing the low spatial frequencies.

Figure 7.5: Demonstration of the roles of different spatial frequencies. By removing the high
spatial frequencies, only the broad features of the image remain: we lose resolution. If the low
spatial frequencies are removed only the sharp features (i.e. the contours) in the image remain.
Optica Lecture Notes TN2421 109 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

Original Image New Image

Original Fourier Transform New Fourier Transform

(a) Removing the amplitude information.

Original Image New Image

Original Fourier Transform New Fourier Transform

(b) Removing the phase information.

Figure 7.6: Demonstration of the role of the phase of the spatial Fourier transform. If the ampli-
tude information is removed, but phase information is kept, some features of the original image
are still recognizable. However, if the phase information is removed but amplitude information
is kept, the original image is completely lost.

Optica Lecture Notes TN2421 110 of 165 Monday 16th April, 2018, 09:44
7.4. Intuition for the Spatial Fourier Transform in Optics

If a function f (x) has width ∆x, its Fourier transform has a width ∆kx ≈ 2π/∆x.

Since after propagation over a distance z, the evanescent waves do not contribute to the Fourier
transform of the field, it follows that this Fourier transform has maximum width ∆kx = k. By
the uncertainty principle it follows that after propagation, the minimum width of the field is
∆x, ∆y ≈ 2π/k = λ. Hence, the minimum feature size of a field after propagation is
of the order of the wavelength. This poses a fundamental limit to resolution given by the
wavelength of the light.

U(x, y) Û(kx , ky )

U(x, y) Û(kx , ky )

U(x, y) Û(kx , ky )

Figure 7.7: Demonstration


 of the uncertainty principle. The more confined U (x, y) is, the larger

kx ky
the spread of F(U ) 2π , 2π .

Optica Lecture Notes TN2421 111 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

7.5 The Fresnel and Fraunhofer Approximations


The Fresnel approximation and the Fraunhofer approximation are two approximations of the
Rayleigh-Sommerfeld integral given in Eq. (7.15). The approximations are based on the assump-
tion that the field has propagated over a large distance z. In the Fraunhofer approximation, z
has to be very large, i.e. much larger than for the Fresnel approximation to hold. Putting it
differently: in order of most accurate to least accurate (i.e. only valid for large propagation
distances), the diffraction integrals would rank as

[Most accurate] Rayleigh-Sommerfeld → Fresnel → Fraunhofer [Least accurate]. (7.19)

7.5.1 Fresnel Approximation


For both approximations, we assume that in Eq. 7.15, z is so large that in the denominator we
can approximate r ≈ z

z eikr 0 0
Z Z
1
U (x, y, z) = U0 (x0 , y 0 ) dx dy (7.20)
iλ r r
Z Z
1
≈ U0 (x0 , y 0 )eikr dx0 dy 0 . (7.21)
iλz
The reason why we can NOT apply the same approximation for r in the exponent, is because
there r is multiplied by k = 2π/λ which is very large, so any error introduced by approximating
r would be magnified significantly by k. To approximate r in the exponent eikr we need to be
more careful, and instead apply a Taylor expansion. Recall
p
r = (x − x0 )2 + (y − y 0 )2 + z 2
r
(x − x0 )2 + (y − y 0 )2
= z + 1. (7.22)
z2
We know that for a small number s we can expand (compare 6.69)):
√ s s2
s+1=1+ − + .... (7.23)
2 8
0 2 0 2
Since we assumed that z is large, (x−x ) z+(y−y
2
)
is small, so we can expand
r
(x − x0 )2 + (y − y 0 )2
r = z +1
z2
(x − x0 )2 + (y − y 0 )2
 
≈ z 1+
2z 2
(x − x ) + (y − y 0 )2
0 2
= z+ Fresnel approximation (7.24)
2z
With this approximation, we arrive at the Fresnel diffraction integral, which can be written
in different forms

eikz
Z Z
ik 0 2 +(y−y 0 )2
U (x, y, z) ≈ U0 (x0 , y 0 )e 2z [(x−x ) ] dx0 dy 0
iλz
ik(x2 +y 2 )
eikz e 2z ik(x02 +y 02 )
Z Z
x 0 y 0
= U0 (x0 , y 0 )e 2z e−ik( z x + z y ) dx0 dy 0 (7.25)
iλz
ik(x2 +y 2 )
eikz e 2z
 
ik(x02 +y 02 ) x y 
0 0
= F U0 (x , y )e 2z , .
iλz λz λz

Optica Lecture Notes TN2421 112 of 165 Monday 16th April, 2018, 09:44
7.5. The Fresnel and Fraunhofer Approximations

Especially the last expression is interesting because it shows that

The Fresnel integral is the Fourier transform of the field U0 (x0 , y 0 ) multiplied by
ik(x02 +y 02 )
the Fresnel propagator e 2z .

Note that this propagator depends on the distance of propagation z.

Remark. By Fourier transforming (7.25), one gets the plane wave amplitudes of the Fres-
nel integral. It turns out that these amplitudes are equal to F(U0 ) mulitiplied by a phase factor.
This phase factor is a paraxial approximation of the exact phase factor given by exp(izkz ), i.e.
it contains as exponent the parabolic approximation of kz . Therefore the Fresnel approximation
is also called the paraxial approximation. In fact, it can be shown that the Fresnel diffraction
integral is a solution of the paraxial wave equation and conversely, that every solution of the
paraxial wave equation can be written as a Fresnel diffraction integral.

7.5.2 Fraunhofer Approximation


For the Fraunhofer approximation, we will make one further approximation to r in eikr

(x − x0 )2 + (y − y 0 )2
r ≈ z+ Fresnel approximation (7.26)
2z
x2 + y 2 − 2xx0 − 2yy 0
≈ z+ Fraunhofer approximation. (7.27)
2z
We have thus omitted the quadratic terms x02 + y 02 , so with respect to the Fresnel diffraction
ik(x02 +y 02 )
integral, we simply omit e 2z to obtain the Fraunhofer diffraction integral

ik(x2 +y 2 )
eikz e 2z x y 
U (x, y, z) ≈ F(U0 ) , . (7.28)
iλz λz λz

This leads to the following important observation:

The far field of U0 (x0 , y 0 ) is simply its Fourier transform with an additional quadratic
phase factor6 .

Note that the coordinates in which we have to evaluate F(U0 ) scale with 1/z, and the overall
field U (x, y, z) also scales with 1/z. This means that as you choose z larger (i.e. you propagate
the field further), the field simply spreads out without changing its shape, and its amplitude goes
down. Hence the field diverges as the propagation distance z increases:

Eventually, for sufficiently large propagation distances, light always spreads out
while preserving the shape of the light distribution.

Remark 1. The Fresnel integral is, like the Fraunhofer integral, also a Fourier transform,
evaluated in spatial frequencies which depend on the point of observation:
x y
ξ= , η= . (7.29)
λz λz
However, in contrast to the Fraunhofer integral, the Fresnel integral depends additionally in a
different way on the propagation distance z, namely the propagator in the integrand also depends
on z. This is the reason that the Fresnel integral does not merely depend on the ratios x/z and
6
Also see Hecht §7.4.4, subsection ‘Fourier Analysis and Diffraction’.

Optica Lecture Notes TN2421 113 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

y/z but in a more complicated manner on the position of the point of observation. Therefore
the Fresnel integral can yield quite diverse patterns depending on the value of the propagation
distance z, as is shown in Fig. 7.8.

Remark 2. Suppose that U0 is the field immediately behind an aperture A in an opaque


screen with diameter D of the aperture. Then points (x, y, z) of observation for which the Fres-
nel and Fraunhofer diffraction integrals are sufficiently accurate satisfy [These inequalities are
not part of the exam]
p !4/3
z max(x0 ,y0 )∈A (x − x0 )2 + (y − y 0 )2
> , Fresnel (7.30)
λ λ
 2
z D
> , Fraunhofer (7.31)
λ λ
Suppose that D = 1 mm and the wavelength is green light: λ = 550 nm, then Fraunhofer’s
approximation is accurate if z > 2 m.

Remark 3. The points of observation where the Fraunhofer formulae can be used must in
any case satisy:
x y
< 1, < 1. (7.32)
z z
The reason is that the Fresnel approximation already fails when x/z > 1, hence so does Fraun-
hofer. On top of that, when x/z > 1, the spatial frequency kx = 2πx
zλ > k which correponds to an
evanescent wave. An evanescent wave can obviously not contribute to the Fraunhofer far field.

7.5.3 Examples of Fraunhofer fields


1. Rectangular hole in a screen
Let the screen be z = 0 and the slit be given by −a/2 < x < a/2, −b/2 < y < b/2. The
transmission function τ (x, y) of the slit is:
τ (x, y) = 1[−a/2,a/2] (x)1[−b/2,b/2] (y), (7.33)
where
1, if − a2 ≤ x ≤ a2 ,

1[−a/2,a/2] (x) = (7.34)
0, otherwise,
and similarly for 1[−b/2,b/2] (y). Let the slit be illuminated by a perpendicular incident plane
wave with unit amplitude. Then the field immediately behind the screen is:
U0 (x, y) = τ (x, y) = 1[−a/2,a/2] (x)1[−b/2,b/2] (y), (7.35)
We have
Z a/2
e−2πiξx dx

F 1[−a/2,a/2] (ξ) =
−a/2
eπiaξ − e−πiaξ
=
2πiξ
sin(πaξ)
= a
πaξ
= a sinc(πaξ), (7.36)
where sinc(u) = sin(u)/u. Hence,
x y   πax   
πby
F(U0 ) , = ab sinc sinc . (7.37)
λz λz λz λz

Optica Lecture Notes TN2421 114 of 165 Monday 16th April, 2018, 09:44
7.5. The Fresnel and Fraunhofer Approximations

0.08 I I
NF=0.01 2.5 NF=4
0.06 2

0.04 1.5

1
0.02
0.5
x x
-10 -8 -6 -4 -2 2 4 6 8 10 -4 -3 -2 -1 1 2 3 4
3

I I
2.5
2.5
NF=1.0 NF=10
2
2
1.5
1.5

1 1

0.5 0.5
x x
-3 -2 -1 1 2 3 -4 -3 -2 -1 1 2 3 4

Figure 7.8: An example of Fresnel fields of a slit. The distance to the slit increases from very
close to the slit at the bottom right, to increasing intermediate distances in the top right and the
bottom left figures, to very large propagation distances in the top left figure. NF = D2 /λz is the
Fresnel number. In the top left, the Fresnel and Fraunhofer approximations give identical results
equal to the Fourier transform of the slit, which is the sinc function. The size of the slit relative
to the pattern is shown in each case by the black bar and hence the pattern becomes broader with
propagation distance. Note: the decrease of amplitude with distance is NOT shown. Increasing
the distance beyond that for the case shown in the top left figure does not change the shape of
the pattern anymore: it only gets broader while the amplitude decreases.

Optica Lecture Notes TN2421 115 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

The Fraunhofer far field of a rectangular hole in a plane at large distance z is obtained by
substituting (7.37) into (7.28).

• The first zero along the x-direction from the centre x = 0 occurs for

λz
x=± . (7.38)
a
The distance between the first zeros is 2λz/a and is thus larger when the width of the
slit is smaller.
• The inequalities (7.32) imply that when a < λ, the far field pattern does not have
any zeros as function of x. It is then not possible to deduce the width a from the
Fraunhofer intensity. This is an illustration of the fact that details of size less than
the wavelength can not propagate to the far field.

2. Periodic Array of slits


We can now predict what the diffraction pattern of a series of slits of finite width will look
like. We only look along the x-axis, i.e. we assume y = 0 so that we can treat the problem
as 1-dimensional. Suppose Wslit (x) is a block function describing the transmission function
of a single slit. Defining the Dirac comb as

X
X∆ (x) = δ(x − m∆), (7.39)
m=−∞

an infinite series of slits with finite width is given by the convolution X∆ (x) ⊗ Wslit (x).
If we want the number of slits to be finite, we multiply the expression with another block
function Warray (x) to get

τ (x) = (X∆ (x) ⊗ Wslit (x)) Warray (x). (7.40)

The diffraction pattern in the far field is given by the Fourier transform of the trans-
mitted near field. If the incident illumination is a perpendicular plane wave with unit
amplitude, the transmitted near field is simply τ (x). Using the fact that convolutions in
real space correspond to products in Fourier space and vice versa, and using the fact that
F{X∆ (x)} = (1/∆)X1/∆ (ξ), we find

(7.41)
 
F(τ ) = X1/∆ F(Wslit ) ⊗ F(Warray )

If the slit has width a:



  a X  m
(1/∆)X1/∆ F(Wslit ) (ξ) = δ ξ− sinc(πaξ)
∆ m=−∞ ∆

a X  a  m
= sinc mπ δ ξ− . (7.42)
∆ m=−∞ ∆ ∆

If the total width of the array is A, then

F(Warray )(ξ) = A sinc(πAξ), (7.43)

and we conclude that



aA X  a   m 
F(τ )(ξ) = sinc mπ sinc πA ξ − . (7.44)
∆ m=−∞ ∆ ∆

Optica Lecture Notes TN2421 116 of 165 Monday 16th April, 2018, 09:44
7.6. Fourier Optics

and the Fraunhofer field of the array of slits is (omitting the quadratic phase factor):
 x  aA X∞  a  x m 
F(τ ) = sinc mπ sinc πA − . (7.45)
λz ∆ m=−∞ ∆ λz ∆

For the directions


x mλ
= , m = 0, ±1, ±2, . . . , (7.46)
z ∆
the field has local maxima (peaks). These directions are called diffraction orders. Note
that as explained above, there should hold: x/z < 1 in the Fraunhofer far field and this
sets a limit to the number of the diffracted orders that can occur. This limit depends on
the period and the wavelength and is defined by:
|m| ≤ ∆/λ. (7.47)

The width of a diffraction order is given by the width of the function (7.43), i.e. it is given
by
λ
angular width = , (7.48)
A
Hence, the larger A, i.e. the more slits there are in the array, the narrower the peaks into
which the energy is diffracted. The phenomenon that the angles of diffraction of the orders
depend on wavelength is used to separate wavelengths. Grating spectrometers use periodic
structures such as this array of slits to separate and measure wavelengths.
The amplitudes of the diffracted orders:
 a
sinc mπ , (7.49)

are determined by the width of the slits. Hence the envelope (i.e. large features) of the
Fraunhofer diffraction pattern is determined by the small-scale properties of the arrray,
namely the width of the slits. This is illustrated in Fig. 7.9.

Remark. A periodic row of slits is an example of a diffraction grating. A grating is


a periodic structure, i.e. the permittivity is a periodic function of position. Structures
can be periodic in one, two and three directions. A crystal acts as a three-dimensional
grating whose period is the period of the crystal, i.e. a few Angstrom. Electromagnetic
waves of wavelength equal to one Angstrom or less are called x-rays. When a beam of
x-rays illuminates a crystal, a detector in the far field measures the Fraunhofer diffraction
pattern given by the intensity of the Fourier transform of the refracted near field. These
diffraction orders of crystals for x-rays where discovered by Von Laue and are used to study
the atomic structure of crystals.

7.6 Fourier Optics


In this section we apply diffraction theory to a lens. We consider in particular the focussing of
a parallel beam and the imaging of an object. To model the imaging of a point object using
diffraction theory the general procedure is as follows. We apply Fresnel diffraction theory to
propagate the light from the object to the entrance pupil of the lens. Then we multiply the
field in the entrance pupil by the transmission function of the lens. This transmission function
incorporates the phase change of the field induces by the local thickness variation over the radius
of the lens. After having obtained the field in the exit pupl of the lens, we propagate this field
again using the Fresnel diffraction integral to the region which according to geometrical optics is
close to the image plane. As you will see, the results generalize those obtained with geometrical
optics.

Optica Lecture Notes TN2421 117 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

Figure 7.9: An illustration of a diffraction pattern of a series of slits.

7.6.1 Focussing of a Parallel Incident Beam


We consider a thin positive lens as example. It induces a phase change to an incident field which
is proportional to the local thickness of the lens. Consider the focussing of a plane wave which is
propagating parallel to the optical axis. According to geometrical optics the rays are all focused
into the focal point if the aberrations of the lens are negligable, i.e. if the lens is diffraction lim-
ited. According to the Principle of Fermat, all rays have travelled the same optical distance
when they intersect in the focal point. In the focal point constructive interference occurs and
the intensity is maximum. Since the wavefronts are surfaces of constant phase onto which the
rays are perpendicular, it follows that the wave fronts in image space are the parts of spheres
with centre the focal point. The spheres are cut off by the cone with top the focal point and
opening angle 2a/f as shown in Fig. 7.10. Behind the focal point, there is a second cone where
there are again spherical wave fronts but there the light is propagating away from the focal point,
of course. According to geometical optics it is in image space completely dark outside of the two
cones in Fig. 7.10.
Let us choose the origin in the centre of the thin lens and let the positive z-axis be along the
axis of symmetry (optical axis), as usual. The focal point is in this coordinate system given by
(0, 0, f ), where f is the focal distance. Let (x, y, z) be a point between the lens and the focal
point and inside the cone, i.e. in the region where there is light according to geometrical optics.
We have 0 < z < f . The field that exists according to geometrical optics inside the cone can be
described in wave optics by: √ 2 2 2
e−ik x +y +(z−f )
p , (7.50)
x2 + y 2 + (z − f )2
Indeed the surfaces of constant phase:

x2 + y 2 + (z − f )2 = constant, (7.51)
p

are spheres with centre the focal point, and the amplitude of the field is proportional to the
reciprocal of the distance between (x, y, z) and the focal point. The minus sign in the exponent
in (7.50) is explained by the (implicit) time dependance which is, as always for monochromatic
fields, given by the factor exp(−iωt) where ω > 0. After multipying by this factor, it is seen that
the phase at a particular point and time is given by

(7.52)
p
− k x2 + y 2 + (z − f )2 − ωt,

and hence as time increses the phase can only be kept constant when the distance to the focal
point decreases. Incidently, for a point (x, y, z) to the right of the focal point, the spherical wave

Optica Lecture Notes TN2421 118 of 165 Monday 16th April, 2018, 09:44
7.6. Fourier Optics

fronts propagate away from the focal point and therefore one should choose there the plus sign
in the exponent.
Returning to the region 0 < z < f , we now consider the field in the exit pupil of the lens, i.e.
we choose z = 0, where the field is truncated by the aperture of the lens. in Hence the field in
the plane z = 0 is given by

e−ik x2 +y 2 +f 2
1Ja (x, y) p , for (x, y, 0) in the exit pupil of the lens, (7.53)
x2 + y 2 + f 2
where
1 if x2 + y 2 < a2 ,

1J (x, y) = (7.54)
a 0 otherwise
The field (7.53) is the field in the exit pupil as predicted by geometrical optics when the lens is
diffraction limited. In diffraction optics we compute the field in the focal region using diffraction
integrals, instead of using ray tracing as is done in geometrical optics. Hence, the modification
introduced by diffraction optics is due to the more accurate propagation of the field in the exit
pupil (as predicted by geometrical optics) into the region behind the exit pupupil of the lens.
The starting field in the exit pupil is however still the same as in geometrical optics.
Because of the more accurate propagation using diffraction integrals we find that the field is
not strictly zero outside of the cones shown in Fig. 7.10, although most of the energy is indeed
concentrated inside the cones. Also, the field inside the cones is modified and not exactly given
by the spherical wave front, due to the diffraction by the exit pupil.
p If a/f is sufficiently small, we may replace in the denominator of (7.53) the distance
x2 + y 2 + f 2 between a point in the exit pupil and the focal point by f . Replacing this
distance by f is not allowed in the exponent however, because the error made in this replace-
ment would be enhanced too much by the multiplication by the large wave number k. In the
exponent we therefore use instead the first two terms of the Taylor series, i.e. we apply the
paraxial approximation (7.23):
s
p x2 + y 2 x2 + y 2
x2 + y 2 + f 2 = f 1 + ≈ f + , (7.55)
f2 2f

which is valid for a/f sufficiently small. Then (7.53) becomes:


2 +y 2
−ik x
1Ja (x, y)e 2f , (7.56)

where we dropped the constant factors eikf and 1/f . For a general incident field U0 (x, y) in the
entrance pupil, the lens applies a transformation such that the field in the exit plane becomes:
2 +y 2
−ik x
U0 (x, y) → U0 (x, y)1Ja (x, y)e 2f ,
(7.57)
(transformation applied by a lens between its entrance and exit planes)

The function that multiplies U0 (x, y) is the transmission function of the lens:
2 +y 2
−ik x
τlens (x, y) = 1Ja (x, y)e 2f . (7.58)

This result makes sense: in the centre (x, y) = 0 the lens is thickest, so the phase is shifted the
most (but we can define this phase shift to be zero because only phase differences matter, not
absolute phase). As is indicated by the minus-sign in the exponent, the further you go away
from the centre of the lens, the less the phase is shifted. For shorter f , the lens focuses more
strongly, so the curvature of the lens is higher, and the phase shift changes more rapidly as a
function of the radial coordinate. Note that transmission function (7.58) has modulus 1 so that

Optica Lecture Notes TN2421 119 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

Figure 7.10: Top: wavefronts in image space due to the focussing of a plane wave that propagates
parallel to the optical axis according to geometrical optics. There is no light outside of the two
cones. Bottom left: amplitude as predicted by diffraction optics. There is no absolute darkness:
the boundary of the cones is diffuse. Furthermore, the intensity does not increase monotonically
with decreasing distance to the focal point as predicted by geometrical optics. Bottom right:
phase of the focused field as predicted by diffractin optics.

Optica Lecture Notes TN2421 120 of 165 Monday 16th April, 2018, 09:44
7.6. Fourier Optics

energy is conserved.

If we propagate this new field using the Fresnel diffraction integral of Eq. (7.25), we get
ik(x2 +y 2 )
eikz e 2z
   
0 0 0 0
1
ik(x02 +y 02 ) 2z 1
− 2f x y 
U (x, y, z) = F U0 (x , y )1J (x , y )e , . (7.59)
iλz a λz λz

This expression can be used for sufficiently large distances from the exit pupil plane, in particular
in the focal plane and beyond it. It is seen at the bottom of Fig. 7.10 that the field is not
monotonically increasing with decreasing distance to the focal point. Instead, secondary maxima
are seen along the optical axis. Also the boundary of the light cone is not sharp as predicted by
geometrical optics but diffuse. For points in the back focal plane of the lens, i.e. z = f , we find
ik(x2 +y 2 )
eikf e 2f n
0 0 J 0 0
o x y 
U (x, y, f ) = F U0 (x , y )1 a (x , y ) , , (7.60)
iλf λf λf

which is the same as the Fraunhofer integral! Thus, the field at the focal plane is the same as
the far field of the field in the entrance pupil of the lens, or to put it differently:

The field in the entrance pupil of the lens and the field at the focal plane are
related by a Fourier transform (apart from a quadratic phase factor in front of the
integral).

It can be shown that the fields in the front focal plane U (x, y, −f ) and the back focal plane
U (x, y, f ) are related exactly by a Fourier transform, i.e. without the additional quadratic
phase factor7 .

So a lens performs a Fourier transform. Let us see if that corresponds to some of the things
we know from geometrical optics:

• We know from geometrical optics that if we illuminate a lens with parallel rays of light (a
plane wave), they all intersect in the back focal plane. This corresponds with the fact that
for U0 (x, y) = 1 (i.e. plane wave illumination, neglecting the finite aperture of the lens),
i.e. neglecting diffraction effects), its Fourier transform is a delta peak:
     
kx ky kx ky
F(U0 ) , =δ δ , (7.61)
2π 2π 2π 2π

which represents the perfect focused spot (without diffraction).

• If in geometrical optics we illuminate a lens with tilted parallel rays of light (a plane wave
propagating in some other direction), then the point in the back focal plane where they
intersect is laterally displaced. A tilted plane wave is described by U0 (r) = eik0 ·r , and its
Fourier transform with respect to (x, y) is given by
     
kx ky kx − k0,x ky − k0,y
F{U0 } , ,z = δ δ ,
2π 2π 2π 2π

which is indeed a shifted delta peak (i.e. a shifted focal spot).

It seems that our new model of light propagation confirms what we know from geometrical
optics. But in the previous two examples we have discarded the infuence of the finite size of the
7
Introduction to Fourier Optics, J. Goodman, §5.2.2 - Several calculations on the Fourier transforming prop-
erties of lenses.

Optica Lecture Notes TN2421 121 of 165 Monday 16th April, 2018, 09:44
A2 J1(kaq>R) 2 2J1(ka sin u) 2
c d (10.52) I(u) = I(0) c d (10.56)
kaq>R ka sin u

ircular opening. To find the irradi- CHAPTER 7. SCALAR DIFFRACTION OPTICS


and as such is plotted in Fig. 10.36. Because of the axial symmetry,
ern (i.e., at P0), set q = 0. It follows
the towering central maximum corresponds to a high-irradiance
elation (m = 1) pupil,
that i.e. we have left out of consideration function 1J in calculating the Fourier transform.
circular spot known as the Airy disk. a
It was Sir George Biddell
If U0 (x, y) = 1 in the entrance
Airy (1801–1892), Astronomer Royal of by
pupil, the focused field is given the Fourier
England, transform
who first de- of the
d J1(u)
J1(u) + circular disc with radius
(10.53) which is called the evaluated at spatial frequencies
rived Eq. (10.56). The central disk is surrounded by a dark ring
a Airy spot ξ = λfx
,
du u η = y : (See Appendix E.17) [ NOTE: you do not need to know the following formula
λf that corresponds to the first zero of the function J1(u). From
for the examination] Table 10.2 J1(u) = 0 when u = 3.83, that is, kaq>R = 3.83. The
at J0(0) = 1, and from Eq. (10.48),
u as u approaches zero has the same radius q1 drawn to the center of  this first
a
p dark ring  can be thought
2J 2π x 2 + y2
he ratio of the separate derivatives of of as the extent of πa the Airy diskλf (Fig. 10.37). It is given by
2 1
Airy spot(x, y) = , (7.62)
q1 = 3.83 Rl>2pa orλf 2πa
p
tor, namely, dJ (u)>du over 1. But x2 + y 2
1 λf
nd side of Eq. (10.53) is twice
where we have that
omitted the phase factors in front theRl
u = 12 at u = 0. The irradiance at P q1 = of1.22 Fourier transform. (10.57)
The pattern is
shown in Fig. 7.11. 2aa central maximum surrounded by
0 It is circular symmetric and consists of
concentric rings of alternating zeros
For a lens and secondairy
focused maxima
on the screen with
s, the rapidly
focal lengthdecreasing
ƒ ≈ R, soamplitudes.
2 2 In cross-section, as fuction of the Airy pattern is very similar (but not identical) to the
e AA r,
)= (10.54)
sinc function. From the uncertainty principle shown in Fig.
2R 2 ƒl 7.7 it follows that the size of the
focal spot decreases as [radius
a increases. From (7.62)q1we
1st dark ring] ≈ see
1.22that the Airy function(10.58)
is a function of
tained for the rectangular openingvariable ar/(λf ), the focal field becomesDnarrower as a/(λf ) increases. The
the dimensionless
d to be essentially constant Aperture
Numerical over the (NA) is defined by
where D is the aperture diameter, in other words, D = 2a. (The
diameter of the AiryNA disk=inathe
. visible spectrum is very roughly (7.63)
2J1(kaq>R) 2 equal to the ƒ># of the lensfin millionths of a meter.) As shown
c d Since the first
(10.55) in Airy
zero of the the accompanying
pattern occursphotos, q1 varies
for 2ar/(λf inversely withofthe
the size thehole’s
focal spot can
kaq>R ) = 1.22,
be estimated as diameter. As D approaches l, the Airy disk can be very large
λ
Size of focal spot ≈ 0.6 (7.64)
NA

( )
b spot: surface plot of the intensity and cross-section of the field.
Figure 7.11: Airy

7.6.2 Imaging by a lens


It follows from the derivations in the previous section that the Airy pattern is the image of a
0.017point
5 source infinitely far in front of a lens. In this section we study the imaging of a general
0.004 2
object. Consider first a point object at distance
(c) so from a lens with focal distance f . There are
two ways to derive the field in image space.
1. 10 ka sinu
The first way is analogous
Figure 10.36 to (a) the derivation
The Airy in Electric
pattern. (b) the previous section
field created by of the focused field
5
of an incident plane wave. We postulate that the field of the point object is given by a
Fraunhofer diffraction at a circular aperture. (c) Irradiance resulting
3.83
5.14

7.02
8.42

spherical wave from Fraunhofer


converging diffraction
towards theatgeometical
a circular aperture.
optics (R.G.
image
Wilson,point
Illinois and tuncated by the
Wesleyan University)
lens aperture. If the object point is on the optical axis, we find for the field the same
function as the Airy pattern except that the variable ar/(λf ) must be replaced by
ar
, (7.65)
λsi

Optica Lecture Notes TN2421 122 of 165 Monday 16th April, 2018, 09:44

26/08/16 4:06 PM
7.6. Fourier Optics

where si is the image distance as given by the Lens Law. This field is called the Point
Spread Function (PSF for short). For an object point that is not on the optical axis, the
PSF is translated such that it is centred on the image point according to geometrical optics.

Figure 7.12: Imaging of a point.

2. The second method is by propagating the field of the point object from the object plane
to the entrance pupil of the lens using the Fresnel diffraction formula, multiplying by the
transmission function of the lens given by (7.58), and finally propagating the field from the
exit pupil of the lens to the image plane using the Fresnel diffraction formula again.

Both methods give identical formulae for the PSF Airy disk8 : [NOTE: you do not need
to know the following formula for the examination]
 p 
a 2 + y2
πa2 J 1 2π λsi x
PSF(x, y) = 2πa
p , (7.66)
λsi x2 + y 2
λsi

Usually we consider as object plane the plane immediately behind the object (on the side of
the lens). We assume that we know the field in this plane. This field has been transmitted (or
reflected) by the object and is then further propagation through the optical system. The object
plane is discretised by a set of points and the images of these points are given by translated
versions of the PSF weighted by the field in these points. The total image field is then:
Z Z
1 x y 
Ui (x, y, si ) = PSF − xo , − xo Uo (xo , yo , s0 ) dxo dyo . (7.67)
M M M

where xi = M xo , yi = M yo is the image point and M = si /so is the magnification and the factor
1/M is to preserve energy. If the magnification is unity, the image field is a convolution between
the PSF and the object field. If the magnification differs from unity, the integral can be made
into a convolution by rescaling the coordinates in image space.
It is clear from (7.66) that larger radius a of the lens and smaller wavelength λ imply a narrower
PSF. This in turn implies that the kernel in the convolution is more sharply peaked and hence
that the resolution of the image is higher. 9 . Alternatively, one could think of the aperture of
the lens cutting away the higher spatial frequencies, as shown in Fig. 7.5, which causes loss of
resolution of the Fourier transform observed in the image plane.

7.6.3 Optical Fourier Filtering


• The field in the entrance pupil of the lens can be changed spatially by a so-called spatial
light modulator (SLM). An SLM has thousands of pixels by which very general fields can
8
See Hecht §10.2.5 ‘The circular aperture’
9
Hecht §10.2.6 ‘Resolution of imaging systems’.

Optica Lecture Notes TN2421 123 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

be made. By applying three SLMs in series, one can tune the polarisation, the phase and
the amplitude pixel by pixel and hence very special fields can be made in the focal region of
a lens, for example an electric field with only a longitudinal component (i.e. Ez -component)
in the focal point10 .

• Suppose we have the setup as shown in Fig. 7.13. With one lens we can create the Fourier
transform of some field U0 (x, y), and we can put a mask in the focal plane and then with a
second lens we can invert the Fourier transform of this new field. This procedure is called
Fourier filtering using lenses. Fourier filtering means that the amplitude and/or phase of
the plane waves in the angular spectrum of the field can be manipulated. An application
of this idea is the phase contrast microscope.

Lens Lens
U(x,y) Û(kx,ky) U(x,-y)

f f f f

Figure 7.13: Setup for Fourier filtering. The first lens creates a Fourier transform of U (x, y),
to which we can apply some operation (e.g. applying a different phase shifts to different parts
of the field). The second lens then applies another Fourier transform (which is the same as the
inverse Fourier transform and a mirror transformation).

7.7 Superresolution
We have emphasized that evanescent waves set the ultimate limit to resolution in optics. In
Chapter 3, it was explained that, although within geometrical optics one can image a single
point perfectly using conical surfaces, several points, let alone an extended object, can not be
imaged perfectly. It was futhermore explained that when only paraxial rays are considered, i.e.
within gaussian geometrical optics, perfect imaging of extended objects is possible. However,
rays whose angle with the optical axis are large cause aberrations. But even when perfect imaging
would be possible in geometrical optics, a real image can never be perfect due to the fact that
information contained in the amplitudes and phase of the evanescent waves can not propagate.
The resolution that can be obtained with an optical system consisting of lenses is less than follows
from considering the effect of evanescent waves because, apart form the evanescent waves, also
the propagating waves with spatial frequencies that are so high that they are not captured by
the optical system, can not contribute to the image. Therefore the image of a point object has
the size
λ/NAi , (7.68)

where NAi = a/si is the numerical aperture in image space, i.e. it is the sinus of half the opening
angle of the cone extended by the exit pupil at the gaussian image point on the optical axis.
This resolution limit is called the diffraction limit.
10
See Phys. Rev. Lett. 100, 123904, 2008

Optica Lecture Notes TN2421 124 of 165 Monday 16th April, 2018, 09:44
7.7. Superresolution

The size of the image of a point as given by the PSF in (7.66), is influenced by the magnifica-
tion of the system. To characterize the resolution of a diffraction limited system, it is therefore
better to consider the numerical aperture on the object side: NAo = NAi |M | = a/so . The value
of NAo is the sinus of the half angle of the cone extended by the entrance pupil of the system
on the object point on the optical axis. This is the cone of wave vectors emitted by this object
point that can contribute to the image. The larger the half angle of this cone, the higher the
spatial frequencies that can contribute and hence the higher the resolution.
It should be clear by now that beating the diffraction limit is extremely difficult. Nevertheless,
a lot of research in physics has been and still is directed to realizing this goal. Many attempts
have been made, some succesful, others have failed, but, whether succesful or not, most were
based on very ingenious ideas. To close this chapter on diffraction theory, we will give a flavor
of the attemps to achieve what is called superresolution.

Confocal microscope. A focused spot is used to scan the object and the reflected field is
imaged onto a small detector (“point detector”). The resolution is roughly a factor 1.5
better than for normal imaging with full field of view using the same objective. The higher
resolution is achieved thanks to the illumination by oblique plane waves that are present in
the spatial (Fourier) spectrum of the focused spot. By illumination with plane waves under
large angles of incidence, higher spatial frequencies of the object which are under normal
incidence not accepted by the objective, are now “folded back” into the cone of plane waves
accepted by the objective. The higher resolution comes at a prize of longer imaging time
because of scanning. The confocal microscope was invented by Marvin Minsky in 1957.

Figure 7.14: A laser beam is focused by an objective and scanned over an object. The reflected
light is imaged by the same objective onto a small detector.

The Perfect Lens based on negative refraction It can be shown that when a material has
negative permittivity and negative permeability, the phase velocity of a plane wave
is opposite to the energy velocity. Furthermore, when a slab of such material is surrounded
by material with postive permittivity and positive permeability equal to the absolute values
of the permittivity and permeability of the slab, the reflection of waves is zero for every

Optica Lecture Notes TN2421 125 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

angle of incidence and every state of polarisation. Moreover, evanescent waves gain
amplitude inside the slab and it turns out that there are two planes, one inside the slab
and one on the other side of it, where a perfect image of a point in front of the slab occurs.
Note that the increase of amplitude of an evanescent wave does not violate the conservation
of energy because an evanscent wave does not propagate energy in the direction in which
it is evanescent.
The simple slab geometry which acts as a perfect lens, was invented by John Pendry in
2000 11 . Unfortunately, a materal with negative permittivity and negative permeability
has not been found in nature, although there seems to be no fundamental reason why it
could not exist. Therefore many groups have attempted to mimick such a material by
conventional materials such as metals. There are however more fundamental reasons why
Pendry’s perfect lens will not work satisfactory, even if the material would exist. We refer
to the master lecture Advanced Photonics for more details.

Figure 7.15: Pendry’s perfect lens consists of a slab of a material with negative permittivity and
negative permeability such that its absolute values are equal to the positive permittivity and
positive permeability of the surrounding material. Points outside the slab are imaged perfectly
in two planes: one inside the sab and the other on the other side of the slab.

Hyperbolic materials Hyperbolic materials are anisotropic, i.e. the phase velocity of a plane
wave depends on the polarisation. The permittivity of an anisotropic material is a tensor
(loosely speaking a (3,3)-matrix). Normally the eigenvalues of the permittivity matrix
are positive, however in a hyperbolic material two eigenvalues are of equal sign and the
third has opposite sign. In such a medium all waves of the so-called extra-ordinary type
of polarisation, propagate, no matter how high the spatial frequencies are. Hence for this
state of polarisation, there are NO evanescent waves and therefore super-resolution and
perfect imaging should be possible in such a medium.
Natural hyperbolic media seem to exist for a few frequencies in the mid-infrared. For visible
wavelengths, materials with hyperbolic behaviour are too lossy to give superresolution.
Therefore one tries to approximate hyperbolic media by so-called metamaterials which are
made of very thin metallic and dielectric layers so that the effective permittivity has the
desired hyperbolic property. The success of this idea has however been moderate so far.

Nonlinear effects When the refractive index of a material depends on the local electric field,
the material is nonlinear. At optical frequencies nonlinear effects are in general quite
11
J.B. Pendry, PRL 18, 2000

Optica Lecture Notes TN2421 126 of 165 Monday 16th April, 2018, 09:44
7.7. Superresolution

Figure 7.16: Examples of composite materials consisting of thin (sub-wavelength) layers of metals
and dielectrics. These artificial materials are called metamaterials.

small, but with a strong laser they become significant. One effect is self-focusing, where
the refractive index is proportional to the local light intensity. The locally higher intensity
causes an increase of the refractive index, leading to a waveguiding effect due to which the
beam focuses even more strongly. Hence the focused beam becomes more and more narrow
while propagating until finally the material breaks down.

Figure 7.17: Self focussing of a beam of light in a nonlinear material.

Stimulated Emission Depletion Microscopy (STED) . This technique was invented by


V.A. Okhonin in 1986 in the USSR and futher developed by Stefan Hell and his co-workers
in the nineties. Hell received the nobel prize in chemistry for his work in 2014. STED is
a non-linear technique to achieve super-resolution in fluorescence microscopy. Images in
this type of microscope are blurred when fluorescent molecules are very closely together.
In the STED microscope a special trick is used to take care that fluorescent molecules are
sufficiently isolated from other fluorescent molecules to be individually detectable. Two
focused spots are used: the first spot excites the molecules to a higher level. The second
spot is slightly red shifted and has a doughnut shape (see Fig. 7.18. It causes decay of the
excited molecules to the lower level by stimulated emission (the excited state is depleted).
Because of the doughnut shape of the second spot, the molecule in the centre of the spot
is not affected and will still fluorescence. Crucial is that a dougnut spot has a central dark
region which is very narrow, i.e. it can be much smaller than the Airy spot and this is the
reason for the super-resolution.

Optica Lecture Notes TN2421 127 of 165 Monday 16th April, 2018, 09:44
CHAPTER 7. SCALAR DIFFRACTION OPTICS

Figure 7.18: Spot used for excitation (top left) and for depletion (top middle). Fluorescence
signal top right. In the lower figure the confocal image is compared to the STED image.

External sources in recommended order:

1. Every picture is made of waves - Sixty Symbols, 3:33 to 7:15: Basic explanation of Fourier
transforms.

2. Heisenberg’s Microscope - Sixty Symbols, 0:20 to 2:38: Basic explanation of the uncertainty
principle (though in the context of quantum physics).

3. Hecht, §7.4.4, subsection ‘Fourier Analysis and Diffraction’.

4. Introduction to Fourier Optics, J. Goodman, §5.2.2 - Several calculations on the Fourier


transforming properties of lenses.

5. Hecht §10.2.6 ‘Resolution of imaging systems’.

Optica Lecture Notes TN2421 128 of 165 Monday 16th April, 2018, 09:44
Chapter 8

Lasers

What you should know and be able to do after studying this chapter.

• Know the special properties of laser sources.

• Understand the optical resonator and the reason for needing it.

• Understand the Einstein theory of absorption, spontaneous and stimulated


emission.

• Understand the role of the amplifier and explain what the gain curve is.

• Explain the principle of the population inversion and how it can be achieved.

• Explain how single frequency operation can be obtained.

• Understand what transverse modes are and how they can be eliminated.

• Know the different types of pumping.

In the early 1950s a new source of microwave radiation, the maser, was invented by C.H. Townes
in the USA and A.M. Prokhorov and N.G. Basov in the USSR. Maser stands for "Microwave
Amplification by Stimulated Emission of Radiation". In 1958, A.L. Schawlow and Townes for-
mulated the physical constraint to realize a similar device for visible light. This has resulted in
1960 in the first optical maser by T.H. Maiman in the USA. This device has since then been
called Light by Amplitification Stimulated Emission of Radiation or laser.
It has revolutionized science and engineering and is being applied in many different applications
such as:
• bar code readers,
• compact discs,
• computer printers,
• fiberoptic communication,
• sensors,
• material processing,
• non-destructive testing,
• position and motion control,
• spectroscopy,
• medical applications, such as treatment of retina detachment,
• nuclear fusion,
• holography.

129
CHAPTER 8. LASERS

8.1 Unique Properties of Lasers and Their Applications


The broad applications of lasers are made possible by their unique properties which distinguishes
lasers from all other light sources. We discuss these unique properties below.

8.1.1 High Monochromaticity; Narrow Spectral Width; High Temporal Co-


herence.
These three formulations basically mean the same thing. Saying that the laser has high monochro-
maticity or that it has a very narrow spectral width means that it emits a very narrow band
of frequencies. A spectral lamp, like a gas discharge lamp based on Mercury-vapor can have a
spectral width ∆ν = 10 GHz. Visible frequencies are around 2 × 1014 Hz, hence the spectral
width of the lamp is roughly 0.02%. The line width measured in wavelengths satisfies
∆λ ∆ν
= , (8.1)
λ ν
and hence for λ = 550 nm, ∆λ of a spectral lamp is of the order of 0.1 nm.
By contrast, a laser can easily have a frequency band that is a factor of 100 smaller, i.e. less
than 10 MHz=107 Hz in the visible. For a wavelength of 550 nm this means that the linewidth
is only 0.001 nm. As has been explained in Chapter 7, the coherence time of the emitted light
is the reciprocal of the frequency bandwidth. Light is emitted by atoms in bursts of harmonic
(cosine) waves consisting of a great but finite number of periods. The wave is thus not purely
monochromatic but has a certain frequency width given by

∆ν = 1/τc , (8.2)

where τc is called the coherence time, the typical duration of the bursts.
Laser light is also emitted by atoms but, as will be explained in this chapter, due to the
special configuration of the laser, the wave trains can be extremely long corresponding to a very
long coherence time.
The property of a very narrow spectral width is essential for many of the already mentioned
applications of lasers, in particular in communication, high resolution spectroscopy, interferom-
etry and for sensors.

8.1.2 Highly Collimated Beam; Difraction Limited Collimation.


Consider a discharge lamp as shown in Fig. 8.1. Every atom in the source emits a spherical wave
which lasts on average a time interval given by the reciprocal of the spectral width. To collimate
the light emitted, a lens is used with the discharge lamp in the focal plane. The spherical waves
emitted by all point sources (atoms) in the lamp are collimated into plane waves whose direction
depends on the position of the atoms in the source. The atoms at the edges of the source
determine the overall divergence angle θ given by

θ = h/f, (8.3)

where 2h is the size of the source and f is the focal length of the lens. Hence the light can
be collimated by either choosing a lens with large focal length or by reducing the size of the
source, or both. Both methods lead however to weak intensities. Before the invention of the
laser, collimated beams were obtained by using a tiny light source. Hence collimated beams were
in those days always very weak.
There exists a fundamental limit for the collimation of a beam. As follows from Chapter 7, a
time-harmonic beam of diameter D and wavelength λ has a diffraction limited divergence given
by:
λ
θ= . (8.4)
D

Optica Lecture Notes TN2421 130 of 165 Monday 16th April, 2018, 09:44
8.1. Unique Properties of Lasers and Their Applications

Figure 8.1: A discharge lamp is positioned in the focal plane of a converging lens. Every atom in
the lens emits a spherical wave during a burst of radiation lasting on average a coherence time
τc . The overall divergence of the beam is determined by the atoms at the extreme positions of
the source.

Note that the minimum divergence depends on the wavelength. When a laser beam is used, the
diffraction limited convergence angle can almost be reached. The minimum divergence angle of
a laser beam therefore does not depend on the size of the laser source as is the case of classical
light sources. Furthermore, all the power emitted by the laser can be collimated so that very
high intensities can be realized. High degree of collimation is very useful for many applications

Figure 8.2: A laser beam can almost reach diffraction limited collimation.

such as alignment, radar, bar code readers, etc.

8.1.3 Very Small Focused Spot; Diffraction Limited Focused Spot; High Spa-
tial Coherence.
If we add a second lens after the first lens in Fig. 8.1 a spot is obtained in the focal plane of the
second lens. This spot can be very small only when the light has been made almost perfectly
collimated by the first lens.
What is the smallest focal spot that one can achieve? If one focuses a perfectly collimated
beam with a lens with very small aberrations, the lateral size of the focused spot is limited by
diffraction. According to Chapter 7:

f λ
diffraction limited spot size = 0.6 λ = 0.6 . (8.5)
D NA
With a laser one can achieve a diffraction limited spot that has a very high intensity. Almost all
the light emitted by the laser can be focused into the spot.
As has been explained in Chapter 6, a light wave has high spatial coherence if it is well
behaved in space. At any given time, its amplitude and phase in any point can be predicted.
The spherical waves emitted by a point source have this property. But when there are many
point sources (atoms) that each emit bursts of harmonic waves that start at random times as
is the case in a classical light source, the amplitude and phase of the total emitted field at any

Optica Lecture Notes TN2421 131 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

position in space can not be predicted. The only way to make the light spatially coherent is
by making the light source very small, but then there is hardly any light. As will be explained
below, by the design of the laser, the emission by the atoms of the amplifying medium in a laser
are phase correlated which leads to a very high temporal and spatial coherence. The property of

Figure 8.3: Diffraction limited spot obtained by focussing a collimated beam.

a small spot size with high intensity is essential for many applications. For the compact disc, a
scanning diffraction limited focused spot is necessary to obtain maximum resolution and hence
maximum storage density of data, while a high light intensity is needed for high sensitivity of
writing and reading the data. In material processing (cutting, welding and drilling) spots with
very high powers are needed. In retina surgery a very small high intensity spot is applied to weld
the retina without damaging the surrounding healthy tissue.

8.1.4 High Power; CW and Pulsed.


There are two types of lasers namely CW (Continuous Wave) lasers, which produce a continuous
output, and pulsed lasers which emit a train of pulses. These pulses can be very short: from
a nanosecond to even femtosecond (10−15 s). A relatively low power CW laser is the HeNe
laser which emits roughly 1 mW at the wavelength 632 nm. Other lasers can emit up to a huge
megawatt of continuous power. Pulsed lasers can emit enormous peak intensities (i.e. at the
maximum of a pulse), ranging from 109 to 1015 Watt and it seems that even 1018 W has been
reached.
CW 10−3 W milliwat
100 W watt
103 W kilowatt
106 W megawatt
pulsed 109 W gigawatt
1012 W terawatt
1015 W petawatt
1018 W exawatt

There are many applications of high power lasers such as for cutting and welding materials. In
Integrating Circuit (IC) manufacturing the required high transistor density on the chip will be
realised in the future by using the very short extreme ultra-violet (EUV) wavelength of 13.5 nm to
image the mask pattern into the photoresist. To obtain EUV light with sufficient high intensity,
extremely powerful CO2 lasers are used to excite a plasma. Extremely high power lasers are
also applied to initiate fusion and in many nonlinear optics applications. Lasers with very short
pulses are used to study very fast phenomena with short decay times, for optical computers to
realise faster clocks and for high resolution imaging.

Optica Lecture Notes TN2421 132 of 165 Monday 16th April, 2018, 09:44
8.2. Optical Resonator

8.1.5 Wide Tuning Range.


For a wide range of wavelengths, from the vacuum ultra-violet (VUV), the ultra-violet (UV),
the visible, the infrared (IR), the mid-infrared (MIR) up into the far infrared (FIR), lasers
are available. For some type of lasers, the tuning range can be quite broad. The gaps in the
electromagnetic spectrum that are not directly addressed by laser emission can be covered by
techniques such as higher harmonic generation and frequency differencing. In this way sources
are available for virtually all the frequencies from VUV to FIR laser

8.2 Optical Resonator


We will now explain the principle of lasers and why it has all the interesting properties mentioned
above. As his name suggests, a laser consists principally of:

• an optical resonator;
• an amplifying medium.

In this section we consider the optical resonator. The function of the resonator is to obtain a
high light energy density and to gain control over the emission wavelengths.
A resonator, whether it is mechanical like a pendulum, a spring or a string, or electrical like an
LRC circuit, has one or multiple resonance frequencies νres . Every resonator has losses, therefore
the oscillation will gradually die when after the initial excitation no energy is supplied. The
losses cause an exponential decrease of the amplitude of the oscillation as shown in Fig. 8.4. The
oscillation is therefore not purely monochromatic but has a finite bandwidth given by ∆ν ≈ 1/τ
as shown in Fig. 8.4, where τ is the time at which the amplitude of the oscillation has reduced
to half the initial value.

Figure 8.4: Damped oscillation (left) and frequency spectrum of a damped oscillation (right)
with resonance wavelength and frequency equal to the reciprocal of the decay time.

The optical resonator is a region filled with some material with refractive index n bounded
by two aligned highly reflective mirrors at a distance L. The resonator is called a Fabry-Perot
cavity.
Let the z-axis be chosen along the axis of the cavity as shown in Fig. 8.5, and assume that the
transverse directions are so large that the light can be considered a plane wave bouncing back
and forth along the z-axis between the two mirrors. Let ω be the frequency and k0 = ω/c the
wave number in vacuum. The plane wave that propagates in the positive z-direction is given by:

E(z) = Aeiknz , (8.6)

For very good mirrors, the amplitude remains unchanged upon reflections while the phase

Optica Lecture Notes TN2421 133 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

Figure 8.5: TO be changed

typically changes by π. Hence, after one round trip (i.e. two reflections) the field (8.6) is (the
possible phase changes at the mirrors add up to 2π and hence have no effect):

E(z) = Ae2iknL eiknz . (8.7)

A high field builds up when this wave constructively interferes with (8.6), i.e. when

2πm kc c
k= , or ν = =m , (8.8)
2nL 2π 2nL

for m = 1, 2, . . .. Hence, provided dispersion of the medium can be neglected (n is independent


of the frequency), the resonance frequencies are separated by

∆ν = c/(2nL), (8.9)

which is the so-called free spectral range. For a gas laser that is 1 m long, the free spectral
range is approximately 150 MHz.

Example Suppose that the cavity is 100 cm long and is filled with a material with refrac-
tive index n = 1. Light with visible wavelength of λ = 500 nm corresponds to mode number
m = 2L/λ = 4 × 106 and the free spectral range is ∆ν = c/(2L) = 150 MHz.

Because of losses caused by the mirrors (which never reflect perfectly) and by the absorption
and scattering of the light, the resonances have a certain frequency width ∆ν. When a resonator
is used as a laser, one of the mirrors is given a small transmission to couple the laser light out.
This also corresponds to a loss of the resonator. To compensate for all losses, the cavity must
contain an amplifying medium. Due to the amplification the resonance line widths inside the
bandwidth of the amplifier are reduced to very sharp lines as shown in Fig. 8.6.

Figure 8.6: Resonant frequencies of a cavity of length L when the refractive index n = 1. With
an amplifier inside the cavity, the linewidths of the resonances within the bandwidth of the
amplifier are reduced. The red curve is the spectral function of the amplification.

Optica Lecture Notes TN2421 134 of 165 Monday 16th April, 2018, 09:44
8.3. Amplification

8.3 Amplification
Amplification can be achieved by a medium with atomic resonances that are at or close to one of
the resonances of the resonator. We first recall the simple theory developed by Einstein in 1916
of the dynamic equilibrium of a material in the presence of electromagnetic radiation.

8.3.1 The A and B Einstein Coefficients


We consider two atomic energy levels E2 > E1 . By absorbing a photon of energy

~ω = E2 − E1 , (8.10)

an atom that is initially in the lower energy state 1 can be excited to state 2. Here ~ is Planck’s
constant:
6.626070040
~= × 10−34 Js . (8.11)

Suppose W (ω) is the time-averaged electromagnetic energy density per unit of frequency interval
around frequency ω. Hence W has dimension Js/m3 . Let N1 and N2 be the number of atoms
in state 1 and 2, respectively, where
N1 + N2 = N, (8.12)
is the total number of atoms (which is constant). The rate of absorption is the rate of decrease
of N1 and is proportional to the energy density and the number of atoms in state 1:

dN1
= −B12 N1 W (ω), absorption, (8.13)
dt
where B12 > 0 is constant of proportionality with dimension m3 J −1 s−2 . Without any external
influence, an atom that is in the excited state will usually transfer to state 1 within 1 ns or so,
while emitting a photon of energy (8.10). This process is called spontaneous emission since
it happens also without an electromagnetic field present. The rate of spontaneous emission is
given by:

dN2
= −A21 N2 , spontaneous emission, (8.14)
dt
where A21 has dimension s−1 . The life time of spontaneous transmission is τsp = 1/A21 . It is
important to note that the spontaneously emitted photon is emitted in a random direction.
In fact, since an atom can in general be described by a radiating electric dipole, the statistical
distribution of radiation angles is proportional to the intensity pattern of the field radiated by the
dipole. Furthermore, since the radiation occurs at a random time, there is no phase relation
between the spontaneously emitted field and the field that excites the atom.
It is less obvious that in the presence of an electromagnetic field of frequency close to the
atomic resonance, an atom in the excited state can also be stimulated by that field to emit a
photon and transfer to the lower energy state. The rate of stimulated emission is proportional
to the number of excited atoms and to the energy density of the field:

dN2
= −B21 N2 W (ω), stimulated emission, (8.15)
dt
where B21 has the same dimension as B12 . It is very important to remark that stimulated
emission occurs in the same electromagnetic mode (e.g. a plane wave) as the mode of the
field that excites the transmission and that the phase of the radiated field is identical to that
of the exciting field. This implies that stimulated emission enhances the electromagnetic field by
constructive interference and this property is crucial for the operation of the laser.

Optica Lecture Notes TN2421 135 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

Figure 8.7: Absorption, Spontaneous Emission and Stimulated Emission with their respective
rates.

8.3.2 Relation Between the Einstein Coefficients


The Einstein coefficients A21 , B12 and B21 can be expressed as functions of each other. Consider
a black body such as a closed empty box. After a certain time, thermal equilibrium will be
reached. Because there is no radiation entering the box from outside nor leaving to the outside,
the electromagnetic energy density is the thermal density WT (ω), which according to Planck’s
Law is independent of the material of which the box is made given by:

~ω 3 1
WT (ω) = 2 3
  , (8.16)
π c exp
kB T − 1

where kB is Boltzmann’s constant:

kB = 1.38064852 × 10−23 m2 kgs−2 K−1 . (8.17)

The rate of upward and downwards transitions of the atoms in the wall of the box must be
identical:
B12 N1 WT (ω) = A21 N2 + B21 N2 WT (ω). (8.18)
Hence,
A21
WT (ω) = . (8.19)
B12 N1 /N2 − B21
But in thermal equilibrium:
   
N1 E1 − E2 ~ω
= exp − = exp . (8.20)
N2 kB T kB T

By substituting (8.20) into (8.19), and comparing the result with (8.16), it follows that both
expressions for WT (ω) are identical for all temperatures only if

~ω 3
B12 = B21 , A21 = B21 . (8.21)
π 2 c3

Example For green light of λ = 550 nm, we have ω/c = 2π/λ = 2.8560 × 106 m−1 and thus
A21
= 1.5640 × 10−15 J s m−3 . (8.22)
B21

Hence the spontaneous and stimulated emission rates are equal if W (ω) = 1.5640 × 10−15 Js
m−3 .

Optica Lecture Notes TN2421 136 of 165 Monday 16th April, 2018, 09:44
8.3. Amplification

Table 8.1: Typical intensities of light sources.

I (W m−2 )
Mercury lamp 104
Continuous laser 105
Pulsed laser 1013

For a (narrow) frequency band dω the time averaged energy density is W (ω) dω and for a
plane wave the energy density is related to the intensity I (i.e. the length of the time averaged
Poynting vector) as:

W (ω) dω = I/c. (8.23)

A typical value for the frequency width of a narrow emission line of an ordinary light source
is: 1010 Hz, i.e. dω = 2π × 1010 Hz. Hence, the spontaneous and stimulated emission rates
are identical if the intensity is I = 2.95 × 104 W/m2 . As seen from Table 8.1, only for laser
light stimulated emission is larger than spontaneous emission. For classical light sources the
spontaneous emission rate is much larger than the stimulated emission rate. If a beam with
frequency width dω and energy density W (ω) dω propagates through a material, the rate of loss
of energy is proportional to:

(N1 − N2 )B12 W (ω). (8.24)

According to (8.18) this is equal to the spontaneous emission rate. Indeed, the spontaneously
emitted light corresponds to a loss of intensity of the beam because it is emitted in random
directions and with random phase.
When N2 > N1 , the light is amplified. This state is called population inversion and it is
essential for the operation of the laser. Note that the ratio of the spontaneous and stimulated
emission rates is according to (8.21) proportional to ω 3 . Hence for shorter wavelengths such as
X-rays, it is much more difficult to make lasers than for visible light.

8.3.3 Population Inversion


For electromagnetic energy density W (ω) per unit of frequency interval, the rate equations are
dN2
= −A21 N2 + (N1 − N2 )B12 W (ω), (8.25)
dt
dN1
= A21 N2 − (N1 − N2 )B12 W (ω). (8.26)
dt
Hence, for ∆N = N2 − N1 :
d∆N
= −A21 ∆N − 2∆N B12 W (ω) − A21 N, (8.27)
dt
where as before: N = N1 + N2 is constant. If initially (i.e. at t = 0) all atoms are in the lowest
state: ∆N (t = 0) = −N , then it follows from (8.27):
   
A21 A21 −(A21 +2B12 W (ω))t
∆N (t) = −N + 1− e . (8.28)
A21 + 2B12 W (ω) A21 + 2B12 W (ω)
An example where A21 /B12 W (ω) = 0.5 is shown in Fig. 8.8. We always have ∆N < 0, hence
N2 (t) < N1 (r) for all times t. Hence a system with only two levels can not have population
inversion and hence also has a net absorption.

Optica Lecture Notes TN2421 137 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

Figure 8.8: ∆N/N as function of t/(A21 + 2B12 W ) when all atoms are in the ground state at
t = 0, i.e. ∆N (0) = −N .

A way to achieve population inversion of levels 1 and 2 and hence amplification of the radiation
with frequency ω with ~ω = E2 − E1 is to use more atomic levels, for example three. In Fig. 8.9
the ground state is state 1 with two upper levels 2 and 3 such as: E1 < E2 < E3 . The transition
of interest is still that from level 2 to level 1. Initially almost all atoms are in the ground state
1. Atoms are pumped with rate R from level 1 directly to level 3. The transition 3 → 2 is
non-radiative and has high rate A32 so that level 3 is quickly emptied and therefore N3 remains
small. State 2 is called a metastable state because each atom’s residence time in the metastable
state is relatively long. Therefore the population tends to increase and leads to a population
inversion between the metastable state 2 and the lower ground state 1 (which is continuously
being depopulated to the highest level).
To obtain population inversion, a majority of ground state electrons (State 1) must be pro-
moted to the highly excited energy level (State 3), requiring a significant input of external energy.
Note that when A31 is not small, level 1 will quickly be filled by which population inversion
will be stopped. Then the laser output is a series of pulses. To have a continuous laser output,
atoms in level 1 should quickly decay to level 0.
Pumping may be done optically as described, but the required energy to transfer the atoms
from level 0 to level 2 can also be supplied by an electrical discharge in a gas or by an electric
current. After the pumping has achieved population inversion, there is initially no light emitted.
So how does the laser actually start? Lasing starts by spontaneous emission. The spontaneously
emitted photon stimulate emission of the atoms in level 2 to decay to level 1 while emitting a
photon of energy ~ω. This stimulated emission occurs in phase with the exciting light en hence
the light continuously builds up coherently while it is bouncing back and forth between the
mirrors. One of the mirrors is slightly transparent and in this way some of the light is leaking
out of the laser.

8.4 Cavities
The amplifying medium can completely fill the space between the mirrors as at the top in
Fig. 8.10, or there can be space between the amplifier and the mirrors. For example, if the
amplifier is a gas it may be enclosed by a glass cylinder. The end faces of the cylinder are
positions under the Brewster angle with respect to the axis as shown in the middle figure of
Fig. 8.10, to minimise reflections. This type of resonator is called a resonator with external
mirrors.
Usually one or both mirrors are convex as shown in the bottom figure of Fig. 8.10. We state

Optica Lecture Notes TN2421 138 of 165 Monday 16th April, 2018, 09:44
8.5. Problems of Laser Operation

Figure 8.9: The three Einstein radiative transitions.

without proof that in that case the distance L between the mirrors and the radii of curvature
R1 and R2 of the mirrors have to satisfy
  
L L
0< 1− 1− < 1, (8.29)
R1 R2

or else the laser light will ultimately leave the cavity. This condition is called the condition of
stability of the laser. The curvatures for the mirrors are positive for convex mirrors but negative
values have to be substituted when the mirror is concave. Clearly, when both mirrors are concave,
the laser is always unstable.

Figure 8.10: Three types of laser cavity. The yellow region is the amplifier. The middle case is
called a laser with external cavities.

8.5 Problems of Laser Operation


In this section we consider some problems that occur with lasers and discuss what can be done
to solve these.

8.5.1 How to Realize Single Frequency Operation


In many applications such as communication, interferometry, spectroscopy and sensing one needs
a single wavelength. Consider a cavity of length L as shown in Fig. 8.11 and suppose that the
amplifier has a gain curve covering many resonances of the resonator. One way to achieve single
frequency output is by taking care that there is only one frequency for which the gain is larger

Optica Lecture Notes TN2421 139 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

than the losses. One then says that the laser is above threshold for only one frequency. This can
be done by choosing the length L of the cavity so small that there is only one mode under the
gain curve, with gain higher than the losses. However, a small length of the amplifier means less
output power. Another method would be to restrict the pumping so that for only one mode the
gain compensates the losses. But this implies again that the laser output power is very limited.
A better solution is to add a Fabry-Perot cavity inside the laser cavity as shown in Fig. 8.12. The
cavity consists e.g. of a piece of glass of a certain thickness a. By choosing a sufficiently small, the
distance in frequency c/(2a) between the resonances of the Fabry-Perot cavity becomes so large
that there is only one Fabry-Perot resonance under the gain curve of the amplifier. Furthermore,
by choosing the proper angle for the Fabry-Perot cavity with respect to the axis of the laser cavity,
the Fabry-Perot resonance can be coupled to the desired resonance frequency. This frequency is
then the frequency of the laser output. All other resonance frequencies of the resonator under
the gain curve are damped because they are not a resonance of the Fabry-Perot cavity.

Figure 8.11: Laser with cavity of length L and broad amplifier gain curve. Many resonance
frequencies of the resonances are above threshold to compensate the losses.

8.5.2 How to Prevent Transverse Modes


The most well known laser mode has transverse intensity distribution that is a Gaussian func-
tion of transverse distance to the optical axis. We call a mode with Gaussian transverse shape
longitudinal mode and when its frequency satisfies ν = mc/(2L) it is called the mth longi-
tudinal mode. However, inside the laser cavity other modes with different transverse patterns
can also resonate. An example is shown in Fig. 8.13 where mode (1,0) consists of two spots.
There exist many more transverse modes as shown in Fig. 8.14. The transverse modes all have
slightly different frequencies. So even when there is only one Gaussian mode above threshold (i.e.
modes occur for only one value of m), there can be many transverse modes with frequencies very
close to the frequency of the Gaussian mode that are also above threshold. This is illustrated in
Fig. 8.15 where the frequencies of modes (0,0), (1,0) and (1,1) all are above threshold.
Usually one prefers the Gaussian mode and the transverse modes are undesired. How can we
get rid of them? Because the Gaussian mode has smallest transverse width, the transverse modes

Optica Lecture Notes TN2421 140 of 165 Monday 16th April, 2018, 09:44
8.6. Types of Lasers

Figure 8.12: Laser with cavity of length L and broad amplifier gain curve. Many resonance
frequencies are below the gain curve and have gain which is above the red dashed line, to
compensate the losses. Such modes are referred to as being above treshold.

can be eliminated by inserting an aperture in the laser cavity. This aperture is so small that the
transverse modes suffer high scattering losses, but is sufficiently large so that the Gaussian mode
is not affected.

Figure 8.13: Laser cavity with (0,0) and (1,0) modes.

8.6 Types of Lasers


There are many types of lasers: gas, solid, liquid, semiconductor, chemical, excimer, e-beam, free
electron, fiber and even waveguide lasers. We classify them according to the pumping mechanism.

8.6.1 Optical Pumping


The energy to transfer the atom A from the ground state to the excited state is provided by
light. The source could be another laser or an incoherent light source such as a discharge lamp.

Optica Lecture Notes TN2421 141 of 165 Monday 16th April, 2018, 09:44
width. Suppose
e approximate
x of refraction (a) (b) (c) CHAPTER 8. LASERS

Figure 13.12 Three operation configurations for a c-w gas laser:


(a) illustrates several longitudinal modes under a roughly Gaussian
envelope, (b) shows several longitudinal and transverse modes, and finally
given by Eq. (c) depicts a single longitudinal mode. (E.H.)

dividing 1.5 *
half the maxi-

. And with one


ee Fig. 13.9.)

de in the cavity
by Eq. (13.16),
mode would fit
d by the broad-
laser operating
0 cm to ensure
of this particu-
ve region con-
utput power of

of oscillation,
g the cavity or
ell (Figs. 13.11
l to z, these are
magnetic). The
ansverse nodal
g beam. That is
Figure Mode
Figure 13.13 8.14: Intensity pattern of
patterns (without theseveral transverse modes.
faint interference fringes this is
what the beam looks like in cross section). (Used with permission of Alcatel-Lucent
USA Inc.)

Frequency to say, the beam is segmented in its cross section into one or
) more regions. Each such array is associated with a giventhTEM
Optica Lecture Notes TN2421 142 of 165 Monday 16 April, 2018, 09:44
mode, as shown in Figs. 13.13 and 13.14. The lowest order, or
TEM00, transverse mode is perhaps the most widely used, and
8.6. Types of Lasers

Figure 8.15: Resonance frequencies of transverse modes that have sufficient gain to compensate
the losses.

If A is the atom in the ground state and A∗ is the excited atom, we have

~ω02 + A → A∗ , (8.30)

where ω02 is the frequency for the transition 0 → 2. The Ruby laser (the amplifier consists of
Al2 O3 with 0.05 weight percent Cr2 O3 ) was the fist laser invented in 1960. It emits pulses of
light of wavelength 694.3 nm and is optically pumped with a gas discharge lamp. Other optically
pumped lasers are the YAG, glass, fiber, semiconductor and the dye laser.
In the dye laser the amplifier is a liquid (e.g. Rhodamine6G). It is optically pumped by an
argon laser and it has a huge gain width, which covers almost the complete visible wavelength
range. We can select a certain wavelength by inserting a dispersive element like the Fabry-Perot
cavity inside the laser cavity and rotate it at the right angle to select the desired wavelength, as
explained above.

Figure 8.16: Optical pumping.

8.6.2 Electron-Collision Pump


Energetic electrons are used to collide with the atoms of the amplifier, hereby transfering some
of their energy:
A + e(E1 ) → A∗ + e(E2 ), (8.31)

where e(E1 ) means an electron with energy E1 and where E1 − E2 is equal to ~ω02 so that the
atom is transferred from the ground state to state 2 to obtain population inversion. Examples
are the Argon, Krypton, Xenon, Nitrogen and Copper lasers. Electrons can be created by a
discharge or by an electron beam.

Optica Lecture Notes TN2421 143 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

8.6.3 Atom Collision

Let B m be atom B in an excited, so-called metastable state. This means that B m , although
unstable, has a very long relaxation time, i.e. longer than 1 ms or so. If B m collides with atom
A, it transfers energy to A.
B m + A → B + A∗ , (8.32)

A∗ is the excited state used for the stimulated emission. Let τm1 be the relaxation time of
metastable state B m , then τm1 is very large and hence the spontaneous emission rate is very
small. This implies that the number of metastable atoms as function of time t is given by a
slowly decaying exponential function exp(−t/τm1 ). How can one get metastable atoms? One

Figure 8.17: Pumping atoms A to state 2 by collision with metastable atoms B m .

can for example pump atom B from their ground state 1 to an excited state 3 above state m,
such that the spontaneous emission rate 3 → m is large. The pumping can be done electrically
or using electron collisions or by any other means. If it is done electrically, then we have

B + e(E2 ) → B m + e(E1 ), (8.33)

Figure 8.18: Creating metastable atoms B m by pumping.

Examples of these types of laser are He-Ne (which emits in the red at 632 nm), N2 -CO2 and
He-Cd. All of these depend on atom or molecule collisions, where the atom or molecule that
is mentioned first in the name is brought in the metastable state and the lasing occurs at a
wavelength corresponding to a level difference of the second mentioned atom (or molecule). In
the simplest case the metastable states are created by electrons generated by a discharge. the
CO2 laser emits at 10 µm and can achieve huge power.

Optica Lecture Notes TN2421 144 of 165 Monday 16th April, 2018, 09:44
8.6. Types of Lasers

Figure 8.19: HeNe laser with spherical external mirrors, a discharge tube with faces at the
Brewster angle to minimise reflections, and an anode and cathode for the discharge pumping.

8.6.4 Chemical Pump


In some chemical reactions, a molecule is created in an excited state with population inversion.
An example is:
A + B2 → (AB)∗ + B. (8.34)
So in this case the lasing will take place for a transfer between states of molecule AB. The HF,
DF, Ar-F, Cr-F, Xe-F, Xe-Cl lasers are all chemically pumped.

8.6.5 Semiconductor Laser


In this case pumping is done by electron current injection. It is one of the smallest lasers and yet
it emits typically 20 mW of power. Transitions occur between the conduction and valence bands
close to the p-n junction. Electrons from the n-layer conduction band will recombine with the
holes in the p-layer. A cavity is obtained by polishing the end faces that are perpendicular to
the junction to make them highly reflecting. Semiconductor lasers are produced for wavelengths
from 700 nm to 30 µm and give continuous (CW) output.

Figure 8.20: Semiconductor laser with active p-n junction, polished end faces and current supply
for pumping.

Optica Lecture Notes TN2421 145 of 165 Monday 16th April, 2018, 09:44
CHAPTER 8. LASERS

Optica Lecture Notes TN2421 146 of 165 Monday 16th April, 2018, 09:44
Appendices

147
Appendix A

Vector Calculus

Below, A, B C, and D are vector fields (or constant vectors) and φ and ψ are scalar functions.
Then:
A · (B × C) = B · (C × A) = C · (A × B). (A.1)
A × (B × C) = (A · C)B − (A · B)C. (A.2)
(A × B) · (C × D) = (A · C)(B · D) − (A · D)(B · C) (A.3)
∇ · (φA) = φ∇ · A + ∇φ · A. (A.4)
∇ × (φA) = φ∇ × A + ∇φ × A. (A.5)
∇ · (A × B) = −A · ∇ × B + B · ∇ × A. (A.6)
∇ × (A × B) = −(A · ∇)B + A∇ · B + (B · ∇)A − B∇ · A. (A.7)
∇(A · B) = (A · ∇)B + A × ∇ × B + (B · ∇)A + B × ∇ × A. (A.8)
∇ · ∇φ = ∆φ, (A.9)
where ∆ = ∂ 2 /∂x2 + ∂ 2 /∂y 2 + ∂ 2 /∂z 2 provided that (x, y, z) is an orthonormal basis.

∇ × ∇ × A = −∆A + ∇∇ · A. (A.10)

Remark: The last formula is only valid in a Cartesian coordinate system. This means that
the vector field A must be decomposed on the Cartesian basis and the derivatives must be com-
puted with respect to the corresponding Cartesian coordinates and then ∆A must be interpreted
component-by-component: ∆A= (∆Ax , ∆Ay , ∆Az )T , where Ax , Ay , Az are components with
respect to the Cartesian basis. The formula does not hold in cylindrical or spherical coordinates!.

∇ × ∇φ = 0. (A.11)
∇ · (∇ × A) = 0. (A.12)
In addition, the following integral theorems apply (V is a volume with surface area S and
outward unit normal n.
Gauss’s Theorem (or divergence theorem):
ZZZ ZZ
∇ · A dV = A · n dS. (A.13)
V S

Apply this to the vector field A = φ∇ψ. Because of (A.4) and (A.9) holds ∇·A = φ∆ψ +∇·∇ψ
and thus (Green’s Theorem):
ZZZ ZZ
∂ψ
φ∆ψ + ∇φ · ∇ψ dV = φ dS. (A.14)
V S ∂n

149
APPENDIX A. VECTOR CALCULUS

By subtracting the analogous relation from (A.14) with φ and ψ interchanged, one gets:
ZZZ ZZ
∂ψ ∂φ
φ∆ψ − ψ∆φ dV = φ −ψ dS. (A.15)
V S ∂n ∂n

By using (A.6) and Gauss’s theorem it follows furthermore that


ZZZ ZZZ ZZ
B · ∇ × AdV − A · ∇ × BdV = ˙ × B) dS
n(A
V V ZZ S

= (n × A) · B) dS
Z ZS
= A · (n × B) dS, (A.16)
S

where in the right-hand side we used (A.1).


Stokes’ Theorem (S is a possibly curved surface with contour C):
ZZ Z
∇ × A · n dS = A · ds, (A.17)
S C

where n is the unit vector field that is perpendicular to S, which is in the direction to which a
right-handed corkscrew points if it is rotated in the positive direction of the line integral along
C.
There is also an analogue of Green’s Theorem for the curl operator:
Z Z Z Z Z Z Z Z Z Z
∇ × A · B dS − A · ∇ × B dS = (n × A) · B dS = − A · (n × B) dS.
V V S S
(A.18)

Optica Lecture Notes TN2421 150 of 165 Monday 16th April, 2018, 09:44
1.4
0 200 400 600 800 1000 transmitting th
Wavelength l (nm) As a final p
er than any of
Figure 3.40 The wavelength dependence of the index of refraction for
situation can o
various materials. Note that while l goes up toward the right, n goes up
toward the left. plate. This is
seeming contr
this behavior a
For example, the important characteristic frequencies for glasses (Section 7.2.2)
occur at wavelengths of about 100 nm. The middle of the visible In partial su
Appendix B 2
range is roughly five times that value, and there, v0j 7 7 v . Notice 2
trum, electroni
2 2
that as v increases toward v0j, (v0j - v ) decreases and n gradu- mining n(v).
ally increases with frequency, as is clearly evident in Fig. 3.40. vibrating at the
The Lorentz Model of Material
This is called normal dispersion. In the ultraviolet region, as v frequency is ap
approaches a natural frequency, the oscillators will begin to reso-
Dispersion frequency, the
nate. Their amplitudes will increase markedly, and this will be tive absorption
accompanied by damping and a strong absorption of energy from are increased,
the incident
The Lorentz wave.
model, which we already v0j = v in
Whenmentioned inSection
Eq. (3.73), thetodamping
2.3, leads term
a dispersion relation on
for the charges
obviously
the susceptibility and becomes
hence for thedominant.
permittivity The
given regions
by immediately sur-
rounding the various v0j in Fig. 3.41 are called absorption
1 + 2q
(ω) = , (B.1)
bands. There dn>dv is negative, 1and − q the process is spoken of as
with anomalous (i.e., abnormal) dispersion. When white light passes 3 × 1015
2 2.8
through a glass prism, q = thee blue 2constituent
Nq fj
, has a higher index (B.2)
X
30 me ω − ω 2 + iγ ω
than the red and is therefore deviated j j
through a larger angle (see
j

where NSection 5.5.1).


is the number In contrast,
density when
of electrons, qe andwemuse
e areathe
liquid-cell
charge andprismmass ofcon-
the electron, 2.4
ωj are resonance frequencies of atoms or molecules of the material, γP j > 0 is a damping term
taining a dye solution with an absorption band in the visible, the
and fj are weighting factors (so-called oscillator strengths) satisfying: j fj = 1. The refractive

index is spectrum is altered markedly (see n = Problem 3.59).
part All substances

Index of refraction
the square root of the permittivity:  and its real is shown in Figs. B.1. K
2.0 KBr
possess
For dilute absorption
gases, N is smallbands somewhere
and hence q is smallwithin
compared thetoelectromagnetic
1. Then the permittivity NaCl
becomes equal to
frequency spectrum, so that the term anomalous
N qe2 X fj
dispersion, be- KCl
ing a carryover from the
q =late
1 +1800s, is certainly (B.3) SiO2
ωj2 − ω 2 + iγaj ωmisnomer.
(ω) ≈1+ ,
30 me 1.6
j CaF2
LiF
NaF
n Ba
1.2

0.8
√KE 100 200
1 ultra-
violet
v
0 v01 v02 v03
Infrared Visible Ultraviolet X-ray Figure 3.42 Ind
for several impor
Figure 3.41Figure
Refractive index versus
B.1: Refractive indexfrequency.
as function of frequency. Chemical Co.)

The resonances corresponding to transitions from a lower to a higher energy level of electrons
that are in the inner shells of an atom, typically are in the x-ray region, whereas transitions
of valence electrons can be in the ultra-violet to the visible. Resonances of relative motions of
atoms inside a molecule are often in the infrared. At a resonance, the atom absorbs a photon of

151
APPENDIX B. THE LORENTZ MODEL OF MATERIAL DISPERSION

energy ~ω equal to the difference between the energy levels. The material is then absorbing and
this corresponds to a permittivity with positive imaginary part. In between the resonances, the
absorption is low so that the imaginary part of the permittivity is almost zero while its real part
is slowly increasing with frequency (this is called "normal" dispersion). Close to a resonance,
the real part of the permittivity is quickly decreasing with frequency (abnormal dispersion).

Optica Lecture Notes TN2421 152 of 165 Monday 16th April, 2018, 09:44
Appendix C

About the Conservation of


Electromagnetic Energy

We consider a time harmonic electric field that is more general than a plane wave (i.e. it is
not necessarily a single plane wave but a superposition of plane waves with wave vectors with
different directions). Let V be a bounded volume with closed boundary A. The time averaged
flux of electromagnetic energy through the boundary A outwards from the volume is given by
the surface integral Z Z
F = S(r) · n̂dA, (C.1)

where n̂ is the outwards pointing unit normal. We assume that there are no sources inside V .
There are then two possibilities:

1. F < 0. In this case there is a nonzero net flux into the volume. Because all fields are time
harmonic, there can only be a net influx if electromagnetic energy is absorbed inside the
volume. Hence the imaginary part of the permittivity must be positive. It can be shown
that the time average of the absorbed power is given by
ω
Absorbed e.m. energy = Im()|E(r)|2 dV, (C.2)
2
where E(r) is the complex amplitude of the electric field at position r.

2. F = 0. In this case the net energy flow through the boundary is zero and hence the matter
in the volume does not absorb.

153
APPENDIX C. ABOUT THE CONSERVATION OF ELECTROMAGNETIC ENERGY

Optica Lecture Notes TN2421 154 of 165 Monday 16th April, 2018, 09:44
Appendix D

Electromagnetic Momentum

We consider a time harmonic electromagnetic field in vacuum or in a dielectric without absorp-


tion. An electromagnetic field not only transports energy but also momentum. The instantaneous
momentum, also called radiation pressure, points in the same direction as the flow of energy and
is given by
S(r, t)
P(r, t) = , (D.1)
c
with c being the speed of light in the medium. The time averaged momentum per unit of area
caried by the field is thus
S(r)
< P >av = . (D.2)
c

155
APPENDIX D. ELECTROMAGNETIC MOMENTUM

Optica Lecture Notes TN2421 156 of 165 Monday 16th April, 2018, 09:44
Appendix E

The Fourier Transrorm

E.1 Definitions
ZZ
F(h)(ξ, η) = e−2πi(xξ+yη) h(x, y)dx dy. (E.1)
ZZ
F −1 (H)(x, y) = e2πi(xξ+yη) H(ξ, η)dξ dfy . (E.2)

E.2 General Equations

F −1 F(h)(x, y) = h(x, y), (E.3)


∗ ∗
F(h)(ξ, η) = F(h )(−ξ, −η), (E.4)

(z ∗ is the complex conjugate of z).


ZZ ZZ
2
|h(x, y)| dx dy = |F(h)(ξ, η)|2 dξ dfy , Parseval’s formula), (E.5)
F(g ∗ h) = F(g)F(h), (E.6)
F(gh) = F(g) ∗ F(h), (E.7)

where ZZ
(g ∗ h)(x, y) = g(x − x0 , y − y 0 )h(x0 , y 0 )dx0 dy 0 . (E.8)

If h(x) is a p-periodical function then


+∞  
X n
F(h)(ξ) = ĥ(n) δ ξ − , (E.9)
n=−∞
p

where Z p
1
ĥ(n) = h(x)e−2πnx dx. (E.10)
p 0

E.3 Special Equations

F 1[−a,a] (x)1[−b,b] (ξ, η) = 4ab sinc(2aξ)sinc(2bfy ), (E.11)


 

where
sin(πx)
sinc(x) = . (E.12)
πx

157
APPENDIX E. THE FOURIER TRANSRORM

F [δ(x/a)δ(y/b)] = ab. (E.13)

F [1] (ξ, η) = δ(ξ)δ(η). (E.14)

h 2 2 2 2
i 1 −π(ξ2 /a2 +η2 /b2 )
F e−π(a x +b y ) (ξ, η) = e . (E.15)
|ab|

Let (
als px2 + y 2 ≤ a,
p
1,
1 a (x, y) = (E.16)
0, als x2 + y 2 > a.
Then
 p 
J1 2πa ξ 2 + η 2
F(1 a )(ξ, η) = a p . (E.17)
ξ2 + η2

h 2 2 2 2
i i −iπ(ξ2 /a2 +η2 /b2 )
F eiπ(a x +b y ) (ξ, η) = e (E.18)
|ab|

Optica Lecture Notes TN2421 158 of 165 Monday 16th April, 2018, 09:44
Appendix F

Basis transformations

In this section, we discuss the relevance of basis transformations and how to apply them. So
what are basis transformations essentially? It comes down to the following: if we have some
physical object Ψ, we can describe it with a vector (which can in principle be a continuous
function). The form of the vector with which we represent Ψ depends on the basis that we
choose. For example, we could represent a position vector R in Cartesian coordinates (x, y, z),
or in spherical coordinates (ρ, φ, θ), or in cylindrical coordinates (r, φ, z). It is important to note
that the physical object remains unchanged, it is only the coefficients with which the ob-
ject is represented that change. The formulas that describe how the coefficients for one basis
transform to the coefficients in the other basis constitute the basis transformation. In case
these formulas can be described as a matrix operation, we have a linear basis transformation.
This concept you have encountered in Linear Algebra courses.

Basis transforms are ubiquitous, so it is important to be familiar with them also outside the
context of Optics. For example, if you have some signal Ψ, you can either express it in the time
domain or in the frequency domain. These are two different representations of the same
physical object, and the basis transformation that relates the two is the Fourier transform. In
the discrete case it would read
N
X −1
Xk = xn e−2πikn/N , (F.1)
n=0
where xn are the coefficients representing the signal in the time domain, and Xk are the coeffi-
cients representing the signal in the frequency domain. Note that this basis transformation can
be described as a matrix operation
 
  1 1 1 ...  
X0 −2πi/N e−4πi/N . . . x0
X1  1 e

  = 1 e−4πi/N e−8πi/N . . . x1  , (F.2)
 
..  ..
. .. .. .. .. .

. . . .
so the Fourier transform is a linear basis transformation. The use of applying such a basis trans-
formation is obvious: in different bases, there is different information that becomes apparent
more obviously. In the time domain one can see how the signal progresses in time, but it is
difficult to identify different frequency components, whereas in the frequency domain it is very
easy to see how much each frequency contributes to the signal, but it is difficult to see how the
signal changes in time. Also, sometimes it is more efficient to describe a signal in one basis than
in the other. For example, if the signal is a sine wave in the time domain, it takes infinitely many
nonzero coefficients (each coefficient being a point in time) to describe it in the time domain,
while it takes only two nonzero coefficients to describe it in the frequency domain. We say that
a signal can be sparse in a certain basis (sparse meaning that it be represented with few non-
zero coefficients). This sparsity can help in compressing data, or it can be used as a constraint

159
APPENDIX F. BASIS TRANSFORMATIONS

in reconstruction algorithms (this field is known as compressed sensing).

A similar observation holds for the different representations of a quantum state. One can repre-
sent a quantum state |ψi in the position basis (i.e. in terms of the eigenvectors of the position
operator x̂), or in the momentum basis (i.e. in terms of the eigenvectors of the position operator
p̂). Again, the physical object remains unchanged, but by representing it in different bases,
different parts of information become more apparent. In the position basis it becomes easier
to see where a particle may be located, while in the momentum basis it is easier to see what
momentum it may have. The basis transformation that relates the position representation to
the momentum representation is the Fourier transform. One can also represent a quantum state
|ψi in the energy basis (i.e. in terms of the eigenvectors of the energy operator Ĥ, also called
the Hamiltonian), in which case it is easier to see what energy a particle may have, and which
makes it easier to calculate the time-evolution of the wave function (because the time evolution
is determined by the Schrödinger equation, which is a differential equation involving Ĥ).

So we have seen that basis transformations can help in making certain properties of a vector
become more apparent, or make its description simpler (i.e. more sparse). Another advantage
that a basis transformation can have is that applying operators can be easier in a cer-
tain basis. In particular, applying a linear operator A to some vector Ψ is much easier if Ψ is
expressed in the eigenbasis of A. Suppose we can write
X
Ψ= a k vk , (F.3)
k

where vk are eigenvectors of A with eigenvalues λk , then applying A to Ψ will give


X
AΨ = λk ak vk . (F.4)
k

In matrix notation, this is written as


 
λ 1 a1
AΨ = λ2 a2 
 
..
.
   (F.5)
λ1 0 . . . a1
 0 λ2 . . . a2 
=  .
.. .. . . ..
. . . .

Thus, we see that if Ψ is represented in terms of eigenvectors of the linear operator


A (i.e. in the eigenbasis of A), then the matrix representation of A is a diagonal
matrix, and on its diagonal are its eigenvalues.

So we have seen how one may benefit from expressing Ψ in terms of eigenvectors of A. But
if Ψ is given in some arbitrary basis, how do we find the coefficients that represent it in the
eigenbasis of A? To do this, let us consider a simple example. Suppose Ψ has the following
representation in the x̂, ŷ basis
Ψ = 4x̂ + 7ŷ. (F.6)
Or in vector notation  
4
Ψxy = . (F.7)
7
Keep in mind that this is not the vector corresponding to Ψ. Rather, it is a representation of
Ψ which holds in the x̂, ŷ basis (i.e. it should be understood that the first entry in the vector

Optica Lecture Notes TN2421 160 of 165 Monday 16th April, 2018, 09:44
is the coefficient corresponding to x̂, and the second entry is the coefficient corresponding to ŷ).
Now, let us suppose that the linear operator A has eigenvectors

v1 = 1x̂ + 3ŷ,
(F.8)
v2 = 2x̂ + 1ŷ,

or in vector notation (in the x̂, ŷ basis)


   
1 2
v1 = , v2 = . (F.9)
3 1

Suppose we want to write Ψ in the v1 , v2 basis. We need to find Ψ [v1 ], Ψ [v2 ] such that

Ψ = Ψ [v1 ]v1 + Ψ [v2 ]v2 , (F.10)

Obviously
Ψ = 2v1 + 1v2 , (F.11)
because in the x̂, ŷ basis this gives
     
4 1 2
=2 + . (F.12)
7 3 1

Thus in the v1 , v2 basis the vector representation of Ψ would read


 
2
ΨA = . (F.13)
1

Once again, let us emphasize that although Ψ is represented with different numbers, the object
itself hasn’t changed.

Let us put our previous calculations in more general terms. We know representations of Ψ,
v1 , v2 in the x̂, ŷ basis
     
Ψ [x] v1 [x] v2 [x]
Ψ= , v1 = , v2 = , (F.14)
Ψ [y] v1 [y] v2 [y]

we want to find Ψ [v1 ], Ψ [v2 ] such that


     
Ψ [x] v1 [x] v2 [x]
= Ψ [v1 ] + Ψ [v2 ]
Ψ [y] v1 [y] v2 [y]
   (F.15)
v [x] v2 [x] Ψ [v1 ]
= 1 .
v1 [y] v2 [y] Ψ [v2 ]

Here, Ψ [x], Ψ [y] represent Ψ in the x̂, ŷ basis, and Ψ [v1 ], Ψ [v2 ] represent Ψ in the v1 , v2 basis.
Thus, defining the matrix  
v1 [x] v2 [x]
B= , (F.16)
v1 [y] v2 [y]
to go from the v1 , v2 representation of Ψ to the x̂, ŷ representation of Ψ, we must calculate
   
Ψ [x] Ψ [v1 ]
=B . (F.17)
Ψ [y] Ψ [v2 ]

Conversely, to go from the x̂, ŷ representation to the v1 , v2 representation, we must calculate


   
Ψ [v1 ] −1 Ψ [x]
=B . (F.18)
Ψ [v2 ] Ψ [y]

Optica Lecture Notes TN2421 161 of 165 Monday 16th April, 2018, 09:44
APPENDIX F. BASIS TRANSFORMATIONS

So now we know how to go from one basis representation to another and back. We have seen
previously that it can be convenient to go to the eigenbasis of a linear operator A, because in
that representation A is diagonal. Thus we can diagonalize A as
 
λ1 0
A=B B −1 . (F.19)
0 λ2

To summarize, with B −1 we go from some x̂, ŷ basis to the eigenbasis of A. The columns of
B contain the eigenvectors of B in the x̂, ŷ basis. Then we apply the operator A, which in its
eigenbasis is a diagonal matrix with its eigenvalues along the diagonal. Then, to go back from
the eigenbasis to the x̂, ŷ basis, we apply B.

In particular, this can be useful when one has to apply A many times. Because in that case
 N 
λ1 0
N
A =B B −1 . (F.20)
0 λN 2

Or, it could be useful when we want to exponentiate A



X Ak
eA =
k!
k=0

 k 
λ1 0

X 0 λ2   −1 (F.21)
=B B

 k! 
k=0

 λ 
e 1 0
=B B −1 .
0 eλ2
This is for example used in the solution of the Schrödinger equation
d
Ĥ = i~
dt
⇒ (F.22)

(t) = e−iĤt/~ (0),

which indicates why it’s useful to describe (0) in the energy basis if we want to find its time
evolution.

So how can we apply these basis transformations and eigenvalue decompositions   in Optics? Well,
a
suppose we know the transmission axis of a linear polarizer. Let’s say it’s . Then we know
b
 
a
all light polarized in that direction will be transmitted completely, so is an eigenvector of
b
 
b
the polarizer operator with eigenvalue 1. We know that all light polarized in the direction
−a
(i.e. perpendicular to the transmission axis) will be completely blocked, so this is an eigenvector
with eigenvalue 0. Thus, given the transmission axis of a linear polarizer, we can immediately
write down its Jones matrix
   −1
a b 1 0 a b
J= . (F.23)
b −a 0 0 b −a
Conversely, from the eigenvalue decomposition of a Jones matrix we can immediately see what
its principal axes are, and how it acts on the components along those axes (i.e. whether it’s a

Optica Lecture Notes TN2421 162 of 165 Monday 16th April, 2018, 09:44
linear polarizer, half-wave plate, quarter-wave plate, or something else).

Also, basis transformations can be used to describe optical activity. In optically active media,
there are different refractive indices for left-circularly and right-circularly polarized light, so it is
more convenient to represent the Jones vector in the basis of left-circularly and right-circularly
polarized light, rather than in the basis of two linear orthogonal polarizations.

It is also interesting to note the equivalence between the Jones vector and the quantum states of
photons that are used as qubits: the polarization of a photon is a two-state quantum-mechanical
system. This qubit can be represented as

|ψi = α |0i + β |1i , (F.24)

where α and β are analogous to the entries of the Jones vector. Indeed, in experiments on
quantum information with photons as qubits, wave plates are ubiquitous1 . Also in quantum
information, it is important to be familiar with basis transformations.

Another instance in Optics where we use a basis transformation (or more specifically: an eigen-
value decomposition) in order to apply an operation more easily is in the Angular Spectrum
Method. This method is used to propagate a field U0 , and it is explained in the chapter on
Diffraction Optics. The operation we want to apply in this case is the propagation operator
P∆z which denotes the propagation over a distance z. To do this, we decompose the field U0 in
eigenfunctions of P∆z , which are plane waves eik·r because

P∆z ei(kx x+ky y+kz z) = ei(kx x+ky y+kz (z+∆z))


(F.25)
= eikz ∆z ei(kx x+ky y+kz z) .

So indeed, eik·r is an eigenfunction of P∆z , with eigenvalue eikz ∆z . The basis transformation
we need to apply in order to decompose U0 into eigenfunctions of P∆z is the Fourier transform.
So, as prescribed in Eq. (F.19), to apply the propagation operator we Fourier transform U0 to
decompose it into eigenfunctions of P∆z , we multiply each component with the eigenvalue eikz ∆z ,
and then we inverse Fourier transform to go back to the original basis

P∆z U0 = F −1 {F{U0 }eikz ∆z }. (F.26)

In this framework, it can be easily understood how this method should be altered for propagation
in non-homogeneous media. In that case the plane waves eik·r are no longer eigenfunctions of the
propagation operator, and instead we must find the appropriate eigenfunctions and eigenvalues
for propagation through such a medium.

For other explanations of basis transformations, one could go to Khan Academy - Alternate
coordinate systems (bases), and Khan Academy - Showing that an eigenbasis makes for good
coordinate systems.

1
See e.g. Experimental Demonstration of Blind Quantum Computing, S. Barz et al. (2011).

Optica Lecture Notes TN2421 163 of 165 Monday 16th April, 2018, 09:44
Index

Aberrations, 56 visibility, 98
distortion, 58 Full-wave plates, 77, 79
spherical, 58
AFOV, see Angular field of view Half-wave plates, 77, 79
Airy Spot, 59 Helmholtz equation, 18
Angular field of view, 62 Huygens-Fresnel principle, 83
Aperture stop, 55 Hyperopia, 66

Boundary conditions, 23, 24 Interference


Brewster angle, see Refraction double-slit, 83

Jones Matrices, 77
Coherence
partial, 86 Laser
propagation, 98 coherence, 131
spatial, 92 CW, 132
temporal, 87 diffraction, 131
Complex notation, 14 dye laser, 143
Cones, 65 population inversion, 137
Critical angle, 28 pulsed, 132
resonator, 133
Depth of focus, 62
spontaneous emission, 135
Descartes’ Law, 28
stimulated emission, 135
Descartes’ law, 39
transverse mode, 140
Diffraction, 59
Lens
Einstein Coefficients, 135 aberrations, 56
Energy Magnifying power, 69
flow, 20 power, 56, 65
in a electromagnetic field, 21 Sign convention, 10, 46
Poynting’s vector, 20 spherical, 47
time averaged, 23 thick, 53
Entrance pupil, 55 thin, 53
Exit Pupil, 55 thin lens equation, 48
Eyepiece, 69 Lensmaker’s Formula, see Lens

Far point, 65 Maxwell Equations


in matter, 11, 18
Fermat’s principle, 36
in vacuum, 11
Field stop, 56
Myopia, 66
Fresnel coefficients
P-polarisation, 29 Near point, 65
S-polarisation, 29 Numerical aperture, 71
Fresnel-Arago laws, 99, 100
Fringes, 85, 87 Ocular, see Ocular
Double Slit, 96 OPL, see optical path length
Michelson, 90 optical path length, 37

164
Index

Perfect imaging, 40
Polarisation
circular, 74
elliptical, 75
linear, 74
P-, 25
S-, 25
TE, 25
TM, 25
Polarizer
linear, 78
Poynting’s vector, see Energy

Quarter-wave plates, 77, 79

Reflection
total internal, 28, 31
Refraction
angle, 28
Brewster angle, 30
Refractive index, 12
Rods, 65
Rotation matrix, 77

Sign convention
lens, 10, 46
Ray angle, 52
Snell’s Law, 28, 39
speed of light, 12

Telescope, 71

Wave
evanescent, 33
intensity, 23
plane wave, 13, 20
spherical, 16
Wave equation
scalar, 12
vector, 12

Optica Lecture Notes TN2421 165 of 165 Monday 16th April, 2018, 09:44

Vous aimerez peut-être aussi