Vous êtes sur la page 1sur 628

CHAPTER

The Phenomenon of Light


of the special theory of relativity lies in a dilemma concerned with the
nature and velocity of light. Appreciation of this dilemma adds purpose and meaning
to relativity, and it is for this reason that the present chapter is concerned with light
and its properties. The first two sections trace the evolution of thought with respect
to whether light is corpuscular or wavelike, and whether its velocity is finite or infinite;
present-day views of these properties culminate both developments. Light and sound
(the latter being representative of wave phenomena requiring a tangible medium)
are compared in the third section and their essential similarities and differences are
highlighted; the resulting contrast prepares the way for the introduction, in Chapter 2,
of the aforementioned dilemma.
More than usual space is given in this chapter to the historical aspects of the subj ect.
An explanation of the decision to do this may be found in the Preface. The reader wishing to concentrate his efforts on the technical development may prefer to limit his
attention to the Bradley aberration experiment in Section 1.2 and the comparison of
light and sound in Section 1.3.
THE OHIGIN

1.1 *

HISTORICAL SURVEY-THE NATURE OF LIGHT

Speculation about the nature of light can be traced back to antiquity. The Sicilian
Empedocles (c.490-c.43;") B.C.) was credited with the vie\v 1 that light consists of srnall
particles emitted from a visible body. These particles were presumed to enter the eyes
and were then returned to the visible body (a conservation law l) with the resulting
streams of particles being responsible for the sensations of shape and color.Unfortunately, only fragments of the writings of this extraordinary man have survived, and
the direct evidence of his view is merely suggestive, being contained in the lyrical
passage"
As when a 111 an , about to sally forth,
Prepares a light and kindles him a blaze
Of flaming fire against the wintry night,

* Throughout this book the content of sections marked with an asterisk is primarily historical. The
reading of these sections can be omitted without materially affecting the technical exposition.
1 Plato, Meno. (See, e.g., the W. R. M. Lamb translation, \:'"01. 165 of the Loeb Classical Library, p.
285, Harvard University Press, 1962.)
2 W. E. Leonard, The Fragments of Empedocles, pp. 42-43, The Open Court Publishing Company,
Chicago, 1908.

T'ke Phenomenon. of [Jt'ght

CHAPTER

In horny lantern shielding from all winds:

Though it protect from breath of blowing winds,


Its beam darts ou tward, as 1110re fine and thin,
And with untiring rays lights up the sky:
Just so the Fire primeval once lay hid
In the round pupil of the eye, enclosed
In films and gauzy veils, which through and through
Were pierced with pores divinely fashioned,
And thus kept off the watery deeps around,
Whilst Fire burst outward, as more fine and thin.

Empedocles was a close observer of nature, the apparent originator of the longstanding and influential notion that all things are composed of the four elements: air,
fire, water, and earth. He was a poet of stature whose wide-ranging opinions exerted a
strong influence on later Greek scholars. Aristotle (384-332 B.C.) quotes hin1 frequently, often contentiously, and in De Sensu says"
Empedocles at times seems to hold that vision is to be explained as above-stated, by light
issuing forth from the eye; e.g., in the following passage: [The 13 lines given above are then
quoted.] Sometimes he accounts for vision thus, but at other times he explains it by emanations from the visible objects.

Aristotle states his own opinion about the nature of light in De A nima:"
N O\V there clearly is something which is transparent, and by "transparent" I mean what
is visible, and yet not visible in itself, but rather owing its visibility to the color of somethinq
else; of this character are air, water, and many solid bodies. Neither air nor water is transparent because it is air or water ; they are transparent because each of them has contained
in it a certain substance which is the same in both and is also found in the eternal body
which constitutes the uppermost shell of the physical COS1110S. Of this substance light is the
activity-the activity of what is transparent so far forth as it has in it the determinate
power of becoming transparent; where this power is present, there is also the potentiality
of the contrary, viz. darkness. Light is as it were the proper color of what i:-; transparent,
and exists whenever the potentially transparent is excited to actuality by the influence of
fire or something resembling "the uppermost body"; for fire too contains something which is
one and the same with the substance in question.
We have now explained what the transparent is and what light is; light is neither fire
nor any kind whatsoever of body nor an efflux from any kind of body (if it were, it would
again itself be a kind of body)-it is the presence of fire or something resembling fire in
what is transparent. I t is certainly not a body, for two bodies cannot be present in the same
place. The opposite of light is darkness; darkness is the absence from what is transparent
of the corresponding positive state above characterized; clearly therefore, light is just the
presence of that.

Aristotle's influence was greater with later cultures than with his O\VIl, and thus one
finds most ancient Greek scholars preferring to accept a simpler view similar to that of
3 Aristotle,
Clarendon
4 Aristotle,
Claren don

De Sensu, 437 b , 23, English translation under editorship of W.D. Ross, Oxford at the
Press, 1931.
De Anima, 418 b , 4, English translation under editorship of W. 1). Ross, Oxford at the
Press, 1931.

SECTION 1

Historical S"urvey-The Nature of Liqh;

Empedocles; for example, both Euclid and Ptolemy held the opinion that light consists
of rays which originate in the eye, illuminate the object seen, and then return to the eye.
In contrast to the richness of Greek speculation about light, Roman scholars do not
appear to have been interested in this problem. Indeed, all of Roman science was
essentially derivative in character and distinctly low order, contributing little that was
original, and nothing worthy of note in the present survey. Arabic science, on the other
hand, while also being derivative, was of a rather high order, being based on the finest
products of Greek scientific achievement. The successors of Mohammed evinced a
great interest in the ideas of the western people whom they conquered, and far from
being the destroyers of Western literature, they were its chief preservers. The Arabs
came into contact with the Greeks in Egypt as well as western Asia, and becarne their
virtual successors in carrying forward the torch of learning. Although inclined to be
conservative and traditional, thus accepting most Greek ideas as authoritative, the
Arabian scholars did make several independent discoveries of significance. An important example is the Arabic numbering system in use today, which evolved during this
period.
In the specific field of light, many accornplishrnents can be credited to Ibn al-Haitharn
(c.9G5-c.1039), known to the Western world by the Latin uame Alhazen. He was the
true physicist of medieval Islam, just as Archimedes had been in the Grecian period,
for he combined with rare skill both the experimental investigation of natural phenomena and the analysis of results by mathematics.' Alhazen was one of the ablest
students of optics of all times and published a seven-volume treatise on this subject
which had great celebrity throughout the medieval period and strongly influenced
Western thought, notably that of Roger Bacon and Kepler." This treatise discussed
concave and convex mirrors in both cylindrical and spherical geometries, anticipated
Fermat's law of least time, and considered refraction and the magnifying power of
lenses. It contained a remarkably lucid description of the optical system of the eye,
which study led Alhazen to the belief that light consists of rays which originate in the
object seen, and not in the eye, a view contrary to that of Euclid and Ptolemy.
Ibn Sina, or Avicenna (980-1037), the most famous of the Islamic scientists, whose
immense medical encyclopedia, the Quanun, made him the greatest name in medicine for four centuries, was also a perceptive student of various physical questions
-motion, contact, force, vacuum, infinity, light, and heat. He shared Alhazen's
view that light originated in the luminous source and felt that it must consist of some
type of particles."
Roger Bacon (1214-1294), a learned scholar who stressed the value of reading works
in their original languages, was well-versed in the teaching of Aristotle, St. Augustine,
and the Muslim scientists Alhazen and Avicenna. During a sojourn in Paris, he so
impressed the future Clement VI that the latter, upon elevation to the Papacy in 1265,
requested Bacon to transmit copies of all his writings without delay. Up to that time,
Bacon had writ.ten but little; however, in the span of one year, he composed the Opus
Mtijus, the Opus Minor, and the Opus Tertium, a stupendous undertaking, the fruits
of which exerted a great influence on Western thought for centuries. In his masterpiece,
5
6

H. J. J. Winter, Eastern Science, John Murray Publishers, Ltd., London, 1952.


G. Sarton, Introduction to the History of Science, Vol. 1, p. 721, Williams and Wilkins Cornpany,

Baltimore, l\ld., 1927.


7 Ibid., Vol. 1, p. 710.

The Phenomenon of Light

CHAPTER

the Opus 111ajus, Bacon appears to endow Alhazen and Avicenna with an ambivalent
position by saying"
If, moreover, Alhazen and A. vicenna, in the third book on the Soul . . . are cited as
opposed to this view, I reply that they are not opposed to the generation of the species of
vision, nor to the part it plays in producing sight; but they are opposed to those who have
maintained that some material substance as a visible or similar species is extended from the
sight to the object, in order that vision may perceive the object itself, and that it may
seize upon the species of the object seen and carry it back to the sight.

Bacon's own view coincided with the opinion of many of the ancients, that light
consists of emanations which originate in the eye, and he defends this view in the
passage"
The reason for this position is that everything in nature completes its action through its
own force and species alone, as, for example, the sun and the other celestial bodies through
their forces sent to the things of the world cause the generation and corruption of things;
and in a similar manner inferior things, as, for example, fire by its own force dries and consumes and does many things. Therefore vision must perform the act of seeing by its own
force. But the act of seeing is the perception of a visible object at a distance, and therefore
vision perceives what is visible by its own force multiplied to the object . . . it is clear to
him who gives it due consideration that vision must take place by means of its species
emitted to the visible object.

As for the species of light itself, Bacon says, in an explanation which has the interesting
tinge of wave motion, that!"
. . . the species is not a body, nor is it changed as regards itself as a whole from one place
to another, bu t that which is produced in the first part of the air is not separated fr0111 that
part, since form cannot be separated from the Blatter in which it is, unless it be soul, but
the species forms a likeness to itself in the second position of the air, and so on. Therefore
it is not a motion as regards place, but is a propagation multiplied through the different
parts of the medium ; nor is it a body which is there generated, but a corporeal form, without, however, dimensions per se, but it is produced subject to the dimensions of the air . . . .

The passage of three centuries marks the interval between the death of Roger Bacon
and the birth of Rene Descartes (1596-1650), whose intellect and creative genius
were to stir scientific imagination, and whose prolific pen was to prove even D10re
influential than Bacon's. Descartes lived at a time in which the world was ripe for a
new conception of the nature of things. Major changes in attitude about man's surroundings were being culminated; Galileo and Kepler were advocating the overthrow
of the geocentric hypothesis of Ptolemy, the Magellan expedition had circumnavigated
the globe, the invention of the telescope was leading to expanded knowledge of the
skies, and Aristotelian scholasticism was under attack at all its weakest points. Xlontaigne's skepticism had paved the way for a break with tradition, and Descartes set
for himself the task of erecting a new structure to replace the old. In the words of
8 R. Bacon, Opus N!ajus, Part 5, 7th Distinction, Chap. 3, the R. B. Burke translation, University of
Pennsylvania Press, Philadelphia, 1928.
9 Ibid., 7th Distinction, Chap. 4; Dth Distinction, Chap. 1.
10 Ibid., 9th Distinction, Chap. 4.

SECTION

Historical Survey-The Nature of Light

Whittaker,l1 "His aim was nothing less than to create from the beginning a theory
of the universe, worked out as far as possible in every detail."
To understand Descartes' position on the particular subject of the nature of light,
one must first appreciate the major features of his grand design of the universe and the
attitudes which shaped this design. His philosophy was essentially dualistic; he believed
the physical world to be mechanistic and divorced from the mind, the only connection
between the two being through God's intervention. In science, he supported the inductive method of Francis Bacon, but with emphasis on rationalization and logic, rather
than upon experiences. Mathematics "vas Descartes' greatest interest and he is widely
called the father of analytic geometry. Under Kepler's influence, he became convinced
that the precision and universality of mathematics set it apart from all other fields of
study. This admiration of the clarity of mathematical expression serves to explain
why, as the first rule in the Discourse on 111ethod, Descartes vowed
never to accept anything as true if I had not evident knowledge of its being so; that is, to
accept only what presented itself to mv mind so clearly and distinctly that I had no occasion
to doubt it.

This attitude led Descartes to the decision that, since effects produced by 111eanS of
contacts and collisions were the simplest and most comprehensible phenomena in the
physical world, he would accept no other causes. Such a decision implies that bodies can
act on each other only when they are contiguous, and thus Descartes ruled out action
at a distance. To account for such phenomena as the lunar influence on tides, Descartes
assumed that space is not a void but is a plenum, t being populated by transparent
particles capable of transmitting force. He actually went further than this, postulating
that all matter was in one of three distinct forms, the luminous matter of the sun, the
transparent matter of interplanetary space, and the opaque matter of the earth, giving
as his reason;"
For, seeing that the sun and the fixed stars emit light, that the heavens transmit it, and
that the earth, the planets, and the cornets reflect it, it appears to me that there is ground
for using these three qualities of luminosity, transparency, and opacity to distinguish the
three elements of the visible world.

Descartes assumed that the luminous matter of the sun consisted of particles which
were in continuous motion. Since there "vas no emp ty space for the particles to 1110Ve
into, he argued that they took the places vacated by other particles which were also in
motion, and thus developed the notion of closed chains of moving particles. The
motions of these closed chains constituted vortices, an important concept in his explanation of the universe. Thus, according to Descartes' theory;" the sun consists of an
enormous vortex composed of the first or subtlest kind of matter. The luminous particles of this vortex, due to centrifugal action, constantly strain away from their centers
of rotation and thus press against the transparent particles of the ether. The ether

t Thus did the concept of an ether en tel' science for the first tirne. The word is of Greek extraction and
originally meant blue sky.
11 E. Whittaker, A Historu of the Theories of Aeiher and Electricity, Vol. 1, p. 4, Thomas Nelson and
Sons, Ltd., London, 1951.
12 R. Descartes, Principes de la Philosphie, 4th ed., Part 3, Sec. 52, Chez Theodore Girard, Paris, 168l.
13 Ibid., Sec. 55-64.

The Phenomenon of Light

CHAPTEH 1

Descartes imagined to consist of a closely packed assemblage of globules, of a size


intermediate between that of the luminous matter of the sun and the opaque matter
of the earth. The pressure of the vortex against these ether particles causes them to
tend to move, thus exerting a pressure on their neighbors, which in turn tend to move,
and in this manner the force exerted by the vortex is passed along through the ether
particles, from layer to layer. In Descartes' view, the transmission of this pressure constitutes light, a thought he summarizes in the passage!'
. . . the force of light . . . does not consist in the duration of some motion but only in the
fact that these small globules (of the ether) are pressed and tend to 1110Ve toward some new
location, although they do not actually move,

Descartes also provided the first theoretical derivation of the law of refraction, discovered experimentally somewhat earlier (1621) by Willebrord Snell. This derivation is
important because it contains a consequence which later 100n1ed as a decisive factor in
settling the controversy as to the true nature of light. In the Descartes derivation, a
light ray is assumed to be incident on a plane interface bet\veen t\VO media at an angle i
with respect to the normal, traveling at a velocity Vi in the first medium, and departing
from the interface at a velocity v, in the second medium, in a direction making an angle
r with respect to the normal. Descartes then assumed that the component of velocity
parallel to the interface was unaffected, obtaining
Vi

from which Snell's law

sin i = u, sin r
sin i

o,

Vi

SIll

=n

follows immediately, However, if the second medium is denser, so that i > r, it follows
that u; > Vi. Thus Descartes' derivation leads to the conclusion that light 111USt travel
faster in a denser medium, a conclusion which was later shown to be in contradiction
with experiment.
Descartes' opinions were vigorously attacked by Robert Hooke (163;'">-1703), whose
views mark a significant turning point in conjectures about the nature of light. X oted
for Hooke's law, he was an able mechanician who devised many improvements in
clocks and astronomical instruments, and was the first to formulate a theory of planetary movements as a mechanical problem. He was responsible for the development of
microscopy as a science in England, and his interest in this subject led him to many
experiments concerned with light itself. Hooke became convinced that light was an
undulatory phenomenon, and his reasons are lucidly expressed in the passage"
And first for Light, it seems very manifest, that there is no luminous Body but has the
parts of it in motion more or less . . . . It would be somewhat too long . . . to examine,
and positively to prove, what particular kind of motion it is that must be the efficient of
Light . . . . I found it ought to be exceeding quick . . . that in all extreamly hot shining
bodies, there is a very quick motion that causes Light, as well as a more robust that causes
Heat, may be argued from the celerity wherewith the bodies are dissolv'd.
Ibid., Sec. 63.
Hooke, M icroqraphia, or Some Physiological Descriptions of illinuie Bodies J.[ade by M agn1fying
Glasses, 1st ed., pp. 54-56, published by the Royal Society of London, reproduced by Dover Publications, Inc., Ne\v York, 1961.
14

15 R.

SECTION ]

lj'storical Survey-l

1he

N aiure of Liqh!

Next, it must be a Vibrative motion. And for this the newly montiori'd Diamond affords
us a good argument: since if the motion of the parts did not return, the Diamond must
after n1any rubbings decay and be wasted . , , ,
And 'Thirdly, That it is a very short vibrating motion, I think the instances drawn from
the shining of Diamonds will also make probable. For a Diamond being the hardest body
we yet know in the World, and consequently the least apt to yield or bend, must consequently also have its vibrations exceeding short.

Having proposed an explanation for the sources of light, Hooke then suggested
That the motion is propagated every way through an Ilomoqeneous medium by direct or
straight lines extended every way like Rays from the center of a sphere . . . in an Homogeneous medium this motion is propagated every way with equal velocity, whence necessarily
every pulse or vibration of the luminous body will generate a Sphere, which will continually
increase, and grow bigger, just after the same manner (though indefinitely swifter) as the
waves or rings on the surface of the water do swell into bigger and bigger circles about a
point of it, where, by the sinking of a stone the motion was begun, whence it necessarily
follows, that all the parts of these Spheres undulated through an H omoqeneous medium
cut the Rays at right angles.

Thus Hooke paralleled Descartes in postulating a medium as the vehicle of light.


However, he replaced Descartes' notion that light was a statical pressure in i.he medium
with the notion that it is a rapid undulatory motion of small amplitude. Hooke then
went on to replace the Descartes analysis of refraction with one of his own, based on the
tilting of a wavefront at the interface of two media, but he failed to notice that it would
be necessary to assume the velocity to be slower in the denser medium in order to be
consistent with Snell's law.
The issue of whether light was wa velike or particlelike was firmly joined wi th the
emergence on the scientific scene of Isaac N ewton (1642-1727). Renowned for his discoveries in mechanics, N ewton also made many significant contributions in the field of
light. His most notable discovery was that white light is made up of the spectral colors,
which led him to propound a theory of prismatic colors directly opposed to an earlier
theory put forward by Hooke. This precipitated a bitter controversey in which Hooke
displayed considerable vexation and accused ~ ewton of favoring the doctrine that light
is a material substance. ~ ewton gave his answer in a communication to the Royal
Society in 1675 in which he said'"
Were I to assume an hypothesis, it should be this, if propounded more generally, so as
not to determine what light is, farther than that it is something or other capable of exciting
vibrations in the aether: for thus it will become so general and comprehensive of other
hypotheses, as to leave little room for new ones to be invented. And therefore, because I
have observed the heads of some great virtuosos to run much upon hypotheses, as if my discourses wanted an hypothesis to explain them by, and found, that so111e, when I could not
make them take my meaning, when I spake of the nature of light and colours abstractedly,
have readily apprehended it, when I illustrated 111y discourse by an hypothesis; for this
reason I have here thought fit to send you a description of the circumstances of this hypothesis as much tending to the illustration of the papers I herewith send you ..A.nd though I shall
not assume either this or any other hypothesis, not thinking it necessary to concern myself,
16 1. Newton, Papers and Letters on Natural Philosophy, edited by 1. Bernard Cohen, p. 179, Harvard
University Press, 1958,

The Phenomenon of IJight

CHAPTER 1

whether the properties of light, discovered by me, be explained by this, or Mr. Hooke's,
or any other hypothesis capable of explaining them ; yet while I am describing this, I shall
sometimes, to avoid circumlocution, and to represent it more conveniently, speak of it,
as if I assumed it, and propounded it to be believed. This I thought fit to express, that no
man may confound this with n1Y other discourses, or measure the certainty of one by the
other, or think me obliged to answer objections against this script: for I desire to decline
being involved in such troublesome and insignificant disputes.

N ewton's lifelong distaste for controversy is clearly evident here, but equally evident
is his refreshing lack of dogmatism about rigid hypotheses. He thoroughly disliked
highly imaginative suppositions, such as Descartes had invoked for his grand scheme of
the universe, and was much more interested in the formulation of the laws which govern
natural phenomena. Despite this, he found it impossible to give coherence to the
observed facts about light without resorting to some speculation about its nature.
Thus in this same communication, after an exhaustive and detailed discussion of the
possible composition of an ether, K ewton goes on to suppose that
Light is neither aether, nor its vibrating motion, but something of a different kind propagated frOIU lucid bodies. They, that will, may suppose it an aggregate of various peripatetic
qualities. Others may suppose it multitudes of unimaginable small and swift corpuscles of
various sizes, springing from shining bodies at great distances one after another; hut yet
without any sensible interval of time, and continually urged forward by a principle of
motion, which in the beginning accelerates them, till the resistance of the aethereal medium
equal the force of that principle, much after the manner that bodies let fall in water are
accelerated till the resistance of the water equals the force of gravity.

In K ewtori's lifetime, all the facts known about light could not be harmonized with
either the corpuscular or wave theories then being proposed. However, he leaned
toward a corpuscular hypothesis, and near the end of his life summed up his objections
to the wave theory in a query at the conclusion of a revised edition of his Opticks"
Are not all Hypotheses erroneous, in which Light is supposed to consist in Pression or
motion, propagated through a fluid Medium? . . . I f Light consisted only in Pression propagated without actual Motion, it would not be able to agitate and heat the Bodies which
refract and reflect it . . . . And if it consisted in Pression or Motion, propagated either in
an instant or in time, it would bend in to the Shadow. For Pression or Motion cannot be
propagated in a Fluid in right Lines, beyond an obstacle which stops part of the Motion,
but will bend and spread every way into the quiescent Medium which lies beyond the
Obstacle . . . . The Waves on the Surface of stagnating 'Vater, passing by the sides of a
broad Obstacle which stops part of them, bend afterwards . . . . But Light is never known
to follow crooked Passages nor to bend into the Shadow.

Newton goes on, in this query, to add the further objection that the wave theory (as it
then existed) could not accoun t for the recen tly discovered phenomenon of the polarization of light.
The discoverer of this phenomenon of polarization was Christiaan Huygens (lG291695), a contemporary of both Hooke and Newton, who sided with Hooke in favoring a
wave theory of light. Inventor of the pendulum clock, perceptive and influential critic
of Descartes' cosmological theories, Huygens is known principally for his work in optics.
1. Newton, Opticks, 4th ed., pp. 362-370, William Innys, Publisher, London, 1730. (Reprinted by
Whittlesey House, McGra\v-Hill Book Cornpany, New York, 1931.)

17

SECTION

Historical Survey-The Nature of I.Jight

He greatly extended and improved the wave theory first enunciated by Hooke and
subscribed wholeheartedly to Hooke's hypothesis that light consists of S0111e form of
motion. Witness the passage."
It is inconceivable to doubt that light consists in the motion of some sort of matter. For
whether one considers its production, one sees that here upon the Earth it is chiefly engendered by fire and flame which contain without doubt bodies that are in rapid motion, since
they dissolve and melt many other bodies, even the most solid; or whether one considers its
effects, one sees that when light is collected, as by concave mirrors, it has the property of
burning as a fire does, that is to say it disunites the particles of bodies. This is assuredly
the mark of motion, at least in the true Philosophy, in which one conceives the causes of all
natural effects in terms of mechanical motions. This, in my opinion, we must necessarily do,
or else renounce all hopes of ever comprehending anything in Physics.
And as, according to this Philosophy, one holds as certain that the sensation of sight is
excited only by the impression of some movement of a kind of matter which acts on the
nerves at the back of our eyes, there is here yet one reason more for believing that ligh t
consists in a movement of the matter which exists between us and the luminous body.

Huygens next addresses himself to the question as to whether the motion is that of a
medium, as assumed by Hooke, or whether it is a stream of particles, as favored by
Newton. He says
Further, when one considers the extreme speed with which light spreads on every side,
and how, when it comes from different regions, even from those directly opposite, the rays
traverse one another without hindrance, one may well understand that when we see a
luminous object, it cannot be by any transport of matter corning to us from this object,
in the way in which a shot or an arrow traverses the air; for assuredly that would too greatly
impugn these two properties of light, especially the second of them.

Huygens shared with Newton the inclination to picture an ethereal medium in which
light propagated. Whereas K ewton favored the idea that this medium was set into
vibration by the passage of light corpuscles through it, Huygens preferred to imagine
a process analogous to sound, in which the vibrating particles of the luminous source
would excite the contiguous portion of the medium into vibration, which would in turn
transfer this excitation on to the next portion, etc. This mechanical model of light
propagation led him to his most important contribution, ever since known as Huygen's
principle, and explained in the passage!"
There is the further consideration in the emanation of these waves, that each particle of
matter in which a wave spreads, ought not to communicate its Illation only to the next
particle which is in the straight line drawn from the luminous point, but that it also imparts
some of it necessarily to all the others which touch it and which oppose themselves to its
movement. So it arises that around each particle there is made a wave of which that particle
is the centre.

Using this principle, Huygens was able to show how all the points in one wavefron t
could be treated as secondary sources which created the next wavefront, and thus provided satisfactory explanations for propagation and reflection. By assuming that the
velocity of light was slower in a denser medium he was also able to explain refraction.
C. Huygens, T'raite de la Lumiere, pp. 3-4, first published in Leyden in 1690; English translation by
S. P. Thompson, London, 1912; reprinted by University of Chicago Press.
19 Ibid., p. 19.

18

10 The Phenomenon of Light

CHAPTEH

This proved to be a pivotal point a century and a half later in deciding between a
corpuscular or wave theory, since it has already been observed that the corpuscular
theory requires a faster velocity in a denser medium in order to be consistent with
the law of refraction.
Huygens was unsuccessful in explaining interference effects, such as the colored rings
of thin films and sharp shadows past obstacles, partly because it was not then appreciated how short the wavelengths of visible light are. He also confessed his inability to
explain his own discovery of polarization, but this is easily understood when one remembers that in 1700 it was not recognized that light consisted of transverse vibrations.
Similarly, N"ewton had difficulty in explaining the colors of thin films under the corpuscular theory and the noninterference of beams of light whose paths crossed, Although neither theory was adequate, the esteem in which K ewton was held by his
contemporaries and followers was so great that the wave theory was rejected and
allowed to remain unnourished for over a century. If the fact that Newton found the
corpuscular hypothesis more acceptable retarded the growth of the theory of light, as
some have claimed, the fault lay with those who blindly espoused all his views. It has
already been noted that K ewton himself did not hold rigidly to any one hypothesis but
rather gave tentative acceptance to that theory which appeared to him to fit most of
the facts.
Although most scientists of the eighteenth century accepted the corpuscular hypothesis, the wave theory was not totally without advocates. Franklin (170G-1790)
favored it, and Euler (1707-1783) took the same position, being persuaded by the
notion that particle emission from a luminous source would cause a diminution in its
mass, an effect not observed, whereas the emission of waves did not involve such a
consequence. However, the wave theory did not make any serious headway until a
new champion arose when Thomas Young (1773-1829) turned his attention to the
subject. A man of diverse and considerable talent, Young was a practicing physician
on the staff of St. George's Hospital. I-Ie was also a physicist, whose lectures at the
Royal Institution of London introduced the modern physical concept of energy. He
was a prodigy at t\VO, an accomplished linguist while still in his boyhood, a musician,
and an archeologist who participated in the deciphering of the Rosetta stone. He made
contributions to the theory of tides, explained capillarity, and established the coefficient
of elasticity known as Young's modulus,
Drawing upon all earlier explanation by K ewton in connection with tides, Young
introduced the concept of interference by saying;"
Suppose a number of equal waves of water to 1110Ve upon the surface of a stagnant lake,
with a certain constant velocity, and to enter a narrow channel leading out of the lake.
Suppose then another similar cause to have excited another equal series of waves, which
arrive at the same channel, with the same velocity, and at the same time with the first.
Neither series of waves will destroy the other, but their effects will be combined: if they
enter the channel in such a manner that the elevations of one series coincide with those of
the other, they must together produce a series of greater joint elevations; but if the elevations of one series are so situated as to correspond to the depressions of the other, they
must exactly fill up those depressions, and the surface of the water must remain smooth:
at least I can discover no alternative, either from theory or from experiment.
20 T. Young, M iscellaneous Works, edited by George Peacock, Vol. 1, pp"
Publishers, Ltd., London, 18\15.

2()2-2();~,

John Murray

SECTION 1

Historical Survey-The Nature of Light

11

Kow I maintain that similar effects take place whenever t\VO portions of light are thus
mixed; and this I call the general law of the interference of light.

Young demonstrated this concept in an experiment performed before the Royal


Society of London in 1803. Using a distant source of a single color, he permitted light
to pass through two tiny holes placed close together in one screen, and to fall on a second
screen. The second screen showed a pattern of fine bands, alternately ligh t and dark.
Young explained this pattern by recourse to a law he had enunciated " in 1802:
Wherever t\VO portions of the same light arrive at the eye by different routes, either
exactly or very nearly in the same direction, the light becomes most intense when the differences of the routes is any multiple of a certain length, and least intense in the intermediate state of the interfering portions; and this length is different for ligh t of different
colours.

He also used this law to give the first satisfactory explanation of the colors of light
reflected from thin plates, arguing that the incident light causes t\VO beams to reach
the eye: the first of these beams has been reflected from the first surface of the thin
plate, and the other from the second. These t\VO beams produce the colors in the
reflected light due to their interference. I ndeed, Young used the measured thickness
of thin plates to determine for the first time the characteristic lengths, or wavelengths,
of the various colors of visible light, publishing?" a table of values which is remarkably
accurate by today's standards.
Despi te a bi tter attack on Young by the followers of N ewton , support for the wa ve
theory accumulated rapidly. Fresnel (1788-1827) satisfactorily explained diffraction
past a sharp edge in terms of mutual interference of the secondary "Huygens" waves
generated by those portions of the original wavefront not obstructed by the diffracting
obstacle. Sharp shadows beyond obstacles big in terms of wavelengths thus became
understood, a point about the wave theory which had always bothered N ewton. Fresnel also demonstrated light interference by employing t\VO mirrors, and in a brilliant
experiment confirmed all hypothesis by Young that light consisted of transverse vibrations by showing that two cross-polarized beams of light do not interfere wi t.h each
other. 'I'his permitted an explanation under the wave theory of the phenornenon of
light polarization in crystals, which had earlier been a stumbling block for Huygeus.
Kirchhoff (1824-1887), starting from the wave equation, developed a diffraction formula in which Huygens' secondary sources were revealed, thus putting that principle
on a much firmer foundation."
Finally, the coup de grace was delivered to the corpuscular theory in 1850 when
Foucault." (1819-1868) and Fizcau " (1819-1896) measured the velocity of light in
21 T. Young, "An Account of Some Cases of the Production of Colours," Phil Trans Roy Soc (London),
92, 387-397; July 1802.
22 T'. Young, "On the Theory of Light and Colours," Phd Trans Roy Soc (London), 92, 12-48; N overnber I80l.
23 Kirchoff summarized his work in the textbook Vorlesu.ngen 'tiber maihemaiische O'piik, Zweite Vorlesung, Sec. 2, Berlin, 1891.
24 M. L. Foucault, "General Method for Measuring the Speed of Light in Air and Transparent Media.
Relative Speeds of Light in Air and Water," Compl Rend, 30, 551-5()(); May 1850.
25 H. Fizeau and L. Brequet, "Note on an Experiment Relative to the Cornparativc Velocities of
Light in Air and in Water," Cornpt Rend, 30, 562-563; May 1850.

12 The Phenomenon of Light

CHAPTER 1

air and water, finding that it was slower in the latter. This result was consistent with
the wave theory, whereas the reverse had been predicted by the corpuscular hypothesis.
With this experiment, all sensible objection to the wave theory of light had disappeared.
At about this time Maxwell (1831-1879) began formulating his theory of electromagnetism, culminating in the celebrated equations which bear his name. Wavelike
solutions to these equations indicated that electromagnetic fields would propagate
through a vacuum at the same speed as light. This led Maxwell to the important
conjecture that light is an electromagnetic phenomenon and further strengthened the
belief that light is basically wavelike in nature.
In 1887, Heinrich Hertz (1857-1894) provided the first successful demonstration of
the generation and propagation of electromagnetic waves, using separate spark gap
coils to transmit and receive. This achievement was hailed immediately by his contemporaries as the crowning victory of physics, the first experimental verification of
the validity of Maxwell's theory. Ironically, a side effect of this experiment was destined
to contribute to a great revolution in scientific thought. Hertz noticed that the sparks
produced in the gap of his receiving coil were influenced by the light falling on this gap
from the sparks in the transmitting coil. Further investigation led Hertz to conclude
that it was the ultraviolet portion of the light which was responsible for the effect,
and that the effect was greatest if the light were incident on the negative point of the
gap. Hertz reported these observations but carried the investigation no further. However, his discovery intrigued many others, and significant contributions were made by
Hallwachs, who showed that the photoelectric effect, as it came to be called, consisted
of the emission of negative charges, and by Lenard, who measured the charge to mass
ratio of the emitted charges and concluded that they were electrons.
A variety of materials was found to be photosensitive, but the characteristics of
the emission were surprising. The number of electrons emitted per unit time was proportional to the intensity of the incident light, which seemed reasonable. However,
the maximum kinetic energy of the emitted electrons was dependent on the frequency
of the light used, but independent of its intensity. A classical argument, assuming a
collision-like process, would anticipate that the greater the intensity of the incident
wave, the greater would be the energy of the electrons which were torn loose from the
surface.
Albert Einstein (1879-1955) offered an explanation of the photoelectric effect in 1905,
the same year he received his doctorate from Zurich and published his first paper on
relativity. Drawing on an hypothesis made several years earlier by Planck, who had been
concerned with the spectral distribution of black-body radiation, Einstein assumed 26
. . . that the incident light is composed of quanta of energy (Rj N A){3V . . . . The quanta
of energy penetrate the surface of the material and their respective energies are at least
in part changed into the kinetic energy of electrons. The simplest process conceivable is
that a quantum of light gives up all its energy to a single electron . . . . G pon reaching the
surface, an electron originally inside the body will have lost a part of its kinetic energy.
Furthermore, one may assume that each electron in leaving the body does an amount of work

lV, which is characteristic of the material. Those electrons which are ejected normal to and
from the immediate surface will have the greatest velocities. The kinetic energy of these
26

A. Einstein, "An Heuristic Viewpoint Concerned with the Generation and Transformation of Light,"

Ann Phys, 322, 132-148; 1905.

SECTION

J-J istorical Sllrvey- The N ature of Light

electrons is

13

R
Nit

-{3v - }V

Einstein thus hypothesized that the incident light was composed of quanta, or photons,
whose energy was proportional to the frequency v of the light. His proportionality
factor consisted of a parameter {3 multiplied by the Boltzmann constant, k = R/N A,
with R the ideal gas constant and N A Avogadro's number. Einstein then argued that,
if the photoelectric material were raised to a potential V above a surrounding grounded
electrode, then even the most energetic emitted electrons would not reach the grounded
electrode if V were of such magnitude that

R
Ve==-{3v-TV

NA

in which e is the electronic charge. He then went on to say


If the formula derived is correct, it would follow that V, if plotted in cartesian coordinates as a function of the frequency of the exciting photons, would yield a straight line
whose slope is independent of the material under investigation . . . . If each quantum of
light were to give its energy to the electrons independently of all the others then the velocity
distribution . . . will be independent of the intensity of the exciting radiation; on the other
hand the numbers of electrons leaving the body under equal conditions will be directly proportional to the intensity of the incident radiation.

Einstein's formula and explanation are notable for their simplicity and fit all the
observed facts. At the time he proposed this explanation he had at his disposal only
qualitative data, but his equation received final and thorough experimental verification
through the precise work of Millikan in 1916. 27 Working with a circuit shown in simplified form in Figure 1.1, Millikan varied the reverse bias until it reached a value V
such that the ammeter read no current. Since this voltage was just enough to prevent
the most energetic electrons from reaching the second electrode, one could argue that

Ve was the maximum kinetic energy any of the electrons had upon being emitted

from the photosensitive electrode. When Millikan varied v, the frequency of the incident light, and recorded V for each frequency, he obtained a curve such as shown
in Figure 1.2. This experimental result was consistent with Einstein's equation
Ve == (R/ N A){3v - W, and the experimental significance of the intercept Vo is that
light at a lower frequency cannot cause photoelectric emission from the metal concerned. The quantity v was found to be characteristic of the photosensitive material
forming the electrode, but the slope of the curve was the same for all electrodes. The
slope, which is Einstein's proportionality constant (Rj N A){3 proved to be identical
with the constant h which Planck employed to explain black-body radiation. Thus
Einstein's quantum of light, or photon, was found to have an energy E == h,
However, the concept that light consists of discrete energy bundles, or photons,
smacks strongly of the earlier corpuscular theories. Is light wavelike or corpuscular?
The best current answer appears to be that it has a dual personality, exhibiting one
set of characteristics or the other, depending on how it is interacting with its environment. If the process being considered is at the microscopic level, the quantized nature
27

R. A. Millikan, uA Direct Photoelectric Determination of Planck's h," Phys Rev, 7, 355-388; 1916.

14

The Phenomenon of Light

CHAPTER 1

+
V
FIG VRE

1.1

Photoelectric diode.

of light will 1110st likely have to be considered; if it is a macroscopic process, the wave
nature of light should account successfully for the interaction.
It would seem that just about everybody was right all along.

FIGURE

1.2*

1.2

111 aximum electron energy VB. light frequency.

HISTORICAL SURVEY-THE VELOCITY OF LIGHT

Whereas a determination of the nature of light is not totally decisive, such ambivalence
does not exist when the discussion turns to the conception of the velocity of light.
Whether light is thought of as a stream of photons or a propagating wave, the transfer

* The reader solely interested in the technical presentation may wish to omit this section except for
the discussion of Bradley's experiment.

SECTION

IIistorical Survey-The Velocity of Light

1,5

of energy occurs at a speed which, today, can be measured with extraordinary precision.
Yet this speed is so great that it is not surprising to find earlier debates as to whether
the velocity of light is finite or infinite.
The direct evidence is lost to us, but Empedocles apparently felt that the velocity is
finite, for Aristotle disputes with him in the passage."
Empedocles (and with him all others who used the same forms of expression) was wrong
in speaking of light as 'traveling' or being at a given moment between the earth and its
envelope, its movement being unobservable by us; that view is contrary both to the clear
evidence of argument and to the observed facts; if the distance traversed were short, the
movement might have been unobservable, but where the distance is from extreme East to
extreme West, the draught upon our powers of belief is too great.

Heron of Alexandria, whose life span has variously been placed in the period from the
second century B.C. to the third century A.D., and who is noted for his invention of
many contrivances operated by water, steam, or compressed air, believed with Euclid
and Ptolemy that light rays originated in the eye. This belief led him to an interesting
argument as proof that the velocity of light is infinite :29
That the sight rays emanating from our eyes move with infinite velocity can also be seen
from the following. N amely if, after having closed our eyes, we look again upward to the
heavens, these rays reach the heavens without any time interval having elapsed (i.e., irnmediately). For in the same instant in which we open our eyes, we see the stars, even though
we may say that the distance is practically infinite. Also, if this distance were even greater,
the same occurrence would be repeated in any case, and thus it results that the rays emanating
from our eyes propagate with infinite velocity. They therefore suffer in their propagation
no interruption in their motion, nor do they make a detour, nor follow a broken-line path,
but rather move along the shortest line, namely the straight one.

Alhazen believed otherwise, and in his treatise on optics stated :30


And we shall see that color will not be perceived in that which is color by the sight, nor
light in that which is light, except in time . . . the arrival of the sensation (of light) to the
hollow of the optic nerve is like the arrival of light from holes . . . the passing of light
from a hole to an object opposite the hole will not be possible except in time, even though
this fact is concealed from the mind.
The passing of light from a hole to an object opposite the hole cannot escape being in one
of the two following ways, namely, that either light will come to that part of the air which is
near the hole, before it can arrive to another following point, and thereafter it will come
to another point, and so to another, until it arrives at the object opposite the hole, or light
will arrive at the entire intermediate atmosphere between the hole and the object opposite
the hole, and to the very object, all at the same time. If the air received light in a successive fashion, the light would not arrive at the object opposite the hole, except through
movement. But movement does not exist except in time; thus, if the whole atmosphere
receives light at the same time, even the arrival of light to the atmosphere does not exist,
since it was not in the atmosphere before . . . .
28 Aristotle, De Anima, 418 b , 20, English translation under editorship of W. D. Ross, Oxford at the
Clarendon Press, 1931.
29 Heronis Alexandrini, Catoptrica, Vol. 2, pp. 320-323, translated into German by L. Nix and W.
Schmidt, von B. G. Teubner, Leipzig, 1900. (Private English translation.)
30 Alhazen, Opticae Thesaurus, edited by Risner, Vol. 2, Chap. 2, Article 21, Basel, 1572. (Private
translation.)

16

The Phenomenon of Light

CHAPTER

If the hole through which the light enters becomes blocked, and then the blockage is reis different from the instant
moved, the instant during which the blockage is removed
during which the light reaches the contiguous atmosphere
Therefore this is done by
a movement ; but a movement does not exist except in time
However, this time
element is strongly concealed from the mind due to the rapidity of the perception of the
sensation of light by the air.

Avicenna agreed with Alhazen, basing his opinion on the belief that light consisted of
the motion of finite particles which therefore could not have an infinite velocity. Roger
Bacon also sided with Alhazen, although he did not like the reasons advanced above
and preferred the argument Alhazen put forth in his seventh volume that "from the
same terminus the perpendicular ray reaches more quickly the terminus of the space
than the ray that is not perpendicular." However, Bacon Vias very gentle in his disagreement with Aristotle, drawing a fine distinction between perceptible and imperceptible intervals of time. His principal reason for believing in a finite velocity is contained in the passage!'
. . . an instant has the same relation to time as a point to a line. Therefore, interchanging
terms, an instant has the same relation to a point as time has to a line; but the passage
through a point is in an instant. Therefore the passage through the whole line is in time.
Therefore species [of light] passing through linear space, however small, will pass through
in time . . . . If, therefore, the multiplication of light is instantaneous, and not in time,
there will be an instant without time; because time does not exist without motion. But it
is impossible that there should be an instant without time, just as there cannot be a point
without a line. It remains, then, that light is multiplied in time, and likewise all species of
a visible thing and of vision. But nevertheless the multiplication does not occupy a sensible
time and one perceptible by vision) but an imperceptible one, since anyone has experience
that he himself does not perceive the time in which light travels from east to west.

Francis Bacon (1561-1626), an English philosopher credited with the formulation


and introduction of the inductive method of modern science, struggled with the question of the velocity of light in the absence of experimental information, as is evident in
this excerpt.r"
Even in sight, whereof the action is most rapid, it appears that there are required certain
moments of time for its accomplishment . . . . (It is not surprising that we do not see the
actual passage of light, for there are things which by reason of the velocity of their rnotion
cannot be seen-as when a ball is discharged from a musket . . . . ) This fact, with others
like it, has at times suggested to me a strange doubt, viz. whether the face of a clear and starlight sky be seen at the instant at which it really exists, and not a little later; and whether
or not, as regards our sight of heavenly bodies, [there is] a real time and an apparent time,
just like the real place and apparent place which is taken account of by astronomers in the
correction for parallaxes . . . [whether or not] the images or rays of heavenly bodies . . .
take a perceptible time in travelling to us. Btl t this suspicion as to any considerable in terval
between the real time and the apparent afterwards vanished entirely . . . what had most
weight of all with me was, that if any perceptible interval of time were interposed between
31 R. Bacon, Opu "AI ajus, Part 5: 9th Distinction, Chap. 3, the H,. B. Burke translation, University
of Pennsylvania Press, Philadelphia, 1928.
32 Francis Bacon, Philosophical Works, edited by J. 1\1. Robertson from the edition of Ellis and Spedding, p. 363, London, 1905. (As quoted in 1. B. Cohen, Roemer, p. 11, The Burndy Library, Inc.,
New York, 1944.)

SECTION

Historical Survey-The Velocity of Light

17

the reality and the sight, it would follow that the images would oftentimes be intercepted
and confused by clouds rising in the meanwhile, and similar disturbances of the medium.

A contrast to all this metaphysical speculation is found in the attitude of Galileo


Galilei (1564-1642). Widely regarded as the father of modern physics, Galileo was a
champion of the experimental method. At the age of twenty-six, while professor of
mathematics at Pisa, he began a systematic investigation of the mechanical doctrines
of Aristotle. Having convinced himself by experiment of the error in many of Aristotle's
assertions, Galileo invoked the enmity of the Church by loudly proclaiming his dissensions. These included the question of whether or not a heavy body falls faster than a
light one, and later the profound question of whether the Ptolemaic or Copernican view
of the universe was the proper one.
Galileo was the first to observe that a simple pendulum has a natural period. He
properly deduced the formulas of uniformly accelerated motion, and his contributions
to mechanics were an important precursor to the generalizations made by N ewton a
century later. He constructed the first astronomical telescope and with it discovered the
satellites of Jupiter, the crescent phases of 'Tenus, sunspots and the rotation of the sun,
and the libration of the moon. Galileo became interested in the question of light velocity
and, believing it to be finite, undertook to establish this experimentally. His approach
was logical but doomed to failure because of the great velocity involved. In the famous
Dialogues, published in Leyden in 1638, Galileo proposed that."
Each of two persons take a light contained in a lantern, or other receptacle, such that by
the interposition of the hand, the one can shut off or admit the light to the vision of the
other. Next let them stand opposite each other at a distance of a few cubits and practice
until they acquire such skill in uncovering and occulting their lights that the instant one
sees the light of his companion he will uncover his own ..After a few trials the response will
be so prompt that without sensible error the uncovering of one light is immediately followed by the uncovering of the other, so that as soon as one exposes his light he will instantly
see that of the other. Having acquired skill at this short distanee let the two experimenters,
equipped as before, take up positions separated by a distance of two or three miles and let
them perform the same experiment at night, noting carefully whether the exposures and
occultations occur in the same manner as at short distances; if they do, we may safely
conclude that the propagation of light is instantaneous; but if time is required at a distance of three miles which, considering the going of one light and the COIning of the other,
really amounts to six, then the delay ought to be easily observable. If the experiment is to
be made at still greater distances, say eight or ten miles, telescopes may be employed, each
observer adjusting one for himself at the place where he is to make the experiment at nigh t;
then although the lights are not large and are therefore invisible to the naked eye at so great
a distance, they can readily be covered and uncovered since by aid of the telescopes, once
adjusted and fixed, they will become easily visible

Later he comments,
In fact I have tried the experiment only at a short distance, less than a mile, from which
I have not been able to ascertain with certainty whether the appearance of the opposite
light was instantaneous or not; but if not instantaneous it is extraordinarily rapid-I should
call it momentary; . . . .
Galileo Galilei, Dialogues Concerning Two New Sciences, p. 43, reprinted by Dover Publications, Inc.,
New York.

33

18

The Phenomenon of Light

CHAPTER 1

Galileo's experiment was repeated by scientists of the Florentine Academy but with
inconsistent results. The human reaction times were much too great, the separation of
the lanterns was only a few miles, and the timepieces of that era were extremely crude.
In the continuing absence of decisive experimental results, the speculation continued.
Kepler (1571-1630) held an Aristotelian view;" maintaining that light can be propagated an infinite distance in zero time. He based this view on the argument that light
is not matter and thus cannot offer resistance to the force which moves it. In Aristotelian mechanics, this requires that light attain an infinite velocity.
Descartes, as has already been noted, believed that light consisted of a transmission
of pressure through the tightly packed globules of the ether. However, in his conception,
light was not a motion because the globules only tended to move, being restrained in
position by their neighbors. 'rhus each globule was capable of transmitting force
instantaneously, which led Descartes to conclude"
Thus, we shall have no trouble in realizing why such an effect, which I attribute to light,
extends in a spherical fashion all around the sun . . . and why such light propagates instantaneously to all distances.

I t is interesting to observe that Descartes could believe both that the veloci ty of ligh t
was infinite and that the velocity of light was not the same in different media, an
assumption he made in deriving Snell's law (see Section 1.2).
In a correspondence with the Dutch physicist Beekman (1570-1637), Descartes was
hard pressed to defend his metaphysical arguments in favor of an infinite light velocity,
and hit upon an argument which is scientifically sound, and which seemed to him to be
a complete proof that his position was the only correct one. Descartes proposed consideration of a lunar eclipse, caused by the earth being interposed between the sun and
the 11100n. He then supposed that it requires an hour for light to travel from the earth
to the moon, which would mean that the l1100n did not growdark until an hour after
the instant of collinearity of the three bodies. People on earth would not be aware of
this darkening for an additional hour, or until the earth and moon had 1110ved in their
orbits an additional t\VO hours beyond the position of collinearity. But, argued
Descartes, this is clearly contrary to experience, for the eclipsed moon is always
observed at a point in the ecliptic opposite to the sun. Thus the light must travel
instantaneously.
Huygens challenged this proof at its only weak point, saying."
But it must be noted that the speed of light in this argument has been assumed such that
it takes a time of one hour to make the passage from here to the Moon. If one supposes
that for this . . . it requires only ten seconds of time . . . then it will not be easy to perceive anything of it in observations of the Eclipse; nor, consequently, will it be permissible
to deduce from it that the movement of light is instantaneous.
I t is true that we are here supposing a strange velocity that would be a hundred thousand
times greater than that of Sound . . . . But this supposition ought not to seem to be an
impossibility; since it is not a question of the transport of a body with so great a speed,
but of a successive movement which is passed on from some bodies to others. I have then
34.J. Kepler, Ad Vitellionem paralipornena quibus astronomiae pars opiica traditur, Frankfurt, 1604.
35 R. Descartes, Principes de la Philosophie, 4th ed., Part 3, Sec. 64, Chez Theodore Girard, Paris, 168l.
36 C. Huygens, Traiie de la Lumiere, pp. 6-7, first published in Leyden in 1690; English translation by
S. P. Thompson, London, 1912; reprinted by University of Chicago Press.

SECTION

Historical Survey-The Velocity of Light

19

made no difficulty, in meditating on these things, in supposing that the emanation of light
is accomplished with time . . . .

Hooke also appreciated the weakness in Descartes' argument, and in speaking of the
propagation of light through a transparent body or medium, he asscrted " that the
light
may be communicated or propagated through it to the greatest imaginable distance in
the least imaginable time; though I see no reason to affirm, that it must be in an instant:
For I know not anyone Experiment or observation that does prove it. And, whereas it may
be objected, That we see the Sun risen at the very instant when it is above the sensible
Horizon, and that we see a Star hidden by the body of the Moon at the same instant, when
the Star, the Moon, and our Eye are all in the same line; and the like Observations, or
rather suppositions, may be urg'd. I have this to answer, That I can as easily deny as they
affirm; for I would fain know by what means anyone can be assured any more of the
Affirmative, than I of the Negative. If indeed the propagation were very slow, 'tis possible
something might be discovered by Eclypses of the Moon; but though we should grant the
progress of the light from the Earth to the l\100n, and from the l\100n back to the Earth
again to be full t\VO Minutes in performing, I know not any possible I11eanS to discover
it . . . .

The distinction for having performed the first decisive determination of the velocity
of light goes to Ole Roemer (1644-1710). Born in Denmark, and educated under the
Bartholins at the University of Copenhagen, he then went to Paris as a young astronomer for the Acadernie Royale des Sciences, which at that time was undertaking a
project to prepare more accurate maps. A technique had been proposed whereby the
longitude of any place could be determined relative to the longitude of Paris by simultaneous observation of an astronomical phenomenon from the t\VO positions. What was
needed "vas a celestial occurrence of reasonable frequency, and a tentative selection was
made of the eclipses of the satellites of Jupiter, a phenomenon which had been discovered earlier in the same century by Galileo.
In choosing Roemer to work on this project, the Academic picked a man who was to
prove to be one of the greatest practical astronomers of all time, He built the first good
transit instrument and the earliest transit circle, greatly improved on the construction
of micrometers, and showed that the epicycloid is the best shape for gear teeth, incorporating this discovery into the design of all his astronomical instruments: in his
later years he supervised the erection of an excellent observatory near Copenhagen.
While in Paris at the beginning of his career, and upon launching into a study of the
eclipses of Jupiter's moons, Roemer was struck by a surprising observation. Since one
would expect that the period of a moon would remain constant, knowing the time at
which one eclipse occurred, it was then a simple matter to predict a sequence of later
times at which a given moon would be eclipsed by Jupiter. But when Roemer did this,
he predicted a time sequence which did not agree with later eclipse measurements, He
attributed this disparity to the changed distance between Earth and Jupiter, which,
if the velocity of light were finite, would explain the irregularity in eclipse occurrences.
Accordingly, in September 167G, Roemer announced to members of the Paris
Academic that the next eclipse of the innermost satellite of Jupiter, expected on
371L Hooke, ~~1 icrographia, 1st ed., p. 56, published by the Hoyal Society of London, reproduced by
Dover Publications, Inc., New York, 1961.

20

The Phenomenon of Light

CHAPTER 1

November 9, would occur exactly ten minutes later than the time computed on the
basis of previous eclipses. When observation had confirmed this startling prediction,
Roemer again addressed the Academic, saying"
The necessity of this new equation of the retardation of light, is established by all the
observations that have been Blade by the Academic Royale and by the Observatory during
the last eight years, and it has been confirmed anew by the emersion of the first satellite,
observed at Paris last November 9th at 5h 35 m 45 8 at night, 10 minutes later than had been
expected . . . .

From his knowledge of the relative positions of the earth and Jupiter, Roemer deduced
that this retardation was such that light should take 22 minutes to cross the diameter
of the earth's orbit, which translates into a velocity of light of approximately 140,000
mi/sec. Roemer's value was thus about 2tj percent low, t but his accomplishment was
nevertheless impressive. For the first time in history man had been able to measure
a velocity which was so great that many had thought it to be infinite.
Roemer's assertion was accepted promptly by Huygens and 1\ewton, and many of
his colleagues were quick to rectify the error in his calculations. Thus N ewton, in the
first edition of his Opticks (1704), introduces the proposition that 39
Light is propagated from luminous Bodies in time, and spends about seven or eight minutes of an Hour in passing from the Sun to the Earth.

adding that this effect was first observed by Roemer. However, no such acceptance was
found among the Cartesians, and such was the influence of Descartes' ideas that the
Continent remained unconvinced until the brilliant confirming experiments of Bradley
a half century later.
Bradley (1693-1762) was born in Gloucestershire and educated at Oxford. His
interest in astronomy was aroused early by an uncle whose home contained an excellent
amateur observatory, and he became an acute observer through having engaged in a
regular series of observations extending from boyhood. He was elected a member of the
Royal Society in 1718 and three years later was appointed Savilian Professor of
Astronomy at Oxford. He succeeded Halley as Astronomer Royal in 1742 and devoted
the remainder of his life to the Greenwich observatory.
In addition to the discovery of stellar aberration, to be discussed below, Bradley's
minute observations led him to the detection of the nutation of the earth's axis. In an
action so characteristic of his painstaking nature, Bradley refrained from announcing
the discovery of nutation until February 1748, after he had assured himself of its
certainty by careful measurements extending over an entire revolution (18.6 years).
Bradley's discovery and interpretation of the phenomenon of stellar aberration came

t His principal source of error was an oversight. Roemer had used eclipse data from the years 16711673 to predict the retardation time, because he had at his disposal many observations from that
period, and also because Jupiter at that time had been making an aphelion passage and thus was at a
nearly constant distance from the sun. However, in 1676 Jupiter was no longer in such a position, and
Roemer failed to account for its changed distance from the sun between eclipses, thus obtaining an
incorrect value for the change in the distance between Earth and Jupiter.
38 O. Roemer, "Demonstration Concerning the Movement of Light," J des Scavans, 233-236; December 7, 1676. (Reprinted in Phil Trans Roy Soc (London), 12, 893-894; June 25, 1677.)
39 1. Newton, Opticks, 4th ed., Book 2, Part 3, Proposition 11, William Innys, Publisher, London, 1730.
(Reprinted by Whittlesey House, McGraw-Hill Book Company, NeVI York, 1931.)

SECTION

Il istorical Survey-The Velocity of Light

21

as the result of an effort to detect stellar parallax, which he began in 1725. The absence
of any measurable parallax had long been a stumbling block for adherents of the
Copernican system. Tycho (1546-1601) had recognized earlier that, when viewed from
opposite sides of the earth's orbit, stars should show a displacement in direction, but his
careful observations convinced him that no such displaeement so great as one minute of
arc existed. Later observers also had sought this effect in vain, and stellar parallax had
become one of the outstanding problems in astronomy.
Working with improved instruments, Bradley attacked this problem by systematically recording the position of l' Draeonis, a bright star in the constellation Draco, at
various times during the year. As shown in Figure 1.3a, what he was seeking was a
difference in the angles a and {3, which certainly should be evident if rl and 1'2 were not
too much greater than the diameter of the earth's orbit. It is obvious from the figure
that this parallax effect should be greatest for stars near the ecliptic pole, t and thus
l' Draconis was an ideal choice. The plane containing the ecliptic axis and l' Draconis
cuts the earth's orbit in points the earth occupies in June and December. Thus Bradley
expected to find l' Draconis making its smallest angle to the ecliptic plane in December
and its greatest angle in June. To his surprise, he found that l' Draconis lies closest to
the ecliptic in March and is most elevated in September, the difference in these angles
being about 40 sec of arc.
Bradley checked his findings by observing other stars over a three-year period, always
with similar results. Finally satisfied that the effect was real, he reported 40 his observations in 1728. After carefully eliminating other possible explanations for the effect, he
said
At last I conjectured, that all the Phenomena hitherto mentioned, proceeded from the progressive Motion of Light and the Earth's annual Motion in its Orbit. For I perceived, that,
if Light was propagated in Time, the apparent Place of a fixt Object would not be the same
when the Eye is at Rest, as when it is moving in any other Direction, than that of the Line
passing through the Eye and Object; and that, when the Eye is moving in different Directions, the apparent Place of the Object would be different.

Bradley then proceeded to explain the apparent shift in position of the stars under this
hypothesis. His reasoning can be understood with reference to Figure 1.3b, in which
Cartesian axes have been chosen fixed in the sun, with the Z axis pointing toward the
ecliptic pole and l' Draconis in the XZ plane, close to the Z axis. In March the orbital
velocity of the earth is toward 'Y Draconis, whereas in September it is away from l'
Draconis, K eglecting the diurnal rotational motion of the earth (which is only about
1 percent of the orbital motion), Bradley reasoned in effect that in March the velocity
components of the light entering his telescope from ')' Draconis were (c x + v, 0, c.),
whereas in September they were (cx - v, 0, e.), with v the orbital speed and Cx, c, the
velocity components of the light relative to the sun. Thus in March he needed to point
his telescope at an angle a above the ecliptic plane given by tan a = cz / (cx
), and in
September he needed to point his telescope at a slightly higher angle {3 above the

t The earth's orbit lies in the plane of the ecliptic, and the ecliptic pole is the axis perpendicular to this
plane and piercing it at the center of the earth's orbit.
40 J. Bradley, "An Account of a New Discovered Motion of the Fix'd Stars," Phil Trans Roy Soc
(London), 35, 637-660; December 1728.

22

The Phenomenon of Light

CHAPTER 1
Ecliptie pole

To 'Y Draconis

To 'Y Draconis

Earth

-----SUN

II

June

Ecliptic plane
/

(a)

To 'Y Draconis ....- - -

/'

March
y

Sept.
(b)
FIGURF.

1.3

Stellar aberration.

Dec.

SECTION

H isiorical Survey-The Veloc1:ty of Light

ecliptic given by tan {3 = cz / (cx

tan B - tan a
1

tan {3 tan

v).

Since {3 - a is small,

2vcz
- -2 = tan (/3 2
c

23

a) ~

B-

from which, because v < c, it follows that

v c,

{3-a"'-'2-cc

(1.1)

Upon inserting measured values for a, {3, and v into Equation (1.1), Bradley was able to
deduce a value for the velocity of light c, since he knew the direction eosine cz / c. In
his own words,
. . . the Velocity of Light [is] to the Velocity of the Eye (which in this Case may be supposed
the same as the Velocity of the Earth's annual Motion in its Orbit) as 10,210 to One, from
whence it would follow, that Light moves, or is propagated as far as from the Sun to the
Earth in 8'12".
It is well known, that 1\1r. Romer, who first attempted to account for an apparent Inequality in the Times of the Eclipses of Jupiter's Satellites, by the Hypothesis of the progressive
Motion of Light, supposed that it spent about 11 Minutes of Time in its Passage from the
Sun to us: but it hath since been concluded by others from the like Eclipses, that it is propagated as far in about 7 Minutes. The Velocity of Light therefore deduced from the foregoing
Hypothesis, is as it were a IIIean betwixt what had at different times been determined
from the Eclipses of Jupiter's Satellites.

Bradley's value for the time of passage of light from the sun to the earth translates into a light velocity of 189,000 miz'scc, a value in close agreement with modern
measuremen ts.
Bradley termed this effect which shifts the apparent position of a star aberraiion,
When his findings became widely known, all sensible objection to the view that the
velocity of light is great, but finite, ceased to exist.
The first attempt to measure the velocity of light using a purely terrestrial method
was made by Fizeau in 1849. He employed a large toothed wheel as a light chopper
and selective receiver, sending light pulses to a remote mirror at a known distance.
Upon their return, the pulses would be unable to get past a tooth which had moved
over to replace a space, if the rotational speed of the wheel were a cri tical val ue; this
fact was used to deduce the time taken for a pulse to travel from the wheel to the
distant mirror and back, from which the velocity of light followed immediately.
A lifelong resident of Paris, Fizeau (1819-1896) devoted his long and productive
career to scientific research. With Foucault, he conducted an extensive series of experiments on interference of both light rays and heat rays. He explained the Doppler effect,
made valuable discoveries related to the polarization of light, and applied the principle
of light interference to the measurement of the dilatation of crystals. He is best rerncmbered for determinations of the velocity of light in air and in moving water. The latter
determination played a significant role in the development of the special theory of
relativity and will be discussed in Chapter 2. Fizeau's determination of the velocity
of light in air was accomplished earlier, in 1849, with an apparatus which is suggested
in simplified form by Figure 1.4.
In this experiment, light from a source S was focused at f by means of the lens L 1

24

The Phenomenon of Light

CHAPTER 1

FIGURE

1.4

Fizeau's apparatus.

and the half-silvered mirror P. The principal focus of the lens 2 was made to coincide
with j so that a parallel beam of light emerged from the apparatus and traveled to a
distant station consisting of the lens L 3 and the spherical mirror 111. This beam was
focused by 3 on M, whose center of curvature was chosen to lie in 3. Thus the reflected
beam emerged from L 3 in a parallel pencil and was brought to a focus es.], from whence
it diverged to fall upon the half-silvered mirror P and be partially transmitted to the
eyepiece V.
When a toothed wheel TV was inserted in the light path atj, an image of the source S
could be seen at V unless f were blocked by the presence of a tooth. Fizeau used a
wheel with 720 teeth separated by spaces congruent to the teeth, and connected the
wheel to a clockwork driven by weights, thus using the wheel to pulse the light. With
the wheel rotating very slowly, the image of S would appear and disappear successively
as the spaces and teeth passed beforej. However, if the speed were increased to the point
that several teeth per second passed j, the persistence of vision would render a permanent image at half the intensity which had been seen with the wheel at rest and two
teeth straddling f.
When the speed of the toothed wheel was increased further, because of the finite
velocity of light, a sensible part of the light transmitted through a space toward M
would, upon returning, fall upon the adjacent tooth and be intercepted, thus decreasing
the intensity of the image. If the rotational speed became great enough so that, when
the light returned, the tooth had just moved into the position previously occupied
by the space, then all the returning light was intercepted and the image at V was
totally extinguished.
What occurred, therefore, was that at first a bright image was observed, which
faded away as the rotational speed increased to a value just sufficient to replace a space
by a tooth in the time T it took light to travel from! to M and back. When the rotational
speed was increased further, the image returned, increasing in brightness until a maximum was reached corresponding to one space replacing another in time T. Having
thus reached a maximum, the image would fade away again, and so on in succession
for higher and higher speeds.
From his knowledge of the wheel geometry and a measurement of the rotational

SECTION

Historical Survey- The Velocity of Light

25

speed during image eclipse, Fizeau was able to deduce T and thus the velocity of light,
since he knew the distance from f to 111. In reporting this experiment," he said
. . . the result turned out very well, and one was able to observe, depending on whether
the speed of rotation was more or less, a bright point of light or a total eclipse. Under the
conditions in which the experiment was performed, the first eclipse occurred for 12.6 rotations per second. For double that speed, a new bright point; for triple, a second eclipse . . .
and so forth.
The first station was placed in the belvedere of 2, house situated at Suresnes, the second
on the top of Montmartre, at a distance of approximately 8633 meters . . . .
These first attempts furnished a value for the velocity of light which differs but little from
that which has been obtained by astronomers, The mean deduced from twenty-eight observations made so far give for its value 70,948 leagues] . . . .

Fizeau's technique was limited in its accuracy because it was difficult to judge
just when the image had reached maximum or minimum intensity. Foucault devised
a modification of the apparatus which overcame this limitation by replacing the toothed
wheel with a rotating mirror. This mirror caused a measurable displacement of the
image, thus providing a determination of the velocity of light. In 1850 Foucault used
this apparatus to measure the relative velocities of light in air and water, and in 1862
he used an improved version to make an absolute determination of the velocity of light
in air.
Foucault (1819-1868) was also a Parisian, the son of a publisher. He originally
studied for a medical career but then abandoned it for physical science. With Fizeau
he carried on a series of investigations on the intensity of the light of the sun, as well
as the above-mentioned interference experiments. He established that the velocity of
light is inversely proportional to the refractive index of the medium, thus contributing
to the overthrow of the corpuscular theory. In 1851 he demonstrated the diurnal
motion of the earth via what has corne to be known as the Foucault pendulum, and in
1852 he invented the gyroscope; for these t\VO achievements he received the Copley
medal in 1855.
The 1862 determination of the velocity of light was achieved with the apparatus
shown in Figure 1.5. Foucault let solar light, transmitted from a rectangular aperture S,
pass through a half-silvered mirror P and fall upon the achromatic lens L. The ligh t
then proceeded to a rotatable plane mirror R, which was initially fixed at the proper
angular position to bring the rays to a focus at the point M, A concave mirror fixed at
111, with a radius of curvature equal to R'M; then reflected the light along a return
path such that half of the light came to a focus at A, to be viewed by a micrometer
eyepiece. A fine grating was stretched over the slit at S, so that the image at A was
crossed by dark lines, above which a cross-hair of the eyepiece could be positioned
accurately.
When the mirror R was rotated, it acted as a light chopper, in that only when R

t The league is an itinerary measure of distance which varies frorn country to country but is usually
estimated at about 3 mi. Fizeau used it in a precise sense such that his result 'was equivalent to a light
velocity of 3.13 X 108 m/sec or 194,000 mi/sec.
41 A. H. Fizeau, "On an Experiment Relative to the Speed of Propagation of Light," Compt Rend, 29,
90-92; July 1849.

26

CHAPTER 1

The Phenomenon of Light

---------------------FIGURE

-----

1.5 Foucault's apparatus.

was in the proper angular position to deliver light to Al would an image be seen at the
eyepiece. However, during the time T light takes to travel from R to M and back, the
mirror would rotate an additional angular amount a = WT in which W was the angular
velocity of the mirror. This caused the reflected beam to be deflected an angle 2a, thus
shifting the image from A to A'. By measuring the displacement AA' and the rotational speed w, since he knew the relative positions of the components of his apparatus,
Foucault was able to determine T and thus the velocity of light.
Foucault placed the mirrors Rand M an equivalent distance of 20 m apart through
the use of multiple reflections, and turned the mirror R at speeds up to 1,000 revolutions per second, obtaining image displacements in the order of 1 111m. Of his results
he said 42
Definitively, the velocity of light has been found to be noticeably diminished. Earlier data
had indicated that the velocity was 308 millions of meters per second, and this new experiment with the turning mirror gives a value, in round numbers, of 298 millions.
One is able, it seems to me, to count on the exactness of this number, in the sense that the
corrections it would have to suffer should not change its value more than 500,000 meters.

Despite the confidence expressed by Foucault in this determination, his apparatus


also suffered from a serious limitation. The distance RM could not be increased significantly without diminishing the intensity of the image at A', since the intensity
of the light reflected from M was attenuated as (Rlll) 2 before returning to R. But with
R111 at 20 m and extremely high speeds for the rotating mirror, the displacement A A'
was still small enough to be subject to considerable error.
Michelson eliminated this drawback by placing the lens L between Rand M so that
S lay at its principal focus, thus providing a parallel beam to travel to 111. The mirror M
could then be made plane and placed at a much larger distance from R, thus enhancing
the displacement AA'; indeed, Michelson was able to achieve such great image displacements that he eliminated the half-silvered mirror P. His simplified version of the
42 J. B. L. Foucault, "Experimental Determination of the Velocity of Light," Compt Rend, 55, 501503; September 1862.

SECTION

Historical Survey-The Velocity of Light


8'

/1
/ I
/ I

27

s
L

FIGURE

1.6

Michelson's apparatus.

apparatus is shown in Figure 1.6. About this apparatus and his measurements, Michelson said 43
In the following experiments the distance between the mirrors was nearly 2000 feet
and the speed of the mirror was about 257 revolutions per second. The deflection exceeded
133 millimeters, being about 200 times as great as that obtained by Foucault. If it were
necessary it could be still further increased. This deflection was measured within three
or four hundredths of a millimeter in each observation; and it is safe to say that the result,
so far as it is affected by this measurement, is correct to within one ten-thousandth part.
The revolving mirror was actuated by a current of air . . . . 1'0 regulate and measure
the speed of rotation a tuning fork, bearing on one prong a steel mirror, was employed. This
was kept in vibration by a current of electricity. The fork was so placed that the light from
the revolving mirror was reflected to a piece of plane glass in front of the eye-piece, and
thence reflected to the eye. When fork and mirror are both at rest, an image of the revolving
mirror is perceived. When the fork vibrates, this image is drawn out into a band of light.
When the mirror commences to revolve, this band breaks up into a number of rnoving
images of the mirror; and when, finally the mirror makes as many turns as the fork makes
vibrations, or any multiple . . . of this number, the images become stationary . . . .
The electric fork made about 128 vibrations per second. No dependence was placed upon
this rate, however, but at each set of observations it was COIn pared with a standard Ut 3
fork, the temperature being noted at the time.

Being thus assured of great accuracy in both of the critical measurements-image


displacement and mirror velocity-e-Michelson listed 200 data points, each of which was
the mean of 10 separate observations, and concluded that the velocity of light in air was
299,740 km /sec, being thus 299,820km/sec in vacuo. In 1882 he repeated the experiment
and announced a new value for the velocity of light in vacuo, 299,853 km /sec. This was
to remain the accepted figure for forty-five years, and when it was replaced by a more
precise figure, Michelson was once again involved in the determination.
Albert A. Michelson (1852-1931) was born in Poland but emigrated to America
with his parents at the age of two. They settled in the West following the gold rush
and he was raised in a ruining town. A rare presidential appointment as midshipman
at the X aval Academy insured his college education and stimulated his interest in
science. Upon graduation he became an instructor at Annapolis and embarked on his
A. A. Michelson, "Experimental Determination of the Velocity of Light," Am J Sci, 18, 390-393;
Novem ber 1879.

43

28

The Phenomenon of Light

CHAPTER 1

first determination of the velocity of light, described above. There followed a period of
study in Europe during which he invented the interferometer and with it performed
the first ether drift experiment. Upon returning to the United States, he teamed with
Professor Morley to improve the interferometer and repeat this celebrated experiment
which has so influenced the subject of relativity. They also collaborated in a precise
repetition of Fizeau's moving-water experiment and in the establishment of the wavelength of sodium light as a standard of length.
Michelson's ingenuity at optical instrumentation also led to the development of an
echelon spectroscope, to a determination of the rigidity of the earth, and to measurements of the distances and diameters of giant stars. In recognition of his many contributions to physics, he was awarded the Nobel prize in 1907, the first American
scientist so honored.
In 1923 Michelson was asked to go to Pasadena to make another determination of
the speed of light, and this he accomplished with the apparatus shown in Figure 1.7.

Arc light source


Mirror on
Mt. Wilson

,~i/

-,

\Q1='"

Slitf~;;;;'

~
~:-::~1~
..

;:t;,.~~

---Lens

? -,

------::/-~~~-- 'J>
Rotating octagonal prism
on Mt. Wilson

Prism

Observer

Fixed mirror on
Mt. San Antonio

Lens

1.7 Michelson's improved apparatus. [From 1.1ichelson and the


Speed of Light by Bernard Jaffe. (Science Study Series). Copyright 1960
by Educational Services Incorporated. Reprinted by permission of
Doubleday & Company, Inc.]
FIGURE

The principle of operation was still the same, although many refinements of the original
apparatus are evident. An eight-sided rotating prism of nickel-steel, with its mirror
surfaces polished true to one part in a million, was used in place of the single rotating
mirror. Once again, an air blast was used to actuate the mirror system, and a tuningfork stroboscope to measure its rotational speed. The t\VO stations were considerably
farther apart, being placed on Mt. Wilson and Mt. San Antonio. The United States
Coast and Geodetic Survey established the distance between these stations within a
fraction of an inch in 22 miles. The intensity of the image was enhanced by using large
parabolic mirrors at both stations. Many observations yielded a mean value for the
velocity of light of 299,798 krn/sec.
But Michelson was not yet through. He wanted to measure the velocity of light in as
near perfect a vacuum as possible, free' from the obstruction of haze or smoke. A milelong tube of corrugated steel was constructed and evacuated down to a pressure of
i mm, with a version of the apparatus of Figure 1.7 enclosed. Unfortunately, Michelson did not live to see the end of this experiment, succumbing two years before its

Sound Waves and Light Waves 29

SECTION ;)

completion. His colleagues made almost 3,000 independent observations, reporting:" a


mean figure for the velocity of light in vacuum to be 299,774 krrr/sec.
The value 299,792.5 km/sec in vacuo has been adopted as the velocity of light by the
International Union of Geodesy and Geophysics and by the International Scientific
Radio Union. This fundamental constant is within the limits of error of Michelson's
final figure.

1.3

SOUND WAVES AND LIGHT WAVES

The previous t\VO sections have indicated that light as a wave phenomenon has characteristics common to those of all other types of waves, These include a wavelength, a
frequency, and their product the wave velocity, as well as a variety of interference
effects. However, light has one characteristic which makes it unique-it can propagate
in the absence of a tangible medium. This feature will prove to be of fundamental
significance.
It is instructive to contrast the properties of light with those of other wave phenomena. A comparison of the behavior of sound waves and light waves in air is a good
illustrative example, because the air can be permitted to become increasingly rarefied,
approaching in the limit the absence of a tangible medium.
The Acoustic Wave Equation.
Sound waves in air consist of longitudinal molecular
vibrations, resulting in alternate compression and rarefaction of the air. If one considers the case in which sound is propagating in the positive X direction, the molecules
which (on the average) lie in a plane x = constant will (on the average) oscillate in the
X direction. As seen in Figure 1.8, their instantaneous average position will be x + ~(x,t)
in which ~(x,t) is the time-varying displacement around the average position x. Similarly,
the average position of molecules at an adj acent cross section will be x + dx + ~(x + dx, t).
For unit transverse area, the instantaneous volume between these t\VO planes of molecules is
[x

dx

+ ~(x + dx,

t)] - [x

+ ~(x,t)]

(1 + axa~)

dx

(1.2)

and thus the fractional change in volume is a~/ ax. Since the average number of molecules in this volume is a constant, it follows that the density is fluctuating. If the
instantaneous density is designated by Po + PI (x,t), then

[pO

pl(X,t)]

(1 + ~D

dx = constant = Po dx

(1.3)

When it is assumed that the density fluctuation pI(X,t) is small compared to the average
value Po and that the fractional change in volume a~/ ax is small compared to unity,
Equation (1.3) yields the first-order result
PI (x,t)

po

(1.4)

A. A. Michelson, F. G. Pease, and F. Pearson, "Measurement of the Velocity of Light in a Partial


Vacuum," Astrophys J, 82,26-61; July 1935.

44

30

lhe

CHAPTER 1

Phenomenon of Light
Adjacent layer of molecules in
its average position x + dx

Layer of molecules in
its average position x

/
~.'

..... _.....:.:...J:.-.~~"';"--O'......-----~X

+ dx

Adjacent layer of molecules in its


instantaneous displaced position
x + dx + ~(x + dx, t)

Layer of molecules in its


instantaneous displaced
position x + ~(x,t)

x
FIGURE

~(x,t)

:r

+ dx + Hx + dx,

t)

1.8 Average behavior of layers of air molecules in presence of sound waves.

The fluctuations in density of the air as the sound waves pass through are so rapid
that the air does not transfer heat. The compressions and rarefactions are thus adiabatic, and the process conforms to the gas law equation

PV'Y

constant

(1.5)

in which p is the pressure, V the volume, and 'Y is the ratio of specific heats at constant
pressure and constant volume.

Sound Waves and Light Waves

SEC'l'ION :)

31

Since it has been observed that the volume occupied by a fixed number of molecules
is fluctuating, it follows from (1..5) that the total pressure is varying also. Thus one
may write
(1.6)
p = po + P1(X,t)
in which P1(X,t) is the small fluctuation around the relatively large constant average
pressure p.;
Taking the total differential of (1.5) and then dividing by (1.5) itself, one obtains

clp
p

dV

-"I

II

which yields the first-order result

PI (X,t)
po

a~

(1.7)

-)'-

ax

because it has been noted, in connection with Equation (1.2), that a~/ax is the fractional change in volume.
N ewton's force law can be applied to the segment of air between the two adjacent
cross sections. The net force per unit transverse area acting on the molecules is
-[PI(X
dx, t) - Pl(X,t)]. Since to first order the mass is Po dx, one may write

api

a2~

Po

at2

(1.8)

ax

Combination of (1.8) with the spatial derivative of (1.7) yields the wave equation

a2~

a2~

(1.9)

ax 2 = ~s at2
in which

c. =

(~::Y'

(1.10)

The reader will have little difficulty convincing himself that the general solution of
(1.9) is
(1.11)
~(X,t) = j(x - cst) + g(x + cst)

f and g are arbitrary functions. At a time t i the spatial distribution of j is


f(x - cs t 1) , as illustrated in Figure 1.9. At a later time t 2 i t is

in which

f(x - cst 2 ) = f( {:r - cs(t 2

t 1) }

cst})

-....c,

~c,

r--~-----~~-____:lI~-

FIGURE

1.9

t----.-..;..-----~~-____:lI-X

Traveling sound waves.

32

The Phenomenon of Light

CHAPTER 1

and is therefore the same spatial distribution as earlier, but shifted along the X axis a
distance cs(t2 - t 1). For this reason .r(x - cst) represents a wave of arbitrary but constant spatial shape, traveling in the + X direction at speed Ca. Similarly, g(x + cst)
represents an arbitrary wave traveling in the - X direction at speed C8 \ The speed of
these waves is seen, from Equation (1.10), to depend on the conditions of the medium,
namely, the pressure and density of the air. If the air is sufficiently well approximated
by the ideal gas law]

pV='JLRT
in which 'JL is the number of moles, then
CS

'JLRT)~

l' - -

poV

I"..J

T~2

(1.12)

since 'JLI Po V is a constant. Therefore this first-order theory yields the result that the
propagation velocity of sound waves in air depends only on the temperature of the air.
Propagation
Independent
of Source. A
significant feature of Equation (1.10) is
its suggestion that c, is independent of the motion of the source of the sound waves and
is governed solely by the properties of the medium. This suggestion is confirmed by
experiment and is reasonable when one considers that only the air molecules in the
proximity of the source make contact with it, all others depending for their excitation
on somewhat-ordered collisions with their neighbors.
The fact that sound waves have a velocity controlled only by the medium and independent of the motion of the source can be used to explain the Doppler effect. This
effect is familiar through the common example of an approaching locomotive. As shown
in Figure 1.10, at an instant when the diaphragm of the locomotive's horn is in its most
forward position, the air adjacent to the diaphragm suffers a compression, and this
compression travels forward at a velocity c.. If r is the period of oscillation of the
diaphragm, then r seconds later the next compression of air is about to be launched
from the horn. At this moment, the earlier compression is a distance A. = (c, - v)r in
front of the horn, with v the speed of the locomotive. A is the separation between points
in the wave train representing positions of successive maximum compression and is
thus the wavelength. The frequency of the sound wave is therefore
V

C8

= -

c,

= - - vo
C8

(1.13)

in which vo = l/r is the frequency the sound wave would have if the locomotive were
at rest (vo is also the frequency of oscillation of the diaphragm). Equation (1.13) has
been amply confirmed by experiment.
Thus the motion of the source of a sound wave affects both its frequency and wavelength but in such a way that their product remains constant at the value C8 given by
(1.10).
Acoustic Power.
The rate at which energy is being transmitted by the sound wave,
per unit transverse area of the wavefront, is called the intensity, and will be denoted
by T. Consider a column of air of unit cross section, extending to infinity from the layer
of molecules whose average position is x. The net force on this column is Pl(X,t) and

t This approximation becomes better as the air is rarefied.

SECTION

Sound Waves and Light Waves

Horn moving
at velocity v

Sound disturbance
moving at velocity

. .. ...
I)) ..
~ ~ .~
..:::
..

(J

C8

.... ..
..
...... ...
..
...... ....
.. .

....
....
......

......
..
....
....
..
....

33

....
..
.
...... ....

.. .

.:

Diaphragm in
forward position

mo~eCUles

...

-vr-I---(C8 - v)r---

.. .

C~j

~::

...
:::
:::

:::

:::
:::

:::
:: :

eli r1,.:: :
I

:: :

..
..
..
......
..
....
....

..
..
....
.
.. ...

..
..
..
..
....
......
..

\ ;:Phragm in forward
position one period r later
FIGURE

1.10

The Doppler effect in sound waves.

during a time interval dt the column is compressed an amount (a~/ at) dt so that the
work done on the column during this interval is Pl(a~/at) dt. With the aid of (1.7), the
rate of energy flow into the column can thus be written
(1.14)
For a simple harmonic wave traveling in the positive X direction one can write
~

27r

= A cos - (x
A

(1.15)

which is a special case of (1.11). In this equation, A is a constant (the amplitude of


molecule oscillation) and A is the wavelength of the sound disturbance. Since c, = Av,
introducing the wave number k = 27r/'A and the angular frequency w = 27rV enables
one to rewrite Equation (1.15) in the form
~

= A cos (wt - kx)

(1.16)

Substitution of (1.16) into (1.14) gives


T = poc sw2A 2sin" (wt - kx)

(1.17)

At any cross section the time average flow is therefore


(1.18)

34

The Phenomenon of Light

CHAPTER 1

Equation (1.18) reveals that, if the air is increasingly rarefied, the intensity of a
sound wave diminishes. This occurs because the density Po decreases, whereas, if the
temperature remains constant, Cs is unaffected (cf. Equation (1.12)); the amplitude of
molecule oscillation A is limited by the finite amplitude of oscillation of the source. In
the limit, with no molecules to transfer the oscillations to their neighbors, no acoustic
power can be transmitted, and the sound wave ceases to exist.
This discussion can be summarized by saying that sound waves cannot exist without
the presence of a tangible medium, but that they are characterized by a wave velocity
which depends on the properties of the medium but not on the motion of the source.
These remarks are equally true of water waves, elastic waves in solids, etc.
Comparison.
Does light share these characteristics? With respect to the requirement of a tangible medium, the answer is no. Light can propagate in gaseous, liquid,
and solid media, but it does not require the presence of these media to exist. Indeed, it
can propagate in the almost complete vacuum which separates the stars from each
other, and many times has been shown to traverse man-made vacua with an intensity
no less than it had when air was present. For example, Xlichelson's last experiments on
the determination of the speed of light were performed in a huge evacuated tunnel. In
this respect light] as a wave phenomenon is unique in not requiring a tangible medium
for its existence.
Does light share the second characteristic, that is, does it possess a wave velocity
which is independent of the motion of the source? An indication that it does was provided when Maxwell discovered that wavelike solutions to his equations described
electromagnetic fields which would propagate through space at the velocity of ligh t,
leading him to assert that light is an electromagnetic phenomenon, But the equation he
used to obtain these wavelike solutions was similar to (1.9), the wave equation for
sound. Thus just as in the case of acoustic disturbances, Maxwell's analysis suggested
that the velocity of light should be completely independent of its source.
There is also strong experimental evidence to support this view. W. de Sitter" has
analyzed with great care the dynamics of eclipsing binary stars. Were the velocity of
light dependent on the motion of the source, it is apparent that the time for light to
reach the earth from the approaching star of a binary would be different than the time
for light to reach the earth from the receding star. de Sitter deduced that this would
introduce apparent eccentricities in their orbits as they circled each other, but such
eccentricities have never been observed. Some binary stars are at such a distance from
the earth and have sufficiently high orbital velocities that this effect could scarcely
escape observation. Because of this evidence the postulate will be accepted that light,
in common with all other wave phenomena, has a velocity which does not depend on the
motion of the source. (Many successful Doppler radar systems have been built under
this assumption.)
The Ether.
It has been noted earlier, in Section 1.1, that light was not really accepted as being wavelike in nature until the middle of the nineteenth century. By that
time many other wave phenomena were well understood. Since these other wave
phenomena all required a medium for transmission, it was natural to believe that light

t The term "light" is used here in the broad sense to include the nonvisible portions of the
electromagnetic spectru In.
45 W. de Sitter, "An Astronomical Argument for the Constancy of the Velocity of Ligh t," Z Phys, 14,
429; May 15, 1913.

SECTION

;3

Sound Waves and Light Waves

35

did also, even after it was appreciated that light could propagate in a VaCUUlTI. Thus an
intangible medium was hypothesized to provide the support for light waves, The ether,
as this medium was called, being intangible, was endowed with extraordinary properties not shared by any other known medium. These included the ability to pass through
all substances without frictional resistance and the property of being mass-less and th us
unaffected by gravitation. Despite the mystical aspects of this hypothesis, most nineteenth-century scientists firmly believed in the existence of the ether and many serious
scientific experiments were undertaken to prove the validity of the ether concept. The
quest for the ether served to sharpen a dilemma concerned with the velocity of light, a
subject which will be explored in Chapter 2.

REFERENCES
1.

Cohen, 1. B., Roemer and the First Determination of the ~l elocity of Light, The Burndy Library,
Inc., New York, 1944.

2.

Drude, P., The Theory of Optics, translation by Mann and Millikan, Longrnans, Green and
Company, London, 1917.

3.

Jaffe, B., M'ichelson and the Speed of Light, .Anchor Books, Doubleday and Company, Inc.,
New York, 1960.

4.

Morse, P. M., Vibration and Sound, 2nd ed., Chap. 6, i.\t'lcGra\v-Hill Book Company, New
York, 1948.

5.

Preston, T'., The Theory of Light, 5th ed., Macmillan and Company, Ltd., London, 1928.

6.

Reymond, A., History of the Sciences in Greco-Iiomun Antiquity, Methuen and Company,
Ltd., London, 1927.

7.

Richtmyer, F. K., E. H. Kennard, and 'r. Lauritsen, Introduction to Jfodern Physics, 5th ed.,
Chaps. 1 and 2, McGraw-Hill Book Company, New York, 1955.

8.

Whittaker, E., A 1listory of the Theories of the Aether and Electriciiu, Thomas Nelson and
Sons, Ltd., London, 1951.

9.

Williams, H. S., .4 History of Science, Vol. 1 and 2, Harper and Brothers, N ew York, 1904.

CHAPTER

The Special Theory of Relativity


RELATIVITY THEORY is usually divided into t\VO categories, the special or "restricted"
theory, and the general theory. The special theory is concerned with phenomena as they
appear to different observers who have a constant velocity relative to each other. The
general theory removes this restriction and considers phenomena as they appear to
different observers who are in arbitrary relative motion. As one would expect, the
general theory is considerably more difficult. Only the special theory will be needed as a
foundation for the electromagnetics to be developed in the remaining chapters of this
text.
The concepts underlying the special theory of relativity are sometimes puzzling on
first consideration because they lead to predictions about space, time, and matter which
are contrary to C01111110n experience. However, once these concepts are grasped, and
it is recognized that common experience need not be rejected (because it consists of
phenomena in which relativistic effects are too small to be detected), the path is opened
to an understanding of important new relationships. Fortunately, the mathematical
tools required to comprehend the special theory do not extend beyond algebra and
some elementary calculus, so that in approaching this subject much of one's attention
can be concentrated on the concepts themselves.
I t is difficult to appreciate fully the need for the special theory of relativity and its
accomplishments without first recognizing the impasse in physics which it solved.
For this reason an essentially dual chronological presentation of subject Blatter will be
followed in this chapter. In the first (or classical) chronology, the principle of relativity
is introduced and then its consequences ill terms of classical mechanics are considered.
In order to be consistent with this principle, X ewtou's Law of Inertia is shown to
require the Galilean transformation as the connection between different inertial coordinate systems, This development requires the assumptions that distance intervals
and time intervals are invariants. When the additional assumption is made that mass
is an invariant, 1\ewtori's general force law also is seen to transform properly via the
Galilean equations. A by-product of this proof is the familiar classical law of velocity
transformation. Application of this velocity law to the ease of sound waves yields a
result in agreement with observation; however, when this law is applied to the velocity of light, such agreement with observation is lacking, thus posing a fundamental
dilemma. This disagreemen t between classical prediction and observation is discussed
in terms of the Fizeau experiment involving light propagation in moving water, and the
Michelson-Morley ether drift experiment, the null result of which raises questions about
the existence of a light medium.

SECTION]

Historical Survey

37

After various classical explanations of this dilemma are considered and rejected as
unsatisfactory, the second chronology] begins with a reexamination of the fundamental definitions of space and time. Einstein's two postulates of special relativity
are then used as the basis for a resolution of the impasse and a derivation of the Lorentz
transformation. This transformation is seen to be consistent with the principle of
relativity in the case of light velocity and provides a convincing explanation of the
Fizeau experiment. The view of Einstein that the concept of a luminiferous ether is
superfluous automatically explains the null result of ether drift experiments such as
those of Michelson and Morley and the more recent test by Cedarholm and Townes
using masers.
Application of the Lorentz equations to the transformation of the laws of mechanics
is found to have several significant consequences. These include the dependence of
length on motion, time dilatation, variation of mass, and the equivalence of mass and
energy. These effects combine to yield transformation laws for mass and force. The
latter is used in Chapter 4 to derive all the results of magnetostatics via a relativistic
transformation of Coulomb's electric force law,
A second transformation is used in Chapter 5 to derive Maxwell's equations for the
case of a source system consisting of steady currents and charges, as seen by one
observer, with respect to whom a second observer is in constant translational mo tio n.
The second observer detects time-varying electrornagnetio fields due to sources of
a restricted class; upon superimposing a set of such fields, one can establish Maxwell's equations for the general case of accelerated sources. In this manner all the
basic relations of electromagnetics are derived by using the special theory of relativity
to enlarge upon the single experimental postulate of Coulomb's law, without the need
to invoke the general theory.

2.1 *

HISTORICAL SURVEY

Bradley's discovery of stellar aberration in 1728 has already been recounted in the
previous chapter. At that time the corpuscular theory of light held sway, and thus
Bradley's explanation of the effect was based on the mechanistic law of addition of
velocities. A century later, when the wave theory of light had been revived successfully
by Young and his followers, the need existed to reexamine all optical phenomena,
including aberration, on a wave basis. The concept of a luminiferous ether through
which light propagates became a natural part of the wave theory, in analogy with all
other known wavelike disturbances, each of which requires a medium for transmission.
Thus it was Young himself who employed the ether concept to explain Bradley's
discovery of aberration in terms of a wave picture. In addressing the Royal Society in
1803 he remarked that!

t The two chronologies overlap because Einstein's explanation of the dilemma, though offered in 1905,
did not gain universal acceptance immediately; many classical attempts at an explanation were still
to be forthcoming for several decades.
* This section may be omitted without loss in continuity of the technical presentation.
1 T. Young, j\1iscellaneous Works, edited by George Peacock, Vol. 1, p. 188, John Murray Publishers,
London, 1855.

38

The Special Theory of Relativity

CHAPTER

IT pon considering the phenomena of the aberration of the stars, I am disposed to believe
that the luminiferous ether pervades the substance of all material bodies with little or no
resistance, as freely perhaps as the wind passes through a grove of trees.

In this conception the earth glides through the ether, .and the light from a distant
star is unaffected by the earth's motion, Thus the light waves, during the interval of
time they traverse the tube of a telescope, suffer a displacement equal to the displacement of the earth through the ether in the same time interval. This displacement can be
compensated for by making a small angular correction in the position of the telescope,
thus accounting for aberration.
The notion of an ether which pervaded the entire universe, being everywhere at
rest in S0111e particular frame of reference, gained favor for the additional reason that it
lent support to the idea of an absolute frame of reference, with respect to which the

absolute position and velocity of all bodies could be specified.


Young enlarged on this ether concept with the suggestion that?
For explaining the phenomena of partial and total reflection, refraction, and inflection,
nothing more is necessary than to suppose all refracting media to retain: by their attraction,
a greater or less quantity of the luminous ether, so as to make its density greater than that
which it possesses in a VaCUll111, without increasing its elasticity.

Fresnel put this idea into a precise form by postulating that the density of ether in any
body is proportional to the square of its refractive index. The excess of ether density
over that in vacuo was assumed to be dragged along with the body, the remainder
staying at rest as part of a uniform background ether. With this model Fresnel was
able to derive an expression for the velocity of light v in a moving body, namely,

in which u is the velocity of the body with respect to the ether and Vo is the velocity
light would have in the body if the body were stationary in the ether; all velocities are
measured with respect to the frame of reference in which the ether is at rest.
With this formula, Fresnel was successful in explaining refraction effects under the
wave theory, for bodies in motion as ,veIl as at rest with respect to the ether. His
theory was consistent with Arago's result that the apparent refraction in a moving
priS111 is equal to the absolute refraction in a stationary prism, and it further predicted
that if observations were made with a water-filled telescope, the aberration would be
unaffected by the presence of the water. This prediction was verified by Airy in 1871.
Fizeau, in a significant experiment, passed light through tubes of moving water, and
used an interference technique to substantiate the above Fresnel formula, this being
done in 18t51.
IVI ax w ell, who possessed a physical imagination akin to that of Faraday, firmly
believed in the existence of an ether. In the classic paper" which introduced his theory
of the electromagnetic field, one finds the passage
2

I bid., p. 80.

J. C. Maxwell, "A Dynamical Theory of the Electromagnetic Field," Phil Trans Roy Soc (London),
155, 450; 1865. (See also J. C. Maxwell Scientific Papers, Vol. 1, pp. 526-597, Dover Publications,

Inc., New York.)

SECTION]

I t appears therefore that certain phenomena in electrici ty and magnetism lead to the
same conclusion as those of optics, namely, that there is an aethereal medium pervading all
bodies, and modified only in degree by their presence; that the parts of this medium are
capable of being set in Illation by electric currents and magnets: that this motion is cornmunicated from one part of the medium to another by forces arising 1'0111 the connexions
of those parts; that under the action of these forces there is a certain yielding depending
on the elasticity of these connexions; and that therefore energy in two different forms 111ay
exist in the medium, the one form being the actual energy of motion of its parts, and the
other being the potential energy stored up in the connexions, in virtue of their elasticity.

Several years earlier .vlaxwcll had devised a mechanical conception of the electromagnetic field and had been led by analogy to the conclusion that electromagnetic
waves are propagated at the velocity of light. He therefore felt that light was an electromagnetic disturbance and made the assertion
We can scarcely avoid the inference that light consists in the transverse undulations of
the same medium which is the cause of electric and magnetic phenomena.

Thus an answer was provided to speculation as to whether or not several ethers existed
for the separate support of light, heat and electricity.
Interest in the detection of this luminiferous ether grew, but it was several decades
before an experiment of sensitivity sufficient to be definitive was performed. In 1881
Michelson invented an interferometer capable of measuring second-order effects in the
assumed velocity of the earth relative to the ether. His technique for determining
ether drift was analogous to the detection of a river current through comparison of
the round trip times of rowers who follow courses parallel to and perpendicular to the
flow. This first experiment gave a null result but the sensitivity was marginal, so the
apparatus was improved and the experiment repeated by Michelson and Morley in
1887. A null result was again obtained; it was as though the earth were at rest in the
ether.
This experiment caught the attention of the Dutch physicist H. A. Lorentz (18531928), who became convinced that the null result was a real effect and sought a reason
to explain it. In 1892 he hypothesized that a material body suffers a contraetion in its
longitudinal dimeusiou, due to its rnotion through the ether, just sufficient to prevent
the ether's detection with the Michelson interferometer. This sarne explanation had
been put forth verbally by G. F. Fitz Gerald (18.51-1901) several years earlier and is
often referred to as the Lorentz-FitzGerald contraction hypothesis.
Lorentz attempted to develop a complete electron theory which would explain this
contraction in terms of a readjustment of electrical forces between molecules, due to
absolute motion through the ether. In a succession of papers he ultimately formulated
a theory in which l\Iaxwell's equations would transform from one set of variables to
another without a change in form. 4 The t\VO sets of variables were related to each other
by what have come to be known as the Lorentz transformation equations; the representation is that of the connection between two coordinate systems in different states
of constant mot.ion through the ether. To obtain this transformation Lorentz assumed
that spherical electrons were flattened into ellipsoids due to their motion through the
4 H. A. Loren tz, "Electromagnetic Phenornena in a System Moving With Any Velocity Less Than
That of Light," Pl'OC A.nlsi Acad, 6, 809; 1904. (Heprinted in English in The Principle of Relativity,
pp. 11-34, Dover Publications, New York.)

40

The Special Theory of Relativity

CHAPTER

ether and introduced what he called "local time" in one frame of reference which
depended on both time and distance in the other frame. The physical meaning of this
local time was not elaborated.
In 1932 R. J. Kennedy devised an ingenious modification of the Miehelsou-Morley
experiment which showed the Lorentz-FitzGerald contraction hypothesis to be untenable. Meanwhile, the intervening years had seen a series of repetitions of the original
Michelson experiment by a number of investigators. Though the sensitivity and accuracy increased, no change from the null result was noted and a variety of explanations
based on an ether theory proved unsatisfactory.
Concomitantly, a new approach to the problem had been evolving. In 1900 while
addressing the International Congress of Physics at Paris, Poincare reviewed the implications raised by the null result of the Michelson experiment and asked, "Our etherdoes it really exist? I do not believe that 1110re precise observations ever could reveal
anything more than relative displacements." Poincare became convinced that it was
impossible to determine the earth's absolute motion (that is, its velocity through the
ether), and embraced this belief in the enunciation of a Principle of Relativity. Speaking in St. Louis in 1904, he said 5
According to the Principle of Relativity, the laws of physical phenomena 111USt be the
same for a "fixed" observer as for an observer who has a uniform Illation of translation
relative to him: so that we have not, and cannot possibly have, any means of discerning
whether we are, or are not, carried along in such a motion.

In 1905 Einstein made a complete break with the ether concept, discarding it as
superfluous. He adopted the principle of relativity as a postulate and added as another
that light is always propagated in empty space with a definite velocity c which is independent of the state of motion of the emitting body. Upon careful reexamination of the
concepts of the measurements of space and t.ime, he concluded that neither was an
invariant. In satisfying his second postulate, Einstein was led to the same transformation equations derived earlier by Lorentz. However, the derivation was on an entirely different basis, and one which has stood the test of time.
The noninvariance of spatial and temporal intervals has caused a major reinterpretation of the concepts of mechanics. In 1906 Max Planck determined the modifications
which would be needed in the K ewtonian equations of motion to place them in accord
with the new relativity theory, and then developed expressions for the kinetic energy
and momentum of a material particle. I t was recognized that the concept of 111aSS as
an invariant must also be abandoned. This variability of mass was clearly illustrated
by G. :N. Lewis and R. C. Tolman in 1909, when they considered the collision of t\VO
similar balls as viewed from different coordinate systems, and found that either D10
mentum was not conserved or mass depended on speed. The relativistic expression for
kinetic energy led Einstein, Lewis, and others to suggest that energy and mass were
related by the now-celebrated equation l~ = me", Transf'ormation laws based on the
Lorentz equations were worked out for velocity, mass, and force, from which emerged
the result that the velocity of light is the upper limit for motion of Blatter and energy.
Experimental evidence in support of Einstein's special theory of relativity is positive
and abundant. With the advent of atomic clocks, greatly increased precision in the
measurement of time intervals has made possible a variety of terrestrial experiments
5

An English translation of this address by G. B. Halstead can be found in The Monist, January 1905.

SECTION

The Principle of Relativity and Its Classical I mplications

41

which verify all the major predictions of the theory, including the dependence on speed
of distance, time, and mass, A variety of nuclear processes has confirmed the relation
E = me', ?\Iuch of this evidence will be presented in the sections to follow, together
with the principal developments of the theory which have been enumerated above.

2.2

THE PRINCIPLE OF RELATIVITY AND ITS CLASSICAL

IMPLICATIONS

The principle of relativity in science is an old idea whose origins are difficult to trace.
Simply stated, it expresses the belief that all the laws of nature should operate in the
same manner everywhere in the universe." This idea was given specific articulation by
Poincare at a meeting of the International Congress of Physics at Paris in 1900 and
was raised to the status of a formal postulate by Einstein in 1905. Despite the apparent
simplicity and self-evident logic of this principle, it has deep-seated consequences.
Consider first the implications of the relativity principle with respect to X ewtou's
laws of motion. The First Law, or Law of Inertia, states: ..4 body at rest or in uniform

motion will remain at rest or in uniform motion unless some externaltorce is applied to it.
Implicit in this law is the notion of an observer who can determine that the mot.inn
of the body is unaccelerated. But not all observers will make such an observation, for if
t\VO observers are in accelerated motion with respect to each other, they cannot both
perceive the body to have an unaccelerated motion. 'rhus the Law of Inertia as stated
above is not applicable for all observers in all coordinate systems. Those systems in
which it is applicable are said to be inertial susieme.
Let XYZ be a Cartesian coordinate system in which the Law of Inertia is valid.
By this one means that an observer
who is stationary in XYZ will determine that
any body which is removed from interaction with all other bodies will be at rest or
traveling ill a straight line at constant speed. In the X ewtonian conception of space,
o can imagine the X, Y, and Z axes extending as straight lines in three perpendicular
directions to the limits of the universe and can imagine a one-to-one correspondence
between the points in physical space and the triplets of numbers (x,y,z). As seen by 0,
the instantaneous position of a particle can be described by its three coordinate variables x(t), y(t), z(t). If this particle is force-free, 0 can write

i == 0

==

z ==

(2.1)

in which each dot signifies a time derivative. Integration once with respect to time gives
i: =

Vx

iJ = v y

== Vz

(2.2)

with v = lxvx + lyvy + lzv z the constant straight-line velocity which 0 observes the
particle to have, in conformance with the First Law, t
But clearly the frame of reference X Y Z is not unique in the sense of being the
only inertial system, and one can readily imagine another observer 0', at rest in a
different coordinate system X' Y'Z', for whom the same body also seems unaccelerated.
Since 0 and 0' are observing the same phenorncnou. they should be able to deduce
each other's measurements from a knowledge of the relative position and motion of

t In this text, unit vectors will be designated by the symbols lx, l r, 14>, etc. Sce the Mathematical
Supplement.
6 Anaxagoras (c.5()()-430 B.C.) apparently held this belief. See 1). E. Gcrshenson and 1). A. Greenberg,
A naxaqoras and the Birth. of the Scientific M ethocl, Blaisdell Publishing Company, New York, 1964.

42

The Special Theory of Relativity

CHAPTEH 2

their two frames of reference. Stated differently, if observer 0 knows the triplet (x,y,z)
which establishes the instantaneous position of the particle as determined by himself,
he should be able to deduce the corresponding triplet (x',y',z') and thus know the
instantaneous position of the particle as seen by 0'.
This connection is accomplished through the coordinate transformation equations]
x' = 01(x,y,z,l)

(2.3)

y' = 02(X,Y,Z,t)

Several restrictions can be invoked to determine a specific form of this transformation.


First, if it is assumed that 0 and 0' agree in their measurement of distance, then the
functions gl, g2 and g3 must be linear and commensurate in the spatial variables.
Second, if 0 and 0' are both to observe the particle to have a straight line trajectory,
only a translational motion of X'Y'Z' with respect to XYZ is permitted; rotational
motion is excluded. Third, if it is assumed that 0 and 0' agree in their measurement
of time intervals, then the functions 01, 02, and 03 must also be linear in the temporal
variable, for otherwise one would not obtain x' = y' = z' = 0 when x = fj = z = o.
With these restrictions, the most general suitable solution of (2.3) is the Galilean
transformation discussed in the Mathematical Supplement (Example V.16); namely,

x' = (x - Xo - uxt) cos xx'


y' = (x - Xo - uxt) cos xy'
z' = (x- Xo - uxt) cos xz'

+ (y + (y +

Yo - uyt) cos yx'


Yo - uyt) cos yy'
(y - yo - uyt) cos yz'

+ (z + (z +

Zo - uzt) cos zx'


Zo - uzt) cos zy'
(z - Zo - uzt) cos zz'

(2.4)

In (2.4), u = l x u x + lyuy + lzu z is the constant velocity of 0' with respect to 0;


cos xx', cos yx', etc., are the cosines of the constant angles between the X and X' axes,
the Y and X' axes, etc.; (xo,Yo,zo) is the position of the origin of the X'Y'Z' system as
seen by 0 at t = O.
Equations (2.4) are known as the most general Galilean transformation, Their physical interpretation is that the primed system is in translative motion relative to the
unprirned system at a speed U = (u; + u~ + u;)}~. This motion is in an arbitrary direction with respect to the XYZ axes. Furthermore, the X'Y'Z' axes are tilted arbitrarily
relative to the XYZ axes and the primed origin is in an arbitrary position relative to
the unprimed origin at t = O.
I t is a simple matter to show that X cwtou's First Law applies in one of these systerns if it applies in the other. If x(t), Yet), and z(t) are the time-varying coordinates
of a particle as seen by 0, then differentiation of (2.4) gives

dx'
dt
dy'

dt
dz'

dt

(x - u x ) cos xx'

(iJ - u y ) cos yx'

+ (z -

(x - u x ) cos xy'

(iJ - u y ) cos yy'

(z - u z ) cos zy'

(iJ - Uy) cos yz'

(z - u z ) cos zz'

= (x -

u x ) cos xz'

u z ) cos zx'
(2.5)

Since it has been assumed that observers 0 and ()' agree in their measurement of ti me
intervals, so that it is proper to write dt' = dt, it follows that the left sides of Equations (2.5) are the instantaneous velocity components of the particle as seen by 0'.

t See the Ma thematical Supplement, Sec.

V.II.

SECTION

The Principle of Relativity and Its Classical Lmplications

43

Under this assumption, Equations (2.5) are known as the velocity transformation
equations, and one additional differentiation gives the acceleration transformation,
namely,
x' == x cos xx' + y cos yx' + Z cos zx'
y' == x cos xy' + y cos yy' + z cos zy'
(2.6)
z' == x cos xz' + y cos yz' + Z cos zz'

Thus if
is observing an unaccelerated particle, so that x = y == z == 0, then Equations (2.6) give x' == y' = z' == 0, indicating that the particle also appears unaccelerated to observer a'.
Cartesian coordinate systems linked by a Galilean transformation have several other
important properties. One of these is the invariance of distance. Suppose that (X2,Y2,Z2)
are the coordinates of one particle and (Xl,Yl,Zl) are the coordinates of another particle
at a common time t, as noted by observer
who is stationary in XYZ. 0 then says
that the instantaneous distance separating the two particles is

(2.7)
Similarly, observer 0', who is stationary in X'Y'Z', finds that the instantaneous positions are (x~,y;,z~) and (x~,y~,z~) and concludes that the distance of separation is
(2.8)

If the two coordinate systems are connected by a Galilean transformation, Equations


(2.4) can be used to deduce that
x~ - x~
(X2 - Xl) cos xx'
y~ - y~ == (X2 - Xl) cos XV'
z~ - z~ == (X2 - Xl) cos xz'

+
+
+

(Y2 - YI) cos yx'


(Y2 - YI) cos yy'
(Y2 - YI) cos yz'

+
+
+

(Z2 - Zl) cos zx'


(Z2 - ZI) cos zy'
(Z2 - ZI) cos zz'

(2.9)

If one substitutes (2.9) in (2.8), recognizing that terms of the type cos? xx' + COs 2 XV' +
cos" XZ' are unity, whereas terms of the type cos X~t' cos yx' + cos xy' cos yy' + cos xz'
cos yz' are zero t makes it apparent that

d'

==

(2.10)

Therefore the Galilean transformation leaves distance an invariant.


This invariance of distance permits a simple proof of the most important property
of a Galilean transformation-the fact that N ewton's general force law is invariant
(actually eovariaut j) under such a transformation. I t has been shown above that if
the First Law (concerning unaccelerated bodies) is valid in the unprimed system, a
Galilean transformation renders it valid in the primed system as well. But for accelerated bodies, if f = ma is a valid relation in XYZ, and it is transformed using (2.4),
does one obtain f' == m' a'? To see that under suitable assumptions this does occur,
consider the result of Example 'l.22 in the Mathematical Supplernent. In that ex-

t The term cos? xx' + cos" xy' + cos" xz' is seen to be a unit vector parallel to the X axis,
resolved in to com ponen ts along the primed axes, and dotted with itself. The term cos xx' cos yx' +
cos XVI cos YY' + cos xz' cos yz' is seen to be the dot product of two perpendicular unit vectors, one
parallel to the .L\ axis, the other parallel to the Y axis, both resolved into their primed components.
t If the [orm of a law is unchanged by a certain coordinate transformation, that is, if the law has the
same functional form in terms of either set of coordinates, the law is said to be covariant with respect
to the transformation considered.

44

The Special Theory of Relativity

CHAPTER 2

ample, a mass m experiences gravitational forces due to an assemblage of other masses


m, . . . m, The total force on rn is found to be expressible as the negative of the
gradient of the scalar potential function
N

cI>(x,Y,Z,Xl,Yl,Zl, ... ,t)

\'

'-'

mm,
G--

(2.11)

ri

i= 1

in which G is the universal gravitational constant and


r,

= [(x - Xi)2

(y - Yi)2

(z - Zi)2)H

is the instantaneous distance between m and mi. Through use of (2.11), Newton's force
law for the case of mass particles can be written in the form
N

\'

ma = V
i

mm;

L G - .- = - V<P
=1

(2.12)

1i

~f

it is assumed that

ep(X',y',z',x~,y~,z~, ... ,t) = ep(x,Y,Z,Xl,Yl,Zl, ... ,t)

(2.13)

Because distance is an invariant, and because of the form of (2.11),


mass is also an invariant, then

In other "'''0 I'ds, for a given set of relative positions of the masses, observers 0 and 0'
will agree on the value of the potential function. Formation of partial derivatives
of (2.13) gives

a4>
ax'
a4>
,
ay
a<I>
az'

a4>
,a4>
,aep
,
cos xx + - cos yx + - cos zx
ax
ay
az
aep
aep
a4>
cos xy' + - cos yy' + - cos zy'
ax
ay
az
acI>
,a4>
,a<fl
,
cos xz + - cos yz + - cos zz
ax
ay
az

= =
=

(2.14)

in which it has been recognized, through differentiation of the inverse of Equations


(2.4), that ax/ax' = cos xx', ay/ax' = cos yx', etc.
Substitution of the three components of (2.12) into (2.14) gives
-

a<I>

+ mz cos zx'

-, =

m.i: cos xx'

mx cos xy'

+ my cos yy' + mz cos zy'

ax
a4>
ay'
a4>
az'

= mi cos

xz'

my cos yx'

my cos yz'

(2.15)

mz cos zz'

Upon comparing (2.15) with (2.6), one can conclude that

a<I>

..,

- - = mx

ax'

a<I>

..,

- - = my

iy'

LG m~i

a<I>

az'

..,

mz

and thus that

ma'

= Vi

i = 1

1i

-V ' 1J

(2.16)

SECTION

1he

Principle of Relativity and Its Claseical l m.plicaiions

45

which is Newton's general force law in the same form as (2.12). Thus, under the
assumptions that time and mass are invariants (plus the consequence of the Galilean
transformation that distance is an invariant), the general Galilean transformation
leaves all of 1\ ewtorr's laws of mechanics for free mass particles unaltered in form. I t is
for this reason that considerable importance attaches to the transformation (2.4).
Other branches of mechanics, including hydrodynamics, elasticity, and the mechanics
of rigid bodies, can be treated as extensions of the mechanics of free mass particles,
through the introduction of suitable interaction energies in the form of potential functions whose gradients give forces. I t is thus clear, without entering into a detailed
treatment of these branches of mechanics, that the laws which govern them also transform properly via the Galilean Equations (2.4), under the same assumptions which
were made in the preceding development. Therefore the t\VO inertial systems XYZ and
X'Y'Z' appear to be equivalent for the description of all the phenomena of mechanics.
This belief is often referred to as the Galilean principle of relativity.
One special case of the general Galilean transformation proves particularly useful.
Assume the situation of Figure 2.1 in which the primed and unprimed axes are respec-

r-----y

Z'

x
~----y'

X'
FIGURE

2.1

Cartesian coordinate systems in constant translative relative motion.

tively parallel, the origins having coincided at t = 0, and in which the X and X' axes
are sliding along each other at a relative speed u. It is seen readily that for this case (2.4)
reduces to
X' = x - ut
(2.17)
y' = Y

z' = z

Similarly, the velocity transformation equations (2.5) reduce to

x' = x - u

y'

i' = i

(2.18)

a result which depends on the assumption that time is an invariant. Equations (2.18)
also could have been deduced directly by a time differentiation of (2.17).

4G

The Special Theory of Relativity

CHAPTER 2

The usefulness of the transformation (2.17) extends beyond its simplicity. Imagine a
third coordinate system .LY * Y *Z * connected to the system X YZ by a static rotation and
also imagine a fourth system X~ Y~Z~ connected to the system X' Y' Z' by a static rotation plus a static translation. t Then X~ Y~Z~ is moving relative to X * Y*Z* at a speed u.
This motion is in an arbitrary direction with respect to the ..cY* Y*Z* axes and is also in
an arbitrary direction with respect to the X~ Y~Z: axes. Furthermore, the t\VO origins
are in an arbitrary relative position at t == O.
But this is the description of t\VO Cartesian frames connected by the general Galilean
Equations (2.4). Therefore one can obtain the most general Galilean transformation,
connecting X * Y *Z* and X~ Y~Z~, via t\VO static transformations of Equations (2.17).
Since t\VO observers, each at rest in separate Cartesian systems of coordinates (but
systems connected by a static Galilean transformation), are in complete agreement
about measurements of motion, it follows that the observations of 0 and 0* are
equivalent, and that the observations of 0' and O~ are equivalent. Any deductions
based on (2.4) are also obtainable from (2.17). For this reason no loss in generality is
suffered if, for brevity and clarity, all the remaining discussion is presented in terms
of the simple Galilean Equations (2.17).
To summarize the ideas of this section, one can say that any two inertial frames, connected by a Galilean transformation of the type (2.17) {possibly through the intermediary of t\VO static transformations) are equally suitable as references in which to
express the general laws of mechanics. This conclusion requires the assumptions that
distance, time, and mass are invariants.
In the nineteenth century, mechanics was such a highly developed branch of science,
and there was such a satisfactory agreement between K ewton's laws and experiment,
that mechanics enjoyed a greater confidence and trust than any other area of physical
knowledge. Since the principle of relativity seemed so logical and natural, and since
the Galilean equations transformed all the laws of mechanics in conformance with the
principle of relativity, the greatest confidence also reposed in the belief that the Galilean
transformation was correct. A test of its correctness arose with the question whether
or not all the other (nonmechanica1) laws of physics also transform properly via the
Galilean equations, as the principle of relativity in its broadest sense requires. The
next several sections are concerned with this question.

2.3

APPLICATIONS

OF THE CLASSICAL

VELOCITY TRANSFORMATION LAW

A simple mechanical example will serve to illustrate the reasonableness of the velocity
transformation equations (2.18). Consider the ease of an observer 0 standing beside a
highway as a sedan goes by traveling at BO mph relative to the ground. At the same
time observer 0' is in a second car which is traveling at 70 mph relative to the ground
and in the process of passing the sedan. If the XYZ system is attached to the ground
with the X axis parallel to the highway, the situation is suggested by Figure 2.2.
Observer 0 will say that the sedan has a velocity given by i: = tjO mph.
If the X' Y' Z' system is attached to the car in which 0' is riding, then u = 70 mph is

t By a static rotation plus a static translation, one means that ..x; y~Z: and .Y ' Y ' Z' have no relative
motion, but their axes are tilted with respect to each other and their origins do not coincide.

SECTION

Applications of the Classical Velocity Transforrnation Law

47

the relative speed of the two coordinate systems, and (2.18) gives x' == 50 - 70 == - 20
mph as the speed of the sedan relative to a'. This is a result completely consistent with
common sense, and is typical of many similar applications of (2.18) which can be
encountered in everyday experience.
N ext consider an observer 0, stationary with respect to the average motion of the

,..)---------------y
o

-:

50

70

FIGURE

2.2

Relative speed.

air which surrounds him. An acoustic source is generating sound waves which pass 0
at a speed c., governed solely by the properties of the air. Imagine also an observer 0'
traveling at a speed u relative to 0, in the direction of the wave motion. Equations
(2.18) predict that the sound waves will pass 0' at a speed c, - u, and experimental
observations are consistent with this prediction.
The motion of the acoustic source will be different as observed by 0 and 0' but this
has no effect on the wave velocity (cf. Section 1.3). The reason why 0 and 0' observe a
different value for the speed of the sound waves is that the medium is at rest with respect
to 0 but is in motion at a speed u relative to 0'.
Finally, consider an observer 0, stationary in XYZ, past whom a light wave is
propagating at speed c. If a second observer 0' is traveling at a speed u relative to 0
in the direction of the wave motion, then Equations (2.18) predict that the light waves
will pass 0' at a speed c' == c - u. This result should be valid even without the presence
of a tangible medium, since it is known that light can propagate through a VaCUU111.
But in this extremity of the absence of a tangible medium, two possibilities need to be
considered:

1. There is a detectable intangible medium, call it ether, which supports the light
waves, and in which light propagates at a speed c governed by the properties of
the ether.
2. There is not a detectable ether, and a vacuum is a region to which no physical
properties can be ascribed.

48

T he Special Theory of Relativl'ty

CHAPTER 2

Under the first possibility, the existence of a detectable ether, the situation is completely analogous to the case of sound waves, For convenience in this discussion
observer 0 can be assumed to be at rest in the ether, so that the light waves do pass
him at a speed c. These light waves then pass 0' at a speed c' = c - u because the
medium is in motion at speed u relative to 0', The motion of the light source does not
affect these values of c and c' because only the ether governs the velocity of propagation (cf. Section 1.3).
Under the second possibility, the nonexistence of a detectable ether, it is illogical to
write
(2.19)
c' = c - u ~ c
This point can be appreciated by recognizing that 0 is in a vacuum, observing a light
source with some particular motion, and that this source is emitting light waves whose
velocity relative to himself 0 can measure. But 0' is also in a vacuum, observing the
same light source with S0111e particular motion, and this source is emitting light waves
whose velocity relative to himself 0' can measure, The only difference in the situation
for observers 0 and 0' is the motion of the source. But if the velocity of light is independent of the motion of the source (and the experimental evidence indicates that
light does share this characteristic with sound), then 0 and 0' should measure the same
velocity for light, and conclude that
(2.20)
c' = c
which is in violation of the classical law of velocity transformation.
Thus the t\VO possibilities lead to different predictions, and one should be able to
design experiments which will test the validity of each possibility. Several such
experiments have been performed, two of which will be described in the sections to
follow, However, before a discussion of these experiments is presented, it is significant
to point out the implications of a choice between the two possibilities (1) and (2) listed
above. If the principle of relativity is applicable to all the laws of physics, and if the
Galilean transformation equations (2.17) are consistent with this principle, then since
the velocity transformation law (2.18) is a direct consequence of (2.17), the presence of
a detectable ether is required; without an ether, the relation c' = c - u is meaningless,
Alternatively, if there is not a detectable ether, then either the Galilean transformation
equations are incorrect or the principle of relativity holds for mechanics but not for
light. A decision between possibilities (1) and (2) is of fundamental importance.
When the need to make this decision was first appreciated, there was every confidence
that an ether would be detected, that the Galilean transformation was correct, and that
the principle of relativity embraced all branches of physics. The actual detection of the
ether was eagerly a wai ted, and there were philosophical overtones to the scien tific
interest evinced in this imminent discovery, Without an ether, no single inertial frame
of reference could in any way be preferred over any other. However, if the ether could
be detected, presumably it would not consist of different portions in relative motion,
but would be everywhere at rest in one Galilean coordinate system. It would then seem
logical to take this preferred frame as the absolute reference. 'I'he instantaneous position
of every body in the universe with respect to this preferred frame could then be designated as its absolute instantaneous position.
An absolute reference frame for the entire universe had long been an appealing idea.
(K ewton, for example, had believed in absolute 1110tioIl, defining it as translation of a

SECTION

Fizeau's Experiment with lJloving WT ater 49

body from one absolute place to another absolute place.) Detection of the ether was
therefore not only expected to settle an outstanding question about light, but also to
establish a means for defining absolute position and motion.

2.4

FIZEAU'S EXPERIMENT WITH MOVING WATER

In 1859, in a classic paper," Fizeau described an experiment he had performed to determine the influence upon the velocity of light of the motion of the tangible medium
through which it passes. The result of this experiment has a strong bearing on the
question of the existence of a detectable ether, and was later credited by Einstein as
being of primary importance in his formulation of the special theory.
Fizeau divided a beam of light, which issued from a slit S placed at the principal focus
of a lens, into two parallelbeams, which he then passed through two parallel tubes
(Figure 2.3). At the end of these tubes, the two beams impinged upon a second lens
F

Tube

Flow..........- ///////

!vI

~Flow

Tube
FIGURE

2.3 Fizeau's moving water apparatus.

and were reunited at its focus, where Fizeau had placed a plane mirror. Upon reflection the rays crossed and were each returned through the other tube, to be reunited
once again by the first lens and brought to a focus at the point F, through the interposition of the half-silvered mirror 1).
With both tubes filled with water, and the water at rest, transverse interference
fringes could be observed at F with a bright central fringe corresponding to equal paths.
If then the water were put in motion with equal speeds, but in opposite directions in the
t\VO tubes, and if the velocity of light were affected by this motion, one would predict
that the central fringe would be displaced. This would be so because one beam of light
would be traveling with the water flow, both out and back, whereas the other beam
would be traveling against it. A simple rncasurerncn t of the shift in position of the central fringe would yield the difference in times along the t\VO paths and thus the dependence of light velocity on motion of the water.
When Fizeau performed this experirnent, he did note a fringe shift which depended
on the rate of flow of the water, and his data fitted the formula
(2.21)
in which v is the velocity of light in water when the water is moving at a speed u relative
7 A. H. Fizeau, "On Hypotheses Relative to a Luminous Ether," Ann de Chimie et de Phys, Ser. III,
57, ;385-404; May 1859.

50

The Special Theory of Relativity

CHAPTER 2

to the laboratory, Vo is the velocity of light in stationary water, and n is the index of
refraction. Both v and Vo were measured relative to the laboratory.
This result is at variance with a simple classical prediction. An observer 0' at rest
relative to the water should measure a velocity of light in the water of value Vo. An observer 0 at rest in the laboratory, seeing 0' go by at speed u, can invoke Equations (2.18)
to predict that v = Vo + u. Since the index of refraction of water is approximately 1.3,
the difference between (2.21) and the classical prediction is too great to be ascribed to
experimental errors. Strong reinforcement of Fizeau's findings has been provided by
the precise work of later investigatorsv" who repeated his experiments, also obtaining
agreement with the formula (2.21).
This formula had actually been derived earlier by Fresnel on theoretical grounds
through a complication of the ether concept. At the time he was interested in explaining
an observation by Arago that the apparent refraction of light in a moving prism was
equal to the absolute refraction in a fixed prism. However, the argument of Fresnel's
derivation is equally applicable to the Fizeau experiment, and in that context proceeds
as follows:
Assume that the ethereal density in any body is proportional to the square of its index
of refraction. Then if c is the velocity of light in the ether in the absence of any tangible
matter, and if Vo is the velocity of light in the given material body when it is at rest, so
that n = c/vo is the refractive index, it follows that

in which p is the density of the ether in free space and p' is its density in the material
body.
Fresnel made the additional assumption that when the material body was in motion
at speed u, part of the ether was carried along with it-namely, that part which constitutes the excess of its density over the density of ether in free space. The rest of the ether
within the space occupied by the body was assumed to remain stationary. In this manner the density of the ether carried along by the body could be computed as
p' - p

(n 2

l)p

while a density p remained at rest. The motion of the center of gravity of the ether
within the body was therefore
n2 - 1
- -2- u
n
Since this is the average motion, relative to the observer, of the ether associated with
the body, he should add this term to u, the velocity of light in the body when it is at
rest, in order to obtain v, the velocity of light in the body when it is in motion at
speed u. This addition yields the formula (2.21).
Fresnel's derivation is seen to require further hypotheses about the behavior of the
ether. No longer is the ether simply an intangible medium which is everywhere at rest
in some absolute reference frame with all the material bodies of the universe gliding
8
9

A. A. Michelson and E. W. Morley, Atn J Sci, 31,377; 1886.


P. Zeeman, Proc Arnst Acad, 17,445; 1914. Also 18, 398; 1915.

SECTION

The Jllichelson-J11 orley Experiment

51

through it without interaction. The ether becomes more dense inside a material body
and part of it is dragged along by the body's motion. Furthermore, Fresnel's derivation
would be valid only for an observer at rest in the absolute reference frame, since he
assumed that part of the ethereal density was stationary with respect to the observer.
An ether hypothesis adequate to explain Fizeau's result is thus seen to be rather
com plicated.
It is interesting to note that if formula (2.21) is valid for all material bodies, and if
a succession of material bodies is considered whose indices of refraction are suecessi vely closer to unity, in the limit as n -.., 1, (2.21) reduces to c = c',

2.5

THE MICHELSON-MORLEY EXPERIMENT

A definitive experiment designed expressly to detect the presence of an ether was first
performed by Michelson in 1881 and repeated with improved accuracy by Michelson
and Morley in 1887. The essence of the approach is precisely analogous to Example V.3
in the IVlathematical Supplement, in which two rowers determine a river's flow by
noting the difference in their elapsed round-trip times, when one man rows across the
river and the other parallel to the bank. The reader is urged to Iamiliarize himself wi th
that problem and to convince himself of the soundness of the logic underlying the
analysis.
The apparatus employed in the ether experiment was an interferometer invented by
Michelson and shown in schematic form in Figure 2.4. Light from a source IJ is split into
two parts by a half-silvered mirror P. One part travels over path ll, is reflected by
mirror M 1, and upon returning to I) is partially reflected toward the viewing telescope F.
The light which thus reaches F has gone through the plate P three times.
The other part of the original light beam travels over path Ls, through the equalizing
plate, is reflected by M 2, and upon returning to P is partially transmitted toward the
viewing telescope F. The light which thus reaches F has gone through the plate F) once
and the equalizing plate twice. Since these t\VO plates are identical except for silvering,
the paths in glass along the two routes are the same. When the light source is monochromatic, the relative phase of the t\VO light components reaching F depends on the
difference in round-trip times for light to travel along the t\VO paths P-1l1 1- 1) and
]J-A1 2-P. This relative phase manifests itself by interference effects in the field of the
viewing telescope.
Imagine that the half-silvered mirror P is set precisely at 45 deg, and that the tilts
of M', and AI 2 are adjusted so that the two light components which reach F via JVJ 1 and
M 2 travel parallel paths as they approach F. In this case the light intensity is essentially
uniform over the central region of a transverse plane AA', and the level of intensity
depends on the relative phase of the t\VO light components. However, if the tilt of P is
now shifted slightly away from 45 deg, the two light components reaching F via ]VII and
]yf 2 no longer travel parallel paths as they approach F. Thus in the transverse plane AA'
there will be alternate regions which are light and dark due to the constructive and
destructive interference of the t\VO light components. The positions of these light and
dark regions, or interference fringes as they are called, depends on the relative phase of
the t\VO light components. If this relative phase changes, the interference fringes will
shift transversely, and there will be a shift of one fringe for every 360 deg change in

52

The Special Theory of Ilelativity

CHAPTER

To eyepiece

F
A---

---+---A'

Half-silvered mirror

NIl

-I

~--+---ll----~

NI2
FIGURE

2.4

The Michelson interferometer.

relative phase of the t\VO light components. Upon focusing the viewing telescope and
adjusting the tilt of P so that this transverse field of fringes is distinct, the operator
of the interferometer has an extremely sensitive indication of a change in relative phase
of the two light components arriving from M 1 and 1.1 2 , through an observed shift in the
fringe pattern in his field of view,
With this experimental technique in mind, assume that the earth, and with it the
apparatus of Figure 2.4, are moving at a speed u relative to the ether in a direction that
would take M 2 into P. According to the ether hypothesis, the speed of light is c in any
direction in the ether. If t~ is the time for light to travel from P to M 2, then ct~ is the

The 111 ichelson- 111 orleu Experiment

SECTION [)

53

distance this light traveled through the ether. But this distance must also equal Z, - ut;
due to the motion of the apparatus through the ether. Thus
,

l2

t2 == - c+u
Similarly, if

t;' is the time for light to

travel the return path from 111 2 to

[J,

then

The total time for light to travel the path P to M 2 to P is therefore


(2.22)
To compute the time for light to travel along the path to M 1 and back, one must
account for the fact that, while the light travels from P to M 1, the whole apparatus
moves a distance 0 in the M 2-P direction, as shown in Figure 2.5. The actual distance

~------ll------""""

FIGURE

Ray path [rom P to M', to P.

2.5

traveled by the light through the ether is therefore (li


for light to get from P to 111 1 , then

02)}~. If t~ is the time it takes

o = ut~
Upon eliminating 0, one obtains

t~ =

Ide
(1 - u 2/

C2)}1

Since the time light takes along the return path from M 1 to P is the same, the total time
for light to travel the path ]J to 111 1 to P is

2l / c

t - - - -1 - 1 ( 1 _ U 2/ C2) }~

(2.23)

54

The Special Theory of Relativity

CHAPTER

The difference in phase (assuming monochromatic light) of the two light components
arriving at F is therefore
6 =

21rV(tl -

t2 )

4~v/c

(1 _

U 2/C 2 )%

II - (1 _

l2]
/c2)H

U2

(2.24)

in which u is the frequency of the light being used. As the apparatus is rotated through
90 deg, this difference in phase should steadily change, until at the end of the 90-deg
rotation, the roles of II and l2 are interchanged. At this position, the difference in phase
IS

(2.25)
If an observer continually notes the interference fringes as the apparatus is rotated
through 90 deg, he should see a total shift of n fringes, where n is given by
(2.26)
If

tt/c is small,

a series expansion (cf'. Mathematical Supplement, Part I) gives

II

+ l2U

= ----

c2

(2.27)

whereas, if u/c is not small, n is larger than the value given by (2.27). Thus (2.27)
is the most pessimistic prediction for fringe shift and is seen to be a second-order
expression in u/c.
If an observer were willing to perform this experiment every day for six months, he
would expect to encounter a value for u at least as great as the orbital velocity of the
earth around the sun, namely 30 kru /sce, this minimum value occurring if the sun were
at rest in the ether. Upon inserting u = 30 kmysec. in (2.27) one finds that II + l2 needs
to be approximately 50 m in order to assure the observation of one fringe shift. It has
proved possible to build an interferometer of this type capable of detecting as little as
1/1000th of a fringe shift, putting a reasonable requirement on the size of the apparatus.
Thus the sensitivity needed is well within the capabilities of construction.
Additional factors affecting the accuracy are apparent. The relative positions of different parts of the apparatus must remain constant within a small fraction of a wavelength during operation. The stability of the light source is important, and the frequency bandwidth of the "monochromatic" light must be as small as possible. However,
if great care is taken in the assembly of the apparatus, with due consideration given to
these possible sources of error, one should expect to be able to measure the ether drift
regardless of how slowly the sun might be moving through the ether.
The first satisfactory trials by Michelson in 1881 indicated a null result, but the sensitivity was marginal, In Michelson's words!"
In the first experiment one of the principal difficulties encountered was that of revolving
the apparatus without producing distortion; and another was its extreme sensitiveness to
10 A. A. Michelson and E. W. Morley, "On the Relative Motion of the Earth and the Luminiferous
Ether," Am J Sci, ser. III, 34, 333-345; November 1887.

SECTION

The 1\1 ichelson-Morley Experiment

55

vibration. This was so great that it was impossible to see the interference fringes except at
brief intervals when working in the city, even at t\VO o'clock in the morning. Finally . . .
the quantity to be observed; namely, a displacement of something less than a twentieth of
the distance between the interference fringes may have been too small to be detected when
masked by experimental errors.

Accordingly, the apparatus underwent a major redesign before the 1887 trials. The
interferometer was mounted on a massive stone 1.5 m square and 0.3 m thick. The stone
rested on an annular wooden float whose outside diameter was 1.5 m, with an inside
diameter of 0.7 m and a thickness of 0.2.5 m. The wooden float rested on liquid mercury
(which Morley had collected and purified), contained in a cast iron trough 1.0 ern
thick, and of such dimensions as to leave a clearance of approximately one centimeter
around the float. A central pin was used to keep the float concentric with the trough.
The annular iron trough rested on a bed of concrete on a low brick pier built in the form
of a hollow octagon. An excavation was made down to bedrock to set the supporting
column for the apparatus.

a
b

brick pier
cast iron trough

c
d

wooden float
stone slab
FIGURE

2.6

guiding pin

Argand lamp
half-silvered mirror

equalizing plate
banks of four mirrors
viewing telescope

Perspective view of the J[ ichelson-Jforley apparatus.

A bank of four mirrors was placed at each corner of the stone and multiple reflections
were utilized to increase the effective lengths of the two legs of the interferometer to
about 11 m. An Argand lamp was used as the light source and a wooden cover was
placed over the interferometer to prevent air currents and rapid changes in temperature. A perspective view of the complete equipment is shown in Figure 2.6.

56

T he Special Theory of Relat1:vity

CHAPTER

It is demonstrated in Appendix A that if the t\VO legs of the interferometer are equal
(as was the case in the Michelson-Morley experiment), and if the ether drift velocity
u is small compared to c, then the number of fringes shifted, n, is a function of the rotational angle e of the apparatus, and is given by
l u'!.
n = - - 2 cos 28
Ac

(2.28)

Using l = 11 m and the wavelength of yellow light, and choosing the minimum value
u = 30 kru/sec., one obtains
n = 0.2 cos 2e
(2.29)
as the minimum predicted fringe shift versus rotation angle of the apparatus. Equation
(2.29) assumes, in effect, that at some point in its orbit about the sun, the earth 111USt
have a motion through the ether at least as great as 30 knr/seo. If, while the earth is
in this orbital position, an experiment is performed in which the apparatus is rotated
through 360 deg, the fringes in the field of the viewing telescope should undergo a
cyclical displacement whose amplitude is at least as great as four-tenths the distance
between adjacent fringes.
Michelson and Morley conducted trials during the period July 8-12, 1887, and
plotted their data against ith of the minimum predictable fringe shift given by (2.29).
Their curves for daytime and nighttime observations are reproduced in Figure 2.7.
One-eighth of minimum
predicted fringe shift

//

/ /

--\---

..........

./

~_

--0.05~
...........

"-

" ", -

~ Daytime

//

-0.05',

,,/

.......

FIGURE

2.i

"""""--------,,"

illichelson-.llorley data for July 1887.

They estimated that the second harmonic of their experimental data was no greater
than 0.005 fringes and thus the maximum detected fringe shift was less than toth of the
minimum predicted fringe shift. Of course, the possibility existed that in July, 1887, the
earth was nearly at rest in the ether, and thus Michelson and Morley concluded
I t is just possible t hat the result an t veloci ty (of the earth relative to the ethel') at the
time of the observations was small though the chances are much against it. The experiment

SECTION

The A1ichelson-A1otley Experiment

57

will therefore be repeated at intervals of three months, and thus all uncertainty will be
avoided.

However, after completing the July 1887 trials, Michelson and Morley did not
return to this problem. The completion of the ether drift experiment for all epochs was
finally accomplished by Dayton C. Miller, first in Cleveland and then at Xlount Wilson,
during the years 1921 through 1926. The Cleveland data gave a null result comparable
in level to what had been obtained by Michelson and Morley, but considerable discussion was caused by the Mount Wilson data, because it seemed to indicate a small ether
drift through the fact that the observed fringe displacements were down to only about
one-thirteenth of the value predicted by the ether theory for a 30 km /sec. velocity of
the earth in its orbit.
Miller's harmonic analysis of the data not only yielded a slight amplitude but also
a phase; however, the latter was incapable of being fitted into any logical relationship
'TABLE 2.1
TRIALS OF THE MICHELSON-MORLEY EXPERIMENT

Observer

Year

Place

Michelson- ...............
Potsdam
1881
Michelson and Morley! .....
Cleveland
1887
Morley and Miller- ........ 1902-04 Cleveland
l\1iller d . . . . . . . . . . . . . . . . . . .
1921
Mt. Wilson
Miller e . . . . . . . . . . . . . . . . . . . 1923-24 Cleveland
Miller (sunlight)! ..........
1924
Cleveland
Tomaschek (starlight)> .....
1924 Heidelberg
Millerh . . . . . . . . . . . . . . . . . . .
1925-26 Mt. Wilson
Kennedyi .. ..............
1926
Pasadena and
IV1 t. Wil~on
Illingwort hi ...............
1927
Pasadena
Piccard and Stahel" ........
1927
l\1t. Rigi
Michelson et al:'. . . , .......
1929
1\1 t. \Vilson
Joosr' ....................
1930
.lena

2l/'A.(u/C)2

fringe

fringe

120
1,100
3,220
3,200
3,200
3,200
860
3,200
200

0.04
0.40
1 .13
1 .12
1 .12
1 .12
0.3
0.07

0.01
0.005
0.0073
0.04
0.015
0.007
0.01
0.044
0.001

2
40
80
15
40
80
15
13
35

200
280
2,590
2,100

0.07
0.13
0.9
0.75

0.0002
0.003
0.005
0.001

175
20
90
375

1 .12

Ratio

A. l\lichelson, Am. J. Sci. 22,120 (1881); Phil. l\Iag. 13,236 (1882).


A. Michelson and E. W. Morley, Am. J. Sci. 34, 333 (1887); Phil. Mag. 24,449 (1887).
c
W. Morley and I). C. Miller, Phil. Mag. 9, 680 (1905); Proc. Am. Acad. Arts Sci. 41, 321 (1905).
d
C. Miller, Data sheets of Observations December 9 to 11,1921 (unpublished).
e
C. Miller, Observations, August 23 to Scptem ber 4, UJ23; June 27 to July 26, 1D24 (unpublished).
f D. C. Miller, "Observations with Sunlight on July 8 to 9,1924," Proc. Natl. Acad. Sci. 11,311 (1925).
(J R. Tomaschek, Ann. d. Physik 73, 105 (1924).
h D. C. lVliller, Revs. Modern Phys. 5, 203 (19~33).
i R. J. Kennedy, Proc. Natl. Acad. Sci. 12,621 (1926); Astrophys. J. 68,367 (928).
i K. K. Illingworth, Phys. Rev. 30, 692 (I927).
k A. Pic card and E. Stahel, Compt. rend. 183, 420 (1926); 184, 152, 451 (1927); 185, 1198 (1927);
J. phys. radiurn 8, 56 (1927).
ll\1ichelson, Pease, and Pearson, Nature 123,88 (1929); J. Opt. Soc. Am. 18, 181 (1929).
m G. Joos, Ann. Physik 7, 385 (1930); Naturwiss. 38,784 (1931).
a

A.
A.
E.
D.
D.

t, em

58

The Special Theory of Relativity

CHAPTER

corresponding to an oscillation of the north point during the course of a sidereal day.
This anomaly cast S0111e doubt on the interpretation of the results and led to a critical
review of the data using statistical methods, The conclusion was reached that the small
observed second harmonic in the Mount Wilson experiment was not due to ether drift
hut rather could be accounted for by temperature effects.!'
Many other investigators have repeated the Michelson-Morley experiment, and a
summary of the various trials is given in Table 2.1 (on page 57) with the appropriate
journal references listed underneath. 12 In response to various objections to the original
experiment, several of the parameters were varied; sunlight and starlight were substituted for the terrestrial source, mountain-top installations were used to minimize a
possible "ether drag" over the surface of the earth, and one experiment was even
performed in a balloon.
In all these trials, the t\VO arms of the interferometer were equal, the length being as
listed in Column 4 of Table 2.1. Column 5 gives the minimum predicted shift at some
time of year, based on the earth's orbital speed of 30 km/sec. and is twice the amplitude
of the corresponding second harmonic. Column 6 lists the amplitude A of the second
harmonic of fringe shift actually found by each observer. The last column gives the
ratio of the minimum predicted second harmonic to that actually observed. For many
of the trials this ratio is large enough that clearly a null result for the Michelson-Morley
experiment can be accepted with confidence.

2.6

ETHER DRAG

The negative result of the Michelson-Morley experiment was totally unexpected and
very perplexing. If one presumes that there is a luminiferous ether in which the velocity
of light is c in all directions, the results of this experiment suggest that the light from a
distant star sweeps past an observer on earth at this velocity c regardless of where the
earth happens to be in its orbit, and thus regardless of the earth's velocity relative to
the ether. But this contradicts all common-sense knowledge of the law of addition of
velocities, embodied in Equations (2.18).
To phrase this problem more specifically, suppose that when the earth is at a certain
point in its orbit, Cartesian axes XYZ are constructed so that the earth is instantaneously at rest in XYZ and so that the X axis lies along the earth's orbit. Then for the
incoming starlight,
x = Cx if = Cy i = c,
and

(c;

+ c~ + c;)~~

is the speed of the starlight in XYZ. Six months later, another Cartesian frame can be
constructed whose X' axis slides along the original X axis at speed 2v, with v the earth's
orbital velocity. The earth will be instantaneously at rest in X'Y'Z', and according to
(2.18)
.,
z = c,
x' = c; - 2v
iJ' = Cy
11 R. S. Shankland, S. W. McCuskey, F. C. Leone, and G. Kuerti, "New Analysis of the Interferometer Observations of Dayton C. Miller," Rev Mod Phys, 27, 167-178; April 1955.
12 This table is reproduced with the kind permission of Messrs. Shankland, McCuskey, Leone, and
Kuerti and is taken from their paper, Ibid.; 168.

SECTION

The Lorentz-Fitzilerold Contraction Hypothesis

so that

e' == [(ex - 2V)2

e~

+ C;P2

59

~ e

Thus the velocity of the starlight should be different at the t\VO orbital positions. But
the resul ts of the Michelson-Morley experiment do not reveal this difference. I t is as
though the ether were caught up by the earth's atmosphere and dragged along with it,
thus accounting for the apparent constancy of the velocity of light in all directions
within the earth's atmosphere for all positions in the earth's orbit.
However, this concept of "ether drag" suffers a fatal objection when Bradley's discovery of aberration is recalled. If the ether were dragged along by the earth's atmosphere, then the setting of a telescope would not have to be altered to compensate for
the earth's orbital velocity and there would be no aberration in the position of any star.

2.7

THE LORENTZ-FITZGERALD CONTRACTION HYPOTHESIS

Another attempt to preserve the ether hypothesis but remain consistent with the
negative result of the Michelson-Morley experiment was made by Lorentz ." He
postulated that, as a result of its speed u through the ether, a material body is contracted by the factor (1 - u 2/e 2) }2 in the direction of its motion. This 111eanS, in the
Michelson-Morley experiment, if 11 and 12 are the lengths of the arms of the interferometer when it is at rest in the ether, then under the conditions depicted by Figure 2.4, 12 == 12 (1 - u 2 / e 2)H and II == L. After the apparatus has been rotated 90 deg,
l2 == Z2 and II == L(l - U2/C2)}~. Making these substitutions in (2.24) and (2.25) gives
1::1'-1::1
n==---==O

27r

(2.30)

which would account neatly for why no fringe shift was observed by Michelson and
Morley,
In a footnote Lorentz acknowledges that this possibility had occurred independently
to FitzGerald, who apparently had limited his discussion of the idea to lectures to his
students and had not published his speculations. The length contraction hypothesis is
customarily identified with the names of both these men.
This proposal was sternly criticized by Poincare, who objected to an ad hoc hypothesis, without experimental basis, designed to explain why one cannot detect the presence
of something else which has been hypothesized-the ether. K evertheless, the proposal
was taken seriously by others, and Kennedy devised a modification of the MichelsonMorley experiment capable of testing the Lorentz-FitzGerald contraction hypo thesis.I"
Assuming that (2.30) is correct, if 11 == 12 (as was intended by Xlichclson and Morley
in the original experiment), 1::1 would remain constant at zero even if u, the speed of the
earth through the ether, were to change. Kennedy, with the assistance of Thorndike,
constructed an interferometer in which II - l2 was as great as the coherence of the
source would permit, attaining a value l1 - 12 == 318 mm. Then, instead of rotating the
H A. Lorentz, "Michelson's Interference Experiment," Versuch einer Theorie der elektrischen und
Erscheinungen in bewegten Korpern, Sections 89-92, Leyden, 1895. (An English translation
appears in The Principle of Relativity, Dover Publications, Inc., New York, 1958.)
14 R. J. Kennedy and E. 1\1. Thorndike, "Experimental Establishment of the Helativity of Time,"
Phys Rev, 42, 400-418; November 1932.
13

optischen

60

The Special Theory of Relativity

CHAPTER

apparatus, he held it fixed to see if there were any variation in ~ as the earth's speed
through the ether changed.
If it is assurned that the sun is gliding through the ether at a velocity v., that the
center of the earth is moving along its orbit at a velocity v, relative to the sun, and that
a point on the surface of the earth has an instantaneous velocity v- relative to the
earth's center, then
is the square of the instantaneous speed of this terrestrial point through the ether.
Twelve hours later it has changed to

whereas six mouths later it becomes

The fringe shift noted by not rotating the apparatus, but taking readings 12 hours
apart should be, according to (2.30),

n21 =

Li2

Li1

21r

= 2

1 -

'A

12 )

[(

IL.

1 - u~/ C2)-,~ -

(1 -

ui/ c2)- ;"'! ]


lL,

Similarly, the fringe shift noted by not rotating the apparatus but taking readings six
months apart should be

n3l =

Li3

21r

Lil

= 2(Zt - 12) [(1 - ui/ C2)-~1


'A

(1 -

ui/c2)- H]

If ui, u~, and u; are all small relative to c2 , these expressions reduce to
(2.31 )
(2.32)

By using a precise photographic technique, Kennedy and Thorndike were able to


detect a fringe shift as small as 1 0100 th of the spacing between adjacent fringes. This
was almost two orders of magnitude more sensitive than the original Michelson and
l\Iorley technique, and compensated for the lowered sensitivity in the length factor.
Thus expressions (2.31) and (2.32) for the case of the Kennedy-Thorndike experiment
give an overall sensitivity comparable to that arising from (2.27) in the case of the
Michelson-Morley experiment.
The result of the analysis of 300 exposures of the fringe pattern photographed at the
viewing telescope of the Kennedy-Thorndike apparatus once again gave a null result
within the limit of experimental error. As a result of this experiment, it is reasonable
to conclude that the Lorentz-FitzGerald contraction hypothesis put forth as a means
of explaining the null result of the Michelson-Morley experiment, while still preserving
the ether concept, is invalid.

SECTION

The Interdependence of Space and Time

61

2.8 EMISSION THEORIES


Several other explanations of the Michelson-Morley null result were attempted,
involving the assumption that the velocity of light was the same in all directions in a
coordinate system in which the source was at rest. These emission theories, as they were
called, differed from each other in that they predicted different results when the ligh t
was reflected from a moving mirror. After reflection the three alternatives were that the
velocity of the light (1) remain c relative to the source, (2) beC0111e c relative to the
mirror, or (3) become c relative to the mirror image of the source.
The first alternative predicts complications in the interference pattern in the
Michelson-Morley experiment when extraterrestrial sources are used, but the results
of Miller using sunlight did not reveal any such effects. The second and third alteratives lead to coherence difficulties with reflected light, and all three emission theories
are inconsistent with the findings of de Sitter, previously rnentioned, that the velocity
of light is independent of the motion of the source. Thus these attempted explanations
had to be rejected along with the Lorentz-Fitz Gerald contraction hypothesis and the
assumption of ether drag.
Classical physics had reached an impasse. The laws of mechanics seemed to obey a
relativity principle via the Galilean transformation. The velocity of light also seemed
to obey a relativity principle in that it appeared to be the same in all coordinate systerns in vacuo. Still this was incompatible with the velocity addition law (2.18) arising
from the Galilean transformation. The ether hypothesis, which had at first seemed so
promising, had not been established after several decades of brilliant experimental
research. Clearly a new approach to the problem was needed. This was provided
by Einstein, who concentrated his attention on a reexamination of the basic principles involved in velocity determinations, namely, the measurement of space intervals
and time intervals. Realization that neither was an invariant led to a resolution of the
impasse and to a satisfactory modification of the Galilean transformation equations.
2.9

THE INTERDEPENDENCE OF SPACE

AND TIME

In the introduction to his first paper on this subject Einstein said 15


. . . The same laws of electrodynamics and optics will be valid for all frames of reference for which the equations of mechanics hold good. \Ve will raise this conjecture (the
purport of which will hereafter be called t he 'Principle of Relativity') to the status of a
postulate, and also introduce another postulate, which is only apparently irreconcilable
with the former, namely, that light is always propagated in empty space with a definite
velocity c which is independent of the state of motion of the emitting body. These t\VO
postulates suffice for the attainment of a simple and consistent theory . . . . The introduction of a 'luminiferous ether' will prove to be superfluous inasmuch as t he view here to
be developed will not require an 'absolu tely stationary space' provided wi th special properties . . . .

Einstein thus accepted the principle of relativity in its broadest sense, postulating
that all the laws of physics take the same form in every inertial system of coordinates.
i e A. Einstein, "On the Electrodynamics of Moving Bodies," Ann Phys, 17,891-921; 1905. (An English translation can be found in The Principle of Relativ1'ty, Dover Publications, Inc., N ew York,
1958.)

62

T'he Special Theory of Relativity

CHAPTER

He further adopted the view that light waves, like sound waves, have a propagation
velocity which is unaffected by the motion of the source. t In discarding as superfluous
the concept of an ether, Einstein thus also accepted the notion that a light wave traveling through empty space will pass t\VO different observers at the same speed c even
if these observers are in motion relative to each other.
Acceptance of the second postulate together with elimination of the ether concept
automatically explains the null result of the Michelson-Morley experirnent. It also
leads to a modification of the Galilean transformation equations and therefore to a
modification of the velocity transformation law, and thus ultimately to a satisfactory
explanation of the Fizeau experiment. However, before proceeding to these developments it is desirable to consider the implications of the second postulate with respect
to the nature of space and time, The conclusions to be drawn will appear surprising
on first inspection because they are contrary to common experience, and it is this facet
of special relativity which often causes the greatest initial difficulty. Con11110n experience
develops the ingrained belief that time and space are totally different and unconnected;
once this belief is successfully challenged, the remainder of the special theory of relativity follows logically and without great difficulty.
I t takes only a simple example to challenge this belief. The one to be presented here
consists of a sequence of experiments designed to establish the lengths of rulers under
various conditions of motion, SOD1e aspects of these experiments are not completely
practical, but could perhaps be made so by a modest amount of elaboration. However,
the experiments are completely logical, which is all that is essential.
That which follows will be developed in what might seem to be overly great detail.
However it is concerned with the crux of the dilemma involving light velocity and the
velocity transformation law, and a thorough understanding at this stage will greatly
facilitate all subsequent developments.
Two Rulers at Rest.
Imagine t\VO long slender rulers, Rand R', perfectly straight
and rigid and laid out side by side 011 the ground, at rest in an inertial coordinate
system. Three observers, who will be designated as 0, 0', and 0" are in the process of
determining if these rulers are precisely the same length. They do this by lining up the
rulers so that they are parallel and flush at one end, and then seeing if they are flush
at the other end also. Having satisfied themselves that such is the case, the three
observers then establish midpoints on each of the rulers, by the use of standard techniques such as the construction of the perpendicular bisector or the employment of
an auxiliary third ruler of half-length. They have no difficulty doing all this because
the t\VO rulers are at Test side by side.
The length of a Moving Ruler.
Next imagine that one of these rulers, say R, is
parallel to the ground and just above it, but is now in motion with respect to the ground
at a constant velocity. Let this velocity be parallel to the ground, but at an arbitrary
angle with respect to the long dimension of the ruler R. This situation is depicted in
Figure 2.8. Is the length of the moving ruler R the same as when it was at rest on the
ground?
To decide this question, one must first establish an operational definition of length
which is applicable to situations involving motion. A suitable definition is embodied in

t It is interesting to note that Einstein postulated this eight years before de Sitter's oonfirrn ing
observations of the ligh t arriving from binary stars.

SECTION

The Interdependence of Space and Time

63

-1/--PA

Ruler R' is stationary on ground

R'
B' ~I- -__ ..::.=-.-

----.11

II

I;

A'

II

II
II

II
Points P A and P B are on ground
I I
and underneath two ends of
II I
R at common time
I I

Ruler R moves just above


the ground at constant
velocity v

-~--P
B
B

FIGURE

2.8

The movinq ruler R.

the Following technique of measurement: Let an observer stationary with respect to


the ground determine a fixed point jJ A OIl the ground directly under one end A of the
moving ruler at a specific time t. Similarly, let him determine a fixed point ])B on the
ground directly under the other end 13 of the moving ruler at the same time t. These
two points fixed on the ground beco mo a permaueut record, and at their leisure, ground
observers call reposition the other ruler It' so that one of its ends coincides with P A; if
its second end coincides with f) B, the moving ruler R may be said to have a length
which is unchanged from the value it had when the t\VO rulers were at rest side by side.
The crucial feature in this technique of dynamic length measurement is the requirement that the t\VO fixed points P A and 1)B on the ground be determined at precisely the
same time. 'fa ensure this the three observers 0, 0', and 0" can equip opposite ends of
each ruler with small, insulated charged probes of unlike electrical charge. Thus if one
refers again to Figure 2.8, it can be imagined that the ruler ends labeled 11 and ii' contain positively electrified probes and the ruler ends labeled Band B' contain negatively
electrified probes. A detail of one of these probes is suggested in Figure 2.9a.
Ruler R' can then be placed on the ground, with its charged probes pointing up, in
approximat.ely the position above which ruler R is expected to pass, as indicated in

H'

Probe
Insulator
(a)

A
v

/~
6Jl

~.:::i

A'

Probe construction

(c) Coincidence of

two probes

A'
(h} Moving ruler R approaching

coincidence wit.h stationary


ruler R'
FIGURE

2.9

Details of the ruler experiment.

64

The Special Theory of Relativity

CHAPTER

Figure 2.9b. If the two rulers are still o] equal length, and R' has been properly positioned
on the ground, t as R passes by with its probes pointing down, there is an instant at
which the negatively charged B probe is directly above the positively charged A' probe,
as suggested by Figure 2.9c. This coincidence causes an intense local field, resulting in
an arc of short duration, which is the source of a light pulse. At this same instant, the
positively charged A probe is directly above the negatively charged B' probe, and this
coincidence is the source of another light pulse. If A' is stationed at the midpoint of the
stationary ruler R', these t\VO light pulses will reach him simultaneously. Conversely,
from the single observation that two ligh t pulses reach him at the same time, A' will
deduce that A and B' did momentarily coincide, that A' and B did also momentarily
coincide, that these coincidences occurred at the same time, and thus that the t\VO
rulers are still of equal length, even though R is now moving, whereas R' is stationary.
If the two rulers are no longer o] equal length, the probes at A' and B' may be displaced
equal amounts away from (or toward) the midpoint of R'. 'I'he ruler R' can then be
placed in a variety of positions on the ground in the hope of determining a position
which will cause, for some pass of the ruler . 1. f" t\VO light pulses to reach the midpoint
of R' simultaneously. Ultimately, a placement of R' and a separation of its two probes
will be found such as to cause the simultaneous arrival of t\VO light pulses at the midpoint of R'. The distance separating the t\VO probes on R' is then the length of the
moving ruler R.
So far, a technique for determining the length of the moving ruler R has been developed, but the question still has not been answered as to whether R has a dynamic
length which is the same as its stationary length. To settle this question, the three
observers 0, 0', and 0" devise t\VO symmetrical experiments.
Relative Motion Perpendicular
to Length.
In the first of these symmetrical experimcnts, 0 stations himself at the midpoint of the ruler R and takes off on a space
journey. Similarly, 0' stations himself at the midpoint of the ruler R' and takes off on
a space journey. The flight paths and positions of the t\VO rulers are mirror images
with respect to a vertical plane through the position occupied by 0, who will be
assumed to remain at rest in the original inertial coordinate system X" Y" Z". These
flight paths will be such that 0 and 0' arrange to encounter each other in outer space
in a region remote from all other bodies. They coast past each other on immediately
adjacent and parallel paths at a constant relative speed u , during a period of time when
neither ruler is accelerating and each ruler is perpendicular to the path.
Figure 2.10 indicates a sequence of positions of the t\VO rulers as "seen" by 0". By
symmetry, 0" must observe that the t\VO rulers are still of equal length and thus that
A and B' will coincide, as will A' and B, these coincidences occurring simultaneously.
When A and B' are precisely opposite each other, the positively charged probe at A and
the negatively charged probe at B' interact to generate a light pulse. Similarly, an arc as
the ends A' and B pass each other gives rise to another light pulse. The generation of
these light pulses is evidence that the rulers are still of equal length. Yet A" can not conclude from this that either ruler is now the same length that it was when at rest relative
to him because now both rulers are in motion relative to him.
However, either 0 or A' is in a position to judge this question, since each is stationary

t It may take many trials of the experiment to determine this position, with R always making the
same pass.

SECTION

The Interdependence of Space and Time


B'

R'
0'

--,-I

65

(a)

R
0

A'

I
I

B'

(b)

R'

R
I
~

0'

I
B

A'

1
0"

FIGURE

2.10

J[ oving ruler experiment- transverse motion.

with respect to one of the rulers. For example, during the period of encounter, observer
no acceleration and can consider himself and the ruler R to be at rest in an
inertial coordinate system XYZ. Under Einstein's first postulate, all of the laws of
nature should be equally applicable in XYZ and the coordinate system X"Y"Z" which
o formerly shared with 0". In particular, 0 has no reason to believe that the length of R
is now any different from what it was when he and R were both at rest in XI/Y"Z".
Therefore since 0 detects the two pulses caused by the passage of R', he concludes that
not only is R' still the same length as R, but also that it is still the same length as it had
been when at rest relative to himself. Similar remarks can be made about the observations of 0'.
This result may be obtained in another way. Assume that when the rulers pass each
other, no light pulses occur, indicating that the rulers are now of dissimilar length. Then
let 0 move each of his probes the same small amount closer to the midpoint of R; let 0'

o senses

66

T he Special Theory of Relativity

CHAPTER

extend each of his probes the same small amount further from the midpoint of R'; and
let the experiment be re-run. If this procedure is repeated until positions of the probes
are found which cause light pulses, then both 0 and 0' can say, for example, that R is
the longer of the two rulers. But this result is impossible, The experiment is completely
symmetrical. If 0 thinks R' is longer, 0' must think R is longer. The only symmetrical
answer possible is that no adjustment of the probe positions was needed, and that they
both think both rulers are the sarne length, unchanged from the value when each was
at rest relative to 0".
Therefore the conclusion is reached that when a body is in motion relative to an
observer, the measurement of length transverse to that motion is unaffected by the
motion, being the same as when the body is at rest relative to the observer.
Relative Motion Parallel to Length.
Now let the space journeys of 0 and 0' on
the rulers Rand R', respectively, be repeated in all original details except that during
the period of encounter the t\VO rulers are oriented parallel to the paths. In addition,
and 0' each now takes along a clock, the two clocks having been determined to be
identical when at rest relative to 0".
Figure 2.11 indicates a sequence of positions of the t\VO rulers as "seen" by 0". When
the ends A and A' pass each other, nothing happens. But as A and B' are precisely opposite each other, the positively charged probe at A and the negatively charged probe at
B' interact to generate a light pulse. Similarly an arc as the ends A' and B pass each
other gives rise to another light pulse.
As 0" views this sequence, the t\VO rulers are moving in opposite directions at equal
speeds, and by symmetry the AB' coincidence occurs at the same time as the A'B
coincidence. The t\VO pulses of light originate simultaneously and spread uniformly
thereafter as spherical wavefrouts traveling at the velocity c. The centers of these
spherical wavefronts are the fixed points P, and J)2, indicated in position (c) of Figure 2.11. Because the velocity of light is finite, 0" "sees" the t\VO rulers separating as
the wavefronts grow and thus finds that the AB' pulse passes 0 before it passes 0',
whereas the A'B pulse passes 0' before it passes o.
Although 0" can conclude that the t\VO rulers are still the same length as each other,
he can not conclude that either of them is still the length it was when at rest relative to
him because once again they are both in motion relative to him.
However, either 0 or 0' is in a position to judge this question, since each is stationary
relative to one of the rulers. For example, during the period of encounter, observer 0
senses no acceleration and can consider himself and the ruler R to be at rest in an
inertial coordinate system XYZ. Using Einstein's second postulate, 0 knows that the
velocity of light is independent of the motion of the source and equal to a constant c in
all directions in XYZ. Thus since he knows himself to be at the midpoint, that one pulse
of light originated at one fixed end of his ruler and that the other pulse of light originated at the other fixed end of his ruler, if these t\VO pulses reach him simultaneously,
he will conclude that the arcs occurred simultaneously. This would mean to 0 that the
coincidences of AB' and A'B were simultaneous and thus that the ruler R' was still the
same length as his O\VTI.
Does 0 receive the t\VO light pulses at the same time'? Although this would be contrary to the observation made by 0", let it nevertheless be assumed that he does. This
would mean that 0 concludes that the t\VO rulers are still the same length and therefore

SECTION

The Interdependence of Space and Time

67

that 0' was directly opposite 0 at the instant of origination of the two light pulses. Due
to the finite velocity of light, at the later instant when 0 receives these t\VO pulses, 0' is
already beyond 0 and the AB' pulse has not yet reached 0' whereas the A'B pulse has
already passed him.
But this result is patently impossible. The experiment is completely symmetrical
with respect to 0 and 0' and observer 0 cannot receive the pulses simultaneously unless

(a)

B'

0'

A'

-=-+-:-----'
I
A

"I

(b)

A'
(c)

Spherical wavefront
of A' B pulse

1
0"

FIGURE

2.11

J! oving ruler experiment-longitudinal motion.

0' does also. Thus the two light pulses do not arrive at 0 at the same time and 0 no
longer thinks the two rulers are the same length.
In what order do the two light pulses reach O'? If one takes the observations of 0"
as a hint, one can assume that the AB' pulse reaches 0 an interval of time ol ahead of
the A'B pulse, as measured by the clock 0 brought along with him. This implies that
the A'B pulse reaches 0' sooner than the AB' pulse, say by an interval of time ot', as

68

T he Special Theory of Relativity

CHAPTER

measured by the clock of 0'. When this assumption is adjusted so that at = at', the
result is completely symmetrical, as required.
An important conclusion has been reached. Observer 0 now feels that the ruler R' is
shorter than the ruler R by an arnount u at. Since he is still at rest relative to R, he has
no reason to believe that the length of his own ruler is any different from what it had
been originally when at rest in the coordinate system X"Y"Z". Thus he concludes that
the length R' depends on its motion relative to him, when that Illation is in a direction
parallel to its length.
In like manner observer 0' now feels that the ruler R is shorter than the ruler R' by
an equal amount u bt', and therefore that the length of R depends on its motion relative
to him, when that motion is in a direction parallel to its length.
I t is clear that the t\VO observers 0 and 0' are no longer in agreement about measurements of distance, and that this is occasioned by their relative motion. Furthermore, the
t\VO observers are not in agreement about the measurement of time intervals, for 0
thinks that the AB' coincidence occurred first, whereas 0' thinks that the A'B coincidence occurred first. K either observer has any reason to believe that his own clock is
behaving in a different manner from when they were together at rest in X"Y"Z". Thus
each observer concludes that the other's clock has been affected by its motion relative
to him.
The temptation exists to raise the protest that the true picture of what is happening
is the sequence shown in Figure 2.11, as "seen" by 0". If 0 and 0' would only take their
motions into account, they could readily explain the time differentials in the arrival of
the two pulses and deduce that the t\VO rulers are really still the same length. But this
point of view puts 0" and his coordinate system in a privileged status. Why shouldn't
o and 0' each have the right to consider himself at rest in a coordinate system in which
all the laws of physics are valid in the same form they have in the coordinate system
of O"? If this first postulate by Einstein is accepted as reasonable, then 0' can properly
consider himself and his ruler to be at rest in a reference frame X'Y'Z' and to be measuring the length of R as it drifts by; he legitimately concludes that the measurement
of the length of R reveals a smaller value due to its relative motion.
These remarks can be made with equal validity when discussing 0 and his right to
consider himself at rest in a reference frame X YZ. Both observers conclude that the
other ruler is shorter, and both are correct, this surprising result being a consequence of
the operational definition of length enunciated earlier.
I t has been noted previously that 0" concludes that the t\VO rulers are still the same
length as each other, and he is also correct; his conclusion is due to the fact that both
rulers have the same speed relative to him. A" cannot say that either ruler is the same
length it was when back on the ground at rest in front of him; to decide this, he would
have to perform an experiment of the type just concluded by 0 and 0'. Were he to
perform such an experiment, he would find that the length of each ruler was now less,
and by the same amount, thus accounting for the fact that the t\VO lengths are still
equal.
Thus the conclusion is reached that when a body is in motion relative to an observer,
the measurement of length parallel to that motion is affected by the motion, being
shorter than when the body is at rest relative to the observer.
General Remarks.
The previous set of hypothetical experiments, or Gedankenexperimenie, reveal that space and time, upon being considered on an operational basis

SECTION

T he Interdependence of Space and Time

69

involving measurements, are not invariant concepts. If a material body has a constant
velocity relative to an observer 0, and he measures its longitudinal and transverse
dimensions, only his transverse results will agree with those obtained by an observer 0'
who is stationary with respect to the body.
What makes this conclusion seem suspect is that in everyday experience one does
not perceive objects apparently shrinking as they take on a relative motion. Airplanes
are not noticeably shorter as they race down a runway, nor does a train seem to extend
its length as it draws to a stop at a station. The reason for this apparent invariance of
length can be traced to the fact that the velocity of light is so great compared to the
velocities of all material objects in one's common experience. In the ruler experiment
just discussed, if c > U, the time intervals ol and ot' are so small as to escape detection
when ruler sizes consistent with one's "common sense" are assumed. Thus the change
in length U ot is normally so small as to be unobservable; however in considering astronornical distances or great velocities this is no longer true and the effect is detectable
and significant.
Likewise, when one motors past a tower clock, the movement of its hands does not
appear altered by the relative motion. Here again this is due to the great disparity
between the velocity of light and the normal velocities of motor vehicles. Thus in the
ruler experiment just discussed, the size of ot and ot' is an index of the difference between
the readings of the clocks of 0 and 0' as they record t\VO events, the coincidences of
ends of the rulers. But for normal velocities U, and normal ruler lengths, ot and ot' are
negligible and the t\VO clocks appear to agree. I t is only when great velocities or distances cornc into play that the differences in the readings of these clocks becorne
important.
For this reason one can appreciate that the discovery that measurements of time and
distance depend on relative motion in no way upsets the large body of common experience built up during one's lifetime. K evertheless, it is of the utmost scientific importance to recognize that these effects exist. This can only be done by considering situations beyond common experience in which such effects are significant. The purpose of
much of the remainder of this chapter will be to explore such situations.
The reader may wonder why light signals were chosen in these ruler experiments
rather than some other means of communication. The reason for this is that only light
signals can propagate in a VaCUU111 and if there were any other medium surrounding the
t\VO rulers it would have a relative motion with respect to each ruler. In general these
relative motions would not be the same, thus destroying the complete symmetry of the
experiment, a crucial point in the argument.
From consideration of the ruler experiment involving longitudinal motion, it is evident that the change in length of a moving ruler, given by U 0[, is dependent on clock
readings. The interval of time ot between the AB' and A'B coincidences is in turn
dependent on the length of the rulers, so that measurements of time intervals and
space increments are interdependent.
Although two observers in relative motion will not, in general, agree about the distance between two points nor about the time interval between two events, this does not
mean that each cannot predict the values of the other's measurements from a knowledge of his own. Such predictions are accomplished via the coordinate transformation
equations which link the two frames of reference in which the t\VO observers are stationary. It is now apparent that the Galilean Equations (2.4), which assume a time

70

The Special Theory of Relativity

CHAPTER

invariance and predict a length invariance, are only approximate, and will need to be
modified in order to find the proper way to link the observations of 0 and 0' so as to
be consistent with situations such as the two foregoing ruler experiments.

2.10

THE LORENTZ TRANSFORMATION

A satisfactory modification of the Galilean transformation can be accomplished by


returning to the ruler experiment involving longitudinal motion. Upon referring again
to the situation of Figure 2.11, one can let the origin of an XYZ coordinate system be
affixed to the tip of the probe at B and can let the origin of an X' Y'Z' coordinate system be placed at the tip of the probe at A', as shown in Figure 2.12. These two coor-

0'

.....

FIGURE

2.12

Coordinate systems fixed on each ruler.

dinate systems then have a relative speed u; the axes can be aligned so that X' and X
slide along each other, with the Y ' and Y axes and the Z' and Z axes respectively parallel, thus duplicating the situation of Figure 2.1. Let it further be assumed that
observer 0, who is stationary in XYZ, selects his time origin so that t = 0 corresponds to the A'B coincidence; in like manner 0', who is stationary in X'Y'Z', will
be assumed to have chosen his time origin so that t' = 0 corresponds to the A'B coincidence, thus causing the t\VO origins to coincide when t = t' = o.
It will be imagined that conceptually 0 determines a unique triplet of numbers
(x,Y,z) for every point in XYZ space by laying out identical scales (e.g., in meters)
along his three axes, and that similarly 0' determines a unique triplet of numbers
(x',y',z') for every point in X'Y'Z' space by laying out identical scales along his
three axes. It is further assumed that 0 and 0' layout these scales using the same
standard of length (e.g., a meter stick). By this it is meant that if 0 measures lengths
in terms of a ruler R marked in meters and at rest in XYZ, and if 0' measures lengths
in terms of a ruler R' marked in meters and at rest in X'Y'Z', then if these two rulers
were brought to rest side by side, markings one meter apart on R would coincide with
markings one meter apart on R'.
Additionally, it will be necessary for each observer, 0 and 0', to measure time
unambiguously at every point in his coordinate system. To this end it will be conceived that 0 has an inexhaustible supply of identical clocks, such that he has been
able to station one clock permanently at each point in XYZ. To ascertain that all
of these clocks are set properly and running at the same rate, 0 can then select one of

SECTION

The Lorentz Transformation

10

71

them as the reference and perform the following experiment: 0 places himself at the
reference clock and stations an auxiliary observer 0 1 at the clock to be synchronized.
o sends out a pulse of light at time fa on the reference clock, directing it toward 0 1, who
reflects it back by means of a mirror. The returned pulse of light reaches 0 at time lb.
The clock where 0 1 is stationed was set properly if it read
tb) /2 at the instant the
light pulse reached the mirror, I t is running at the proper rate if it proves to be set
properly every time 0 and 0 1 choose to perform this experiment.
In this manner every clock in XYZ can be synchronized to the reference clock, and
thus to every other clock in XYZ. It will be assumed that this has been done, and this
will be the conception of time in the frame of reference XYZ.
Likewise, it can be conceived that 0' has an inexhaustible supply of identical clocks
which he has arrayed at fixed points in X' Y' Z' and which he has synchronized by the
same procedure. It will be further assumed that if these two sets of clocks were brought
to rest relative to each other, they would be found to be identical and running at the
same rate.
With these concepts of spatial position and time, let an event be defined for
observer 0 as something which happens at a point P(x,Y,z) at time t, or more briefly
at the "point"P(x,y,z,t). The same event will occur for observer 0' at the "point"
p' (x' ,y' .z' ,t').
Returning now to a consideration of the pulse of light caused by the coincidence of
A' and B, imagine that 0 has stationed an auxiliary observer 0 1 at the fixed point
(x,y,z) and that 0 1 records the event that this light pulse passes him as having occurred
at time t. Then it follows that 0 can characterize this event by the equation

u. +

x2

+ Z2

y2

(ct)2

(2.33)

Imagine further that 0' has stationed an auxiliary observer O~ at the fixed point
(x',y',z') and that 0 1 and O~ just happen to coincide at the instant the light pulse
passes. O~ records the event as having occurred at a time t', and 0' can write

(X')2 + (y')2

(Z')2

(ct')2

(2.34)

The transformation equations which link the observations in XYZ to those in X'Y'Z'
must be such that 0 can derive (2.34) from (2.33) and such that 0' can derive (2.33)
from (2.34), since they are describing the same event.
The discussions of the previous section have already provided much information
about this transformation. For example, observers 0 and 0' agree about distances in
the transverse directions and can write

y' = y

z' = z

(2.35 )

Further, time intervals and spatial increments were found to be interdependent when
considering measurements in the longitudinal direction. Thus since every motion that
is uniform and rectilinear in XYZ 111Ust also appear uniform and rectilinear in X'Y'Z',
so that the transformation from (x,t) to (x',t') takes straight lines into straight lines,
and is therefore linear, it follows that

x' =

a1X

t' =

aaX

+
+

a 2t

a 4t

(2.36)
(2.37)

The absence of constant terms in these two equations is due to the fact that

72

The Special Theory of Relativity

CHAPTER

(x' = 0, [' = 0) corresponds to (x = 0, t = 0). The problem now remains to evaluate the coefficients ai.
First of all, note that if a point ]J'(:r',Y',z') is fixed with respect to observer a', this
point appears to be moving in the positive X direction at speed u when observed by O.
For such a point, taking differentials of (2.36) gives

dx' = 0 =

dx

0'2

dt

dx

0'2

But in this case dx/dl = u so that

0'1

-0'1-

dt

0'2

x'

= alex

-UO'I

and (2.36) can be rewritten

ut)

(2.38)

The remaining three constants, aI, 0'3, and 0'4, can be determined by requiring that
(2.33) and (2.34) transform into each other. Thus if Equations (2.35), (2.37), and (2.38)
are substituted into (2.34), one obtains

aix 2

2aiuxt

aiu 2t 2

y2

= a;c 2x 2

Z2

Since this must agree with (2.3;3) for all values of x, y,

a;c

ai 2aiu

a;c

Z,

2a30'4c 2xl

a;c 2l 2

and t, it follows that

= 1

2aaa4c2 = 0

aiu

= c2

Solution of these three equations gives

ai

a; =
0'3

(1 -

U 2/C 2 ) - 1

alU
--

c2

which yields the result

x' = K(X - ut)


y' = y
z' = z
t' = K(t - ux/c2)

(2.39)

with
K =

(1 - u 2j C2)-~~

These important equations were derived by Einstein in his 1905 paper using an argument which has been reproduced in its essentials. They are commonly called the
Lorentz transformation equations, so named by Poincare in honor of H. Lorentz,
who had derived them earlier (1903) under a different set of hypotheses. t
If u and the range of the variable x are both small compared to the velocity of light c,
Equations (2.39) reduce to
x' ~ x - ut
y' = Y
z' = z

t'

t These equations actually had been used even earlier by Voigt (1887) in connection with vibrating
motion. Lorentz in his development assumed the existence of an ether, the physical contraction of
bodies due to their motion through the ether, and required that Maxwell's equations transform
properly.

SECTION

11

Length and Time Under the Lorentz Transformation

73

which is essentially the Galilean transformation (2.17). Thus for velocities and distances encountered in C01111110n experience the Lorentz transformation can be approximated with negligible error by the Galilean transformation, a conclusion which is consistent with the discussion of the previous section.
Equations (2.39) can be inverted to give the Lorentz transformation proceeding the
other way, namely.
x = K(X' + ut')
y = y'
Z

t =

z'
K

(t'

(2.40)

+ ux'/ c

K ote that the only difference between (2.40) and (2.39) is the sign of u. But this is to
be expected; if X' Y' Z' is advancing along the X axis at velocity u, then X}T Z is receding
along the X' axis at velocity - 'U.
A 1110re general form of the Lorentz equations could be obtained by introducing
a third Cartesian coordinate system X*Y*Z* at rest with respect to X1TZ but with
its axes tilted in an arbitrary way with respect to those of XYZ. This has the effect
of letting X'Y'Z' move through X* }T*Z* in an arbitrary direction. Even greater
generality could then be obtained by selecting a fourth frame X~ Y~Z~ arbitrarily
tilted and displaced (statically) with respect to X'Y'Z'. The result would be that the
equations connecting X* Y*Z* and x~ Y~Z~ form the most general Lorentz transformation, corresponding to the most general Galilean transformation (2.4). However, no loss
in generality will occur from confining one's attention to the simpler Lorentz transformation (2.39), since the transformations from XYZ to X* Y*Z* and from X'Y'Z' to
X~ Y~Z~ are static, and therefore Galilean. This discussion parallels the remarks of
Section 2.2.
The Lorentz transformation equations call be looked upon as the means whereby one
links the t\VO quartets of numbers !)(x,y,z,t) and f)'(x',y',z',t') which identify the
same event. This process has wide applicability since many physical phenomena can be
expressed in terms of events. For exarnple, the progression of a mass particle along a
path can be thought of as a continuous sequence of events. ]>(x,Y,z,l) traces this progression as seen by 0, with the spatial variables continuous functions of the temporal
variable. The progression of this same particle as seen by 0' can be deduced through use
of the Lorentz equations.
I t should be noted that the transformations (2.39) and (2.40) are nonphysical for
u ~ c.

2.11

LENGTH AND TIME UNDER THE LORENTZ TRANSFORMATION

It is now possible to give a quantitative interpretation of the second ruler experiment


of Section 2.9 in terms of the Lorentz transformation. Let the two ends of the ruler R'
be at x~ and x~. These spatial coordinates are independent of t' and observer 0' can say
that the length of R' is
If observer 0 wishes to measure the length of R', since it is in motion with respect to
him, he should measure its end coordinates X2 and Xl at a common time t. Using the

74

The Special Theory of Relativity

CHAPTER

first of Equations (2.39), one can then write


x~

K(XI -

ut)

x;

K(X2 -

x~ - x~ =

ut)

K(X2 -

Xl)

from which
(2.41)
in which lR' is the length of the ruler R', as determined by O. One could similarly
investigate the length of the ruler R using the first of Equations (2.40) and conclude
that R appears contracted to 0' by the same factor.
Equation (2.41) is seen to be exactly the Lorentz-FitzGerald contraction formula.
However, it is to be remembered that the Lorentz contraction hypothesis included an
ether-filled space which did not contract, there being rather a physical contraction of
material bodies as they moved through the ether. Experiment proved this hypothesis
to be untenable. The interpretation to be placed on (2.41) is that the distance between
t\VO points in one coordinate system appears to be contracted to an observer in relative
motion parallel to the line connecting these t\VO points, whethera material body is present
or not. This is not an apparent contraction of material bodies alone, but of all of space;
as mentioned earlier, it is an effect caused by the operational definition of the measurement of length. If u c, this contraction is insignificant unless the length itself is very
great. Two widely different examples serve to point up this effect.
EXAMPLE

2.1

The vehicular tunnel under Mont Blanc connecting France and Italy is 11.2 km long.
How much shorter does this tunnel appear to a motorist driving through it at 100 kph?
Equation (2.41) is applicable to this situation, and the first t\VO terms of a power series
expansion give
2

lR' = l~' ( 1 - -1 -u
2 c2

61

l~.

- IR'

~ l~. u2

2 c

~ (10

+ ...

5/3600)2

3 X 10 8

= 4.8 X 10-14 knl

= 0.000000048 mm
EXAl\:IPLE

2.2

Sirius, the brightest star in the heavens, is estimated to be 8.5 light years from earth. If a
group of space travelers were to journey from Earth to Sirius, having achieved a velocity
of 0.90c relative to the solar system before cutting out their rocket motors, how far away
from Earth would Sirius seem to them?
To these observers the distance would appear contracted, being given by

d' = 8.5[1 - (0.90)2P2 = 3.7 light years

which is far from being an insignificant contraction.


Since this segment of length is going past them at 0.90c, the space travelers compute
that the journey will take them a period of time
T'

3.7

= 0.90

4.1 years

An observer back on Earth will estimate that the journey will consume an amount of time
T

8.5

= -

0.90

9.45 years

SECTION

Lenqtli and Time Under the Lorentz Transformation

11

75

This disparity is due to the fact that the t\VO sets of observers also disagree about time intervals because of their relative motion. The disparity is large because of the high velocity
and the great distance involved.

The preceding example indicated a situation in which t\VO observers in relative


motion would disagree about the time interval between two events. 'I'his phenomenon
can be treated more generally by considering a particular clock in X'Y'Z' which
remains at the fixed coordinates (x',y',z') and is thus being passed by a sequence of
XYZ clocks. One can define a first event when the hands of this single X'Y'Z' clock
indicate the time t~ and a second event when its hands indicate the time t~.
In XYZ, the first event will occur at the spatial position

y == y'

z == z'

these equations resulting from the application of (2.40). The X


registers the time of the first event as
t 1 ==

K (

t,1

}TZ

clock at this position

U :r ' )
~

Similarly, in X}TZ, the second event will occur at the spatial position
X2

==

K(X '

ut~)

==

y'

==

z'

and the XYZ clock at this position registers the time of the second event as

t2 = K(t~ + ~ XI)
Frorn this it follows that
t2

i,

==

t~ - t~

(1 _ U2/C2)~~

"

> t2

t1

(2.42)

Consider this result first from the viewpoint of 0', who is stationary beside the single
X'Y'Z' clock. He watches a succession of X1TZ clocks go by and can take only a single
reading of each of them. However, he notices that they seem to be set progressively
further and further ahead, thus accounting for the inequality in (2.42).
On the other hand, observer 0 can take a sequence of readings of the X/:V'Z' clock
as it passes a succession of XYZ clocks. Since he knows these clocks are all synchronized, he concludes that the rate of the X'Y'Z' clock is slowed by its relative motion,
These results are symmetrical and the same conclusions could be reached if a single
XYZ clock were considered to be passing a succession of X'Y'Z' clocks. Thus it can be
concluded that when the readings of a succession of rnoving clocks are compared with
those of a single stationary clock, successive moving clocks appear to be set further and
further ahead; when the readings of a single moving clock are compared with those of
a family of stationary clocks, the moving clock appears to be running slow. This effect
is known as time dilatation and is given quantitatively by (2.42).
EXAl\1PLE

2.3

Direct experimental evidence of the time dilatation effect exists. For example, the lifetimes of 1r mesons have been studied both for the case of mesons at rest in the laboratory,
and for the case when they are in motion relative to the laboratory. 1r mesons are unstable

76

l he Special Theory of Relativity


l

and they decay into a

J..L

CHAPTER

meson and a neutrino, obeying the exponential law

N = Noe- tlT

(2.43)

when at rest in the laboratory. In Equation (2.43), No is the number of 1r mesons existing
at time t = 0 and N is the number surviving at a later time t; e is the base for naturallogarithms and T is the characteristic lifetime of the decay process. Several experimentersw "
have established the average value T = 2.56 X 10- 8 sec for 1r+ mesons at rest.
The decay in a beam of 1r+ mesons traveling at 0.755c relative to the laboratory has
also been studied.'! By passing the beam through a series of counters, and noting the relative
numbers of counts in successive counters as a function of the separation distance between
counters, it was established that the separation distance needed to be 8.43 m in order to
have the fourth counter register the passage of only lie as many 1r+ mesons as did the
third coun tel'.
In the laboratory frame of reference, it takes the meson beam
8.43

t = - - = 3.72 X 10- 8 sec


0.755c

to travel this distance. I f this value for t is inserted in (2.43), it yields the prediction that
the fourth counter should be down from the third by 1/1.57e, a value which is 36 percent
lower than the experimental results.
The difficulty lies in the fact that (2.43) is valid only in a frame of reference in which the
mesons are at rest. For an observer traveling along with the meson beam, the time interval
between passage of the third and fourth counters is only

t' = 3.72 X 10- 8[1 -

(0.755)2]~2 =

2.44 X 10- 8 sec

If this value is inserted for t in (2.43), excellent agreement between prediction and experiment results.
EXAMPLE

2.4

The Mossbauer effect.' 9 has also provided a graphic illustration of time dilatation. A source
consisting of the radioactive isotope cobalt 57, which has a convenient half-life of 280 days,
was plated on to the surface of a 0.8-cnl diameter iron cylinder as shown in the figure. 2o
This cylinder was rigidly mounted between t\VO aluminum plates which also held a cylindrical shell of lucite. The latter was 13.28 ern in diam, 0.31 em thick, and concentric with
the iron cylinder. An iron foil enriched in Fe 57 was glued to the inside surface of the lucite
shell. This assembly was mounted on a shaft and rotated at angular velocities as great as
3,000 rad/sec. A xenon-filled proportional counter was placed near the assembly, just
beyond an intervening lead shield, as shown in the diagram.
As the cobalt 57 nuclei decay, they change into excited nuclei of iron 57. These iron
nuclei emit gamma rays at a frequency Vo = 3 X 10 18, and these rays can be directed into
16 1\1. Jakobson, A. Schulz, and J. Steinberger, "Detection of Positive 7r Mesons by 7r+ Decay," Phys
Rev, 81, 894-895; l\1arch 1, 1951.
17 C. E. Wiegand, "Measurement of the Positive 7r Meson Lifetime," Phys Rev, 83, 1085-1090; September 15, 1951.
18 R. P. Durbin, H. II. Loar, and \V. W. Havens, Jr., "The Lifetimes of the 1r+ and 7r- Mesons," Phys
Rev, 88, 179-183; October 15, 1952.
191l. L. Mossbauer, "Fluorescent Nuclear Resonance of Gamma Radiation in Iridium 191," Z. Phys.
151, 124-143; 1958. (For an excellent explanation of the Mossbauer effect, see the article by Sergio de
Benedetti, Sci Amer, 202, 72-80; April 1960.)
20 H. J. Hay, J. P. Schiffer, T. E. Cranshaw, and P. A. Egelstaff, "Measurement of the Red Shift in an
Accelerated System Using the Mossbauer Effect in Fe S7 , " Phys Rev L, 4, 165; February 15, 1960.

SECTION

Length and Time Under the Lorentz Transformation

11

77

[After Hay, Schiffer, Cranshaw, and


Egelstaff, Phys Rev L, 4, 165; 1960.]

a beam aimed at the counter. However, the iron foil glued to the lucite shell, being enriched
with Fe 5 7 , can absorb these gamma rays and then reradiate them isotropically. This absorption effect is greatest when the source-absorber assembly is at rest, for then the quantum
energy levels have the same separation hlJo in source and absorber.
However. if the assembly is rotated, the source and absorber travel at different speeds
relative to the laboratory and thus the counting of time occurs at different rates in two
coordinate systems, in one of which the source is at rest, and in the other of which the absorber is at rest. An oscillation of period i in the source frame will appear to take a greater
time r' as measured by a clock in the absorber frame, the connection being

Since

T =

'

(1 -

) }2~ r'

(1 - ~ ~)
2 c2

1/ Vo it follows that
2

v' ~ lJo

U )
1(1 - 2 c2

in which Vi is the frequency of the photons from the cobalt source, as determined in a frame
at rest with respect to the Fe 57 absorber.
Since u is the relative speed of source and absorber, it follows that

w(R 2

R 1)

-~-----

w(6.64 - 0.4)
3 X 10 1 0

in which w is the angular velocity of the assembly. Thus the change in frequency of the

<l)

-+-J
~

104

r...

eo 103
~

.~

:::1
0

102

> 101
Q)

'.0
~

Q)
~

100

100

200

300

400

500

Angular velocity (rps)

[After Hay, Schiffer, Cranshaw, and


Egelstaff, Phys Rev L, 4, 165; 1960.]

78

The Special Theory of Relativity

CHAPTER

gamma rays is approximately

Llv

Vo -

v'

1 u2

= - -2
2c

Vo

0.065 w 2

But the absorption spectrum of Fe 57 is so sharp that the width of the resonance, or the line
wid th, is only one part in 10 12 I n other words, if the incoming photons differ in frequency
from Vo by as Ii ttle as one part in 10 12, the absorption falls off markedly, and a lowered
absorption is evident at even smaller changes in frequency.
This lowered absorption plus isotropic scattering by the Fe 57 foil manifests itself by an
increased reading in the counter, since 1110re of the original directed beam of gamma rays
gets through to the counter if less is absorbed and scattered. A plot of counter reading
versus angular velocity of the assembly is shown in the graph, and a theoretical curve
based on the absorption spectrum is included for comparison. The agreement between
theory and experiment is seen to be excellent. It is to be noted that this effect would not
be predicted by a theory which assumed time to he an invariant.

2.12

PROPER TIME AND PROPER DISTANCE

One of the cardinal N ewtonian beliefs is the invariance of distance, and it has already
been seen in Section 2.2 that a Galilean transformation preserves this invariance.
Another ingrained belief is the invariance of time, and this assumption was necessary
to preserve the form of 1'\ewton's force law under a Galilean transformation. However,
it has just been noted that under a Lorentz transformation neither time nor distance is
an invariant.
However, time intervals and space intervals may be combined to form a quantity
which is invariant with respect to a Lorentz transformation, Let Ti2 and (T~2)2 be
defined by the relations
(2.44)
(2.45)
By using either of the transformations (2.39) or (2.40) one can show that

Ti2

(T~2) 2

(2.46)

and thus this quantity is an invariant.


To appreciate the physical significance of T12 (the positive square root of Ti2), it can
be recognized that if there exists a Lorentzian frame of reference XYZ in which two
events take place at the same spatial point, then T12 is the time interval between these
two events as recorded by a single clock at rest at this spatial point in XYZ. For this
reason T12 is called the proper time interval. In another frame of reference X'Y'Z',
l~ - t~ is measured by t\VO different clocks because the events are not at the samo
spatial point, and thus t~ - t~ is sometimes called the nonproper time interval. The
interdependence of space and time is clearly illustrated by (2.44) and (2.45).
I t is not always possible to find a Lorentzian frame of reference in which two events
take place at the same spatial point. For imagine that in XYZ they take place at
(Xl,Y1,ZI) and at (X2,Y2,Z2) at times t 1 and t2 respectively, and that upon inserting

SECTION

Proper Time and Proper Distance

12

79

these values in (2.44) one finds that 7i2 is negative. Then 712 is imaginary for all
Lorcntzian frames since it is an invariant. The trouble is that the t\VO points (X1,Y1,Zl)
and (X2,Y2,Z2) are widely enough separated in XYZ that even an X'Y'Z' frame going
at a relative speed u ---+ c cannot cover the distance between (X1,Y1,Zl) and (X2,Y2,Z2)
in so small a time interval as f 2 - fl. Since Lorentzian frames of reference are physically
restricted to relative velocities u < c (for otherwise x' and t' would be imaginary) it
follows that only when 712 is real is it possible to find a Lorentzian frame in which the
two events occur at the same spatial point. Whenever the value of 712 is real, the interval between the two events will be called timelike.
To accommodate situations in which 712 is imaginary, a new quantity C12 can be
defined by the relation
Ci2 ==

-C 27 i 2 == (X2 - X1)2

from which it follows immediately that

(Y2 - Yl)2

(Z2 - .Zl)2 - C2 (t 2

ciz is an invariant,

Ei2 ==

t 1) 2

(2.47)

for

(E~2)2

(2.48)

by virtue of (2.46). E12 (the positive square root of Ci2) is called the proper space interval,
because in a Lorentzian frame in which t\VO events take place at the same time, C12 is
the Cartesian distance between the t\VO events. In another reference frame X'}T'Z',
t~ - t~ ~ 0, and the distance
[(x~ - X~) 2

(Y~ - Y~) 2

(z; - z~) 2P~

is sometimes called the nonproper space interval.


Whenever the value of C12 is real, the interval between the t\VO events will be called
spacelike. Except when 712 == C12 == 0, it is always possible to carry out a Lorentz transformation to a new frame of reference in which either the two events occur at the same
spatial point (712 real) or at the same time (C12 real) but not both. 712 == C12 =
is the
boundary between these possibilities, and corresponds to the situation in which the t\VO
events can just be connected by a light ray which leaves the site of one event as it
occurs and arrives at the site of the other event as it occurs.
These ideas can be given a simple pictorial representation if attention is confined to
events which happen along the X axis. Let an observer () be at Xl at time iI, and let
light signals traveling along the X axis in each direction pass through Xl at i.. The
tracks of these light signals in the XT plane are shown in Figure 2.13 as the lines AB
and CD. The equations for th ese lines arise from the condition 7i2 == Ci2 == 0, and are

(2.49)
These two lines are thus the boundaries between spacelike intervals and timelike intervals. For events which occur in the areas marked Future and Past, the interval between
such an event and the event (X1,l.1) is tirnelike. For events which occur in the areas
marked Present, the interval between such an event and the event (x1,ll) is spacelike.
An event 1)3 anywhere in the Future region is such that the observer () at Xl still has
the opportunity to influence it causally, since he can send a signal over the distance
X 3 - xII at a velocity less than c and have it arrive there in less tirne than t 3 t-: An
event P 4 anywhere in the region labeled Past happened long enough ago that the
I

80

The Special Theory of Relativity

CHAPTER

FIGURE

2.13

The divisions of space-time.

observer 0 could have learned about it via a signal traveling the distance IX4 - xli at
a velocity less than c and requiring a time interval less than i, - t4.
However an event 1)5 anywhere in the region marked Present could be occurring
without observer 0 being aware of it, for a signal could not be sent in either direction
over the distance Ixs - xII, traveling at a velocity no greater than c, and cover this
distance in so small a time interval as Its - t 1 1.
K ewtonian mechanics can be viewed as a theory in which the velocity of light is
infinite, for then the Lorentz transformation is seen to reduce to the Galilean transformation. This would have the effect on Figure 2.13 of making the lines AB and CDhorizontal and coincident. The Present would then be reduced to a single line of events
occurring at all positions x, but at the single time Now (i 1 ) . There would be no spacelike intervals, just timelike intervals. Special relativity has one consequence of enlarging
the domain of the Present at the expense of the Past and the Future.
EXAMPLE

2.5

Imagine that the X axis is selected to be pointing at the star Betelgeuse, and that this
star is at the position Xfj. At the present time i 1 observer 0, stationed at Xl, sees Betelgeuse
as it was at the earlier time t6 ; that is he sees the event P 6 Imagine that at a later time t s"
Betelgeuse undergoes a supernova explosion, this being the event P s. At time t 1 observer 0
is not yet aware of this occurrence. However, as time goes on the crossed lines AB and CD
move vertically upward in the diagram of Figure 2.13. When they have shifted an amount
t s - le, the line CD will cross the event P 5 and observer 0 will become aware of the supernova.

Another enlightening geometrical construction results when one conceptually plots


events in the four-dimensional space (x,Y,z,l). For example, the projection on the
XT plane of the history of a moving point might be the sequence of events shown as
the line PQ in Figure 2.14.
In this same plane one can show the axes X' and T'; the equations for these axes can
be obtained by setting t' and x' equal to zero in (2.40). There is no reason why X and T
should be shown orthogonal; if they are so shown, X' and T' most decidedly are not
orthogonal. The line PQ is known as the world line of the moving point and it has the
property of being the same for all Lorentzian frames, the latter differing in the direc-

SECTION

TT elocity

1:3

81

--------- --- -

X'
--- ---

----

FIGURE

2.14

n;orld lines.

tions of their space and time axes on Figure 2.14. The locus of all the time axes is the
Past-Future area of Figure 2.13, whereas the locus of all space axes is the Present area.
A world line can follow one of the four axes in Figure 2.14 in which case length con traction and the slowing of clocks can be deduced geometrically.

2.13

VELOCITY

The general motion of a point, in which the spatial variables are continuous functions
of the temporal variable, can be traced in terms of differentials. Fr0l11 (2.39) and (2.40)
these are
dx' == td - u dt)
dx == K(dx' + u dt')
dy

dy

dy'

dz' == dz

dz

==

dz'

dt

==

dy'

dt'

(dt -

~ dX)

(dt'

+ .!!.c2 dX')

Ratios of these differentials may be formed to yield velocity components. For example
v

,
X

dx - u dt
dt - Cui c2 ) dx

dx'
dt'

=:-=:-----

1 -

dxi d! - u
(u/c 2 ) dx/dt

L'x -

Proceeding in this manner, one can derive the Lorentz velocity transformation equations, namely,
vx - u
,
v~ + u.
vx ==
vx =
2
1 - uV x / c
1 + UV~/C2

,
v

vy
y =:
(I - uV x / c2 )
,
vz
v ==
K(l - uV x / c2)
Z

vy
o,

==

==

vy

(I

+ uv~1
, c

u,

K(l

(2.50)
)

+ uv:1c2)

82

The Special Theory of Relativity

CHAPTER

It may be noticed that the transformation one way differs from the transformation the
other way only in the sign of u. If -u and , are small compared to c, Equations (2.50)
are approximated quite well by the Galilean velocity law (2.18).
EXAl\1PLE

2.6

Let there be t\VO particles moving along the ~Y axis. As seen from the X YZ frame of reference, let one particle have a velocity V x = v and let the other particle have a velocity V x =
-v. What is their relative velocity'?
To answer this question, let ~Y' Y' Z' ride along with one particle by setting u = v. Then
from (2.50L the velocity of the other particle in ~Y'}""' Z' is
v

,
z

vx

= - - - - -2 =
1-

UV I / C

-v - v

+V

2v

= - --1

2/C 2

+V

2/C 2

For v small this yields the classic result v~ = - 2v. However, as v ~ c, v~ --4 - c. Thus even
though in .o\ YZ the t\VO particles migh t be going in opposite directions with speeds each
of which approaches c relative to X YZ, their recessional velocities relative to each other
are still less than c.
For v ~ c the entire analysis is improper because one cannot then put u = v in (2.50),
since the Lorentz transformation is nonphysical for u 2:: c.
I t can be concluded from the foregoing that if a particle is traveling at a velocity less
than that of light in one Lorentzian frame, it travels at a velocity less than that of light in
all Lorentzian frames.

2.14

RELATIVISTIC INTERPRETATION OF THE FIZEAU EXPERIMENT

It will be recalled from Section 2.4 that Fizeau found the velocity of light in water to
be dependent on the motion of the water, this dependency being expressed by Equation (2.21). An explanation of this result based on an ether hypothesis had been made
earlier by Fresnel, who assumed part of the ether to be dragged along by the water.
A simpler explanation of Fizeau's data is possible in terms of the Lorentz velocity
transformation. Let XYZ and X'lT'Z' be two frames of reference such that X and X'
are aligned with the flow and X' is sliding along X at speed u. Then X'Y'Z' can be
chosen to be at rest relative to the water, resulting in XYZ being at rest relative to
the laboratory. In X'Y'Z' the velocity of the light waves as they pass through the
water is

Vx =

Vo =

c
n

(2.51)

in which n is the index of refraction.


If the appropriate equation from (2.50) is used, this velocity, as viewed from XYZ, is
V

+u
+ uf cn

cln

(2.52)

Expansion of the denominator of (2.52) in a power series (cf. Mathematical Supplemont) gives

v;

==

(~n + u)

+U
n

Vx = -

(1 - ~ + : + . .. )

1i

n2

en

u
en
2

e2n 2

-u
+ -en
+ -cun +
2

(2.53)

Cedurholm-Toumee ill aser Experiment

1he

SEC'frON I;")

83

Retaining only terms containing c in powers above c- 1 gives


, =

Vo + it

(1 - ~z)

which is in agreement with (2.21). Neither Fizeau nor his followers had sufficient experimental sensitivity to detect the effect of the higher order terms in (2.53). Thus the
Fizeau experiment is completely consistent with the Lorentz velocity transformation.

2.15

THE CEDARHOLM-TOWNES MASER EXPERIMENT

With the advent of very precise clocks based on the maser principle;" it has recently
become possible to perform an even more sensitive test of the presence of an ether than
that afforded by the Michelson interferometer. This has been accomplished by pointing
the beams of ammonia molecules comprising two masers in opposite directions and
measuring the difference in their oscillating frequencies.
'I'he operation of one of these masers is suggested in Figure 2.15. Ammonia gas is
emitted through an opening in a source S and sprays out into a region containing a
cylinder of electrostatically charged rods. The ammonia molecules norrnally exist as a

Output

FIGURE

2.15

The ammonia beam maser.

[After Gordon, Sci Amer, 199, 42; 1958.]

gas in a balance between two energy states, there being a greater population in the
lower state. However, the charged rods repel the ammonia molecules in the higher
state, whereas they attract those in the lower state. As the ammonia gas drifts through
the cylinder of charged rods, the two states start to separate. Those molecules in the
lower state (represented by black dots) diverge whereas those in the upper state (represented by grey dots) converge. The latter then enter a cavity where, due to the unbalanced population, some of them spontaneously revert to the lower energy state, emitting
photons of characteristic frequency in the process. As these photons bounce around
J. P. Gordon, H. J. Zeiger, and C. H. Townes, "The l\1aser-Ne\v Type of Microwave Amplifier,
Frequency Standard, and Spectrorneter," Phys Rev, 99, 1264-1274; August 15, 1955.

21

84

The Special Theory of Relat'ivity

CHAPTER

in the cavity, a field builds up; if the dimensions of the cavity are properly chose]
to resonate this effect, self-sustaining oscillations can occur, and an electromagneti
signal of great purity at a stable, precise frequency can be extracted from the cavity b:
means of a probe.
If two such ammonia beam masers are placed back to back, and the signals coupler
out of their respective cavities are compared, a detectable beat will occur if the signa
frequencies are not the same, That an ether theory would predict the presence of t
beat can be seen from the following argument:
Assume that the t\VO masers are back to back and at rest in a coordinate systerr
X'Y'Z'. Their ammonia beams are presumed to have velocities v and -v with respect
to X'Y'Z' and this entire system is presumed to be traveling with respect to the ethel
at a velocity u. Under these conditions IVI~l1er has shown." that photons emitted from
the first maser beam in the direction characterized by the unit vector e' have a frequency v~ in the laboratory frame of reference given by

= Vo

v e'
+ -+
C

(v e')2
c2

u v]
+ -c
2

(2.54)

in which vo is the photon frequency as determined by an observer at rest relative to


the ammonia molecules, A derivation of Equation (2.54) can be found in Appendix B.
In the cavity of the maser oscillator the ammonia molecules emit photons in all
directions, and as a result the signal coupled out of the cavity will have a mean frequency given by
(2.55)
in which dn is an element of solid angle and fee') is a weighting function dependent on
the geometrical arrangement of the cavity. Upon introducing (2.54) into (2.55), one
gets for the mean frequency

u v]
ii: = [1 + g(v) + ~
Vo

v e'
in which g(v) is the mean value of - c

(v e')2

+ ---and
c'2

(2.56)

is thus a function only of the

magnitude of v.
If this argument is repeated for the second maser beam, for which v is replaced by - v,
the mean value v~ can similarly be found. 'The difference in these two mean frequencies
is therefore
J

ii +

2vo

ii = c2

(0 v)

(2.57)

It has proved possible to achieve a precision of one part in 10 12 in this frequency


comparison, Thus since v = 0.6 knt/see. for each ammonia beam, with vo = 23,870 l\1c/
sec., an ether drift u as small as T1looth of the orbital velocity of the earth should be
detectable with this apparatus. This is 50 times Blare sensitive than the apparatus of
Joos which incorporated a Michelson interferometer and was used in 1930.
C. Meller, "On the Possibility of Terrestrial Tests of the General Theory of Relativity," Nuovo
Cirniento, 6, Suppl, 381-398; 1957.

22

SECTION

1G

The Variation of 111ass 85

Back to back maser oscillators have been constructed by Cedarholm and Townes"
and used in the manner just described. The outputs from the cavities of the t\VO rnasers
were compared ill frequency as the entire apparatus was rotated through 180 deg, thus
ensuring in (2.57) a maximum value of u v for some position of the apparatus. I n the
words of the investigators
'The experiment . . . was carefully done for the first time on September 20, 1958. No
proper effect (in the frequency difference) so large as 510 cps was found. Hence, since the
orbital velocity of the Earth of 30 km/s, would have given an effect of 20 cps, the ether
drift could not have been larger than nloo of this value, or 30 lll/~. I t is, of course, possible
for the Illation of the earth to be just cancelled by the motion of the solar system through
the ether at some particular time of the year. The experiment has now been repeated at the
Watson Laboratory during 24-hr. runs at approximately three-month intervals throughout the year. In none of these runs was any effect so large as -r/o cps found.

This null result makes the case against an ether theory even 1110re compelling,
Einstein's formulation, which treats the ether as superfluous, predicts a null result in
the Cedarholm-Townes experiment.

2.16

THE VARIATION OF MASS

Since it has been established that the Lorentz transformation affords a satisfactory
explanation of phenomena involving the velocity of light, it now becomes necessary to
reexamine the laws of mechanics. If the principle of relativity is to hold for all physical
laws, and if the Lorentz transformation is the proper link between inertial coordinate
systems, then the laws of mechanics, if properly expressed, should transform satisfactorily via the Lorentz equations. In a sense this reexamination has already been
started in that the concepts involving the measurement of distance intervals and time
intervals form an integral part of all mechanical laws. A critical study of the
operational definitions of these measurements for moving systems has revealed
that both distance intervals and time intervals are dependent on relative
motion. This reexamination will now be COIl tinued by the introduction of another
hypothetical experiment." whose symmetry raises a question about the invariance of
mass.
Imagine that two exactly similar elastic balls suffer a .collision which in the X'Y' Z'
frame appears as shown in Figure 2.16a. They are seen to approach each other along
parallel lines, collide, and then recede from each other along parallel lines. Their
approach speeds are equal and by symmetry so too are their recessional speeds. (A
perfectly elastic collision is assumed with no loss of energy, thus causing the recessional
speed to equal the speed of approach.) This experiment can be assumed to take place
either in a region free from gravitational attraction, or on a level frictionless table
over which the balls are sliding. t
Now imagine this same collision as viewed from an XY'Z frame which is moving in

t A rolling motion would complicate the discussion unnecessarily.


J. P. Cedarholm and C. H. Townes, liA New Experimental 'rest of Special Relativity," Nature,
184, 1350-1351; October 31, 1959.
24 This hypothetical experiment and the ensuing analysis were first offered by G. N. Lewis and It C.
Tolman in the paper "The Principle of Relat.ivity and Non-Newtonian Mechanics," Phil Mag, 18,
510-523; 1909.
23

86

The Special Theoru of Relativity

CHAPTEH

y'

v~

.....

.....

\
I
\

/
l

-,

",

\
./

A.
A

(a)

(b)
FIGURE

2.16

The collision of two balls.

the direction of the -X' axis at a speed u = v~. To an observer 0 stationary in XYZ,
ball A is moving parallel to the Y axis, and ball B makes a more grazing incidence to
the X axis.
As seen in X'Y'Z', each ball has its y' component of velocity reversed by the collision but its x' component of velocity is unchanged. As seen in XYZ, ball B has its y
C0111pOnent of velocity reversed by the collision but its x component is unaffected. In
XYZ ball A does not have an x component of velocity either before or after the collision; however, it does have a y C0111pOnent which suffers a reversal.
Classical mechanics would yield the result for this experiment that V y =
for ball B
and that in the XYZ frame the velocity of ball A is iVy. In terms of a Lorentz transformation, one would be ill-advised to assume this without checking. Therefore, let
W y represent the velocity of ball A in XYZ before and after the collision. Using (2.50)
one finds that for ball B

v:

whereas for ball A


The ratio gives

Wy

v=Y
K
(2.58)

and thus V y < ui; Viewed from XYZ, ball A has a greater y component of velocity
than does ball B. (For ordinary velocities the difference is exceedingly small.)

SECTION

The Variation of 111 ass

IG

87

Equation (2.G8) requires the abandonment of one or the other of t\VO principles of
classical mechanics. If mass is an invariant, then the principle of conservation of linear
momentum is violated in the ?J direction in X}TZ. If the momentum principle is valid,
then 111aSS cannot be an invariant. T'he latter assumption has proved to be the one which
is consistent wit.h experiment. and will be the basis for what follows.
Let nl~ = 1n~ be the t\VO 11laSSeS in the X'JT'Z frame (they are equal by symmetry)
and let 1nA ~ rnB be the t\VO masses in the XYZ frame. Then

so that
This result can be rephrased entirely in terms of XYZ quantities by using (2.50) to
substitute for v~. This gives
V

==

from which

== - - - 1 - uvx/c2

- 2 - == V x C

U -

2v;

UV x -

1- 2UVc

U?~V; ==
c4

ln~ ==

m;

== vx

c~

and thus

-l)

UV x

(1 _UVx)2
(1 _V:Vc )2
c
==

(1 _~)-H
c2

(2.59)

This relation is seen not to depend on Vy and should hold even when V y == O. But then
0 also, and as seen from X'Y'Z' the two balls approach each other along the X'
axis and barely touch as they pass. As seen from XYZ, ball A is at rest and ball B
passes by, barely touching .A. as it travels parallel to the X axis. With m the 111aSS
of ball A \vhen it is at rest, Equation (2.59) can be rewritten
ui; ==

ma
- -----(1 c 2) }~

B -

v;/

(2.60)

One can now argue that it no longer matters whether ball A is present or not. Further, the rest mass of ball B should also be rna, since in X' V'Z' one started with a sy mmetrical experiment using identical balls. Wi th only ball.B left, in constant rectilinear
motion, the subscripts can be dropped on m n and Vx, giving
(2.61)
In Equation (2.61), ni is the rest mass of ball B in the Lorentz frame XYZ, and m is
its dynamic mass when going at a speed v relative to XYZ.
It is inferred from this result that the mass of any material body depends on its relative motion, increasing with speed according to the relation (2.61).

88

The Special Theory of Relativity

EXAMPLE

CHAPTER

2.7

A clear confirmation of the variability of mass has been given by Zahn and Spees." Employing a radioactive source S to generate high-speed electrons, they selected a small velocity
range of these electrons through the use of a velocity filter; with a Geiger counter as detector, they were able to determine the dynamic mass.
As indicated by the figure of the apparatus, C is a parallel plate condenser with extremely
small spacing between the plates (d = 0.4663 mm) and a length of 12 em. Any electron

C~~~8

~~~S2~-~S
:I

I
I
I

I
I

1/

I"

If

P.'

[After Zahn and Spees, Phys Rev, 53, 365; 1938.)

which reaches the Geiger counter must pass between these plates. Helmholtz coils (not
shown) create a uniform, constant magnetic field B = 120.85 X 10-4 webera/m" in a
direction perpendicular to the plane of the paper. The electrons which leave S have trajectories in the plane of the paper which are circles. Only those electrons whose center of
curvature is at P will enter the condenser traveling parallel to the plates. The radius of
curvature r can be found from the geometry, which yields the formula
r 2 = (r - a) 2
a 2 + b2
r=--2a

+ b2

in which a is the distance that the source S is below the mouth of the condenser, and b is
the distance that S is to the right of the mouth. In one run of the apparatus, these distances had the values a = 0.02992 m and (a 2
b 2)H = 0.0977 m, yielding r = 0.1595 m.
The force law can be invoked to determine the velocity of the electrons traveling this
particular trajectory. One obtains
mv 2

= evB

and using (2.61) for the dynamic mass, this becomes

(1 - V2/C2)~~

!!.- rB
mo

26 C. T. Zahn and A. H. Spees, "The Specific Charge of Disintegration Electrons from Radium E,"
Phys Rev, 53, 365-373; March 1, 1938.

SECTION

The Momentum and Energy

17

89

in which e is the electronic charge (1.6 X 10- 19 coul) and m is the rest mass (9.1 X 10- 31 kg).
Solving for v, one obtains
v = 0.749c = 2.25 X 10 8 ru/sec
Under the action of the magnetic field, these electrons would continue in their circular
path, thus striking the bottom plate of the condenser, if it were not for the electrostatic
voltage between the plates. When this voltage V is properly adjusted, a balancing electrostatic force results and the electrons are able to travel between the plates, emerging
from the other end to be detected by the Geiger counter. It is clear that regardless of the
value of V, no other electrons, traveling in any other orbit as they leave S, can pass through
the condenser, and those electrons traveling at the speed 2.25 X 10 8 m/sec will get through
only if V has a value such that

eV
- = evB
d
V = dvB
= (0.4663 X 10- 3) (2.25 X 10 8 ) (120.85 X 10- 4)
V = 1,270 volts

The experimental data, showing counts per minute versus condenser voltage, are given
in the graph. The agreement between theory and experiment is seen to be very good.
70
60
~

:::s

50

's.

40

Q)

0.

ell

5
u

30
20
10
0

500

1000

1,500

2000

Volts

[After Zahn and Spees, Phys Rev, 53, 365; 1938.]


Had the rest mass rather than the dynamic mass been used above in the force equation,
the velocity of the electrons getting through the filter would have been computed to be 3.4 X
10 8 m/sec (greater than c), and the predicted condenser voltage to permit passage would
have been 50 percent higher.

2.17 MOMENTUM AND ENERGY


1\ ow let attention be turned to the more general case of a body whose velocity is not

necessarily a constant with time, in either magnitude or direction. Let it be assumed


that Equation (2.61) is applicable to this general case and define momentum by the
relation
p

= mv

mov

(2.62)

90

The Special Theory of Relativ?'ty

CHAPTER

It is apparent that for modest velocities this reduces to the classical definition of

momentum.
Further let N ewtori's force law be defined by the relation

F = dp
dt

(2.63)

Since v is now permitted to be time-dependent, it follows that the dynamic 111aSS m is


a function of time and thus that
dp
dv
dm
-==m-+v-

dt

dt

(2.64)

dt

Expressions (2.63) and (2.64) also are seen to reduce to their classical forms for normal
velocities.
The kinetic energy T of a moving body still can be defined as the work supplied to
bring it frorn rest to its state of motion, Thus

(2.65)
From (2.62)
so that
However since d(v v) = 2v dv, this can be rewritten

Therefore (2".65) becomes

dw
= moc2
(1 - w)%

[(1

V2)-~~

-c2

(2.66)

Equation (2.66) can be expanded in a power series (cf. Mathematical Supplement) to


give
1
3
v4
(2.67)
T = - mov2
ma - 2 +
2
8
c

+-

If vc, (2.67) is approximated quite well by the conventional expression for kinetic
energy.
Equation (2.66) can be written in the interesting form
T =

[(1

_r::/ C2 )!h - m o] c2 = (m - mo)c

(2.68)

This suggests that the kinetic energy can be interpreted as the square of the velocity
of light times the change in mass, If the increase in energy is thought of as the cause
of the increase in mass, it becomes an attractive hypothesis to imagine that even the

SECTIO~

18

The Transformation Law for Mos

91

rest 111aSS 1110 is due to an internal arnount of energy moc2 If 1noc 2 is called the rest
energy of the body, then the total energy 1~1, being the sum of the rest energy and
the kinetic energy, is given by
(2.69)
This celebrated equation is one of the most important results of the special theory and
has been amply substantiated by a wide variety of atomic and nuclear experiments. It
lies at the heart of the explanation of fission and fusion bornbs and has led to a satisfactory explanation of stellar energy processes. Verification of (2.69) provides convincing support of the soundness of the generalized definitions of momentum and the force
law given earlier, on which the derivation of (2.69) "vas based.
EXAwlPLE

2.8

The dynamic balance within a stable star can be explained by arguing that the great
mass causes a high gravitational pressure at the core. 'This intense pressure serves to elevate the temperature of the core to millions of degrees and thus permit fusion processes to
occur. The most likely of these processes is the conversion of hydrogen to helium. Four
hydrogen atoms, each consisting of a proton and an electron, can be transformed into a
single helium atom consisting of two protons, t\VO neutrons, and t\VO electrons, as suggested
by the diagram. A charge balance is achieved because t\VO positively charged protons plus
t\VO negatively charged electrons are replaced by t\VO uncharged neutrons. However a
mass balance is not achieved. Since the atomic weights of hydrogen and helium are 1.008
and 4.003, respectively, it follows that 4.032 units of mass are replaced by only 4.003
units. The loss in mass, multiplied by c 2 represents the energy radiated during the transformation. As this energy streams outward from the core it causes a radiation pressure

which balances the gravitational pressure, causing the star to maintain a stable size. This
stability ensures that the rate of the fusion process will remain essentially constant over a
long period of time (billions of years). This in turn makes the solar power available to a
planetary system a constant-a desirable factor in evolutionary processes.
When radiation pressure is computed on the basis 6.E = c2 Sm, theoretical calculations
yield values for stellar diameters and surface temperatures which are in satisfactory agreement with observations. Spectrographic studies of our own sun indicate hydrogen is its most
abundant element, with helium next. 'The relative abundance of these t\VO elements suggests
that the process has been going on for about five billion years, a figure which is in good agreement with geological data. I t also suggests that the process can continue in stable fashion
for another five billion years.

2.18

THE TRANSFORMATION LAW FOR MASS

Equation (2.61) is not, of course, the transformation law for mass because it relates
the rest mass in one coordinate system to the dynamic mass in the same coordinate

92

The Special Theory of Relativity

CHAPTER

system. However it can be used to relate the dynamic mass in two different coordinate
systems as follows:
Let a body of rest mass mo have a velocity v(x,y,z,t) in X YZ and a velocity Vi (x' ,y' ,z' ,t')
in X' Y' Z'. Then

m
m

and

= -----

(1 - v2/

C2)~2

mo
[1 - (v') 2/ C2P2

=-----

are the expressions for the dynamic 111aSS in the

t\VO

coordinate systems. Thus

From the velocity transformation Equations (2.50),

(V/)2 =

(1 - ~:%r2

so that

m'

(1 - ~:) +

[(V 2 - v;)
=

(1 - ~~%) m

(v% - U)2]

(2.70)

Equation (2.70) is the transformation law for mass. In using it one should remember
that in general both m and m' are functions of time.

2.19

THE TRANSFORMATION LAW FOR FORCE

On the presumption that the Lorentz equations properly transform all the laws of
physics (as required by the relativity principle) one can write
F =

F'

dt (mv)

= -

dt'

(2.71)
(2.72)

(m'v')

and inquire what the force transformation law must be in order to derive either of these
equations from the other via the Lorentz equations.
With the help of (2.50) and (2.70), Equation (2.72) can be expanded to give
F
F

,
Z

,
Y

dt d

= -

dt' dt

[K(V
Z

u)m]

dt d
- [mv]
dt' dt
Y

(2.73)

= -

dt d

F = --[mv]
dt' dt
Z

Formation of the differentials of the last of (2.39) yields

dt
dt'

SECTION

The Transformation Law fOT Force 93

19

so that Equations (2.73) become

F ' ==
x

UV

1 -

_x

)-1

c2

~ (mv x

dt

r. -

u(dm/dt)
1 - uVx / c 2

mu)

d
F
- (mv y ) =
y
K 1 - uv x /
dt
K(l - uv x / c2)
,
1
d
F'7
F ==
- (mv z ) ==
~
2
K(l - uVx / c ) dt
K(l - uVx / c 2 )

F~ =

(2.74)

c2)

From (2.Gl)

dm
dt

mo/c

Iv

(1 - V2/C2)~~ v dt == 1 -

With the help of (2.64) this can be rewritten

(v) ( dV)

(F _

(c2 - v2) dm = v
dt
dm
v F

so that

V2/C2

m dt

v dm)

dt

-- - ---dt
c2

(2.75)

Finally
F' == F
z

F ' ==
F ' ==

(2.76)

K(l - uv x / c2 )

z/_C_- F,
uv y/c
F _ _ 1_lV_
1 _ uvx/c2 1/
1 - uv x/ c2

Fz
K(l - uVx / c2)

Equations (2.7f)) are known as the force transformation law. It is evident that if u and v
are small, F ' ~ :F', indicating that in such cases the classical expression, which equates
these forces, is a valid approximation.
It is significant that Equations (2.76) are linear in the force components, Recalling
that F or F' is the total force acting on the body of rest mass me, if F is composed of
partial forces such that

F = FI

F2

+ ... +

FN

then each of these partial forces has a counterpart such that

F' == F~

F~

+ ...+

F~

Equations (2.76) can then be written in the expanded form

(F~x

+ r; + ... + r.:

== (FIX

+ F + ... + FNx)
2x

uVy / c 2

1 - uVx / c
UV Z /

o: + F~y +

F~y)

(F;z+F~z+

+F~)

(F ly

+F +

(F lz

+F +

c2

2y

2z

1 - uVx / c
(Fly + F 2y + ... + F Ny)
K(l - uv x / c2)
(F 1z + F 2z + ' " +F Nz)
e(I - uVx / c2)

94

The Special Theory of Reloiioin,

CHAPTEH

Since the partial forces arc in general independent, it follows that

F'nx

r.;

F'

If

lIX

nz

uVy / c2 F
1 _ uVxlc2 ll y
----

uu.] c2 F
1 _ uV x /c2

----

llZ

F ny
1 - uV x/c 2 )

(2.77)

Fn z

K(l - uv x / c2)

in which F, and F: represent the nth partial force as determined in XYZ or X'Y'Z',
with 1 S n S N. Thus the partial forces transforrn according to the same law as the
total forces. However, it should be recognized that, whereas (2.77) contains partial
forces, it does not contain partial velocities. The terms Vx, Vy, and v, occurring in (2.77)
refer to the total instantaneous motion of the mass m, resulting from the action of all
the forces.
This important transformation law will be central to the development of the field
of a moving charge, a topic to be considered in Chapter 4. The results there obtained
will serve as additional evidence for the validity of this reconstitution of the laws of
rnechanics in keeping with the Lorentz transformation.
REFERENCES
1.

Bergmann, P. G., Introduction to the Theory of Relativity, Prentice-Hall, Inc., New York,
1947.

2.

Dingle, H., 'The Special 'Theory of Relativity, 3rd ed., a Methuen Monograph, John Wiley
and Sons, Inc., Ne\v York, 1950.

3.

Einstein, A., H. A. Lorentz, H. Minkowski, and H. Weyl, The Principle of Relativity, a


collection of original papers, Dover Publications, Inc., New York.

4.

Leighton, R. B., Principles of l1fodern Physics, McGra\v-Hill Book Company,


1959.

5.

lVI~ller,

6.

Panofsky, VV. 1(. H., and NI. Phillips, Classical Electricity and M aqneiism, AddisonWesley, Inc., Boston, 1955.

7.

Richtmyer, F. R., E. H. Kennard, and r. Lauritsen, Introduction to Modern Physics,


5th ed., Mcflraw-Hill Book Company, N C\V York, 1955.

8.

Sherwin, C. 'V., Basic Concepts of Physics, Holt, Rinehart and Winston, Nc\v York, 1961.

9.

Whittaker, E., A History of the Theories of Aether and Electricity, Vols, 1 and 2, 'rhos.
Nelson and Sons, Ltd., London, 1953.

10.

XC\V

York,

C., The Theory of Relativity, Oxford at the Clarendon Pre~s, London, 1952.

Whittaker, E., From Euclid to Eddington, Dover Publications, Inc., N"e\v York, 1958.
PROBLEMS

2.1

Assume that t\VO plane waves of light are propagating almost parallel to the Y axis, such
that they are given by

fl = K cos (27rvt - k, -

"'2 = K cos (21t"vt + kxx -

kyY)
kyY + a)

Problems

95

in which K is the constant amplitude of both waves and a is their relative phase at the
origin. Show that in any transverse plane y = constant these waves interfere so as to
give alternate regions of light and dark. What is the spacing of these interference fringes?
How does the position of these fringes depend on a'? (Note tha; this effect is used in the
Michelson interferorneter.)
2.2

In the Michelson interferometer how does fringe shift depend on the rotational position
of the apparatus if the two arms are not equal? (Cf. .Appendix A.)

2.3

In Section 2.10 of the text, the Lorentz transformation equations were derived using the
pulse of light which occurred at the coincidence of ends of the two rulers. Show that the
Lorentz equations can also be derived by requiring that 0 and 0' obtain symmetrical
results, thus giving an analytic parallel to the literal arguments of Section 2.9.

2.4

Show that t\VO Lorentz transformations carried out one after the other are equivalent to
one Lorentz transformation for which the relative velocity is

U1 + U2
= ----

(U1 U 2/

e2)

with U1 and 1[2 the relative velocities of the t\VO transformations. Thus show that it is
impossible to combine a sequence of Lorentz transformations into one yielding a relative
velocity greater than e.
2.5

In Section 2.9 of the text, a literal argument was used to show that observers 0 and 0'
each concluded that the other ruler had shrunk when relati ve motion occurred. Use the
Lorentz transformation equations to demonstrate that the events A opposite B' and A'
opposite B occur in reverse time sequence for the two observers.

2.6

The result (2.41) was obtained when observer 0 found the positions of the t\VO ends of
the ruler R' at a common time t. Show that the same result is obtained if 0 determines
how long it takes for R' to pass a fixed point in XYZ and then multiplies tnis time interval
by the speed u.

2.7

Show that the time dilatation effect may also be obtained by determining the distance
in XYZ between two events which occur at the same point in X'Y'Z' and dividing this
distance by u to get the time interval in XYZ.

2.8

A jet passenger airplane 150 ft long is cruising at a ground speed of 600 mph. By how much
does the plane appear shortened to a ground observer '? How long would the pilot need to
fly at this speed before his clock appeared to a ground observer to have lost 1 sec?

2.9

Use the time dilatation formula to check the results of Example 2.2.

2.10

A space vehicle whose rest length is 100 In is traveling away from the earth at a constant
velocity v = 0.8e. A pulse of light is sent from the earth toward the spacecraft. As the light
pulse passes the rear of the vehicle it triggers a clock. It then continues to the front of the
vehicle where it is reflected by a mirror and returns to the clock. What time interval does
the clock record between the two passages of the light pulse? What time interval would
earth clocks record between the same two events?

2.11

An electronic clock is shown in the figure, and consists of a flashtube F and a photoelectric
cell P shielded from each other by a baffle, plus a mirror M rigidly mounted a fixed distance !J above the assembly. A. circuit in the box B is arranged so that when P receives
a light pulse from F via AI, it causes the flashtube to emit another pulse of light with
negligible delay. This clock thus U ticks" once every 2D/ e sec when at rest. N O\V suppose

96

The Special Theory of Relativity

CHAPTER

that this clock is moving at a constant velocity v relative to the laboratory frame and
determine its period. Does your answer depend on the direction of v?
2.12

A cosmic ray p, meson enters the earth's atmosphere vertically at a speed v = 0.98c. In
its own rest system the p, meson decays into an electron and 2 neutrinos with a mean
lifetime of 2.2 X 10- 6 sec. What is its mean life expectancy as determined by an earth
observer? How far will a shower of these J..L mesons penetrate the earth's atmosphere before
half of them have decayed?

2.13

Suppose that the frequency of a ray of light is v, as determined by 0 who is stationary


in XYZ, and that this light ray is traveling at an angle (J with respect to the X axis. Show
that an observer 0', stationary in X' Y' Z', will find that the frequency of the light ray is
v[l - (u/c) cos (J]
v' - - - - - - - -

(1 - u 2/ c2) ~~

This result is known as the relativistic Doppler formula. Note that the numerator is the
classical expression.
2.14

A distant galaxy is receding from Earth with a radial velocity component of 1,000 km/sec.
By how many angstrom units will the If{J line (4,861 A) be shifted? Is the shift toward the
red or toward the violet end of the spectrum?

2.15

A straight line fixed in the XZ plane makes an angle (J with the X axis. What angle does
this line appear to make with the X' axis to an observer 0' stationary in X/Y'Z'?

2.16

A small particle of mass m is moving at a constant speed v in a straight-line path in the


XZ plane. This path makes an angle () with the X axis. Find the velocity components of
this particle in the X'f'Z' frame. What angle does the particle's path make with the X'
axis? Is this answer consistent with the result of the previous problem?

2.17

A particle of rest mass j1f 0, moving through X YZ at the constant velocity V, collides
inelastically with a second particle of rest mass mo. If the second particle were initially at
rest in XYZ, find the speed of the composite particle.

2.18

Explain the aberration of light from a distant star in terms of the Lorentz transformation.

2.19

To what speed must a particle of rest mass m be accelerated in order to quadruple its
mass? What is its kinetic energy at this speed? How does this answer compare with

tmov 2 ?
2.20

Find a formula connecting the momentum p and the kinetic energy T of a particle of rest
mass mo.

Problems 97
2.21

A Compton collision occurs when a photon strikes an electron and is thereby scattered.
Find the change in frequency of the photon as a function of the angle ()through which it is
deflected. What is the change in energy of the electron?

2.22

An excited atom, at rest in XYZ, drops to a quantum state whose energy level is lower
by ~E. A photon is emitted and the atom recoils. Therefore the frequency of the photon
will not be precisely v = ~E/h, but rather will be
v =

~E
h

(1 _~ M~E)
2

oc

in which M 0 is the rest mass of the atom and h is Planck's constant. Show this result.
2.23

Consider the collision of a particle of initial energy E and rest energy Eo with a like
particle which is at rest. Show that the maximum energy available in the zero momentum
frame is (2EE o 2E o2) ~~.

2.24

Find the Lorentz transformation law for acceleration and express your answer in terms of
acceleration components which are perpendicular to and parallel to the velocity.

2.25

In an Xl"Z frame a particle is moving in the Xl" plane and has instantaneous velocity
components V x = V y =
At this same instant the two force cornponents are equal. In
what Lorentzian frame will the force appear to be entirely Y directed at this instant?
What will be the magnitude of this force?

2.26

Show that the force defined by (2.71) is parallel to the acceleration only if the acceleration
is either parallel to or perpendicular to the velocity.

2.27

An electron and a positron can combine at rest, annihilating each other with the result
that t\VO l' rays are emitted. Assuming that energy and momentum are conserved, calculate the wavelength of the l' rays.

2.28

Consider a rocket ship in which mass can be converted to energy which provides a thrust.
Find the terminal velocity of this rocket ship relative to a frame in which it was initially
at rest, as a function of the percent of original mass converted.

tc.

CHAPTER

Electrostatics in Free Space


and the t\VO which follow will be concerned with the establishment of
an electrical theory in the absence of dielectric and permeable materials. Cond uctors
will be considered in a limited way, but only as supporting structures for the distribution or transport of charge; attention will be focused on the fields set up by these
charges and not on any interaction with their conducting environment. The conductors themselves will be treated as an electrically neutral background, consisting locally
of equal amounts of positive and negative charge in a VaCUUI11. In this way a simplified theory of electromagnetic fields caused by charges in free space can be developed. Subsequent chapters will then be concerned with the extension of this theory to
situations which include the effects of materials,
The present chapter begins with a formulation of the electric field due to a static
assemblage of charges and then proceeds to the introduction of the electrostatic potential. Electric flux density is defined, following which Gauss' law and its applications are
discussed, including the use of flux maps. The relationship between field and charge at
a conductor-vacuum interface is established and the method of images is then developed and applied to several cases. Poisson's and Laplace's equations are derived and
a variety of boundary-value problems considered. The concept of capacitance is defined
and generalized to a system of conductors, and the chapter closes with a discussion of
the energy stored in an electrostatic field.
At this point a dipole theory of the behavior of dielectric materials could have been
introduced, but it would perforce be limited to static stresses. For this reason it was
felt desirable to defer the discussion on dielectrics until the general time-varying case
could be considered. Similarly, the next chapter, which deals with magnetic fields due
to time-independent currents, could logically contain sections OIl d.c. conductivity and
static effects in magnetic materials: these topics too have been postponed so that timevarying effects could be included for completeness.
THIS CHAPTER

3.1 *

HISTORICAL SURVEY

Electrostatic theory is based OIl the single experimental postulate that electric charges
exert forces on each other which vary directly as the product of the strengths of the
charges and inversely as the square of their distance of separation. Thus if q and q' are
chosen to represent the strengths of t\VO point charges, and r is the distance between
* This section may be omitted without loss in continuity of the technical presentation.

SECTION

Historical Sllrvey

99

them, the force which one charge exerts on the other may be expressed in the form
f

cc

qq' i,
r2

(3.1)

in which 1r is a unit vector along their connecting line. The two electric charges can be
alike or opposite, causing the force to be repulsive or attractive; this feature is accommodated mathematically by permitting the symbols q and q' to have an intrinsic algebraic sign which is positive or negative.
This inverse square law, as it is usually called, has a curious history of discovery and
rediscovery. As is true with respect to most major scientific principles, its establishmen t
cannot be wholly credited to the efforts of one man. Perhaps the first significant contribution to the realization of this law was made by Benjamin Franklin (1706-1790).
Writing to Dr. John Lining of Charlestown, South Carolina, on March 18, 1755,
Franklin described an experiment he had performed in the following words;'
. . . I electrified a silver pint cann, on an electric stand, and then lowered into it a cork-ball,
of about an inch diameter, hanging by a silk string, till the cork touched the bottom of the
cann. The cork was not attracted to the inside of the cann as it would have been to the outside, and though it touched the bottom, yet, when drawn out, it was not found to be electrified by that touch, as it would have been touching the outside. The fact is singular. You
require the reason; I do not know it. Perhaps you may discover it, and then you will be so
good as to communicate it to me. I find a frank acknowledgment of one's ignorance is not
only the easiest way to get rid of a difficulty, but the likeliest way to obtain information,
and therefore I practice it: I think it an honest policy. Those who affect to be thought to
know every thing, and so undertake to explain every thing, often remain long ignorant of
many things that others could and would instruct them in, if they appeared less conceited.

Later, upon editing a collection of his letters for publication, Franklin added the
footnote
. . . Mr. F. has since thought, that, possibly the mutual repulsion of the inner opposite
sides of the electrised cann, may prevent the accumulating an electric atmosphere UpOl1
them and occasion it to stand chiefly on the outside. But recommends it to the farther
examination of the curious.

Very little progress was made with this idea until Franklin described the abovementioned experiment to his good friend Joseph Priestley and asked Priestley to repeat
the investigation and verify his results. Priestley (1733-1804), better known as the discoverer of oxygen, undertook experiments beginning in December 1766. He suspended
two pith balls from threads which were entirely inside an electrically charged cup. Like
Franklin, Priestley found" that the balls
. . . remained just where they were placed, without being in the least affected by the
electricity; but that, if a finger, or any conducting substance communicating with the earth,
touched them, or was even presented towards them, near the mouth of the cup, they
immediately separated, being attracted to the sides; as they also were in raising them up,
the moment that the threads appeared above the mouth of the cup.
1

Bernard Cohen, ed., Benjamin Franklin's Experiments, pp. 331-338, Harvard University Press, 1941.

J. Priestley, The History and Present State of Electricity with Original Experiments, p. 732, printed for
J. Dodsley, London, 1767.

100

Electrostaiics in Free Space

CHAPTER

Based on the results of this experiment, Priestley then made the observation
May we not infer from this experiment, that the attraction of electricity is subject to
the same laws with that of gravitation, and is therefore according to the square of the
distances; since it is easily dernonstrated that were the earth in the form of a shell, a body
in the inside of it would not be attracted to one side more than another.

Despite the fact that Priestley was prompt to publish these experirnental findings
and his inference of the inverse square relation, the scientific community of his day
failed to appreciate the significance. Indeed, Priestley himself apparently did not
regard this accomplishment as a sufficiently rigorous proof and did not champion his
deductions.
Two years later in 1769, Dr. John Robison (1739-1805), of Edinburgh, undertook
the task of determining the law of force between electric charges by direct experiment.
Little attention has been given to the historical priority of his discovery, since Robison
made scant attempt at the time to publicize his findings. This is unfortunate, because
he was an accomplished investigator of wide interests, whose discoveries could have
benefi ted the progress of science. His lectures and scientific researches were published
posthumously in Edinburgh in 1822 and are clearly and engagingly presented in an
extensive four-volume treatise entitled 111 echanical Philosophy. Commencing on page 73
of the fourth volume of this treatise, Robison describes in detail an electrometer which
he constructed for the purpose of determining the force law between electrified particles.
Figure 3.1 is a reproduction of Robison's sketch of the electrometer, a device which
balances gravitational and electrical forces, thus giving a mechanical equivalent
of electrical attraction or repulsion. Robison's lengthy description of the apparatus
and its method of operation can be paraphrased by noting that A and Bare metallic
balls which, in the course of the experiment, will be electrified. B is attached to an
insulating stalk and counterpoised by the ball D, with the stalk freely hinged at C.
A is insulated by the glass arm FEL to which is attached the hinge C. With the two
balls A and B uncharged, the apparatus is adjusted so that, when BD hangs vertically,
A and B barely touch. The shaft FI is then rotated until
. . . the line LA is horizon tal, and so is CB; and the movable ball B is resting on A and is
carried by it. N O\V electrify the balls, and gently turn the handle I backwards . . . noticing
carefully the t\VO balls. It will happen that, in some particular position of the index, they
will be observed to separate. Bring them together again, and again cause them to separate,
till the exact position at separation is ascertained. This will shew their repulsive force in
contact, or at the distance of their centres, equal to the sum of their radii. Having determined this point, turn the instrument still more toward the vertical position. The balls will
now separate more and more . . . this electrometer . . . win give absolute measures:
for . . . by laying some grains weight on the cork-ball D, till it becomes horizontal and
perfectly balanced, and compu ting for the proportional lengths of BC and DC, we know
exactly the number of grains with which the balls ITIUSt repel each other (when the stalk is
in a horizontal position) in order merely to separate. Then a very simple computation will
tell us the grains of repulsion when they separate in any oblique position of the stalk; and
another computation, by the resolution of forces, will shew us the repulsion exerted between
them when AL is oblique, and Be makes any given angle with it.

After revealing his talent for careful instrumentation by instructing the reader in the
proper construction and care of the critical components of the electrometer, Robison

SECTION

Historical Survey

101

FIGURE

3.1

Robison's apparatus.

moved on to a discussion of the results he had obtained. Noting that he had made
many hundreds of measurements with different instruments, he concluded that
the mutual repulsion of two spheres, electrified positively or negatively, was very nearly in
the inverse proportion of the squares of the distance of their centres, or rather in a proportion somewhat greater, approaching to 1/r 2 . o6

By rotating the apparatus so that B was under A, Robison was able to make measurements of the attractive force between unlike charges. The results were similar and
he concluded that the force law was probably the inverse square of distance for both
attraction and repulsion. He failed to recognize the importance of this result, perhaps
because of the subordinate position in which he tended to place experimental work
relative to mathematics.
Another definitive demonstration of the inverse square law was achieved by Henry
Cavendish (1731-1810) in 1773. His experiment had the same basic form as the

102 Electrostatics in Free Space

CHAPTER

approach used earlier by Franklin and Priestley, although it is not clear that Cavendish
was aware of their efforts. He went far beyond their accomplishments, however, and
obtained a quantitative result for the law of force, including an estimate of the precision of his data.
The laboratory technique displayed by Cavendish in all his researches would earn
the admiration of any modern experimenter. In his earlier 'York with electricity, he had
developed the concept of "degree of electrification" (now called potential), and had
then convinced himself that when t\VO charged conductors are connected by a wire
they redistribute charge in order to attain the same potential. He incorporated this
result into many experiments designed to compare the charge on two bodies which had
been brought to a common potential.
In one of these experiments, Cavendish showed that the charges on similar bodies at
the same potential are in the ratio of their linear dimensions. Using this knowledge, he
expressed the charge on any body in terms of the diameter of a sphere which, when at
the same potential, would have an equal charge. This, in modern language, is the concept of capacitance, and when Cavendish spoke of the charge of a body as "globular
inches" or simply "inches of electricity" he meant that the capacitance of the body
in question was equal to that of a sphere whose diameter in inches was the value quoted.
Cavendish took as his standard a conducting spherical shell whose diameter was 12.1 in.
and he then ascertained, by a well-arranged series of measurements, the relative capacitances of a great number of bodies of many shapes.
His electric force experiment had the intention!
. . . to find au t whether, when a hollow globe is electrified, a smaller globe inclosed within
it and communicating with the outer one by some conducting substance is rendered at all
over or undercharged; and thereby to discover the law of the electric attraction and repulsion.

To this end, Cavendish constructed an apparatus consisting of a 12.1 in. diam inner
globe, mounted on a glass rod, and surrounded by two hemispheres of diameter 13.3 in.
He then
. . . made a communication between them by a piece of wire run through one of the hernispheres and touching the inner globe, a piece of silk string being fastened to the end of the
wire, by which I could draw it out at pleasure.

Cavendish next charged the outer globe, withdrew the connecting wire, removed the
t\VO hemispheres, and tested for charge on the inner globe by touching to it an electrometer consisting of two pith balls suspended by fine linen threads. However, he was not
satisfied with the first form of his apparatus, and went to an improved design, about
which he says
For the more convenient performing this operation, I made use of the following apparatus.
I t is more complicated, indeed, than was necessary, but as the experiment was of great
importance to my purpose, I was willing to try it in the most accurate manner.
ABCDEF and AbcDef [Figure 3.2] are t\VO frames of wood of the same size and shape,
supported by hinges at A and D in such manner that each frame is moveable on the horizontal
line AD as an axis. H is one of the hemispheres, fastened to the frame ABCD by the four
sticks of glass, AIm, N n, Pp, and Rr, covered with sealing-wax, h is the other hemisphere
J. Clerk Maxwell, ed., The Scientific Papers of the Honourable Henry Cavendish, revised by J. Larmo r,
vol. 1, p. 118, et seq., Cambridge University Press, 1921.

SECTION

Historical Survey

(a) Cavendish's original sketch.

(b) lv! axwell's drawing.


FIGURE

3.2

The Cavendish apparatus.

103

104 Electrostatics in Free Space

CHAPTER

fastened in the same manner to the frame Jibe I). G is the inner globe, suspended by the horizontal stick of glass Ss, the frame of wood by which Ss and the hinges at A and D are supported being not represented in the figure to avoid confusion.
Tt is a stick of glass with a slip of tinfoil bound round it at x, the place where it is intended
to touch the globe, and the pith balls are suspended from the tinfoil.

Cavendish describes how the inner globe and hemispheres were coated with tinfoil to
make them good conductors, and how the frame was adjusted so that the hemispheres
would fit accurately together and concentrically around the inner globe. He then goes
on to explain that
It was also so contrived, by means of different strings, that the same motion of the hand
which drew away the wire by which the hemispheres were electrified, immediately after
that was done, drew out the wire which made the communication between the hemispheres
and the inner globe, and immediately after that was drawn out, separated the hemispheres
from each other and approached the stick of glass Tt to the inner globe. It was also contrived
so that the electricity of the hemispheres and of the wire by which they were electrified was
discharged as soon as they were separated from each other, as otherwise their repulsion
might have made the pith balls to separate, though the inner globe was not at all overcharged.

Upon electrifying the outer shell and following the procedure just described, Cavendish brought his pith-ball electrometer into contact with the inner globe, and observed
The result was, that though the experiment was repeated several times, I could never
perceive the pith balls to separate or shew any signs of electricity.

These experiments were performed on December 18-24, 1772 and April 4, 1773. On
the later date Cavendish improved on the detectability of his electrometer by first precharging the pith balls positively or negatively. In this situation, a small like charge on
the inner globe would have slightly altered the separation of the pith balls, whereas
Cavendish observed in both cases that, upon contact with the inner globe, the pith
balls collapsed toward each other, assuming a position in which they were barely
separated. This indicated that the greater capacitance of the inner globe was draining
most of the precharge off the pith balls, and thus that the charge which had been on
the inner globe was much less than the charge with which he had pre-electrified the
electrometer.
Cavendish next turned his attention to the question of the accuracy of his measurements. At issue was the minimum charge on the inner globe which his pith-ball electrometer could detect. To make an estimate of this minimum detectable charge, Cavendish totally discharged the condenser which had been used to charge the outer sphere
in the electric force experiment. He then recharged this condenser with -loth of its
original charge, being sure of this value through his use of a set of calibrated condensers.
Upon connecting the recharged condenser to the inner globe (with the hemispheres
removed), he was certain that the charge transferred to the inner globe was less than
a10th of the charge which had been transferred to the outer sphere in the original electric force experimen t. K O\V, upon bringing his electrometer in con tact wi th the inner

SECTION

Historical Survey

105

globe, Cavendish found a sensible effect on the separation of the pith balls. Thus he
was led to the conclusion
It appears, therefore, that if a globe 12.1 inches in diameter is inclosed within a hollow globe
13.3 inches in diameter, and communicates with it by some conducting substance, and the
whole is positively electrified, the quantity of redundant fluid lodged in the inner globe is
certainly less than 6)rth of that lodged in the outer globe, and that there is no reason to
think from any circumstance of the experiment that the inner globe is at all overcharged.

Cavendish then proceeded to argue that the law of electric attraction and repulsion
lTIUSt be inversely as the square of the distance. But he was not yet satisfied. He wanted
. . . to form some estimate how much the law of the electric attraction and repulsion may
differ from that of the inverse duplicate ratio of the distances without its having been
perceived in this experiment . . .

Reasoning as had Newton in the case of gravitational attraction, Cavendish assumed


that the electric charge would spread uniformly over a sphere, and that each element
of charge would exert forces on all other elements according to the same law, with the
principle of superposition applying. He then assumed that the force law had an inverse
distance dependency of the 2 10th power and computed the amount of charge which
would have to reside on the inner globe so that the net force on a charge located at the
midpoint of the wire connecting the two globes was zero. This amount of charge turned
out to be ~7th of the charge on the outer sphere. Since 5)-th is larger than the detectable
';oth amount, Cavendish concluded that
. . . the electric attraction and repulsion must be inversely as some power of the distance
between that of the 2 + s1rth and that of the 2 - sloth, and there is no reason to think
that it differs at all from the inverse duplicate ratio.

Cavendish also investigated the manner in which the electric force law is dependent
on the amount of charge. Once again, he ingeniously contrived a quasi-null experiment,
the apparatus for which is depicted in Figure 3.3. In Cavendish's own words:"
CD is a wooden rod 43 inches long, covered with tinfoil and supported horizontally by
non-conductors. At the end C is suspended, as in the figure, the electrometer described 5 in
Article 249, and at the other end D is suspended a similar electrometer, only the straws
reached to the bottom of the cork balls A and B, but not beyond them, and were left open so
as to pu t in pieces of wire, and thereby increase their weigh t and the force with which they
endeavoured to close.

The two Leyden jars E and F were approximately equal in capacitance and each
exceeded the capacitance of the bar and electrometers together by a factor of about
4 Ibid., pp 189-193.
51'his electrometer consisted of two wheaten straws, suspended by pin bearings from a brass block,
and terminated by gilted cork balls. Ibid., 131.

106 Electrostatics in Free Space

CHAPTER

one hundred. The outer coatings of both jars were grounded and the inner coating of E
was permanently attached to the bar CD. With the wire weights in the straws A and B,
the system consisting of the jar E, the bar CD, and the two electrometers was charged
until A and B were separated by a measurable amount. The jar F was then connected
to the system, essentially halving the charge on the electrometers. The electrometer
at C was then observed to have a separation almost equal to the separation which

FIGURE

3.3

Cavendish methodfor determining relation between force and charge intensity.

A and B had previously experienced with double the charge. Since Cavendish had
determined that the two electrometers were almost identical and since he had chosen
the wire weights so that they would quadruple the force tending to close AB (the
actual factor was 3.9), he was able to conclude that the electric force was directly
proportional to the amount of each charge.
The results of these highly original and definitive experiments were unknown to the
scientific community for almost a century for, like Robison, Cavendish chose not to
publicize his findings. By the time a general awareness had developed that each of
these men had established the inverse square law, the credit and fame had been bestowed properly on someone else.
That someone else was Charles Augustin de Coulomb (1736-1806) who, in 1785, also
demonstrated the law of electric force, using a technique totally different from those
employed by any of his predecessors. Coulomb's procedure involved the use of a torsion
balance which he had invented. With it, he measured the repulsive force between two
like charges, balancing this force by the torsion in a wire from which a bar containing
one of the charges was suspended. His celebrated First Memoir on Electricity and Afagneiism contains a preliminary section in which the torsion balance is clearly described
as follows:"
On a glass cylinder ABCD [Figure 3.4, sub-Figure 1J ... we place a glass plate . . .
which completely covers the cylinder; this plate is pierced by two holes . . . one at the
center f, upon which is erected a glass tube; this tube is bonded over the hole j': at the upper
end h of this tube, a torsion micrometer is placed which we see in detail in sub-Figure 2.
C. A. de Coulomb, "Premiere Mcmoire sur l'Electricite et Magnet.isme," Histoire de I'Academic
Royale des Sciences, 569; 1i85. For an English translation of excerpts, see W. F. Magie, A Source Book
in. Physics, McGraw-Hill Book Company, New York, 1935.
6

SECTION

Historical Survey

JZ~

.1.

p
C

FIGURE

3.4

Coulomb's apparatus.

9'

107

108 Electrostatics in Free Space

CHAPTER

The upper part, No.1, carries the milled head b, the index io, and the chuck q; this piece fits
into the hole G of piece No.2; piece No.2 consists of a circle ab divided along its girth into
360 degrees, and a copper tube 4> which fits into the tube H, No.3, which in turn is sealed to
the interior of the upper end of the glass tubefh of sub-Figure 1. The chuck q is shaped much
like the end of a solid pencil holder, and is closed by means of the ring q. In this chuck is
clamped the end of a very fine silver wire: the other end of the silver wire [sub-Figure 3] is
held at P in a clamp made of a cylinder Po of copper or iron . . . whose upper end P is split
so as to form a clamp which is closed by means of the sliding piece 4>. This small cylinder is
enlarged and pierced at C in order to permi t the needle ag to pass through; it is necessary
that the weight of the small cylinder be sufficient to stretch the silver wire without breaking
it. The needle, ag, is seen [sub-Figure 1] to be suspended horizontally, and about half-way
up in the large cylinder which encloses it, and is formed either of a silk thread or straw soaked
in Spanish wax and finished off from q to a for eighteen linest of its length by a cylindrical
rod of shellac; at the extremity a of this needle is found a small pith ball two or three lines
in diameter; at g there is a little vertical piece of paper soaked in turpentine, which serves
as a counterbalance to the ball a and retards oscillations.
We have said that the cover AC was pierced by a second hole at m; into this second hole
is introduced a slender rod mCf.>t, whose lower portion <I>t is made of shellac; at t is another pith
ball; around the perimeter of the glass cylinder ABeD, at the height of the needle, is
described a circle zQ divided in to 360 degrees; for greater simplicity I use a strip of paper
divided into 360 degrees which is pasted around the cylinder at the height of the needle.

Coulomb then goes on to explain how the instrument is prepared for the experiment
by securing the pith ball t in place and adjusting the micrometer head so that the two
pith balls are just touching. His description of the actual experiment follows:
We electrify a small conductor [Figure 3.4, sub-Figure 4] which is simply a large-headed
pin insulated by plunging its point into the end of a rod of Spanish wax; this pin is introduced
through the hole m and permitted to touch the ball t, which is in contact with the ball a;

upon withdrawing the pin, the two balls are left electrified with the same kind of electricity
and they repel each other to a distance which is measured by looking beyond the suspension
wire and the center of the ball a to the corresponding division of the circle zOQ; then by
rotating the index of the micrometer in the direction pno, we twist the suspension wire lP
and exert a force proportional to the angle of torsion, which tends to bring the ball a closer
to the ball t. In this way one can observe the distance through which different angles of
torsion bring the ball a toward the ball t; comparison of the forces of torsion with the
corresponding distances of the t\VO balls determines the law of repulsion. I shall here only
present some trials which are easy to repeat and which will at once make evident this law
of repulsion.

Coulomb then indicated an initial separation of the two balls of 36 deg, causing an
initial torsional twist of 36 deg in the suspending wire. He next turned the micrometer
head through 126 deg, causing the balls to reduce their separation to 18 deg. Finally,
by turning the micrometer head through 567 deg, he observed that the separation had
been reduced to 8t deg. Since the force of torsion is proportional to the angle of twist,
these data can be tabulated as shown in Table 3.1.
The values of wire twist in the second column are composed of the rotation of the
micrometer head and the angular displacement of ball a. The distance of separation of
the two pith balls is proportional to the sine of one-half their angle of separation, but
since all these angles are small, the distance of separation is essentially proportional, in
t Before adoption of the metric system in France, one line equalled

T~

in.

SECTION

Historical S'urvey

109

TABLE 3.1
COULOlVIB'S EXPERIMENTAL DATA FOR THE LA \V OF REPULSION

A ngular separation of the


two pith balls, deg

A ngular measure of the


force of torsion, deg

36

36

18

144

8t

575~

this experiment, to the angle of separation. One notes from the first column of the table
that the angles of separation are almost in the ratio 4: 2 : 1. The second column of the
table lists quantities proportional to the restoring force, and these figures are essentially
in the ratio 1: 4: 16. In analyzing the data, Coulomb was led to the conclusion
It results then from these three trials that the repulsive action which the two balls exert
on each other when they are electrified similarly is in the inverse ratio of the square of the
distance.

It is interesting to note the great change which has taken place in the method of
reporting in scientific journals since Coulomb's time. Whereas Coulomb went into
great detail in describing his apparatus, the experimental procedure, and possible
sources of error, when it came to reporting data he listed only three points, one of
which deviated from the inverse square law by 6 percent. His statement in the First
Memoir just prior to introduction of the data clearly suggests that other trials had
been undertaken and one can only surmise that Coulomb felt a small sample of his
data would be sufficiently convincing.
Upon turning his attention to an investigation of the law of electric force for the
attraction between oppositely charged bodies, Coulomb encountered a new difficulty:'
I wished to use the same method to determine the attractive force between t\VO pith balls
charged with opposite natures of electricity, but by using this same balance to measure the
attractive force) I found an experimental difficulty which did not occur during the measurement of repulsive force. The experimental difficulty arises when the two balls are drawn near
to each other. The attractive force . . . frequently increases at a greater rate than the
torsional force, which increases only directly as the angle of twist; as a consequence, if several
readings are desired, the balls must be prevented from touching each other by means of an
insulating stop in the path of the needle. Since the balance is often required to measure forces
of less than one thousandth of a grain, the collision of the needle with the insulating stop
influences the results and causes part of the electric charge to be lost.

Coulomb displayed his ingenuity in circumventing this difficulty by devising an


experiment in which he related the period of a pendulum to the spatial dependence of
the force of electric attraction. He suspended a horizontal needlelike insulator from a
thin silk thread (Figure 3.5) and attached to it a tinsel disc at the end l and a counterbalance at g. Kearby was placed a globe G, and in Coulomb's words
7 C. A. de Coulomb, "Seconde Mernoire sur l'Electricite et Magnot.isme," Histoire de I' Academic
Royale des Sciences, 579; 1785.

110 Electrostatics in Free Space

CHAPTER

. we adjust the globe G in such a way that its horizontal diameter Gr is opposite the
center of the tinsel disc l, which is some inches away from it. We give an electric spark to
the globe from a Leyden j ar [condenser]; we then ground the disc I with a conductor, and
the action of the electrified globe on the electric fluid of the unelectrified tinsel disc gives to
ti.e disc a charge of the opposite type from that of the globe; so that when the ground is
removed, the globe and disc act on each other by attraction.

FIGURI~

3.5

Coulomb's apparatus for unlike charges.

Designating by d the distance from the needle's center c to the center of the globe G,
Coulomb varied d and, after setting the needle into oscillation, recorded the time it
took for the needle to perform a specified number of oscillations. He listed three trials,
the recorded data for which is reproduced in Table 3.2.
TABLE 3.2
COULOMB'S EXPERIMENTAL DATA FOR THE LAW OF ATTRACTION

d, in.
9
18
24

No. of oscillations Elapsed time, sec.


15
15
15

20
41

60

SECTION

Historical Survey

111

......

G.---=::::-----~---~~-------__\_-----~-

I
FIGURE

3.6

Composition of forces in Coulomb experiment.

To analyze the data, it is necessary to determine the relationship between oscillation


time and disposition of the parts of the apparatus. Figure 3.6 shows the needle in an
arbitrary angular position and indicates the attractive electric force Fe resolved in to
tangential and radial components. If r is the distance from the tinsel disc l to the
cen ter of the needle c, then

in which I is the moment of inertia of the needle about the axis containing its suspending thread. Under the assumption that the law of attraction is the same as the
law of repulsion, F; = K(d')-2, in which K is a constant and d' is the distance from l
to the center of the globe G. Then

d28
dt 2

Kr cos <p
I (d') 2

Since d' was considerably greater than r in Coulomb's experiment, d' can be replaced
by d in the above equation. Further, since <p == 90 '- (j - a, if 90 - (j is very much
greater than a (small oscillations) then <p ~ 90 - (J, and cos <p
sin (j
e. Making
these substitutions gives
/'-!

which is the equation for simple harmonic motion. The period


T

27rd

/'-!

is therefore

/I
\) Kr

Thus for small oscillations, if the law of attraction is the inverse square, the period
should be proportional to the separation distance d. This analysis would predict periods
in the ratio 20: 40: 54 whereas Table 3.2 indicates that the measured periods were in
the ratio 20: 41 : 60.
Coulomb made some measurements of the rate of dissipation of electric charge and
then corrected his data (the entire experiment took 4 minutes and he found that -ioth
of the charge was dissipated per minute). After correction, the lack of agreement was
negligible for the second trial and only 5 percent for the third trial, which led him to
conclude:
We have thus come, by a method absolutely different from the first, to a similar result;
we may therefore conclude that the mutual attraction of the electric fluid which is called

112

Electrostatics in Free Space

CHAPTER

positive on the electric fluid which is customarily called negative is the inverse ratio of the
distances; just as we have found in the first memoir, that the mutual action of the electric
fluid of the same type is in the inverse ratio of the square of the distance.

Coulomb also investigated the manner in which the amount of electric charge affected
the electric force. To do this, he replaced the stationary pith ball t (Figure 3.4) by a
small iron circle and proceeded in the following manner:"
He electrified these two bodies [the pith ball a and the small iron circle) simultaneously by
means of the head of a pin, and the repulsive force separated the needle from the iron circle;
when it was brought back and placed at a distance of 30 degrees the index pointed to 110
degrees; the repulsive force therefore was [proportional to] 140 degrees. He then touched the
little iron circle with another of the same substance and same diameter; the needle immediately approached the circle, and to bring it back to the distance of 30 degrees, it was
found necessary to untwist the wire till the index stood at 40 degrees; therefore the repulsive
force was reduced to 40 + 30 or 70 degrees, the half of 140 degrees, the measure of its
former intensity.

Arguing that when the charged iron circle was touched by a similar uncharged circle,
its charge was reduced to half, Coulomb then concluded that the electric force is linearly
proportional to the charge on each body.
These discoveries by Coulomb formed the first quantitative basis for a mathematical
statement of the law of electric force. Although his method lacked the degree of accuracy of the approach used by Cavendish, it was direct, it was quantitative, and it was
easy to comprehend. The scientific world readily accepted Coulomb's results, the first
of a substantive nature to be published and widely distributed.
This acceptance was greatly furthered by the theoretical contributions of Simeon
Denis Poisson (1781-1840) who, in two brilliant memoirs" presented in the years 1812
and 1813 lifted electrostatic theory to a mature state of development. He accomplished
this by accepting Coulomb's inverse square law as a fundamental postulate and making
rich use of the analogy to gravitational theory, a subject already highly advanced at
that time.
In an article in the M emoires de Berlin in 1777, Lagrange had shown that if a function
if;(x,y,z) be formed by adding together the masses of all the particles of an attracting
system, each divided by its distance from (x,Y,z), then the derivatives of this function
were equal to the components of the attractive force at (x,y,z). Laplace later demonstrated 10 that this function if; satisfies the equation

a2if; a2~ a2if;


-+-+-=0
ax 2 ay 2 az 2
at all points not occupied by masses.
8 J. Farrar, Elements of Electricity, M aqnetism, and.. Electromagnetism, Hilliard and Metcalf, Boston,
1826. (Notes selected from Biot's Precis Elemeniaire de Physique, compiled for the use of students of
the University at Cambridge, New England.)
9 s. I). Poisson, "On the Distribution of Electricity at the Surface of Conducting Bodies." First
Memoir read to the French Academy on May 19 and August 3, 1812. Printed in M em. de l'Institut,
part 1, 1-92; 1811. Second Memoir read on September 6,1813. Printed in M'em, de l'Institut, part 2,
164-274; 1811.
10 P. S. Laplace, "Theory of Attractions of Spheroids," ~1 em. de l' Academie Royale, 113-196; 1782
(published in 1785).

SECTION

Historical Survey

113

In laying the groundwork for a similar formulation involving electric charge, Poisson
opened his First Memoir by remarking
The theory of electricity which is most generally admitted is that which attributes all the
phenomena to two different fluids, distributed within all bodies of nature. It is supposed
that the molecules of one fluid repel each other and that they attract the molecules of the
other; these forces of attraction and repulsion obey the inverse square law of distance; at
the same distance the attractive power is equal to the repulsive power; from which it follows
that when all the parts of a body contain equal amounts of the two fluids, the latter do not
excercise any influence on the fluids contained in neighboring bodies, and as a consequence
no signs of electricity are manifest. This equal and uniform distribution of the two fluids is
called the natural state; when this state is disturbed for any reason, the body becomes electrified, and the various phenomena of electricity begin to take place.
All the bodies of nature do not behave the same way with respect to the electric fluid:
some, such as the metals, do not appear to exert any influence on it, but permit it to move
about freely in their interior: for this reason they are called conductors. Others, on the conrary, very dry air for example, oppose the passage of the electric fluid in their interior; in
this way they serve to prevent dissipation throughout space of the fluid accumulated in
conducting bodies. The phenomena associated with electrified conductors, whether these
conductors be considered singly, or whether they be considered in conjunction and exerting a mutual influence, are the objectives of this Memoir, in which I propose to apply
the calculus to this important branch of physics. But before entering into these matters,
I wish to state in some detail the principles which serve as the basis for my analysis, and
to make known the most remarkable results to which they have led me.

Poisson's central principle, of course, was the assumption of Coulomb's inverse square
law, on the basis of which he introduced a function t cI>(x,y,z) , composed of the sum of
the charges of an electrical system, each divided by its distance from (z; y, z). He then
argued, as had Lagrange in the case of gravitational attraction, that the derivatives

a4>

ax

a<l>

az

would yield the components of electric force] at (x,y,z).


Turning his attention to conducting bodies, Poisson assumed that an excess of one
electric fluid had been placed on a conductor, and reasoned:
By virtue of the repulsive force between these [excess] particles, and because the metal does
not oppose their movement, one can imagine that the added fluid is transported to the
surface of the body, where it will remain because of the air environment. Coulomb has
proved in effect, by direct experiments, that no atoms of electricity reside in the interior of
an electrified conductor except the natural electricity of the body: all the added fluid distributes itself over the surface . . . it exerts neither attraction nor repulsion at any interior
point of the body; for if this condition were not satisfied, the action of the surface layer of
electricity on interior points would decompose a new quantity of the natural electricity of
the body, and its electric state would be changed.

t Fifteen years later, in generalizing Poisson's work on electric and magnetic phenomena, George
Green (1793-1841) gave to this function the name potential, and this appellation has been universally
adopted ever since.
t Poisson's original notation has been altered to be consistent with the remainder of this chapter.

114

Electrostatics in Free Space

CHAPTER

As a consequence of this argument, Poisson adopted the principle that he could find
the manner in which the excess charge distributed itself over the outer surface of a
conductor, by imposing the condition that this distribution must lead to no net electric force at any interior point of the conductor. In terms of the potential function 4>,
this meant that if IJ were a point in the interior of an electrified conductor, then
. . . the value of <I> is independent of the coordinates of the point P; because then the
partial derivatives of this function being null, the force at the interior point P will be also.

Thus was the concept formulated that a conducting body in electrostatic equilibrium
is an equipotential.
Poisson next turned his attention to conditions at the surface of an electrified conductor and argued, following a suggestion by Laplace, that the electric force at a point
immediately outside the conductor is proportional to the local concentration of surface
charge density. He did this by dividing the force into a part f due to the element of
charged surface immediately adjacent to the point, and a part F due to the rest of the
surface. At a neighboring point just inside the conductor, F will be unchanged but f will
have to be reversed to give a null force. Therefore the resultant force at the exterior
point must be 2f. But if the exterior point is extremely close to the surface, the immediately adjacent surface element looks like an infinite plane, uniformly charged, for
which case Poisson showed the force f to be proportional to the charge per unit area of
the surface, thus completing the theorem.
Using the principle that a charged conductor must be an equipotential, Poisson
deduced the surface distribution for several simple shapes, including an ellipsoid, and
then enlarged his analysis to the study of t\VO charged spheres placed at any distance
from each other. This was a classic and difficult problem to which he devoted over
three-quarters of the space occupied by these t\VO lengthy memoirs. The solution
involves single or double gamma functions, depending on whether or not the two
spheres are in contact. Poisson laboriously computed the values of his integrals for a
variety of conditions and exhibited very satisfactory agreement with the earlier
experimental results of Coulomb.
The year 1813 recorded another significant contribution by Poisson when, in a brief
note.!' he extended Laplace's equation to include points occupied by matter, obtaining

in which p is the volume density of mass. The same connection exists, of course,
between electric potential and charge density. Poisson's proof of the validity of this
important differential equation, which bears his name, has a simple elegance which will
fully reward a decision to consult the original paper. An alternative derivation will be
offered in section 3.9.
The admiration invoked by recounting these achievements of Poisson perhaps has
been expressed best by Whittaker ;"
11

S. D. Poisson, "Remarks on an Equation Which Occurs in the Theory of Attractions of Spheroids,"

Bull. de la Soc. Philomathique, 3, 388-392; 1813.


12 E. Whittaker, A History of the Theories of Aether and Electricity, vol. 1, p. 62, Thomas Nelson and

Sons, Ltd., 1951.

SECTION

Historical Survey

115

The rapidity with which


Poisson passed from the barest elements of the subject to
such recondite problems as those just mentioned may well excite admiration. His success is,
no doubt, partly explained by the high state of development to which analysis had been
advanced by the great mathematicians of the eighteenth century; but even after allowance
has been Blade for what is due to his predecessors, Poisson's investigation must be accounted
a splendid memorial of his genius.

Poisson's differential equation, linking spatial derivatives of the electrostatic potential to charge distribution, found its integral counterpart through a discovery by Karl
Friedrich Gauss (1777-1855). In 1813 Gauss established 13 the famous divergence
theorem

J D dS vJ
=

V D dV

connecting a volume integral throughout V to a surface integral over S, with S being"


the closed surface bounding the volume 11 , and D being any vector function possessing
continuous first derivatives in a region containing V. If D is a radial field which varies
inversely with distance from some point 0, that is, if D = lr/r 2 then the surface integral of Gauss' divergence theorem yields the simple result

J D dS = 471"

if 0 is inside V; otherwise the result is zero. This special result is known as Gauss'
integral. When D is properly related to Coulomb's inverse square law, f sD dS equals
the net charge enclosed by S. This result, coupled with the divergence theorem, yields
an integral form of Poisson's equation. These deductions will be elaborated in the sections to follow.
Another great advance in electrostatic theory, though it was not so recognized at
the time, was made by Michael Faraday (1791-1867). His keen sense of physical visualization led him to picture all force functions in terms of flux lines. This technique
first suggested itself to Faraday because of the common custom of illustrating magnetic power by strewing iron filings on a sheet of paper and noticing the curves along
which they arranged themselves when a magnet was placed underneath the paper.
From this Faraday evolved the idea of lines of magnetic force, whose direction at every
point coincided with the direction of the magnetic intensity.
It was a simple extension to apply this concept of flux lines to gravitational effects
and to electric intensity. About the latter, Faraday said 14
The lines of force of the static condition of electricity are present in all cases of induction.
They terminate at the surfaces of the conductors under induction, or at the particles of nonconductors, which, being electrified, are in that condition.

This conception permitted Faraday to replace action at a distance with a local interaction of charge and a field of force, a viewpoint which had great appeal for Maxwell
13 K. F. Gauss, "Theoria Attractionis Corporum Sphaeroidicorum Ellipticorum Homogeneorum,"
reprinted in his lVetke, vol. 5, pp. 1-22, published by the Royal Society of Science, Gottingcn, 1870.
14 1\1. Faraday, Experimental Researches in Electricity, vol. 3, art. 3249, published by Bernard Quaritch,
London, 1855.

116 Electrostatics in Free Space

CHAPTER

(1831-1879). In the Preface to the first edition of his celebrated Treatise on Electricity

and Magnetism, Maxwell wrote


. . . before I began the study of electricity I resolved to read no mathematics on the subject
till I had first read through Faraday's Experimental Researches in Electricity. I was aware
that there was supposed to be a difference between Faraday's way of conceiving phenomena
and that of the mathematicians, so that neither he nor they were satisfied with each other's
language. I had also the conviction that this discrepancy did not arise from either party
being wrong.

Maxwell found, as he proceeded with the study of Experimental Researches, that it was
possible to couch Faraday's ideas in mathematical terms and thus compare them with
the formulations preferred by mathematicians. As part of the contrast, he noted
For instance, Faraday, in his mind's eye, saw lines of force traversing all space where the
mathematicians saw centres of force attracting at a distance: Faraday saw a medium where
they saw nothing but distance: Faraday sought the seat of the phenomena in real actions
going on in the medium, they were satisfied that they had found it in a power of action at a
distance impressed on the electric fluids.

Maxwell's skillful mathematical exposition of Faraday's ideas led him to conclude that
the results of the t\VO methods coincided, but that Faraday's viewpoint was much
richer. Thus he adopted it and furthered it with many ideas of his o\vn. It was Maxwell
who introduced the concept of the electric flux density Junction D (he called it the
displacement), a concept which becomes especially meaningful in the study of dielectrics. Using Green's Theorem, he obtained an expression for the energy stored in an
electrostatic system in the form

w=

Lf
v

E2(X,y,Z) dx dy dz

which highlights the interpretation of the electric field E as the seat of the phenomena.
Maxwell also solved a variety of boundary-value problems, obtaining both the potential and electric field, and displaying these for the first time as precise field maps, to
illustrate Faraday's idea of lines of force. The plates appended to both volumes of his
Treatise include some of the most beautiful flux maps which have ever been prepared.
The field approach of Faraday and Maxwell, with strong emphasis on the local interaction of a charge and a field, will be found to have permeated the remainder of this
text.
Maxwell also contributed to the establishment of the inverse square law. His interest
in this problem had been aroused by his reading of the unpublished manuscripts of
Cavendish. These manuscripts had been brought to the attention of Lord Kelvin after
Cavendish's death. Recognizing their importance and desiring that they be published,
Kelvin urged the Duke of Devonshire, to whom the manuscripts belonged, to entrust
them to Maxwell, This he did in 1874.
The Cavendish experiment which particularly caught Maxwell's admiration was the
one concerned with the determination of the law of electric force, and he resolved to
repeat it. Accordingly, with Sir Donald l\1cAlister, he devised an apparatus which
improved in several particulars on Cavendish's original design. The principal innovations were the use as detector of a more sensitive quadrant electrometer and the adop-

SECTION

Mathematical Formulation of the Inverse Square Law

117

tion of a technique which did not require the dismantling of the outer spherical shell.
Maxwell provided a thorough analysis of the accuracy of this method which, coupled
with McAlister's data, led him to conclude that the force law was bounded by r-(2+6)
in which I<5 I < 1/21,600. t The McAlister experiment and Maxwell's analysis will be considered in Section 3.20.
By far the most sensitive investigation of the electric force law which has ever been
undertaken was accomplished by S. J. Plimpton and W. E. Lawton at the Worcester
Polytechnic Institute in 1936. Together they skillfully applied all the advantages of
modern technology and electronic instrumentation to a repetition of the Cavendish
experiment and were able to show that the distance dependency in the electric force
law deviated from the inverse square by less than two parts in one billion. This remarkable achievement stands as the most compelling reason for basing an electrical theory
on the inverse square law. The Plimpton-Lawton experiment and an analysis of the
accuracy of their results will be considered in Section 3.21.

3.2 MATHEMATICAL FORMULATION OF THE INVERSE SQUARE LAW


The preceding section of this chapter has shown how Coulomb's experiments led to the
formulation of the law of force for electric charges; namely
(3.1)
The increasing accuracy of the experiments performed by Cavendish, by Maxwell and
McAlister, and by Plimpton and Lawton have raised the confidence level in this law
almost to the point of certainty. Yet, it seems appropriate to point out certain limitations in all these experiments and to circumscribe the limits of validity of this force law.
First, none of these experiments was undertaken with the charges extremely close
to each other, nor excessively far apart. Thus, the question can be raised as to the
limits of r within which the law is valid. As yet, there is little direct evidence at very
large distances. However, if one accepts the premise that an entire electromagnetic
theory can be based on Coulomb's inverse square law, then the indirect evidence supports the belief that this law operates at astronomical distances. Concerning short
distances of separation, Rutherford's experiments, in which he bombarded atomic
nuclei with a particles, have shown that the law holds at distances as small as 10- 14 m.
It may be valid at closer distances, but nuclear forces then come into play and partially
mask the effect.
Second, note that the law as stated presumes point charges and it is clear that this
is an approximation which can be good only when the extent of each charge is small
compared to r, For example, Coulomb's experiments involved, not point charges, but
rather charge distributions on balls of finite size. Induction effects caused these distritions to be nonuniform. (In the case of repulsion, the remote sides of the two pith balls
attained a heavier charge density.) Thus it became somewhat uncertain what to use
for the true spacing r.

t Maxwell was apparently being conservative in using as the bound one part in 21,600, for in the
Introduction to The Scientific Papers of the Honourable Henry Cavendish he states "We can now use
Thomson's Quadrant electrometer, and thereby detect a deviation from the law of the inverse square
not exceeding one in 72,000."

118

Electrostatics in Free Space

CHAPTER

In the case of the Cavendish method, the nonconcentration of charge at a single


point is even more evident, and another assumption entered heavily into the experimerits. It was assumed that the law of superposition of forces holds for electric charges
so that, if q' were replaced by a distribution of charges, one could write for the force on q
N

F 0: q

n = 1

q; i; = q
~n

I ~i~"
rn

(3.2)

n = 1

in which ~n is drawn from the charge qn to the charge q, and l rn


~n/~n is a unit vector
in the same direction as ~n. t
Quite obviously, none of the Cavendish-type experiments proved the validity of the
assumption of superposition; nevertheless, superposition is important for many applications of the theory. The validity of this assumption is accepted by virtue of the fact that
results predicated on it are consistent with experiment, but it should be recognized that
the principle of superposition for the forces among electric charges was not directly
demonstrated.
Third, the force law for electric charges, (3.1), includes the implication that the line
of action of the force coincides with the straight line connecting the two charges.
Coulomb's experiments did not reveal any transverse component of force, but they
were hardly sensitive enough to be definitive on this point. The Cavendish approach
requires the assumption that this is so, in that symmetry is used to cancel out certain
components of force in computing the net force on a charge between the two spheres.
Here again, the assumption that the force acts along the line joining the charges gains
its strength not from the original experiments, but rather from the accuracy of predictions based on making this assumption.
Fourth, the law (3.1) states that the force varies directly as the algebraic product of
the two quantities of charge. Coulomb was able to show that like charges repel and
unlike charges attract according to the same function of distance, and also showed that
halving one charge reduced the force by a factor of two. Cavendish demonstrated that
doubling each charge quadrupled the force. But neither showed the general validity
of using the product qq', and this is accepted by inference.
Fifth, the law (3.1) states nothing explicitly about the medium in which the charges
q and q' are immersed. The approach to be adopted here will be that only a vacuum is
particle-free, and that any other medium can be viewed at the atomic level as an
assemblage of particles, some of which may be electrified. In the cases of such media,
the generalized form (3.2) can then be used to find the force on the charge q, with some
of the charges qn belonging to the particles which constitute the medium.
Finally, the law (3.1) also states nothing explicitly about whether or not the distance
between q and q' is time- dependent. The experiments were all essentially static. How-

t The distance symbol r is actually a German lower-case x, but it has the semantic advantage of
looking like an r. It will be used throughout this text to designate the distance from a source point (the
position of qn) to a field point (the position of q). The symbol r will be reserved for distances measured from the origin. Other authors have achieved this distinction by using r' for the distance between
source point and field point. Unfortunately that notation is not convenient here, since many of the
developments in Chapters 4 and 5 will involve t\VO coordinate systems XYZ and X'Y'Z'. rand r '
(or j' and !") will then mean the distance between the same two points as measured in the two different
coordinate systems. The reader may find it convenient to call the symbol r by the name r-sub-c or
r-cedilla.

SECTION

j.~1 alhematicol

Formulation of the Inverse Square Law

119

ever, to develop a useful theory, one must be able to let charges move. I t will be seen
in retrospect that a satisfactory theory can be based on Equation (3.2) if one assumes
that it is valid for the case that ql ... qN are all fixed in position, but that motion of q
is permitted.
There is ample experimental evidenee to support this assumption. Many modern
electronic instruments and particle accelerators employ static distributions of charges
to create steady electric fields and thus accelerate charged particles which move
through these fields. When a theoretical traj ectory for these accelerated particles
is computed based on (3.2), excellent agreement with the experimental trajectory is
obtained. This has been found to be true even when the trajectory speed is high enough
to require inclusion of relativistic effects.
It has already been seen in Chapter 2 that mass is a function of velocity and the question can be raised at this point whether charge is not also a function of velocity. It will
be seen in Chapter 4 that it is convenient to define charge to be an invariant. However,
a general answer to this question can be deferred. For the present it will be assumed
merely that the inverse square law is valid whether or not q moves, with the symbol
q which occurs in the formulas always referring to the value of charge when q is at rest.
With all the foregoing limitations and implicit assumptions in mind, the experiments
described earlier in this chapter will be taken as justification for acceptance of the
inverse square law as a fundamental postulate. Since this will be the only purely electrical postulate needed to develop a complete theory of electromagnetics, space will now
be taken to recapitulate and construct a concise mathematical formulation of this law:
As suggested by Figure 3.7, let there be a static assemblage of N charged particles,
containing respectively charges ql, q2, . . . , qN, arbitrarily arranged in otherwise

q(x,Y,Z)

......- - - - - - - - - - - - - - - - - - - y

x
FIGURE

3.7

Notation for Coulomb's law.

120

Electrostatics in Free Space

CHAPTER

empty space. The quantities qi are real numbers which can be either positive or negative. The positions of these charged particles can be described in a coordinate system
X YZ so that the nth particle is identified by the coordinates (X ,Yn,Zn) or by the position vector r., = lxxn
lyYn + lzzn. These coordinates are not functions of time.
Additionally, let a particle containing a charge q be instantaneously at the point
(x,y,z) described by the position vector r = lxx + lyY + 1zz. This charge will be permitted to move, so that the coordinates x, Y, and z can be general functions of time. The
total electrical force exerted on q is then

ll

F =

n=l

r,

(3.3)

qqn r - r n
41rfO /r - r n l 3

(3.4)

in which

r,

= _1_ q~n
41rfo rn

~n

The proportionality constant (41rEo)-1 serves to convert (3.1) to an equality. Inclusion


of the factor 471" is known as rationalization and is done so that a factor of 471" will not
appear in the more often used Maxwell's equations to be derived subsequently from
(3.3). The factor EO can be looked upon as a units-adjusting parameter. It is called the
permittivity of free space, and in the IVIKS system of units to be used in this text has
the measured value of 8.854 X 10- 12 farads/me This choice for fO permits charge to be
measured in coulombs when distance is measured in meters and force in newtons.
The dimensions of Eo will become clear later in this chapter when the concept of
capacitance is introduced. The newton is a unit of force equal to lOS dynes. Thus 1 lb
(force) equals 4.4482 newtons, so that 1 newton can be remembered conveniently as
being slightly less than a quarter of a pound. One coulomb, the unit of charge, is defined
as the quantity of electricity passing a cross section when a current of 1 amp is flowing.
(The primary definition of an ampere will be discussed in Chapter 4 after Ampere's law
has been derived.) One coulomb is also the quantity of electricity required to deposit
0.001118 g of silver f'rorn an aqueous solution of silver nitrate.
Combining (3.3) and (3.4), one can write for the total electrical force on q,

(3.5)

in which
(3.6)
is the vector drawn from qn to q. t
Equation (3.5) is a mathematical statement of the inverse square law, and will hereafter be referred to as Coulomb's law. This equation will prove to be the cornerstone
of all the theory which is to follow, and thus can equally well be called the fundamental
postulate of electricity.
It should be noted that Equation (3.5) is being adopted as a postulate only for
discrete charges which are in otherwise empty space. Although on the face of it this

t The force

F may be im plicitly a function of time if z, y, and z are functions of time.

SECTION

The Electric Field

121

appears impractical, it will be seen later in this chapter that many electrostatic systems of charges exist in the presence of electrically neutral and unpolarized material
bodies which can be treated as though they were empty space. In Chapter 6 material
bodies which cannot be so treated will be introduced and the theory will be enlarged to
take them into account.

3.3

THE ELECTRIC FIELD

If a static assemblage of charges qn exists at the points (Xn,Yn,Zn) and a small test
charge ~q is placed at the point (x,Y,z), Equation (3.5) gives for the force on ~q
LiF = -

~q

47T'

\'

~n

(3.7)

I...t qn 3

fon

~n

=1

in which 6n is drawn from 'I to ~q.


When it is assumed that the charge ~q is small enough so that its presence or absence
does not affect the spatial distribution of the other charges, the vector

E = LiF = _1
Li q
47r f 0

n= 1

qn

~n
~~

(3.8)

is defined as the force per unit charge at (x ,Y,z). E can be expressed in the units of
newtons per coulomb and is variously called the electric force, the electric intensity, or
the electric field strength.
By implication, if a charge q of arbitrary size is placed at (x,Y,z), it experiences a
force
(3.9)
F = qE
However, one must be careful in using (3.9) to ascertain that the presence of q has not
disturbed the positions of the other charges. For example, if the assemblage of charges
qn is distributed over the surface of a conductor and the charge q is placed in the vicinity, the charges q-; being free to move, will redistribute themselves to new positions of
equilibrium.
Equation (3.8) indicates that the electric force depends on the charges qn and their
positions relative to the point (x,Y,z) but that it does not depend on Liq. An intensity
E(x,y,z) can thus be associated with the point (x,Y,z) whether Liq is there or not. If the
vector function E is interpreted in this manner, it can be taken as a fundamental subject of investigation. This is the field viewpoint of Maxwell and Faraday, which differs
from the action-at-a-distance theories of their predecessors. In this view, the source
charges qn set up an electric field at the point P(x,y,z); the field in turn will exert a
force on any charge which might be introduced at P.
With this interpretation, E as defined by (3.8) is an electrostatic field, since the source
points (Xn,Yn,Zn) are static and the field point (x,Y,z) has coordinates which are not
connected to the possible motion of any particle. This functional dependence is usually
indicated by writing
(3.10)

122 Electrostatics in Free Space

CHAPTER

The dependence of E on the sources and their positions is not usually explicitly indicated in the left side of (3.10), but is nevertheless understood implicitly.
In many problems it will be appropriate to consider the total charge p dV in a volume
element dV in lieu of the discrete charge qn. In such cases (3.10) can be written
E(x,y,z)

1
-4
1T'fO

Jp(~,l1,r) 3~ d~ dl1 dr

(3.11)

in which p(~,'YJ,r) is the volume charge density function, expressed in coul/rn'', and
(3.12)
is the positional vector drawn from the volume element centered at (~,l1,r) to the field
point (x,Y,z). The volume region V is sufficient to encompass all the sources p dV.
Similarly, there will be occasions when it is useful to consider the total charge (1 dS
on a surface element dS, or the total charge x de on a line element de, in place of the
discrete charge q.: For example, in the case of surface distributions, (3.10) becomes
E(x,Y,z) = - 1

41T'fO S

(1(~,'Y1,r)

-~ dS

in which is the surface charge density function, given in coul/rn", and


from the surface element dS, centered at (~,'Y1,r), to the field point (x,y,z).
(J"

EXAMPLE

(3.13)

~3

is drawn

3.1

1"0 gain some appreciation for the effects caused by 1 caul of electric charge, recall that the
charge on a single electron is 1.6 X 10- 19 couI. Therefore, it takes 6.25 X 10 18 electrons to

comprise 1 caul of charge. If these electrons were to be arranged in a cubical lattice 3A on


centers, the resulting cube would be approximately 500 microns on a side. An exterior
electron of this assemblage would be an average distance of perhaps 250 microns from the
remainder of the charge and would thus experience a repulsive force in the order of

F = eE

1.6 X 10- 19
41T'(8.854 X 10- 12 ) (250 X 10- 6 ) 2

~ -----------~

itr newton

Al though this does not seem to be a great force, when it is remembered that the mass of
an electron is only 9.1 X 10- 31 kg, the initial acceleration of a free electron experiencing this
force would be approximately 102 7 g. In the absence of compensating forces, this assemblage
of electrons is highly unstable.
Suppose that on a macroscopic scale this entire coulomb of charge is essentially concentrated at a point. Then the electric field due to it is radial and given by

E(r)

1
-lr - 41l"Eor

10 10
- lr r2

This field is so great that if another charge of 1 caul were placed 3 m distant, it would experience a force in excess of 100 kilotons. Further, in the presence of normal air, this field
intensity would cause breakdown out to a distance r = 100 m. In every respect 1 coul
represen ts an enormous amount of charge.

The Electric Field

SECTION ~3

123

Alternatively, if one asks what amount of charge exerts normal forces at normal distances, a feeling for this can be gained by considering t\VO equal charges 1 111 apart which
exert a force of 4 newtons (approximately one pound) on each other. Solving Coulomb's
force equation gives q = 21 micro-caul.
EXAMPLE

3.2

Imagine that all of space is populated with electric charge, but in such a way that the
charge density varies in only one direction. Let X be this direction, so that the situation can
be pictured in terms of plane layers of charge, stacked one next to the other, all transverse
to the X axis. The volume charge density p(~) varies from one layer to the next, but is constant throughout any layer. Let it be desired to find the electric field distribution due to
this charge system.

"

4_

~-

--

-------=lC======~~---rP(X,Y,z)

lI

L_

;---- X

z
The layer of charge contained between the planes ~ and ~ + d~ is pictured, together with
the field point (x,y,z) at which the electric field is to be determined. By symmetry, the
charges in the four volume elements shown exert a net effect at (x,Y,z) which is X directed.
On the basis of the contributions from all volume elements in this double-paired fashion,
Equation (3.11) gives
E(x,Y,z) =

~ f"" d~ f"" dlJ foe>

47rEo

00

.1/

[(x - ~)2

pWlxCx -

(y - rJ)2

00

E(x,y,z)

7rEO

p( ~')~' d~'

drJ'
0

[(

~ ') 2

ds'

(rJ ') 2

(z - S)2]%

(s ') 2] ~~

dt

124 Electrostatics in Free Space

CHAPTER

in which ~' = x - ~, rJ' = Y - rJ, and


E(x,Y,z)

E(x,Y,z)

S' =

Z-

S. Integration gives

_.!:. /' p(f)ede


1T'Eo _

+.!:.
- 2Eo

QO

00

p( ~')

J'"

(~')2

d7]'

(rJ')2

d~'

the plus or minus sign being taken according to whether or not ~' is greater or less than
zero. Thus the electric field is independent of y and z and is given by

::0 [ f pw d~ J pW d~
x

E(x)

00

-00

This result has a simple physical interpretation. If a column of unit cross-sectional area
extending from ~ ~ - 00 to ~ ~ + 00 is considered, the first integral is all the charge in
that part of the column to the left of ~ = x, whereas the second integral is all the charge in
that part of the column to the right of ~ = x. Thus the electric field at any cross section is
uniform over that cross section, normal to it, and equal to 1/2Eo times the difference in the
total charge per unit area found to the left and righ t of the cross section.
EXAl\JIPLE

3.3

Consider a spherical conducting] shell of outer radius a which contains a net charge Q.
What is the electric field distribution for this system?
Because of repulsion, the charge Q will distribute itself uniformly over the outer surface
of the shell with a density a = Q/41T'a 2 By symmetry, the field E at any distance r from the
center of the shell will be independent of the direction of that distance. With the use of
spherical coordinates, (3.13) can be written for this example in the form

in which, for convenience, the electric field is being evaluated at the point (r,O,O).
Referring to the figure, one sees that the charge contained within the band of area
211"a 2 sin 0 dO exerts a net effect at (r,O,O) which is entirely radial, and thus the above integral
can be writ ten
E(r)

IrQ
87rEo

f~
o

cos a sin 0 dO
~2

t For the purposes of electrostatics, materials may be divided into two categories: conductors of
electricity and insulators (dielectrics). A conductor may be viewed as an aggregation of charged particles occupying a region which, on the atomic level, is mostly vacuous; conductors are thus brought
within the purview of the present analysis. A large number of these charged particles (usually electrons) are free to wander throughout the confines of the conductor. These mobile charges respond
readily to an electric field and will continue to move as long as they experience a field. Thus whenever
the mobile carriers are arranged in a spatial distribution whose statistical time-average value is zero,
there is no net static electric field anywhere within the conductor. Only situations in which this is the
case will be considered in this chapter, the treatment of dynamic situations being reserved to Chapter 8. The discussion of dielectric materials will be deferred until Chapter 6.

SECTION

The Electric Field

125

P(r,O,O)

Using the law of cosines, one can convert this to the form
r+a

lrQ

E(r)

7r foar 2

16

~2

Q
E(r) = l r - -

which gives

+r

a2

-----dr
lr-al
~2
47rfor2

>a

<

Thus charge which is uniformly distributed over a spherical shell creates an electric field
at all external points as though the charge were concentrated at the center of the shell; at
all internal points it exerts no electric field whatsoever.
EXAMPLE

3.4

An academic problem which later can be extended to a practical situation concerns two
infinitely long parallel line charges. As shown in the figure (see next page), the upper line
contains a uniform charge density of x coul/m and the lower line contains the opposite uniform charge density of -)( coul/rn. The electric field at any point in space can be deduced
by first noting that by symmetry the answer will be independent of z. Thus the value of E
will be sought at the point P(x,Y,O).
In analogy with what has been said previously for volume and surface distributions, this
lineal distribution of charge gives rise to an electric field expressible as

in which ~l is drawn from a charge element in the upper line to P and ~2 is drawn from a
charge element in the lower line to P. Thus
E(x,y,O)

f'" lx + 1
l:

= ..":

47r fo

00

x2

y(Y

y -

E(x,y) = ~
211"'fo

l~t dt

- d/2) -

2d)2 + r ]/2

[lxX +
x2 +

f'" 1 xx + l.(,I} + d/2)

__ ~
411"'fo

lyCy - d/2) _
(y - d/2)2

l.TX

x2

00

+
+

x2

lyCy

(y

11 dt

y + d)2
2 + r ]72

d/2)]
d/2)2

126 Electrostatics in Free Space

CHAP'rER

P(.t,y.O)

--

-J(

The Dirac delta function can be used to advantage in the formulation of many field
problems, Written <S(x - a), the one-dimensional delta function is defined as having the
properties:
for all x

o(x - a) = 0

! o(x -

if x = a is included in the region of integration;


otherwise the in tegral is zero

a) dx = 1

The delta function can be pictured conveniently as the limit of a Gaussian curve, or
some other similarly peaked distribution, when in the limit the curve narrows and
heightens indefinitely, but in such a way that the area under the curve remains unity.
This area is considered to be dimensionless.
From these definitions it follows that, if f(x) is any arbitrary function

f f(x)

o(x - a) dx = f(a)

!f(x) o'(x - a) dx

-f'(a)

(3.14)
(3.15)

in which the prime denotes differentiation. The first result comes readily from the mean
value theorem whereas the second can be derived from the differential of a product. In
both formulas a is assumed to lie within the integration interval.
Delta functions in multidimensional space can be fabricated by forming products of
one-dimensional delta functions. In three dimensions

oCr -

ro) = o(x - a) o(y - b) o(z - c)

SECTION

Electrostatic Potential

127

in which rand ro are the position vectors drawn respectively to the points (x,Y,z) and
(a,b,c). As a consequence of the earlier definitions, oCr - ro) vanishes except at r == ro
and, if dV == dx dy dz,

J oCr -

ro) dV = { 1 if V contains (a,b,c)


0 if V does not contain (a,b,c)

Through the use of the delta function, an assemblage of discrete charges can be represented by a volume charge density function in the form

L qn oCr' N

p(~,'Y],~) =

(3.16)

r n)

n=l

in which r' is the position vector drawn to the point (~,'Y],~). This representation can be
verified by inserting (3.16) in (3.11), which gives
1
47rEO

E(x,y,z) = -

J L~

qn

n = 1

oCr' -

r.,)

r - r'

--'I d~ d'Y]

-I

r -

d~

Use of the result (3.14) reduces this to (3.10).


Similarly, a surface charge distribution can be represented in terms of a volume
charge density through the use of a one-dimensional delta function along the direction
of the normal to the surface; a lineal charge distribution can be represented by a volume
charge density through the use of a two-dimensional delta function in a transverse surface.
By means of these representations a general discussion in terms of p-type distributions
has much wider applicability.

3.4

ELECTROSTATIC POTENTIAL

Use of the del operator V

==

I,

~
+ 1 ay~ +
ax
y

I,

i8z

to form the gradient of inverse dis-

tance (cf. Mathematical Supplement) gives

lx(x - ~)
[(x - ~)2

v(D

so that

+ ly(Y
+

- 7]) + lz(z - ~)
(y - 'Y])2 + (z - ~)2F2

from which it follows that (3.11) can be written

E(x,y,z)

== -

_1_
47rEo

J p(~,'Y],~)V (~) d~ d'Y] d~


~

Since neither p(~,'Y],~) nor the limits of integration are functions of the field point (x,y,z),
the order of integration and differentiation can be interchanged giving
E( x,y,z)

== -

'rf

J p(~,'Y],~) d~ d'Y] d~
V

1TEO~

(3.17)

128 Electrostatics in Free Space

CHAPTER

Thus the electric field is expressible as the negative of the gradient of the scalar function
cfJ(X,Y,z)

f p(~,Tj,n d~ dTj d~
41rEO~

(3.18)

<P is called the scalar electrostatic potential function and it has several important properties which will now be developed.

E = - V<I>

Since

(3.19)

use of the vector identity (V.112) gives

V'xE=O

(3.20)

and thus the static electric field is irrotational. Application of Stokes' theorem then
yields

E .u sf
=

V X E dS

==

(3.21)

in which C is any closed contour, forming the sole boundary of an open surface S.
Consider a contour C which is arbitrarily divided into two segments C I and C 2 by the
points PI and 1)2, as shown in Figure 3.8. As a consequence of (3.21)

FIGURE

3.8

A segmented contour.

Pz

{t
PI

p~

Ed= {, Ed

(3.22)

PI

In words, the line integral of the longitudinal C0111pOnent of E from PI to P2 is independent of the path and therefore the static electric field is said to be conservative. But
fro m (3.19)

or

(3.23)

SECTION

Electrostatic Potential 129

The difference in the value of the scalar electrostatic potential function at any two points
is thus the negative of the line integral of longitudinal E along any path connecting the
two points.
If the total charge in the system is finite and occupies a finite volume, it follows from
(3.18) that the value of cI> at infinity is zero. If in (3.23) the point P2 is allowed to
approach infinity, one obtains
(3.24)
This result has a clear physical interpretation. If there is a distribution of charges
p(t,1J,r) dV which create a static electric field E, and if a charge q is placed at PI, it will
experience a force qE (providing its presence does not alter the charge distribution).
If q is then displaced an amount dl along an arbitrary path, the charge system does an
amount of work on q equal to qE di. If this process is continued and q is permitted to
recede to infinity, the total amount of energy extracted from the charge system is

00

w=

ae

qE

q~(Pl)

(3.25)

Pi

Therefore W is the amount of energy potentially available when q is at Pl-energy which


can be extracted from the charge system by removing q to a point remote from all of the
charges in the system. For this reason cI>(P 1) = ltV/ q can be interpreted as the potential
energy available per coulomb at the point PI due to the charge distribution p dV. <I> is
expressed in units of joules per coulomb, or volts. For this reason E is customarily given
in volts per meter, a unit which can be understood by virtue of (3.23).
Of course, the value of ifl(P l ) can be negative just as well as positive. If it is positive,
the net action of the charge system on a positive charge q as it moves away is repulsive.
If on the other hand ifl(P l ) is negative, the net action of the charge system on a positive
charge q as it moves away is attractive. In this latter case it takes an external force and
an external supply of energy to pull the positive charge q away. Instead of the charge
system having provided energy to remove q, it has required the addition of energy at the
expense of external sources in order to effect the removal. These arguments are inverted
if q is a negative charge, but then the signs of Wand <I> are opposite, as seen from (3.25),
so the interpretation with respect to the supply or removal of energy is the same.
<I>(P I ) as given by (3.24) is called the absolute potential, whereas <I>(P 2 ) - cI>(P 1 ) as
expressed in (3.23) is customarily given the name of potential difference.
The result (3.21) is consistent with the law of conservation of energy. Since the integrand can be thought of as the work done by the charge system on a positive unit charge
as it undergoes a displacement di, the closed line integral is the work done on a positive
unit charge as it moves from an initial point around any path and ultimately back to the
initial point. Upon its return the original situation is reproduced. If there had been a net
value for the work done, this cycle could be repeated endlessly and would constitute a
perpetual-motion machine.
Because of the independence of path, the result (3.23) can be written
00

~(P2)

~(Pl)

f
P2

00

Ed-

f
Pi

Edt

130 Electrostatics in Free Space

CHAPTER

which is consonant with (3.24) and indicates that the potential difference is the difference in absolute potentials at P2 and PI, as well as being a 111eaSUre of the energy which
can be extracted from the system if a positive unit charge is moved from PI to P2. One
may interpret (3.23) by saying that, if E on the average is oriented from PI to P 2 ,
energy is extracted from the charge system as a positive unit charge moves from PI to
P2. This means that the energy potentially available at P2 is less than at PI, thus
explaining the minus sign on the right side of (3.23).
The scalar function <fl(x,Y,z), defined by (3.18), can be set equal to a constant, thereby
prescribing an equipotential surface. The discussion of gradient in the Mathematical
Supplement establishes the facts that V<fl is perpendicular to the equipotential surface
and has a magnitude and direction synonymous with the maximum spatial rate of
increase of <fl. Because of (3.19), E(x,Y,z) is therefore normal to the equipotential surface
which contains (x,Y,z), and has a magnitude and direction which give the maximum
spatial rate of decrease of . The negative feature of this interpretation is in accord with
the discussion of the negative sign in (3.23). More will be said in Section 3.6 about the
orthogonality between E and the equipotential surfaces, in connection with flux maps.
To summarize the results of this section, a scalar electrostatic potential function
~(x,y,z) has been introduced, with (3.18) the defining relation. ~(x,Y,z) has the physical
significance of being the potential energy available if a positive unit charge is placed at
(x,Y,z), this energy being extractable through removal of the unit charge to infinity. cI> is
related to the electric field by Equations (3.19) and (3.24) and is related to the sources by
(3.18). <I> is an exact function because the system is conservative, the line integral in
(3.24) being independent of the path.
Since surface distributions of charge a dS, lineal distributions x de, and discrete
charge distributions qn can be represented by volume distributions p dV through use
of the Dirac delta function (cf. Section 3.3), all the results just obtained apply to these
types of distributions as well, Appropriate forms for the potential function include
~(x,y,z)

cJ!(x,Y,z)

J U(l;,17,r) dS

= -

(3.26)

47rEo~

47rEo

\' qn
'-'
n = 1 r,

(3.27)

The second of these results can be obtained by inserting (3.16) into (3.18) and the first
can be found by a similar procedure. Alternatively, Equations (3.8) and (3.13) can be
rephrased in terms of the gradient of inverse distance and the procedure which led
to (3.18) can be repeated for these two cases.
Because (3.18) is a scalar integral, for many volume distributions it is much easier
to evaluate than the vector integral (3.11). When such is the case, it probably will
prove to be simpler to find E by first finding <fl and then forming - V4>, rather than
attempting to find E directly.
EXAl\1PLE

3.5

.An important special case of an electric charge system is the doublet, or dipole, consisting
of t\VO charges q and - q separated by a small distance d. It is desired to find the potential
and electric field of this charge system a distance ~ from its center, with ~
d. This result

SECTION

Electrostatic Potential

131

~l

derives some of its importance from a model of dielectric materials, whose behavior can be
explained in terms of atomic and molecular dipoles (see Chapter 6).
On the basis of the figure and Equation (3.27), the potential at a remote point P(~, e,)
is given by

<l> _

4 7r Eo

(1 1) _
~1

~2

(~2

47ro

- ~l)

~1~2

But

and, since r
d, these expressions may be expanded by the binomial theorem (cf. Mathematical Supplement) into rapidlyconverging series. Retaining only dominant terms gives
~2

and thus

~1 ~

d cos 8

<PC () ) ~ qd cos 8
r, ,4> - 4 2
7ro~

The product qd is called the dipole moment, a phrase borrowed from mechanics. I t is useful
to introduce a vector p whose magnitude is qd and whose direction is from the charge -q to
the charge +q. If ~ is taken to mean the directed distance from the center of the dipole to
the remote point P, the above resuit can be written
4> =

~~

47rO~3

(3.28)

This form of expression of the potential of a static electric dipole will prove convenient in
later discussions.

132 Electrostatics in Free Space

CHAPTER

The electric field due to the dipole can be found by employing the gradient operator for
spherical coordinates centered at the dipole, namely,

.,

E(r,O,<p)

(qd cos 0) = -qd- (l 2 cos 8 + 1, SIn. 0)

= -v - 41T'EO~2

41T'EO~3

(3.29)

I t can be observed that the electric field and potential of a doublet diminish with distance
as the inverse cube and square, respectively, whereas Equations (3.8) and (3.27) indicate
that for a single charge the electric field and potential diminish with distance only to the
inverse second and first powers. The explanation lies in the fact that the doublet consists of
two equal and opposite charges close enough together so that they partly neutralize each
other's effect.
Field plots proportional to Equations (3.28) and (3.29) can be found with Example 3.11.
EXAMPLE

3.6

Consider again a spherical conducting shell of outer radius a which contains a net charge Q.
The electric potential distribution for this system can be found with the aid of (3.26) in
the form

4>(r,O,O)

f !L a

_1_
41T'Eo

41T'a 2

sin 0 dO dq,
~

in which the geometry of the figure of Example 3.3 applies. Since


(~) 2

the integral can be rewritten

= a2 + r2

r d{ =
4>(r)
4>(r)

which yields

ar sin

dO

= _Q-

81rEoar

2ar cos

r+a

dr

Ir-al

-!L
41fEor

r~a

= -!L

r ~ a

Therefore, charge which is uniformly distribu ted over a spherical surface creates an electric
potential at all external points as though the charge were concentrated at the center; at all
internal points it causes a constant potential.
From these potential expressions the electric field can be found through use of the gradient
operator for spherical coordinates. The result is

E(r)

-V4>

= 1r -

41fEor 2

=0

r>a

r < a

which agrees with Example 3.3.


EXAMPLE

3.7

As an extension of Example 3.4, the potential due to two parallel line charges of opposite
sign can be deduced. Referring again to that figure, one sees that the potential is z independent and given by

SECTION

Electrostatic Potential

<I>(x,y) = _1_
41r E o

133

foo x df _ _ 1_ foo}( df
-00

{I

41rEO

~Z

-00

(3.30)

Equipotential surfaces for this distribution occur when


x

(Y + ~y ~

(Y -

0
2

=k

-------====~~~===--------x

134 Electrostatics in Free Space

CHAPTER

in which k is a constant, for then


J<.

<I>(x,Y) = - I n k
21T'fo

Rearrangemen t of terms gives


2

+( +
Y

d1

+ k 22) 2 _

21 -

d2

(3.31)

(1 - k2)2

which is the equation of a right circular cylinder parallel to the Z axis. The equipotential surfaces are a family of nonconcentric nesting cylinders with centers at (0, d(l + k 2 ) /2(1 - k 2 ) )
and radii d[k/ (1 - k 2 ) ]. A. few contours of these equipotentials are sketched in the figure.
Formation of the gradient of (3.30) gives

l zx

ly

(Y + ~)

+ (Y + ~y

which agrees with the result of Example 3.4.

3.5

GAUSS'

LAW

Let a new static vector field be introduced by the relation

Do(x,y,z)

= EoE

=~

f p(~,'Y],r) !r d~ d'Y] dr
3

41T' v

(3.32)

in which, through use of the delta function, p can represent volume, surface, lineal, or
discrete charge distributions. V is a volume large enough to encompass all the charges
of the system, Do is called the electric flux density function and the zero subscript is a
reminder that the present discussion is limited to charges in a vacuum. The units of
Do are coul /rn" as can be seen by inspection of (3.32).
Consider evaluation of the surface integral

f Do dS f [~411" f
=

Sa

Sa

~ dV]

dS

in which Sc is a closed surface bounding a volume V G. V G and V can bear any general
relation to each other-either may totally include the other, they may have a subvolume in C0111n1011, or they 111ay be nonintersecting. Since the extents of these two
volumes are independent, the order of integration can be reversed, giving

f Do dS

Sa

~
41T'

f p(~,7J,r) [ f ~ ~S] d~ d7J dt

(3.33)

Sa!

In (3.33), p(~,1],r) d~ d1] dr is a source element of charge in the volume V, and ~ is


drawn from the source element to the surface element dS in SG. This situation is
depicted in Figure 3.9.
Consider the evaluation of the surface integral on the right side of (3.33) for the
case of a point (~,'Y],r) interior to Sa. Let dS A be a surface element with a central point

SECTION

Gauss' Law

135

\ \

\ \

\\
\\

\\
\\

\\
\\

\\ r;
~

\ <t',77',()
FIGURE

3.9

Geometry for establishment of Gauss' law.

P A as shown. If lines are drawn from every point on the perimeter of dS A to (~,1],r),
the cone thus formed includes a solid angle dnA. The projection of dS A onto a sphere
thr ough 1:1 A with (~,1],r) as center therefore has an area equal to ~2 dnA. Thus

in which in is the outward-drawn unit vector normal to dS A and cos 8A is the acute
angle of intersection of the sphere and dS A . By consideration, in this fashion, of all the
elements of solid angle centered at (~,1],r), every surface element in Sa is included
and one can write

f -- = f

471"

~. dS

Sa

~3

~. in
--dfJ
~

cos ()

But 6 In/~ = - cos 8 or + cos 8 depending on whether or not 6 and in are oppositely
directed. For the solid angle dnA there is one intersection with the surface Sa and the
contribution to the above integral is therefore +dQA. For the solid angle dQ B there are
three intersections. The contributions at P B and P~ are each +dQB whereas the contribution at P~ is -dQB, for a net contribution of +dQB. It is apparent that since
(~,l1,r) is inside Sa, for each element of solid angle dn there must be an odd number of
intersections and hence a net contribution of +dn. Thus

J ~ = J dQ = 41r
dS

Sa

471"

(3.34)

When an exterior point (~/,1]/,r/) is considered, each element of solid angle dn makes
an even number of intersections with Sa, half of which give a contribution +dn and
half of which give a contribution - dQ, for a net contribution which is null. Thus the
result (3.34) applies for any point (~,1],r) interior to SG but must be replaced by zero

136 Electrostatics in Free Space


for any point

(~/,q',r/)

CHAPTER

exterior to So. Therefore (3.33) becomes

f Do dS f p(~,1/,n dV
=

So

(3.3fi)

Va

In words, the integral of the normal component of Do over a closed surface Sa is equal
to the net charge in the volume 11 G enclosed by Sa. This is Gauss' law.
Several useful corollaries to Gauss' law follow readilyv First, if the closed surface Sa
is constructed so that every point of SG is occupied by conducting material, the total
charge within SG is zero. This is a consequence of the fact that in electrostatic equilibrium, E == 0 within a conductor] and thus Do == 0 also. Therefore f So Do dS = O.
Second, in electrostatic equilibrium, there is no net charge at any interior point of a
conductor. This follows because such a point can be surrounded by a surface SG which
lies wholly within the conductor. Thus the excess charge of a conductor resides on its
outer surface in electrostatic equilibrium, t
Third, if any number of arbitrarily charged bodies be placed inside a hollow closed
conductor, the charge on its inside surface will be equal and opposite to the total charge
on the enclosed bodies. This can be shown by constructing SG to consist only of interior
points of the hollow conductor, at all of which points Do = O.
EXAMPLE

3.8

Consider a static electric system consisting of a single charge of strength q located at the
origin. For this case (3.32) gives
q r

Do = - 3
41r r

If a gaussian surface So is constructed, consisting of a sphere of radius r centered on the


charge, then over this sphere Do is everywhere normal with a magnitude q/4trr 2 Thus

f Do dS

So

= -q-2 (4trr2) = q
47rr

since 41rr2 is the surface area of the sphere. This result is seen to agree with Gauss' law.
If a charge Q resides on a spherical conducting shell of outer radius a, as in Example 3.3,
by the second corollary to Gauss' law, this charge is found on the outer surface. By sym-

t To say that E == 0 requires some elaboration. In a metallic body, for example, at the atomic level
one can imagine an array of positive ions which can vibrate about their lattice sites, plus a cloud of
electrons which are free to wander throughout the body. The vibrations of the ions and the wanderings
of the free electrons are both random thermal effects, and local fluctuating electric fields exist at any
atom site, varying with the motions of the nearby ions and electrons. However, the time-average value
of this electric field is zero unless there is a drift of the electron cloud (current) through the metallic
body. In electrostatic equilibrium no such macroscopic drifts occur. The statement that E(x,y,z) == 0
implies this, and E(x,y,z) can be interpreted properly as the time-independent component of the electric
field within the conductor. Since the conductor is being viewed as an assemblage of charges in a
vacuum, the defining relation Do = foE is applicable and therefore, within the conductor, Do == 0 also.
t The net result is that a charged conductor can be viewed internally as an electrically neutral body,
but one possessing a charge distribution over its exterior surface. The interior of the body is equivalent
to a vacuum once equilibrium is established, its role having been properly to distribute the surface
charge. This proper distribution is such that if the conducting body were removed, leaving the surface
charges intact in a vacuum, E would still be zero at all points formerly occupied by conductor.

SECTION

Gauss' Law

137

metry it has a uniform surface density (J' = Q/41ra2 If a gaussian surface Sa is drawn, which
consists of a concentric sphere of radius b, then if b > a,

f Do' dS

=Q

Sa

from which, by symmetry

Q
Do = 41rr 2
However, if b

<

a, then

Do' dS = 0

Sa

and thus, again invoking symmetry

Do ==
Upon division of these results by
EXAMPLE

fo

o.

to obtain E, agreement is found with Example 3.3.

3.9

Another academic problem, from which several practical results can be derived in due
course, involves two concentric conducting right circular cylinders of infinite length. A short
length of this geometry is shown in the figure. If it is assumed that a lineal charge density of

+x coul/rn exists on the inner conductor, the second corollary to Gauss' law indicates that
this charge resides on the surface r = a; by symmetry it is uniformly distributed over this
surface. The third corollary leads to the conclusion that the surface r = b contains a charge
of - x coul/rn ; by symmetry this is also uniformly distributed.
The field that exists in the vacuous region between these coaxial cylinders can be deduced
with the aid of Gauss' law. Let a gaussian surface SG be erected, composed of a concentric
cylinder of radius r, with a < r < b, and two end caps at the positions z = L. By symmetry, Do is entirely radial so the integrals over the end caps vanish, and
2Lx

Sa

Do dS

-L

x
Do(r) = -

Do(r)27f'r dz

= 47f'rLDo(r)

138 Electrostatics in Free Space

CHAPTER

From this it follows that

E(r) = l r

21Tfor

<I>(r) = <I>(a) - -

)(

27T'fo

In -

If the outer cylinder is grounded and the inner cylinder maintained at a potential
last result becomes
<I>(r)

Vb, this

= Vb In (rib)
]n (a/b)

As one would expect, the equipotential surfaces are concentric cylinders.


These results can be put to practical use in the treatment of tubular condensers and
coaxial transmission lines.

3.6

ELECTRIC FLUX

A graphic method for displaying any vector field is described in Example V.21 of the
Mathematical Supplement. This technique can be applied to the field Do with the
advantage that it frequently provides conceptual help in the understanding of problems.
The spatial vector function Do(x,Y,z) has a value at the point P(x,y,z) given by
Equation (3.32). Imagine that lines are constructed at P parallel to Do and marked
with arrows pointing in the direction of Do. If the number of lines per square meter
which pass through a small area erected at P transverse to Do is chosen to be numerically
equal to the magnitude of Do at P, and if this is done for all points P, a field map of
Do results.
The lines which represent Do are known as electric flux lines, and since their density
gives the value of Do, one can see why Do is known as the electric flux density function.
For any closed surface So
J/; =

f Do' dS

(3.36)

Sa

is the net number of electric flux lines emerging from So. Hence an alternative statement
of Gauss' law is that the net number of electric flux lines emerging from a closed surface
So is numerically equal to the total charge enclosed by So. Since So may be chosen
small enough so as to enclose only one discrete charge, it follows that the number of
electric flux lines originating on a positive charge ql is numerically equal to ql, and that
the number of electric flux lines terminating on a negative charge q2 is numerically
equal to q2. If Sa encloses no charge at all, t/; = 0, which means that as many flux lines
enter SG as leave it. Thus electric flux lines are continuous except at points where there
is charge. All these deductions may be summed up by the two statements:
1. All lines of electric flux originate on positive charge and terminate on negative
charge. t
2. The net efflux t/; at a point P is numerically and algebraically equal to the electric
charge at P.

t Some charge may have to be considered to he placed at infinity in order to "complete" the charge
system.

SECTION

Electric Flux

EXAMPLE

139

3.10

Consider again the case of a single charge q placed at the origin. If q is positive, then if; = q
lines of electric flux emerge from the origin; by symmetry, they are uniformly distributed
in three dimensions, as suggested by the figure. At any radius r, the density of these lines is
Do = -

l/;

41rr 2

so that

q r

Do = - -3
41T'r

which agrees with Example 3.8.


In the absence of all other charge, these flux lines would extend radially to infinity, there
to be terminated by a total charge -q, uniformly distributed over an infinite sphere. If the
charge q at the origin is negative, the directions of the arrows on the flux lines are reversed.

EXAMPLE

3.11

Next, reconsider the doublet of Example 3.5, consisting of charges q and -q a distance d
apart. All q of the lines of electric flux leaving the positive charge terminate on the negative
charge. Since for this system

Do = .!L

41T'

(6~
~1

it follows that for points very close to the charge

_~)
~2

+q,

so that the flux lines start out radially and uniformly from + q and then bend around toward
the charge -qJ where they enter radially and uniformly. This is enough information to

140

Electrostatics in Free Space

CHAPTER

allow a rapid and informative sketching of the field, with the result indicated by the heavy
solid lines in the figure.
Through use of the field expression developed in Example 3.5, if ~
d,

Do =

~ (1~ 2 cos () +

41rr3

1 8 sin ())

which is seen to be consistent with the flux plot.

Flux lines

Equipotentials

Equipotential surfaces can also be added to a flux map and they are everywhere
perpendicular to the electric flux lines. In the preceding simple example of a single
charge, the equipotential surfaces are concentric spheres. In the case of the doublet,
they are figures of rotation about the line connecting the charges; the profiles of several
equipotentials are shown in the figure.

3.7 A CONDUCTOR-VACUUM INTERFACE


The relation between flux lines and the charge which resides on conductors is of special
interest. Consider the case of the electrified conductor of exterior surface S shown in
Figure 3.10. I t already has been established, through the second corollary to Gauss'
law, that all the excess charge resides on the exterior surface S. Further, it has been seen
that in electrostatic equilibrium, E == 0 throughout the conductor, for if this-were not

SECTION

A Conductor-Vacuum Interface

141

so, charges would be flowing, in denial of the assumption of equilibrium. Thus between
any two points in the conductor IE di == 0, because these points can be connected
by a path which lies wholly in the conductor. It follows that the conductor is an
equipotential.
If the conducting body is viewed as an assemblage of charges (some mobile) in a
vacuum, it follows that Do = EoE == 0 within the conductor, and therefore all the
electric flux lines associated with the excess charge are external to the exterior surface 8.
Further, these flux lines must be normal to 8. If they were not, this would imply a
tangential component of Do, and thus of E, in 8; this would give rise to surface charge
flow, violating the premise of static equilibrium. This conclusion that the flux lines
must be normal to 8 is also consistent with the fact that S is an equipotential surface.

S~c:J\
\ Sf
\

'-.;

(a)

-..;

"....--;;....;'

(b)
FIGURE

3.10

~4n

(c)

electrified conductor.

If the total excess charge in a surface element d.S of 8 is (J dS, Gauss' law requires
that dl/; = a d.S be the total number of flux lines just outside of dS associated with the
charge in dS. But dl/; = Do dS, in which Do is the electric flux density immediately
outside of dS. Therefore, at any point on the exterior surface of an electrified conductor
(J

= Do

(3.37)

In general (J and Do are functions of position on the conductor surface. A possible


distribution is indicated by the flux lines in Figure 3.10a.
All of the foregoing conclusions apply whether the conductor whose exterior surface
is 8 is solid or hollow. Let it be assumed that it is a hollow body, with an interior surface 8', as shown in Figure 3.10b. A gaussian surface ~SG can be constructed which lies
as close to S' as one pleases, with all points of Sa lying in conductor. Gauss' law then
yields the result that S', considered as a whole, must be electrically neutral. This does
not prove that 8' must everywhere be locally neutral. One could imagine a charge
distribution over 8', somewhat as suggested by Figure 3.10c, with the flux lines running
through the hollow interior from the positive cluster of charge to the negative cluster
of charge.
But such a charge distribution on S' is not in static equilibrium. 1"0 appreciate this,
one need only form the integral IE d.f along t\VO paths from PI to ]J 2 , one path being
entirely in the conductor, the other through the hollow interior along a flux line. Since
E == 0 along the first path, it must also be identically zero along the second because
potential difference is independent of path. Therefore, the interior surface S' must be

142 Electrostatics in Free Space

CHAPTER

everywhere locally neutral and the hollow interior region is field-free. This is a feature
which can be used to advantage when it is desired to shield equipment from external
electric fields.
EXAMPLE

3.12

Previous examples have established that the electric flux density external to a conducting
spherical shell of outer radius a, possessing an excess charge Q, is Do = l r (Q/ 47T-)(r /r 3 ) .
Right at the surface
(J

Do

= - -2
47T'a

which is consistent with the fact that the surface area is 41ra 2

3.8

THE METHOD OF IMAGES

Certain problems in electrostatics may be simplified by the application of an image


technique which can be established through the use of Gauss' law. To this end, consider an arbitrary complete system of discrete fixed charges. t Figure 3.11a is intended
to suggest this general situation, with flux lines drawn from positive charges to negative charges, and equipotentials shown lighter and transverse to the Do field.
Let cI>o be a closed equipotential surface with in an outward-drawn unit vector
normal to cI>o at the point I). The surface <Po divides the system of charges into an
internal part and an external part such that
N

L qn =

n=l

Qint

+ Qext

In words, the algebraic sum of the external charges equals minus the algebraic sum
of the internal charges.
If a surface charge density a = - Do L, COUl/lU 2 is placed at each point P on the
surface <Po, at the same time that all charges exterior to <Po are removed, the result
will be as shown in Fig. 3.11b. The Do field inside <Po will be unaltered, whereas the field
outside will be completely erased. The surface <Po will still be an equipotential.
Suppose next that an extremely thin, electrically neutral conductor, shaped in the
form of the surface <Po, is slipped into the position of <Po in such a way as not to disturb
any of the interior charge. Suppose further that the charges which make up the distribution (J' become attached to the conducting surface. Since these charges were in
transverse equilibrium before the insertion of the conducting surface (E t a n == 0 over an
equipotential surface), they will not move even though they are now on a conductor and
free to do so. The important conclusion is reached that the field inside <Po is the same
in the presence of the conductor, charged with the distribution (J', as it was originally
when the external charges were present.
For any gaussian surface Sa erected outside the conducting shell, f SaDO dS = 0 by
virtue of the fact that now Do == 0 outside <Po. This means that the total charge inside

t By

a complete system one means N charges of values qn, placed at arbitrary positions, but such that
a complete
system.
1;qn

= O. If charges are imagined to exist at infinity, every system can be considered

SECTIO N

The Method of Images

<1>0

Exterior

(a)

+
+~.-....-~"""",

(b)

(c)

FIGURE

3.11

The method of images.

143

144 Electrostatics in Free Space


SG is zero, or that

CHAPTER

o.: + f dS = 0
f dS = o.:
U

<1>0

so that

<1>0

A vivid way to picture what has been done is to imagine that all flux lines which
have extremities external to <Po have unit charges of appropriate sign attached to those
extremities, these charges making up Qext. These flux lines are permitted to contract,
pulling the unit charges with them until all of Qext has collapsed onto a conductor placed
at <1>0, thus forming the distribution a. This erases the external field but leaves the
internal field intact.
Alternatively, if a surface charge density (J' = Do in = - ( J is placed at all points
on the surface <1>0 at the same time that all interior charges are removed, the result
will be as shown in Figure 3.11c. The field outside <1>0 will be unaltered whereas the
interior field will be completely erased. If the shaped conducting shell is put in place
and the charges comprising (J' are allowed to become attached to it, no change in their
distribution will occur. Thus the field outside cI>0 is the same in the presence of the
conductor containing the surface charge distribution a' as it was when the original
discrete internal charges were present. One can conclude that

f u' dS

Qint

<1>0

and say that if Qint is allowed to collapse along its flux lines onto a conductor placed
at <1>0, thus forming the distribution a', then the internal field will be erased whereas
the external field will remain intact.
Once the situation of either Figure 3.11b or Figure 3.11c is achieved, it is no longer
necessary that the conductor be an extremely thin shell. I t can assume any thickness
which encroaches only on the field-free region and can even be a solid conductor which
completely fills this region.
The above procedure turned around is the method of imaqes. If one wishes to find the
field due to fixed charges and/or charged conducting bodies, a simple solution is available if the conductors can be replaced by properly positioned equivalent charges. For
simple conductor shapes the proper equivalence is often easily recognizable.
EXAMPLE

3.13

Previous examples have been concerned with the field and potential due to an electric dipole
(doublet) and a flux map for this system can be found with Example 3.11. The equipotential
contours which have been added to that flux map show that the plane which forms the perpendicular bisector of the line connecting the t\VO charges is an equipotential surface of
value <I> = o. Therefore, the system consisting of a discrete charge q a distance d/2 above
a grounded conducting plane is equivalent (in a half-space) to a doublet. The imaqe charge
-q, a distance d/2 behind the plane can replace the plane and all the surface charge it
contains, for the purpose of computing the field anywhere above the plane. With reference to
the figure, the Do field at any point P(r,c/>,z) above the plane is therefore

-.!l

o - 41r

{I r
T

[r 2

1z (z - d/2). _ 1TT
(z - d/2) 2P2
[r 2

+ d/2) }
+ d/2) 2F2

1z (z
(z

SECTION

The Method of I mages 145

in which cylindrical coordinates are being employed, with the origin selected as that point
in the conductor closest to q.
At the plane z = 0

CT(r)

Do

1r

d[

r2

+ (d)2J
2 _3,-2
'

a distribution which is shown as the dotted area in the figure. The image technique is thus

z
P(r,cP,z)

seen to be additionally useful in yielding the charge distribution in the conductor. As a


check,

f CT(r)2rr dr =

o
EXAMPLE

qd foo

- 2

r dr

[ r 2 + (d)2J%
-

-q

3.14

The case of t\VO parallel line charges of opposite polarity has been treated in Example 3.7 in
which the equipotential surfaces were found to be nesting right circular cylinders. Com
bining the image technique with this result facilitates the solution of a problem of considerable practical importance.
Consider no\v the case of two straight parallel circular conductors, each of diameter 2a,
with a center-to-center spacing D, as shown in the figure. Let the upper and lower conductors
contain lineal charge densities of +)( and -)( COUI/l11, respectively. Since it is already known
that the equipotential surfaces for parallel line charges are cylindrical, it follows that the
system shown in the figure is equivalent to t\VO line charges at an appropriate spacing d.
This spacing d must be selected so as to give equipotentials for the t\VO surfaces occupied

146 Electrostatics in Free Space

CHAPTER

by the outer skins of the two conductors. By use of Equation (3.31) this means that
d 1 + k2
1 - k2

2=2

k
a=d-1 - k2

which, solving for d, gives

d = (D 2 - 4a 2) ~2

Upon inserting this value for d in (3.30) one obtains the expression for the potential anywhere in the space surrounding the two parallel conductors, namely
<I>(x y)
,

=-

41rEo

x2

[y

!(D2 - 4a2))~J2

In - - - - - - - x 2 + [y - t(D2 - 4a2)~,~p

In particular, if (x,y) be chosen to correspond to any point on the surface of the upper conductor, such as (0, D/2 - a), the potential of the upper cylinder is found to be

- - 1 ]H}
{D+ [(D)2
2a

<1>+ = ~ In 21rEo
2a

Since the median plane is at zero potential, it follows that the lower conductor is at a

I 1
d

potential <1>_ = -<1>+. This can also be deduced by inserting an appropriate point, such as
(0, - D/2 + a), into (3.30). The potential difference between the t\VO conductors is therefore

2<1>+

~ In { -D

~Eo

2a

[(D)2
- - 1 ]H}
2a

)(
D
= -cosh- 1 2a

~Eo

If D

a, as is often the case in practice, then to first order

V =

)(

-In-

1rEo

These results are useful in a discussion of two-wire transmission lines.

SECTION

The Method of Images

147

The problem of a uniform line charge parallel to a conducting cylinder is quite evidently
within the scope of this analysis and is left as an exercise.
EXAMPLE

3.15

The method of images can be applied to the problem of calculating the field due to a discrete
charge in the presence of a spherically shaped conductor, upon recognition of the fact that
any two discrete particles, statically separated and bearing charges of opposite sign but
arbitrary magnitude, give rise to one equipotential surface which is a sphere. To see this,
refer to the figure, in which PI and P 2 are the positions of the two charges ql and Q2, and 0
/---,p

I
I

//

\
\

"

\
l2 \

I r, P,

PI

-I

.....

......... _ - - /

II

rl

'

..

is a point on the line which connects them. The potential at the point P is
ip(P) = _1_

41l"Eo

(~ + ~)
~l

~2

A zero potential surface is defined by the condition

:=-~
ql

~l

and this surface will be a sphere if the constant ratio ~2/~1 permits the distance a from 0 to

P to be a constant also. This will be true if the triangles OPP1 and OP2P are similar, for then
r2

~2

- -~l
a

a
rl

in which r i and rs are the fixed distances from 0 to the two charges. Thus an equipotential
surface exists which is spherical, with center at 0 and with a radius a which is the geometric
mean of the distances from 0 to each charge.
As can be seen by turning this problem around, if a grounded spherical conductor of
radius a is in the presence of an external discrete charge ql, placed a distance rl from the
center of the sphere, the field for this system can be computed by replacing the sphere with
a discrete charge

this equivalent charge being positioned a distance

from 0, in the direction toward ql.


This solution can be generalized to permit any value
conductor by placing an additional equivalent charge

q3

41T' Eoa <I>

ip

for the potential of the spherical

148 Electrostatics in Free Space

CHAPTER

at the point O. Letting q3 = 0 gives the case just discussed, that of a grounded sphere. Letting q3 = - q2 gives the case of an electrically neu tral sphere.

3.9

POISSON'S

EQUATION

If a satisfactory model of the electrostatic systern in question results from assuming


that the volume charge density function p is well-behaved (i.e., has continuous first
derivatives), then it follows that Do is well-behaved also. In this event, the divergence
theorem can be applied to (3.35) giving

f VDodV f DodS f pdV


=

So

(3.38)

in which V G is the volume bounded by the closed surface Se. Since 11 G is completely
arbitrary, the integrands of the t\VO volume integrals in (3.38) must equate point by
point; thus
(3.39)
v Do = p
But one can also write
Do = EoE = EO(-VcI

so that (3.39) can be rewritten


V'2ep

= - .!!-

Eo

(3.40)

This important differential equation is due to Poisson. It relates the spatial derivatives
of electrostatic potential at a point to the volume charge density at the point and can
be viewed as the differential form of Gauss' law. The voltage distributions inside
vacuum tubes are solutions to (3.40) and it is the basis for analysis of electron beam
compositions, for design problems in electron beam shaping, and for the determination
of electron densities in plasmas.
The formal solution to this differential equation has already been found and is given
by (3.18). However, Equation (3.18) is principally useful in problems for which the
charge distribution is known beforehand and one desires to find the potential function.
There are also problems in which one begins by knowing neither the potential function
nor the charge distribution and wishes to determine both; in such cases it is often
advantageous to begin with (3.40) and seek a particular solution.
EXAMPLE

3.16

Consider two parallel plates composed of conducting material. As shown in the figure, one
plate has its interior surface situated at x = 0 and it is to be supposed that this plate is
heated so that it emits electrons into the interspace. The other plate has its interior surface
at x = l. A constant potential difference is applied between the plates by a battery so that
the unheated plate is Vb volts above the heated plate, thus attracting the emitted electrons.
A steady time-independent current results and this device can be recognized as a rudimentary model of a diode. It is desired to find the distribution of electrons and potential in the
interspace and also the connection between current and plate voltage.
One may wonder why this problem, which involves moving charges, i~ treated in a
chapter on electrostatics. However, since the current is time-independent, at any point
between the plates there always must be as much charge arriving per unit time as leaving, so
the amount of charge at the point remains constant. Thus, even though the identity of the

SECTION

Poisson's Equation

149

charges in a volume element dV keeps changing, the amount of charge p dV does not.
Therefore, p may be a function of space but not of time and Poisson's Equation (3.40) Inay
be used to deduce <1>, which also will be a function of space but not of time,

cI>=o

1- -1
1

For simplicity the plate dimensions will be taken large compared to l so that variations
of voltage and charge density in the transverse directions may be ignored. Then (3.40)
becomes

d 2<1>
dx?

Eo

p(x)

I t will be assumed that the electrons, under the repulsive action of the electron cloud already
in the interspace, are barely able to get out of the cathode, emerging with negligible
initial energy. Then if v(x) is the electron velocity at a distance x from the heated plate
(cathode),
~mv2 =

ecJl(x)

relates the kinetic energy of the electron to the work that has been done on it by the field,
as it moves from the cathode through a distance x. In the above expression - e and m are the
electronic charge and mass and the cathode has been assumed grounded.
Further, since p is the volume density of charge, pv dA dt is the charge in a tube of crosssection dA and ~Y directed length v dt. All this charge will pass through the tube in time dt.
Defining current as charge/sec passing a cross section, one can write
L

in which

dA =

pv

dA dt
dt

is the current density, expressed in amp/rn". Thus


L =

pv

is the relation between current and the flow of charge. In this problem, L is a constant, being
independent of both time and space. p and v are both functions of x but their product is not.
With the aid of these two auxiliary relations, Poisson's equation can be rewritten

150 Electrostatics in Free Space

CHAPTER

which is readily integrated to give

a solution which satisfies <1>(0) = 0 and <I>(l) = Vb as well as the condition d<l>/dx = 0 at
x = 0, imposed by the assumption that the electrons barely get out of the cathode.
From this it follows that
vex) =

(~: VbY2(TY'

p(x) =

-~foVb(xl2)-,~

so that
in which

K =

t (2e)}2 ~
ni

(3.41 )

-KVb~~

l2

= 2.33 X 10- 6
l2

with l expressed in meters.


Equation (3.41) is known as the Child-Langmuir law and is obtained for any geometry
of cathode and plate, the only factor being affected by a change in geometry is the parameter
K, and its variability is not great. This nonlinear relation between current and voltage in a

~------------+---x

diode is a most vital characteristic, being responsible, as an example, for a useful technique
in signal detection. Equation (3.41) has been verified extensively by experiments employing
a variety of geometries.
The presence of a minus sign in (3.41) may seem surprising but it can be traced to the
equation L = pv. The electrons have velocities in the positive X direction but their charge

SECTION

Laplace's Equation

10

151

density p is negative so that L is negative; that is, it constitutes a current in the negative X
direction.
Plots of <1>, o, and v versus x can be found in the graph.

3.10

LAPLACE'S

EQUATION

For those electrostatic problems in which the charge distribution is known completely,
Equation (3.18) can be used to determine the potential function and the relation
E == - V<I> can then be used to deduce the field intensity. It has been noted earlier,
however, that problems in which the charge distribution is not known beforehand arise
frequently. When this is the case the potential function in regions containing charge
can be obtained by solving Poisson's equation. Unfortunately, solutions to this equation are very difficult to obtain except for a limited class of relatively simple situations.
However, an extensive number of practical problems exists in which the charges are
confined to the surfaces of conductors, or otherwise constrained to occupy a limited
region. Under such conditions it is often advantageous first to determine the potential
distribution in the adjacent charge-free regions. For such regions, Poisson's equation
reduces to
\72<1> = 0
(3.42)
which is known as the Laplace equation after its discoverer.
Solutions to Laplace's equation must be chosen to match the conditions at the
boundaries of the charge-free regions, which is the link whereby the charges of the
system affect the potential distributions. 1"0 1' example, if all the boundaries of the
charge-free regions are conducting surfaces, then the constant potential over each
of these surfaces might be specified. This is called a Dirichlet problem. Solving Laplace's
equation for the potential subject to these boundary conditions, and forming the
gradient, permits one to determine Do at all boundary points; by this means the charge
distributions over the conducting surfaces are deduced.
Alternatively, the charge distributions over all the boundary surfaces may be specified, which is equivalent to stating the normal derivative of potential as a boundary
condition. This is called a Neumann problem. Solving Laplace's equation for the
potential, subject to these boundary conditions, yields not only the potential distribution throughout the charge-free region but also the potential of each conducting surface
forming a boundary.
It is of value to know that the solutions to Laplace's equation so obtained are unique.
To see that this is the case, imagine that t\VO functions <PI and <P2 have been found,
each of which satisfies Laplace's equation in the charge-free regions, and each of which
satisfies the boundary conditions. Then, since Laplace's equation is linear, their difference <1>1 - <P2 is also a solution and one can use the divergence theorem to write
1

f (ch -

ch)V(<I>! - <1>2) dS

f V [(<1>1 -

<l>2)V(<I>1 - <1>2)) dV

(3.43)

in which S is the totality of boundary surfaces of the charge-free regions and V is the
entire charge-free VOlU1TIC.
But the integrand in the surface integral of (3.43) is the product of <1>1 - <P2 and the
normal derivative of <1>1 - <P 2. One or the other of these factors is zero on any surface
element dS because both <PI and <1>2 are assumed to satisfy the boundary conditions.

152 Electrostatics in Free Space

o=

f [(<1>1 -

CHAPTER

<1>2)\7 2(<1>1 - <1>2)

IV(<I>1 - <1>2)1 2) dV

in which use has been made of the vector identity (Vvl O?'). Because <1>1 - <1>2 satisfies
Laplace's equation everywhere ill 11 this reduces to

f IV(<I>1 -

<1>2)1 2 dV = 0

(3.44)

Since the integrand of (3.44) can nowhere be negative, it follows that

Thus 4>1 and 4>2 can differ from each other at most by an additive constant. This
constant will have no influence on potential differences between points in V and
disappears in taking the gradient. Thus, 4>1 and <P2 yield the same electric field distribution and the solution is unique.

3.11

SOLUTIONS TO LAPLACE'S

EQUATION IN RECl ANGULAR COORDINATES

When the electrostatic potential 4> is expressed as a function of the three Cartesian
coordinates x, y, z, Laplace's equation takes the form]
(3.45)
Using the method of separation of variables, one can assume a product solution of the

form
(3.46)
Since <P is a real function, it is convenient to assume that the functions fi are real also.
Upon substituting (3.46) into (3.45) one obtains
2

~df
f3

_.!-

d 2f t _ ~ d~r2
II dx 2 12 d y 2

dz 2

(3.47)

Since the right side of (3.47) is at most a function of x and y, whereas the left side
can be a function only of z, it follows that both sides 111USt be equal to the same constant.
For convenience this constant, which 111ay have any real value, will be designated by
Then

k;.

df3
dz?

_ ~ d It
11 dx 2
2

k2 f = 0

(3.48)

z. 3

~ d~f2

i d y 2

k2

(3.49)

The left side of (3.49) is at most a function of x whereas the right side can be a

t Cf. Mathematical Supplement, Sec. \T.16.

SECTION

11

Solutions to Laplace's Equation in Rectangular Coordinates

function only of y. Consequently, both sides must equal a constant, say

k;,

153

so that
(3.50)
(3.51)
(3.52)

in which

If no one of the three constants k-, k y , k, is zero, appropriate solutions of (3.48),


(3.50), and (3.51) give
(3.53)
in which the brace notation is intended to signify

with a and b constants; etc.


If anyone of the separation constants is zero, for example, if k x = 0, then
fl(x)

ex

+d

(3.54)

and the appropriate factor in (3.53) is replaced by this linear solution.


Since Laplace's equation is linear, any sum of solutions of the type (3.53) is also a
valid representation for cI>(x,y,z). A particularly useful combination, applicable when
the potential is repetitive in intervals L, and L, in the X and Y directions is

(3.55)
The complex constant coefficients a m n and bm n may be determined from the boundary
conditions by the usual Fourier techniques. This formulation can be extended to nonrepetitive geometries by replacing (3.55) with a Fourier integral.
EXAMPLE

3.17

Consider again the case of t\VO parallel conducting plates, as first treated in Example 3.16,
but now assume that neither plate is heated, although they still differ in potential by Vb
volts, the plate at x = l being at the higher potential. This is now a parallel plate capacitance
problem, and with no electron emission occurring, Laplace's equation applies for the region
between the plates. Assuming transverse dimensions large compared to l, <P can be taken
independent of y and z and Equation (3.54) is an appropriate solution. Inserting the boundary conditions that <1>(0) = 0 and <l>Cl) = Vb gives

cI>(x)

Vbl

(3.56)

and thus the potential increases linearly from one plate to the other. This should be contrasted to the heated cathode case of Example 3.16 in which the space charge caused the

154 Electrostatics in Free Space

CHAPTER

potential to increase as the four-thirds power of distance. The two potential distributions
are shown in the figure.
<t>(x)

"bq; /

SO /

~'!11

l'1

<i~ /

~I

~/

From (3.56) one can deduce that the electric field is

E = - V <I> = -1 del> = -1 Vb
x

dx

(3.57)

Therefore the electric field is uniform between the plates and

Do = EoE = -1 x
The plate at z

EoVb

(3.58)

l has a uniform surface charge density given by

EoVb

uo = Do = - l

(3.59)

there being an equal and opposite distribution on the plate at x = O. The subscript on a is
a reminder that these plates are in a vacuum: later this configuration will be reconsidered in
the presence of a dielectric.
EXAMPLE

3.18

As an illustration of the applicability of harmonic solutions, consider an array of thin conducting strips lying in the z = 0 plane. Each strip is a/2 units wide in the X direction and
infinitely long in the Y direction. Alternate strips are at potentials of V and - V volts,
and insulated from each other by negligibly thin spacings b. This geometry is suggested by
the figure. It is desired to find the potential distribution in the upper half space z > 0.
First of all it is evident by applying (3.55) that one should select n = to be compatible
with no variations of potential in the }" direction ..Also, amo should be set equal to zero for
all m since the associated exponential term in z diverges as z ~ co. With these simplifications,
(3.55) reduces to

I
co

<I>(x,z)

b; exp (j271" l1:X) exp ( -271" ':IZ)

and this solution is subject to the boundary condition that the potential is a square wave in

x when z = O. Thus

<p(x,O)

SECTION

Solutions to Laplace's Equation in Cylindrical Coordinates

12

155

z
<I>(x,O)

,..------:1-----....----- y

-v
x
must agree with the potential plot indicated by the graph. This is a well-known problem
in Fourier series" and the coefficients are given by
bm
for m
c}>(x,z)

=~

a/2

a -a/2

cI>(x,O) exp

(-j

27rmx) dx
a

.V

J1r111,

(1 - cos m1r)

-r.:

O. (bo = 0). Therefore

~ 2Jb. m SIn. 21rmx


(21rmz)
= L
- exp
--m=l

2V

= -

1r

00

m= 1

1 - cos m1r S. I n
21rmx
(21rmz)
- - exp
---

4V [.
27rX
(21rZ)
= --;SIn -;; exp - -;;

+3

l'

67rX

SIn -;; exp

(61rZ)
--;;

107rX exp (I07rZ)


+ 5" SIn -a- -a- +
1

...J

This is a series in which the higher harmonic terms decay very rapidly in the + Z direction.
One does not need to be very far above the plane Z = 0 in order to find a potential distribution which is almost a pure sinusoid in the X direction.

3.12

SOLUTIONS TO LAPLACE'S

EQUATION IN CYLINDRICAL COORDINATES

For problems involving boundaries which are coordinate surfaces in a cylindrical geometry, it is advantageous to express Laplace's equation in terms of the cylindrical variables (r,et>,z). For this case Equation (V.86) of the Mathematical Supplement becomes

a2c}>

1 aep

1 a2ep

a2ep

-+--+--+-=0
ar2
r ar
r 2 aet>2
az2

(3.60)

15 See, e. g., 1. S. Sokolnikoff and R. 1\1. Redheffer, it!athernatics of Physics and Modern Engineering,
pp. 180-181, McGraw-Hill Book Company, New York, 1958.

1.56

Electrostatics in Free Space

CHAPTER

Once again if the method of separation of variables is used, a product of three real functions can be assumed in the form
(3.61)
which leads to the three ordinary differential equations

df
dz
d2f

2
_._3 _
2

~
d2

k2fa = 0

(3.62)

= 0

(3.63)

11 2/

J2

2fI

-ddr? + -1"1 -djI


+ ( k2 dr

2
v )
(1
1'2 ~

= 0

(3.64)

in which the separation constants k 2 and 11 2 are arbitrary real numbers. Some of this
arbitrariness is removed upon writing the solution for (3.63) as

!2( 4

::: {ejJlq,}

and thus recognizing that II must be an integer n if the range of is unrestricted and the
potential is to be single-valued. Imposing this condition, one may write
(3.65)
Ordinarily, for the special case n

0, the linear solution


(3.66)

would be indicated. However, the requirement that <I> be single-valued il11pOSeS the constraint that the constant c in (4.61) be zero. The remaining constant d can be aCeOnl1110dated in (3.65) by permitting that solution to apply also for the case n = O. Thus (3.6tj)
will be used for n = 0, 1, 2, . . . .
The solution of (3.62) proceeds in similar fashion, giving

!3(Z) = {ekz}
f3(Z) = c'z + d'

(3.67)
(3.68)

k~O

=0

However, no integer restriction exists on the allowable values of k; indeed, since k 2 can
be any real constant, k can be any pure real number or any pure imaginary number.
If both k and n are zero, (3.64) has the simple solution

fI(r) = alnr

+b

n=k=O

(3.69)

whereas if only k equals zero


(3.70)

k = 0

which can be verified by substitution.


For k ' 0 it is best to proceed by introducing the substitution variable v = kr,
which converts (3.64) to
2

-ddvf2I + -V1 -dfl


+(1dv

n- )
v2

il

This can be recognized as Bessel's differential equation.

(3.71)

Solutions to Laplace's Equation in Cylindrical Coordinates 157

12

SECTION

It will be assumed that the reader is familiar with the details of the method used for
solving (3.71) and only the principal results will be stated here. 16 The assumption of a
power series representation for fl(V) leads to the conclusion that (3.71) has two independent solutions given by
J n(lcr)

Yn(kr)

__ ~ (-1)m(kr/2)n+2m
~

mien

m=O

(3.72)

+ m) I

('Y + In ~) In(kr)
_ ~ \' en - m - I)! (~)n-2m

=;

n-l

'Tr

m!.: 0

m!

kr

1 ~ (-l)m(kr /2)n+2m (

-- L

7r m=O

mIen

+ m)!

1+-+-+"
2
3

.+-m.

1
1
1 )
+1+-+-+
.+-2
3
n + m

(3.73)

where 'Y = 0.5772


These solutions are known as Bessel functions of the first and second kind. They
appear formidable but are rarely needed in these forms in practice, since both functions
are tabulated. 17 For k real, the two functions are oscillatory, this feature being shown in
Figure 3.12 for the first few values of n.
For large arguments (kr 1, n),

In(kr)

Yn(kr)

- cos (kr - - - -)
~1T'kr
kr - -n7r - -7r)
~'Trkr
2
4
2

SIn

'nat

1T'

(3.74)
(3.75)

For this reason it is convenient to introduce particular linear combinations of J nand


Y n through the definitions
H~l)(kr)

+ jYn(kr)

(3.76)

In(kr) - jYn(kr)

(3.77)

= In(kr)

H~2)(k1') =

'I'hese combinations also form a fundamental set of solutions to Bessel's equation and
are known as Bessel functions of the third kind, or more commonly as Hankel functions.
For large arguments they have the asymptotic forms (2/1T'kr) ~1 exp [j (kr - n1T' /2 7r/4)] and thus, when combined with the harmonic time function ei wt , represent incoming
and outgoing cylindrical waves, Their principal utility will arise later in the discussion
of time-varying fields.
16 Many excellent discussions of Bessel's equation and the properties of its solutions exist in the literature. For example, the interested reader is referred to J. Irving and N. Mullineux, 1\1aihematics in
Physics and Engineering, pp. 75-82, 128-174, Academic Press, New York, 1959.
17 See, e. g., the tables appended to G. N. Watson, A Treatise on the Theory of Bessel Functions, 2d ed.,
pp. 666-752, Cambridge Press, London, 1952.

L58

Electrostatics in Free Space

rHAPTER

Yo(v)

FIGURE

3.12

Lower order Bessel functions of the first and second kind.

For small arguments (kr

1),
(3.78)

;2 ( 0.5772
Yn(kr)

+ In 2kr)

_ (n _ I)!
7r

(~)n
kr

n=O
(3.79)

nO

Regardless of the value of the index n, the Bessel functions of the second kind are seen
to possess a singularity at r = 0 and thus they must be excluded from the solutions to
physical problems in regions containing the Z axis, unless a line source exists at r = O.

SECTION

12

Solutions to Laplace's Equation in Cylindrical Coordinates

1,59

If k is imaginary, the series solutions (3.72) and (3.73) are still valid, but for convenience a pair of modified Bessel functions is employed. Letting k = jf, with f real,
since the series (3.72) is even or odd according to the nature of n, it is possible to define a
real function by the relation

(3.80)
If one attempts similarly to modify Y n , a study of the series (3.73) reveals that a cornparably simple definition will yield a com-plex function, a result which is unwieldy
when one recalls that an effort is being made to represent a real potential function <P.
However, this difficulty can be avoided by introducing the function
(3.81 )
Inspection of the complex sum of the t\VO series (3.72) and (3.73) for imaginary argument reveals that Kn(fr) is a real function; further, In and K; comprise an independent
set of solutions to Bessel's equation and their asymptotic forms for large argument are
the simple expressions
(3.82)

Kn(lr)

-+

~ e-

fr

(3.83)

where fr 1, n.
The functions In and K; do not oscillate and only K; is well-behaved at infinity. The
graphical forms for the first few modified Bessel functions are shown in Figure 3.13.
For low values of n they are widely tabulated;"
In summary of these results, <I>(r,et>,z) may be composed of a suitable product of three
factors from among:

fl(r) = alnr + b
= {rn}
= anJn(kr) + b Yn(kr)
= anIn(fT) + bnKn(fr)
f 2 ( cP) = {ejncP }
!3(2) = C'z + d'
{ekz}
= {ejfz}

n=k=O
k = 0
k real, nonzero
k = jf imaginary, nonzero
n=O,1,2,.
k = 0
k real, nonzero
k = jf imaginary, nonzero

(3.84)

One can observe from (3.84) that oscillatory functions in r combine with nonoscillatory
functions in z and vice versa.
Since Laplace's equation is linear, products formed from the factors in (3.84) can be
summed to give more general solutions. Of particular value are the summations of
products which comprise complete orthogonal sets. For example, if the potential is repetitive in the Z direction with a characteristic length L z , then an appropriate formula18

See, e. g., G. N. Watson, loco cit.

160

Electrosiaiics in Free Space

CHAPTEH

~-----_-------Y---------r-W

FIGURE

3.13

Lower order modified Bessel functions.

tion is

!P(r, cP,z)
m=-ClOn=-ClO

This can be recognized as a double Fourier series in c/> and z. The complex coefficients
and bm n can be determined in the usual way through knowledge of the boundary
conditions. The terms for n = 0, k = 0, must be treated individually in that the linear
forms in (3.84) should be substituted where needed.
An orthogonal set of functions can also be generated in the radial direction. The procedure can be illustrated with the function J 1 (kr) which is plotted in Figure 3.14a out to
a-;

(a)

(d)

Yll

vo
I

FIGURE

VV J, (Y lI -'!-)
"0

I
\.

3.14

(b)

Y11

(Y1.f.)

Y:\

Construction of orthogonal Bessel functions.

(e)

vvJ.

Y12
~V

.-

Vo

Y13

<,

f.) 1\

(f)

VVJ,(Y13

(c)

Y 12

t.x.j

tn

..

C'J

Cr.J

~.

0
0

C":l

~.

~.

~.

~..

CJ:l'"

B"
C":l

"'t3

c-

0-

CJ:l

0
~

V:;.

tv

162 Electrostatics in Free Space

CHAPTER

its first root 1'11, in Figure 3.14b out to its second root 1'12, and in Figure 3.14c out
to its third root 1'13. If each of these curves is stretched (or compressed) to a common
length kr and multiplied by ~, the result is the family of curves shown in Figures

3. 14d-f. These curves are plots of the functions vfk;. .J1('Ylmr/rO) in the interval
r ~ roo The functions form part of an orthogonal. set, and this procedure can be
followed for any value of n since

o~

(3.86)
for all integral values of n, and for any integral values of m ~ 1, p ~ 1. In (3.86) the
symbol Dm p is the Kronecker delta and has the value unity if m = p, but is otherwise
zero. Equation (3.86) is known as the orthogonality relation for the Bessel functions of
the first kind. Its derivation can be found in Appendix C together with a collection of
the more useful recursion relations and differential and integral formulas which connect
different Bessel functions.
When this type of expansion is appropriate, the potential function can be expressed in
the form
(3.87)

e:J>(r,cP,z)

The complex coefficients a m n and b-; can be evaluated for the given boundary conditions
with the aid of (3.86) and the usual Fourier formula associated with the series in cPo
EXAMPLE

3.19

Consider the case of t\VO concentric conducting cylinders with the figure of Exalnple 3.9
once again applicable. With the inner cylinder maintained at a potential Vb volts above the
outer cylinder, let it be required to find the potential distribution in the space between
cylinders.
By symmetry, the answer should be independent of !/> and z, so the proper selection from
(3.84) is
cI>(r) = i1 In r

Use of the boundary conditions e:J>(b)

::=

<I>(r)
From this
so that

0, e:J>(a) =

Vb gives

In (r jb)

= Vb In (a/b)

Do = -foV<I> = -1, LnfO~~J ~


a = Do(a) =

f.OVb/ a

In (b/a)

and the charge per unit length on the inner cylinder is

all of which is harmonious with the results of Example 3.9.

SECTION

EXAMPLE

Solutions to Laplace's Equation in Cylindrical Coordinates

12

163

3.20

As indicated by the figure, a grounded conducting cylinder is immersed in what had been a
uniform electric field Eo = l xE o. The cylinder axis coincides with the Z axis and its radius
is roo Find the potential distribution external to the cylinder,

P(r,cp)

~-~-t-----X

..

Eft

Since the problem has no z dependence, k = 0 and the proper selection from (3.84) is

I'

if>(r,cf

n=

(anrn + bnr-n)(Cneln4>

+ dnc 1n4 + a In r + b

-00

in which ~' signifies that the term n = 0 is excluded from the summation. Symmetry conditions indicate that the solution should be even in cI> and therefore the above expression
reduces to

I
00

if>(r,cf

(anrn + bnr- n) cos ncf>

+ a In r + b

n=l

The boundary condition at large r is such that there should still be a uniform field Eo, since
the effect of the cylinder is local. Therefore the potential at large r should be

<P = -Eox = -Ear cos


For this reason a = b

0 and an

cI>

0, n ~ 1, with al = <E. Thus

I
00

if>(r,cf = - Eor cos cf>

b.r: cos ncf>

n=l

The constants b; are determined by the boundary condition that ep(ro,cI = O. This gives

o= -

Eoro cos cf>

bnr- n cos ncf>

n=l

and therefore b; = 0, n :;e 1, whereas b, = Eor~. The final form of the solution is

<P(r,cI

( -Eor

Eor~) cos
+ -r-

cI>

164 Electrostatics in Free Space


EXAMPLE

CHAPTER

3.21

Consider a grounded hollow cylindrical can of radius r and height h, as shown in the figure.
Its lid has been removed just sufficiently to be insulated from the can, and raised to a potential Yo. Find the potential distribution within the cylinder.

-1
h

,,---

"

--

- .......... , ,

x
This problem can be simplified from the outset by recogmzmg that the geometry is
= ro and
be finite at r = 0, only the J 0 functions will suffice. Thus the representation (3.87) is
appropriate in the form

c/> symmetric and therefore that n = O..Also, since the potential must vanish at r

~(r,z) = m~l Am sinh ('Yom~) J o ('Yom~)


in which the exponentials in z have been combined to give the hyperbolic sine in accordance
with the requirement that <P == 0 over the bottom of the can.
The coefficients Am can be evaluated with the aid of (3.86) and the boundary conditions
at z = h. Since

~(r,h) v,
=

mtl AmSinh('Yom~)Jo('Yom~)

it follows, upon multiplying both sides by rJ 0 [ I'Op

]'-v , (T'OP!:.-) dr =

I Am

m = 1

sinh

and integrating, that

(T'om~)
]' rJo('Yom!..) J o('Yo P; ) dr
ro
ro

. ( roh)

= A P sinh

(~)

'Y Op -

r~ J 21 ('Y op)

SECTION

Solutions to Laplace's Equation in Spherical Coordinates

13

165

Use of Equation (C.14) from Appendix C gives

d(

!-.)] =

'Y Opd r/ ro) f'Y Op !-ro J 1("'lOP ro

!-.)

('YoP
J
r

0("'1 Op !..)
ro

so that the above integral yields

Ap

2Vo

"'IOpJ ("'lop) sinh ("'lop h/ro)


1

and the potential is given by

J 0 ('Yom

2:
00

4>(r,z) =

2Vo

!..)

m=l

("'IOm ~)

sinh

ro

'YomJ1('YO m) sinh

~o

('Yom;:;;)

For specific values of ro and h the table of roots in Appendix C may be used to determine the
relative richness of harmonics in the above series.
EXAMPLE

3.22

In a variation of the preceding problem, the cylinder is insulated from the bottom lid as
well as the top lid. If the two lids are grounded and the cylinder is kept at a potential V o,
the internal field can be determined by utilizing the representation (3.85). This choice is
dictated by the need to have the potential vanish at Z = 0, h, a condition which cannot be
satisfied by hyperbolic functions of z, Once again there is cJ> symmetry, making n = 0, and
the functions K. cannot be used because the potential is finite at r = O. Therefore
4>(r,z)
Since cI>(ro,z) =

mirz

V o,
h

= m'= 1 .i, sm h

fV

. prz

o SIn -

00

dz

\'

'-' s :),

m=l

in which the mirror potential Z < O. Thus

Vo has

(m1rr o)
-h-

10

(m1rr)

f . h1n1rZ .
h

SIn

-h

sin

hp1rZ dz

been assumed for convenience in the range -h

<

A = 2Vo(1 - cos p1r)


p
pt:I o(P1r ro/ h)
and the potential is given by the expression

~(r Z)

'

2:
00

_ 4Vo
-

7r

,
I o(m1r r/h) SIn
. (m1rz)
m=lmlo(m1rro/h)
h

in which 2;' denotes odd values of m only.

3.13

SOLUTIONS TO LAPLACE'S

EQUATION IN SPHERICAL

COORDINATES

When the potential problem of interest involves boundaries which are coordinate
surfaces in a spherical geometry, Laplace's equation is best written in terms of the
spherical variables (r,(J,cJ; Equation (V.86) of the Mathematical Supplement then

166 Electrostatics in Free Space

CHAPTER

becomes

1
a (
ael
+ r-sin
-0 -ao
sin 0 +
ao

1 a (ael
- - r2 1'2

ar

ar

1
a2ep
= 0
1'2 sin! 0 a~2

(3.88)

A product solution in the form


(3.89)
can be assumed in which the functions fi are real. This leads to the separation
2
f 1)
sin 0 d (rd
2 i, dr
dr

-- -

sin e-d
+ --

(.
df 2)
SIn () f2 de
de

+ -1 d-

2f3

f3 d~2

(3.90)

Since the last term is a function only of ~ whereas the first two terms are not, it follows
that

-.!

d f3 =

(3.91)

f3 d~2
in which m must be an integer if the potential is to be single-valued. Thus
m = 0,

1, 2, . . .

(3.92)

Replacement of (1/13) d2f3/d~2 by -m 2 in (3.90) and division by sin? () gives

-! ~ (r

11 dr

f1
d )
dr

+ _1_ !!- (sin 8 df 2)


12 sin

8 dO

d8

sin 2 8

(3.93)

Because only the first term of (3.93) is a function of 1', this term must equal a real
constant which shall be designated n(n + 1), for reasons which will emerge shortly.
Thus
fl

-d ( r 2 -d )
dr
dr
f2
~
(sin () d ) +
d8
d8

n(n

[n(n + 1) sin

(3.94)

1)f1 = 0
(J -

~2 0] /2

SIn

(3.95)

= 0

Equation (3.94) is readily solved and gives

fl(r)

or:

(3.96)

br-(n+O

Solution of (3.95) is facilitated by making the substitution u = cos () which leads to

2
(1 - u 2 ) -d f 22 -

du

2u -df2
du

[ n(n

+ 1) -

2
-m-2]
1- u

12

(3.97)

The functions which satisfy this equation are the associated Legendre functions and
the t\VO independent solutions are normally designated pr;:(u) and Q:':(u). The latter
have singularities at the poles 8 = 0, 1r and must be excluded if the polar axis is part
of the region of interest. In all that follows this will be assumed to be the case. Appendix
D includes a discussion of the manner in which (3.97) is solved, together with a development of the major properties of the functions l)~(u), and only the principal results
will be stated here.

SECTION

13

Solutions to Laplace's Equotum in Spherical Coordinates 167

The assumption of a power series solution of (3.97) for the case m == 0 leads to the
conclusion that, if n is an integer, f2(U) can be expressed as a polynomial which is
well-behaved in the entire region -1 ~ u ~ 1 and given by
1 dn
Pn(u) = - n - (u 2
2 n ! dun

l)n

(3.98)

The first few of these polynomials are

Po(u)

==

and these functions are plotted in Figure 3.15. If all positive integral values of n are

1.0 ....-

P n(u)

..... --~--..------.-.....- - . . , . . . - - ..... ----,r--...,......-~

0.8

~~---l---~--4-----l~---+---_+_--+---t-----I---+H

0.6

1-----'~---+---4-----l~-__+_--_+_--_+_-~~-_+__t__t__1

0.4

l---~....--~~-+--:~~~--+---_+_--~--t__-_tt____t_____f

0.2

l----4--Jr--+----+--~~-_+_--_I_--_+_--t_+___+__t___;

1---~I----A.--.+---I---_w:~-_4_--_f_-__++__-_It_-____1

- 0.2

~-~---+--~4-------lIiI'----+--~_+_--~~-t___+___t_-____1

- 0.4

1-----l-4----+----lJ---=~~-__+_--_+_7I'_~d_-~t_-_+-__;

- 0.6

~-I-~---J,.--+---~~---+---_+_--+_--t__-_+-_____1

- 0.8

1-l--~!-.---+----4---~-__+_--_+_--_+_--t_-_+-____1

-1.0

L-_--.L _ _ .......L. _ _ -'--_ _ L - _ - L . _ _ --L._ _..J- _ _

-1.0

-0.8

-0.6

-0.4

-0.2

0.2

0.4

.L--.._--..L.._-----l

0.6

0.8

1.0

'U

FIGURE

3.15

Legendre polynomials.

included, the Legendre polynomials generated by (3.98) constitute a complete orthogonal set in the interval [- 1,1] and for this reason noni ntegral values of n normally
are not considered.

168 Electrostatics in Free Space


For m

CHAP'rER

0, the associated Legendre function P;:(u) satisfies (3.97) and is given by

(3.99)
Since ]J n (u) is an nth-order polynomial, m cannot exceed n in value. A variety of
recurrence formulas connecting associated Legendre functions and/or their derivatives
for different values of the indices an be found in Appendix D together with a list of the
specific functions generated from (:~.99) for low values of 'In and n.
The associated Legendre functions are also orthogonal in [-1,1]' the normalization
integral being
1

JP

-1

m
n

2(n

(u)P l (u) du = (

2n

+ m)!

)(

)'

1 n - m .

(3.100)

DIm

When (3.89) is expanded in terms of the solutions which have been found for the constituent functions, one obtains

IL
n

4.>(r,8,rb)

ClO

[anr n

bnr-(n+l)]p;:,(cos

8)[cm cos mrb

dm sin mrb]

(3.101)

m=On=O

The combination P=(cos O)[c m cos met> + d; sin met>] is called a spherical harmonic,
Being orthogonal in both cos (j and et> it is suitable for the expansion of arbitrary functions of 0 and et> in spherical coordinates in exactly the same way that a double Fourier
series is used in two dimensions in rectangular coordinates.
EXAMPLE

3.23

Imagine a uniform electric field of strength Eo into which an insulated conducting sphere of
radius To is placed. For convenience take the polar axis parallel to the original field and
accept the resulting potential of the sphere as the zero reference. What is the field distribution in the region exterior to the sphere?
Equation (3.101) can be used with the simplification that m = 0 because of et> symmetry,
Then

L
00

4.>(r, 8)

[anrn + bnr- (n+ 1l ]p n(COS 8)

n=O

I [anr~ +
ClO

4.>(ro,8) = 0 =

bnrO"(n+l l jP n(COS 8)

n=O

Multiplying both sides of the second of these equations by Pl(cos 0) and integrating gives

L [anr~ +

bnrO"(n+l l j

n=O

= [ alTol

+ blro

ClO

o=

-(l+l)

P n(COS 8)PI(cOS 8)d(cos 8)

-1

] --

2l

this second result arising from the normaliza.tion integral (D.24) in .Appendix D. Thus

SECTION

Solutions to Laplace's Equation in Spherical Coordinates

13

169

Since the electric field must be IzE a at points remote from the sphere,

cr

oo

lim 4>(r,8) = lim


r-4OO

r-4OO

n=O

anr n 1 -

= lim (- Eor cos 8)


r-+

Pn(cos 8)

00

Therefore only al is nonzero and al = - Eo. The potential distribution is then

<I>(r,O) = -Eo

[1 - (~YJ

r cos 0

The field intensity is given by

The surface density of charge induced on the sphere is

(J(8) = EoEr(ro) = 3EoE o cos 8


Flux lines and equipotentials are shown in the figure. I t can be observed that the sphere
exerts little influence at distances larger than one radius from its surface.

I
I

#f ~-lt
\

I I

170 Electrostatics in Free Space


EXAMPLE

CHAPTER

3.24

The potential of a point charge at a distance rl from the origin of coordinates has c/> symmetry if the polar axis is chosen to pass through the charge. vVi th reference to the figure, the
P(r,O,t/J)

potential at the point P(r,(},c/

is

in which, in accordance with the cosine law,

~ = ~ [(~y + 1 - 2 ~ cos 8
But since

(l - 2ut

= ; [ (;

+ t2)-~~ =

y+

Li-r
00

1 - 2; cos 8

n(U)

n=O

is a generating function for Legendre polynomials if t

<1

(cf'. Appendix D), it follows that


r

<
>

rl

rl

The potential of the point charge can therefore be expressed as


ip(r,8)

Pn(cos 0)
n':o (!-)n
rl

~ ~

.si:

47T"orl

47T"or

I (~)n

n=O

Pn(cos ()

< T1

SECTION

13

Solutions to Laplace's Equation in Spherical Coordinates

171

This formulation can be used to find the potential distribution due to a point charge outside a grounded conducting sphere. If the polar axis is taken through the point charge ql,
which is a distance rl from the center of the sphere, it follows that the charge induced on the
sphere has a et>-symmetric distribution. L sing (3.101) wi th m = 0 to represen t the part of
the potential due to the charges on the sphere, and using the above series to represent the
potential due to ql, one can write
<I>(r,O)

bnr-(n+'lPn(cos

bnr-<n+l)Pn(cos

n":o

a~T<rl

<I>(r,O) =

r>rl

n':.0

0)

+ _ql_

41J"Orl

0)

\'

n':o

(~)n Pn(cos ())


rl

+ .si: ~ (~)n Pn(cos 8)


41J"or n': 0

in which a is the radius of the sphere. The radial functions have been chosen appropriately
to satisfy the finiteness condition at infinity.
Since 4>(a,8) = 0, the orthogonality of the Legendre polynomials leads to

bna-(n+l)

ql

(a)n

41J"Orl

rl

from which it follows that

bnr-Cn+I)Pn(cos

0) -

.,

I (~)n

41J"or n=O

n=O

Pn(cos 8)

with
Thus the grounded sphere is equivalent to a second point charge at an interior point on the
polar axis. This result is in agreement with Example 3.15.
EXAMPLE

3.25

A conducting spherical shell of radius a is divided into four equal sectors as shown in the
figure. These sectors are insulated from each other by small gaps and are alternately at the
potentials Vo and - Vo. Find the potential distribution in the region external to the
sphere.

The general expansion (3.101) may be used once again with an == 0 to satisfy conditions
at infinity. Then since the potential must be an odd function of ~,

I I
n

<I>(r,O,(p) =

m=On=O

d mn sin me/> r-Cn+1lp;:'(cos

0)

172 Electrostatics in Free Space

CHAPTER

At r = a, the potential is independent of

ff

(J

and is a square wave function in and thus

1 211"

<fJ(a,rp)Pf(u)

sin p du d

-1 0

ff
1

LL
n

21r

Pf(u) sin p

-1 0

00

dmna-(n+l) sin m P':(u) du d

m=On=O

= 7T'a-<l+l)

(2l

+ p)! d
1) (l _ p)! pi

2(l

by virtue of (D.30) and the Fourier normalization formula,


The left side of the above equation reduces to

v,

1I"n

Pf(u) du

[4 f

sin p d ]

= 8V

- l O P

in which p is restricted to the values 2(28


pi

Pf(u) du

-1

1) with 8 = 0, 1, 2, . . . . Therefore

- 4al + 1V o (2l

-7r--

(l

l)(l - p)! F P
+ p)!
1

in which F] = f:tPf(u) duo The dominant term in this series is Fi = 457T' /8. Thus
<fJ(r,8,)

for r

3.14

~HVo (~r4 sin 2 P~(cos 8) ~ 5Vo(~r4 sin 2

(cos 8 - cos 38)

a.

GREEN'S FUNCTIONS

A technique for solving boundary-value problems which has the virtue of systematization arises from the use of Green's second integral theorem. With reference to Section
V.21 of the Mathematical Supplement, if <I>(~,l1,r) and 'l!(~,17,r), are well-behaved functions in a volume V which is enclosed by a surface S, this theorem leads to the
identity

f (<fJV~'lJ'

v
in which

'lJ'V~<fJ)
Vs =

dV =

a+

1 z

a~

(<fJVs'lJ' - 'lJ'Vs<fJ) dS

a+

1 y a17

1 _.
Z

ar

Let <P be the potential function being sought and let 'l!
defined by

= -

47T'fo~

(3.102)

+X

= G be

the Green's function


(3.103)

with \7~x = 0 through Vand ~ = [(x - ~)2 + (y - 17)2


(z - r)2P'~. G can be interpreted
as the potential due to a unit charge placed at (x,Y,z) plus the potential due to a source
system exterior to V. Because of the singularity in G, if (x,Y,z) lies within V, a small
sphere of radius E can be erected around (x,Y,z) and its volume excluded from V, or
alternatively the singularity can be represented by the Dirac delta function. The
latter approach gives
2 y
o(r - rs)
(3.104)
'lsG = - - - Eo

SECTION

Green's Functions

14

in which rand rs are drawn from the origin to the point (x,Y,z) and to the point
respectively.
Since \7~<I> == - pleo in V, (3.102) becomes

-f
v

[<I> D(r - rs) - G ~J dV =


eo

eo

f (4) aGan -

G a<l

an

173
(~,7J,r)

dS

wherein n is a spatial variable in the direction of the outward-drawn normal. This


yields the general result
<I>(x,Y,z)

f Gp dV + sf (G a<I>an - anac) dS
<I>

EO

(3.105)

if (x,y,z) is within V, otherwise the left side of (3.105) is zero. If the Green's function G
is known, Equation (3.105) gives the solution for the wanted potential function <I> in
terms of its sources within V and the values of <I> and its normal derivative on S.
Several classes of problems which can be solved by this formulation are worthy of
mention:

1. If G is chosen so that X == 0, if S consists of a single surface which goes to infinity,


and if <I> decreases with ~ as ~ ~ 00 (as it will for any finite source distribution), then
the surface integrals in (3.105) vanish and the familiar result (3.18) is obtained.
2. If <I> has no sources in V, so that \7~<I> == 0 in V, then (3.105) reduces to
<I>(x,Y,z)

EO

f (G a<I>
an

<I>

aG) dS
an

(3.106)

However, it already has been noted in Section 3.10, in connection with the uniqueness
proof for solutions to Laplace's equation, that if \7~<I> == 0 in V, knowledge of <I> or
aif>lan on S is sufficient to determine <I> everywhere in V. Thus (3.106) as it stands is
over-determined, and additional conditions may be imposed. The most commonly
imposed conditions are (a) G == 0 on S (the Dirichlet problem) and (b) aGlan == 0 on S
(the Neumann problemj.!"
In the Dirichlet problem the Green's function may be considered to arise from a unit
charge placed at (x,Y,z) plus a system of "image" charges so positioned outside S as to
cause G == 0 on S. Another interpretation is to imagine (for the purpose of determining
G) that S is a grounded conducting surface which contains an induced surface charge
distribution due to the presence of a unit charge at (x,Y,z). For any Dirichlet problem
<I>(x,y,z)

== -

eo

dS
fs <I> -aG
an

(3.107)

in which eo aG I an is the induced surface charge distribution.


Similarly, in the Xeumann problem, the Green's function may be considered to be
due to a unit charge placed at (x,Y,z), plus a distribution of charges exterior to V such
that aG I an == 0 on S. For any 1\'"eumann problem

ep(x,y,z) ==

eo

aep

G - d.S
.)f an

(3.108)

19 See J. I). Jackson, Classical Electroduruimics, p. 19, John Wiley and Sons, Inc., New York, 1962, for
a less restrictive Neumann condition.

174 Electrostatics in Free Space

CHAPTER

The advantage of both formulas (3.107) and (3.108) is that if G is once found for a
particular geometry (i.e., a particular shape of surface S) an entire class of problems is
formally solved.
3. Since Poisson's equation is lineal', solutions 111ay be superirnposed so that, if clJ
has sources in V, a generalized Dirichlet problem gives

eJ>(x,y,z)
in which G

J Gp dV - sJ
v

EO

a ('1
eJ> -.!. dS

(3.109)

an

== 0 on S. Similarly, a generalized K eumann problem gives


~(x,y,z)

J Gp dV + J G -an dS

Eo

a~

(3.110)

with aGjan == 0 on S.
4. Finally, mixed boundary conditions are possible, with G == 0 over S', a part of S,
and aG jan == 0 over S", the other part of S. The appropriate expressions for cf> are then
natural extensions of the results already given.
EXAl\fPLE

3.26

If an arbitrary potential distribution ~(a,tJ,<p) is established over a spherical surface of


radius a by means of a source system external to the sphere, the potential anywhere within
the sphere can be determined with the aid of (3.107). On the basis of the results of Example

/
(~,.",r)1

In

SECTION

15

175

Solutions to Laplace's Equation in Two Dimensions

3.15, G will be due to a unit charge placed at the point (r,f),cP) and an image charge of
strength - air placed at the point (a 2Ir,f),cP). Then
cfJ(r,(J,)

~ J7I" J

2'"

47T'

<I>(a,lJ,lp)

0 0

~ (~
8a

~2

air)
~l

a 2 sin f} dtJ d'P

If ~1 and ~2 are expressed as functions of a and r through use of the law of cosines, the
integrand may be arranged in a form which is suitable for evaluation.

3.15

SOLUTIONS TO LAPLACE'S EQUATION IN TWO DIMENSIONS


WITH THE USE OF CONFORMAL MAPPING

A large class of two-dimensional potential problems fits the condition that, with proper
orientation of the coordinate axes, ep is independent of z and Laplace's equation reduces
to
8 2ep
8 2ep
-+
=
=0
(3.111)
2
8x

8y 2

When this is the case, a powerful method of attack utilizing the theory of functions
of a complex variable may be brought to bear on the problem. Using the real variables

x and y, one may define the complex variable g == z


jy, in which j == V-=1. I t is
then convenient to associate any given value of g with a point in the xy plane, as shown
y

c
~----

X -------tl~

---.A.--------.L...--

----------"--------

(b) m plane

(a) 3 plane
FIGURE

'U

3.16

Functions in a complex plane.

in Fig. 3.16a, and thus refer to this as a representation in the complex S plane. The
coordinates may also be expressed in polar form through the transformation equations

R
and then

(x 2 +

y2)~'

eJ>

S = Re i

tan' ' (;)

(3.112)
(3.113)

176

Electrostatics in Free Space

CHAPTER

Imagine that additionally there is a different complex variable defined by the


relations
l11 = U

+ jv

CRe i IP

(3.114)

\11 can similarly be represented by points in a


tangular coordinates u, v, or polar coordinates
function of S, such that for each assigned value
corresponding value of lV, this relationship can

complex plane, either in terms of reeill, ip, (See Figure 3.16b.) If m is some
of S there is some rule which specifies a
be symbolized by writing

f(g)

(3.115)

l11 =

Then if S is permitted to vary continuously, its representative point in the S plane


may trace out a curve C, as shown. In general, the corresponding values of ro will
trace out a curve ( in the m plane.
A small change ~s in the independent complex variable S will occasion a corresponding change ~m in the dependent complex variable m, The derivative of f(g) may then
be defined in the usual way as the limit of ~ll)1 ~g, namely:

dro = lim f(g


dg
~s~o

~g) - f(g)
~g

(3.116)

However, unlike the case of the derivative of a function of a real variable, there is no
unique path in the S plane along which g + ~g must approach g, which is to say that
~s may have any direction. If the derivative druidS has a unique value at S, regardless
of the path along which ~S is chosen, the function lUeS) is said to be analytic, or regular,
at S.
If the value of the derivative is to be independent of the direction of ~S, the same
result must be obtained if S is changed solely in the ~r direction, or solely in the y
direction. Since m = u(x,Y) + jV(x,lJ) , in the former rase
dlU = lim u(x
dg
~X--'O

~x, y)

+ jv(x +

au
av
=-+jax
ax

~x, y) ~ u(x,y) _=-jvC~,Y)


6x

(3.117)

whereas in the latter case

.
?l(x, y + ~y) + jvCr, Y + ~y) - u(x,y) - jv(.l',/})
-ds = lim
-------------- ----------------

dS

i~y-+O

av

= -

ay

~lJ

.au

(3.118)

-J-

ay

If m is to be analytic at S, it is therefore a necessary condition that

au

-- -

a:r

au
ax

av
ay

(:3.119)

au
ay

(:3.120)

These are known as the Cauchy-Riemann equations. Since any path through S can be

SECTION

Solutions to Laplace's Equation in Two Dimensions

1.5

177

expressed as the linear sum of the displacements ~x and j ~y, it follows that the CauchyRiemann equations are also a sufficient condition that tu(S) be analytic at S.
EXAMPLE

3.27

Consider the function tu = S3 which gives u

u = x3

so that

au

-ax = 3x 2

3xy 2

+ jv

av
ax

= (x
2
3u
.'J

+ jy)3 = (x 3 au
- = -6.'Clj

3 xy 2)

+ j(3x 2y -

y3)

ay

av

-ay = 3x 2

= 6xlJ
oj

31;2

The Cauchy-Riemann equations are seen to be satisfied for all values of x and y and thus
m = S3 is an analytic function for all points in the complex S plane.
It is left as an exercise to show that gn is analytic (n ~ 0) and thus that the series Lt:=oanS n
is analytic and can therefore be used to represent a general class of regular functions.

When the complex derivative defined by (3.116) exists, it may be found by the same
rules which are used for functions of a real variable. As examples

dg. (cos g)

d
- (In

sm g

ds

s)

s:'

Returning to the Cauchy-Riemann equations, if (3.119) is differentiated with respect


to x and (3.120) with respect to y and the difference taken, one obtains
(3.121)
Alternatively, if the differentiation is reversed, there results
(3.122)
Thus both the real and imaginary components of an analytic function of a complex
variable satisfy Laplace's equation in two dimensions. It is for this reason that the
theory of functions of a complex variable is so rich in applications to potential problems.
For a problem in which one of the two components of tu is chosen to represent the
potential function 4>, it is interesting to note that the other component is related to the
electric flux. To see this relation, let u(:r,y) be chosen as the potential function for a
particular two-dimensional problem, Then since Do = - fOVU,

D ox
and the vector

au

-fa

a:r

i,

Do!!

au

-fO--

au
au
a--x + lY-ay

ay

(3.123)
(3.124)

is perpendicular to the contour u(x,Y) == constant and tangent to the flux line which
passes through the point (x,y). But

av dx + -av dy
ax
ay

du = -

178

Electrostatics in Free Space

CHAPTER

which, with the aid of the Cauchy-Riemann equations, can be written

dv

au

ay

dx

+ -au
dy
ax

(3.12.5)

In moving along a contour v = constant, dv = 0 and the displacements dx and dy

(au)
: (au)
which is a vector parallel to (3.124). Thus the lines
ax
ay

. t h e ratio
. are In

= con-

stant coincide with the electric flux lines.


Further, combination of (3.123) with (3.125) gives
1

dv = - (D oy dx - I)ox dy)
to

If d = l x dx
element

(3.126)

l y dy is the general displacement implied in (3.126), then a surface

can be composed which is a rectangle, with a side de in the xy plane and a side of unit
length in the z direction. The flux through this surface element is

dtJ;

Do dS

D ox dy - D oy dx

(3.127)

Comparison of (3.126) and (3.127) reveals that


ell/; = -

~o

dv

(3.128)

and, except for an integration constant (which can be made zero by choosing the
reference for flux at v = 0),
(3.129)
l/; = -toV
This equation can be interpreted as saying that the total electric flux between the
contours v = Vi and v = V2, per unit length in the z direction, is -to(V2 - V1)' Xot only
are the contours of v the flux lines themselves, but the value of v can be made a measure
of the total flux.
If the foregoing is represented in the complex ro plane of Figure 3.1Gb, a very simple
picture emerges. The horizontal equispaced grid of lines v = constant trace the electric
flux density, and the vertical equispaced grid of lines u = constant give the equipotentials. This is the electrostatic field picture for the region between parallel conducting
plates which have been oppositely charged (cf. Example 3.17). However, the connection
to real space requires the knowledge of u and v as functions of x and y.
This connection may be viewed in terms of the transformation function lD = /(g.)
relating the contours C and ~, as described earlier and shown in Figures 3.16a and b.
If ~ is a line u = constant, then C will be an equipotential contour in real space; if
~ is a line v = constant, C will be a flux line in real space. If the correct transformation
equation /(g.) is found, these cquipotentials and flux lines will be the solution to the
problem under consideration.
It has been seen already that the function f(s) must be analytic if u and v are to
satisfy Laplace's equation. But then the derivative must be independent of path and
dhJ is uniquely related to dg by the expression

dhJ =

f' (s) dg

(3.130)

SECTION

15

Solutions to Laplace's Equation in Two Dimensions

179

If 1'(8) is considered in polar form, so that 1'(8) = Ae j a then Equation (3.130) states
that the magnitude of dtt) is A times the magnitude of d8 and that the angle of d~u is
the angle of dS augmented by a. Therefore the entire infinitesimal region in the neighborhood of a point m is similar to the infinitesimal region in the vicinity of the corresponding point 8, merely being magnified by the scale factor A and rotated through
an angle a. For this reason, if two curves C andC' intersect at a certain angle in the
8 plane, the transformed curves ( and ~' will intersect at the same angle in the tt) plane,
since both have been rotated by the angle a. Transformations possessing this property
of angle preservation are said to be conformal and every analytic function is therefore
a conformal transformation. As a particular example of this conformal property, the
angle between a flux line and an equipotential contour in the 8 plane is 90 deg; the
angle between a u line and a v line in the mplane is also 90 deg.
The problem of determining a two-dimensional electrostatic field distribution is
thus seen to be equivalent to finding the correct analytic function /(8) which will
transform the sought-for flux-potential map into a simple rectangular grid in the
mplane. As is so often the case in analysis, the inverse problem is simpler, namely, to
study a known function /(8) and see what physical potential problem it represents.
EXAiVIPLE

3.28

If n is a positive real number (not necessarily an integer) the function

is analytic and has real and imaginary components given by


u = Rn cos n
v = R sin n

It is evident from the exponential forrn of this function that a semi-infinite straight line,
drawn from the origin in the 8 plane at an angle , will transform into a semi-infinite straight

_ _ ..-60

1110.-""'_ <I>

= 0

~.......

_..-<I>=O

line, drawn from the origin in the mplane at an angle n. Therefore this transformation is
useful in determining the fields near conducting corners. As examples, consider the grounded
interior and exterior corners shown in the figure. For the interior corner, the boundaries are
the semi-infinite lines at = 0, 1r /2. If v is chosen as the potential and n = 2, these lines
transform into the m plane as the t\VO halves of the v = 0 line, thus satisfying the condition

180 Electrostatics in Free Space

CHAPTER

of zero potential. Then in the S plane, the potential distribu tion is


v(R,) = R2 sin 2

and the flux lines can be found by letting u = constant. Both fields are plotted in the figure.
Similarly for the exterior angle, since the boundaries are the semi-infinite lines at cP = 0,
31J" /2, if n = !, once again these boundaries transform into the ro plane as the two halves
of the v = 0 line. The potential distribution is therefore
2

v(R,cP) = R~'J sin - cP


3

and u = constant gives the flux lines. These fields are also shown in the figure.
This solution is applicable to corners of any angle.
EXAMPLE

3.29

Next consider the function

m = cos- 1 S
which gives

= cos (u + jv) = cos u cosh v - } sin u sinh v


x = cos u cosh v
y = - sin u si nh v

+ jy

4---+--~I---+-.f-H~~f---+--+---f-+t+--+----t--t---;--+-

-3

= 0

SECTION

Solutions to Laplace's Equation in Two Dimensions

15

181

from which it follows that

x2

y2

- +sinh!
- -v
cosh? v
x2

y2

cos! u

sin 2 u =

The first of these equations, for constant v, gives a family of confocal ellipses with the foci
at x = 1. The second equation, for constant u, yields a family of confocal hyperbolas.
The two sets of contours are orthogonal, as shown in the first figure.
Inspection of this figure reveals a variety of problems which may be solved by this transformation. If v is chosen to represent the potential, one can solve for: (1) the field between
two confocal elliptic cylinders, or between an elliptic cylinder and a flat strip stretched
between its foci; (2) the field external to a charged elliptic cylinder, including the limiting
case of a flat strip extending between the foci.
If u is chosen to represent the potential, one can solve for: the field between two confocal
hyperbolic cylinders, including the cases that one or both of them is a plane (u = tr/2 and/or
u = 0 and/or U = tr). The special cases include two perpendicular charged plates separated
by a gap and two coplanar charged plates separated by a gap.
. As an illustration, consider the case of t\VO semi-infinite conducting planes, both lying in
the XZ plane, and separated by a gap of width d, as shown in the second figure. The t\VO

1> =

Vo _......I

. . L - _ . . . '"""""'-.l.-J_..L.----L_--L.._.J........&~--a-----...... --<I>

I~

= 0

~---d---~

conductors are assumed to be equally but oppositely charged, with the right plate at potential zero and the left plate at potential Vo.
Choosing u to represent the potential function, one sees that if d = 2 and V o = tc, the
preceding development applies without modification and u(x,Y) is given implicitly by

x2

--_._-

eos 2 'U

y2

-- =

sin 2 u

However, it is a simple matter to scale this solution since if


then
with K and k arbitrary constants. In other words, both potential and distance can be scaled
linearly without affecting a solution to Laplace's equation. Therefore if general values of d

182
and

Electrostatics in Free Space

V o are

CHAPTER

used in this problem, the solution for u(x,y) is contained in the equation
x2

cos?

y2

(~)

sin?

(~:)

The flux lines, given by v = constant, scale similarly and both fields are sketched in the figure
in the upper half of space. The plots in the lower half would be similar.

3.16

THE SCHWARZ TRANSFORMATION

Example 3.28 dealt with the function ro = Sn, which was seen to map an angular section
of the S plane into the upper half of the ro plane. This angular section was bounded
by the semi-infinite lines 4> = 0 and cP = 1r/n (cf. Equation (3.112) for the meaning of cP)
and was thus controlled by n. Field distributions for grounded corners could be deduced
readily by means of this transformation.
A generalization of this technique is known as the Schwarz transformation and
permits the interior of a polygon in the S plane to be mapped into the upper half of the
to plane. (The polygon need not be closed.) To see how this is accomplished, consider
the inverse transformation
(3.131)
in which K is a constant, possibly complex, and the parameters an, an are real, but as
yet otherwise undetermined. This transformation is analytic everywhere except at the
points ro = an. Therefore if ro is caused to trace out the segment of the real axis in the
ttl plane which lies between an-l and an, Equation (3.131) indicates the phase of dS is
constant and thus the corresponding contour in the S plane is also a straight-line
segment.
The angle cPn which this S segment makes with the real axis is equal to the argument
of dS/dm evaluated at the nth segment. But this is given by

arg

(:~) =

arg K

+ al

arg (IV - al)

+ ... + aN arg (IV -

aN)

If the points an are graduated so that an-l < an for all n, and if the nth segment is
bounded by an-l and an, then it follows that for any point on the real axis, arg (ro - an)
equals zero or 1r according to whether or not hJ < all' 'rhus
(:3.132)
If (3.132) is subtracted from a similar expression for

cPll+

1 -

cPn = -1ra

By referring to Figure 3.17, one sees that all points

cPfI+l one

obtains

ll

011 the real axis segment an - all-l


in the hJ plane map into a line segment of slope cPlI in the S plane; similarly the segment an+l - an maps into the segment of slope cPlI+l. The interior angle B; is given by

SECTION

16

The Schwarz Tronsjormoiion

183

Hence Equation (3.131) may be written

dm

.
K

[1

(ro - an)({3

7J7r)-1

n=O

This is the Schwarz transforrnation for a polygon with internal angles {3n. The complex
constant K controls the relative scale and orientation of the figure in the g. plane.

L....---"'------.L---------x

L - - - - - - .....---"._----..-u

(a) ~ plane

(b)
FIGURE

3.17

m plane

J1!Iapping of a polygon.

This transformation is useful if (3.134) is easily integrable. The difficulties obviously


increase with the number of sides, and the inverse nature of the transformation causes
some inconvenience, in that it would be more desirable to use as independent variables
the S coordinates which describe the polygon under consideration. Despite these limitations, the approach is a powerful one and will yield the field distributions around a
variety of grounded segmented shapes. If one or more of the vertices of the polygon
are at infinity, different parts of the boundary need not be at the same potential.
EXAl\1PLE

3.30

The field between t\VO semi-infinite conducting planes charged to a potential difference Vo
may be solved by using the Schwarz transformation. With reference to the figure, part (a)
shows the actual geometry being considered, and part (b) shows a polygon t in the S plane
which will tend to the actual geometry as the points b 2 and b, tend to - 00. This limiting
polygon can be mapped into a m plane by recognizing that the "interior" angles tend to the
limits (32 = 0, (31 = (33 = (34 = 27r. Then by use of (3.134)
f}

= K

(l1J - al)(lu - a2)-1(lu - a3)(hJ - a4) dl1J

K'

t This polygon is exterior to the eon tour shown so as to include the region in which the field is desired,
and comprises the nondotted area of the plane.

184

Electrostatics in Free Space

CHAPTER

(a) A plane

-----------------~-----------------x

<1>=0

(b) ~ plane

------------==~~~:a.L

(c) ttl plane

___________

....

...

..

a:t

(d) ttl plane

11

SECTION

The Schwarz Transformation

16

185

in which K' is a constant of in tegration. This may be written

I f now a-t ---t 00 so that the entire real axis in the m plane 111ay be u t.ilized, then K can be
permitted to tend to zero in such a way that - K" is the limiting value of Ka4. L'pon making
the further choices al = -1, Q2 = 0, aa = +1, the above expression becomes

f ltJ2 ~ 1 dltJ + K'

g. = K"

and the z, axis is divided into four segments as shown in part (c) of the figure. Integration
gives

[~2 - In ro J +

g. = K"

K'

The constants may be evaluated by using the information that S


and that S = 0
jO when m = + 1. This gives

s=

l
-;

as the final form for the transformation,

1 - m
--2-

In

ro J

+ jl

when m

-1

186 Electrostatics in Free Space

CHAPTER

I t may be observed that the negative half of the u axis is an equipotential of value V o and
that the positive half of the u axis is an equipotential of value zero. Therefore the potential
distribution in the upper half of the m plane is given by

in which <p is measured counterclockwise from the u axis. If lU is written in the form
m = CRei'f, setting <p = constant traces out an equipotential, whereas setting CR = constant
traces out a flux line. This is shown in part (d) of the first figure. The corresponding g. traces
I11ay be found fr0I11 the transformation, and lead to the flux map shown in the second figure.
This solution may be used to deduce the fringing capacitance of a parallel plate condenser.
(Cf. Example 3.32.)

3.17

CAPACITANCE

The concept of electric flux which originates on positive charge and terminates on
negative charge has already been introduced, and has been seen to be a useful pictorialization of Gauss' la \\.. I t also has been noted that when charge resides statically OIl a

FJGURE

conductor, it does so

:3.1 S

Capaciianre of two arbitrary conductors.

the outer surface. The flux lines are then normal to the surface
on the vacuum side of the interface and are confined to the vacuum side, there being no
field within the conductor. This fact led to the conclusion that at each point on the
outer surface of the conductor Do = (J, with Do the flux density and (J the surface charge
density.
Let these ideas be applied to the ease of t\VO conductors in free space, as shown
in Fig. 3.18. Each conductor has an arbitrary shape and their relative position and
011

SECTION

17

Capacitance

187

orientation is also arbitrary. If one conductor contains a net excess charge Q and the
other a net excess charge - Q all the flux lines leaving one conductor terminate on the
other, as suggested by the figure. From a knowledge of Do(x,Y,z) in the intervening
space, one could deduce E(x,Y,z) and then determine the potential difference between
the t\VO conductors by computing the line integral of longitudinal E along an arbitrary
path extending from one conductor to the other.
Suppose that one were to double Q and - Q. This would double Do everywhere and
double E everywhere, thus doubling the voltage difference between the t\VO conductors.
From this it follows that the ratio of the charge on one of the conductors to the voltage
between them is a constant. This constant is a useful index of the charge storage capability and is called capacitance. t
The conclusion just reached is valid for arbitrary conductors and the general definition of electrostatic capacitance is therefore

Co

(3.135 )

in which Q is the charge in coulombs and V is the potential difference in volts. The
subscript on Co is a reminder that the intervening space is a vacuum. Capacitance is
measured in units of coul/volt, more commonly called farads.
EXAl\1PLE

3.31

Other illustrative examples have included a variety of situations involving two conductors
containing equal and opposite charges. Making use of the results of Example 3.9, one may
conclude that the capacitance per unit length of t\VO concentric cylinders is
Co

==

O~

In b/a

From Example 3.14, the capacitance per unit length of t\VO parallel tubular conductors of
radius a and spacing D is
Co =

7r

cosh:' D/2a

From Example 3.17, the capacitance per unit area of t\VO closely spaced parallel plates is
1

Co = Eo-

These expressions for capacitance all contain the multiplicative factor Eo and indicate
why the units for Eo are ordinarily taken as farads/m.
EXAMPLE

3.32

The expression given above for capacitance per unit area of two parallel plates leads to the

approxima.te result
Co = o

1:

as the total capacitance, if the area of each plate is ..4. However, this result neglects fringing
and assumes that the Do field is uniform and does not extend beyond the plate edges, as

t The

two conductors are said to comprise a capacitor or condenser.

188 Electrostatics in Free Space

CHAPTER

suggested by the figure. The true field is more like the flux map shown in Example .3.30,
which indicates an extension of the field beyond the plate edges and some additional charge
storage on the back sides of the plates. A more accurate expression for capacitance may be
deduced with the aid of the Schwarz transformation there derived.
y

4>

V,

plane

4>=0

For two semi-infinite parallel plates, separated in distance an amount l, and in potential
an amoun t V o, a transformation to the m plane led to the resul t that

with m = CRe icp The charge density in hJ space is therefore


(j

a<I>

an

o -

Eo

a<l>

= - - u

acp

Eo

V0

--

7rU

The total charge on the lower plate, per unit length perpendicular to the 8- plane, between
= ~, and accounting for both the inner and outer surfaces, is

x = 0 and x

in which

Ul

and

U2

straddle the point u


~

J du
U2

Q(~) = -

oVo

7r

a3

UI

in such a way that

[1 - ui +
=; [1 - u~ +

.
+ JO
=:;l

-2-

In u 1

--2-

In

1[2

these expressions arising from the Schwarz transformation.


If I~I
l, a good approximate solution may be obtained to these transcendental equations, for then 'lit
1 and us
1. Hence (1 - ui) /2 is negligible compared to In 1l} and
In U2
may be neglected in comparison to -u~/2. Thus

+t

7r~

Inul~-

QW

1
( - -Z27r~)
In u2~21n

~ - EO;O [7r~~1 + ~ In e7r}~I) ]

The first term in this expression for charge is the value which would occur if there were no
fringing and the second term is therefore the correction due to fringing.

SECTION

AIulticapacitor Systems

18

189

This result may be applied to the practical problem of a parallel plate capaci tor of area

A = ab and spacing l. If it is assumed that a l, b L, the fringing charge is approximately

in which the effects at the four edges have been sUTl1D1ed, with I~l chosen as a/2 or b/2
as appropriate. The total capacitance is then given by

C =
o

:!

f
0

r1

L+

In ('/fall)
'/fall

In ('/fbll)
'/fbll

As a specific illustration of this result, if the t\VO plates are 2 in. X 4 in. and spaced 0.1 in.
apart, the capacitance is 10 percent higher due to fringing.

3.18

MULTICAPACITOR SYSTEMS

The results just obtained for a capacitor consisting of t\VO conducting bodies oppositely
charged may be extended to the ease of many conductors. Let otherwise empty space
be populated by N conducting bodies whose outer surfaces are designated as Sn. These
conducting bodies may have arbitrary size, shape, orientation, and position, and their
general distribution is suggested by Figure 3.19. Without loss of generality, one of the

FIGURE

3.19

A systern of conducting bodies.

conductors may he considered so vast as to be an "earth," that is, an infinite reservoir


of both types of charge, and at potential zero.
As consequences of the uniqueness theorem for solutions to Laplace's equation,
t\VO general propositions may be established for this system of conductors:
1. If the electrostatic potential of every conductor is specified, there is only one distribution of electric charges which will yield these potentials.
2. If the total charge on each conductor is specified, there is only one way in which
the charges can distribute themselves over the surfaces Sn in order to be in
equilibrium.

190

Electrosiatics in Free Space

CHAPTER ~3

The first proposition is based on the fact that, if all the boundary potentials are prescribed, a unique electrostatic potential distribution, epCr,Y,z), exists in the intervening
space. But then a unique dip/an is established everywhere, including all points contiguous to S. Since the charge distribution (J is proportional to acI>/an at such points, it
follows that (J is a unique distribution over ail the conducting boundaries Sn.
The second proposition may be established by a similar argument. Each conductor
is an equipotential surface once its total charge is in an equilibrium distribution. But
this leads to a unique cI>C-C,Y,Z) , a unique dcI>/an, and thus a unique (J, with f(J d.S;
equalling the total charge on the nth body.
s,
These t\VO results 111ay be summarized by saying that the distribution of electric
charge over the outer conducting surfaces S; is fully specified if one knows either (1) the
potential of each conductor or (2) the total charge on each conductor.
K ow suppose that there are two equilibrium distributions of charge:
on the different conductors,

1. A distribution (J giving total charges Ql, Q2,


with their potentials being V b V 2, . . .
2. A distribution (J' giving total charges Q~, Q~,
with their potentials being V~, V~, ...

on the differen t conductors,

Since Laplace's equation is linear, these distributions J11ay be superposed with the
result that (J
(J' will give a total charge Qn
Q:t on 1-.';/1, its potential being V n
Clearly this conclusion 111ay be extended to the superposition of any number of charge
distributions.
As a particular application of the foregoing, if charges (21, Q2, . . . , QN give rise
to potentials V 1, V 2, . . . , V N on the N conducting bodies, then charges kQl, leQ2,
. . . ,kQN will cause potentials kVI, leV'}., . . . kV N, with k any real constant.
Suppose next that a positive unit charge is placed on the first conductor with all
other conductors left uncharged, and that this produces the potentials

+ V:.

Pl1, P21, . . . ,PNI

on the N conductors respectively. Then if a charge Ql is placed on the first conductor,


with all others left uneharged, this will cause potentials

PIlOl, P21Ql, . . . , PNIQl


Similarly, if placing a positive unit charge on the nth conductor and Ieaving the
others uncharged produces potentials

pin, p2n, . . . , pNn


then placing Qn on S; and maintaining the other bodies uncharged will yield potentials

If these distributions are superposed, the effect of charges Ql, Q2, . . . , Qv on the
N bodies is to cause their potentials to become V 1, V 2, . . . , VN where

VI = PIlQl

V: =

P21Ql

VN

PNIQl

+
+
+ ... +

P12Q2
P22Q2

PN2Q2

Pl.vQ.v
P2:vQN

+ ... +

(3.136)

PNNQN

SECTIOK

M'uliicapaciior J,..';ystems

18

191

These equations give the potentials in terms of the charges. The factors P are called
coefficients of potential; they are purely geometrical quantities which depend on the
size, shape, orientation, and position of the various conductors. Except for a few simple
geometries, the calculation of the coefficients is quite involved, but their values may
be deduced experimentally with little difficulty.
Some of the properties of Equations (3.13G) may be brought out with the aid of
Green's reciprocation theorem. 'This theorem is concerned with a set of ill point
charges qm, placed at positions where the potentials due to the other 111 - 1 charges
are given by a set of numbers <Pm. These potentials may be written
Jl1

\', qn
47r EOn = 1 ~ m n
1

ep

--~_.

m -

in which ~mn is the distance from 'l to qm, and the prime on the summation sign indicates that the term for n = m is deleted frorn the SUInt
Alternatively, if a different set of charges q~ is placed at the same points, the potentials will be

If ~
1\1

= _1

<p'
m

47r EOn

~m n

=1

Upon multiplying the first of these sumrnations by q~, the second by qm, and then
summing each resulting expression over the index m, one obtains

I <l>mq~ 4: I L'
u
I <I>~qm 4~ L If
AI

Iv!

~mn

J.\1

m=1

7rEO m

(3.137)

qnqm

0 m = 1 n=1

m= 1

(3.138)

qnqm

= 1 n = 1 ~mn

The right sides of (3.137) and (3.138) are equal to each other, since either can be
converted to the other by an interchange of the summation indices m and n. Thus
111

2:

m=l

<Pmq~ =

1\1

2:

m=l

(3.139)

tI>~qm

which is Green's reciprocation theorern. It may be extended to a set of N conducting


bodies whose potentials are V n, and which possess total charges Qn, by combining
all the points of a COn1ITIOn potential in (3.139) into a single term. This gives
N

L Q~Vn

2:

n=l

n=l

Consider now the special case that Qi


that (3.136) gives

If instead,

Q;

(3.140)

QnV~

1, all other conductors being uncharged, so

1, all other conductors being uncharged, then

V~ =

PI},

V; =

P2j,

...

VN =

PNj

192 Electrostatics in Free Space

CHAPTER

and application of (3.140) yields


N

2:

n=l

Q:Vn =

so that

2:

Pij

n=l

Pij

QnV~

Pij
(3.141)

Pji

and the coefficients of potential are symmetrical, with only Nt N + 1)/2 of them being
independent. Other properties of these coefficients may be deduced from their basic
definition. Since the Pij are the potentials at the surfaces Si due to a positive unit charge
on Sj, all the Pij must be positive. Further, the conductor possessing the charge must be
at the most positive potential and thus
(3.142)
EXAMPLE

3.33

Equation (3.141) may be interpreted in words by saying that the potential to which S, is
raised by placing uni t charge on S j, all other bodies being uncharged, is the same as the
potential to which S, is raised by placing unit charge on Si, all other bodies being uncharged. This is, of course, still true if S, and S j are the only t\VO bodies in the system.
As a special case of this result, let the first conductor be reduced to a point P and suppose
that the system contains additionally only a second conductor. Then the potential to which
the conductor is raised by placing a unit charge at P, with the conductor itself uncharged,
is the same as the potential which would be found at P if unit charge were placed on the
conductor.
Specifically, let the conductor be a sphere and let the point P be a distance r from its
center. Since a unit charge on the sphere causes a potential 1/47rfor at P, if a unit charge is
placed at P, the uncharged sphere will assume a potential 1/47rfor.

Equations (3.136) comprise a linear set of N equations which 111ay be solved to give

Ql = CUV 1 + C12V 2 +
Q2

C 21V

C 22V

+ CINV
+ ... + C V

2N

N
N

(3.143)

in which the coefficients Cij represent appropriate ratios of two determinants involving
the PijS. Thus the CijS are also purely geometrical quantities, depending on the size,
shape, orientation, and position of each conducting body. c., is called a coefficient of
capacitance, and c., (i ~ J) is called a coefficient of electrostatic induction.
I t follows from Green's reciprocation theorem that
Cij

(3.144)

Cji

If the jth conductor is raised to a positive potential while all the other bodies are
grounded, Qn must be positive, but all other charges must be negative. Therefore
Cij

~ 0

(i

j)

(3.145)

Furthermore, since the total charge of the system cannot be negative in this situation,

SECTION

Electrostatic Stored Energy

19

for any value of the index

J,

193

'"' c..
>0
tJ_

(3.146)

'-'

i=l

Equations (3.143) may be rewritten in a more revealing form by making the substitutions

C,

2:

c;

j=1

which leads to

Ql = ClIVI
C12(VI - V2)
Q2 = C2I(V 2 - VI) + C22V2

+
+

+ C IN (V 1 -

+ C2N(V2

VN)

- V N)

(3.147)

The quantities Cii and Cij are known as the self-capacitance of the ith body and the
mutual capacitance between the ith and jth bodies. Cij = Cji, by virtue of (3.144),
and all the CiiS and CijS are positive because of the defining relations and the results
(3.145) and (3.146).
An interpretation of (3.147) may be undertaken with reference to the first of these
equations. The total charge Ql has a component CuV 1 which may be attributed to a
capacitance Cn between the first body and ground, since V 1 is the absolute potential.
Additionally, there is a charge C12(V 1 - V 2) residing on the first body, which may be
attributed to a capacitance C12 between the first and second bodies, with this capacitance charged to a voltage difference VI - V 2. Since C21 = C12, there is an equal
and opposite charge C2I(V 2 - V 1) residing on the second body. A number of flux
lines C12(V 1 - V 2) connects these two bodies, originating on C12(V 1 - V 2) and
terminating on C21 (V 2 - V 1)' Similar explanations may be offered for the other terms
C1n(Vl - Vn) occurring in the first equation. Thus the entire set of Equations (3.147)
may be interpreted in terms of a capacitance Cii between the ith body and ground, plus
capacitances Cij between the ith body and each other body in the system. These
capacitances are purely geometrical quantities and often can best be determined by
experimen t.

3.19

ELECTROSTATIC STORED ENERGY

Since electric charges exert forces on each other, work is performed when they move.
In particular, energy normally is required to assemble a system of charges into a given
distribution. This energy may be said to be stored in the system. The technique
employed in Section 3.8 to develop the method of images provides a simple means for
calculating this stored energy.
Let it be desired to find the electrostatic energy stored in the charge system of
Figure 3.11a. If the charge Qext is allowed to collapse onto the surface cPo, forming the
distribution (J, the external field disappears. If, in addition, the charge Qint is allowed

194 Electrostatics in Free Space

CHAPTER

to collapse onto the surface 4>0, forming the distribution (J", the internal field also disappears. Since (J" = -(J', the net charge everywhere on cf>o will be zero and the system
has become electrically neutral.
This provides an excellent starting point for the creation of the system of Figure
3.11a. Consider the family of surfaces So, SI, S2, ... ,which is to become the family
of equipotentials <1>0, <PI, <P2, . . . ,in the final system, with <Po the innermost of these
surfaces. One begins by placing the charge distributions (J' and (J" on So. The charges
comprising (J' are then moved to S1 and changed to the distribution (J'l = D 1 , in which
D 1 is the flux density distribution the final system is to have over S1. The charges of (J'1
are next moved to S2 and changed to the distribution (J2 = D 2 , in which D 2 is the flux
density distribution the final system is to have over 8 2 If this process is continued to
completion, all the charges comprising Qext will be in their proper places and the field
external to So will be precisely that of the final system. Since (J" is still on So, the field
internal to So still will be identically zero.
How much energy is expended in moving the charges of (J' from So to S 1 '1 Consider a
surface element d.S in So, as shown in Figure 3.20. The charge (J' d.S is to be transferred

81
FIGURE

3.20

So

Transfer of charge.

from d.S to a surface element dS 1 in S1 such that c dS = (J'1 dS 1. Let dp d.S be the
amount of charge which is transferred at a time when p dS o units of charge have already
been transferred. At this time the density of charge on d.S is (J" + (J' - p) = -p
and the density of charge on d.S, is (to first order)
p. If de is the distance from d.S
to dS l , the work done in transferring the charge dp d8 0 is

d 41V = -P dp d.S de
Eo

The work required to transfer all of the charge

1 (J'2
d 3W = - - dS o de
2 Eo

(J'

= -

d.S is therefore

E2dS o de

Eo

(3.148)

SECTION

Electrostatic Stored Energy

19

Thus the work done in moving all the charges comprising

dW

-ho

(1

195

from So to S 1 is

E2 dV

VO

1-

in which VI - V o is the volume between Sl and So.


It follows that the energy stored in the system as the charges of (1 are 1110ved from So
to their final positions is given by
Wex t = tEo

J E2 dV

(3.149)

Vext

with V ext the entire volume external to So.


If So, S~, S~, ... , is the family of surfaces which is to become the family of equipotentials <Po, <P~, <P~, . . . , with So the outermost surface, the charges comprising (1'
may now be transferred successively from So, to S~, S~ to S~, etc., until they reach their
proper final positions. The work necessary to do this will be
W i n t = tEo

J E2 dV

(3.150)

Vint

with Vi n t the entire volume internal to So. The conclusion is reached therefore that the
total electrostatic energy stored in a system of static charges surrounded by free space
is given by

WE = tEo

Jv E2 dV

J E Do dV

(3.151)

in which the volume integration extends throughout all space. This suggests that the
energy is stored in the field with a volume density of f. oE2/ 2 joulc/m", but of course no
experimental verification of this is possible. Only the integrated form (3.151) is susceptible to check. However, this interpretation will prove very attractive when time-varying fields are considered in Chapter 5.
If the ultimate position of all the charge Qext is such that it is distributed over the
surface of a single conductor, and if similary Qint finally ends by being distributed over
the surface of a second conductor, an electrostatic system such as the one shown in Figure
3.18 will have been created. For such a case it is interesting to return to (3.148) and write
d 3W = ~((1 dS o) (E dt)

as the work required to transfer the charge (1 d.S to the surface element dS 1. The work
required to transfer this charge to its ultimate proper place on the conductor is therefore

d 2 W = 1;;(1 d.S

fE

de =

~(1

dSO(<PA - <Po)

in which <PAis the potential of the conductor. The work invested to transfer all of Qext
to the first conductor is then

Wext =

~(<PA -

<po)f(1 dS o = !(<PA - <PO)Qext

Similarly, the work required to 1110Ve all the charge Qint from So to the second conductor
IS

W int

t(cPB - cPO)Qint

in which <PB is the potential of the second conductor. Letting Q = Qext = -Qint, and

196

Electrostatics in Free Space

CHAPTER

letting V = <PA - <PB be the voltage difference of the two conductors, one can conclude
that
(3.152)
is the total energy stored. Since Q =
written in the alternative forms

CoY,

with

Co the capacitance,

this result may be


(3.153)

1 Q2

WE

(3.154)

=-

Co

These equations probably already are familiar from circuit analysis, and it is to be
noted that the derivation just given is valid for any arbitrary pair of conductors.
EXAMPLE

3.34

If a parallel plate capacitor of plate area A and spacing l is charged to a potential difference
electric field is uniform (neglecting fringing) of value

Vb' the

By virtue of (3.151), the energy stored in the capacitor is


r

l~ E

1 (Vb)2
-z- Al = 21(

= 2"

Eo

But it already has been noted that the capacitance is

vV E

in agreement with (3.153).

Eo

lA_) Vb2

Co

Eo

All and therefore

= iCoV~

The energy stored in a multicapacitor system 111ay be deduced by a generalization of the argument leading to (3.152). If <Po is at absolute zero potential, and a d.S is
an element of charge which will ultimately reside on a conducting surface Sn, then

d 2W

i a dSoV n

is the work needed to move this charge from So to Sn, where the potential is to be V n.
When all elements of charge in a and a' are 1110ved to their final positions, the energy
stored is

1VE = t

2:

n=l

Qn V n

(3.155 )

Use of (3.143) allows (3.155) to be written

1V E =

LL

m=l n=l

Cnm

V mV n

(3.156)

The field expression for electrostatic stored energy, (3.151), may be converted to
still another form which leads to a generalized geometric formula for capacitance. Using
the relation E = - Vel> gives

TVE

tEo

f E (- Vcf

dV

SECTION

Electrostatic Stored Energy 197

19

which, through application of the vector identity (V.107) becomes

WE

= -ho

<l>V E dV -

-ho

V (<I>E) dV

Substitution of EoV E = p into the first integral and application of the divergence
theorem to the second yields

WE =

p<l> dV -

t~o

f <I> E dS

But S may be taken as a sphere at infinity, and since E decreases as


as ~-1, the surface integral is seen to vanish. Therefore

WE

~-2

and <I> decreases

f p<l>dV

(3.157)

The electrostatic potential function ep rnay be expressed as the integral


<I> =

f
v'

pi dV'

(3.158)

41ro~

wherein primes are used so as to be able to distinguish between the contributions


to the integrals in (3.157) and (3.158). Thus the electrostatic stored energy may also be
expressed in the form

WE

=! f f ~ dV dV'
2

17

v'

(3.159)

47ro~

in which ~ is the distance between the volume elements dV and dlT ' , and the integration
is to be performed twice throughout all of space containing elements of charge.
This result may be applied to situations in which equal and opposite amounts of
charge [Q,-Q] reside on two conducting bodies whose exterior surfaces are 8 1 and 8 2 .
The surface charge densities CT1(~,1J,r) on 8 1 and (12(~,1J,r) on 8 2 are both linearly proportional to Q so that one may write
for any point (~,1J,r) on 8 1, and
for any point (~,1J,r) on 8 2 with 11 and 12 functions which give the normalized charge
distribution. Under these conditions, (3.159) may be written

WE

Q2
2

f f fd~ ss.cs; + Q2 f f fd~

S1

S'
1

47ro~

S2

S'
2

47rEO~

dS2dS~ + Q2

f f fd2 dS

S1

S2

47rfo~

1dS

wherein f~ implies fl(~',rl',r'), etc. It is evident from (3.154) that the capacitance of
this system must be given by

198 Electrostatics in Free Space


EXAMPLE

CHAPTER

3.35

Two concentric conducting shells of radii T1 and T2 form a capacitor whose capacitance can
be determined with the aid of (3.160). On the basis of the figure, let charges - Q and + Q

-Q

+Q
be placed on the inner and outer shells; then !1(ttJ,r)
are constants and

1=2

JJ

82

fd2

41rfo~

as.ss,

= - _1

21rfo

-1/41rri and !2(~,7J,r)

(~) (~)
41rrl

47T' r 2

JJ

81

82

dS\dS

1/47rr~

Let dS 2 be the zenith area element shown in the figure, and let dS l be the ring 27rri sin fJ dfJ.
Then, since
~2 = ri
r~ - 2r1r2 cos 0
= 2rlr2 sin 0 dO

2r dr
dS I

rl
27T'r2

dr

it follows that

41rf or2

Following the same procedure, one finds that the other


and + 1 /47T'for2 and therefore

t\VO

integrals in (3.160) give

C =

47T'forl - 47rfor2 - - - - - -

+ 1/41rforl

SECTION

3.20*

The 111axwell-McAlister Experiment

20

THE MAXWELL-McALISTER

199

EXPERIMENT

This section and the next are concerned with improvements in accuracy which Maxwell and Me Alister and later Plimpton and Lawton made in the Cavendish electric
force experiment (described in Section 3.1). Though not adding further to the electrostatic theory just presented, a discussion of these experiments enhances confidence
in the postulation of the inverse square law, and affords the opportunity to consider
Maxwell's analysis of the accuracy of such experiments.

Two thin spherical


metal shells

Air

---~

FIGURE

3.21

i{:D

Insulators

The Maxwell-McAlister apparatus.

With IVlcAlister's help in the laboratory, Maxwell repeated the Cavendish experiment in 1878, using an apparatus of improved design. The need Cavendish had to
remove the outer hemispheres was avoided by introducing a trap door which provided
access to the inner globe, and through which the testing electrode of an electrometer
could be inserted to detect the presence of charge on the inner globe. As suggested by
Figure 3.21, the outer hemispheres were sealed together and placed on an insulating
stand. The inner globe was spaced and insulated from the outer shell, in a concentric
position, through the use of a piece of ebonite tubing. The trap door in the outer shell
* This section may be omitted without loss in continuity of the technical presentation.

200

Electrostaiics in Free Space

CHAPTER

was so constructed that, in its closed position, it formed an electrical connection


between the t\VO spheres. The trap door could be lifted by an insulating thread, thus
breaking this electrical connection. The detector inserted through the resulting opening
was a version of Thomson's quadrant electrometer, a much more sensitive instrument
than the pith-ball electrometer available to Cavendish a century earlier. The case of
this electrometer and one of its electrodes were permanently grounded, and the testing
electrode was also kept grounded except when used to test the potential of the inner
globe. To estimate the original charge on the outer shell, a small brass ball was placed
on an insulating stand at a distance of about 60 em from the center of the shell.
The procedure followed was to close the trap door, thus connecting the outer shell
to the inner globe, and then to charge the outer shell positively from a condenser which
was brought in from another room for this purpose and then promptly removed. After
this, in Maxwell's words"
The small brass ball was then connected to earth for an instant, so as to give it a negative
charge by induction, and was then left insulated. The lid was then lifted up by means of the
silk string, so as to take away the communication between the shell and the globe. The shell
was then discharged and kept connected to earth. The testing electrode of the electrometer
was then disconnected from earth, and made to pass through the hole in the shell so as to
touch the globe within withou t touching the shell.
Not the slightest deflexion of the electrometer could be observed.

Because of the relative sizes of the small brass ball and the outer shell, and their
separation distance, Maxwell and l\1cAlister knew that at the time the brass ball
had been momentarily grounded, it had taken on an induced negative charge which
was approximately 1/54th of the positive charge which had been applied to the outer
shell. Later, when the outer shell was grounded, it actually retained a small positive
charge, through induction, and due to the presence of the insulated, negatively charged
brass ball. This small positive charge was computed to be about 1/9th of the negative
charge on the brass ball, or 1/486th of the original charge applied to the shell system.
Thus at this stage of the experiment, the outer and inner shells were insulated from
each other, the outer shell was at ground potential and possessed a positive charge
approximately 1/486th of the original charge, the small brass ball was insulated and
contained a negative charge approximately 1/54th as big as the original charge applied
to the shell system, and the electrometer indicated no charge on the inner globe.
1-'0 test the sensitivity of the instrumentation, the outer shell was disconnected from
ground and connected instead to the electrometer. Being still at ground potential
(due to the presence of the negatively charged brass ball) the outer shell caused no
deflection of the electrometer. However, at this juncture, the small brass ball was
grounded, thus raising the potential of the outer shell and producing a large deflection
of the electrometer.
Calling this observed deflection D, and letting d be the largest deflection which
could escape detection, Maxwell and McAlistcr then knew that the maximum charge
which resided on the inner globe was II486(dl D)th of the original charge applied
to the shell system. Thus Maxwell concludes
20 J. Clerk Maxwell, ed., The Scientific Papers of the Honourable Henry Cavendish, vol. 1, p.p. 404-409,
revised by J. Larmor, Cambridge University Press, 1921. (Note 19 of Notes by the Editor.)

SECTION

The 111axwell-111cAlister Experiment

20

201

. \VC know that the potential of the globe at the end of the first part of the experiment
cannot differ from zero by n101'e than

where V is the potential of the shell when first charged.


But it appears from the mathematical theory that if the law of repulsion had been as
r-(2+0), the potential of the globe when tested would have been
O.l4780V

Hence 0 cannot differ from zero by more than .l2(d/D).


N O\V, even in a rough experiment, ]) was certainly more than 300d. In fact, no sensible
value of d was ever observed. We may therefore conclude that 0, the excess of the true index
above 2, 111USt either be zero, or must differ from zero by less than

The mathematical theory to which Maxwell refers in the above passage is developed
in the remainder of his Kate 19. This development is slightly paraphrased, with modified notation, as follows:
Assume that the law of force between t\VO electric charges a distance r apart is
qq'F(r) in which q and q' are algebraic quantities representing the amounts of charge
and F(r) is a function to be determined. Since the system is conservative, the potential
energy can be wri t ten

f F(R) dR
00

= qq'

W(r)

(3.161)

in which the result is independent of the path. This may be expressed in the form

i' (r)
W(r) == qq"r
00

so that

j(r) =

f r [f F(R)

dR

(3.162)

J dr

(3.163)

Imagine in the foregoing that the charge q' is successively elements of a charge Qa
uniformly distributed over the outer sphere (Figure 3.22) of radius a and a charge Qb
which is uniformly distributed over the inner sphere of radius b. The potential energies
between the charge q and each of these elements of charge may be added.
Let ; and (J"b be the respective surface charge densities so that

Qa

==

47ra 2(J" a

Qb == 47rb 2(J" b

Referring to the figure, one can consider first an element of charge at pi, residing in the
element of area a 2 sin 8 d8 d. (The usual spherical coordinates are implied.) Placing
the charge q at ]J, a distance c from the center, and letting r == PP', one finds that
}'2

== a 2

2ac cos 8

+c

and that the potential energy between this element and q is, from (3.162)
q(J"aa 2

f' (r)
sin 8 de df/> ' - r

(3.164)

202

Electrostatics in Free Space

CHAPTER

Air

Two shells

f
FIGURE

Geometry for 111 axwell's analysis.

3.22

rf

The potential energy between q and all the charge on the outer shell is therefore

Wa

qQa .f'(r) sin fJ dfJ d


41r
1"

Since r is independent of </>, this becomes

Wa = qQa
2

1'(1') sin fJ dfJ

(3.165)

However, the differential of (3.164) gives r dr = ac sin f} df} so that (3.165) may be
rewritten
a+c
qQ
f'(1") dr = -2a [j(a + c) - J(a - c)]
W a = qQa

2ac a-c

ac

When this procedure is repeated for the inner sphere, one obtains

qQb
TVb = [f(c

2bc

b) - f(c - b)]

If q is taken to be a unit charge, the electric potential at P is


V(c) =

~ [f(a +
2ac

c) - f(a - c)]

+ 2bc
Qb [f(c + b) -

f(c - b)]

(3.166)

From this, it follows that the potential of the outer sphere is

V(a)

~2 f(2a) + !J!'2a

2ab

[f(a

+ b) -

f(a - b)]

(3.167)

whereas the potential of the inner sphere is

Qb
(b) = 2b 2f(2b)

o. [f(a + b)
+ 2ab

- f(a - b)]

(3.168)

SECTION

The Maxwell-McAlister Experiment

20

203

In the Cavendish experiment, the two spheres first were joined by a short wire
and charged to a common potential VI above ground. Putting yea) = V(b) = V I
into the above equations, and solving for Qb, the charge on the inner sphere, one obtains

Qb

V b bf(2a) - a[f(a + b) - f(a - b)]


I

f(2a)f(2b) - [f(a

b) - f(a -

(3.169)

b)r~

The two spheres were next disconnected from each other and the outer sphere grounded.
At this point, the charge on the outer sphere changed to Q~ which can be determined
from

yea)

Q~ f(2a) + ~ [f(a + b) -

2a 2

2ab

f(a - b)]

At the same time, the potential of the inner sphere became

V(b)

V2

2~b2f(2b) + 2~~ [f(a + b) -

f(a - b)]

Elimination of Q~ from these t\VO equations yields the relation

V = V
2

[1 _~

I(a

b) - I(a - b)]
.f(2a)

(:3.170)

I t is this potential V 2 of the inner globe which the electrometer was used to detect. t
N ext assume, with Cavendish, that the law of electric force is SODle inverse power
of the distance which differs but little from the inverse square; that is, let

F(r) =

1'-(2+<5)

r 1-

so that

<5

.f(r) = 1 _ 82

Since 0 is assumed small,

1'-6

1'-0

can be expanded in the rapidly converging power series

= 1 - 0 In r

(8 In 1')2

2!

+ ...

(This is merely a Maclaurin series in powers of o. Cf. the Mathematical Supplement,


Part I.)
Use of the first two terms of this expansion in (3.] 70) gives the first-order result

8
[a- I na +- -b -In -2 4a-2-2]
V 2 = -VI

a - b

(3.171)

Insertion in (3.171) of the values used by Maxwell and l\IcAlister for the radii a and b
gives V 2 = 0.1478V I , the relation used by Maxwell in the previous quotation. It was

. t hiIS manner that Maxwell was able to determine the bound 0 = 21 1600'

In

t In the Maxwell-McAlister version of this experiment, the presence of the small charged brass ball
adds another term to the expressions for V(a) and V(b), but this term cancels out in (3.169) and
(3.170), and thus (3.170) is valid both for the original Cavendish experiment and for the later
Maxwell-McAlister experimen t.

204

Electrostatics in Free Space

CHAPTER

One can carry this analysis further and assume that Qb = 0 and thus that
Equation (3.170) then gives

bf(2a) - af(a

b)

af(a - b)

V 2 = o.

Holding a fixed, letting b vary, and differentiating twice with respect to b, one obtains

f"(a

b) = f"(a - b)

Since this Blust be true for any b < a, it follows thatj"(r) = K I andf'(r) =
in which K 1 and K 2 are constants. Hence,
00

F(R) dR = f'(r)

K1

+K

KIf

+K

2,

from which

(3.172)

Thus on the basis of the assumption of a null result in the Cavendish experiment, the
electric force law is the inverse square. This proof is due to Maxwell, It borrows from a
procedure first used by Laplace who showed that no function of the distance except
the inverse square satisfies the condition that a uniform spherical mass shell exerts no
gravitational force on a particle within it.

3.21 *

THE PLIMPTON-LAWTON EXPERIMENT

The accuracy of the Maxwell-Me Alister result stood until 1936 when Plimpton and
Lawton of Worcester Polytechnic Institute undertook the task of attempting a more
exact measurement."! Using modern equipment, they were able to show that the
electric force must be an inverse power of the distance, r-(2+6), in which 0 is bounded
by 2 X 10- 9
Plimpton and Lawton examined Maxwoll's version of the Cavendish experiment
very carefully. It was at first believed that by using a greater charging potential on the
outer sphere and a D10re sensitive electrometer, one could increase the accuracy of
Maxwell's method. This did not prove to be the case; in fact, it was concluded that
Maxwell had apparently reached the limit attainable by his method, even granting
the sensitivity of modern equipment. Maxwell's method suffered from two limitations:
(1) radioactive contamination of the metal surfaces make spontaneous ionization
possible, and this could affect the charge on the inner globe during measurement, and
(2) contact potentials establish a lower bound on the detectable voltage of the inner
sphere. The second effect is the more severe of the two.
The contact potential difficulty was eliminated completely and the spontaneous
ionization problern reduced by an ingenious modification of the apparatus. Plimpton
and Lawton placed the detector inside the inner globe and thus were able to make
permanent connections between the electrometer and the inner globe. This eliminated
contact potentials entirely. It was also possible to seal the inner globe inside the outer

* This section may he omitted without loss in continuity of the technical presentation.
S. J. Plimpton and \V. E. Lawton, "A Very Accurate Test of Coulomb's Law of Force between
Charges," Phys Rev, 50, 1066-1077; 1936.
21

SECTION

21

The Plim pion-Lauiton Experiment

205

one, thus reducing contamination of its surface. The time duration of the data-taking
was drastically curtailed, thereby decreasing the accumulated effects of spontaneous
ionization.
The apparatus used by Plimpton and Lawton is showed in Figure 3.23. The outer
globe consisted of two hemispheres 5 feet in diameter, soldered together, and mounted
on a porcelain insulator. A slat floor was constructed inside the outer sphere and on it
was placed the detector, housed in copper boxes. These copper boxes formed the lower
half of the inner globe, the upper half being a 4-ft diam hemisphere, mounted on pyrex
glass insulators and connected to the detector boxes. (Plimpton and Lawton showed
that this deformation of the geometry used by Cavendish did not invalidate the applicability of Maxwell's analysis, as given in Section 3.20.)

~----~irror
Telescope

Central rheostat
.-/-

l
I
I
I

...--

WI

~I

'-------------------------'
FIGURE

3.23

Condenser
generator

~~o~oooo~oooo~
110 a.c.

--------

The Plimpton-Lawton apparatus.

The detector used was a five-stage amplifier operating a galvanometer. This assembly
was suspended on rubber to avoid microphonics, The Johnson noise of the input resistor
caused an indication of only ~~ microvolt. The galvanometer was viewed through a
conducting window in the outer sphere, this being simply a glass-bottomed vessel
filled with a salt water solution.
During preliminary investigations, it was found that when switches were opened or
closed, the galvanometer deflected, due to magnetic field surges. A quasistatic procedure
was devised to circumvent this difficulty. The outer globe was charged by a sinusoidal
voltage source whose frequency was adjusted to the resonance of the galvanometer.
This greatly enhanced the sensitivity of the instrumentation as well as the signal-tonoise ratio. I t was found that a frequency of about 2 cycles per second submerged the
galvanometer fluctuations due to inductive effects below the Johnson noise.

206

Electrostatics in Free Space

CHAPTER

Plimpton and Lawton were unable to find a commercially available generator at


such a JO\V frequeney ; therefore they designed and const.ructcd their own, 'I'he timevarying voltage was developed by moving the center plate of a tri-plate condenser
connected to a suitable power supply.
The calibration procedure was simplified by employing a high resistance potentiometer, as shown in Figure 3.24. During calibration, a small known fraction of the

From condenser generator

o
Oscilloscope

fJ
U

"---_--...._-.-_..."

FIGURE

3.24

.....

--.J

JlIethod of calibration.

charging voltage was applied to the inner hemisphere. The potentiometer was varied
until the smallest detectable voltage was determined. This voltage was consistently
less than one microvolt.
During the actual experiment, the outer sphere was charged with a voltage wh ich
was always in excess of 3,000 volts. Although many trials were made, no detectable
deflection was ever observed in the galvanometer,
1"'0 adapt Maxwell's analysis to the Plimpton-Lawton experiment, one can return
to the expressions for the potentials of the t\VO spheres, namely, Equations (3.167)
and (3.168). Since Plimpton and Lawton were using such a low frequency (2 cps), these
equations are still valid in their case, only now there is an implied time dependence of
er", Solving these t\VO equations for Qb gives

Qb = 2b2/(2a)V(b) - 2ab[f(a
f(2a)j(2b) - [f(a

+ b)

- /(a - b)]V(a)
b) - .r(a - b)]2

(3.173)

Sincc the detector is connected between the inner and outer spheres, the current

flowing through the extremely high input resistance R of the detector is Qb = jWQb.
Thus

yea) -

V(b) = jwQbR

(3.174)

SECTION

21

The Plimpton-Lawton Experiment

207

Elimination of Qb from Equations (3.173) and (3.174) gives

{f(2a)f(2b) - (f(;w; b) - f(a - b))2

== {2b~f(2a)
Since

2ab(f(a

b) - f(a - b)]} Yea)

+ f(2a)f(2b)

- Lf(~

b) - .r(a - b) J2} V (b)

JwR

R is so large, this reduces to

ba [f(a +

b) - f(a - b)lV(a) == f(2a)V(b)

from which it follows that

yea) - V(b) == yea)

[1 _~.f(a +
b

b)l

b) - fCa -

fC2a)

(3.175)

But this is the same expression which Maxwell obtained for the static case. N ow, however, the voltages are sinusoidal, and the potential of the inner sphere is being measured
with respect to the outer sphere rather than ground.
Once again, the assumption can be made that the electric force law is of the form
r-(2+o) with 0 small, yielding the first-order result

Yea) - V(b)

~ Yea) [~ln a_+_b


2

a - b

- In _4a_
a2

_l
b2

When the values of a and b used by Plimpton and Lawton] are inserted in this expression, one obtains

yea) - V(b)

~12

yea)

(3.176)

Their measurements indicated that yea) - V(b) was not greater than one-half microvolt even for V (a) as great as :-3,000 volts. This yields the result that 0 is bracketed
by the limits 2 X ]0- 9
Because of the great accuracy of this determination, Plimpton and Lawton even
investigated possible effects due to gravity. This influence was shown to cause a potential difference between the spheres of less than 10- 10 volts, an effect which could be
neglected.
The Plimpton-Lawton determination of the law of electric force stands as a model
of precise experimentation and provides the most confident basis for a development
of an electromagnetic theory which uses the inverse square law as a postulate.
REFERENCES

1.

Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, VV. H.
Freeman and Company, San Francisco, 1962.

2.

Jackson,.T. D., Classical Electrodynamics, John \Viley and Sons, Inc., Ne\v York, 1962.

t The value used for b needs to be a suitable average, since their inner surface was not entirely
spherical.

208

Electrostatics in Free Space

CHAPTER

3.

Jeans, J., The 111 athematical Theory of Electricity and il1agnetism, 5th ed., Cambridge
University Press, London, 1946.

4.

Langmuir, R. V., Electromaqnetic Fields and lVaves, Meuraw-Hill Book Company, New
York, 1961.

5.

Lenard, P., Great ill en of Science, The Macmillan Company, Inc., N e\v York, 1933.

6.

Magie, \tV. F., . 4 Source Book in Physics, Mcflraw-Hill Book Company, New York, 1935.

7.

Panofsky, \V. I{. H., and 1\1. Phillips. Classical Electricity and Jl1 aqneiism, Addison- \Vesley
Publishing Company, Inc., Reading, Massachusetts, 1956.

8.

Plonsey, R., and R. E. Collin, Principles and Applications of Electromagnetic Fields,


McGra\v-Hill Book Company, New York, 196!.

9.

Ramo, S., and J. R. Whinnery, Fields and lVaves in Afodern Radio, 2nd ed., John Wiley
and Sons, Inc., N e\v York, 1953.

10.

Reitz, J. R., and F ..J. Milford, Foundations of Eleciromaqnetic Theory, Addison-Wesley


Publishing Company, Inc., Reading, Massachusetts, 1960.

11.

Shamos, 1VI. H., Great Experiments in Physics, Holt-Dryden Company, Inc., New York,
1959.

12.

Shire, E., Classical Electricity and ..Magnetism, Cambridge University Press, London, 1960.

13.

Smythe, ,V. R., Static and Dynamic Electricity, McGra\v-Hill Book Company, N ew York,
1939.

14.

Whittaker, E., A J/istory of the Theories of Aether and Electricity, Vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

15.

'VoH, A., A History of Science, Technology and Philosophy in the Eighteenth Century, The
Macmillan Company, Inc., New York, 1939.
PROBLEMS

3.1

Two particles of equal mass m and equal charge q are suspended from a C0111InOn point by
light strings of equal length. Find the angle of separation of the t\VO strings.

3.2

Two small spheres are placed 1 In apart and a charge of 1 caul is placed on each. Will t\VO
strong men be able to hold the spheres in position ? (Define a strong man as one weighing
200 lbs and able to lift twice his weight.) How many excess electrons could be placed on
each sphere before the men begin to feel a sense of achievement?

3.3

An uncharged conductor of volume V is placed in a uniform electric field of strength E.


What force does it experience?

3.4

A quantity of negative charge - Ze is distribu ted uniformly throughou t a sphere of radius


r; A positive point charge +Ze is situated at an arbitrary point within the negative cloud
of charge. Find the force on the positive charge.

3.5

Use the Dirac delta function to express a surface charge density distribution in the form
of a volume distribution. Insert this expression in (3.11) and (3.18) to verify (3.13) and
(3.26) .

3.6

A. metal shell of radius a contains a charge Qa and a second metal shell of radius b is given
a charge Qb. If these t\VO shells are then connected by a wire, in which direction will curren t
flow?

Problems

209

3.7

Use an energy argument to show that the charge on a conductor resides on its outer
surface.

3.8

.A discrete system of N charges 'l are at the points (~n, 1]n,S n). Find the energy which can
be extracted from this system when each charge is allowed to move infinitely far away
from all the others.

3.9

Rutherford, in 1911, gave a satisfactory explanation for the deflection of a particles by


adopting as a model of the atom a uniformly distributed cloud of negative electricity of
amount - Ze, contained in a sphere of radius r., with a positive charge Ze at its center.
He obtained the following expressions for electric field and potential at any point within
the atom

E(r) =
'P(r) =

4~:o (~ - ~)

~:o G- 2~a + ;:)

Verify that these expressions are correct.

3.10

Find the equation for the equipotential surfaces due to an electric doublet. Plot a few of
the contours to scale. Are the results valid near the dipole? In what way should they be
modified '?

3.11

A. dipole of moment p is located in a uniform electric field of strength E. Assume that p


is initially perpendicular to E and say that in this position the dipole has zero potential
energy. Then show that if the dipole is rotated into any new position, its potential energy
has changed to - p E.

3.12

:\ flat circular ring of radius a contains a uniform charge density x couf /m. Find the potential and field intensity at any point along the axis.

3.13

Use the result of the previous problem to deduce the potential and field intensity at any
point along the axis of a disc of radius a which has a uniform surface charge density (5.

3.14

Two extensi ve metal plates are parallel and opposi tely charged, each being insulated, wi th
the intervening space being a VaCUU111. If the plates are pulled further apart, explain why
their potential difference increases. Suppose that the initial separation was 1 mm and that
initially the potential difference was 1,000 volts. If the ultimate separation is 1 m, what
is the final voltage? This effect accounts for the high voltage of some lightning discharges.

3.15

Three thin concentric conducting spherical shells of radii a = 1 m, b = 2 IU, and c = 4 m


are originally uncharged and insulated from each other by a vacuum. A total charge of +3
coul is then placed on the middle shell B. N ext, A and C are electrically connected by a
thin wire which goes through a negligibly small hole in B without touching B. .A fter the
wire is removed, the total charge on ..4 and C is measured. Predict the results of these
measu rem en ts.

3.16

Use Gauss' la w to show that the average poten tial over any spherical surface in a chargefree region is equal to the value of the potential at the center of the sphere.

3.17

With the aid of Gauss' law, determine the electric field distribution inside and outside
an electron beam of circular cross section of radius a. ASSUlTIe the beam possesses a uniform
current density L amp zm" and that the electrons are moving at a constant velocity v = lzv.
(This problem may be treated electrostatically even though the charges are moving
because the time-average amount of charge at any position in the beam is a constant.)

210 Electrostatics in Free Space


3.18

CHAPTEH

Point charges q and -q are placed at the points .A, B. The flux line which leaves A, making
an angle a with AB, meets the plane which bisects .4B at right angles, in a point I). Sho w
that
. a
_ r: . ]J AB
SIn -

2 SIn

--

Hint: When this flux line is rotated about AB as axis, the surface thus generated encloses
no net charge.
3.19

.~ point charge of +q coul is placed d 111 above an infinite grounded conducting plane in
otherwise empty space. Let P be that point in the plane nearest to q, and with ]J as center
draw a circle in the plane of radius r. If the circular area thus formed contains one-quarter
of the total induced charge in the plane, find the value of r.

3.20

Two infinite grounded conducting planes intersect at right angles and a point charge q is
placed a distance d from each plane. Find the surface charge distribution in each plane.

3.21

Let the electric field intensity at the surface of a thin spherical conducting shell be E.
Show that if an extremely small hole pierces the shell, then the electric field at the lUOU th
of the hole is i E.

3.22

Use the method of irnages to deri ve a field expression for the system consisting of a uniform
line charge parallel to and external to a right circular conducting cylinder. Consider the
general case in which the cylinder is at an arbitrary potential.

3.23

.A grounded conducting tube of infinite length has a circular cross section of inner radius
10 ern. f\ line charge of 2 microcoul/rn is placed parallel to and at a distance of 5 em from
the axis of the tube. Determine the surface charge density distribution induced on the
inner surface of the tube. What is the net force per unit length acting on the tube '?

3.24

Consider an electrified system consisting of a metallic sphere of radius a and excess charge
Q together with a point charge q a distance b > a from the center of the sphere. Find the
force on q and show that under certain circumstances it can be attractive even if q and Q
are of like sign.

3.25

.A hollow conducting sphere has an internal radius of 1 m ..~ point charge of 1 microcoul
is placed within the sphere at a distance of 50 cm from the center. Find the surface charge
density distribution on the sphere and determine the net electrostatic force experienced
by the sphere.

3.26

Two equal charges q are placed at equal distances d from the center of a grounded conducting shell of radius a. If a > d and the charges are on the same diameter, find the net
force on each charge.

3.27

A hollow conductor is formed by a quarter of a sphere and t\VO perpendicular diametral


planes. Find the image of a charge placed at any internal point.

3.28

A spherical conducting shell of radius b is insulated and uncharged and surrounds a


spherical conductor of radius a, the distance c between their centers being small, The inner
conductor contains a charge Q. Find the potential distribution between conductors and
the surface charge density.

3.29

A cylindrical volume of radius b, extending to infinity in both axial directions, contains


a space charge of constant density Po. Find the field intensity E(r) and the electrostatic
potential <I>(r) for any radial distance.

3.30

Two infinite coaxial cylindrical conducting shells of radii a and b bound a uniform space
charge density Po. Determine the potential distribution in this intervening region if the
inner and outer cylinders are held at potentials <I>(a) = 0 and <I>(b) = Vo respectively.

Problems
3.31

211

Determine the trajectory of an outer electron in an electron beam of circular cross section
subject to spreading caused by space charge repulsions within the beam. Assume axial
velocities to be constant and radial velocities such that the beam diverges symmetrically
with no crossing of trajectories. Obtain the radial electric field as a function of r and relate
rm

.l..------ -:=t --t-.....---------

this to the radial acceleration; then integrate and determine r as a function of axial
distance..Assume that for z ~ 0 the beam is confined to r; with no radial velocity.
3.32

How long does it take an electron to travel from cathode to plate in a planar diode'? Insert
typical values for the parameters and compute the time in microseconds,

3.33

1\ spherical volume of radius b contains a space charge density p(r) = b2 - r 2 Find the
field intensity E(r) and the electrostatic potential cI>(r) for any radial distance. Check your
results by substitution in Poisson's equation.

3.34

A. thin conducting spherical shell of radius a contains a total excess charge Q, and has its
inner surface coated with a thin insulating film. An equal amount of charge is distributed
throughout its hollow interior such that
T

<a

Find the charge distribu tion within the sphere and the surface charge densi ty on its ou tel'
surface. What is the absolute potential of the sphere'? Of the point at its center'?
3.35

Use Laplace's equation to show that the electrostatic potential cannot have a maximum
or a minimum value at any point in space not occupied by an electric charge. Then show
that if cI> is maximum at a point, the point must be occupied by a positive charge, whereas
a negative charge must be at a point where the potential is a minimum.

3.36

Two infinite parallel conducting plates are separated by a distance b as shown in the
figure ..\ very thin conducting septum, infinitely long and of height d, is connected to the
grounded plate, with the other plate kept at a constant potential V o. Solve for the potential distribution between plates.

11

<f> = 0

3.37

t
b

Two L-shaped conducting channels are placed near each other so as to form a narrow
longitudinal slit, as shown in the figure. The t\VO channels are kept at a difference in

212

Electrostatics in Free Space

CHAPTER

potential Vo. Assume that the structure is infinite in the Z direction and solve Laplace's
equation to obtain a solution for ep in the region between the channels.
y

<I> =

V,

Slit

I
3.38

<1>=0

~X

Find the potential distribution and electric field between t\VO half-plane conductors set
at an angle cPo but not quite touching. Ignore edge effects and assume that one plate is
grounded with the other at a potential <1>( cPo) = V o. What is the charge distribution?

<1>=0

<I> =

3.39

Vo

Find the potential distribution between a four-segment commutator of radius fl and a


grounded concentric cylinder of radius f2 > fl. Alternate segments of the commutator
are at the patentials Vo vol ts, as shown in the figure.

Problems

213

3.40

A spherical conducting shell of radius a is divided into two hemispheres by a narrow


equatorial gap. If the hemispheres are kept at a difference in potential Yo, find the potential distribution both inside and outside the shell.

3.41

For the quartered spherical shell of Example 3.25, find the potential distribution inside
the shell.

3.42

Find the Green's functions suitable for solving Dirichlet-type boundary-value problems
in vol ving a rectangular box.

3.43

Repeat the preceding problem for a cylindrical box.

3.44

A line charge of density )( coul/rn is located symmetrically inside a 90-deg grounded conducting corner, being a distance d from each face, as shown in the figure. By using a suitable conformal transformation, and then employing the image principle, find the paten tial
distribution of this system.

II
~

<1>=0

3.45

Repeat the above problem if the line charge is placed symmetrically inside a grounded

trough, as shown in the figure.

<1>=0

214

Electrostatics in Free Space

3.46

With the aid of a Schwarz transformation, find the potential and field distributions in the
region between the two right-angle conducting wedges shown in the figure.

CHAPTER

4> =

Vo
b

+
4>=0

3.47

A capacitor is formed of three concentric cylinders of which the inner and outer are connected together. Neglecting end effects, obtain a formula for the capaci tance per uni t
length.

3.48

Use a Schwarz transformation to determine the change in capacitance, for the geometry
shown, over the value which would be obtained if a uniform field existed in both parallel
plane regions.
4l

= Vo

t
~

<1>=0

3.49

.A sandwich line consists of three parallel plane conductors, as shown in the figure .
.A ssuming these conductors are infinitely long in a direction perpendicular to the paper,
and neglecting fringing, find the coefficients of capacitance per unit length.

~"-b-.~
I ..

L.1
I

Problems 215
3.50

With the aid of a transformation in the complex plane, find the coefficients of capacitance
per unit length for the geometry shown in the figure.

3.51

Show that the energy stored in the field of a coaxial capacitor is consistent. with the
formula Q2/2C. Repeat t his calculation for a spherical capacitor.

3.52

Consider an electrostatic system consisting of .V conducting bodies possessing charges


Qn and potentials V n. Prove Thomson's theorem, which states that the charge will
distribute itself so that when in equilibrium the electrostatic stored energy in the field is
a minimum.

3.53

j\ system of LV conductors is charged in any manner and then charges are transferred
among the conductors until they are all brought to the same potential V. Show that there
has been a decrease in the stored electrostatic energy equal to what would be the energy
of the system if each of the original potentials had been decreased by an amount V.

3.54

Under the assumption- that the error in measuring the angular position of ball a (see
Figure 3.4) was so much larger than anv other error that it was the determining factor in
accuracy, and that Coulomb could measure this position within deu. to what accuracv
did he determine the inverse square law?

3.55

A proof of the inverse square law, assuming a null result in the Cavendish experiment, was
provided by Maxwell based on a formulation of potentials. Can you give an alternative
proof based on force':

3.56

What was the size of the auxiliary small brass ball used by Maxwell and Me Alister '!

3.57

From the description given of the Plimpton-Lawton experiment, what is the upper bound
on the total charge residing on the inner globe '?

3.58

What was the average diameter of the inner closed surface In the Plimpton-Lawton
experiment?

CHAPTER

Magnetostatics in Free Space


as a topic within electromagnetic theory, is usually introduced
by drawing upon cxperimentai evidence to postulate either the Biot-Savart law or
Ampere's circuital law, The theory of maguetic fields due to t.irne-independeut current
distributions ill free space is then developed. Following this, the behavior of magnetic
materials is considered, usually in terms of aggregations of atomic current loops or
equivalent ruagueti dipoles. A satisfactory description of all gross static magneti
effects may be achieved in this manner.
The approach to be presented in this chapter differs ill several respects from the
above. First of all, no new experimentally based postulates will be introduced. Instead,
the previously obtained results of special relativity and electrostatics will be used to
derive the Biot-Savart law. The procedure will be to consider the force exerted on a test
charge by a system of charges which are at rest relative to an observer ()'. '1'0 a second
observer O, in constant motion relative to ()', this charge system is drifting at a constant
velocity, and thus can take 011 the appearance of a steady current. The second observer
detects a slightly different force to be acting on the test charge. This slight difference
is determinable through the force transformatiou law and proves to be the seat of
magnetosta tics.
After the force trausformatiou equations have been used to transform Coulomb's
law and thus establish the Biot-Savart law, the chapter proceeds conventionally with
the introduet.ion of the maguet.ostatic vector potential function and the derivation
of Ampere's circuital law, The magnctostatic vector potential function is found to
satisfy Poisson's equation, leading to the solution of a class of boundary-value problems, in analogy with what was presented in Chapter :~ in the ease of electrostatics.
As illustrations of the theory, a variety of problems is solved, including the far field
of a small current loop. This resultforms the building block for an explanation of
magnetic effects ill materials. However, the discussion of magnetic materials will be
deferred until Chapter 7. ill order to be able to include time-varying effeets.
MAGNETO~TATIC;';,

4.1 *

HISTORICAL SURVEY

Man's awareness of magnetic effects appears to be almost as old as recorded history,


but most of the early knowledge was concerned with the properties of permanent
magnets, The subject of magnetic fields caused by electric currents had a very welldefined beginning in the winter of 1819-1820. During that period, Professor Hans

* This section may be omitted without loss in continuity of the technical presentation.

SECTION

Historical Survey

217

Christian Oersted (1777-1851) of the University of Copenhagen, experimented with the


placement of a closed electric circuit near a compass needle. He had been motivated
in this study by the observation that a compass needle fluctuated erratically during a
thunderstorm. Accordingly, he set up an apparatus consisting of a galvanic battery
and a short-circuiting wire. Apparently during one of his lectures Oersted placed the
wire at right angles to a compass needle, but observed no effect. At the end of this
lecture the thought occurred to him to place the wire parallel to the needle. This action
immediately caused a pronounced deflection in the needle. After putting together a
more powerful galvanic battery, Oersted assembled some of his colleagues as witnesses
and repeated the experiment, Excerpts of his own account 1 of what was observed
follow:
The opposite ends of the galvanic battery were joined by a metallic wire, which, for shortness sake, we shall call the uniting conductor . . . . To the effect which takes place in this
conductor and in the surrounding space, we shall give the name of the conflict of electricity.
Let the straight part of this wire be placed horizontally above the magnetic needle,
properly suspended, and parallel to it . . . . Things being in this state, the needle will be
moved . . . .
If the distance of the uniting conductor does not exceed three-quarters of an inch from
the needle, the declination of the needle makes an angle of about 45. If the distance is
increased, the angle diminishes proportionally. The declination likewise varies with the
power of the battery . . . .
The effect of the uniting conductor passes to the needle through glass, metals, wood, water,
resin, stoneware, stones; . . . . The effects, therefore, which take place in the conflict of
electricity are very different from the effects of either of the electricities.
If the uniting conductor be placed in a horizontal plane under the magnetic needle,
all the effects are the same as when it is above the needle, only they are in the opposi te
direction . . . .

After noting that a rotation of the wire would be tracked by a rotation of the magnetic
needle, and that no effect was observed for needles made of brass, glass, or gum lac,
Oersted offered a few observations in the nature of an explanation of the phenomenon:
The electric conflict acts only on the magnetic particles of matter..All non-magnetic
bodies appear penetrable by the electric conflict, while magnetic bodies, or rather their
magnetic particles, resist the passage of this conflict. Hence they can be moved by the
impetus of the contending powers.
I t is sufficiently evident from the preceding facts that the electric conflict is not confined
to the conductor, but dispersed pretty widely in the circumjacent space.
From the preceding facts \ve may likewise collect that this conflict performs circles; for
without this condition, it seems impossible that the one part of the uniting conductor, when
placed below the magnetic pole, should dri ve it towards the east, and when placed above it
towards the west . . . .

There has been some debate as to the extent of the honor which should be accorded
Oersted for this discovery. A principal factor occasioning this debate is a letter from
I-Iansteen (one of Oersted's students) to Faraday, in which he says:"
1 H. C. Oersted, "Experiments on the Effect of a Current of Electricity on the Magnetic Needle,"
a pamphlet dated July 21, 1820, distributed privately to scientists and scientific societies. English
translation in A nn. of Philosophy, 16, 273-276; 1820.
2 Bence Jones, The Life and Letters of Faraday, vol. 2, pp. 389-392, Longmans, Green and Company,
London, 1870.

218

M agnetostatics in Free Space

CHAPTER

Professor Oersted was a man of genius, bu t he was a very unhappy experimenter: he


could not manipulate instruments . . . . Oersted tried to place the wire of his galvanic
battery perpendicular over the magnetic needle, but remarked no sensible motion. Once,
after the end of his lecture, as he had used a strong galvanic battery to other experiments, he
said, "Let us now once, as the battery is in activity, try to place the wire parallel with the
needle"; as this was made, he was quite struck with perplexity by seeing the needle make
a great oscillation . . . . Thus the great detection was made; and it has been said, not
without reason, that "he tumbled over it by accident." He had not before any more idea
than any other person that the force should be transversal. But as Lagrange has said of
Newton in a similar occasion, "such accidents only meet persons who deserve them."

Considerable weight has been given to this letter by Hansteen, because he was apparently a witness to the original discovery. However, the letter was written in 1857,
almost thirty years after the fact, and six years after Oersted's passing. True Oersted's
own account, paraphrased above, was completely lacking in quantitative determination, but all the salient features of the phenomenon had quite clearly been investigated,
including the dependence on current strength, distance of separation, and even shielding
effects. The inference of a circular distribution of magnetic field lines was certainly an
able deduction. As Lenard" has pointed out, the fact that Oersted had a battery and
compass needle on the table indicates he was looking for such an effect and that the
discovery cannot fairly be labeled as a pure accident. But whatever the true circumstances were surrounding this discovery, it was one of the 1110st important in the history
of science, linking for the first time the fields of electrici ty and magnetism,
Oersted's discovery was promptly enlarged by others. The academician Arago learned
of it while traveling abroad and, upon his return to Paris, described the effect at a meeting of the French Academy on September 11, 1820. This news excited the interest of
several investigators, and the next discovery was announced by Andre-Marie Ampere
(1775-1836) just one week later. Reasoning that, if magnets exert forces on each other,
and if electric currents exert forces on magnets, then two electric currents should
interact, Ampere devised an experiment in which.'
. . . in parallel directions, t\VO straight parts of t\VO conducting wires joined the terminals
of two voltaic piles; the one being fixed, and the other suspended from points and made very
mobile by a counterpoise, being able to approach or withdraw while still retaining its
parallelism with the first wire, I have then observed that upon passing an electric current
through each of them, they mutually attract if the two currents are in the same direction,
and that they repel each other when, instead, (the currents) are in opposite directions.

Meanwhile, Jean-Baptiste Biot (1774-1862) and Felix Savart (1791-1841) repeated


Oersted's experiments, and announced to the Academy at the October 30th meeting
that they had determined a law of force which governed the effect. The following brief
notice of their announcement was printed in the Journal de Pluisique:"
The beautiful observations of IV1. Oersted, combined with precise measurements of torsion
and oscillation, give the following expression for the action exercised at a distance on an
austral or boreal magnetic pole, by a nearby thin copper wire, of great length, connected to
the t\VO terminals of a voltaic apparatus. From the point of the pole draw a perpendicular
P. Lenard, Great Men of Science, p. 214, The Macmillan Company, Inc., New York, 193:3.
A. IV1. Ampere, "Memoir on the Mutual Action of Two Electric Currents," Annales de Chimie et
Physique, 15, pp. 59-76; 1820.
5 J de Phys (Paris), 91,151; 1820, See also Ann de Chimie et Physique, 15,222-223; 1820.
3
4

SECTION

Historical Survey

219

to the axis of the wire. The force acting on the pole is perpendicular to the axis of the wire.
Its intensity is proportional to the reciprocal of the distance. The nature of its action is the
same as if a magnetic needle were to be placed tangentially to the contour of the wire (in
place of the wire), in which case the austral and boreal magnetic poles would be acted upon
in opposite senses, but always along the same straight line determined by the preceding
construction.

The best source for the details of the experiment which established this law is Biot's

Precis Elementaire de Phusique." The method used can be understood with reference to
Figure 4.1a, which is a reproduction of Biot's original drawing. Shown is a compass
6

The third edition of this text was printed in Paris in 1824. An English translation is embodied in

J. Farrar's Elements of Electricity, M agnetisrn, and Electromaqneiiem; printed by Hilliard and Metcalf,
Cambridge, Massachusetts, 1826.

\'--1

F(r)
I

(b)

c'

,A
!v!'

Ail"

Z'
(a)

z
(c)

FIGURE

4.1

The Biot-Savart experiments.

220

j11agnetostatics in Free Space

CHAPTER

needle A B, which can freely pivot about its center point, and which is placed a distance
r from a long, straight wire CllfZ. A permanent magnet A'B' (not shown) is positioned
nearby in such a way as to cancel the effect of the earth's magnetic field. The equilibrium position of the needle is then found to be perpendicular to the wire axis. If the
needle is pictured as having t\VO equal and opposite magnetic poles at its extremities,
the forces exerted by the current OIl these poles are thus equal, opposite and circumferential. If then the needle is displaced from equilibrium by a small angle (), as shown in
Figure 4.1b, a restoring couple is experienced by the needle, and its equation of motion
IS

-F(r)L sin () =

Ie

in which L is the length of the needle and I is its moment of inertia. For SInal! displacements, harmonic oscillations will occur of period

Thus, in Biot's words,


. . . if we compare in this way, the squares of the periods, for different distances of the
uniting wire from the needle, supposing always the condition of isochronism to be fulfilled,
we shall obtain the ratios of the component forces exerted in these different cases by the
uniting wire, parallel to the direction of equilibrium about which the needle oscillates.

Upon performing this experiment, Biot and Savart obtained data which is reproduced
in Table 4.1. The last column of data was calculated under the assumption that the
law was inverse with distance. Since the errors were alternatively positive and negative,
irregular, and greater for the larger distances, Biot and Savart concluded that the
law had been fairly established.
TABLE 4.1
DATA FOR THE BIOT-SAVART EXPERIlVIENT

Distances of
the wire, 111n1

Duration of ten oscillations


Observed, sec

Calculated

40
50

30.00
33.50
48.85
54.75

60
120

89.00

30.99
33.88
48.62
53.74
59.40
84.25

15
20

56.75

Biot extended this experiment significantly by inquiring what the action must be
on the compass needle due to an infinitesimal length of the wire, Since the influence
of the entire straight wire varied as 1'-1, and since 1'-1 is the integral of 1'-2, he felt that
each element of the wire should make a contribution to the total force which is proportional to the inverse square of its distance from the needle. However, he realized that

SECTION

Historical Survey

221

the contribution might also depend on the orientation of the element relative to the
needle, and devised an experiment to deduce this relation. Referring to Figure 4.1c,
Biot introduced an additional V-shaped wire with its apex close to the central point
of the first wire. He then determined the period of the compass needle as a function of
1', with a steady current alternately passing through the straight wire and the bent
wire. The difference in period under the action of the t\VO wires could be explained? by
the assumption that the contribution from a single current element I de was proportional to (sin W)/~2. The discovery of this fact led Biot to proclaim
. . . the elementary action of any lamina whatever (is) proportional to sin W/~2; and
uniting with this expression, which is founded upon experiment, the knowledge of the
absolute direction of the force which is perpendicular to the plane drawn through each
distance and through the direction of each longitudinal element of the wire under consideration, we may assign by calculation the total resultant of the action exerted by a wire,
or by any portion of a wire, whether straight or curved, limited or indefinite.

In present-day notation, this result is equivalent to saying that a system of steady


currents K creates a magnetic field at point (x,Y,z) given by
B( x,Y,Z )

ex:

fX

dt X ~
~3

and that if a magnetic pole of strength m is placed at (x,y,z) it will experience a force
mB. In the above formula, ~ is the distance Irorn the element df to (x,y,z). This important equation is known as the Biot-Savart law and is often taken as the experimental
postulate on which magnetostatics is based.
Ampere, following his announcement of the discovery of the force between t\VO
currents, continued his investigations, and later that year published a memoir" which
succeeded in clarifying much of what was known about electricity at that tirne. He
distinguished phenomena involving electricity at rest from phenomena involving
electricity in motion, introducing for the former the name electrostatics, and for the
latter the name electrodynamics. He also distinguished between electric tension (voltage) and electric current. At that time, people were accustomed to speak of the conduction and flow of electricity, but since the two-fluid theory was popular, considerable
confusion existed with respect to the nature of the flow process. Ampere decided that
he would call the whole process an electric current, without regard to its inner nature,
and with the direction of the current defined as the direction in which the positive fluid
was presumed to move. This made the electric current something definite in terms of
which phenomena could be described.
The concept of electric potential, or tension, had been privately appreciated by
Cavendish, and had been admirably developed for electrostatics by Poisson. Ampere
noted that electric tension was observable in a voltaic pile before the circuit was closed,
being detectable through use of an electrometer or electroscope, instruments which
Ampere labeled as measurers of tension. As for the current itself, Ampere felt that it
was best measured by means of its magnetic effects, and he introduced for this purpose
an instrument which he called a galvanometer, an instrument which is still in use today.
To Ampere, tension appeared as a cause, and current as an effect. Koting that as
t
8

See Farrar, ibid., pp. 334-339. See also Problem 4.3 at the end of this chapter.
Ibid., pp. 59-68.

222

M agnetostatics in Free Space

CHAPTER

soon as the effect appears through completion of the circuit, the tension "disappears,
or at least becomes very small," Ampere, then made the interesting observation"
The currents of which I speak self-accelerate until the inertia of the electric fluids and the
resistance that they encounter due to the imperfections in even the best conductors cause
equilibrium with the electromotive force, after which they continue indefinitely at a constant speed such that this force remains at a constant intensity; but they cease entirely at
the instant that the circuit is interrupted.

Ohm's law, which was to be enunciated seven years later, is thus seen to be not far off
in Ampere's thinking.
With a clear definition of current, and a means for measuring it, Ampere continued
his researches over the next three years, and in 1825 collected his results in a lengthy
memoir!" which must rank as one of the most distinguished in the history of science.
In this memoir, Ampere concerned himself with the problem of determining the law
of force between two current elements. A wide variety of experiments on an assortment
of wiregeometries had led him to four conclusions about the force interaction between
currents:
1. The action of one current on another is unchanged in magnitude, but reversed
in direction, when the direction of the current is reversed.
2. The effect of a conductor bent or twisted in any small manner is the same as if
the contour were smoothed out.
3. The force exerted by a closed circuit on a current element is always normal to the
element.
4. If all dimensions of a circuit are changed proportionally, with the currents
unchanged, the forces retain their original values.
When Ampere added to these four conditions the natural assumption that the force
d 2F between two current elements I di and I' dl' is along their connecting line-an
assumption consistent with K ewton's gravitational theory and the Coulomb-Poisson
electrostatic theory-he was able, by an astute piece of analysis, to establish the force
l aw

.
dtF

ex:

IV'
[2 se.-di'
Jl ~
~3

(dl- ~) (dl'
~) ]
- 3- - ~5

in which r is the distance separating the t\VO current elements. A clear exposition of the
analysis leading to this formula, as well as the experimental basis for Ampere's four
conditions, can be found in Mason and Weaver."
If Ampere's third condition is formulated in terms of a field concept, one may write

dF =

I' se X n

in which dF is the force exerted on the current element I' df' and B is the field caused
by the closed circuit. From this, if I eli is an element of the closed circuit, according
Ibid., p. 64.
A. lVI. Ampere, "On the Mathemat.ical Theory of Electrodynamic Phenomena Uniquely Deduced
from Experiment," illem Acad, 175-388; 1825.
11 M. Mason and W. Weaver, The Electromaqnetic Field, pp. 176-183, The University of Chicago Press,
1929. Reprinted by Dover Publications, Inc., New York.
9

10

SECTION

Historical Survey

to the Biot-Savart law, it exerts a force

d2F

a:

OIl

223

li df/ given by

III di

X (di X 6)
~3

which does not agree with Ampere's formula for d 2F.


A lively controversy ensued for some time as to which of these formulas for differential force is correct. Various investigators have shown 12 that for closed circuits
f f d 2F gives the same answer when starting with either forrnula. In Ampere's time it
was not possible to decide the question by experiment. ~ O\V that the motion of free
charges can be studied under the influence of magnetic fields, the decision is clearly
in favor of the Biot-Savart formula. Ampere's difficulty can be traced to the improper
assumption that the elemental force acts along the connecting line.
Unlike Biot, who regarded magnetic poles as fundamental, Ampere considered
magnetism to be basically an electrical phenomenon. He viewed a magnetized rod as
equivalent to a coil carrying an uninterrupted current. He showed that t\VO solenoids
deflect each other in exactly the same way as do t\VO maguetiz ed rods, and was even
able to show that a single current loop, when free to move, sets itself like a compass
needle with respect to the earth's magnetic field. Ultimately, An1pere came to the view
that every magnetic molecule is really a small permanent circular current. This viewpoint was much too advanced for his contemporaries. The meager knowledge of atomic
structure would not permit the conception of permanent currents within materials
without a source of power. However, the impression produced by this memoir was deep
and lasting, and today Ampere's views of these phenomena form the core of magnetic
theory. He is properly credited with authorship of the force law dF = I'dil X B,
even though Biot and Savart deserve citation for the correct formulation of B in terms
of the current elements in a closed circuit. Ampere himself extended the applicability
of this formula by showing that a permanent magnet will exert a force on a current.
His achievements were truly remarkable and Maxwell, writing a half century later,
labeled his memoir "one of the D10st brilliant achievements in science." As a fitting
tribute, the unit for electric current and the circuital law linking magnetic field and
current are named in his honor.
During this same period Faraday made a discovery of the greatest practical importance. His interest had been aroused in electromagnetism in April 1821 when Wollaston,
a colleague at the Royal Institution, attempted to make a current-carrying wire revolve
around its own axis in the presence of a magnet. Although the experiment was unsuccessful, it piqued Faraday's interest. He began by reading what had been done by
Oersted, Ampere, Biot and Savart, and others, and repeated many of their experiments.
Finally, upon repeating Wollaston's experiment, he noted :13
. . . Magnets of different power brought perpendicularly to this wire did not make it
revolve as Dr. Wollaston expected, but thrust it from side to side . . . . The effort of the
wire is always to pass off at a right angle from the pole. indeed to go in a circle round it; . . .
a single magnet pole in the centre of one of the circles should make the wire continually
turn round. Arranged a magnet needle in a glass tube with mercury about it and by a cork,
water, etc., supported a connecting wire so that the upper end should go into the silver cup
CL, e.g., 1\Iason and Weaver, .u, pp. 183-185.
Faraday's Diary, vol. 1, pp. 49-50, being entries in his laboratory notebook for September 3rd, 1821.
Published by G. Bell and Sons, Ltd., London, 1932.
12

13

224

1J1 aqneiosiatics in Free Space

CHAPTER

and its mercury and the lower move in the channel of mercury round the pole of the needle . . . . In this way got the revolu tion of the wire round the pole of the magnet . . . .
Very Satisfactory, but make more sensible apparatus.

This was the first electric motor. 'I'he next day Faraday improved on it and shortly
thereafter invented the COD1111utator. nut he left to others the reduction to practice.
Magnetostatic theory was advanced and placed 1110re in analogy with electrostatic
theory by Franz N eumann (1798-1895) of Konigsberg, who introduced the concept
of the magnetic vector potential function A. 1\eumann discovered the utility of this
formulation 14 while devising a theory based on Faraday's emf law, and the A which
he defined is the more general time-varying function which will be encountered in
Chapter 5. However, its time-independent counterpart facilitates the solution of many
magnetostatic problems and will be used extensively in the sections to follow,
An entirely different approach to magnetostatic theory was first perceived by Leigh
Page (1884-1952) of Yale University."! Adopting the principles of special relativity
and Coulomb's law as fundamental postulates, he began with a system of charges at
rest relative to an observer 0'. Upon introducing a second observer 0, in constant motion
relative to 0', he observed that the charge system took on the appearance to 0 of a
steady current. Upon transforming the Coulomb force, Page noted that the force on a
test charge, as observed by 0, was slightly different from the value determined by 0'.
He was able to show that this small difference precisely accounted for the magnetic field.
1\ ot only was this demonstration additional evidence in support of the validity of
special relativity, but it also further illumined the basic unity of all electrical phenomena, whether they are due to charges at rest or in motion. A generalization of Page's
development will form the core of the present chapter.
4.2

THE TRANSFORMATION OF ELECTRIC FORCE

Let an electric charge q~ be at rest at an arbitrary point (x~,y~,z~) in the X' y' Z' coordinate system and let a moving] charge q' be instantaneously at the position
r'

= lxx'

lyy'

lzz'

The coordinates x', V', z' may be arbitrary functions of time so that, in general, the
charge q' has a velocity v' (t'). This situation is depicted in Figure 4.2.
An observer 0', stationary in X'}TlZ', will determine the force exerted by q~ on q'
through application of Coulomb's law, obtaining

,
q'q~ 6'
F = -477'EO (!") 3

(4.1)

in which ~' = lx(x' - x~) + ly(Y' - y~) + lz(z' - z~) and it is assurned that the
charges are in free space.
Let the coordinate axes of an X YZ system be aligned with the corresponding axes

t For the moment this statement means that this charge had a value of q' when at rest in X' Y' Z'.
It shall be seen shortly that it is convenient to consider charge to be an invariant.
14 F. E. Neumann, "The Mathematical Laws for Induced Electric Currents," Berlin Abhandiunqen,
p. 1; 1845. Also p. 1; 1848. Reprinted as nos. 10 and 36 of Ostwald's Klassiker,
15 L. Page, "A Derivation of the Fundamental Relations of Electrodynamics From Those of Electrostatics," Am J Sci, 34, 57-68; 1D12.

SECTION

TYhe T'romsjormotion of Electric F'orce

z'

225

v'(t')

r----------------y'

X'
FIGURE

Notation for a movinq charge acted on by a fixed charge.

4.2

of the X'Y'Z' system and let the X axis slide along the X' axis in the negative direction
at a speed u. Upon the coincidence of the two origins let t = i' = O. If VI and V are the
velocities of q~ and q' with respect to an observer 0 who is stationary in XYZ, then

(4.2)
From Equations (2.77), the force exerted on q' by q~, as determined by 0, is given by

F
in which

Fe = lxF~

Fm
with

Fe

lyKF~

2"" [ l x(vyFy + VzF z )


KU

= (1 -

(4.3)

F;

(4.4)

lzKF~
,

lyvxF y - lzvxF z ]

(4.5)

U2/C2)-~~

The division of F into t\VO parts is arbitrary but useful in that Fe contains all the
terms not dependent on the motion of q' relative to XYZ, whereas F m contains all the
terms which do depend on v, The subscripts on the forces Fe and Fm refer to the fact
that they shall be designated the coulomb force and the magnetostatic force for reasons
which shall become evident.
Two cases of this force transformation will now be considered.

Case 1 :

== o.

In this case, q~ is static in X'Y'Z' and q' is static in XYZ. The force exerted by q~ on
q', as determined by 0, is simply Fe given by (4.4).

226

M agnetostatics in Free Space

CHAPTER

Suppose that one wishes to apply the concept of electric flux density to this situation.
The reader will recall that in Chapter 3 the idea of electric flux was introduced in
connection with a system of fixed charges. Consistent with that discussion, observer 0'
can say, since q~ is at rest in his coordinate system, that the electric flux density associated with q~ is given by

D' - ~-~

(4.6)

o - 41r (~') 3

in which ~' is drawn from q~ to a field point P' where D~ is being determined,
0' can say further that if q' is instantaneously at the point P', it interacts with the
field D~ so as to experience a force

,D~
F' = q

(4.7)

EO

which is consistent with (4.1).


One needs to proceed cautiously however, in picturing force as the interaction of
charge and electric flux, when adopting the viewpoint of observer O. For him, the charge
q~ is not at rest. To discuss the concept of electric flux associated with a moving charge
requires an extension of the original definition of electric flux.
It is useful to explore the consequences of enlarging the original definition by assuming that a quantity of electric charge and the total electric flux associated with it are
invariants. The 0' observer, for whom q~ is at rest, will picture the electric flux as
emanating from q~ with a spherically uniform distribution. Referring to Figure 4.3a, he
can find the components of D~ at any point I)' by erecting small displacements I)' r;
and P'1); as shown, When the segment P' P~ is rotated around the line L' as axis, it
cuts out a band of area S~, as shown in Figure 4.3b. If all the electric flux which pierces
this band is counted and divided by the area S~, 0' obtains the transverse flux density
[(D~)2 + (D~z)2p2. Similarly, when the segment P'P; is rotated about L', it cuts out
a ring of area S; as shown in Figure 4.3c. If all the electric flux which pierces this ring
is counted and divided by the area S~, the longitudinal flux density D~x is obtained.
The 0 observer, for \Vh0I11 the charge ql has the velocity l x u , sees all longitudinal
dimensions of X' Y' Z' contracted by the factor (1 - u 2 / c2 ) ~2 and all transverse dimensions unaltered, thus picturing the instantaneous situation shown in Figure 4.3d.
Under the assumption that charge and total flux are invariants, 0 will count the same
number of flux lines piercing the area SI (generated by rotating PI)1 about L as in
Figure 4.3e), that 0' counts piercing the area S~. However 0 finds the area SI to be
smaller than S~, the relation being

SI = S~/ K
Thus 0 concludes that the transverse flux density is
(D~y

D~z)~~ = K[(D~)2

(D~z)2P2

(4.8)

By rotating PP2 about L as axis (Figure 4.3f), 0 generates the same area as does 0'
and counts the same number of piercing flux lines and thus concludes that
(4.9)
If 0 considers the force exerted by ql on q to be computable in the usual way, in terms

SECTION

The Transformation of Electric Force

227

z'

------L'

q'I

q1

------L

X'

X
(d)

(a)

P'1

pI

PI P

81

+..-...~-L'

P'

L'
q'1

(c)

FIGURE

--+--+--

4.3

Comporison of flux densities.

-~-L

228

AIaqneiostatics in Free Space

CHAPTER

of an interaction between q and the flux field of ql, he can write, by virtue of (4.8) and
(4.9),

Doy
, KD~y
,
Fy = q = q - - = KF y
EO

EO

But these equations agree with (4.4). Therefore by defining charge and total electric
flux to be invariants, and by making use of the relativistic contraction of length, one
can extend the validity of the relation

Fe

Do

qEO

(4.10)

to include the case that Do is the flux density due to a charge in constant translatory
motion.

Moving charge

Z'

Stationary
charge

Stationary field

----------------x'
FIGURE

4.4

L....--

Interaction of electric field and charge in relative motion.

The flux distr-ibution as visualized by the two observers is illustrated in Figure 4.4.
For observer 0' the field is stationary and spherically symmetrical; the charge q' is
moving through this field at the constant velocity - l x u . For observer 0 the charge q
is stationary and the field is moving past it at the constant velocity l x u ; the field
exhibits some longitudinal compression due to its motion. For both observers the force
is time-varying-for observer 0' this is because q' keeps moving into new regions of
different static field intensity-for observer 0 it is because, at the static position occupied by q, the field intensity keeps changing with time.

SECTION

The Transforrnation of Electric Force 229

Since measurements of distance by the t\VO observers can be connected through the
relation
~' == [K2 (X - XI) 2 + (y - Y 1) 2 + (z - z 1) 2P~
it follows that (4.8), (4.9), and (4.10) can be combined to give
(4.11)

in which Fe has been expressed entirely in terms of quantities measured in XYZ.

Case 2: v

=1=

o.

Under this condition, in X'Y'Z', q~ is at rest and q' has the general velocity v'(t').
In XYZ, ql has the constant velocity VI == lxu and q is no longer at rest, but rather has
the general velocity v(t). The force exerted by qI on q, as determined by 0, is now given
by F = Fe + F m in which Fe has the same value as in Case 1 and, from (4.5)

r,

==

V (VI,)
~ X F

(4.12)

~ X

If the idea is retained that charge and total electric flux are invariants, then it
is still true that Fe == qD o/ EO, but the total force F on q is no longer equal to Fe. One
could discard the assumption of invariance of charge and flux and require that both
vary in such a way that the transformation linking D~ and Do yields the relation
F = Fe + F m = qDO/EO. But it is apparent from a study of (4.12) that this would
require that the flux field due to an electric system (the rigidly translating charge qI)
would depend on the motion of a test charge q which was not a part of the system.
Such an unwieldy definition has no utility. Therefore, the postulate that charge
and electric flux are invariants will be adopted generally and an additional
vector field will be introduced to account for the force F m. The reader has perhaps
already surmised that this will be the magnetic field.
In summa.tion, if q1. is moving in XYZ at constant velocity, the force exerted by ql
on the arbitrarily moving charge q can be expressed by
as

Do

V (VI
F=q-+K-X
- x qD~)
EO

(4.13)

Eo

wherein use has been made of (4.7). Since VI is X directed, this equation may be converted into a form containing only XYZ quantities through utilization of (4.8), with
the result
(4.14)

in which the nature of the field Do associated with the moving charge qI is precisely as
described in Case 1 and pictured in Figure 4.4.
Let a new vector field B(x,y,z,t) be defined by the relation
(4.15)

230 ;'1 agnetostatics in Free Space

CHAPTER

in which it is noted that B is a function of time (by virtue of the fact that Do is timevarying) and depends on the source system (the moving charge ql) but not on the charge
q. B is called the magnetic flux density function and has the units of webers per square
meter. A weber is 1 volt-see and these units will take on more meaning in Chapter 5.
Substitution of (4.1(5) in (4.14) gives

+ v X B]

q[E

(4.16)

This important equation is known as the Lorentz force law. So far, it has been formulated only for the simplest source system of a single charge ql in constant translational
motion, exerting a force F on an arbitrarily moving charge q. K ow a generalization
of this result will be undertaken.

4.3

THE FIELDS DUE TO A CLOSED CIRCULATING CHARGE SYSTEM

If there is a system of charges q1 . . . qN at rest in X' Y' Z', instead of the single charge

ql, the charges of this system will have rigid translatory motion through XYZ at the
common velocity l x u. By use of the principle of linear superposition, the total field D~

due to this charge system can be found by the methods of Chapter 3, and the total field
Do can then be found with the aid of Equations (4.8) and (4.9). An observer at rest in
XYZ can determine a magnetic field B(x,Y,z,t) due to the moving charge system by
using (4.15) and can then compute the force on a test charge q moving at a velocity v
by employing (4.16). In other words, the results of the previous section are applicable
to a system of rigidly translating charges as well as to the single translating charge q1.
Admittedly, a rigidly translating system of charges is not a physically realizable
source; however, it may be used as a constituent part of any real system of charges
and currents. As an illustration, consider the closed system of circulating charges shown
in Figure 4.5. It is assumed that the motion of these charges is such that the amount of
charge in a given volume element is always the same, albeit the identity of the charge
keeps changing. It is further assumed that the charge velocity associated with a particular volume element is unchanging, so that the flow can be characterized by a static
VOlU111e charge distribution p(~,l1,r) and by a static velocity distribution Vl(~,l1,r). This
circulating stream of charge constitutes a time-independent current density t(~,77,r) = PVI.
(Cf. Example 3.16.) If a separate test charge q is instantaneously at the point (x,y,z)
with a velocity v(t), the force F which the circulating charge system exerts on q can be
found as follows:
The circulating charge system in X YZ can be shown to be equivalent to a linear
superposition of static charge distributions in all other Lorentzian frames X'Y'Z'
which move at a variety of constant velocities u with respect to XYZ. (See Appendix E.) Therefore, by superposition, the results of the previous section are extendable
to this source system. The force on q can then be thought of as being composed of differential contributions due to each source element in the circulating charge system.
Specifically, the charge P(t,l1,t) d~ d'YJ dr, which is moving at the velocity Vl(~,'YJ,t), can
be said to exert a force on q given by

dF = q(dE

in which

dE

dD o

=EO

dB

v X

(4.17)

dB)

VI

dD o

=--2

C Eo

(4.18)

SECTION

The Fields Due to a Closed Circulating Charge Syste1n 231

The electric flux field dD o at (x,y,z), due to the moving charge p dV, would be timevarying except that other charges keep moving into dV and assuming the velocity VI,
thus assuring a steady contribution to the field at (x,y,z).

v(t)

~(X,y,Z)
FIGURE

4.5 Force due to a circulating charge system.

All of the circulating charges can be taken into account by integrating (4.17) to
obtain
(4.19)
F = q[E + v X B]
which is a generalization of the Lorentz force law (4.16), based on the principle of
superposition. The fields contained in (4.19) are given by

f dD
f
B(x,y,z) =
E(x,y,z)

Eo

VI

(4.20)

zn,

2E

(4.21)

C O

in which V is the volume occupied by the circulating charge system. These fields are
static because each elemental electric flux field dD o is time-independent. This fact

232

Jl1 aqneiosiaiics in Free Space

CHAPTER

can be appreciated by still another argument. Imagine that the charge q is at a particular point (x,Y,z) with a particular velocity v. If, at a later time, q is once again at
this same point with the same velocity, it will experience the same force as before
because the state of the circulating charge system is unchanged. Thus the force F in
(4.19) is a function of time only because of the motion of q, and not because of any time
variance of E and B.
Since charge and electric flux have been defined as invariants, it follows that Gauss'
law is applicable to this situation so that
V Do =

(4.22)

Also, since p(t,l1,f) is static (because of the nature of the circulating charge system under
discussion), it further follows that
(4.23)
with ~ drawn from (~,l1,r) to (x,Y,z). One 111ay conclude that the closed circulating
system of charges gives rise to a static electric field which does not differ from what
would occur if the charges were at rest. This should be contrasted with the case of the
electric field associated with a single charge, which was found to depend on the charge's
motion.
It follows further that, since the volume V is arbitrary,
dD

= p~dV

(4.24)

41T'~3

Returning to (4.20) and (4.21), one can write the static fields in the forms
E(x,Y,z) =
B(x,Y,z)

p~dV
-

(4.25)

41T'Eo~3
PVI

X ~ dV
2

1T'C EO~

(4.26)

If the source system is specified, these integrals may be evaluated and the results
inserted in (4.19) to obtain the force on q.

4.4

THE BIOT-SAVART LAW

Inspection of Equations (4.19), (4.25), and (4.26) reveals that if vI/c and vic are much
less than unity, which is usually the case, then [v X BI < E, and a good approximation
to the Lorentz force on q is to ignore the B field altogether. This may be a valid conclusion for the system of uncompensated circulating charges discussed in Section 4.3.
However, imagine now that an additional system of charges with distribution - p(~,l1,r)
is superimposed on the first, but that the individual charges of the second system do not
move. Then each moving charge qI of the circulating system finds itself alongside a
charge - qI of the noncirculating system. The charge pair [qI, - qd exerts equal and
opposite coulomb forces on q but only the moving charge qI contributes to the magneto-

SECTION

The Biot-Savart Law

233

static force F m acting on q. Under these conditions the net force on q is

F m == qv X B

(4.27)

with B given by (4.26). In this case, ignoring B is tantamount to ignoring the entire
force on q.'
With the two systems of sources superimposed in this manner, one circulating and
the other not, every volume element dV is electrically neutral. This situation describes
conditions which prevail inside conductors through which steady currents are flowing.
The drift velocities of the electrons have the distribution VI (~,17,t). The moving electronic charge p dV is compensated by the stationary ionic charge - p dV. A timeindependent current density \ == PVI ampyrn" flows through the volume element dV
(cf. Example 3.16). Although the individual electrons have drift velocities so low that
VI/ c may be as small as 10- 10 , there are usually so many electrons participating in the
current flow inside conductors that the calculation of B from (4.26) yields a value
which is often not insignificant.
Let a substitution constant )).0, called the permeability of free space, be defined by the
relation
(4.28)
In lVIKS rationalized units, )).0 == 41r X 10- 7 henries/m; this unit will become more
meaningful when the circuit concept of inductance is introduced. With this substitution,
(4.26) can be written

B(

X,Y,Z

) ==

J l(~,17,t)
X ~ dV
4
-1 3

(4.29)

1rJ..Lo ~

This is the Biot-Savart law and permits computation of the B field arising from any
distribution of steady currents. Equation (4.27) can then be used to find the force
which this field exerts on a charge q moving at a velocity v. Because the magnetic field B
due to the system of steady currents is time-independent, this subject is given the name

nuumetostatics.
EXAMPLE

4.1

As an illustration of the use of (4.29), consider the case of a long straight wire in free space,
extending along the Z axis from -Zl to +Zl and carrying a time-independent current I.
Then t dV = tA dt = I d.t = lzl
is a current element, with A the small cross-sectional
area of the wire, The magnetic flux density at a point (r,,O) in cylindrical coordinates will
be

dr

in which the current element is situated at the source point (O,O,s), as shown in the figure.
The above expression integrates readily to give

234

M agnetostatics in Free Space

CHAPTER

-,
5'

.--+-----~----

P(r,q"O)

For points not too near the ends of the wire, and not too far removed from the wire, so that

Zl,

= 1<p---=1

27rJ..Lo r

In such regions the magnetic flux density can be mapped as a system of coneen tric circular
lines which thin ou t with distance from the wire as r- 1
EXAMPLE

4.2

A freely moving charge q enters a region in which a steady magnetic field exists, being
described by the equation B = 1zB o, with B o independent of spatial coordinates as well as
time. If the entering velocity is the constant Vo = 1xvox + 1yvoy + 1zvoz, find the subsequent motion of the charge.
The equation of motion is given by
Fm

= qv

X B

d
(mv)
dt

=-

If the velocity of the charge is never so great that its mass m need be considered relativistically, this equation can be broken down into the components

qVyB o = miJ x
-qvxB o = mv y

o=

mii,

SECTION

The Biot-Savart Law 235

These equations integrate to give

Vx = VOx cos Wbt


VO y sin Wbt
Vy = VO y cos Wbt - vOx sin Wbt
Vz = VOz

in which Wb = qBo/m is known as the cyclotron frequency. The charge thus follows a helical
path parallel to the Z axis, the radius of the helix being
fo

2
(VOx

+ VOy2) ~~

= -----::--Wb

One interesting aspect of this solution is that what would otherwise be the lateral drift of the
charge has been converted to a circular motion, which can be very tight if B is large. Further, if (xo,Yo,zo) is a point in the trajectory, then so too is the point (xo, Yo, Zo
21T'Voz/Wb).
'rhus, if a group of charges is injected at the point (xo,Yo,zo), with random initial transverse
velocities but a common VOz, they will all come to a "focus" at 21T'VOz/Wb units of distance
further along the Z axis. This principle is used in the design of many electron devices, including some cathode ray tubes.

EXAMPLE

4.3

In 1879 Dr. Edwin Hall of Harvard observed that when a conductor carrying a steady
current is placed transverse to a magnetic field, as indicated in the figure, a transverse
charge separation occurs in the conductor. This phenomenon, called the Hall effect, has
proved useful in the determination of charge densities in materials, including semiconductors. I t can be explained by the following argumen t:

;--------- y

.X

Let the magnetic field be locally uniform and given by B = l yB y Let the current flow in
a conductor of rectangular cross section, and let it have the uniform value t = l x Lx Then
the conduction electrons have an average drift velocity v = l x v x and
Lx = PV x = -nevx

in which -e is the electronic charge and n is the volume density of conduction electrons.

236

M agnetostatics in Free Space

CHAPTER

These electrons experience an average force given by


f = -ev X B = - l zevxB y

This force causes a charge separation in the Z direction until oppositely charged layers are
built up on the top and bottom faces of just the proper value to cause a compensating force.
If E H is the electric field caused by this charge separation, then
O = -e E H

evx B y

-eE H

+ -LxBy
n

so that the free charge per unit volume is

LxB y
ne=-

EH

Since all the quantities on the right side of this equation can be measured, the number of
free electrons per atom can be determined for various metals in this manner. If the technique
is applied to a p-type semiconductor, the Hall field E H is reversed, indicating that the current
is caused predominantly by positive carriers.

4.5 THE MAGNETIC FIELD INTENSITY


In Chapter 3, when using Coulomb's law, it was convenient to introduce the concept
of an electric field by spli tting the force expression in to two factors, namely,

Fe

qE = q

dV ~

J--

(4.30)

41T'EO~3

Similarly, it has proved convenient, when discussing the force on a moving charge, to
introduce the concept of a magnetic field by writing

F m = qv X B

qv X

\ dV X

-1 3

41T'JLo

(4.31)

The forms of these t\VO integrals suggest a certain analogy between Band E, with
p dV being the sources for E and \ dV being the sources for B. The comparison between
magnetostatics and electrostatics is further heightened by the introduction of a new
vector function Hs, analogous to Do, defined by the relation
(4.32)
in which the zero subscript is a reminder that the discussion so far excludes magnetic
materials. H, is called the magnetic intensity, and when (4.32) is combined with the
Biot-Savart la w one finds that

11 o(X,Y,Z ) -

J
V

\(~,1],r) X ~ dV
4
1r~3

(4.33)

Thus the units of H, are amperes/me


The manner in which E and B enter the Lorentz force law, with E acting on a charge
element q to give a force, and B acting on a current element qv to give a force; the similar manner in which E and B are related to the sources p dV and t dV; the similarity

SECTION

The Force between Currents

237

between the defining relations Do = EoE and H, = ,u 1B- all serve to point up the fact
that B plays a role similar to E and that H, is analogous to Do. It is unfortunate that
this point was not fully appreciated during the early evolvement of electrical theory,
since awareness of the analogy would be enhanced by use of the reciprocal of !J.O rather
than )..Lo itself. In this text )..Lo1 shall be used wherever convenient in order to emphasize
this d uality.
The value of introducing H, will begin to emerge shortly when Ampere's circuital
law (which is analogous to Gauss' law) is established. The principal utility of introducing D and H arises when dielectric and permeable materials are discussed (cf. Chapters 6 and 7).

4.6 THE FORCE BETWEEN CURRENTS


If the solitary moving charge q is replaced by a volume charge element

qv ~

PaY

Pa

dV a so that

dV a = t a dV a

with \a dV a a current element which is not necessarily time-independent, Equation


(4.27) becomes
(4.34)
Equation (4.34) gives the elemental force on a general current element r, dV a due to a
steady magnetic field B(x,Y,z). This field is caused by a distribution of steady current
elements and is deducible from the Biot-Savart law. Equation (4.34) is a differential
form of what is often called Ampere's force law.
The total force on all the current elements in the circuit of which \a dV a is a part
can be written

Fm =

v,

ta X

[f \b4

Vb

~-1dVbJ
3

7r,uo

dV a

(4.35)

in which tb is the steady current distribution giving rise to the field B and ~ is drawn
from dl1 b to dV a . The two volumes Tla and l'b may overlap. For example, a closed
circuit of steady current can exert a magnetic force on itself.
EXAMPLE

4.4

A simple case of magnetic interaction of some importance involves t\VO long thin straight
parallel wires carrying steady currents Xl and X2 and separated a distance d. This situation
can be approximated by assuming the wires to be vanishingly thin and infinitely long. Then
(4.34) gives

dr

as the force on a length


2 of the second wire, due to all the current elements in the first
wire. (No current element in the second wire experiences a force due to any other current
element in the second wire because 12
X ~ == 0.)
No loss in generality arises from taking 2 at the position S2 = 0, as shown in the figure,
and writing for ~ the relation

ee,

ds

238

JJ1agnetostatics in Free Space

If by

en one means the force

CHAPTER

per unit length on the second wire , then

(4.36)
'rhus the force between the wires is attractive if the currents are in the same direction'
otherwise it is repulsive. This simple formula can be used to define the unit of current
ampere, in terms of a mechanical measurement of force.
'

--+----)(

~-----d-----~

When (4.36) is considered in conjunction with COUIOlUb's law, each is seen to contain
1
two electrical quantities, either q and fo or I and JJ.o But q and I are related through a time
1
derivative and JJ.o and fo are related by (4.28). Thus in reality (4.36) and Coulomb's law
each contain the same t\VO electrical quantities, and these t\VO force laws taken together
permit the definition of all electrical quantities in tenus of the units of mass, length, and
time, indicating that a fourth fundamental unit is unnecessary.
This is not to deny that a fourth fundamental unit is convenient nor to suggest that mass
is more fundamental than charge. One could equally well start with electricity instead of

SECTION

The Time-Independent Magnetic Vector Potential Function 239

gravitation and conclude by being able to define mass in terms of the units of charge,
length, and time.

4.7

THE TIME-INDEPENDENT MAGNETIC VECTOR POTENTIAL FUNCTION

If \(~,17,S) and ~ are other than very simple functions, the evaluation of B from (4.29)
can be extremely difficult. This same situation was encountered with the electric field
in Chapter 3 and was eased by the introduction of the scalar electric potential function,
whose gradient gave -E. By analogy one is led to wonder whether B can be expressed
alternatively as a vector derivative of a potential function. That this is possible can be
demonstrated by the following argument:
Since ~ = lx(x - ~)

+ ly(Y

+ lz(z

- 17)

- t), the relation


(4.37)

can be used to rewrite (4.29) in the form

= -

In (4.37) V F = L, -

ax

ly -

ay

~
J 1 X V F (~)~ dV
47r~o V

L, -

az

(4.38)

a
a
.
+
ly - + L, - are the gradient
a~
a17
at

and V s = L, -

operators with respect to the coordinates of the field point and the source point
respectively.
The vector identity (V.109) can be utilized to obtain

G) = ~

VF X

VF X 1

VF

(D

X 1

But V F X t == 0 because 1 is a function only of the source variables ~, 17,


can be written

~ rJ V

47r~o

r. Thus (4.38)

(~)
dV
t

However, since the limits of integration are also independent of the field point P(x,Y,z),
the order of integration and differentiation may be inverted to give

= VF X

J 47r~OI~
~~

Therefore, it is convenient to define a magnetic vector potential function A by the


expression

~, )
A( X,y,z
from which

J \(~,17,r) dV

-I

41T',uo ~

B=vxA

(4.39)
(4.40)

In almost every case it is simpler to compute A first and take its curl to find B rather
than to compute B directly from (4.29).

240

111aqneiosiatics in Free Space

CHAPTER

One important consequence of (4.40) is the fact that

VB

==

(4.41)

This follows from the vector identity V V X A == 0 (cf. Mathematical Supplement,


Equation V.III). Because of the defining relation (4.32) it also follows that
V
EXAl\IPLE

H, == 0

(4.42)

4.5

For the long straight wire of Example 4.1, the magnetic vector potential function is simply

which integrates to give

A = ~ In (z

+ Zl) +

47r,u Ol

Zl)

(z -

[r 2
[r 2

+ (z + Zl)2]~~
+ (z - Zl)2P~

Taking the curl in cylindrical coordinates, and then inserting the point

nCr ~O) - 1 _1_


,0/,

q, 2

7r,uo-1 r (2
Zl

per, cP;O)

one obtains

+Zl r2) I~
7

in agreement with Example 4.l.


EXAl\IPLE

4.0

Imagine a small circular loop of radius a carrying a steady uniform current I, as suggested
by the figure. Let localized spherical coordinates (r,(),cP) be set up with origin at the center
of the loop, and such that the loop lies in the () = 7r/2 plane. I t is desired to find the magnetic field at a remote point P(r, (),cP) such that r
a.
Consider the current clement Ia d{3 situated {3 deg beyond the cP plane. The contributions
to A of this element and its twin, which is {3 deg in front of the cP plane, sum to only an A cP

z
p

8
~::::------+----+------y

d{j

SECTION

The Time-Independent Magnetic Vector Potential Function

241

component. By thus arranging all the current elements in pairs one can conclude that
A(r,(},et = lct>A<t>(r,(},et and the task has simplified to one of finding A<t>(r,(},cP). By symmetry, Act> is not a function of , so there is no loss in generality resulting from placing
P(r,(),) in the YZ plane. Then
lct>Ia d{3

-lxla cos {3 d{3 - lyla sin {3 d{3

and, from (4.39),


A (r ()) = 2 /11"
<t> ,

Ia cos {3 d{3
4

in which

~ = [(a sin (3)2

is the distance from the element

+ (r sin 0 -

Ia d{3

(1 + ~
r

-1

1rJ.1.o

+ (r cos O)2]H
Since r a,

a cos (3)2

to the point P.
sin 0 cos (3 -

~
+ .)
2r
2

If terms of higher order than r- 2 are neglected,


A<t>(r,O)

I~l

21rJ.1.o r

/11" (1
0

+ ~ sin 0 cos (3) cos {3 d{3


r

Integration gives
1ra 2I sin fj
41r,LLo1 r 2

(4.43)

It is useful to define what is known as the magnetic moment m of this small current
element. m is chosen to have a magnitude equal to the area of the loop times the current
and a direction perpendicular to the plane of the loop. Thus 111 = 1ra 2I; the direction of m
obeys the right-hand rule, which means that if the fingers of the right hand are placed along
the loop in the direction of current flow, then the right thumb points in the direction of m.
With this definition, Equation (4.43) may be written

A=mXr

(4.44)

41rJ.1.o r 3

in which r is drawn from the center of the loop to the point P.


The use of (4.40) yields
B

41r,LLo r 3

(lr2 cos ()

18 sin ())

(4.45)

Equation (4.45) has special significance since it is found to be in the same form as (3.29);
thus there is a duality between electric dipoles and small current loops.
EXAMPLE

4.7

If the small circular loop of the previous example is immersed in a region of uniform magnetic field B o, it experiences a torque tending to align its magnetic moment with B o. This
effect can be appreciated by referring to the figure, in which the loop is seen edge on and the
uniform field is indicated by flux lines. The current in the loop is assumed to be coming out of
the paper on the left side and into the paper on the right, and therefore m is upward, as
shown.
Application of Equation (4.34) leads to the conclusion that the B, field exerts forces on the
left-hand and right-hand current elements which are outward, causing a couple which tends

242 Magnetostatics in Free Space

CHAPTER

to rotate the loop so that m will be parallel to B o. A quantitative expression for this couple
can be derived as follows:
With no loss in generality, the uniform field may be assumed not to have an X component,
in which case it can be given by

with (}o a constant polar angle measured from the Z axis.


A current element I d.t situated at the latitudinal angle

I d.t

Ia d<l>( -1 x sin

<I>

+1

t1

ep

can be represented by

cos

<1

and, according to (4.34), this current element experiences a force

dF m = laRa dep( -Ix sin ep + 1y cos ep) X (L, sin ()o


= laR o d<l>(1 x cos ()o cos <I> + ly cos ()o sin <t> -

+ 1z cos

()o)

L, sin 90 sin <1

This force causes a torque around the center point of the loop (cf. Example V.9, Mathematics Supplement) given by
dT

in which r

lxa cos

ep

+ l ya sin ep

r X dF m

is drawn from the center of the loop to the current

,1

SECTION

The 'I'ime-Lndependeni Mtujneiic Vector Potential Function

243

elemen t. Therefore,
dT =

Ia 2B o dcj>( -Ix sin

00 sin? cj>

l y sin 00 sin cj> cos cj

The total torque abou t the center point of the loop can be determined by integration:
T =

fo

211"

dT = -lxla B o sin 00
2

sin? <f> d<f>


(4.46)

= -lx(7ra 2I )B o sin 00
= m X Bo

Equation (5.46) indicates that the equilibrium position for the current loop occurs when its
magnetic moment is aligned with the field. If the loop is rotated from this equilibrium
alignment, its potential energy is increased. The energy which must be supplied to rotate
the loop from an initial angle 01 to a final angle O2 is given by

(12

U =

02

T se = mB o

(h

= -mBo[cos (0 2

(h
-

sin (0 - 00) dO

(}o) - cos (0 1

( 0)]

If the zero reference level for the potential energy U is taken as occurring when the magnetic moment is transverse to the B o field (0 1 = 00 + 7r/2), then
U = -mB o cos

u,

= -m

((}2 -

(}o)

(4.47)

in which the final direction of m is used in the dot product in (4.47).

The magnetic vector potential function A has the additional important property
that its divergence is zero. This can be seen by returning to (4.39) and writing

VF A = ~
47r,LLo

[l(~,l],t)J
r

dV

in which differentiation inside the integral sign is permissible because the volume limits
are not functions of x, y, z. Use of the vector identity (V.I07) gives

But V F

==

(D i

(D

+ \ V
0, since \ is a function only of ~, 'fJ, r; therefore
VF

VF

A =

VF

~l vf \V (~)
dV
~
F

47r,LLo

= -

---s f \

47r,LLo

VS

(!)
~

dV

Use of the same vector identity gives


Vs

G)

~ V s \ + \ V s (t)

However, V s 1 == 0 because the currents are time-independent and the net efflux of
current from a volume element must be zero. Therefore,

244 Magnetostatics in Free Space

CHAPTER

The divergence theorem is now applicable and permits the conversion to


VF

A = -

_1_
41r,u Ol

J \dS
~

in which S is the closed surface bounding V. Since V can be maintained finite and yet
made large enough that none of the currents of the system intersects S, it follows that
one can make \ == 0 on S. This 111eanS that
(4.48)
as asserted.
EXAMPLE

4.8

In Example 4.5 the magnetic vector potential function for a long straight wire carrying a
steady current I was found to be
A

= 1

_I-1
41r}.LO

(z
Zl)
(z - Zl)

+
+

[r 2
[r 2

+
+

(z
Zl)2]~~
1.(,
(z - Zl) 2]72

and therefore the divergence of A is

V. A = aA

= _1_

41r,u Ol

az

{I + (z + zl)[r 2 + (z + zl)21~}" _ 1+ (z (z

Zl)

[r

(z

471"~OI {rr2 + (/+ ZI)2jH -

zl)[r 2
(z - Zl)
[r 2

+ Zl)2r':!

[r 2

+
+

(z - ZI)21~H}
(z - Zl)2P2

+ (/- ZI)2jH}

This expression for V A is not quite zero and the reason can be traced to conditions at the
two ends of the wire. There the current has been assumed to end abruptly and V \ t= 0,
which violates a condition imposed in deriving the result V A = O. If one were to include
the steady currents in the remainder of the circuit, of which this long wire is a part, then a
null value for the divergence of A would be obtained. Alternatively, if Zl and Z2 approach
00 it is seen that V A ~ 0 for finite z.

4.8

AMPERE'S

CIRCUITAL LAW

The fact that V A == 0 opens the way to the proof of an important theorem, the result
of which is known as Ampere's circuital law. Recall that in electrostatics the equations
\72ep = - - P

ep -

Eo

pdV
-41rEO~

were encountered, the first being a differential equation for the electric potential, and
the second its solution. But from (4.39), a component of A, for example the y component, is given by

Ay =

'y

d~

41rllo ~

and therefore must satisfy Poisson's equation, namely,


(4.49)

SECTION

A mpere' s Circuital Law

245

If both sides of (4.49) are multiplied by l y and the result is added to similar terms
involving the x and z components, one obtains
\

(4.50)

-1

IJ-o

and A(x,Y,z) is seen to satisfy a vector form of Poisson's equation.


Since B == V X A, use of the vector identity (V.113) gives
V X B == V X V X A == v(v A) -

\72A

by virtue of (4.48). Thus


\

vxB==-=i

IJ-o

which means that

X H,

= \

(4.Fjl)

This is the differential form of Ampere's circuital law. Integration gives

f V X u, dS

f \. dS

in which S is an open surface bounded by the closed contour C. Application of Stokes'


theorem yields

n, di = sf r- dS

(4.52)

Ienclosed

This is the integral form of Ampere's circuital law and it plays the same role in magnetostatics that Gauss' law does in electrostatics.
EXAl\,IPLE

4.9

The results of Example 4.1 can be used to deduce that the magnetic field due to a steady
current in a long straight wire is

at points not too far removed from the wire nor too near its ends. Let a closed contour C be
erected which encloses such a wire. An element of length along C, expressed in cylindrical
coordinates, is (cf. Example V.17)

and therefore,

o'

di

(l)
(r de/
21T'r

which agrees with Ampere's law.

==

1- f2~ dep =
21T'

246

M aqneiosiaiics in Free Space

EXAlVIPLE

CHAPTER

4.10

Consider an infinitely long cylindrical tube, shown in cross section in the figure, which
carries a uniformly distributed axial current I. What magnetic field is caused by this system
of sources?

.......-----+O-:-...:+---y

x
An answer can be given to this question by first noting that symmetry requires that H, be
independen t of cP; since A has only an axial corn ponen t, V X A has only a cP com ponen t.
Therefore the magnetic field is a function H et>(r).
Next imagine that a concentric circular contour C of radius r has been constructed in a
transverse z plane. If r ::; a, /1 et> dt = 0 and therefore,
c

liq, == 0
If a ~ r

S b, some

r ~ a

current is enclosed by C. The uniform current density is given by

l.=----

and thus

r~

1r(b 2

a 2)

Hq,(r)r d =

21rrHq,(r) =
Ifet> (r)

1r(b 2

(r 2

(b

a2)

21rr'dr'

a 2)

1-2
2

_L (r

21rr (b2

- a 2)
a 2)

r :::; b

Finally, if r ~ b, all the current is enclosed by C and

1/q,(r)

-L
21rr

b :::; r

Interior points of the tube are shielded; at all exterior points the field acts as though the
entire current were concentrated on the axis.
By superposition, if t\VO concentric conducting tubes carry equal and opposite steady

SECTION

A nipere' s Circuital Law

247

currents, which are uniformly distributed, the field between them is

Jlcf>(r)

27fT

Throughout the hollow interior of the inner tube and outside the outer tube the field is
everywhere zero.
EXAMPLE

4.11

Consider the long thin and tightly wound circular cylindrical solenoid shown in cross
section in the figure. Let a-a' represent the central transverse plane with ]J any point
(external or internal) in a-a'. Let the first task be to find B(IJ). I f I is the steady curren t in
the winding, COIning out of the paper at the upper cross section of each turn, symmetrically
disposed pairs of current elements I d and I d' can be found, such as the two shown in the
b

I
I

b'

a'

figure at distances ~ and i' from P. These t\VO current elements will make contributions to
the magnetic flux density at P which can be written

dB

I df

47r,uOl~3

and which are shown in the figure. By symmetry these two contributions sum to a longitudinal component of B only. When all the current elements in the solenoid are paired in
this fashion, it is evident that the entire B field is longitudinal at every point in the central
transverse plane a-a'.
X ext consider the field B(lJ 1) at an internal point in a noncentral transverse plane, such
as b-b', One can begin to construct B(IJ 1) by once again considering pairs of current elemerits, this time symmetrically disposed about P'; After awhile, all of the current elements
to the left of 1)1 will have been used up, but there will still be 80n1C left over far to the right
of ]J I . However, since the solenoid is long and thin, these leftover current elements can be
considered to advantage in pairs of a different sort, such as I df' and I
(See figure.)
If b-b' is not too near either end, the posi tion vectors drawn from I df' to P 1 and from I d"
to ]J 1 must be almost parallel as well as almost of equal length. Since the two current
elements are oppositely directed, it follows that their paired contribution to B(]J 1) is
negligible. Thus if the transverse plane b-b' is sufficiently remote from an end, B is essentially longitudinal at all interior points of b-b',

se,

248

it!agnetostatics in Free Space

CHAPTER

z------------~--::e

x
With this information about the nature of the field inside the solenoid, Ampere's circuital
law can be applied to the contour 1234 shown dotted in the second figure. Since this contour
encloses no current, and since B is essentially perpendicular to the legs 23 anrl14, it follows
that
2

JB.di= JBodi
1

and thus that B is uniform over a transverse plane inside the solenoid, provided this transverse plane is not too near either end.
Finally, n can be deduced at a point P far removed from the solenoid, with the use of the
coordinate system indicated in the second figure. If a is the radius of the solenoid, L its
length, and r the distance to the remote point 1~, then a L r, Let n be the number of
turns per unit length of the solenoid, so that n dr is the number of turns of a flat loop at the
source position t. On the basis of Equation (4.~14), the contribution to A at J:> for this flat
loop is
dA

= 7ra 2I n d((lz

X ~)

41r,uol~3

in which ~ = lxx + 1 y Y + lz(z - s) is the position vector drawn from the center of this
loop to the distant point P(x,y,z).
Since r L, ~ varies insignificantly as S ranges from -L/2 to +L/2. Thus
A

and

1ra2InL 1
zXr
o lr

~--3
41r,u

1ra 2 I nL
B = V X A ~ - - 1 - 3 (lr2 cos ()
47r,uo r

+ 18 SIn. ()

and it is as though the entire solenoid were concentrated in the Xl" plane.
These conclusions permit an approximate sketching of the B field for a long slender
solenoid, with the result suggested by the third figure.

SECTION

4.9

Boundary- Value Problems in Magnetostatics

BOUNDARY-VALUE

PROBLEMS

249

IN MAGNETOSTATICS

In connection with Equation (4.50), it has been noted that A satisfies a vector form
of Poisson's equation; in regions removed from the current sources this reduces
to a vector forrn of Laplace's equation. Therefore, all the techniques discussed in
Chapter 3 pertinent to solving V'2q, = 0 would appear to be applicable to boundaryvalue problems concerning A. Unfortunately, the situation is not that simple, due to the
vector nature of A. As discussed in Section V.16 of the Mathematical Supplement, the
Laplacian of a vector function generally involves D10re than the Laplacian of its scalar
component functions; additional terms may arise through the spatial derivatives of the
unit vectors. Only in rectangular coordinates is this not the case, because the unit
vectors have constant directions. In all other coordinate systems, the change in direction of these unit vectors with spatial position adds terms which complicate the solution
of the differential equation. For example, in cylindrical coordinates
V2A = lr ( V2A r

2 -aA<t>

1"2

Ar)

1"2

1<1> ( V'2Act>

+ -1'22 aA
-CJr - -Act
+
1'2

lzV'2A z

(4.53)

and in spherical coordinates

(4.54)

250

1\1 agnetostatics in Free Space

CHAPTER

It is apparent that one is generally confronted with the problem of solving more
complicated differen tial equations than the Laplacian of a scalar function. These
equations can be mixed, and will take different forms as the type of symmetry is
changed. For this reason the techniques tend to be more specialized than was found
to be the case when solving for the electrostatic potential function. A few examples
will serve to illustrate possible approaches.
EXAl\1PLE

4.12

A simple configuration in cylindrical coordinates has been treated in Example 4.10, that of
an infinitely long cylindrical tube carrying a uniformly distributed axial current. Such a
current distribution yields an A which is entirely axial and a function only of r. But if
A = 1zA z (r ), inspection of (4.53) reveals that Y72A = 1zV 2A z Therefore, in this case Poisson's
Equation (4.50) is simply

<

a, r

>

The solutions to these equations may be written

Az(r)

= CI
1

Ir

r < a
2

41rJlo (b 2
=

C4 ln r

a 2)

C2 ln

+C

Cs

>

in which the C, are constants of integration. Determination of the values of C I , C3, and C,
is not important, since they vanish in taking the curl of A to find the magnetic field. The
requirement that aA z/ ar be continuous across the interfaces leads to the evaluations

When these values for the constants are substituted in the above expressions for A z, performance of the curl operation yields expressions for the magnetic field in the three regions
which are in agreement with the results of Example 4.10.
EXAMPLE

4.13

A problem in spherical coordinates which can be extended to several practical situations


involves a -directed sheet of current lying in a thin spherical shell of radius a. If a is the
thickness of the shell, then j = ta amp/rn can be taken as the lineal current density in the
surface of the shell. It will be assumed that j = 1ct>i(8); that is, the current density will not
be taken as a function of .
It follows that A will have only a cP component, which is a function of rand 8 but not cPo
Inspection of (4.54) indicates that for this case
Y72A = let>

(V' 2A ct> - r sin"


~cP 0

= 0

for points not in the shell. Expansion of the Laplacian operator gives
-

1 - a ( r 2 -aAct

r 2 ar

ar

1
+ ---a
r sin () a()
2

( SIn
. 8 -aAct

ao

Act> - -r 2 sin" (J

SECTION

Boundary- Value Problems in M agnetostatics

Upon assuming that A<jl

251

fl(r)f2(O) one obtains

idr (r

f1
d ) - n(n

dr

f2
~
(Sin () d ) +
dO
dO

[n(n

1)f1 = 0

1) sin () - -.1_J 12 = 0
SIn 0

in which n(n + 1) is a separation constant. Both of these equations were encountered in


Chapter 3 in connection with solutions of Laplace's equation in spherical coordinates. The
most general appropriate solution is

i,

= i,

nt GY P~(cos
nt Y+!
an

an (;

0)

P;(cos 0)

<

>

with these series constituting a complete orthogonal set.


Performance of the curl operation yields

1 ~
~ n(n
B = ~
a n=l
1 ~
~
=~

a n=l

n(n

(r)n-1 Pn(cos 0) -

1)an a

6 ~
-1 ~
(n
a n=l

~
+ l ), (a)n+2
Pn(cos 0) + -1 L
6

+ ,1)an (r)n-l
P~(cos
a

(a)n+2 P~(cos ())

nan -

a n=l

0)

<a

>a

If a contour C is drawn in a ~ plane, straddling the shell as shown in the figure, application
of Ampere's circuital law gives

Jq,( O)a dO

[1 nL:~

= JJ.o-1 a dO ~
-1

so that

jq,(O)

= JJ.;

nanP

00

(2n

~ (n + l)a nP n
1J
n1+ 1
~ n':l

l)anP;(cos

0)

n=l

is the lineal current density, expressed as a sum of orthogonal terms. If the current distribu-

252 M agnetostatics in Free Space

CHAPTER

tion is specified, the normalization integral for Legendre polynomials can be used to deduce
the constants an.
The case in which all of the an coefficients are zero except for n = 1 is particularly interesting, for then inside the shell B, = B cos () and Be = - B sin () and the field strength is
uniform. The current distribution required to achieve this effect varies as sin (j.
All the foregoing can be extended to problems involving -type currents flowing in
spherical volumes by considering such volumes to be composed of nesting spherical shells;
the results given here then become a prototype solution.

Several other techniques have proved helpful in the solution of magnetostatic


boundary-value problems. The differential form of Ampere's law yields V X B = 0
away from the sources, and thus in such regions B may be expressed as the. gradient
of a scalar potential function in much the same manner as found in electrostatics. This
technique has been widely used when describing magnetic fields in terms of equivalent
magnetic dipoles."
Since V A = 0 it is possible to introduce a vector function W by the relation
A = V X W. In turn, W has proved to be expressible as a series of orthogonal functions,
and a variety of problems are solvable by this technique," including the spherical
shell of current discussed in Example 4.13.

4.10

COMPOSITE FIELDS

At this stage in the analysis, it is possible to formulate an expression for the force F
on a charge q which is moving through XYZ at a velocity v(t), when that force is
contributed to by a composite of three sets of sources:
1. A static volume charge distribution Pl(~,17,r).
2. A system of uncompensated charges P2(~,17,r) dV moving through space at the
constant] velocities V2(~,17,r).
3. A system of compensated charges P3(~,17,t) dV moving at the constant velocities
V3(~,17,t). There are stationary charges -P3(~,17,r) dV providing the compensation,
and one 111ay talk conveniently of the charge pairs (P3 dV, - P3 dV).

Through the use of the Dirac delta function, these volume charge densities can equally
well represent surface and lineal distributions, or discrete point charges.
"fhe force on q is given by the Lorentz force law
F

q(E

+ v X B)

t By this it should be recalled one means that the charge P2 dV, which is instantaneously in that
volume element dV which contains the point (~,17,t) has, for the moment, the particular velocity
V2(t17,t). The identity of the charge in dV keeps changing, but on the time-average there is always
charge at this position with this velocity. Alternatively, if the progress of a specific charge is followed,
it will be found to occupy a succession of positions, momentarily taking on a progression of velocities
V2, which need have neither the same magnitudes nor directions.
16 M. Abraham and R. Becker, The Classical Theoru of Electricitu and Mtujnetism, 2d English ed.,
Chap. 7, Hafner Publishing Company, l\e\v York, 1949.
17 W. R. Smythe, Static and Dimamic Electriciiu, pp. 260-271, Mc Graw-Hill Book Company, New
York, 1939.

Problems 253
in which

== _1_

E(x,Y,z)

41rfo

J p~ dV
J ~ dV
V

~3

1
t X
-a41T'Jlo v
r

B(x,y,z)~ ----=i

(4.55)
(4.56)

with P = Pl + P2 and t = P2V2 + PaVa. The fields E and B, as given by (4.55) and
(4.56) satisfy the differential relations
V E = ~
fa

VB

== 0

VxE==O

vxB

(4.57)

-1

f..Lo

These composite fields, which are due to the most general aggregation of timeindependent sources, are therefore the most general electrostatic and magnetostatic
fields obtainable.
REFERENCES
1.

Abraham, 1\1., and R. Becker, The Classical Theory of Electricity and 111 agnetism, 2d
English ed., Hafner Publishing Company, New York, 1949.

2.

Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, W. H.


Freeman and Company, San Francisco, 1962.

3.

Duckworth, H. E., Electricity and ill agnetism, Holt, Rinehart and Winston, Inc., New
York, 1960.

4.

Langmuir, R. V., Electromagnetic F'ields and lVaves, McGra\v-Hill Book Company, New
York, 1961.

5.

Page, L., and N. I. Adams, Jr., Electrodynamics, D. Van Nostrand and Company, Inc.,
New York, 1940.

6.

Panofsky, w. K. H., and M. Phillips, Classical Electricity and 111agnetism, Addison-vVesley


Publishing Company, Inc., Reading, Massachusetts, 1956.

7.

Shedd, P. C., Fundamentals of Electromagnetic lVaves, Prentice-Hall, Inc., Englewood


Cliffs, New Jersey, 1954.

8.

Smythe, W. R., Static and Dynamic Electricity, McGraw-Hill Book Company, Ne\v York,
1939.

9.

Whittaker, E., .4 History of the Theories. of Aeiher and Electricity, Vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

PROBLEMS
4.1

Two straight horizontal insulated aluminum wires carry equal steady currents in opposite
directions. If each wire is 0.5 em. in diameter and one wire lies on top of the other, what
current will barely cause separation?

4.2

A rectangular loop consists of a U-shaped conductor and a sliding bar. Find the force on
the bar as a function of the dimensions of the loop and the strength of the steady current I which flows in the loop.

254

J11 aqnetostaiics in Free Space

4.3

Referring to Figure 4.lc, Biot bent the second wire into the form of a right angle and
found that the period of oscillation of the needle, with a steady current Jl: passing through
the bent wire, was TI. Upon passing half this much current through the straight vertical
wire, he measured a period T2. The ratio T2/TI was independent of the distance r and had a
mean value of 0.917. Is this result compatible with the Biot-Savart law? How should the
ratio of periods vary with the apex angle of the bent wire?

4.4

A circular loop carrying a current I and a long straight wire carrying a current I' lie in
the same plane. Show that the mutual force is proportional to II' (sec a - 1) with a the
angle subtended by the circle at the nearest point of the straight wire.

4.5

If in the last problem the straight wire is placed perpendicular to the plane of the loop,
show that a torque exists tending to set the two wires in the same plane. Does your
answer depend on whether or not the straight wire is within the loop?

4.6

Show that the net self-force on a plane circular loop carrying a steady current is zero.

4.7

Show that a simple loop of arbitrary shape carrying a steady current tends to assume the
form of a plane circular loop.

4.8

Two circular wires of radii a and b have a COmlTIOn center and are free to turn on an axis
which is a diameter of both. Find the torque existing between these coils if they carry
steady currents Ia and Ib and are (1) at right angles, and (2) in the same plane.

4.9

For a current-carrying circular loop of radius a, show that the rate of change of field
along the axis is constant at a distance a/2 from the center. Thus show that if t\VO identical coils are placed coaxially a distance apart equal to their radii, an extended region
exists in which the magnetic field strength is essentially constant. Two such coils arranged
in this manner are known as 1/ elmholtz coils.

4.10

A long thin solenoid of length L and radius a carries a steady current


Find the B field along the axis.

4.11

A high-speed electron has a dynamic mass which is 1.5 times its rest mass. At what
radius of curvature will it travel in a perpendicular magnetic field of strength B o =

CHAPTER

I through its N turns.

0.5 webers/rn"?
4.12

In a parallel plane diode operating at a constant potential Vo, the electrons normally
travel directly across from cathode to plate. If a transverse uniform magnetic field of
strength B is interposed, what must be the value of B o just to prevent the electrons from
reaching the plate?

4.13

If ions of mass M and charge q are injected transversely into a region of uniform magnetic
field Eo, after having accelerated through a potential Vo, show that they will travel a
circular path of radius

R-

[ 2-VO (J[)J~~
B~

If a variety of ions of differen t mass and the same charge are collected after having
traveled through a semicircle, they will be separated laterally due to their different orbital
radii. This is the principle of the mass spectroscope.

4.14

In a Wien velocity filter, ions of a particular velocity Vo are not deflected in passing
through a region containing steady electric and magnetic fields. How are the fields
arranged to accomplish this '?

4.15

What magnetic moment results if a spherical conductor of radius a, possessing a net


charge Q, rotates with a constant angular velocity w?

Problems

255

<b

4.16

Find the magnetic field everywhere if two infinite coaxial cylindrical shells of radii a
carry equal and opposite steady currents I.

4.17

An infinitely long cylindrical shell is segmented into four equal quarter-circles. These four
segments carry axial currents in alternate directions of uniform lineal density j amp/m.
If the cylinder radius is a, find the magnetic field B at all points.

4.18

A fine wire is wound in the form of a flat spiral of N turns, shaped like a disc of radius a.
Find the magnetic dipole moment if a steady current I flows through this winding.

4.19

.A
. toroidal coil consists of a large number N turns of wire and carries a steady current I.
Use Ampere's circuital law to determine the field at any point inside the toroid. If a and b
are the inner and outer radii of the toroid, find the percent variation in B over a cross
section as a function of b/ a.

4.20

A fine wire is wound in a single layer of N closely spaced turns on the surface of an insulating spherical shell, such that the axes of the turns coincide with the polar axis of the
sphere. If a steady current I is passed through the winding, find the magnetic field
everywhere.

CHAPTER

Electromagnetics in Free Space


of the word implies, electromagnetics is concerned with interrelated
electric and magnetic fields, an effect which occurs when the two fields are time-varying.
This interrelation is normally introduced by accepting Faraday's emf law as an experimental postulate and adding to it the continuity equation for current, from which
ultimately Maxwell's equations may be deduced. However, the approach to be presented here will not require this additional experimental postulate. Instead, the most
general static and electric fields will be created in one coordinate system and the
resulting force expression transformed to another (moving) coordinate system, The
transformed force expression will be recognized as a generalization of the Lorentz
force law, and permits the definition of time-varying electric and magnetic fields. These
fields are then shown to be related through Maxwell's equations, one consequence of
which is Faraday's emf law. This approach provides the additional satisfaction of
identifying the electromagnetic fields in the Lorentz force law and in Maxwell's
equations as one and the same, an identity which can only be postulated in the customary derivation.
After Maxwell's equations have been established, the vector Green's theorem is used
to obtain a general solution for the electromagnetic field. Conditions at infinity are
studied, and convergence is demonstrated for real sources. The wavelike nature of the
general solution is demonstrated and then Poynting's theorem is derived to show the
energy content transported by these waves, The chapter continues with discussions of
solutions to the vector wave equation in rectangular, cylindrical, and spherical coordinate systems and concludes with a Minkowskian formulation of the field equations.
AS THE STRUCTURE

5.1 *

HISTORICAL SURVEY

The two major discoveries on which the theory of time-varying electromagnetic fields
is ordinarily based were made b.y Michael Faraday (1791-1867) and James Clerk Maxwell (1831-1879). Faraday's discovery was experimental and consisted of the significant
observation that a changing magnetic field would induce an electric field. Maxwell
was led by an analogy to the theoretical conclusion that the converse was also true,
namely, that a changing electric field would induce a magnetic field. In this respect,
time-varying electric fields play the same role as conduction currents, and Maxwell
combined the two into a total current which he showed to be continuous. The mathe-

* This section may be omitted without loss in continuity of the technical presentation.

SECTION

Historical Survey 257

matical formulation of all these effects and their interrelations constitute what is
known as Maxwell's theory.
Faraday was undoubtedly motivated in his discovery by what appears to have been
a basic tenet of his scientific philosophy-that every cause and effect has its converse.
Thus since Oersted's experiment and many developments which followed had clearly
shown that electricity can produce magnetic effects, it was reasonable to expect that
magnetism should be able to produce electricity. Faraday attacked this problem many
times without success. His laboratory notebook contains an entry dated December 28,
1824 describing an experiment in which a magnet was placed inside a helical coil
" . . . but in no case did the magnet seem to affect the current so as to alter its intensity
as shewn upon a magnetic' needle placed under a distant part of it. . ."
Again, on November 28, 1825, his laboratory notes refer to a battery-connected wire
" . . . parallel to which was another similar wire separated from it only by two thicknesses of paper. The ends of the latter wire attached to a galvanometer exhibited no
action." Replacing either straight wire by a helix also had no effect.
A third try was recorded on April 22, 1828. Faraday suspended a copper ring by a
thread and placed a bar magnet inside the ring but could detect no induced current.
Faraday's efforts were paralleled by those of many other scientists, but no one was
having any appreciable measure of success. The difficulty lay in the fact that everyone
was looking for the creation of a steady current. Perhaps the most significant discovery
had been made by Arago in 1824. He suspended a magnetic compass needle over a
copper plate and set it into oscillation, noting that the presence of the copper plate
enhanced the damping. Upon eliminating air disturbances and rotating the copper
plate, Arago was able to make the needle revolve also, and even showed that this
dragging effect depended on the conductivity of the rotating plate. Faraday repeated
Arago's experiment in 182.5 but, despite the suggestiveness of the results, the true
explanation of the phenomenon eluded both investigators.
Finally, on August 29, 1831, six years after his first attempt, Faraday discovered the
effect he had been seeking. His notes for that day state:'
Have had an iron ring made (soft iron), iron round and f inches thick and ring 6 inches in
external diameter. Wound many coils of copper wire round one half, the coils being separated by twine and calico-there were three lengths of wire
each about 24 feet long and they could be connected as one
length or used as separate lengths. By trial with a trough each
was insulated from the other. Will call this side of the ring A.
On the other side but separated by an interval was wound wire
in two pieces together amounting to about 60 feet in length, the
A
direction being as with the former coils; this side call B.
Charged a battery of 10 pro plates 4 inches square. Made the
coil on B side one coil and connected its extremities by a copper
wire passing to a distance and just over a magnetic needle (3
feet from iron ring). Then connected the ends of one of the
pieces on A side with battery; immediately a sensible effect on
needle. It oscillated and settled at last in original position. On breaking connection of A
side with Battery again a disturbance of the needle.
Made all the wires on A side one coil and sent current from battery through the whole.
Effect on needle much stronger than before.
1

Faraday's Diary, vol. 1, p. 367. Published by G. Bell and Sons, Ltd., London, 1932.

258

Electronuumetiee in Free Space

CHAPTER

This discovery of transformer action quickly led Faraday to an appreciation of the


entire effect. On September 24th he tried a different experiment. Using a remote helix
and compass needle as indicator, he wrapped a helical coil around a soft iron cylinder
and built up an apparatus which he described as follows:"
The iron cylinder and helix . . . . A.lI the wires made into one helix and
these connected with the indicating helix at distance by copper wire: Then the
iron placed between the poles of bar magnets as . . . in fig. Every time the
magnetic contact at N or S was made or broken there was magnetic motion
at the indicating helix, the effect being as in former cases not permanent, but
a mere momentary push or pull. But if the electric communication (i.e. by the
copper wire) was broken then these disj unctions and contacts produced no effect
whatever, Hence here distinct conversion of Magnetism into Electricity.

On October 1st Faraday repeated the transformer experiment but


with a wooden core, and once again obtained the same effect, though
enough weaker that he had to substitute a galvanometer for the indicating helix. He concluded: "Hence there is an inducing effect without the
presence of iron . . . ."
Finally, on October 17th, Faraday performed the most significant experiment of all. He prepared a helical wire in the form of a cylinder and then 3
. . . . a cylindrical bar magnet t inch in diameter and 8-i inches in length had one end
just inserted into the end of the helix cylinder-then it was quickly thrust in the whole
length and the galvanometer needle moved-e-then pulled out and again the needle moved but
in the opposite direction. This effect was repeated every time the magnet was put in or out
and therefore a wave of Electricity was so produced from mere approximation oj a magnet
and not from its formation in situ.

As noted in Chapter 3, Faraday preferred to think of all electric and magnetic effects
in terms of lines of force, having been first attracted to this view by observing the
disposition of iron filings in the neighborhood of a permanent magnet. He thus sought
to explain this new phenomenon of induced electricity in terms of an interaction with
magnetic flux lines. His raw thoughts OIl this subject are contained in an entry in the
laboratory notebook dated August 1, 1851, which contains the passages'
The force of a given magnet is definite and may be considered as represented by its
curves . . . . The curves . . . exist within the magnet as well as without; but within they
are in the contrary or return direction . . . . Whatever the condition of the interior of the
magnet: it has . . . the same kind and amoun t of power as the outside, and so is in full
analogy and similitude with an electro helix.
The intensity of the curves of a magnet vary greatly at different distances from the
magnet
But the amount of force is definite and the same for every section of all the
curves
.
Hence it follows that whether the curves are intersected directly or obliquely makes no
difference provided they are intersected. 'The effect depends upon the number of curves
intersected. 1\. wire moving obliquely may intersect fewer curves and therefore have a
2
3

Ibid., vol. 1, p. 372.


Ibid., vol. 1, pp. 375-376.
Ibid., vol. 5, pp. 409-411.

SECTION

Historical Survey

259

feebler current evolved in it; but if it intersected only the same curves directly across,
it would have no larger a current.
So with a given moving wire or with a given \vire under which a magnet is moving, the
quantity of electricity generated is directly as the amount of curves passed over or through.
With the same curves therefore it varies directly with the velocity of the motion.

This explanation of induction as being due to the relative motion between magnetic
lines of force and a conductor was refined by Faraday and included in a paper read
to the Royal Society later that year.' I t was given mathematical articulation by Maxwell as the equation

in which e is the emf induced in a contour C and J sBn dS is the total magnetic flux
enclosed by C. If the contour C is occupied by a conductor, e is the source of the resulting induced electric current. In the above, S is an open surface erected on C as boundary, and B; is the normal component of flux density, thus representing the number of
magnetic lines of force per unit area. This famous equation is known as Faraday's emf
law.
After his initial discovery of induction, Faraday continued to experiment with the
phenomenon. On October 28, 1831, he invented the first direct-current generator,
consisting of a copper plate rotating between magnetic poles, with an external circuit
attached between the center and rim of the plate. Through the years Faraday designed
and tested a variety of such generators, and his entry for October 11, 1851 describes
a machine consisting of a rotating wire rectangle with a commutator attached, this being
the prototype of the modern electric generator.
Faraday also discovered the phenomenon of self-induction (in 1834), unaware that
Joseph Henry (1797-1878) had made an independent discovery of the effect two years
earlier. t
It is impossible in a survey this brief to do justice to the painstaking, thorough
manner in which Faraday carefully built and enlarged his knowledge of electrical
phenomena. The interested reader is encouraged to read extensive sections of Faraday's
laboratory notebook in order to gain a full flavor of his accomplishments. As for the
discovery of electromagnetic induction itself, this 111USt rank as one of the D10st important contributions ever made to scientific knowledge.

t In fairness to Henry, it should be stated that during this period he and Faraday independently discovered many important electromagnetic phenomena, including self- and mutual induction and many
of the principles of electric machines. Henry also developed the electromagnetic relay, perfected an
electromagnet.ic telegraph, and showed that voltage could be stepped up or down by properly proportioning the coils in a transformer. Henry's lack of promptness in announcing the results of his
experimen ts has probably been the primary cause of his neglect, but the remoteness of the N ew World
from the Old, in those days of slow communications, was a contributing factor. Faraday's achievements were more promptly disseminated to the European centers of learning, and news of Henry's
accomplishments often bore the appearance of mere confirmation of what Faraday had already done.
In thc stimulation of further scientific inquiry by others, Faraday's influence was inestimably greater.
s M. Faraday, "On Lines of Magnetic Force; Their Definite Character; and Their Distribution Within
a Magnet and Through Spacc," Phil Trans ](oy Soc (London), 142, 25-56; 1852.

260 Electromagnetics in Free Space

CHAPTER

Although Faraday's law of induction was readily accepted, his explanation in terms
of lines of force fell mainly on deaf ears. The scientists of his day had been reared on
theories of action at a distance, theories which had enjoyed wide success in describing
a variety of electric and magnetic phenomena, as well as gravitational effects. The
eminent Astronomer Royal, Sir George Biddell Airy, declared that he could "hardly
imagine anyone who knows the agreement between observation and calculation based
on action at a distance to hesitate an instant between this simple and precise action
on the one hand and anything so vague and varying as lines of force on the other."
Maxwell was only twenty-four when he undertook to OVerC0111e this objection and
place Faraday's ideas on a firm mathematical basis. In the introduction to his first
paper on electricity, he stated that"
. . . the limit of my design is to show how, by a strict application of the ideas and methods
of Faraday, the connection of the very different orders of phenomena which he has discovered may be clearly placed before the mathematical mind.

After defining a single line of force as a curve in space whose direction at each point is
that of the force on a positive charge, or the force on an elementary north magnetic
pole, whichever the case may be, Maxwell continued
. . . \Ve might in the same way draw other lines of force, till we had filled all space with
curves indicating by their direction that of the force at any assigned point.
We should thus obtain a geometrical model of the physical phenomena, which would
tell us the direction of the force, but we should still require some method of indicating the
intensity of the force at any point. If we consider these curves not as mere lines, but as
fine tubes of variable section carrying an incompressible fluid, then, since the velocity of
the fluid is inversely as the section of the tube, we may make the velocity vary according
to any given law, by regulating the section of the tube, and in this way we might represent
the intensity of the force as well as its direction by the motion of the fluid in these tubes.

Maxwell then pointed out that if the force law involves distance to the inverse square,
there would be no interstices between his tubes of force .
. . . The tubes will then be mere surfaces, directing the motion of a fluid filling up the
whole space. It has been usual to commence the investigation of the laws of these forces
by at once assuming that the phenomena are due to attractive or repulsive forces acting
between certain points. vVe may, however, obtain a different view of the subject, and one
more suited to our difficult inquiries, by adopting for the definition of the forces of which
\ve treat, that they may be represented in magnitude and direction by the uniform motion
of an incompressible fluid.

With this conception, Maxwell proceeded to show that all results obtained for static
charges or permanent magnets, using action-at-a-distance formulas, were also obtainable in terms of the distribution of tubes of force. Upon pointing out the equivalence
of a steady current element and a magnetic dipole, he was also able to extend this conclusion to magnetic phenomena caused by time-independent currents. However, in
discussing induced electric currents, Maxwell admitted
6 J. C. Maxwell, "On Faraday's Lines of Force," read to the Cambridge Philosophical Society on
December 10, 1855 and February 11, 1856. Reprinted in Scientific Papers, vol. 1, pp. 155-229, Cambridge University Press, London, 1890.

SECTION

H isiorical Survey

261

. . . The idea of the electro-tonic state, t however, has not yet presented itself to my mind
in such a form that its nature and properties may be clearly explained without reference
to mere symbols, and therefore I propose in the following investigation to use symbols
freely, and to take for granted the ordinary mathematical operations. By a careful study
of the laws of elastic solids and of the motions of viscous fluids, I hope to discover a method
of forming a mechanical conception of this electro-tonic state adapted to general reasoning.

Maxwell then concluded this first paper with an extensive mathematical development
in which the vector potential emerged as being representative of the electrotonic state,
its curl giving the magnetic field, and its time derivative yielding the induction effect.
He also showed that the curl of the magnetic field at any point was equal to the current
density at that point.
This first electrical paper by Maxwell can fairly be described as principally achieving
mathematical expression for all known electric and magnetic phenomena in terms of
Faraday's physical conceptions. It exhibits Maxwell's characteristic fondness for
models, a fondness which had led him to construct a top to illustrate the dynamics of a
rigid body rotating about a fixed point, and to construct a model of Saturn's rings
(now in the Cavendish Laboratory) to illustrate the motion of the satellites in the
rings. This rich physical imagination was now to lead Maxwell to his most important
discovery, through an extension of the tube of force model so as to explain the electrotonic state. This extension was accomplished in a second paper which appeared six
years later in the Philosophical M agazine, in which he offered the introductory remark?
I propose now to examine magnetic phenomena from a mechanical point of view, and to
determine what tensions in, or motions of, a medium are capable of producing the mechanical phenomena observed. If, by the same hypothesis, we can connect the phenomena of
magnetic attraction with electromagnetic phenomena and with those of induced currents,
we shall have found a theory which, if not true, can only be proved to be erroneous by
experiments which will greatly enlarge our knowledge of this part of physics.

I t has already been noted that Faraday looked upon electrostatic and magnetic induction as taking place along curved lines of force. He imagined these lines to be ropes of
molecules starting from a charged conductor or magnet, and acting on other nearby
bodies. These ropes of molecules were in tension, tending to shorten and at the same
time bulge out laterally. Thus the charged conductor or magnet tends to draw bodies
to itself, contracting its lines of force like the fibers of a muscle. Maxwell sought to
represent this longitudinal tension and transverse pressure in terms of equivalent
conditions in a fluid medium.
Let us 110\V suppose that the phenomena of magnetism depend on the existence of a
tension in the direction of the lines of force, combined with a hydrostatic pressure; or in
other words, a pressure greater in the equatorial than in the axial direction: the next question
is, what mechanical explanation can we give of this inequality of pressures in a fluid or
mobile medium? The explanation which most readily occurs to the mind is that the excess
of pressure in the equatorial direction arises from the centrifugal force of vortices or eddies
in the medium having their axes in directions parallel to the lines of force . . . .

t Faraday called the state into 'which any body was thrown, due to the presence of a magnetic field,
the electrotonic state, and explained induction as being due to changes in the electrotonic state.
i J. C. Maxwell, "On Physical Lines of Force," Phil Mag, 21, 161-175, 281-291, 338-348; 1861.
Reprinted in Scientific Papers, vol. 1, pp. 451-513, Cambridge University Press, London, 1890.

262

Electromaqnetics in Free Space

CHAPTER

'Ve shall suppose at present that all the vortices in anyone part of the field are revolving in the same direction about axes nearly parallel, but that in passing from one part
of the field to another, the direction of the axes, the velocity of rotation, and the density
of the substance of the vortices are su bject to change. We shall investigate the resultant
mechanical effect upon an element of the medium, and from the mathematical expression
of this resultant we shall deduce the physical character of its different component parts.

In order to have adjacent vortices rotating in the same direction, Maxwell next supposed that there exist between them a large number of minute spherical bodies which
roll, without sliding, in contact with the surfaces of the vortices. These particles, which
Maxwell assumed to constitute electricity, thus play the role of idler wheels, Under
this construction, for example, the static magnetic field of a permanent magnet can be
envisioned as consisting of vortices which fill the tubes of force, with the rotational
velocity of a vortex proportional to the strength of the field and thus varying with tube
cross section. With adjacent vortices in the magnetic field rotating at the same speed
in the same direction, the particles between them rotate idly but remain in the same
position. However, if a change should occur in the magnetic field, this would mean that
one of the vortices began rotating faster than the other, and thus the particles between
them would change position, indicating an electric current. In this way, Maxwell's
model demonstrated the creation of electric currents due to changes in the magnetic
field; hydrodynamical considerations of the relations between the rotational velocities
of adjacent vortices and the displacement of the idler particles led to a mathematical
statement of Faraday's emf law,
It was precisely at this point that the great value of the model became apparent.
If a change in vortex motion can cause a displacement of the idler particles, then the
converse should be true-a displacement of the idler particles should occasion a change
in vortex motion. Cause and effect are interchangeable. A changing magnetic field can
create an electric field; a changing electric field should produce a magnetic field. Maxwell was reaching the heart of his greatest contribution when, in Part 3 of the paper, he
said"
According to our theory, the particles which form the partitions between the cells (vortices) constitute the matter of electricity. The motion of these particles constitutes an
electric current; the tangential force with which the particles are pressed by the matter of
the cells is electromotive force, and the pressure of the particles on each other corresponds
to the tension or potential of the electricity.
If we can now explain the condition of a body with respect to the surrounding medium
when it is said to be "charged" with electricity, and account for the force acting between
electrified bodies, we shall have established a connexion between all the principal phenornena of electrical science.

After pointing out that electromotive force (voltage due to magnetic effects) is the
same thing as electric tension (voltage due to charge separation), Maxwell distinguished
between conductors and insulators, concluding
Here then we have t\VO independent qualities of bodies, one by which they allow of the
passage of electricity through them, and the other by which they allow of electrical action
being transmitted through them without any electricity being allowed to pass. J.~ conducting body may be compared to a porous membrane which opposes more or less resist8

Ibid., p. 490.

SECTION]

Ifistorical Suroeu

263

ance to the passage of a fluid, while a dielectric is like an elastic membrane which may be
impervious to the fluid, but transmits the pressure of the fluid on one side to that on the
other.

Maxwell next discussed the relation between conduction current and potential in a
conductor and then went 011 to say
Electromotive force acting on a dielectric produces a state of polarization of its parts
. . . . In a dielectric under induction, we may conceive that the electricity in each molecule
is so displaced that one side is rendered positively, and the other negatively electrical, but
that the electricity remains entirely connected with the molecule, and does not pass from
one molecule to another.
The effect of this action on the whole dielectric mass is to produce a general displacement of the electricity in a certain direction. This displacement does not amount to a
current, because when it has attained a certain value it remains constant, but it is the
commencement of a current, and its variations constitute currents in the positive or negative direction, according as the displacement is increasing or diminishing. The amount of
the displacement depends on the nature of the body, and on the electromotive force . . . .

Thus Maxwell introduced for the first time the concept that variations in position of
bound charge were equivalent in their effect to a conduction current. By letting motion
of the idler particles of his model represent either or both, and finding the variation in vortex velocity due to a particle displacement, he arrived at a generalization of
Ampere's circuital law,
The importance of this generalization cannot be overstated. If motion of the idler
particles could only represent conduction current, then an electrical disturbance could
only propagate through a conductive medium. But with the concept of displacement
current, field changes could be transmitted through dielectric media, including air, and
even including free space (which Maxwell considered to be an ether).
Maxwell recognized that a finite velocity would be associated with the propagation
of any disturbance through his model medium. He described the mechanism of propagation by imagining that a translational motion of one layer of idler particles would
initiate a change in angular velocity of the contiguous vortices. These in turn would set
the next layer of idler particles into translational motion, and in this mariner the
disturbance would be transferred through a sequence of layers. Maxwell computed
the kinetic and potential energy which were transferred in this fashion, thus obtaining
a velocity of transport. By associating kinetic energy and potential energy with the
magnetic and electric fields respectively, he deduced that the velocity of propagation
of an electromagnetic disturbance was governed by the electrostatic permittivity and
magnetostatic permeability of the supporting 111ediu111. Upon using the values for
these constants, determined for air by Kohlrausch and Weber, Maxwell deduced that
the velocity of an electromagnetic disturbance should be 193,088 mi/sec. He then
concluded
. . . the velocity of light in air, as determined by 1\1. Fizeau, is . . . 195,647 miles per second. The velocity of transverse undulations in our hypothetical medium . . . agrees so
exactly with the velocity of light calculated from the optical experiments of IV1. Fizeau, that
we can scarcely avoid the inference that light consistsin the transverse undulations of the same
medium which is the cause of electric and maqneiic phenomena.

This discovery may be likened to an earlier occasion when 1\ ewton first tested his

264

Electronuujneiice in Free Space

CHAPTER

law of universal gravitation by making calculations on the distance of the n100Il. It


was Newton's misfortune to use an inaccurate value for the diameter of the earth, and
this led to such poor agreement that he put the theory aside for nearly two decades.
Maxwell was spared a similar disappointment in that both his value and Fizeau's
were in error in the same direction.
I t should be remembered that at this time no one had ever wittingly generated or
detected electromagnetic waves. The concept was completely new, as was the notion
of a displacement current. To link light to these hypothetical phenomena was a flash
of brilliance seldom equalled in the history of science. It was not to be until eight years
after Maxwell's death that these hypotheses would receive substantiation through the
experiments of Hertz.
Maxwell next discarded the model which had served so well as a scaffolding with
which to erect his theory, and in a third paper entitled "A Dynamical Theory of the
Electromagnetic Field," presented the theory completely in electrical terms." The
properties of the field are described in terms of 20 equations, which include the relation
between displacement current and conduction current, and the continuity equation
linking charge to current, as well as what are now known conventionally as Maxwell's
equations. This paper was so carefully written that it later appears almost intact in
his Treatise.
These accomplishments, added to his contributions in color vision and molecular
theory, have earned Maxwell the place as the greatest theoretical physicist of the nineteenth century. At a centenary in 1931 honoring his birth, Max Planck reviewed the
evolution of man's knowledge of electrical phenomena and concluded 10 by saying of
Maxwell
. . . it was his task to build and complete the classical theory, and in so doing he achieved
greatness unequalled. His name stands magnificently over the portal of classical physics,

and we can say this of him: by his birth, James Clerk Maxwell belongs to Edinburgh,
by his personality he belongs to Cambridge, by his work he belongs to the whole world.

Maxwell's equations, as has already been noted in Chapter 2, played a central role
in the development of the theory of special relativity. Lorentz used them as an invariant
to derive the transformation which bears his name, and Einstein devoted much of his
first paper to the same subject. In the sections to follow, this process will in effect be
reversed. The Lorentz transformation has already been established in terms of fundamental considerations of length and time measurements, The Lorentz equations will be
used to derive Maxwell's equations from a transformation of Coulomb's law.

5.2 THE TRANSFORMATION EQUATIONS FOR


ELECTRIC AND MAGNETIC FIELDS

Suppose that an observer 0', stationary in X'Y'Z', has created most general composite
electrostatic and magnetostatic fields, E' (x' ,y' ,z') and .B' (x' ,y' ,z'). He can do this
through the use of three types of sources: (1) a static system of charges, (2) a steady
First read to the Royal Society in 1864. Published in Phil Trans Roy Soc (London), 155; 1865.
Reprinted in Scientific Papers, vol. 1, pp ..526-597, Cambridge University Press, London, 1890.
10 Janus Clerk At axwell, A collection of commemorative essays, p. 65, The Macmillan Company,
New York, 1931.

SECTION

The Transformation Equations for Electric and 1\1 agnetic Fields

265

current consisting of the flow of uncompensated charges, and (3) a steady current
against the background of static compensating charges. This situation is suggested in
Figure 5.1. Formulas for the static fields arising from a composition of these three

z'
Static charges

(x',y',z') ~
v'(t')

Uncompensated
circulating charges

fI'-"'"

,'-',,

~ i"

!i l\
~

J.-------------------y'

'.--

X'

Compensated
circulating charges
FIGUHE

5.1 Composiie sources causinq most general static fields B' (x' ,y' ,z') and E' (x' ,y' ,z')

which interact with a movinq charge q.

types of sources were given in Section 4.10. If, in the presence of these fields, a charge q
is moving through X'Y'Z' at a velocity v', observer 0' can say that the force on q is

F' = q(E'

v' X B')

(5.1)

which is a use of the Lorentz force law.


Imagine that a second observer 0 is stationary in a frame XYZ which is in constant

266

Electromaqneiics in Free Space

CHAPTER

motion with respect to X' Y' Z' such that the respective axes are aligned and X is sliding
along the - X' axis at speed u. Then the Lorentz transformation equations (2.40) are
applicable and observer 0 will deduce that the force on q is F, where

F= [lxF~ + K(lyF~ + 1.F:)] + K~

X (

1x ~

F')

(5.2)

in which K = (1 - U2/C2)-~2 and vet) is the velocity of q in XYZ. Equation (5.2) is


merely a restatement of (2.76) and the present development in some ways parallels
the opening development of Chapter 4. Two cases of (5.2) wil] now be considered.

Case 1:

== O.

In this case, q is static in XYZ and the force F is just the bracketed term in (5.2).
Observer 0, who is accustomed to the idea that magnetic fields exert forces only on
moving charges, will ascribe this force to an electric field, since q is not moving relative
to O. Thus he defines an electric field such that
(5.3)
This electric field depends on time as well as spatial position because the sources of 0'
are moving relative to O.
With q stationary in XYZ, v' = -l x u and (5.1) gives

F~ = qE~

F~ = q(E~

F~ = q(E; - uB~)

uB:)

(5.4)

Insertion of (5.4) in (5.3) yields the field transformation equations

Ex

E~

E y = K(E~

+ uB:)

E, = K(E; - uB~)

(5.5)

The electric field in XYZ is seen to be contributed to by both the electric and magnetic
fields of X'Y'Z', and in relative amounts controlled by u.

Case 2: v O.
In this case q has an arbitary motion vet) in XYZ and the force F is the entire
expression (5.2). The charge q also has an arbitrary motion v'(t') in X'Y'Z', and
(5.1) gives

F'x = q (E'x

F'y
F'z

=
=

q(E'y
q(E'
1z

+ v'B'z
y

x -

+ V'B'
+ V'B'

v'B')
z y
Vx'B')
z
v'B')
x
y

(5.6)

If the velocity transformation equations (2.50) are used to replace the components of
v' by those of v in (5.6), and the resulting equations are inserted in (5.2), one obtains

= q {1x

E~ + :~ (vyE~ + v.E:) + K(vyB: - v.B~) ]

(1 - u;x) E; + n; - x- u) B:]
+ 1. [K(l - :~x)E: - .; + K(v x- U)B~J}

i, [ K

K(V

(5.7)

Observer 0 can introduce a magnetic field B(x,y,z,t) to account for the fact that the
force on q is different because it is now in motion through XYZ. This magnetic field

SECTION

The Transformation Equations for the Source Densities

267

will be time-varying because the sources of 0' are moving relative to O. If observer 0
chooses to define B(x,y,z,t) such that the Lorentz force law is still valid, then he can
write
(f).8)
F = q(E + v X B)
in which qE is the force on q when it is at rest in X YZ, and thus E is as gi ven in Case 1
by Equations (5.5). The magnetic field B must be such that (5.7) and (5.8) equate.
Upon comparing components of these two equations, one finds that
(5.9)
These transformation equations, together with (5.5), form a set from which 0 can
determine the fields which interact with q to produce the force given by the Lorentz
equation (5.8). The sources which have produced these fields are of a restricted class,
being time-independent in X'Y'Z', but this restriction will be lifted shortly.
Upon properly combining (5.5) and (5.9) one can establish that the inverse transformation is
E~ = K(Ey

B~
5.3

= K

uB z )

(By + ~ Ez)

E; = K(Ez

B:

= K(

Bz

(5.10)

uB y )

~E

y )

(5.11)

THE TRANSFORMATION EQUATIONS FOR THE SOURCE DENSITIES

It will be desirable to relate the fields E(x,Y,z,l) and B(x,Y,z,l) to the time-independent
sources p'(x',y',z') and t(x',Y',z') created by 0'. However, it is convenient first to transform these sources into their XYZ equivalents. This can be done by considering an
arbitrary volume element dV o, in which an amount of charge Po dV o is at rest. This
volume element is assumed to be moving through X'Y'Z' at velocity w", and through
XYZ at velocity w. To 0', this volume element has a size dV' = dV o[l - (W')2/ C2]H
and to 0 it has a size dV = dVo(l - W2/C2)~<2. Since charge is an invariant,
p'dV' = Po dV o = p dV
!!- = dV' = [1 - (W')2/C2J~2
p'
dV
1 - W 2/C 2

from which

(5.12)

The velocity transformation equations (2.50) give

w~ =

(1 + U;~r2 {[(W/)2 - (w~)2l (1 - ~:) + (w~ + U)2}

which may be used to convert (5.12) to the form

p(x,Y,z,l) =
Since l(x,Y,z,i)
yields

===

7 p (x ,y ,z )
(1 + uw~)",
I

(5.13 )

p(x,Y,z,t)w(x,Y,z), another use of the velocity transformation (2.50)


(5.14)

268

Elecironuumeiics in Free Space

CHAPTER

Equations (5.13) and (5.14) are called the source transformation equations and will be
of assistance in the determination of the dependence of E and B on the sources.

5.4

MAXWEll'S EQUATIONS

The relations between the static fields E'(x',y',z'), B'(x',y',z') and the time-independent
sources p' (x' ,y' ,Z'), " (x' ,y' ,Z') are already known, being given by

v' E'

v' X E ' == 0
v' X B' = \'/~Ol

= p'/EO

(5.15)

v'D' == 0

If all the quantities in these four equations are converted to their XYZ equivalents,
with the assistance of the transformations developed in Sections 5.2 and 5.3, the result
will be a set of equations in which the dependence of E and B on, the sources is dis-

played. To see how this is accomplished, consider any function! of the four coordinate
variables. Upon making use of the Lorentz Equations (2.40), one can establish that

aj = aj dx +

ax'
af
at'

-==

~!!!.-

K!1

+ KU aj

ax dx' at dx'
ax c2 at
a.r dt af dx
af
af
at dt' + ax dt' = K at + KU ax
af
af
af
af
-=-=ay' ay
az' az

(5.16)

Application of these formulas to the curl of E' gives terms such as

aE:
ax'

aE~

aE;
ax

aE~

KU

aE;
at

---=--K----

az'

az

c2

which, with the use of (5.10), can be written

aE~ _ aE; = aE z
az'

ax'

az

K2

(aEz + U aB II)
ax

ax

~~ (aEz
c

at

+ U aB

at

Upon determining all three components in this manner, one may write
V" X E

z
== 0 = i, [(aE
K ay

aEy) + (V B - aB%)]
- a;
a;

aEx - -aEz) +
ax
+ 1z [( -aEy - -aEx) ax
ay

+1

[( - az

KU

et;

(1 -

K 2) -

K 2U -

(1 -

K 2)

K 2U

ax
aEy
ax

aBy
ax
z
aB
ax

2
K U

es. -

at
2
+ -K U -aEy
c2

c2

at

aBy ]
at
2 2
(.1.17)
-K cU2 -aBz]
at
2

K U
-- 2

This result can be simplified considerably. If f is any function of x', y', z' but not of t',
it follows from the second of Equations (5.16) that

ar
?f= -u -=
at

ax

SECTION

M axwell' s

l~ quations

269

When this relation is used in (5.17) one obtains

(5.18)
Further simplification is possible through determination of V B. Since

aB
x+ -aB +-aB
ax
ay
az
y

aB x
ax'

KU

K- -

en, ets, en,


+ - +at' ay' az'

--.
2

use of (5.9) and recognition of the fact that partial derivatives with respect to t' are
all zero leads to
I

V B

KV B

aE l aEI)
(-ay' - -az'

KU

- -

(5.19)

The right side of (5.19) is zero by virtue of (5.15) and thus


(5.20)

vB==O
which means that (5.18) can be written

aB

vxE==

(5.21)

at

When this procedure is repeated for the curl of B' one obtains
V' X

B' == ~1 == 1
fJ.o
x

[(aBz _ aBy) _ ~2 aExJ _ KU2 V . E}


ay
az
c at
c
+ r, [(aBx _ aBz) _ ~ aEy] + i, [(aB y _ aBx) _ ~ aE,]
{K

az

ax

at

c2

ax

ay

c-

at

(5.22)

Once again reduction is possible since


V

"

E = KV E
==

(, +
P

fO

KP' (

=fO

KU

'

("x

U - -1

fJ.o

(aB:
ay'
-

aB~)

az'

) == - '(1 +
o

fO

fO

uw x - -1
Po

uw~)
1+2
c

Use of (5.13) yields


V E =

(5.23)

fO

and then use of (5.14) means that (5.22) can be written


t
1 aE
V x B = - + -2 POl

at

(5.24)

270

Electronuumetics in Free Space

CHAPTER

All these results are relativistically exact. When collected together, they are known
as Maxwell's equations and can be written in the form

an
at

(5.25a)

aDo
+
at

(5.25b)

VxE=--

v x H,

in which

= 1

V Do = p

(5.25c)

vB == 0

(5.25d)

Do = foE
n, = ,uolB

(5.26)
(5.27)

The sources p, 1 and the fields due to them, E, B, Do, "0, are all time-varying, However,
they are due to a restricted class of sources, namely those consisting of static charges
and steady currents in X'Y'Z'. But this restriction can be lifted by a simple argument.
If a second set of steady sources exists in another coordinate system X" Y"Z", they
will give rise to additional time-varying fields in XYZ which will also satisfy (5.25).
By superposition, the sum of the sources in X'Y'Z' and in X"Y"Z" will give rise to
fields in XYZ which satisfy (5.25). If this is generalized to include steady sources in all
Lorentzian frames, including those traveling at any speed in any direction relative to
X YZ, the sum of such sources can result in completely general distributions p(x,Y,z,t)
and I(X,Y,Z,t). This fact is demonstrated in Appendix E. Thus Equations (5.25) have
the widest validity and can form the basis for the study of all types of electromagnetic
fields. Of course, observer 0 need not rely on the steady sources of 0', 0", etc., to
establish his time-varying electromagnetic fields, but can do this equally well himself
by direct creation of the time-varying sources p and t.
Integral forms of Maxwell's equations follow readily from (5.25) with the aid of
Stokes' theorem and the divergence theorem. They are:

E dt = - f B dS
s
n, dt = f (t + Do) dS
c

Do dS vJ

B dS

== 0

dV

(5.28a)
(5.28b)
(5.28c)
(5.28d)

The first of these equations is often called Faraday's emf law and states that the line
integral of longitudinal E around any closed path is equal to minus the time rate of
change of magnetic flux enclosed. The second equation is a generalization of Ampere's
circuital law and casts Do in the same role as 1. This point was first appreciated by
Maxwell, who gave to Do the name displacement, For this reason, Do is called the displacement current. The third equation is a generalization of the integral form of
Poisson's equation, and the fourth integral states that at all times the total magnetic
flux piercing any closed surface is zero.
If the divergence of Equation (,5.25a) is formed, the left side is zero because of the

SECTION

l\,faxwell's Equations

271

vector identity (V. Ill) ; this is matched by the right side, which is also zero by virtue
of (5.25d). Sirnilarly, if the divergence of (5.25b) is taken, one obtains

v .

V X

Ho =

V (1

Do)

== 0

indicating that the total current is continuous. Thus

v 1

(v

-p

at

Do)

Use of (5.25c) converts this to

v \

(5.29)

which is known as the continuity equation. In words, V \ dV is the net efflux of current from a volume element dV, and - p dV is the time rate of decrease of charge within
dV. It is quite natural that these two quantities should be equal.
The continuity equation, which links charge and current, in no way denies the existence of charge without current, since it involves only p. A static charge distribution
p(x,Y,z) satisfies (.5.29) with no current flow.
EXAMPLE

.5.1

Consider t\VO rectangular conducting blocks, as shown in the figure, separated by a small
distance l so as to form a parallel plate capacitor. Assume a uniform electron flow in the
direction indicated, so that t = I zt. is upward in both blocks. If A is the cross-sectional
area of each block, charge is accumulating on the adjacent faces at a rate t.A coul/sec.

-,
I

Electron
flow

272

Electromaqneiics in Free Space

CHAPTER

Therefore the total flux between faces, neglecting fringing, is increasing at the rate of
L1 lines/sec, or

Do

= t

Thus the conduction current in the blocks is exactly replaced by a displacement current
in the interspace and the total current is continuous, in agreement with V (t + Do) == 0,
as deri ved above.

If this entire development, beginning with Section 5.2, had been undertaken by
starting with steady sources in X'Y'Z', and asking what fields would be determined
by an observer 0*, in a coordinate system X*y*Z* which was moving at a speed u*
relative to X'Y'Z', all the same results would once again be obtained. Time-varying
fields E*, B*, due to time-varying source distributions o", t* would be found to satisfy
l\laxwell's equations. The question could then be raised as to the relations between the
fields E, B observed by 0 and the fields E*, B* observed by 0*. It is easy to show that
these two sets of fields are related by the previously obtained transformations (5.5)
and (5.9). A proof can be found in Appendix F.

5.5

INTEGRAL SOLUTIONS OF MAXWELL'S EQUATIONS


IN TERMS OF THE SOURCES

Since Maxwell's equations are linear in free space, no loss in generality results from
assuming that time variations are harmonic and represented by ei wt . The angular
frequency w may be a component of a Fourier series or a Fourier integral, thus bringing
arbitrary time dependence within the purview of the following analysis. Accordingly, if !(x,Y,z,t) is any field component or source component, it will be assumed that
!(x,Y,z,t) = !(x,Y,z)e i wt . In this case, Maxwell's equations can be written

v X E = -jwB
V X n, = t
jwD o

V E =

VB

==

(5.30)
(5.31)

(5.32)

(5.33)

Eo

and the continuity equation becomes


V t = -JWP

(5.34)

In all the above equations, E = E(x,Y,z), etc., the time-dependence being suppressed.
E, B, etc., are now complex vectors. (See Mathematical Supplement, Section V.23).
Additionally, if the curl of (5.21) and of (5.24) is taken, and if then (5.21) and (5.24)
are used to eliminate either E or B, one obtains the vector wave equations
(5.35a)
(5.35b)

I ntegral Solutions of i11 axwell' s Equations

SECTION ;)

273

For an ejwt time-dependence, these becorne

X V X

X V X B -

E - k 2E
k 2B = V X

(5.3Ga)

(5.3Gb)

in which k = w v!:Oo is called the propagation constant, for reasons which will emerge
shortly. These last two equations can be integrated through use of a technique first
introduced by Stratton and Chu, and based on a vector formulation of Green's second
identity. 11

i,

FIGURE

.5.2 Geomein] for the vector Green's theorem.

Consider a region V, bounded by the surfaces Sl ... ~SN as shown in Figure 5.2.
Let F and G be t\VO vector functions of position ill this region, each continuous and
having continuous first and second derivatives everywhere within V and on the
boundary surfaces Si. Using the vector identity
V

and letting A

[A X B] == B V X A - A V X B

F while B == V X G, one obtains

v [F

X V X

G]

V X

V X

F - F

V X V X

whereas, if A == G and B == V X F, there results


V

[G X V X F] == V X F V X G - G V X V X F

11 J. A. Stratton and L. J. Chu, "Diffraction Theory of Electromagnetic Waves," Phys Rev, 56,99-107;
July 1939. Also, sec the excellent treatment in S. Silver, Microwave Antenna Theory and Design,
MIT Rad. Lab. Series, vol. 12, pp. 80-89, l\TcGra\v-Hill Book Company, New York, 1939. The present
development differs from Silver's principally in the nonuse of fictitious magnetic currents and charges.

274

Electromagnetics in Free Space

CHAPTEH

If the difference in these results is integrated over the volume V one obtains

f (F

V X V X G -

G V X V X F) dV =

V [G X V X F -

F X V X G] dV

If one lets In be the inward-drawn normal from any boundary surface S, into the volume
V, use of the divergence theorem gives

f (F.

V X V X G -

G V X V X F) dV

SI"

(G X V X F - F X V X G) in dS

(1).37)

,SN

This result is the vector Green's theorem,


Suppose that the E and B of (5.3Ga) and (5.3Gb) meet the conditions required of the
function F in V and let G be the vector Green's function defined by
e-jk~

(f>.38)

G = - a = y;a
~

in which a is an arbitrary constant vector and ~ is the distance from an arbitrary point,
[J(x,Y,z) within V to any point (~,1],t) within V or on Si.
G as defined by (5.38) satisfies the conditions of the vector Green's theorem everywhere except at P. Therefore, one can surround P by a sphere ~ of radius 0 and consider that portion V' of V which is bounded by the surfaces 8 1 . . SN, ~. Letting
E = F, one obtains

J (E

v'

Vs X Vs X

1/-'a - 1/-'a V s

X V s X E) dV

SI'"

E X V s X 1/;a) in dS

(1/;a X V s X E -

(5.39)

SN,T,

in which, since y; is a function of (x,Y,z) as well as (~,1],t), it is necessary to distinguish


between differentiation with respect to these two sets of variables by subscripting the
operators so that
Vs =

and

1x -a~

vp = 1z -ax

1y -a1]

1 -at
Z

+ 1 -aya + 1 -aza
Y

I t is shown in Appendix G that both sides of this equation may be transformed so


that a is brought outside the integral signs, the result being

f (jw1/-'

v'

~l

lJ.o

~ V s 1/-') dV
Eo

- a'

- a

(In' E)Vs1/; dS

SI' SN,T,

81'"

[jw1/-'(l n X B) - (in
SN,T,

X E) X V s1/;]

dS

(5.40)

SECTION

Conditions at Infinity

275

Since a is arbitrary, it follows that the integrals on the t\VO sides of (5.40) may be
equated, yielding

(jWl{!

t-t> - ~ V sl{!)

dV -

8,.

8N

[(in E)V sl{!

f (in E)Vsl{! + (in X E)

(in X E) X V sl{! - jwl{!(in X B)) dS


X Vsl{! - jwl{!(in X B)) dS

(5.41)

where, for convenience, the surface integral over the sphere ~ is displayed separately.
It is further shown in Appendix G that the right side of (5.41) reaches the limit
-47rE(x,y,z), with (x,Y,z) the coordinates of the point 1\ as ~ shrinks to zero. Therefore
the limiting value of (5.41) is

E(x,y,z)

=~

47r

f (~V sl{! -

EO

Lf

jwl{!

~)) dV

JJ.o

[(in E)V sl{!

SI'"

(in X E) X V sl{! - jwl{!(in X B)) dS

(5.42)

SN

This important formula gives .E at any point in the volume V in terms of the sources
within V plus the field values on the surfaces which bound ~T.
One may proceed in a similar fashion, by letting B = F, and deduce a companion
formula for B(x,Y,z). Alternatively, the curl of ([).42) may be taken and then (5.30)
employed to obtain B. By either procedure, one finds that

B(x,y,z)

1
= -4

7r

f --=t
t

+~
47r

X V st/; dV

JJ.o

f
81"

[jWl{! (in X E)
2

8N

(in X B) X V sl{!

(in B)V sl{!] dS

(5.43)

Inspection of the volume integrals in (5.42) and (5.43) reveals that B is given in
terms of the current sources only, whereas the expression for E contains terms involving
both the currents and the charges. However, the continuity equation (.1.34) may be
used to give

E(x,y,z)

f _.1_ lo Vs)Vst/; + k t/;t ] dV


2

47r v JWEo

+~

47r

81

f
"

[(in E)Vst/;
,

(in X E) X Vs\{; - jwl/;(l n X B)] dS

(5.44)

SN

Equations (.1.43) and (5.44) constitute a solution of Maxwell's field equations in terms
of the current sources within V and the field values over the bounding surfaces S;

5.6

CONDITIONS AT INFINITY

Let it now be assumed that the surface F;N of Figure 5.2 becomes a large sphere of
radius (R centered at the point P. (R initially will be taken great enough to enclose all
the sources t and p of the fields; ultimately CR will be permitted to become infinitely

276

Electro1nagnetics in Free Space

CHAPTER

large. Under these circumstances, consider the contributions to (5.43) and (5.44) of the
surface integrals over SN.
If 1m is a unit vector directed outward along the radius of the spherical surface SN,
so that 1m = - In , one may write for the appropriate part of (5.43)

L1[j;t
L1[- jc~

(I n X E)

L1{- jc~
= L1{- j;

(1<1\

+
X

(1<1\

[(1<1\

(1" X B) X V Sy;

E)

(1<1\

E) - (jk

B)

(I n

+ ~}

B)V Sy;] dS

1<1\ (jk

+ ~) [(1<1\ X

E) - cB]

(1<1\ B)l<1\ (jk

1<1\

D]e-;ki

dS

B) - (1<1\' B)l<1\l} e;<I\ dS

e;<I\ dS

(;J.45)

Similarly, the appropriate part of (5.44) becomes

47l"

J [(1,,

SN

E)vsY;

(1" X E) X VSy; - jwy;(l" X B)] dS

=~

471"

J {Jw [(1<1\

+ ~] +

B)

SN

E}
ffi

e-~<I\ dS

(5.46)

\.Tt

If CR ~ 00, since the surface of the sphere increases as CR2, the surface integral (5.45)
will vanish if
(5.47)
lim ffiB is finite
m~oo

lim CR[(l m X E) - cB] = 0

(tj.48)

CR~Q()

Similarly, the surface integral (,5.46) will vanish if


(5.49)

lim ffiE is finite


CR~oo

lim CR [(1<1\

<R~Q()

B)

+~]
C

(5.50)

The relations (5.47) through (5.50) are known as the Sommerfeld conditions at infinity.
Expressions (f>.47) and (5.49) are commonly called the finiteness conditions (Endlichkeit Bedingungen) and expressions (5.48) and (5})0) are customarily given the name
of radiation conditions (Ausstrahlung Bedingungen). The finiteness conditions require
that E and B diminish as (R-l while the radiation conditions require that they bear the
relation to each other found in wave propagation in regions remote from the sources.
(See Section 5.7.)
I t is now possible to demonstrate the extremely important result that real sources,
confined to a finite volume, always give rise to fields which satisfy the Sommerfeld
conditions. To see this, consider Equations (.1.43) and (fj.44) when the only boundary
surface is the large sphere SN whose radius will be permitted to become infinitely

Conditions at J nfin"ity

SECTION ()

277

large. It shall be assumed that the real sources t and pare finite and confined to a finite
volume 11 0 . With the surface ~')N becoming an infinite sphere, the volume l ' in U>.4:3)
and (:>.44) also becomes infinite, but no convergence difficulties arise with the volume
integrals because the sources are all within V o.
Borrowing from the results of Section ;").9, t he fields over SN will consist of outgoing
waves whose power density is E X H, wat.ta/rn". Since the surface area of SN is increasing as (R2, if there is even the minutest loss in V, the law of conservation of energy
requires that E and H, diminish more rapidly than m-I , and thus conditions (;").47)(5.50) are satisfied. One can then conclude that in an unbounded region, B(x,y,z) and
E(x,y,z) are given solely by the volume integrals which appear in U).4:3) and (5.44).
A check on this conclusion for the limiting case of no loss in V may be obtained
through an ordering of the terms which comprise the volume integrals. To see this,
select as origin an arbitrary point in V o and let r be the vector drawn from the origin
to the field point P(x,y,z); the vector drawn from the source element to P will be
labeled ~. Then

(jk+-1) e-~

jkl

Let>
- all
-+- all)
-

(L8

~ ao'

~ sin 0' ac/>'

in which spherical coordinates (~,O',c/>') centered at P have been employed and


~

1r = - -r
Performing the indicated differentiations, one obtains

(5.51 )
The functions l/;, V sf, and (t V s)V sf are all seen to involve polynomials in the variable
(5.43) and
(5.44) gives

~-l. Retain for the moment only first-order terms; then substitution in

f \

1
B(x , y '
z) 4
= -

e-jk~

(.5.52)

jk. - -1 X I r - - dV
rr v
JJ.o
~

E(x,Y,z)
But

=-

f -.-1

4rr v

JWf:o

[-k 2 (t Ir)lr

= [(x - ~)2 + (y - 1])2 + (z - r)2p2


== [(1' sin () cos c/> - ~)2 + (r sin 0 sin </>

~M
k 21] dV

1])2

(5..13)

(1' cos () - ~)2P~

in which now conventional spherical coordinates (r,f),cP) centered at the origin have
been introduced. As P becomes remote, ~ can be expressed in the rapidly converging
series
~

Similarly,

= r -

(~sin
~-l

0 cos c/>

r- 1

+ 1] sin () sin c/> + r cos ()) + OCr-I)

O(r- 2 )

lim l r = IT

T-+ 00

(5.54)

278

Electronuujnciics in

Space

Free

CHAPTER

and thus as r becomes very large, Equations (5.52) and (5.53) may be written

B(x,Y,z)

jk e!':
r

= -

41T

--=1 X lr ei k JI dV
J.lo

1).

f I, X (1

jk r

E(x,Y,z) = -jw -e41T l'

f\

V'

X -=i

J.lo

V'

(5.55)

0(1'-2)

eJk JI dV

(;").56)

0(r- 2 )

in which JI = ~ sin () cos + 17 sin 0 sin + S cos o. t


If one were to go back and include all the terms in the expressions for V st/; and
(1 V s) V s1/;, they would alter the results (5.55) and (5.5H) only at the level of 0(r- 2) .
Therefore these two expressions for Band E may be taken as exact.
In considering the expressions (tj.5t and (5.56) with respect to the Sommerfeld
conditions, one notices that the terms of 0(1'-2) and below satisfy all four conditions
and thus concern may be focused on the explicit first-order terms. But
lim rB = jk lim e- j k r
41T T-+ ao
ao

T-+

f~Xle

-1

J.l 0

j k JI

dV

(.j.57)

and, since the volume integral is a function of the source coordinates and the angular
direction to J>, but not of r, this limit is finite. A similar argument establishes that
lim r E is also finite and thus both finiteness conditions are satisfied.

T-+

ao

Further,
lim r
T-+

ao

[(l

X B)

+ ~]
C

e: jkr

= lim T-+

ao

47r

f [jk1

X ---=i X

JJ.o

I,

+ -jwc I, X

I,

t ]

X ---=i

JJ.o

e i k JI

dV

The integrand in (5}j8) is identically zero and therefore condition (5.t50) is satisfied.
In like manner, the condition (5.48) is found to be satisfied also. This supports the
argument that any system of real sources confined to a finite volume Vo gives rise to an
electromagnetic field at infinity which satisfies Sommerfeld's conditions, that the surface integral over an infinite sphere SN gives a null contribution, and that in an unbounded region the electromagnetic field at any point P, near or remote, is given
precisely by
E(x,Y,z)

=~

41T v

B(x,Y,z)

_.1_ [(t V s)V st/;

)Wf.o

= -

41T

f ----=i
t
V st/; dV

k 21/;t ] dV

(5..59)
(5.60)

J.lo

Suppose now that parts of the volume Vo are excluded from V by the finite, regular
closed surfaces 1..~1 S, . . . . These surfaces may exclude some of the sources from
V or not, but their presence does not alter the results at infinity. However, now the
more general expressions (.1.43) and (fJ.44) apply, and one may conclude by saying that
(5.43) and (5.44) are valid ev~n if the volume V is infinite, so long as real sources in a

t This syrn bol is the Russian lower-case "ell" and may be called the directional position of the source
point.

SECTION

TYke Potential F'U,nctions

279

finite volume are assumed. If the volume V is infinite, the surface at infinity need not
be considered.

5.7

THE POTENTIAL FUNCTIONS

If the volume V is totally unbounded,

J~quations

(:").42) and (:>.43) give


(:>.61)

(.1.62)
Since Vpl/; == -Vsl/;, and since \ and the limits of integration arc functions of (~,r],t),
but not of (x,y,z), these integrals may be written

J PYt

E = -V p

41ro

== V

dV - jw

J~
dl
41rJ,lo
-1

lYt_1 dV
41r J,lo

(.5.63)

(5.64)

Therefore it is convenient to introduce two potential functions by the defining relations


A(x,Y,z,t)

-- J

~(~,1J,t)ej(wt-kn

4.l(x,Y,z,t)

-- J

p(~,r],t)ej(wt-kn

-1

41r~o ~

41ro~

(5.6.5)

dV

(5.66)

dl

in which the time factor e jwt has been reinserted and e-jkr/-c has been substituted for l/;.
A is known as the magnetic vector potential function and <I> is known as the electric
scalar potential function.
Since k == w/C, one may write
exp [j(wt - k~)] == exp [jw (t - ~/c)]

Therefore each current element in the integrand of (f).65), and each charge element in
the integrand of (f).66), makes a contribution to the potential at (x,y,z) at time t which
is in accord with the value it had at the earlier time t - ~/c. But this is consistent with
the idea that it takes a time ~/c for a disturbance to travel from (~,'Y/,t) to (x,y,z). For
this reason, (5.6t)) and (f).66) are often called the retarded potentials.
From (5.63) and (5.64),

E == -V<I> B==vxA

(5.67)
(.1.68)

in which the subscripts on the del operators have been dropped, since A and 4.l are
functions only of (x,Y,z) and not also of (~,'YJ,t).

280

Elcctronuumetics in Free Space

CHAPTER tj

The differential equations satisfied by A and <I> may be deduced by taking the
divergence of (5.67) and the curl of (;").68), which leads to

1 ..

V'2A - - A =
c2

(;").69)

-1

1J.o
p

(ti.70)

these results being valid whether t and pare harmonic functions of time, or more general
time functions representable by Fourier integrals. A proof 111ay be found in Appendix H.
At points away from the sources, it is unnecessary to solve for both <I> and A (unless
static source distributions are involved). One need find only A, then use (f).G8) to
obtain B, and then use (5.24) to deduce E. The latter may be rewritten

c
jk

E=-vxVxA

(:").71)

It is interesting to observe that for time-independent sources (w = k = 0), Equations


(5.66) and (5.67) reduce to the electrostatic relations encountered in Chapter 3, whereas
Equations (5.6,S) and (5.68) reduce to the magnetostatic relations developed in Chapter 4.
For the more general time-harmonic case, if all the sources are confined to a finite
volume Yo, and if ~ from any point in V o to (x,Y,z) is much bigger than the maximum
dimension of Yo, then (5.65) 111ay be approximated by replacing ~ with r in the denorninator of the integrand, and by replacing ~ with (5})4) in the phase factor, where r is
drawn from an origin in V o to (x,Y,z). This gives the far-field approximation
A(x,Y,z,t)

ei(wt-kr)

---1-

41r,LLo

r v

t(~,l1,t)ejkJI dV

in which, as before, JI = ~ sin () cos cP


11 sin () sin cP
mation, using (5.68) and (5.71), one obtains

(,5.72)

+ r cos 8. To this same approxi-

B = -jkl r X A
E = -c1 r X B = -jwA T

(,j.73)
(5.74)

with AT that part of A which is transverse to the radial direction L.


A study of Equations (5.72) through (fj.74) shows that the fields are in the form of an
outgoing spherical wave
ei(wt-kr)

(5.75)

41r1J.o1r

which diminishes as the reciprocal of the distance, and that this wave is modified by
the directional weighting function

a(o,cf

f ta,'I/,t)e

i k JI

dV

(5.76)

For this reason a(8,cP) may be called the field pattern, and is closely related to the
power radiation pattern of the system of sources, as will become evident shortly.

SECTION

The Potential Functions

281

From the form of (5.75), it is apparent that the wave is propagating in the radial
direction at such a speed that a point of constant phase satisfies the relation

wt - kr = constant
dr
w
vph = -dt = -k =

which gives

as the phase velocity of the wave.


Further, 0).73) and (5.74) indicate that both Band E are transverse to the direction
of propagation and that in the transverse plane they are perpendicular to each other,
their magnitudes being in the ratio E/ B = c.
These properties are common to all time-varying electromagnetic fields in free space
at points remote from the sources.
EXAl\1PLE

5.2

A. simple source of great practical importance is the half-wave dipole. It may be assumed
to consist of a filamentary current disposed along the Z axis, as shown in the figure, with

"

I(f)

I; cos kf

an amplitude distribution which is spatially sinusoidal. Thus one may describe the current
distribution by the equation
I(s,t) = 1m cos kseiwt

282

Electronuupieiics in Free Space

CHAPTER

in which 1m is the amplitude of the current at the central feeding terminals, which are
assumed to be negligibly separated.
Use of (5.76) gives

J Um
"/4

(j,C 8)

kt . e i kt co, 9 dt

cos

-"/4

= 1 21m cos (1r /2 cos 8)


z k
sin? 8
so that

AT = -10

.
SIn

E =

from which

Imei(wt-kr)

8..-l z = -1 8 - -121rJ.Lo kr

jwlmei(wt-kr)
18 - - - -

cos (1r /2 cos 8)

------

sin (J

[COS (1r /2 cos 8)]

21rfJol kr
sin 8
_
jcl m ei(wt-kr) [cos (1r /2 cos (J)]
-1 8 - - l - 21rJ.Lo
r
sin (J

= - iT X E = l cP
C

jIm

ei(wt-kr)

- - --l

21rJ.Lo

[cos (1r /2 cos (J)]


sin 8

The directional weighting function [cos (1r /2 cos 8)]/sin {} is plotted in polar form in the
second figure, for a half-plane 4> = constant. The three-dimensional field pattern may be
obtained by rotating this plot around the Z axis, and bears some resemblance to a torus.

z
8

cos

(~cos 8)
sin 8

SECTION

lJIaqnetic Stored Energy

283

5.8 MAGNETIC STORED ENERGY


With the aid of Faraday's emf law, it is now possible to derive a relation for the
energy stored in a magnetic field. To this end, consider first a charge q which is part
of a current system giving rise to an electromagnetic field E, B. If vet) is the instantaneous velocity of the charge, in time dt it suffers a displacement v dt and experiences
a force q(E + v X B). The work done on the charge during this displacement is

dW

==
==

q(E + v X B) v dt
qv E dt

Therefore the power being supplied to the charge by the field is

dW
== qv' E
dt

p == -

(5.77)

If, in place of q, one considers all the charge PI dV in a volume element dV which
possesses the instantaneous velocity vIet), the power being supplied to this charge is
d 3P I == PIVI E dV == 1 1 E dV

Similarly, for all the charge P2 dV in the same volume element dV, which has the different instantaneous velocity V2(t), the power being supplied is 12 E dV. Upon superimposing the contributions for all the charges in dV, one obtains

d 3P

(5.78)
== 1 E dV
in which \ == \1 + \2 + . . . .
N ext, consider a distribution of current density 1(~,1],t) which has established a
steady magnetic field B(x,y,z). Let \'(~,1],t,t) be an intermediate value of the current
density as it is slowly raised from zero to its final value t, and let B' (x,Y,z,t) be the
corresponding intermediate value of the magnetic field. As suggested by Figure 5.3, let
B' dS be a tube of flux through the point P(x,y,z) and let C be the contour of this tube.

-:

FIGURE

5.3

Energy build-up in a magnetic field.

r- dS'

284

Electromaqneiics in Free Space

CHAPTER

If S' is an open surface with C as its sole boundary, then

f .'. dS'

S'

is the total current linked by C.


Let C' be the contour of one of the tubes of current " dS' which pierces S'. 'I'hen
the tube of flux H' dS induces an electric field along C' given by

2E

df'

13' rlS

C'

This electric field opposes the growth of the current " dS' and energy must be supplied
by the current to the field at the rate

d 4P

- (.'

dS')

2E

df'

(.'.

dS')(B' dS)

C'

which is a use of (5.78).


When all the tubes of current piercing S' are included

d 2P = -

fd

2E.'

v'

dV' =

13' dS

f .'dS'

S'

in which V'is the region of current flow for all the tubes of current which pierce S'.
Since the field is changing so slowly that D~ may be neglected in comparison to 1',

H~ de = f .'dS'

S'

and it follows that

d2 P =

f H~.B'dV
~v

wherein oV is the volume of the tube of flux whose contour is C. Upon including all
the tubes of flux, one obtains for the power being supplied to the entire field
P =

B dV
fv H,.,
o

d
= -dt

f -1

1,
1/-

,....0

dV

(5.79)

with V the volume of all space.


If W m is the energy stored in the magnetic field, so that P = dW mldt, then
Wm =

tJ.lo l

f B2

dV

=t

f B n,

dV

(5.80)

in which B now has its final steady value. Equation (5.80) is a companion formula to
(3.151), which gave the electrostatic stored energy for a steady electric field distribution.
EXAMPLE

5.3

It was shown in Chapter 4 at the end of Example 4.10, that if two long concentric conducting tubes carry equal steady currents in opposite directions, the field between them is
given by
B<I>(r) =

----1

27rJ..Lo r

SECTION

Poynting's Theorem 285

in which r ranges from a, the outer radius of the inner tube, to b, the inner radius of the
outer tube. I is the total current in either tube.
The energy stored between tubes, per unit length, is

IVm

1J (_Jh_)2
= -1- J - = 1
b

= ~ J.lo
2

27rJ.lo

-1

4~J.lo

dr

--I
n -b
-1

Znr dr

47rJ.lo

As an example, if a = tin., b = t in., and I = 1 amp, the stored energy in the magnetic
field, per meter length of the t\VO concentric tubes, is 0.07 micro-joules.

5.9

POYNTING'S THEOREM

Consideration can next be given to the power balance in a time-varying electromagnetic field. Assume that there is a system of impressed sources Ii which causes an electromagnetic field E', B', and that in response to this impressed field there is an induced
system t of currents I r creating an additional field ET, B T. The total current density and
field at any point is therefore

+ Ir
Ei + Er

I == Ii

E
B

B'

==

B:

In accordance with (5.78), the total field E reacts on the impressed source density t i in
such a way that, if power is being supplied to the field, it must be at the rate

But from Maxwell's equations,


ti

so that

x Ho -

aDo

at

tr

dP = [-EoVXHo+ :tG~oE2)+Eo\r]dV
3

(5.81)

Application of the vector identity (V.108) gives

v (E

Hs) = H, V X E - E V X H,

and this result, coupled with (5.25a) yields

- E V X H,
0

(E X Hs)

+ H, (aa~)
0

(5.82)

t The decomposition of the total current systern into impressed and response current densities is
arbitrary, but often forms a natural division. As an illustration, the currents which flow in the dipole
of Example 5.2 may be considered as response currents, whereas the currents which flow in the generator and transmission line leading up to the terminals of the dipole may be taken as the impressed
source system.

286

Electromaqneiics in Free Space

CHAPTER tj

Therefore (5.81) may be rewritten


(5.83)
This result gives the power balance in a volume element dV. The left side of (ti.83)
is the instantaneous power being supplied by the sources to dV. The factor tE olt 2 +
t,uOl B2 has been shown to represent the density of energy stored in static electric and
magnetic fields. If it is assumed that this factor bears the same interpretation for
dynamic fields (and since it is a point function, this is a most reasonable assumption),
then the term

at

(12
-

E2 + - ,u-1B2)
2 0

Eo

may be identified as the time rate of change of the density of stored energy.
The factor E v represents the power density being absorbed from the field by the
response current lr. If, for example, the response current is flowing in a conductor, this
term accounts for ohmic loss. Alternatively, if IT is due to freely moving charges, E IT
accounts for their increase in kinetic energy.
When the law of conservation of energy is invoked, it follows that the term
V (E X 1-1 0 ) may be interpreted as the volume density of power leaving dV.
This conclusion may be seen from another point of view by integrating (5.83).
With the aid of the divergence theorem, one may write

J (~EoE2
+ ~ /-lOlB2) dV + J E
2
2
v

dt v

tT

dV

J E X H o ' dS

.')

(5.84)

The left side of (5.84) represents the entire instantaneous power being supplied by all
the sources. The first integral on the right side of this equation accounts for the time
rate of change of the entire stored energy of the field. The second integral stands for the
power being absorbed by the system of response currents. The last integral therefore
represents the entire instantaneous power flow outward across the surface S bounding
the volume V. For this reason, one may define the Poynting vector as
CP = E X

n,

(5.85)

and place upon it the interpretation that it gives in magnitude and direction the
instantaneous rate of energy flow per unit area at a point. This is Poynting's theorem.
Since the units of E and H, are volts per meter and amperes per meter respectively,
it is seen that the units of CP are watts per square meter.
EXAMPLE

5.4

For the field of the half-wave dipole treated in Example 5.2, at points remote from the
dipole (the far-field), Equations (5.73) and (5.74) are applicable and therefore
1-1 0 =

in which

1]

-1

J..Lo

J.1.o

B = -1, X E
C

1
-1 X E
T

1]

377 ohms is called the impedance of free space. Therefore the

SECTION

Poynting' s Theorem

287

Poynting vector (5.85) may be written for this case in the form

~ = i.s,
:1

\r

= i,

_?1~_

(27T"r) 2

(1

[cos

~(2

)=

cos
SIn (J

since eRe [jei(Wl-kr)] = - sin (wt - kr). If

(j>

u,

1T X

E~
1T--;

~2] 2 sin 2 (wt

cP is the

kr)

time-average val ue of (P, then

i. 1]I~ [cos (2~7T" cos (J)]2


87T"2 r2
SIn (J

and one sees that there is a steady radial flow of energy away from the dipole. The total
average power being radiated I11ay be determined by integrating cP over the surface of a large
sphere S centered at the dipole. This gives

-- f7T'1 r
P r a d -- findS
'-.r"
S

?JIm
47T"

1r

?Jli~

--

87T"2r 2

[COS (7T"/2 cos


sin ()

[cos (7T" /2 cos (J)]2


sin (J

())]2

12
2 . (Jd(J
r 7T"r SIn

de

1'/I~ (1.2186)
47T"

in which the integral has been evaluated by first expanding the integrand in a power series .
.As a specific illustration, if Ieff = O.707Im = 1 amp, since ?J = 377 ohms, the radiated
power is P r a d = 73 watts.

Cases such as the preceding example, in which the currents and fields are varying

harmonicallu in time, occur so frequently and have such importance as to deserve


special discussion. Expressing all quantities in the form of a complex spatial vector
function multiplied by eiwt, such as
one may write

<P = E

E(x,y,z,l) = CRe E(x,y,z)e iwt

H, = t(8eiwt + 8*e-iwt) X (Jeoeiwt + Je~e-iwt)


= l(8 X :ICci' + t* X :leo) + t(t X JCoei2wt
= tCRe (E X Hci') + fCRe (E X "0)

t*

X JC~e-i2wt)

(5.86)

The term fCRe (E X Hri) is independent of time and thus represents the time-average
value of CP, giving

cP

teRe (E X Hri)

(5.87)

The term tCRe (E X 110) contains the factor e


and thus represents the oscillating
portion of Poynting's vector. CP may therefore be interpreted at a point as consisting
of a steady flow of energy density plus a flow which surges back and forth at frequency 2w.
Similarly
tOE2 = toE E = to[t(te iwt + t *e- iwt) (te iwt t *e- iwt)]
(5.88)
= -toE E* + -toCRe (E E)
(5.89)
and
tJJ.o 1B2 = tJJ.o1B B* + tJJ.o1CRe (B B)
j 2wt

288 Eleciromaqneiics in Free Space

CHAPTER

The terms tfoE E* and t,uolB B* are independent of time and represent the timeaverage stored energies; their time derivatives are zero. The terms tfoCRe (E E) and
t,uolCRe (D B) oscillate at a frequency 2w and they represent the variable components
of the stored energy.
Finally,
E IT = t( Bei wt + B*e-i wt ) (IJe iwt
= tCRe E IT* + JCRe E IT

+ IJ *e-i wt )

(:").90)

Here again, the term tCRe E IT* represents the time-average power density being
absorbed by the response currents; the term -~CRe E IT oscillates at a frequency 2w
and represents the energy density being cyclically absorbed and released by the
response currents.
With this formulation, Equation (.1.84) may be rewritten in t\VO parts. The timeaverage power balance is seen to be

tCRe

f E v. dV + tCRe sf E

H~ dS

(5.91)

whereas the time-variable part, oscillating at a frequency 2w, may be written

P(2w)

= -d

f [1 foCRe (E E) + -1,un CRe (B B) dV


+ i CRe f E \' dV + ! CRe f E

dt v

n, dS

(5.92)

Thus, on the time average, the sources supply power only to that component of the
response currents in phase with the electric field, represented by the first integral
in (5.91), and to the net energy flow out of the volume V across the surface S. In
addition, the sources may have to furnish energy and take it back at the cyclic rate 2w
if the right side of (,5.92) is not zero. However, in many practical circumstances, the
individual integrals in (5.92) may not be in phase, but may be adjusted purposely
so that they cancel each other, thus "matching" the generator.
EXAMPLE

5.5

In a volume V away from all currents, Equations (5.91) and (5.92) give

t CRe

1- CRe sf
2

E X H, dS = - -d

dt

f E H~
f [1- EoCRe
v
X

dS = 0

(E E)

+ - }J.o 1CRe (8 B) ]
4

dV

The first equation says that the average power flow into iT equals the average power flow
ou t. This is as it should be since V consists of free-space. The second equation says that
the energy which surges back and forth across S accommodates the cyclic variation of
stored energy within V.
As a specific illustration, let l be sufficien tly remote from the currents so that the fields
'
may be described by (5.73) and (5.74). If then all dimensions of V are small compared to r

SECTION

Poynting' s

Theore11~

289

(the distance to the currents), A assumes the simple form throughout V of

A =

<Xoei(wt-kz)

with <X O a constant, and the local Z axis chosen in the direction of propagation.
I t then follows that
E =

lx(80e-ikz)eiwt

B = 1.

(~ e- ikz)

eiwt

in which 8 0 is taken to be a real constant, and the X and Y axes have been oriented appropriately in a transverse plane. Thus

toCRe (E E) = toCRe

8~ei2(wt-kz)

= t08~ cos 2(wt - kz)


1

4 J..Lo CRe (B B)
1

1 J..Lo

= "4 ~ 8 0 cos 2(wt - kz) =

41 oCRe (E E)

The time-varying energies stored in the electric and magnetic fields have the same peak
values. The same is true of the time-average values.
Further,

iE
iCRe (E

X H6' = clz(to8~)
X Hs) = clz(to8~) cos 2(wt - kz)

The first of these two expressions has only a real part, is independent of spatial position,
and gives the time-average value of the Poynting vector. It is interesting to note that the
average power crossing unit transverse area is equal to the average energy stored in a
volume c units long and unit area in cross section. Since the waves are propagating at a
speed c, this is a most reasonable result.
The two integrals at the beginning of this example may be applied to the case for which
V is a rectangular volume of square and unit cross section, one-quarter wavelength long
in the Z direction. Then the first integral becomes
tCRe

fE

H; . dS = c(hoG~)

f 1,

dS

This surface integral has contributions only over the two transverse surfaces, these contributions being equal and opposite, thus giving the required null result.
For the second integral,

c(iEo(;~)

f 1, cos 2(wt -

-it ! GEO(;~)

kz) dS =

cos 2(wt - kz) dV = -

-CEo(;~ cos 2wt

it Gf (;~
W

sin 2wt)

- Ie o8 o cos 2wt
and thus the two sides of the second equation of this example are seen to agree.

All the principal features of the .preceding discussion of Poynting's theorem for timeharmonic fields may be retained by deriving a complex form of Equation (5.84). If
one assumes that all fields and currents are expressible as complex vector functions of

290 Eleciromaqneiics in Free Space

CHAPTER

position, multiplied by the time factor ei wt , one can let


d 3P

E . \i* dV

(5.93)

be defined as the complex power being supplied by the sources to the field. This concept
of complex power will require and receive subsequent interpretation. Then, since
.*

\1.

= V X
-

= V X

one may write


d 3?

aDri
H* - - -

at
*
.
H + JW D*

\r *

0
0

\T*

= [-E V X Hri - j2w(iEoE E*)

+ E \r*] dV

Once again, use of (V. 108) and (5.25a) gives

- E V X H o*

= V

* (aB)
+ Do
at

*
(E X Do)

= V (E X Hri)

+ j2w

and therefore

d 3P = [j2w(tJLo 1B B* - tEoE E*)

E t r*

llolB B*)

V (E X Hri)] dV

(5.94)

If this expression is integrated the result is

P=

j4w(Wm - We)

f E v' dV + sf E

X Hri dS

(5.95)

in which, by virtue of (5.88) and (5.89), Wmand We are the time-average values of the
total energies stored in the magnetic and electric fields in the volume V.
One-half the real part of (5.95) is seen to be identical with (5.91) and gives the timeaverage power being delivered by the sources, that is,

tCRe?

(5.96)

No equivalent simple interpretation may be placed upon the imaginary part of (5.95)
in the general case. However, in regions away from the sources

idm

fE

X Hri dS

= 2w(We - W m)

(5.97)

Example 5.5 contained an illustration of (5.97) in that E X Hri did not have an imaginary part and the time-average values of electric and magnetic stored energies were
found to be equal.
Because of the utility of the preceding formulation, it is customary when dealing
with time-harmonic fields to define a complex Poynting vector by the relation

cP

H6'

(5.98)

from whence it follows, by use of (5.87), that the time-average value of energy flow
at a point, per unit transverse area, is given by
(5.99)

SECTION

EXAMPLE

The Wave Equation in Rectangular Coordinates

10

291

5.6

In the application of Equation (5.95) to radiation problems, all points of the surface S
are customarily remote from the currents, so that (5.73) and (5.74) are applicable and

(5.100)
in which ae and act> are components of a(8,cP) as given by (5.76).
It should be noted that cP in (5.100) has only a real component and is therefore twice the
time-average energy flow in watts per square meter. At a fixed large distance r from the system of currents, cP is a function of 8 and and is known as the power radiation pattern.
The directional dependence of cP is controlled by the current distribution through (5.76).
If cP(fJ,) is specified, (5.76) becomes an integral equation involving the sought-for current
distribution \(~,l1,r); this defines an antenna synthesis problem. If \(~,l1,r) is specified,
(5.76) is an integral solution for the power pattern cP(8,); this defines an antenna analysis
problem. These subjects have been treated extensively in the literature."
For the specific case of the half-wave dipole of Example 5.2, since

21m cos (71'"/2 cos fJ)


. 8

a= -l e - k

Sin

use of (5.99) and (5.100) gives

(J> = ~(Re
2

cP

= 1,

'7I~
871'"2

r2

[cos (1l"~2 ;os (J)J2


Sin

which agrees with the result found in Example 5.4.

5.10

SOLUTIONS TO THE WAVE EQUATION IN RECTANGULAR


COORDINATES-UNGUIDED WAVES

Equations (5.35) may be looked upon as dynamic analogs to Poisson's equation. The
developments in Sections 5.5 and 5.6 have revealed that solutions to these equations
in regions remote from the currents may have a wavelike nature. But at points not
occupied by currents, (5.35a) and (5.35b) reduce to
(5.101a)
(5.101b)

and these homogeneous vector differential equations may be likened to Laplace's


equation. Because the general solutions to (5.101) are wavelike (as will be seen shortly),
See, e.g., H. Jasik, ed., Antenna Engineering Handbook, McGraw-Hill Book Company, New York,
1961. Also, R. C. Hansen, ed., Microwave Scanning Antennas, Academic Press, New York, 1964.

12

Electronuumetics in Free Space

292

CHAPTER

(5.101a) and (5.101b) are customarily referred to as the homogeneous vector wave
equations.
Since V E == 0, V B == 0 away from the currents, these equations further reduce to

1 2E
V'2E - - -

at 2

= 0

(5.102a)
(5.102b)

If 1 is any component of E or B, then in rectangular coordinates

af
af af
af
-+-+-+--=0
ax
ay
az
a (jct)
2

(5.103)

and this is seen to be a four-dimensional form of Laplace's equation. Using the method
of separation of variables, in exactly the same manner that it was employed in Section
3.11, one obtains as a primitive solution of (5.103)

f
with

= eiwt-ikr

(5.104)

drawn from the origin to the point (x,y,z) and

Ii. =

k2

lxk x
kz 2

+l

+ lzk z

yk y

ky 2

kz 2

w
= -c

(5.105)
2
2

(5.106)

The solution (5.104) is recognized as representing a uniform plane wave, in that all
points in a plane transverse to k have the same amplitude and a common phase. The
wave propagates in the direction of k at the velocity of light and has a wavelength A
given by

211"
k

(5.107)

A =-

If attention is restricted to uniform plane waves propagating in the positive and


negative X directions, Equation (5.104) indicates the fundamental solution
(5.108)

in which a, and b are constants.


By linear superposition, and with the aid of the Fourier integral theorem, a general
solution may be constructed from (5.108) of the form

!(x,t)

f1(x - ct)

+ f2(x + ct)

(5.109)

in which 11 and 12 are arbitrary functions. The forms of these functions show clearly
that any spatial waveform existing at a time t 1 is preserved and merely displaced a
distance c(t 2 - t l ) at a later time t2 , indicating undistorted propagation at the velocity
of light.
If waves propagating in all directions are considered, three-dimensional Fourier
integrals may be used to fabricate arbitrary spatial distributions. In particular, a
solution
(5.110)
!(x,y,z,l) = Ol(Y,Z)!l(X - ct) + 02(Y,Z)!2(X + ct)

SECTION

10

The Wave Equation in Rectangular Coordinates

293

may be constructed with 91 and 92 arbitrary functions. This is seen to represent nonuniform plane waves propagating in the X direction at the velocity of light, with
arbitrary amplitude distributions in a transverse plane. By insertion of (5.110) in
(5.103) it is evident that 91 and 92 satisfy the two-dimensional Laplace's equation.
If one returns to the constituent solution (5.104), which applies for any component
of E or B, it follows that on putting the components together, a uniform plane wave
may be represented by
E(x,Y,z,t) = IEEoejwt-ikr
(5.111)
B(x,y,z,t) == IBBoeiwt-ikr
(5.112)

wherein l E and In are unit vectors and Eo and B are complex constants. Since V E ==
and V B == 0, it follows that both IE and IE must be transverse to k, and for this
reason the electromagnetic wave is said to be transverse.
E and B are related through Maxwell's equations, and insertion of ((5.111) and
(5.112) in (5.25a) gives

(l{ X i E )E o - i n wB o = 0
which requires that

in == i k

(5.113)

IE

k
Eo
B o == - Eo = w

(5.114)

Therefore the E and B fields are crossed, both being transverse to the direction of
propagation, and their amplitudes are in the ratio c. These properties are held in
common with spherical waves at great distances from the sources, as has already been
noted in Section 5.7. But this is hardly surprising, since spherical waves at great radii
of curvature are well-approximated by plane waves.
The power density of this uniform plane wave is given by
(5.115)
and many of the remarks put forth in Example 5.5, in which a plane wave approximation was made, are applicable to this case.
By linear superposition, the above results for a uniform plane wave may be generalized to the case of a nonuniform plane wave through use of Fourier integrals. The development parallels what has already been said for a single component of the field.
EXAMPLE

5.7

Consider a uniform plane wave traveling in the +X direction and imagine that it encounters a flat conducting surface in the plane x == O. If it is a good conductor, practically
all the energy in the incident wave will be reflected. This situation may be idealized by
assuming that the conductor is a perfect reflector, meaning that tCRe E X H6 == 0 within
the conductor.
Assume that the Y axis is oriented parallel to the incident electric field E'. Then from
(5.111)
whereas from (5.12) and (5.13),

294

Electronuumeiics in Free Space

CHAPTER

Equation (5.115) gives for the incident power density

(Jii = cl x C
t Eo1Et1 2)
Let the reflected wave be represented by

Then, since the power flow for the reflected wave must be in the (5.12) and (5.13) give

Dr(x,l)

~Y

direction, Equations

-1 z E~
e1w. t +"ik x
C

Because no field exists inside the idealized conductor, the total electric field at x = 0must vanish, so that
Ei(O-,t)

+ Er(O-,t)

1 11 e iwt (Et

+ E~)

== 0

which requires that E~ = - E~. This in turn satisfies the condition that the power density
in the reflected wave equals the incident power density.
The total magnetic field just in front of the idealized conductor is therefore
Bi(O-,t)

+ Br(O-,t)

= l.ei"' ! (~~

~~)

2Et 1wt
.

= lz-

whereas the total magnetic field just inside the conductor is zero. This discontinuity in the
magnetic field is accommodated by a sheet of current which ftO\VS in the surface of the conductor. This current sheet has been induced by the incident wave and is the source of the
reflected wa vee I ts strength may be deduced by recourse to the dynamic form of Ampere's
circuital law, C5.28b). With reference to the figure, if a rectangular contour is chosen in the
Z

Free space

Reflected wave
Incident wave

Conductor

.. x

SECTION

The lVave Equation in Rectangular Coordinates

10

295

XZ plane such that its long legs, of length l, are just inside and just outside the conducting
surface, then

n, dl = 21 J..L;

-1

E1e;WI

This must equal the total conduction current enclosed by the contour, since Do will make
no contribution if the short legs of the contour are reduced to infinitesimals. Therefore,
the total conduction current enclosed is

I =
and a linear current density

2le f: oE1eiwt

III flows in

the conducting surface such that

= l y 2e f:oE~eiwt

In causing the flow of energy in the electromagnetic wave to be turned around, the
conductor suffers a reaction which may be computed from the Ampere force law. Since

d 3F = \ X B dV
if the areal current density 1 is "collapsed" into the surface to give a lineal current density j,
one obtains
d 2F = i X B dS
so that the conductor experiences a pressure due to the wave given by

d2F
.
p=-=JXB
dS

Since the sheet of current is immersed in a magnetic field whose spatial average value is
l z (E11e)e iwt, the pressure is
and this has an average value

l x 2 (t f:oIE~12)

which is twice the energy density in the wave (cf. Example 5.5).
This result is consistent with the viewpoint that the energy possessed by the incident
wave in a column c units long and unit cross section is e(tf:oIE~12). According to the massenergy equivalence formula, this may be equated to me", But then this much of the wave
has a momentum equal to

me =

t f:oIE~12

and this much momentum is reversed in 1 sec against unit area of the conductor, causing
a radiation pressure of 2mc.

If one returns to (5.111) and (5.112), which are the expressions for a uniform plane
wave, and the Z axis is chosen in the direction of propagation, then
lEE o

= l.rEl

lyE 2

in which E 1 and E 2 are complex constants. Therefore

= lxElei(wt-kz)

E 2 eJ(wt-kz)
.

B = -Ix -

Inspection of these equations reveals that

lyE 2e i(wt-kz)

+ Iy -E

(Ex,B y)

.
eJ(wt-kz)

(5.116)
(5.117)

and (Ey,B x ) are linearly independent

296

Electromaqneiics in Free Space

CHAPTER

fields. (Ex,B y) is said to be an X-polarized wave and (I~ly,Bx) is called a V-polarized


wave, the designation referring to the spatial direction of the electric field. The total
field (E,B) is the superposition of these t\VO cross-polarized waves,
If the complex factors Eland b 2 have the same phase, then E at any point in space
oscillates along a directional line which makes a constant angle <p with the X axis, this
angle being given by <p = tan' (1~2/I~l). Under this condition, the wave is said to be
linearly polarized.
If E 1 and E 2 have the same magnitude, but their phases differ by 90 deg, then E at
\
any point in space does not oscillate. Its magnitude is constant, but its direction rotates
at the angular velocity w. To see this, let })2 = +jEl, so that (5.116) yields
1

E(x,t)

=
=

CRe(lx + jly) E le;'(wt-kz)


El1[lx cos (wt - kz) Iy sin (wt - kz)]

(5.118)

in which, for simplicity, the phase of Ell has been chosen as zero. Therefore,

IE(x,t)1 = EI[cOS 2 (wt - kz)

+ sin" (wi -

kz)]~~ = b\

a result which is independent of time and position. Further, the direction of E makes an
angle cp with the X axis given by
cp = tan- 1

sin (wt - kz)


=
cos (wt - kz)

(wt -

kz)

(5.119)

At a fixed point (x,Y,z), cp changes linearly with time at the angular rate w. If the
thumb of the right hand is placed in the direction of propagation, E thus either rotates
in the direction indicated by the other fingers, or counter to this direction. When the
rotation of E agrees with the direction of the fingers (E 2 = -jEl), the wave is said
to be right-handed circularly polarized; if E rotates counter to the finger direction
(E 2 = +jE 1) , the wave is said to be left-handed circularly polarized.
Alternatively, if time is held fixed, and the direction of E is viewed as a function of z,
Equation (5.119) indicates that the locus of the tip of E is a helix whose axis is the
Z axis, the z length of one turn being a wavelength. The helix resembles either a lefthand thread or a right-hand thread, depending on whether E 2 lags or leads E 1
The stored energy and the energy flow associated with either a right-handed or lefthanded circularly polarized wave are given by

We

i EoIE(x,i )12 =
(P

= E

H,

tEoE~

Ei

= Wm

Iz-

(5.120)
(5.121)

11

Therefore, the energy density and the energy flow are both independent of time and
space, a characteristic not shared by linearly polarized plane waves.
If one returns again to the general solution (.:).116), and if E 1 and E 2 have arbitrary
relative amplitudes and phases, at any point in space the tip of E describes a locus
which is an ellipse, and for this reason the wave is said to be elliptically polarized. It is
left as an exercise to develop the properties of such plane waves, including the useful
fact that any elliptically polarized wave may be represented by appropriate amounts of
left-handed and right-handed circularly polarized waves.

SECTION

5.11

Rectilinear Guided TVaves 297

11

RECTILINEAR GUIDED WAVES

Many structures, such as two-wire lines, coaxial cables, waveguides, and dielectric
rods, have been found to possess the property of being able to guide electromagnetic
waves from one point to another. When this guiding occurs along a straight-line path,
the problem is amenable to analysis, for then every component of the electromagnetic
wave may be represented in the form
f( u )ei (wt-kzz)

(5.122)

in which z is chosen as the propagation direction and u, v are generalized orthogonal


coordinates in a transverse plane. (See Mathematical Supplement, Section V.11.)
Under this assumption, in a source-free region, Maxwell's equations become
1 aE
_z
h 2 av

+ jk E v
Z

1 aE
-jkzEu - - -

hl~2

[a:

hI

au

(h2E.) -

= -J'wB u
=

-jwB v

:v (h1E

u) ]

-jwB,
(5.123)

wherein tii and h 2 are the scale factors associated with u and v.
These equations can be solved for the transverse field components, yielding

(5.124)

in which k 2 = W2J..LoEo = (W/C)2 = (27r/A)2 is the square of the free-space wave number.
Equations (5.124) indicate that, in general, the entire electromagnetic field can be
determined from knowledge of the longitudinal components.
An important exception to this occurs when the field is propagating in the Z direction at the velocity of light, for then k; = k and Equations (5.124) have a pole, unless
E, = Hz == O. Once again the conclusion is reached that electromagnetic waves propagating in free space at the velocity of light are transverse.
If this case (k z = k) is pursued further, with the longitudinal field components zero,
Maxwell's equations (5.123) give
so that

E; = eli;
E; = -cB u
E B = (l ucB v - l vcBu ) (l uBu + l vB v ) == 0

(5.125)
(5.126)

298

Electronuujneiics in Free Space

CHAPTER

and thus the transverse electric and magnetic fields are orthogonal. In addition, if one
writes
E = f,(u,v)ej(wt-kz)
B = (B(u,v)ej(wt-kz)
(5.127)
with the implication being that 8 and (B are transverse two-dimensional static fields,
then
1z
1v
lu
h2
hI h Ih2
vxf,=
a a a =0
(5.128)

au

av

hlS u

h 2S v

az

Therefore f, may be expressed as the negative gradient of an electrostatic potential


function. For this reason, if any two-dimensional static electric field, such as those
found in Chapter 3 is put in motion at the velocity of light, the result is a valid dynamic
solution to Maxwell's equations. The rich storehouse of solved two-dimensional electrostatic problems is thus available for consideration in the creation of rectilinear guided
waves.
Furthermore, since (B is orthogonal to f" it follows that the flux lines of (B lie in the
equipotential surfaces of any two-dimensional electrostatic problem. Field maps such
as those in Example 3.29 may be viewed as giving an electrostatic field and its equipotentials, or alternatively, as the transverse electric and magnetic fields of a propagating wave.
EXAMPLE

5.8

In Example 3.14, the image principle was used to determine the electrostatic potential
distribution due to two infinitely long, parallel tubular conductors, each of diameter 2a
and center-to-center spacing D, when the upper cylinder contained a net charge
coul/m
and the lower cylinder contained a net charge -}{ coul/m. With the coordinate system
arranged as in the figure (see next page), the potential was given by

+}{

<I>(xy)
,

2
= -xI n {x 2 +
471'" Eo

[y + t(D2 - 4a2)~2]2}
[y - t(D2 - 4a2)~2]2

When the coordinates of a point on the upper conductor were inserted in this expression,
the potenttal of the upper conductor was deduced. When the same thing was done for the
lower conductor, the potential difference was found to be

v =..!!:- In {D2a + [(D)2


2a
71'" Eo

f,

1]~}

The electrostatic field caused by this static charge distribution may be determined from
= - Vcf>, giving
8(x,y) = 2 In /D/2a

in which

(x,y)

i(D2 -

[(D/2a)2 _

IJ~~1

f(x,y)

l x x + l y [Y 4a ) }2]
l x x + l y [Y + t(D2 - 4a2)~~]
x 2 + [y _ i(D2 _ 4a 2)Hj2 x 2 + [y + i(D2 - 4a2)~~J2

If this static electric field is put in motion along the cylinders at the velocity of light
(this assumes the cylinders are perfectly conducting and in a free-space environment),

SECTION

Rectilinear Guided Waves 299

11

f
I

-----x

~2a~

then the electric field distribution becomes

E(x

y z t)
, , ,

= t(x

y)ei(wt-kz)

2 In {D/2a

f(x,Y)

+ [(D/2a)2

- 1]~~}

Vei(wt-kz)

and a voltage wave can be imagined to travel along the twin cylinders in conjunction with
the electric field.
The accompanying magnetic field may be deduced from Maxwell's equations (5.123) or
from (5.125) and is given by

Using the integral form (5.28b) of Maxwell's second equation, and taking a contour in the
X Y plane which coincides with the perimeter of the upper conductor, one is able to determine the current flow in the upper conductor. Since Do == 0 within a perfect conductor,
the result is that
Ienclo8ed

Iei(wt-k;.)

Vei(wt-kz)

= 'f H, dt = - - - - - - - - - - - - - - c
~ ~ In {D/2a + [(D/2a)2 - 1]~~}
1r

so that the complex current amplitude, I, is linearly proportional to the complex voltage
amplitude, V. The current in the upper conductor also is seen to be a wave; a counter

300

Electromagnetics in Free Space

CHAPTER

current flows in the lower conductor. This two-conductor system, which guides the electromagnetic wave rectilinearly, is called a two-wire transmission line.
The ratio V /1 is of some interest, and is called the characteristic impedance of the
transmission line. It is given by

V ==
Z= T

{D + [(D)2
2a -

120 In 2a

1 ]~2} ohms

in which the numerical values of j..Lo and Eo have been inserted. Z is seen to be pure real
and to have a value governed by the geometry of the twin conductors. If a finite length
of this two-wire line is terminated by a lumped resistor of value R = Z, a wave traveling
along the wires, upon reaching the resistive termination, will be totally absorbed, since the
voltage-to-current ratio in the resistor is exactly the value required by the wave. If R ~ Z,
there must be a reflection.
This procedure may be repeated for a variety of transmission line geometries. Several
cases are included among the problems at the end of this chapter.

Returning to Equations (5.124), if k, ~ k, one can use these equations to find the
transverse-field components if the longitudinal components are known. Since these
equations are linear functions of E, and Bs, partial transverse fields due to E, alone
may be determined by setting B, == O. Such fields are called transverse magnetic, or
more briefly Tl\1 waves. Similarly, a second set of partial transverse fields due to B,
alone may be determined by setting E, == O. These fields are called transverse electric,
or 'I'E waves, The most general solution is then an arbitrary sum of the two sets of
partial fields.
Because the spatial derivatives of L, and L, are transverse, the vector wave equation
(5.102) has a separable Z component (cf. Mathematical Supplement, Section V.16)
which may be written

-h 1 (a-au -hh -aua+ -ava-hh -ava) E z +


a + -a -h -a) B z +
-h 1 (a-au -hhI -au
av h av
lh 2

1h 2

(k 2

2
kz)E
z

(5.129a)

(k 2

k 2z)Bl, = 0

(5.129b)

Solutions to (5.129) which fit the boundary conditions of the problem under consideration may be inserted in (5.124) to determine the transverse-field components, thereby
completing the description of the electromagnetic waves which are traveling along the
guiding structure.
EXAMPLE

5.9

A rectangular waveguide consists of a hollow pipe, usually made of good conductor, with a
rectangular cross section, as shown in the figure. This waveguide will support both TM and
T'E waves, as may be seen by the following argument:
In Cartesian coordinates, (5.129a) reduces to

If the walls are good conductors, E, ~ 0 against each wall. If this were not so, the currents
induced in the walls would be so high as not to match properly the tangential magnetic
field. This condition will be modeled by choosing as boundary conditions E, == 0 in each of

SECTION

Rectilinear Guided Waves

11

301

-t

- - - - - - - - - - - - ' - - - - - - - - - t ..
.._ X

/I~.

~I

a-----

Z
the four walls. Then the suitable primitive solution of the above wave equation is

2 . mat x . n7rY e1'(wt- k )


E z = - - SIn - - SIn -

-v;;b

zZ

(5.130)

in which m and n are independent positive integers (greater than zero). E z , as given by
(5.130), and the four transverse-field components associated with it, determinable from
(5.124), together are called the TMm n mode for rectangular waveguide. In (5.130) the
factor

2/~ is

included for normalization purposes, such that

f f t/;mnt/;rs
b

o
wherein

dx dy = O::n

.1,

'Ymn =

2 . mx x . n7rY
SIn SIn ~
a
b

--

(5.131)

and the Kronecker delta, o;::m equals unity if r = m and s = n, being otherwise zero.
Upon substituting (5.130) in the wave equation, one finds that k, = {3mn where

(5.132)
and this mode will propagate only if

If this condition is not met, the mode will attenuate exponentially. For given interior
dimensions (a,b), the higher the values of m and n, the shorter must be the free-space
wavelength A in order to achieve propagation.
The most general solution for E, consists of a linear superposition of terms like (5.130)

302 Eleciromaqnetics in Free Space

CHAPTER

for all possible values of m and n, that is,

LL
ClO

Ez

ClO

KmnY;mn(X,y)ei(wt-timrl%)

(5.133)

m=l n=l

in which the K m n are arbitrary complex constants. Equations (5.124) may be used to
obtain expressions for the four transverse-field components associated with (5.133). The
resulting collection of five equations describes the most general combination of 1'M modes
traveling in the positive Z direction in a rectangular waveguide, Reversing the sign before
f3mn will give a similar solution for propagation in the negative Z direction.
In like manner, in Cartesian coordinates (5.129b) reduces to

aB z + aB z +
ax
ay
2

(k 2

k;)B z

=0

In order to satisfy the boundary conditions that Ex == 0 in the top and bottom walls, and
that E y == 0 in the side walls, the suitable primitive solution of this wave equation is

= 'Itmn(X,Y )ei(wt-timrlz)

B,

'It

in which

mn

ms

n1rY

= - cos -a- cos - b


v;;b

(5.134)

(5.135)

and m and n are independent positive integers. (One or the other can be zero, but not
both.) {3mn once again is given by (5.132). Bs, as given by (5.134), and the four transversefield components associated with it, determinable from (5.124), are together called the
TE m n mode for rectangular waveguide.
The most general solution for B, consists of a linear superposition of terms like (5.134)
for all possible values of m and n, that is,
ClO

~ Kmn'lt mn (x,y

)ei(wt-timnz)

(5.136)

m=O n =0

with the Kmn arbitrary complex constants. Use of Equations (5.124) will yield expressions
for the four transverse-field components associated with (5.136). The resulting collection
of five equations describes the most general combination of TE modes traveling in the positive Z direction in a rectangular waveguide, Reversing the sign before {3mn will give a
similar solution for propagation in the negative Z direction.
A study of (5.132) reveals that, if a > b, the selection m = 1, n = 0 yields the lowest
possible value of k which will permit propagation. Therefore the TE mode for which rn = 1,
n = 0 (i.e., the rrE 10 mode) will propagate at a lower frequency (longer free-space wavelength) than any other TE mode, and than any 'I'M mode. This means that, at a given
frequency, it is possible to choose a and b such that only the TE 10 mode will propagate.
It is left as an exercise to show that to ensure this condition, X/2 < a < X, b < 'A/2.
Because of this unique feature of the TE 10 mode, it is particularly useful when electromagnetic energy must be conveyed with a well-defined field distribution. From (5.136) and
(5.124), the field components of this fundamental mode are given by
E; = -

jw K sin 1rX
1r/a
a

B;

jf310 K sin 7rX


7r/a
a

Bz

ei(Wt-{jlOZ)

ei(wt-{:JIOZ)

cos 1rX e i (wt- til0Z )


a

(5.137)

SECTION

with

12

The Wave Equation in Cylindrical Coordinates

303

an arbitrary complex constant and


(5.138)

Inspection of Equations (5.137) reveals that the electric flux lines for the TE IO mode
go straight across from one broad wall to another, originating on positive charge and
terminating on negative charge. The instantaneous electric flux pattern at one cross section
and charge distribution along one broad wall are shown in part (a) of the second figure.

00000000
00000000

0000

(a) Electric field and

(b) Magnetic field

(c) Current flow

charge distribution

The magnetic flux lines are closed loops which lie in planes parallel to the broad walls
and which encircle the y-directed displacement current Do. Some of these flux lines are
shown in part (b) of the second figure. 'The magnetic flux density against the waveguide
walls is associated with current flow in the walls which may be deduced from the integral
form of Maxwell's second equation, (5.28b). If perfectly conducting walls are assumed,
Do == 0 within the conductor, and a lineal current density flows in the walls of amount
(5.139)
in which In is a unit vector normal to the wall and pointing into the interior of the waveguide. Part (c) of the second figure shows an instantaneous plot of some of the current lines.

5.12

SOLUTIONS TO THE WAVE EQUATION IN CYLINDRICAL COORDINATES

Equations (5.102) were used as the point of departure in deducing wavelike solutions
to the field equations in rectangular coordinates. This was a relatively direct procedure
because of the simple form taken by the Laplacian of a vector in Cartesian frames of
reference. However, reinspection of Equation (4.52) reveals that the task is not so
simple in circular cylindrical coordinates, except when dealing with an axial field cornponent. For this reason, the approach to be adopted in the following study of cylindrical waves assumes that

E(r,,z,t) = B(r,)ejwt-jkzz
B(r,cP,z,t) = CB(r,cP)eiwt-J'kzz

(5.140a)
(5.140b)

in which k; is the wave number in the Z direction and w is the angular frequency. By
treating k, as a parameter, traveling or standing waves in the Z direction may be

304 Eleciromaqneiice in Free Space

CHAPTER

represented by appropriate linear combinations of fields of the type given by (5.140a)


and (5.140b). Use of the Fourier integral theorem will embrace a still wider variety of
physically realizable distributions within this formulation.
With E and B assumed in the form (5.140), the analysis of Section 5.11 is pertinent.,
and in this coordinate geometry (5.124) gives
Or

(,tI> =

k2

1
_

k2

('k z 8Sar

k~ J
(

k;

jk; 8S z
--

r act>

Jw 8CBz)
r ac/J
,

aCBz)

(5.141)

+JW--

a1'

so that, if the longitudinal field components can be found, the transverse components
are generally determinable.
The wave equations (,1.129) are applicable and in cylindrical coordinates give

(5.142)

By assuming either
separation

('Z

or CB z to be expressible in the form fl(r)f2(), one achieves the


(5.143)
(5.144)

with k~ a separation constant.


If k,p is limited to the integral values n = 0, 1, 2, ' . , (which provides a complete Fourier series), then letting v = Vk 2 - k; r converts (;'"),143) to

which is seen to be identical with (:3.71) and is recognized as Bessel's differential equation. This equation and its solutions were discussed in Chapter 3 and Appendix C. The
solutions may be given in many forms, including Bessel functions of the first and second
kinds, modified Bessel functions, and Hankel functions, Because of the asymptotic
forms (3.74) and (3.75), the Hankel functions are particularly convenient when representing radial waves.
By virtue of the foregoing, I~z(r,,z,t) and Bz(r,,z,t) may be composed of suitable
products of the factors
fl(V) = Zn(V)

f 2 ( cf> )

!3(Z)
/4(t)

=
=

{ejntl>}

e- jkzz

(5.145)

= e"i wt

with Zn(V) representing suitable cylinder Bessel functions. The usual Fourier tech-

SECTION

The lVave Equation in Cylindrical Coordinates

12

305

niques may be used to generalize !2, !3, and !4, and orthogonal expansions are also
available for !1 (cf. Chapter 3).
If the axis r = 0 is included, unless it contains sources, fl(V) must be expressed in
terms of J n (v) alone. If only a sector in the cf> direction is being considered, n need not
be an integer; the corresponding Bessel functions have a non-integral index and are
unlikely to be tabulated.
An elementary wave function consists of the product of the above four factors with
n, k z , and k (or w) specified, this triplet of numbers serving as identification. More
general solutions 111ay then be given by summing on these three indices. The specific
solutions for E, and B, will differ in the values attached to the different elem.entary
wave functions through imposition of the boundary conditions.
EXAMPLE

t5.10

Assume that a tubular sheet of current Ije iwt amp/m flows in the cylindrical surface r = a
between the limits z = 00 , with i a complex constant. What field does this source create
in the region r > a?
Because of the disposition of the currents, B must be transverse to the Z axis, and the
entire solution may be given in terms of E z By symmetry, the resultant field must be
independent of cf> and z; thus n = k z = 0. Further, at large radial distances

with the plus sign applying for H~l) and the minus sign for H~2); since this source system
causes outgoing waves, only H~2) need be selected. Therefore
Ez(r,t) = boHd2 ) (kr)e

with k

=~
C

jwt

and the constant bo yet to be determined.

Using (5.141), one finds that E, == 0, E == 0, B, == 0, and

and the fields are given by

The ratio of the field components is


B,

H62 ) (kr)

-JC

Hi2 ) (kr)

and this ratio approaches - C as r ~ co. This san.e ratio also has been observed for the
components of rectangular plane waves.

306

Eleciromaqneiice in Free Space

CHAPTER

The time-average power densi ty is

cP = -1 CRe
2

E X fI*
0

I'

.
21
t, Jr]
1
II(2)(kr)II(1)(kr)
2 \Hi2 ) (ka)12 0
1

At large radial distances this formula may be reduced by using the asymptotic expressions
for the Hankel functions. Further simplification is possible if ka is large, in which case

EXA:\IPLE

5.11

.A circular cylindrical waveguide consists of a hollow conducting tube of inner radius a. Find
the general expression for transverse magnetic waves propagating axially inside this tube.
Because the axis r = 0 is included in the region of interest, J n(V) must be chosen as the
radial function and (5.145) gives
E = In(v)
1

{C?S
n<f>}
SIn ncj>

ei(wt-kzz)

If the walls are assumed to be composed of a perfect conductor, Ez(a,<f>,z,t)


requires that

If the roots of I n are designated by ~nl' ~n2'

~nm'

== 0, which

then

(k2 - k;)a 2 = ~~m


and the propagation constant k, may have a sequence of values given by
f.l.

tJnm

k =
z

'YIk
I

2 _

'Yaim
2

"[nm.

Therefore there is a doubly infinite set of transverse magnetic modes which can exist in a
circular cylindrical waveguide, and these modes are distinguished by the indices n, m. For
the TMn m mode,
E, = J

( ar) {cos n<f>}


n<f>
'Ynm - .
SIn

.
eJ(wt-~nwZ)

with the other field components deducible from (5.141). Whether or not the TMn m mode
will propagate is governed by whether f3nm is real or imaginary, which is determined by
whether or not ka is greater than ~nm'

5.13

SOLUTIONS TO THE WAVE EQUATION IN SPHERICAL COORDINATES

If Equations (ij.102) are used as the starting point for the deduction of wavelike solutions to the field equations in spherical coordinates, a study of Equation (-1 ..14) indicates the difficulty of the task ..A.-II three field components are involved in all three
components of the vector wave equation and separation is possible only in a few particular situations of symmetry. Still another technique must be found for the solution
of (5.102) by indirect means.

SECTION

13

The Wave Equation in Spherical Coordinates

307

The approach to be followed begins with a study of the scalar wave equation
(5.146)
in which <p(T,f),<t>,l) is a scalar function expressed in spherical coordinates. Letting time
variations be accounted for by writing <P = 'I!(r,8,<te i wt yields

(\7 2

k 2 )'l!

(5.147)

= 0

which may be separated by assuming 'I! = !1(r)!z(8)!3(<t. This gives


2
fl
sin 8 d ( d
- - r2 - )
11 dr
dr

sin 8 d
+ --.
-

f2 d8

f2
(.
SIn 8 -d )

ae

.
k 2r2 sin" 8

1)] 11

+ -1 -d f 3 =
f3 d<t>2

which breaks into the pieces

~
(r z ddr + [k r
dr
i l

_.1_

SIn

2 2

f2
~ (Sin 8 de
d ) +

8 de

d2! 3
drjJ2

n(n

[n(n + 1) +

~J!2
sln 8
2

(5.148)

= 0

(5.149)

(.5.150)

with m and n separation constants. If the field is to be single-valued and a complete


azimuthal region is being considered, m must be an integer; this choice for m also gives
a complete Fourier series representation to !3.
Equation (5.149) was encountered in Section 3.13 and is recognized as being the
associated Legendre equation. If the axis e = 0, 7r is to be included, it has finite solutions
P':(cos e) whose properties are described in Appendix D. Restricting n to zero and the
positive integers, provides a complete orthogonal set of functions from which to construct 12.
If the substitution j', = (kr)-~~F(r) is made, Equation (5.148) may be transformed to

(5.151)
which may be recognized as Bessel's differential equation, being in the same form as
(3.64). It gives rise to the solutions
(5.152)
in which Zn+~'l(kr) is a cylinder function of half-order, i.e., an appropriate choice
among Bessel functions of the first and second kinds, Hankel functions, etc. These
functions have the same asymptotic behavior, recurrence relations, and orthogonal
properties as the cylinder functions of whole order (cf. Appendix C).
It is customary to define a spherical Bessel function by the notation
(5.153)

308 Eleciromaqnetics in Free Space

CHAPTER

and with this terminology one may construct solutions for 'It by forming products
of the partial solutions

fl(r) = zn(lcr)
f2(O) = P';(cos 0)

13<4

(5.154)

= {ejmcf>}

N ext consider the vector function

G = i r r X V'It

-v X (r'It)

(5.155)

in which 'It is a solution to the scalar wave equation (5.147). It is shown in Appendix I
that G satisfies (5.102) and is therefore an appropriate solution for either E or B.
However, G cannot represent a general field, since it has no component in the r direction. However, if 'It l and 'It2 are t\VO independent solutions to the scalar wave equation,
then a general solution may be constructed by choosing

BI

= -

~ V X

JW

EI

with the total field given by E = E I + E 2, B = B I + B 2 In this manner the total field
is expressed as the sum of t\VO partial fields, one of which is TE with respect to the
radial direction, the other being Tl\L Since 'l!I and 'l!2 are expressible in terms of cornplete sets of orthogonal functions, this is a broadly useful representation.
EXAMPLE

5.12

A spherical cavity consists of a conducting shell of inner radius a. Find the expressions for
those resonant fields in this cavity which are without an E; component.
For such fields, 'lJ 1 must be of the form

{cos
. m<t>} e jwt
SIn m'

)
'lJ 1 = J.n (k r ) pm(
n cos ()
in which

in has been

A,.

chosen for the Bessel function to ensure regularity at r

Eo = - _1_ a'1'l
sin 0 ac/>

=+=

= O. Then

jm '1'1

sin 0

E = a'It 1

ao

B, = n(n

1) 'It

s, = _1

(r a'Itao

B =

jwr

jwr ar

q,

jwr sin ()

ar

(r ac/> 1)
iJ'lJ

and, if a perfect conductor is assumed, Eo and E must vanish identically at r = a, which


requires that

SECTION

Inductance

14

If "Ynm is the mth root of in (kr), such that in ("Ynm)

309

0, then k has the allowed values

k = "Ynm
a

which determines the resonant frequencies for the TE modes in the cavity.

5.14

INDUCTANCE

The results of Section 5.8, concerned with magnetic stored energy, may be expressed
in an alternative manner. When the vector identity (V. 108) is employed, if A is the
magnetic vector potential function, such that B == V X A, then
V

(A

B) == B V X A - A V
==BB-AvxB

and (5.80) may be written

= i,uOl
= -k,uOl

V (A X B)

(A X B) dS

dV + t,uOl fA.
v

+ -k,uol fA.
v

V XB

V XB

dV

dV

But S may be taken as a sphere at infinity, and since A decreases as ~-1 and B decreases
as ~-2, the surface integral is seen to vanish. Therefore

TV m = i,uOl

fA.

V X B

For static magnetic fields, V X B

==

~1

dV

(5.156)

and

fJ.o

==

f .' dV'

v:

47rJ.lo-1 r

(5.157)

wherein primes are used so as to be able to distinguish between contributions to the


integrals in (5.1t56) and (5.157). Thus the stored magnetic energy may also be expressed
by
(5.158)
in which ~ is the distance between the volume elements dV and dV', and the integration
is to be performed twice throughout all of space containing current elements. This
development should be compared to the similar analysis presented for electrostatic
energy in Section 3.19; expressions (3.159) and (5.158) are seen to be completely
analogous.
Equation (5.158) may be applied to current systems of which the prototype is
indicated by Figure 5.4. Shown are two distinct circuital volumes V 1 and V 2 which
contain steady current distributions, the rest of space being source-free. If 11 is the
total current flow at some reference cross section in V 1, and if similarly 12 is the total
current flow at some reference cross section in V 2, it will often occur that the current

310 Electromaqneiics in Free Space

CHAPTER [)

density at any point in V I is linearly proportional to 11, whereas the current density
at any point in 11 2 is linearly proportional to 12. In such cases,

for any point

(~,'YJ,t)

in VI, and

for any point (t,1J,t) in V 2, with f l and f 2 functions which give the normalized current

V,
FIGURE

5.4

Self- and mutual inductance.

intensities. Under these conditions, (5.158) may be written

wherein f~ implies fl(~'''Y7',t'), etc. The integrals appearing in (5.159) depend only on
the normalized distributions of the two systems of currents, and for a given conductor
configuration are constants, so that one may write
(5.160)

in which L ll and L 22 are called self-inductances and M12 is called a mutual inductance,
their units being given the name henries. This development can be extended readily to
situations in which the volumes VI and T1 2 overlap and/or in which there is any number
of separately identifiable volumes containing current systems.

SECTION

EXAl\1PLE

Inductance

14

311

5.13

Find the mutual inductance between t\VO coaxial, coplanar filamentary loops of radii a and
b, as shown in the figure. Assume b a.
From (5.159),

M =

JJ

VI

fl

f~

41r,Liol~

V2'

dV dV'

In this case, the current densities may be taken as uniform over the cross-sectional areas

~----f-4-----+-+----Y

4>'

x
8 1 and 8 2 of the inner and outer loops. Then

and
in which C1 and C2 are the median contours of the two loops.
Upon first considering that element dfl which is on the X axis, one sees that by symmetry, the elements de~ may be taken in pairs symmetrically disposed with respect to the
X axis such that if
1<1>' d(~ = -1 xb sin ' d'

+1

yb

cos ' d'

then the X components may be discarded. This gives

del f 2b cos

Cl

~: d'

41r).Lo

312

Eleciromtumeiics in Free Space

With b a,

~-1 ~

b- l

a cos '/b 2 and thus

M~ ~

c,

EXAlVlPLE

CHAPTER

del

f" 2ab cos


0

c/>' dc/>' = 7fa


41rJ..Lo-lb 2
2 J..Lo-lb

.1.14

A wire of circular cross section whose radius is b is bent into the form of a closed circular
ring of mean radius a. Find its self-inductance if a b.
If one again makes use of (5.159) and (5.160), then

The geometry identifying the t\VO volume elements dV and dV' is shown in the figures,

Cross section at Of

Cross section at Of + 0

from which it follows that


f l f~ == flf~ cos ()
dV = (ad8)(pddp)
dV' = (a d8')(p' d' dp!)

SECTION

Inductance 313

14

+ 7f one finds that


( + ') + o' cos ']2 +

Upon letting 6 = 271

!2 = {[2a + p cos
1"-..1 2a(1
- k 2 sin 2 71)

[p sin (

') - p' sin 'P} (1 - k 2 sin" 71)

in which
k2 = .
[p sin (

4(a + p' cos ')[a + peas ( + ')]


') - o' sin ']2 + [2a + peas ( + ')

p' cos ']2

1 _ k2~ p2 + (p')2 - 2pp' cos

so that

4a

Let it be assumed that 11 is a function of p but not of

f f f
2rr

a
= --=1
pll (p) dp p'l~ (p') dp'

47r,uo

27ra

-1

,ua

217"

de'

,
pll(P) dp P'11(P') do'

d'

2
f rL
2rr

E(k)
To find K(k), let (3

VI -

2 sin 71. - 1 }" d71


(1 - k 2 sin? 71)/2

1)

K(k)] d

-rr/2
(

b, it follows that k 2 ~ 1 and

rr/2

k sin2

T} dT}'"

cos T} dT}

7f/2 - 71 and then

f
o V sin"
flo

K(k) =

+ -2-

E(k)

rr/2

wherein E(k) and K (k) are elliptic integrals. Since a


rr/2

e nor . Then

217"

{3

d{3
cos? {3 - k 2 cos" {3

f
e,

rr/2

VI -

d{3
k 2 cos? {3

with 1 - k 2 {3o 1. This division of the integral into two parts is done so that sin {3
may be approximated by {3 when {3 ::; {3o. If one places sin {3 = {3, cos (3 = 1 in the first
integral, and k = 1 in the second, the result is that
K(k) ~

r[-

In _/14V

k2

and therefore

~ E(k) + (~ -

1) JdcP~ j"
f '\JI +

= 27l"(ln 4 - 2) -

K(k)
217"

In

p2

(In 4 - 2 - In V

(p')2 - 2pp' cos cP d


4a 2

which means

d 2L =

47l"~~ p (In 8a ,uo

1-

2) 11 (p) dp p'f~ (p') dp'

k 2 ) d

314 Eleciromaqnetics in Free Space

CHAPTER

'I'wo cases of interest may now be distinguished. First, if the wire is assumed to have infinite conductivity, so that all the current flows on its surface, then/l(p) = (27rb)-1 o(p - b),
and

L = - a(8a
In- 1
J1.o

Second, if the current is assumed to be uniformly distributed over any cross section of the
wire, then 11 = (7rb 2)- 1 and

The difference in these results is slight under normal circumstances.

The concepts of capacitance and inductance may be extended to situations involving


time-varying currents and charges by the following argument: Let time variations
of the fields and sources be of the form e jwt (w may be only one Fourier component of the
total spectrum). Then the potential functions are
dV
f t(~,71,5)ej(wt-kn
4trlJo !

A(x,Y,z,t)
4>(x,Y,z,l)

-1

f (

t
r)ei(wt-kr)
P l;,71,~
dV
41rEo!

If the currents and charges are disposed throughout a system of conductors, and if no
point (x,Y,z) in the system is an appreciable part of a wavelength removed from any
other point (~,71,t), then retarded time may be ignored between t\VO such points, and
(5.161)
if>(x,Y,z)e

iwt

= eiwt

f p(~,7J,n dV

(5.162)

4trEo~

Upon deleting the factors eiwt, one sees that A and <I> have the same forms as were
encountered in magnetostatics and electrostatics, except that now t and p are, in general, complex quantities.
The condition that the extent of the conductor system be small compared to a
wavelength characterizes a lumped circuit. If within the circuit the regions where
conduction current and displacement current dominate are distinct and separate, the
analyses leading to Equations (3.159) and (5.158) may be repeated, with the potential
functions (5.161) and (5.162) replacing their static counterparts. The net result is
that the same formulas for capacitance and inductance are obtained. One concludes
that the static formulas for capacitance and inductance are valid so long as the frequency is low enough to insure that the circuit dimensions are small compared to
the wavelength.

SECTION

5.15

L5

T'ransjormaiion of the Integral Solutions

315

TRANSFORMATION OF THE INTEGRAL SOLUTIONS


TO FORMS SUITABLE FOR WAVEGUIDE PROBLEMS

In the development leading to the integral solutions for E and B, given by Equations
(5.42) and (5.43), the Green's function t/; = e-jkr/~ was employed. By retracing the steps
leading to (5.42) and (5.43), the reader will have no difficulty in convincing himself
that, if the more general Green's function
e- jkr

G(x,y,z,~,l1,r) = -

+ g(x,y,z,~,l1,r)

(5.163)

be used in place of t/;, with 9 any function which satisfies


(5.164)
everywhere in the volume V and over the bounding surfaces 8 1
SN, then (5.42)
and (5.43) are once again obtained, with G replacing t/; everywhere, In particular, if V
is a source-free region bounded by the single closed surface S, then at any point (x,Y,z)
within V,

E(x,y,z)

B(x,y,z)

f E)VsG + (In
~ f [jwG (In
+
41r
c

41r s

[(In

X E)

X E) X VsG - jwG(ln X B)] dS

(5.165)

(5.166)

(In X B) X V sG

(In B)V sGJ dS

in which In is the inward-drawn normal.


These equations may be applied to waveguide problems in the following manner:
let any cross section of a cylindrical waveguide be represented as in Figure 5.5, with C
y

'------------- X
FIGURE

5.5

Cross-sectional geometry of cylindrical waveguide.

the contour of the cross section, A the cross-sectional surface, u the outward-drawn
normal direction, and v the peripheral direction.
Let the closed surface S alluded to in (5.165) and (5.166) be taken to consist of the

316

Electromaqnetics in Free Space

CHAPTER

rl

interior surface of a portion of the waveguide, extending from t =


to
= S2, plus
end caps of area A at =
and at t =
Let the sources of the electromagnetic
waves within the waveguide be fields externally impressed over holes in the waveguide
walls, Further, let these holes be confined to a finite axial extent of the guide. Then, if a
small loss is assumed in the medium (an air-filled guide, instead of an evacuated guide,
for example), and if tl ~ - 00, t2 ~
00, it follows that the contributions to E(x,y,z)
and B(x,Y,z) in (5.165) and (5.lo6) due to the integrals over the t\VO end caps is negligible. This condition will be assumed, and the surface S in (5.1G5) and (5.166) will
be taken to consist of the interior waveguide surface of infinite axial extent.
For convenience, different Green functions will be used in (5.165) and (5.166). If one
selects

r rl

r2.

G 1 (x,Y,Z,1l,v,t )

e- j kr

= gl(X,y,Z,u,v,s)

for use in (5.165), and further stipulates that G l


then the Z component of (5.165) becomes

==

(5.167)

0 on S (the Dirichlet condition),

(5.168)
in which G, satisfies
(5.169)
with j)F the field point (x,Y,z) and r)s the source point (1l,V,t). Only l~z need be found
from (5.165), since all other field components of 1"'~1 waveguide 1110des 111ay be expressed
in terms of E; (cf. Section 5.11).
In like manner, if one selects
e-jk~

+ -

G2(x,y,Z,1l,v,t) = g2(X,y,z,u,v,r)
for use in (5.166) and further stipulates that aG 2 / au
then (5.166) yields the z component

B.(x,y,z)

:11'

in which G 2 satisfies

f j~

[:;2 +

{E.(U,V,n

(V'~

+k

2)G

==

k 2 ] G2

(5.170)

0 on S (the Neumann condition),

E.(u,v,n

-41rO(PF - Ps)

::~;} dv dr

(5.171)

(5.172)

In reaching the result (5.171), Maxwell's equations have been used to replace B; by
terms involving E; and E z , and several integrations by parts have been employed to
transfer the differentiations in the kernel to the Green's function.
Knowledge of tangential E everywhere on the waveguide walls permits determination
of all field components at all points, through use of (5.168) and (5.171). When the walls
are made of good conductor, E t a n ~ 0 except over the holes in the walls, and the extent
of the integration is thereby reduced.
If the waveguide has a simple cross section, such as rectangular or circular, the
Green's functions G l and G 2 can be expressed as complete series of orthonormal functions, thereby greatly simplifying the analysis. As an illustration, the Green's functions

SECTION

Trausjornuuion of the Integral Solutions

15

317

for a rectangular waveguide are derived in Appendix J. When the results are substituted in (i).168) and (:').171), one obtains
E,(x,y,z)

i f .pm.,.~.c,y) J

= -

m=1 n=1

~ L~
- ~

(mir / a)

m =0 n =0

2) {3mn

+ (n

7r /

2w/3mn

II

m=O n=O

a.pm.,(1;,7]) E,(1;,7],t)e-

au

b) 2 '1t (
)
mn x,y

Wmn(X,y)
2w

ifJ m"I,-

J 'ltmn() 1-

.s

~,'YJ

av

~v ~,'YJ,s

J aw mn(I;,7]) E,(1;,7],t)e-

t l dv dt

ifJ m"I,-

(;3.173)

1
e-'~
] Iz- tid
, v Gs
mn

t l dv dt

(5.174)

in which the upper sign is used in the second sum of (5.174) if z > S, the lower sign
being used if z < S. V;mn and 'lt m n have been defined previously by Equations (5.131)
and U>.135).
Equations (5.173) and (5.174) are known as the generating functions for rectangular
waveguide and permit the determination of the fields anywhere within the guide if the
fields are completely known on the walls. These generating functions were first obtained
by Stevenson. 13
EXAlVIPLE

5.15

Two identical rectangular waveguides are joined so as to have a common broad wall, as
suggested by the figure. It is assumed that the dimensions a and b are so chosen that only

Directional coupler

Slot dimensions

the TE 10 mode will propagate at the frequency being considered. (Cf. Example 5.9). A
pair of small crossed slots is cut in the common broad wall at a distance Xl from the nearest
A. F. Stevenson, "Theory of Slots in Rectangular Waveguides," J App Phys, 19, 24-38; January
1948.

13

318

Electromagnetics in Free Space

CHAPTER

side wall, In what is to follow, it will be shown that if Xl is properly chosen, this assembly
is a matched directional coupler. By this one means that if a TE l o mode is fed into port 1,
no reflected waves are detected at ports 1 and 2, but waves are detected at ports 3 and 4,
their relative amplitudes being controlled by the size of the crossed slots.
If the results of Example 5.9 are utilized, a 1'E I O mode incident on the crossed slots from
port 1 may be represented by

B z = K 0 cos -7rX.
e, (w t - IJI OZ)
R

in which Ko is the complex amplitude of the incident wave at z = 0. The other magnetic
field component of this incident wave is

BX

= j{310
-

1r/a

K 0 SIn
. -7rX e
a '

wt-IJIOZ

At the center of the slots (Xl,O,O), the magnetic field components are
Bx

If

Xl

. 7rXl . t
= j{31oK
- : 0 SIn
er:

7r/a

B =
Z

is chosen so that
1rXI

tan -;;

[(2a)2
~ -

7T"

= {3loa =

7rXl.

K 0 cos - a
1

er"

]-~~

then at the center of the slots, B x and B, will have equal amplitudes and will be in time quadrature. With this condition assumed, if the slots were not present, the X and Z components
of current density in the broad wall at (XI,O,O) also would have equal amplitudes and be in
time quadrature. If the slots are narrow and small (l d but l A), the conduction current which each slot interrupts is replaced mainly by a displacement current across each
slot. This means that electric fields which are in time quadrature are induced across the
two slots by the incident wave. These electric fields may be approximated by the expressions

in which E is the complex induced amplitude at the center of each slot and it is assumed
that the electric field goes to zero at both ends of each slot. The first electric field expression
applies for the Z-directed slot, the second for the X-directed slot.
By virtue of (5.174), the back-scattered TE IO mode which appears as a reflected wave at
port 1 is given by
B(l)

1r

iwt

Ee

}wf3loa 2b

cos 1rX
a

{~

7r

7rr

cos ~ cos
e-itllolz-rl
a sal
-

(.J

fJIO

d~ dr

f.

1r ~
SIn cos 7T"(~ Sal

Xl)

e-j{jlolz-t\ d5t d~r }

SECTION

A Minkowskian Formulation of the Field Equations

16

If the substitution is made that ~' = ~ -

this becomes

Xl

319

l/2

-7f cos 7fXI


a

cos -7fS cos


l

{310S

df

X
. 7f
{310 SIn
- l

l/2

a ()

t'}

COS -7f ~' cos 7f


- ~' d t;;
l
a

Since sin 7rXl/a = (7f/{310a) cos 7fXl/a, and since Z"'I\, so that cos (31of~ 1 and cos 7r~' /a~ 1
throughout the integration interval, it follows that
B~l)

= 0

and there is no back-scattering to port 1. Similarly, B~2) = o. However, for ports 3 and 4,
the upper sign must be used in the second integral of (5.174) and one obtains

B?> = B~4) =

Kl

cos 7rX eiCwt-{310Z)


a

u E
Kl=--

in which

a 3b c

The total field emerging from port 3 is given by Ko


Kl; at port 4 it is given by Kl.
The mode amplitude Kl is seen to increase with l so that the fraction of power diverted to
port 4 may be controlled by slot size. Experimentation has revealed that an extensive
dynamic range of power diversion is feasible with crossed slots of this type, making them
suitable for use not only as a waveguide coupler, but also as a circularly polarized radiator
(with the upper guide removed)."

5.16*

A MINKOWSKIAN FORMULATION OF THE FIELD EQUATIONS

Shortly after the appearance of Einstein's first paper on relativity, Hermann Minkowski (1864-1909) recognized that a considerable clarification in notation was possible
if the variable jet were treated as a fourth dimension and the equations of physics
restated accordingly.t> Thus, for example, the proper distance (2.47) could then be
written

ds 2

dxi

+ dx~ + dx~ +

in which, in place of the coordinates x, y,


X2

= Y

Z,

(5.175)

dx~

t, the new coordinates


X3

X4

= -jet

(5.176)

have been introduced, and the two events occasioning (5.175) have been assumed to be
an infinitesimal distance and time apart. Equation (5.175) is seen to be an extension to
four dimensions of the familiar expression for differential distance in three dimensions.
With this notation, functional derivatives with respect to X4 assume the same form
as those with respect to the spatial variables. As an illustration, the scalar wave equa-

* This section may be omitted without lops in continuity of the technical presentation.
A. J. Simmons, "Circularly Polarized Slot Radiators," IRE, Trans Antennas Propagation, AP-5
(1), 31-36; January, 1957.
IS H. Minkowski, Space and 'I'ime, an address delivered at the 80th Assembly of German Natural
Scientists and Physicians, at Cologne, September 21, 1908. A translation may be found in The Principle of Relativity, Dover Publications, Inc, New York.
14

320 Electronuumeiics in Free Space

CHAPTER

tion beC0111eS a four-dimensional version of Laplace's equation. The laws of dynamics


maybe recast in a static form and the governing equations of electrical phenomena also
assume a simple and elegant structure, as shall be seen by what follows.
In the Preface to his Electrodinuimics, S0111111erfeld indicated his admiration of ~\ Iinkowski's formulation by saying
. . . After I had heard Hermann Minkowski's lecture on "Space and Time" in 1909 in
Cologne, I carefully developed the four-dimensional form of electro-dynamics as an apotheosis of Maxwell's theory . . . in return, this has always met with an enthusiastic reception
on the part of my audience.

Sommerfeld's presentation of this material is especially appealing because of its


conciseness and clarity, and the ensuing development is patterned after his approach
wherever appropriate. Recently another excellent treatment of the subject has been
offered by L. J. Chu. 16 1;'01' applications of this formulation to other branches of physics,
such as dynamics, the reader is referred to the literature of those fields. 17
Confining attention to a rephrasing of the governing equations of electromagnetics,
let a four-dimensional generalization of the Laplacian operator be defined by

(5.177)
Additionally, let the potentials A and <1>, defined by (t>.G5) and (i).66), be combined to
form the four-potential A whose components are

A4

<I>
= ~

JC

(5.178)

With these definitions, the differential equations (5.69) and (.5.70) become

D 2A

(5.179)

= --1
J..Lo

with I called the four-current density and possessing the components

(;j.180)
Equation (5.179) is the four-dimensional wave equation relating the potential function
to its sources.
Equation (H.5), which connects the potentials A and <1>, may be written

aAl
a:rl

aA 2 aAa
aX2 aX3
0 A ==

1 .

V A = - + - + - = --<1>=
2
which gives

.
wherein

aA4
aX4
(5.181)

DIS the four-vector (a


. I dIvergence
- , -a, -a, - a) . Thus the four- dInlenSlona
ax!

8X2 8X3

aX4

of the four-potential function is identically zero.


16 H,. :iV1. Fano, L. J. Chu, and R, B .. Adler, Electronuujnetic Fields, Energy and Forces, Appendix 1,
John Wiley and Sons, Inc., New York, 1960.
17 See, e.g., H. Goldstein, Classical J.~1 echanics, Addision- Wesley Publishing Company, Inc., Reading,
Massachusetts, 1953.

SECTION]

A 1\1 inkowskian FOrJHUlation of the Ft'eld Equations

321

Turning now to Equations (;").G7) and (5.G8), which relate the field vectors to the
potentials, one may write

(f>.182)

These equations suggest the utility of introducing the four-dimensional curl


curl m n A
Since

CUl'lmm

aAn

= -

aX m

= 0

aAm

(5.183)

--

aX

curl.., = - curls.,

it follows that curl m n A is an anti-symmetric tensor with six distinct components


which differ from zero. Equations (;">.182) 111ay then be written in the tensor form

(~~12 - ~~J (iJA3 iJA1) (~~14 - ~~J


0
(:~l _ ~~12)
(~~23 - ~~32) (~~24 - ~~42)
(iJAl _ iJA3) (~~32 - ~~23)
0
(~~34 - ~~43)
0

curl A =

a X3

aX3

aXl

=5=

aXl

(~~41 - ~~:) (iJA2 _ iJA4) (iJiJ~: _ ~~4)


aX4

aX2

(5.184)
in which g: is an antisymmetric tensor possessing six distinct components different
frorn zero, and properly 111ay be called the field tensor. I t is given by

~=

[B,J:J

-B z

-By

-B z

Bx

By

-B x

-jJ~x

-jJ~y

-jI~z

jEx
c
jEy
c
jJ~z

(5.185)

c
0

Similarly, the .vlaxwcll equations tan be cast In a four-dimensional form. If the


operation
4

div., ~ ==

\'
L

n=l

a~mn

a~rn

(5.186)

322

Electromaqneiics in. Free Space

CHAPTER

applied to tensors is given the name reduction or divergence, it is seen to reduce a fourdimensional tensor to a four-vector, and 111ay be contrasted to the operation 0 introduced earlier, which reduced a four-vector to a scalar. Upon applying the operation
(5.186) to the tensor 5= given in (5.185), one obtains

~l

div g: =

(5.187)

J..Lo

The first three components of this equation embody the ~rax\vell equation (:>.24) and
the last component is a rephrasing of (5.23).
Since for any antisymmetric tensor 3' the components are related such that
it follows that

D div 3'

= 0

and thus from (5.187) one is able to deduce that

D I

(;j.188)

= 0

which is a restatement of the continuity equation (5.29).


Finally, let the dual of 5= be defined by the relation
0

g:*

[j:,

jE z

_jJ~lJl

jEx
c

B1J

Bz

jI~}z

B]

c
jl~y

c
-Ex

-jIE x
c
-By

-B z

Bx
(;'1.189)

which is formed by an interchange of the real and imaginary constituents of ~. The


reduction of this tensor is seen to give
(5.190)

div ~* = 0

The first three components of (5.190) comprise the Maxwell equation U>.21) whereas
the fourth component is a restatement of ([).20).
To recapitulate, a complete representation of Maxwell's electromagnetic theory for
free space is therefore embodied in the equations
div ff =

~
-1

J..Lo

(;).191)

div ~* = 0

The field tensor ~ may be found from the four-potential

A through the relation

g: = curl A
whereas the potential

(5.192)

A is deducible from the equations


I

-----=1
J..Lo

DA =

(5.193)

Problems

323

For further development and use of this notation the interested reader is referred
to Sommerfeld's Electrodynamics.
REFERENCES
1.

Campbell, L., and W. Garnett, The Life of James Clerk M axwell, Macmillan and Company,
London, 1882.

2.

Crowther, .I. G., i11en of Science,

3.

Fano, 1~. M., L. J. Chu, and R. B..Adler, Electromagnetic Fields, Energy and Forces, John
Wiley and Sons, Inc., New York, 1960.

4.

Glazebrook, R. T., James Clerk ill axwell and Modern Physics, Cassell and Company, Ltd.,
London, 1896.

5.

Harrington, R. F., Time-Harmonic Electromagnetic Fields, McGra\v-Hill Book Company,


Ne\v York, 1961.

6.

Jackson, IT. D., Classical Electrodynarnics, John vViley and Sons, Inc., Ne\v York, 1962.

7.

Jordan, E. C., Electrornagnetic Waves and Radiating Systems, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1950.

8.

Jones, Bence, The Life and Letters of Faraday, Longmans, Green and Company, London,
1870.

9.

Panofsky, VV. K. H., and M. Phillips, Classical Electricity and 111 agnetism, AddisonWesley Publishing Company, Inc., Reading, Massachusetts, 1955.

10.

Ramo, S., and J. R. Whinnery, Fields and Waves in .7\!I odern Radio, 2nd ed., John Wiley
and Sons, Inc., New York, 1953.

11.

Shedd, P. C.,Fundamentals of Electromagnetic lVaves, Prentice-Hall, Inc., Englewood


Cliffs, New Jersey, 1954.

12.

Sommerfeld, A., Electrodynamics, Academic Press, Inc., New York, 1952.

13.

Stratton, J. A.., Electromagnetic Theory, Mcflraw-Hill Book Company, New York, 1941.

14.

Whittaker, E., A History of the Theories of Aether and Electricity, vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

w. VV.

Norton and Company, New York, 1936.

PROBLEMS
5.1

Because of the result of Appendix E that the most general sources t(x,y,z,t) and p(x,Y,z,t)
in XYZ may be built up from static charge distributions in all other Lorentzian frames, it
follows that one may derive Maxwell's equations by starting only with p' (x' ,y' ,z') in
X' y' Z', without including t' (x' ,y' ,z'). To see this, assume only a static charge distribution
in X' Y' Z' and parallel the development of Sections 5.2-5.4 to obtain the X~connected
portions of Maxwell's equations. Then invoke superposition to obtain equations (5.25).
This procedure has the advantage in rigor of basing the derivation of the general field
equations solely on electrostatics, and it then permits all the results of Chapter 4 to become special cases of the more general theory. In particular, Equation (4.29), which
expresses the Biot-Savart law, is seen to be a limiting form of Equation (5.60), with

k = O.

5.2

Let B = F in the vector Green's theorem (5.37) and show that B is expressible in the
form (5.43).

324 Eletiramaqneiics in Free Space

CHAPTER

5.3

By taking the curl of (5.42), show that B may be written in the form (5.43).

5.4

Use the continuity equation to show that (5.42) 111ay be converted to (5.44).

5.5

Using Fourier integral theory, show that the general form for the retarded potential <P, as
given by (H.7), is a natural extension of the harmonic form (H.2).

5.6

Find the field pattern for a dipole which is one wavelength long if the current distribution is

5.7

Use Poynting's theorem to determine the total radiated power for the full ..wave dipole of
Problem 5.6.

5.8

Find the stored magnetostatic energy per unit length for the system consisting of two thin
parallel conducting tubes of radius a and center-to-center spacing D, if they carry equal
and opposite steady currents I.

5.9

Static charges and steady currents can set up time-independent electric and magnetic
fields in a C0l11mOn region such that CP = E X II is not identically zero, but still no net
power flow exists. Show that under these conditions f s(P dS = 0 for any closed surface S
in the region.

5.10

If t\VO uniform plane waves of con11110n polarization but different angular frequencies WI
and W2 propagate simultaneously in the same direction, show that the net time-average
power flow is equal to the sum of the individual time-average power flows.

5.11

Find the radiation pressure if a plane wave is normally incident on a perfectly absorbing
plane screen.

5.12

Show that an elliptically polarized plane wave may be decomposed into appropriate
amounts of right-handed and left-handed circularly polarized waves.

5.13

Determine expressions for the instantaneous stored energy density and Poynting's vector
for an elliptically polarized plane wave, Check that your answers have the proper limits
for linearly and "circularly polarized waves.

5.14

A circularly polarized plane wave is normally incident on a perfectly conducting plane


screen. What can be said about the polarization of the reflected wave?

5.15

Establish the law of reflection for a linearly polarized wave of arbitrary polarization
incident on a perfectly conducting screen at an angle a with respect to the normal.

5.16

Repeat the analysis of Example 5.8 for the case of a coaxial transmission line consisting of
t\VO concentric circular cylindrical shells of radii a and b, with b > a.

5.17

Repeat the analysis of Example 5.8 for the case of t\VO semi-infinite planar conducting
sheets which lie in the same plane but are separated by a constant gap width a.

5.18

A rectangular cavity of dimensions a, b, c is excited in the mode

E
Z

= sin 7rX sin 7ry eiwt

If Hz = Ex = E y == 0 and the walls are perfectly conducting, find the resonant frequency
wand the force exerted on each face.

5.19

Determine an expression for the resonant frequency for a TM mode in a cylindrical cavity
of radius a and length l.

5.20

Determine the integral of the Poynting vector over any cross section of the circular
cylindrical waveguide of Example 5.11 for any transverse magnetic mode.

Problems 325
5.21

Establish the expressions for the field components and the allowed frequencies for the
resonant TM modes in a spherical cavity.

5.22

Deduce the self-inductance per meter of t\VO parallel wires, each of radius a and centerto-center spacing D, (D a). Assume that the wires carry equal and opposite harmonic
currents, and that D
A, conditions often encountered in practice, such as in telephonic
communications.

5.23

Calculate the mutual inductance between the t\VO coils shown in the figure. Assume that
b and that the respective numbers of turns are N and Ni:

---t)-\-- - - - - - - - d - - - - - e - I
0 div 3'

5.24

For any antisymmetric tensor 3', show that

5.25

In a Minkowskian formulation of the field equations} define a suitable field tensor whose
terms represent the components of H, and Do.

O.

CHAPTER

l)ieleclric lklalerials
WITH RESPECT to some aspects of their electrical behavior, materials 111ay be classified
as conductors, semiconductors, and dielectrics or insulators. (See Section 8.2.) An ideal
dielectric is a material which possesses no free charges and thus completely inhibits
the passage of steady electric current. Since many real materials of practical importance
approach this idealization, it constitutes a useful 1110del on which to base an analysis
of electric behavior, as shall be seen in subsequent sections of this chapter.
Although electrically neutral, any dielectric is composed of molecules, which in turn
are composed of charged particles (nuclei and electrons), and these particles are usually:
affected by the presence of an electric field. Such a field influences the positively and
negatively charged parts of a molecule oppositely, and these parts may suffer oppositely
directed displacements from their equilibrium positions, thus causing the molecule to
become polarized. These displacements are limited by strong restoring forces, caused
by the altered charge distribution within the molecule, such that the charge shift is
seldom more than a small fraction of a molecular diameter. In such cases the molecule
may be viewed as an elementary electric dipole (or several dipoles) whose distant field
can be calculated using the techniques developed in earlier chapters. When the contributions due to all the molecular dipoles are summed, the resultant is often found to
alter the field distribution significantly from the value it had in the absence of the
dielectric, both for points inside and outside the dielectric.
The dipole behavior of a molecule may arise from three distinct causes. First, the
electron cloud of a constituent atom may shift relative to its nucleus due to the presence
of an electric field. 'I'his induced effect is called electronic polarization. Second, the
molecular structure may be due to an arrangement of oppositely charged ions, which
can shift from their equilibrium positions under the action of an electric field, thus
giving rise to ionic polarization. Third, the molecule 111ay consist of an arrangement of
atoms which, in the absence of an electric field, is a randomly oriented permanent
electric dipole. The presence of the field then causes a partial orientation of the perrnanent molecular dipole, causing a net polarization, and this phenomenon is termed
orientational polarization. All three effects may be present in a given material.
To account for these three sources of dielectric behavior, the material may be treated
as though it were a collection of dipole moments P in a vacuum" with consideration
of the detailed composition of P deferred to a later discussion. Accordingly, a distribution of static dipole moments will first be assumed and an expression derived for the
total electric field due to static primary charges and the assemblage of dipoles. This
expression will then be used to explain the manner in which dielectric materials affect

SECTION

Historical Survey

327

capacitance, to generalize the meaning of the electric flux density Do, and to deduce a
relation for the local field at the site of any molecular dipole. Consideration will then
be given to the problem of connecting, for a linear dielectric material, the strengths
of the local field and the induced 1110111ents P- This will be done for electronic, ionic, and
orientational polarization and will be seen to permit simplifications of the expression
for the total field; additionally, it will lead to the relation D = fE, with the permittivity
factor f serving to describe dielectric behavior for large classes of materials.
S0111e attention next will be given to nonlinear materials, notably ferroelectric
crystals, in which the polarization is not only not linearly proportional to the applied
field, but also depends on the prior history of excitation.
For linear dielectric materials, the theory will then be extended to the case of timevarying fields, at which point the necessity to include dielectric losses will arise. The
concept of a complex permittivity will be introduced whose imaginary part accounts
for these losses, and the dependence of permittivity on frequency will be considered
for all three types of polarization.
Finally, the free-space form of Xlaxwell's equations derived in Chapter 5 will be
extended to apply in regions occupied by dielectric materials. Additional extensions
of Xlaxwcll's equations will occur at the ends of Chapter 7 (for magnetic materials) and
Chapter 8 (for conductive materials).

6.1 *

HISTORICAL SURVEY

The recognition of a distinction between two classes of materials, conductors and


insulators, dates from 1729 when Stephen Gray discovered the phenomenon of electric
conduction. 1 Most common substances were soon categorized with respect to this
property; the metals were identified as good conductors, and many excellent insulators
were known and widely used by electrical experimenters of the eighteenth century.
It will be remembered from Chapter 3 that during this period Henry Cavendish became
interested in electrostatics, and was the first to observe that the presence of an insulator
between the plates of a condenser increased its capacity to store charge for a given
voltage. He measured the relative dielectric constants of many common substances,
such as shellac, beeswax, ebonite, and paraffin, and performed experiments which
indicated that the dielectric constant was independent of voltage (for glass) and
independent of temperature (for rosin)." However, Cavendish's papers were still unpublished in 1837 when Michael Faraday rediscovered the effect.
Faraday was led to this problem by researches he had conducted on the deeornposition of chemical compounds placed between electrodes. Contrary to the behavior of
liquid electrolytes, he observed that when a solid substance such as sulphur was used,
it did not conduct electricity and was in no way decomposed; yet its presence between
the electrodes did cause an effect in that the charge stored on each electrode was
increased over the value found when air was the intervening 111ediu111. Faraday pursued
this discovery, selecting shellac and sulphur as the t\VO insulating substances best
* This section may he omitted without loss in continuity of the technical presentation.
r S. Gray, "Several Experiments Concerning Electricity," Phil Trans Roy Soc (London), 37, 18-44;

Feb., 1731.
H. Cavendish, Electrical Researches, ed. by J. C. Maxwell. See particularly Notes 15 and 27. Cambridge University Press, London, 1879.
2

328 Dielectric 1\1aterials

CI-{;\PTEH

suited for experimental study of the phenomenon. Using t\VO identical spherical
capacitors, he left one air-filled and fitted a hemispherical shellac cup between the
conducting shells of the other. Upon comparing the ratio of charge to voltage for the
t\VO condensers, Faraday concluded"
. . . assuming the capacity of the air apparatus as 1, that of the shell-lac apparatus would
be tfi or 1.55 . . . this by no means expresses the relation of lac to air. The lac only
occupies one half of the space . . . if the effect of the two upper halves of the globes be
abstracted, then the comparison of the shell-lac power in the lower half of the one, with
the power of the air in the lower half of the other, will be as 2: 1 . . . . I cannot resist the
conclusion that shell-lac does exhibit a case of specific inductive capacity.

This coefficient of specific inductive capacity, which Faraday introduced as a quantitative measure of capacity enhancement, is today called the relative dielectric constant. For sulphur he determined it to be 2.24, and then extended his experiments
to include a variety of insulators, both liquid and gaseous, as well as solid.
To explain this phenomenon, Faraday formed a physical conception of the action of
insulators, based on an idea originally put forward by Davy some years earlier to
describe the behavior of a voltaic pile. Davy had supposed that prior to chemical
decomposition, the molecules of the liquid electrolyte became electrically polarized.
Faraday supported this hypothesis by noting!
When I discovered the general fact that electrolytes refused to yield their elemen ts to a
current when in the solid state, though they gave them forth freely if in the liquid condition, I thought I saw an opening to the elucidation of inductive action . . . . For let
the electrolyte be water, a plate of ice being coated with platina foil on its t\VO surfaces,
and these coatings connected with any continued source of the t\VO electrical powers, the
ice will charge like a Leyden arrangement, representing a case of common induction, but
no current will pass. If the ice be liquefied, the induction will fall to a certain degree, because a current can now pass; but its passing is dependent upon a peculiar molecular arrangement of the particles . . . .
Faraday then drew the inference
. . . As, therefore, in the electrolytic action, induction appeared to be the first step, and
decomposition the second . . . as the induction was the same in its nature 'as that through
air, glass, wax, . . . produced by any of the ordinary means: and as the whole effect in
the electrolyte appeared to be an action of the particles thrown into a peculiar or polarized
state, I was led to suspect that common induction itself was in all cases an action of con-

tiguous particles . . . .

In the following year Faraday elaborated on this idea, saying"


The particles of an insulating dielectric whilst under induction may be compared to a
series of small magnetic needles, or more correctly still to a series of small insulated conductors. If the space round a charged globe were filled with a mixture of an insulating
dielectric, as oil of turpentine or air, and small globular conductors, as shot, the latter
being at a little distance from each other so as to be insulated, then these would in their
condition and action exactly resemble what I consider to be the condition and action of
3 M. Faraday, Experimental Iiesearches in Electricity, vol. 1~ Sec. 1252-1270, Bernard Quaritch, Publisher, London, 1839.
4 Ibid., Sec. 1164.
5 Ibid., Sec. 1679.

SECTION

Historical Survey

329

the particles of the insulating dielectric itself. If the globe were charged, these little conductors would all be polar; if the globe were discharged, they would all return to their
normal state, to be polarized again upon the recharging of the globe.

This insight is all the 1110re remarkable when one remembers the primitive state of
atomic theory in 1838. With this model, Faraday was able to deduce that the polarization of the dielectric would be opposite to the influence causing it, thus requiring 1110re
primary charge to maintain the same voltage. This provided an explanation for the
increase of capacity due to the presence of a dielectric.
In drawing an analogy between dielectric polarization and the behavior of a series
of small magnetic needles, Faraday established a link to a successful theory of magnetization promulgated fourteen years earlier by Poisson." Poisson had adopted Coulomb's
doctrine of two magnetic fluids as the basis for his analysis. These fluids were presumed
to neutralize each other unless magnetically excited, in which case they 1110ved to
opposite ends of the individual elements inside a magnetic body, but were incapable of
passing from one element to the next. This polarization of the t\VO magnetic fluids
then caused a magnetic field distribution which was derivable as the gradient of a
potential function <Pm. Poisson showed this potential function to be given by the
expression
<Pm =

M dS
~

! (V

V M)

dV

with the first integral taken over the surface of the magnetic body, the second taken
throughout its volume, and M the polarization density, or magnetization. This formula
shows that the magnetic field produced by the body is the same as would be caused
by a fictitious distribution of magnetic charges, consisting of a surface layer whose
density is the normal component of M, plus a volume distribution of density - V M.
With this interpretation, Poisson was able to explain the magnetic phenomena known
at that time, and this theory was one of his most significant achievements.
Accepting Faraday's conception of an analogous electric polarization in insulating
materials, all that was needed to provide a theory for dielectric behavior was to translate Poisson's theory of induced magnetism into electrostatic language. This was
done independently by Lord Kelvin? and F. O. ~Iossotti8 and the essence of their
development will be found in Section 6.3. In his memoir, Mossotti also showed the
manner in which the dielectric constant of a material depends on its mass density.
This relation, known as the Clausius-l\Iossotti equation "vas derived independently by
Clausius some years later," and will be considered in Section 6.12.
With the growth of atomic theory, the nature of polarization mechanisms became
more clearly understood, and both electronic and ionic polarization were recognized
as forms of induction. It was appreciated that an atorn or ion pair would become
polarized under the action of a local electric field, and that this local field was contributed to not only by the external sources, but also by the induced dipoles in all other
atoms or ion pairs of the material. Lorentz derived a particularly useful expression for
6 S. I). Poisson, "Memoir on the Theory of Magnetism," Mem Acad Sci (Paris), ser. 2, 5, 247-338;
February 1824.
7 W. Thomson, Cambridge and Dublin Math J, 1, 75; November 1845.
8 F. O. Mossotti, Arch des Be Phys, (Geneva) 6, 193; 1847. Mem della Soc Ital Medena (2), 14,49; 1850.
9 R. Clausius, Die mechanische Warmelehre, vol. 2, pp. 62-97, Vieweg, 1879.

330

Dielectric i11aterials

CHAPTER

the local field in ter111S of the externally applied field and the polarization densit.y.!" If
one uses the Lorentz expression, it is possible to obtain an explicit relation between the
externally applied field and the induced polarization, thus accounting for the alteration
in field distribution due to the presence of the material.
Orientational polarization, due to the existence of permanent dipole moments in
certain molecules, was first hypothesized by Debyc!' in 1912. He used this concept to
explain the high static dielectric constants of water, alcohol, and similar liquids.
Borrowing from an earlier analysis of Langevin, concerned with the analogous problem
of orientation of magnetic dipoles in a permeable material, Debye was able to demonstrate a temperature dependence for the dielectric constant of substances containing
polar molecules. He extended the Clausius-Mossotti equation to include orientational
polarization, and provided a technique whereby experimental data giving dielectric
constant versus temperature could be used to separate the orieutational polarizability
from the -electronic and i